Tag: data
Bulk load performance in PostgreSQL
© Laurenz Albe 2023 There are several techniques to bulk load data into PostgreSQL. I decided to compare their performance in a simple test case. I’ll add some recommendations for parameter settings to improve the performance even more. An example table to bulk load data The table is simple enough: It is a narrow table […]
Practical Examples of Data Normalization in PostgreSQL
Normalization by Osmosis Part 2: Practical Examples of Data Normalization in PostgreSQL Having gone through a theoretical introduction in part 1, it is now time to discuss some more practical examples. I’ll first talk about the… type modeling … of columns. It is important to realize that during the time Codd formulated the base 3 […]
Data Normalization in PostgreSQL
Normalization by Osmosis Part 1: A Guide to Data Normalization in PostgreSQL By Michał Małecki Introduction Although I installed my first PostgreSQL database on RedHat Linux (not RHEL) 6.0 around 1999, I probably limited my interaction with this 6.5 release to starting the server with SysinitV scripts. PostgreSQL was at that time an underdog to […]
pg_rewrite: PostgreSQL Table Partitioning
PostgreSQL table partitioning is by far one of the most popular new PostgreSQL features. Many people look to partitioned tables as a way to improve performance and broaden scalability. However, partitioning comes with a little problem: How can you partition an existing table without locking up the database? The answer is: pg_rewrite can help you […]
Data warehousing: Making use of synchronized seq scans
PostgreSQL contains some hidden gems which have been around for many years and help to silently speed up your queries. They optimize your SQL statements in a clever and totally transparent way. One of those hidden gems is the ability to synchronize sequential scans. Actually, this feature has been around for 15+ years, but has […]
Postgres scaling advice
By Kaarel Moppel: So, you’re building the next unicorn startup and are thinking feverishly about a future-proof PostgreSQL architecture to house your bytes? My advice here, having seen dozens of hopelessly over-engineered / oversized solutions as a database consultant over the last 5 years, is short and blunt: Don’t overthink, and keep it simple on […]
Binary data performance in PostgreSQL
© Laurenz Albe 2020 A frequently asked question in this big data world is whether it is better to store binary data inside or outside of a PostgreSQL database. Also, since PostgreSQL has two ways of storing binary data, which one is better? I decided to benchmark the available options to have some data points […]
PostgreSQL v13 new feature: tuning autovacuum on insert-only tables
© Laurenz Albe 2020 Most people know that autovacuum is necessary to get rid of dead tuples. These dead tuples are a side effect of PostgreSQL’s MVCC implementation. So many people will be confused when they read that from PostgreSQL v13 on, commit b07642dbc adds support for autovacuuming insert-only tables (also known as “append-only tables”). […]
Looking at MySQL 8 with PostgreSQL goggles on
By Kaarel Moppel – Compare MySQL and PostgreSQL – First off – I’m not trying to kindle any flame wars here, just trying to broaden my (your) horizons a bit, gather some ideas (maybe I’m missing out on something cool, it’s the most used Open Source RDBMS after all) and to somewhat compare the two […]
COPY in PostgreSQL: Moving data between servers
The COPY command in PostgreSQL is a simple way to copy data between a file and a table. COPY can either copy the content of a table to or from a table. Traditionally data was copied between PostgreSQL and a file. However, recently a pretty cool feature was added to PostgreSQL: It is now possible […]
Preparing data for machine learning in PostgreSQL
You have probably noticed that everyone is talking about Artificial Intelligence and Machine Learning these days. Quite rightly, because it is a very important topic, which is going to shape our future for sure. However, when looking at most of the code related to machine learning available on the net, it strikes me, how much […]
Forking databases – the art of copying without copying
I received a question about how to fork PostgreSQL databases like you can do on Heroku. As I did not find any good examples on how to do this, I decided to do a quick write up. What does “fork databases” mean? Forking a database means taking a copy of a database where you can […]