Are your foreign keys indexed?
Foreign key constraints are an important tool to keep your database consistent while also documenting relationships between tables. A fact that is often ignored is that foreign keys need proper indexing to perform well. This article will explain that and show you how to search for missing indexes. Index at the target of a […]
Correlation of PostgreSQL columns explained
After you ANALYZE a PostgreSQL table to collect value distribution statistics, you will find the gathered statistics for each column in the pg_stats system view. This article will explain the meaning of the correlation column and its impact on index scans. Physical vs. logical ordering Most common PostgreSQL data types have an ordering: they support […]
rules or triggers to log bulk updates?
Inspired by my co-worker’s recent blog post, I decided to revisit the old question of rules vs. triggers and run a little benchmark to see which one does better. About rules While triggers are well known to most application developers and database administrators, rules are less well known. The full name “query rewrite rule” […]
Adding an index can decrease SELECT performance
We all know that you have to pay a price for a new index you create — data modifying operations will become slower, and indexes use disk space. That’s why you try to have no more indexes than you actually need. But most people think that SELECT performance will never suffer from a new […]
Linux cgroups for PostgreSQL
In a recent wrestling match with the Linux “out-of-memory killer” for a CYBERTEC customer I got acquainted with Linux control groups (“cgroups”), and I want to give you a short introduction how they can be used with PostgreSQL and discuss their usefulness. Warning: This was done on my RedHat Fedora 27 system running Linux […]
Avoiding “OR” for better query performance
PostgreSQL query tuning is our daily bread at CYBERTEC, and once you have done some of that, you’ll start bristling whenever you see an OR in a query, because they are usually the cause for bad query performance. Of course there is a reason why there is an OR in SQL, and if you […]
What’s in an xmax?
xmax is a PostgreSQL system column that is used to implement Multiversion Concurrency Control (MVCC). The documentation is somewhat terse: The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version. It is possible for this column to be nonzero in a visible row version. That usually indicates that the deleting […]
Get rid of your unused indexes!
Why should I get rid of unused indexes? Everybody knows that a database index is a good thing because it can speed up SQL queries. But this does not come for free. The disadvantages of indexes are: Indexes use up space. It is not unusual for database indexes to use as much storage space as […]
Three reasons why VACUUM won’t remove dead rows from a table
Why VACUUM? Whenever rows in a PostgreSQL table are updated or deleted, dead rows are left behind. VACUUM gets rid of them so that the space can be reused. If a table doesn’t get vacuumed, it will get bloated, which wastes disk space and slows down sequential table scans (and – to a smaller extent […]
New features for sequences: gains and pitfalls
About sequences Sequences are used to generate artificial numeric primary key columns for tables. A sequence provides a “new ID” that is guaranteed to be unique, even if many database sessions are using the sequence at the same time. Sequences are not transaction safe, because they are not supposed to block the caller. That is […]
How a bad network configuration can cause table bloat
I recently had an interesting support case that shows how the cause of a problem can sometimes be where you would least suspect it. About table bloat After an UPDATE or DELETE, PostgreSQL keeps old versions of a table row around. This way, concurrent sessions that want to read the row don’t have to wait. […]