By Kaarel Moppel – UPDATED by Laurenz Albe 06.07.2023 – See what progress has been made
What are PostgreSQL’s weaknesses? How can PostgreSQL be improved? Usually, in this blog I write about various fun topics around PostgreSQL – like perhaps new cool features, some tricky configuration parameters, performance of particular features or on some “life hacks” to ease the life of DBA-s or developers. This post will be quite different though. It’s inspired by an article called “10 Things I Hate about PostgreSQL” which I stumbled upon recently. I thought I’d also try my hand at writing about some less flattering sides of PostgreSQL 🙂 And yes, there are some, for sure – as with any major piece of software.
In the article I referenced, quite a few points are debatable from a technology viewpoint. However, in general I enjoyed the idea of exploring the dark side of the moon once in a while. Especially in our relational database domain, where problems tend to be quite complex – since there’s a bunch of legacy code out there (PostgreSQL has been around sind 1996!), there are the official SQL standards to be adhered to, and there are a gazillion different database / OS / hardware combinations out there in the wild.
So yeah, you definitely shouldn’t get blinded and praise any product X univocally. Especially given that the software world has gotten so complex and is moving so fast nowadays, that my guess is that nobody actually has the time to try out and learn all those competing technologies to a sufficiently good level. Well maybe only in academia, where one is not expected to directly start generating $$, the faster the better.
To love it or to hate it, that is the question
About the title – although it was kind of tempting to also use the word “hate” (more clicks probably), I found it too harsh in the end. In general – PostgreSQL is a truly well written and maintained software project, with a healthy Open Source community around it and it is also my default database recommendation for most applications. Its price – performance ratio is just yet to be beaten in the RDBMS sphere I believe. And there are far more benefits than shortcomings that I can think of.
The most important ones for me are its lightweightedness and simplicity for DBA operations (building a replication server is a one-liner!). From the development side, it is maybe the fact that Postgres is at the very top when it comes to implementing the ISO SQL standard, so that PostgreSQL can’t really become a bottleneck for you. You could migrate to something else with a reasonable amount of work. In this area, a lot of commercial competitors have no real interest in such qualities – would you want to make migrating away from your product easy, if it’s generating you tens of thousands of dollars per instance per year?
So in short, as concluded in the post referenced, you can’t go wrong with PostgreSQL for most applications. But anyways, here are some of my points that I think could be improved to make PostgreSQL even better. Most of them are rather minor complaints you’ll find, or just looking at the clouds and wishing for more.
Construction Site #1 – Rigid query planning
This is related to the “no hints dogma” from the blog post referenced above (not to forget about the “pg_hint_plan” 3rd party extension though!) one of my biggest gripes with Postgres when it comes to complex queries – is the mostly static query planner. Mostly here means that except for some extra logic on choosing plans for prepared statements and PL/pgSQL stored procedures (which basically also boil down to prepared statements), the planner / execution doesn’t gather any feedback from the actual runs!
It pretty much cold-heartedly looks at the query, the pre-calculated statistics of tables mentioned in there, and then selects a plan and sticks to whatever happens. This holds true even if the row estimate for the first executed node is a million times off and some millions of rows are now going to be passed higher via a nested loop join, an algorithm that’s rather meant for handling small to medium datasets.
I do understand that employing some adaptive query planning strategies is a huge task. It would require decent resources to tackle it. However, some simple heuristics like automatic re-analyze based on seen data, or trying some other join types in the presented example should be probably doable as “low hanging fruits”. And it seems some people are already trying out such ideas already also – there’s for example the AQO project. But for real progress some wider cooperation would be probably needed to keep up with the big commercial products, who already throw around buzzwords like “AI-based query optimization”. Not sure if just marketing though or they really have something, not much stuff on that on the interwebs.
Could be improved: #2 Some levels of automatic tuning
Slightly connected to the first point is the idea of PostgreSQL automatically selecting some better values for default parameters, based on statistics gathered over time. Alternatively, it could just look at available OS resources. Currently, Postgres actually tries hard on purpose not to know too much about the OS (to support almost all platforms out there). For some reasonable subset of platforms and configuration settings, it could be done – theoretically.
I’m not talking about complex stuff even, but just the main things, like: optimizing the checkpoint settings (after some safety time + warnings), if checkpoints occur too often, increase
autovacuum aggressiveness in case there’s lots of data changes coming in, look at memory availability and increase
work_mem to get rid of some temp files, or enable
wal_compression if the CPU is sitting idle. Currently, I think the only parameter that is automatically tuned / set is the “wal_buffers” parameter.
Well, in general, the topic of tuning is not a real problem for seasoned DBA’s. Actually, the opposite is true – it’s our bread 🙂 However, it would surely benefit most developers out there. Also, this is what the competition is already touting. In the long run, this would benefit the project hugely, since DBA’s are kind of a scarce resource nowadays, and extensive tuning could be off-putting for a lot of developers.
Still waiting on #3: Lost statistics during pg_upgrade
I’m sure most people have not heard or seen this issue, so it can’t really be described as a major problem. It has to do with the fact that old table statistics are not carried over when migrating to a newer Postgres version using the
pg_upgrade utility, which is the fastest way to migrate, by the way. The original post talks about some hours of downtime… but with the “–link” flag, it’s usually less than a minute!
As I said, it’s not a biggie for most users that have ample downtime for upgrades or have only normal simple queries, selecting or updating a couple of rows over an index. On the other hand, for advanced 24/7 shops with lots of data, it can be quite annoying. It’s hard to predict the kind of spikes you’re going to get during those first critical minutes after an upgrade, when statistics are still being rebuilt. So here, it would be actually really nice if the statistics would not be “just deleted“ with every release, but only when some incompatibilities really exist.
For some versions, I even looked at the pertinent structures (
pg_statistic group of tables) and couldn’t see any differences. I think it’s just a corner case and hasn’t gotten enough attention. There are also alternatives. If you really want to avoid such “iffy” moments, you could use logical replication (v10 and above) instead of the good old “in-place” upgrade method. Some details on such upgrades can be found in this post about logical replication.
This one has gotten better: #4 Autovacuum and XID Wraparound – UPDATED
Also “featured” in the original post – historically speaking, the XID Wraparound and the seemingly randomly operated autovacuum background process have definitely been the number one problem for those who are not so up to date on how Postgres MVCC row versioning works, and haven’t tuned accordingly. At default settings, after some years of operation, given that transaction counts are also increasing steadily, it’s indeed very possible to have some downtime to take care of the gathered “debt”.
Still, I’d say it’s not so tragic for most people as it is depicted, as it affects only very busy databases. And if you’re running with version v9.6 of PostgreSQL or higher, then there’s a good chance that you’ll never get any wraparound-related downtime, because the algorithms are now a lot smarter. It has improved even more as of v12, where autovacuum is much more aggressive by default. You can do quick vacuums by skipping index maintenance!
To defend Postgres a bit – you can enable early warning signals for such situations (
log_autovacuum_min_duration) and there are many internal statistics available…you just need to use this information and take action – the tools are there. But there’s definitely room for more improvement. For example, one cool idea (from my colleague Ants) would be to allow explicit declaration of “tables of interest” for long running snapshots. Currently, many autovacuum / wraparound problems are caused by the fact that long-running transactions, sadly, block pretty much all autovacuum cleanup activity within the affected database, even if we’re selecting a table that is completely static. Such declarations could probably also help on the replica side with reducing recovery conflict errors, making Postgres more load-balancing friendly.
– in this area there are also some very promising developments happening with the zHeap project, that aims to provide an alternative to the MVCC row storage model, reducing bloat and thereby surprises from the background processes like the autovacuum.
Notes to #4 by Laurenz Albe:
There have been major improvements in #4, although the fundamental problem is still there:
- in v13, we have autovacuum triggered by
INSERT, which can trigger VACUUM earlier and reduce the magnitude of anti-wraparound autovacuum for
INSERT-only tables (see this article about autovacuum and insert-only tables)
- From v14 on, anti-wraparound autovacuum skips index cleanup and removing dead tuples if things get dangerous (See more about this on github)
The second is the more important improvement, but the first is cooler.
On its way but not there yet… #5 Disk footprint and bloat – UPDATED
This is another well-acknowledged problem-area for PostgreSQL – its “on disk” footprint is mostly a bit higher than that of its competitors. Normally this doesn’t show in real life, since most of your “working data set” should be cached anyway, but it can gradually slow things down. This is especially true if you run at default configs or fail to do any manual maintenance from time to time. But luckily, some things are already happening in this area, like the zHeap project and also a new hybrid store (row + column storage) called ZedStore is emerging that will compress the data size considerably. It will make Postgres more suitable for ultra large data warehouses – so let’s hope it makes it into Postgres v14.
One could throw in here also the full compression topic…but in my opinion it’s not really that important for OLTP databases at least, as enterprise SSD disks are fast as hell nowadays, and you have the option to dabble on the file system level. For those rare use cases where 10TB+ data warehouses with near real-time analytical query expectations are needed, I’d maybe not recommend Postgres as a default anyway. There are better specialized systems which employ hybrid and columnar storage approaches for such needs; automatic block compression alone would not save us there.
Not to say that automatic compression is a bad idea, though: for example, the latest versions of TimescaleDB (a 3rd party PostgreSQL extension for time-series data) employ such tricks, can reduce the “on disk” size dramatically, and thereby speed up heavier queries. There’s also another extension called cstore_fdw. It’s an extension for data warehouse use cases, so there are already options out there for Postgres.
Notes by Laurenz:
For #5, v13 has somewhat improved things with index de-duplication (at least B-tree indexes with duplicates take less space now).
That is only a slight step in the right direction, however.
#6 Out-of-the-box vulnerability against brute force password attacks – UPDATED
I put this one in even though there’s been some thought put into it, since there’s a “contrib” (bundled with Postgres) extension named “auth_delay” for exactly this purpose. What’s true about Postgres out of the box – there’s no limit on password login attempts! Also, by default, users can open up to
$max_connections (minus some superuser reserved connections) sessions in parallel…this could mean trouble if you have some mean gremlins in your network. Note, however, that under “password login” I mean “md5” and “scram-256” auth methods, meaning if you forward your authentication to LDAP or AD, you should be safe – but be sure to validate that. Or, you could enable the “auth_delay” extension; better safe than sorry! It’s really easy to crack weak passwords if you know the username – basically a one-liner! So it would be really nice to see some kind of soft limits kick in automatically.
N.B.! This doesn’t mean that you don’t get notified of the failed login attempts, but it could be too late if you don’t monitor the log files actively! P.S.: there’s also an extension which deals with the same “problem space”, called pg_auth_mon and we (CYBERTEC) also have a patch for support customers to disable such attacked accounts automatically after X failed attempts.
Don’t forget though – by default, access config Postgres only listens to connections from the localhost, so the potential threat only occurs if some IP range is made explicitly accessible!
Notes by Laurenz:
- v14 has changed
password_encryptionto scram-sha-256 by default, which makes brute force attacks way harder, since scram hashes are more expensive.
- v16 has introduced (or will introduce)
require_auth, which allows the client to reject unsafe authentication methods and makes identity theft by faking the server way harder..
Finally, #7: No query cache
This is one of those “wishing for more” or “nice to have” items on my list. The general idea behind it is that it’s quite wasteful to execute exactly the same deterministic query (fixed parameters, immutable functions etc) and scan some large tables fully every time, returning exactly the same results – if the data hasn’t changed! There are probably some non-obvious technical pitfalls lurking around here…but some databases have managed to implement something like that. MySQL now decided to retire it in v8.0 due to some unfortunate implementation details…so the stuff is not easy. However, since performance is the main thing that developers (especially junior ones) worry about, and given that Postgres materialized views are also not always useful, something like that would be an awesome addition.
That should do
Phew…that was tiring already. Seems it’s not really that easy to find 10 things to improve about PostgreSQL…because it’s a really solid database engine 🙂 So let’s stop at 7 this time, but if you have some additional suggestions, please leave a comment. Hope it broadened your (database) horizons, and do remember to decide on the grand total of pluses and minuses.
Read more about PostgreSQL performance and data types in Laurenz Albe’s blog: UNION ALL, Data Types and Performance