Unearthing some hidden PostgreSQL 11 gems

12.2018 | Category: News | Tags: performance

12.2018

Category: News

By Kaarel Moppel - PostgreSQL 11 new features - It’s been a bit over a month since the last Postgres major version was released (and also the 1st minor update is out) so it’s not exactly fresh out of the oven...and as usual there has been already a barrage of articles on the most prominent features. Which is great, as I can save some keyboard strokes on those. But there are of course some other little gems that didn’t get any spotlight (by the way release notes are about 13 pages long so a lot of stuff!)...and now luckily had some “aluminium tube time” to have a second look on some more interesting/useful little features and improvements. So here my findings (somewhat logically grouped).

Table of Contents

General PostgreSQL 11 features

Add column pg_stat_activity.backend_type

This makes it possible to report the “connected/active users” correctly as just a “count(*)” could lie due to the “parallel query” features added already 2 versions ago.

Allow ALTER TABLE to add a column with a non-null default without doing a table rewrite

Adding new columns with DEFAULT values to large and active tables is a classical beginner mistake, basically halting operation. But no more – Postgres is now a lot more beginner-friendly!

Performance

Fill the unused portion of force-switched WAL segment files with zeros for improved compressibility

My favourite low-hanging fruit from this release...and I wonder why it took so long. Why is it cool? Well, one can now set the “archive_timeout” so low that RPO-s from 1 second (lower values currently not allowed) are doable without a replica, with plain WAL-shipping. A great alternative for semi-important systems where losing a couple of last records is not so tragic.

Improve performance of monotonically increasing index additions

The most common use case for "serial" ID columns got a 30-50% boost!

Allow bitmap scans to perform index-only scans when possible

Bitmap index scans (which are quite different from normal index scans) could get quite costly, so this is very good news.

Add support for large pages on Windows

Mostly known as "huge pages", this feature (at least on Linux) is quite recommended for machines with more than 128GB of RAM. Needs to be enabled on OS side also though.

Replication

Exclude unlogged tables, temporary tables, and pg_internal.init files from streaming base backups

This could be quite a time-saver for huge Data Warehouse type of databases where “unlogged” is used quite often for “staging” tables.

Allow checksums of heap pages to be verified during streaming base backup

And actually if checksums are enabled on the master this happens by default! Can be disabled by “--no-verify-checksums” flag...but you should not. And there was actually also a “sister”-feature added where pg_verify_checksums tool can now run on an offline cluster which could be useful in some cases.

Replicate TRUNCATE activity when using logical replication

A huge one for LR users. Previously there were no good workarounds except creating a DML trigger to catch truncations on the publisher side and to error out.

Partioning / "sharding"

Allow INSERT, UPDATE, and COPY on partitioned tables to properly route rows to foreign partitions

This is supported only by postgres_fdw foreign tables but a great addition when you’re doing “home-grown sharding”.

Allow postgres_fdw to push down aggregates to foreign tables that are partitions

Again great if you’re doing sharding - less data fetched equals faster queries, especially when network is a bit laggy.

Allow partition elimination during query execution

Previously, partition elimination only happened at planning time, meaning many joins and prepared queries could not use partition elimination and it resulted in some unnecessary scans.

Auxiliary tools

Add an approximately Zipfian-distributed random generator to pgbench

Quite a nice feature at least for me personally as I’m a big fan and user of pgbench for quick performance gauging. The feature allows tests to be closer to real-life use cases where we typically have a cold/hot data scenario and active data stays generally more or less in shared buffers. To use the feature one needs to change the default test script though, there's no flag.

Allow the random seed to be set in pgbench

More repeatable test runs, yay! I’ve made use of it already for example here.

Allow initdb to set group read access to the data directory

This is actually useful for 3rd party file-based backup software, such which previously needed “root” or “postgres” privileges, which are of course not good for the cluster’s operational security.

Reduce the number of files copied by pg_rewind

Good news for HA-tools that use pg_rewind automatically (Patroni for example). Helps to quickly restore cluster redundancy.

Allow extension pg_prewarm to restore the previous shared buffer contents on startup

The extension can now be used to automatically reduce the time of degraded performance that usually lasts for a couple of minutes after a restart. Shared buffers are then periodically (or only on shutdown) stored in files and fetched back into shared buffers after a (re)start by 2 background workers. For bigger shared buffers sizes you also want to make sure that the Postgres $DATADIR has enough disk space.

Add psql command gdesc to display the names and types of the columns in a query result

Well that was that for PostgreSQL 11. Please write in the comments section if you found some other small but cool additions for version 11 🙂

PostgreSQL 11 was a while ago. Find out about the latest new features of the current PostgreSQL version in our What's New blog spot.

Unearthing some hidden PostgreSQL 11 gems

General PostgreSQL 11 features

Performance

Replication

Partioning / "sharding"

Auxiliary tools

Leave a Reply Cancel reply

CYBERTEC Guest

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts