Last week a new PostgreSQL major version with the number 9.6 was released! The announcement, release notes and the official “What’s new” overview can be found here, here and here – it’s highly recommended for reading, so check them out. But as always, there’s also a slew of blog-posts from exited members of the global Postgres community follows (check out Planet PostgreSQL here if not yet subscribed), each with a bit of a different angle. Now I would like to add my own impressions on the most interesting/relevant features, summarized for easy digestion.
As always, users who upgrade or initialize a fresh cluster, will enjoy huge performance wins (avoid scanning frozen pages unnecessarily during vacuum freeze, scalability on multi-CPU-socket servers, checkpoint writes in sorted order, index-only scans for partial indexes) out of the box without doing, or being able to do anything, but here I would like to look at the things that you won’t get out of the box, but you actually need to take some steps to start benefiting from them. The list below highlights PostgreSQL 9.6 features compiled from a DBA’s viewpoint. This week a similar article where we look at the changes from a developer’s point of view will follow.
First, here’s a list of things that could most likely cause problems when migrating to PostgreSQL 9.6 from an older version. Before migrating one should of course test on a separate replica and go through the full list of possibly incompatible changes from the release notes.
- Columns for pg_stat_activity (information on active sessions) have changed, providing more details on the lock types blocking the given process.
“waiting” column has been replaced with “wait_event_type” and “wait_event”.
- Treat role names beginning with “pg_” as reserved.
Pretty simple stuff, “pg_upgrade” will give you an error “The old cluster contains roles starting with ‘pg_'”.
- psql’s -c option no longer implies –no-psqlrc.
In case there exists a “.psqlrc” file this could cause your Cron scripts to generate some unwanted output (translating to emails usually), even with the “-q/–quiet” flag.
My favourites from a DBA’s point of view
- Parallel execution of sequential scans, joins and aggregates.
Needs enabling via “max_parallel_workers_per_gather” parameter, which can luckily be done on the user level too, thus making per-query parallelization possible. But one thing to note here is that the total number worker processes is limited with “max_worker_processes” parameter, so this might need increasing (default is 8) on good hardware when doing parallel queries from lots of concurrent sessions.
- Time-based limiting of maximum MVCC snapshot age via the “old_snapshot_threshold” parameter.
Beyond the threshold, old data may be vacuumed away, and users will get a “snapshot too old” error when trying to read such old rows. Warning! From the documentation – “When this feature is enabled, freed space at the end of a relation cannot be released to the operating system”…sothis is basically a double-edged sword basically and it’s not enabled by default.
- Synchronous replication now allows multiple standby servers for increased reliability.
Together with the new “remote_apply” setting for the “synchronous_commit” parameter user have power to create “mirrored” multi-machine clusters. Awesome feature!
- Add pg_stat_progress_vacuum system view to provide progress reporting for VACUUM operations
- Improve pg_rewind so that it can work when the target timeline changes
From release notes – “This allows, for example, rewinding a promoted standby back to some state of the old master’s timeline”. Meaning you could promote a replica, do some migration testing say and then convert it back into a normal replica. Great!
- Add pg_control_system() and some other pg_control_* functions
Enables to read information equivalent to “pg_controldata” utility via SQL. Previously one had to work around it via a custom PL/Pythonu stored procedure or even custom extension, when wanting to expose “database system identifier” for monitoring queries for example.
- Add pg_blocking_pids() function
As release notes formulated it well “Historically users have obtained such information using a self-join on the pg_locks view. However, it is unreasonably tedious…”, this means one can hugely simplify monitoring scripts and ad-hoc troubleshooting, paired with some joins to pg_stat_activity.
- Allow sessions to be terminated automatically if they are in idle-in-transaction state for too long via “idle_in_transaction_session_timeout” parameter.
Useful to prevent forgotten transactions from holding locks or preventing vacuum cleanup for too long. Bye-bye Cron scripts trying to do the same by reading pg_stat_activity regularly and terminating misbehaving transactions.
- Merge the “archive” and “hot_standby” values of the “wal_level” configuration parameter into a single new value “replica”
Postgresql.conf needs adjusting while migrating and already not using “logical”!
- Add a CASCADE option to CREATE EXTENSION to automatically create any extensions the requested one depends on
- New built-in role called “pg_signal_backend” to enable this previously superuser-only functionality.
More built-in roles to be expected in future.
- Add pg_size_bytes() function to convert human-readable size strings to numbers
- “psql” supports multiple -c and -f command-line options.
The specified operations are carried out in the order in which the options are given, and then psql terminates.