January 2013

Sometimes it happens that you reach a point where you know that things cannot be any easier than they already are.

Simple replica?

I think in PostgreSQL 9.3 we will pretty much have reached this point when we fire up a simple replica. Back in the old times you had to call pg_start_backup, rsync the entire stuff, call pg_stop_backup and finally come up with a recovery.conf file to setup a single standby.

And now?

Things have changed and some time ago pg_basebackup made life so much easier. No more rsync (in the typical case), no more function calls. But still, a normal user had to come up with a simple recovery.conf file to control the recovery process.

However, when you take a look at pg_basebackup, you'll figure out that the call itself already has all info a user will typically need to write a simple recovery.conf file. So, why not use this info and write a recovery.conf file straight away if desired?

This is exactly what Zoltan has implemented for 9.3:
The patch just got committed recently http://git.postgresql.org/pg/commitdiff/915a29a10cdabfbe301dc7201299841339b9798f

What does it mean for a typical user?

All you have to do is to call pg_basebackup with the "-R" option. A simple recovery.conf file will be written automatically. The only thing left to do now is to fire up your brand new standby (pg_ctl start ...).

I guess there is no way to make it any easier.

The optimizer uses several parameters to optimize queries.

Those parameters have been constant for many years in the past. This was fine for most users. Recently we have seen a couple of systems which were already fully based on SSD disks.

SSDs have a nice advantage over traditional disks: Random disk access is not a nightmare as it used to be on traditional hard drives. On a SSD there is no XX-times difference between random and sequential access anymore. This is an important information when it comes to optimizing a query.

It has turned out to be very beneficial to adjust random_page_cost in postgresql.conf to a value close to 1 (instead of 4 which has been the default for many years) when running PostgreSQL on SSDs.

This is especially important if you happen to work with very large data sets (with small data sets which can be cached, setting random_page_cost to something close to 1 might have been a good idea in the past already anyway).

Finally...

Recently there have already been some cases where PostgreSQL decided to use too many hash-joins. Adjusting random_page_cost can help here too because index scans and therefore less hashing become more likely (if this is not sufficient, some other stuff has to be adjusted as well).