Sometimes it happens that you reach a point where you know that things cannot be any easier than they already are.
I think in PostgreSQL 9.3 we will pretty much have reached this point when we fire up a simple replica. Back in the old times you had to call pg_start_backup
, rsync the entire stuff, call pg_stop_backup
and finally come up with a recovery.conf
file to setup a single standby.
Things have changed and some time ago pg_basebackup
made life so much easier. No more rsync (in the typical case), no more function calls. But still, a normal user had to come up with a simple recovery.conf
file to control the recovery process.
However, when you take a look at pg_basebackup
, you'll figure out that the call itself already has all info a user will typically need to write a simple recovery.conf
file. So, why not use this info and write a recovery.conf
file straight away if desired?
This is exactly what Zoltan has implemented for 9.3:
The patch just got committed recently http://git.postgresql.org/pg/commitdiff/915a29a10cdabfbe301dc7201299841339b9798f
All you have to do is to call pg_basebackup
with the "-R" option. A simple recovery.conf
file will be written automatically. The only thing left to do now is to fire up your brand new standby (pg_ctl start ...).
I guess there is no way to make it any easier.
Read more:
pg_basebackup: Creating self-sufficient backups
PostgreSQL: Bulk loading huge amounts of data
Those parameters have been constant for many years in the past. This was fine for most users. Recently we have seen a couple of systems which were already fully based on SSD disks.
SSDs have a nice advantage over traditional disks: Random disk access is not a nightmare as it used to be on traditional hard drives. On a SSD there is no XX-times difference between random and sequential access anymore. This is an important information when it comes to optimizing a query.
It has turned out to be very beneficial to adjust random_page_cost in postgresql.conf to a value close to 1 (instead of 4 which has been the default for many years) when running PostgreSQL on SSDs.
This is especially important if you happen to work with very large data sets (with small data sets which can be cached, setting random_page_cost to something close to 1 might have been a good idea in the past already anyway).
Recently there have already been some cases where PostgreSQL decided to use too many hash-joins. Adjusting random_page_cost
can help here too because index scans and therefore less hashing become more likely (if this is not sufficient, some other stuff has to be adjusted as well).