While it’s generally well known how to create physical replicas (i.e. block level copies), using the super simple and really effortless pg_basebackup tool, not so many know what can be done, when the process is painfully slow for bigger databases. The reason for slowness can be typically linked to machine hardware (slow disks, high CPU load) or more commonly slow network between the master and replica machines – think disaster recovery, with DBs in different datacenters. So what can be done?
Tackling the problem
In the first case (slow systems) there’s not too much magic that can be done… hardware has its limits after all. One could try to review/optimize the configuration, hot-swap disks to SSDs when RAID setup allows that or such. The only good news might be that when you already have a spare “high-availability” replica, that’s just idling, it can be used for taking “base backup” also and things should be snappier. There are some special considerations for that use case though, so look at the pg_basebackup documentation before hitting the road.
But what about the network bottleneck? Here we have some options:
1) Set up some VPN/SSH tunnel with compression
Using pg_basebackup will be convenient in the end but the tunnel setup could get quite complex and might be “out of scope” for DBAs.
2) Local pg_basebackup (or just file level copy) compressed before copying over.
This approach requires possibly a lot of extra disk space on the master that might not be available but otherwise a good option.
3) The SSH streaming approach.
Dump pg_basebackup output to stdout on the master and stream it to the replica. We’ll concentrate on this approach with this post.
Compressed streaming pg_basebackup over SSH
So here’s our quick fix to alleviate slow network – streaming “base backup” with master side compression, preferably with multiple compression threads! Note the “master side” part – pg_basebackup already has a “–gzip/–compress” flag, but it actually only compresses things on the replica side, so that all the Postgres data files are still moved over the wire “as is”, which is definitely not optimal (I wonder are there any plans to address this issue?). But ok, how much “profit” can one expect with this trick, given that Postgres also has built in compression for larger strings? Well, it depends heavily on the contents of your data and the compression/parallelization level you’ll configure – but 30-50% decrease in total replica creation time are easily possible. It might not sound too much, but this could translate already to hours for 1TB databases. For really slow networks the gains would be even bigger, when using highest levels of compression with more threads.
A sample implementation of such approach can be found here. Some adjusting before usage is required but the main idea is that we need to combine the TAR format and compression on the master side and then stream over already the compressed data and then de-compress it on the replica side. For parallel compression/de-compression of the TAR stream I would recommend the “pigz” utility. NB! And not to forget about streaming the WAL files separately with pg_receivexlog as pg_basebackup does it for you only in the “plain” mode – without the WALS the whole “base backup” is later worthless.
# pseudocode on the replica # 1. start streaming of WALs pg_receivexlog -h master -D streamed_wals # 2. start the parallel (2 threads) streamed base backup ssh master "pg_basebackup -h /unix/socket -Ft -D- | pigz -c -p2 " | pigz -cd -p2 | tar -xf- -C pgdata # 3. stop WAL streaming and move WALs to XLOG folder pkill pg_receivexlog && mv streamed_wals/* pgdata/pg_xlog # 4. edit configs and start the replica
So take a look and leave a comment when you have some thoughts on the approach or apply some other tricks. Cheers!