CYBERTEC Logo

pg_basebackup and backup speed: In the past couple of years, replication has been made easier and easier. Setting up streaming replication is pretty easy these days. When I train people I get a lot of positive feedback. People seem to like the way entire database instances can be cloned.

In many cases, DBAs use a straightforward approach to creating base backups: pg_basebackup.

Potential showstoppers on the way

While this is simple, there can be a small issue which pops up. What if the I/O capacity of two servers is low, while the network bandwidth is large? In case of a simple base backup, that can lead to high disk wait on both servers. The problem is that pg_basebackup running at full speed can easily cause response times to go up, because the disk is just too busy reading all the data.

Some customers have reported that this kind of problem even feels like downtime.

The solution to the problem is to control the speed of pg_basebackup. By reducing the backup speed to a reasonable level, the disk on the master has enough time to serve normal requests.

The -r flag of pg_basebackup can be used for exactly that:

Controlling the speed of your base backups can be a good idea if I/O bandwidth is an issue on your systems. It makes sure that your systems stay responsive all the time - even during a backup.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

I received a question about how to fork PostgreSQL databases like you can do on Heroku. As I did not find any good examples on how to do this, I decided to do a quick write up.

What does "fork databases" mean?

Forking a database means taking a copy of a database where you can make changes that don't affect the original source database. The obvious solution here is to just take a copy of a database (also known as a backup) and start a new database instance on the copy. There are enough articles and documentation for backup procedures so I will not dwell on this point. The issue with the obvious solution is performance - copying a large amount of data takes a lot of time. If we want to use database forks to refresh our staging or (confidentiality requirements permitting) testing environments we can't afford to wait several hours each time we do this.

We need to somehow take a copy without actually taking the time to make a copy of everything. The good news is that with a little help from your storage layer this is possible. The technical term for what we are looking for is a copy-on-write snapshot. A feature provided by specialized storage devices, but also in software by Linux LVM layer and Btrfs and ZFS filesystems.

For this demonstration I will go with Btrfs, it is the simplest to set up and also this is what I happened to have lying around. I have a 99% filled up RAID1 Btrfs filesystem mounted at /mnt/data, backed by three Western Digital Green spinning disks (<6000RPM). So basically a storage system as slow as you can make it. First I set up a database to run our tests on. I create a new subvolume for the master database, create a new PostgreSQL database in it and start it up on port 6000.

Now that we have a master server running, let's generate some dummy data in it. I will use pgbench to do this.

At this point I have a master database with 15GB of data in it. Let's also modify some data in the master so we can track our forks. We will set the filler column on a row in pgbench_branches table to do this.

Normal PostgreSQL hot backup procedure is to start your backup with pg_start_backup(), copy database contents over, end backup with pg_stop_backup() and then copy xlogs over. However Btrfs supports atomic snapshots. With atomic snapshot we can just create a snapshot and have the exact same contents we would have had if we had SIGKILL'ed PostgreSQL at that point in time and taken a regular copy. PostgreSQL durability mechanisms ensure that we will get a consistent state that includes everything that has successfully committed at that time and nothing more. The backup management commands are not required in our case.

To make things more interesting, I will start a workload on the master in a second shell to demonstrate that we can easily do this on a production database. 16 clients trying to keep a steady load of 50 transactions per second should do it. 50 transactions a second is quite slow, but a considerable challenge for the slow storage system we are using here.

We are ready to fork the database, let's go ahead and do it:

And the performance stats from that point in time:

So we managed to take fork a 15 GB database in 6 seconds with only a small hiccup in performance. We are ready to start up the forked database.

To start the fork we have couple of things to do. First we have to remove the pid file of the master. Usually PostgreSQL can remove it on its own, but because the master is still running on the same machine it isn't convinced it is safe in this case and we have to do it manually. Second we have to configure the fork to run on a different port from the master. When this is done, we can start the server up and observe that it successfully recovers from a crash:

Next we can verify that we indeed have two different databases running. Let's update our tracking row in both the master and fork databases and verify that they are independent.

Let's make change on the forked database:

And check that master database still has the old data (except the bbalance column that pgbench has updated):

With this we have a recipe for successfully forking a running production database with only a minor hiccup. You probably don't want to run your staging tests or dev environment on the same machine as the master. To have your forks on a separate machine you would need a streaming replication standby running on your staging or dev environment and then fork new databases off the streaming standby. Just don't forget to remove/replace recovery.conf otherwise your snapshot will still be running as a standby.

It's also advisable to switch the fork to a new timeline so PostgreSQL knows it's a fork and will give you errors if you try to do something stupid, like have the fork replicate from master (or vice versa). To do this create a recovery.conf that contains the single line restore_command = '/bin/false'. This will switch PostgreSQL to point-in-time-recovery mode (as opposed to regular crash recovery), creating a timeline switch at the end of transaction log. /bin/false is there to signify that there is no archive to fetch additional transaction logs from.

Happy forking.

In case you need any assistance, please feel free to contact us.
 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram