CYBERTEC PostgreSQL Logo
exclusive backup deprecated - updated for Postgresql v15
© Laurenz Albe 2019

 

UPDATE 12.07.2022: The “exclusive backup” method of calling pg_start_backup('label') before backup and pg_stop_backup() afterwards has been removed in PostgreSQL v15.

This article describes the problems with the old method and discusses the options for those who still use this backup method. I include scripts below to help you.

The “exclusive” backup method

Before pg_basebackup was invented, there was only one online file-system level backup method:

This method is called exclusive because only one such backup can be performed simultaneously.

pg_start_backup creates a file backup_label in the data directory that contains the location of the checkpoint starting the backup. This makes sure that during startup, PostgreSQL does not recover from the latest checkpoint registered in pg_control. Doing so would cause data corruption, since the backup may contain data files from before that checkpoint. Don't forget that database activity, including checkpointing, continues normally in backup mode!

The problem with the exclusive backup method

This backup method can cause trouble if PostgreSQL or the operating system crash during backup mode.
When PostgreSQL starts up after such a crash, it will find the backup_label file and deduce that it is recovering a backup. There is no way to distinguish the data directory of a server crashed while in backup mode from a backup!

Consequently, PostgreSQL will try to recover from the checkpoint in backup_label. Lacking a recovery.conf file with a restore_command, it will resort to the transaction log (=WAL) files in pg_wal (pg_xlog on older versions).

But the database might have been in backup mode for a longer time before the crash. If there has been enough data modification activity in that time, the WAL segment with the starting checkpoint may already have been archived and removed.

The startup process will then fail with this error message:

You have to manually remove the backup_label file left behind from the failed backup to be able to restart PostgreSQL.
Today, in the age of automated provisioning, requiring manual intervention is even less tolerated than it used to be. So this behavior is not acceptable in many cases.

Overcoming the problem with pg_basebackup

In PostgreSQL 9.1, pg_basebackup was introduced, which provides a much simpler method to create an online file-system backup.

It introduced the “non-exclusive” backup method, meaning that several such backups can be performed at the same time. backup_label is not written to the data directory but added only to the backup. Consequently, pg_basebackup is not vulnerable to the problem described above.

pg_basebackup makes backups simple, but since it copies all data files via a single database connection, it can take too long to back up a large database.
To deal with such databases, you still had to resort to the “low-level backup API” provided by pg_start_backup and pg_stop_backup with all its problems.

The improved “low-level backup API”

Version 9.6 brought the non-exclusive backup to pg_start_backup and pg_stop_backup.
Backups can now be performed like this:

Removal of the exclusive backup method

Since version 9.6, the documentation contained the following sentence:

The non-exclusive method is recommended and the exclusive one is deprecated and will eventually be removed.

If you are still using the exclusive backup method, DON'T PANIC.

PostgreSQL releases are supported for 5 years after their release date. So you have until 2026 (the end-of-life date for PostgreSQL v14) to adjust your backup scripts if you are using the exclusive backup method.

Using a pre-backup and post-backup script

Your backup may be driven by a company-wide backup software, or maybe you use snapshots on the storage subsystem to back up a large database.

In both cases, it is not unusual that the backup software offers to run a “pre-backup” and a “post-backup” command on the target machine. The pre-backup script prepares the machine for being backed up, and the post-backup script resumes normal operation.

In such a situation it is difficult to switch from exclusive backup to non-exclusive backup: You cannot easily keep the database session where you ran pg_start_backup open, because the backup will only start once the pre-backup script has ended. But you need to keep that session open, so that you can run pg_stop_backup in the same session to complete the backup!

People with such a backup scenario will probably find it hardest to move away from the exclusive backup method.

Pre-backup and post-backup scripts for non-exclusive backups

To overcome this problem, I have written pre- and post-backup scripts that use non-exclusive backups. They are available here.

They work by creating a table in the database postgres and a “co-process” that stays around when the pre-backup script is done. The post-backup script notifies the co-process to complete the backup and write the contents of the backup_label file to the database table. You can get that information either from the standard output of the post-backup script or from the database table.

There is one last thing you have to do: you have to store the backup_label file with the checkpoint information along with the backup. The file must be present after the backup has been restored. Remember that if you start PostgreSQL on a restored data directory without the correct backup_label file, the result will be data corruption. This is because the pg_control file in the backup usually contains a later checkpoint than the one taken during pg_start_backup.

By Kaarel Moppel - Compare MySQL and PostgreSQL - First off - I'm not trying to kindle any flame wars here, just trying to broaden my (your) horizons a bit, gather some ideas (maybe I'm missing out on something cool, it's the most used Open Source RDBMS after all) and to somewhat compare the two despite being a difficult thing to do correctly / objectively. Also, I'm leaving aside performance comparisons and I'm looking at just the available features, general querying experience and documentation clarity as this is I guess the most important for beginners. So this is just a list of points I made for myself, grouped in no particular order.

Disclaimer: The last time I used MySQL for some personal project is 10 years ago, so basically I'm starting from zero and it only took one and a half days to get to know it - thus if you see that I've gotten something screamingly wrong then please do leave a comment and I'll change it. Also, my bias in this article probably tends to favour Postgres...but I'm pretty sure a MySQL veteran with good knowledge of pros and cons can write up something similar also on Postgres, so my hope is that you can leave this aside and learn a thing or two about either system.

To run MySQL, I used the official Docker image, 8.0.14. Under MySQL the default InnoDB engine is meant.

"mysql" CLI ("psql" equivalent) and general querying experience

*When the server requires a password why doesn't it just ask for it?

MySQL positive findings

MySQL downsides

PostgreSQL architectural/conceptual/platform advantages

MySQL architectural/conceptual/platform advantages

Things I found weird in MySQL

A table alias in an aggregate can break a query:

Things I found very weird in MySQL

MySQL cool features that I would welcome in Postgres

My verdict on MySQL 8 vs PostgreSQL 11

First again, the idea of the article is not to bash MySQL - it has shown a lot of progress recently with the latest version 8. Judging by the release notes, a lot of issues got eliminated and cool features (e.g. CTE-s, Window functions) added, making it more enterprise-suitable. There's also much more activity happening on the source code repository compared to Postgres (according to www.openhub.net), and even if it's a bit hard to acknowledge for a PostgreSQL consultant - it has much more installations and has very good future prospects to develop further due to solid financial backing, which is a bit of a topic for Postgres as it's not really owned by any company (which is a good thing in other aspects).

In summary, I recommend PostgreSQL for 99% of users needing a relational database, thanks to its lightweight nature, data integrity, and fewer surprises. The remaining 1% may require MySQL for global start-up scaling due to its multi-master support. PostgreSQL's constraints ensure data integrity, while MySQL requires constant vigilance from developers, who may overlook important safeguards under pressure. Additionally, PostgreSQL offers over 100 Foreign Data Wrappers for diverse data integration needs.

Hope you found something new and interesting for yourself, thanks for reading!

Check out the latest blogs concerning the important topic of PostgreSQL security in our security blog spot.

Over the years many people have asked for “timetravel” or “AS OF”-queries in PostgreSQL. Oracle has provided this kind of functionality for quite some time already. However, in the PostgreSQL world “AS OF timestamp” is not directly available. The question now is: How can we implement this vital functionality in user land and mimic Oracle functionality?

Implementing “AS OF” and timetravel in user land

Let us suppose we want to version a simple table consisting of just three columns: id, some_data1 and some_data2. To do this we first have to install the btree_gist module, which adds some valuable operators we will need to manage time travel. The table storing the data will need an additional column to handle the validity of a row. Fortunately, PostgreSQL supports “range types”, which allow to store ranges in an easy and efficient way. Here is how it works:

Attention!

Mind the last line here: “EXLUDE USING gist” will ensure that if the “id” is identical the period (“valid”) must not overlap. The idea is to ensure that the same “id” only has one entry at a time. PostgreSQL will automatically create a Gist index on that column. The feature is called “exclusion constraint”. If you are looking for more information about this feature consider checking out the official documentation (https://www.postgresql.org/docs/current/ddl-constraints.html).

How can I filter?

If you want to filter on some_data1 and some_data2 consider creating indexes. Remember, missing indexes are in many cases the root cause of bad performance:

By creating a view, it should be super easy to extract data from the underlying tables:

For the sake of simplicity I have created a view, which returns the most up to date state of the data. However, it should also be possible to select an old version of the data. To make it easy for application developers I decided to introduce a new GUC (= runtime variable), which allows users to set the desired point in time. Here is how it works:

Then you can create a second view, which returns the old data:

It is of course also possible to do that with just one view. However, the code is easier to read if two views are used (for the purpose of this blog post). Feel free to adjust the code to your needs.

If you are running an application you usually don't care what is going on behind the scenes - you simply want to modify a table and things should take care of themselves in an easy way. Therefore, it makes sense to add a trigger to your t_object_current table, which takes care of versioning. Here is an example:

The trigger will take care that INSERT, UPDATE, and DELETE is properly taken care of.

Finally: Timetravel made easy

It is obvious that versioning does have an impact on performance. You should also keep in mind that UPDATE and DELETE are more expensive than previously. However, the advantage is that things are really easy from an application point of view. Implementing time travel can be done quite generically and most applications might not have to be changed at all. What is true, however, is that foreign keys will need some special attention and might be easy to implement in general. It depends on your applications whether this kind of restriction is in general a problem or not.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram