pg_resetwal (formerly known as
pg_resetxlog) is a command provided by PostgreSQL which allows you to reset the WAL (Write Ahead Log) in case of trouble. It is a means of last resort to get a broken database server to start. However, what happens behind the scenes when this command is called? Is it dangerous? Required? Useful or just a “nice to have” kind of thing? Let’s dive in and figure it out.
The purpose of the WAL in PostgreSQL
pg_resetwal, we need to talk about the purpose of WAL in the first place:
Let’s consider this basic transaction:
BEGIN; INSERT INTO a VALUES (1); INSERT INTO a VALUES (2); <crash> …
In case the database crashes while inserting the second row, we are in trouble. Our data might only be half written (e.g. only half of an 8k block, missing index entries, etc.) which causes a problem, since our data files would face instant corruption (just like MyISAM got corrupted in the old days). However, this is PostgreSQL and the goal is to protect data at all costs. Therefore PostgreSQL writes redo information to the WAL first, and uses it to fix the data files (= remove corruption) after a crash.
Note: During normal operations, data files are not supposed to be up to date, consistent and correct. Yes, you have just read this correctly: In case your database is under load your data files are usually inconsistent. If the WAL gets lost or corrupted, your database is (in case of a crash) simply broken, incomplete and inconsistent.
The rule is:
- data files + WAL = consistent state
- data files + shared buffers = consistent state
Your data files alone are not sufficient to ensure consistency.
pg_resetwal and risk
This is where the problem starts: If your WAL is so corrupted that you have to reset it, it usually means data loss or corruption because, remember, data files are inconsistent during normal operations. What
pg_resetwal does is to remove the “instructions to fix those data files after a restart”.
In reality, this leads to some very important rules:
- ALWAYS backup what you have BEFORE invoking
- Use it as a means of last resort
- Do not call
- Expect data loss
Keep this warning in mind at all times, otherwise you put your data at risk!
This does not mean that
pg_resetwal is always bad – it just means that you need to thoroughly understand it before you use it.
pg_resetwal and how to use it correctly
After this introduction, it’s time to take a look at the syntax of the
postgres_support:~ hs$ pg_resetwal --help pg_resetwal resets the PostgreSQL write-ahead log. Usage: pg_resetwal [OPTION]... DATADIR Options: -c, --commit-timestamp-ids=XID,XID set oldest and newest transactions bearing commit timestamp (zero means no change) [-D, --pgdata=]DATADIR data directory -e, --epoch=XIDEPOCH set next transaction ID epoch -f, --force force update to be done -l, --next-wal-file=WALFILE set minimum starting location for new WAL -m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID -n, --dry-run no update, just show what would be done -o, --next-oid=OID set next OID -O, --multixact-offset=OFFSET set next multitransaction offset -u, --oldest-transaction-id=XID set oldest transaction ID -V, --version output version information, then exit -x, --next-transaction-id=XID set next transaction ID --wal-segsize=SIZE size of WAL segments, in megabytes -?, --help show this help, then exit
Note is that
pg_resetwal is called when the database is not able to start up. You have to point to a data directory and then you can set a few variables.
The problem is: The WAL is not supposed to have less data than the data directory (
$PGDATA). Therefore setting the status of the WAL back will lead to “interesting” results. You might face duplicate primary key entries, missing index pointers and a lot of other difficulties you have never even heard of.
In case you really have to reset the system, it is super important to properly handle the database afterwards. Usually it is not a good idea to consider the content of the database to be trustworthy. Therefore it is a good idea to
pg_restore the database (in case this is still possible). A binary backup is NOT what you are looking for, since you would merely create a binary copy of the broken database – which is broken too!
It is therefore a really good idea to use
pg_dump to see what can be extracted from the database and what not. In fact: It is quite likely that
pg_dump will error out after
pg_resetwal as the database might be, well, broken. All you can do in this case is to use manual copy and restore whatever you can extract from the database, step-by-step.
You might be interested in this article that demonstrates how
pg_resetwal can be used to cause data corruption.
If you want to learn more about PostgreSQL and how to spot performance problems, see our performance blogs for more great tips. We cover the latest information to support PostgreSQL and to help people use their systems to the greatest advantage. If you want to know more about PostgreSQL support and PostgreSQL consulting – reach out to us.