pg_resetwal: When to reset the WAL in PostgreSQL

10.2023 | Category: How To | Tags: backup and recovery, under the hood

10.2023

Category: How To

Tags: backup and recovery, under the hood

pg_resetwal (formerly known as pg_resetxlog) is a command provided by PostgreSQL which allows you to reset the WAL (Write Ahead Log) in case of trouble. It is a means of last resort to get a broken database server to start. However, what happens behind the scenes when this command is called? Is it dangerous? Required? Useful or just a “nice to have” kind of thing? Let’s dive in and figure it out.

Table of Contents

The purpose of the WAL in PostgreSQL

Before discussing pg_resetwal, we need to talk about the purpose of WAL in the first place:

Let’s consider this basic transaction:

BEGIN;
INSERT INTO a VALUES (1);
INSERT INTO a VALUES (2);
 …

1

2

3

4

BEGIN;

INSERT INTO a VALUES (1);

INSERT INTO a VALUES (2);

…

In case the database crashes while inserting the second row, we are in trouble. Our data might only be half written (e.g. only half of an 8k block, missing index entries, etc.) which causes a problem, since our data files would face instant corruption (just like MyISAM got corrupted in the old days). However, this is PostgreSQL and the goal is to protect data at all costs. Therefore PostgreSQL writes redo information to the WAL first, and uses it to fix the data files (= remove corruption) after a crash.

Note: During normal operations, data files are not supposed to be up to date, consistent and correct. Yes, you have just read this correctly: In case your database is under load your data files are usually inconsistent. If the WAL gets lost or corrupted, your database is (in case of a crash) simply broken, incomplete and inconsistent.

The rule is:

data files + WAL = consistent state
data files + shared buffers = consistent state

Your data files alone are not sufficient to ensure consistency.

`pg_resetwal` and risk

This is where the problem starts: If your WAL is so corrupted that you have to reset it, it usually means data loss or corruption because, remember, data files are inconsistent during normal operations. What pg_resetwal does is to remove the “instructions to fix those data files after a restart”.

In reality, this leads to some very important rules:

ALWAYS backup what you have BEFORE invoking pg_resetwal
Use it as a means of last resort
Do not call pg_resetwal carelessly
Expect data loss

Keep this warning in mind at all times, otherwise you put your data at risk!
This does not mean that pg_resetwal is always bad - it just means that you need to thoroughly understand it before you use it.

`pg_resetwal` and how to use it correctly

After this introduction, it’s time to take a look at the syntax of the pg_resetwal command:

postgres_support:~ hs$ pg_resetwal --help
pg_resetwal resets the PostgreSQL write-ahead log.

Usage:
  pg_resetwal [OPTION]... DATADIR

Options:
  -c, --commit-timestamp-ids=XID,XID
                                   set oldest and newest transactions bearing
                                   commit timestamp (zero means no change)
 [-D, --pgdata=]DATADIR            data directory
  -e, --epoch=XIDEPOCH             set next transaction ID epoch
  -f, --force                      force update to be done
  -l, --next-wal-file=WALFILE      set minimum starting location for new WAL
  -m, --multixact-ids=MXID,MXID    set next and oldest multitransaction ID
  -n, --dry-run                    no update, just show what would be done
  -o, --next-oid=OID               set next OID
  -O, --multixact-offset=OFFSET    set next multitransaction offset
  -u, --oldest-transaction-id=XID  set oldest transaction ID
  -V, --version                    output version information, then exit
  -x, --next-transaction-id=XID    set next transaction ID
      --wal-segsize=SIZE           size of WAL segments, in megabytes
  -?, --help                       show this help, then exit

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

postgres_support:~ hs$ pg_resetwal --help

pg_resetwal resets the PostgreSQL write-ahead log.

Usage:

pg_resetwal [OPTION]... DATADIR

Options:

-c, --commit-timestamp-ids=XID,XID

set oldest and newest transactions bearing

commit timestamp (zero means no change)

[-D, --pgdata=]DATADIR data directory

-e, --epoch=XIDEPOCH set next transaction ID epoch

-f, --force force update to be done

-l, --next-wal-file=WALFILE set minimum starting location for new WAL

-m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID

-n, --dry-run no update, just show what would be done

-o, --next-oid=OID set next OID

-O, --multixact-offset=OFFSET set next multitransaction offset

-u, --oldest-transaction-id=XID set oldest transaction ID

-V, --version output version information, then exit

-x, --next-transaction-id=XID set next transaction ID

--wal-segsize=SIZE size of WAL segments, in megabytes

-?, --help show this help, then exit

Note is that pg_resetwal is called when the database is not able to start up. You have to point to a data directory and then you can set a few variables.

The problem is: The WAL is not supposed to have less data than the data directory ($PGDATA). Therefore setting the status of the WAL back will lead to “interesting” results. You might face duplicate primary key entries, missing index pointers and a lot of other difficulties you have never even heard of.

Living after `pg_resetwal`

In case you really have to reset the system, it is super important to properly handle the database afterwards. Usually it is not a good idea to consider the content of the database to be trustworthy. Therefore it is a good idea to pg_dump / pg_restore the database (in case this is still possible). A binary backup is NOT what you are looking for, since you would merely create a binary copy of the broken database - which is broken too!

It is therefore a really good idea to use pg_dump to see what can be extracted from the database and what not. In fact: It is quite likely that pg_dump will error out after pg_resetwal as the database might be, well, broken. All you can do in this case is to use manual copy and restore whatever you can extract from the database, step-by-step.

Finally …

You might be interested in this article that demonstrates how pg_resetwal can be used to cause data corruption.

If you want to learn more about PostgreSQL and how to spot performance problems, see our performance blogs for more great tips. We cover the latest information to support PostgreSQL and to help people use their systems to the greatest advantage. If you want to know more about PostgreSQL support and PostgreSQL consulting - reach out to us.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

pg_resetwal: When to reset the WAL in PostgreSQL

The purpose of the WAL in PostgreSQL

`pg_resetwal` and risk

`pg_resetwal` and how to use it correctly

Living after `pg_resetwal`

Finally …

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

pg_resetwal: When to reset the WAL in PostgreSQL

The purpose of the WAL in PostgreSQL

pg_resetwal and risk

pg_resetwal and how to use it correctly

Living after pg_resetwal

Finally …

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

`pg_resetwal` and risk

`pg_resetwal` and how to use it correctly

Living after `pg_resetwal`