If you are running PostgreSQL in production you might have already noticed that adjusting checkpoints can be a very beneficial thing to tune the server and to improve database performance overall. However, there is more to this: Increasing checkpoint distances can also help to actually reduce the amount of WAL that is created in the first place. That is right. It is not just a performance issue, but it also has an impact on the volume of log created.
You want to know why? Let us take a closer look!
Creating a test installation
For my test I am using a normal PostgreSQL 12 installation. To do that I am running initdb by hand:
% initdb -D /some_where/db12 restart
Then I have adjusted two parameters in postgresql.conf:
max_wal_size = 128MB synchronous_commit = off
and restarted the server:
% pg_ctl -D /some_where/db12 restart
First of all I have reduced max_wal_size to make sure that PostgreSQL checkpoint a lot more often than this would be the case in the default installation. Performance will drop like a stone. The second parameter is to tell PostgreSQL to use asynchronous commits. This does not impact WAL generation but will simply speed up the benchmark to get results faster.
Finally I am using pgbench to create a test database containing 10 million rows (= 100 x 100.000)
% createdb test % pgbench -s 100 -i test dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data... 100000 of 10000000 tuples (1%) done (elapsed 0.13 s, remaining 12.90 s) 200000 of 10000000 tuples (2%) done (elapsed 0.28 s, remaining 13.79 s) … 9700000 of 10000000 tuples (97%) done (elapsed 12.32 s, remaining 0.38 s) 9800000 of 10000000 tuples (98%) done (elapsed 12.46 s, remaining 0.25 s) 9900000 of 10000000 tuples (99%) done (elapsed 12.59 s, remaining 0.13 s) 10000000 of 10000000 tuples (100%) done (elapsed 12.68 s, remaining 0.00 s) vacuuming... creating primary keys... done.
Running a PostgreSQL benchmark
Let us run a test now: We want to run 10 x 1 million transactions. If you want to repeat the tests on your database make sure the test runs long enough. Let us take a look at the test!
% pgbench -c 10 -j 5 -t 1000000 test starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 10 number of threads: 5 number of transactions per client: 1000000 number of transactions actually processed: 10000000/10000000 latency average = 0.730 ms tps = 13699.655677 (including connections establishing) tps = 13699.737288 (excluding connections establishing)
As you can see we managed to run roughly 13.700 transactions per second (on my personal laptop). Let us take a look at the amount of WAL created:
test=# SELECT pg_current_wal_insert_lsn(); pg_current_wal_insert_lsn --------------------------- 13/522921E8 (1 row)
The way to figure out the current WAL position is to call the pg_current_insert_lsn() function. To most people this format might be incomprehensible and therefore it is easier to transform the value into a readable number:
test=# SELECT pg_current_wal_insert_lsn() - '0/00000000'::pg_lsn; ?column? ------------- 82982805992 (1 row)
The pg_size_pretty() function will return the number in a readily understandable format:
test=# SELECT pg_size_pretty(pg_current_wal_insert_lsn() - '0/00000000'::pg_lsn); pg_size_pretty ---------------- 77 GB (1 row)
Increasing checkpoint distances
Let us stop the database, delete the data directory and initialize it again. This time my postgresql.conf parameters will be set to way higher values:
max_wal_size = 150GB checkpoint_timeout = 1d synchronous_commit = off
Then we start the database again, populate it with data and run pgbench with the exact same parameters. Keep in mind that to repeat the process it is necessary to run a fixed number of transactions instead of running the test for a certain time. This way we can fairly compare the amount of WAL created.
Let us take a look and see how much WAL has been created this time:
test=# SELECT pg_current_wal_insert_lsn(); pg_current_wal_insert_lsn --------------------------- 1/4F210328 (1 row) test=# SELECT pg_size_pretty(pg_current_wal_insert_lsn() - '0/00000000'::pg_lsn); pg_size_pretty ---------------- 5362 MB (1 row)
Wow, only 5.3 GB of WAL. This is a major improvement but why did it happen?
Understanding WAL creation
The idea of the transaction log is to make sure that in case of a crash the system will be able to recover nicely. However, PostgreSQL cannot keep the transaction log around forever. The purpose of a checkpoint is ensuring that all data is safe in the data file before the WAL can be removed from disk.
Here comes the trick: If a block is modified after a checkpoint for the first time it has to be logged completely. The entire block will be written to the WAL. Every additional change does not require the entire block to be written – only the changes have to be logged which is, of course, a lot less.
This is the reason for the major difference we see here. By increasing the checkpoint distances to a really high value, we basically reduced checkpoints to an absolute minimum which in turn reduces the number of “first blocks” written.
Changing WAL settings is not just good for performance in the long run – it will also reduce the volume of WAL.
If you want to know more about PostgreSQL and want to learn something about VACUUM check out my blog about this important topic.