vacuuming insert-only carpets
© Laurenz Albe 2020

Most people know that autovacuum is necessary to get rid of dead tuples. These dead tuples are a side effect of PostgreSQL’s MVCC implementation. So many people will be confused when they read that from PostgreSQL v13 on, commit b07642dbc adds support for autovacuuming insert-only tables (also known as “append-only tables”).

This article explains the reasons behind that and gives some advice on how to best use the new feature. It will also explain how to achieve similar benefits in older PostgreSQL releases.

Note that all that I say here about insert-only tables also applies to insert-mostly tables, which are tables that receive only few updates and deletes.

How insert-triggered autovacuum works

From v13 on, PostgreSQL will gather statistics on how many rows were inserted since a table last received a VACUUM. You can see this new value in the new “n_ins_since_vacuum” column of the pg_stat_all_tables catalog view (and in pg_stat_user_tables and pg_stat_sys_tables).

Autovacuum runs on a table whenever that count exceeds a certain value. This value is calculated from the two new parameters “autovacuum_vacuum_insert_threshold” (default 1000) and “autovacuum_vacuum_insert_scale_factor” (default 0.2) as follows:

insert_threshold + insert_scale_factor * reltuples

where reltuples is the estimate for the number of rows in the table, taken from the pg_class catalog.

Like other autovacuum parameters, you can override autovacuum_vacuum_insert_threshold and autovacuum_vacuum_insert_scale_factor with storage parameters of the same name for individual tables. You can disable the new feature by setting autovacuum_vacuum_insert_threshold to -1.

You can use “toast.autovacuum_vacuum_insert_threshold” and “toast.autovacuum_vacuum_insert_scale_factor” to change the parameters for the associated TOAST table.

Use case 1: “anti-wraparound” vacuum on insert-only tables

Why do insert-only tables need VACUUM?

PostgreSQL stores transaction IDs in the xmin and xmax system columns to determine which row version is visible to which query. These transaction IDs are unsigned 4-byte integer values, so after slightly more than 4 billion transactions the counter hits the upper limit. Then it “wraps around” and starts again at 3.

As described in this blog post, that method would cause data loss after about 2 billion transactions. So old table rows must be “frozen” (marked as unconditionally visible) before that happens. This is one of the many jobs of the autovacuum daemon.

Why anti-wraparound vacuum on insert-only tables can be a problem

The problem is that PostgreSQL only triggers such “anti-wraparound” runs once the oldest unfrozen table row is more than 200 million transactions old. For an insert-only table, this is normally the first time ever that autovacuum runs on a table. There are two potential problems with that:

  • The anti-wraparound vacuum doesn’t get done in time. Then, one million transactions before it would suffer data corruption, PostgreSQL will not accept any new transactions. You have to start it in single-user mode and run VACUUM manually. You can find a description of such cases in this and this blog.
  • Different from other autovacuum runs, anti-wraparound autovacuum will not give up when it blocks a concurrent transaction. This will block even short operations that require an ACCESS EXCLUSIVE lock (like DDL statements on the table). Such a blocked operation will block all other access to the table, and processing comes to a standstill. You can find such a case described in this blog.

How to protect yourself from disruptive anti-wraparound vacuums

From PostgreSQL v13 on, the default settings should already protect you from this problem. This was indeed the motivation behind the new feature.

If it is truly an insert-only table, you should also set the vacuum_freeze_min_age storage option to 0 for that table, so that autovacuum freezes as many tuples as it can:

ALTER TABLE mytable SET (
   vacuum_freeze_min_age = 0
);

With a table that also receives some updates and deletes, you should use a higher value of vacuum_freeze_min_age to avoid unnecessary I/O.

For PostgreSQL versions older than v13, you can achieve a similar effect by triggering anti-wraparound vacuum earlier, so that it becomes less disruptive. For example, if you want to vacuum a table every 100000 transactions, you can set this additional storage parameter:

ALTER TABLE mytable SET (
   autovacuum_freeze_max_age = 100000,
   vacuum_freeze_min_age = 0
);

Use case 2: index-only scans on insert-only tables

How index-only scans work in PostgreSQL

As mentioned above, each row contains the information for which transactions it is visible. However, the index does not contain this information. Now if you consider an SQL query like this:

SELECT count(*) FROM mytables WHERE id < 100;

where you have an index on id, all the information you need is available in the index. So you should not need to fetch the actual table row (“heap fetch”), which is the expensive part of an index scan. But unfortunately you have to visit the table row anyway, just to check if the index entry is visible or not.

To work around that, PostgreSQL has a shortcut that makes index-only scans possible: the visibility map. This data structure stores two bits per 8kB table block, one of which indicates if all rows in the block are visible to all transactions. If a query scans an index entry and finds that the block containing the referenced table row is all-visible, it can skip checking visibility for that entry.

So you can have index-only scans in PostgreSQL if most blocks of a table are marked all-visible in the visibility map.

The problem with index-only scans on insert-only tables

Since VACUUM removes dead tuples, which is required to make a table block all-visible, it is also VACUUM that updates the visibility map. So to have most blocks all-visible in order to get an index-only scan, VACUUM needs to run on the table often enough.

Now if a table receives enough UPDATEs or DELETEs, you can set autovacuum_vacuum_scale_factor to a low value like 0.005. Then autovacuum will keep the visibility map in good shape.

But with an insert-only table, it is not as simple to get index-only scans before PostgreSQL v13. One report of a problem related to that is here.

How to get index-only scans on insert-only tables

From PostgreSQL v13 on, all you have to do is to lower autovacuum_vacuum_insert_scale_factor on the table:

ALTER TABLE mytable SET (
   autovacuum_vacuum_insert_scale_factor = 0.005
);

In older PostgreSQL versions, this is more difficult. You have two options:

  • schedule regular VACUUM runs with cron or a different scheduler
  • set autovacuum_freeze_max_age low for that table, so that autovacuum processes it often enough

Use case 3: hint bits on insert-only tables

In PostgreSQL, the first query that reads a newly created row has to consult the commit log to figure out if the transaction that created the row was committed or not. It then sets a hint bit on the row that persists that information. That way, the first reader saves future readers the effort of checking the commit log.

As a consequence, the first reader of a new row “dirties” (modifies in memory) the block that contains it. If a lot of rows were recently inserted in a table, that can cause a performance hit for the first reader. Therefore, it is considered good practice in PostgreSQL to VACUUM a table after you insert (or COPY) a lot of rows into it.

But people don’t always follow that recommendation. Also, if you want to write software that supports several database systems, it is annoying to have to add special cases for individual systems. With the new feature, PostgreSQL automatically vacuums insert-only tables after large inserts, so you have one less thing to worry about.

Future work

During the discussion for the new feature we saw that there is still a lot of room for improvement. Autovacuum is already quite complicated (just look at the many configuration parameters) and still does not do everything right. For example, truly insert-only tables would benefit from freezing rows right away. On the other hand, for tables that receive some updates or deletes as well as for table partitions that don’t live long enough to reach wraparound age, such aggressive freezing can lead to unnecessary I/O activity.

One promising idea Andres Freund propagated was to freeze all tuples in a block whenever the block becomes dirty, that is, has to be written anyway.

The fundamental problem is that autovacuum serves so many different purposes. Basically, it is the silver bullet that should solve all of the problems of PostgreSQL’s MVCC architecture. That is why it is so complicated. However, it would take a major redesign to improve that situation.

Conclusion

While it seems to be an oxymoron at first glance, autovacuum for insert-only tables mitigates several problems that large databases used to suffer from.

In a world where people collect “big data”, it becomes even more important to keep such databases running smoothly. With careful tuning, that was possible even before PostgreSQL v13. But autovacuum is not simple to tune, and many people lack the required knowledge. So it is good to have new autovacuum functionality that takes care of more potential problems automatically.