CYBERTEC PostgreSQL Logo

When The Dead Won't Die

05.2025 / Category: / Tags: |


VACUUM is one of the most important maintenance tasks in a PostgreSQL database. Over the years, many improvements were made to VACUUM and especially autovacuum to meet heavy workloads and make PostgreSQL fit for future tasks.

One of the optimizations is VACUUM skipping indexes and just focusing on heap tuples. This saves time and makes VACUUM/autovacuum more aggressive at the cost of leaving indexes aside. This behavior can be controlled by the INDEX_CLEANUP ON|OFF|AUTO option to VACUUM, introduced in PostgreSQL 12 and improved in PostgreSQL 14 to be used automatically in appropriate cases. There is also a relation storage parameter that allows controlling the behavior on table level.

Confusing metrics

Recently, a customer asked for advice, since he has problems interpreting metrics from his database monitoring that he couldn't explain. For longer periods, VACUUM and autovacuum respectively don't seem to free dead tuples, causing them to show up in metrics. Especially when doing this manually, he keeps seeing dead tuples in e.g. pg_stat_user_tables system view, but only sometimes. And even more strangely, they seem to reoccur after manual VACUUM (ANALYZE) ...

Some Page Layout Background

PostgreSQL has an advanced technique to place tuples into pages, which are 8kB in size (if not compiled otherwise). Tuples are addressed not directly within their physical location within that page, but rather pointed-to by so-called line pointers. Each tuple gets a special flag showing its purpose and lifetime, like the following:

An LP_NORMAL flag indicates that the line pointer is currently in use and points to a valid tuple. LP_UNUSED shows that this line pointer can immediately be reused by a new tuple. LP_DEAD is an interesting flag though: the line pointer flagged this way doesn't necessarily point to a valid tuple anymore (for example, no space used anymore), but is still mandatory and can't be reused at the moment.

LP_REDIRECT is something special: the tuple pointed to by such a flagged line pointer is the first of a Heap Only Tuple (HOT) tuple chain. Read this excellent article by Laurenz Albe on HOT updates in PostgreSQL for better performance.

The following graphic illustrates the described page layout in a simplified form:

As already mentioned, the PostgreSQL page layout has an array at the beginning of each page right after the page header. This line pointer array can grow from the beginning up to MaxHeapTuplesPerPage, which is platform dependent. Each entry in this array, aka line pointer, points to a specific tuple within the page. Tuples are stored from the end of the page (leaving room for some reserved space), while the line pointer array grows from the beginning. This makes it very flexible in placing and addressing tuple space within the page. A line pointer is a structure which, along with other properties, stores the specific flags in the lp_flag member.

Skip Indexes

The cause of this previously mentioned behavior is the "skip index" optimization in combination with a manual VACUUM (ANALYZE) command. Since PostgreSQL 14 (auto)VACUUM is able to perform the "skip indexes" optimization automatically, whether this optimization is used during vacuum is determined by the following formula (via source):

When VACUUM or autovacuum decide to skip vacuuming indexes, they cannot kill dead heap tuples entirely. Since the index tuples are still pointing to these tuples, they have to keep at least the line pointer entry in the page.

So to simplify things: If there aren't already more than 2 percent of LP_DEAD tuples in the table and the TID storage is not yet exceeding 32 MB, then (auto)VACUUM will skip vacuuming the indexes.

By skipping the indexes, VACUUM can save some significant work, depending on the index size and number of indexes of a relation. However, it can't just free the line pointers within the table, since index entries are still pointing to the tuples. In this case, VACUUM marks these line pointers as LP_DEAD, indicating that they are still in use, but considered dead and will be probably be freed soon. This also happens during normal VACUUM operation, but in the 2nd phase of each VACUUM run they will finally be set to LP_UNUSED.

The described discrepancy in PostgreSQL dead tuple metrics now happens if VACUUM is combined with ANALYZE, or if an ANALYZE is run some time after the last VACUUM with skipped index cleanup. ANALYZE still recognizes the tuples marked LP_DEAD as dead rows, which might cause them to suddenly reappear in the system metrics. This isn't a problem, but it is exactly this surprising condition that happens for people monitoring their PostgreSQL instance very closely.

Example

The following example demonstrates the described behavior. First, create a small test table and insert a bunch of tuples:

Since autovacuum kicks in within autovacuum_naptime and we just populated the table, we wait until it has done its initial job:

Initially, the pg_stat_user_tables view shows the following statistics for this table now:

Now, we decide to delete a bunch of tuples. To showcase the VACUUM skip-index optimization, we try to keep the number of those tuples below the 2 percent page threshold. Normally autovacuum_vacuum_scale_factor is set to 0.2 per default, so this prevents autovacuum from kicking in, again.

This now shows 1000 dead tuples. To showcase, we issue an explicit VACUUM now, which should skip the index associated with the primary key constraint defined on the id column. We use the VERBOSE option to print some additional information, but especially one interesting line:

So VACUUM tells us that it has skipped the indexes. Let's investigate the statistics of this table again:

This is what we expected. However, we now want to ANALYZE this table, too, so that we have current statistics for the optimizer:

Nothing else happened in between, but the statistics are showing the dead tuples again. Also note that ANALYZE (VERBOSE) reported that it has found 1000 dead tuples. This is because VACUUM cannot free the tuples immediately, as indexes are still addressing the line pointers within the pages of the table it belongs to. ANALYZE still recognizes them though, because they're relevant for the data distribution on a page and thus important in order to calculate the costs of a query properly. Note: tuples marked as LP_DEAD are gone, but we can't delete their line pointers until the index is cleaned up accordingly. So given the right circumstances it might seem to you that dead tuples believed to have been successfully deleted suddenly "reappear", which is just a detail of the skip-index optimization of VACUUM. Once VACUUM finds the number of tuples needed to exceed the index cleanup threshold, those line pointers will be freed for reuse again, too.

Summary

VACCUM and autovacuum can skip cleaning up indexes since PostgreSQL 14 if they find that the number of dead tuples is below a specific threshold. This is to save on unnecessary work, since cleaning up indexes can be an expensive operation and it’s better to aggregate some amount of work to do that. This optimization can lead to some confusing behavior of reappearing dead tuples in the table statistics (e.g. pg_stat_user_tables view). If DBAs find this optimization to be suboptimal for their workload, they can force index cleanups via the INDEX_CLEANUP ON parameter for manual VACUUM, or even deactivate that optimization on table level:

Though, I highly recommend reviewing this setting carefully to determine if it’s really needed. Otherwise VACUUM does too much unnecessary work, which might have negative impacts on very large tables and keeping up with potential high write traffic there.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram