CYBERTEC Logo

Four reasons why VACUUM won't remove dead rows from a table

03.2018 / Category: / Tags: | |

Why vacuum?

Whenever rows in a PostgreSQL table are updated or deleted, dead rows are left behind. VACUUM gets rid of them so that the space can be reused. If a table doesn't get vacuumed, it will get bloated, which wastes disk space and slows down sequential table scans (and – to a smaller extent – index scans).

VACUUM also takes care of freezing table rows so to avoid problems when the transaction ID counter wraps around, but that's a different story.

Normally you don't have to take care of all that, because the autovacuum daemon built into PostgreSQL does it for you. To find out more about enabling and disabling autovacuum, read this post.

Problems with vacuum: bloated tables

In case your tables bloat, the first thing you check is whether autovacuum processed them or not:

If your bloated table does not show up here, n_dead_tup is zero and last_autovacuum is NULL, you might have a problem with the statistics collector.

If the bloated table is right there on top, but last_autovacuum is NULL, you might need to configure autovacuum to be more aggressive so that it finishes the table.

But sometimes the result will look like this:

Here autovacuum ran recently, but it didn't free the dead tuples!

We can verify the problem by running VACUUM (VERBOSE):

Why won't VACUUM remove the dead rows?

VACUUM only removes those row versions (also known as “tuples”) that are not needed any more. A tuple is not needed if the transaction ID of the deleting transaction (as stored in the xmax system column) is older than the oldest transaction still active in the PostgreSQL database. (Or, in the whole cluster for shared tables).

This value (22300 in the VACUUM output above) is called the “xmin horizon”.

There are three things that can hold back this xmin horizon in a PostgreSQL cluster:

  1. Long-running transactions and VACUUM:

    You can find those and their xmin value with the following query:

    You can use the pg_terminate_backend() function to terminate the database session that is blocking your VACUUM.

     

  2. Abandoned replication slots and VACUUM:

    A replication slot is a data structure that keeps the PostgreSQL server from discarding information that is still needed by a standby server to catch up with the primary.

    If replication is delayed or the standby server is down, the replication slot will prevent VACUUM from deleting old rows.

    You can find all replication slots and their xmin value with this query:

    Use the pg_drop_replication_slot() function to drop replication slots that are no longer needed.

    Note: This can only happen with physical replication if hot_standby_feedback = on. For logical replication there is a similar hazard, but only it only affects system catalogs. Examine the column catalog_xmin in that case.

     

  3. Orphaned prepared transactions and VACUUM:

    During two-phase commit, a distributed transaction is first prepared with the PREPARE statement and then committed with the COMMIT PREPARED statement.

    Once Postgres prepares a transaction, the transaction is kept “hanging around” until it Postgres commits it or aborts it. It even has to survive a server restart! Normally, transactions don't remain in the prepared state for long, but sometimes things go wrong and the administrator has to remove a prepared transaction manually.

    You can find all prepared transactions and their xmin value with the following query:

    Use the ROLLBACK PREPARED SQL statement to remove prepared transactions.

     

  4. Standby server with hot_standby_feedback = on and VACUUM:

    Normally, the primary server in a streaming replication setup does not care about queries running on the standby server. Thus, VACUUM will happily remove dead rows which may still be needed by a long-running query on the standby, which can lead to replication conflicts. To reduce replication conflicts, you can set hot_standby_feedback = on on the standby server. Then the standby will keep the primary informed about the oldest open transaction, and VACUUM on the primary will not remove old row versions still needed on the standby.

    To find out the xmin of all standby servers, you can run the following query on the primary server:

Read more about PostgreSQL table bloat and autocommit in my post here.

5 2 votes
Article Rating
Subscribe
Notify of
guest
8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
PgM
PgM
1 year ago

Hello Laurenz,
on 'long-running transactions'. In isolation level 'read committed'. If I just do selects and keep the transaction open in between. Can that cause problems for vacuum? I guess not as it does not guarantee read consistency? Thank you!

laurenz
laurenz
1 year ago
Reply to  PgM

You are right. What holds back VACUUM is open snapshots, and in READ COMMITTED isolation, each statement takes a new snapshot. If you look at the query I provide, you will see that it checks backend_xmin and backend_xid. You will see that your read-only
READ COMMITTED has both values set to NULL between queries.

Susano Novici
Susano Novici
3 years ago

Hi Laurenz,
I've got an issue with this auto-vacuum
[2020-08-05 16:45:17.157 +07][][][][][431][XX001]ERROR: found xmin 2756976979 from before relfrozenxid 300006063
[2020-08-05 16:45:17.157 +07][][][][][431][XX001]CONTEXT: automatic vacuum of table "template1.pg_catalog.pg_authid"

May u please kindly help me with this problem?

laurenz
laurenz
3 years ago
Reply to  Susano Novici

You have data corruption.

Dump and restore the database to a new PostgreSQL cluster. If you need more help, please contact sales.at

Багир Гварамадзе
Багир Гварамадзе
4 years ago

Hi Laurenz, thanks a lot for post.
I got bloated table because of oldest xmin, but this xmin belong to physical replication. How can I solve this problem without loosing replication?
Best regards.

laurenz
laurenz
4 years ago

It's a bit unclear what your problem is, and it seems unrelated to the article, but perhaps you need to drop the replication slot.

Purav Chovatia
Purav Chovatia
5 years ago

Hi Laurenz, Thanks for this post. It was helpful to us in identifying some issues related to auto-vacuum.

We are on postgresql 9.6 and when we run the vacuum verbose cmd, it does not show us the oldest xmin as seen in your output above. Wondering if you tried this on a newer version.

Regards

laurenz
laurenz
5 years ago
Reply to  Purav Chovatia

You are right, this came with commit 9eb344faf54 in PostgreSQL v10.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    8
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram