Four reasons why VACUUM won't remove dead rows from a table

03.2018 | Category: How To | Tags: administration, table bloat, vacuum

03.2018

Category: How To

Tags: administration, table bloat, vacuum

Why vacuum?

Whenever rows in a PostgreSQL table are updated or deleted, dead rows are left behind. VACUUM gets rid of them so that the space can be reused. If a table doesn't get vacuumed, it will get bloated, which wastes disk space and slows down sequential table scans (and – to a smaller extent – index scans).

Table of Contents

VACUUM also takes care of freezing table rows so to avoid problems when the transaction ID counter wraps around, but that's a different story.

Normally you don't have to take care of all that, because the autovacuum daemon built into PostgreSQL does it for you. To find out more about enabling and disabling autovacuum, read this post.

Problems with vacuum: bloated tables

In case your tables bloat, the first thing you check is whether autovacuum processed them or not:

SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_all_tables
ORDER BY n_dead_tup
    / (n_live_tup
       * current_setting('autovacuum_vacuum_scale_factor')::float8
          + current_setting('autovacuum_vacuum_threshold')::float8)
     DESC
LIMIT 10;

1

2

3

4

5

6

7

8

SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum

FROM pg_stat_all_tables

ORDER BY n_dead_tup

/ (n_live_tup

* current_setting('autovacuum_vacuum_scale_factor')::float8

+ current_setting('autovacuum_vacuum_threshold')::float8)

DESC

LIMIT 10;

If your bloated table does not show up here, n_dead_tup is zero and last_autovacuum is NULL, you might have a problem with the statistics collector.

If the bloated table is right there on top, but last_autovacuum is NULL, you might need to configure autovacuum to be more aggressive so that it finishes the table.

But sometimes the result will look like this:

 schemaname |    relname   | n_live_tup | n_dead_tup |   last_autovacuum
------------+--------------+------------+------------+---------------------
 laurenz    | vacme        |      50000 |      50000 | 2018-02-22 13:20:16
 pg_catalog | pg_attribute |         42 |        165 |
 pg_catalog | pg_amop      |        871 |        162 |
 pg_catalog | pg_class     |          9 |         31 |
 pg_catalog | pg_type      |         17 |         27 |
 pg_catalog | pg_index     |          5 |         15 |
 pg_catalog | pg_depend    |       9162 |        471 |
 pg_catalog | pg_trigger   |          0 |         12 |
 pg_catalog | pg_proc      |        183 |         16 |
 pg_catalog | pg_shdepend  |          7 |          6 |
(10 rows)

1

2

3

4

5

6

7

8

9

10

11

12

13

schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum

------------+--------------+------------+------------+---------------------

laurenz | vacme | 50000 | 50000 | 2018-02-22 13:20:16

pg_catalog | pg_attribute | 42 | 165 |

pg_catalog | pg_amop | 871 | 162 |

pg_catalog | pg_class | 9 | 31 |

pg_catalog | pg_type | 17 | 27 |

pg_catalog | pg_index | 5 | 15 |

pg_catalog | pg_depend | 9162 | 471 |

pg_catalog | pg_trigger | 0 | 12 |

pg_catalog | pg_proc | 183 | 16 |

pg_catalog | pg_shdepend | 7 | 6 |

(10 rows)

Here autovacuum ran recently, but it didn't free the dead tuples!

We can verify the problem by running VACUUM (VERBOSE):

test=> VACUUM (VERBOSE) vacme;
INFO:  vacuuming 'laurenz.vacme'
INFO:  'vacme': found 0 removable, 100000 nonremovable row versions in
       443 out of 443 pages
DETAIL:  50000 dead row versions cannot be removed yet,
         oldest xmin: 22300
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.

1

2

3

4

5

6

7

8

9

10

test=> VACUUM (VERBOSE) vacme;

INFO: vacuuming 'laurenz.vacme'

INFO: 'vacme': found 0 removable, 100000 nonremovable row versions in

443 out of 443 pages

DETAIL: 50000 dead row versions cannot be removed yet,

oldest xmin: 22300

There were 0 unused item pointers.

Skipped 0 pages due to buffer pins, 0 frozen pages.

0 pages are entirely empty.

CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.

Why won't `VACUUM` remove the dead rows?

VACUUM only removes those row versions (also known as “tuples”) that are not needed any more. A tuple is not needed if the transaction ID of the deleting transaction (as stored in the xmax system column) is older than the oldest transaction still active in the PostgreSQL database. (Or, in the whole cluster for shared tables).

This value (22300 in the VACUUM output above) is called the “xmin horizon”.

There are three things that can hold back this xmin horizon in a PostgreSQL cluster:

Long-running transactions and `VACUUM`:

You can find those and their xmin value with the following query:

SELECT pid, datname, usename, state, backend_xmin, backend_xid
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL
ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;

1

2

3

4

SELECT pid, datname, usename, state, backend_xmin, backend_xid

FROM pg_stat_activity

WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL

ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;

You can use the pg_terminate_backend() function to terminate the database session that is blocking your VACUUM.

Abandoned replication slots and VACUUM:

A replication slot is a data structure that keeps the PostgreSQL server from discarding information that is still needed by a standby server to catch up with the primary.

If replication is delayed or the standby server is down, the replication slot will prevent VACUUM from deleting old rows.

You can find all replication slots and their xmin value with this query:

PgSQL

SELECT slot_name, slot_type, database, xmin FROM pg_replication_slots ORDER BY age(xmin) DESC;

1
2
3

SELECT slot_name, slot_type, database, xmin
FROM pg_replication_slots
ORDER BY age(xmin) DESC;

Use the pg_drop_replication_slot() function to drop replication slots that are no longer needed.

Note: This can only happen with physical replication if hot_standby_feedback = on. For logical replication there is a similar hazard, but only it only affects system catalogs. Examine the column catalog_xmin in that case.
Orphaned prepared transactions and VACUUM:

During two-phase commit, a distributed transaction is first prepared with the PREPARE statement and then committed with the COMMIT PREPARED statement.

Once Postgres prepares a transaction, the transaction is kept “hanging around” until it Postgres commits it or aborts it. It even has to survive a server restart! Normally, transactions don't remain in the prepared state for long, but sometimes things go wrong and the administrator has to remove a prepared transaction manually.

You can find all prepared transactions and their xmin value with the following query:

PgSQL

SELECT gid, prepared, owner, database, transaction AS xmin FROM pg_prepared_xacts ORDER BY age(transaction) DESC;

1
2
3

SELECT gid, prepared, owner, database, transaction AS xmin
FROM pg_prepared_xacts
ORDER BY age(transaction) DESC;

Use the ROLLBACK PREPARED SQL statement to remove prepared transactions.
Standby server with hot_standby_feedback = on and VACUUM:

Normally, the primary server in a streaming replication setup does not care about queries running on the standby server. Thus, VACUUM will happily remove dead rows which may still be needed by a long-running query on the standby, which can lead to replication conflicts. To reduce replication conflicts, you can set hot_standby_feedback = on on the standby server. Then the standby will keep the primary informed about the oldest open transaction, and VACUUM on the primary will not remove old row versions still needed on the standby.

To find out the xmin of all standby servers, you can run the following query on the primary server:

PgSQL

SELECT application_name, client_addr, backend_xmin FROM pg_stat_replication ORDER BY age(backend_xmin) DESC;

1
2
3

SELECT application_name, client_addr, backend_xmin
FROM pg_stat_replication
ORDER BY age(backend_xmin) DESC;

Read more about PostgreSQL table bloat and autocommit in my post here.

20 responses to “Four reasons why VACUUM won't remove dead rows from a table”

Purav Chovatia says:

December 19, 2018 at 10:15 am

Hi Laurenz, Thanks for this post. It was helpful to us in identifying some issues related to auto-vacuum.

We are on postgresql 9.6 and when we run the vacuum verbose cmd, it does not show us the oldest xmin as seen in your output above. Wondering if you tried this on a newer version.

Regards

Reply
- laurenz says:
  
  December 19, 2018 at 10:42 am
  
  You are right, this came with commit 9eb344faf54 in PostgreSQL v10.
  
  Reply
Багир Гварамадзе says:

March 17, 2020 at 6:40 am

Hi Laurenz, thanks a lot for post.
I got bloated table because of oldest xmin, but this xmin belong to physical replication. How can I solve this problem without loosing replication?
Best regards.

Reply
- laurenz says:
  
  March 17, 2020 at 7:21 am
  
  It's a bit unclear what your problem is, and it seems unrelated to the article, but perhaps you need to drop the replication slot.
  
  Reply
Susano Novici says:

August 5, 2020 at 10:09 am

Hi Laurenz,
I've got an issue with this auto-vacuum
[2020-08-05 16:45:17.157 07][][][][][431][XX001]ERROR: found xmin 2756976979 from before relfrozenxid 300006063
[2020-08-05 16:45:17.157 07][][][][][431][XX001]CONTEXT: automatic vacuum of table "template1.pg_catalog.pg_authid"

May u please kindly help me with this problem?

Reply
- laurenz says:
  
  August 10, 2020 at 7:13 am
  
  You have data corruption.
  
  Dump and restore the database to a new PostgreSQL cluster. If you need more help, please contact sales@cybertec.at
  
  Reply
PgM says:

October 13, 2022 at 8:22 am

Hello Laurenz,
on 'long-running transactions'. In isolation level 'read committed'. If I just do selects and keep the transaction open in between. Can that cause problems for vacuum? I guess not as it does not guarantee read consistency? Thank you!

Reply
- laurenz says:
  
  October 13, 2022 at 8:35 am
  
  You are right. What holds back VACUUM is open snapshots, and in READ COMMITTED isolation, each statement takes a new snapshot. If you look at the query I provide, you will see that it checks backend_xmin and backend_xid. You will see that your read-only
  READ COMMITTED has both values set to NULL between queries.
  
  Reply
C P says:

August 28, 2024 at 2:49 pm

Hi Laurenz,
we have a database where a vacuum on pg_largeobject does not remove the dead items.
I've checked the 4 reasons but I'm not sure if any of the reasons apply here. From my understanding, I would rather say that nothing applies.
Have I overlooked anything? And what else could it be?

VACUUM VERBOSE ANALYSE pg_largeobject;
INFO: vacuuming "pg_catalog.pg_largeobject"
INFO: table "pg_largeobject": index scan bypassed: 420500 pages from table (1.19% of total) have 690946 dead item identifiers
INFO: table "pg_largeobject": found 0 removable, 1816641 nonremovable row versions in 533548 out of 35474423 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 25410676
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 2.66 s, system: 1.43 s, elapsed: 4.29 s.
INFO: analyzing "pg_catalog.pg_largeobject"
INFO: "pg_largeobject": scanned 30000 of 35474423 pages, containing 106967 live rows and 587 dead rows; 30000 rows in sample, 126486420 estimated total rows

Please find the results of the queries enclosed:
1) Long-running transactions and VACUUM
pid | datname | usename |    state    | backend_xmin | backend_xid
--------+----------+----------+---------------------+--------------+-------------
235573 | ........ | ........ | idle in transaction |  25410676 |
236156 | ........ | ........ | idle in transaction |  25410676 |
237118 | ........ | ........ | idle in transaction |  25410676 |
240698 | ........ | ........ | idle in transaction |  25410676 |
242505 | ........ | postgres | active       |  25410677 |

2) Abandoned replication slots and VACUUM
slot_name | slot_type | database | xmin
-----------+-----------+----------+------
stndb01 | physical |     |
stndb02 | physical |     |
(2 rows)

3) Orphaned prepared transactions and VACUUM
gid | prepared | owner | database | xmin
-----+----------+-------+----------+------
(0 rows)

4) Standby server with hot_standby_feedback = on and VACUUM
application_name | client_addr | backend_xmin
------------------+---------------+--------------
stndb01     | 172.16.24.24 |
stndb02     | 172.16.24.22 |
(2 rows)

FYI:
hot_standby = on
hot_standby_feedback = off (default)

Thanks in advance!
Best regards

Reply
- C P says:
  
  August 29, 2024 at 7:28 am
  
  Seems that the problem are the "idle in transaction" transactions...
  
  Reply
- Laurenz Albe says:
  
  August 29, 2024 at 7:36 am
  
  Everything is fine, and VACUUM removed all dead rows. See the line "0 dead row versions cannot be removed yet", which is the same as "all dead row versions could be removed". The "nonremovable row versions" are the live rows in the table.
  To actually shrink the table, you would need VACUUM (FULL).
  
  Reply
  - C P says:
    
    August 29, 2024 at 8:55 am
    
    Thanks for your fast response.
    
    Reply
Rob Galanakis says:

December 4, 2024 at 5:15 pm

I just want to say thank you for this! I did not understand that long-lived transactions would block vacuuming across all tables (including those the transaction doesn't use), but this information helped me figure out what was going on. You are a saint for all the help you provide!

Reply
deniz durakcan says:

February 4, 2025 at 11:34 am

In my case problematic tables were catalog tables (pg_statistic) and its associated TOAST table (pg_toast).

I observed that dead tuples were not being cleared, even though they were not held by any transaction, had no replication, and were not locked by another process. The pg_statistic table had approximately 2GB of dead tuples (around 6 million in count), but autovacuum was not removing them.

To resolve the issue, I stopped all transactions in PostgreSQL, started it on a different port, and ran VACUUM FULL on these tables. Since it was a test environment with limited resources, the process took around 2 hours to complete..The result is successful

Reply
- Laurenz Albe says:
  
  February 5, 2025 at 9:11 am
  
  It's probably impossible to figure that out now, but if you think you have discovered a fifth cause for VACUUM not making progress, let me know. I think I have all bases covered, but I cannot be certain.
  
  Reply
deniz durakcan says:

February 5, 2025 at 12:16 pm

Correction: Since it was a test environment, it was overlooked. A replication slot was created for Debezium but remained in a false state, which was the cause of the issue. The problem was resolved after dropping the unused replication slot.

However, the part I don't understand is that after creating a new database and loading the dump into it, the vacuum operation was performed on the old database. The issue persisted in actively running instances.

Reply
deniz durakcan says:

February 5, 2025 at 12:25 pm

However, the part I don't understand is that after creating a new database and loading the dump into it, the vacuum operation was performed on the old database. The issue persisted in actively running instances
Note: The issue was resolved after dropping the replication slot. When I tried before dropping it, the vacuum operation was performed on the XX_old database, and it continued on the existing one.

Reply
Ananda Muthiar says:

March 17, 2025 at 9:52 pm

In one of my postgres 9.6 DB, pg_shdepend has the value 197936012 from pg_class.relfrozenxid. Is it safe to run a vacuum freeze on pg_shdepend? We have been running the vacuum freeze on tables which has over 100 millions transaction id age using flexible_freeze.py? But the flexible_freeze.py has a logic to ignore the pg_catalog tables. Please advise.

Reply
- Laurenz Albe says:
  
  March 24, 2025 at 1:21 pm
  
  It is safe to run VACUUM (FREEZE) on pg_shdepend. But you don't have to do that, because autovacuum will eventually take care of it.
  Similarly, there is no need to explicitly run VACUUM (FREEZE) on other tables just because relfrozenxid exceeds 100 million. It is autovacuum's job to take care of that. If you want autovacuum to trigger anti-wraparound runs sooner, you should lower autovacuum_freeze_max_age globally.
  
  Reply
Jes says:

May 4, 2025 at 8:53 am

Hi Laurenz,

Given that this article was written in 2018, I realise I'm very late to the party but, in spite of that, thanks very much!

It's clear, well-structured, makes perfect sense, and helped me find the solution to my problem.

Thanks again for sharing!

Reply

Leave a Reply Cancel reply

Stay tuned with our

NEWSLETTER

CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0

office@cybertec-postgresql.com

ISO_27001_Badge

Customer Support

Support Platform

Services

Support CYBERTEC Partner PostgreSQL Books

Company

STAY TUNED WITH OUR NEWSLETTER

Get the newest PostgreSQL Info & Tools

Data Protection Policy Terms and conditions Terms of Service Imprint

©

2025

CYBERTEC PostgreSQL International GmbH