Many people are wondering why deleting data from a table in a PostgreSQL database does not shrink files on disk. You would expect storage consumption to go down when data is deleted. This is not always the case.

To show this really works I have compiled some small examples. Let us get started with a simple table and add 100k rows to it:

test=# CREATE TABLE t_test (id int4);

CREATE TABLE
test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000);

INSERT 0 100000

The size of the table will be around 3.5 MB on disk. We can find out about the size of a table using pg_relation_size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

Let us delete half of the rows now:

test=# DELETE FROM t_test WHERE id > 50000;

DELETE 50000

Many users would expect this table to shrink to half the size. This is not the case:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

The reason here is that when you delete some rows it does not necessarily mean that nobody else can see those rows anymore. There might still be transactions around seeing those rows you are about to delete. In other words: PostgreSQL does not delete a row instantly but marks it invisible so that certain transactions cannot see it.

To clean up rows, which are really dead, we can run VACUUM:

test=# VACUUM t_test;

VACUUM

In our example the table has been reduced in size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

1776 kB

(1 row)

How shrinking works

VACUUM is able to reclaim space and return it to the filesystem, but this is not always possible. It is only possible to shrink a file by cutting off stuff at the end. Cutting out data at the end of a data file is only possible if (and only if) there are no valid rows anymore AFTER a certain position in the file. In our example we have loaded data in ascending order into our PostgreSQL table (1, 2, 3, etc.). If we delete everything larger than 50.000 it means that the second half of the table will be gone. VACUUM will go through the table and figure out that the second half of the data file contains nothing but chunk and truncate the data file to return space to the filesystem.

When shrinking does not work

In the first example we have arranged things in a way that we can see VACUUM returning space to the filesystem. Let us construct an example where this is not possible:

test=# DROP TABLE t_test;

DROP TABLE

First of all we drop the table and create a new one:

test=# CREATE TABLE t_test (id int4);

CREATE TABLE

Then we load 100.000 rows just like before:

test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000);

INSERT 0 100000

In this example we will delete the first half of the data:

test=# DELETE FROM t_test WHERE id < 50000;

DELETE 49999

Just like before we can run VACUUM:

test=# VACUUM t_test;

VACUUM

This time the size of the underlying table is the same size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

Remember what we said about cleanup and VACUUM? VACUUM can only shrink a file when free space is at the end. However, our example has been set up in a way that this is not the case.

CTIDs – a nice thing to know

In PostgreSQL there is a thing called CTID, which says where a certain row is. ctid=(0, 14) would mean that we are talking about the 14th row in the first data block of the table.

Let us take a look at our table and checkout the highest 5 CTIDs.

test=# SELECT ctid, * FROM t_test ORDER BY ctid DESC LIMIT 5;

ctid | id

-----------+--------

(442,108) | 100000

(442,107) | 99999

(442,106) | 99998

(442,105) | 99997

(442,104) | 99996

(5 rows)

As you can see the last row is in block 442. Let us do the math: 442 * 8192 = 3.5 MB. This is why the table has not changed in size.

Let us check the lowest CTID:

test=# SELECT min(ctid) FROM t_test;

min

----------

(221,54)

(1 row)

We can see that the first valid row is somewhere in the middle of the table. Therefore the table has not been truncated by VACUUM.

———-
We welcome your feedback on our brand new Facebook page: www.fb.com/cybertec.postgresql