CYBERTEC Logo

Detecting table bloat in PostgreSQL

08.2013 / Category: , / Tags: |

If your PostgreSQL database is purely maintained (lack of VACUUM) or badly structured, you might face some table bloat.
UPDATE Sept. 2023 - See this blog for a more recent update on this topic: Table bloat revisited.

Table bloat is somewhat nasty because it slows down your database and eats up more storage than needed. Therefore it can be important to figure out if a table is bloated or not. The PostgreSQL contrib package provides a package called pgstattuple to diagnose table bloat. 

To use the module we have to activate it:

For a test we create a table and add some 10.000 rows to it on the fly:

We can already see, how pgstattuple works. To make sure the psql output stays readable, we have called ...

In our example pgstattuple will return the following data:

As you can see the size of the table is somewhat over 368k. Our table has a fill grade of around 76%. Note that those numbers don't add up to 100% completely. This is due to some overhead. In reality the fill grade of a freshly loaded table will be a lot higher than in our trivial single column case.

To demonstrate table bloat we can delete some data. In this example we delete one third of those rows:

Note, the size of the table on disk is still the same. Some of those valid rows have just been moved to the “dead tuple” section of the output:

The first lesson here is that DELETE does not shrink a table on disk. It merely marks rows as dead. This is highly important – many people are misled by this behavior.

To reclaim the space occupied by those dead rows we can call VACUUM:

What we see here is that those dead rows have been converted to free space. And yes, the size of the table is still the same:

This free space can now be used to store new rows inside your table.

Checking many tables

In our example we have checked one table. But what if you want to check your entire database to see who is to blame?

In this case, you must make use of the PostgreSQL type system and a system catalog. Let us discuss the type system first: In the previous example we have seen that pgstattuple can be called in the FROM-clause because it returns an entire row. But we have to do this for all tables – not just for some. To make sure that we can put the list of objects we want to check into the FROM-clause (in the form of the pg_class system table) we have to move the function call to the SELECT clause. The call will return a composite type consisting of all fields:

This is not really readable so we have to break up those fields again. Some brackets and a star will do the job:

We can now sort and filter at will to figure out which table is to blame.

See also Laurenz Albe's post about "How a bad network connection can cause table bloat."

----------
We welcome your feedback on our brand new Facebook page: www.fb.com/cybertec.postgresql

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram