CYBERTEC PostgreSQL Logo

Orphaned files after a PostgreSQL crash

04.2025 / Category: / Tags: |

A computer screen where "ls --orphaned" ran on a data directory and returned files with all kinds of funny orphan names
© Laurenz Albe 2025

PostgreSQL is famous for its stability, and rightly so. But that does not mean that it can never crash. And while PostgreSQL usually cleans up after itself, it has no good way to do so after a crash (after all, it has lost its memory). As a consequence, you can end up with orphaned files in your data directory. If these files are small, they probably won't worry you. But sometimes you can end up with a lot of junk in your data directory. It is notoriously difficult to deal with that problem, so I decided to write about it.

Why would PostgreSQL crash?

In case the title of this article gives you bad feelings about PostgreSQL, I'll try to dispel them. PostgreSQL is not in the habit of crashing. But sometimes a crash is not PostgreSQL's fault. Let me enumerate a few possible causes for a crash:

In fact, the most frequent cause of a crash that leaves large orphaned files in its wake is people running VACUUM (FULL). This command creates a copy of the table and all its indexes. If you run out of disk space and PostgreSQL cannot create a new WAL segment, the database server will crash. Such a crash will leave the partly created copy of the table behind. To avoid that kind of problem, I recommend that you keep pg_wal and the data directory on different file systems (and while you are at it, put the log files on a separate file system as well).

Why can a crash leave orphaned files behind?

You will probably understand that PostgreSQL cannot clean up when it crashes, but why can't it clean up when it restarts? I have two answers for that:

  • going through all the files in all database directories and comparing them with the catalog data and pg_filenode.map could take a long time, during which the server is not yet available
  • data corruption in the catalog tables could cause PostgreSQL to remove legitimate data files, which would exacerbate the data corruption

I see particularly the second point as a very good reason to forgo any automatic clean-up activity.

How can I identify and remove orphaned files after a crash?

That's the crucial question. I'll offer you two approaches:

The safe way: dump and restore

The simple, safe and supported way to deal with the problem is:

  • stop the application
  • pg_dump the database to disk
  • DROP DATABASE — that will delete all files, including the orphans
  • create a new, empty database
  • restore the dump to the new database
  • start the application

Incidentally, that is also the correct way to clean a database that has experienced any form of data corruption.

While safe and simple, cleaning a large database that way can cause egregious down time. If you feel that you cannot afford that, and you don't want to take the risky path described below, you should simply live with the orphaned files. They won't do any harm, they only eat up some of your disk space, and these days disk space is often much cheaper than an extended down time.

The risky way: identify and remove the orphaned files

There are a couple of PostgreSQL functions that allow a superuser to examine the file system:

Selected PostgreSQL file information functions
function description
pg_ls_dir(dirname) lists a directory on the database server, returns a table of text
pg_stat_file(filename) returns a composite record with all kinds of information about a database server file, including the size and the modification timestamp
pg_filenode_relation(tablespace, filenode) returns the object ID of the relation (table, index, sequence, ...) that belongs to a certain file in the database directory (NULL if there is no such relation)

With some knowledge of the internals of PostgreSQL, that allows us to compose a query for all orphaned files in a database directory:

To delete a filenode returned by the query (let's assume we get 12345), go to the database directory:

and delete the files that belong to it:

Warnings about deleting orphaned files

Be careful when deleting files:

  • The query above will only identify files in the default tablespace. If you created additional tablespaces, it won't find orphaned files in those tablespaces.
  • Don't delete very recent files. A concurrent transaction may have created them, and the catalog entries are not yet visible to your query.
  • Best wait a week or two after the crash. Then, only delete files returned by the query that were last modified around the time of the crash.
  • If the query gives you some small files, don't bother deleting them. They won't bother you and are not worth the risk.

Before you delete any files in the data directory, make sure you have a good backup.

Conclusion

The safe way to get rid of orphaned files after a crash is to dump and restore the database. For the more daring people out there, I gave you a query that helps you identify those orphaned files.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram