Prepared transactions are disabled in PostgreSQL by default, since the parameter max_prepared_transactions has the default value 0.

You don’t need prepared transactions in most cases. However, they can cause nasty problems, so I think that everybody who runs a PostgreSQL database should understand them.

To illustrate these problems, I’ll show you how to use prepared transactions to get a PostgreSQL into an inaccessible state.

 

What are prepared transactions?

Normally, a database transaction that spans multiple statements is ended with COMMIT or ROLLBACK. With prepared transactions, another step is added:

  1. BEGIN or START TRANSACTION: starts a transaction as usual.
  2. PREPARE TRANSACTION 'name': prepares the transaction for commit or rollback and assigns a name to it.
  3. { COMMIT | ROLLBACK } PREPARED 'name': commits or rolls back a previously prepared transaction.

 

The PREPARE TRANSACTION step performs all actions that may fail during COMMIT. That way, both COMMIT PREPARED and ROLLBACK PREPARED are guaranteed to succeed once a transaction is prepared. Moreover, PREPARE TRANSACTION persists the still open transaction, so that it will survive a crash or server restart.

Once a transaction is prepared, it is complete. Subsequent SQL statements belong to different transactions. You cannot do anything with a prepared transaction except COMMIT PREPARED and ROLLBACK PREPARED.

 

What is the use of prepared transactions?

Prepared transactions are used to implement “distributed transactions”.
Distributed transactions are transactions that affect more than one data source.
The protocol is as follows:

  1. Start a transaction on all data sources involved in the transaction.
  2. Modify data in the data sources. If there is a problem, ROLLBACK all involved transactions.
  3. Once you are done, PREPARE all involved transactions.
  4. If the PREPARE step fails in any of the transactions, issue ROLLBACK PREPARED everywhere.
  5. If the PREPARE step succeeds everywhere, COMMIT PREPARED all involved transactions.

 

This so-called “two-phase commit protocol” guarantees that the distributed transaction either succeeds or is rolled back everywhere, leaving the whole system consistent.

To make that works reliably, you need a “distributed transaction manager”.
That is software that keeps track of all distributed transactions, persisting their state to survive crashes and other interruptions.
That way it can complete all interrupted distributed transactions as soon as operation is resumed.

 

Problems caused by prepared transactions

Normally, no transaction should be in the prepared state for longer than a split second. But software bugs and other disruptions can cause a transaction to remain in the prepared state for a longer time. This causes the problems associated with long running transactions in general:

  • locks that are held for a long time, blocking other sessions and increasing the risk of a deadlock
  • VACUUM cannot clean up dead tuples created after the the start of the transaction

These problems are exacerbated by the fact that prepared transactions and their locks stay around even after the database server is restarted.

 

Implementation details

Preparing a transaction will write a WAL record, so the prepared transaction can be restored during crash recovery. This requires forcing the WAL to disk, just like a normal commit does.

During a checkpoint, the state of the prepared transaction is persisted in a file in the pg_twophase subdirectory of the data directory. The name of the file is the hexadecimal transaction ID.

On startup, all prepared transactions are restored from pg_twophase.

The file is deleted when the prepared transaction is committed or rolled back.

 

Getting rid of “orphaned” prepared transactions

You can examine all prepared transactions in the PostgreSQL database cluster using the view pg_prepared_xacts.

If a prepared transaction is “orphaned” because the transaction manager failed to close it, you will have to do that manually. Connect to the correct database and run COMMIT PREPARED or ROLLBACK PREPARED.

 

Locking up a database with prepared transactions

Warning: Don’t try this on your production database!

As a database superuser, run the following:

BEGIN;

LOCK pg_catalog.pg_authid;

PREPARE TRANSACTION 'locked';

Then disconnect from the database.

pg_authid, the table that contains the database users, is required to authenticate a database session. Since this table is locked by the prepared transaction, all future connection attempts will hang.

Restarting the database won’t help, because the prepared transaction will be retained.

Before you read on to the next part that contains the solution, let me invite you to try and get out of this dilemma yourself.

 

Enter Houdini!

Your first reaction will probably be the same as mine: Start PostgreSQL in single user mode. Alas, no luck:

$ pg_ctl stop
waiting for server to shut down.... done
server stopped
$ postgres --single postgres
LOG:  recovering prepared transaction 571 from shared memory
^C
FATAL:  canceling statement due to user request

Single user mode just hangs until I send it a SIGINT by pressing Ctrl+C, which shuts down the server.

But we can easily find a way to recover by reviewing the implementation details above:
571 in hexadecimal is 23b, so while PostgreSQL is shut down, we can remove the prepared transaction as follows:

$ rm $PGDATA/pg_twophase/0000023B

This will essentially roll back the transaction, and its effects will be undone when PostgreSQL is restarted.