Subtransactions in PL/pgSQL
© Laurenz Albe 2020

 

Recently, while troubleshooting PostgreSQL performance problems, I ran into problems with subtransactions twice. So I thought this was a nice topic for another blog post.

What are subtransactions?

Everybody knows database transactions. In PostgreSQL, which is operating in autocommit mode, you have to start a transaction that spans multiple statements explicitly with BEGIN or START TRANSACTION and close it with END or COMMIT. If you abort the transaction with ROLLBACK (or end the database session without committing) all work done inside the transaction becomes undone.

Now subtransactions allow you to roll back part of the work done in a transaction. You start a subtransaction inside a transaction with the standard SQL statement:

SAVEPOINT name;

name” is an identifier (no single quotes!) for the subtransaction. You cannot commit a subtransaction in SQL (it is automatically committed with the transaction that contains it), but you can roll it back with:

ROLLBACK TO SAVEPOINT name;

The use of subtransactions

Subtransactions are useful in longer transactions. In PostgreSQL, any error inside a transaction will abort the transaction:

test=> BEGIN;
BEGIN
test=*> SELECT 'Some work is done';
     ?column?      
-------------------
 Some work is done
(1 row)

test=*> SELECT 12 / (factorial(0) - 1);
ERROR:  division by zero
test=!> SELECT 'try to do more work';
ERROR:  current transaction is aborted, commands ignored until end of transaction block
test=!> COMMIT;
ROLLBACK

With a transaction that does a lot of work this is quite annoying, because it means that all work done so far is lost. Subtransactions can help you recover from such a situation:

test=> BEGIN;
BEGIN
test=*> SELECT 'Some work is done';
     ?column?      
-------------------
 Some work is done
(1 row)

test=*> SAVEPOINT a;
SAVEPOINT
test=*> SELECT 12 / (factorial(0) - 1);
ERROR:  division by zero
test=!> ROLLBACK TO SAVEPOINT a;
ROLLBACK
test=*> SELECT 'try to do more work';
      ?column?       
---------------------
 try to do more work
(1 row)

test=*> COMMIT;
COMMIT

Note that ROLLBACK TO SAVEPOINT starts another subtransaction called a when it rolls back the old one.

Subtransactions in PL/pgSQL

Even if you have never used the SAVEPOINT statement, you may already have encountered subtransactions. Written in PL/pgSQL, the code in the previous section looks like this:

BEGIN
   PERFORM 'Some work is done';

   BEGIN  -- a block inside a block
      PERFORM 12 / (factorial(0) - 1);
   EXCEPTION
      WHEN division_by_zero THEN
         NULL;  -- ignore the error
   END;

   PERFORM 'try to do more work';
END;

Every time you enter a block with an EXCEPTION clause, a new subtransaction is started. The subtransaction is committed when you leave the block, and rolled back when you enter the exception handler.

Compatibility with other databases

Many other databases handle errors inside a transaction differently. Rather than aborting the complete transaction, they roll back only the statement that caused the error, leaving the transaction itself active.

When migrating or porting from such a database to PostgreSQL, you might be tempted to wrap every single statement in a subtransaction to emulate the above behavior.

The PostgreSQL JDBC driver even has a connection parameterautosave” that you can set to “always” to automatically set a savepoint before each statement and rollback in case of failure.

As the following will show, this alluring technique will lead to serious performance problems.

A performance test case

To demonstrate the problems caused by overuse of subtransactions, I created a test table:

CREATE UNLOGGED TABLE contend (
    id integer PRIMARY KEY,
    val integer NOT NULL
)
WITH (fillfactor='50');

INSERT INTO contend (id, val)
SELECT i, 0
FROM generate_series(1, 10000) AS i;

VACUUM (ANALYZE) contend;

The table is small, unlogged and has a low fillfactor to reduce the required I/O as much as possible. This way, I can observe the effects of subtransactions better.

I’ll use pgbench, the benchmarking tool shipped with PostgreSQL, to run the following custom SQL script:

BEGIN;
PREPARE sel(integer) AS
   SELECT count(*)
   FROM contend
   WHERE id BETWEEN $1 AND $1 + 100;
PREPARE upd(integer) AS
   UPDATE contend SET val = val + 1
   WHERE id IN ($1, $1 + 10, $1 + 20, $1 + 30);

SAVEPOINT a;
\set rnd random(1,990)
EXECUTE sel(10 * :rnd + :client_id + 1);
EXECUTE upd(10 * :rnd + :client_id);

SAVEPOINT a;
\set rnd random(1,990)
EXECUTE sel(10 * :rnd + :client_id + 1);
EXECUTE upd(10 * :rnd + :client_id);

...

SAVEPOINT a;
\set rnd random(1,990)
EXECUTE sel(10 * :rnd + :client_id + 1);
EXECUTE upd(10 * :rnd + :client_id);

DEALLOCATE ALL;
COMMIT;

The script will set 60 savepoints for test number 1, and 90 for test number 2. It uses prepared statements to minimize the overhead of query parsing.

pgbench will replace :client_id with a number unique to the database session. So as long as there are no more than 10 clients, each client’s UPDATEs won’t collide with those of other clients, but they will SELECT each other’s rows.

Running the performance test

Since my machine has 8 cores, I’ll run the tests with 6 concurrent clients for ten minutes.

To see meaningful information with “perf top”, you need the PostgreSQL debugging symbols installed. This is particularly recommended on production systems.

Test 1 (60 subtransactions)

pgbench -f subtrans.sql -n -c 6 -T 600

transaction type: subtrans.sql
scaling factor: 1
query mode: simple
number of clients: 6
number of threads: 1
duration: 600 s
number of transactions actually processed: 100434
latency average = 35.846 ms
tps = 167.382164 (including connections establishing)
tps = 167.383187 (excluding connections establishing)

This is what “perf top --no-children --call-graph=fp --dsos=/usr/pgsql-12/bin/postgres” shows while the test is running:

+    1.86%  [.] tbm_iterate
+    1.77%  [.] hash_search_with_hash_value
     1.75%  [.] AllocSetAlloc
+    1.36%  [.] pg_qsort
+    1.12%  [.] base_yyparse
+    1.10%  [.] TransactionIdIsCurrentTransactionId
+    0.96%  [.] heap_hot_search_buffer
+    0.96%  [.] LWLockAttemptLock
+    0.85%  [.] HeapTupleSatisfiesVisibility
+    0.82%  [.] heap_page_prune
+    0.81%  [.] ExecInterpExpr
+    0.80%  [.] SearchCatCache1
+    0.79%  [.] BitmapHeapNext
+    0.64%  [.] LWLockRelease
+    0.62%  [.] MemoryContextAllocZeroAligned
+    0.55%  [.] _bt_checkkeys                                 
     0.54%  [.] hash_any
+    0.52%  [.] _bt_compare
     0.51%  [.] ExecScan

Test 2 (90 subtransactions)

pgbench -f subtrans.sql -n -c 6 -T 600

transaction type: subtrans.sql
scaling factor: 1
query mode: simple
number of clients: 6
number of threads: 1
duration: 600 s
number of transactions actually processed: 41400
latency average = 86.965 ms
tps = 68.993634 (including connections establishing)
tps = 68.993993 (excluding connections establishing)

This is what “perf top --no-children --call-graph=fp --dsos=/usr/pgsql-12/bin/postgres” has to say:

+   10.59%  [.] LWLockAttemptLock
+    7.12%  [.] LWLockRelease
+    2.70%  [.] LWLockAcquire
+    2.40%  [.] SimpleLruReadPage_ReadOnly
+    1.30%  [.] TransactionIdIsCurrentTransactionId
+    1.26%  [.] tbm_iterate
+    1.22%  [.] hash_search_with_hash_value
+    1.08%  [.] AllocSetAlloc
+    0.77%  [.] heap_hot_search_buffer
+    0.72%  [.] pg_qsort
+    0.72%  [.] base_yyparse
+    0.66%  [.] SubTransGetParent
+    0.62%  [.] HeapTupleSatisfiesVisibility
+    0.54%  [.] ExecInterpExpr
+    0.51%  [.] SearchCatCache1

Even if we take into account that the transactions in test 2 are one longer, this is still a performance regression of 60% compared with test 1.

Implementation of subtransactions

To understand what is going on, we have to understand how transactions and subtransactions are implemented.

Whenever a transaction or subtransaction modifies data, it is assigned a transaction ID. PostgreSQL keeps track of these transaction IDs in the commit log, which is persisted in the pg_xact subdirectory of the data directory.

However, there are some differences between transactions and subtransactions:

  • Each subtransaction has a containing transaction or subtransaction (the “parent”)
  • Committing a subtransaction does not require a WAL flush
  • There can be only one transaction per database session, but many subtransactions

The information which (sub)transaction is the parent of a given subtransaction is persisted in the pg_subtrans subdirectory of the data directory. Since this information becomes obsolete as soon as the containing transaction has ended, this data do not have to be preserved across a shutdown or crash.

Subtransactions and visibility

The visibility of a row version (“tuple”) in PostgreSQL is determined by the xmin and xmax system columns, which contain the transaction ID of the creating and destroying transactions. If the transaction ID stored is that of a subtransaction, PostgreSQL also has to consult the state of the containing (sub)transaction to determine if the transaction ID is valid or not.

To determine which tuples a statement can see, PostgreSQL takes a snapshot of the database at the beginning of the statement (or the transaction). Such a snapshot consists of:

  • a maximum transaction ID: everything from that transaction on is invisible
  • the list of transactions and subtransactions that were active when the snapshot was taken
  • the command number of the earliest visible command in the current (sub)transaction

A snapshot is initialized by looking at the process array, which is stored in shared memory and contains information about all currently running backends. This, of course, contains the current transaction ID of the backend and has room for at most 64 non-aborted subtransactions per session. If there are more than 64 such subtransactions, the snapshot is marked as suboverflowed.

Explaining the test results

A suboverflowed snapshot does not contain all data required to determine visibility, so PostgreSQL will occasionally have to resort to pg_subtrans. These pages are cached in shared buffers, but you can see the overhead of looking them up in the high rank of SimpleLruReadPage_ReadOnly in the perf output. Other transactions have to update pg_subtrans to register subtransactions, and you can see in the perf output how they vie for lightweight locks with the readers.

Diagnosing the problem of too many subtransactions

Apart from looking at “perf top”, there are other symptoms that point at the direction of this problem:

  • Your workload performs well when run single-threaded, but badly when run in several concurrent database sessions
  • You often see the wait event “SubtransControlLock” in pg_stat_activity
  • If you export a snapshot using the “pg_export_snapshot()” function, the resulting file in the pg_snapshots subdirectory of the data directory will contain the line “sof:1” to indicate that the subtransaction array overflowed

Conclusion

Subtransactions are a great tool, but you should use them wisely. If you need concurrency, don’t start more than 64 subtransactions per transaction.

The diagnostics shown in this article should help you determine whether you are suffering from the problem, or not.

Finding the cause of that problem can be tricky. For example, it may not be obvious that a function with an exception handler that you call for each result row of an SQL statement (perhaps in a trigger) starts a new subtransaction.