Postgresql comic - rower complains about row locks
© Laurenz Albe 2023

 

The PostgreSQL documentation has some information about row locks. But typically, you cannot see them in pg_locks, and not everybody knows how they work and how to track and debug row locks. This article intends to give some insight into how PostgreSQL row locks work “under the hood”.

Why are there four kinds of row locks?

In the old days, there were only FOR UPDATE and FOR SHARE. PostgreSQL used the latter for foreign key constraints: if you inserted a row in a table with a foreign key, PostgreSQL locked the referenced row with FOR SHARE to prevent concurrent deletes. Unfortunately, that also prevented concurrent updates, which was bad for concurrency.

PostgreSQL 9.4 introduced FOR KEY SHARE and FOR NO KEY UPDATE. Now the former lock is taken on the row referenced by a newly inserted row, and the latter is taken for updates that do not modify a primary or unique key column. Since FOR KEY SHARE and FOR NO KEY UPDATE are compatible, you can now update a row even if a concurrent transaction inserts a row referencing the row you are updating.

Which row lock should I take with SELECT ... FOR ...?

Most people use SELECT ... FOR UPDATE to explicitly lock a row against concurrent updates. But most of the time, that is excessive. You should only take a FOR UPDATE lock if you intend to delete the row or modify a primary or unique key column. Otherwise, it is better to take FOR NO KEY UPDATE locks, so that you do not block inserts into tables referencing the table you are updating. I have seen cases where this simple change was enough to dissolve massive locking problems into thin air.

I have never seen a need to explicitly take a share lock on a row. PostgreSQL multiversioning allows you to see the unmodified version of a row even if it is concurrently updated or deleted.

How does PostgreSQL store row locks?

PostgreSQL creates the shared memory lock table when the server starts, and it has a fixed size. Therefore, PostgreSQL does not keep row locks in the lock table. While the number of tables (and table locks) in a database cluster is (normally) moderate, the number of rows can be very large. That would engender the danger of overflow if we kept row locks in the lock table. There is the concept of lock escalation to promote many row locks to a page lock or a table lock, and some other databases do that. But apart from the extra work, lock escalation can lead to surprising and unpredictable deadlocks.

Instead, PostgreSQL stores row locks on the row itself. For that end, it repurposes the xmax system column that is otherwise unemployed when a transaction is in progress (read my article for details).

While that allows PostgreSQL to take as many row locks as it needs, it also comes at a price: if you modify a table row, the buffer that contains the row becomes dirty (if it isn’t already) and PostgreSQL has to write it out to disk during the next checkpoint. So taking row locks causes extra write operations in PostgreSQL.

What happens when PostgreSQL acquires row locks?

Check if somebody else holds a lock on the row

First, we check the hint bits and the xmax of the row to see if somebody holds a lock on the row. That will become important later on.

Take a short-lived tuple lock on the row

Then we take an exclusive “tuple” lock on the row in question. This is a regular “heavyweight” lock in the shared memory lock table and serves two purposes:

  • If more than one session wants to lock the same row, they have to queue. This queue is maintained by the tuple lock.
  • The lock protects the row from concurrent modifications until we have written our lock to the row itself.

Wait for the locking transaction if necessary

If the initial check showed that other transactions are holding a lock on the row, we hang on to the tuple lock and go to sleep waiting for the transaction ID of one of the locking transactions. As soon as all blocking transactions have terminated, we can proceed. But first, we check if there were concurrent modifications of the row while we were waiting. If yes, the behavior depends on the current transaction isolation level:

  • With READ COMMITTED, we fetch the latest committed version of the row (and check if is still satisfies the WHERE condition). This is good for concurrency, but can lead to interesting anomalies.
  • With higher isolation levels, PostgreSQL throws a serialization error.

Write the row lock to the row

As soon as all blocking row locks are gone and we have the most recent version of the row, we can proceed. We modify xmax and some hint bits on the row and release the tuple lock.

The beauty of that algorithm is that no session ever needs more than two locks in the lock table, so there is no danger of running out of memory.

A clarifying example of row locks

Since the above was a bit complicated, let’s look at an example where three database transactions try to lock the same row at the same time.

  • The first transaction that gets there will grab a tuple lock on the row. Then it writes the row lock to the row and releases the tuple lock immediately.
  • The second transaction that comes along sees that the first transaction has locked the row. It grabs a tuple lock on the row and goes to sleep waiting for the first transaction to finish
  • The third (and any following) transaction also sees that the row is locked. It blocks waiting for the tuple lock on the row.

If you see a tuple lock in pg_locks, that means that somebody is waiting for a row lock, no matter if that tuple lock is granted or not.

Debugging row locks with the pgrowlocks extension

pg_locks is the catalog view that allows you to examine the lock table. Unless there are conflicts on row locks, you will never see a row lock in pg_locks. So we cannot use pg_locks to see who holds which row lock. But we can use the extension pgrowlocks:

CREATE EXTENSION IF NOT EXISTS pgrowlocks;

SELECT * FROM pgrowlocks('table_name');

 locked_row │ locker │ multi │      xids       │           modes           │     pids      
════════════╪════════╪═══════╪═════════════════╪═══════════════════════════╪═══════════════
 (0,2)      │      2 │ t     │ {571228,571229} │ {"Key Share","Key Share"} │ {13644,13911}
 (0,3)      │ 571230 │ f     │ {571230}        │ {"For No Key Update"}     │ {13913}
(2 rows)

locker” is the value stored in xmax. We see that two rows in the first table block are locked:

  • The sessions 13644 and 13911 hold a FOR KEY SHARE lock on the row with ctid (0,2). Since more than one transaction holds a lock on the row, PostgreSQL created the multixact with ID 2 as a placeholder.
  • The session 13913 holds a FOR NO KEY UPDATE lock on the row with ctid (0,3).

Note that pgrowlocks() performs a sequential scan on the table, so this can take a while and will put load on your disk.

The upshot

Row locks are different, because PostgreSQL does not permanently store them in the shared memory lock table. Because of that, we normally cannot see them in pg_locks. However, there is the pgrowlocks extension that can help with debugging lock problems. Next time you need to explicitly lock a row, use SELECT ... FOR NO KEY UPDATE!

To learn more about SQL, find out about using the MERGE command in PostgreSQL v15 and higher.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.