Many might have seen PostgreSQL issue the following error message: "ERROR: deadlock detected". But what does it really mean? How can we prevent a deadlock and how can we reproduce the problem? Let’s dive into PostgreSQL locking and understand what deadlock and deadlock_timeout really mean.

How does a deadlock happen?

Many people approach us because they want to understand what a deadlock is and how it can happen. They also want to understand how a deadlock can be avoided and what software developers can do about it.
If you want to understand how a deadlock occurs, all you need is a table containing two lines. That’s sufficient to explain the basic principle of deadlocks.

Here is some easy-to-use sample data:


test=# CREATE TABLE t_data (id int, data int);
CREATE TABLE
test=# INSERT INTO t_data VALUES (1, 100), (2, 200);
INSERT 0 2
test=# TABLE t_data;
id  | data
----+------
1   | 100
2   | 200
(2 rows)

The crux is that if data is updated in a different order, transactions might have to wait for one another to be finished. It is perfectly fine if transaction 1 has to wait for transaction 2. But what happens if transaction 1 has to wait for transaction 2 and transaction 2 has to wait for transaction 1? In that case, the system has two choices:

  • Wait infinitely, or
  • Abort one transaction and commit the other transaction.

As waiting infinitely is not an option, PostgreSQL will abort one of these transactions after some time (deadlock_timeout). Here is what happens:

Transaction 1Transaction 2Comment
BEGIN;BEGIN;
UPDATE t_data
SET data = data * 10
WHERE id = 1
RETURNING *;
id  | data
----+------
1   | 1000
(1 row)
UPDATE t_data
SET data = data * 10
WHERE id = 2
RETURNING *;
id  | data
----+------
2   | 2000
(1 row)
works perfectly
UPDATE t_data
SET data = data * 10
WHERE id = 2
RETURNING *;
has to wait until transaction 2 releases the lock on the row containing id = 2
… waits …UPDATE t_data
SET data = data * 10
WHERE id = 1
RETURNING *;
wants to lock the row locked by transaction id: now both are supposed to wait
… deadlock timeout …… deadlock timeout …PostgreSQL waits (deadlock_timeout) and triggers deadlock detection after this timeout (not immediately)
update proceeds: “UPDATE 1”ERROR: deadlock detecteda transaction has to die
COMMIT;the rest commits normally

 

The error message we will see is:

ERROR: deadlock detected
DETAIL: Process 70725 waits for ShareLock on transaction 891717; blocked by process 70713.
Process 70713 waits for ShareLock on transaction 891718; blocked by process 70725.
HINT: See server log for query details.
CONTEXT: while updating tuple (0,1) in relation "t_data"

The reason is that transactions have to wait for one another. If two transactions are in a conflict, PostgreSQL will not resolve the problem immediately, rather it will wait for deadlock_timeout and then trigger the deadlock detection algorithm to resolve the problem.

Why does PostgreSQL wait for some time before it steps in and fixes things? The reason is that deadlock detection is quite expensive, and therefore not immediately triggering it makes sense. The default value here is 1 second, which is high enough to avoid pointless deadlock detection attempts, but is still short enough to fix the problem in a useful and timely manner.

How to fix and avoid deadlocks

The most important thing to know is: There is NO MAGIC CONFIGURATION PARAMETER to fix this problem. The problem does NOT depend on configuration. It depends on the execution order of operations. In other words, you cannot magically fix it without understanding the application and its underlying operations.

The only thing that can fix the problem is to change the execution order, as shown in the next listing:


test=# SELECT * FROM t_data ;
id  | data
----+------
1   | 1000
2   | 2000
(2 rows)

This is the data you should see after committing the transaction that did not fail before. Thus we can see what happens if two transactions execute in a different order:

Transaction 1Transaction 2Comment
BEGIN;
UPDATE t_data
SET data = data * 10
WHERE id = 1
RETURNING *;
id  | data
----+------
1   | 1000
(1 row)
BEGIN;
UPDATE t_data
SET data = data * 10
WHERE id = 1
RETURNING *;
UPDATE t_data
SET data = data * 10
WHERE id = 2
RETURNING *;
id  | data
----+-------
2   | 20000
(1 row)
… wait …
COMMIT;… wait …
UPDATE t_data
SET data = data * 10
WHERE id = 1
RETURNING *;
id  | data
----+--------
1   | 100000
(1 row)
re-read the value and use the newly committed entries
UPDATE t_data
SET data = data * 10
WHERE id = 2
RETURNING *;
id  | data
----+--------
2   | 200000
(1 row)
re-read the value and use the newly committed entries
COMMIT;

 

In this case, there is no deadlock. However, in a real work scenario it is hardly possible to simply swap the execution order. That’s why this is more of a theoretical solution to the problem than a practical one. However, there are no other options to fix the problem of deadlocks. In the case of deadlocks, being aware of how to prevent them is the best cure.

Finally …

Locking is really important. Deadlocks are not the only concern in this area. Performance might be equally important, therefore it makes sense to deal with performance-related locking effects as well. Stay tuned for more on this topic.

If you want to learn more about important features of PostgreSQL, you might want to check out a blog post about UPDATE which can be found here.