CYBERTEC Logo

Debugging deadlocks in PostgreSQL

debugging deadlocks made unnecessary
© Laurenz Albe 2022

 

(Updated 2023-04-07) Even if you understand what a deadlock is, debugging deadlocks can be tricky. This article shows some techniques on how to figure out the cause of a deadlock.

A simple deadlock example

Setting the stage

We will test our techniques for debugging deadlocks with the following example:

The deadlock

To provoke the deadlock, we run the following transactions in parallel.

Each transaction adds a row to the child table and then tries to modify the corresponding row in parent.

How a deadlock appears in the log

The PostgreSQL log reveals more details:

Why is debugging deadlocks difficult?

The error message sent to the client does not contain any meaningful detail. That is because the SQL statement from the other sessions participating in the deadlock may contain data that you have no right to see. The PostgreSQL log contains more helpful data, but also not enough to understand the deadlock. To understand the cause of the deadlock, you would have to know all statements from the involved transactions, because locks are always held until the end of a transaction. The two statements from the error message alone could never cause a deadlock. So some earlier statement in these transactions must already have taken locks that are part of the problem. However, PostgreSQL does not retain any memory of these previous statements, so we have to find some other way to get the whole picture.

Usually you need help from the application's developers. Perhaps you are lucky, the SQL statements from the error message are enough to identify the database transactions involved, and it is easy to reconstruct the SQL statements that were issued from each transaction. By examining the locks taken by these statements, you should be able to reconstruct the cause of the deadlock.

The cause of the deadlock in our example

If we know the statements from our transactions in the example above, we can deduce the following:

  • The INSERT statements take a FOR KEY SHARE lock on the referenced rows in parent. That is necessary so that no concurrent transaction can delete the row referenced by our not yet committed INSERT. As you can see in the compatibility matrix, these locks don't conflict with each other.
  • The FOR UPDATE row locks from the SELECT ... FOR UPDATE statements conflict with the FOR KEY SHARE locks from the other transaction, which leads to the deadlock.

Once we know that, the solution is simple: change “SELECT ... FOR UPDATE” to “SELECT ... FOR NO KEY UPDATE”, and there will be no more lock conflicts and consequently no deadlock.

Note that a FOR UPDATE row lock is only required if you want to DELETE a row, or if you intend to update a column with a unique or primary key constraint on it (a key). The lock name FOR UPDATE is counter-intuitive, since it suggests that this is the correct lock for an UPDATE, and it is a common trap for PostgreSQL beginners.. See my article about row locks for more detail.

Great, with our complete knowledge we could fix the problem. But how do we get the statements from the transactions?

Techniques for debugging deadlocks

There are several approaches:

Debugging deadlocks by logging on the application side

If the application logs information about each error and each database transaction started, it is easy to determine the transactions that caused the problem. However, that requires that you either have control over the application code, or that the application has sufficiently well-designed tracing facilities. You are not always in that lucky position.

Debugging deadlocks by annotating queries on the application side

This is also a method that requires that you can modify the application. The trick here is to add a comment to the query, like

The information can be anything that allows you to tell which transaction the statement belongs to.

Debugging deadlocks by logging on the database side

In order to enable us to trace statements, we have to make sure that the process ID and transaction number are logged with each statement. With CSV logging, that is always the case, but with stderr logging, you have to change log_line_prefix:

Also, you have to log all statements:

Now the amount and performance impact of logging all statements may be forbidding for busy applications. PostgreSQL v12 introduced log_transaction_sample_rate, which allows you to log only a certain percentage of all transactions. However, in many cases that won't be good enough, because you'll typically only see the statements from one of the transactions involved.

The log file for our example would look as follows:

Note that the transaction ID is logged as 0 until after the first data modifying statement in the transaction. This is because PostgreSQL assigns transaction IDs lazily, only when they are needed.

Debugging deadlocks by increasing deadlock_timeout

The PostgreSQL deadlock detector doesn't kick in immediately, but waits until after deadlock_timeout has passed. This is because the deadlock detector is a computationally expensive component, so we only want to activate it if there is very likely a problem. By default, the parameter is set to one second. Unfortunately, one second is much too short to debug the problem. Now if you can afford to let deadlocked sessions “hang” until you can debug them, you can increase the parameter. The maximum allowed value is almost 600 days, which is longer than you need. With proper monitoring, you could for example set deadlock_timeout to 15 minutes to catch deadlocks in the act.

Once you have an active deadlock, you can find client_addr and client_port from pg_stat_activity for the hanging sessions:

We sort by state_change, because other sessions might later get stuck behind the sessions that originally caused the deadlock. For that reason, you want to focus on the oldest entries.

Next, log into the client machine and find out which process has a TCP connection on port 49578:

Now you know that the client process is 3974, running the program myapp. You can now attach to the program with a debugger and see the current execution stack. That should give you a clue as to what is being executed, so you can figure out the SQL statements run in that transaction.

Needless to say, this is yet another method that requires control over the application code.

Conclusion

We have seen several methods for debugging deadlocks. Most of them require that you have control over the application code. Not all techniques are useful in every case: for example, you may not be able to afford logging all statements or leaving deadlocks hanging for a longer time.

For more information on deadlocks, see Hans' blog on Understanding Deadlocks.


Please leave your comments below. In order to receive regular updates on important changes in PostgreSQL, subscribe to the CYBERTEC newsletter, or follow us on TwitterFacebook, or LinkedIn

0 0 votes
Article Rating
Subscribe
Notify of
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Frane Jelavić
Frane Jelavić
1 year ago

Laurenz,
Could you explain this part about:

Note that a FOR UPDATE row lock is only required if you want to DELETE a row, or if you intend to update a column with a unique or primary key constraint on it (a key).

You've probably talked about it in the blog, I'm searching it as we speak.

laurenz
laurenz
1 year ago
Reply to  Frane Jelavić

I only briefly glossed over it, but my article about row locks describes it in some more detail.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram