CYBERTEC Logo

Reducing the impact of locking

10.2013 / Category: / Tags: |

“Waiting is the slowest way to execute an operation” - this is how a professor at my university in Vienna usually described bad locking and low concurrency. He could not have been more right. The same applies to performing calculations: The fastest way to calculate something is to try to skip it entirely. In this posting we want to focus on the first half of the story – minimizing locking.

Let us assume we have three tables:

We can populate those tables with a handful of values to get our example started:

SELECT ... FOR UPDATE

Concurrency (= doing things in parallel) is an essential thing to consider when writing code. In most applications there is more than just one user at the same time and those users might very well focus their attention on the same set of data.

To make sure that nobody can modify a row while we are processing it, all modern relational databases such as Oracle and PostgreSQL provide a syntax component called “SELECT .. FOR UPDATE”. All rows returned by “SELECT FOR UPDATE” are locked as if they were about to be updated. This will help us to avoid concurrency issues and strange behavior (race conditions).

In our example we want to perform a join between table A and table B and lock all rows returned by SELECT. Here is the code:

As you can see, data is returned as expected. For the sake of this example we keep the transaction open. Let us run a second transaction now, which is executed, while the first session is still active:

The point here is: The second session has to wait because session 1 and session 2 will have to lock rows in the B table. This is exactly the desired behavior.

Improving concurrency

But: What if B was just a simple lookup table? What if A was a payment table and B just a postal code table? What if C was a table storing data about people?

If somebody was about to modify some payment and if he has to look up some postal code for this purpose, we would automatically lock other transactions out, which want to modify data about people. The lock on the postal code table would magically spread to other areas of the code, which have nothing to do with payment.

If changes to payments are frequent, the postal code table would be the poison pill for all writing transactions touching the table storing people. Clearly, this is not acceptable. How can you ever scale to 64 CPUs or beyond if you are shot down by a simple row level lock?

PostgreSQL has the right solution to this kind of problem: SELECT FOR UPDATE allows you to specify, which table in the join you want to lock. It works like this:

In this example we are telling the system that we are joining A and B but we only want to lock A. This makes sure that C can be locked undisturbed:

This SELECT statement will only take care of table C and make sure that B is not touched because we don't want to modify it anyway.

As you can see, a small change can have a significant impact on the overall performance because we gave magically allowed a second transaction (and thus a second CPU) to proceed without having to wait on a statement.

The golden rule of thumb is: The more concurrency you got in your system the faster you are overall. Never lock more than needed.

Visit us on facebook: www.fb.com/cybertec.postgresql

Comments are closed.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram