CYBERTEC Logo

Parallel queries and SERIALIZABLE transactions

11.2016 / Category: / Tags:

In version 9.6 PostgreSQL introduced parallel queries. The ability to use more than just one CPU core per query is a giant leap forward and has made PostgreSQL an even more desirable database. With parallel queries many workloads can be sped up considerably. However, in this article, I want to point out a small missing feature, which bugged one of our support clients recently: lack of support for SERIALIZABLE. UPDATE Sept. 2023: This restriction was lifted as of v12, see this note in the documentation.

Consider the following example:

The sample table contains 10 million rows.

To tell the optimizer about the content of the table, running ANALYZE seems like a good idea:

Then users can tell PostgreSQL about the maximum number of cores allowed for the query we are about to run:

After that we can already run a simple aggregation:

PostgreSQL will execute a so called parallel sequential scan using 4 CPU cores (= worker processes). The system will scale nicely and significantly better execution times can be observed.

SERIALIZABLE and parallelism

We have seen a couple of people use SERIALIZABLE transactions for analytical requests. While this might not be a good idea for other reasons, there are also implications when it comes to the use of parallel queries. At this point (as of PostgreSQL 9.6) there is some code in PostgreSQL which is not yet fully parallel safe. Therefore parallel queries are not yet supported if you happen to use SERIALIZABLE. If you are affected by this, consider using REPEATABLE READ.

Here is what happens:

As you can see, only a single core is used.

The code inside the query planner (planner.c) is very clear about this:

We can't use parallelism in serializable mode because the predicate locking code is not parallel-aware.  It's not catastrophic if someone tries to run a parallel plan in serializable mode; it just won't get any workers and will run serially.  But it seems like a good heuristic to assume that the same serialization level will be in effect at plan time and execution time, so don't generate a parallel plan if we're in serializable mode.

Future versions of PostgreSQL will surely relax this at some point so that all isolation levels can benefit from parallel queries.

UPDATE:

This limitation was removed as of PostgreSQL v12: see the documentation on "Allow parallelized queries when in SERIALIZABLE isolation mode (Thomas Munro)" for more information.

See also this note about the limitation to parallel query (v15).

In case you need any assistance, please feel free to contact us.
 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
David Fetter
David Fetter
7 years ago

What's involved with making predicate locking parallel-aware?

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    1
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram