CYBERTEC PostgreSQL Logo

PostgreSQL 18: Better I/O performance with AIO

09.2025
Category: 

PostgreSQL 18 is around the corner and it is time to take a look at one of the most important improvements that have been added to the core engine. We are of course talking about the introduction of asynchronous I/O (AIO), which has been a huge topic over the years. 

Synchronous vs. asynchronous I/O

Let's dive into this and understand what the fuzz is all about. The standard I/O model works roughly like this:

PostgreSQL opens a file and reads one 8k block after the other. Yes, the kernel will do readahead and so on, but essentially the process is:

  • request an 8k block from the OS
  • wait until it arrives 
  • process the block

This process is repeated. In real life, this means that the database cannot do calculations while we are waiting for the data to arrive. That is the curse of synchronous I/O. So, wouldn’t it be cool if the OS could already produce the data while the database is doing its calculations? The goal of AIO is to make reading and processing happen in parallel.

Here is a picture that illustrates this in more detail:

How does this work? The database schedules I/O and asks to provide it for later use. In the meantime, processing of already existing data can happen. This greatly reduces the latency the system will face. Obviously this is not beneficial to all operations, but it can greatly speed up things such as sequential scans, bitmap scans and so on.

In reality, one can expect better I/O performance - especially when running large scale operations.

However, let us not stick to theoretical explanations and move on to a more practical showcase.

Testing asynchronous I/O in PostgreSQL 18

The first thing one can do in PostgreSQL 18 and beyond is to actually configure the I/O method we are looking at. Here is how it works:

Configuring AIO in PostgreSQL

PostgreSQL 18 supports three I/O modes:

  • worker: asynchronous I/O using worker processes
  • io_uring: execute asynchronous I/O using io_uring
  • sync (execute asynchronous-eligible I/O synchronously)

The parameter in use here is io_method, which can be found in postgresql.conf. We recommend using io_uring for maximum efficiency.

PostgreSQL 18 performance data

For the purpose of this test we will simply use pgbench, which can provide us with a simple enough data set for the tests we want to perform. In this example, we will use pgbench to initialize a database called “bench”, which will be roughly 1 TB in size:

The “-s” parameter says that we want to produce 75000 times x 100k rows in our sample database. 

Note that we are running the test on a local NVME disk on a Fedora Linux box:

The table that is ideal for the I/O test is the first one in the list. It is sufficiently big to not fit into memory and provides us exactly what we need: The chance to run a big fat simple sequential scan on a large table.

But let us check the desired configuration first:

The most simplistic query that forces PostgreSQL to read the entire table might be as follows: We simply count the number of entries in the “bid” column of our big table. There is no index on this column and therefore the system will trigger a parallel sequential scan. The following listing shows the execution plan of our SQL statement: 

Let us just run it and see what happens:

The query takes almost 5 minutes. When looking at the I/O throughput of our disk, we can see that the disk peaks around 3.4 GB / sec (the “bi” column in the next listing contains the real system data represented as a 2 second average):

It is also noteworthy that the “wa” column (“disk wait”) is around 40. During the execution, one can see that PostgreSQL will use up around 3 cores and keep the disk totally busy.

Inspecting asynchronous I/O

To see what is going on during asynchronous reads, PostgreSQL offers a couple of new functions, which allow us to inspect how much I/O has been scheduled and what is going on in general. The first thing to look at is the pg_get_aios() function, which returns some insights:

On my local machine, the return value happened to be around 30 and 50 given 2 parallel worker processes. However, this number can of course vary depending on hardware, configuration, structure of your data and a lot more. 

Of course, we can also take a deeper look at the data returned by the SFR (“set returning function”):

For every I/O request PostgreSQL has scheduled, we will get one entry so that we can understand exactly what is going on behind the scenes. 

What is interesting to see here is: Regardless of the number of worker processes (assuming max_parallel_workers_per_gather is at least 2) the throughput on disk won’t change anymore because we have simply hit the limit of what the disk is able to handle. We are simply completely I/O bound already. 

Comparing synchronous and asynchronous I/O

Back in the old days, all we had was synchronous I/O, and the new behavior is cutting edge. So let us compare the old and new technology by setting “io_method  = sync”, which is what we had in PostgreSQL 17 and before:

The changes are substantial: We top out at 2.6 GB already, which is roughly 800 MB / sec less than before. To get closer to what is doable, we have to raise the number of worker processes to somehow compensate for this:

In other words: We have to throw additional CPUs at the problem to make better use of the disk as such. When we take a look at “vmstat 2 -w”, there is something worth noticing:

The disk wait has gone up substantially - we are now at around 75 instead of 40, which of course has a couple of implications that can affect the rest of your system in a fairly significant way. 

Comparing results

Visualizing the results can give us a feeling of what we are dealing with:

Now, with faster (or more) disks, we can see even higher differences and bigger gains. However, what is already pretty evident is: “PostgreSQL 18 makes significantly better use of your hardware” - especially if you are running on premise. 

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram