CYBERTEC PostgreSQL Logo

After testing shared_buffers recently, I decided to do a little more testing on our new office desktop PC (8 core AMD CPU, 4 GHZ, 8 GB RAM and a nice 128 GB Samsung 840 SSD). All tests were conducted on the SSD drive.

This time I tested the impact of maintenance_work_mem on indexing speed. To get started I created half a billion rows using the following simple code snippet:

Once the data is loaded we can check the database size. The goal of the data set is to clearly exceed the size of RAM so that we can observe more realistic results.

In our case we got 24 GB of data:

The number of rows is: 536870912

Sorted input

The point is that our data set has an important property: We have indexed the integer column. This implies that the input is already sorted.

Each index is built like that:

We only measure the time of indexing. A checkpoint is requested before we start building the index to make sure that stuff is forced to disk before, and no unexpected I/O load can happen while indexing.

The results are as follows:

1 MB: 516 sec
32 MB: 605 sec
128 MB: 641 sec
512 MB: 671 sec
1 GB: 688 sec
2 GB: 704 sec
4 GB: 709 sec

The interesting part here is that we can see the top performing index creating with very little memory. This is not totally expected but it makes sense given the way things work. We see a steady decline, given more memory.

unsorted input

Let us repeat the test using unsorted data now.

We create ourselves a new table containing completely random order of the same data:

Let us run those index creations again:

1 MB: 1810 sec
32 MB: 1176 sec
128 MB: 1130 sec
512 MB: 1244 sec
1 GB: 1315 sec
2 GB: 1385 sec
4 GB: 1411 sec

As somewhat expected, the time needed to create indexes has risen. But: In this case we cannot see the best performer on the lower or the upper edge of the spectrum but somewhere in the middle at 128 MB of RAM. Once the system gets more memory than that, we see a decline, which is actually quite significant, again.

From my point of view this is a clear statement that being too generous is clearly not the best strategy.

Of course, results will be different on different hardware and things depend on the nature of data as well as on the type of index (Gist and GIN would behave differently) in use. However, for btrees the picture seems clear.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

A lot has been said about PostgreSQL shared buffers and performance. As my new desktop box has arrived this week I decided to give it a try and see, how a simple benchmark performs, given various settings of shared_buffers.

The first thing to run the test is to come up with a simple test database. I decided to use a pgbench with a scale factor of 100. This gives a total of 10 mio rows and a database size of roughly 1.5 GB:

The test box we are running on is an AMD FX-8350 (8 cores) with 16 GB of RAM and a 256 GB Samsung 840 SSD.

Running the test

A simple shell script is used to conduct the test:

The goal of the test is to see if the amount of shared memory does have an impact on the performance of the system.

NOTE: In our test the benchmarking program is running on the same node sucking up around 1.5 CPU cores. So, the tests we have here can basically just use up the remaining 6.5 cores. While this does not give performance data, it is still enough to see, which impact changing the size of the shared buffers have.

Here are the results. Buffers are in 8 blocks:

BuffersTPS simpleTPS prepared
102480529,33120403,43
409680496,44119709,02
1638482487,12124697,52
6553687791,88139695,82
13107292086,75150641,91
26214494760,85155698,34
52428894708,02155766,49
104857694676,83156040,41

What we see is a steady increase of the database throughput roughly up to the size of the database itself. Then the curve is flattening out as expected:

11

It is interesting to see that the impact of not having to ask the operating system for each and every block is actually quite significant.

The same can be said about prepared queries. Avoiding parsing does have a serious impact on performance - especially on short reading transactions. Close to 160k reads to find data in 10 mio rows is actually quite a lot given the fact that we are talking about hardware, which is ways around 1.000 euros including VAT.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram