For many years various commercial database vendors have already offered this feature and we are glad that PostgreSQL has become part of this elite club, which offers multi-core index creation that will dramatically improve the usability of large database deployments in the future.
Creating large tables in PostgreSQL
Since version 11, PostgreSQL supports classical “stored procedures”. The beauty is that a procedure can run more than one transaction, which is ideal if you want to generate huge amounts of random data. When you call generate_series to generate 1 million rows, PostgreSQL has to keep this data in memory. Consequently generating hundreds of millions of random rows using more than 1 transactions can be really useful to reduce the memory footprint. Here is how it works:
CREATE TABLE t_demo (data numeric); CREATE OR REPLACE PROCEDURE insert_data(buckets integer) LANGUAGE plpgsql AS $$ DECLARE i int; BEGIN i := 0; WHILE i < buckets LOOP INSERT INTO t_demo SELECT random() FROM generate_series(1, 1000000); i := i + 1; RAISE NOTICE 'inserted % buckets', i; COMMIT; END LOOP; RETURN; END; $$; CALL insert_data(500);
This tiny bit of code loads 500 million random numeric values, which should be enough to demonstrate, how CREATE INDEX can be improved in PostgreSQL 11. In our example 500 million rows translate to roughly 21 GB of data:
test=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+--------+-------+-------+-------+------------- public | t_demo | table | hs | 21 GB | (1 row)
The reason why I went for numeric is that numeric causes the most overhead of all number data types. Creating a numeric index is a lot more costly than indexing, say, int4 or int8. The goal is to see, how much CPU time we can save by building a large index on a fairly expensive field.
CREATE INDEX: Using just 1 CPU core
In PostgreSQL 11 parallel index creation is on by default. The parameter in charge for this issue is called max_parallel_maintenance_workers, which can be set in postgresql.conf:
test=# SHOW max_parallel_maintenance_workers; max_parallel_maintenance_workers ---------------------------------- 2 (1 row)
The default value here tells PostgreSQL that if the table is sufficiently large, it can launch two workers to help with index creation. To compare a “traditional” way to create the index with the new settings, I have set max_parallel_maintenance_workers to 0. This will ensure that no multicore indexing is available:
test=# SET max_parallel_maintenance_workers TO 0; SET
The consequence is that indexing will take forever. When running the CREATE INDEX statement we will see a lot of I/O and a lot of CPU. To make things worse I left all memory parameters at their default value, which means that the index creation has to work with only 4 MB of memory, which is nothing given the size of the table.
Here are the results on my “Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz”:
test=# CREATE INDEX idx1 ON t_demo (data); CREATE INDEX Time: 1031650.658 ms (17:11.651)
17 minutes, not too bad. Remember, we are talking about 500 million of really nasty lines of data.
Using more than just one core
Let us run the same type of indexing on 2 cores:
test=# SET max_parallel_maintenance_workers TO 2; SET test=# CREATE INDEX idx2 ON t_demo (data); CREATE INDEX Time: 660672.867 ms (11:00.673)
Wow, we are down to 11 minutes. Of course the operation is not completely linear because we have to keep in mind that those partial results have to be merged together and all that. But, there is a catch: If set max_parallel_maintenance_workers to 2 and what we saw is 2 cores, right? What if we set the value to 4? In my case 4 is the number of physical cores in the machine so it makes no sense to use any higher values. What you will see is that PostgreSQL still uses only two cores.
How can we change that? The answer can be found in the next listing: ALTER TABLE … SET … allows us to lift this restriction and use more workers:
test=# ALTER TABLE t_demo SET (parallel_workers = 4); ALTER TABLE test=# SET max_parallel_maintenance_workers TO 4; SET
In this case both, max_parallel_workers and the table parameter are set to 4. What we will see now is that PostgreSQL will utilize 5 processes. Why does that happen? What you will see is one main process and 4 processes helping with index creation. That might not be totally obvious, but it makes sense when you think about it.
Of course we cannot add an infinite number of workers and expect performance to grow linearly. At this stage our (single) SSD will also start to run into performance limitations and we won’t see a two times increase anymore:
test=# CREATE INDEX idx3 ON t_demo (data); CREATE INDEX Time: 534775.040 ms (08:54.775)
Everybody is doing the same thing pretty much at the same time so we will see wild swings in our I/O curve, which naturally makes the entire thing a bit slower and not linear. Still, we managed to speed up our index creation from 17 minutes to close to 9 minutes by simply adding more cores to the system.
Using more memory for CREATE INDEX
CPU cores are not the only limiting factor during index creation. Memory is also of significant importance. By default maintenance_work_mem is set to a really low value (64 MB), which greatly limits the amount of data, which can be sorted in memory. Therefore the next logical step is to increase this parameter and set it to a higher value creating the new index:
test=# SET maintenance_work_mem TO '4 GB'; SET
In my case I decided to pump the value to 4 GB. My server has 32 GB of memory and we have to keep in mind that we are not the only ones, which might create an index so 4 GB x 5 cores might already be a really aggressive value in a real-world scenario.
What we will see while creating the index is a lot more parallelism going on in the first phase of the index creation, which is exactly what we are supposed to see and what we expected. You can also see quite clearly that towards the end CPU usage is pretty low and PostgreSQL is waiting on the disk to do its job. The entire system has been set up with default values so writes have not been optimized yet and are therefore going to be an issue.
However, we will still see a nice improvement:
test=# CREATE INDEX idx4 ON t_demo (data); CREATE INDEX Time: 448498.535 ms (07:28.499)
7 minutes and 28 seconds. That is already very nice. But let us see if we can do even better. What we have seen so far is that checkpoints and I/O have started to become a limiting factor. Therefore we will try to improve on that by telling PostgreSQL to use larger checkpoint distances. In this example I have decided to change postgresql.conf to the following values:
checkpoint_timeout = 120min max_wal_size = 50GB min_wal_size = 80MB
Those settings can easily be activated by reloading the config file:
test=# SELECT pg_reload_conf(); pg_reload_conf ---------------- t (1 row)
Let us create a new index using those larger checkpoint distances.
When looking at the process table while building the index you can notice that PostgreSQL spent quite a lot of time on writing the WAL to disk. As long as we stick to a single SSD there is not much more we can do about it. However, what will happen if we play our next trump card? Additional hardware. What if we created all our temporary data on one disk, send the WAL to the main disk and create the index on a third SSD? This way we could split the amount of I/O needed quite nicely and see what happens.
Using tablespaces in PostgreSQL to speed up indexing
As already stated adding more hardware by using tablespaces might be a good idea. I am well aware that this might not be possible in a modern cloud environment. However, on my test server I still got the luxury items: A couple of real physical SSD drives.
So let us give them a try and create two tablespaces, which can store the data. On top of that I will tell PostgreSQL to use those tablespaces for sorting and to store the new index:
test=# CREATE TABLESPACE indexspace LOCATION '/ssd1/tabspace1'; CREATE TABLESPACE test=# CREATE TABLESPACE sortspace LOCATION '/ssd2/tabspace2'; CREATE TABLESPACE
Then we can tell PostgreSQL where to put temporary data:
test=# SET temp_tablespaces TO sortspace; SET
In the next step the index creation can start:
test=# CREATE INDEX idx6 ON t_demo (data) TABLESPACE indexspace; CREATE INDEX Time: 408508.976 ms (06:48.509)
What we see here during the index creation is that our throughput peaks at higher values than before because more than one SSD can work at the same time. Instead of 500 MB / sec peak our throughput goes up to as much as 900 MB / sec at times. The overall speed has improved as well. We are already below 7 minutes, which is really nice.
If you add more hardware to the box it might be worth considering creating one filesystem using all disks at once. I did not have time to test this options, but I assume that it might similar and maybe even better results then what I was able to come up with in this first test.
TIP: Don’t underestimate the importance of the data type in use. If we did the same test using normal integer values, we could create the index in 3 min 51 seconds. In other words: The data type is of significant importance.
In this post you have seen that creating indexes can be improved. However, keep in mind that new indexes are not always beneficial. Pointless indexes can even slow down things. To figure out, which indexes might not be needed, consider reading a post written by Laurenz Albe, who explains, how to tackle this kind of problem.