PostgreSQL is a powerful database which supports partitioning. In contrast to Oracle partitioning is part of the PostgreSQL core engine and does not need any additional licensing or extensions. If you migrate from Oracle to PostgreSQL this will be a major advantage.
However, just like any other technology, partitioning is not without risk. While it might help to handle large quantities of data efficiently it can also have downsides which have to be taken into consideration when using this powerful feature.
Querying data without partitions
The most basic example involves a table consisting of two columns:
test=# CREATE TABLE t_simple ( id serial, val int DEFAULT random()*100000 ); CREATE TABLE test=# INSERT INTO t_simple SELECT FROM generate_series(1, 10000000); INSERT 0 10000000 test=# CREATE INDEX ON t_simple (val); CREATE INDEX test=# VACUUM ANALYZE; VACUUM
In this case we have created a table and loaded 10 million rows. The column we want to query is indexed. When looking for a handful of values we will end up with a single index scan:
test=# explain (analyze, buffers, costs) SELECT * FROM t_simple WHERE val = 454650; QUERY PLAN -------------------------------------------------------------- Bitmap Heap Scan on t_simple (cost=5.20..388.39 rows=99 width=8) (actual time=0.010..0.011 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared hit=3 -> Bitmap Index Scan on t_simple_val_idx (cost=0.00..5.18 rows=99 width=0) (actual time=0.009..0.010 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared hit=3 Planning Time: 0.066 ms Execution Time: 0.028 ms (8 rows)
This is not really a surprise. What is noteworthy here is that the query is executed in a fraction of a millisecond. PostgreSQL can execute thousands of such queries per second per CPU core. On a large server we can easily reach more than 1 million queries per second.
Using PostgreSQL partitioning to store data
Let’s deploy the same data using partitions. To prove our point we’ll use hash partitions for this purpose:
test=# CREATE TABLE t_part ( id serial, val int DEFAULT random()*100000) PARTITION BY HASH (id); CREATE TABLE
After deploying the parent table, we can create the partitions. For the sake of simplicity, I have created a set of only 8 partitions which is enough to make a point. The effect you are going to see is even larger if the number of partitions increases:
CREATE TABLE t_part_1 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 0); CREATE TABLE t_part_2 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 1); CREATE TABLE t_part_3 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 2); CREATE TABLE t_part_4 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 3); CREATE TABLE t_part_5 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 4); CREATE TABLE t_part_6 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 5); CREATE TABLE t_part_7 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 6); CREATE TABLE t_part_8 PARTITION OF t_part FOR VALUES WITH (MODULUS 8, REMAINDER 7);
Once the data structure has been created, we can load the same data we loaded before:
test=# INSERT INTO t_part SELECT FROM generate_series(1, 10000000); INSERT 0 10000000 Again an index is created to make a fair comparison: test=# CREATE INDEX ON t_part (val); CREATE INDEX test=# VACUUM ANALYZE; VACUUM
Running the same query as before will result in a far more complex execution plan, which already points to the root cause we’re going to discuss:
test=# explain (analyze, buffers, costs) SELECT * FROM t_part WHERE val = 454650; QUERY PLAN ------------------------------------------------------ Append (cost=4.53..438.92 rows=104 width=8) (actual time=0.141..0.145 rows=0 loops=1) Buffers: shared read=24 -> Bitmap Heap Scan on t_part_1 (cost=4.53..54.80 rows=13 width=8) (actual time=0.024..0.025 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_1_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.023..0.023 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_2 (cost=4.53..54.80 rows=13 width=8) (actual time=0.018..0.019 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_2_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_3 (cost=4.53..54.80 rows=13 width=8) (actual time=0.014..0.015 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_3_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_4 (cost=4.53..54.80 rows=13 width=8) (actual time=0.014..0.014 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_4_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_5 (cost=4.53..54.80 rows=13 width=8) (actual time=0.016..0.017 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_5_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.016..0.016 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_6 (cost=4.53..54.80 rows=13 width=8) (actual time=0.018..0.018 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_6_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.017..0.017 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_7 (cost=4.53..54.80 rows=13 width=8) (actual time=0.017..0.017 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_7_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.017..0.017 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Heap Scan on t_part_8 (cost=4.53..54.80 rows=13 width=8) (actual time=0.017..0.018 rows=0 loops=1) Recheck Cond: (val = 454650) Buffers: shared read=3 -> Bitmap Index Scan on t_part_8_val_idx (cost=0.00..4.53 rows=13 width=0) (actual time=0.017..0.017 rows=0 loops=1) Index Cond: (val = 454650) Buffers: shared read=3 Planning: Buffers: shared hit=100 Planning Time: 0.769 ms Execution Time: 0.235 ms (54 rows)
Wow, what an execution plan. Partitioning has not done us any favors here. The reason is that at this point, we have to scan every single partition to find the data. This causes a significantly higher runtime, since it is simply more work for the database engine.
Conclusion and takeaways
The takeaway is that if the partitioning criteria is not part of the query or if many partitions have to be touched in general, runtime will suffer, which is exactly what happened here. It is also important to note that using too many partitions will significantly increase the time the planner needs to do its job. Having hundreds of partitions can easily lead to a real disaster.
Partitioning has many advantages. However, it has to be used cleverly and not blindly. There is no feature which is always advantageous. The same is true for partitioning and it makes sense to take that into consideration.
For further blogs on the topic of partitioning and performance, see these posts:
- Automatic partition creation in PostgreSQL
- Which partition did I insert my data into?
- Learn to implement pg_rewrite
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.