Improving transaction latency by moving indexes to faster media

By Kaarel Moppel - Improve transaction latency and consequently performance - The topic of transaction performance is as relevant as ever, in spite of constant hardware improvements, we're quite often asked how to improve in that area. But the truth is that when the most important PostgreSQL configuration parameters are already more or less tuned, it is usually really hard to magically squeeze that extra something out of a setup, without also modifying the schema.

Table of Contents

What I've noticed is that a lot of people with performance issues seem to still be using "old" spinning disks. Of course there are a lot of different reasons for that (maybe some long-term server lease contracts or losing support if changing hardware configs). But for cases, in which this is done because there's just way too much data and it would get very expensive, there might be some remedy. Most people don't seem to realize that for OLTP systems there's a huge win, if one can already move the indexes of the busiest tables to SSDs or some other low latency media. So here a quick overview on how to do that with some indicative numbers.

Moving indexes to fast storage - improve transaction latency

The process is quite simple.

1. Install/connect/mount the media. This is probably the hardest part.

2. Create a Postgres tablespace (superuser needed). It would then make sense to also adjust the random_page_cost parameter.

CREATE TABLESPACE fast_storage LOCATION '/some/mount'
  WITH (random_page_cost=1.25);

1 2	CREATE TABLESPACE fast_storage LOCATION '/some/mount' WITH (random_page_cost=1.25);

3. Move impactful (lots of columns) or existing indexes to that tablespace. NB! This will result in full locking. Also since 9.4 it's actually possible, to move all indexes to some tablespace with "ALTER INDEX ALL IN TABLESPACE", but this would basically mean a downtime, as everything is locked and then the moving starts. One can do it also in a more controlled/manual way, via "CREATE INDEX CONCURRENTLY ... TABLESPACE ... + RENAME INDEX+ DROP INDEX" or maybe use pg_squeeze/pg_repack extensions that can basically do the same.

# single index
ALTER INDEX pgbench_accounts_aid_idx SET TABLESPACE fast_storage;
# or all indexes on a table with a small DO block (which could be improved to with schemas)
DO $$
DECLARE
r record;
BEGIN
  FOR r IN SELECT ci.relname FROM pg_class c
                  JOIN pg_index i ON i.indrelid = c.oid JOIN pg_class ci on i.indexrelid = ci.oid
                  AND c.relname = 'pgbench_accounts'
  LOOP
    EXECUTE 'ALTER INDEX ' || r.relname || ' SET TABLESPACE fast_storage' ;
  END LOOP;
END
$$;
# or by levereging psql-s newish 'gexec' (less locking so due to independent transaction)
SELECT  'ALTER INDEX ' || ci.relname || ' SET TABLESPACE fast_media' FROM pg_class c
                  JOIN pg_index i ON i.indrelid = c.oid JOIN pg_class ci on i.indexrelid = ci.oid
                  AND c.relname = 'pgbench_accounts' gexec

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

# single index

ALTER INDEX pgbench_accounts_aid_idx SET TABLESPACE fast_storage;

# or all indexes on a table with a small DO block (which could be improved to with schemas)

DO $$

DECLARE

r record;

BEGIN

FOR r IN SELECT ci.relname FROM pg_class c

JOIN pg_index i ON i.indrelid = c.oid JOIN pg_class ci on i.indexrelid = ci.oid

AND c.relname = 'pgbench_accounts'

LOOP

EXECUTE 'ALTER INDEX ' || r.relname || ' SET TABLESPACE fast_storage' ;

END LOOP;

END

$$;

# or by levereging psql-s newish 'gexec' (less locking so due to independent transaction)

SELECT 'ALTER INDEX ' || ci.relname || ' SET TABLESPACE fast_media' FROM pg_class c

JOIN pg_index i ON i.indrelid = c.oid JOIN pg_class ci on i.indexrelid = ci.oid

AND c.relname = 'pgbench_accounts' gexec

4. Optionally it might be a good idea to set this new tablespace as default schema is somewhat static.

ALTER SYSTEM SET default_tablespace TO fast_storage;
select pg_reload_conf();

1 2	ALTER SYSTEM SET default_tablespace TO fast_storage; select pg_reload_conf();

Trying to compare HDD vs SDD

To "visualize" the possible performance benefits (there are some for sure), I performed a small and simplistic test, comparing HDD and SDD transaction latencies with a lot of random IO – to really hit the disk a lot, I chose a very small amount of RAM (~5% of dataset fits in shared_buffers/kernel cache), but increased max_wal_size a lot so that we wouldn't stall during the test, giving more predictable latencies. To generate random IO easily, I decided to just create 9 extra indexes on the pgbench schema – and having 10 to 20 indexes on a central OLTP table is also actually quite common. Also to only illustrate HDD vs SDD difference on multi-index update latencies, I removed other activities like WAL logging by using unlogged tables, disabling the background writer and changed the pgbench default transaction so that only the UPDATE part on the single pgbench_accounts table would be executed.

HW info: Google Compute Engine (europe-west-1), 1 vCPU, 1 GB RAM, 100GB Standard persistent disk / SSD persistent disk
Postgres info: PG 10.3, max_wal_size=50GB, checkpoint_timeout=1d, shared_buffers=256MB, bgwriter_lru_maxpages=0
Test script:

SCALE=684 # estimated ~10Gi DB, see jsfiddle.net/kmoppel/6zrfwbas/ for the formula
TEST_DURATION_SEC=900
CLIENTS=1
NR_OF_AID_INDEXES=8
MAX_TPS=10

pgbench --unlogged-tables -i -s $SCALE

for i in $(seq 1 $NR_OF_AID_INDEXES) ; do psql -qXc 'create index on pgbench_accounts (aid)' ; done
psql -qXc 'create index on pgbench_accounts (abalance)'	# an extra index to effectively disables HOT-updates

psql -qXc 'checkpoint'

cat << EOF > bench_upd_only.sql
set aid random(1, 100000 * :scale)
set delta random(-5000, 5000)
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
EOF

pgbench -f bench_upd_only.sql -R $MAX_TPS -T $TEST_DURATION_SEC -P $TEST_DURATION_SEC -c $CLIENTS

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

SCALE=684 # estimated ~10Gi DB, see jsfiddle.net/kmoppel/6zrfwbas/ for the formula

TEST_DURATION_SEC=900

CLIENTS=1

NR_OF_AID_INDEXES=8

MAX_TPS=10

pgbench --unlogged-tables -i -s $SCALE

for i in $(seq 1 $NR_OF_AID_INDEXES) ; do psql -qXc 'create index on pgbench_accounts (aid)' ; done

psql -qXc 'create index on pgbench_accounts (abalance)' # an extra index to effectively disables HOT-updates

psql -qXc 'checkpoint'

cat << EOF > bench_upd_only.sql

set aid random(1, 100000 * :scale)

set delta random(-5000, 5000)

UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

EOF

pgbench -f bench_upd_only.sql -R $MAX_TPS -T $TEST_DURATION_SEC -P $TEST_DURATION_SEC -c $CLIENTS

The results

And the results were quite surprising actually – almost 38x difference in average update latency on 10 indexes! I somehow thought it will be a bit less, maybe 5-10x...

Disk Type	Latency avg.	Latency Stddev.
HDD	141 ms	158 ms
SSD	3.7 ms	2.5 ms

NB! The test doesn't make any claims at absolute truths – I used separate Google cloud machines (same non-disk specs though), which could have different utilization levels, but to counteract, I limited the transaction rate to a very low 10 TPS not to make it a total throughput test but rather a transaction latency test, so in the end it should at least give some idea on possible performance gains. Also we can see that HDD latencies (at least on shared cloud envs) jump quite a lot on random updates, with "Latency Stddev" being bigger than "Latency avg".

Get the latest information about PostgreSQL performance tuning, right here in our blog spot.