Find and fix a missing PostgreSQL Index

Spot a missing PostgreSQL index

Missing indexes are a key ingredient if you are looking for a perfect recipe to ruin performance in the most efficient way possible. 🙂 However, if you want to ensure that your database performs well and if you are generally not in favor of user complaints – better watch out for missing indexes and make sure that all relevant tables are properly taken care of. One PostgreSQL index can make all the difference in performance.

Table of Contents

To help you understand, I have compiled this little guide about how to locate missing indexes, what you can do to fix them, and how to achieve good database performance.

Setting up a test database

In order to demonstrate how to find missing indexes, I have to first create a test database. One way to do it is to use pgbench:

[hs@hansmacbook ~]$ pgbench -i -s 100 test
dropping old tables...
NOTICE: table 'pgbench_accounts' does not exist, skipping
NOTICE: table 'pgbench_branches' does not exist, skipping
NOTICE: table 'pgbench_history' does not exist, skipping
NOTICE: table 'pgbench_tellers' does not exist, skipping
creating tables...
generating data (client-side)...
10000000 of 10000000 tuples (100%) done (elapsed 13.65 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 19.92 s (drop tables 0.00 s, create tables 0.00 s, client-side generate 13.70 s, vacuum 1.95 s, primary keys 4.27 s).

[hs@hansmacbook ~]$ pgbench -i -s 100 test

dropping old tables...

NOTICE: table 'pgbench_accounts' does not exist, skipping

NOTICE: table 'pgbench_branches' does not exist, skipping

NOTICE: table 'pgbench_history' does not exist, skipping

NOTICE: table 'pgbench_tellers' does not exist, skipping

creating tables...

generating data (client-side)...

10000000 of 10000000 tuples (100%) done (elapsed 13.65 s, remaining 0.00 s)

vacuuming...

creating primary keys...

done in 19.92 s (drop tables 0.00 s, create tables 0.00 s, client-side generate 13.70 s, vacuum 1.95 s, primary keys 4.27 s).

What happens here is that pgbench just provided us with a little sample database which contains 4 tables.

The largest one is pgbench_accounts:

test=# d+
                                           List of relations
 Schema |     Name         | Type  | Owner | Persistence | Access method |  Size   | Description 
--------+------------------+-------+-------+-------------+---------------+---------+-------------
 public | pgbench_accounts | table | hs    | permanent   | heap          | 1281 MB | 
 public | pgbench_branches | table | hs    | permanent   | heap          |   40 kB | 
 public | pgbench_history  | table | hs    | permanent   | heap          | 0 bytes | 
 public | pgbench_tellers  | table | hs    | permanent   | heap          |   80 kB | 
(4 rows)

test=# d+

List of relations

--------+------------------+-------+-------+-------------+---------------+---------+-------------

(4 rows)

This database is perfectly indexed by default, so we have to drop some indexes in order to find something we can fix later:

test=# d pgbench_accounts
          Table 'public.pgbench_accounts'
  Column  |     Type      | Collation | Nullable | Default 
----------+---------------+-----------+----------+---------
 aid      |       integer |           | not null | 
 bid      |       integer |           |          | 
 abalance |       integer |           |          | 
 filler   | character(84) |           |          | 
Indexes:
'pgbench_accounts_pkey' PRIMARY KEY, btree (aid)

test=# ALTER TABLE pgbench_accounts 
             DROP CONSTRAINT pgbench_accounts_pkey;
ALTER TABLE

test=# d pgbench_accounts

Table 'public.pgbench_accounts'

Column | Type | Collation | Nullable | Default

----------+---------------+-----------+----------+---------

aid | integer | | not null |

bid | integer | | |

abalance | integer | | |

filler | character(84) | | |

Indexes:

'pgbench_accounts_pkey' PRIMARY KEY, btree (aid)

test=# ALTER TABLE pgbench_accounts

DROP CONSTRAINT pgbench_accounts_pkey;

ALTER TABLE

We have simply dropped the primary key: which is, internally, nothing other than a unique index which does not allow NULL entries.

Running a demonstration benchmark

Before we start running our benchmark to see how bad performance really gets, you need to make sure that the most important tool to handle performance problems is installed and active: pg_stat_statements. Without pg_stat_statements, tracking down performance problems is unnecessarily hard.

Therefore consider performing the next steps to install pg_stat_statements:

Add “pg_stat_statements” to shared_preload_libraries (postgresql.conf)
Restart the database
Run “CREATE EXTENSION pg_stat_statements” in your database

Once this is done, we are ready to run our benchmark. Let's see what happens:

[hs@hansmacbook ~]$ pgbench -c 10 -T 60 -j 10 test
pgbench (14.1)
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 252
latency average = 2446.148 ms
initial connection time = 8.833 ms
tps = 4.088061 (without initial connection time)

[hs@hansmacbook ~]$ pgbench -c 10 -T 60 -j 10 test

pgbench (14.1)

starting vacuum...end.

transaction type:

scaling factor: 100

query mode: simple

number of clients: 10

number of threads: 10

duration: 60 s

number of transactions actually processed: 252

latency average = 2446.148 ms

initial connection time = 8.833 ms

tps = 4.088061 (without initial connection time)

Despite opening 10 connections (-c 10) and proving pgbench with 10 threads (-j 10) we managed to run 4 - yes, 4 - transactions per second. One might argue that hardware is the problem, but it's definitely not:

Model Name: MacBook Pro
Model Identifier: MacBookPro16,1
Processor Name: 8-Core Intel Core i9
Processor Speed: 2,3 GHz
Number of Processors: 1
Total Number of Cores: 8

Model Name: MacBook Pro

Model Identifier: MacBookPro16,1

Processor Name: 8-Core Intel Core i9

Processor Speed: 2,3 GHz

Number of Processors: 1

Total Number of Cores: 8

This is a modern, 8 core machine. Even if the clockspeed were 10 times as high, we would have topped out at 40 transactions per second. That's still way below what you would expect.

pg_stat_user_tables: An important monitoring view for your PostgreSQL indexes

The first clue that indexes might be missing can be found in pg_stat_user_tables. The following table contains the relevant columns:

test=# d pg_stat_user_tables 
View 'pg_catalog.pg_stat_user_tables'
       Column        | Type ... 
---------------------+---------------- …
 relid               | oid ...
 schemaname          | name ...
 relname             | name ...
 seq_scan            | bigint ...
 seq_tup_read        | bigint ...
 idx_scan            | bigint ...
...

test=# d pg_stat_user_tables

View 'pg_catalog.pg_stat_user_tables'

Column | Type ...

---------------------+---------------- …

relid | oid ...

schemaname | name ...

relname | name ...

seq_scan | bigint ...

seq_tup_read | bigint ...

idx_scan | bigint ...

...

What we see here is the name of the table (relname) including the schemaname. Then we can see how often our table has been read sequentially (seq_scan) and how often an index has been used (idx_scan). Finally, there is the most relevant information: seq_tup_read. So what does that mean? It actually tells us how many rows the system had to process to satisfy all those sequential scans. This number is really really important and I cannot stress it enough: If “a lot” is read “really often” it will lead to an insane entry in the seq_tup_read column. That also means we have to process an enormous number of rows to read a table sequentially again and again.

Now, let's run a really important query:

test=# SELECT schemaname, 
              relname, 
              seq_scan, 
              seq_tup_read, 
              idx_scan, 
              seq_tup_read / seq_scan AS avg 
 FROM         pg_stat_user_tables 
 WHERE        seq_scan > 0 
 ORDER BY     seq_tup_read DESC;
 schemaname |     relname      | seq_scan | seq_tup_read | idx_scan | avg
------------+------------------+----------+--------------+----------+---------
 public     | pgbench_accounts |      954 |   5050000000 |          | 5293501
 public     | pgbench_branches |      254 |        25400 |        0 |     100
 public     | pgbench_tellers  |        1 |         1000 |      252 |    1000
(3 rows)

test=# SELECT schemaname,

relname,

seq_scan,

seq_tup_read,

idx_scan,

seq_tup_read / seq_scan AS avg

FROM pg_stat_user_tables

WHERE seq_scan > 0

ORDER BY seq_tup_read DESC;

------------+------------------+----------+--------------+----------+---------

public | pgbench_accounts | 954 | 5050000000 | | 5293501

public | pgbench_branches | 254 | 25400 | 0 | 100

public | pgbench_tellers | 1 | 1000 | 252 | 1000

(3 rows)

This one is true magic. It returns those tables which have been hit by sequential scans the most and tells us how many rows a sequential scan has hit on average. In the case of our top query, a sequential scan has read 5 million rows on average, and indexes were not used at all. This gives us a clear indicator that something is wrong with this table. If you happen to know the application, a simple d will uncover the most obvious problems. However, let us dig deeper and confirm our suspicion:

pg_stat_statements: Finding slow queries

As stated before, pg_stat_statements is really the gold standard when it comes to finding slow queries. Usually, those tables which show up in pg_stat_user_tables will also rank high in some of the worst queries shown in pg_stat_statements.

The following query can uncover the truth:

test=# SELECT query, 
              total_exec_time, 
              calls, 
              mean_exec_time 
       FROM   pg_stat_statements 
       ORDER BY total_exec_time DESC;
-[ RECORD 1 ]---+-------------------------------------------------------------------
query           | UPDATE pgbench_accounts 
                  SET abalance = abalance + $1 WHERE aid = $2
total_exec_time | 433708.0210760001
calls           | 252
mean_exec_time  | 1721.0635756984136
-[ RECORD 2 ]---+-------------------------------------------------------------------
query           | SELECT abalance FROM pgbench_accounts 
                  WHERE aid = $1
total_exec_time | 174819.2974120001
calls           | 252
mean_exec_time  | 693.7273706825395
…

test=# SELECT query,

total_exec_time,

calls,

mean_exec_time

FROM pg_stat_statements

ORDER BY total_exec_time DESC;

-[ RECORD 1 ]---+-------------------------------------------------------------------

query | UPDATE pgbench_accounts

SET abalance = abalance + $1 WHERE aid = $2

total_exec_time | 433708.0210760001

calls | 252

mean_exec_time | 1721.0635756984136

-[ RECORD 2 ]---+-------------------------------------------------------------------

query | SELECT abalance FROM pgbench_accounts

WHERE aid = $1

total_exec_time | 174819.2974120001

calls | 252

mean_exec_time | 693.7273706825395

…

Wow, the top query has an average execution time of 1.721 seconds! That is a lot. If we examine the query, we see that there is only a simple WHERE clause which filters on “aid”. If we take a look at the table, we discover that there is no index on “aid” - which is fatal from a performance point of view.

Further examination of the second query will uncover precisely the same problem.

Improve your PostgreSQL indexing and benchmarking

Let's deploy the index and reset pg_stat_statements as well as the normal system statistics created by PostgreSQL:

test=# CREATE UNIQUE INDEX idx_accounts 
         ON pgbench_accounts (aid);
CREATE INDEX
test=# SELECT pg_stat_statements_reset();
 pg_stat_statements_reset 
--------------------------

(1 row)

test=# SELECT pg_stat_reset();
 pg_stat_reset 
---------------

(1 row)

test=# CREATE UNIQUE INDEX idx_accounts

ON pgbench_accounts (aid);

CREATE INDEX

test=# SELECT pg_stat_statements_reset();

pg_stat_statements_reset

--------------------------

(1 row)

test=# SELECT pg_stat_reset();

pg_stat_reset

---------------

(1 row)

Once the missing indexes have been deployed, we can run the test again and see what happens:

[hs@hansmacbook ~]$ pgbench -c 10 -T 60 -j 10 test
pgbench (14.1)
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 713740
latency average = 0.841 ms
initial connection time = 7.541 ms
tps = 11896.608085 (without initial connection time)

[hs@hansmacbook ~]$ pgbench -c 10 -T 60 -j 10 test

pgbench (14.1)

starting vacuum...end.

transaction type:

scaling factor: 100

query mode: simple

number of clients: 10

number of threads: 10

duration: 60 s

number of transactions actually processed: 713740

latency average = 0.841 ms

initial connection time = 7.541 ms

tps = 11896.608085 (without initial connection time)

What an improvement. The database speed has gone up 3000 times. No “better hardware” in the world could provide us with this type of improvement. The takeaway here is that a SINGLE missing PostgreSQL index in a relevant place can ruin the entire database and keep the entire system busy without yielding useful performance.

What is really important to remember is the way we have approached the problem. pg_stat_user_tables is an excellent indicator to help you figure out where to look for problems. Then you can inspect pg_stat_statements and look for the worst queries. Sorting by total_exec_time DESC is the key.

Finally...

If you want to learn more about PostgreSQL and database performance in general, I can highly recommend Laurenz Albe’s post about “Count(*) made fast” or his post entitled Query Parameter Data Types and Performance.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on X, Facebook, or LinkedIn.

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

Stay tuned with our