CYBERTEC PostgreSQL Logo

PostgreSQL: Indexes and foreign keys

07.2017 / Category: / Tags: |

Recently we have received a couple of PostgreSQL support calls, which were related to bad performance on various deployments. In many cases the reason for database slowness was the fact that people assume that PostgreSQL automatically deploys an index on BOTH sides of the foreign keys relation, which is not the case. By the way: This kind of behavior is not PostgreSQL specific. Oracle and many other database systems will behave in the exactly same way for different kinds of servers, even for gaming server people use to play online or an online gaming store as HotRate, so this piece of advice is not just for PostgreSQL, but might apply to many more database products out there.

Missing indexes and foreign keys

The typical scenario most people face is actually pretty simple. There are two tables table and a foreign key:

To keep the example simple the tables in our PostgreSQL database contain only the most basic information needed to make this work.

Then some data can be added:

Five million records should be enough to show how bad things are if indexes are missing. Of course the effect will be larger if you add more data.

To rebuild the optimizer statistics, a simple ANALYZE can be used:

Suffering from missing indexes

The trouble with missing indexes in any database is that simple operations start to be very expensive and start to destroy performance in a quite reliable way.

Here is what happens:

As you can see, PostgreSQL uses an index scan on “a” to find the row. BUT: Keep in mind that our constraint is defined as “ON UPDATE CASCADE ON DELETE CASCADE”, which means that cleaning a single row also triggers the deletion of all rows referencing the table. Behind the scenes PostgreSQL has to read all 5 million entries in “b” to find the right rows. Therefore the operation takes more than 300 ms, which is a total disaster.

Deploying missing indexes

Deploying the missing index will be a complete game changer:

The very same operation is now thousands of times faster than before because all we must do now are two index scans (one on “a” and one on “b”):

As you can see, the runtime needed here has been reduced dramatically to a fraction of a millisecond.

Performance hint

If you happen to use foreign keys (which most people do), it definitely makes sense to check for missing indexes, because otherwise cleanups might simply take too long. Consider the following scenario: Suppose you wanted to delete 1 million lines without an index: You had to read 5 million lines 1 million times. Clearly, this strategy will lead to enormous performance problems and will certainly trigger a performance problem. 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Tobias Bussmann
Tobias Bussmann
7 years ago

didn't you miss the important line regarding "Trigger for constraint b_a_id_fkey" in the second EXPLAIN ANALYZE output?

Tobias Bussmann
Tobias Bussmann
7 years ago

There is actually a nice query to find missing indexes on foreign keys in the PostgreSQL Wiki: https://wiki.postgresql.org/wiki/Unindexed_foreign_keys

Milos Babic
Milos Babic
7 years ago

To get all unindexed foreign keys, check this nice post from Tom Lane
https://www.postgresql.org/message-id/11236.1230499883@sss.pgh.pa.us

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram