Functions: The most widely ignored performance tweak

05.2013 / Category: / Tags: |

It happens quite frequently that people complain about stored procedure performance in PostgreSQL. In many cases the reason for bad performance can be explained quite nicely when looking at the definition of a function.

In general, a PostgreSQL function can be marked as follows: VOLATILE, STABLE, IMMUTABLE or [NOT] LEAKPROOF.

What does this actually mean?

To show which impact this can have, we can create a small table consisting of just integer values:

In our example we have added 1 mio rows to the table. In the next step we can define a simple index:

The point now is: Many complaints about poor performance arise when people use the output of a function and compare it to a column as shown in the following listing:

We want to check, if any of the rows is identical to round(17.5, 0)::int4. As you can see PostgreSQL can use an index nicely. The reason for that is that the round function will ALWAYS return 18 if you pass 17.5 as parameter. It is simply a mathematical fact. Technically this means that PostgreSQL can calculate the function ONCE and use it to search the index.
Inside PostgreSQL the function is marked an IMMUTABLE. Its output will never change:

What happens if we try the very same thing using the random() function? The execution plan will be very different:

In this case we have to read the entire table to find the right answer. Reading 1 mio rows is expensive and therefore it is not a good strategy. The problem is that there is no way for the optimizer to use an index. The random() function will change its result everyt ime it is called. So, which value would you look up in the index? The answer is: There is no way to use the index because every time you are inspecting a row the output of random() will already be something else. This behavior is called VOLATILE:

The performance difference between a VOLATILE and an IMMUTABLE function will be substantial because the VOLATILE function has to be called 1 mio times:

In our example we can see an execution time, which is 240 times higher than in the index optimized case.

Tweaking your own functions

So, if you are writing your own procedures – please never forget to mark your function appropriately. Otherwise the function might be called ways too often, which is bad for performance. Defining your function properly is a very simple tweak offering great potential.

Here is how this can be done:

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Comments are closed.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf

+43 (0) 2622 93022-0

Get the newest PostgreSQL Info & Tools

    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    CYBERTEC PostgreSQL International GmbH
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram