PostgreSQL has tons of useful features and so it is somehow inevitable that some of it gets forgotten every now and then. In my case that’s exactly what happens a lot with a cool piece of functionality called “auto_explain”, which is a module in the contrib package. So I thought I’ll just write a short reminder to my future self and potentially also some other people out there who occasionally have the problem of fitting the working set to the main memory 😉
What does auto_explain do?
From the excellent PostgreSQL documentation we can read:
“The auto_explain module provides a means for logging execution plans of slow statements automatically, without having to run EXPLAIN by hand. This is especially helpful for tracking down un-optimized queries in large applications.”
Sounds pretty useful, right? One only needs to keep in mind that all the features come at a certain cost. So the hint from the documentation is that the approach falls a bit into the category “let’s bring out the big guns”, and probably doesn’t make a lot of sense. So if everything is running fine and there is no need to pay that small extra execution penalty. But if you’re experiencing sporadic slow statements in the database log from your applications sessions (given log_min_duration_statement is set accordingly) but can’t really pinpoint the problem from re-running those slow statements with manual explains (the rows with those ID’s from the query are gone or data distribution has changed), enabling this extension could provide you some helpful insights.
The basic setup is pretty simple actually, just modify postgresql.conf to set the “auto_explain.log_min_duration” parameter to your “allowed pain-threshold” value , e.g. ‘5s’ and enable the library for all sessions by setting “shared_preload_libraries = auto_explain”. Then restart your database, and voila – execution plans of all statements that take more than 5 seconds will be logged. A tip – the module could also be configured on user level if you’re really conscious about performance.
For some more advanced usecases, even more configuration parameters exist – logically all the extra options for the standard EXPLAIN (verbosity, buffers, timing information, output format) can be enabled plus additionally the more exotic possibilities of including detailed trigger execution timing and sub statement “explain” information for stored procedures.
Log entry including trigger execution statistics would look something like that for my simple test scenario. Code here.
STATEMENT: insert into t1 select 1; LOG: duration: 1001.976 ms plan: Query Text: insert into t1 select 1; Insert on t1 (cost=0.00..0.01 rows=1 width=0) (actual time=0.048..0.048 rows=0 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) Trigger t1_upd_trg: time=1001.924 calls=1