By Kaarel Moppel
How can I use pg_stat_statements for troubleshooting? First, an introductory story (with some slight “drama” added for fun) from those good old days of on-site consulting 🙂 So…I’m at a client where the database is not behaving nicely among other things…what a crappy DB product indeed I hear, it gets cranky every morning although there should be a constant workload and is saturating the CPU and making some important queries time out!
Hmm very interesting I think, Postgres usually doesn’t tend to care too much about mornings or time in general…Okay, well let’s sit down and look at the metrics I say…oh, surprise-surprise – we discover that there’s no monitoring framework in place! Sounds very bad and rare, right? Well, quite common actually as many people rely only on the server log files…which I wouldn’t recommend doing personally. But not too tragic in this case actually – as luckily their DBA at least had read from somewhere that it’s good to install the pg_stat_statements extension when taking an instance into use – which is of course absolutely true and a highly recommended thing to do!
Giving hope to DBA’s
So I say – well, let’s just quickly install some monitoring tool and have a coffee and talk about other issues while the metrics, that will help us to get to the bottom of this, are being gathered. And then I hear that distinct audible “sigh” and they look at me with despair in their eyes – you know, we just can’t install some software like that…firstly there’s no Internet…and 2nd, it needs to go through managerial and security acceptance processes X,Y and Z and it would take a week at best.
Yoda and direct console access
OK, doesn’t make life easier for sure…the poor DBA is also already feeling bad that their organisation is not exactly the candidate for the “Agile company of the year” award…But, then I ask…you know, we do have direct console access to the database still, right? Yes, we do! And then I can finally use my wise Yoda quote: Well great, this is pretty much the same as a monitoring tool – it’s all about the data anyways! The DBA starts to look at me in a very suspicious and puzzled way…what the hell am I talking about? How can a plain text-based query interface be nearly as useful as some modern graphical monitoring tool for troubleshooting?
Ad-hoc setup of pg_stat_statements monitoring
So let’s shed some light on what I mean. A disclaimer – it’s not the most “humane” method of course, but it will definitely gather all the necessary information and thus do the trick. For most users though, who have the possibility, I’d recommend installing some real monitoring tool – like pgwatch2 or pghero, which are both dead simple to use and have a graphical interface for pinpointing problems.
Step 1 – prerequisites
Making sure the pg_stat_statements extension is loaded via the shared_preload_libraries parameter and also the extension created in some database – doesn’t have to be even the problematic one, as the extension works on the cluster level. For new installs, a restart of the PostgreSQL server is needed though, which might be problematic in some cases – so again: it’s a really highly recommended step when performing the initial server setup.
Step 2 – “snapshot” table setup
Log into the database where pg_stat_statements is available and create a table to hold the periodic snapshots of the tracking information the extension is providing. And this is the crux of the “trick” actually – we want to regularly store snapshots of detailed query execution details so that we can later calculate the so called “deltas” for each query and see who was burning the most resources within a certain timespan! This is actually exactly what most “continuous monitoring” tools are doing in a nicer coat – just storing snapshots of the internal PostgreSQL performance counters for various objects or queries and then calculating deltas or anomalies based on that data and representing it in some nice and digestible, usually graphical, format.
A DDL statement to create a table for such “snapshot” storing purposes would look something like that:
CREATE TABLE stat_statements_snapshots AS SELECT now() AS ts, * FROM pg_stat_statements WHERE false;
Step 3 – metrics gathering
Set up periodic gathering of pg_stat_statements data into our “snapshot” table. In the simplest form on the “psql” console it can be done with the built-in “watch” subcommand. Namely, this command will re-execute the previous user query with some regular interval – by default set at 2 seconds…which is a bit too frequent mostly though for general pg_stat_statements monitoring (but might again be too infrequent for pg_locks or pg_stat_activity monitoring). So in short it would look something like this:
/* let’s do the first insert normally and then let “watch” do the same every INSERT INTO stat_statements_snapshots SELECT now(), * FROM pg_stat_statements; \watch 60
Remember – “psql” + “watch” is just one option and a simple Bash loop or a Cron with the same SQL would be as good, and maybe even better – if you plan to keep it running for a day or so as there might be network interruptions or restarts that terminate your “psql“ session.
Analyzing the data
Now when the data collection has been running for a while and our target time range is covered, we obviously need to analyze the data somehow. And there are quite some ways to do that – many resource usage columns to choose from depend on the detected problem type and also many SQL query techniques could be used. But to give you a general idea, in the chapter below there are some simpler examples of some standard questions and the resulting queries for such ad-hoc troubleshooting sessions…but as already said, there are many-many different paths here and the queries provided are by far not the only ones.
The burning question
Which top 3 queries were burning the most CPU time between times t1 and t2 and approximately how much of the total Postgres runtime did that represent? Let’s see:
explain with q_snap_data as ( select ts, queryid, max(query) as query, /* avoiding groupings over long text values is a good idea generally */ sum(total_time) as total_time, sum(calls) as calls from stat_statements_snapshots where not query like 'insert into stat_statements_snapshots%' /* don't include our snapshot collecting query */ group by ts, queryid /* 1st gruping needed as queries have separate rows for various users which we want to ignore */ ), q_min_total as ( select sum(total_time) as min from q_snap_data group by ts order by ts limit 1 ), q_max_total as ( select sum(total_time) as max from q_snap_data group by ts order by ts desc limit 1 ) select queryid, max(query) as query, max(total_time) - min(total_time) as total_time, max(calls) - min(calls) as calls, (max(total_time) - min(total_time)) / (max(calls) - min(calls)) as mean_time, 100 * (max(total_time) - min(total_time))::numeric / (select max - min from q_max_total, q_min_total) as approx_pct_of_total from q_snap_data /*where ts between t1 and t2 */ group by queryid having max(calls) > min(calls) order by total_time desc limit 3;
What queries had the biggest (absolute) execution time change compared to the hour before?
with q_snap_data as ( select ts, queryid, max(query) as query, /* avoiding grupings over long text values is a good idea generally */ sum(total_time) as total_time, sum(calls) as calls from stat_statements_snapshots where not query like 'insert into stat_statements_snapshots%' /* don't include our snapshot collecting query */ /* and ts between t1 and t2 */ group by ts, queryid /* 1st grouping needed as queries have separate rows for various users which we want to ignore */ ), q_snap_data_hour as ( select extract(hour from ts) as ts, queryid, max(query) as query, max(total_time) - min(total_time) as total_time, max(calls) - min(calls) as calls from q_snap_data group by extract(hour from ts), queryid ) select *, total_time - lag_total_time as total_time_growth from ( select ts, queryid, query, total_time, lag(total_time) over (partition by queryid order by ts) as lag_total_time, calls from q_snap_data_hour ) x where lag_total_time > 0 and total_time > lag_total_time order by total_time - lag_total_time desc limit 3;
Another warning on the above queries
The results can be considered useful only in case there was no Postmaster crash or user invoked statistics reset in between. If that was the case, your best bet is almost always to go all in on the awesome Window Functions!
Improving resource usage
In some cases, as in the above described fully utilized CPU case, one doesn’t really want to add any unnecessary load on the database server with monitoring activities (especially if you use shorter intervals), as it might actually be that last straw that breaks the camel’s back…so how do you minimize such risks a bit?
First performance hint
We could (and mostly should) use the UNLOGGED table modifier for our snapshot gathering table, to bypass the transaction log completely for our frequently inserted data – thus reducing disk flushing and network load (remember – WAL is sent also to the replicas). This modifier of course has the downside that the table is not effectively crash safe and we would lose our data in case of a PostgreSQL crash or a power outage! Nevertheless, this is especially recommended when the whole monitoring operation is planned to be relatively short-lived, as it mostly should, or if the strange “peaks” are appearing very frequently and we could just re-start our capturing any time and still get usable “leads”. I actually use “UNLOGGED” almost always for such ad-hoc troubleshooting – better to be safe than sorry and not to strain the system under monitoring.
Second performance hint
We could leave out the longest info (byte-wise) from the pg_stat_statements view snapshots – the actual queries! This is OK to do as the internal query IDs are static and good enough to look up the real query texts later with a JOIN, when we have already identified some suspicious query ID-s.
Third performance hint
We don’t actually need to store the pg_stat_statement statistics on each interval for all queries…as probably only a subset of them are “active” in the monitored time range. So basically we could add some WHERE filter condition that looks up the very first entry for a specific query and dismisses it if the execution counter has not progressed for the current timestamp – this usually minimizes the data amount at least 2 times based on my experience. A hint though – for fast lookups on that first entry, you might want to also throw some index on the “snapshot” table, so that we again burn a bit less CPU time.
CREATE INDEX ON stat_statements_snapshots (queryid, ts); INSERT INTO stat_statements_snapshots SELECT now(), * FROM pg_stat_statements s WHERE calls > (select calls from stat_statements_snapshots where queryid = s.queryid and dbid = s.dbid order by ts limit 1);
By the way, exactly the same approach can be used also for all the other internal statics views that are cumulative in nature, as most of them are – first such that comes to mind for example is the pg_stat_user_tables view, that enables to determine at which times some tables are getting a lot of sequential scans. There though I’d also add some minimum table size limit into the mix as well, as scans on tiny tables, unless in billions per day, are not usually a thing to worry about.
Hopefully, that wasn’t too “hacky”, and you have a new trick up your sleeve now.
Read more about putting pg_stat_statements to good use in Hans’ blog Find and Fix a Missing PostgreSQL Index.