How to terminate long running queries in PostgreSQL:
People might be fans of SQL, people might like PostgreSQL on Kubernetes or people might even love database ORMs. But have you ever heard of a single person who loves slow queries? In 20+ years of professional database engineering, I have not seen a single fan club paying respect to slow queries, bad SQL and a horrible user experience. So let’s terminate long running queries and simply remove them from our system.
Identifying slow queries and bad performance in PostgreSQL
A lot has been written about how to identify and fix slow queries, and I simple want to refer to some of those posts to help people along:
However, in this blog we want to focus on how to actually kill a slow query once it has been found in the system.
Killing slow queries
Basically, there are two ways to get rid of a slow query:
- Terminate the query but keep the database connection alive
- Kill the entire database connection
To ensure we have a slow query which we can eliminate, we first need to start one:
Artificially create a long running query
test=# SELECT pg_sleep(1000000);
This query will simply sleep and not terminate in the foreseeable future. So how can we find and kill it?
First, let’s find the long running query:
test=# \x Expanded display is on. test=# SELECT * FROM pg_stat_activity WHERE datname IS NOT NULL AND state = 'active' ORDER BY query_start LIMIT 1; -[ RECORD 1 ]----+------------------------------ datid | 16384 datname | test pid | 1207 leader_pid | usesysid | 10 usename | hs application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2023-08-10 15:24:13.980079+02 xact_start | 2023-08-10 15:24:20.443691+02 query_start | 2023-08-10 15:24:20.443691+02 state_change | 2023-08-10 15:24:20.443697+02 wait_event_type | Timeout wait_event | PgSleep state | active backend_xid | backend_xmin | 899 query_id | query | SELECT pg_sleep(1000000); backend_type | client backend
PostgreSQL successfully returned the query which ran for the longest time on my system. What is important here: When querying
pg_stat_activity, you need to make sure that only look for rows that actually have a valid database as we would otherwise end up with system processes.
So: How can you eliminate this query? The important information we want to retrieve from
pg_stat_activity is the process ID (pid) of the database backend executing what we want to terminate. In my case this PID = 1207.
PostgreSQL offers two functions to kill slow queries:
pg_cancel_backend: Terminate the query and keep the connection alive
pg_terminate_backend: Terminate the database connection including the query
Let’s try the first function:
test=# SELECT pg_cancel_backend(1207); pg_cancel_backend ------------------- t (1 row)
Executing this in a second connection will ensure that PostgreSQL kills the long running query, which will cause the following output in the first connection:
test=# SELECT pg_sleep(1000000); ERROR: canceling statement due to user request
Sometimes queries will not stop immediately. This is fine, and patience is certainly advised.
However, sometimes we want to kill the entire backend. We do that by using the second function:
test=# SELECT pg_terminate_backend(1207); pg_terminate_backend ---------------------- t (1 row)
With both methods, our problematic query will stop consuming valuable resources. The first method is less invasive, since it does not terminate the database session.
Preventing bad SQL in the future
While it is easy to fix a single case, it’s also important to keep an eye on the overall picture and make sure that bad SQL gets cleaned up automatically.
statement_timeout to automatically remove slow queries
One way to do that is to make PostgreSQL remove bad queries for you. Use the
statement_timeout variable. Here’s how it works:
test=# SET statement_timeout TO 1000; SET test=# SELECT pg_sleep(1000000); ERROR: canceling statement due to statement timeout
PostgreSQL terminated the query after 1 second.
What is important here: We can also set a reasonably good value in
postgresql.conf to ensure that the setting is in effect in all database sessions of the PostgreSQL instance.
Long running queries have all kinds of implications which have to be kept in mind. It is not only about resources and performance, but also about making sure that table bloat is kept under control at all times. Check out the out post about Autovacuum, UPDATE and a lot more to learn more about this vital topic.
Sometimes a statement keeps running even though you tried to interrupt it with the methods shown in this article. Then you might want to read our article on how to cancel a statement that refuses to die.