CYBERTEC Logo
cancel a statement because it is sexist
© Laurenz Albe 2022

 

Sometimes a PostgreSQL query takes forever. Usually, it is easy to interrupt (cancel) such a statement, but sometimes it can turn into a problem, and a query cannot be canceled. This article discusses what might be the cause. I'll also show you a trick how to solve the problem (not for the faint of heart!)

How to cancel a running query

The PostgreSQL protocol has provisions for interrupting a running statement. This is done by opening a new connection and sending a CancelRequest message with a secret key. That secret key was sent by the server during the start of the original connection. Without that key, everybody could cancel your query, which would be an inacceptable security problem.

The C library libpq provides the functions PQgetCancel() and PQcancel() to cancel queries, and other database APIs should have similar provisions. In an interactive psql session, you can simply hit Ctrl+C to send a cancel request, and GUI clients usually have a button for that purpose.

But it is also possible to cancel somebody else's query by calling the database function pg_cancel_backend(). Another option is pg_terminate_backend(), which goes one step further and terminates someone else's database session. To be able to use these functions, you must either be a superuser or a member of the default role pg_signal_backend (the next section will explain why), or you must have connected to the database as the same database user as the session you want to torpedo (you are allowed to cancel your own statements).

How the server responds to a cancel request

Inter-process communication in PostgreSQL relies a lot on signals.

When the postmaster process receives a CancelRequest, it sends the signal SIGINT to the backend process of the corresponding database session. This is also what the function pg_cancel_backend() does. pg_terminate_backend() sends the signal SIGTERM.

Now each PostgreSQL process has a signal handler that processes these signals when they are received. This signal handler does not immediately interrupt the backend process, but it sets global variables for the process. SIGINT will set QueryCancelPending and SIGTERM will set ProcDiePending. These variables act as flags, and it is the responsibility of the backend process to react to them as soon as is convenient. This ensures that no process is interrupted at an inconvenient time, when it would for example leave shared memory in an inconsistent state.

Calls to the CHECK_FOR_INTERRUPTS() macro, which invoke the ProcessInterrupts() function, are sprinkled all over the PostgreSQL code at safe places. This function will then throw the error that cancels the current statement or it will terminate the backend process, depending on which flag was set.

Reasons why canceling a query may not work

There are several possible causes:

  1. Execution is stuck in a loop that does not contain CHECK_FOR_INTERRUPTS(). That would be a PostgreSQL bug, and the fix would be to add another call to the macro.
  2. Execution is stuck in a third-party C function that was called in an SQL statement. In this case, you should report it as a bug to the author of the function.
  3. Execution is stuck in a system call that cannot be interrupted. That would indicate a problem on the operating system or hardware level. Note that the delivery of signals is held off while a process is in kernel space.

Don't use kill -9 unless you are desperate

It is perfectly fine to use plain kill on a PostgreSQL backend process. That will send the SIGTERM signal, so it is the same as calling pg_terminate_backend() for that backend. If that has no effect, it is tempting to use kill -9, which will send SIGKILL. That signal cannot be caught and immediately terminates the process. The problem is that the postmaster detects if one of its child processes did not shut down cleanly. It will then kill all other PostgreSQL processes and undergo crash recovery, which causes an outage of the whole database that can take seconds to minutes.

Note that while using kill -9 on a backend causes a short down time, kill -9 on the postmaster process itself has even worse effects and should be avoided at any cost. It opens a time window during which a new postmaster could be started while some of the old postmaster's children are still alive, which is likely to lead to corruption of the data on disk. Never, ever, kill the postmaster process with kill -9!

Sometimes even kill -9 won't be able to kill a PostgreSQL backend. That means that the backend is stuck in an uninterruptible system call, for exampling performing I/O on network attached storage that is no longer available. If that condition persists, the only way to get rid of the process is a reboot of the operating system.

A trick to cancel a stuck query without crashing the server

Sometimes you can avoid crash recovery and an outage by proceeding as follows. This example uses the GNU debugger on Linux; you'll have to adapt it to other environments as necessary.

An example of a hanging function

We write this simple C function (source file loop.c):

Build the shared library like this (change the include path as appropriate):

and copy the file into the PostgreSQL shared library directory (which you can get with “pg_config --libdir”).

Define and call the function

Define the function in SQL as superuser:

Then, as an arbitrary user, call the function:

Execution will hang. You can try to cancel the query, but it will keep running.

Identify the hanging backend process and send it a termination signal

Open another database connection with the same database user. Then find out the process ID of the background process, which identifies the database session:

Once you know the process ID, send the process a SIGTERM:

Of course, the argument should be the process ID obtained in the previous step. The function returns TRUE, since the signal was sent, but the query continues to execute.

Attach with the debugger

Make sure that the GNU debugger gdb is installed. To get a readable stack trace, you should have the debugging symbols for the PostgreSQL server installed, although that is not necessary for the trick I will show you. Log into the database server machine as the PostgreSQL user, which is typically called postgres. Call gdb like this (using the proper path to the postgres executable and the proper process ID):

As soon as you get the prompt “(gdb)”, you generate a stack trace with the command “bt”. That will look similar to the following:

The stack trace is a valuable help for determining where the problem is. Include it if you report a bug to PostgreSQL!

If you don't want to perform the next step, you can chicken out and enter “detach” to detach the debugger from the process and allow it to continue.

Cancel execution by making the hanging backend exit cleanly

The above stack trace shows that execution currently is not inside PostgreSQL code, but in a custom function (in loop () from /usr/pgsql-14/lib/loop.so). That means that it is pretty safe to let the process exit. If execution is somewhere inside the PostgreSQL server, there is a small risk that PostgreSQL may be in the middle of modifying shared state, holding a spinlock or something similar. If you know the PostgreSQL source, a look at the call stack will help you to assess that risk. Now, if you dare, call ProcessInterrupts(), which will cause the process to exit, since ProcDiePending is set:

Fix the function to allow the user to cancel execution

To improve the situation, the function code should be modified like this:

The above changes will cause the function to check for interrupts every two seconds, so that execution can be canceled safely.

Conclusion

Canceling queries works by sending the backend a SIGINT signal. If neither that nor sending SIGTERM can interrupt a backend, you can attach to the hanging backend with gdb and call ProcessInterrupts() directly to make it exit.

If you found this article interesting, perhaps you will also like my article about how to use TCP keepalive to keep abandoned queries from running forever, and to keep idle sessions from closing.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

If you are using pg_timetable to run multiple jobs on PostgreSQL, you need to understand how to calculate the number of database sessions, so you don't overload your system. Recently, a related question came up on pg_timetable's Github Discussion page: What do you do when you get the following error message: "Failed to send chain to the execution channel?"
TLDR: See the bottom of the post for Key Takeaways.

Q: Run multiple jobs?

I have started multiple (20) jobs using timetable.add_job(). These are long-running, CPU-intensive jobs.
I have checked the timetable.log regularly and I sometimes found the following error message:
Failed to send chain to the execution channel
But the chain is running forward.
Could you explain to me what this message is?

A: Understand channel capacity

Assuming you're running the latest version, you're out of the capacity of the channel for passing chains to workers.
The channel accumulates jobs and then workers pull jobs one by one from this channel. If all of your workers are busy and new jobs are fetched for execution, the channel will be out of capacity to hold new jobs.

Solution: run multiple jobs but limit the number of sessions

Q: Rules regarding active sessions?

I now have 2 workers. I initiated 10 multiple chains using self-destruction and job_max_instances = 1.
After the 10 chains finished their run, I see 10 active pg_timetable sessions in pg_stat_activity and timetable.active_session contains 39 rows.
When will the 10 active sessions be idle? Because I suspect that if I want to schedule another 10 chains, maybe the active sessions in pg_stat_activity will grow up to 20, which is not suitable for my 4 cpu db server.
UPDATE: when I launched the next 10 chains, the active sessions were gone from pg_stat_activity.
But I would like to know what the rule is regarding this.

A: Overview of workers' parameters

Let's assume that when we speak about 2 workers, we are talking about 2 pg_timetable instances.
Each pg_timetable instance can use max --cron-workers + --interval-workers parallel routines according to the manual.

Solution:

In the worst scenario (highest load) we need:

That means the max connection (session) number for one pg_timetable instance (worker) would be:

In other words, if you want to limit the number of the database sessions you should decrease the --cron-workers and/or --interval-workers start parameters (default value 16 for both).

job_max_instances applies only to the chain. Since you have self-destructed chains it does nothing. This parameter allows you to run the same chain in parallel, but it's limited to the job_max_instances value, e.g. I want to start some chain every minute to process something, but no more than 10 chains at once. When the old chain finishes, the system will start a new one, so, at any point in time, there will be no more than job_max_instances chains running.

Key Takeaways - run multiple jobs on PostgreSQL using pg_timetable

For more reading on using pg_timetable to schedule cron jobs, see my previous blog.

Stay safe, healthy, and wealthy!
Be happy! Peace! Love! ❤

Pavlo

This information about limiting active sessions is republished from messages on CYBERTEC's pg_timetable Github page:

Failed to send chain to the execution channel #427
and
active sessions #428

Texts were edited for proofreading and clarity.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

dsa
© Laurenz Albe 2022

 

Next to character encoding, time zones are among the least-loved topics in computing. In addition, PostgreSQL's implementation of timestamp with time zone is somewhat surprising. So I thought it might be worth to write up an introduction to time zone management and recommendations for its practical use.

Time zones and the SQL standard

The SQL standard has rather limited provisions for specifying time zones:

It has the following to say about time zones:

The surface of the earth is divided into zones, called time zones, in which every correct clock tells the same time, known as local time. Local time is equal to UTC (Coordinated Universal Time) plus the time zone displacement, which is an interval value that ranges between INTERVAL '–14:00' HOUR TO MINUTE and INTERVAL '+14:00' HOUR TO MINUTE. The time zone displacement is constant throughout a time zone, and changes at the beginning and end of Summer Time, where applicable.

In other words, the standard only has time zone management provisions for representing an offset from UTC. Users or applications that have to cope with daylight savings time changes have to change the offset at the appropriate times to reflect that. Apparently not a lot of usability considerations went into this part of the SQL standard (or none of the big database vendors whose voice carries weight in the standard committee had a smarter implementation).

It is also interesting to note that the SQL standard seems to consider the time zone offset to be a part of the timestamp value, although it is not totally clear on that point.

Timestamp data types in PostgreSQL

There are two data types for timestamps in PostgreSQL: timestamp (also known as timestamp without time zone) and timestamp with time zone (or, shorter, timestamptz). timestamp is oblivious to time zone considerations and does not concern us here. Both data types have a resolution of microseconds (six decimal places).

timestamp with time zone diverges from the SQL standard's ideas in two respects:

Both of these points cause confusion for PostgreSQL users, but the first point more than the second. In effect, the name “timestamp with time zone” is not an accurate description for the PostgreSQL data type, and it would be more appropriate to call it “absolute timestamp”: it represents a fixed point in time, without any respect for the time zone in which an event takes place.

For completeness' sake, let me comment on a bad practice: for reasons that are beyond my understanding, many people store timestamps in a database as the number of seconds since the “Unix epoch” (midnight 1970-01-01 UTC). There is no advantage to doing that from a database perspective, only a number of disadvantages:

The TimeZone database parameter

The observations in the previous section make you wonder how to use timestamp with time zone in PostgreSQL. The key to that is the database parameter TimeZone, which contains the time zone in use in the current database session. Whenever PostgreSQL converts a timestamp with time zone to a string, it formats it according to the current setting of TimeZone and displays an appropriate offset from UTC:

The important aspect here is that TimeZone is not intended to be set centrally on the database server. Rather, each database session should override the parameter according to the time zone in effect on the client side. That way, everybody sees the same timestamp in a different way, formatted in the correct time zone.

Time zone management: syntax in PostgreSQL

In addition to the limited provisions of the SQL standard, PostgreSQL allows three ways to specify time zones. You can use them as values for the parameter TimeZone or in the AT TIME ZONE construct, but also to specify a time zone in a timestamp constant.

IANA time zone names

IANA is an internet organization best known for managing top level domains and IP address ranges, but they also maintain a database of time zone definitions that is used by many software components throughout the world. This database is also known as “Olson database” after its founder.

IANA time zone names are of the form “Area/Location”. “Area” is a continent or ocean or the special area Etc, which is used for “administrative zones” that have no geographical location. “Location” is the most important city or other small locale and stands for the time zone used in its surroundings. Examples for such time zones are:

To get a list of the time zones available in your PostgreSQL installation, look at the system catalog pg_timezone_names. PostgreSQL maintains its own copy of the IANA time zone database, but most operating systems also have a copy. You can configure which copy you want to use when you build PostgreSQL from source. To determine which copy of the IANA time zone database your PostgreSQL binaries are using, run “pg_config --configure”: if the output contains --with-system-tzdata, the operating system's copy is used.

Note that many of these time zones (like Europe/Vienna and Europe/Paris) are currently the same. However, it makes sense to retain different names, because these time zones were no always the same, nor do they need to stay the same in the future.

The IANA time zones include data about present and historical time zone and daylight saving times definitions:

The shift between the first and second result is explained by the fact that Vienna changed to daylight savings time on the last Sunday in March 2022, an the odd last result shows that before 1893, Vienna used local solar time rather than a time zone offset from Greenwich Mean Time.

Using IANA time zone names offers the richest functionality, and using them is the king's way through the impenetrable jungle of regional time zone definitions.

Time zone abbreviations

These are the well-known abbreviations like PST for “Pacific Standard Time” or CET for “Central European Time”. You can find the list that your PostgreSQL database understands in the system catalog pg_timezone_abbrevs.

Note that these time zone abbreviations have several downsides:

Convenient as it is to have a short abbreviation, I would advise to avoid time zone abbreviations other than UTC for the above reasons.

POSIX-style time zone specifications

You can find a description of the syntax for POSIX time zones in appendix B of the PostgreSQL documentation. The main advantage is that POSIX time zones allow you to specify daylight savings time changes, so they may be useful for cases that are not covered by the IANA time zone database, unlikely as such a case may be.

To illustrate the full power of the format, here is the time zone that describes what is currently in use in most countries of the European Union:

That is to be read as follows: during winter, the time zone abbreviation in effect is CET, which is one hour ahead of UTC. Daylight savings time has the abbreviation CEST, starts on the last (“fifth”) Sunday (day 0) of March (month 3) at 2 am and ends on the last Sunday in October at 3 am.

There are also some problems with POSIX time zones:

Particularly the first of these points is a frequent source of confusion. It is therefore best to avoid POSIX time zones as well.

Type casts and conversions in PostgreSQL time zone management

There are two ways to convert between timestamp with time zone and timestamp: type casts and the expression AT TIME ZONE.

When data is cast between the two data types, the timestamp is always interpreted as being in the time zone given by TimeZone:

The expression AT TIME ZONE also converts between the two data types:

When applied to a timestamp with time zone, the result is a timestamp that shows what a wall clock in the given time zone would show:

At twelve noon in time zone UTC, a Viennese wall clock would show 2 pm.

When applied to a timestamp, the result of AT TIME ZONE is a timestamp with time zone representing the absolute time corresponding to the given timestamp on a wall clock in the given time zone:

When a Viennese clock shows twelve noon, it is 10 am UTC.

How to write a time zone-aware application

There are two good ways to deal with time zone management in your application:

Don't try hybrid solutions, they will probably lead to pain and confusion.

Conclusion

Even though it is easy to get confused with time zones, you can steer clear of most problems if you use timestamp with timezone everywhere, stick with IANA time zone names and make sure to set the TimeZone parameter to the time zone on the client side. Then PostgreSQL will do all the heavy lifting for you.

If you need to group nearby timestamps together in an analytical query, you might be interested in our article about the new date_bin function.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

It's been many years since the first ideas to implement the MERGE command surfaced on the PostgreSQL mailing list. Now this important feature has finally made it into PostgreSQL core, in PostgreSQL 15. To show people how this vital command works, I have decided to come up with a technical preview to introduce my readers to this wonderful command. It adds so much power to PostgreSQL.

MERGE: Preparing a sample table

As the name already suggests, MERGE can be used to merge lists and to combine them into one. It allows you to define the behavior, opens an easy way to import missing data and a lot more.

Let's prepare some data and try it out:

What we've produced is a list consisting of two columns. The second value is 10 times the value in the first column. This table will be our “target table” which is supposed to receive changes.

Using MERGE in PostgreSQL 15

Let's try to run MERGE on our data set and see what happens. Here's an example of a simple query:

The t_test table will be the target table and we merge “y”, which is the result of the USING clause, into it. The WHEN clauses simply define the rules applied during the MERGE operation. When we have a match, we want to overwrite the existing value. In case the value does not exist we want to run INSERT. There are two things worth mentioning here: If we look closely at the UPDATE statement, we see that the target table is not needed anymore. That's also true for the INSERT statement at the end - there is no need to define the table all over again.

The result is as expected:

What we see here is that all the odd numbers have been updated - the rest was unchanged. Missing rows were added.

However, there is more: We can also use DELETE inside of a MERGE statement:

In this case we have deleted all matching rows. The DELETE does not need any further arguments. It is totally clear which rows are affected and column information is not needed.

Finally …

MERGE is a valuable new feature in PostgreSQL 15. We've waited for many years and now it makes a lot of code much easier to deal with.
If you want to learn more about PostgreSQL and if you're interested in how PostgreSQL stores functions and procedures, check out our post about it.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram