Time zone as an excuse for oversleeping
© Laurenz Albe 2022

Next to character encoding, time zones are among the least-loved topics in computing. In addition, PostgreSQL’s implementation of timestamp with time zone is somewhat surprising. So I thought it might be worth to write up an introduction to time zone management and recommendations for its practical use.

Time zones and the SQL standard

The SQL standard has rather limited provisions for specifying time zones:

<timestamp string> ::=
  <quote> <unquoted timestamp string> <quote>

<unquoted time string> ::=
  <time value> [ <time zone interval> ]

<time zone interval> ::=
  <sign> <hours value> <colon> <minutes value>

It has the following to say about time zones:

The surface of the earth is divided into zones, called time zones, in which every correct clock tells the same time, known as local time. Local time is equal to UTC (Coordinated Universal Time) plus the time zone displacement, which is an interval value that ranges between INTERVAL ‘–14:00’ HOUR TO MINUTE and INTERVAL ‘+14:00’ HOUR TO MINUTE. The time zone displacement is constant throughout a time zone, and changes at the beginning and end of Summer Time, where applicable.

In other words, the standard only has time zone management provisions for representing an offset from UTC. Users or applications that have to cope with daylight savings time changes have to change the offset at the appropriate times to reflect that. Apparently not a lot of usability considerations went into this part of the SQL standard (or none of the big database vendors whose voice carries weight in the standard committee had a smarter implementation).

It is also interesting to note that the SQL standard seems to consider the time zone offset to be a part of the timestamp value, although it is not totally clear on that point.

Timestamp data types in PostgreSQL

There are two data types for timestamps in PostgreSQL: timestamp (also known as timestamp without time zone) and timestamp with time zone (or, shorter, timestamptz). timestamp is oblivious to time zone considerations and does not concern us here. Both data types have a resolution of microseconds (six decimal places).

timestamp with time zone diverges from the SQL standard’s ideas in two respects:

  • It does not store the time zone. Instead, the timestamp is stored as the number of seconds since midnight 2000-01-01 UTC.
  • There are more ways than the offset from UTC to specify a time zone (details about that later). That allows you to deal with daylight savings time changes effectively.

Both of these points cause confusion for PostgreSQL users, but the first point more than the second. In effect, the name “timestamp with time zone” is not an accurate description for the PostgreSQL data type, and it would be more appropriate to call it “absolute timestamp”: it represents a fixed point in time, without any respect for the time zone in which an event takes place.

For completeness’ sake, let me comment on a bad practice: for reasons that are beyond my understanding, many people store timestamps in a database as the number of seconds since the “Unix epoch” (midnight 1970-01-01 UTC). There is no advantage to doing that from a database perspective, only a number of disadvantages:

  • The values stored in the database are incomprehensible to the naked eye.
  • You are losing most of the features of datetime arithmetic, such as being able to calculate current_timestamp - INTERVAL '1 month'.
  • As a consequence of the previous point, your WHERE conditions tend to be more complicated, which is bad for the performance of your queries.

The TimeZone database parameter

The observations in the previous section make you wonder how to use timestamp with time zone in PostgreSQL. The key to that is the database parameter TimeZone, which contains the time zone in use in the current database session. Whenever PostgreSQL converts a timestamp with time zone to a string, it formats it according to the current setting of TimeZone and displays an appropriate offset from UTC:

CREATE TABLE tstz (
   id bigint PRIMARY KEY,
   t timestamp with time zone NOT NULL
);

INSERT INTO tstz VALUES (1, '2022-04-01 12:00:00+02:00');
INSERT INTO tstz VALUES (2, '2022-04-01 12:00:00-02:00');

SET TimeZone = 'UTC';

TABLE tstz;

 id │           t            
════╪════════════════════════
  1 │ 2022-04-01 10:00:00+00
  2 │ 2022-04-01 14:00:00+00
(2 rows)

The important aspect here is that TimeZone is not intended to be set centrally on the database server. Rather, each database session should override the parameter according to the time zone in effect on the client side. That way, everybody sees the same timestamp in a different way, formatted in the correct time zone.

Time zone management: syntax in PostgreSQL

In addition to the limited provisions of the SQL standard, PostgreSQL allows three ways to specify time zones. You can use them as values for the parameter TimeZone or in the AT TIME ZONE construct, but also to specify a time zone in a timestamp constant.

IANA time zone names

IANA is an internet organization best known for managing top level domains and IP address ranges, but they also maintain a database of time zone definitions that is used by many software components throughout the world. This database is also known as “Olson database” after its founder.

IANA time zone names are of the form “Area/Location”. “Area” is a continent or ocean or the special area Etc, which is used for “administrative zones” that have no geographical location. “Location” is the most important city or other small locale and stands for the time zone used in its surroundings. Examples for such time zones are:

America/New_York
Europe/Vienna
Europe/Paris
Pacific/Pitcairn
Etc/UTC

To get a list of the time zones available in your PostgreSQL installation, look at the system catalog pg_timezone_names. PostgreSQL maintains its own copy of the IANA time zone database, but most operating systems also have a copy. You can configure which copy you want to use when you build PostgreSQL from source. To determine which copy of the IANA time zone database your PostgreSQL binaries are using, run “pg_config --configure”: if the output contains --with-system-tzdata, the operating system’s copy is used.

Note that many of these time zones (like Europe/Vienna and Europe/Paris) are currently the same. However, it makes sense to retain different names, because these time zones were no always the same, nor do they need to stay the same in the future.

The IANA time zones include data about present and historical time zone and daylight saving times definitions:

SET TimeZone = 'UTC';

SELECT TIMESTAMPTZ '2022-04-01 12:00:00 Europe/Vienna';

      timestamptz       
════════════════════════
 2022-04-01 10:00:00+00
(1 row)

SELECT TIMESTAMPTZ '2022-03-01 12:00:00 Europe/Vienna';

      timestamptz       
════════════════════════
 2022-03-01 11:00:00+00
(1 row)

SELECT TIMESTAMPTZ '1850-02-01 12:00:00 Europe/Vienna';

      timestamptz       
════════════════════════
 1850-02-01 10:54:39+00
(1 row)

The shift between the first and second result is explained by the fact that Vienna changed to daylight savings time on the last Sunday in March 2022, an the odd last result shows that before 1893, Vienna used local solar time rather than a time zone offset from Greenwich Mean Time.

Using IANA time zone names offers the richest functionality, and using them is the king’s way through the impenetrable jungle of regional time zone definitions.

Time zone abbreviations

These are the well-known abbreviations like PST for “Pacific Standard Time” or CET for “Central European Time”. You can find the list that your PostgreSQL database understands in the system catalog pg_timezone_abbrevs.

Note that these time zone abbreviations have several downsides:

  • They are merely abbreviations for a certain fixed offset from UTC and do not contain any information about daylight savings time. For example, there is CET for “Central European Time” and CEST for “Central European Summer Time”.
  • There is no international agreement on these abbreviations. For example, on my PostgreSQL installation, IST means “Israel Standard Time” (used in winter and offset 2 hours from UTC), while to the Irishwoman the same abbreviation stands for “Irish Summer Time” and almost a billion people would expect it to be “India Standard Time”.

Convenient as it is to have a short abbreviation, I would advise to avoid time zone abbreviations other than UTC for the above reasons.

POSIX-style time zone specifications

You can find a description of the syntax for POSIX time zones in appendix B of the PostgreSQL documentation. The main advantage is that POSIX time zones allow you to specify daylight savings time changes, so they may be useful for cases that are not covered by the IANA time zone database, unlikely as such a case may be.

To illustrate the full power of the format, here is the time zone that describes what is currently in use in most countries of the European Union:

CET-1CEST,M3.5.0/2,M10.5.0/3

That is to be read as follows: during winter, the time zone abbreviation in effect is CET, which is one hour ahead of UTC. Daylight savings time has the abbreviation CEST, starts on the last (“fifth”) Sunday (day 0) of March (month 3) at 2 am and ends on the last Sunday in October at 3 am.

There are also some problems with POSIX time zones:

  • The offset in POSIX time zones has the opposite sign of the time zone offset used elsewhere. For example, “2022-04-01 12:00:00 UTC-2” is the same as “2022-04-01 12:00:00+02”: two hours east of UTC.
  • More complicated POSIX time zones cannot be used in timestamp literals, since they confuse the timestamp parser.
  • There is no way to express historical time zone changes with POSIX time zone syntax.

Particularly the first of these points is a frequent source of confusion. It is therefore best to avoid POSIX time zones as well.

Type casts and conversions in PostgreSQL time zone management

There are two ways to convert between timestamp with time zone and timestamp: type casts and the expression AT TIME ZONE.

When data is cast between the two data types, the timestamp is always interpreted as being in the time zone given by TimeZone:

SET TimeZone = 'UTC';

SELECT CAST (TIMESTAMP '2022-04-01 12:00:00'
             AS timestamp with time zone);

      timestamptz       
════════════════════════
 2022-04-01 12:00:00+00
(1 row)

SELECT CAST (TIMESTAMP WITH TIME ZONE '2022-04-01 12:00:00+02'
             AS timestamp);

      timestamp      
═════════════════════
 2022-04-01 10:00:00
(1 row)

The expression AT TIME ZONE also converts between the two data types:

When applied to a timestamp with time zone, the result is a timestamp that shows what a wall clock in the given time zone would show:

SELECT TIMESTAMP WITH TIME ZONE '2022-04-01 12:00:00+00'
       AT TIME ZONE 'Europe/Vienna';

      timezone       
═════════════════════
 2022-04-01 14:00:00
(1 row)

At twelve noon in time zone UTC, a Viennese wall clock would show 2 pm.

When applied to a timestamp, the result of AT TIME ZONE is a timestamp with time zone representing the absolute time corresponding to the given timestamp on a wall clock in the given time zone:

SELECT TIMESTAMP '2022-04-01 12:00:00'
       AT TIME ZONE 'Europe/Vienna';

        timezone        
════════════════════════
 2022-04-01 10:00:00+00
(1 row)

When a Viennese clock shows twelve noon, it is 10 am UTC.

How to write a time zone-aware application

There are two good ways to deal with time zone management in your application:

  • Use timestamp throughout, store UTC timestamps and let the application handle time zone conversions.
  • Use timestamp with time zone throughout, set TimeZone correctly in each session and let PostgreSQL handle time zone conversions.

Don’t try hybrid solutions, they will probably lead to pain and confusion.

Conclusion

Even though it is easy to get confused with time zones, you can steer clear of most problems if you use timestamp with timezone everywhere, stick with IANA time zone names and make sure to set the TimeZone parameter to the time zone on the client side. Then PostgreSQL will do all the heavy lifting for you.

If you need to group nearby timestamps together in an analytical query, you might be interested in our article about the new date_bin function.