August 2013

Creating reports is a core task of every PostgreSQL database engineer or developer. However, many people think that it is enough to hack up some SQL aggregating some data and execute it. This is not quite true. We have repeatedly seen reports being just plain wrong without people even taking note of it.

How can this happen?

Transaction isolation does make a difference

In our example we want to aggregate some simple data:

test=# CREATE TABLE t_test (name text, gender char(1));
CREATE TABLE

1 2	test=# CREATE TABLE t_test (name text, gender char(1)); CREATE TABLE

We add two boys and one girl:

test=# INSERT INTO t_test VALUES ('hans', 'm');
INSERT 0 1

test=# INSERT INTO t_test VALUES ('paul', 'm');
INSERT 0 1

test=# INSERT INTO t_test VALUES ('maria', 'f');
INSERT 0 1

test=# INSERT INTO t_test VALUES ('hans', 'm');

INSERT 0 1

test=# INSERT INTO t_test VALUES ('paul', 'm');

INSERT 0 1

test=# INSERT INTO t_test VALUES ('maria', 'f');

INSERT 0 1

Let us start with our report. We want to create a nice and colorful analysis displaying the number of girls, the number of boys, and the overall total.

To prove our point we have written two queries:

test=# BEGIN;
BEGIN

test=# SELECT gender, count(*) FROM t_test GROUP BY 1;
 gender | count
--------+-------
      m | 2
      f | 1
(2 rows)

test=# SELECT count(*) FROM t_test;
 count
-------
 3
(1 row)

test=# COMMIT;
COMMIT

test=# BEGIN;

BEGIN

test=# SELECT gender, count(*) FROM t_test GROUP BY 1;

gender | count

--------+-------

m | 2

f | 1

(2 rows)

test=# SELECT count(*) FROM t_test;

count

-------

(1 row)

test=# COMMIT;

COMMIT

We start a transaction, do an analysis per gender followed by an overall count (and yes, there are more efficient ways of doing that).

The question now is: Is this report correct or not? Most people would ask: “Why not?”

User 1

BEGIN;

test=# SELECT gender, count(*)
FROM t_test
GROUP BY 1;
 gender | count
--------+-------
   m    |   2
   f    |   1
(2 rows)
.
.
.

test=# SELECT count(*) FROM t_test;
 count
-------
 2
(1 row)

COMMIT;

BEGIN;

test=# SELECT gender, count(*)

FROM t_test

GROUP BY 1;

gender | count

--------+-------

m | 2

f | 1

(2 rows)

test=# SELECT count(*) FROM t_test;

count

-------

(1 row)

COMMIT;

User 2

.
.
.
.
.
.
.
.
.
.
.
.
test=# DELETE FROM t_test
WHERE name = 'hans';
DELETE 1
.
.
.
.
.
.
.
.
.

test=# DELETE FROM t_test

WHERE name = 'hans';

DELETE 1

Here is why:

What we see here is that the count does not reflect what we have seen before. The overall count is two – it is supposed to be three. The reason for this (for many users) unexpected behavior is that we are in so called “read committed” mode. In “read committed” mode every SELECT statement will take a so called snapshot. In other words: Every SELECT will already see data committed by some other transaction. In our case the second SELECT will already see the other transaction's commit. So, over time, a “read committed” transaction can change its view of the data. This implies that a report consisting of more than just one query might not operate on the same data and thus return inconsistent results.

Repeatable read – a way to fix transaction visibility

To fix the problem we can run our transaction in “repeatable read” mode. This will make sure that your transaction will have a consistent view of the data throughout the transaction and therefore return correct results.

Running a transaction in “repeatable read” works like this:

test=# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN

test=# SELECT gender, count(*) FROM t_test GROUP BY 1;
 gender | count
--------+------- 
      m | 2
      f | 1
(2 rows)

test=# SELECT count(*) FROM t_test;
 count
-------
 3
(1 row)

test=# COMMIT;
COMMIT

test=# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN

test=# SELECT gender, count(*) FROM t_test GROUP BY 1;

gender | count

--------+-------

m | 2

f | 1

(2 rows)

test=# SELECT count(*) FROM t_test;

count

-------

(1 row)

test=# COMMIT;

COMMIT

If you happen to do reporting, we always recommend to use “repeatable read” because it allows you to do your reporting on a consistent view of the data.

----------
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

UPDATED August 2023 - Table bloat in PostgreSQL: Many people wonder why deleting data from a table in a PostgreSQL database does not shrink files on disk. You would expect storage consumption to go down when data is deleted. This is not always the case.

To show how this works, I have compiled some examples.

Let's get started with a basic table and add 100k rows to it:

test=# CREATE TABLE t_test (id int4);
CREATE TABLE

1 2	test=# CREATE TABLE t_test (id int4); CREATE TABLE

test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000);
INSERT 0 100000

1 2	test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000); INSERT 0 100000

The size of the table will be around 3.5 MB on disk. We can find out about the size of a table using pg_relation_size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
3544 kB
(1 row)

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

Now let's delete half of the rows:

test=# DELETE FROM t_test WHERE id > 50000;
DELETE 50000

1 2	test=# DELETE FROM t_test WHERE id > 50000; DELETE 50000

Many users would expect this table to shrink to half the size. This is not the case:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
3544 kB
(1 row)

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

The reason why the storage space doesn't decrease is that when you delete some rows, it does not necessarily mean that nobody else can see the deleted rows anymore. There might still be transactions around which might at some point need those row versions you are about to delete. In other words: PostgreSQL does not delete a row instantly, but marks it invisible, so that certain transactions cannot see it.

To clean up those rows which are really dead, we can run `VACUUM`:

test=# VACUUM t_test;
VACUUM

1 2	test=# VACUUM t_test; VACUUM

In our example the table has been reduced in size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
1776 kB
(1 row)

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

1776 kB

(1 row)

How shrinking works

VACUUM is able to reclaim space and return it to the filesystem, but this is not always possible. It is only possible to shrink a file by cutting off stuff at the end. Cutting out data at the end of a data file is only possible if (and only if) there are no valid rows anymore AFTER a certain position in the file. In our example we have loaded data in ascending order into our PostgreSQL table (1, 2, 3, etc.). If we delete everything larger than 50.000 it means that the second half of the table will be gone. VACUUM will go through the table and figure out that the second half of the data file contains nothing but chunk and truncate the data file to return space to the filesystem.

When shrinking does not reduce table bloat

In the first example, we arranged things in a way that we can see VACUUM returning space to the file system.

Now, let's construct an example where this is not possible:

test=# DROP TABLE t_test;
DROP TABLE

1 2	test=# DROP TABLE t_test; DROP TABLE

First of all we drop the table and create a new one:

test=# CREATE TABLE t_test (id int4);
CREATE TABLE

1 2	test=# CREATE TABLE t_test (id int4); CREATE TABLE

Then we load 100.000 rows just like before:

test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000);
INSERT 0 100000

1 2	test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000); INSERT 0 100000

In this example we will delete the first half of the data:

test=# DELETE FROM t_test WHERE id < 50000;
DELETE 49999

1 2	test=# DELETE FROM t_test WHERE id < 50000; DELETE 49999

Just like before we can run VACUUM:

test=# VACUUM t_test;
VACUUM

1 2	test=# VACUUM t_test; VACUUM

This time the size of the underlying table is the same size:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
3544 kB
(1 row)

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));

pg_size_pretty

----------------

3544 kB

(1 row)

Remember what we said about cleanup and VACUUM? VACUUM can only shrink a file when free space is at the end. However, our example has been set up in a way that this is not the case.

CTIDs and their role in understanding PostgreSQL table bloat

In PostgreSQL there is a thing called CTID, which says where a certain row is. ctid=(0, 14) would mean that we are talking about the 14^th row in the first data block of the table.

Take a look at our table and check out the highest 5 CTIDs:

test=# SELECT ctid, * FROM t_test ORDER BY ctid DESC LIMIT 5;
      ctid | id
-----------+--------
 (442,108) | 100000
 (442,107) | 99999
 (442,106) | 99998
 (442,105) | 99997
 (442,104) | 99996
(5 rows)

test=# SELECT ctid, * FROM t_test ORDER BY ctid DESC LIMIT 5;

ctid | id

-----------+--------

(442,108) | 100000

(442,107) | 99999

(442,106) | 99998

(442,105) | 99997

(442,104) | 99996

(5 rows)

As you can see, the last row is in block 442. Do the math: 442 * 8192 = 3.5 MB. This is why the table has not changed in size.

Let's check the lowest CTID:

test=# SELECT min(ctid) FROM t_test;
    min
----------
(221,54)
(1 row)

test=# SELECT min(ctid) FROM t_test;

min

----------

(221,54)

(1 row)

We can see that the first valid row is somewhere in the middle of the table. Therefore the table has not been truncated by VACUUM.

UPDATE AUGUST 2023:

Postgres version 12 introduced a new table storage parameter vacuum_truncate that controls the functionality to truncate off any empty pages at the end of a table. For example, you might want to avoid vacuum truncation for the following reasons:

due to performance issues with large shared_buffers (see this blog about DROP TABLE: Killing shared_buffers
to reduce replication conflicts - see this blog about dealing with streaming replication conflicts in PostgreSQL

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

If your PostgreSQL database is purely maintained (lack of VACUUM) or badly structured, you might face some table bloat.
UPDATE Sept. 2023 - See this blog for a more recent update on this topic: Table bloat revisited.

Table bloat is somewhat nasty because it slows down your database and eats up more storage than needed. Therefore it can be important to figure out if a table is bloated or not. The PostgreSQL contrib package provides a package called pgstattuple to diagnose table bloat.

To use the module we have to activate it:

test=# CREATE EXTENSION pgstattuple;
CREATE EXTENSION

1 2	test=# CREATE EXTENSION pgstattuple; CREATE EXTENSION

For a test we create a table and add some 10.000 rows to it on the fly:

test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 10000);
SELECT 10000

1 2	test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 10000); SELECT 10000

We can already see, how pgstattuple works. To make sure the psql output stays readable, we have called ...

test=# x
Expanded display is on.

1 2	test=# x Expanded display is on.

In our example pgstattuple will return the following data:

test=# SELECT * FROM pgstattuple('t_test');
-[ RECORD 1 ]------+-------
table_len          | 368640
tuple_count        | 10000
tuple_len          | 280000
tuple_percent      | 75.95
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 7380
free_percent       | 2

test=# SELECT * FROM pgstattuple('t_test');

-[ RECORD 1 ]------+-------

table_len | 368640

tuple_count | 10000

tuple_len | 280000

tuple_percent | 75.95

dead_tuple_count | 0

dead_tuple_len | 0

dead_tuple_percent | 0

free_space | 7380

free_percent | 2

As you can see the size of the table is somewhat over 368k. Our table has a fill grade of around 76%. Note that those numbers don't add up to 100% completely. This is due to some overhead. In reality the fill grade of a freshly loaded table will be a lot higher than in our trivial single column case.

To demonstrate table bloat we can delete some data. In this example we delete one third of those rows:

test=# DELETE FROM t_test WHERE generate_series % 3 = 0;
DELETE 3333

1 2	test=# DELETE FROM t_test WHERE generate_series % 3 = 0; DELETE 3333

Note, the size of the table on disk is still the same. Some of those valid rows have just been moved to the “dead tuple” section of the output:

test=# SELECT * FROM pgstattuple('t_test');
-[ RECORD 1 ]------+-------
table_len          | 368640
tuple_count        | 6667
tuple_len          | 186676
tuple_percent      | 50.64
dead_tuple_count   | 3333
dead_tuple_len     | 93324
dead_tuple_percent | 25.32
free_space         | 7380
free_percent       | 2

test=# SELECT * FROM pgstattuple('t_test');

-[ RECORD 1 ]------+-------

table_len | 368640

tuple_count | 6667

tuple_len | 186676

tuple_percent | 50.64

dead_tuple_count | 3333

dead_tuple_len | 93324

dead_tuple_percent | 25.32

free_space | 7380

free_percent | 2

The first lesson here is that DELETE does not shrink a table on disk. It merely marks rows as dead. This is highly important – many people are misled by this behavior.

To reclaim the space occupied by those dead rows we can call VACUUM:

test=# VACUUM t_test;
VACUUM

1 2	test=# VACUUM t_test; VACUUM

What we see here is that those dead rows have been converted to free space. And yes, the size of the table is still the same:

test=# SELECT * FROM pgstattuple('t_test');
-[ RECORD 1 ]------+-------
table_len          | 368640
tuple_count        | 6667
tuple_len          | 186676
tuple_percent      | 50.64
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 114036
free_percent       | 30.93

test=# SELECT * FROM pgstattuple('t_test');

-[ RECORD 1 ]------+-------

table_len | 368640

tuple_count | 6667

tuple_len | 186676

tuple_percent | 50.64

dead_tuple_count | 0

dead_tuple_len | 0

dead_tuple_percent | 0

free_space | 114036

free_percent | 30.93

This free space can now be used to store new rows inside your table.

Checking many tables

In our example we have checked one table. But what if you want to check your entire database to see who is to blame?

In this case, you must make use of the PostgreSQL type system and a system catalog. Let us discuss the type system first: In the previous example we have seen that pgstattuple can be called in the FROM-clause because it returns an entire row. But we have to do this for all tables – not just for some. To make sure that we can put the list of objects we want to check into the FROM-clause (in the form of the pg_class system table) we have to move the function call to the SELECT clause. The call will return a composite type consisting of all fields:

test=# SELECT relname, pgstattuple(oid) FROM pg_class WHERE relkind = 'r' LIMIT 3;
      relname | pgstattuple
--------------+----------------------------------------------------
 pg_statistic | (147456,387,115800,78.53,27,8161,5.53,19840,13.45)
      pg_type | (73728,357,61965,84.05,20,3440,4.67,4828,6.55)
    pg_authid | (8192,3,324,3.96,0,0,0,7816,95.41)
(3 rows)

test=# SELECT relname, pgstattuple(oid) FROM pg_class WHERE relkind = 'r' LIMIT 3;

relname | pgstattuple

--------------+----------------------------------------------------

pg_statistic | (147456,387,115800,78.53,27,8161,5.53,19840,13.45)

pg_type | (73728,357,61965,84.05,20,3440,4.67,4828,6.55)

pg_authid | (8192,3,324,3.96,0,0,0,7816,95.41)

(3 rows)

This is not really readable so we have to break up those fields again. Some brackets and a star will do the job:

test=# SELECT relname, (pgstattuple(oid)).* FROM pg_class WHERE relkind = 'r' LIMIT 3;
-[ RECORD 1 ]-------+-------------
 relname            | pg_statistic
 table_len          | 147456
 tuple_count        | 387
 tuple_len          | 115800
 tuple_percent      | 78.53
 dead_tuple_count   | 27
 dead_tuple_len     | 8161
 dead_tuple_percent | 5.53
 free_space         | 19840
 free_percent       | 13.45

-[ RECORD 2 ]-------+------------- 
 relname            | pg_type
 table_len          | 73728
 tuple_count        | 357
 tuple_len          | 61965
 tuple_percent      | 84.05
 dead_tuple_count   | 20
 dead_tuple_len     | 3440
 dead_tuple_percent | 4.67
 free_space         | 4828
 free_percent       | 6.55

-[ RECORD 3 ]-------+-------------
 relname            | pg_authid
 table_len          | 8192
 tuple_count        | 3
 tuple_len          | 324
 tuple_percent      | 3.96
 dead_tuple_count   | 0
 dead_tuple_len     | 0
 dead_tuple_percent | 0
 free_space         | 7816
 free_percent       | 95.41

test=# SELECT relname, (pgstattuple(oid)).* FROM pg_class WHERE relkind = 'r' LIMIT 3;

-[ RECORD 1 ]-------+-------------

relname | pg_statistic

table_len | 147456

tuple_count | 387

tuple_len | 115800

tuple_percent | 78.53

dead_tuple_count | 27

dead_tuple_len | 8161

dead_tuple_percent | 5.53

free_space | 19840

free_percent | 13.45

-[ RECORD 2 ]-------+-------------

relname | pg_type

table_len | 73728

tuple_count | 357

tuple_len | 61965

tuple_percent | 84.05

dead_tuple_count | 20

dead_tuple_len | 3440

dead_tuple_percent | 4.67

free_space | 4828

free_percent | 6.55

-[ RECORD 3 ]-------+-------------

relname | pg_authid

table_len | 8192

tuple_count | 3

tuple_len | 324

tuple_percent | 3.96

dead_tuple_count | 0

dead_tuple_len | 0

dead_tuple_percent | 0

free_space | 7816

free_percent | 95.41

We can now sort and filter at will to figure out which table is to blame.

See also Laurenz Albe's post about "How a bad network connection can cause table bloat."

We welcome your feedback on our Facebook page.

In my role as a PostgreSQL consultant and trainer there are a couple of issues, which pop up on a regular basis. One of those issues has to do with set operations. Many people seem to misunderstand the way UNION and UNION ALL work.

UPDATE Sept. 2023: See this post on UNION ALL for more recent information.

The basic assumption of most users is that UNION just appends data – nothing can be further from the truth. Here is why:

test=# SELECT 1 UNION SELECT 1;
 ?column?
----------
 1
(1 row)

test=# SELECT 1 UNION SELECT 1;

?column?

----------

(1 row)

Most people would expect that two rows will be returned from this query. In fact, it is just one row. The reason for that is simple. UNION does NOT just append data – it also removes duplicates. As we have two identical numbers in our example, only one row will remain.

We can use the explain command to see how this operation is performed internally:

test=# explain SELECT 1 UNION SELECT 1;
                        QUERY PLAN
------------------------------------------------------------
 Unique (cost=0.05..0.06 rows=2 width=0)
  -> Sort (cost=0.05..0.06 rows=2 width=0)
     Sort Key: (1)
     -> Append (cost=0.00..0.04 rows=2 width=0)
        -> Result (cost=0.00..0.01 rows=1 width=0)
        -> Result (cost=0.00..0.01 rows=1 width=0)
(6 rows)

test=# explain SELECT 1 UNION SELECT 1;

QUERY PLAN

------------------------------------------------------------

Unique (cost=0.05..0.06 rows=2 width=0)

-> Sort (cost=0.05..0.06 rows=2 width=0)

Sort Key: (1)

-> Append (cost=0.00..0.04 rows=2 width=0)

-> Result (cost=0.00..0.01 rows=1 width=0)

(6 rows)

PostgreSQL will sort the data and remove duplicates later on. The way PostgreSQL handles this query also explains, why many users report performance problems when using UNION. PostgreSQL has to sort all the data to remove duplicate – this is clearly more expensive than just adding data to some result.

What most people really want

What people in many cases really want is UNION ALL.

Let us take a look at an example:

test=# SELECT 1 UNION ALL SELECT 1;
 ?column?
----------
 1
 1
(2 rows)

test=# SELECT 1 UNION ALL SELECT 1;

?column?

----------

(2 rows)

In this case data is really just appended and therefore we get two lines of data.

The plan will reveal that things are really the way we want them to be:

test=# explain SELECT 1 UNION ALL SELECT 1;  
                      QUERY PLAN
------------------------------------------------------
 Result (cost=0.00..0.04 rows=2 width=4)
 -> Append (cost=0.00..0.04 rows=2 width=4)
    -> Result (cost=0.00..0.01 rows=1 width=0)
    -> Result (cost=0.00..0.01 rows=1 width=0)
(4 rows)

test=# explain SELECT 1 UNION ALL SELECT 1;

QUERY PLAN

------------------------------------------------------

Result (cost=0.00..0.04 rows=2 width=4)

-> Append (cost=0.00..0.04 rows=2 width=4)

-> Result (cost=0.00..0.01 rows=1 width=0)

(4 rows)

PostgreSQL will take both result sets and just add them up.

So be careful when it comes to set operations and think twice if you really want to filter duplicates or not.

----------
We welcome your feedback on our brand new Facebook page: www.fb.com/cybertec.postgresql

After digging through a table consisting of dozens of boolean values I thought it might be helpful to more folks out there to know about bitfields in PostgreSQL.

Basically “bit” can be used just as a normal char data type. The idea behind bit is to have a fixed set of bits, which can be accessed at will. Here is an example:

test=# SELECT '0110'::bit(4);
 bit
------
 0110
(1 row)

test=# SELECT '0110'::bit(4);

bit

------

0110

(1 row)

You can use zeros and ones to represent bits in a PostgreSQL bitfield. The nice thing about it is that you can easily cast those bitfields to integer:

test=# SELECT '0110'::bit(4)::int4;
int4
------ 
    6
(1 row)

test=# SELECT '0110'::bit(4)::int4;

int4

------

(1 row)

In our case we would have reduced 4 boolean columns to just one single column. Just imagine the benefit if you have to deal with hundreds of values. Clearly – normalizing boolean columns is not too attractive either.

Converting integer back to bit

The previous example has shown how a bitfield can be turned into integer. Let us try to convert stuff back now:

test=# SELECT 6::bit(4);
  bit
------
 0110
(1 row)

test=# SELECT 6::bit(4);

bit

------

0110

(1 row)

In this case we have turned a number into a bitfield. All it takes here is a simple cast.

Setting and getting bits

What happens if you want to update a column containing a bitfield? To do so, PostgreSQL offers a function called set_bit. It takes a bitfield, a position as well as the desired new value of the n-th bit you want to change.

Setting a bit works as follows:

test=# SELECT set_bit('0110'::bit(4), 3, 1);
 set_bit
---------
    0111
(1 row)

test=# SELECT set_bit('0110'::bit(4), 3, 1);

set_bit

---------

0111

(1 row)

If you want to fetch a certain bit from your field, you can call get_bit:

test=# SELECT get_bit('0110'::bit(4), 2);
 get_bit
---------
      1
(1 row)

test=# SELECT get_bit('0110'::bit(4), 2);

get_bit

---------

(1 row)

In this example we have extracted the third bit from the bitfield.

Bitshifting

Shifting bits around is a pretty common operation. Therefore it can also be done in PostgreSQL. Here are two examples – one with variable bitfields (the bit counterpart to varbit) and one with a fixed bitfield:

test=# SELECT '0110'::bit(10) << 1;
 ?column?
----------
     1100
(1 row)

test=# SELECT '0110'::bit(10) << 1;

?column?

----------

1100

(1 row)

test=# SELECT '0110'::varbit(10) << 1;
  ?column?
------------
 1100000000
(1 row)

test=# SELECT '0110'::varbit(10) << 1;

?column?

------------

1100000000

(1 row)

The nice thing about bitfields is that their size is virtually unlimited. You want a thousand bits or maybe ten thousand? No problem – this is all possible and easy to do.

For more posts on data types and their use in PostgreSQL, check out our data type blog spot.

UPDATED AUGUST 2023: After my previous post, I received mails asking about what was meant by using “generate_series” in an outer join.

Let me try to address this issue with an example: Many people use PostgreSQL to aggregate data. In this example, we assume that we want to aggregate data on an hourly basis.

Sample data

Here's some sample data:

test=# CREATE TABLE t_data (t timestamp, payload text);
CREATE TABLE

1 2	test=# CREATE TABLE t_data (t timestamp, payload text); CREATE TABLE

We create a table with just two fields:

test=# INSERT INTO t_data VALUES ('2012-04-04 03:12', 'data 1');
INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 04:16', 'data 2');
INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 04:28', 'data 3');
INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 06:45', 'data 4');
INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 03:12', 'data 1');

INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 04:16', 'data 2');

INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 04:28', 'data 3');

INSERT 0 1

test=# INSERT INTO t_data VALUES ('2012-04-04 06:45', 'data 4');

INSERT 0 1

We have data for 3am, 4am and 6am ... we've intentionally left 5am out. If we want to start to aggregate the data to see how many rows there are per hour, we can use the following query:

test=# SELECT date_trunc('hour', t), count(payload)
       FROM   t_data
       GROUP BY 1
 ORDER BY 1;
          date_trunc | count
---------------------+-------
 2012-04-04 03:00:00 | 1
 2012-04-04 04:00:00 | 2
 2012-04-04 06:00:00 | 1
(3 rows)

test=# SELECT date_trunc('hour', t), count(payload)

FROM t_data

GROUP BY 1

ORDER BY 1;

date_trunc | count

---------------------+-------

2012-04-04 03:00:00 | 1

2012-04-04 04:00:00 | 2

2012-04-04 06:00:00 | 1

(3 rows)

The SQL challenge

A simple GROUP BY will reveal that there were no entries for the 5am group. However, the zero is not displayed here, because GROUP BY will only return data for valid groups.

In order to create a list of times we want to see in our final report, we can execute generate_series. Let's assume we want to have a complete table from midnight to 6am:

test=# SELECT *
       FROM   generate_series( '2012-04-04 00:00'::timestamp, 
                               '2012-04-04 06:00'::timestamp, 
                               '1 hour')
AS x; 
         x
---------------------
 2012-04-04 00:00:00
 2012-04-04 01:00:00
 2012-04-04 02:00:00
 2012-04-04 03:00:00
 2012-04-04 04:00:00
 2012-04-04 05:00:00
 2012-04-04 06:00:00
(7 rows)

test=# SELECT *

FROM generate_series( '2012-04-04 00:00'::timestamp,

'2012-04-04 06:00'::timestamp,

'1 hour')

AS x;

---------------------

2012-04-04 00:00:00

2012-04-04 01:00:00

2012-04-04 02:00:00

2012-04-04 03:00:00

2012-04-04 04:00:00

2012-04-04 05:00:00

2012-04-04 06:00:00

(7 rows)

This will return a complete list – exactly what we need to display in the final results. Remember, we also want to display hours with no rows.

Joining things together with an outer join

Let's now join things together using a standard PostgreSQL outer join:

 test=# SELECT *
        FROM  generate_series('2012-04-04 00:00'::timestamp,
                              '2012-04-04 06:00'::timestamp,
                              '1 hour') AS x
        LEFT JOIN t_data 
             ON (date_trunc('hour', t) = x);
        x            |         t           | payload
---------------------+---------------------+---------
 2012-04-04 00:00:00 |                     |
 2012-04-04 01:00:00 |                     |
 2012-04-04 02:00:00 |                     |
 2012-04-04 03:00:00 | 2012-04-04 03:12:00 | data 1
 2012-04-04 04:00:00 | 2012-04-04 04:16:00 | data 2
 2012-04-04 04:00:00 | 2012-04-04 04:28:00 | data 3
 2012-04-04 05:00:00 |                     |
 2012-04-04 06:00:00 | 2012-04-04 06:45:00 | data 4
(8 rows)

test=# SELECT *

FROM generate_series('2012-04-04 00:00'::timestamp,

'2012-04-04 06:00'::timestamp,

'1 hour') AS x

LEFT JOIN t_data

ON (date_trunc('hour', t) = x);

x | t | payload

---------------------+---------------------+---------

2012-04-04 00:00:00 | |

2012-04-04 01:00:00 | |

2012-04-04 02:00:00 | |

2012-04-04 03:00:00 | 2012-04-04 03:12:00 | data 1

2012-04-04 04:00:00 | 2012-04-04 04:16:00 | data 2

2012-04-04 04:00:00 | 2012-04-04 04:28:00 | data 3

2012-04-04 05:00:00 | |

2012-04-04 06:00:00 | 2012-04-04 06:45:00 | data 4

(8 rows)

We'll run date_trunc to make sure that the precision of our dates is actually in hours. Then we can use the date column as the join key. Note that the non-existing fields on the “data” side of the join will stay empty.

Watch out for trouble – be precise

Once we have joined the data we have to aggregate to fetch the final result. When doing this aggregation, watch out for a common mistake:

test=# SELECT x, count(*)
       FROM   generate_series('2012-04-04 00:00'::timestamp,
                              '2012-04-04 06:00'::timestamp,
                              '1 hour') AS x
       LEFT JOIN t_data
            ON (date_trunc('hour', t) = x)
       GROUP BY 1;
 x                   | count
---------------------+-------
 2012-04-04 00:00:00 | 1
 2012-04-04 01:00:00 | 1
 2012-04-04 02:00:00 | 1
 2012-04-04 03:00:00 | 1
 2012-04-04 04:00:00 | 2
 2012-04-04 05:00:00 | 1
 2012-04-04 06:00:00 | 1
(7 rows)

test=# SELECT x, count(*)

FROM generate_series('2012-04-04 00:00'::timestamp,

'2012-04-04 06:00'::timestamp,

'1 hour') AS x

LEFT JOIN t_data

ON (date_trunc('hour', t) = x)

GROUP BY 1;

x | count

---------------------+-------

2012-04-04 00:00:00 | 1

2012-04-04 01:00:00 | 1

2012-04-04 02:00:00 | 1

2012-04-04 03:00:00 | 1

2012-04-04 04:00:00 | 2

2012-04-04 05:00:00 | 1

2012-04-04 06:00:00 | 1

(7 rows)

Never EVER use count(*) along with an outer join.

The problem is, count(*) will count ROWS – we don't want to count rows because we're not allowed to count NULL values. As you can see, the result is just plain wrong.

To correct the problem, I advise you to count non-NULL values. In SQL, this means that you have to use count(column) rather than count(*). count(column) will only count values which are not NULL. That's exactly what we want:

test=# SELECT x, count(payload)
       FROM   generate_series('2012-04-04 00:00'::timestamp,
                              '2012-04-04 06:00'::timestamp,
                              '1 hour') AS x
             LEFT JOIN t_data
                  ON (date_trunc('hour', t) = x)
      GROUP BY 1;
       x             | count
---------------------+------- 
 2012-04-04 00:00:00 | 0
 2012-04-04 01:00:00 | 0
 2012-04-04 02:00:00 | 0
 2012-04-04 03:00:00 | 1
 2012-04-04 04:00:00 | 2
 2012-04-04 05:00:00 | 0
 2012-04-04 06:00:00 | 1
(7 rows)

test=# SELECT x, count(payload)

FROM generate_series('2012-04-04 00:00'::timestamp,

'2012-04-04 06:00'::timestamp,

'1 hour') AS x

LEFT JOIN t_data

ON (date_trunc('hour', t) = x)

GROUP BY 1;

x | count

---------------------+-------

2012-04-04 00:00:00 | 0

2012-04-04 01:00:00 | 0

2012-04-04 02:00:00 | 0

2012-04-04 03:00:00 | 1

2012-04-04 04:00:00 | 2

2012-04-04 05:00:00 | 0

2012-04-04 06:00:00 | 1

(7 rows)

Voilà, the result is exactly as desired.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

UPDATED August 2023: Time is one of those unloved topics every programmer has to deal with. Have you ever written a routine heavily dependent on time calculations in C, Python, PHP, Perl, or any other language? In that case, I'm sure you've fallen in love with time management as much as I have.

When it comes to processing time, PostgreSQL is really like a kind of revelation – simple, easy to deal with, efficient, and highly capable.

Day constants in PostgreSQL - YESTERDAY, TODAY, and TOMORROW

One of the most common tasks that PostgreSQL excels at is to figure out about yesterday, today and tomorrow. Consider the following query:

test=# SELECT 'YESTERDAY'::date, 'TODAY'::date, 'TOMORROW'::date;
    date    |    date    |    date
------------+------------+------------
 2013-08-05 | 2013-08-06 | 2013-08-07
(1 row)

test=# SELECT 'YESTERDAY'::date, 'TODAY'::date, 'TOMORROW'::date;

date | date | date

------------+------------+------------

2013-08-05 | 2013-08-06 | 2013-08-07

(1 row)

These three PostgreSQL constants allow you to fetch important dates in SQL quickly and easily. There is no need for nasty math here.

PostgreSQL can also handle dates like February 29^th nicely:

test=# SELECT DATE '2023-02-29';
ERROR:  date/time field value out of range: '2023-02-29'
LINE 1: SELECT DATE '2023-02-29';
                    ^
test=# SELECT DATE '2024-02-29';
    date    
------------
 2024-02-29
(1 row)

test=# SELECT DATE '2023-02-29';

ERROR: date/time field value out of range: '2023-02-29'

LINE 1: SELECT DATE '2023-02-29';

test=# SELECT DATE '2024-02-29';

date

------------

2024-02-29

(1 row)

Coping with intervals

But PostgreSQL is not only able to work with dates. It can also be extremely convenient in terms of working with intervals. Basically, an interval can be subtracted and added to a timestamp at will.

Consider the following example:

test=# SELECT now();
         now
-------------------------------
2013-08-06 10:41:19.202914+02
(1 row)

test=# SELECT now();

now

-------------------------------

2013-08-06 10:41:19.202914+02

(1 row)

test=# SELECT now() + '3 decades 2 hours 5 centuries 20 minutes 90 days 12 months'::interval;
          ?column?
-------------------------------
2544-11-04 13:01:42.298739+01
(1 row)

test=# SELECT now() + '3 decades 2 hours 5 centuries 20 minutes 90 days 12 months'::interval;

?column?

-------------------------------

2544-11-04 13:01:42.298739+01

(1 row)

The interval datatype accepts an easily readable format. You don't even have to put the units in exact order – hours can be placed before centuries, or the other way around. There are absolutely no restrictions on order, which makes coding with this type a lot simpler.

Generating timeseries in PostgreSQL: `generate_series`

Did you ever write a web application which has to display a calendar? In case you did, I'm pretty sure you had a problem coming up with a list of days for a given month. In most programming languages, doing that kind of processing is pretty hard. PostgreSQL can assist here. It provides a simple, efficient method:

test=# SELECT * FROM generate_series('2012-02-26'::date, '2012-03-02'::date, '1 day'::interval);
    generate_series
------------------------
 2012-02-26 00:00:00+01
 2012-02-27 00:00:00+01
 2012-02-28 00:00:00+01
 2012-02-29 00:00:00+01
 2012-03-01 00:00:00+01
 2012-03-02 00:00:00+01
(6 rows)

test=# SELECT * FROM generate_series('2012-02-26'::date, '2012-03-02'::date, '1 day'::interval);

generate_series

------------------------

2012-02-26 00:00:00+01

2012-02-27 00:00:00+01

2012-02-28 00:00:00+01

2012-02-29 00:00:00+01

2012-03-01 00:00:00+01

2012-03-02 00:00:00+01

(6 rows)

The generate_series function takes three parameters: The first one defines the starting timestamp. The second parameter defines the ending timestamp, and the third parameter will tell PostgreSQL the size of the increments for the interval. In our example, we defined the interval length as one day-- but you can use any increment you desire.

Keep in mind: Generating a timeseries can come in handy when you have to write outer joins - See the next blog post in this series for more info. Doing stuff like that on the database side can dramatically reduce the amount of engineering needed.

Update: `date_bin` and timestamps in PostgreSQL

Since this blog post was written, a new function was added in PostgreSQL 14 to solve a problem which has caused challenges for many users: How can we map timestamps to time bins? The function is called date_bin.

What people often do is round a timestamp to a full hour. That’s commonly done using the date_trunc function. But what if you want to round data in a way that things fit into a 30-minute or a 15-minute grid? Find out all about date_bin in this blog post.

Finally...

For more advanced information on how PostgreSQL processes time, see the following blogs:

Timeseries advanced analytics and window functions: EXCLUDE TIES, CURRENT ROW, and GROUP
Laurenz Albe's blog post on Time Zone Management in PostgreSQL

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Transaction isolation does make a difference

Repeatable read – a way to fix transaction visibility

Let's get started with a basic table and add 100k rows to it:

Now let's delete half of the rows:

To clean up those rows which are really dead, we can run VACUUM:

How shrinking works

When shrinking does not reduce table bloat

Now, let's construct an example where this is not possible:

CTIDs and their role in understanding PostgreSQL table bloat

Take a look at our table and check out the highest 5 CTIDs:

Let's check the lowest CTID:

UPDATE AUGUST 2023:

Checking many tables

What most people really want

Converting integer back to bit

Setting and getting bits

Bitshifting

Sample data

The SQL challenge

Joining things together with an outer join

Watch out for trouble – be precise

Day constants in PostgreSQL - YESTERDAY, TODAY, and TOMORROW

Coping with intervals

Consider the following example:

Generating timeseries in PostgreSQL: generate_series

Update: date_bin and timestamps in PostgreSQL

Finally...

To clean up those rows which are really dead, we can run `VACUUM`:

Generating timeseries in PostgreSQL: `generate_series`

Update: `date_bin` and timestamps in PostgreSQL