December 2017

A couple of weeks ago I have seen a horribly designed, 370 billion (with a “b”) row Oracle database monster running some sort of aggregations. Due to the sheer amount of data I naturally thought about how I would implement the same thing in PostgreSQL. What I noticed is that most people would actually implement aggregations slightly wrong in PostgreSQL - so I decided to share some thoughts on the clause GROUP BY and performance.

Loading data into PostgreSQL

Before getting started, two relations are created:

test=# CREATE TABLE t_gender (id int, name text);
CREATE TABLE
test=# INSERT INTO t_gender
       VALUES (1, 'male'), (2, 'female');
INSERT 0 2

test=# CREATE TABLE t_gender (id int, name text);

CREATE TABLE

test=# INSERT INTO t_gender

VALUES (1, 'male'), (2, 'female');

INSERT 0 2

t_gender is a classical “lookup table”. It contains only a handful of rows, which will be joined on. The real “fact table” is in my example t_person:

test=# CREATE TABLE t_person (
           id         serial,
           gender     int,
           data       char(40)
);
CREATE TABLE

test=# CREATE TABLE t_person (

id serial,

gender int,

data char(40)

);

CREATE TABLE

For this example it should be enough to load a couple of million rows. The following INSERT statement makes sure that genders are evenly distributed in the table. Half of the population is male, and the other half is female:

test=# INSERT INTO t_person (gender, data)
       SELECT  x % 2 + 1, 'data'
       FROM    generate_series(1, 5000000) AS x;
INSERT 0 5000000

test=# INSERT INTO t_person (gender, data)

SELECT x % 2 + 1, 'data'

FROM generate_series(1, 5000000) AS x;

INSERT 0 5000000

How most people approach aggregation in PostgreSQL

The goal of the following code is to take our data and turn it into a simple report. Reporting is pretty important these days so you might see many of those queries as outlined below:

test=# SELECT  name, count(*)
       FROM    t_gender AS a, t_person AS b
       WHERE   a.id = b.gender
       GROUP BY 1;
  name  | count
--------+---------
 female | 2500000
   male | 2500000
(2 rows)
Time: 961.034 ms

test=# SELECT name, count(*)

FROM t_gender AS a, t_person AS b

WHERE a.id = b.gender

GROUP BY 1;

name | count

--------+---------

female | 2500000

male | 2500000

(2 rows)

Time: 961.034 ms

The goal is to figure out, how many men and women our database contains. The way to do that is to simply join the lookup table. On my laptop this takes around 961 ms …
The question is: Can we do better? Of course there is always a way to speed up things: More CPUs, more cache, etc. However, this is not the kind of improvement I am talking about. My question is: Can we use a smarter algorithm? Many people might be surprised but the answer is “yes”.

SQL: Finding the bottleneck

If you want to understand a performance problem there is usually no way to get around reading execution plans. Here is what the planner thinks:

explain SELECT name, count(*)
        FROM   t_gender AS a, t_person AS b
        WHERE  a.id = b.gender 
        GROUP BY 1;
                    QUERY PLAN
-----------------------------------------------------
 HashAggregate ...
     Group Key: a.name
     -> Hash Join (rows=5000034)
        Hash Cond: (b.gender = a.id)
        -> Seq Scan on t_person b (rows=5000034)
           -> Hash (cost=1.02..1.02 rows=2 width=10)
           -> Seq Scan on t_gender a (rows=2)

explain SELECT name, count(*)

FROM t_gender AS a, t_person AS b

WHERE a.id = b.gender

GROUP BY 1;

QUERY PLAN

-----------------------------------------------------

HashAggregate ...

Group Key: a.name

-> Hash Join (rows=5000034)

Hash Cond: (b.gender = a.id)

-> Seq Scan on t_person b (rows=5000034)

-> Hash (cost=1.02..1.02 rows=2 width=10)

-> Seq Scan on t_gender a (rows=2)

PostgreSQL scans both tables sequentially and joins them together. Then the joined data is aggregated. In other words: 5 million rows will be joined with a small table.

High-performance analysis and aggregation in PostgreSQL

However, there is an alternative: What if we aggregate first and join later? What if we just counted those IDs and then lookup the name? The beauty of this approach is that we just had to join 2 rows instead of 5 million rows.

Here is what we could do:

test=# WITH x AS
(
   SELECT gender, count(*) AS res
   FROM   t_person AS a
 GROUP BY 1
)
SELECT    name, res
FROM      x, t_gender AS y
WHERE     x.gender = y.id;
... ...
Time: 526.472 ms

test=# WITH x AS

(

SELECT gender, count(*) AS res

FROM t_person AS a

GROUP BY 1

)

SELECT name, res

FROM x, t_gender AS y

WHERE x.gender = y.id;

... ...

Time: 526.472 ms

Voila, the same thing happens A LOT faster. The Common Table Expression (CTE) is executed first and then joined. WITH is an “optimization barrier” making sure that the optimizer cannot fool around with things.

Let us take a look at the plan:

test=# explain WITH x AS
( 
  SELECT gender, count(*) AS res
  FROM   t_person AS a
  GROUP BY 1
)
SELECT   name, res
FROM     x, t_gender AS y
WHERE    x.gender = y.id;

           QUERY PLAN
---------------------------------
 Hash Join (rows=2)
       Hash Cond: (y.id = x.gender)
       CTE x
           -> HashAggregate (rows=2)
              Group Key: a.gender
              -> Seq Scan on t_person a (rows=5000034)
              -> Seq Scan on t_gender y (rows=2)
       -> Hash (rows=2)
          -> CTE Scan on x (rows=2)

test=# explain WITH x AS

(

SELECT gender, count(*) AS res

FROM t_person AS a

GROUP BY 1

)

SELECT name, res

FROM x, t_gender AS y

WHERE x.gender = y.id;

QUERY PLAN

---------------------------------

Hash Join (rows=2)

Hash Cond: (y.id = x.gender)

CTE x

-> HashAggregate (rows=2)

Group Key: a.gender

-> Seq Scan on t_person a (rows=5000034)

-> Seq Scan on t_gender y (rows=2)

-> Hash (rows=2)

-> CTE Scan on x (rows=2)

Sweet: The CTE is calculated and later joined giving us the extra boost we desired.

PostgreSQL: What the future might have in stock for us

The reason why PostgreSQL doesn't do this automatically is buried deep inside the structure of the planner. As of version 10.x PostgreSQL always has to join first and aggregate later. Currently serious work is done to lift this restriction and give the planner a bit more flexibility. Various developers including people from my team here at CYBERTEC are actively working on this issue, and I am hopeful to see a speedup in PostgreSQL 11 or maybe PostgreSQL 12.

Find out the latest info on PostgreSQL performance tuning in our blogposts.

By Kaarel Moppel - Restricting column access - Something from the security realm this time - normally this side of the “database world” doesn't get too much attention and "love" because it's difficult/tedious and requires some deep understanding of how the system works...but ignoring security as we know can catch up with you badly.

Anyways, I'm not telling a real horror story here, but some weeks ago when a friend who’s a “casual DBA” (install, start/stop, backup/restore) asked for help with a situation, where a column was “leaking” data. I had to think a bit longer than usual to find an appropriate place where this issue could be fixed. The solution I suggested in the end included making use of the Rule System of PostgreSQL. I hadn’t seen rules being used in years myself (one should generally try to avoid having some stuff happening behind the scenes) and had to refresh on them a bit myself, but this was the correct use case for applying them and thought I’d share it as well then.

The security leak

So, the application they had was a bought “black box” webapp that did some quite trivial CRUD on some data forms and all was fine - until they noticed that in some form, some column with address data was visible to all users of the system, instead of a specific login role. They couldn’t immediately change the code and were looking for a quick fix to hide the data in that column altogether for all users on the DB side (they could do changes there), until the app gets properly fixed. So what means/tricks are available for such kind of stuff? Not too many and they usually have implications...but let’s have a look at them.

Available options on restricting column access

Column-level privileges

When data is accessed by different login roles, normally the best way to limit access on single columns would be to use the “column privileges” feature - just do a “REVOKE SELECT” followed by “GRANT SELECT (col1, col2,...)”, leaving out the “secret” columns from the list. With our app it wouldn’t work as only one backend role was used, and all the SELECTs on that table would start to throw errors. Additionally, on other forms showing the column, SQL needs to be adjusted. A no go.

Rename the column, create a new (nullable) column with the same name

This would be the easiest trick (very easily rollbackable) for a temporary fix...but would only work when no new data is being added - in our case though new customer registrations were still coming in from the web. A no go.

A view instead of a table

Renaming the old table and creating a simple view with the same name, including all the wanted columns (and adding appropriate GRANTs, for example copying the existing ones with "pg_dump -s -t problem_table | grep ^GRANT") is relatively easy. We could replace our “secret” column’s value for example with an empty string and all would be fine…until we again get some INSERT-s or UPDATE-s. Then we would see an error like the below one, although simple DML in itself is allowed on views by default in Postgres.

ERROR:  cannot insert into column 'leaky_data' of view 'problem_table'
DETAIL:  View columns that are not columns of their base relation are not updatable.

1 2	ERROR: cannot insert into column 'leaky_data' of view 'problem_table' DETAIL: View columns that are not columns of their base relation are not updatable.

So how do you get around this INSERT/UPDATE problem?

A view instead of a table with Rules!

To overcome the errors on adding/changing data, we need to re-route the data into the “base table” of our view. This can be easily done with rules and it would be a good fit for this use case. Read on for a sample.

The solution

So let’s say this is our leaking table:

CREATE TABLE  problem_table(id int, leaky_data text);

1	CREATE TABLE problem_table(id int, leaky_data text);

And now we want to replace all values in leaky_data column with empty strings. So, we need to write something like this:

BEGIN;

ALTER TABLE problem_table RENAME TO problem_table_real;

CREATE VIEW problem_table AS SELECT id, ''::text AS leaky_data FROM problem_table_real;

CREATE RULE problem_table_ins AS ON INSERT TO problem_table
    DO INSTEAD
    INSERT INTO problem_table_real VALUES (
           NEW.id,
           NEW.leaky_data
    );

CREATE RULE problem_table_upd AS ON UPDATE TO problem_table
    DO INSTEAD
    UPDATE  problem_table_real
       SET id = NEW.id,
               leaky_data = NEW.leaky_data
     WHERE id = NEW.id;

COMMIT;

BEGIN;

ALTER TABLE problem_table RENAME TO problem_table_real;

CREATE VIEW problem_table AS SELECT id, ''::text AS leaky_data FROM problem_table_real;

CREATE RULE problem_table_ins AS ON INSERT TO problem_table

DO INSTEAD

INSERT INTO problem_table_real VALUES (

NEW.id,

NEW.leaky_data

);

CREATE RULE problem_table_upd AS ON UPDATE TO problem_table

DO INSTEAD

UPDATE problem_table_real

SET id = NEW.id,

leaky_data = NEW.leaky_data

WHERE id = NEW.id;

COMMIT;

Now indeed it does what we want:

INSERT INTO problem_table(id, leaky_data) VALUES (1, 'secret');

SELECT * FROM problem_table;

 id │ leaky_data 
────┼────────────
  1 │ 
(1 row)

INSERT INTO problem_table(id, leaky_data) VALUES (1, 'secret');

SELECT * FROM problem_table;

id │ leaky_data

────┼────────────

1 │

(1 row)

NB! When the application is using constructs like “UPDATE … RETURNING *”, it’s also possible to take care of the loophole there in the rule definition, by adding the “RETURNING …, ‘’::text AS leaky_data” part. Hope it helps!

Find out the latest on security matters in PostgreSQL by reading our security blog spot.

While I was at database training to migrate some poor fellows from Oracle to PostgreSQL, I was confronted with this simple question: “How many users can you have in PostgreSQL?”. Obviously somebody has made some bad experiences with other databases, or Oracle in particular, to come up with this kind of question. “Well, as many as you want. How many do you need?”. After some brief discussion it seems that people just did not believe in what I said. So I decided to show them how easily a large number of database users can be created in no time.

Creating a large number of users

The good thing about PostgreSQL is that DDLs can happen inside transactions. As user creation is done by DDLs, it is possible to create all those new accounts in a single transaction. Also 1 million users is a lot, so we clearly don’t want to compile this list by hand. And: The entire process has to be quick and take only 5 minutes at a training course.

In short: I needed something fast, so I decided to come up with the following script...

BEGIN;
SELECT 'CREATE USER xy' || id 
	FROM generate_series(1, 1000000) AS id;
gexec
COMMIT;

BEGIN;

SELECT 'CREATE USER xy' || id

FROM generate_series(1, 1000000) AS id;

gexec

COMMIT;

It simply generates a couple of SQL commands, which do exactly what I want. Here is the output:

test=# SELECT 'CREATE USER xy' || id 
	FROM generate_series(1, 1000000) AS id;
       ?column?        
-----------------------
 CREATE USER xy1
 CREATE USER xy2
 CREATE USER xy3
 CREATE USER xy4
 CREATE USER xy5
 CREATE USER xy6
 CREATE USER xy7

test=# SELECT 'CREATE USER xy' || id

FROM generate_series(1, 1000000) AS id;

?column?

-----------------------

CREATE USER xy1

CREATE USER xy2

CREATE USER xy3

CREATE USER xy4

CREATE USER xy5

CREATE USER xy6

CREATE USER xy7

The idea is to get 1 million statements.

Before the script is executed, the table is empty:

test=# SELECT count(*) FROM pg_user;
 count 
-------
     1
(1 row)

test=# SELECT count(*) FROM pg_user;

count

-------

(1 row)

Here comes the trick: gexec executes the SQL we have just generated directly. Finally the transaction is committed:

[hs@zenbook master]$ time psql test < /tmp/a.sql > /dev/null 

real	1m0.009s
user	0m4.212s
sys	0m8.600s

[hs@zenbook master]$ time psql test < /tmp/a.sql > /dev/null

real 1m0.009s

user 0m4.212s

sys 0m8.600s

On my laptop, generating 1 million users in a single transaction takes around 1 minute. Many people might be surprised that it is actually possible to create this number of users. Others might be delighted to see that it only takes one minute. The first feedback I got from my folks here at Cybertec HQ was: “Why does it take so long 😉 ”. Well, people in the PostgreSQL community see things differently sometimes 🙂

Here is what happens:

test=# SELECT count(*) FROM pg_user;
  count  
---------
 1000001
(1 row)

test=# SELECT count(*) FROM pg_user;

count

---------

1000001

(1 row)

The bottom line is really that PostgreSQL can easily handle a million users without any problems. It is good to see, that it is possible to create so many accounts with just 4 lines of code.

Ever since Hannes Eder published the idea of the SKYLINE OF operator on the PostgreSQL mailing list years ago, I was somewhat hooked on the idea of being able to make more intelligent queries in PostgreSQL. So, what is the idea of a “Skyline query”? Here is the basic concept: Imagine you want to go on holiday, and you are looking for a nice hotel on the beach. The trouble is: The hotels with a nice view of the beach are way too expensive – the hotels further back are cheap but far away from the sea. The question is: What is the best compromise?

That's exactly what this post is all about.

Here is an example:

test=# CREATE TABLE t_hotel 
( 
     id               serial, 
     name             text, 
     price            numeric, 
     distance_beach   numeric 
); 
CREATE TABLE

test=# CREATE TABLE t_hotel

(

id serial,

name text,

price numeric,

distance_beach numeric

);

CREATE TABLE

The table stores the name of a hotel, the price and the distance to the beach. Let us add a couple of rows manually:

test=# INSERT INTO t_hotel (name, price, distance_beach) 
       VALUES ('ABC Motel', 120, 2.4); 
INSERT 0 1 
test=# INSERT INTO t_hotel (name, price, distance_beach) 
       VALUES ('Crapstone Hotel', 90, 2.2); 
INSERT 0 1 
test=# INSERT INTO t_hotel (name, price, distance_beach) 
       VALUES ('Luxury Watch Spa Hotel', 495, 0.2); 
INSERT 0 1 
test=# INSERT INTO t_hotel (name, price, distance_beach) 
       VALUES ('Nowhere Middle Hotel', 45, 9.6); 
INSERT 0 1

test=# INSERT INTO t_hotel (name, price, distance_beach)

VALUES ('ABC Motel', 120, 2.4);

INSERT 0 1

test=# INSERT INTO t_hotel (name, price, distance_beach)

VALUES ('Crapstone Hotel', 90, 2.2);

INSERT 0 1

test=# INSERT INTO t_hotel (name, price, distance_beach)

VALUES ('Luxury Watch Spa Hotel', 495, 0.2);

INSERT 0 1

test=# INSERT INTO t_hotel (name, price, distance_beach)

VALUES ('Nowhere Middle Hotel', 45, 9.6);

INSERT 0 1

If we select our hotels sorted by price, we will see that we will most likely end up far away from the beach in a cheap, low-quality hotel. Clearly, this is not desirable:

test=# SELECT * FROM t_hotel ORDER BY price; 
 id |       name             | price | distance_beach 
----+------------------------+-------+---------------- 
 4  |   Nowhere Middle Hotel |   45  | 9.6 
 2  |        Crapstone Hotel |   90  | 2.2 
 1  |              ABC Motel |  120  | 2.4 
 3  | Luxury Watch Spa Hotel |  495  | 0.2 
(4 rows)

test=# SELECT * FROM t_hotel ORDER BY price;

id | name | price | distance_beach

----+------------------------+-------+----------------

4 | Nowhere Middle Hotel | 45 | 9.6

2 | Crapstone Hotel | 90 | 2.2

1 | ABC Motel | 120 | 2.4

3 | Luxury Watch Spa Hotel | 495 | 0.2

(4 rows)

However, if we sort by distance, we will end up close to the beach, but we won't be able to afford it. The trouble is that none of those queries will actually offer us a good compromise:

test=#  SELECT * FROM t_hotel ORDER BY distance_beach; 
 id |          name          | price | distance_beach 
----+------------------------+-------+---------------- 
  3 | Luxury Watch Spa Hotel |   495 | 0.2 
  2 |        Crapstone Hotel |    90 | 2.2 
  1 |              ABC Motel |   120 | 2.4 
  4 |   Nowhere Middle Hotel |    45 | 9.6 
(4 rows)

test=# SELECT * FROM t_hotel ORDER BY distance_beach;

id | name | price | distance_beach

----+------------------------+-------+----------------

3 | Luxury Watch Spa Hotel | 495 | 0.2

2 | Crapstone Hotel | 90 | 2.2

1 | ABC Motel | 120 | 2.4

4 | Nowhere Middle Hotel | 45 | 9.6

(4 rows)

Scaling up: Increasing the size of our data set

Let us see what happens if we load 5 million rows into our table:

test=# TRUNCATE t_hotel ; 
TRUNCATE TABLE 
test=# INSERT INTO t_hotel (price, distance_beach) 
       SELECT  40 + random() * 200, random() * 15 
       FROM    generate_series(1, 5000000); 
INSERT 0 5000000

test=# TRUNCATE t_hotel ;

TRUNCATE TABLE

test=# INSERT INTO t_hotel (price, distance_beach)

SELECT 40 + random() * 200, random() * 15

FROM generate_series(1, 5000000);

INSERT 0 5000000

Loading all this data is clearly not a problem, but check out what happens now:

test=# \timing 
Timing is on. 
test=# SELECT price * sqrt(distance_beach), price, distance_beach FROM t_hotel 
        ?column?              |        price     |         distance_beach 
------------------------------+------------------+------------------------- 
 0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832 
  0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806   
  0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227 
... 
(10 rows) 

Time: 18916.807 ms (00:18.917)

test=# \timing

Timing is on.

test=# SELECT price * sqrt(distance_beach), price, distance_beach FROM t_hotel

?column? | price | distance_beach

------------------------------+------------------+-------------------------

0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832

0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806

0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227

...

(10 rows)

Time: 18916.807 ms (00:18.917)

It took almost 19 seconds (my laptop) to run the query. Most users would not tolerate this kind of behavior for too often, so we somehow need to improve the runtime.

The SKYLINE OF operator does not exist in PostgreSQL (nor in any other database engine I am aware of). However: PostgreSQL offers functional indexes, which are ideal in this case:

test=# CREATE FUNCTION rate_hotel(numeric, numeric) 
  RETURNS numeric AS 
$$ 
  SELECT $1 * sqrt($2) 
$$ 
LANGUAGE 'sql' IMMUTABLE; 
CREATE FUNCTION

test=# CREATE FUNCTION rate_hotel(numeric, numeric)

RETURNS numeric AS

SELECT $1 * sqrt($2)

LANGUAGE 'sql' IMMUTABLE;

CREATE FUNCTION

The important thing here is to use an IMMUTABLE function. We must assure that the function used to rank the data is perfectly deterministic and its result does not change over time given the same input parameters.
Creating the index is easy:

test=# CREATE INDEX idx_fix_hotel 
         ON t_hotel (rate_hotel(price, distance_beach)); 
CREATE INDEX 
Time: 22706.882 ms (00:22.707) 
test=# ANALYZE ; 
ANALYZE 
Time: 354.660 ms

test=# CREATE INDEX idx_fix_hotel

ON t_hotel (rate_hotel(price, distance_beach));

CREATE INDEX

Time: 22706.882 ms (00:22.707)

test=# ANALYZE ;

ANALYZE

Time: 354.660 ms

Speeding up the skyline query using an index

Our new index is really boosting things and reduces the runtime of this query to around 1 millisecond, which is around 20.000 faster than before. The result is the same:

test=# SELECT rate_hotel(price, distance_beach), 
              price, distance_beach 
       FROM   t_hotel 
       ORDER BY 1 
       LIMIT 10; 
           rate_hotel         |       price      |      distance_beach 
------------------------------+------------------+------------------------- 
 0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832 
  0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806 
  0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227 
... 
(10 rows) 

Time: 1.024 ms

test=# SELECT rate_hotel(price, distance_beach),

price, distance_beach

FROM t_hotel

ORDER BY 1

LIMIT 10;

rate_hotel | price | distance_beach

------------------------------+------------------+-------------------------

0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832

0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806

0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227

...

(10 rows)

Time: 1.024 ms

The execution plan shows that PostgreSQL will directly go to the index and fetch the data needed. Indexes in PostgreSQL return sorted data so there is no need for sorting and no need to touch more than a handful of rows:

test=# explain SELECT rate_hotel(price, distance_beach), 
                      price, distance_beach 
               FROM   t_hotel 
               ORDER BY 1 
               LIMIT 10; 
                                 QUERY PLAN 
-------------------------------------------------------------------------- 
Limit (cost=0.43..1.12 rows=10 width=55) 
 -> Index Scan using idx_fix_hotel on t_hotel 
    (cost=0.43..346214.73 rows=4999993 width=55) 
(2 rows)

test=# explain SELECT rate_hotel(price, distance_beach),

price, distance_beach

FROM t_hotel

ORDER BY 1

LIMIT 10;

QUERY PLAN

--------------------------------------------------------------------------

Limit (cost=0.43..1.12 rows=10 width=55)

-> Index Scan using idx_fix_hotel on t_hotel

(cost=0.43..346214.73 rows=4999993 width=55)

(2 rows)

Of course the approach is somewhat different than described in the paper about Skyline queries. However, the method I have shown you is easy to implement, efficient and serves most real-world purposes. You can make your rating function reasonably sophisticated without and still maintain good performance.

Read on about PostgreSQL in our latest performance blogs.

Loading data into PostgreSQL

How most people approach aggregation in PostgreSQL

SQL: Finding the bottleneck

High-performance analysis and aggregation in PostgreSQL

PostgreSQL: What the future might have in stock for us

The security leak

Available options on restricting column access

The solution

Creating a large number of users

Before the script is executed, the table is empty:

Here is what happens:

More advanced ordering in PostgreSQL

Scaling up: Increasing the size of our data set

Speeding up the skyline query using an index