October 2013

UPDATED July 2023: PostgreSQL has a sophisticated security system capable of handling complex policies and user permissions are an essential component of every productive setup many people rely on. However, over the past couple of years I have noticed that many users fall into the same trap: The public schema.

Unexpected results

To make life easier, PostgreSQL provides a thing called the “public” schema, which can be found in every database. Prior to version 15, everybody could create objects in the public schema. This could lead to some interesting and unexpected behavior. (For info about PG v15 and higher, see the last paragraph.) Let's assume I'm connected to a database as superuser in v14:

security=# CREATE ROLE john NOSUPERUSER LOGIN;
CREATE ROLE

security=# CREATE TABLE t_test (id int4);
CREATE TABLE

security=# CREATE ROLE john NOSUPERUSER LOGIN;

CREATE ROLE

security=# CREATE TABLE t_test (id int4);

CREATE TABLE

I've created a new user as well as a table belonging to the superuser. Let's see what the new user is capable of doing.

$ psql security -U john
psql (14.5)
Type 'help' for help.

security=> SELECT * FROM t_test;
ERROR: permission denied for relation t_test

$ psql security -U john

psql (14.5)

Type 'help' for help.

security=> SELECT * FROM t_test;

ERROR: permission denied for relation t_test

As expected our new user is not allowed to read from existing tables and he is not allowed to create a new schema or even drop an existing one:

security=> CREATE SCHEMA sales;
ERROR: permission denied for database security

security=> DROP DATABASE security;
ERROR: must be owner of database security

security=> CREATE SCHEMA sales;

ERROR: permission denied for database security

security=> DROP DATABASE security;

ERROR: must be owner of database security

So far everything is going according to plan. But, let us take a look at the next example:

security=> CREATE TABLE t_disaster (id int4);
CREATE TABLE

1 2	security=> CREATE TABLE t_disaster (id int4); CREATE TABLE

The user can easily create a new table. Most people would have expected some “permission denied” here. However, this does not happen because the public schema can be utilized by everybody. d reveals what is going on here:

security=> \d
        List of relations

 Schema |     Name   | Type  | Owner
--------+------------+-------+-------
 public | t_disaster | table | john
 public | t_test     | table | hs
(2 rows)

security=> \d

List of relations

Schema | Name | Type | Owner

--------+------------+-------+-------

public | t_disaster | table | john

public | t_test | table | hs

(2 rows)

We can see that the new relation belongs to John.

Closing the hole (PG prior to v15)

To make sure that things like this can't happen, we have to remove permissions from the public schema. The superuser can execute the following command to achieve that:

security=# REVOKE CREATE ON SCHEMA public FROM PUBLIC;

REVOKE

security=# REVOKE CREATE ON SCHEMA public FROM PUBLIC;

REVOKE

This will lead exactly to the kind of behavior we expected before:

security=> CREATE TABLE t_forbidden (id int4);
ERROR: no schema has been selected to create in

security=> CREATE TABLE public.t_forbidden (id int4);
ERROR: permission denied for schema public

security=> CREATE TABLE t_forbidden (id int4);

ERROR: no schema has been selected to create in

security=> CREATE TABLE public.t_forbidden (id int4);

ERROR: permission denied for schema public

Don't be misled by the error message in the first example. This is only due to the fact that PostgreSQL does not know where to put the table. Prefixing the table properly will do the job, reveal the real cause of the problem and lead to the expected result.

PostgreSQL v15: new default for public schema

In PostgreSQL v15 and higher, the default for the public schema has been changed.

That may lead you to receive this same error message - to learn how to fix it, see my blog about ERROR: Permission Denied: schema public.

Finally...

For further security information, check out View Permissions and Row-Level Security in PostgreSQL.
To find out more about schemas, see What is a Schema in PostgreSQL?

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

After digging through some customer source code yesterday I thought that it might be worth publishing a post about space consumption and enum types. It seems to be quite common to store status information as text fields. This is highly inefficient.

Here is an example:

test=# CREATE TABLE t_log (id serial, t timestamp, status text);

1	test=# CREATE TABLE t_log (id serial, t timestamp, status text);

Unless we have defined a check constraint we can add any data we want:

test=# INSERT INTO t_log (status) VALUES ('online');
INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('offline');
INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('some useless stuff');
INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('online');

INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('offline');

INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('some useless stuff');

INSERT 0 1

The problem I want to point out here is that a string needs a fair amount of space. In addition to that, it is pretty likely that there is only a handful of status flags available anyway. Just imagine that you have to store billions of rows – a couple of bytes can definitely make a difference.

Space optimization

To reduce the amount of space needed, we can create an enum-type. The advantage here is that we need just one byte to store up to 255 different values. An enum is also an implicit “check constraint” making sure that only those values we really want are allowed:

test=# CREATE TYPE log_status AS ENUM ('online', 'offline', 'waiting');
CREATE TYPE

1 2	test=# CREATE TYPE log_status AS ENUM ('online', 'offline', 'waiting'); CREATE TYPE

In our example three values are allowed.

We can now use this type in our table:

test=# CREATE TABLE t_log (id serial, t timestamp, status log_status);
CREATE TABLE

1 2	test=# CREATE TABLE t_log (id serial, t timestamp, status log_status); CREATE TABLE

When we try to insert data, we will say that bad rows are not accepted:

test=# INSERT INTO t_log (status) VALUES ('online');
INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('some useless stuff');
ERROR: invalid input value for enum log_status: "some useless stuff"

LINE 1: INSERT INTO t_log (status) VALUES ('some useless stuff');

test=# INSERT INTO t_log (status) VALUES ('online');

INSERT 0 1

test=# INSERT INTO t_log (status) VALUES ('some useless stuff');

ERROR: invalid input value for enum log_status: "some useless stuff"

LINE 1: INSERT INTO t_log (status) VALUES ('some useless stuff');

Enum data types are really a nice way to model a finite list of textual values in an efficient way. One cool feature is also that those types can preserve order. What does that mean? It basically means that if you sort an enum column, data will be returned in the same order as the enum.

Changing enums later on in the process is definitely possible. ALTER TYPE provides a nice interface to add values to an enum type.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

If a good programmer is facing a technical problem, he will always try to abstract the problem. In a procedural language such as C people will create typedefs, functions and so on – in object oriented programming people will simply create objects. But what about data type abstraction in SQL? Most people seem to miss this possibility completely.

CREATE DOMAIN – an approach

PostgreSQL (and most other relational databases I am aware of) provide an instruction called CREATE DOMAIN. The idea is to have a mechanism capable of providing data type abstraction.

Let us assume you are writing an application, which has to store information about a simple shop. You might have sales prices, discounts, taxes, and so on. In short: There will be many places in the data structure where you actually want to store a price or some sort of financial information.

Here is an example:

CREATE TABLE t_product (
      id               serial,
      name             text,
      normal_price     numeric(10, 4),
      discount_price   numeric(10, 4)
);

CREATE TABLE t_product (

id serial,

name text,

normal_price numeric(10, 4),

discount_price numeric(10, 4)

);

In this case we are storing two prices and we have to repeat the same thing over and over again. The danger is that if you store currency in ten different tables, there is always a danger that one of your programmers will have a different view of currency – this will definitely lead to inconsistencies in your models.

Data type abstraction

CREATE DOMAIN is exactly what we need here. The syntax is actually quite simple:

test=# \h CREATE DOMAIN
Command:     CREATE DOMAIN
Description: define a new domain
Syntax:
CREATE DOMAIN name [ AS ] data_type
    [ COLLATE collation ]
    [ DEFAULT expression ]
    [ constraint [ ... ] ]

where constraint is:

[ CONSTRAINT constraint_name ]
{ NOT NULL | NULL | CHECK (expression) }

test=# \h CREATE DOMAIN

Command: CREATE DOMAIN

Description: define a new domain

Syntax:

CREATE DOMAIN name [ AS ] data_type

[ COLLATE collation ]

[ DEFAULT expression ]

[ constraint [ ... ] ]

where constraint is:

[ CONSTRAINT constraint_name ]

{ NOT NULL | NULL | CHECK (expression) }

In our case we want to make sure that our column has exactly 10 digits (overall), it must not be NULL and the value inserted must be higher than zero. Here is the command to achieve exactly that:

test=# CREATE DOMAIN currency AS numeric(10, 4) NOT NULL CHECK (VALUE > 0);
CREATE DOMAIN

1 2	test=# CREATE DOMAIN currency AS numeric(10, 4) NOT NULL CHECK (VALUE > 0); CREATE DOMAIN

Once we have created the domain we can use it just like any other data type:

test=# CREATE TABLE t_product (id serial, name text, normal_price currency, discount_price currency);
CREATE TABLE

test=# d t_product
 Table "public.t_product"
    Column      |   Type   |   Modifiers
----------------+----------+--------------------------------------------------------
 id             | integer  | not null default nextval('t_product_id_seq'::regclass)
 name           | text     |
 normal_price   | currency |
 discount_price | currency |

test=# CREATE TABLE t_product (id serial, name text, normal_price currency, discount_price currency);

CREATE TABLE

test=# d t_product

Table "public.t_product"

Column | Type | Modifiers

----------------+----------+--------------------------------------------------------

id | integer | not null default nextval('t_product_id_seq'::regclass)

name | text |

normal_price | currency |

discount_price | currency |

The main advantage is that those checks we have assigned to the domain will now be used by all columns using the data type. It is a consistent way to store data.

If you want to modify things later on you can still use ALTER DOMAIN so you are not stuck with a mistake.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Histograms are a common way for a relational database to store statistical information about data. What is this kind of statistical information good for? Well, statistics are the rocket fuel behind performance. If statistics are bad, the optimizer will come up with bad decisions and poor execution plans will be the result. The PostgreSQL optimizer is a damn sophisticated piece of software and it provides end users with good plans in general – still, without statistics even the best SQL optimizer will yield bad performance.

The good news is that when things are a little tight, the end user can have some influence on the way PostgreSQL stores statistics. The ALTER TABLE ... SET STATISTICS command will allow users to define the size of the internal histograms storing statistical information.

Before we dig into those details we create a simple table and add some data to it:

test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 1000000) AS x;
SELECT 1000000

1 2	test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 1000000) AS x; SELECT 1000000

This will create a table containing one million numbers. To make sure that the optimizer will know about the content of the table, we can run ANALYZE (a command in charge of creating statistics):

test=# ANALYZE ;
ANALYZE

1 2	test=# ANALYZE ; ANALYZE

How statistics is used

Whenever PostgreSQL has to plan a query, it will rely on the statistics we have made just before. Let us take a look at a very simplistic plan:

test=# explain SELECT * FROM t_test WHERE x < 10000;
                      QUERY PLAN
-------------------------------------------------------------
 Seq Scan on t_test (cost=0.00..16925.00 rows=9109 width=4)
   Filter: (x < 10000)

(2 rows)

test=# explain SELECT * FROM t_test WHERE x < 10000;

QUERY PLAN

-------------------------------------------------------------

Seq Scan on t_test (cost=0.00..16925.00 rows=9109 width=4)

Filter: (x < 10000)

(2 rows)

We try to find all values lower than 10.000. The planner estimates that we will find roughly 9.100 rows. Overall this is pretty good and totally sufficient to plan the query efficiently. Remember, estimating statistics is not about having a final answer right away – it is about having a reasonably well estimate.

Let us see, which statistics PostgreSQL has made. To do so we can take a look at a system view called pg_stats:

test=# \d pg_stats
          View "pg_catalog.pg_stats"
       Column           |   Type   | Modifiers
------------------------+----------+-----------
 schemaname             | name     |
 tablename              | name     |
 attname                | name     |
 inherited              | boolean  |
 null_frac              | real     |
 avg_width              | integer  |
 n_distinct             | real     |
 most_common_vals       | anyarray |
 most_common_freqs      | real[]   |
 histogram_bounds       | anyarray |
 correlation            | real     |
 most_common_elems      | anyarray |
 most_common_elem_freqs | real[]   |
 elem_count_histogram   | real[]   |

test=# \d pg_stats

View "pg_catalog.pg_stats"

Column | Type | Modifiers

------------------------+----------+-----------

schemaname | name |

tablename | name |

attname | name |

inherited | boolean |

null_frac | real |

avg_width | integer |

n_distinct | real |

most_common_vals | anyarray |

most_common_freqs | real[] |

histogram_bounds | anyarray |

correlation | real |

most_common_elems | anyarray |

most_common_elem_freqs | real[] |

elem_count_histogram | real[] |

If you are interested in the content of pg_stats we recommend to take a look at the PostgreSQL documentation. It outlines nicely how things work in this area: http://www.postgresql.org/docs/9.3/static/view-pg-stats.html

Changing the size of the histogram

Sometimes it can be quite beneficial to change the size of a column's histogram. Reducing the size if the histogram will make estimates a lot less precise – but, it will reduce the time needed to plan a query. If you are dealing with a table with 1 billion rows and all you do is hitting the primary key to fetch exactly one row at a time, a narrow histogram is definitely not a big issue.

However, if you are dealing with more complex operations, keeping the default to 100 or to raise this value can be quite beneficial.

This is how you can change the size of the histogram of a specific column:

test=# ALTER TABLE t_test ALTER COLUMN x SET STATISTICS 10;
ALTER TABLE

1 2	test=# ALTER TABLE t_test ALTER COLUMN x SET STATISTICS 10; ALTER TABLE

Once this has been done, we can recreate the stats:

test=# ANALYZE t_test;
ANALYZE

1 2	test=# ANALYZE t_test; ANALYZE

Querying the stats table reveals what PostgreSQL knows about the column:

test=# \x
Expanded display is on.
test=# SELECT * FROM pg_stats WHERE tablename = 't_test' AND attname = 'x';

-[ RECORD 1 ]----------+----------------------------------------------------------------------------
schemaname             | public
tablename              | t_test
attname                | x
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | -1
most_common_vals       |
most_common_freqs      |
histogram_bounds       | {492,100971,194099,290435,398419,485780,587410,692588,793607,898079,999686}
correlation            | 1
most_common_elems      |
most_common_elem_freqs |
elem_count_histogram   |

test=# \x

Expanded display is on.

test=# SELECT * FROM pg_stats WHERE tablename = 't_test' AND attname = 'x';

-[ RECORD 1 ]----------+----------------------------------------------------------------------------

schemaname | public

tablename | t_test

attname | x

inherited | f

null_frac | 0

avg_width | 4

n_distinct | -1

most_common_vals |

most_common_freqs |

histogram_bounds | {492,100971,194099,290435,398419,485780,587410,692588,793607,898079,999686}

correlation | 1

most_common_elems |

most_common_elem_freqs |

elem_count_histogram |

Note that the histogram consists of just 11 values now (which gives us 10 bars).

The impact of a change

We can toy around with this setting now and see what happens to the plan made by PostgreSQL once the statistics target has been risen to 1.000:

test=# ALTER TABLE t_test ALTER COLUMN x SET STATISTICS 1000;
ALTER TABLE

test=# ANALYZE ;
ANALYZE

test=# explain SELECT * FROM t_test WHERE x < 10000;
                        QUERY PLAN
-------------------------------------------------------------
 Seq Scan on t_test (cost=0.00..16925.00 rows=9921 width=4)
     Filter: (x < 10000)
(2 rows)

test=# ALTER TABLE t_test ALTER COLUMN x SET STATISTICS 1000;

ALTER TABLE

test=# ANALYZE ;

ANALYZE

test=# explain SELECT * FROM t_test WHERE x < 10000;

QUERY PLAN

-------------------------------------------------------------

Seq Scan on t_test (cost=0.00..16925.00 rows=9921 width=4)

Filter: (x < 10000)

(2 rows)

In our example the estimate is a little bit more precise than initially. However, this comes at a cost. Planning time has risen – the plan is still the same (because there is no other choice for the planner anyway).

Btw, keep in mind that those values should not be changed on a daily basis and in many cases it is not even necessary to change this stuff at all – it is nice to have the opportunity to do so, however, when beneficial.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

“Waiting is the slowest way to execute an operation” - this is how a professor at my university in Vienna usually described bad locking and low concurrency. He could not have been more right. The same applies to performing calculations: The fastest way to calculate something is to try to skip it entirely. In this posting we want to focus on the first half of the story: Reducing the impact of locking.

Let us assume we have three tables:

test=# CREATE TABLE a (aid int4);
CREATE TABLE

test=# CREATE TABLE b (bid int4);
CREATE TABLE

test=# CREATE TABLE c (cid int4);
CREATE TABLE

test=# CREATE TABLE a (aid int4);

CREATE TABLE

test=# CREATE TABLE b (bid int4);

CREATE TABLE

test=# CREATE TABLE c (cid int4);

CREATE TABLE

We can populate those tables with a handful of values to get our example started:

test=# INSERT INTO a VALUES (1);
INSERT 0 1

test=# INSERT INTO b VALUES (1);
INSERT 0 1

test=# INSERT INTO c VALUES (1);
INSERT 0 1

test=# INSERT INTO a VALUES (1);

INSERT 0 1

test=# INSERT INTO b VALUES (1);

INSERT 0 1

test=# INSERT INTO c VALUES (1);

INSERT 0 1

SELECT … FOR UPDATE

Concurrency (= doing things in parallel) is an essential thing to consider when writing code. In most applications there is more than just one user at the same time and those users might very well focus their attention on the same set of data.

To make sure that nobody can modify a row while we are processing it, all modern relational databases such as Oracle and PostgreSQL provide a syntax component called “SELECT .. FOR UPDATE”. All rows returned by “SELECT FOR UPDATE” are locked as if they were about to be updated. This will help us to avoid concurrency issues and strange behavior (race conditions).

In our example we want to perform a join between table A and table B and lock all rows returned by SELECT. Here is the code:

test=# BEGIN;
BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE;
 aid | bid
-----+-----
1    | 1

(1 row)

test=# BEGIN;

BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE;

aid | bid

-----+-----

1 | 1

(1 row)

As you can see, data is returned as expected. For the sake of this example we keep the transaction open. Let us run a second transaction now, which is executed, while the first session is still active:

test=# BEGIN;
BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE;

-- waiting for the first session to commit

test=# BEGIN;

BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE;

-- waiting for the first session to commit

The point here is: The second session has to wait because session 1 and session 2 will have to lock rows in the B table. This is exactly the desired behavior.

Improving concurrency by locking

But: What if B was just a simple lookup table? What if A was a payment table and B just a postal code table? What if C was a table storing data about people?

If somebody was about to modify some payment and if he has to look up some postal code for this purpose, we would automatically lock other transactions out, which want to modify data about people. The lock on the postal code table would magically spread to other areas of the code, which have nothing to do with payment.

If changes to payments are frequent, the postal code table would be the poison pill for all writing transactions touching the table storing people. Clearly, this is not acceptable. How can you ever scale to 64 CPUs or beyond if you are shot down by a simple row level lock?

SELECT FOR UPDATE

PostgreSQL has the right solution to this kind of problem: SELECT FOR UPDATE allows you to specify, which table in the join you want to lock. It works like this:

test=# BEGIN;
BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE OF a;
 aid | bid
-----+-----
   1 | 1
(1 row)

test=# BEGIN;

BEGIN

test=# SELECT * FROM a, b WHERE aid = bid FOR UPDATE OF a;

aid | bid

-----+-----

1 | 1

(1 row)

In this example we are telling the system that we are joining A and B but we only want to lock A. This makes sure that C can be locked undisturbed:

test=# BEGIN;
BEGIN

test=# SELECT * FROM b, c WHERE bid = cid FOR UPDATE OF c;
 bid | cid
-----+-----
   1 | 1
(1 row)

test=# BEGIN;

BEGIN

test=# SELECT * FROM b, c WHERE bid = cid FOR UPDATE OF c;

bid | cid

-----+-----

1 | 1

(1 row)

This SELECT statement will only take care of table C and make sure that B is not touched because we don't want to modify it anyway.

As you can see, a small change can have a significant impact on the overall performance because we gave magically allowed a second transaction (and thus a second CPU) to proceed without having to wait on a statement.

The golden rule of thumb is: The more concurrency you got in your system the faster you are overall. Never lock more than needed.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Just like most other database systems I am aware of PostgreSQL providing a feature called tablespaces. In general a tablespace is a collection of storage files residing in a certain mount point. The main idea of a tablespace is to give administrators a chance to scale out their I/O and to make things more efficient. Typically a tablespace makes a lot of sense if you want to access more than just one I/O subsystem.

More use for per-tablespaces

As I have mentioned before, a tablespace can be used to add more hardware to the system – but, is there some other way to make use of a tablespace? Well, there is some sort of hidden “abuse”, which can be highly beneficial to performance: Using tablespaces to help the optimizer.

In PostgreSQL the optimizer is in charge of finding good execution plans given a certain query. To come up with a good plan PostgreSQL has to calculate artificial costs for certain operations. To estimate I/O, two essential cost parameters are used:

sequential_page_cost: The idea of this parameter is to assign costs to sequentially reading 8k blocks. So, if you want to perform a sequential scan involving 1 mio 8k blocks, you will be “punished” with (1 mio * sequential_page_cost) = 1 mio penalty points
random_page_cost: If you are not performing sequential but random I/O you will receive 4 penalty points for each block (instead of just one). The idea here is to make random I/O more expensive than just sequentially scanning something.

In reality (on mechanical disks) random I/O is a lot more expensive than sequential I/O because the disk has to seek each block. This is why PostgreSQL assigns 4 penalty points to such a block.

The quest for “real” costs

Let us assume for a second there is no cache around. In a case like that, random I/O can very well be 50-100 times more expensive than sequential I/O. The question is: Why is the cost parameter still at 4 instead of at 100 then? The answer is that we can assume that not all blocks will come from disk. Some will be cached, and we can assume that in average 4 is a pretty good number.

But what if a table (maybe 10 TB?) is so large that we can never expect an index lookup to benefit from a cache hit? What if we had additional small lookup tables which will always be in cache because they are used frequently? Clearly, it seems unfair to use the same cost parameters for those different types of tables.

Assigning costs to tablespaces

A tablespace can come to the rescue here. PostgreSQL allows people to assign different cost parameters to different tablespaces. We could pack tables we expect to be uncached into one tablespace and lookup tables onto some other tablespace. Those two tablespaces could reside on the same set of drives – we could just use a tablespace to honour different cost models.

Then we can make sure that the optimizer will consider the costs of a certain tablespace to be more expensive. Here is how it works:

test=# ALTER TABLESPACE name_of_tablespace SET (random_page_cost = 20);
ALTER TABLESPACE

1 2	test=# ALTER TABLESPACE name_of_tablespace SET (random_page_cost = 20); ALTER TABLESPACE

In this case we have told the system that data residing on a tablespace called name_of_tablespace is 5 times more expensive than a typical random read.

The benefit of per-tablespace

Especially when data sets are composed of very large and fairly small tables, using different cost parameter for various tablespaces can be highly beneficial because it allows the optimizer to estimate costs more reliably. Of course, it takes some time to properly adjust those parameters, but it can be highly beneficial.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.