A father finds out that CHECK constraints won't work on children - they eloped to neighbor's garden
© Laurenz Albe 2023

 

I am on a spree! After writing about breaking your database and transaction ID wraparound, here I am writing about damage caused by CHECK constraints! Trust me, I don’t want to destroy PostgreSQL databases. It’s just that this Twitter message was brought to my attention and triggered memories of other reports about CHECK constraints that caused problems.

When is a CHECK constraint correct?

The documentation warns that there are rules you have to play by when using CHECK constraints:

PostgreSQL assumes that CHECK constraints’ conditions are immutable, that is, they will always give the same result for the same input row. This assumption is what justifies examining CHECK constraints only when rows are inserted or updated, and not at other times.

That makes sense. Otherwise, PostgreSQL would have to check the condition whenever anything in the database changes (or when time has passed), which is not an option. However, PostgreSQL does not enforce that the constrained expression be IMMUTABLE.

Why is IMMUTABLE not enforced in CHECK constraint expressions?

PostgreSQL normally doesn’t leave requirements like that to the discretion of the user. For example, trying to use a function that is not IMMUTABLE in an index definition causes the error:

ERROR:  functions in index expression must be marked IMMUTABLE

So why does PostgreSQL not enforce that for CHECK constraints? The reason is that there are valid use cases for CHECK constraints where the expression is not IMMUTABLE in the literal sense (but the constraint is). This is even required by the SQL standard. To quote ISO/IEC 9075-2:2003, chapter 11, verse 9:

<check constraint definition> ::= CHECK <left paren> <search condition> <right paren>

5) The <search condition> shall simply contain a <boolean value expression> that is retrospectively deterministic.

Earlier on, the standard explains:

A retrospectively deterministic <boolean value expression> has the property that if it is True at one point time, then it is True for all later points in time if re-evaluated for the identical SQL-data by an arbitrary user with the identical set of privileges. The precise definition is found in Subclause 6.34, “<boolean value expression>”.

I’ll spare you the “precise” definition of “retrospectively deterministic”. It is enough to bring tears to your eyes and is a prime example of how an attempt to formalize an idea can go wrong. What you can distill from it is that the standard committee had something like this in mind:

CHECK (colname < current_timestamp - INTERVAL '1' DAY)

This is not an IMMUTABLE expression (current_timestamp can have a different value tomorrow), but certainly a reasonable CHECK constraint. Any row that satisfies that condition now will also satisfy it in the future.

Breaking valid UPDATEs with a bad CHECK constraint

A bad constraint to check the format of a string

Imagine we have a table

CREATE TABLE data (
   id bigint PRIMARY KEY,
   lirum text NOT NULL
);

and we have a table to commemorate deleted rows

CREATE TABLE history (
   id bigint NOT NULL,
   archived_at timestamp with time zone NOT NULL,
   row text NOT NULL,
   view_count integer DEFAULT 0 NOT NULL
);

that is populated by a trigger

CREATE FUNCTION archive() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   INSERT INTO history
      (id, archived_at, row)
   VALUES (OLD.id, current_timestamp, OLD::text);

   RETURN NULL;
END;$$;

CREATE TRIGGER archive AFTER DELETE ON data
   FOR EACH ROW EXECUTE FUNCTION archive();

We want to create a CHECK constraint that enforces that the string has the correct format. So we define

/* never mind the IS NOT NULL, the type cast is the test */
ALTER TABLE history ADD CONSTRAINT row_correct
   CHECK (row::data IS NOT NULL);

Demonstrating the problem

Let’s add and delete some data:

INSERT INTO data (id, lirum)
VALUES (1, 'some data');

DELETE FROM data WHERE id = 1;

Sure enough, there is now a row in history:

TABLE history;

 id │          archived_at          │       row       │ view_count 
════╪═══════════════════════════════╪═════════════════╪════════════
  1 │ 2023-01-26 07:26:07.460679+01 │ (1,"some data") │          0
(1 row)

At some later time, we need to create a new column for our table:

ALTER TABLE data ADD larum text NOT NULL;

Next time we look at the history table and want to increase view_count, we get:

UPDATE history
SET view_count = view_count + 1
WHERE id = 1
RETURNING id, archived_at, row;
ERROR:  malformed record literal: "(1,"some data")"
DETAIL:  Too few columns.

What went wrong, and what should we have done instead?

What happened? By adding a column to the table data, we also changed the composite data type of the same name. That way, we broke “retrospective determinism”, because the data no longer match the changed data type. Even though we didn’t change the checked value at all, the new row no longer satisfies the CHECK constraint.

A better solution for this requirement would have been a trigger. You can configure the trigger to check the data only once, when the row is inserted. It is also possible to define the trigger so that it checks the data whenever the row is updated, but only if “row” is modified.

Breaking restore with a bad CHECK constraint

A bad constraint that checks dependencies between tables

For the sake of the example, let’s consider a rental car management system:

CREATE TABLE vehicle (
   id bigint PRIMARY KEY,
   model text NOT NULL,
   seats smallint NOT NULL
);

CREATE TABLE client (
   id bigint PRIMARY KEY,
   name text NOT NULL,
   group_size smallint NOT NULL
);

/* for the exclusion constraint */
CREATE EXTENSION IF NOT EXISTS btree_gist;

CREATE TABLE rented (
   vehicle_id bigint REFERENCES vehicle NOT NULL,
   client_id bigint REFERENCES client NOT NULL,
   from_to daterange NOT NULL,
   PRIMARY KEY (vehicle_id, client_id),
   EXCLUDE USING gist (from_to WITH &&, vehicle_id WITH =)
);

Now it is not a good idea to rent out a small car to a big group, and we know that it is a good idea to enforce constraints in the database. So we want to have a constraint that keeps us from assigning a car to a client if the group size exceeds the number of seats in the car:

ALTER TABLE rented ADD CHECK (
   (SELECT vehicle.seats FROM vehicle
    WHERE vehicle.id = rented.vehicle_id) >=
   (SELECT client.group_size FROM client
    WHERE client.id = rented.client_id)
);
ERROR:  cannot use subquery in check constraint

What an annoying limitation! This shows us that PostgreSQL indeed places some limits on what we are allowed to do in a CHECK constraint. Fortunately we can work around it with functions:

CREATE FUNCTION get_seats(bigint) RETURNS smallint
BEGIN ATOMIC
   SELECT seats FROM vehicle WHERE id = $1;
END;

CREATE FUNCTION get_group_size(bigint) RETURNS smallint
BEGIN ATOMIC
   SELECT group_size FROM client WHERE id = $1;
END;

ALTER TABLE rented ADD CHECK
   (coalesce(get_seats(vehicle_id), 0) >=
    coalesce(get_group_size(client_id), 0));

Great! That’s much more readable anyway!

Testing the constraint

Let’s verify that this works as intended:

INSERT INTO vehicle (id, model, seats) VALUES
   (1, 'Mercedes Vito', 9),
   (2, 'Audi TT', 4);

INSERT INTO client (id, name, group_size) VALUES
   (100, 'Albe', 5),
   (101, 'McDuck', 1);

INSERT INTO rented (vehicle_id, client_id, from_to) VALUES
   (2, 100, '[2022-07-01,2022-07-14]');
ERROR:  new row for relation "rented" violates check constraint "rented_check"
DETAIL:  Failing row contains (2, 100, [2022-07-01,2022-07-15)).

INSERT INTO rented (vehicle_id, client_id, from_to) VALUES 
   (1, 100, '[2022-07-01,2022-07-14]');

Looks good!

The broken backup

We perform database backups using

pg_dump -F c -f dumpfile cars

On the dark day of data loss, when we try to restore our backup, we will get an unexpected error message:

pg_restore -d cars dumpfile
pg_restore: error: COPY failed for table "rented": ERROR:  new row for relation "rented" violates check constraint "rented_check"
DETAIL:  Failing row contains (1, 100, [2022-07-01,2022-07-15)).
CONTEXT:  COPY rented, line 1: "1	100	[2022-07-01,2022-07-15)"
pg_restore: warning: errors ignored on restore: 1

What went wrong, and what should we have done instead?

PostgreSQL dumps tables in alphabetic order to be deterministic. While it dumps most constraints in the end to speed up data loading and prevent integrity problems, it considers CHECK constraints to be part of the table definition and dumps them in the beginning. That is fine, since the expressions in CHECK constraints must be retrospectively deterministic and depend only on the row itself. Unfortunately, our cute constraint does not satisfy that requirement. PostgreSQL creates the tables, then restores the data for client, rented and vehicle in that order. So when rented is loaded, vehicle is still empty, get_seats returns 0 and the constraint fails.

Again, the correct solution would have been a trigger that throws an error if the condition is violated. While constraints have to be fulfilled all the time, triggers only fire at certain events. A dump contains trigger definitions at the end, and the triggers are not fired at all when the dump is restored.

If we want to make sure that the condition is always fulfilled, we can go a different way: we could add a copy of seats and group_size to rented and include these columns in the FOREIGN KEY constraints to make sure they are always identical. Then the CHECK constraint only has to compare columns of rented, and the problem is gone.

Conclusion

We saw two examples of how a badly defined CHECK constraint can break a database. The restore failure was particularly unpleasant. Sadly, these are not academic cases, and there have been reports about restore failures on the mailing lists. So make sure that your CHECK constraints are defined correctly, even if PostgreSQL cannot enforce that.

Apart from that, we have learned the pretentious phrase “retrospectively deterministic”, which may draw admiration or a shake of the head, depending on how gullible the audience is.