CYBERTEC Logo

Exclusion constraints in PostgreSQL and a tricky problem

09.2023 / Category: / Tags: | |

Exclusion constraints are a feature that is not very well known, but can be used to implement highly sophisticated constraints. A few years ago, Hans wrote his blog post about EXCLUDE USING GIST... WITH. Recently we received a note from someone dealing with a very tricky problem concerning exclusion constraints: (many thanks to @necrotikS at YouTube for a very good problem to solve!)

Q:

I think this is a very difficult topic. I have the following columns: status, which can be either "SCHEDULED", "PENDING" or "CANCELED", field_id and the duration, which is a tstzrange type. I need a constraint that do not allow creating overlapping appointments for the same field ONLY IF THE APPOINTMENT STATUS IS "SCHEDULED". If there are appointment with status "PENDING" or "CANCELED", there could be other overlapping appointments. For example:
[sql gutter="false"]
Row 1 (Field 1, 'SCHEDULED', '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')

Tries to create overlapping appointment with any status, should not allow, since there's a "SCHEDULED" appointment, for example:
[sql gutter="false"]
INSERT (Field 1, 'PENDING', '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')
OR (Field 1, 'PENDING', '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')
OR (Field 1, 'CANCELED', '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')

However, if we update the Row 1 to status "CANCELED" or "PENDING", when the user tries to create the previous overlapping appointment, it should allow it, since the status is not "SCHEDULED" anymore.

Is it possible to achieve this logic using the examples demonstrated in the video?

A:

It should be fairly easy by using a partial index, i.e. just append the following:
[sql gutter="false"]where (status = 'SCHEDULED')

...to the index definition

Q:

Yes, I had already tried using this WHERE clause, however, the only thing that it changes is that it checks for overlapping when the new row has status "SCHEDULED". But I need the check to run for any status, and throw the error if overlapping with any "SCHEDULED" appointments. With this WHERE you told me to use, I can have a appointment with status "PENDING" overlapping with an already existing appointment with status "SCHEDULED", but that can never happen. It only works for creating a new appointment with status "SCHEDULED", then it throws the error.

A:

The easy options to deal with the status column in an exclusion constraint are:
* ignore it
* exclude (status with = )
* only look at SCHEDULED - where (status = 'SCHEDULED')

What you are asking for is "exclude if one of the status values is 'SCHEDULED'" which is unfortunately more complicated. The idea would be to create a new operator to exclude on in place of = which check if either argument is 'SCHEDULED'. That part is easy:

And then define the table as follows:

However, that is not enough:

At this point, I'd just give up since defining operator families is a quite deep rabbit hole.

On a side note, exclusion constraints are symmetric. This means that while if the above worked, it would prevent any new inserts of a PENDING event if a matching SCHEDULED event is already there, it would also prevent a SCHEDULED event from being inserted if a PENDING event is also there. Not sure this would be desired in the workflow you describe.

I'd be curious if anyone finds a clever solution that works, but my current guess would be it's not easy.

Q:

Thank you for the answer. Well, I guess I'll stick to code validation. Probably I'll change the columns I already have to a single range column (currently I don't use it), and before inserting any new rows, I use the available range operators to check for overlapping and status, since they're very good and simpler than my current SQL queries for finding that.

A:

Perhaps as a practical answer: I would probably go with the where (status = 'SCHEDULED') partial exclusion constraint that solves the problem in the strict sense that no two events can be scheduled in parallel, and deal with the "soft" part of the problem on the application side by simply checking for any conflicting events before inserting a PENDING one.


A short disambiguation: exclusion contraint vs. constraint exclusion!

"Exclusion constraints" and "constraint exclusion" are two entirely separate things (yay naming!):
Exclusion constraints are table-level constraints that dictate which data must not appear twice in a column (or set of columns). The most simple example would be a "unique constraint" that says value X must appear only once in a column. Exclusion constraints (what the video is about) expand that concept from simple equality (X = Y) to more general "overlapping", e.g. the standard example of non-overlapping room reservations.
 
Constraint exclusion is a run-time technique for partitioned tables that allows the planner to skip scanning some partitions. For example, if you partition a table by months, and the query has "where date > 'March', the planner can infer that it doesn't have to look at the Jan and Feb partitions and needs only Mar and later.

More Resources - PostgreSQL Exclusion Constraints

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Cédric Villemain
7 months ago

Maybe this is easier to use expression, like here with value A or B on t matching a locked range:


CREATE TABLE qq (
id int generated always as identity
, t text
, ts tstzrange
, exclude using gist (
(case when t ='A' or t = 'B' THEN true end) with =,
ts with && )
);

insert into qq (t, ts) values ('A', tstzrange(now(), now() interval '1 day'));
insert into qq (t, ts) values ('B', tstzrange(now(), now() interval '1 day')); -- must fail
insert into qq (t, ts) values ('C', tstzrange(now(), now() interval '1 day'));
insert into qq (t, ts) values (NULL, tstzrange(now(), now() interval '1 day'));

ουαν_εντ_ονλι
ουαν_εντ_ονλι
5 months ago

Hi, no because it treats A and B as being interchangeable. The OP says explicitly that SCHEDULED is a special status, everything can overlap as long as there is no SCHEDULED involved, everything conflicts on overlapping with a SCHEDULED status. So the correct solution is with the ==*== operator, but this needs some Gist hacking to come across.
Another dump solution would be via a trigger. Each insert/update checks all the rows, which is means special index scanning should be planned, but still slow.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    2
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram