Exclusion constraints are a feature that is not very well known, but can be used to implement highly sophisticated constraints. A few years ago, Hans wrote his blog post about EXCLUDE USING GIST… WITH. Recently we received a note from someone dealing with a very tricky problem concerning exclusion constraints: (many thanks to @necrotikS at YouTube for a very good problem to solve!)
I think this is a very difficult topic. I have the following columns: status, which can be either “SCHEDULED”, “PENDING” or “CANCELED”,
field_id and the
duration, which is a
tstzrange type. I need a constraint that do not allow creating overlapping appointments for the same field ONLY IF THE APPOINTMENT STATUS IS “SCHEDULED”. If there are appointment with status “PENDING” or “CANCELED”, there could be other overlapping appointments. For example:
Row 1 (Field 1, "SCHEDULED", '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')
Tries to create overlapping appointment with any status, should not allow, since there’s a “SCHEDULED” appointment, for example:
INSERT (Field 1, "PENDING", '[2023-01-01 08:00:00, 2023-01-01 09:00:00)') OR (Field 1, "PENDING", '[2023-01-01 08:00:00, 2023-01-01 09:00:00)') OR (Field 1, "CANCELED", '[2023-01-01 08:00:00, 2023-01-01 09:00:00)')
However, if we update the Row 1 to status “CANCELED” or “PENDING”, when the user tries to create the previous overlapping appointment, it should allow it, since the status is not “SCHEDULED” anymore.
Is it possible to achieve this logic using the examples demonstrated in the video?
It should be fairly easy by using a partial index, i.e. just append the following:
where (status = 'SCHEDULED')
…to the index definition
Yes, I had already tried using this
WHERE clause, however, the only thing that it changes is that it checks for overlapping when the new row has status “SCHEDULED”. But I need the check to run for any status, and throw the error if overlapping with any “SCHEDULED” appointments. With this WHERE you told me to use, I can have a appointment with status “PENDING” overlapping with an already existing appointment with status “SCHEDULED”, but that can never happen. It only works for creating a new appointment with status “SCHEDULED”, then it throws the error.
The easy options to deal with the status column in an exclusion constraint are:
* ignore it
* exclude (status with = )
* only look at SCHEDULED – where (status = ‘SCHEDULED’)
What you are asking for is “exclude if one of the status values is ‘SCHEDULED'” which is unfortunately more complicated. The idea would be to create a new operator to exclude on in place of = which check if either argument is ‘SCHEDULED’. That part is easy:
create function one_is_scheduled (a text, b text) returns boolean language sql return 'scheduled' in (a, b) ; create operator ==*== ( leftarg = text, rightarg = text, function = one_is_scheduled, commutator = ==*==);
And then define the table as follows:
create extension btree_gist; create table events ( field_id int, status text, time tstzrange, exclude using gist ( field_id with =, status with ==*==, time with && ) );
However, that is not enough:
ERROR: 42809: operator ==*==(text,text) is not a member of operator family "gist_text_ops" DETAIL: The exclusion operator must be related to the index operator class for the constraint.
At this point, I’d just give up since defining operator families is a quite deep rabbit hole.
On a side note, exclusion constraints are symmetric. This means that while if the above worked, it would prevent any new inserts of a PENDING event if a matching SCHEDULED event is already there, it would also prevent a SCHEDULED event from being inserted if a PENDING event is also there. Not sure this would be desired in the workflow you describe.
I’d be curious if anyone finds a clever solution that works, but my current guess would be it’s not easy.
Thank you for the answer. Well, I guess I’ll stick to code validation. Probably I’ll change the columns I already have to a single range column (currently I don’t use it), and before inserting any new rows, I use the available range operators to check for overlapping and status, since they’re very good and simpler than my current SQL queries for finding that.
Perhaps as a practical answer: I would probably go with the
where (status = 'SCHEDULED') partial exclusion constraint that solves the problem in the strict sense that no two events can be scheduled in parallel, and deal with the “soft” part of the problem on the application side by simply checking for any conflicting events before inserting a
A short disambiguation: exclusion contraint vs. constraint exclusion!
“Exclusion constraints” and “constraint exclusion” are two entirely separate things (yay naming!):
Exclusion constraints are table-level constraints that dictate which data must not appear twice in a column (or set of columns). The most simple example would be a “unique constraint” that says value X must appear only once in a column. Exclusion constraints (what the video is about) expand that concept from simple equality (X = Y) to more general “overlapping”, e.g. the standard example of non-overlapping room reservations.
Constraint exclusion is a run-time technique for partitioned tables that allows the planner to skip scanning some partitions. For example, if you partition a table by months, and the query has “where date > ‘March’, the planner can infer that it doesn’t have to look at the Jan and Feb partitions and needs only Mar and later.
More Resources – PostgreSQL Exclusion Constraints
- Youtube video on exclusion constraints: Hans’ Explaining PostgreSQL YouTube video
- Hans’ blog post about EXCLUDE USING gist… WITH exclusion constraints