© Laurenz Albe 2023
This article is about the notoriously difficult problem of “conditional foreign keys”. In object-oriented programming languages, polymorphism is a natural concept. However, it does not translate well to SQL. As a consequence, many people whose data models are driven by an object-oriented application design end up trying to implement conditional foreign keys. In this article, I will explore different ways to solve the problem and their respective advantages and disadvantages. Most of this discussion is not specific to PostgreSQL, but I will also explore how PostgreSQL features like JSON support and table inheritance can help with conditional foreign keys.
Polymorphism in object-oriented languages
As an example, let’s consider a data model in Java for a traffic insurance company. We want to model different kinds of vehicles:
public abstract class Vehicle { private java.util.Date manufacturer; private java.math.BigDecimal price; } public abstract class MotorVehicle extends Vehicle { private String powersource; private String license_plate; private int watt; } public class Car extends MotorVehicle { private int doors; private int seats; } public class Bicycle extends Vehicle { private int gears; }
Vehicle
and MotorVehicle
are abstract classes, that is, you cannot create objects of that type. They define attributes that are common to all vehicles or motor vehicles. The classes you can instantiate (create actual objects of that type) are Car
and Bicycle
.
Now we can model customers and contracts:
public class Customer { private String name; private java.util.Date birthday; } public class Contract { private Customer customer; private Vehicle vehicle; private java.util.Date start; private java.util.Date end; private java.math.BigDecimal payment; }
A Contract
references a Customer
and a Vehicle
. Note that while there can be no objects of type Vehicle
, the attribute vehicle
can reference any object of a subclass of Vehicle
, like a Car
or a Bicycle
. This useful feature is called polymorphism.
The problem of conditional foreign keys
To persist the data from the insurance application, we have to map the application objects to database tables. Customer
, Car
and Bicycle
are easy to translate:
CREATE TABLE customer ( customer_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name text NOT NULL, birthday date NOT NULL ); CREATE TABLE car ( car_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, watt integer NOT NULL, doors integer NOT NULL, seats integer NOT NULL, manufacturer text NOT NULL, price numeric(15,2) NOT NULL, powersource text NOT NULL, license_plate text NOT NULL ); CREATE TABLE bicycle ( bicycle_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, gears integer NOT NULL, manufacturer text NOT NULL, price numeric(15,2) NOT NULL );
The problem occurs when we try to map Contract
to a database table:
CREATE TABLE contract ( contract_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id bigint REFERENCES customer NOT NULL, vehicle_id bigint REFERENCES ???, start date NOT NULL, end date NOT NULL, payment numeric(15,2) NOT NULL );
The foreign key vehicle_id
should sometimes reference a car
, sometimes a bicycle
, and sometimes some other type of vehicle. This is what I call a “conditional foreign key”. You cannot directly write a conditional foreign key in SQL. So, what options do you have to model one?
Method 1: Modeling conditional foreign keys with multiple foreign keys
With this technique, you add several foreign keys to contract
: one for each type of vehicle. An additional column v_type
identified the type of vehicle.
CREATE TABLE vehicle_type ( id integer PRIMARY KEY, name text NOT NULL ); INSERT INTO vehicle_type VALUES (1, 'car'), (2, 'bicycle'); CREATE TABLE contract ( contract_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id bigint REFERENCES customer NOT NULL, car_id bigint REFERENCES car, bicycle_id bigint REFERENCES bicycle, start_date date NOT NULL, end_date date NOT NULL, v_type integer REFERENCES vehicle_type NOT NULL, payment numeric(15,2) NOT NULL );
I added v_type
further down to avoid wasting storage space. An additional check constraint makes sure that the data are consistent:
ALTER TABLE contract ADD CONSTRAINT conditional_key_check CHECK (CAST ((v_type = 1 AND car_id IS NOT NULL AND bicycle_id IS NULL) AS integer) + CAST ((v_type = 2 AND car_id IS NULL AND bicycle_id IS NOT NULL) AS integer) = 1);
This code makes use of the fact that the boolean FALSE
converts to the integer 0 and TRUE
converts to 1.
The advantages of this solution are that consistency is guaranteed, and that we don’t have to modify car
and bicycle
. The disadvantage is that we need many foreign key columns in contract
. We have to add another foreign key for each new kind of vehicle, and we have to adapt the check constraint. This solution is particularly appealing in cases with not too many types of vehicles and if the various types have few common attributes.
Method 2: Modeling conditional foreign keys by storing all referenced data in a single table
With this technique, we store all vehicle data in a single table:
CREATE TABLE vehicle ( vehicle_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, watt integer, doors integer, seats integer, gears integer, v_type integer REFERENCES vehicle_type NOT NULL, price numeric(15,2) NOT NULL, manufacturer text NOT NULL, powersource text, license_plate text, CHECK (CAST ((v_type = 1 AND ROW(watt, doors, seats, powersource, license_plate) IS NOT NULL AND gears IS NULL) AS integer) + CAST ((v_type = 2 AND ROW(watt, doors, seats, powersource, license_plate) IS NULL AND gears IS NOT NULL) AS integer) = 1) );
Again, a check constraint makes sure that the data are consistent, and each type of vehicle has only the appropriate attributes set. If we want, we can create views for car
and bicycle
:
DROP TABLE IF EXISTS car; CREATE VIEW car AS SELECT vehicle_id AS car_id, watt, doors, seats, manufacturer, price, powersource, license_plate FROM vehicle WHERE v_type = 1; DROP TABLE IF EXISTS bicycle; CREATE VIEW bicycle AS SELECT vehicle_id AS bicycle_id, manufacturer, price, gears FROM vehicle WHERE v_type = 2;
Now, the definition of contract
is simple:
CREATE TABLE contract ( contract_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id bigint REFERENCES customer NOT NULL, vehicle_id bigint REFERENCES vehicle NOT NULL, start_date date NOT NULL, end_date date NOT NULL, payment numeric(15,2) NOT NULL );
The advantage of this solution is that the definition of contract
is straightforward. The disadvantage is that vehicle
has to grow new columns for each additional vehicle type. This solution is particularly useful if the tables referenced have most columns in common, so that vehicle
does not end up having too many columns. It is also an appealing solution if there are many types of vehicles.
Method 3: Modeling conditional foreign keys with a single table and jsonb
This is a variation of the previous method and relies on PostgreSQL’s JSON capabilities. Instead of creating columns for all vehicle types in the vehicle
table, we add the additional attributes in a single jsonb
column:
CREATE TABLE vehicle ( vehicle_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, v_type integer REFERENCES vehicle_type NOT NULL, price numeric(15,2) NOT NULL, manufacturer text NOT NULL, data jsonb NOT NULL, CHECK (CAST ((v_type = 1 AND ROW(data -> 'watt', data -> 'doors', data -> 'seats', data -> 'powersource', data -> 'license_plate') IS NOT NULL AND data -> 'gears' IS NULL) AS integer) + CAST ((v_type = 2 AND ROW(data -> 'watt', data -> 'doors', data -> 'seats', data -> 'powersource', data -> 'license_plate') IS NULL AND data -> 'gears' IS NOT NULL) AS integer) = 1) );
Again, we can create views for the individual types of vehicles:
CREATE VIEW car AS SELECT vehicle_id AS car_id, (data ->> 'watt')::integer AS watt, (data ->> 'doors')::integer AS doors, (data ->> 'seats')::integer AS seats, manufacturer, price, data ->> 'powersource' AS powersource, data ->> 'license_plate' AS license_plate FROM vehicle WHERE v_type = 1; CREATE VIEW bicycle AS SELECT vehicle_id AS bicycle_id, manufacturer, price, (data ->> 'gears')::integer AS gears FROM vehicle WHERE v_type = 2;
The idea is to model all frequent attributes as normal columns and all rare ones as JSON attributes. The advantage is that the number of columns stays moderate, which is great if there are many types of vehicles with many attributes. If you create a GIN index on the jsonb
column, you can perform efficient searches. On the down side, fetching and updating the JSON attributes will be more expensive.
Method 4: Modeling conditional foreign keys by pointing the foreign key in the other direction – “faking it”
With this method, we don’t add a foreign key vehicle_id
to contract
. Instead, we add foreign keys from car
and bicycle
to contract
:
ALTER TABLE car ADD contract_id bigint REFERENCES contract NOT NULL; ALTER TABLE bicycle ADD contract_id bigint REFERENCES contract NOT NULL;
You will say that that is not really correct, and you are right. The disadvantage of this solution is that we don’t have a guarantee that each contract
is related to one and only one vehicle. But it is a simple solution that you can resort to if the previous solutions are not feasible. If you index these foreign keys like you should, you can still find the vehicle that belongs to a contract fairly efficiently.
PostgreSQL table inheritance and polymorphism
Back in the day when object-oriented programming was the latest fad, PostgreSQL added table inheritance to turn from a RDBMS (relational database management system) to an ORDBMS (object-relational database management system). With inheritance, you can model subclasses as tables:
CREATE TABLE vehicle ( vehicle_id bigint GENERATED ALWAYS AS IDENTITY, manufacturer date NOT NULL, price numeric(15,2) NOT NULL ); CREATE TABLE motorvehicle ( powersource text NOT NULL, license_plate text NOT NULL, watt integer ) INHERITS (vehicle); CREATE TABLE car ( doors integer NOT NULL, seats integer NOT NULL ) INHERITS (motorvehicle); CREATE TABLE bicycle ( gears integer NOT NULL ) INHERITS (vehicle);
At first glance, this looks like it would be a perfect fit for our problem! Indeed, it can save you from writing the same column definition over and over, since tables automatically inherit all columns from their inheritance parents. But all these tables are separate tables, and a primary key constraint on vehicle
will exist only on that table itself. It won’t constrain the values of vehicle_id
in the other tables. Each table needs its own primary key, and you will have to implement a solution like the first one I suggested.
Conclusion
I showed you some ways to model conditional foreign keys. Each of them has its advantages and disadvantages, and none of them is perfect. PostgreSQL table inheritance does not solve the problem either. Polymorphism is nice in object-oriented languages, but it doesn’t carry over well into SQL.
If you are interested in data models that are not a natural fit for relational databases, you might want to read my article on how to use JSON in PostgreSQL correctly. For an easy introduction to normalization, read Michał’s article on the topic.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.