Is a key left by the neighbor with a request to feed the cat while he is gone a conditional foreign key?
© Laurenz Albe 2023

This article is about the notoriously difficult problem of “conditional foreign keys”. In object-oriented programming languages, polymorphism is a natural concept. However, it does not translate well to SQL. As a consequence, many people whose data models are driven by an object-oriented application design end up trying to implement conditional foreign keys. In this article, I will explore different ways to solve the problem and their respective advantages and disadvantages. Most of this discussion is not specific to PostgreSQL, but I will also explore how PostgreSQL features like JSON support and table inheritance can help with conditional foreign keys.

Polymorphism in object-oriented languages

As an example, let’s consider a data model in Java for a traffic insurance company. We want to model different kinds of vehicles:

public abstract class Vehicle {
    private java.util.Date manufacturer;
    private java.math.BigDecimal price;
}

public abstract class MotorVehicle extends Vehicle {
    private String powersource;
    private String license_plate;
    private int watt;
}

public class Car extends MotorVehicle {
    private int doors;
    private int seats;
}

public class Bicycle extends Vehicle {
    private int gears;
}

Vehicle and MotorVehicle are abstract classes, that is, you cannot create objects of that type. They define attributes that are common to all vehicles or motor vehicles. The classes you can instantiate (create actual objects of that type) are Car and Bicycle.

Now we can model customers and contracts:

public class Customer {
    private String name;
    private java.util.Date birthday;
}

public class Contract {
    private Customer customer;
    private Vehicle vehicle;
    private java.util.Date start;
    private java.util.Date end;
    private java.math.BigDecimal payment;
}

A Contract references a Customer and a Vehicle. Note that while there can be no objects of type Vehicle, the attribute vehicle can reference any object of a subclass of Vehicle, like a Car or a Bicycle. This useful feature is called polymorphism.

The problem of conditional foreign keys

To persist the data from the insurance application, we have to map the application objects to database tables. Customer, Car and Bicycle are easy to translate:

CREATE TABLE customer (
   customer_id bigint GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   name        text   NOT NULL,
   birthday    date   NOT NULL
);

CREATE TABLE car (
   car_id        bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   watt          integer       NOT NULL,
   doors         integer       NOT NULL,
   seats         integer       NOT NULL,
   manufacturer  text          NOT NULL,
   price         numeric(15,2) NOT NULL,
   powersource   text          NOT NULL,
   license_plate text          NOT NULL
);

CREATE TABLE bicycle (
   bicycle_id    bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   gears         integer       NOT NULL,
   manufacturer  text          NOT NULL,
   price         numeric(15,2) NOT NULL
);

The problem occurs when we try to map Contract to a database table:

CREATE TABLE contract (
   contract_id  bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   customer_id  bigint        REFERENCES customer NOT NULL,
   vehicle_id   bigint        REFERENCES ???,
   start        date          NOT NULL,
   end          date          NOT NULL,
   payment      numeric(15,2) NOT NULL
);

The foreign key vehicle_id should sometimes reference a car, sometimes a bicycle, and sometimes some other type of vehicle. This is what I call a “conditional foreign key”. You cannot directly write a conditional foreign key in SQL. So, what options do you have to model one?

Method 1: Modeling conditional foreign keys with multiple foreign keys

With this technique, you add several foreign keys to contract: one for each type of vehicle. An additional column v_type identified the type of vehicle.

CREATE TABLE vehicle_type (
   id   integer PRIMARY KEY,
   name text    NOT NULL
);

INSERT INTO vehicle_type VALUES
   (1, 'car'),
   (2, 'bicycle');

CREATE TABLE contract (
   contract_id  bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   customer_id  bigint        REFERENCES customer NOT NULL,
   car_id       bigint        REFERENCES car,
   bicycle_id   bigint        REFERENCES bicycle,
   start_date   date          NOT NULL,
   end_date     date          NOT NULL,
   v_type       integer       REFERENCES vehicle_type NOT NULL,
   payment      numeric(15,2) NOT NULL
);

I added v_type further down to avoid wasting storage space. An additional check constraint makes sure that the data are consistent:

ALTER TABLE contract ADD CONSTRAINT conditional_key_check
   CHECK (CAST ((v_type = 1 AND
                 car_id     IS NOT NULL AND
                 bicycle_id IS     NULL) AS integer) +
          CAST ((v_type = 2 AND
                 car_id     IS     NULL AND
                 bicycle_id IS NOT NULL) AS integer)
          = 1);

This code makes use of the fact that the boolean FALSE converts to the integer 0 and TRUE converts to 1.

The advantages of this solution are that consistency is guaranteed, and that we don’t have to modify car and bicycle. The disadvantage is that we need many foreign key columns in contract. We have to add another foreign key for each new kind of vehicle, and we have to adapt the check constraint. This solution is particularly appealing in cases with not too many types of vehicles and if the various types have few common attributes.

Method 2: Modeling conditional foreign keys by storing all referenced data in a single table

With this technique, we store all vehicle data in a single table:

CREATE TABLE vehicle (
   vehicle_id  bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   watt          integer,
   doors         integer,
   seats         integer,
   gears         integer,
   v_type        integer REFERENCES vehicle_type NOT NULL,
   price         numeric(15,2) NOT NULL,
   manufacturer  text          NOT NULL,
   powersource   text,
   license_plate text,
   CHECK (CAST ((v_type = 1 AND
                 ROW(watt, doors, seats, powersource, license_plate) IS NOT NULL AND
                 gears IS NULL) AS integer) +
          CAST ((v_type = 2 AND
                 ROW(watt, doors, seats, powersource, license_plate) IS     NULL AND
                 gears IS NOT NULL) AS integer)
          = 1)
);

Again, a check constraint makes sure that the data are consistent, and each type of vehicle has only the appropriate attributes set. If we want, we can create views for car and bicycle:

DROP TABLE IF EXISTS car;

CREATE VIEW car AS
SELECT vehicle_id AS car_id,
       watt,
       doors,
       seats,
       manufacturer,
       price,
       powersource,
       license_plate
FROM vehicle
WHERE v_type = 1;

DROP TABLE IF EXISTS bicycle;

CREATE VIEW bicycle AS
SELECT vehicle_id AS bicycle_id,
       manufacturer, 
       price,
       gears
FROM vehicle
WHERE v_type = 2;

Now, the definition of contract is simple:

CREATE TABLE contract (
   contract_id  bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   customer_id  bigint        REFERENCES customer NOT NULL,
   vehicle_id   bigint        REFERENCES vehicle NOT NULL,
   start_date   date          NOT NULL,
   end_date     date          NOT NULL,
   payment      numeric(15,2) NOT NULL
);

The advantage of this solution is that the definition of contract is straightforward. The disadvantage is that vehicle has to grow new columns for each additional vehicle type. This solution is particularly useful if the tables referenced have most columns in common, so that vehicle does not end up having too many columns. It is also an appealing solution if there are many types of vehicles.

Method 3: Modeling conditional foreign keys with a single table and jsonb

This is a variation of the previous method and relies on PostgreSQL’s JSON capabilities. Instead of creating columns for all vehicle types in the vehicle table, we add the additional attributes in a single jsonb column:

CREATE TABLE vehicle (
   vehicle_id  bigint        GENERATED ALWAYS AS IDENTITY
      PRIMARY KEY,
   v_type        integer REFERENCES vehicle_type NOT NULL,
   price         numeric(15,2) NOT NULL,
   manufacturer  text          NOT NULL,
   data          jsonb         NOT NULL,
   CHECK (CAST ((v_type = 1 AND
                 ROW(data -> 'watt', data -> 'doors', data -> 'seats', data -> 'powersource', data -> 'license_plate') IS NOT NULL AND
                 data -> 'gears' IS NULL) AS integer) +
          CAST ((v_type = 2 AND
                 ROW(data -> 'watt', data -> 'doors', data -> 'seats', data -> 'powersource', data -> 'license_plate') IS     NULL AND
                 data -> 'gears' IS NOT NULL) AS integer)
          = 1)
);

Again, we can create views for the individual types of vehicles:

CREATE VIEW car AS
SELECT vehicle_id AS car_id,
       (data ->> 'watt')::integer AS watt,
       (data ->> 'doors')::integer AS doors,
       (data ->> 'seats')::integer AS seats,
       manufacturer,
       price,
       data ->> 'powersource' AS powersource,
       data ->> 'license_plate' AS license_plate
FROM vehicle
WHERE v_type = 1;

CREATE VIEW bicycle AS
SELECT vehicle_id AS bicycle_id,
       manufacturer, 
       price,
       (data ->> 'gears')::integer AS gears
FROM vehicle
WHERE v_type = 2;

The idea is to model all frequent attributes as normal columns and all rare ones as JSON attributes. The advantage is that the number of columns stays moderate, which is great if there are many types of vehicles with many attributes. If you create a GIN index on the jsonb column, you can perform efficient searches. On the down side, fetching and updating the JSON attributes will be more expensive.

Method 4: Modeling conditional foreign keys by pointing the foreign key in the other direction – “faking it”

With this method, we don’t add a foreign key vehicle_id to contract. Instead, we add foreign keys from car and bicycle to contract:

ALTER TABLE car
   ADD contract_id bigint REFERENCES contract NOT NULL;

ALTER TABLE bicycle
   ADD contract_id bigint REFERENCES contract NOT NULL;

You will say that that is not really correct, and you are right. The disadvantage of this solution is that we don’t have a guarantee that each contract is related to one and only one vehicle. But it is a simple solution that you can resort to if the previous solutions are not feasible. If you index these foreign keys like you should, you can still find the vehicle that belongs to a contract fairly efficiently.

PostgreSQL table inheritance and polymorphism

Back in the day when object-oriented programming was the latest fad, PostgreSQL added table inheritance to turn from a RDBMS (relational database management system) to an ORDBMS (object-relational database management system). With inheritance, you can model subclasses as tables:

CREATE TABLE vehicle (
   vehicle_id   bigint        GENERATED ALWAYS AS IDENTITY,
   manufacturer date          NOT NULL,
   price        numeric(15,2) NOT NULL
);

CREATE TABLE motorvehicle (
   powersource   text NOT NULL,
   license_plate text NOT NULL,
   watt          integer
) INHERITS (vehicle);

CREATE TABLE car (
   doors integer NOT NULL,
   seats integer NOT NULL
) INHERITS (motorvehicle);

CREATE TABLE bicycle (
   gears integer NOT NULL
) INHERITS (vehicle);

At first glance, this looks like it would be a perfect fit for our problem! Indeed, it can save you from writing the same column definition over and over, since tables automatically inherit all columns from their inheritance parents. But all these tables are separate tables, and a primary key constraint on vehicle will exist only on that table itself. It won’t constrain the values of vehicle_id in the other tables. Each table needs its own primary key, and you will have to implement a solution like the first one I suggested.

Conclusion

I showed you some ways to model conditional foreign keys. Each of them has its advantages and disadvantages, and none of them is perfect. PostgreSQL table inheritance does not solve the problem either. Polymorphism is nice in object-oriented languages, but it doesn’t carry over well into SQL.

If you are interested in data models that are not a natural fit for relational databases, you might want to read my article on how to use JSON in PostgreSQL correctly. For an easy introduction to normalization, read Michał’s article on the topic.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.