Most of my readers will know about primary keys and all kinds of table constraints. However, only a few of you may have ever thought about the difference between a primary key and a UNIQUE constraint. Isn’t it all just the same? In both cases, PostgreSQL will create an index that avoids duplicate entries. So what is the difference? Let’s dig in and find out…
What primary keys and UNIQUE constraints do
The following example shows both a primary key and a unique constraint:
test=# CREATE TABLE t_sample (a int PRIMARY KEY, b int UNIQUE); CREATE TABLE test=# \d t_sample Table "public.t_sample" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | not null | b | integer | | | Indexes: "t_sample_pkey" PRIMARY KEY, btree (a) "t_sample_b_key" UNIQUE CONSTRAINT, btree (b)
The really important observation is that both features make PostgreSQL create an index. This is important because people often use additional indexes on primary keys or unique columns. These additional indexes are not only unnecessary, but actually counterproductive.
The key to success: NULL handling
What makes a primary key different from a unique index is the way NULL entries are handled. Let’s take a look at a simple example:
test=# INSERT INTO t_sample VALUES (1, NULL); INSERT 0 1
The example above works perfectly. PostgreSQL will accept the NULL value for the second column. As long as the primary key contains a unique value, we are OK. However, if that changes, then an error will occur:
test=# INSERT INTO t_sample VALUES (NULL, 2); ERROR: null value in column "a" of relation "t_sample" violates not-null constraint DETAIL: Failing row contains (null, 2).
This is actually the single biggest difference between these two types of constraints. Keep that in mind.
Using foreign keys
The next logical question which arises is: What does that mean for foreign keys? Does it make a difference? Can we reference primary keys as well as unique constraints?
The simple answer is yes:
test=# CREATE TABLE t_fk_1 ( id serial PRIMARY KEY, aid int REFERENCES t_sample (a) ); CREATE TABLE test=# CREATE TABLE t_fk_2 ( id serial PRIMARY KEY, bid int REFERENCES t_sample (b) ); CREATE TABLE test=# \d t_fk_1 Table "public.t_fk_1" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+------------------------------------ id | integer | | not null | nextval('t_fk_1_id_seq'::regclass) aid | integer | | | Indexes: "t_fk_1_pkey" PRIMARY KEY, btree (id) Foreign-key constraints: "t_fk_1_aid_fkey" FOREIGN KEY (aid) REFERENCES t_sample(a) test=# \d t_fk_2 Table "public.t_fk_2" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+------------------------------------ id | integer | | not null | nextval('t_fk_2_id_seq'::regclass) bid | integer | | | Indexes: "t_fk_2_pkey" PRIMARY KEY, btree (id) Foreign-key constraints: "t_fk_2_bid_fkey" FOREIGN KEY (bid) REFERENCES t_sample(b)
It’s perfectly acceptable to reference a unique column containing NULL entries, in other words: We can nicely reference primary keys as well as unique constraints equally – there are absolutely no differences to worry about.
If you want to know more about NULL in general, check out my post about NULL values in PostgreSQL.
Primary keys and unique constraints are not only important from a logical perspective, they also matter from a database-performance point of view. Indexing in general can have a significant impact on performance. This is true for read as well as write transactions. If you want to ensure good performance, and if you want to read something about PostgreSQL performance right now, check out our blog.