About sequences

Sequences are used to generate artificial numeric primary key columns for tables.
A sequence provides a “new ID” that is guaranteed to be unique, even if many database sessions are using the sequence at the same time.

Sequences are not transaction safe, because they are not supposed to block the caller. That is not a shortcoming, but intentional.

As a consequence, a transaction that requests a new value from the sequence and then rolls back will leave a “gap” in the values committed to the database. In the rare case that you really need a “gap-less” series of values, a sequence is not the right solution for you.

PostgreSQL’s traditional way of using sequences (nextval('my_seq')) differs from the SQL standard, which uses NEXT VALUE FOR <sequence generator name>.

New developments in PostgreSQL v10

Identity columns

PostgreSQL v10 has introduced the standard SQL way of defining a table with an automatically generated unique value:

GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]

Here is an example:

CREATE TABLE my_tab (
   id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
   ...
);

Behind the scenes, this uses a sequence, and it is roughly equivalent to the traditional

CREATE TABLE my_tab (
   id bigserial PRIMARY KEY,
   ...
);

which is a shorthand for

CREATE SEQUENCE my_tab_id_seq;

CREATE TABLE my_tab (
   id bigint PRIMARY KEY DEFAULT nextval('my_tab_id_seq'::regclass),
   ...
);

ALTER SEQUENCE my_tab_id_seq OWNED BY my_tab.id;

The problem with such a primary key column is that the generated value is a default value, so if the user explicitly inserts a different value into this column, it will override the generated one.

This is usually not what you want, because it will lead to a constraint violation error as soon as the sequence counter reaches the same value. Rather, you want the explicit insertion to fail, since it is probably a mistake.

For this you use GENERATED ALWAYS:

CREATE TABLE my_tab (
   id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   ...
);

You can still override the generated value, but you’ll have to use the OVERRIDING SYSTEM VALUE clause for that, which makes it much harder for such an INSERT to happen by mistake:

INSERT INTO my_tab (id) OVERRIDING SYSTEM VALUE VALUES (42);

New system catalog pg_sequence

Before PostgreSQL v10, a sequence’s metadata (starting value, increment and others) were stored in the sequence itself.

This information is now stored in a new catalog table pg_sequence.

The only data that remain in the sequence are the data changed by the sequence manipulation functions nextval, currval, lastval and setval.

Transactional DDL for sequences

A sequence in PostgreSQL is a “special table” with a single row.

In “normal tables”, an UPDATE does not modify the existing row, but writes a new version of it and marks the old version as obsolete. Since sequence operations should be fast and are never rolled back, PostgreSQL can be more efficient by just modifying the single row of a sequence in place whenever its values change.

Since prior to PostgreSQL v10 all metadata of a sequence were kept in the sequence (as explained in the previous section), this had the down side that ALTER SEQUENCE, which also modified the single row of a sequence, could not be rolled back.

Since PostgreSQL v10 has given us pg_sequence, and catalog modifications are transaction safe in PostgreSQL, this limitation could be removed with the latest release.

Performance regression with ALTER SEQUENCE

When I said above that ALTER SEQUENCE has become transaction safe just by introducing a new catalog table, I cheated a little. There is one variant of ALTER SEQUENCE that modifies the values stored in a sequence:

ALTER SEQUENCE my_tab_id_seq RESTART;

If only some variants of ALTER SEQUENCE were transaction safe and others weren’t, this would lead to surprising and buggy behavior.

That problem was fixed with this commit:

commit 3d79013b970d4cc336c06eb77ed526b44308c03e
Author: Andres Freund <andres@anarazel.de>
Date:   Wed May 31 16:39:27 2017 -0700

    Make ALTER SEQUENCE, including RESTART, fully transactional.
    
    Previously the changes to the "data" part of the sequence, i.e. the
    one containing the current value, were not transactional, whereas the
    definition, including minimum and maximum value were.  That leads to
    odd behaviour if a schema change is rolled back, with the potential
    that out-of-bound sequence values can be returned.
    
    To avoid the issue create a new relfilenode fork whenever ALTER
    SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY
    already is already handled.
    
    This commit also makes ALTER SEQUENCE RESTART transactional, as it
    seems to be too confusing to have some forms of ALTER SEQUENCE behave
    transactionally, some forms not.  This way setval() and nextval() are
    not transactional, but DDL is, which seems to make sense.
    
    This commit also rolls back parts of the changes made in 3d092fe540
    and f8dc1985f as they're now not needed anymore.
    
    Author: Andres Freund
    Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de
    Backpatch: Bug is in master/v10 only

This means that every ALTER SEQUENCE statement will now create a new data file for the sequence; the old one gets deleted during COMMIT. This is similar to the way TRUNCATE, CLUSTER, VACUUM (FULL) and some ALTER TABLE statements are implemented.

Of course this makes ALTER SEQUENCE much slower in PostgreSQL v10 than in previous releases, but you can expect this statement to be rare enough that it should not cause a performance problem.

However, there is this old blog post by depesz that recommends the following function to efficiently get a gap-less block of sequence values:

CREATE OR REPLACE FUNCTION multi_nextval(
   use_seqname text,
   use_increment integer
) RETURNS bigint AS $$
DECLARE
   reply bigint;
BEGIN
   PERFORM pg_advisory_lock(123);
   EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)
           || ' INCREMENT BY ' || use_increment::text;
   reply := nextval(use_seqname);
   EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)
           || ' INCREMENT BY 1';
   PERFORM pg_advisory_unlock(123);
   RETURN reply;
END;
$$ LANGUAGE 'plpgsql';

This function returns the last value of the gap-less sequence value block (and does not work correctly when called on a newly created sequence).

Since this function calls ALTER SEQUENCE not only once but twice, you can imagine that every application that uses it a lot will experience quite a performance hit when upgrading to PostgreSQL v10.

Fortunately you can achieve the same thing with the normal sequence manipulation functions, so you can have a version of the function that will continue performing well in PostgreSQL v10:

CREATE OR REPLACE FUNCTION multi_nextval(
   use_seqname regclass,
   use_increment integer
) RETURNS bigint AS $$
DECLARE
   reply bigint;
   lock_id bigint := (use_seqname::bigint - 2147483648)::integer;
BEGIN
   PERFORM pg_advisory_lock(lock_id);
   reply := nextval(use_seqname);
   PERFORM setval(use_seqname, reply + use_increment - 1, TRUE);
   PERFORM pg_advisory_unlock(lock_id);
   RETURN reply + increment - 1;
END;
$$ LANGUAGE plpgsql;

If you want to get the first value of the sequence value block, use RETURN reply;

Note that both the original function and the improved one use advisory locks, so they will only work reliably if the sequence is only used with that function.