New features for sequences: gains and pitfalls

02.2018 | Category: | Tags: development , sql help , transactions

02.2018

Category:

Tags: development , sql help , transactions

About sequences

Sequences are used to generate artificial numeric primary key columns for tables.
A sequence provides a “new ID” that is guaranteed to be unique, even if many database sessions are using the sequence at the same time.

Table of Contents

Sequences are not transaction safe, because they are not supposed to block the caller. That is not a shortcoming, but intentional.

As a consequence, a transaction that requests a new value from the sequence and then rolls back will leave a “gap” in the values committed to the database. In the rare case that you really need a “gap-less” series of values, a sequence is not the right solution for you.

PostgreSQL's traditional way of using sequences (nextval('my_seq')) differs from the SQL standard, which uses NEXT VALUE FOR .

New developments in PostgreSQL v10

Identity columns

PostgreSQL v10 has introduced the standard SQL way of defining a table with an automatically generated unique value:

GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]

1	GENERATED { ALWAYS \| BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]

Here is an example:

CREATE TABLE my_tab (
   id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
   ...
);

1

2

3

4

CREATE TABLE my_tab (

id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,

...

);

Behind the scenes, this uses a sequence, and it is roughly equivalent to the traditional

CREATE TABLE my_tab (
   id bigserial PRIMARY KEY,
   ...
);

1

2

3

4

CREATE TABLE my_tab (

id bigserial PRIMARY KEY,

...

);

which is a shorthand for

CREATE SEQUENCE my_tab_id_seq;

CREATE TABLE my_tab (
   id bigint PRIMARY KEY DEFAULT nextval('my_tab_id_seq'::regclass),
   ...
);

ALTER SEQUENCE my_tab_id_seq OWNED BY my_tab.id;

1

2

3

4

5

6

7

8

CREATE SEQUENCE my_tab_id_seq;

CREATE TABLE my_tab (

id bigint PRIMARY KEY DEFAULT nextval('my_tab_id_seq'::regclass),

...

);

ALTER SEQUENCE my_tab_id_seq OWNED BY my_tab.id;

The problem with such a primary key column is that the generated value is a default value, so if the user explicitly inserts a different value into this column, it will override the generated one.

This is usually not what you want, because it will lead to a constraint violation error as soon as the sequence counter reaches the same value. Rather, you want the explicit insertion to fail, since it is probably a mistake.

For this you use GENERATED ALWAYS:

CREATE TABLE my_tab (
   id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   ...
);

1

2

3

4

CREATE TABLE my_tab (

id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

...

);

You can still override the generated value, but you'll have to use the OVERRIDING SYSTEM VALUE clause for that, which makes it much harder for such an INSERT to happen by mistake:

INSERT INTO my_tab (id) OVERRIDING SYSTEM VALUE VALUES (42);

1	INSERT INTO my_tab (id) OVERRIDING SYSTEM VALUE VALUES (42);

New system catalog `pg_sequence`

Before PostgreSQL v10, Postgres stored a sequence's metadata (starting value, increment and others) in the sequence itself.

This information is now stored in a new catalog table pg_sequence.

The only data that remain in the sequence are the data changed by the sequence manipulation functions nextval, currval, lastval and setval.

Transactional DDL for sequences

A sequence in PostgreSQL is a “special table” with a single row.

In “normal tables”, an UPDATE does not modify the existing row, but writes a new version of it and marks the old version as obsolete. Since sequence operations should be fast and are never rolled back, PostgreSQL can be more efficient by just modifying the single row of a sequence in place whenever its values change.

Since prior to PostgreSQL v10 all metadata of a sequence were kept in the sequence (as explained in the previous section), this had the downside that ALTER SEQUENCE, which also modified the single row of a sequence, could not be rolled back.

Since PostgreSQL v10 has given us pg_sequence, and catalog modifications are transaction safe in PostgreSQL, this limitation could be removed with the latest release.

Performance regression with `ALTER SEQUENCE`

When I said above that ALTER SEQUENCE has become transaction safe just by introducing a new catalog table, I cheated a little. There is one variant of ALTER SEQUENCE that modifies the values stored in a sequence:

ALTER SEQUENCE my_tab_id_seq RESTART;

1	ALTER SEQUENCE my_tab_id_seq RESTART;

If only some variants of ALTER SEQUENCE were transaction safe and others weren't, this would lead to surprising and buggy behavior.

That problem was fixed with this commit:

commit 3d79013b970d4cc336c06eb77ed526b44308c03e
Author: Andres Freund <andres@anarazel.de>
Date:   Wed May 31 16:39:27 2017 -0700

    Make ALTER SEQUENCE, including RESTART, fully transactional.
    
    Previously the changes to the 'data' part of the sequence, i.e. the
    one containing the current value, were not transactional, whereas the
    definition, including minimum and maximum value were.  That leads to
    odd behaviour if a schema change is rolled back, with the potential
    that out-of-bound sequence values can be returned.
    
    To avoid the issue create a new relfilenode fork whenever ALTER
    SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY
    already is already handled.
    
    This commit also makes ALTER SEQUENCE RESTART transactional, as it
    seems to be too confusing to have some forms of ALTER SEQUENCE behave
    transactionally, some forms not.  This way setval() and nextval() are
    not transactional, but DDL is, which seems to make sense.
    
    This commit also rolls back parts of the changes made in 3d092fe540
    and f8dc1985f as they're now not needed anymore.
    
    Author: Andres Freund
    Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de
    Backpatch: Bug is in master/v10 only

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

commit 3d79013b970d4cc336c06eb77ed526b44308c03e

Author: Andres Freund <andres@anarazel.de>

Date: Wed May 31 16:39:27 2017 -0700

Make ALTER SEQUENCE, including RESTART, fully transactional.

Previously the changes to the 'data' part of the sequence, i.e. the

one containing the current value, were not transactional, whereas the

definition, including minimum and maximum value were. That leads to

odd behaviour if a schema change is rolled back, with the potential

that out-of-bound sequence values can be returned.

To avoid the issue create a new relfilenode fork whenever ALTER

SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY

already is already handled.

This commit also makes ALTER SEQUENCE RESTART transactional, as it

seems to be too confusing to have some forms of ALTER SEQUENCE behave

transactionally, some forms not. This way setval() and nextval() are

not transactional, but DDL is, which seems to make sense.

This commit also rolls back parts of the changes made in 3d092fe540

and f8dc1985f as they're now not needed anymore.

Author: Andres Freund

Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de

Backpatch: Bug is in master/v10 only

This means that every ALTER SEQUENCE statement will now create a new data file for the sequence; the old one gets deleted during COMMIT. This is similar to the way TRUNCATE, CLUSTER, VACUUM (FULL) and some ALTER TABLE statements are implemented.

Of course this makes ALTER SEQUENCE much slower in PostgreSQL v10 than in previous releases, but you can expect this statement to be rare enough that it should not cause a performance problem.

However, there is this old blog post by depesz that recommends the following function to efficiently get a gap-less block of sequence values:

CREATE OR REPLACE FUNCTION multi_nextval(
   use_seqname text,
   use_increment integer
) RETURNS bigint AS $$
DECLARE
   reply bigint;
BEGIN
   PERFORM pg_advisory_lock(123);
   EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)
           || ' INCREMENT BY ' || use_increment::text;
   reply := nextval(use_seqname);
   EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)
           || ' INCREMENT BY 1';
   PERFORM pg_advisory_unlock(123);
   RETURN reply;
END;
$$ LANGUAGE 'plpgsql';

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

CREATE OR REPLACE FUNCTION multi_nextval(

use_seqname text,

use_increment integer

) RETURNS bigint AS $$

DECLARE

reply bigint;

BEGIN

PERFORM pg_advisory_lock(123);

EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)

|| ' INCREMENT BY ' || use_increment::text;

reply := nextval(use_seqname);

EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname)

|| ' INCREMENT BY 1';

PERFORM pg_advisory_unlock(123);

RETURN reply;

END;

$$ LANGUAGE 'plpgsql';

This function returns the last value of the gap-less sequence value block (and does not work correctly when called on a newly created sequence).

Since this function calls ALTER SEQUENCE not only once but twice, you can imagine that every application that uses it a lot will experience quite a performance hit when upgrading to PostgreSQL v10.

Fortunately you can achieve the same thing with the normal sequence manipulation functions, so you can have a version of the function that will continue performing well in PostgreSQL v10:

CREATE OR REPLACE FUNCTION multi_nextval(
   use_seqname regclass,
   use_increment integer
) RETURNS bigint AS $$
DECLARE
   reply bigint;
   lock_id bigint := use_seqname::bigint;
BEGIN
   PERFORM pg_advisory_lock(lock_id);
   reply := nextval(use_seqname);
   PERFORM setval(use_seqname, reply + use_increment - 1, TRUE);
   PERFORM pg_advisory_unlock(lock_id);
   RETURN reply + increment - 1;
END;
$$ LANGUAGE plpgsql;

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

CREATE OR REPLACE FUNCTION multi_nextval(

use_seqname regclass,

use_increment integer

) RETURNS bigint AS $$

DECLARE

reply bigint;

lock_id bigint := use_seqname::bigint;

BEGIN

PERFORM pg_advisory_lock(lock_id);

reply := nextval(use_seqname);

PERFORM setval(use_seqname, reply + use_increment - 1, TRUE);

PERFORM pg_advisory_unlock(lock_id);

RETURN reply + increment - 1;

END;

$$ LANGUAGE plpgsql;

If you want to get the first value of the sequence value block, use RETURN reply;

Note that both the original function and the improved one, use advisory locks. That means they will only work reliably if the sequence is only used with that function.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

5 responses to “New features for sequences: gains and pitfalls”

Yepeng Yin says:

April 30, 2019 at 4:12 pm

The improved multi_nextval version even requires normal nextval to use the same
advisory lock. Otherwise, there will be call to nextval between following statements:

reply := nextval(use_seqname);
----> other connections will call nextval

PERFORM setval(use_seqname, reply use_increment - 1, TRUE);

Reply
Rob says:

March 29, 2022 at 7:02 pm

very helpful function! but, when creating the bigint lockid, why the addition of minimum int4 value and casting to int?
i.e. why can't you just do:
lock_id bigint := use_seqname::bigint
rather than:
lock_id bigint := (use_seqname::bigint - 2147483648)::integer;

Reply
- laurenz says:
  
  March 29, 2022 at 8:41 pm
  
  Yes, that seems silly; fixed. I cannot really remember why I did it that way.
  
  Reply
  - Rob says:
    
    March 30, 2022 at 9:36 am
    
    Thanks, I wondered whether it might relate to the fact that there are two forms to the pg_advisory_lock function. From the manual: pg_advisory_lock locks an application-defined resource, which can be identified either by a single 64-bit key value or two 32-bit key values (note that these two key spaces do not overlap). Or, as I guess one risk with advisory locks is that every use case needs to be aware of every other use case, to ensure there is no unintended collision of keys between them.
    
    Reply
    - laurenz says:
      
      March 30, 2022 at 10:40 am
      
      Sure, but the original, more complicated code did that no better than the simple one does.
      There can be no collision, because no two sequences can have the same object ID.
      
      Reply

Leave a Reply Cancel reply

Stay tuned with our

NEWSLETTER

CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0

office@cybertec-postgresql.com

ISO_27001_Badge

Customer Support

Support Platform

Services

Support CYBERTEC Partner PostgreSQL Books

Company

STAY TUNED WITH OUR NEWSLETTER

Get the newest PostgreSQL Info & Tools

Data Protection Policy Terms and conditions Terms of Service Imprint

©

2025

CYBERTEC PostgreSQL International GmbH