CYBERTEC Logo

New features for sequences: gains and pitfalls

02.2018 / Category: , / Tags: | |

About sequences

Sequences are used to generate artificial numeric primary key columns for tables.
A sequence provides a “new ID” that is guaranteed to be unique, even if many database sessions are using the sequence at the same time.

Sequences are not transaction safe, because they are not supposed to block the caller. That is not a shortcoming, but intentional.

As a consequence, a transaction that requests a new value from the sequence and then rolls back will leave a “gap” in the values committed to the database. In the rare case that you really need a “gap-less” series of values, a sequence is not the right solution for you.

PostgreSQL's traditional way of using sequences (nextval('my_seq')) differs from the SQL standard, which uses NEXT VALUE FOR .

New developments in PostgreSQL v10

Identity columns

PostgreSQL v10 has introduced the standard SQL way of defining a table with an automatically generated unique value:

Here is an example:

Behind the scenes, this uses a sequence, and it is roughly equivalent to the traditional

which is a shorthand for

The problem with such a primary key column is that the generated value is a default value, so if the user explicitly inserts a different value into this column, it will override the generated one.

This is usually not what you want, because it will lead to a constraint violation error as soon as the sequence counter reaches the same value. Rather, you want the explicit insertion to fail, since it is probably a mistake.

For this you use GENERATED ALWAYS:

You can still override the generated value, but you'll have to use the OVERRIDING SYSTEM VALUE clause for that, which makes it much harder for such an INSERT to happen by mistake:

New system catalog pg_sequence

Before PostgreSQL v10, Postgres stored a sequence's metadata (starting value, increment and others) in the sequence itself.

This information is now stored in a new catalog table pg_sequence.

The only data that remain in the sequence are the data changed by the sequence manipulation functions nextval, currval, lastval and setval.

Transactional DDL for sequences

A sequence in PostgreSQL is a “special table” with a single row.

In “normal tables”, an UPDATE does not modify the existing row, but writes a new version of it and marks the old version as obsolete. Since sequence operations should be fast and are never rolled back, PostgreSQL can be more efficient by just modifying the single row of a sequence in place whenever its values change.

Since prior to PostgreSQL v10 all metadata of a sequence were kept in the sequence (as explained in the previous section), this had the downside that ALTER SEQUENCE, which also modified the single row of a sequence, could not be rolled back.

Since PostgreSQL v10 has given us pg_sequence, and catalog modifications are transaction safe in PostgreSQL, this limitation could be removed with the latest release.

Performance regression with ALTER SEQUENCE

When I said above that ALTER SEQUENCE has become transaction safe just by introducing a new catalog table, I cheated a little. There is one variant of ALTER SEQUENCE that modifies the values stored in a sequence:

If only some variants of ALTER SEQUENCE were transaction safe and others weren't, this would lead to surprising and buggy behavior.

That problem was fixed with this commit:

This means that every ALTER SEQUENCE statement will now create a new data file for the sequence; the old one gets deleted during COMMIT. This is similar to the way TRUNCATE, CLUSTER, VACUUM (FULL) and some ALTER TABLE statements are implemented.

Of course this makes ALTER SEQUENCE much slower in PostgreSQL v10 than in previous releases, but you can expect this statement to be rare enough that it should not cause a performance problem.

However, there is this old blog post by depesz that recommends the following function to efficiently get a gap-less block of sequence values:

This function returns the last value of the gap-less sequence value block (and does not work correctly when called on a newly created sequence).

Since this function calls ALTER SEQUENCE not only once but twice, you can imagine that every application that uses it a lot will experience quite a performance hit when upgrading to PostgreSQL v10.

Fortunately you can achieve the same thing with the normal sequence manipulation functions, so you can have a version of the function that will continue performing well in PostgreSQL v10:

If you want to get the first value of the sequence value block, use RETURN reply;

Note that both the original function and the improved one, use advisory locks. That means they will only work reliably if the sequence is only used with that function.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Rob
Rob
2 years ago

very helpful function! but, when creating the bigint lockid, why the addition of minimum int4 value and casting to int?
i.e. why can't you just do:
lock_id bigint := use_seqname::bigint
rather than:
lock_id bigint := (use_seqname::bigint - 2147483648)::integer;

laurenz
laurenz
2 years ago
Reply to  Rob

Yes, that seems silly; fixed. I cannot really remember why I did it that way.

Rob
Rob
2 years ago
Reply to  laurenz

Thanks, I wondered whether it might relate to the fact that there are two forms to the pg_advisory_lock function. From the manual: pg_advisory_lock locks an application-defined resource, which can be identified either by a single 64-bit key value or two 32-bit key values (note that these two key spaces do not overlap). Or, as I guess one risk with advisory locks is that every use case needs to be aware of every other use case, to ensure there is no unintended collision of keys between them.

laurenz
laurenz
2 years ago
Reply to  Rob

Sure, but the original, more complicated code did that no better than the simple one does.
There can be no collision, because no two sequences can have the same object ID.

Yepeng Yin
Yepeng Yin
4 years ago

The improved multi_nextval version even requires normal nextval to use the same
advisory lock. Otherwise, there will be call to nextval between following statements:

reply := nextval(use_seqname);
----> other connections will call nextval

PERFORM setval(use_seqname, reply + use_increment - 1, TRUE);

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    5
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram