CYBERTEC Logo

Dealing with streaming replication conflicts in PostgreSQL

11.2020 / Category: / Tags: | | |
replication conflict?
© Laurenz Albe 2020

 

Streaming replication in PostgreSQL is a well-established master-slave replication technique. It is simple to set up, stable and performs well. So many people are surprised when they learn about replication conflicts — after all, the standby server is read-only.

This article describes replication conflicts and tells you how to deal with them.

What is a replication conflict?

A replication conflict occurs whenever the recovery process cannot apply WAL information from the primary server to the standby, because the change would disrupt query processing there. These conflicts cannot happen with queries on the primary server, but they happen on the streaming replication standby server because the primary server has limited knowledge about what is going on on the standby.

There are several kinds of replication conflicts:

Snapshot replication conflicts

This is the most frequent replication conflict.

Snapshot conflicts can occur if VACUUM processes a table and removes dead tuples. This removal is replayed on the standby. Now a query on the standby may have started before VACUUM on the primary (it has an older snapshot), so it can still see the tuples that should be removed. This constitutes a snapshot conflict.

Lock replication conflicts

The queries on a standby server take an ACCESS SHARE lock on the tables they are reading. So any ACCESS EXCLUSIVE lock on the primary (which conflicts with ACCESS SHARE) must be replayed on the standby to keep incompatible operations on the table from happening. PostgreSQL takes such a lock for operations that conflict with SELECT, for example DROP TABLE, TRUNCATE and many ALTER TABLE statements. If the standby should replay such a lock on a table that a query uses, we have a lock conflict.

Buffer pin replication conflicts

One way to reduce the need for VACUUM is to use HOT updates. Then any query on the primary that accesses a page with dead heap-only tuples and can get an exclusive lock on it will prune the HOT chains. PostgreSQL always holds such page locks for a short time, so there is no conflict with processing on the primary. There are other causes for page locks, but this is perhaps the most frequent one.

When the standby server should replay such an exclusive page lock and a query is using the page (“has the page pinned” in PostgreSQL jargon), you get a buffer pin replication conflict. Pages can be pinned for a while, for example during a sequential scan of a table on the outer side of a nested loop join.

HOT chain pruning can of course also lead to snapshot replication conflicts.

Rare kinds of replication conflicts

The following types of conflict are rare and will not bother you:

  • Deadlock replication conflicts: A query on the standby blocks while using the shared buffer that is needed to replay WAL from the primary. PostgreSQL will cancel such a query immediately.
  • Tablespace replication conflicts: A tablespace is in temp_tablespaces on the standby server, and a query has temporary files there. When a DROP TABLESPACE occurs from the primary, we get a conflict. PostgreSQL cancels all queries on the standby in that case.
  • Database replication conflicts: Replication of DROP DATABASE causes a conflict if the standby has active sessions on the database. PostgreSQL terminates all connections to the database on the standby in that case.

Monitoring replication conflicts

The statistics view pg_stat_database_conflicts contains a detailed account of all replication conflicts that happened since the last statistics reset. You have to look at that view on the standby server, not the primary, because that is where replication conflicts occur.

Note that this view does not show all replication conflicts that occurred, it only shows the ones that led to a canceled query on the standby (see the next section).

How does the standby server resolve replication conflicts?

The parameter max_standby_streaming_delay determines what happens when WAL replay encounters a replication conflict (there is a similar parameter max_standby_archive_delay that does the same thing for archive recovery). PostgreSQL suspends replay of the WAL information for at most max_standby_streaming_delay milliseconds. If the conflicting query is still running after that time, PostgreSQL cancels it with an error message like:

The detail message shows that this was from a snapshot replication conflict.

max_standby_streaming_delay has a default value of 30 seconds, so queries on the standby get a “grace time” of half a minute to finish before they get canceled if they cause a replication conflict. This is a middle ground between the extreme settings of 0 (PostgreSQL cancels queries immediately, no delay for replay) and the special value -1 (PostgreSQL never cancels queries, arbitrarily long replay delay).

To understand how to best configure PostgreSQL to deal with replication conflicts, we have to take a look at the use cases for streaming replication.

Use cases for streaming replication

High availability

Streaming replication is the basis for most high availability solutions. Together with a software like Patroni that manages failover, it provides a robust shared-nothing architecture to build a fault tolerant system.

Clearly, the main objective with high availability is to have as little replication delay as possible. This way, standby promotion is fast and little information is lost during the failover. In this case, you want to set max_standby_streaming_delay low.

Note that you need not lose more data during a failover if the standby is behind with applying WAL – the WAL information is still streamed to the standby and written to pg_wal. But it will take longer for the standby to catch up, so the failover time increases.

Off-loading big queries

Big queries for reporting or data analysis can generate load that may overload the productive system. The best solution for that is a data warehouse specifically designed for such queries. But often a standby server can serve as a “poor man's data warehouse”.

Another example for off-loading are database backups: backing up a standby server puts no stress on the primary system.

The main objective in this case is to let the queries (or the backup) complete without interruption.

In this case, you want to set max_standby_streaming_delay to a value greater than the lifetime of the longest query, and it is no problem if there is a delay replaying WAL.

Horizontal scaling

You can use standby servers to distribute database workload across several machines. In practice, there are several limitations to the usefulness of this approach:

  • all writing statements have to go to the primary server, so only reading can be scaled
  • the application has to be able to direct queries and data modifications to different databases
  • the application has to cope with the problem that data modifications may not immediately be visible to queries (synchronous replication avoids this, but the performance impact on writing transactions is staggering)

Another difficulty you face is that there is no good setting for max_standby_streaming_delay: a low value will make queries on the standby fail, while a high value will cause queries on the standby to see stale data.

How to deal with conflicting requirements

Ideally, a standby server serves only a single purpose, so that you can adjust max_standby_streaming_delay or hot_standby accordingly. So the king's way it to have dedicated standby servers for failover and off-loading work.

But sometimes you cannot afford a second standby server, or you may be stuck in a situation like in the “horizontal scaling” scenario above. Then your only option is to reduce the number of replication conflicts as much as possible.

Avoiding replication conflicts

Avoiding all conflicts by disabling hot standby

Obviously there can be no replication conflicts if there are no queries on the standby server. So if you set hot_standby = off on the standby, you don't have to worry about this at all.

But while this method is simple and effective, it will only be feasible if the standby server is exclusively used for high availability. If you are not in that fortunate position, read on.

Avoiding lock conflicts

The obvious measure to avoid lock conflicts is not to issue statements that take an ACCESS EXCLUSIVE lock on the table. The most important statements that do so are:

  • DROP TABLE
  • TRUNCATE
  • LOCK
  • DROP INDEX
  • DROP TRIGGER
  • ALTER TABLE

But there is one kind of ACCESS EXCLUSIVE lock that you cannot avoid that way: locks from VACUUM truncation. When VACUUM has finished processing a table, and the pages at the end of the table have become empty, it tries to get a short ACCESS EXCLUSIVE lock on the table. If that succeeds, it will truncate the empty pages and immediately release the lock. While such locks don't disrupt processing on the primary, they can cause replication conflicts on the standby.

There are two ways to avoid VACUUM truncation:

  • From PostgreSQL v12 on, you can disable the feature for individual tables with

  • You can set old_snapshot_threshold on the primary to a value other than -1. This disables VACUUM truncation as an undocumented side effect.

Avoiding snapshot conflicts

The way to reduce such conflicts is to keep the primary from removing dead tuples that might still be visible on the standby. There are two parameters that help with this:

  • Set hot_standby_feedback on the standby to on. Then the feedback messages from standby to primary server will contain the snapshot xmin of the oldest active transaction on the standby, and the primary will not remove any tuples that this transaction still could see.
    That will get rid most of these replication conflicts, but now long running queries on the standby can lead to table bloat on the primary, which is why that setting is not enabled by default. Consider the risk carefully.
  • Set vacuum_defer_cleanup_age on the primary to a value greater than 0. Then VACUUM will not clean up dead tuples unless they are more than vacuum_defer_cleanup_age transactions old. This is less specific than hot_standby_feedback and can also lead to table bloat.

Note that while hot_standby_feedback = on will get rid of most of the snapshot replication conflicts, it will not necessarily eliminate buffer pin conflicts, since the page that the standby is using could contain some very old tuples. Moreover, I have seen snapshot conflicts in databases even when hot_standby_feedback was on, although after consulting the source I don't understand how that could happen. Maybe a reader can enlighten me :^)

Avoiding buffer pin conflicts

There is no very good way to avoid these conflicts. Perhaps you can reduce the number of HOT updates, but that would harm performance on the primary.

Conclusion

The best way to avoid replication conflicts is to have dedicated standby servers: one for high availability and one for off-loading queries or backups. Then you can easily configure each to avoid replication conflicts.

If you cannot afford that expense, or you want to use a standby for horizontal scaling, you will have to adjust hot_standby_feedback, max_standby_streaming_delay and the vacuum_truncate storage parameter to get as few canceled queries as possible while avoiding excessive table bloat and long replication delays.

0 0 votes
Article Rating
Subscribe
Notify of
guest
23 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ayub123
Ayub123
1 year ago

:disqus thanks for your wonderful blog posts, they explain concepts in a very easy to understand manner. I prefer reading your blog first then go to official documentation. Now for "Buffer pin replication conflicts" - does it mean the way to avoid them is to disable (auto) vacuum? So that it wont hold exclusive lock on the dead tuple blocks and prune them? In streaming replication, will standby have same blocks at the storage level?

laurenz
laurenz
1 year ago
Reply to  Ayub123

Thank you for the praise.
Even if you disable autovacuum (which would be a mistake), you would still get buffer pins and conflicts because of them. HOT updates lead to page locks even if autovacuum is disabled.
Don't try to avoid replication conflicts. Have two standby servers: one for high availability and one for running queries on.

Joe
Joe
1 year ago
Reply to  laurenz

How "two standby servers: one for high availability and one for running queries on" can solve replication conflicts ?

High availability standby server should have hot_standby = off
but second standby server with hot_standby = on will still have conflicts.

laurenz
laurenz
1 year ago
Reply to  Joe

Yes, but it won't cancel queries in the case of a conflict if you set max_standby_streaming_delay to -1. Instead, replay of the WAL information will be delayed.

Joe
Joe
1 year ago
Reply to  laurenz

I have master and standby server with hot_standby = on (pgpool with disable_load_balance_on_write = 'transaction'), last week I have checked settings with "max_standby_streaming_delay to -1" and synchronous_commit=remote_apply.

write_lag and flush_lag was ok but replay_lag had terrible delays, some SELECT FOR UPDATE and UPDATES waits for almost 2h and in logs "remaining connection slots are reserved for non-replication superuser connections....." eventually I had to turn off replication. It looked like never-ending queries.

laurenz
laurenz
1 year ago
Reply to  Joe

Synchronous replication with synchronous_commit = remote_apply and max_standby_streaming_delay = -1 are a toxic combination. That cannot work: queries on the standby will be allowed to delay replay of the WAL indefinitely, and COMMIT on the primary will wait for WAL to be replayed, so any write on the primary will start taking forever.
You either have to tolerate stale reads or canceled queries on the standby. There is no third way.

Joe
Joe
1 year ago
Reply to  laurenz

Thank you

Byung Gyu Park
Byung Gyu Park
1 year ago

I admired your article. Thank you so much for writing such a nice article.
I have a question.

You explained as follows.
------------------------------------------------------
max_standby_streaming_delay has a default value of 30 seconds, so queries on the standby get a “grace time” of half a minute to finish before they get canceled if they cause a replication conflict. This is a middle ground between the extreme settings of 0 and the special value -1
------------------------------------------------------
Q1. Does "half a minute" mean 30minute? Or 30 seconds?
Q2. You explained that the intermediate value between the values "0" and "-1" of "max_standby_streaming_delay" is "grace time".
A value of "-1" means does not end the query, and is an arbitrary long duration.
In other words, it is not a quantitatively determined time, and I do not understand why it is explained that half of the time that is not quantitatively defined is "grace time" and that "grace time" is specified as "half a minute", a quantitative number.

Could you please explain this part as a supplement?
What I'm really curious about is how many minutes or seconds "grace time" is exactly, and why?
If there is a way to know the "grace time" when referring to the source code or the official documentation, please let me know.
Lastly, I would like to say thank you once again for writing such a good article.

laurenz
laurenz
1 year ago
Reply to  Byung Gyu Park

Much of the language that confused you is not technical language, so let me try to clarify:
- "half a minute" is English for 30 seconds.
- "grace time" is not a technical time, all it means is "some extra time before I get nasty".
By default that is 30 seconds; if the query that causes the conflict has not finished after that time, it will be canceled.
- "30 seconds" is not halfway between 0 (immediately) and -1 (infinitely long).
But most queries take less than 30 seconds, so most queries will not get canceled. That's what I mean be "middle ground": most queries don't get canceled, and at the same time replay is not delayed for more than 30 seconds.

Byung Gyu Park
Byung Gyu Park
1 year ago
Reply to  laurenz

You will be truly blessed for being kind enough to answer my poor English skills. You are not only technically good, but you have a great personality.

postgres dba
postgres dba
2 years ago

minor correction :

Avoiding snapshot conflicts:
hot_standby_feedback should be set on standby
Regards,
Harikrishna kuchipudi

laurenz
laurenz
2 years ago
Reply to  postgres dba

What an embarrassing mistake. Thanks! I have fixed the text.

postgres dba
postgres dba
2 years ago
Reply to  laurenz

Nice explanation...learnt lot of things..thank you so much

yhuelf
yhuelf
3 years ago

Hi Laurenz, thank you for this very interesting blog post!!

I'd like to report a typo in "The queries on a standby server take an ACCESS EXCLUSIVE lock on the tables they are reading"

(replace "ACCESS EXCLUSIVE" by "ACCESS SHARE")

laurenz
laurenz
3 years ago
Reply to  yhuelf

Thank you, what an embarrassing bug. Fixed.

Amine Tengilimoglu
Amine Tengilimoglu
3 years ago

I have a standby server on pg13. there is not any transaction or an request on it. when I display the pg_stat_replication view I see 46 hours replay lag. how could be this?

laurenz
laurenz
3 years ago

I guess you mean pg_stat_replication.
Perhaps there is no data modification activity.
The relevant measure is lag in bytes. Try this query on the primary server:

SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication;

Amine Tengilimoglu
Amine Tengilimoglu
3 years ago
Reply to  laurenz

Yes I mean pg_stat_replicaiton. I see the replay_lag in this view. and it is about 2 days.. on the master there are lots of insert statements. but on the slave there is no any workload. Is there any way to force replaying the wal on the standby?

laurenz
laurenz
3 years ago

Make sure there are no queries running on the standby. Look at the standby's log file for errors. Buy some consulting from us.

Amine Tengilimoglu
Amine Tengilimoglu
3 years ago
Reply to  laurenz

Thank you Laurenz. I have resolved it.

jobinau
jobinau
3 years ago

Minor correction. View name is mentioned as "pg_stat_replication_conflicts" it is database_conflicts

laurenz
laurenz
3 years ago
Reply to  jobinau

Thanks, fixed.
I wrote "replication conflict" so often in this article that it got stuck...

Kaarel
Kaarel
3 years ago

Another increasingly more popular query conflict avoiding technique is setting up a logical replication replica. It's a bit more work for sure than simple streaming standbys and needs monitoring and maintenance when schema is evolving, etc...but you can avoid most downsides - no extra bloat on the primary, no replication lag, no killed queries

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    23
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram