© Laurenz Albe 2022
Recently, a customer asked me why a bulk load into a PostgreSQL table kept slowing down as time went by. They saw that it was a trigger on the table that took longer and longer, and in that trigger, the updates were getting slower all the time. Now slow updates are a frequent problem, so I thought it would be interesting to explain what’s going on here in detail.
An example that demonstrates updates getting slower in PostgreSQL
Our example scenario is a table that receives bulk loads. Because we frequently need to query the sum of the values in the table, we keep a summary table that is maintained by a trigger (see this article for an explanation):
CREATE TABLE item ( id bigint GENERATED ALWAYS AS IDENTITY, value bigint NOT NULL ); CREATE TABLE sum ( total bigint NOT NULL ); INSERT INTO sum VALUES (0); CREATE FUNCTION add_to_sum() RETURNS trigger LANGUAGE plpgsql AS $$BEGIN UPDATE sum SET total = total + NEW.value; RETURN NEW; END;$$; CREATE CONSTRAINT TRIGGER add_to_sum AFTER INSERT ON item DEFERRABLE INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION add_to_sum();
We are using a deferred constraint trigger here so that the update takes place at the very end of the transaction. Then the row in the
sum table is not locked any longer than absolutely necessary, and we get the best possible concurrency. Also, we are careful not to place any index on
sum (which would be tempting to enforce that there is only a single row) so that we can get fast HOT updates.
How we perform bulk loading
We will bulk load the table with a script like this:
\timing on COPY item (value) FROM STDIN; 11638 422 17165 [a total of 100000 random numbers] \.
Since this script mixes a
COPY statement with data, we need to execute it with
psql -d dbname -f script.sql
On my machine, the above script takes 60 seconds, which is a terribly long time to load 100000 rows. If I drop the trigger on
item, the same script runs in less than 70 milliseconds.
How to prove that the problem is updates getting slower?
The problem clearly is the trigger, and the trigger function contains only a single
UPDATE statement. Why does that statement take so long? We can use the auto_explain extension to see details of the execution of the individual
UPDATE statements. For that, we add the following lines to
shared_preload_libraries = 'auto_explain' auto_explain.log_min_duration = 0 auto_explain.log_analyze = on auto_explain.log_buffers = on auto_explain.log_wal = on auto_explain.log_nested_statements = on
We need to restart PostgreSQL for the changed
shared_preload_libraries setting to take effect. Note that these are not settings I would want to use on a busy server. For one,
auto_explain.log_analyze = on will log all statements, which can be more than your disk can handle. Also,
auto_explain.log_analyze = on will slow down all statements considerably, since PostgreSQL then measures statement execution in detail.
However, if we start the experiment over, we get each of the 100000
UPDATE statements from the trigger logged, and the result is enlightening:
Query Text: UPDATE sum SET total = total + NEW.value Update on sum (cost=0.00..38.25 rows=0 width=0) (actual time=0.016..0.017 rows=0 loops=1) Buffers: shared hit=3 WAL: records=1 bytes=74 -> Seq Scan on sum (cost=0.00..38.25 rows=2260 width=14) (actual time=0.005..0.005 rows=1 loops=1) Buffers: shared hit=1
Then execution gets slower and slower, until the 100000th
Query Text: UPDATE sum SET total = total + NEW.value Update on sum (cost=0.00..38.25 rows=0 width=0) (actual time=1.641..1.641 rows=0 loops=1) Buffers: shared hit=445 WAL: records=1 bytes=74 -> Seq Scan on sum (cost=0.00..38.25 rows=2260 width=14) (actual time=1.637..1.637 rows=1 loops=1) Buffers: shared hit=443
Explaining the cause of updates getting slower
Row versions keep accumulating
The time is clearly spent in a sequential scan of
sum, which must be growing to a size of 443 8kB-pages. To understand that, remember that an
UPDATE in PostgreSQL does not overwrite the existing row, but adds a new row version to the table. This is similar (but not exactly the same) as a
DELETE followed by an
INSERT. So the 100000
UPDATEs actually add 100000 new row versions to the table.
VACUUM cannot clean up
It is the job of
VACUUM to remove these “dead row versions” produced by
UPDATE statements. Normally, the autovacuum process takes care of this automatically. So why doesn’t it work in our case? The reason is that
VACUUM can only remove row versions that are older than any currently active transaction. But our
COPY statement is a single transaction, which normally is a good thing, since it makes bulk loading fast. So autovacuum can only clean up the dead row versions in
sum after the
COPY statement is done, and by that time the
sum table has already been bloated out of shape.
HOT pruning is no help either
But wait a minute: we carefully designed our setup to get HOT updates so that PostgreSQL does not have to run
VACUUM to clean up dead row versions. Every SQL statement can perform HOT pruning to get rid of dead row versions cheaply. However, HOT pruning is subject to the same limitation as
VACUUM and also cannot clean up row versions from currently open transactions.
Indexes cannot help
After a first glance at the slow execution plan, you may think that the reason for the slowdown is that PostgreSQL performs a sequential scan, and an index would speed up processing. That would not work with the
sum table in its current definition, but we could certainly add a primary key column. However, that would harm performance rather than improve it: PostgreSQL would still have to follow the HOT chain within each block to the end. Also, every
UPDATE that is not HOT (because there is no more room in the current page) would create a new (identical) primary key index entry, and scanning through these entries would make us visit every page of the table. So, if anything, an index would make the
UPDATE still slower.
Examining the table with
Reasoning is one thing, but a look at the actual table data is more convincing. Normally, we cannot see dead row versions, but with the functions from the pageinspect extension a superuser can see all the data in a table. If we are quick enough and run the following query after
COPY is done, but before autovacuum has cleaned up the table, we can see what is in the first page of
SELECT format('(0,%s)', lp) AS tid, t_ctid AS replacement, t_xmax <> 0 AS dead, t_infomask2 & 16384 <> 0 AS hot_updated, t_attrs AS total FROM heap_page_item_attrs(get_raw_page('sum', 0), 'sum');
The result will be as follows:
tid │ replacement │ dead │ hot_updated │ total ═════════╪═════════════╪══════╪═════════════╪════════════════════ (0,1) │ (0,2) │ t │ t │ \x0000000000000000 (0,2) │ (0,3) │ t │ t │ \x214e000000000000 (0,3) │ (0,4) │ t │ t │ \xe766000000000000 (0,4) │ (0,5) │ t │ t │ \xeaa0000000000000 (0,5) │ (0,6) │ t │ t │ \x9113010000000000 (0,6) │ (0,7) │ t │ t │ \x5c23010000000000 ... (0,225) │ (0,226) │ t │ t │ \x1356380000000000 (0,226) │ (1,1) │ t │ f │ \xab99380000000000 (226 rows)
Each entry is the replacement for the previous one, all entries are dead (
xmax is set), and all
UPDATEs except the last one (which created a row in the next page) were HOT.
A remedy for updates getting slower
Now that we understand the problem, the solution is clear: don’t repeatedly update the same row in a single transaction. The best way to avoid that will depend on the details of the original workload. In our case, one possible solution would be a statement level trigger using a transition table:
DROP TRIGGER add_to_sum ON item; CREATE OR REPLACE FUNCTION add_to_sum() RETURNS trigger LANGUAGE plpgsql AS $$BEGIN UPDATE sum SET total = total + (SELECT sum(value) FROM new_rows); RETURN NULL; END;$$; CREATE TRIGGER add_to_sum AFTER INSERT ON item REFERENCING NEW TABLE AS new_rows FOR EACH STATEMENT EXECUTE FUNCTION add_to_sum();
With this trigger, only a single
sum will be performed for each
COPY statement. With that, the bulk load takes only slightly longer than 90 milliseconds on my machine!
The most common reason for updates getting slower over time is if you repeatedly update the same rows in a single transaction. The simple solution is to avoid such repeated updates: after all, only the last update of a row in a transaction will be visible after the transaction is done.
- If you would like to learn how to detect slow queries, see Hans’ blog on 3 Ways to Detect Slow Queries in PostgreSQL.
- To find out why
VACUUMdoes not always remove dead rows, see my blog on 4 Reasons Why VACUUM Won’t Remove Dead Rows
- For more help in speeding up your PostgreSQL database performance, check out our performance blogs.