Rules or triggers to log bulk updates?

07.2018 / Category: Performance / Tags: administration | trigger | tuning

Triggers with transition tables rule! — © Laurenz Albe 2018

Table of Contents

Inspired by my co-worker's recent blog post, I decided to revisit the old question of rules vs. triggers and run a little benchmark to see which one does better.

About rules

While triggers are well known to most application developers and database administrators, rules are less well known. The full name “query rewrite rule” explains what they are doing: Before the query is optimized, a rule can either replace the query with a different one or add additional queries. These are then planned and executed instead of or together with the original query.

While rules are very powerful, they are also tricky to get right, particularly when data modification is involved. It is usually recommended to use the more modern triggers, and there have been attempts to deprecate them. But since rules are used to implement views, they are probable here to stay.

Moreover, they are said to be faster than triggers when many rows are affected. That's what I tried to measure.

The test case

We have a simple table:

CREATE UNLOGGED TABLE upd (
   id  integer NOT NULL,
   val text    NOT NULL
) WITH (autovacuum_enabled = off);

1

2

3

4

CREATE UNLOGGED TABLE upd (

id integer NOT NULL,

val text NOT NULL

) WITH (autovacuum_enabled = off);

Since we are only interested in the speed of processing, I decided to use an unlogged table to bypass disk I/O as much as possible. Autovacuum is disabled to avoid inferences with it.

Every update to this table should be logged to a second table:

CREATE UNLOGGED TABLE log (
   id       integer     NOT NULL,
   log_time timestamptz NOT NULL
            DEFAULT current_timestamp,
   new_val  text        NOT NULL
) WITH (autovacuum_enabled = off);

1

2

3

4

5

6

CREATE UNLOGGED TABLE log (

id integer NOT NULL,

log_time timestamptz NOT NULL

DEFAULT current_timestamp,

new_val text NOT NULL

) WITH (autovacuum_enabled = off);

The test will repeatedly perform the following in a loop:

TRUNCATE upd;
TRUNCATE log;

INSERT INTO upd (id, val)
   SELECT i, 'text number ' || i
   FROM generate_series(1, 100000) i;

VACUUM (ANALYZE) upd;

UPDATE upd SET val = val;

1

2

3

4

5

6

7

8

9

10

TRUNCATE upd;

TRUNCATE log;

INSERT INTO upd (id, val)

SELECT i, 'text number ' || i

FROM generate_series(1, 100000) i;

VACUUM (ANALYZE) upd;

UPDATE upd SET val = val;

baseline

I'll perform two baseline measurements: one without the UPDATE (called “zero”), and one with just the UPDATE, but no logging (called “base”). This will enable me to measure the net time spent in the UPDATE by subtraction so I can calculate how much performance is slowed down by the different logging techniques.

logging with a rule

For this measurement, which I call “rule”, I'll use the following rule to log updates:

CREATE RULE upd_rule AS ON UPDATE TO upd
   DO ALSO INSERT INTO log (id, new_val)
           VALUES (NEW.id, NEW.val);

1

2

3

CREATE RULE upd_rule AS ON UPDATE TO upd

DO ALSO INSERT INTO log (id, new_val)

VALUES (NEW.id, NEW.val);

logging with a row level trigger

For this measurement, which I call “trig_row”, I'll use the following trigger to log updates:

CREATE FUNCTION upd_row_trig() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   INSERT INTO log (id, new_val)
   VALUES (NEW.id, NEW.val);

   RETURN NEW;
END;$$;

CREATE TRIGGER upd_row_trig
   AFTER UPDATE ON upd FOR EACH ROW
   EXECUTE PROCEDURE upd_row_trig();

1

2

3

4

5

6

7

8

9

10

11

12

CREATE FUNCTION upd_row_trig() RETURNS trigger

LANGUAGE plpgsql AS

$$BEGIN

INSERT INTO log (id, new_val)

VALUES (NEW.id, NEW.val);

RETURN NEW;

END;$$;

CREATE TRIGGER upd_row_trig

AFTER UPDATE ON upd FOR EACH ROW

EXECUTE PROCEDURE upd_row_trig();

logging with a statement level trigger

For this measurement, which I call “trig_stmt”, I'll use the following trigger to log updates:

CREATE FUNCTION upd_stmt_trig() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   INSERT INTO log (id, new_val)
   SELECT id, val FROM newrows;

   RETURN NEW;
END;$$;

CREATE TRIGGER upd_row_trig AFTER UPDATE ON upd
   REFERENCING NEW TABLE AS newrows FOR EACH STATEMENT
   EXECUTE PROCEDURE upd_stmt_trig();

1

2

3

4

5

6

7

8

9

10

11

12

CREATE FUNCTION upd_stmt_trig() RETURNS trigger

LANGUAGE plpgsql AS

$$BEGIN

INSERT INTO log (id, new_val)

SELECT id, val FROM newrows;

RETURN NEW;

END;$$;

CREATE TRIGGER upd_row_trig AFTER UPDATE ON upd

REFERENCING NEW TABLE AS newrows FOR EACH STATEMENT

EXECUTE PROCEDURE upd_stmt_trig();

This uses transition tables which are new in PostgreSQL v10. The REFERENCING clause will make all rows modified by the statement available in a “temporary” table newrows.

Test environment

All commands are run from a shell script on my Linux laptop using psql -c "...". The script loops through the statements twenty times, and the elapsed time is measured with time.

Each test is executed three times to ensure that the execution time does not vary, then the middle value is used.

Test results

The zero measurements have been subtracted and the result divided by twenty to obtain the following numbers:

Performance of rule and triggers for logging bulk updates
	baseline	rule	row level trigger	statement level trigger
execution time	295 ms	454 ms	3816 ms	642 ms
overhead percentage	0 %	54 %	1193 %	117 %

Discussion

The test confirms that rules perform better than anything else when it comes to logging bulk updates. However, while logging with row level triggers caused a slowdown by a forbidding factor of 13, statement level triggers with transition tables only doubled the execution time.

So with PostgreSQL v10 we have a good alternative that performs half as well as rules, and there is no need to use rules unless you need cutting edge performance.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

6 responses to “Rules or triggers to log bulk updates?”

Wutikrai Pornchai says:

July 5, 2018 at 12:25 am

Very useful and informative article, Thank you.

Reply
Nick says:

July 5, 2018 at 11:05 am

Note that this kind of usage of rules may result in unexpected behavior. For instance a simple query like update upd set val = val || random(); will result in different values actually written to val and logged into log table.

Reply
- laurenz says:
  
  July 25, 2018 at 1:04 pm
  
  You are absolutely right.
  Rules are not easy to use correctly, particularly with data modifications. That is why it is recommended to use triggers.
  I was only out to compare performance, not usability.
  
  Reply
Shivam Rathore says:

November 30, 2018 at 3:49 am

What about using logical decoding to retrieve logs?
https://www.postgresql.org/docs/9.5/logicaldecoding.html

Reply
Wellspring says:

October 2, 2019 at 11:15 am

Good read -- thanks much!

Reply
Chris Savvopoulos says:

December 13, 2022 at 10:23 am

Thank you for posting this Benchmark. for each row has one saving grace: when. For example, for each row when (new.val = 'bla') seems to outperform rules.

Reply

Leave a Reply Cancel reply

Stay tuned with our

NEWSLETTER

CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Customer Support

Support Platform

Services

Support CYBERTEC Partner PostgreSQL Books

Company

STAY TUNED WITH OUR NEWSLETTER

Get the newest PostgreSQL Info & Tools

Data Protection Policy Terms and conditions Terms of Service Imprint

©

2024

CYBERTEC PostgreSQL International GmbH