CYBERTEC Logo

Performance differences between normal and generic audit triggers

04.2020 / Category: / Tags: |

What are the performance differences between normal and generic audit triggers? Recently I was talking in a more general way about some common auditing / change tracking approaches for PostgreSQL...but it also made me curious, how does it look from the performance side?

To quickly recap the previous blog post: the most common approaches for tracking important changes are mostly solved with writing some triggers. There are two main variations: table specific triggers / audit tables and a more generic approach with only one trigger function and one (or also many) generic audit tables, usually relying on PostgreSQL’s NoSQL features that allow quite a high degree of genericness in quite a convenient and usable way.

Obviously one could “guesstimate” that the generic approach would perform worse than the tailored approach as this is commonly the case in software. But the question is - how much worse? Can it be considered negligible? Well there's only one way to find out I guess….and finally after finding some time to set up a small test case I can say that I was in for a bit of a surprise! But do read on for details or see the last paragraph for the executive summary.

Test schema for the “table specific” approach

As usual I turned to my old buddy pgbench, that aims to effortlessly aid with a typical OLTP transaction scenario (3 updates, 1 select, 1 insert - detailed SQL seen here). Based on that it quickly hacked up some SQL to create the “background” auditing tables for the 3 pgbench tables that get UPDATE-s in the default mode (FYI - there’s also other modes built in + one can use custom scripts) and attached the per table triggers. Although it’s quite a simple schema, there’s probably still too much code to list here - so this is only an excerpt to get the gist of it. The full script can be found here. Note that I didn’t bother to drop the history table populated by pgbench itself as there are no indexes on it, so it shouldn’t slow things down too much, plus I left it there for both test cases so everything should still be equal.

Test schema for the “generic” approach

As with the explicit schema, we’re starting with the default pgbench schema but now extending it only with a single trigger function that will be attached to all 3 tables getting updates! And there will also only be a single logging table relying on the “generic” JSONB data type to handle all kinds of various inputs with just one column. Full SQL code below (or here):

Setup info

Some hardware / software information for completeness:

Test host: 4 CPU i5-6600, 16GB RAM, SATA SSD

PostgreSQL: v12.1, all defaults except shared_buffers set to 25% of RAM, i.e. 2GB, checkpoint_completion_target = 0.9, backend_flush_after = '2MB' (see here for explanation).

Pgbench: scale 1000, i.e. ~ 13GB DB size to largely factor out disk access (working set should fit more or less into RAM for good performance on any DB system), 4h runtime, 2 concurrent sessions (--jobs) not to cause any excessive locking on the smaller tables. Query latencies were measured directly in the database using the pg_stat_statement extension, so they should be accurate.

Results - TPS & latencies

Performance has many aspects but let’s start with TPS (Transactions per Second) and transaction latencies (single transaction duration) as they are probably the most important metrics for the performance hungry. Surprisingly it turned out that the generic NoSQL approach was quite a bit faster in total numbers! Whew, I didn’t expect that for sure. Note the “in total numbers” though...as updates on the 2 small tables were still a tiny bit slower still, but the costliest update a.k.a. the “heart” of the transaction (on pgbench_accounts table) was still significantly faster still. See below table for details.

 

Explicit triggers Generic triggers Difference (%)
TPS 585 794 +35.7
Total transaction 6324546 8581130 +35.7
Mean time of UPDATE pgbench_accounts (ms) 1.1127 0.8282 -25.6
Mean time of UPDATE pgbench_branches (ms) 0.0493 0.0499 +1.3
Mean time of UPDATE pgbench_tellers (ms) 0.0479 0.0483 +1.0

Results - disk footprint

Well here the decision is very clear when looking at the resulting auditing table sizes after the tests finished - the grand total of tables representing the traditional / explicit approach are much smaller than the one big generic table! So this is definitely also worth considering if you’re building the next Amazon or such. We shouldn’t only look at absolute numbers here as we had 35% more transactions with the generic approach...so we should also factor that in.

 

Explicit triggers Generic triggers Difference (%) TPS adjusted difference (%)
Total size of auditing tables (in MB) 1578 4255 +170 +109

Note that the table sizes don’t include indexes.

Quering

Not directly related to this test but another important aspect not to be overlooked when planning change tracking: From time to time it’s also needed to actually look into the audit trail...and preferably in a speedy way I assume. After some ad-hoc experimenting with the given schemas, the winner here is quite clear and obvious - the explicit / non-generic audit table approach! It is much better in two areas. Firstly, you can get by without needing to know any JSONB syntax (which will take half an hour to learn for newbies though…) and secondly, according to some quick testing the query performance is also noticeably better for non-trivial queries as the audit trail data starts to amount up and doesn’t fit into shared buffers. Remember - as we saw above, the generic approach uses ~2x more space so for very frequent queries over the whole history it might not be the best idea.

Also you will most probably need more targeted functional indexes on the JSONB column as I wouldn’t recommend to just index everything with a single GiN index - although it would be the simplest way to go, it only makes sense if you really perform searches on the majority of all “columns”.

Anyways it’s hard to say something definitive here as real life requirements will differ vastly I imagine, and the amount of indexes / data will be decisive.

Summary

My personal learning from this test - it's not over until the fat lady sings 🙂 My initial hunch on the performance cost of generic JSONB storing triggers was not correct this time and the performance is actually great! So one shouldn’t be afraid of this approach due to the performance hit...but if at all, rather due to disk “costs”...or due to possible woes when needing to actually query the gathered data in some complex ways.

By the way - if you’re wondering why the generic approach was actually faster, my best bet is that it has to do with (at least) 3 factors:

1) PostgreSQL JSONB implementation is very-very efficient

2) less PL/pgSQL parsing / planning as same trigger code is used three times

3) better CPU level data caching as we have less active / hot data blocks.

And as always - remember that when choosing between different approaches for your own project I always recommend testing with your exact schema / hardware as schemas vary vastly and a lot depends on data types used. This test was just one of the simpler examples.

Bonus idea - generic “differential” change auditing

After finishing my test I started wondering if it would be somehow possible to reduce the disk “cost” downside of this otherwise neat “generic” approach...and it actually wasn’t too hard to put together some trigger code that looks at the row data and logs only columns and values that are changed!

The core idea can be seen in the code below (full sample here) and also the first short testing looked quite promising actually - about 40% of disk space savings compared to the above described generic approach! So this could be definitely an interesting way to go. Thanks for reading!

Read more about PostgreSQL, triggers and performance, see this blog: Why Are My PostgreSQL Updates Getting Slower? By Laurenz Albe.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Wolfgang
Wolfgang
4 years ago

Interesting post. Thanks for it.

Just a tiny glitch: "Test host: 4 CPU i5-6600, 16GB RAM... shared_buffers set to 25% of RAM, i.e. 2GB" => 4GB shared buffers or test host with 8GB of RAM?

Kaarel
Kaarel
4 years ago
Reply to  Wolfgang

Ah good catch, shared_buffers was 4GB indeed

Pavel Stěhule
Pavel Stěhule
4 years ago

Looks so the pattern var.* in plpgsql has significant overhead - in my tests about 6-10%

Kaarel
Kaarel
4 years ago
Reply to  Pavel Stěhule

Ok interesting. You mean ~10% for a single usage of such construct vs explicit column list or end result for a similar test? But seems this amount of penalty should still leave the JSONB approach on top still, even when taking more care when writing the normal triggers.

Pavel Stěhule
Pavel Stěhule
4 years ago
Reply to  Kaarel

yes, there is a overhead of plpgsql with passing parameters to INSERT, and there is some CPU overhead related to number of columns. If your bottleneck is CPU, then is better to have less columns than more.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram