CYBERTEC Logo

What is fillfactor and how does it affect PostgreSQL performance?

07.2020 / Category: / Tags:

By Kaarel Moppel

Recently I was asked if there’s a rule of thumb / best practice for setting the fillfactor in Postgres - and the only answer I could give was to decrease it “a bit” if you do lots and lots of updates on some table. Good advice? Well, it could be better - this kind of coarse quantization leaves a lot of room for interpretation and possible adverse effects. So to have a nicer answer ready for the next time, I thought it would be nice to get some real approximate numbers. Time to conjure up some kind of a test! If you already know what fillfactor does, then feel free to skip to the bottom sections. There you'll find some numbers and a rough recommendation principle.

What is fillfactor?

But first a bit of theory for the newcomers - so what does fillfactor do and how do you configure it? Simon says:

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When you specify a smaller fillfactor, INSERT operations pack table pages only to the indicated percentage. The remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice. However, in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables.

Per table parameter

In short, it’s a per table (or index!) parameter that directs Postgres to initially leave some extra disk space on data pages unused. Later UPDATE-s could use it, touching ultimately only one data block, thus speeding up such operations. And besides normal updates it also potentially (depending on if updated columns are indexed or not) enables another special, and even more beneficial, type of update, known as HOT updates. The acronym means Heap Only Tuples, i.e. indexes are not touched.

There's quite a simple base concept: how densely should we initially “pack” the rows? (Many other database systems also offer something similar.) Not much more to it - but how to set it? Sadly (or probably luckily) there’s no global parameter for that and we need to change it “per table”. Via some SQL, like this:

Effects of fillfactor

As the documentation mentions, for heavily updated tables we can gain on transaction performance by reducing the fillfactor (FF). But in what range should we adjust it and how much? Documentation doesn’t take a risk with any numbers here. Based on my personal experience, after slight FF reductions small transaction improvements can usually be observed. Not immediately, but over some longer period of time (days, weeks). And as I haven’t witnessed any cases where it severely harms performance, you can certaily try such FF experiments for busy tables. But in order to get some ballpark numbers, I guess the only way is to set up a test...

Test setup

As per usual, I modified some test scripts I had lying around, that use the default pgbench schema and transactions, which should embody a typical simple OLTP transaction with lots of UPDATE-s...so exactly what we want. The most important parameter (relative to hardware, especially memory) for pgbench transaction performance is the “scale factor”, so here I chose different values covering 3 cases - initial active data set fits almost into RAM, half fits and only a fraction (ca 10%) fits. Tested fillfactor ranges were 70, 80, 90, 100.

Test host: 4 CPU i5-6600, 16GB RAM, SATA SSD

PostgreSQL: v12.2, all defaults except: shared_buffers set to 25% of RAM, i.e. 4GB, checkpoint_completion_target = 0.9, track_io_timing = on, wal_compression = on, shared_preload_libraries=’pg_stat_statements’

Pgbench: scales 1000/2500/10000, 12h runtime for each scale / FF combination, 2 concurrent sessions.

Query latencies were measured directly in the database using the pg_stat_statement extension, so they should be accurate.

By the way, if you wonder why I’m not using the latest Postgres version v12.3 for my testing (which is normally the way to go) - all these combinations took a week to run through and although a new release appeared during that time I thought it’s not worth it as I didn’t see anything relevant from the release notes.

Test results

Performance has many aspects and even for a bit of a simplistic pgbench test we could measure many things - maybe most important for us in this fillfactor context are the frequent updates on our biggest table that we want to optimize. But let’s not also forget about the effect of our fillfactor changes on the Postgres background processes, worse caching rates etc, so for completeness let’s also look at the total Transactions per Seconds numbers. Remember - 1 pgbench default transaction includes 3 updates (2 mini-tables + 1 main), 1 insert into the write-only history table + 1 select from the main table by PK.

So here the effects on pgbench_accounts UPDATE mean time as measured via pg_stat_statements in milliseconds with global TPS in parentheses:

Data scale FF=100  FF=90 FF=80 FF=70
Mem. 5.01 (374) 4.80 (389) 4.89 (382) 4.98 (376)
2x Mem. 5.31 (363) 5.33 (359) 5.38 (357) 5.42 (353)
10x Mem. 6.45 (249) 5.67 (282) 5.61 (284) 5.72 (279)

Deriving a rule of thumb

What can we learn from the test data? Although it seems that there was some slight randomness in the tests (as 2x Mem test actually made things minimally slower), on the whole it seems that decreasing FF a bit also improves performance “a bit”! My general hunch has something to it, even 🙂 On average a 10% boost, when decreasing FF by 10 or 20%. It's not game changing, but it could make a visible difference.

And the second learning - don’t overdo it! As we see that FF 70% clearly deteriorates the update performance instead of improving it, with all scaling factors.

My try at a rule of thumb - when your active / hot data set is a lot bigger than the amount of RAM, going with fillfactor 90% seems to be a good idea. Don’t forget - there are still no free lunches out there. When optimizing for fast updates on table X, we pay some of the winnings back with global TPS numbers. Our tables will be somewhat bigger, we lose a bit on cache hit ratios, and also background workers like the autovacuum daemon have more scanning to do. So for smaller data vs RAM ratios, the benefits are not particularly visible. You might even lose slightly in global TPS numbers.

Some more thoughts - I’m pretty sure that if I had tuned autovacuum to be more aggressive, we would have seen some more performance improvement. In spite of decreasing the FF, the pgbench_accounts table was still growing considerably during the testing (~35% at biggest scale) even at FF 70%. And the same for old spinning disks - the potential for additional winnings with fillfactor tuning is even bigger there compared to SSD-s, because fillfactor helps to reduce exactly the Achilles heel of rotating disks - the costly random access.

Read more about performance improvements:


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Dave Myron
Dave Myron
1 year ago

If the updates-to-inserts ratio for a single table is 3:1, would that suggest that a fillfactor=25 be a good idea?

laurenz
laurenz
1 year ago
Reply to  Dave Myron

No, that is way too low.
Unless you are bulk updating major parts of the table, it is probably good enough to have space for a few more rows in the block.
Depending on the size of your rows, a fillfactor between 70 and 90 is usually good.

Sulivan Oliveira
Sulivan Oliveira
3 years ago

Hi Kaarel, thank you very much for this post!

Reka putra
Reka putra
3 years ago

HI Kaarel, does fill_factor impact about how many data per Page for every Shared_buffer cache? if it bloat or not, as empty space also get into cache?

laurenz
laurenz
3 years ago
Reply to  Reka putra

Yes, setting a fillfactor of less than 100 will bloat your table, and that bloat will also affect cache efficiency, since you will cache more empty space.
That is the price yo are paying for more HOT updates.

Reka putra
Reka putra
3 years ago
Reply to  laurenz

Ok then, it's same as others rdbms fillfactor, and the table with fillfactor specified. The empty space only maintained after table reorganization/rebuild/ vacuum in postgres, many DML operation the pages will be full again until next vacuum.
Great article with detail explanation and research. It help me understanding postgres internal and saving time to do a research about fillfactor
Thanks.
Reka

Sebastian Dressler
Sebastian Dressler
3 years ago

Hi Kaarel, thanks a lot for this! Could you maybe share your autovacuum config and what you specifically mean by "more aggressive" in that context?

laurenz
laurenz
3 years ago

You may want to read my blog for a discussion of that.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram