PostgreSQL knows a parameter called effective_cache_size. To many this is one of the most mysterious settings in the entire PostgreSQL config. Maybe this is the core reason why many people just completely ignore this setting – in many cases this is not the best idea of all. effective_cache_size can have a serious impact on execution plans and change things dramatically. So, it is definitely a good idea to take this into account and see what kind of difference it can make.
What does effective_cache_size actually do? Well, there is a pretty cool comment in the PostgreSQL optimizer explaining what this is all about:
* We also use a rough estimate "effective_cache_size" of the number of
* disk pages in Postgres + OS-level disk cache. (We can't simply use
* NBuffers for this purpose because that would ignore the effects of
* the kernel's disk cache.)
*
* Obviously, taking constants for these values is an oversimplification,
* but it's tough enough to get any useful estimates even at this level of
* detail. Note that all of these parameters are user-settable, in case
* the default values are drastically off for a particular platform.
The idea is to give the planner an idea of what the operating system is doing and adjust its estimates accordingly. In PostgreSQL the default value of effective_cache_size is 128 MB – on most production system this is by far not enough.
The practical impact of effective_cache_size:
Let us take a look at some practical implications of effective_cache_size. To do so we can create a simple table:
1 2 |
test=# CREATE TABLE t_test (id int4); CREATE TABLE |
Then we can import some data into the table. We do this in a random order to make the impact more visible:
1 2 |
test=# INSERT INTO t_test SELECT * FROM generate_series(1, 2500000) ORDER BY random(); INSERT 0 2500000 |
Let us create an index now:
1 2 |
test=# CREATE INDEX idx_in ON t_test (id); CREATE INDEX |
As I have stated before, the default value of effective_cache_size is 128 MB. We can set this to 1 MB on the fly (for our session only):
1 2 |
test=# SET effective_cache_size TO '1 MB'; SET |
To look for the lowest 10 numbers we can use the following query:
1 2 3 4 5 6 |
test=# explain SELECT * FROM t_test ORDER BY id LIMIT 10; QUERY PLAN -------------------------------------------------------------------------------------------- Limit (cost=0.00..39.97 rows=10 width=4) -> Index Only Scan using idx_in on t_test (cost=0.00..9992553.14 rows=2500000 width=4) (2 rows) |
As you can see costs of this query are estimated at 39.97 penalty points.
What happens if we change effective_cache_size to an insanely high value?
1 2 3 4 5 6 7 8 9 |
test=# SET effective_cache_size TO '10000 MB'; SET test=# explain SELECT * FROM t_test ORDER BY id LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------- Limit (cost=0.00..0.44 rows=10 width=4) -> Index Only Scan using idx_in on t_test (cost=0.00..109180.31 rows=2500000 width=4) (2 rows) |
As you can see the costs will drop dramatically. This makes sense because if we don't expect the kernel to cache any data if we got only 1 MB of RAM – however, we expect the cache hit rate on the kernel side to up dramatically if we can expect things to be cached by the OS. Random I/O is the most expensive thing and changing this cost parameter has serious impacts on what the planner believes. Just imagine a more complex query – different cost estimates can lead to totally different plans.
+43 (0) 2622 93022-0
office@cybertec.at
What if increase effective_cache_size makes worsening in execution plains? Maybe you can reduce the cost but i have experience with a coup of queries whose time execution worsened without any other cause.
I don't get what the conclusion in terms of "which value to choose for xy amount of available RAM" is here. I doubt effective_cache_size = '10000 MB' is the recommendation.
Since the only difference in the query plans in the example is the estimated cost I assume that the performance would actually be the same? In this instance the estimated cost has not changed the selected plan?