fighting ove max_connections
© Laurenz Albe 2020

In my daily work, I see many databases with a bad setting for max_connections. There is little advice out there for setting this parameter correctly, even though it is vital for the health of a database. So I decided to write up something.

What is max_connections?

According to the documentation, max_connections determines the maximum number of concurrent connections to the database server. But don’t forget that superuser_reserved_connections of these connections are for superusers only (so that superusers can connect even if all other connection slots are blocked). With the default setting of 100 and 3 for these parameters, there are 97 connections open for application purposes.

Since PostgreSQL reserves shared memory for all possible connections at server start, you have to restart PostgreSQL after changing max_connections.

Why do people choose high values for max_connections?

There are several reasons:

  • Since you need a restart to increase max_connections, people want to be “on the safe side”.
  • People got the error
    ERROR: remaining connection slots are reserved for non-replication superuser connections
    and raised the limit so that they don’t get the error anymore.
  • The application developers convinced the DBA that they need many database connections for best performance.

It seems like an easy enough solution to increase the number of connections.

What are the problems of high values for max_connections?

Setting max_connections to a high value can have severe consequences:

Risk of overloading the database

As long as all but three of your 500 database sessions are idle, not much harm is done. Perhaps taking the snapshot at the beginning of each query is a little slower, but you probably won’t notice that.

But there is nothing that can prevent 100 of the connections from becoming active at the same time. Perhaps a particular event occurred (everybody wants to buy a ticket the minute after sales started). Perhaps a rogue statement kept a lock too long and processes ready to run “piled up” behind the lock.

If that happens, your CPU and/or I/O subsystem will be overloaded. The CPU will be busy switching between the processes or waiting for I/O from the storage subsystem, and none of your database sessions will make much progress. The whole system can “melt down” and become unresponsive, so that all you can do is reboot.

Obviously, you don’t want to get into that situation, and it is not going to improve application performance either.

Insufficient resources for each database connection

There is a limited amount of RAM in your machine. If you allow more connections, you can allot less RAM to each connection. Otherwise, you run the danger of running out of memory. The private memory available for each operation during query execution is limited by work_mem. So you will have to set this parameter low if you use a high value for max_connections.

Now work_mem has a direct influence on query performance: sorting will be faster if it can use enough RAM, or PostgreSQL may prefer a faster hash join or hash aggregate and avoid a sort at all.

So setting max_connections high will make queries perform slower than they could, unless you want to risk running out of memory.

An upper limit for max_connections

You want to utilize your resources without overloading the machine. So your setting should satisfy

max_connections < max(num_cores, parallel_io_limit) /
                  (session_busy_ratio * avg_parallelism)
  • num_cores is the number of cores available
  • parallel_io_limit is the number of concurrent I/O requests your storage subsystem can handle
  • session_busy_ratio is the fraction of time that the connection is active executing a statement in the database
  • avg_parallelism is the average number of backend processes working on a single query.

The first two numbers are easy to obtain, and you usually have a good idea if you will use a lot of parallel query or not. But session_busy_ratio can be tricky to estimate. If your workload consists of big analytical queries, session_busy_ratio can be up to 1, and avg_parallelism will be one more than max_parallel_workers_per_gather. If your workload consists of many short statements, session_busy_ratio can be close to 0, and avg_parallelism is 1.

Often you will have to use a load test to determine the best value of max_connections by experiment. The aim is to keep the database busy, but not overloaded.

For some workloads, this formula or an experiment would lead to a high setting for max_connections. But you still shouldn’t do that, as stated above: the more connections you have, the bigger the danger of many connections suddenly becoming active and overloading the machine. The better solution in this case is to use a connection pool to increase session_busy_ratio.

Using a connection pool

A connection pool is a piece of software that keeps a number of persistent database connections open. It uses these connections to handle database requests from the front-end. There are two kinds of connection pools:

  • Connection pools built into the application or the application server. Rather than opening a database connection, application processes request a connection from the pool. When they are done, they return it to the pool rather than closing the connection.
  • External connection pools like pgBouncer. Such a connection pool looks like a like a database server to the front end. SQL statements from the application are executed over a limited number of backend connections to the database.

Connection pools provide an artificial bottleneck by limiting the number of active database sessions. Using them increases the session_busy_ratio. This way, you can get by with a much lower setting for max_connections without unduly limiting your application.

Conclusion

It is important for the health and performance of your application not to set max_connections too high.

If you need to be able to handle many database connections due to architecture limitations (the application server has no connection pool, or there are too many application servers), use a connection pooler like pgBouncer. Then you can give each connection the resources it needs for good performance.