CYBERTEC Logo

Memory context: private memory management in PostgreSQL

06.2023 / Category: / Tags: | |
Pre-sales engineer claims that memory contexts are good to store semantic networks
© Laurenz Albe 2023

 

PostgreSQL uses shared memory for data shared between processes. With the exception of the dynamic shared memory segments used for exchanging data between parallel workers, the server allocates shared memory with a fixed size when it starts. But each PostgreSQL backend process also has to manage private memory to process SQL statements. In this article, I want to describe what a memory context is, how PostgreSQL uses them to manage its private memory, and how you can examine memory usage. This is primarily interesting for people who write PostgreSQL server code, but I want to focus on the perspective of a user trying to understand and debug the memory consumption of an SQL statement.

I got inspired to write about this topic after debugging a memory leak in PostgreSQL.

What is a memory context?

PostgreSQL is written in C, and memory management in C is notoriously tricky. A program has to explicitly release all dynamically allocated memory. As a consequence, it is easy to develop memory leaks by not freeing memory. This can lead to ever-increasing memory consumption, which eventually proves fatal for long-lived processes like PostgreSQL backends.

To reduce the danger of memory leaks, PostgreSQL has implemented its own memory management system: memory contexts. Memory contexts are chunks of memory that can grow on demand. You never directly malloc() memory in PostgreSQL, but you request memory from a memory context. If necessary, PostgreSQL will extend the memory context.

The big advantage of memory contexts is that you can delete them, which frees all the memory at once. That means that you don't have to keep track of all your memory allocations. Simply make sure that you allocate memory in a memory context with the appropriate lifespan, and there is no danger of memory leaking past the end of that lifespan. For example, the PostgreSQL query executor will create an ExecutorState context before it starts processing a statement. If you need memory, you simply allocate it in that memory context. The executor will delete the ExecutorState when it is done, and there is no danger that any memory can leak past the end of the query execution.

For details about the design and usage of memory contexts, read src/backend/utils/mmgr/README in the PostgreSQL source.

Organization of memory contexts

Memory contexts form a hierarchy. The top memory context TopMemoryContext exists for the whole lifetime of a backend process. Any other memory context has a parent memory context. When PostgreSQL deletes a memory context, it will recursively delete all descendent memory contexts. Consequently, the programmer rarely needs to free memory explicitly. If she need several chunks of memory for a shorter time, for example to process a certain step of an execution plan, she can create a new memory context with ExecutorState as parent and delete it when the step is done. If the executor terminates before that, no memory from that memory context can leak.

Important Memory Contexts
Memory Context Description
TopMemoryContext The root of the hierarchy. It is never deleted.
CacheMemoryContext Contains a cache for database metadata and cached query execution plans. Consumes more space if your database contains many objects, for example table partitions, or if you have many prepared statements.
MessageContext Contains the statement from the client and sometimes planning and parsing data.
PortalContext Memory associated with the currently active statement (known as portal or cursor)

How much memory does an SQL statement use?

Unfortunately, that question is not easy to answer. Each step of an execution plan should in theory be limited by work_mem, but often that is not enough to estimate the memory usage:

  • a single statement may have many memory-intense execution steps, so it can allocate work_mem several times
  • if the statement uses parallel query, it could create dynamic shared memory segments that are not bounded by work_mem
  • before PostgreSQL v13, hashes could grow substantially larger than work_mem if the optimizer underestimated the number of entries
  • large data values, such as bytea binary data or large PostGIS geometries, will reside in memory and are not limited by work_mem

There are a few helpers to see how much memory is stored in PostgreSQL memory contexts.

Viewing memory context usage with pg_backend_memory_contexts

The view pg_backend_memory_contexts shows all memory contexts held by the current session.

You can only query this view between statements, but it would be much more useful to see the contents while you are executing an SQL statement. For that, you can create a function and build it into your SQL statement at strategic points:

Logging memory context usage with pg_log_backend_memory_contexts()

There is also the function pg_log_backend_memory_contexts(integer) to write the current state of the memory contexts of an arbitrary session to the log file. The argument is the process ID of the backend process, which you can find in pg_stat_activity. By default, only superusers may call that function, but you can GRANT the EXECUTE privilege on that function to additional roles.

This way, you can easily examine the memory usage of a long-running SQL statement. The problem is that a statement that consumes a lot of memory need not run for a long time, although the two often correlate. It can be tricky to catch a short statement “in the act”.

Logging memory context usage with a debugger

This method gives you precise control over the point in the execution of a statement when you want to examine memory usage. But it requires some familiarity with PostgreSQL code, and attaching a debugger to a backend process of your productive database might not be for the faint of heart. I will use the GNU's gdb debugger.

First, we have to identify the backend process ID of the session we want to examine. I'll use 12345 as an example. Then we attach the debugger to the backend:

Then we set a breakpoint at some point in the PostgreSQL code. One useful function is ExecutorEnd, which PostgreSQL enters at the end of processing a statement.

Now we execute the problematic statement in the session that is being debugged. As soon as execution hits the breakpoint, we trigger a memory context dump:

This writes a memory context dump to the log file. We can now detach from the process and quit gdb:

See this article for a more detailed description of the above technique.

Estimating total memory usage for PostgreSQL

What we have seen so far refers to a single PostgreSQL session. Now a busy database will have many sessions running concurrently, and it is hard to say how many connections there will be and if they will be executing simple or complicated statements. So it is very hard or impossible to predict how much memory a PostgreSQL database is going to use. Christophe Pettus (if you ever get a chance to hear him talk, do!) aptly says that everything you know about setting work_mem is wrong. Naturally, he goes on to suggest his own formula:

50% of free memory + file system buffers divided by the number of connections

(Note that you have to first add, then divide, even though there are no parentheses.) I agree that that is usually a save value to prevent going out of memory. You will note that the number of connections plays a crucial role in this formula. This indicates the importance of using a properly sized connection pool if you want to have good performance. After all, a big enough work_mem setting is the most important requisite for the good performance of non-trivial SQL statements.

Going out of memory in PostgreSQL

Naturally, you don't want to go out of memory. But if it happens, the consequences depend largely on how you configured the operating system kernel (I am only talking about Linux in this section).

With the default configuration, Linux will invoke the “out-of memory killer” if you run out of memory. This unfriendly kernel component will send a SIGKILL to some PostgreSQL processes to unconditionally terminate them and free some memory. As detailed in this article, the untimely death of a PostgreSQL process will cause PostgreSQL to terminate all connections and undergo crash recovery. Crash recovery means downtime until PostgreSQL has recovered all data modifications since the latest checkpoint.

As detailed in the PostgreSQL documentation, the correct way to avoid such a crash is to set the kernel parameter vm.overcommit_memory to 2 (and tune vm.overcommit_ratio). Then you will get a regular “out of memory” error, and PostgreSQL will write a memory context dump to the log file. That memory context dump is very useful for understanding where PostgreSQL did allocate all that memory. Watch out: while you usually catch the misbehaving backend that way, it could also be that the actual culprit almost exhausted the memory, and some other, innocent process gets the error.

Conclusion

It is important to have some concept of how PostgreSQL uses memory contexts to manage its private memory, even if you are not a core developer. That will enable you to understand the memory context dump you get if you run out of memory on a properly configured database server. We also looked at some tools to examine memory context usage with system functions and views.

If you are interested in PostgreSQL resource management, you might want to know how too many subtransactions affect PostgreSQL performance.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Lakhveer Singh
Lakhveer Singh
10 months ago

Always love reading your blogs. Thanks for sharing!

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    1
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram