CYBERTEC PostgreSQL Logo

Monitoring Performance for PostgreSQL with Citus

09.2023 / Category: , / Tags: | |

In the fast-paced world of data management, scalability rules supreme and Citus plays an ever greater role. The question now is: How can we see what is going on inside Citus? How can businesses leverage monitoring technology to optimize their PostgreSQL database performance?

Monitoring PostgreSQL and Citus

Database monitoring is important regardless of the extension you are using. The best way to find performance bottlenecks in PostgreSQL is definitely to use pg_stat_statements. I've written about detecting PostgreSQL performance problems in the past.

pg_stat_statements is the single most powerful tool to spot slow queries and to ensure that you have all the insights you need to actually react to bad performance.

However, if Citus comes into the picture, there is a bit more you have to know in order to inspect performance problems.

Configuring Citus performance monitoring

Enabling pg_stat_statements should be done on any PostgreSQL database deployment. The performance overhead is basically non-existent and therefore pg_stat_statements is truly a must.

When Citus enters the picture, we have to change some additional variables. The most important one is citus.stat_statements_track, which should be set to all. Here's how it works:

After reconnecting to the database, we'll be able to see information. In general, we also recommend turning on track_io_timing in PostgreSQL to gain some more insights about the time needed to perform I/O operations:

A restart is not needed in this context. Sending a signal (“reload”) is enough for all new connections coming in.

Checking Citus performance data

When Citus is enabled in your platform, you can call a set returning function which returns a list of all Citus-related queries along with some more information:

The important information here is the query ID. It is identical to what we can find when taking a look at pg_stat_statements. We can use an IN-statement to figure out all Citus-related queries and extract them from pg_stat_statements, as shown in the next listing:

Together, pg_stat_statements and citus_stat_statements() will reveal all the important information we need to detect performance problems. Once all the data has been collected, it is up to the database engineer to make sense of the data, and adjust queries accordingly.

Optimizing our monitoring configuration

When you configure PostgreSQL and Citus for query performance inspection, there are two more relevant parameters: The purge interval and the maximum number of statements. Let's take a look:

The first variable means that performance data is synced every 10 milliseconds. In most deployments, this is totally fine and we didn't change that setting in our projects. The citus.stat_statements_max variable is similar to what pg_stat_statements has to offer. Because of the way PostgreSQL tracks data, this is a static variable. It defines how many different types of statements will be tracked. In case an excess number of types of statements exist, PostgreSQL will purge the less frequent ones. However, in real life deployments, using the default setting (50000) is totally fine.

Finally …

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram