In the fast-paced world of data management, scalability rules supreme and Citus plays an ever greater role. The question now is: How can we see what is going on inside Citus? How can businesses leverage monitoring technology to optimize their PostgreSQL database performance?

Monitoring PostgreSQL and Citus

Database monitoring is important regardless of the extension you are using. The best way to find performance bottlenecks in PostgreSQL is definitely to use pg_stat_statements. I’ve written about detecting PostgreSQL performance problems in the past.

pg_stat_statements is the single most powerful tool to spot slow queries and to ensure that you have all the insights you need to actually react to bad performance.

However, if Citus comes into the picture, there is a bit more you have to know in order to inspect performance problems.

Configuring Citus performance monitoring

Enabling pg_stat_statements should be done on any PostgreSQL database deployment. The performance overhead is basically non-existent and therefore pg_stat_statements is truly a must.

When Citus enters the picture, we have to change some additional variables. The most important one is citus.stat_statements_track, which should be set to all. Here’s how it works:

postgres=# ALTER SYSTEM 
SET citus.stat_statements_track TO 'all';
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

After reconnecting to the database, we’ll be able to see information. In general, we also recommend turning on track_io_timing in PostgreSQL to gain some more insights about the time needed to perform I/O operations:

postgres=# ALTER SYSTEM 
SET track_io_timing TO on;
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

A restart is not needed in this context. Sending a signal (“reload”) is enough for all new connections coming in.

Checking Citus performance data

When Citus is enabled in your platform, you can call a set returning function which returns a list of all Citus-related queries along with some more information:

postgres=# \x
Expanded display is on.
postgres=# SELECT * FROM citus_stat_statements();
-[ RECORD 1 ]-+---------------------------------------
queryid       | 47780130472803892
userid        | 10
dbid          | 5
query         | SELECT count(*) FROM t_row_timeseries;
executor      | 1
partition_key | 
calls         | 1
-[ RECORD 2 ]-+---------------------------------------
queryid       | 7664479320327094882
userid        | 10
dbid          | 5
query         | SELECT count(*) FROM t_col_timeseries;
executor      | 1
partition_key | 
calls         | 3

The important information here is the query ID. It is identical to what we can find when taking a look at pg_stat_statements. We can use an IN-statement to figure out all Citus-related queries and extract them from pg_stat_statements, as shown in the next listing:

postgres=# SELECT * 
FROM 	 pg_stat_statements 
WHERE queryid IN (SELECT queryid 
  FROM   citus_stat_statements() );
-[ RECORD 1 ]----------+---------------------------------------
userid                 | 10
dbid                   | 5
toplevel               | t
queryid                | 7664479320327094882
query                  | SELECT count(*) FROM t_col_timeseries;
plans                  | 0
total_plan_time        | 0
min_plan_time          | 0
max_plan_time          | 0
mean_plan_time         | 0
stddev_plan_time       | 0
calls                  | 6
total_exec_time        | 5739.103502999999
min_exec_time          | 948.513001
max_exec_time          | 964.7458330000001
mean_exec_time         | 956.5172505
stddev_exec_time       | 5.902929807416584
rows                   | 6
shared_blks_hit        | 172
shared_blks_read       | 0
shared_blks_dirtied    | 0
shared_blks_written    | 0
local_blks_hit         | 0
local_blks_read        | 0
local_blks_dirtied     | 0
local_blks_written     | 0
temp_blks_read         | 0
temp_blks_written      | 0
blk_read_time          | 0
blk_write_time         | 0
temp_blk_read_time     | 0
temp_blk_write_time    | 0
wal_records            | 0
wal_fpi                | 0
wal_bytes              | 0
jit_functions          | 0
jit_generation_time    | 0
jit_inlining_count     | 0
jit_inlining_time      | 0
jit_optimization_count | 0
jit_optimization_time  | 0
jit_emission_count     | 0
jit_emission_time      | 0
-[ RECORD 2 ]----------+---------------------------------------
userid                 | 10
dbid                   | 5
toplevel               | t
queryid                | 47780130472803892
query                  | SELECT count(*) FROM t_row_timeseries;
plans                  | 0
total_plan_time        | 0
min_plan_time          | 0
max_plan_time          | 0
mean_plan_time         | 0
stddev_plan_time       | 0
calls                  | 4
total_exec_time        | 23872.893667
min_exec_time          | 3022.5416250000003
max_exec_time          | 14483.156667
mean_exec_time         | 5968.223416749999
stddev_exec_time       | 4916.539866467654
rows                   | 4
shared_blks_hit        | 77
shared_blks_read       | 0
shared_blks_dirtied    | 0
shared_blks_written    | 0
local_blks_hit         | 0
local_blks_read        | 0
local_blks_dirtied     | 0
local_blks_written     | 0
temp_blks_read         | 0
temp_blks_written      | 0
blk_read_time          | 0
blk_write_time         | 0
temp_blk_read_time     | 0
temp_blk_write_time    | 0
wal_records            | 0
wal_fpi                | 0
wal_bytes              | 0
jit_functions          | 0
jit_generation_time    | 0
jit_inlining_count     | 0
jit_inlining_time      | 0
jit_optimization_count | 0
jit_optimization_time  | 0
jit_emission_count     | 0
jit_emission_time      | 0

Together, pg_stat_statements and citus_stat_statements() will reveal all the important information we need to detect performance problems. Once all the data has been collected, it is up to the database engineer to make sense of the data, and adjust queries accordingly.

Optimizing our monitoring configuration

When you configure PostgreSQL and Citus for query performance inspection, there are two more relevant parameters: The purge interval and the maximum number of statements. Let’s take a look:

postgres=# SHOW citus.stat_statements_purge_interval;
 citus.stat_statements_purge_interval 
--------------------------------------
 10ms
(1 row)

postgres=# SHOW citus.stat_statements_max ;
 citus.stat_statements_max 
---------------------------
 50000
(1 row)

The first variable means that performance data is synced every 10 milliseconds. In most deployments, this is totally fine and we didn’t change that setting in our projects. The citus.stat_statements_max variable is similar to what pg_stat_statements has to offer. Because of the way PostgreSQL tracks data, this is a static variable. It defines how many different types of statements will be tracked. In case an excess number of types of statements exist, PostgreSQL will purge the less frequent ones. However, in real life deployments, using the default setting (50000) is totally fine.

Finally …

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.