Monitoring Performance for PostgreSQL with Citus

In the fast-paced world of data management, scalability rules supreme and Citus plays an ever greater role. The question now is: How can we see what is going on inside Citus? How can businesses leverage monitoring technology to optimize their PostgreSQL database performance?

Table of Contents

Monitoring PostgreSQL and Citus

Database monitoring is important regardless of the extension you are using. The best way to find performance bottlenecks in PostgreSQL is definitely to use pg_stat_statements. I've written about detecting PostgreSQL performance problems in the past.

pg_stat_statements is the single most powerful tool to spot slow queries and to ensure that you have all the insights you need to actually react to bad performance.

However, if Citus comes into the picture, there is a bit more you have to know in order to inspect performance problems.

Configuring Citus performance monitoring

Enabling pg_stat_statements should be done on any PostgreSQL database deployment. The performance overhead is basically non-existent and therefore pg_stat_statements is truly a must.

When Citus enters the picture, we have to change some additional variables. The most important one is citus.stat_statements_track, which should be set to all. Here's how it works:

postgres=# ALTER SYSTEM 
SET citus.stat_statements_track TO 'all';
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# ALTER SYSTEM

SET citus.stat_statements_track TO 'all';

ALTER SYSTEM

postgres=# SELECT pg_reload_conf();

pg_reload_conf

----------------

(1 row)

After reconnecting to the database, we'll be able to see information. In general, we also recommend turning on track_io_timing in PostgreSQL to gain some more insights about the time needed to perform I/O operations:

postgres=# ALTER SYSTEM 
SET track_io_timing TO on;
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# ALTER SYSTEM

SET track_io_timing TO on;

ALTER SYSTEM

postgres=# SELECT pg_reload_conf();

pg_reload_conf

----------------

(1 row)

A restart is not needed in this context. Sending a signal (“reload”) is enough for all new connections coming in.

Checking Citus performance data

When Citus is enabled in your platform, you can call a set returning function which returns a list of all Citus-related queries along with some more information:

postgres=# x
Expanded display is on.
postgres=# SELECT * FROM citus_stat_statements();
-[ RECORD 1 ]-+---------------------------------------
queryid       | 47780130472803892
userid        | 10
dbid          | 5
query         | SELECT count(*) FROM t_row_timeseries;
executor      | 1
partition_key | 
calls         | 1
-[ RECORD 2 ]-+---------------------------------------
queryid       | 7664479320327094882
userid        | 10
dbid          | 5
query         | SELECT count(*) FROM t_col_timeseries;
executor      | 1
partition_key | 
calls         | 3

postgres=# x

Expanded display is on.

postgres=# SELECT * FROM citus_stat_statements();

-[ RECORD 1 ]-+---------------------------------------

queryid | 47780130472803892

userid | 10

dbid | 5

query | SELECT count(*) FROM t_row_timeseries;

executor | 1

partition_key |

calls | 1

-[ RECORD 2 ]-+---------------------------------------

queryid | 7664479320327094882

userid | 10

dbid | 5

query | SELECT count(*) FROM t_col_timeseries;

executor | 1

partition_key |

calls | 3

The important information here is the query ID. It is identical to what we can find when taking a look at pg_stat_statements. We can use an IN-statement to figure out all Citus-related queries and extract them from pg_stat_statements, as shown in the next listing:

postgres=# SELECT * 
FROM 	 pg_stat_statements 
WHERE queryid IN (SELECT queryid 
  FROM   citus_stat_statements() );
-[ RECORD 1 ]----------+---------------------------------------
userid                 | 10
dbid                   | 5
toplevel               | t
queryid                | 7664479320327094882
query                  | SELECT count(*) FROM t_col_timeseries;
plans                  | 0
total_plan_time        | 0
min_plan_time          | 0
max_plan_time          | 0
mean_plan_time         | 0
stddev_plan_time       | 0
calls                  | 6
total_exec_time        | 5739.103502999999
min_exec_time          | 948.513001
max_exec_time          | 964.7458330000001
mean_exec_time         | 956.5172505
stddev_exec_time       | 5.902929807416584
rows                   | 6
shared_blks_hit        | 172
shared_blks_read       | 0
shared_blks_dirtied    | 0
shared_blks_written    | 0
local_blks_hit         | 0
local_blks_read        | 0
local_blks_dirtied     | 0
local_blks_written     | 0
temp_blks_read         | 0
temp_blks_written      | 0
blk_read_time          | 0
blk_write_time         | 0
temp_blk_read_time     | 0
temp_blk_write_time    | 0
wal_records            | 0
wal_fpi                | 0
wal_bytes              | 0
jit_functions          | 0
jit_generation_time    | 0
jit_inlining_count     | 0
jit_inlining_time      | 0
jit_optimization_count | 0
jit_optimization_time  | 0
jit_emission_count     | 0
jit_emission_time      | 0
-[ RECORD 2 ]----------+---------------------------------------
userid                 | 10
dbid                   | 5
toplevel               | t
queryid                | 47780130472803892
query                  | SELECT count(*) FROM t_row_timeseries;
plans                  | 0
total_plan_time        | 0
min_plan_time          | 0
max_plan_time          | 0
mean_plan_time         | 0
stddev_plan_time       | 0
calls                  | 4
total_exec_time        | 23872.893667
min_exec_time          | 3022.5416250000003
max_exec_time          | 14483.156667
mean_exec_time         | 5968.223416749999
stddev_exec_time       | 4916.539866467654
rows                   | 4
shared_blks_hit        | 77
shared_blks_read       | 0
shared_blks_dirtied    | 0
shared_blks_written    | 0
local_blks_hit         | 0
local_blks_read        | 0
local_blks_dirtied     | 0
local_blks_written     | 0
temp_blks_read         | 0
temp_blks_written      | 0
blk_read_time          | 0
blk_write_time         | 0
temp_blk_read_time     | 0
temp_blk_write_time    | 0
wal_records            | 0
wal_fpi                | 0
wal_bytes              | 0
jit_functions          | 0
jit_generation_time    | 0
jit_inlining_count     | 0
jit_inlining_time      | 0
jit_optimization_count | 0
jit_optimization_time  | 0
jit_emission_count     | 0
jit_emission_time      | 0

postgres=# SELECT *

FROM pg_stat_statements

WHERE queryid IN (SELECT queryid

FROM citus_stat_statements() );

-[ RECORD 1 ]----------+---------------------------------------

userid | 10

dbid | 5

toplevel | t

queryid | 7664479320327094882

query | SELECT count(*) FROM t_col_timeseries;

plans | 0

total_plan_time | 0

min_plan_time | 0

max_plan_time | 0

mean_plan_time | 0

stddev_plan_time | 0

calls | 6

total_exec_time | 5739.103502999999

min_exec_time | 948.513001

max_exec_time | 964.7458330000001

mean_exec_time | 956.5172505

stddev_exec_time | 5.902929807416584

rows | 6

shared_blks_hit | 172

shared_blks_read | 0

shared_blks_dirtied | 0

shared_blks_written | 0

local_blks_hit | 0

local_blks_read | 0

local_blks_dirtied | 0

local_blks_written | 0

temp_blks_read | 0

temp_blks_written | 0

blk_read_time | 0

blk_write_time | 0

temp_blk_read_time | 0

temp_blk_write_time | 0

wal_records | 0

wal_fpi | 0

wal_bytes | 0

jit_functions | 0

jit_generation_time | 0

jit_inlining_count | 0

jit_inlining_time | 0

jit_optimization_count | 0

jit_optimization_time | 0

jit_emission_count | 0

jit_emission_time | 0

-[ RECORD 2 ]----------+---------------------------------------

userid | 10

dbid | 5

toplevel | t

queryid | 47780130472803892

query | SELECT count(*) FROM t_row_timeseries;

plans | 0

total_plan_time | 0

min_plan_time | 0

max_plan_time | 0

mean_plan_time | 0

stddev_plan_time | 0

calls | 4

total_exec_time | 23872.893667

min_exec_time | 3022.5416250000003

max_exec_time | 14483.156667

mean_exec_time | 5968.223416749999

stddev_exec_time | 4916.539866467654

rows | 4

shared_blks_hit | 77

shared_blks_read | 0

shared_blks_dirtied | 0

shared_blks_written | 0

local_blks_hit | 0

local_blks_read | 0

local_blks_dirtied | 0

local_blks_written | 0

temp_blks_read | 0

temp_blks_written | 0

blk_read_time | 0

blk_write_time | 0

temp_blk_read_time | 0

temp_blk_write_time | 0

wal_records | 0

wal_fpi | 0

wal_bytes | 0

jit_functions | 0

jit_generation_time | 0

jit_inlining_count | 0

jit_inlining_time | 0

jit_optimization_count | 0

jit_optimization_time | 0

jit_emission_count | 0

jit_emission_time | 0

Together, pg_stat_statements and citus_stat_statements() will reveal all the important information we need to detect performance problems. Once all the data has been collected, it is up to the database engineer to make sense of the data, and adjust queries accordingly.

Optimizing our monitoring configuration

When you configure PostgreSQL and Citus for query performance inspection, there are two more relevant parameters: The purge interval and the maximum number of statements. Let's take a look:

postgres=# SHOW citus.stat_statements_purge_interval;
 citus.stat_statements_purge_interval 
--------------------------------------
 10ms
(1 row)

postgres=# SHOW citus.stat_statements_max ;
 citus.stat_statements_max 
---------------------------
 50000
(1 row)

postgres=# SHOW citus.stat_statements_purge_interval;

citus.stat_statements_purge_interval

--------------------------------------

10ms

(1 row)

postgres=# SHOW citus.stat_statements_max ;

citus.stat_statements_max

---------------------------

50000

(1 row)

The first variable means that performance data is synced every 10 milliseconds. In most deployments, this is totally fine and we didn't change that setting in our projects. The citus.stat_statements_max variable is similar to what pg_stat_statements has to offer. Because of the way PostgreSQL tracks data, this is a static variable. It defines how many different types of statements will be tracked. In case an excess number of types of statements exist, PostgreSQL will purge the less frequent ones. However, in real life deployments, using the default setting (50000) is totally fine.

Finally …

Check out our other Citus blogs.
If you'd like to learn more about performance, see our latest performance content here.
In case you need any assistance, please feel free to contact us.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Monitoring Performance for PostgreSQL with Citus

Monitoring PostgreSQL and Citus

Configuring Citus performance monitoring

Checking Citus performance data

Optimizing our monitoring configuration

Finally …

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts