CYBERTEC Logo

Grafana dashboard for pg_timetable scheduled jobs

01.2024 / Category: / Tags: | |

As a little boy, I dreamed of becoming a supersonic jet pilot or an astronaut. I also yearned to pilot trains, subways, trolleybuses, trams, and other vessels. I remember stealing pot lids from my grandparent's kitchen to make a steering wheel. And also all sorts of devices and tools to emulate the cockpit dashboard. My childish mind easily turned a hammer into a landing gear release lever. Clothes pegs quickly served as switches to open the airlocks.

That could be one of the reasons why I have fun times being the pgwatch maintainer. After all, one of the duties is to maintain Grafana dashboards, which are widely used for monitoring and visualization. Today, I want to introduce an updated dashboard to monitor pg_timetable scheduled jobs.

Grafana dashboard to monitor pg_timetable jobs

Grafana dashboard overview

Let me describe all the elements on this dashboard.

Client name

Select client name in Grafana dashboard for pg_timetable

The "Client name" drop-down list allows you to filter data only for selected pg_timetable clients. By default, all client sessions are displayed.

Schedulers running

The "Schedulers running" panel indicates in "real-time" how many pg_timetable instances are running and connected to the current database. This panel is an excellent choice to create an alert based on it. If you prefer the direct query, then use this one:

Total chains vs Enabled chains

The "Total chains" panel shows you how many chain definitions are available in the system, including paused. One can get this value simply by executing:

On the contrary, the "Enabled chains" panel displays how many jobs the client can execute. If the "Client name" value is "All," this panel indicates the total number of all non-paused chains. This query allows you to get such a piece of information directly:

Please note that if client_name is null here, all running schedulers can execute such a chain.

All following panels are affected by the "Client name" and the time interval selected in the top bar, e.g., "Last 1 hour".

Total task executions

The "Total task executions" panel represents the number of tasks executed by the specified client(s) during the selected time frame. Pay attention; we are talking about tasks here, not chains. We want such a granularity here because some chains may consist of several tasks. To get this value directly, one can run such a query:

We examine timetable.execution_log here. That means the client_name column will definitely indicate the runner. The last_run column holds the start time of a task. If you're looking for the time when a task was finished, use a corresponding finished column.

Total task errors

The "Total task errors" panel shows how many tasks the selected client(s) failed during the time frame chosen. This number, however, doesn't include failures of tasks with the timetable.task.ignore_error column set to true. Remember to tune the panel thresholds. By default, 0 (no errors) is shown as green, 1-10 errors are colored orange, and more than that are painted red.
To get this information directly, use such the query:

Average task duration

The "Average task duration" panel displays the average duration of successful tasks during the selected interval. There are no predefined thresholds set for this panel. You should probably apply some based on your specific workflows. To collect this value, we use the query like this:

The "Task executions (total vs error)" graph provides detailed execution information for the selected time interval. The visualization consists of bars indicating a heartbeat of the pg_timetable equal to 1 minute. For example, from the graph provided above, we can tell:

  1. We have no information about the system before 11:23. Either logs were truncated at that time, or the system was initialized at this time.
  2. From 11:23 till 12:00, the schedulers were producing nearly 500 task executions each minute (green line), and roughly half of the executions returned ignored (silenced) errors (blue line).
  3. There were 102 non-ignored errors in total, according to the "Total task error" panel. However, we cannot observe it on the graph (red line) due to the scale.

If you want to examine these trends in your favorite SQL client, start with such code snippet:

The bottom last "Task executions errors" grid lets you see all non-ignored task errors for the selected interval. This grid is a detailed view of the previously discussed "Total task errors" panel.

Finally...

If you want to contribute to pg_timetable and help to make it better:

In conclusion, I wish you all the best! 💙💛
See you soon in person at one of the conferences, meetups, or training sessions!

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram