pg_timetable v4.4 is available immediately!

Our team is proud to introduce a new pg_timetable v4.4 release!

Table of Contents

This time we focused on implementing a couple of new features, as well as improving performance.

I want to remind you that pg_timetable is a community project. So, please, don’t hesitate to ask any questions, to report bugs, to star the pg_timetable project, and to tell the world about it.

REST API

The first new cool feature we've added to pg_timetable v4.4 release is the web-server providing REST API. Right now it only serves two endpoints: /liveness and /readiness.

GET /liveness always returns HTTP status code 200, which only indicates that pg_timetable is running, e.g.

$ curl -i localhost:8080/liveness 
HTTP/1.1 200 OK
Date: Wed, 09 Feb 2022 15:11:02 GMT
Content-Length: 0

$ curl -i localhost:8080/liveness

HTTP/1.1 200 OK

Date: Wed, 09 Feb 2022 15:11:02 GMT

Content-Length: 0

GET /readiness returns HTTP status code 200 when pg_timetable is running and the scheduler is in the main loop processing chains, e.g.

$ curl -i localhost:8080/readiness
HTTP/1.1 200 OK
Date: Wed, 09 Feb 2022 15:30:25 GMT
Content-Length: 0

$ curl -i localhost:8080/readiness

HTTP/1.1 200 OK

Date: Wed, 09 Feb 2022 15:30:25 GMT

Content-Length: 0

If the scheduler connects to the database, creates a database schema, or upgrades it, it will return HTTP status code 503, i.e.

$ curl -i localhost:8080/readiness
HTTP/1.1 503 Service Unavailable
Date: Wed, 09 Feb 2022 15:10:48 GMT
Content-Length: 0

$ curl -i localhost:8080/readiness

HTTP/1.1 503 Service Unavailable

Date: Wed, 09 Feb 2022 15:10:48 GMT

Content-Length: 0

This is useful for monitoring purposes; for example, to perform HTTP health checks. We are planning to add more endpoints to perform start/stop/reinitialize/restarts/reloads and to provide extended monitoring statistics.

The REST API server is disabled by default. You should use the --rest-port command-line parameter to activate it:

$ ./pg_timetable --rest-port=8080 --clientname=loader2 postgresql://scheduler@localhost/timetable
2022-02-09 16:36:08.593 [INFO] [port:8080] Starting REST API server...
2022-02-09 16:36:08.867 [INFO] Database connection established
2022-02-09 16:36:08.875 [INFO] Accepting asynchronous chains execution requests...
2022-02-09 16:36:08.880 [INFO] [count:0] Retrieve scheduled chains to run @reboot
2022-02-09 16:36:08.884 [INFO] [count:2] Retrieve interval chains to run
...

$ ./pg_timetable --rest-port=8080 --clientname=loader2 postgresql://scheduler@localhost/timetable

2022-02-09 16:36:08.593 [INFO] [port:8080] Starting REST API server...

2022-02-09 16:36:08.867 [INFO] Database connection established

2022-02-09 16:36:08.875 [INFO] Accepting asynchronous chains execution requests...

2022-02-09 16:36:08.880 [INFO] [count:0] Retrieve scheduled chains to run @reboot

2022-02-09 16:36:08.884 [INFO] [count:2] Retrieve interval chains to run

...

Version Output

For debugging and monitoring purposes, we've added detailed version output in pg_timetable v4.4. You should use the -v, --version command-line argument to force pg_timetable to output the associated version information:

$ pg_timetable.exe -v
pg_timetable:
  Version:      4.4.0
  DB Schema:    00381
  Git Commit:   52e12177d0025b9b01c737cea06048fc350315f5
  Built:        2022-02-07T14:06:57Z

$ pg_timetable.exe -v

pg_timetable:

Version: 4.4.0

DB Schema: 00381

Git Commit: 52e12177d0025b9b01c737cea06048fc350315f5

Built: 2022-02-07T14:06:57Z

The first line is the version of the binary itself, or the name of the branch if this is a development build. For example, the latest tag of our cybertecpostgresql/pg_timetable Docker image is always built against the master branch, thus the output will be slightly different:

$ docker run --rm cybertecpostgresql/pg_timetable:latest -v
pg_timetable:
  Version:      master
  DB Schema:    00381
  Git Commit:   e67c6872ab9aa91a262aab5b75fb76ea51e050b8
  Built:        2022-02-07T16:01:25+01:00

$ docker run --rm cybertecpostgresql/pg_timetable:latest -v

pg_timetable:

Version: master

DB Schema: 00381

Git Commit: e67c6872ab9aa91a262aab5b75fb76ea51e050b8

Built: 2022-02-07T16:01:25+01:00

⚠️ Since the latest tag is up to date with the master branch, you probably want to use the latest stable tag in production.

The database schema line in the output indicates the version of the latest database migration applied. We use the ID of the Github issue that caused these changes as an identifier. That helps quickly locate the history connected with the schema change, e.g. Issue #381.

Git commit is the commit against which the binary is built, and the precise time is placed on the last line.

Rewritten active chains handling

It turns out that on highly loaded systems, the scheduler inserts too many rows in the system table run_status: one row for chain start and one for a finish. Over time, the target table may contain a high number of rows, causing internal functions to lag for about ~2-3 seconds for each call. That also means resource usage can get to be too much.

The whole idea behind run_status was to track active chains so the scheduler won't run new chains if the active number exceeds max_instances.

In fact, we don't need such a detailed table, because we already have log and execution_log tables where every piece of the chain of execution is already stored.
Also, this run_status table was designed in a very complicated way, but that allowed it to hold many details. On the other hand, managing active/running chains can be done in a similar way to how we manage active sessions. From the logical point of view, this is the same. So now in the new version, instead of managing this complicated run_status table, we switched to another active_chain table. And the idea behind this active_chain table is the same as the active_session table that we already use for sessions.

The idea itself can be described in 3 steps:
1. make it UNLOGGED to save space and not produce WALs
2. add a row to the active_chain table when a chain starts
3. delete a row from the active_chain table when a chain is finished or failed.
In this way, we can handle a load of several thousand parallel jobs simultaneously without visible degradation -- well, at least in the test environment.

Finally...

There are some more improvements. The full changelog is available on the v4.4 release page. We want to thank all contributors and users for their help.

If you want to contribute to pg_timetable and help to make it better:

⭐give a star to the project,
feel free to open an 🤚issue and ask a 🎓question
or even consider submitting a 📜pull request.

In conclusion, I wish you all the best! ♥️
Please, stay safe – so we can meet in person at one of the conferences, meetups, or training sessions!

pg_timetable v4.4 is available immediately!

REST API

Version Output

Rewritten active chains handling

Finally...

Leave a Reply Cancel reply

Pavlo Golub

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts