pg_timetable is a brand-new job scheduler for PostgreSQL implemented completely from scratch (in Go). Traditional job scheduling is quite limited and does not cover all we needed in real life. Therefore, we thought about job execution in more detail and came up with a new concept that has not been seen in the PostgreSQL world before. pg_timetable does not simply execute some simple SQL statements periodically. It has been designed to execute entire sequences of operations at a time while offering some other important features we will discuss in this post.
Installing and running pg_timetable
You can run pg_timetable in 2 ways:
- Standalone application
Both methods are described on our Github page so it should be easy for most people to run pg_timetable and to give it a try.
For the sake of simplicity we assume that you have installed our new PostgreSQL job scheduler and everything is up and running. So let’s take a look and see how pg_timetable can be configured.
Base tasks and chains
The core idea is to allow users to execute more complicated sequences of operations. Now what does that mean? Suppose you want to …
- Start a transaction
- Write a log entry to the database
- Download a file
- Import the file
- Run aggregations
- Commit the transaction
A sequence like that is quite common. Let us take a look at those steps in pg_timetable terminology: Each step on the way is a “base task”. There are various types of base tasks:
- Built-in tasks (such as “send email”, start transaction, etc.)
- SQL tasks (some database instruction)
- Shell tasks (some external executables)
Those tasks can be combined and turned into chains. A base task can take parameters which are added at runtime. What does that mean? Suppose you want to send an email once a day. Who is going to receive this email? This kind of information can be passed to chains / base tasks scheduled for execution.
Configuring pg_timetable: An example
Let us take a look at an easy pg_timetable configuration example. The goal is just to make the database sleep periodically. We created a simple config script:
WITH noop(id) AS ( SELECT task_id FROM timetable.base_task WHERE name = 'Sleep' ), chain_insert(chain_id) AS ( INSERT INTO timetable.task_chain (chain_id, parent_id, task_id, run_uid, database_connection, ignore_error) VALUES (DEFAULT, NULL, (SELECT id FROM noop), NULL, NULL, TRUE) RETURNING chain_id ), chain_config(id) AS ( INSERT INTO timetable.chain_execution_config VALUES ( DEFAULT, -- chain_execution_config, (SELECT chain_id FROM chain_insert), -- chain_id, 'sleep every minute', -- chain_name, NULL, -- run_at_minute, NULL, -- run_at_hour, NULL, -- run_at_day, NULL, -- run_at_month, NULL, -- run_at_day_of_week, 1, -- max_instances, TRUE, -- live, FALSE, -- self_destruct, FALSE -- exclusive_execution, ) RETURNING chain_execution_config ) INSERT INTO timetable.chain_execution_parameters (chain_execution_config, chain_id, order_id, value) VALUES ( (SELECT id FROM chain_config), (SELECT chain_id FROM chain_insert), 1, '5' :: jsonb) ;
The WITH statement just selects the ID of a base task called “Sleep”. Each base task has a name and an ID. We will need the ID to schedule those events in the form of a simple chain. In my example, the chain has just one element, so there is no need for parent elements. Also: We simply want to sleep. There is no need for a database connect string.
To do the actual scheduling, we need to insert into timetable.chain_execution_config. The configuration is pretty similar to cron. However, there are a couple of interesting things here:
• max_instances tells us that this job must not run concurrently. One sleep is allowed at a time
• self_destruct = false means that the chain will survive its own execution. If you set this to true, a chain is only executed once (in case it succeeds).
• exclusive_execution means that other chains doing something might basically happen in parallel.
Finally, parameters are added to the chain.
Using pg_timetable concurrency protection
Concurrency protection is an important aspect. Suppose you want to backup a database daily. What is going to happen if the backup takes 60 hours to complete? At some point, your server is going to die because jobs pile up. Or maybe backup files will start to overwrite each other. In short: If you want to make sure that only one backup at a time is allowed, pg_timetable concurrency protection is what you want. The max_instances = 1 setting is the easiest way to do that. There is no more need for pid files, no need for anything else. pg_timetable does it all for you.
Utilizing self-destructive chains
More often than not, it happens that a job should only be done once. The idea is simple: Suppose somebody gets a 7-day free trial. After 7 days, you want to shut down the person’s account. Obviously, shutting down somebody’s account is not done daily – it happens once, and if the process was successful, that was basically it.
Self-destructive chains are especially useful for asynchronous execution. What do I mean by that? Suppose you have a website and you press a button to launch a job that runs for 2 hours. Clearly, nobody is going to sit in front of the browser window and wait. What you do instead is to tell pg_timetable to execute stuff as soon as possible and kill the job as soon as it is ended successfully. Asynchronous execution is the prime use case.
More simplistic jobs
Now I want to show you a more complex example. Let’s take a look at the following file: https://github.com/cybertec-postgresql/pg_timetable/blob/master/samples/ManyTasks.sql
In this case, 500 base tasks are created on the fly and assembled in a chain. The logic is the same as we had before. First, base tasks are selected (or created). Then, they are put into a chain and finally, execution is scheduled before we can at last add parameters to the job. The concept is always the same.
Making use of cron-style configuration
pg_timetable is really powerful, but sometimes you don’t need all those features. Sometimes all you want is to simply run some SQL and nothing more. To make configuration easier, we have introduced a cron-style configuration interface. The following example creates a dummy table and adds a row once in a while:
CREATE TABLE timetable.dummy_log ( log_ID BIGSERIAL, event_name TEXT, timestmp TIMESTAMPTZ DEFAULT TRANSACTION_TIMESTAMP(), PRIMARY KEY (log_ID) ); -- Parameters detail for timetable.job_add() -- task_name: The name of the Task -- task_function: The function which will be executed. -- client_name: The name of worker under which this task will execute -- task_type: Type of the function SQL,SHELL and BUILTIN -- by_cron: Time Schedule in Cron Syntax -- by_minute: This specifies the minutes on which the job is to run -- by_hour: This specifies the hours on which the job is to run -- by_day: This specifies the days on which the job is to run. -- by_month: This specifies the month on which the job is to run -- by_day_of_week: This specifies the day of week (0,7 is sunday) on which the job is to run -- max_instances: The amount of instances that this chain may have running at the same time. -- live: Control if the chain may be executed once it reaches its schedule. -- self_destruct: Self destruct the chain. ----CRON-Style -- * * * * * command to execute -- ┬ ┬ ┬ ┬ ┬ -- │ │ │ │ │ -- │ │ │ │ └──── day of the week (0 - 7) (Sunday to Saturday)(0 and 7 is Sunday); -- │ │ │ └────── month (1 - 12) -- │ │ └──────── day of the month (1 - 31) -- │ └────────── hour (0 - 23) -- └──────────── minute (0 - 59) SELECT timetable.job_add ( task_name => 'cron_Job run after 40th minutes after 2 hour on 27th of every month ', task_function => $$INSERT INTO timetable.dummy_log (event_name) VALUES ('Cron test')$$, client_name => NULL, -- any worker may execute task_type => 'SQL', by_cron => '40 */2 27 * *', by_minute => NULL, by_hour => NULL, by_day => NULL, by_month => NULL, by_day_of_week => NULL, max_instances => 1, live => TRUE, self_destruct => FALSE );
As you can see, all you have to do is to call the “timetable.job_add” function and pass a couple of parameters to the function. pg_timetable will automatically translate things into the internal configuration format for you.
If you like pg_timetable, we are looking forward to receive you feedback, or maybe even feature requests, bug reports and contributions. We also want to point out that commercial support for pg_timetable is available internationally.
If you want to read something else right now, we can recommend our post about pg_permissions which is available here.