By Kaarel Moppel
The functionality of using table partitions to speed up queries and make tables more manageable as data amounts grow has been available in Postgres for a long time already. Nicer declarative support became available from v10 on – so in general it’s a known technique for developers. But what is not so uniformly clear is the way how low-level partition management is done. Postgres leaves it up to users, but no real standard tools or even concepts have emerged.
So it happened again that a customer approached me with a plan to use partitions for an action logging use case. The customer couldn’t find a nice tool to his liking after a session of googling and was looking for some advice. Sure I know some tools. But my actual advice seemed quite unconventional and surprising for him: Why all the craziness with tools for this simple task? Why not throw together and use something simple yourself?
The problem with tools
Of course tools in a broader sense are necessary and great, we wouldn’t be here today without our stone axes. Software developers thrive on the skillful use of editors, programming languages, etc. But the thing with tools is that you should really know when you actually need to use one! And when it comes to simple, isolated tasks I’d argue that it’s better to delay picking one up until you absolutely need one, and understand the “problem space” on a sufficiently good level.
Adopting some deeply integrated software tools also bring risks to the table – like the added additional complexity. When something then goes sour you’re usually up for some serious sweating / swearing as you quickly need to gain a deep understanding of the tool’s internals (as it was mostly set up some X months or years ago and who remembers all this stuff). That’s a tough thing to do when the production database is unable to process user queries and your manager is breathing down your neck.
Also, most tools or products in the PostgreSQL realm are in the end just GitHub repos with code, but without the associated business model and support services behind them. Which means you can forget about any guarantees / SLA-s whatsoever unless you’re covered by some Postgres support provider. Even then not all 3rd party tools / extensions fall under the SLA, typically.
But sure – if you evaluate the associated dangers, learn the basics, evaluate the project liveliness (a must for Open Source tools I think) you could look for some tools that more or less automatically take care of the tasks needed. For PostgreSQL partitioning tasks, by the way, I think the most common ones are pg_partman and the very similarly named pg_pathman.
Rolling your own
Disclaimer – this approach can only be recommended if you’re familiar with the problem space and have some programming skills, or actually rather SQL skills when it comes to database partitioning. What I wanted to show with this blog post is that it’s not really hard to do. That’s true at least for the simple task of table partitioning, and I personally prefer this way over some more complex tools, where by the time you’re done with the README and digested all the concepts and configuration options, you could have actually written and tested the whole code needed. So below find some sample code describing one approach for how to implement your own time-based (the most common partition scheme I believe) partition management – i.e. pre-creating new partitions for future time ranges and dropping old partitions.
Our sample business schema
From the business logic side let’s assume here that we have a schema for capturing millions of user interaction events per day (not uncommon for bigger shops), but luckily we need to store those events only for 6 months – a perfect use case for partitions!
CREATE TABLE event ( created_on timestamptz NOT NULL DEFAULT now(), user_id int8 NOT NULL, data jsonb NOT NULL ) PARTITION BY RANGE (created_on); CREATE INDEX ON event USING brin (created_on); CREATE EXTENSION IF NOT EXISTS btree_gin; CREATE INDEX ON event USING gin (user_id);
Now let’s create the first partitions manually.
That way we can get going and send to application to QA and have some time to figure out how we want to solve the future automation part. NB! Note that I’m creating a separate schema for the sub-partitions! This is actually a best practice, when it’s foreseeable that the amount of partitions is going to be more than a dozen or so (not the case for our demo schema though), so that they’re out of sight if we inspect our schema with some query tool and table access is still going to happen over the main table so partitions can be considered implementation details.
-- partitions for the current and next month CREATE SCHEMA subpartitions; CREATE TABLE event_y2020m05 PARTITION OF event FOR VALUES FROM ('2020-05-01') TO ('2020-06-01');
Sample partition management code
So now to the most important bits – creating new partitions and dropping old ones. The below query generates as SQL so that, when executed (for example by uncommenting the ‘\gexec’ directive when using psql), it pre-creates a partition for the next month. When run as a weekly Cron, it should be enough.
WITH q_last_part AS ( select /* extract partition boundaries and take the last one */ *, ((regexp_match(part_expr, $$ TO \('(.*)'\)$$)))::timestamptz as last_part_end from ( select /* get all current subpartitions of the 'event' table */ format('%I.%I', n.nspname, c.relname) as part_name, pg_catalog.pg_get_expr(c.relpartbound, c.oid) as part_expr from pg_class p join pg_inherits i ON i.inhparent = p.oid join pg_class c on c.oid = i.inhrelid join pg_namespace n on n.oid = c.relnamespace where p.relname = 'event' and p.relkind = 'p' ) x order by last_part_end desc limit 1 ) SELECT format($$CREATE TABLE IF NOT EXISTS subpartitions.event_y%sm%s PARTITION OF event FOR VALUES FROM ('%s') TO ('%s')$$, extract(year from last_part_end), lpad((extract(month from last_part_end))::text, 2, '0'), last_part_end, last_part_end + '1month'::interval) AS sql_to_exec FROM q_last_part; -- \gexec
And SQL for dropping old ones:
SELECT format('DROP TABLE IF EXISTS %s', subpartition_name) as sql_to_exec FROM ( SELECT format('%I.%I', n.nspname, c.relname) AS subpartition_name, ((regexp_match(pg_catalog.pg_get_expr(c.relpartbound, c.oid), $$ TO \('(.*)'\)$$)))::timestamptz AS part_end FROM pg_class p JOIN pg_inherits i ON i.inhparent = p.oid JOIN pg_class c ON c.oid = i.inhrelid JOIN pg_namespace n ON n.oid = c.relnamespace WHERE p.relname = 'event' AND p.relkind = 'p' AND n.nspname = 'subpartitions' ) x WHERE part_end < current_date - '6 months'::interval ORDER BY part_end;
For real production use I’d recommend wrapping such things into nice, descriptively named stored procedures. Then, call them from good old Cron jobs or maybe also use the pg_cron extension if you have it installed already, or try out pg_timetable (our job scheduling software with advanced rules / chaining) or similar. Also don’t forget to set up some alerting and re-try mechanisms for production usage, as some hiccups will sooner or later appear – table locking issues on partition attaching / dropping, accidentally killed sessions etc.
And for multi-role access models I’d also recommend setting up reasonable DEFAULT PRIVILEGES for your schemas involved. That way all necessary application users will automatically gain correct access rights to new subpartition tables. This is not actually directly related to partitioning and is a good thing to set up for any schema.
Some more ideas to manage partitions
For a lot of environments the simple task of running periodic jobs is actually not so simple – for example the environment might be very dynamic, with database and scheduling nodes jumping around the infrastructure and connectivity might become a problem, scheduling nodes might have different criticality classes and thus no access, etc. For such cases there are also some alternative partitioning tricks that can be applied. Mostly they look something like that:
* Calling some partition management function with every insert
If the volume of inserts is not too great it’s more or less OK to call the partition management code transparently in the background via triggers. The trigger will then look a bit into the future and pre-create the next sub-partitions on-the-fly, if not already existing – applying some dynamic SQL techniques usually.
For non-native (pre v10, inheritance based) partitioning you could also look into the actual row (see here for a code sample) and verify / create only the required partitions, but the newer declarative partition mechanism doesn’t allow “before” triggers, so you need to look into the future a bit.
* Calling the partition management functions only for some transactions
Although from a performance point of view the previous method does not hurt as much as it might seem, as system catalog info is highly likely to be cached and mostly we don’t also need to create or drop any partitions, but under heavy volumes it could still start to hurt so a better idea might be to apply the whole partition management procedure only for a fraction of rows – most commonly by using the builtin random() function. Note that in this case a good trick is to apply the randomness filter (0.1% might be a good starting choice) at trigger definition level (the WHEN clause), not at trigger function level – this again saves some CPU cycles.
To formulate some kind of a “takeaway” for this post – we saw above that partition management can basically be implemented with only 2 SQL statements (without validation and retry mechanisms) – so why tie yourself to some 3rd party project X that basically doesn’t give you any guarantees, needs maybe installation of an extension and might make you wait before adding compatibility for some newly released PostgreSQL version?
We are so used to using all kinds of tools for everything and being always on the lookout for quick wins and those very rare “silver bullets” that we sometimes forget to stop and ask – do I really need a tool or an utility for the task at hand?
So my advice – if the task looks relatively simple, try to stop and rethink your needs. Is the new tool “investment” really worth it? What percentage of features of the tool are you actually interested in? And in a lot of cases when dealing with PostgreSQL you’ll see that things are not really that difficult to implement by yourself…and by doing so your system will gain one very important attribute – you stay in full control, know exactly what’s going on, and can easily make changes if needed.
Read on to find out more about partitioning in these posts: