CYBERTEC Logo

Partial aggregation: The beautiful way

02.2015 / Category: / Tags: |

Partial aggregation: Aggregations are a fundamental feature of SQL. Functions such as min, max, count, sum, avg and so on are used in virtually every application - it is basically impossible to write a reasonably complex applications without them.

Here is a trivial aggregate: All it does is summing up numbers from 1 to 10:

So far so good. But what if we only want to sum up even numbers? We can rely on a simple trick: NULL values are ignored by aggregation functions. This makes a lot of sense because NULL really means “undefined” and you cannot average or sum up undefined values.  A simple CASE WHEN can do the job:

This works perfectly - however, it is a bit old fashioned and pretty nasty to write (especially in case of more complex operations).

The fancy way to partial aggregation

There is a more fancy way to do that:

The FILTER keywords allow users to tell the systems which rows should make it into the aggregate function. The new syntax is far more readable, as well as a lot shorter.

Find out more about partial aggregation in PostgreSQL 16 in this blog post about Parallel Aggregates by Pavlo Golub.

In case you need any assistance, please feel free to contact us.
 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Efi
Efi
9 years ago

Is there any difference performance wise between the FILTER version and the CASE WHEN version?

David Fetter
David Fetter
9 years ago
Reply to  Efi

Clarity and speed. The planner knows about FILTER, so it has what it needs to make it faster.

Slava Moudry
Slava Moudry
9 years ago
Reply to  David Fetter

Thanks! This is really great. David, can you elaborate a bit on planner optimizations for FILTER.
Would it use data stats on columns or can it pull data using index-only scan?

David Fetter
David Fetter
9 years ago
Reply to  Slava Moudry

Check the git commit logs for FILTER. The initial mechanics of the optimization are there.

David Fetter
David Fetter
9 years ago
Reply to  Slava Moudry

FILTER is about eliminating, not adding. See the patch for places where that happens. http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b560ec1b0d7b910ce13edc51ffaafaca72136e3b

David Fetter
David Fetter
9 years ago
Reply to  Slava Moudry

Basically, FILTER is a way of removing rows efficiently compared to evaluating a CASE at each row. The technical details of this are here: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b560ec1b0d7b910ce13edc51ffaafaca72136e3b

David Fetter
David Fetter
9 years ago

Thanks for promoting the feature I wrote 🙂

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    7
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram