CYBERTEC Logo

PostgreSQL: GROUP BY expression

08.2020 / Category: / Tags: |

GROUP BY is nothing new and is available in any relational database I am aware of. It is an integral part of SQL and PostgreSQL but what many people might not know is the fact that the GROUP BY expression can do more than just group by simple fields. You can use expressions to group in an even more sophisticated way and here is how:

Importing test data

To show you what you can do with GROUP BY I have compiled some test data:

This data set is pretty easy to understand. It tells us how much oil was produced and consumed by which country in the past. To load the data you can either download the file or simply ask COPY to load it directly from the web.

To use COPY … FROM PROGRAM you have to be superuser otherwise it does not work for security reasons. In my example 644 rows have been loaded successfully.

Simple GROUP BY statements

Using GROUP BY is pretty simple. If we want to make some kind of analysis for each group we basically got two choices:

“GROUP BY 1” basically means the same as the following query:

“GROUP BY region” and “GROUP BY 1” are therefore identical. It is a question of faith which type of syntax you prefer. People have told me once than once that one or the other syntax is “evil”. In reality it makes absolutely no difference. It is just syntactic sugar.

GROUP BY “expression”

Most people group by one or more columns or no column at all:

However, there is more: You can also use an expression to determine the groups on the fly.

Here is an example:

In this case we got two groups: One group is for rows greater than 9000 and one for rows lower or equal than 9000. The name of those two groups is therefore “true” (> 9000) or “false” (<= 9000). You can use any expression to calculate those groups on your own. Consider the next example:

In this example we are counting odd and even years. Note that it is not necessary to list the grouping criterial in the SELECT clause. You might not understand the result if you miss half of the column list but you are free to do that. What is also interesting is that you can use a full SQL query in the GROUP BY clause.

The following statement is identical to what you have just seen:

If you run those queries it makes sense to take a look at the execution plan. As you can see the plan is just like any other GROUP BY statement. It is also noteworthy that PostgreSQL does a really good job to estimate the number of groups:

PostgreSQL successfully figured out that two groups are to be expected.
HAVING clauses done right

Recently people have asked us about aliases in the HAVING clause: Can one use aliases in a HAVING clause?

The answer is simply no. SQL does not allow that. I remember seeing people who actually wanted to implement this feature in PostgreSQL over the years but such a feature is not going to be supported.

If you want to use a HAVING clause it is necessary to explicitly use the entire expression:

However, consider the following example:

As you can see we use “count” in the SELECT clause, but we use “avg” in the HAVING clause. This is definitely possible and in some cases really useful. You can use different aggregate functions in GROUP BY and HAVING clauses without any problems.

Making use of grouping sets

So far you have seen what GROUP BY can do for you in the simple case butthere is more. I don't want to go into too much detail in this blog but just one thing: You can do more than just one aggregation at the same time to speed up your query. Here is an example:

In this case we got two grouping operations: One containing all rows and one for before respectively after 1990.

If you don't like this syntax you can also try ROLLUP which is equivalent to the query you have just seen:

ROLLUP is basically the “bottom line” - it adds one additional row counting everything.

Finally …

If you want to find out more about grouping in general I recommend checking out my post about speeding up GROUP BY in general.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram