What is a parallel aggregate?

In PostgreSQL, a parallel aggregate refers to a way of processing aggregate functions (such as SUM, AVG, MAX, MIN, etc.) on large amounts of data in a parallel and distributed manner, thereby making the query execution faster.

When executing an aggregate query, the database system automatically breaks up the result set into smaller pieces and distributes the work among the available resources, and then combines the results to produce the final output. This approach can significantly improve the performance of aggregate queries on large datasets, but it requires sufficient resources and may not always be faster than serial execution.

Not all aggregate functions support so-called “Partial Mode”, which indicates the aggregate is eligible to participate in various optimizations, such as parallel aggregation. And that was true for array_agg() and string_agg().

The first one collects all the input values, including nulls, into an array, while the second concatenates the non-null input values into a string.

What’s new in PostgreSQL 16?

David Rowley implemented a new functionality for parallel aggregates on string_agg() and array_agg() functions. The patch was reviewed by Andres Freund, Tomas Vondra, Stephen Frost and Tom Lane. Committed by David Rowley. The commit message is:

This adds combine, serial and deserial functions for the array_agg() and
string_agg() aggregate functions, thus allowing these aggregates to
partake in partial aggregations.  This allows both parallel aggregation to
take place when these aggregates are present and also allows additional
partition-wise aggregation plan shapes to include plans that require
additional aggregation once the partially aggregated results from the
partitions have been combined.

Author: David Rowley
Reviewed-by: Andres Freund, Tomas Vondra, Stephen Frost, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A@mail.gmail.com

Show time for new parallel aggregate functionality

test=# select version();
 PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
(1 row)

test=# create table pagg_test (x int, y int);

test=# insert into pagg_test
test-# select (case x % 4 when 1 then null else x end), x % 10
test-# from generate_series(1,5000) x;
INSERT 0 5000

test=# set parallel_setup_cost TO 0;

test=# set parallel_tuple_cost TO 0;

test=# set parallel_leader_participation TO 0;

test=# set min_parallel_table_scan_size = 0;

test=# explain select
test-#     y,
test-#     string_agg(x::text, ',') AS t,
test-#     string_agg(x::text::bytea, ',') AS b,
test-#     array_agg(x) AS a,
test-#     array_agg(ARRAY[x]) AS aa
test-# from pagg_test
test-# group by y;
                                          QUERY PLAN                                          
 Finalize GroupAggregate  (cost=117.14..118.02 rows=10 width=132)
   Group Key: y
   ->  Gather Merge  (cost=117.14..117.37 rows=20 width=132)
         Workers Planned: 2
         ->  Sort  (cost=117.12..117.14 rows=10 width=132)
               Sort Key: y
               ->  Partial HashAggregate  (cost=116.75..116.95 rows=10 width=132)
                     Group Key: y
                     ->  Parallel Seq Scan on pagg_test  (cost=0.00..48.00 rows=2500 width=8)
(9 rows)

And to be absolutely sure, let’s try the same test on a PostreSQL 13 cluster. The EXPLAIN output will be the same for PostgreSQL 14 and 15.

                                    QUERY PLAN
 HashAggregate  (cost=185.50..185.70 rows=10 width=132)
   Group Key: y
   ->  Gather  (cost=0.00..48.00 rows=5000 width=8)
         Workers Planned: 2
         ->  Parallel Seq Scan on pagg_test  (cost=0.00..48.00 rows=2500 width=8)
(5 rows)

We cannot compare timing but we can compare costs. As you can see, the final cost in the upcoming PostgreSQL 16 with parallel aggregate implemented is 118.02! The final execution cost for previous versions was 185.70.

Let me remind you what the cost means in the PostgreSQL’s EXPLAIN output. Cost is an estimated measure of the query execution time in arbitrary units, representing the processing power required to execute a particular step in the query plan. It generally represents a combination of CPU, I/O, and memory usage and helps the query planner to choose the fastest execution plan.


See “Handling Bonus Programs in SQL” blog post by Hans-Jürgen Schönig to check how to efficiently code bonus programs in SQL. There is a real life example of using array_agg aggregate function. Or check out my “ER diagrams with SQL and Mermaid” post to see an example on how to use string_agg() function.

Check out out the “what’s new” post series to know more about upcoming PostgreSQL versions.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.