PostgreSQL GROUPING SETS - Multiple grouping in one query
This article describes how to use GROUPING SETS
in a GROUP BY
clause to generate a report on multiple dimensions in one query.
In PostgreSQL, the GROUPING SETS
clause is a parameter to the GROUP BY
clause that allows you to generate a report on multiple dimensions in one query.
For example, you can use GROUPING SETS
in the GROUP BY
clause to to get a sales report with rows for sales per year and rows for sales per department.
PostgreSQL GROUPING SETS
syntax
The following syntax shows how to use GROUPING SETS
in the GROUP BY
clause:
SELECT ...
FROM table_name
[WHERE ...]
GROUP BY
GROUPING SETS (
(group_expr_list_1)
[, (group_expr_list_2), ...]
)
;
here,
- The
GROUPING SETS
is an argument of theGROUP BY
clause, it must follow theGROUP BY
keyword. - The
(group_expr_list_N)
is a list of expressions used for grouping, which can contain zero or more columns or expressions, like those used directly in theGROUP BY
clause. - You can specify one or more grouping expressions for
GROUPING SETS
, and all result sets produced by all grouping expressions will be merged into the final result set. - If there are no columns or expressions in
(group_expr_list_N)
, that is()
, all eligible rows are aggregated into one group.
PostgreSQL GROUP BY
Examples
We will use the tables in the Sakila sample database for demonstration, please install the Sakila sample database in PostgreSQL first.
In the Sakila sample database, the film
table stores all the films in the DVD store.
To find the number of films for each film rating from the film
table, use a statement like this:
SELECT rating, count(*)
FROM film
GROUP BY rating
ORDER BY rating;
rating | count
--------+-------
G | 178
PG | 194
PG-13 | 223
R | 195
NC-17 | 210
(5 rows)
To find the number of films for each rental from the film
table, use a statement like this:
SELECT rental_rate, count(*)
FROM film
GROUP BY rental_rate
ORDER BY rental_rate;
rental_rate | count
-------------+-------
0.99 | 341
2.99 | 323
4.99 | 336
(3 rows)
To merge the above two reports in one report, you can UNION ALL
combine the as follows:
SELECT rating, NULL rental_rate, count(*)
FROM film
GROUP BY rating
UNION ALL
SELECT NULL rating, rental_rate, count(*)
FROM film
GROUP BY rental_rate
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178 ┐
PG | | 194 │
PG-13 | | 223 │ > ranting
R | | 195 │
NC-17 | | 210 ┘
| 0.99 | 341 ┐
| 2.99 | 323 │ > rental_rate
| 4.99 | 336 ┘
(8 rows)
However, the above statement with UNION ALL
looks complicated. It would be great to be able to do all this in one simple statement. PostgreSQL GROUPING SETS
can do all this for us:
SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate))
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178 ┐
PG | | 194 │
PG-13 | | 223 │ > ranting
R | | 195 │
NC-17 | | 210 ┘
| 0.99 | 341 ┐
| 2.99 | 323 │ > rental_rate
| 4.99 | 336 ┘
(8 rows)
So, you can simply understand that GROUPING SETS
is a UNION ALL
operation on multiple grouped result sets.
If you wanted to add a row to the above report to show the total number of films, you could be an empty grouping expression ()
in GROUPING SETS
, as following:
SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate), ())
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178
PG | | 194
PG-13 | | 223
R | | 195
NC-17 | | 210
| 0.99 | 341
| 2.99 | 323
| 4.99 | 336
| | 1000
Conclusion
In the GROUP BY
clause, you can use GROUPING SETS
to generate a collection of result sets for multiple grouped queries in one query.
For more complex needs, PostgreSQL also provides ROLLUP
and CUBE
to simplify GROUPING SETS
.