MySQL GROUP BY
This article will describes MySQL GROUP BY
clause which can group some rows by specified columns or expression.
Sometimes, we need to summarize some rows base on one or more columns. This is often used in statistics, consider the following use cases:
- Get the average grade by class.
- Summarize total score by student.
- Summarize sales by year or month.
- Summarize the number of users by country or region.
You can use the GROUP BY
clauses in these use cases.
In MySQL, GROUP BY
clauses are used to group some rows into summary rows by columns values or expressions.
GROUP BY syntax
The GROUP BY
Clause is used in SELECT
statement. The syntax of the GROUP BY
clause is as follows:
SELECT column1[, column2, ...], aggregate_function(ci)
FROM table
[WHERE clause]
GROUP BY column1[, column2, ...];
[HAVING clause]
Here:
column1[, column2, ...]
afterGROUP BY
keyword is the column by which the grouping is based. It can be one or more columns.aggregate_function(ci)
is an aggregate function, used to summarize. This is optional, but generally available. You can alse use more aggregate functions here.- The columns after
SELECT
keyword must be one or more columns in the grouping columns. - The
WHERE
clause is used to filter the rows in the result set and it is optional. - The
HAVING
clause is used to filter grouped rows and it is optional.
Here is some aggregate functions often used in the GROUP BY
clause:
SUM()
AVG()
MAX()
MIN()
COUNT()
GROUP BY examples
In the following example, we will use actor
and paypment
tables from Sakila sample database.
Simple GROUP BY example
We use the GROUP BY
clause to list all last names in the actor
table.
SELECT last_name
FROM actor
GROUP BY last_name;
+--------------+
| last_name |
+--------------+
| AKROYD |
| ALLEN |
| ASTAIRE |
| BACALL |
| BAILEY |
...
| ZELLWEGER |
+--------------+
121 rows in set (0.00 sec)
In this example, the GROUP BY
clause grouped all rows based on last_name
column values.
The output of this example is as same as following statement using DISTINCT
:
SELECT DISTINCT last_name FROM actor;
using aggregate functions
If you want to know the count of every last name in the above example, you can use aggregate functions COUNT()
. Here is the statement:
SELECT last_name, COUNT(*)
FROM actor
GROUP BY last_name
ORDER BY COUNT(*) DESC;
+--------------+----------+
| last_name | COUNT(*) |
+--------------+----------+
| KILMER | 5 |
| NOLTE | 4 |
| TEMPLE | 4 |
| AKROYD | 3 |
| ALLEN | 3 |
| BERRY | 3 |
...
| WRAY | 1 |
+--------------+----------+
121 rows in set (0.00 sec)
In this example, here is the order of execution:
- First, use
GROUP BY
clause to group all rows bylast_name
column values. - Second, use the aggregate function
COUNT(*)
to count rows in each group. - Finally, use
ORDER BY
clause to sortCOUNT(*)
column in descending order.
In this way, the last name KILMER
with the largest number is ranked first.
GROUP BY, LIMIT, aggregate function
In this example, let us find the top 10 customers from the payment
table. We will use GROUP BY
clause, LIMIT
clause and aggregate functions SUM()
.
SELECT customer_id, SUM(amount) total
FROM payment
GROUP BY customer_id
ORDER BY total DESC
LIMIT 10;
+-------------+--------+
| customer_id | total |
+-------------+--------+
| 526 | 221.55 |
| 148 | 216.54 |
| 144 | 195.58 |
| 137 | 194.61 |
| 178 | 194.61 |
| 459 | 186.62 |
| 469 | 177.60 |
| 468 | 175.61 |
| 236 | 175.58 |
| 181 | 174.66 |
+-------------+--------+
10 rows in set (0.02 sec)
In this example, here is the order of execution:
- First, use
GROUP BY
clause to group all rows bycustomer_id
column values. - Second, use the aggregate function
SUM(amount)
to sumamount
columns in each group, and give it a aliastotal
; - Third, use
ORDER BY
clause to sorttotal
column in descending order. - Finally, use the
LIMIT 10
clause to return the top 10 rows.
Examples of GROUP BY and HAVING
You can use HAVING
clause after GROUP BY
clause to filter grouped rows. This statement returns customers whose total is more than 180.
SELECT customer_id, SUM(amount) total
FROM payment
GROUP BY customer_id
HAVING total > 180
ORDER BY total DESC;
+-------------+--------+
| customer_id | total |
+-------------+--------+
| 526 | 221.55 |
| 148 | 216.54 |
| 144 | 195.58 |
| 137 | 194.61 |
| 178 | 194.61 |
| 459 | 186.62 |
+-------------+--------+
6 rows in set (0.02 sec)
In this example, here is the order of execution:
- First, use
GROUP BY
clause to group all rows bycustomer_id
column values. - Second, use aggregate function
SUM(amount)
to sumamount
columns in each group, and give it a aliastotal
; - Third, use
HAVING
clause to filtering rows whichtotal
column value is more than 180. - Finally, use
ORDER BY
clause to sorttotal
column in descending order.
Conclusion
In this article, you learned MySQL GROUP BY
syntax and use cases. The following are key points of the GROUP BY
clause:
- The
GROUP BY
clause is used to group some rows by specified columns or expressions. - The
HAVING
clause is used to filter grouped rows. - The
GROUP BY
clause is often used for summarization with aggregate functions.