PostgreSQL DISTINCT Usages
This article describes how to use in PostgreSQL DISTINCT
to return a result set with no duplicate rows.
In PostgreSQL, the DISTINCT
keyword is used in the SELECT
statement so that it returns a result set with no duplicate rows.
PostgreSQL DISTINCT
syntax
To return a result set with no duplicate rows, use the SELECT
statement with the DISTINCT
keyword:
Here is the syntax for DISTINCT
:
SELECT
DISTINCT column1 [, column2, ...]
FROM
table_name;
Explanation:
- The keyword
DISTINCT
should be specified afterSELECT
. - Specify the columns to evaluate for duplicates after the
DISTINCT
keyword . - Multiple column names need to be separated by
,
. If multiple column names are specified, MySQL will evaluate for duplicates based on the combined value of these columns. - You can use
DISTINCT *
to evaluate duplicates for all columns.
PostgreSQL also provides DISTINCT ON (expression)
to keep the first row of each set of duplicates using the following syntax:
SELECT
DISTINCT ON (column1) column_alias,
column2
FROM
table_name
ORDER BY
column1,
column2;
It is a good practice to always use the ORDER BY
clause with the DISTINCT ON(expression)
to make the result set predictable.
Note that the DISTINCT ON
expression must match the leftmost expression in the ORDER BY
clause .
PostgreSQL DISTINCT Examples
We will use the tables in the Sakila sample database for demonstration, please install the Sakila sample database in PostgreSQL first.
To retrieve all ratings of films from the film
table, use the following statement:
SELECT
DISTINCT rating
FROM
film;
rating
--------
R
PG-13
G
PG
NC-17
(5 rows)
Here, in order to find all the films ratings, we use DISTINCT rating
, so that each film rating appears only once in the result set.
To retrieve all rent amounts from the film
table, use the following statement:
SELECT
DISTINCT rental_rate
FROM
film;
rental_rate
-------------
2.99
4.99
0.99
(3 rows)
Here, in order to find all the film rental amounts, we use DISTINCT rental_rate
, so that each film rental amount appears only once in the result set.
To retrieve all combinations of ratings and rental amounts from the film
table, use the following statement:
SELECT
DISTINCT rating, rental_rate
FROM
film
ORDER BY rating;
rating | rental_rate
--------+-------------
G | 0.99
G | 4.99
G | 2.99
PG | 2.99
PG | 0.99
PG | 4.99
PG-13 | 4.99
PG-13 | 0.99
PG-13 | 2.99
R | 0.99
R | 2.99
R | 4.99
NC-17 | 0.99
NC-17 | 2.99
NC-17 | 4.99
(15 rows)
Here, we used DISTINCT rating, rental_rate
to find all combinations of film ratings and rental amounts. To make the output more readable, we use ORDER BY
to sort the result set in positive order by film ranking.
If you want to return the first row for each set of films ratings, use the DISTINCT ON
:
SELECT
DISTINCT ON (rating) rating,
film_id,
title
FROM
film
ORDER BY rating, film_id DESC;
rating | film_id | title
--------+---------+------------------
G | 2 | ACE GOLDFINGER
PG | 1 | ACADEMY DINOSAUR
PG-13 | 7 | AIRPLANE SIERRA
R | 8 | AIRPORT POLLOCK
NC-17 | 3 | ADAPTATION HOLES
DISTINCT and NULL
DISTINCT
treats all null values ββare the same, regardless of the field’s type, so only one null value is left from DISTINCT
.
For example the following SQL statement returns multiple rows of NULL
records:
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col;
nullable_col
--------------
<null>
<null>
<null>
(3 rows)
Here, we have 3 rows, each of which has a nullable_col
column value of NULL
.
After using DISTINCT
:
SELECT
DISTINCT nullable_col
FROM
(
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col
) t;
nullable_col
--------------
<null>
(1 row)
This example uses to UNION ALL
simulate a recordset containing multiple NULL
values.
Conclusion
This article describes how to use the DISTINCT
to clear duplicate rows in a result set. The key points of the usage of the DISTINCT
are as follows:
- The
SELECT DISTINCT
statement returns a result set with no duplicate rows. - You can specify one or more columns after
DISTINCT
, or you can use it*
. - The
DISTINCT
treat allNULL
as equal and keep only one. - The
DISTINCT ON
is used to return the first row of each group of repeated values.