PostgreSQL lag() Function
The PostgreSQL lag()
function returns the value from the specified row before the current row in the partition where the current row is located.
lag()
Syntax
Here is the syntax of the PostgreSQL lag()
function:
lag(expr[, offset[, default]])
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
Parameters
expr
-
Required. It can be a column name or an expression.
offset
-
Optional. The number of lines at the offset relative to the current line. The default value is 1.
default
-
Optional. It can be a column name or an expression.
partition_column_list
-
List of columns for partitioning.
partition_column_list
-
List of columns for sorting.
Return value
The PostgreSQL lag()
function returns the value from the specified row before the current row in the partition where the current row is located.
lag()
Examples
Preparing Data
Use the following CREATE TABLE
statement to create a table named tax_revenue
to store tax returns for every quarter:
CREATE TABLE tax_revenue (
id SERIAL PRIMARY KEY,
year CHAR(4) NOT NULL,
quarter CHAR(1) NOT NULL,
revenue INT NOT NULL
);
This tax_revenue
table has 4 columns as following:
id
- The row ID, primary key.year
- The year.quarter
- The quarter of the year, from 1 to 4.revenue
- Tax revenue.
Insert some rows into the tax_revenue
table using the following INSERT
statement:
INSERT INTO tax_revenue
(year, quarter, revenue)
VALUES
('2020', '1', 3515),
('2020', '2', 3678),
('2020', '3', 4203),
('2020', '4', 3924),
('2021', '1', 3102),
('2021', '2', 3293),
('2021', '3', 3602),
('2021', '4', 2901);
Use the following SELECT
statement to show all rows of this table:
SELECT * FROM tax_revenue;
id | year | quarter | revenue
----+------+---------+---------
1 | 2020 | 1 | 3515
2 | 2020 | 2 | 3678
3 | 2020 | 3 | 4203
4 | 2020 | 4 | 3924
5 | 2021 | 1 | 3102
6 | 2021 | 2 | 3293
7 | 2021 | 3 | 3602
8 | 2021 | 4 | 2901
(8 rows)
Example 1
The following statement uses the PostgreSQL lag()
function to add a next_quarter_revenue
column for comparing the revenue of the current quarter and the next quarter:
SELECT *,
lag(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
FROM tax_revenue;
id | year | quarter | revenue | next_quarter_revenue
----+------+---------+---------+----------------------
4 | 2020 | 4 | 3924 | <null>
3 | 2020 | 3 | 4203 | 3924
2 | 2020 | 2 | 3678 | 4203
1 | 2020 | 1 | 3515 | 3678
8 | 2021 | 4 | 2901 | <null>
7 | 2021 | 3 | 3602 | 2901
6 | 2021 | 2 | 3293 | 3602
5 | 2021 | 1 | 3102 | 3293
(8 rows)
Note that the window function in the SQL statement above:
lag(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
In the OVER
clause,
- The
PARTITION BY year
partitions all rows by year. - The
ORDER BY quarter DESC
sort the rows within each partition in descending order by quarter. - The
lag(revenue, 1)
returns the revenue (revenue
) from the previous row (1
) for within the partition where the current row is located.
So the value of the next_quarter_revenue
column is the revenue of the next quarter of the current row. And the value of the next_quarter_revenue
column in the first row of each partition is null.
You can specify a default value for nulls ββin the next_quarter_revenue
column. The following statement is used 0
as a default value:
SELECT *,
lag(revenue, 1, 0) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
FROM tax_revenue;
id | year | quarter | revenue | next_quarter_revenue
----+------+---------+---------+----------------------
4 | 2020 | 4 | 3924 | 0
3 | 2020 | 3 | 4203 | 3924
2 | 2020 | 2 | 3678 | 4203
1 | 2020 | 1 | 3515 | 3678
8 | 2021 | 4 | 2901 | 0
7 | 2021 | 3 | 3602 | 2901
6 | 2021 | 2 | 3293 | 3602
5 | 2021 | 1 | 3102 | 3293
(8 rows)