How the RANK() function works in Mariadb?
The RANK()
function is a window function in Mariadb that returns the rank of a row within a partition of a result set.
The RANK()
function is a window function in Mariadb that returns the rank of a row within a partition of a result set. The rank of a row is determined by the order of the values in the ORDER BY
clause of the window definition. The rank of a row is one plus the number of rows that precede it with a lower or equal value. If two or more rows have the same value, they are assigned the same rank, and the next rank is skipped. The function can be used to perform various calculations and analysis involving ranks, such as finding the top or bottom values, or the percentile of a value.
Syntax
The syntax of the RANK()
function is as follows:
RANK() OVER (window_definition)
The function takes one argument:
window_definition
: A window definition that specifies the partitioning and ordering of the result set. The window definition can include the following clauses:PARTITION BY
: This clause divides the result set into partitions based on the values of one or more expressions. TheRANK()
function is applied to each partition separately. ThePARTITION BY
clause is optional. If it is omitted, the entire result set is treated as a single partition.ORDER BY
: This clause specifies the order of the rows within each partition based on the values of one or more expressions. TheRANK()
function assigns ranks to the rows according to this order. TheORDER BY
clause is mandatory. The expressions can be followed byASC
orDESC
to indicate the ascending or descending order, respectively. The default order is ascending.ROWS
orRANGE
: This clause specifies the frame of rows that are used to calculate theRANK()
function for each row. The frame can be defined by a physical offset (ROWS
) or a logical offset (RANGE
) from the current row. The frame can have different options, such asUNBOUNDED PRECEDING
,CURRENT ROW
,UNBOUNDED FOLLOWING
, or a numeric expression. TheROWS
orRANGE
clause is optional. If it is omitted, the default frame isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
, which means that theRANK()
function is calculated over all the rows from the start of the partition to the current row.
The function returns an integer value that represents the rank of the row within the partition of the result set, as follows:
- The function assigns ranks to the rows according to the order of the values in the
ORDER BY
clause of the window definition. The rank of a row is one plus the number of rows that precede it with a lower or equal value. - If two or more rows have the same value, they are assigned the same rank, and the next rank is skipped. For example, if the values are 1, 2, 2, 3, the ranks are 1, 2, 2, 4.
- If the
ROWS
orRANGE
clause is specified, the function only considers the rows within the frame to assign ranks. For example, if the frame isROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
, the function only looks at the current row and the previous and next rows to assign ranks.
Examples
Example 1: Finding the rank of sales by product
The following example finds the rank of sales by product using the RANK()
function. The table sales
contains the product name and the sales amount for each product. The query uses the RANK()
function to assign ranks to the products based on their sales amount in descending order. The query also uses the PARTITION BY
clause to divide the result set into partitions based on the product category. The query returns the product name, the product category, the sales amount, and the rank of sales for each product.
SELECT product_name, product_category, sales_amount,
RANK() OVER (PARTITION BY product_category ORDER BY sales_amount DESC) AS rank_of_sales
FROM sales;
The output is:
+--------------+------------------+--------------+---------------+
| product_name | product_category | sales_amount | rank_of_sales |
+--------------+------------------+--------------+---------------+
| Laptop | Electronics | 50000 | 1 |
| TV | Electronics | 40000 | 2 |
| Camera | Electronics | 30000 | 3 |
| Phone | Electronics | 20000 | 4 |
| Book | Books | 5000 | 1 |
| Magazine | Books | 3000 | 2 |
| Newspaper | Books | 2000 | 3 |
| Pen | Stationery | 1000 | 1 |
| Pencil | Stationery | 500 | 2 |
| Eraser | Stationery | 200 | 3 |
+--------------+------------------+--------------+---------------+
The output shows that the RANK()
function assigns ranks to the products based on their sales amount in descending order within each product category. For example, the laptop has the highest sales amount in the electronics category, so it has a rank of 1. The TV has the second highest sales amount in the electronics category, so it has a rank of 2. The book has the highest sales amount in the books category, so it has a rank of 1. The pen has the highest sales amount in the stationery category, so it has a rank of 1. If two or more products have the same sales amount, they are assigned the same rank, and the next rank is skipped. For example, there are no products with the same sales amount in the electronics category, so the ranks are 1, 2, 3, 4. However, there are two products with the same sales amount of 5000 in the books category, so they are assigned the same rank of 1, and the next rank is 3.
Example 2: Finding the rank of scores by student
The following example finds the rank of scores by student using the RANK()
function. The table scores
contains the student name and the score for each student. The query uses the RANK()
function to assign ranks to the students based on their score in ascending order. The query also uses the ROWS
clause to specify the frame of rows that are used to calculate the RANK()
function for each row. The frame is defined as ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
, which means that the function only looks at the current row and the previous and next rows to assign ranks. The query returns the student name, the score, and the rank of score for each student.
SELECT student_name, score,
RANK() OVER (ORDER BY score ASC ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS rank_of_score
FROM scores;
The output is:
+--------------+-------+---------------+
| student_name | score | rank_of_score |
+--------------+-------+---------------+
| Alice | 50 | 1 |
| Bob | 60 | 2 |
| Charlie | 70 | 2 |
| David | 80 | 3 |
| Eve | 90 | 4 |
+--------------+-------+---------------+
The output shows that the RANK()
function assigns ranks to the students based on their score in ascending order within the frame of rows. For example, Alice has the lowest score of 50, so she has a rank of 1. Bob has the second lowest score of 60, so he has a rank of 2. Charlie has the same score as Bob, so he also has a rank of 2. David has the third lowest score of 80, so he has a rank of 3. Eve has the highest score of 90, so she has a rank of 4. If two or more students have the same score, they are assigned the same rank, and the next rank is skipped. For example, Bob and Charlie have the same score of 60, so they are assigned the same rank of 2, and the next rank is 4.
Related Functions
There are some other functions that are related to the RANK()
function, such as:
DENSE_RANK()
: This function is similar to theRANK()
function, but it does not skip any ranks if there are ties. The syntax of the function isDENSE_RANK() OVER (window_definition)
, wherewindow_definition
is the same as in theRANK()
function. The function returns an integer value that represents the dense rank of the row within the partition of the result set. For example, if the values are 1, 2, 2, 3, the ranks are 1, 2, 2, 3, and the dense ranks are 1, 2, 2, 3.ROW_NUMBER()
: This function returns the sequential number of a row within a partition of a result set. The syntax of the function isROW_NUMBER() OVER (window_definition)
, wherewindow_definition
is the same as in theRANK()
function.NTILE()
: This function returns the bucket number of a row within a partition of a result set. The buckets are divided into a specified number of equal groups. The syntax of the function isNTILE(number) OVER (window_definition)
, wherenumber
is an integer expression that specifies the number of buckets, andwindow_definition
is the same as in theRANK()
function. The function returns an integer value that represents the bucket number of the row within the partition of the result set. For example, if the number of buckets is 4, the values are 1, 2, 3, 4, 5, 6, 7, 8, the bucket numbers are 1, 1, 2, 2, 3, 3, 4, 4.
Conclusion
The RANK()
function is a useful function to return the rank of a row within a partition of a result set. The rank of a row is determined by the order of the values in the ORDER BY
clause of the window definition. The rank of a row is one plus the number of rows that precede it with a lower or equal value. If two or more rows have the same value, they are assigned the same rank, and the next rank is skipped. The function can be used to perform various calculations and analysis involving ranks, such as finding the top or bottom values, or the percentile of a value. The function takes one argument, which is a window definition that specifies the partitioning and ordering of the result set. The function returns an integer value that represents the rank of the row within the partition of the result set. The function can also be combined with other window functions, such as DENSE_RANK()
, ROW_NUMBER()
, NTILE()
, etc., to perform more complex operations on ranks.