Using the row_number Function in PostgreSQL for Ordered Row Numbers
PostgreSQL row_number Function
The row_number function in PostgreSQL is a window function that assigns a unique number to each row within a partition of a result set. This function is particularly useful for creating row numbers or ranking rows based on a specific order. Unlike other ranking functions, row_number assigns a unique, consecutive integer to each row without considering ties, making it ideal for ordering results for pagination, deduplication, or displaying sorted records.
Syntax:
ROW_NUMBER() OVER ([PARTITION BY partition_column] ORDER BY sort_column)
Explanation:
- PARTITION BY: Divides the result set into partitions to which the row_number function is applied independently. Optional.
- ORDER BY: Specifies the order of rows within each partition, determining how row numbers are assigned.
Example 1: Basic row_number Function Usage
Code:
-- Retrieves each employee's details with an assigned row number ordered by salary in descending order.
SELECT
employee_id,
first_name,
last_name,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_number -- Assigns row number based on salary order
FROM employees;
Explanation:
- In this example, ROW_NUMBER() generates a unique number for each row in the employees table, ordering them by salary in descending order. This is helpful for ranking employees by salary, where each employee is assigned a distinct row number.
Example 2: Using row_number with Partitioning
Code:
-- Assigns a row number to each employee within their department, ordered by hire date.
SELECT
department_id,
employee_id,
first_name,
last_name,
hire_date,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY hire_date ASC) AS row_number -- Row number per department based on hire date
FROM employees;
Explanation:
- Here, the row_number function partitions the rows by department_id and then orders them within each department by hire_date. This means each department will have a row numbering starting from 1, ordered by the employees' hire dates. This setup is useful for identifying seniority within departments.
Example 3: Removing Duplicates with row_number
Code:
-- Removes duplicate entries based on the "email" column, keeping only the most recent entries.
SELECT *
FROM (
SELECT
employee_id,
email,
created_at,
ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at DESC) AS row_number -- Number rows within each email group by date
FROM employees
) AS ranked_employees
WHERE row_number = 1; -- Filters to keep only the latest record for each email
Explanation:
- This example uses row_number to partition records by email and order them by created_at. The inner query assigns row numbers within each email partition based on creation date. The outer query then filters out rows where row_number is greater than 1, keeping only the most recent entry for each email.
Notes:
1. Difference from Other Ranking Functions:
- The row_number function does not consider ties. If two rows have the same values in the ordered columns, they will still receive different row numbers. For cases where ties should share the same rank, consider using RANK() or DENSE_RANK() instead.
2. Use in Pagination:
- row_number is commonly used for pagination, especially with OFFSET and LIMIT clauses. By generating a unique row number, you can retrieve specific rows from a large result set.
3. Performance:
- Be cautious with large tables, as using row_number with complex ORDER BY or PARTITION BY clauses can impact query performance. Indexing the ordered columns may help improve performance.
Summary:
The row_number function in PostgreSQL is a powerful tool for adding a sequential row identifier to a result set based on a specified order. This function is ideal for tasks like pagination, identifying unique entries, or ranking within partitions. It differs from other ranking functions by ensuring a unique row number for each row, regardless of ties.
All PostgreSQL Questions, Answers, and Code Snippets Collection.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics