Select the first row of each set of rows grouped with a GROUP BY in PostgreSQL
Select First Row in Each Group by Group in PostgreSQL
In PostgreSQL, selecting the first row in each group after a GROUP BY operation requires a combination of window functions or custom sorting. Here are some efficient ways to achieve this.
1. Using DISTINCT ON
PostgreSQL's DISTINCT ON clause provides a straightforward way to select the first row in each group. You specify the columns for distinct groups, and then order the rows to ensure the correct row appears as the first within each group.
Syntax:
SELECT DISTINCT ON (group_column) group_column, other_columns FROM table_name ORDER BY group_column, sorting_column;
Example Code:
-- Select the first row in each group based on 'group_column'
SELECT DISTINCT ON (category) -- Select unique 'category' values
category, product_name, price -- Specify additional columns to display
FROM products -- From 'products' table
ORDER BY category, price DESC; -- Order by 'category' and descending 'price'
Explanation:
- DISTINCT ON (category): Returns only the first row within each unique category.
- ORDER BY category, price DESC: Orders by category and then sorts each group by price in descending order, making the row with the highest price appear first in each category group.
2. Using Window Functions (ROW_NUMBER)
If more flexibility is needed, use the ROW_NUMBER() window function to assign row numbers to each row in each group and then select only the first row per group.
Syntax:
SELECT group_column, other_columns FROM ( SELECT group_column, other_columns, ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY sorting_column) AS row_num FROM table_name ) AS subquery WHERE row_num = 1;
Example Code:
-- Use ROW_NUMBER() to select the first row in each group
SELECT category, product_name, price -- Specify columns to display
FROM (
SELECT category, product_name, price, -- Columns to show in output
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS row_num -- Row number per category
FROM products -- From 'products' table
) AS ranked_rows -- Alias for subquery
WHERE row_num = 1; -- Select only the first row in each group
Explanation:
- ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC): Generates a row number for each row in each category, ordered by price in descending order.
- WHERE row_num = 1: Filters the query to return only the first row per category group.
3. Using LATERAL JOIN with LIMIT
Another approach involves using a LATERAL join to select the first row of each group. This method is useful when filtering within each group.
Syntax:
SELECT group_column, other_columns FROM table_name AS main_table JOIN LATERAL ( SELECT other_columns FROM table_name AS sub_table WHERE main_table.group_column = sub_table.group_column ORDER BY sorting_column LIMIT 1 ) AS first_row;
Example Code:
-- Select first row in each group using LATERAL join
SELECT main_table .category,
first_row.product_name,
first_row.price -- Specify columns to display
FROM products AS main_table -- 'products' table as main_table
JOIN LATERAL (
SELECT product_name, price -- Columns to select in subquery
FROM products AS sub_table -- 'products' table as sub_table
WHERE main_table.category = sub_table.category -- Join on category
ORDER BY price DESC -- Order by descending price
LIMIT 1 -- Limit to the first row
) AS first_row ON true; -- Alias for lateral join
Explanation:
- JOIN LATERAL (...): Joins each row from main_table with the result of the subquery (first_row).
- WHERE main_table.category = sub_table.category: Ensures each subquery only considers rows within the same group.
- LIMIT 1: Restricts the subquery to return only the first row within each group.
All PostgreSQL Questions, Answers, and Code Snippets Collection.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics