Using PostgreSQL array_agg Function for data aggregation
PostgreSQL array_agg Function
The array_agg function in PostgreSQL is an aggregate function that collects multiple values from a group and returns them as an array. This is especially useful for aggregating data from multiple rows into a single array format, enabling you to perform complex data manipulation, such as collecting a list of related items or grouping values from multiple rows.
Syntax
array_agg(expression [ORDER BY sort_expression ASC|DESC])
- expression: The column or expression to be aggregated.
- ORDER BY: (Optional) Specifies the order of the elements in the resulting array.
Examples and Code Explanation
Example 1: Basic Array Aggregation
Suppose we have a students table with the following data:
id | name | course |
---|---|---|
1 | Alice | Math |
2 | Bob | Math |
3 | Charlie | Science |
4 | David | Math |
5 | Eve | Science |
You can use array_agg to list all student names by course.
Code:
SELECT
course, -- Group results by course
array_agg(name) AS student_names -- Collect names in an array for each course
FROM
students -- From students table
GROUP BY
course;
Result:
course student_names Math {Alice, Bob, David} Science {Charlie, Eve}
Explanation:
- array_agg(name): Aggregates the name field for each group (course) and returns it as an array.
- GROUP BY course: Groups the results by course, so each course has a separate row with all student names in that course.
Example 2: Using ORDER BY in array_agg
You can use the ORDER BY clause within array_agg to specify the order of items within the array.
Code:
SELECT
course, -- Group results by course
array_agg(name ORDER BY name ASC) AS student_names
FROM
students
GROUP BY
course;
This query sorts the names within each course in alphabetical order in the resulting arrays.
Example 3: Aggregating Multiple Columns
array_agg can also aggregate complex expressions, such as concatenating multiple columns.
Code:
SELECT
course, -- Group results by course
array_agg(name || ' (' || id || ')') AS student_details -- Concatenate name and id within array elements
FROM
students
GROUP BY
course;
Result:
course student_details Math {Alice (1), Bob (2), David (4)} Science {Charlie (3), Eve (5)}
Explanation:
- name || ' (' || id || ')': This concatenates name and id to give details for each student.
- array_agg: Collects each concatenated result into an array per group.
Example 4: Flattening an Array from Subquery
Sometimes, you may want to use array_agg in combination with a subquery to aggregate data more flexibly.
Code:
SELECT
array_agg(name) AS all_students
FROM
(SELECT DISTINCT name FROM students) AS unique_names; -- Get unique names before aggregation
Explanation:
- Subquery: Selects distinct student names to avoid duplicates.
- array_agg: Collects all unique student names into a single array.
Explanation of Key Points:
- Aggregating Data with array_agg: This function is ideal for collecting multiple rows into a single array, providing a compact, readable way to display grouped data.
- Using ORDER BY: The ORDER BY clause within array_agg gives control over the order of array elements, making it easier to manage sorted lists.
- Combining Expressions: array_agg can be combined with expressions and concatenations, making it possible to display multiple fields in a single array element.
All PostgreSQL Questions, Answers, and Code Snippets Collection.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/PostgreSQL/snippets/postgresql-array-agg.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics