SQL SELECT with DISTINCT on multiple columns
DISTINCT on multiple columns
In SQL multiple fields may also be added with DISTINCT clause. DISTINCT will eliminate those rows where all the selected fields are identical.
Contents:
Example: Sample SELECT statement
This query retrieves specific columns from the orders table, filtered by agent_code='A002'
SQL Code:
SELECT agent_code, ord_amount, cust_code, ord_num
-- Select specific columns: agent_code, ord_amount, cust_code, ord_num
FROM orders
-- From the table 'orders'
WHERE agent_code='A002';
-- Filter the results to only include rows where the 'agent_code' is 'A002'
Explanation:
- SELECT agent_code, ord_amount, cust_code, ord_num: This line specifies the columns that you want to retrieve data from. It selects the columns 'agent_code', 'ord_amount', 'cust_code', and 'ord_num' from the 'orders' table.
- FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': This line specifies a condition for filtering the results. It filters the results to only include rows where the value in the 'agent_code' column is 'A002'. This condition acts as a filter, allowing only rows with 'A002' as the agent code to be included in the result set.
Relational Algebra Expression:
Relational Algebra Tree:
Sample table: orders
Output:
AGENT_CODE ORD_AMOUNT CUST_CODE ORD_NUM ---------- ---------- ---------- ---------- A002 2500 C00005 200106 A002 500 C00022 200123 A002 500 C00009 200120 A002 500 C00022 200126 A002 3500 C00009 200128 A002 1200 C00009 200133 A002 4000 C00022 200113
The above result shows the same agent_code, ord_amount and cust_code appears more than once in theorders table.
Example: SELECT with DISTINCT on two columns
The SQL statement below retrieves identical rows based on the 'agent_code' and 'ord_amount' columns from the orders table :
SQL Code:
SELECT DISTINCT agent_code, ord_amount
-- Select distinct combinations of 'agent_code' and 'ord_amount'
FROM orders
-- From the table 'orders'
WHERE agent_code='A002';
-- Filter the results to only include rows where the 'agent_code' is 'A002'
Explanation:
- SELECT DISTINCT agent_code, ord_amount: This line specifies that you want to retrieve unique combinations of 'agent_code' and 'ord_amount'. The DISTINCT keyword ensures that only unique combinations are returned; any duplicate combinations will be eliminated.
- FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': This line specifies a condition for filtering the results. It filters the results to only include rows where the value in the 'agent_code' column is 'A002'. This condition acts as a filter, allowing only rows with 'A002' as the agent code to be included in the result set.
Relational Algebra Expression:
Relational Algebra Tree:
Output:
AGENT_CODE ORD_AMOUNT ---------- ---------- A002 3500 A002 1200 A002 4000 A002 500 A002 2500
Pictorial presentation:
Example: SELECT with DISTINCT on three columns
The following SQL statement can be used to retrieve identical rows based on the 'agent_code', 'ord_amount', and 'cust_code' columns once from the orders table:
SQL Code:
SELECT DISTINCT agent_code, ord_amount, cust_code
-- Select distinct combinations of 'agent_code', 'ord_amount', and 'cust_code'
FROM orders
-- From the table 'orders'
WHERE agent_code='A002';
-- Filter the results to only include rows where the 'agent_code' is 'A002'
Explanation:
- SELECT DISTINCT agent_code, ord_amount, cust_code: This line specifies that you want to retrieve unique combinations of 'agent_code', 'ord_amount', and 'cust_code'. The DISTINCT keyword ensures that only unique combinations are returned; any duplicate combinations will be eliminated.
- FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': This line specifies a condition for filtering the results. It filters the results to only include rows where the value in the 'agent_code' column is 'A002'. This condition acts as a filter, allowing only rows with 'A002' as the agent code to be included in the result set.
Relational Algebra Expression:
Relational Algebra Tree:
Output:
AGENT_CODE ORD_AMOUNT CUST_CODE ---------- ---------- ---------- A002 500 C00022 A002 3500 C00009 A002 2500 C00005 A002 500 C00009 A002 4000 C00022 A002 1200 C00009
Pictorial presentation:
Example : SELECT with DISTINCT on all columns of the first query
To get the identical rows (on four columns agent_code, ord_amount, cust_code, and ord_num) once from the orders table , the following SQL statement can be used :
SQL Code:
SELECT DISTINCT agent_code, ord_amount, cust_code, ord_num
-- Select distinct combinations of 'agent_code', 'ord_amount', 'cust_code', and 'ord_num'
FROM orders
-- From the table 'orders'
WHERE agent_code='A002';
-- Filter the results to only include rows where the 'agent_code' is 'A002'
Explanation:
- SELECT DISTINCT agent_code, ord_amount, cust_code, ord_num: This line specifies that you want to retrieve unique combinations of 'agent_code', 'ord_amount', 'cust_code', and 'ord_num'. The DISTINCT keyword ensures that only unique combinations are returned; any duplicate combinations will be eliminated.
- FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': This line specifies a condition for filtering the results. It filters the results to only include rows where the value in the 'agent_code' column is 'A002'. This condition acts as a filter, allowing only rows with 'A002' as the agent code to be included in the result set.
Relational Algebra Expression:
Relational Algebra Tree:
Output:
AGENT_CODE ORD_AMOUNT CUST_CODE ORD_NUM ---------- ---------- ---------- ---------- A002 500 C00022 200126 A002 2500 C00005 200106 A002 500 C00009 200120 A002 1200 C00009 200133 A002 4000 C00022 200113 A002 3500 C00009 200128 A002 500 C00022 200123
In the above output, all rows whose agent_code is 'A002' have returned because there is no identical rows on agent_code, ord_amount, cust_code and ord_num. See the following presentation :
Pictorial presentation:
SELECT with DISTINCT on multiple columns and ORDER BY clause
You can use an order by clause in the select statement with distinct on multiple columns. Here is an example:
SQL Code:
SELECT DISTINCT agent_code, ord_amount
-- Select distinct combinations of 'agent_code' and 'ord_amount'
FROM orders
-- From the table 'orders'
WHERE agent_code='A002'
-- Filter the results to only include rows where the 'agent_code' is 'A002'
ORDER BY ord_amount;
-- Sort the results in ascending order based on the 'ord_amount' column
Explanation:
- SELECT DISTINCT agent_code, ord_amount: This line specifies that you want to retrieve unique combinations of 'agent_code' and 'ord_amount'. The DISTINCT keyword ensures that only unique combinations are returned; any duplicate combinations will be eliminated.
- FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': This line specifies a condition for filtering the results. It filters the results to only include rows where the value in the 'agent_code' column is 'A002'. This condition acts as a filter, allowing only rows with 'A002' as the agent code to be included in the result set.
- ORDER BY ord_amount: This line specifies how the results should be sorted. It sorts the results in ascending order based on the 'ord_amount' column. This means that the rows will be arranged from the smallest to the largest 'ord_amount'.
Relational Algebra Expression:
Relational Algebra Tree:
Output:
AGENT_CODE ORD_AMOUNT ---------- ---------- A002 500 A002 1200 A002 2500 A002 3500 A002 4000
Pictorial presentation:
COUNT() function and SELECT with DISTINCT on multiple columns
You can use the count() function in a select statement with distinct on multiple columns to count the distinct rows. Here is an example:
SELECT COUNT(*)
-- Count the number of rows in the result set
FROM (
SELECT DISTINCT agent_code, ord_amount, cust_code
-- Select distinct combinations of 'agent_code', 'ord_amount', and 'cust_code'
FROM orders
-- From the table 'orders'
WHERE agent_code='A002'
-- Filter the results to only include rows where the 'agent_code' is 'A002'
);
Explanation:
- SELECT COUNT(*): This line specifies that you want to count the number of rows in the result set.
- FROM (...): This line wraps the inner query in parentheses and treats it as a subquery. The subquery selects distinct combinations of 'agent_code', 'ord_amount', and 'cust_code' from the 'orders' table where the 'agent_code' is 'A002'.
- The inner query:
- SELECT DISTINCT agent_code, ord_amount, cust_code: Selects distinct combinations of 'agent_code', 'ord_amount', and 'cust_code' from the 'orders' table.
- FROM orders: Specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.
- WHERE agent_code='A002': Filters the results to only include rows where the value in the 'agent_code' column is 'A002'.
Output:
COUNT(*) ---------- 6
Pictorial presentation:
Check out our 1000+ SQL Exercises with solution and explanation to improve your skills.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics