SQL COUNT() with distinct
COUNT() function with distinct clause
In SQL, the COUNT() function is used to count the number of rows that match a specified condition. The DISTINCT keyword is used to return only distinct (unique) values.
Syntax :
COUNT(DISTINCT expr,[expr...]) or SELECT COUNT(DISTINCT expression) FROM table_name WHERE condition;
Here, the term "expression" refers to the combination of columns or an expression that must be evaluated to obtain distinct values, while "condition" is an optional parameter.
Important Points :
Example :
Sample table: orders
ORD_NUM ORD_AMOUNT ADVANCE_AMOUNT ORD_DATE CUST_CODE AGENT_CODE ORD_DESCRIPTION ---------- ---------- -------------- --------- --------------- --------------- ----------------- 200114 3500 2000 15-AUG-08 C00002 A008 200122 2500 400 16-SEP-08 C00003 A004 200118 500 100 20-JUL-08 C00023 A006 200119 4000 700 16-SEP-08 C00007 A010 200121 1500 600 23-SEP-08 C00008 A004 200130 2500 400 30-JUL-08 C00025 A011 200134 4200 1800 25-SEP-08 C00004 A005 200108 4000 600 15-FEB-08 C00008 A004 200103 1500 700 15-MAY-08 C00021 A005 200105 2500 500 18-JUL-08 C00025 A011 ...........
1. Count the number of unique cust_code values from the orders table.
2. Display the result with the heading "Number of employees".
How can you write an SQL query to achieve the above?
SQL Code:
-- Counting the number of distinct values in the 'cust_code' column of the 'orders' table
-- and aliasing the result as "Number of employees"
SELECT COUNT(DISTINCT cust_code) AS "Number of employees"
-- From the 'orders' table
FROM orders;
Explanation:
- SELECT COUNT(DISTINCT cust_code): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'cust_code' column of the 'orders' table.
- The DISTINCT keyword ensures that each unique value of 'cust_code' is counted only once.
- AS "Number of employees": This renames the column returned by the COUNT(DISTINCT cust_code) function as "Number of employees". The AS keyword is used to give an alias or a new name to the result column. In this case, the result will have a single column with the name "Number of employees".
- FROM orders: This specifies the source of the data for the query, which is the 'orders' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'orders' table.
Output:
Number of employees ------------------- 25
Visual Presentation:
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
To count the number of unique products sold in the year 2018 from the product_mast table, we can use the following SQL query:
SQL Code:
-- Counting the number of distinct values in the 'prod' column
SELECT COUNT(DISTINCT prod)
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'year' is equal to 2018
WHERE year = 2018;
Explanation:
- SELECT COUNT(DISTINCT prod): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'prod' column of the 'product_mast' table.
- The DISTINCT keyword ensures that each unique value of 'prod' is counted only once.
- FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
- WHERE year = 2018: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018. This condition restricts the count to only include products from the year 2018.
Output:
count --------- 2
This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
If we want to count the number of distinct combinations of product and year, we can use the following SQL query:
SQL Code:
-- Counting the number of distinct combinations of 'prod' and 'year' in the 'product_mast' table
SELECT COUNT(DISTINCT (prod, year))
-- From the 'product_mast' table
FROM product_mast;
Explanation:
- SELECT COUNT(DISTINCT CONCAT(prod, year)): It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
Output:
count --------- 4
This will return the result 4, as there are four distinct product and year combinations (Pro1-2018, Pro2-2019, Pro2-2018, and Pro3-2019) in the product_mast table.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
To count the number of distinct products sold in the year 2018 with a price greater than 300, we can use the following SQL query:
SQL Code:
-- Counting the number of distinct values in the 'prod' column
SELECT COUNT(DISTINCT prod)
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'year' is equal to 2018
-- and the 'price' is greater than 300
WHERE year = 2018 AND price > 300;
Explanation:
- SELECT COUNT(DISTINCT prod): It selects the count of distinct (unique) values in the 'prod' column of the 'product_mast' table using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: This specifies the source of the data for the query. The FROM keyword is used to indicate the table from which the data will be selected.
- WHERE year = 2018 AND price > 300;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018 and the value in the 'price' column is greater than 300. This condition restricts the count to only include products from the year 2018 with a price greater than 300.
Output:
count --------- 2
This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018 with a price greater than 300.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
To count the number of unique combinations of product and year where the price is greater than 400, you can use the following SQL query:
SQL Code:
-- Counting the number of distinct combinations of 'prod' and 'year'
SELECT COUNT(DISTINCT (prod, year))
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'price' is greater than 400
WHERE price > 400;
Explanation:
- SELECT COUNT(DISTINCT CONCAT(prod, year)): It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: This specifies the source of the data for the query. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
- WHERE price > 400;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'price' column is greater than 400. This condition restricts the count to only include products with a price greater than 400.
Output:
count --------- 2
This will return the result 2, as there is two distinct products and year combination (Pro3-2022, Pro2-2018) in the product_mast table with a price greater than 400.
SQL COUNT( ) with All
In the following, we discuss the usage of the ALL clause with the SQL COUNT() function to count only non-NULL values for the specified column. The difference between * (asterisk) and ALL is that * counts both NULL and non-NULL values, while ALL counts only non-NULL values.
Example:
To count the total number of rows in the 'customer' table, including duplicate values in the 'grade' column, you can use the following SQL statement:
SQL Code:
-- Counting the total number of rows in the 'customer' table,
-- including duplicate values in the 'grade' column
SELECT COUNT(ALL grade)
-- From the 'customer' table
FROM customer;
Explanation:
- SELECT COUNT(ALL grade): This is the main part of the SQL query. It uses the COUNT() function to count the number of values in the 'grade' column of the 'customer' table. The ALL keyword is optional and doesn't change the behavior in this context; it's used to explicitly indicate that all values, including NULLs, should be counted. However, ALL is the default behavior for COUNT() if no distinct keyword is specified.
- FROM customer;: This specifies the source of the data for the query, which is the 'customer' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'customer' table.
Sample table: customer
+-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+ |CUST_CODE | CUST_NAME | CUST_CITY | WORKING_AREA | CUST_COUNTRY | GRADE | OPENING_AMT | RECEIVE_AMT | PAYMENT_AMT |OUTSTANDING_AMT| PHONE_NO | AGENT_CODE | +-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+ | C00013 | Holmes | London | London | UK | 2 | 6000.00 | 5000.00 | 7000.00 | 4000.00 | BBBBBBB | A003 | | C00001 | Micheal | New York | New York | USA | 2 | 3000.00 | 5000.00 | 2000.00 | 6000.00 | CCCCCCC | A008 | | C00020 | Albert | New York | New York | USA | 3 | 5000.00 | 7000.00 | 6000.00 | 6000.00 | BBBBSBB | A008 | | C00025 | Ravindran | Bangalore | Bangalore | India | 2 | 5000.00 | 7000.00 | 4000.00 | 8000.00 | AVAVAVA | A011 | | C00024 | Cook | London | London | UK | 2 | 4000.00 | 9000.00 | 7000.00 | 6000.00 | FSDDSDF | A006 | | C00015 | Stuart | London | London | UK | 1 | 6000.00 | 8000.00 | 3000.00 | 11000.00 | GFSGERS | A003 | | C00002 | Bolt | New York | New York | USA | 3 | 5000.00 | 7000.00 | 9000.00 | 3000.00 | DDNRDRH | A008 | | C00018 | Fleming | Brisban | Brisban | Australia | 2 | 7000.00 | 7000.00 | 9000.00 | 5000.00 | NHBGVFC | A005 | | C00021 | Jacks | Brisban | Brisban | Australia | 1 | 7000.00 | 7000.00 | 7000.00 | 7000.00 | WERTGDF | A005 | | C00019 | Yearannaidu | Chennai | Chennai | India | 1 | 8000.00 | 7000.00 | 7000.00 | 8000.00 | ZZZZBFV | A010 | ........
Output:
COUNT(ALLGRADE) --------------- 25
Note: Outputs of the said SQL statement shown here is taken by using Oracle Database 10g Express Edition.
Here is a slide presentation of all aggregate functions.
Frequently Asked Questions (FAQ) - SQL COUNT() with DISTINCT
1. What does COUNT() with DISTINCT do in SQL?
COUNT() with DISTINCT counts the number of unique values in a specified column or expression, excluding duplicates.
2. Why would we use SQL COUNT(DISTINCT ...) instead of just SQL COUNT()?
Use COUNT(DISTINCT ...) when we need to determine the number of unique (distinct) values in a column, as opposed to the total number of rows including duplicates.
3. How does SQL COUNT(DISTINCT ...) handle NULL values?
COUNT(DISTINCT ...) ignores NULL values, as NULL is not considered a distinct value.
4. Is SQL COUNT(DISTINCT ...) slower than SQL COUNT(*)?
Yes, COUNT(DISTINCT ...) can be slower than COUNT(*) because it requires sorting and eliminating duplicates to count the unique values.
5. Can we use SQL COUNT(DISTINCT ...) with multiple columns?
Yes, COUNT(DISTINCT ...) can be used with multiple columns to count unique combinations of values across those columns.
6. How many times can SQL DISTINCT appear in a single SELECT statement?
The DISTINCT keyword can only appear once in a given SELECT statement.
7. What is the general format for using SQL COUNT(DISTINCT ...) in a query?
The general format involves specifying the column(s) you want to count the unique values of, using the DISTINCT keyword within the COUNT() function..
Previous: COUNT Function
Next: COUNT with Group by
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics