w3resource

SQL COUNT() with distinct


COUNT() function with distinct clause

In SQL, the COUNT() function is used to count the number of rows that match a specified condition. The DISTINCT keyword is used to return only distinct (unique) values.

  • When combined, COUNT and DISTINCT can be used to count the number of unique values in a column or a set of columns.

  • The DISTINCT clause eliminates the repetitive appearance of the same data.

  • The DISTINCT can come only once in a given select statement.

    Syntax :

    COUNT(DISTINCT expr,[expr...])
    or
    SELECT COUNT(DISTINCT expression) FROM table_name WHERE condition;
    

    Here, the term "expression" refers to the combination of columns or an expression that must be evaluated to obtain distinct values, while "condition" is an optional parameter.

    Important Points :

  • COUNT(DISTINCT column_name) will ignore NULL values because NULL is not considered a distinct value.

  • The performance of COUNT(DISTINCT ...) can be slower compared to COUNT(*) because it requires sorting and removing duplicates to find the unique values.

  • Example :

    Sample table: orders

    
       ORD_NUM ORD_AMOUNT ADVANCE_AMOUNT ORD_DATE  CUST_CODE       AGENT_CODE      ORD_DESCRIPTION
    ---------- ---------- -------------- --------- --------------- --------------- -----------------
        200114       3500           2000 15-AUG-08 C00002          A008
        200122       2500            400 16-SEP-08 C00003          A004
        200118        500            100 20-JUL-08 C00023          A006
        200119       4000            700 16-SEP-08 C00007          A010
        200121       1500            600 23-SEP-08 C00008          A004
        200130       2500            400 30-JUL-08 C00025          A011
        200134       4200           1800 25-SEP-08 C00004          A005
        200108       4000            600 15-FEB-08 C00008          A004
        200103       1500            700 15-MAY-08 C00021          A005
        200105       2500            500 18-JUL-08 C00025          A011
    ...........
    

    View the table

    1. Count the number of unique cust_code values from the orders table.

    2. Display the result with the heading "Number of employees".

    How can you write an SQL query to achieve the above?

    SQL Code:

    
    -- Counting the number of distinct values in the 'cust_code' column of the 'orders' table
    -- and aliasing the result as "Number of employees"
    SELECT COUNT(DISTINCT cust_code) AS "Number of employees"
    -- From the 'orders' table
    FROM orders;
    

    Explanation:

    • SELECT COUNT(DISTINCT cust_code): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'cust_code' column of the 'orders' table.

    • The DISTINCT keyword ensures that each unique value of 'cust_code' is counted only once.

    • AS "Number of employees": This renames the column returned by the COUNT(DISTINCT cust_code) function as "Number of employees". The AS keyword is used to give an alias or a new name to the result column. In this case, the result will have a single column with the name "Number of employees".

    • FROM orders: This specifies the source of the data for the query, which is the 'orders' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'orders' table.

    Output:

    Number of employees
    -------------------
                     25
    

    Visual Presentation:

    SQL COUNT WITH DISTINCT clause


    Example :

    Sample table : product_mast

    id|prod|price|year|
    --+----+-----+----+
    1|Pro1| 300|2018|
    2|Pro2| 400|2019|
    3|Pro1| 350|2018|
    4|Pro3| 500|2019|
    5|Pro2| 450|2018|

    To count the number of unique products sold in the year 2018 from the product_mast table, we can use the following SQL query:

    SQL Code:

    
    -- Counting the number of distinct values in the 'prod' column 
    SELECT COUNT(DISTINCT prod) 
    -- From the 'product_mast' table
    FROM product_mast
    -- Filtering the results to include only rows where the 'year' is equal to 2018
    WHERE year = 2018;
    

    Explanation:

    • SELECT COUNT(DISTINCT prod): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'prod' column of the 'product_mast' table.

    • The DISTINCT keyword ensures that each unique value of 'prod' is counted only once.

    • FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.

    • WHERE year = 2018: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018. This condition restricts the count to only include products from the year 2018.

    Output:

    count
    ---------
           2
    

    This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018.

    Example :

    Sample table : product_mast

    id|prod|price|year|
    --+----+-----+----+
    1|Pro1| 300|2018|
    2|Pro2| 400|2019|
    3|Pro1| 350|2018|
    4|Pro3| 500|2019|
    5|Pro2| 450|2018|

    If we want to count the number of distinct combinations of product and year, we can use the following SQL query:

    SQL Code:

    
    -- Counting the number of distinct combinations of 'prod' and 'year' in the 'product_mast' table
    SELECT COUNT(DISTINCT (prod, year))
    -- From the 'product_mast' table
    FROM product_mast;
    

    Explanation:

    • SELECT COUNT(DISTINCT CONCAT(prod, year)): It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.

    • FROM product_mast: The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.

    Output:

    count
    ---------
           4
    

    This will return the result 4, as there are four distinct product and year combinations (Pro1-2018, Pro2-2019, Pro2-2018, and Pro3-2019) in the product_mast table.

    Example :

    Sample table : product_mast

    id|prod|price|year|
    --+----+-----+----+
    1|Pro1| 300|2018|
    2|Pro2| 400|2019|
    3|Pro1| 350|2018|
    4|Pro3| 500|2019|
    5|Pro2| 450|2018|

    To count the number of distinct products sold in the year 2018 with a price greater than 300, we can use the following SQL query:

    SQL Code:

    
    -- Counting the number of distinct values in the 'prod' column 
    SELECT COUNT(DISTINCT prod)
    -- From the 'product_mast' table
    FROM product_mast
    -- Filtering the results to include only rows where the 'year' is equal to 2018 
    -- and the 'price' is greater than 300
    WHERE year = 2018 AND price > 300;
    

    Explanation:

    • SELECT COUNT(DISTINCT prod): It selects the count of distinct (unique) values in the 'prod' column of the 'product_mast' table using the COUNT() function with the DISTINCT keyword.

    • FROM product_mast: This specifies the source of the data for the query. The FROM keyword is used to indicate the table from which the data will be selected.

    • WHERE year = 2018 AND price > 300;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018 and the value in the 'price' column is greater than 300. This condition restricts the count to only include products from the year 2018 with a price greater than 300.

    Output:

    count
    ---------
           2
    

    This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018 with a price greater than 300.

    Example :

    Sample table : product_mast

    id|prod|price|year|
    --+----+-----+----+
    1|Pro1| 300|2018|
    2|Pro2| 400|2019|
    3|Pro1| 350|2018|
    4|Pro3| 500|2019|
    5|Pro2| 450|2018|

    To count the number of unique combinations of product and year where the price is greater than 400, you can use the following SQL query:

    SQL Code:

    -- Counting the number of distinct combinations of 'prod' and 'year' 
    SELECT COUNT(DISTINCT (prod, year))
    -- From the 'product_mast' table
    FROM product_mast
    -- Filtering the results to include only rows where the 'price' is greater than 400
    WHERE price > 400;
    

    Explanation:

    • SELECT COUNT(DISTINCT CONCAT(prod, year)): It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.

    • FROM product_mast: This specifies the source of the data for the query. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.

    • WHERE price > 400;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'price' column is greater than 400. This condition restricts the count to only include products with a price greater than 400.

    Output:

    count
    ---------
           2
    

    This will return the result 2, as there is two distinct products and year combination (Pro3-2022, Pro2-2018) in the product_mast table with a price greater than 400.

    SQL COUNT( ) with All

    In the following, we discuss the usage of the ALL clause with the SQL COUNT() function to count only non-NULL values for the specified column. The difference between * (asterisk) and ALL is that * counts both NULL and non-NULL values, while ALL counts only non-NULL values.

    Example:

    To count the total number of rows in the 'customer' table, including duplicate values in the 'grade' column, you can use the following SQL statement:

    SQL Code:

    
    -- Counting the total number of rows in the 'customer' table, 
    -- including duplicate values in the 'grade' column
    SELECT COUNT(ALL grade)
    -- From the 'customer' table
    FROM customer;
    

    Explanation:

    • SELECT COUNT(ALL grade): This is the main part of the SQL query. It uses the COUNT() function to count the number of values in the 'grade' column of the 'customer' table. The ALL keyword is optional and doesn't change the behavior in this context; it's used to explicitly indicate that all values, including NULLs, should be counted. However, ALL is the default behavior for COUNT() if no distinct keyword is specified.

    • FROM customer;: This specifies the source of the data for the query, which is the 'customer' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'customer' table.

    Sample table: customer

      
    +-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+  
    |CUST_CODE  | CUST_NAME   | CUST_CITY   | WORKING_AREA | CUST_COUNTRY | GRADE | OPENING_AMT | RECEIVE_AMT | PAYMENT_AMT |OUTSTANDING_AMT| PHONE_NO     | AGENT_CODE |
    +-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+
    | C00013    | Holmes      | London      | London       | UK           |     2 |     6000.00 |     5000.00 |     7000.00 |       4000.00 | BBBBBBB      | A003       |
    | C00001    | Micheal     | New York    | New York     | USA          |     2 |     3000.00 |     5000.00 |     2000.00 |       6000.00 | CCCCCCC      | A008       |
    | C00020    | Albert      | New York    | New York     | USA          |     3 |     5000.00 |     7000.00 |     6000.00 |       6000.00 | BBBBSBB      | A008       |
    | C00025    | Ravindran   | Bangalore   | Bangalore    | India        |     2 |     5000.00 |     7000.00 |     4000.00 |       8000.00 | AVAVAVA      | A011       |
    | C00024    | Cook        | London      | London       | UK           |     2 |     4000.00 |     9000.00 |     7000.00 |       6000.00 | FSDDSDF      | A006       |
    | C00015    | Stuart      | London      | London       | UK           |     1 |     6000.00 |     8000.00 |     3000.00 |      11000.00 | GFSGERS      | A003       |
    | C00002    | Bolt        | New York    | New York     | USA          |     3 |     5000.00 |     7000.00 |     9000.00 |       3000.00 | DDNRDRH      | A008       |
    | C00018    | Fleming     | Brisban     | Brisban      | Australia    |     2 |     7000.00 |     7000.00 |     9000.00 |       5000.00 | NHBGVFC      | A005       |
    | C00021    | Jacks       | Brisban     | Brisban      | Australia    |     1 |     7000.00 |     7000.00 |     7000.00 |       7000.00 | WERTGDF      | A005       |
    | C00019    | Yearannaidu | Chennai     | Chennai      | India        |     1 |     8000.00 |     7000.00 |     7000.00 |       8000.00 | ZZZZBFV      | A010       |
    ........
    

    View the table

    Output:

    COUNT(ALLGRADE)
    ---------------
                 25
    

    Note: Outputs of the said SQL statement shown here is taken by using Oracle Database 10g Express Edition.

    Here is a slide presentation of all aggregate functions.

    Frequently Asked Questions (FAQ) - SQL COUNT() with DISTINCT

    1. What does COUNT() with DISTINCT do in SQL?

    COUNT() with DISTINCT counts the number of unique values in a specified column or expression, excluding duplicates.

    2. Why would we use SQL COUNT(DISTINCT ...) instead of just SQL COUNT()?

    Use COUNT(DISTINCT ...) when we need to determine the number of unique (distinct) values in a column, as opposed to the total number of rows including duplicates.

    3. How does SQL COUNT(DISTINCT ...) handle NULL values?

    COUNT(DISTINCT ...) ignores NULL values, as NULL is not considered a distinct value.

    4. Is SQL COUNT(DISTINCT ...) slower than SQL COUNT(*)?

    Yes, COUNT(DISTINCT ...) can be slower than COUNT(*) because it requires sorting and eliminating duplicates to count the unique values.

    5. Can we use SQL COUNT(DISTINCT ...) with multiple columns?

    Yes, COUNT(DISTINCT ...) can be used with multiple columns to count unique combinations of values across those columns.

    6. How many times can SQL DISTINCT appear in a single SELECT statement?

    The DISTINCT keyword can only appear once in a given SELECT statement.

    7. What is the general format for using SQL COUNT(DISTINCT ...) in a query?

    The general format involves specifying the column(s) you want to count the unique values of, using the DISTINCT keyword within the COUNT() function..

    Previous: COUNT Function
    Next: COUNT with Group by

    

    Follow us on Facebook and Twitter for latest update.