Find duplicate data in MySQL
Objective
There are many occasions when you need to find duplicate values available in a column of a MySql table. Often, you may want to count the number of duplicate values in a MySQL table.
In this article, we have discussed a query where you can find duplicates, triplicates, quadruplicates (or more) data from a MySQL table.
We have discussed how to find duplicate values with INNER JOIN and subquery, INNER JOIN and DISTINCT, and also how to count duplicate values with GROUP BY and HAVING.
Table in question
We have used a table called 'item' to apply the query :
Table Name: item
Structure:
item_code varchar(20), value int(11), quantity int(11) where item_code is the primary key.
![item master](https://www.w3resource.com/w3r_images/item-master.png)
Using INNER JOIN and Subquery
Now we want to get the details of those records where quantity field have duplicate/triplicates values. In the image above, values marked with red rectangle exist more than once.
![item masterduplicate](https://www.w3resource.com/w3r_images/item-master-duplicate.png)
Here is the query:
Code:
SELECT item_code, value, item.quantity
FROM item
INNER JOIN(
SELECT quantity
FROM item
GROUP BY quantity
HAVING COUNT(item_code) >1
)temp ON item.quantity= temp.quantity;
Relational Algebra Expression:
![Relational Algebra Expression: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-relational-algebra.png)
Relational Algebra Tree:
![Relational Algebra Tree: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-relational-algebra-tree-diagram.png)
Sample Output:
![item master duplicate result](https://www.w3resource.com/w3r_images/item-master-duplicate-result.png)
To get the above result we have used a query with an INNER JOIN (INNER JOIN selects all rows from both participating tables as long as there is a match between the columns.) statement. INNER JOIN uses the main table 'item' and a temporary table 'temp' whose data comes from a subquery. Here is the subquery and it's output:
Code:
SELECT quantity
FROM item
GROUP BY quantity
HAVING COUNT(item_code) >1
Relational Algebra Expression:
![Relational Algebra Expression: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-1-relational-algebra.png)
Relational Algebra Tree:
![Relational Algebra Tree: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-1-relational-algebra-tree-diagram.png)
Sample Output:
![item-duplicate-data-subquery](https://www.w3resource.com/w3r_images/item-duplicate-data-subquery.png)
Now the following main query will execute on 'item' and 'temp' tables where the common field is quantity and the result will be as follows:
Code:
SELECT item_code, value, item.quantity
FROM item
INNER JOIN temp ON item.quantity= temp.quantity;
Relational Algebra Expression:
![Relational Algebra Expression: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-2-relational-algebra.png)
Relational Algebra Tree:
![Relational Algebra Tree: Using INNER JOIN and Subquery.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-subquery-2-relational-algebra-tree-diagram.png)
Using INNER JOIN and DISTINCT
You can use the following query to get the same result. Here we apply INNER JOIN the table with itself. As the same quantity value exists in more than two records, a DISTINCT clause is used.
Here is the code and the output :
Code:
SELECT distinct a.item_code, a.value, a.quantity
FROM item a
INNER JOIN item b ON a.quantity = b.quantity
WHERE a.item_code <> b.item_code
Relational Algebra Expression:
![Relational Algebra Expression: Using INNER JOIN and DISTINCT.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-distinct-relational-algebra.png)
Relational Algebra Tree:
![Relational Algebra Tree: Using INNER JOIN and DISTINCT.](https://www.w3resource.com/w3r_images/find-duplicate-data-mysql-using-inner-join-and-distinct-relational-algebra-tree-diagram.png)
Sample Output:
![item-master-duplicate-result2](https://www.w3resource.com/w3r_images/item-master-duplicate-result2.png)
Count duplicate data in MySQL
The following query count those records where quantity field holds duplicate/triplicates (or more) data.
Table data:
![item master](https://www.w3resource.com/w3r_images/item-master.png)
Code:
SELECT item_code, COUNT( quantity ) x
FROM item
GROUP BY quantity
HAVING x >1
Sample Output:
![item-master-duplicate-data-count](https://www.w3resource.com/w3r_images/item-master-duplicate-data-count.png)
Count duplicate records in MySQL
To count the total duplicate (or more) 'quantity' of 'item' table you can use the following query:
Code:
SELECT count(*) AS Total_duplicate_count
FROM
(SELECT item_code FROM item
GROUP BY quantity HAVING COUNT(quantity) > 1
)AS x
Relational Algebra Expression:
![Relational Algebra Expression: Count duplicate records in MySQL.](https://www.w3resource.com/w3r_images/count-duplicate-records-in-mysql-relational-algebra.png)
Relational Algebra Tree:
![Relational Algebra Tree: Count duplicate records in MySQL.](https://www.w3resource.com/w3r_images/count-duplicate-records-in-mysql-relational-algebra-tree-diagram.png)
Sample Output:
![item-master-duplicate-data-count-product](https://www.w3resource.com/w3r_images/item-master-duplicate-data-count-product.png)
Previous: MySQL Miscellaneous Functions
Next: Rows holding group wise maximum for a column
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics