Pandas - Detecting duplicate rows in a DataFrame using duplicated()
3. Data Cleaning Techniques
Write a Pandas program to detect duplicates using duplicated() method.
In this exercise, you will identify duplicate rows in a DataFrame using the duplicated() method.
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
'Name': ['David', 'Annabel', 'Charlie', 'David'],
'Age': [25, 30, 22, 25],
'Salary': [50000, 60000, 70000, 50000]
})
# Detect duplicates in the DataFrame
duplicates = df.duplicated()
# Output the result
print(duplicates)
Output:
0 False 1 False 2 False 3 True dtype: bool
Explanation:
- Created a DataFrame with some duplicate rows.
- Used duplicated() to detect which rows are duplicates.
- Outputted a Boolean Series indicating whether each row is a duplicate.
For more Practice: Solve these Related Problems:
- Write a Pandas program to detect and remove rows with outliers using the IQR method.
- Write a Pandas program to normalize a numeric column using Z-score scaling.
- Write a Pandas program to apply a function to clean a text column.
- Write a Pandas program to convert categorical data into numerical values.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.