w3resource

Pandas - Detecting duplicate rows in a DataFrame using duplicated()


3. Data Cleaning Techniques

Write a Pandas program to detect duplicates using duplicated() method.

In this exercise, you will identify duplicate rows in a DataFrame using the duplicated() method.

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 25],
    'Salary': [50000, 60000, 70000, 50000]
})

# Detect duplicates in the DataFrame
duplicates = df.duplicated()

# Output the result
print(duplicates)

Output:

0    False
1    False
2    False
3     True
dtype: bool

Explanation:

  • Created a DataFrame with some duplicate rows.
  • Used duplicated() to detect which rows are duplicates.
  • Outputted a Boolean Series indicating whether each row is a duplicate.

For more Practice: Solve these Related Problems:

  • Write a Pandas program to detect and remove rows with outliers using the IQR method.
  • Write a Pandas program to normalize a numeric column using Z-score scaling.
  • Write a Pandas program to apply a function to clean a text column.
  • Write a Pandas program to convert categorical data into numerical values.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.