w3resource

Pandas - Removing duplicate rows in a DataFrame using drop_duplicates()


Pandas: Data Cleaning and Preprocessing Exercise-4 with Solution


Write a Pandas program to remove duplicates rows from a DataFrame.

This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 25],
    'Salary': [50000, 60000, 70000, 50000]
})

# Remove duplicate rows from the DataFrame
df_no_duplicates = df.drop_duplicates()

# Output the result
print(df_no_duplicates)

Output:

      Name  Age  Salary
0    David   25   50000
1  Annabel   30   60000
2  Charlie   22   70000

Explanation:

  • Created a DataFrame with duplicate rows.
  • Used drop_duplicates() to remove duplicate rows from the DataFrame.
  • Returned the DataFrame without duplicates.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Become a Patron!

Follow us on Facebook and Twitter for latest update.