Pandas - Removing duplicate rows in a DataFrame using drop_duplicates()

Last update on May 05 2025 13:03:41 (UTC/GMT +8 hours)

4. String Manipulation in Pandas

Write a Pandas program to remove duplicates rows from a DataFrame.

This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 25],
    'Salary': [50000, 60000, 70000, 50000]
})

# Remove duplicate rows from the DataFrame
df_no_duplicates = df.drop_duplicates()

# Output the result
print(df_no_duplicates)

Output:

      Name  Age  Salary
0    David   25   50000
1  Annabel   30   60000
2  Charlie   22   70000

Explanation:

Created a DataFrame with duplicate rows.
Used drop_duplicates() to remove duplicate rows from the DataFrame.
Returned the DataFrame without duplicates.

For more Practice: Solve these Related Problems:

Write a Pandas program to extract numbers from a text column.
Write a Pandas program to find and replace specific words in a text column.
Write a Pandas program to count occurrences of a word in each row of a column.
Write a Pandas program to split a column containing full names into first and last names.

Go to:

Previous: Data Cleaning Techniques.
Next: Handling Outliers with Z-Score Method.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.