Pandas - Removing duplicate rows in a DataFrame using drop_duplicates()
4. String Manipulation in Pandas
Write a Pandas program to remove duplicates rows from a DataFrame.
This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 25],
    'Salary': [50000, 60000, 70000, 50000]
})
# Remove duplicate rows from the DataFrame
df_no_duplicates = df.drop_duplicates()
# Output the result
print(df_no_duplicates)
Output:
      Name  Age  Salary
0    David   25   50000
1  Annabel   30   60000
2  Charlie   22   70000
Explanation:
- Created a DataFrame with duplicate rows.
 - Used drop_duplicates() to remove duplicate rows from the DataFrame.
 - Returned the DataFrame without duplicates.
 
For more Practice: Solve these Related Problems:
- Write a Pandas program to extract numbers from a text column.
 - Write a Pandas program to find and replace specific words in a text column.
 - Write a Pandas program to count occurrences of a word in each row of a column.
 - Write a Pandas program to split a column containing full names into first and last names.
 
Go to:
PREV : Data Cleaning Techniques.
NEXT : Handling Outliers with Z-Score Method.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
