Pandas - Removing duplicate rows in a DataFrame using drop_duplicates()
Pandas: Data Cleaning and Preprocessing Exercise-4 with Solution
Write a Pandas program to remove duplicates rows from a DataFrame.
This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
'Name': ['David', 'Annabel', 'Charlie', 'David'],
'Age': [25, 30, 22, 25],
'Salary': [50000, 60000, 70000, 50000]
})
# Remove duplicate rows from the DataFrame
df_no_duplicates = df.drop_duplicates()
# Output the result
print(df_no_duplicates)
Output:
Name Age Salary 0 David 25 50000 1 Annabel 30 60000 2 Charlie 22 70000
Explanation:
- Created a DataFrame with duplicate rows.
- Used drop_duplicates() to remove duplicate rows from the DataFrame.
- Returned the DataFrame without duplicates.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/pandas-remove-duplicate-rows-from-a-dataframe.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics