Removing Duplicate Rows from a DataFrame Using Pandas
Pandas: Data Validation Exercise-5 with Solution
Write a Pandas program to remove duplicate rows from a DataFrame.
This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
'Name': ['Orville', 'Arturo', 'Ruth', 'Orville'],
'Age': [25, 30, 22, 25],
'Salary': [50000, 60000, 70000, 50000]
})
# Remove duplicate rows
df_no_duplicates = df.drop_duplicates()
# Output the result
print(df_no_duplicates)
Output:
Name Age Salary 0 Orville 25 50000 1 Arturo 30 60000 2 Ruth 22 70000
Explanation:
- Created a DataFrame with some duplicate rows.
- Used drop_duplicates() to remove duplicate rows.
- Returned the DataFrame without duplicates.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/remove-duplicate-rows-from-a-dataframe-in-pandas.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics