Merging DataFrames and removing duplicate rows in Pandas

Last update on May 06 2025 13:19:44 (UTC/GMT +8 hours)

16. Merge and Drop Duplicates

Write a Pandas program to merge DataFrames and drop duplicates.

In this exercise, we have merged two DataFrames and then remove any duplicate rows that may arise from the merge.

Sample Solution :

Code :

import pandas as pd

# Create two sample DataFrames with potential duplicates
df1 = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Annabel', 'Selena', 'Caeso']
})

df2 = pd.DataFrame({
    'ID': [2, 3, 1],
    'Name': ['Selena', 'Caeso', 'Annabel'],
    'Age': [30, 22, 25]
})

# Merge the DataFrames on the 'ID' column
merged_df = pd.merge(df1, df2, on=['ID', 'Name'])

# Drop any duplicate rows from the merged DataFrame
merged_df_no_duplicates = merged_df.drop_duplicates()

# Output the result
print(merged_df_no_duplicates)

Output:

   ID     Name  Age
0   1  Annabel   25
1   2   Selena   30
2   3    Caeso   22

Explanation:

Created two DataFrames df1 and df2 with overlapping data.
Merged the DataFrames on the 'ID' and 'Name' columns.
Removed any duplicate rows in the merged DataFrame using drop_duplicates().

For more Practice: Solve these Related Problems:

Write a Pandas program to merge two DataFrames, drop duplicates, and then count the number of unique rows in the result.
Write a Pandas program to merge two DataFrames and drop duplicates based on specific columns before sorting the output.
Write a Pandas program to merge two DataFrames and drop duplicates while ensuring rows with null key values are retained.
Write a Pandas program to merge two DataFrames, drop duplicates, and then export the cleaned result to a CSV file.

Go to:

Previous: Merge with Custom Sorting.
Next: Merge with Missing Data.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.