Tracking the source of merged rows using the indicator argument in Pandas
12. Merge with Indicator
Write a Pandas program to merge with custom indicator to track source.
In this exercise, we have used the indicator argument in pd.merge() to track the source of each row after merging.
Sample Solution :
Code :
import pandas as pd
# Create two sample DataFrames
df1 = pd.DataFrame({
'ID': [1, 2, 3],
'Name': ['Selena', 'Annabel', 'Caeso']
})
df2 = pd.DataFrame({
'ID': [2, 3, 4],
'Age': [25, 30, 22]
})
# Merge the DataFrames with an indicator
merged_df = pd.merge(df1, df2, on='ID', how='outer', indicator=True)
# Output the result
print(merged_df)
Output:
ID Name Age _merge 0 1 Selena NaN left_only 1 2 Annabel 25.0 both 2 3 Caeso 30.0 both 3 4 NaN 22.0 right_only
Explanation:
- Created two DataFrames df1 and df2.
- Used pd.merge() with the indicator=True argument to add a column that shows whether the row came from the left DataFrame, the right, or both.
- The result includes an extra _merge column to track the source of each row.
For more Practice: Solve these Related Problems:
- Write a Pandas program to merge two DataFrames with a custom indicator column marking the origin of each row, then count occurrences per source.
- Write a Pandas program to merge two DataFrames with a custom indicator and filter rows that appear in both DataFrames.
- Write a Pandas program to merge two DataFrames with a custom indicator and group by the indicator to compute aggregate values.
- Write a Pandas program to merge two DataFrames with a custom indicator column and then display only the rows originating from a specified source.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.