Merging Pandas DataFrames on multiple columns
Python Pandas Numpy: Exercise-32 with Solution
Merge two DataFrames based on multiple columns.
Sample Solution:
Python Code:
import pandas as pd
# Create two sample DataFrames
data1 = {'ID': [1, 2, 3, 4],
'Name': ['Imen', 'Karthika', 'Cosimo', 'Cathrine'],
'Department': ['HR', 'IT', 'Finance', 'IT']}
data2 = {'ID': [1, 2, 3, 5],
'Salary': [50000, 60000, 45000, 70000],
'Department': ['HR', 'IT', 'Finance', 'Marketing']}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
# Merge DataFrames based on 'ID' and 'Department'
merged_df = pd.merge(df1, df2, on=['ID', 'Department'], how='inner')
# Display the merged DataFrame
print(merged_df)
Output:
ID Name Department Salary 0 1 Imen HR 50000 1 2 Karthika IT 60000 2 3 Cosimo Finance 45000
Explanation:
Here's a breakdown of the above code:
- First we create two sample DataFrames (df1 and df2) with columns 'ID', 'Name', 'Department' and 'ID', 'Salary', 'Department', respectively.
- The pd.merge(df1, df2, on=['ID', 'Department'], how='inner') line merges the DataFrames based on the common columns 'ID' and 'Department'.
- The how='inner' parameter specifies that only the rows with matching values in both DataFrames will be included in the result.
- The resulting "merged_df" DataFrame contains the merged data.
Flowchart:
Python Code Editor:
Previous: Transposing DataFrame: Pandas data manipulation.
Next: Aggregating data in Pandas: Multiple functions example.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas_numpy/pandas_numpy-exercise-32.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics