Pandas: Groupby and aggregate over multiple lists
Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-30 with Solution
Write a Pandas program to split the following dataset using group by on first column and aggregate over multiple lists on second column.
Test Data:
student_id marks 0 S001 [88, 89, 90] 1 S001 [78, 81, 60] 2 S002 [84, 83, 91] 3 S002 [84, 88, 91] 4 S003 [90, 89, 92] 5 S003 [88, 59, 90]
Sample Solution:
Python Code :
import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'student_id': ['S001','S001','S002','S002','S003','S003'],
'marks': [[88,89,90],[78,81,60],[84,83,91],[84,88,91],[90,89,92],[88,59,90]]})
print("Original DataFrame:")
print(df)
print("\nGroupby and aggregate over multiple lists:")
result = df.set_index('student_id')['marks'].groupby('student_id').apply(list).apply(lambda x: np.mean(x,0))
print(result)
Sample Output:
Original DataFrame: student_id marks 0 S001 [88, 89, 90] 1 S001 [78, 81, 60] 2 S002 [84, 83, 91] 3 S002 [84, 88, 91] 4 S003 [90, 89, 92] 5 S003 [88, 59, 90] Groupby and aggregate over multiple lists: student_id S001 [83.0, 85.0, 75.0] S002 [84.0, 85.5, 91.0] S003 [89.0, 74.0, 91.0] Name: marks, dtype: object
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges.
Next: Write a Pandas program to split the following dataset using group by on ‘salesman_id’ and find the first order date for each group.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/groupby/python-pandas-groupby-exercise-30.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics