w3resource

Pandas: Groupby and aggregate over multiple lists


30. Grouping by First Column with Multiple List Aggregations

Write a Pandas program to split the following dataset using group by on first column and aggregate over multiple lists on second column.

Test Data:

  student_id         marks
0       S001  [88, 89, 90]
1       S001  [78, 81, 60]
2       S002  [84, 83, 91]
3       S002  [84, 88, 91]
4       S003  [90, 89, 92]
5       S003  [88, 59, 90]  

Sample Solution:

Python Code :

import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
    'student_id': ['S001','S001','S002','S002','S003','S003'],
    'marks': [[88,89,90],[78,81,60],[84,83,91],[84,88,91],[90,89,92],[88,59,90]]})
print("Original DataFrame:")
print(df)
print("\nGroupby and aggregate over multiple lists:")
result = df.set_index('student_id')['marks'].groupby('student_id').apply(list).apply(lambda x: np.mean(x,0))
print(result)

Sample Output:

Original DataFrame:
  student_id         marks
0       S001  [88, 89, 90]
1       S001  [78, 81, 60]
2       S002  [84, 83, 91]
3       S002  [84, 88, 91]
4       S003  [90, 89, 92]
5       S003  [88, 59, 90]

Groupby and aggregate over multiple lists:
student_id
S001    [83.0, 85.0, 75.0]
S002    [84.0, 85.5, 91.0]
S003    [89.0, 74.0, 91.0]
Name: marks, dtype: object

For more Practice: Solve these Related Problems:

  • Write a Pandas program to group the dataframe by the first column and then aggregate the second column values into multiple lists per group.
  • Write a Pandas program to split the dataframe by a key and then create lists of values for a second column for each group.
  • Write a Pandas program to group by the first column and then apply a custom function to aggregate values from the second column into a list.
  • Write a Pandas program to split the dataframe on the first column and then combine the values of the second column into a single aggregated list per group.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges.
Next: Write a Pandas program to split the following dataset using group by on ‘salesman_id’ and find the first order date for each group.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.