w3resource

Pandas: Split a given dataframe into groups and display target column as a list of unique values


20. Grouping with Unique Values as List

Write a Pandas program to split a given dataframe into groups and display target column as a list of unique values.

Test Data:

   id  type     book
0   A     1     Math
1   A     1     Math
2   A     1  English
3   A     1  Physics
4   A     2     Math
5   A     2  English
6   B     1  Physics
7   B     1  English
8   B     1  Physics
9   B     2  English
10  B     2  English

Sample Solution:

Python Code :

import pandas as pd
df = pd.DataFrame( {'id' : ['A','A','A','A','A','A','B','B','B','B','B'], 
                    'type' : [1,1,1,1,2,2,1,1,1,2,2], 
                    'book' : ['Math','Math','English','Physics','Math','English','Physics','English','Physics','English','English']})

print("Original DataFrame:")
print(df)
new_df = df[['id', 'type', 'book']].drop_duplicates()\
                         .groupby(['id','type'])['book']\
                         .apply(list)\
                         .reset_index()

new_df['book'] = new_df.apply(lambda x: (','.join([str(s) for s in x['book']])), axis = 1)
print("\nList all unique values in a group:")
print(new_df)

Sample Output:

Original DataFrame:
   id  type     book
0   A     1     Math
1   A     1     Math
2   A     1  English
3   A     1  Physics
4   A     2     Math
5   A     2  English
6   B     1  Physics
7   B     1  English
8   B     1  Physics
9   B     2  English
10  B     2  English

List all unique values in a group:
  id  type                  book
0  A     1  Math,English,Physics
1  A     2          Math,English
2  B     1       Physics,English
3  B     2               English

For more Practice: Solve these Related Problems:

  • Write a Pandas program to group the dataframe by specific columns and then aggregate the target column into a comma-separated list of unique values.
  • Write a Pandas program to split the dataframe into groups and then apply a function to convert group values into a list of distinct entries.
  • Write a Pandas program to group by identifiers and then join the unique target values into a single string per group.
  • Write a Pandas program to group the dataframe and then use a lambda function to aggregate the target column as a list of unique values.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split a given dataframe into groups with multiple aggregations.

Next: Write a Pandas program to split the following dataframe into groups and calculate quarterly purchase amount.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.