w3resource

Group by and Filter Groups in Pandas


4. Group by and Filter Groups

Write a Pandas program that implements the technique of grouping and filtering groups to refine your data analysis and insights.

Sample Solution:

Python Code :

import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print("Sample DataFrame:")
print(df)
# Group by 'Category'
grouped = df.groupby('Category')
# Filter groups where the sum of 'Value' > 5
print("\nFilter groups where the sum of 'Value' > 5")
filtered = grouped.filter(lambda x: x['Value'].sum() > 5)

print(filtered)

Output:

Sample DataFrame:
  Category  Value
0        A      1
1        A      2
2        B      3
3        B      4
4        C      5
5        C      6

Filter groups where the sum of 'Value' > 5
  Category  Value
2        B      3
3        B      4
4        C      5
5        C      6

Explanation:

  • Import pandas.
  • Create a sample DataFrame.
  • Group by 'Category'.
  • Filter groups where the sum of 'Value' > 5.
  • Print the filtered DataFrame.

For more Practice: Solve these Related Problems:

  • Write a Pandas program to group data and then filter out groups where the total sum of a numeric column is below a threshold.
  • Write a Pandas program to group a DataFrame and keep only those groups whose average exceeds a given value.
  • Write a Pandas program to group data and filter groups based on a custom condition applied to the count of records in each group.
  • Write a Pandas program to group data and filter groups dynamically based on the standard deviation of a numeric column.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Use Custom Aggregation Functions in Pandas GroupBy.
Next: Group by and Apply function to Groups in Pandas.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.