By “group by” refers a process involving one or more of the following steps:
import numpy as np
import pandas as pd
df = pd.DataFrame({'M': ['foo', 'bar', 'foo', 'bar'],
'N': ['one', 'one', 'two', 'three'],
'O': np.random.randn(4),
'P': np.random.randn(4)})
df
Grouping and then applying the sum() function to the resulting groups.
df.groupby('M').sum()
Grouping by multiple columns forms a hierarchical index, and again we can apply the sum function.
df.groupby(['M', 'N']).sum()