Examples
Single level columns:
import numpy as np
import pandas as pd
df_single_level_cols = pd.DataFrame([[0, 2], [3, 4]],
index=['deer', 'monkey'],
columns=['weight', 'height'])
Stacking a dataframe with a single level column axis returns a Series:
df_single_level_cols
df_single_level_cols.stack()
Multi level columns: simple case:
multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
('weight', 'pounds')])
df_multi_level_cols1 = pd.DataFrame([[3, 4], [4, 5]],
index=['deer', 'monkey'],
columns=multicol1)
Stacking a dataframe with a multi-level column axis:
df_multi_level_cols1
df_multi_level_cols1.stack()
Missing values
multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
('height', 'm')])
df_multi_level_cols2 = pd.DataFrame([[2.0, 3.0], [4.0, 5.0]],
index=['deer', 'monkey'],
columns=multicol2)
It is common to have missing values when stacking a dataframe with multi-level columns,
as the stacked dataframe typically has more values than the original dataframe. Missing values
are filled with NaNs:
df_multi_level_cols2
df_multi_level_cols2.stack()
Prescribing the level(s) to be stacked:
The first parameter controls which level or levels are stacked:
df_multi_level_cols2.stack(0)
df_multi_level_cols2.stack([0, 1])
Dropping missing values:
df_multi_level_cols3 = pd.DataFrame([[None, 2.0], [3.0, 4.0]],
index=['deer', 'monkey'],
columns=multicol2)
Note that rows where all values are missing are dropped by default but this behaviour
can be controlled via the dropna keyword parameter:
df_multi_level_cols3
df_multi_level_cols3.stack(dropna=False)
df_multi_level_cols3.stack(dropna=True)