Following examples show factorize() as a top-level method like pd.factorize(values). The results are identical
for methods like Series.factorize().
import numpy as np
import pandas as pd
labels, uniques = pd.factorize(['q', 'q', 'p', 'r', 'q'])
labels
uniques
If sort=True, the uniques will be sorted, and labels will be shuffled so that the relationship is the maintained.
labels, uniques = pd.factorize(['q', 'q', 'p', 'r', 'q'], sort=True)
labels
uniques
Missing values are indicated in labels with na_sentinel (-1 by default) though missing values are never
included in uniques.
labels, uniques = pd.factorize(['q', None, 'p', 'r', 'q'])
labels
uniques
When factorizing pandas objectsthe type of uniques will differ.
For Categoricals, a Categorical is returned.
cat = pd.Categorical(['p', 'p', 'r'], categories=['p', 'q', 'r'])
labels, uniques = pd.factorize(cat)
labels
uniques
For other pandas objects, an Index of the appropriate type is returned.
cat = pd.Series(['p', 'p', 'r'])
labels, uniques = pd.factorize(cat)
labels
uniques