w3resource

Pandas: Data Manipulation - factorize() function

factorize() function

The factorize() function is used to encode the object as an enumerated type or categorical variable. This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values.

Syntax:

pandas.factorize(values, sort=False, order=None, na_sentinel=-1, size_hint=None)

Parameters:

Name Description Type Default Value Required / Optional
values A 1-D sequence. Sequences that aren’t pandas objects are coerced to ndarrays before factorization. sequence   Required
prefix Sort uniques and shuffle labels to maintain the relationship. bool Default: False Optional
na_sentinel Value to mark “not found”. int Default:1 Optional
size_hint Hint to the hashtable sizer. int   Optional

Returns: labels: ndarray - An integer ndarray that’s an indexer into uniques. uniques.take(labels) will have the same values as values.
uniques: ndarray, Index, or Categorical - The unique valid values.
When values is Categorical, uniques is a Categorical.
When values is some other pandas object, an Index is returned. Otherwise, a 1-D ndarray is returned.

Note: Even if there’s a missing value in values, uniques will not contain an entry for it.

Example:


Download the Pandas DataFrame Notebooks from here.

Previous: get_dummies() function
Next: unique() function



Become a Patron!

Follow us on Facebook and Twitter for latest update.

It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.

https://w3resource.com/pandas/factorize.php