Pandas: Data Manipulation - get_dummies() function

Last update on August 19 2022 21:50:52 (UTC/GMT +8 hours)

get_dummies() function

The get_dummies() function is used to convert categorical variable into dummy/indicator variables.

Syntax:

pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)

Parameters:

Name	Description	Type	Default Value	Required / Optional
data	Data of which to get dummy indicators.	array-like, Series, or DataFrame		Required
prefix	String to append DataFrame column names.	str, list of str, or dict of str	Default: None	Optional
prefix_sep	If appending prefix, separator/delimiter to use. Or pass a list or dictionary as with prefix.	str	Default: ‘_’	Optional
dummy_na	Add a column to indicate NaNs, if False NaNs are ignored.	bool	Default: False	Optional
columns	Column names in the DataFrame to be encoded. If columns is None then all the columns with object or category dtype will be converted.	list-like	Default: None	Optional
sparse	Whether the dummy-encoded columns should be backed by a SparseArray (True) or a regular NumPy array (False)	bool	Default: False	Optional
drop_first	Whether to get k-1 dummies out of k categorical levels by removing the first level.	bool	Default: False	Optional
dtype	Data type for new columns. Only a single dtype is allowed.	dtype	Default: np.uint8	Optional

Returns: DataFrame - Dummy-coded data.

Example:

Download the Pandas DataFrame Notebooks from here.

Previous: concat() function
Next: factorize() function