Pandas: Data Manipulation - get_dummies() function
get_dummies() function
The get_dummies() function is used to convert categorical variable into dummy/indicator variables.
Syntax:
pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)
Parameters:
Name | Description | Type | Default Value | Required / Optional |
---|---|---|---|---|
data | Data of which to get dummy indicators. | array-like, Series, or DataFrame | Required | |
prefix | String to append DataFrame column names. | str, list of str, or dict of str | Default: None | Optional |
prefix_sep | If appending prefix, separator/delimiter to use. Or pass a list or dictionary as with prefix. | str | Default: ‘_’ | Optional |
dummy_na | Add a column to indicate NaNs, if False NaNs are ignored. | bool | Default: False | Optional |
columns | Column names in the DataFrame to be encoded. If columns is None then all the columns with object or category dtype will be converted. | list-like | Default: None | Optional |
sparse | Whether the dummy-encoded columns should be backed by a SparseArray (True) or a regular NumPy array (False) | bool | Default: False | Optional |
drop_first | Whether to get k-1 dummies out of k categorical levels by removing the first level. | bool | Default: False | Optional |
dtype | Data type for new columns. Only a single dtype is allowed. | dtype | Default: np.uint8 | Optional |
Returns: DataFrame - Dummy-coded data.
Example:
Download the Pandas DataFrame Notebooks from here.
Previous: concat() function
Next: factorize() function
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics