Pandas DataFrame: to_parquet() function

Last update on August 19 2022 21:50:33 (UTC/GMT +8 hours)

DataFrame - to_parquet() function

The to_parquet() function is used to write a DataFrame to the binary parquet format. This function writes the dataframe as a parquet file.

Syntax:

DataFrame.to_parquet(self, fname, engine='auto', compression='snappy', index=None, partition_cols=None, **kwargs)

Parameters:

Name	Description	Type / Default Value	Required / Optional
fname	File path or Root Directory path. Will be used as Root Directory path while writing a partitioned dataset.	str	Required
engine	Parquet library to use. If 'auto', then the option io.parquet.engine is used. The default io.parquet.engine behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if 'pyarrow' is unavailable.	{'auto', 'pyarrow', 'fastparquet'} Default Value: 'auto'	Required
compression	Name of the compression to use. Use None for no compression.	{'snappy', 'gzip', 'brotli', None} Default Value: 'snappy'	Required
index	If True, include the dataframe’s index(es) in the file output. If False, they will not be written to the file. If None, the behavior depends on the chosen engine.	bool Default Value: None	Required
partition_cols	Column names by which to partition the dataset Columns are partitioned in the order they are given	list Default Value: None	Optional
**kwargs	Additional arguments passed to the parquet library.		Required

Example:

Download the Pandas DataFrame Notebooks from here.

Previous: DataFrame - info() function
Next: DataFrame - to_pickle() function