You are viewing documentation about an older version (1.18.0). View latest version

snowflake.snowpark.DataFrameWriter.parquet¶

DataFrameWriter.parquet(location: str, *, partition_by: Optional[Union[Column, str]] = None, format_type_options: Optional[Dict[str, str]] = None, header: bool = False, statement_params: Optional[Dict[str, str]] = None, block: bool = True, **copy_options: Optional[str]) → Union[List[Row], AsyncJob][source]¶

Executes internally a COPY INTO <location> to unload data from a DataFrame into a PARQUET file in a stage or external stage.

Parameters:
  • location – The destination stage location.

  • partition_by – Specifies an expression used to partition the unloaded table rows into separate files. It can be a Column, a column name, or a SQL expression.

  • format_type_options – Depending on the file_format_type specified, you can include more format specific options. Use the options documented in the Format Type Options.

  • header – Specifies whether to include the table column headings in the output files.

  • statement_params – Dictionary of statement level parameters to be set while executing this action.

  • copy_options – The kwargs that are used to specify the copy options. Use the options documented in the Copy Options.

  • block – A bool value indicating whether this function will wait until the result is available. When it is False, this function executes the underlying queries of the dataframe asynchronously and returns an AsyncJob.

Returns:

A list of Row objects containing unloading results.

Example:

>>> # save this dataframe to a parquet file on the session stage
>>> df = session.create_dataframe([["John", "Berry"], ["Rick", "Berry"], ["Anthony", "Davis"]], schema = ["FIRST_NAME", "LAST_NAME"])
>>> remote_file_path = f"{session.get_session_stage()}/names.parquet"
>>> copy_result = df.write.parquet(remote_file_path, overwrite=True, single=True)
>>> copy_result[0].rows_unloaded
3
Copy