snowflake.snowpark.DataFrameWriter.parquet¶

DataFrameWriter.parquet(location: str, *, partition_by: Optional[Union[Column, str]] = None, format_type_options: Optional[Dict[str, str]] = None, header: bool = False, statement_params: Optional[Dict[str, str]] = None, block: bool = True, validation_mode: Optional[str] = None, storage_integration: Optional[str] = None, credentials: Optional[dict] = None, encryption: Optional[dict] = None, **copy_options: Optional[str]) → Union[List[Row], AsyncJob][source]¶

Executes internally a COPY INTO <location> to unload data from a DataFrame into a PARQUET file in a stage or external stage.

Parameters:
  • location – The destination stage location.

  • partition_by – Specifies an expression used to partition the unloaded table rows into separate files. It can be a Column, a column name, or a SQL expression.

  • format_type_options – Depending on the file_format_type specified, you can include more format specific options. Use the options documented in the Format Type Options.

  • header – Specifies whether to include the table column headings in the output files.

  • statement_params – Dictionary of statement level parameters to be set while executing this action.

  • copy_options – The kwargs that are used to specify the copy options. Use the options documented in the Copy Options.

  • block – A bool value indicating whether this function will wait until the result is available. When it is False, this function executes the underlying queries of the dataframe asynchronously and returns an AsyncJob.

  • validation_mode – String (constant) that instructs the COPY command to return the results of the query in the SQL statement instead of unloading the results to the specified cloud storage location. The only supported validation option is RETURN_ROWS. This option returns all rows produced by the query.

  • storage_integration – Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity.

  • credentials – Specifies the security credentials for connecting to the cloud provider and accessing the private/protected cloud storage.

  • encryption – Specifies the encryption settings used to decrypt encrypted files in the storage location.

Returns:

A list of Row objects containing unloading results.

Example:

>>> # save this dataframe to a parquet file on the session stage
>>> df = session.create_dataframe([["John", "Berry"], ["Rick", "Berry"], ["Anthony", "Davis"]], schema = ["FIRST_NAME", "LAST_NAME"])
>>> remote_file_path = f"{session.get_session_stage()}/names.parquet"
>>> copy_result = df.write.parquet(remote_file_path, overwrite=True, single=True)
>>> copy_result[0].rows_unloaded
3
Copy