snowflake.snowpark.Session.write_pandas¶
- Session.write_pandas(df: Union[pandas.DataFrame, modin.pandas.DataFrame, modin.pandas.Series], table_name: str, *, database: Optional[str] = None, schema: Optional[str] = None, chunk_size: Optional[int] = None, compression: str = 'gzip', on_error: str = 'abort_statement', parallel: int = 4, quote_identifiers: bool = True, auto_create_table: bool = False, create_temp_table: bool = False, overwrite: bool = False, table_type: Literal['', 'temp', 'temporary', 'transient'] = '', use_logical_type: Optional[bool] = None, **kwargs: Dict[str, Any]) Table[source]¶
- Writes a pandas DataFrame to a table in Snowflake and returns a Snowpark - DataFrameobject referring to the table where the pandas DataFrame was written to.- Parameters:
- df – The pandas DataFrame or Snowpark pandas DataFrame or Series we’d like to write back. 
- table_name – Name of the table we want to insert into. 
- database – Database that the table is in. If not provided, the default one will be used. 
- schema – Schema that the table is in. If not provided, the default one will be used. 
- chunk_size – Number of rows to be inserted once. If not provided, all rows will be dumped once. Default to None normally, 100,000 if inside a stored procedure. Note: If auto_create_table or overwrite is set to True, the chunk size may affect schema inference because different chunks might contain varying data types, especially when None values are present. This can lead to inconsistencies in inferred types. 
- compression – The compression used on the Parquet files: gzip or snappy. Gzip gives supposedly a better compression, while snappy is faster. Use whichever is more appropriate. 
- on_error – Action to take when COPY INTO statements fail. See details at copy options. 
- parallel – Number of threads to be used when uploading chunks. See details at parallel parameter. 
- quote_identifiers – By default, identifiers, specifically database, schema, table and column names (from - DataFrame.columns) will be quoted. If set to- False, identifiers are passed on to Snowflake without quoting, i.e. identifiers will be coerced to uppercase by Snowflake.
- auto_create_table – When true, automatically creates a table to store the passed in pandas DataFrame using the passed in - database,- schema, and- table_name. Note: there are usually multiple table configurations that would allow you to upload a particular pandas DataFrame successfully. If you don’t like the auto created table, you can always create your own table before calling this function. For example, auto-created tables will store- list,- tupleand- dictas strings in a VARCHAR column.
- create_temp_table – (Deprecated) The to-be-created table will be temporary if this is set to - True. Note that to avoid breaking changes, currently when this is set to True, it overrides- table_type.
- overwrite – Default value is - Falseand the pandas DataFrame data is appended to the existing table. If set to- Trueand if auto_create_table is also set to- True, then it drops the table. If set to- Trueand if auto_create_table is set to- False, then it truncates the table. Note that in both cases (when overwrite is set to- True) it will replace the existing contents of the table with that of the passed in pandas DataFrame.
- table_type – The table type of table to be created. The supported values are: - temp,- temporary, and- transient. An empty string means to create a permanent table. Learn more about table types here.
- use_logical_type – Boolean that specifies whether to use Parquet logical types when reading the parquet files for the uploaded pandas dataframe. With this file format option, Snowflake can interpret Parquet logical types during data loading. To enable Parquet logical types, set use_logical_type as True. Set to None to use Snowflakes default. For more information, see: file format options:. 
 
 - Example: - >>> import pandas as pd >>> pandas_df = pd.DataFrame([(1, "Steve"), (2, "Bob")], columns=["id", "name"]) >>> snowpark_df = session.write_pandas(pandas_df, "write_pandas_table", auto_create_table=True, table_type="temp") >>> snowpark_df.sort('"id"').to_pandas() id name 0 1 Steve 1 2 Bob >>> pandas_df2 = pd.DataFrame([(3, "John")], columns=["id", "name"]) >>> snowpark_df2 = session.write_pandas(pandas_df2, "write_pandas_table", auto_create_table=False) >>> snowpark_df2.sort('"id"').to_pandas() id name 0 1 Steve 1 2 Bob 2 3 John >>> pandas_df3 = pd.DataFrame([(1, "Jane")], columns=["id", "name"]) >>> snowpark_df3 = session.write_pandas(pandas_df3, "write_pandas_table", auto_create_table=False, overwrite=True) >>> snowpark_df3.to_pandas() id name 0 1 Jane >>> pandas_df4 = pd.DataFrame([(1, "Jane")], columns=["id", "name"]) >>> snowpark_df4 = session.write_pandas(pandas_df4, "write_pandas_transient_table", auto_create_table=True, table_type="transient") >>> snowpark_df4.to_pandas() id name 0 1 Jane - Note - 1. Unless - auto_create_tableis- True, you must first create a table in Snowflake that the passed in pandas DataFrame can be written to. If your pandas DataFrame cannot be written to the specified table, an exception will be raised.- 2. If the dataframe is Snowpark pandas - DataFrameor- Series, it will call- modin.pandas.DataFrame.to_snowflakeor- modin.pandas.Series.to_snowflakeinternally to write a Snowpark pandas DataFrame into a Snowflake table.- 3. If the input pandas DataFrame has datetime64[ns, tz] columns and auto_create_table is set to True, they will be converted to TIMESTAMP_LTZ in the output Snowflake table by default. If TIMESTAMP_TZ is needed for those columns instead, please manually create the table before loading data.