snowflake.snowpark.DataFrame.to_snowpark_pandas¶
- DataFrame.to_snowpark_pandas(index_col: Optional[Union[str, List[str]]] = None, columns: Optional[List[str]] = None) modin.pandas.DataFrame [source]¶
Convert the Snowpark DataFrame to Snowpark pandas DataFrame.
- Parameters:
index_col – A column name or a list of column names to use as index.
columns – A list of column names for the columns to select from the Snowpark DataFrame. If not specified, select all columns except ones configured in index_col.
- Returns:
DataFrame
A Snowpark pandas DataFrame contains index and data columns based on the snapshot of the current Snowpark DataFrame, which triggers an eager evaluation.
If index_col is provided, the specified index_col is selected as the index column(s) for the result dataframe, otherwise, a default range index from 0 to n - 1 is created as the index column, where n is the number of rows. Please note that is also used as the start row ordering for the dataframe, but there is no guarantee that the default row ordering is the same for two Snowpark pandas dataframe created from the same Snowpark Dataframe.
If columns are provided, the specified columns are selected as the data column(s) for the result dataframe, otherwise, all Snowpark DataFrame columns (exclude index_col) are selected as data columns.
Note
Transformations performed on the returned Snowpark pandas Dataframe do not affect the Snowpark DataFrame from which it was created. Call -
modin.pandas.to_snowpark
to transform a Snowpark pandas DataFrame back to a Snowpark DataFrame.The column names used for columns or index_cols must be Normalized Snowflake Identifiers, and the Normalized Snowflake Identifiers of a Snowpark DataFrame can be displayed by calling df.show(). For details about Normalized Snowflake Identifiers, please refer to the Note in
read_snowflake()
to_snowpark_pandas works only when the environment is set up correctly for Snowpark pandas. This environment may require version of Python and pandas different from what Snowpark Python uses If the environment is setup incorrectly, an error will be raised when to_snowpark_pandas is called.
For Python version support information, please refer to: - the prerequisites section https://docs.snowflake.com/en/developer-guide/snowpark/python/snowpark-pandas#prerequisites - the installation section https://docs.snowflake.com/en/developer-guide/snowpark/python/snowpark-pandas#installing-the-snowpark-pandas-api
See also
- Example::
>>> df = session.create_dataframe([[1, 2, 3]], schema=["a", "b", "c"]) >>> snowpark_pandas_df = df.to_snowpark_pandas() >>> snowpark_pandas_df A B C 0 1 2 3
>>> snowpark_pandas_df = df.to_snowpark_pandas(index_col='A') >>> snowpark_pandas_df B C A 1 2 3 >>> snowpark_pandas_df = df.to_snowpark_pandas(index_col='A', columns=['B']) >>> snowpark_pandas_df B A 1 2 >>> snowpark_pandas_df = df.to_snowpark_pandas(index_col=['B', 'A'], columns=['A', 'C', 'A']) >>> snowpark_pandas_df A C A B A 2 1 1 3 1