You are viewing documentation about an older version (1.18.0). View latest version

modin.pandas.to_snowpark

snowflake.snowpark.modin.pandas.to_snowpark(obj: Union[DataFrame, Series], index: bool = True, index_label: Optional[Union[Hashable, Sequence[Hashable]]] = None) DataFrame[source]

Convert the Snowpark pandas DataFrame or Series to a Snowpark DataFrame. Note that once converted to a Snowpark DataFrame, no ordering information will be preserved. You can call reset_index to generate a default index column that is the same as the row position before the call to_snowpark.

Parameters:
  • obj – The object to be converted to Snowpark DataFrame. It must be either a Snowpark pandas DataFrame or Series

  • index – bool, default True. Whether to keep the index columns in the result Snowpark DataFrame. If True, the index columns will be the first set of columns. Otherwise, no index column will be included in the final Snowpark DataFrame.

  • index_label – IndexLabel, default None. Column label(s) to use for the index column(s). If None is given (default) and index is True, then the original index column labels are used. A sequence should be given if the DataFrame uses MultiIndex, and the length of the given sequence should be the same as the number of index columns.

Returns:

DataFrame

A Snowpark DataFrame contains the index columns if index=True and all data columns of the Snowpark pandas DataFrame. The identifier for the Snowpark DataFrame will be the normalized quoted identifier with the same name as the pandas label.

Raises:
  • ValueError if duplicated labels occur among the index and data columns.

  • ValueError if the label used for a index or data column is None.

Note

The labels of the Snowpark pandas DataFrame or index_label provided will be used as Normalized Snowflake Identifiers of the Snowpark DataFrame. For details about Normalized Snowflake Identifiers, please refer to the Note in read_snowflake()

Examples:

>>> df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
...                               'Parrot', 'Parrot'],
...                    'Max Speed': [380., 370., 24., 26.]})
>>> df
   Animal  Max Speed
0  Falcon      380.0
1  Falcon      370.0
2  Parrot       24.0
3  Parrot       26.0
>>> snowpark_df = pd.to_snowpark(df, index_label='Order')
>>> snowpark_df.order_by('"Max Speed"').show()
------------------------------------
|"Order"  |"Animal"  |"Max Speed"  |
------------------------------------
|2        |Parrot    |24.0         |
|3        |Parrot    |26.0         |
|1        |Falcon    |370.0        |
|0        |Falcon    |380.0        |
------------------------------------

>>> snowpark_df = pd.to_snowpark(df, index=False)
>>> snowpark_df.order_by('"Max Speed"').show()
--------------------------
|"Animal"  |"Max Speed"  |
--------------------------
|Parrot    |24.0         |
|Parrot    |26.0         |
|Falcon    |370.0        |
|Falcon    |380.0        |
--------------------------

>>> df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
...                               'Parrot', 'Parrot'],
...                    'Max Speed': [380., 370., 24., 26.]}, index=pd.Index([3, 5, 6, 7], name="id"))
>>> df      
    Animal  Max Speed
id
3  Falcon      380.0
5  Falcon      370.0
6  Parrot       24.0
7  Parrot       26.0
>>> snowpark_df = pd.to_snowpark(df)
>>> snowpark_df.order_by('"id"').show()
---------------------------------
|"id"  |"Animal"  |"Max Speed"  |
---------------------------------
|3     |Falcon    |380.0        |
|5     |Falcon    |370.0        |
|6     |Parrot    |24.0         |
|7     |Parrot    |26.0         |
---------------------------------


MultiIndex usage

>>> df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
...                               'Parrot', 'Parrot'],
...                    'Max Speed': [380., 370., 24., 26.]},
...                    index=pd.MultiIndex.from_tuples([('bar', 'one'), ('foo', 'one'), ('bar', 'two'), ('foo', 'three')], names=['first', 'second']))
>>> df      
                Animal  Max Speed
first second
bar   one     Falcon      380.0
foo   one     Falcon      370.0
bar   two     Parrot       24.0
foo   three   Parrot       26.0
>>> snowpark_df = pd.to_snowpark(df, index=True, index_label=['A', 'B'])
>>> snowpark_df.order_by('"A"', '"B"').show()
----------------------------------------
|"A"  |"B"    |"Animal"  |"Max Speed"  |
----------------------------------------
|bar  |one    |Falcon    |380.0        |
|bar  |two    |Parrot    |24.0         |
|foo  |one    |Falcon    |370.0        |
|foo  |three  |Parrot    |26.0         |
----------------------------------------

>>> snowpark_df = pd.to_snowpark(df, index=False)
>>> snowpark_df.order_by('"Max Speed"').show()
--------------------------
|"Animal"  |"Max Speed"  |
--------------------------
|Parrot    |24.0         |
|Parrot    |26.0         |
|Falcon    |370.0        |
|Falcon    |380.0        |
--------------------------

>>> snowpark_df = pd.to_snowpark(df["Animal"], index=False)
>>> snowpark_df.order_by('"Animal"').show()
------------
|"Animal"  |
------------
|Falcon    |
|Falcon    |
|Parrot    |
|Parrot    |
------------
Copy