You are viewing documentation about an older version (1.3.0). View latest version

snowflake.snowpark.DataFrame.to_pandas_batches¶

DataFrame.to_pandas_batches(*, statement_params: Dict[str, str] | None = None, block: bool = True, **kwargs: Dict[str, Any]) → Iterator[pandas.DataFrame][source]¶
DataFrame.to_pandas_batches(*, statement_params: Dict[str, str] | None = None, block: bool = False, **kwargs: Dict[str, Any]) → AsyncJob

Executes the query representing this DataFrame and returns an iterator of Pandas dataframes (containing a subset of rows) that you can use to retrieve the results.

Unlike to_pandas(), this method does not load all data into memory at once.

Example:

>>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
>>> for pandas_df in df.to_pandas_batches():
...     print(pandas_df)
   A  B
0  1  2
1  3  4
Copy
Parameters:
  • statement_params – Dictionary of statement level parameters to be set while executing this action.

  • block – A bool value indicating whether this function will wait until the result is available. When it is False, this function executes the underlying queries of the dataframe asynchronously and returns an AsyncJob.

Note

  1. This method is only available if Pandas is installed and available.

2. If you use Session.sql() with this method, the input query of Session.sql() can only be a SELECT statement.