You are viewing documentation about an older version (1.12.1). View latest version

snowflake.snowpark.DataFrame.cache_result

DataFrame.cache_result(*, statement_params: Optional[Dict[str, str]] = None) Table[source]

Caches the content of this DataFrame to create a new cached Table DataFrame.

All subsequent operations on the returned cached DataFrame are performed on the cached data and have no effect on the original DataFrame.

You can use Table.drop_table() or the with statement to clean up the cached result when it’s not needed. Refer to the example code below.

Note

An error will be thrown if a cached result is cleaned up and it’s used again, or any other DataFrames derived from the cached result are used again.

Examples::
>>> create_result = session.sql("create temp table RESULT (NUM int)").collect()
>>> insert_result = session.sql("insert into RESULT values(1),(2)").collect()
Copy
>>> df = session.table("RESULT")
>>> df.collect()
[Row(NUM=1), Row(NUM=2)]
Copy
>>> # Run cache_result and then insert into the original table to see
>>> # that the cached result is not affected
>>> df1 = df.cache_result()
>>> insert_again_result = session.sql("insert into RESULT values (3)").collect()
>>> df1.collect()
[Row(NUM=1), Row(NUM=2)]
>>> df.collect()
[Row(NUM=1), Row(NUM=2), Row(NUM=3)]
Copy
>>> # You can run cache_result on a result that has already been cached
>>> df2 = df1.cache_result()
>>> df2.collect()
[Row(NUM=1), Row(NUM=2)]
Copy
>>> df3 = df.cache_result()
>>> # Drop RESULT and see that the cached results still exist
>>> drop_table_result = session.sql(f"drop table RESULT").collect()
>>> df1.collect()
[Row(NUM=1), Row(NUM=2)]
>>> df2.collect()
[Row(NUM=1), Row(NUM=2)]
>>> df3.collect()
[Row(NUM=1), Row(NUM=2), Row(NUM=3)]
>>> # Clean up the cached result
>>> df3.drop_table()
>>> # use context manager to clean up the cached result after it's use.
>>> with df2.cache_result() as df4:
...     df4.collect()
[Row(NUM=1), Row(NUM=2)]
Copy
Parameters:

statement_params – Dictionary of statement level parameters to be set while executing this action.

Returns:

A Table object that holds the cached result in a temporary table. All operations on this new DataFrame have no effect on the original.