snowflake.hypothesis_snowpark.dataframe_strategy¶
- snowflake.hypothesis_snowpark.dataframe_strategy(schema: str | DataFrameSchema, session: Session, size: int | None = None) SearchStrategy[DataFrame] ¶
Create a Hypothesis strategy for generating Snowpark DataFrames based on a given schema.
- Parameters:
schema – A schema defining the columns, data types and checks that the generated DataFrame should satisfy. This can be a path to a JSON schema file generated by the
snowflake.snowpark_checkpoints_collector.collect_dataframe_checkpoint()
function when the collection mode is set to SCHEMA, or a Pandera DataFrameSchema object.session – The Snowpark session to use for creating the DataFrames.
size – The number of rows to generate for each DataFrame. If not specified, the strategy will generate DataFrames of different sizes.
Examples
Generate a Snowpark DataFrame from a JSON schema file:
>>> from hypothesis import given >>> from snowflake.hypothesis_snowpark import dataframe_strategy >>> from snowflake.snowpark import DataFrame, Session >>> @given( ... df=dataframe_strategy( ... schema="path/to/schema.json", ... session=Session.builder.getOrCreate(), ... size=10, ... ) ... ) >>> def test_my_function(df: DataFrame): ... ...
Generate a Snowpark DataFrame from a Pandera DataFrameSchema object:
>>> import pandera as pa >>> from hypothesis import given >>> from snowflake.hypothesis_snowpark import dataframe_strategy >>> from snowflake.snowpark import DataFrame, Session >>> @given( ... df=dataframe_strategy( ... schema=pa.DataFrameSchema( ... { ... "A": pa.Column(pa.Int, checks=pa.Check.in_range(0, 10)), ... "B": pa.Column(pa.Bool), ... } ... ), ... session=Session.builder.getOrCreate(), ... size=10, ... ) ... ) >>> def test_my_function(df: DataFrame): ... ...
You can control aspects like the maximum number of test cases, the deadline for each test execution, verbosity levels and many others using the Hypothesis @settings decorator.
>>> from datetime import timedelta >>> from hypothesis import given, settings >>> from snowflake.hypothesis_snowpark import dataframe_strategy >>> from snowflake.snowpark import DataFrame, Session >>> @given( ... df=dataframe_strategy( ... schema="path/to/schema.json", ... session=Session.builder.getOrCreate(), ... size=10, ... ) ... ) >>> @settings( ... deadline=timedelta(milliseconds=800), ... max_examples=25, ... ) >>> def test_my_function(df: DataFrame): ... ...
- Returns:
A Hypothesis strategy that generates Snowpark DataFrames.