snowflake.snowpark_checkpoints.check_with_spark¶
- snowflake.snowpark_checkpoints.check_with_spark(job_context: SnowparkJobContext | None, spark_function: F, checkpoint_name: str, sample_number: int | None = 100, sampling_strategy: SamplingStrategy | None = 1, output_path: str | None = None) Callable[[F], F] ¶
Validate function output with Spark instance.
Will take the input snowpark dataframe of this function, sample data, convert it to a Spark dataframe and then execute spark_function. Subsequently the output of that function will be compared to the output of this function for the same sample of data.
- Parameters:
job_context (SnowparkJobContext) – The job context containing configuration and details for the validation.
spark_function (fn) – The equivalent PySpark function to compare against the Snowpark implementation.
checkpoint_name (str) – A name for the checkpoint. Defaults to None.
sample_number (Optional[int], optional) – The number of rows for validation. Defaults to 100.
sampling_strategy (Optional[SamplingStrategy], optional) – The strategy used for sampling data. Defaults to SamplingStrategy.RANDOM_SAMPLE.
output_path (Optional[str], optional) – The path to store the validation results. Defaults to None.
- Returns:
A decorator that wraps the original Snowpark function with validation logic.
- Return type:
Callable[[fn], fn]