snowflake.snowpark_checkpoints.check_with_spark

snowflake.snowpark_checkpoints.check_with_spark(job_context: SnowparkJobContext | None, spark_function: F, checkpoint_name: str, sample_number: int | None = 100, sampling_strategy: SamplingStrategy | None = 1, output_path: str | None = None) Callable[[F], F]

Validate function output with Spark instance.

Will take the input snowpark dataframe of this function, sample data, convert it to a Spark dataframe and then execute spark_function. Subsequently the output of that function will be compared to the output of this function for the same sample of data.

Parameters:
  • job_context (SnowparkJobContext) – The job context containing configuration and details for the validation.

  • spark_function (fn) – The equivalent PySpark function to compare against the Snowpark implementation.

  • checkpoint_name (str) – A name for the checkpoint. Defaults to None.

  • sample_number (Optional[int], optional) – The number of rows for validation. Defaults to 100.

  • sampling_strategy (Optional[SamplingStrategy], optional) – The strategy used for sampling data. Defaults to SamplingStrategy.RANDOM_SAMPLE.

  • output_path (Optional[str], optional) – The path to store the validation results. Defaults to None.

Returns:

A decorator that wraps the original Snowpark function with validation logic.

Return type:

Callable[[fn], fn]