snowflake.snowpark_checkpoints.validate_dataframe_checkpoint

snowflake.snowpark_checkpoints.validate_dataframe_checkpoint(df: DataFrame, checkpoint_name: str, job_context: SnowparkJobContext | None = None, mode: CheckpointMode | None = CheckpointMode.SCHEMA, custom_checks: dict[Any, Any] | None = None, skip_checks: dict[Any, Any] | None = None, sample_frac: float | None = 1.0, sample_number: int | None = None, sampling_strategy: SamplingStrategy | None = 1, output_path: str | None = None) tuple[bool, DataFrame] | None

Validate a Snowpark DataFrame against a specified checkpoint.

Parameters:
  • df (SnowparkDataFrame) – The DataFrame to validate.

  • checkpoint_name (str) – The name of the checkpoint to validate against.

  • job_context (SnowparkJobContext, optional) – The job context for the validation. Required for PARQUET mode.

  • mode (CheckpointMode) – The mode of validation (e.g., SCHEMA, PARQUET). Defaults to SCHEMA.

  • custom_checks (Optional[dict[Any, Any]], optional) – Custom checks to apply during validation.

  • skip_checks (Optional[dict[Any, Any]], optional) – Checks to skip during validation.

  • sample_frac (Optional[float], optional) – Fraction of the DataFrame to sample for validation. Defaults to 0.1.

  • sample_number (Optional[int], optional) – Number of rows to sample for validation.

  • sampling_strategy (Optional[SamplingStrategy], optional) – Strategy to use for sampling. Defaults to RANDOM_SAMPLE.

  • output_path (Optional[str], optional) – The output path for the validation results.

Returns:

A tuple containing a boolean indicating success and a Pandas DataFrame with validation results, or None if validation is not applicable.

Return type:

Union[tuple[bool, PandasDataFrame], None]

Raises:

ValueError – If an invalid validation mode is provided or if job_context is None for PARQUET mode.