Evaluate models in an experiment

With Snowflake ML Experiments, you can set up experiments, organized evaluations of the results of model training. This allows you to quickly compare the results of hyperparameter adjustment, different target metrics, and behavior of different model types in an organized fashion in order to select the best model for your needs. Each experiment consists of a series of runs, which are metadata and artifacts from your training. Snowflake is unopinionated about your run artifacts – you can submit anything that’s useful for your model evaluation process.

After you complete an experiment, the results are visible through Snowsight. You can also retrieve run artifacts at any time in Python or SQL.

Note

Snowflake Experiments require snowflake-ml-python version 1.19.0 or later.

Access control requirements

Creating an experiment requires the CREATE EXPERIMENT privilege on the schema where run artifacts are stored. Creating an experiment requires the USAGE privilege on the parent database and schema.

Create an experiment

First, create an experiment. This requires an existing database and schema, used to store run information.

Experiment support is available in the snowflake.ml.experiment.ExperimentTracking class. Use the set_experiment(name: Optional[str]) method to both open an experiment with the given name and set it to the active experiment context for logs and artifacts. Experiments which don’t exist yet are created.

The following example shows how to create or open an experiment named My_Experiment in the active database and schema and set it as the active experiment, using an existing session:

from snowflake.ml.experiment import ExperimentTracking

exp = ExperimentTracking(session=session)
exp.set_experiment("My_Experiment")
Copy

Start an experiment run

Each run in an experiment has its own set of metrics, parameters, and artifacts. This information is used in Snowsight to provide visualizations and data about your model training and its results.

Start a run with the start_run(name: Optional[str]) method on your ExperimentTracking instance. This returns a new Run, which supports use in a with statement. Snowflake recommends that you use with statements, so that runs are cleanly completed and it’s easier to reason about run scope.

with exp.start_run("my_run"):
  # .. Train your model and log artifacts
Copy

Automatically log training information

You can autolog training information for XGBoost, LightGBM, or Keras models during model training. Autologging is performed by registering a callback which refers to your experiment and information about the model you’re training. Each time a method is called on your Model instance which adjusts a parameter or metric, it’s automatically logged to your experiment for the active run.

The following example shows how to configure your experiment’s callbacks for each supported model trainer and then start a basic training run to log artifacts.

# exp: ExperimentTracking

from xgboost import XGBClassifier

from snowflake.ml.experiment.callback.xgboost import SnowflakeXgboostCallback
from snowflake.ml.model.model_signature import infer_signature

sig = infer_signature(X, y)
callback = SnowflakeXgboostCallback(
    exp, model_name="name", model_signature=sig
)
model = XGBClassifier(callbacks=[callback])
with exp.start_run("my_run"):
    model.fit(X, y, eval_set=[(X, y)])
Copy

Manually log training information and artifacts

For models which don’t support automatic logging or are pre-trained, you can manually log experiment information and upload artifacts in Python. Parameters are constant inputs to the training model, while metrics are evaluated at a model step. You can choose to represent a training epoch as a corresponding step. The following example shows how to log parameters, log metrics, and upload artifacts.

Note

The default step value is 0.

# Logging requires an active run for the exp: ExperimentTracker instance.

# Log model parameters with the log_param(...) or log_params(...) methods
exp.log_param("learning_rate", 0.01)
exp.log_params({"optimizer": "adam", "batch_size": 64})

# Log model metrics with the log_metric(...) or log_metrics(...) methods
exp.log_metric("loss", 0.3, step=100)
exp.log_metrics({"loss": 0.4, "accuracy": 0.8}, step=200)

# Log your model to the experiment's model registry with the log_model(...) method.
exp.log_model(model, model_name="my_model", signatures={"predict": model_signature})
exp.log_model(model, model_name="my_model", sample_input_data=data)

# Log local artifacts to an experiment run with the log_artifact(...) method.
exp.log_artifact('/tmp/file.txt', artifact_path='artifacts')
Copy

Note

Artifact logging isn’t supported on Warehouse Notebooks.

Complete a run

Completing a run makes it immutable and presents it as finished in Snowsight.

If you started a run as part of a with statement, the run is automatically completed when exiting scope. Otherwise, you can end a run by calling your experiment’s end_run(name: Optional[str]) method with the name of the run to complete:

experiment.end_run("my_run")
Copy

Compare runs within an experiment

Experiment evaluation is done through Snowsight. In the navigation menu, select AI & ML » Experiments and select your experiment to examine from the list.

The experiment view contains information on Run count over time, a display for your currently selected metric, and a list of your available runs in the experiment. The upper-right dropdown provides a list of the available metrics to inspect, which populates the Metric Value Range chart in the experiment view.

The list of runs on this page includes the Run name, Status, when the run was Created, and an additional column for each metric available in the experiment. Information on parameters and model versions are available from the run comparison view.

You can select up to five runs in your experiment and then select the Compare button to be presented with the comparison view, which displays run metadata, parameters, metrics, and model version information. Each metric is displayed visually, which can be changed by using the toggle in the Metrics section to switch between Charts and Tables.

Retrieve artifacts from a run

At any time during or after a run, you can retrieve artifacts. The following example shows how to list a run’s available artifacts in the logs path, and download the logs/log0.txt artifact for the run my_run in the experiment my_experiment to the local directory /tmp:

# exp: ExperimentTracking
exp.set_experiment("my_experiment")

exp.list_artifacts("my_run", artifact_path="logs")
exp.download_artifacts("my_run", artifact_path="logs/log0.txt", target_path="/tmp")
Copy

Delete runs and experiments

After finishing an experiment, you can remove it and all of its associated run artifacts. The following example removes the experiment my_experiment:

# exp: ExperimentTracking
exp.delete_experiment("my_experiment")
Copy

You can also remove an individual run from an experiment. The following example removes the run my_run from the experiment my_experiment:

# exp: ExperimentTracking
exp.set_experiment("my_experiment")
exp.delete_run("my_run")
Copy

Limitations

Snowflake Experiments are subject to the following limitations:

  • Each schema is limited to 500 experiments.

  • Each experiment is limited to 500 runs.

  • Runs are stored for a maximum of one year.

  • Individual run parameters and metrics are limited to 200 KB total.

Cost considerations

Using Snowflake Experiments incurs standard Snowflake consumption-based costs. These include: