Snowpark ML Ops: Migrating from the Model Registry Preview API

Snowflake previously made a model registry preview available privately to select customers. The registry feature described in this topic has significant changes in functionality and APIs compared to the preview version. Most notably, the core registry functionality is now hosted natively inside Snowflake using a new schema-level model object.

Note

The public preview version does not yet support deploying models to Snowpark Container Services (SPCS). If you rely on this functionality, continue to use the private preview registry for now.

This table summarizes key differences between the two registry implementations. The API of the privately-available preview version is designated the “Preview API,” while the current publicly-released API is called the “Public API.”

Preview API

Public API

Metadata is stored in tables. Models are stored in stages. The registry API is a Python library that creates and maintains these objects for the models stored in the registry.

  • Users must have the privileges to create schemas, tables, and stages in order to create a registry.

  • The registry can be made inconsistent by updating model metadata outside of the Python API.

  • Models must be explicitly deployed in order to be used.

  • Individual models can’t have role-based access control (though model deployments, which are user-defined functions, can).

  • Porting the registry API requires implementing the entire registry functionality in the new language.

Models are native schema-level objects, like tables and stages. The Python registry API is a class that facilitates interaction with model objects in Python, using SQL under the hood.

  • Models are stored in an existing schema. Schemas do not need any special preparation for use as a registry, and users need only one privilege to create a model in a schema they do not own.

  • No metadata is stored outside the model object, so the registry can’t become inconsistent.

  • Models contain methods that can be called from SQL or Python, and don’t need to be explicitly deployed.

  • Usage privileges can be granted independently on specific models.

  • Porting the registry API to another language is straightforward since the Python library is a thin layer on top of SQL.

The following sections describe the differences between the two APIs in more detail.

Importing and Accessing the Registry API

Both registry APIs are in the main Snowpark ML package, snowflake.ml.

Preview API

from snowflake.ml.registry import model_registry
Copy

Use model_registry.ModelRegistry to access registry functionality.

Public API

from snowflake.ml.registry import Registry
Copy

Use the Registry class to access registry functionality.

Creating a Registry

The Preview API requires that the registry be created by the Python library.

Preview API

model_registry.create_model_registry(...)
Copy

Required before using registry for the first time.

Public API

Not applicable. Any existing schema can be used as a registry.

Opening a Registry

You open a registry to add new models to it and to work with the models already in it.

Preview API

reg = model_registry.ModelRegistry(
          session=session,
          database_name="MODEL_REGISTRY")
Copy

Public API

reg = Registry(
          session=session,
          database_name="ML",
          schema_name="REGISTRY")
Copy

Logging a Model

Adding a model to the registry is called logging. Both APIs use a registry method called log_model for this purpose. This method has two minor differences in the Public API:

  • The parameter that specifies the model version, previously called model_version, is now called version_name to better reflect its semantics.

  • You can’t set tags when logging a model. Instead, add tags after logging using the model’s set_tag method.

Getting a Reference to a Model

Getting a reference to a model allows you to update its metadata and perform other operations on it.

Preview API

Getting a model from the registry always returns a specific version of the model, so it is necessary to specify the version you want when you retrieve the model.

model = model_registry.ModelReference(
            registry=registry,
            model_name="my_model",
            model_version="101")
Copy

Public API

Model versions are separate from the model itself. To get a reference to the model:

m = reg.get_model("my_model")
Copy

To get a reference to a specific version, first get a reference to the model as above, then retrieve the desired version. Note that the model object has an attribute, default, that contains the version of the model you have designated as the default. This is the actual ModelVersion object, not a string.

mv = m.version('v1')
mv = m.default
Copy

Deploying a Model

Preview API

Models must be explicitly deployed to a warehouse (as a user-defined function) or to Snowpark Container Services (as a service). This example illustrates deploying to a warehouse.

model.deploy(
    deployment_name="my_warehouse_predict",
    target_method="predict",
    permanent=True)
Copy

Public API

It is not necessary to explicitly deploy a model.

Using a Model for Inference

Inference refers to using the model to make predictions based on test data.

Preview API

Specify the deployment name you used when you deployed the model in order to run inference.

result_dataframe = model.predict(
    "my_warehouse_predict", test_dataframe)
Copy

Public API

Models are run in a warehouse. You can call methods of the model from Python or from SQL.

Python

You call methods using the run method of a model version.

remote_prediction = mv.run(
    test_features, function_name="predict")
Copy

SQL

You can call methods of the default version using a simple SELECT query, or specify a version using a WITH clause.

-- Use default version
SELECT my_model!predict() FROM test_table;

-- Use a specific version
WITH my_model_v1 AS MODEL my_model VERSION "v1"
     SELECT my_model_v1!predict() FROM test_table;
Copy

Accessing and Updating Descriptions

Preview API

The model reference provides getter and setter methods for the description. (In this API, a model reference is always to a specific version of the model.)

print(model.get_model_description())

model.set_model_description("A better description")
Copy

Public API

Both models and model versions provide access to a description through their equivalent comment and description attributes.

print(m.comment)
m.comment = "A better description"

print(m.description)
m.description = "A better description"

print(mv.comment)
mv.comment = "A better description"

print(mv.description)
mv.description = "A better description"
Copy

Accessing and Updating Tags

Preview API

Tags are set and accessed at the model version level (model references always refer to a specific version).

Get all tags

print(model.get_tags())
Copy

Add tag or set new tag value

model.set_tag("minor_rev", "1")
Copy

Remove a tag

model.remove_tag("minor_rev")
Copy

Public API

Tags are set at the model level (a model comprises a collection of versions) and are implemented using SQL tags. See Object Tagging to learn how to create tags and define their permissible values.

Get all tags

print(m.show_tags())
Copy

Add tag or set new tag value

m.set_tag("minor_rev", "1")
Copy

Remove a tag

m.unset_tag("minor_rev")
Copy

Accessing and Updating Metrics

In both APIs, metrics are set at the model version level.

Preview API

Set scalar metric

model.set_metric("test_accuracy", test_accuracy)
Copy

Set hierarchical (dictionary) metric

model.set_metric("dataset_test", {"accuracy": test_accuracy})
Copy

Set multivalent (matrix) metric

model.set_metric("confusion_matrix", test_confusion_matrix)
Copy

Get all metrics

print(model.get_metrics())
Copy

Remove a metric

model.remove_metric("test_accuracy")
Copy

Public API

Set scalar metric

m.set_metric("test_accuracy", test_accuracy)
Copy

Set hierarchical (dictionary) metric

mo.set_metric("dataset_test", {"accuracy": test_accuracy})
Copy

Set multivalent (matrix) metric

m.set_metric("confusion_matrix", test_confusion_matrix)
Copy

Get all metrics

print(m.get_metrics())
Copy

Remove a metric

m.remove_metric("test_accuracy")
Copy

Deleting a Model

Preview API

You can only delete specific versions of a model. To delete the model completely, delete all its versions.

registry.delete_model(
    model_name="my_model",
    model_version="100")
Copy

Public API

Deleting a model deletes all its versions.

reg.delete_model("mymodel")
Copy

You may also delete a specific version of a model through the model’s delete_version method.

m.delete_version("v1")
Copy

Listing Versions of a Model

Preview API

The list_models method returns a DataFrame of all model versions. You can filter this to show only the versions of a specific model.

model_list = registry.list_models()
model_list.filter(model_list["NAME"] == "mymodel").show()
Copy

Public API

Given a model reference, you can get the versions of the model as either a list of ModelVersion instances or as a DataFrame containing information about the model’s versions.

Get list of ModelVersions instances

version_list = m.versions()
Copy

Get informational DataFrame

version_df = m.show_versions()
Copy

Managing Model Lifecycle

You’re intended to manage the lifecycle of a model using tags. For example, you might create a tag called stage to record the model’s current status, using values such as “experimental”, “alpha”, “beta”, “production”, “deprecated”, and “obsolete”.

In the Public API, tags are implemented using SQL tag objects. See Object Tagging to learn how to create tags and define their permissible values.

The Public API also has the concept of the default version of a model, which is the model used when a version is not specified, particularly in SQL. When you train a new version of a model, and the new model is ready for widespread use, you can update the default version. You can set the default version using the model’s default attribute.

m.default = "2"
Copy

You can then get the default version of the model, as a ModelVersion object, as follows.

mv = m.default
Copy

Or you can call its predict method right away.

m.default.run(test_features, function_name="predict"))
Copy