Snowflake ML release notes¶
This article contains the release notes for the Snowflake ML, including the following when applicable:
Behavior changes
New features
Customer-facing bug fixes
Note
These notes do not include changes in features that have not been publicly announced. Such features may appear in the Snowflake ML source code but not in the public documentation.
See Snowflake ML: End-to-End Machine Learning for documentation.
Verifying the snowflake-ml-python package¶
All Snowflake packages are signed, allowing you to verify their origin. To verify the snowflake.ml.python package, follow the steps below:
Install
cosign. This example uses the Go installation: Installing cosign with Go.Download the file from a repository such as PyPi.
Download a
.sigfile for that release from the GitHub releases page.Verify the signature using
cosign. For example:
cosign verify-blob snowflake_ml_python-1.7.0.tar.gz --key snowflake-ml-python-1.7.0.pub --signature resources.linux.snowflake_ml_python-1.7.0.tar.gz.sig
cosign verify-blob snowflake_ml_python-1.7.0.tar.gz --key snowflake-ml-python-1.7.0.pub --signature resources.linux.snowflake_ml_python-1.7.0
Note
This example uses the library and signature for version 1.7.0 of the package. Use the filenames of the version you are verifying.
Version 1.7.2 (2024-11-21)¶
New features¶
New model registry features:
Model registry now supports asynchronous model inference service creation with the
blockoption in theModelversion.create_servicemethod. Set this option toFalseto create the service asynchronously. The default isTrue.
Bug fixes¶
Model explainability bug fixes:
Fixed issue where
explainis enabled for scikit-learn pipelines whose task is UNKNOWN, only to later fail when invoked.
Version 1.7.1 (2024-11-05)¶
New features¶
New model registry features:
Null values are now ignored in the dataframe used for model signature inference. Only non-null values are used to infer signatures.
Null values are now allowed in dataframes used for prediction.
pandas extension data types are now supported in model signature inference.
pandas
Seriescan be used in input and output data.
New model monitoring features:
The option
enable_monitoringis now available when logging a model in the registry. This option gates access to private preview features of model monitoring.
Bug fixes¶
Data bug fixes:
Missing
snowflake.ml.dataexports in wheel have been added.
Dataset bug fixes:
Missing
snowflake.ml.datasetexports in wheel have been added.
Model registry bug fixes:
Fixed issue where
tf_keras.Modelwas not recognized as a keras model when logging.
Version 1.7.0 (2024-10-22)¶
Behavior changes¶
General behavior changes:
Python 3.9 is now the minimum required version.
Data connector behavior changes:
to_torch_datasetandto_torch_datapipenow create a dimension of 1 for scalar data. This allows more seamless integration with the PyTorch DataLoader, which creates batches by stacking inputs. The following example illustrates the difference.ds = connector.to_torch_dataset(shuffle=False, batch_size=3)
Input data:
"col1": [10, 11, 12]Previous result:
array([10., 11., 12.])with shape(3,)New result:
array([[10.], [11.], [12.]])with shape(3, 1)
Input data:
[[0, 100], [1, 110], [2, 200]]Previous result:
array([[ 0, 100], [ 1, 110], [ 2, 200]])with shape(3,2)New result: No change
You can now specify a batch size of
Noneinto_torch_datasetto squeeze dimensions of 1 for better interoperability with the PyTorch DataLoader. ::code::Noneis the new default batch size.
Model Development behavior changes:
The
eps(epsilon) argument is no longer used with thelog_lossmetric. The argument is still accepted for backward compatibility, but its value is ignored, and the epsilon is now computed by the underlying scikit-lean implementation.
Model Registry behavior changes:
External access integrations are no longer required when creating an inference service in Snowflake 8.40 or later.
New features¶
New Model Registry features:
You can now pass keyword arguments when instantiating
ModelContextto provide a variable number of context values. For example:mc = custom_model.ModelContext( config = 'local_model_dir/config.json', m1 = model1 ) class ExamplePipelineModel(custom_model.CustomModel): def __init__(self, context: custom_model.ModelContext) -> None: super().__init__(context) v = open(self.context['config']).read() self.bias = json.loads(v)['bias'] @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: model_output = self.context['m1'].predict(input) return pd.DataFrame({'output': model_output + self.bias})
Support for pandas’s
CategoricalDtypefor categorical columns.log_modelmethod now accepts bothsignatureandsample_input_dataparameters to capture background data from explainability and data lineage.
Bug fixes¶
Data Connector bug fixes:
For multi-dimensional data,
to_torch_datasetandto_torch_datapipenow return a numpy array with an appropriate data type instead of a list.
Feature Store bug fixes:
Fixed an issue where
ExampleHelperused an incomplete table name.Changed weather features aggregation time to one hour instead of one day.
Model Explainability bug fixes:
Fixed an issue with explainability for XGBoost models by using a new SHAP library version.
Version 1.6.4 (2024-10-17)¶
Bug fixes¶
Model Registry bug fixes:
Fix issue with using
ModelVersion.runwith Model Serving (inference on SPCS).
Version 1.6.3 (2024-10-07)¶
Behavior changes¶
Model Registry behavior changes:
This release no longer contains the preview Model Registry API. Use the public API in
snowflake.ml.model_registryinstead.
Bug fixes¶
Model Registry bug fixes:
Fix unexpected package name normaliations for packages that do not follow PEP-508 conventions when logging a model.
Fix “Not a valid remote URI” error when logging MLflow models.
Fix nested calls to
ModelVersion.run.Fix
log_modelfailure when a local package version number contains parts other than the base version.
New features¶
New Model Registry features:
You can now set a task type for the model in
log_modelvia thetaskparameter.
New Feature Store features:
FeatureViewnow supportsON_CREATEandON_SCHEDULEinitializion modes.
Version 1.6.2 (2024-09-04)¶
Bug fixes¶
Fix a bug involving invalid names passed where fully-qualified names were required. These now correctly raise an exception.
Modeling bug fixes:
Correctly log models built using XGBoost version 2 and higher.
Model explainability bug fixes:
Workarounds and better error handling for XGBoost version 2.1.0 and higher.
Correctly handle multiclass XGBoost classification models
New features¶
New Feature Store features:
The
update_feature_viewmethod now accepts aFeatureViewobject as an alternative to name and version.
Version 1.6.1 (2024-08-13)¶
Bug fixes¶
Feature Store bug fixes:
Metadata size is no longer limited when generating a dataset.
Model Registry bug fixes:
Fix an error message in the
runmethod of model versions when a function name is not given and the model has multiple target methods.
New features¶
New Modeling features:
The
set_paramsmethod is now available to set the parameters of the underlying scikit-learn estimator, if the Snowpark ML model has been fitted.
New Model Registry features:
Support for model explainability in XGBoost, LightGBM, CatBoost, and scikit-learn models supported by the
shapibrary.
Version 1.6.0 (2024-07-29)¶
Behavior changes¶
Feature Store behavior changes:
Many positional arguments are now keyword arguments. The following table lists the affected arguments for each method.
Method
Arguments
EntityinitializerdescFeatureViewinitializertimestamp_col,refresh_freq,descFeatureStoreinitializercreation_modeFeatureStore.update_entitydescFeatureStore.register_feature_viewblock,overwriteFeatureStore.list_feature_viewsentity_name,feature_view_nameFeatureStore.get_refresh_historyverboseFeature:Store.retrieve_feature_valuesspine_timestamp_col,exclude_columns,include_feature_view_timestamp_colFeatureStore.generate_training_setsave_as,spine_timestamp_col,spine_label_cols,exclude_columns,include_feature_view_timestamp_colFeatureStore.generate_datasetversion,spine_timestamp_col,spine_label_cols,exclude_columns,include_feature_view_timestamp_col,desc,output_typeAdd new column
warehouseto the output oflist_feature_views.
Bug fixes¶
Modeling bug fixes:
Fixed an issue in which
SimpleImputercould not impute integer columns with integer values.
Model Registry bug fixes:
Fixed an issue when providing a non-zero-index-based pandas Dataframe
ModelVersion.run.
New features¶
New Feature Store features:
Added overloads to certain methods to accept both a
FeatureViewand name/version strings. Affected APIs includeread_feature_view,refresh_feature_view,get_refresh_history,resume_feature_view,suspend_feature_view, anddelete_feature_view.Added docstring inline examples for all public APIs.
Added
ExampleHelperutility class to help with loading source data to simplify public notebooks.Added
update_entitymethod.Added
warehouseargument toFeatureViewconstructor to override the default warehouse.
New Model Registry features:
Added option to enable explainability when registering XGBoost, LightGBM, and Catboost models.
Added support for logging a model from a
ModelVersionobject.
New modeling features:
You can disable the 10GB training data size limit in distributed hyperparameter optimization by executing:
from snowflake.ml.modeling._internal.snowpark_implementations import ( distributed_hpo_trainer, ) distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False
Version 1.5.4 (2024-07-11)¶
Bug fixes¶
Model Registry bug fixes:
Fixed “401 Unauthorized” issue when deploying a model to Snowpark Container Services.
Feature Store bug fixes:
Some exceptions in property setters have been downgraded to warnings, allowing you to change
desc,refresh_freq, andwarehousein “draft” feature views.
Modeling bug fixes:
Fixed issues with calling
OneHotEncoderandOrdinalEncoderwith a dictionary as thecategoriesparameter and the data in a pandas DataFrame.
New features¶
New Model Registry features:
Allow overriding
device_mapanddevicewhen loading Hugging Face pipeline models.Add
set_aliasandunset_aliasmethods toModelVersioninstances to manage the model version’s aliases.Add
partitioned_inference_apidecorator to create partitioned inference methods in models.
New Feature Store features:
New
refresh_freq,refresh_mode, andscheduling_statecolumns have been added to the output of thelist_feature_viewsmethod.The
update_feature_viewmethod now supports updating a feature view’s description.New methods
refresh_feature_viewandget_refresh_historymanage updates of feature views.New method
generate_training_setgenerates table-backed feature snapshots.generate_dataset(..., output_type="table")has been deprecated and generates aDeprecationWarning.
New Modeling features:
OneHotEncoderandOrdinalEncodernow accept a list of array-like values for thecategoriesargument.
Version 1.5.3 (2024-06-17)¶
Bug fixes¶
Model Registry bug fixes:
Fix an issue causing incorrect results when using a pandas Dataframe with over 100,000 rows as the input of
ModelVersion.runmethod in Stored Procedures.
Modeling bug fixes:
Fix an issue with passing categories to
OneHotEncoderandOrdinalEncoderas a dictionary or as a pandas DataFrame.
New features¶
New Model Registry features:
Model Registry now supports timestamp (TIMESTAMP_NTZ) columns in input and output data.
New modeling features:
OneHotEncoderandOrdinalEncodernow support a list of array-like values for thecategoriesargument.
New Dataset features:
DatasetVersioninstances now havelabel_colsandexclude_colsproperties.
Version 1.5.2 (2024-06-10)¶
Bug fixes¶
Model Registry bug fixes:
Fixed an issue that prevented calls to
log_modelin a stored procedure.
Modeling bug fixes:
Quick fix for
import snowflake.ml.modeling.parameters.enable_anonymous_sprocnot working due to package dependency error.
Version 1.5.1 (2024-05-22)¶
New features¶
New Model Registry features:
log_model,get_model, anddelete_modelmethods now support fully-qualified names.
New modeling features:
You can now use an anonymous stored procedure during fitting, so that modeling does not require privileges to operate on the registry schema. Call
import snowflake.ml.modeling.parameters.enable_anonymous_sprocto enable this feature.
Bug fixes¶
Model registry bug fixes:
Fix issue with loading older models.
Version 1.5.0 (2024-05-01)¶
Behavior changes¶
Model Registry behavior changes:
The
fit_transformmethod can now return either a Snowpark DataFrame or a pandas DataFrame, matching the kind of DataFrame passed to the method.
New features¶
New Model Registry features:
Added support for exporting models from the registry (
ModelVersion.export).Added support for loading the underlying model object (
ModelVersion.load).Added support for renaming models (
Model.rename).
Bug fixes¶
Model Registry bug fixes:
Fixed the “invalid parameter
SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL” error.
Version 1.4.1 (2024-04-18)¶
New features¶
New Model Registry features:
Added support for catboost models (
catboost.CatBoostClassifier,catboost.CatBoostRegressor).Added support for lightgbm models (
lightgbm.Booster,lightgbm.LightGBMClassifier,lightgbm.LightGBMRegressor).
Bug fixes¶
Model Registry bug fixes:
Fixed bug that caused
relax_versionoption to not work.
Version 1.4.0 (2024-04-08)¶
Behavior changes¶
Model Registry behavior changes:
The
applymethod is no longer included as a target method by default when logging an XGBoost model. If you need this method available in logged models, included it manually in thetarget-methodsoption:log_model(..., options={"target_methods": ["apply", ...]})
New features¶
New model registry features:
The registry now supports logging sentence transformer models (
sentence_transformers.SentenceTransformer).The
version_nameargument is no longer required when logging a model. A random human-readable ID is generated if none is provided.
Bug fixes¶
Model registry bug fixes:
Fix issue where, when multiple models are called in the same query, models after the first returned incorrect results. This fix is applied when models are logged and does not benefit existing models; you must log your models again to correct this behavior.
Modeling bug fixes:
Fix bug in registering a model where only methods mentioned in
save_modelwere added to the model signature for Snowpark ML models.Fix bug in batch inference methods such as such as
predictandpredict_log_probewhere, whenn_jobswas not 1, the methods would not be executed.Fix bug in batch inference methods where they could not infer datatypes when the first row of data contained NULL.
The output column names from distributed hyperparameter optimization are now correctly matched with the Snowflake identifier.
Relaxed the versions of dependencies of distributed hyperparameter optimization methods; these were too strict and caused these methods to fail.
scikit-learn is now listed as a dependency of the LightGBM package.
Version 1.3.1 (2024-03-21)¶
New features¶
FileSet/FileSystem updates:
snowflake.ml.fileset.sfcfs.SFFileSystemcan now be used in UDFs and stored procedures.
Version 1.3.0 (2024-03-12)¶
Behavior changes¶
Model registry behavior changes:
As previously announced, the default for the
relax_versionoption (in theoptionsargument oflog_model) is nowTrue, allowing more reliable deployment in most cases by permitting dependency versions available in Snowflake.When running model methods, value range based input validation (which prevents input from overflowing) is now optional. This should improve performance and should not lead to issues for most types of models. To enable validation, pass the named argument
strict_input_validation=Truewhen calling the model’srunmethod.
Model development behavior changes:
The
fit_predictmethod now returns either a pandas or a Snowpark DataFrame, depending on the type of the input data, and is available on all classes where it is available in the underlying scikit-learn, xgboost, or lightgbm class.
New features and updates¶
FileSet/FileSystem updates:
Instances of
snowflake.ml.fileset.sfcfs.SFFileSystemcan now be serialized withpickle.
Bug fixes¶
Model registry bug fixes:
Fix a problem with importing
log_modelin some circumstances.Fix an incorrect error message when validating input Snowpark DataFrame with an array feature.
Model development bug fixes:
Relax package versions for all inference methods when the installed version of a dependency is not available in the Snowflake conda channel.
Version 1.2.3 (2024-02-26)¶
New features and updates¶
Model development updates:
All modeling classes now include a
score_samplesmethod to calculate the log-likelihood of the given samples.
Model registry updates:
Decimal type features are automatically cast (with a warning) to a DOUBLE or FLOAT instead of producing an error.
Improve error message for currently-unsupported
pip-requirementsoption.You can now delete a version of a model.
Bug fixes¶
Model development fixes:
precision_recall_fscore_supportreturned incorrect results withaverage="samples".
Model registry fixes:
Descriptions, models, and tags were not retrieved correctly in newly-created registries under the private preview model registry API due to a recent Snowflake behavior change.
Version 1.2.2 (2024-02-13)¶
New features and updates¶
Model registry updates:
You can now specify external access integrations when deploying a model to Snowpark Container Services using the private preview registry API, allowing models to access the internet to retrieve dependencies during deployment. The following endpoints are required for all deployments:
docker.com:80
docker.com:443
anaconda.com:80
anaconda.com:443
anaconda.org:80
anaconda.org:443
pypi.org:80
pypi.org:443
For models derived from
HuggingFacePipeLineModel, the following endpoints are required.huggingface.com:80
huggingface.com:443
huggingface.co:80
huggingface.co:443
Version 1.2.1 (2024-01-25)¶
New features and updates¶
Model development updates:
Infer column data type for transformers when possible.
Model registry updates:
relax_versionoption (inoptionsargument oflog_model) relaxes dependencies of stated versions to allow newer minor versions when set toTrue.
Version 1.2.0 (2024-01-12)¶
New features and updates¶
Public preview release of model registry. See Snowflake Model Registry. The previous private preview release of the model registry has been deprecated, but will continue to be supported while it includes features not yet available in the public preview version.
Model development updates:
Added support for
fit_predictmethod in AgglomerativeClustering, DBSCAN, and OPTICS classes.Added support for
fit_transformmethod in MDS, SpectralEmbedding and TSNE class.