Snowpark ML release notes¶
This article contains the release notes for the Snowpark ML, including the following when applicable:
Behavior changes
New features
Customer-facing bug fixes
Note
These notes do not include changes in features that have not been publicly announced. Such features may appear in the Snowpark ML source code but not in the public documentation.
Version 1.6.1 (2024-08-13)¶
Bug fixes¶
Feature Store bug fixes:
Metadata size is no longer limited when generating a dataset.
Model Registry bug fixes:
Fix an error message in the
run
method of model versions when a function name is not given and the model has multiple target methods.
New features¶
New Modeling features:
The
set_params
method is now available to set the parameters of the underlying scikit-learn estimator, if the Snowpark ML model has been fitted.
New Model Registry features:
Support for model explainability in XGBoost, LightGBM, CatBoost, and scikit-learn models supported by the
shap
ibrary.
Version 1.6.0 (2024-07-29)¶
Behavior changes¶
Feature Store behavior changes:
Many positional arguments are now keyword arguments. The following table lists the affected arguments for each method.
Method
Arguments
Entity
initializerdesc
FeatureView
initializertimestamp_col
,refresh_freq
,desc
FeatureStore
initializercreation_mode
FeatureStore.update_entity
desc
FeatureStore.register_feature_view
block
,overwrite
FeatureStore.list_feature_views
entity_name
,feature_view_name
FeatureStore.get_refresh_history
verbose
Feature:Store.retrieve_feature_values
spine_timestamp_col
,exclude_columns
,include_feature_view_timestamp_col
FeatureStore.generate_training_set
save_as
,spine_timestamp_col
,spine_label_cols
,exclude_columns
,include_feature_view_timestamp_col
FeatureStore.generate_dataset
version
,spine_timestamp_col
,spine_label_cols
,exclude_columns
,include_feature_view_timestamp_col
,desc
,output_type
Add new column
warehouse
to the output oflist_feature_views
.
Bug fixes¶
Modeling bug fixes:
Fixed an issue in which
SimpleImputer
could not impute integer columns with integer values.
Model Registry bug fixes:
Fixed an issue when providing a non-zero-index-based pandas Dataframe
ModelVersion.run
.
New features¶
New Feature Store features:
Added overloads to certain methods to accept both a
FeatureView
and name/version strings. Affected APIs includeread_feature_view
,refresh_feature_view
,get_refresh_history
,resume_feature_view
,suspend_feature_view
, anddelete_feature_view
.Added docstring inline examples for all public APIs.
Added
ExampleHelper
utility class to help with loading source data to simplify public notebooks.Added
update_entity
method.Added
warehouse
argument toFeatureView
constructor to override the default warehouse.
New Model Registry features:
Added option to enable explainability when registering XGBoost, LightGBM, and Catboost models.
Added support for logging a model from a
ModelVersion
object.
New modeling features:
You can disable the 10GB training data size limit in distributed hyperparameter optimization by executing:
from snowflake.ml.modeling._internal.snowpark_implementations import ( distributed_hpo_trainer, ) distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False
Version 1.5.4 (2024-07-11)¶
Bug fixes¶
Model Registry bug fixes:
Fixed “401 Unauthorized” issue when deploying a model to Snowpark Container Services.
Feature Store bug fixes:
Some exceptions in property setters have been downgraded to warnings, allowing you to change
desc
,refresh_freq
, andwarehouse
in “draft” feature views.
Modeling bug fixes:
Fixed issues with calling
OneHotEncoder
andOrdinalEncoder
with a dictionary as thecategories
parameter and the data in a pandas DataFrame.
New features¶
New Model Registry features:
Allow overriding
device_map
anddevice
when loading Hugging Face pipeline models.Add
set_alias
andunset_alias
methods toModelVersion
instances to manage the model version’s aliases.Add
partitioned_inference_api
decorator to create partitioned inference methods in models.
New Feature Store features:
New
refresh_freq
,refresh_mode
, andscheduling_state
columns have been added to the output of thelist_feature_views
method.The
update_feature_view
method now supports updating a feature view’s description.New methods
refresh_feature_view
andget_refresh_history
manage updates of feature views.New method
generate_training_set
generates table-backed feature snapshots.generate_dataset(..., output_type="table")
has been deprecated and generates aDeprecationWarning
.
New Modeling features:
OneHotEncoder
andOrdinalEncoder
now accept a list of array-like values for thecategories
argument.
Version 1.5.3 (2024-06-17)¶
Bug fixes¶
Model Registry bug fixes:
Fix an issue causing incorrect results when using a pandas Dataframe with over 100,000 rows as the input of
ModelVersion.run
method in Stored Procedures.
Modeling bug fixes:
Fix an issue with passing categories to
OneHotEncoder
andOrdinalEncoder
as a dictionary or as a pandas DataFrame.
New features¶
New Model Registry features:
Model Registry now supports timestamp (TIMESTAMP_NTZ) columns in input and output data.
New modeling features:
OneHotEncoder
andOrdinalEncoder
now support a list of array-like values for thecategories
argument.
New Dataset features:
DatasetVersion
instances now havelabel_cols
andexclude_cols
properties.
Version 1.5.2 (2024-06-10)¶
Bug fixes¶
Model Registry bug fixes:
Fixed an issue that prevented calls to
log_model
in a stored procedure.
Modeling bug fixes:
Quick fix for
import snowflake.ml.modeling.parameters.enable_anonymous_sproc
not working due to package dependency error.
Version 1.5.1 (2024-05-22)¶
New features¶
New Model Registry features:
log_model
,get_model
, anddelete_model
methods now support fully-qualified names.
New modeling features:
You can now use an anonymous stored procedure during fitting, so that modeling does not require privileges to operate on the registry schema. Call
import snowflake.ml.modeling.parameters.enable_anonymous_sproc
to enable this feature.
Bug fixes¶
Model registry bug fixes:
Fix issue with loading older models.
Version 1.5.0 (2024-05-01)¶
Behavior changes¶
Model Registry behavior changes:
The
fit_transform
method can now return either a Snowpark DataFrame or a pandas DataFrame, matching the kind of DataFrame passed to the method.
New features¶
New Model Registry features:
Added support for exporting models from the registry (
ModelVersion.export
).Added support for loading the underlying model object (
ModelVersion.load
).Added support for renaming models (
Model.rename
).
Bug fixes¶
Model Registry bug fixes:
Fixed the “invalid parameter
SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL
” error.
Version 1.4.1 (2024-04-18)¶
New features¶
New Model Registry features:
Added support for catboost models (
catboost.CatBoostClassifier
,catboost.CatBoostRegressor
).Added support for lightgbm models (
lightgbm.Booster
,lightgbm.LightGBMClassifier
,lightgbm.LightGBMRegressor
).
Bug fixes¶
Model Registry bug fixes:
Fixed bug that caused
relax_version
option to not work.
Version 1.4.0 (2024-04-08)¶
Behavior changes¶
Model Registry behavior changes:
The
apply
method is no longer included as a target method by default when logging an XGBoost model. If you need this method available in logged models, included it manually in thetarget-methods
option:log_model(..., options={"target_methods": ["apply", ...]})
New features¶
New model registry features:
The registry now supports logging sentence transformer models (
sentence_transformers.SentenceTransformer
).The
version_name
argument is no longer required when logging a model. A random human-readable ID is generated if none is provided.
Bug fixes¶
Model registry bug fixes:
Fix issue where, when multiple models are called in the same query, models after the first returned incorrect results. This fix is applied when models are logged and does not benefit existing models; you must log your models again to correct this behavior.
Modeling bug fixes:
Fix bug in registering a model where only methods mentioned in
save_model
were added to the model signature for Snowpark ML models.Fix bug in batch inference methods such as such as
predict
andpredict_log_probe
where, whenn_jobs
was not 1, the methods would not be executed.Fix bug in batch inference methods where they could not infer datatypes when the first row of data contained NULL.
The output column names from distributed hyperparameter optimization are now correctly matched with the Snowflake identifier.
Relaxed the versions of dependencies of distributed hyperparameter optimization methods; these were too strict and caused these methods to fail.
scikit-learn is now listed as a dependency of the LightGBM package.
Version 1.3.1 (2024-03-21)¶
New features¶
FileSet/FileSystem updates:
snowflake.ml.fileset.sfcfs.SFFileSystem
can now be used in UDFs and stored procedures.
Version 1.3.0 (2024-03-12)¶
Behavior changes¶
Model registry behavior changes:
As previously announced, the default for the
relax_version
option (in theoptions
argument oflog_model
) is nowTrue
, allowing more reliable deployment in most cases by permitting dependency versions available in Snowflake.When running model methods, value range based input validation (which prevents input from overflowing) is now optional. This should improve performance and should not lead to issues for most types of models. To enable validation, pass the named argument
strict_input_validation=True
when calling the model’srun
method.
Model development behavior changes:
The
fit_predict
method now returns either a pandas or a Snowpark DataFrame, depending on the type of the input data, and is available on all classes where it is available in the underlying scikit-learn, xgboost, or lightgbm class.
New features and updates¶
FileSet/FileSystem updates:
Instances of
snowflake.ml.fileset.sfcfs.SFFileSystem
can now be serialized withpickle
.
Bug fixes¶
Model registry bug fixes:
Fix a problem with importing
log_model
in some circumstances.Fix an incorrect error message when validating input Snowpark DataFrame with an array feature.
Model development bug fixes:
Relax package versions for all inference methods when the installed version of a dependency is not available in the Snowflake conda channel.
Version 1.2.3 (2024-02-26)¶
New features and updates¶
Model development updates:
All modeling classes now include a
score_samples
method to calculate the log-likelihood of the given samples.
Model registry updates:
Decimal type features are automatically cast (with a warning) to a DOUBLE or FLOAT instead of producing an error.
Improve error message for currently-unsupported
pip-requirements
option.You can now delete a version of a model.
Bug fixes¶
Model development fixes:
precision_recall_fscore_support
returned incorrect results withaverage="samples"
.
Model registry fixes:
Descriptions, models, and tags were not retrieved correctly in newly-created registries under the private preview model registry API due to a recent Snowflake behavior change.
Version 1.2.2 (2024-02-13)¶
New features and updates¶
Model registry updates:
You can now specify external access integrations when deploying a model to Snowpark Container Services using the private preview registry API, allowing models to access the internet to retrieve dependencies during deployment. The following endpoints are required for all deployments:
docker.com:80
docker.com:443
anaconda.com:80
anaconda.com:443
anaconda.org:80
anaconda.org:443
pypi.org:80
pypi.org:443
For models derived from
HuggingFacePipeLineModel
, the following endpoints are required.huggingface.com:80
huggingface.com:443
huggingface.co:80
huggingface.co:443
Version 1.2.1 (2024-01-25)¶
New features and updates¶
Model development updates:
Infer column data type for transformers when possible.
Model registry updates:
relax_version
option (inoptions
argument oflog_model
) relaxes dependencies of stated versions to allow newer minor versions when set toTrue
.
Version 1.2.0 (2024-01-12)¶
New features and updates¶
Public preview release of model registry. See Snowflake Model Registry. The previous private preview release of the model registry has been deprecated, but will continue to be supported while it includes features not yet available in the public preview version.
Model development updates:
Added support for
fit_predict
method in AgglomerativeClustering, DBSCAN, and OPTICS classes.Added support for
fit_transform
method in MDS, SpectralEmbedding and TSNE class.