Snowflake Feature Store¶
The Snowflake Feature Store is a native solution that data scientists and ML engineers can use to create, maintain, and use ML features in data science and ML workloads.
Features are enriched or transformed data used as inputs into a machine learning model. For example, a feature might derive the day of the week from a timestamp, allowing the model to detect weekly patterns (such as “sales are 20% lower on Wednesdays”) useful in prediction. Other common features involve aggregating or time-shifting data. Feature engineering, the process of defining the features needed by yor models, is a vital part of building high-quality ML applications.
A Feature Store lets you standardize ML features in a single managed and governed repository. Having commonly-used features defined centrally in a Feature Store can help reduce redundancy and duplication of data and effort, improving the productivity of data science teams. By improving consistency in how features are extracted from raw data, a Feature Store can also help improve the robustness of production ML pipelines.
The Snowflake Feature Store is designed to make creating, storing, and managing features for data science and machine learning workloads easier and more efficient. Key benefits of Snowflake’s Feature Store are:
Easy authoring of common feature transformations in Python or SQL
Support for batch and streaming data
Continuous, automated, incremental feature updates on new data
Support for backfill and point-in-time correct features with ASOF JOIN
Fine-grained role-based access control and governance
Python API for retrieving features and creating training datasets
Support for feature pipelines authored and maintained in external tools such as DBT
Integration with Model Registry and other Snowflake ML features
Feature Store UI for easy feature search and discovery
Feature Store Workflow¶
The following diagram shows the high-level workflow of the Snowflake Feature Store.
For details on the Python API, see Snowflake Feature Store API Reference.
Installation¶
The Snowflake Feature Store is part of the Snowpark ML Python package, snowflake-ml-python
. For installation instructions, see
Using Snowpark ML Locally.
Key Concepts¶
Within Snowflake, feature stores are schemas. You may create as many feature stores as you need and organize them in the databases you choose. See Creating or Connecting to a Feature Store.
A feature store contains feature views. A feature view encapsulates a pipeline for transforming raw data into one or more related features that are refreshed from the data source at the same time. Inside Snowflake, a feature view is a dynamic table or a view. See Creating and Using Feature Views.
Tip
Users who have access to multiple feature stores can combine feature views from more than one feature store to create training and inference datasets.
A feature view can be materialized based on a specific table. Features in the materialized feature view are updated incrementally and efficiently as the source table receives new data. A materialized feature view is a Snowflake dynamic table. (This is different from a materialized view.)
Feature views are organized in the feature store according to the entity to which they apply. An entity is a higher-level abstraction that represents what the features are about. For example, in a movie streaming service, the main entities might be users and movies. Raw movie data and user activity data can be converted into useful features such as per-movie viewing time and user session length. See Creating and Registering Entities.
Examples¶
You can find many great examples with Feature Store from Snowflake quickstarts. Currently we have:
Intro to Feature Store. Start your journey here for an introduction to Feature Store concepts.
Develop and Manage ML Models with Feature Store and Model Registry. This is an End-to-end ML development cycle demo with Feature Store and Model Registry.
Getting Started with Snowflake Feature Store API. This is an overview of Feature Store Python APIs.
Advanced Guide to Snowflake Feature Store. This is a more advanced example of Feature Store and pipelines.
You can find additional demo notebooks in Snowflake Labs:
For common feature and query patterns, see Common feature and query patterns.
Note
These are only shown as examples, and following along with the example may require additional rights to third-party data, products, or services that are not owned or provided by Snowflake. Snowflake does not guarantee the accuracy of these examples.
Feature Store Back End and Data Model¶
Feature store objects map directly to Snowflake objects. All feature store objects are therefore subject to Snowflake access control rules.
Feature store object |
Snowflake object |
---|---|
feature store |
schema |
feature view |
dynamic table (internal features) or view (external features) |
entity |
tag |
feature |
column in a dynamic table (internal features) or view (external features) |
Properties of feature views (such as name and entity) are implemented as tags on dynamic tables or views.
You can query or manipulate the Snowflake objects directly using SQL. Changes you make via SQL are reflected in the Python API.
All objects of a Snowflake Feature Store are stored in the feature store’s schema. You can easily delete an entire feature store by dropping the schema (but make sure the schema doesn’t contain other resources).