Snowflake Feature Store¶
Feature engineering, in which raw data is transformed into features that can be used to train machine learning models, is a vital part of building high-quality machine learning applications. A feature store lets you easily create, find, and employ features that work with your data.
The Snowflake Feature Store is designed to make creating, storing, and managing features for data science and machine learning workloads easier and more efficient. It provides:
A Python SDK for defining, registering, retrieving, and managing features.
Back end infrastructure with Snowflake dynamic tables, tables, views, and tags for automating feature pipelines and governance.
Feature Store Workflow¶
The following diagram shows the high-level workflow of the Snowflake Feature Store.
![Overall architecture of Snowflake Feature Store](../../../_images/snowflake-feature-store.png)
Note
This release of the Snowflake Feature Store includes the following features:
Python SDK for feature definition and management
Support for externally created or user-maintained feature tables
Helper functions for defining time window features and AsOf JOINs
Continuous, incremental feature update pipelines using dynamic tables
API for retrieval of batch features and training datasets with point-in-time lookup
Linking features to model metadata with the Snowflake Model Registry
Additionally, these related features are available to selected accounts:
A Snowsight user interface for the Feature Store
APIs for tracking end-to-end lineage of ML artifacts (source data, features, datasets, and models)
Additional capabilities for feature quality monitoring and low-latency online feature serving are on the roadmap.
For details on the Python API, see Snowflake Feature Store API Reference.
Installation¶
The Snowflake Feature Store is part of the Snowpark ML Python package, snowflake-ml-python
. For installation instructions, see
Installing Snowpark ML.
Key Concepts¶
Within Snowflake, feature stores are schemas. You may create as many feature stores as you need and organize them in the databases you choose. See Creating or Connecting to a Feature Store.
A feature store contains feature views. A feature view encapsulates a pipeline for transforming raw data into one or more related features that are refreshed from the data source at the same time. Inside Snowflake, a feature view is a dynamic table or a view. See Creating and Using Feature Views.
Tip
Users who have access to multiple feature stores can combine feature views from more than one feature store to create training and inference datasets
A feature view can be materialized based on a specific table. Features in the materialized feature view are updated incrementally and efficiently as the source table receives new data. A materialized feature view is a Snowflake dynamic table. (This is different from a materialized view.)
Feature views are organized in the feature store according to the entity to which they apply. An entity is a higher-level abstraction that represents what the features are about. For example, in a movie streaming service, the main entities might be users and movies. Raw movie data and user activity data can be converted into useful features such as per-movie viewing time and user session length. See Creating and Registering Entities.
Examples¶
You can find example notebooks for getting started in the open source Snowflake-Labs. Specifically, you can find four Jupyter notebooks:
Feature Store Quickstart
Feature Store API Overview
End-to-end ML with Feature Store and Model Registry
Manage features in DBT with Feature Store
For a more advanced example of Feature Store concepts and end-to-end feature and ML pipelines, see this quickstart.
For common feature and query patterns, see this Common feature and query patterns.
Note
These quickstarts are only shown as an example, and following along with the example may require additional rights to third-party data, products, or services that are not owned or provided by Snowflake. Snowflake does not guarantee the accuracy of this example.
Feature Store Back End and Data Model¶
Feature store objects map directly to Snowflake objects. All feature store objects are therefore subject to Snowflake access control rules.
Feature store object |
Snowflake object |
---|---|
feature store |
schema |
feature view |
dynamic table (internal features) or view (external features) |
entity |
tag |
feature |
column in a dynamic table (internal features) or view (external features) |
Properties of feature views (such as name and entity) are implemented as tags on dynamic tables or views.
You can query or manipulate the Snowflake objects directly using SQL. Changes you make via SQL are reflected in the Python API.
All objects of a Snowflake Feature Store are stored in the feature store’s schema. You can easily delete an entire feature store by dropping the schema (but make sure the schema doesn’t contain other resources).