Snowflake Feature Store

The Snowflake Feature Store is a native solution that data scientists and ML engineers can use to create, maintain, and use ML features in data science and ML workloads.

Features are enriched or transformed data used as inputs into a machine learning model. For example, a feature might derive the day of the week from a timestamp, allowing the model to detect weekly patterns (such as “sales are 20% lower on Wednesdays”) useful in prediction. Other common features involve aggregating or time-shifting data. Feature engineering, the process of defining the features needed by yor models, is a vital part of building high-quality ML applications.

A Feature Store lets you standardize ML features in a single managed and governed repository. Having commonly-used features defined centrally in a Feature Store can help reduce redundancy and duplication of data and effort, improving the productivity of data science teams. By improving consistency in how features are extracted from raw data, a Feature Store can also help improve the robustness of production ML pipelines.

The Snowflake Feature Store is designed to make creating, storing, and managing features for data science and machine learning workloads easier and more efficient. Key benefits of Snowflake’s Feature Store are:

  • Easy authoring of common feature transformations in Python or SQL

  • Support for batch and streaming data

  • Continuous, automated, incremental feature updates on new data

  • Support for backfill and point-in-time correct features with ASOF JOIN

  • Fine-grained role-based access control and governance

  • Python API for retrieving features and creating training datasets

  • Support for feature pipelines authored and maintained in external tools such as DBT

  • Integration with Model Registry and other Snowflake ML features

  • Feature Store UI for easy feature search and discovery

Feature Store Workflow

The following diagram shows the high-level workflow of the Snowflake Feature Store.

Overall architecture of Snowflake Feature Store

For details on the Python API, see Snowflake Feature Store API Reference.

Installation

The Snowflake Feature Store is part of the Snowpark ML Python package, snowflake-ml-python. For installation instructions, see Using Snowpark ML Locally.

Key Concepts

Within Snowflake, feature stores are schemas. You may create as many feature stores as you need and organize them in the databases you choose. See Creating or Connecting to a Feature Store.

A feature store contains feature views. A feature view encapsulates a pipeline for transforming raw data into one or more related features that are refreshed from the data source at the same time. Inside Snowflake, a feature view is a dynamic table or a view. See Creating and Using Feature Views.

Tip

Users who have access to multiple feature stores can combine feature views from more than one feature store to create training and inference datasets.

A feature view can be materialized based on a specific table. Features in the materialized feature view are updated incrementally and efficiently as the source table receives new data. A materialized feature view is a Snowflake dynamic table. (This is different from a materialized view.)

Feature views are organized in the feature store according to the entity to which they apply. An entity is a higher-level abstraction that represents what the features are about. For example, in a movie streaming service, the main entities might be users and movies. Raw movie data and user activity data can be converted into useful features such as per-movie viewing time and user session length. See Creating and Registering Entities.

Examples

You can find many great examples with Feature Store from Snowflake quickstarts. Currently we have:

You can find additional demo notebooks in Snowflake Labs:

For common feature and query patterns, see Common feature and query patterns.

Note

These are only shown as examples, and following along with the example may require additional rights to third-party data, products, or services that are not owned or provided by Snowflake. Snowflake does not guarantee the accuracy of these examples.

Feature Store Back End and Data Model

Feature store objects map directly to Snowflake objects. All feature store objects are therefore subject to Snowflake access control rules.

Feature store object

Snowflake object

feature store

schema

feature view

dynamic table (internal features) or view (external features)

entity

tag

feature

column in a dynamic table (internal features) or view (external features)

Properties of feature views (such as name and entity) are implemented as tags on dynamic tables or views.

You can query or manipulate the Snowflake objects directly using SQL. Changes you make via SQL are reflected in the Python API.

All objects of a Snowflake Feature Store are stored in the feature store’s schema. You can easily delete an entire feature store by dropping the schema (but make sure the schema doesn’t contain other resources).