Snowflake Feature Store

Feature engineering, in which raw data is transformed into features that can be used to train machine learning models, is a vital part of building high-quality machine learning applications. A feature store lets you easily create, find, and employ features that work with your data.

The Snowflake Feature Store is designed to make creating, storing, and managing features for data science and machine learning workloads easier and more efficient. It provides:

  • A Python SDK for defining, registering, retrieving, and managing features.

  • Back end infrastructure with Snowflake dynamic tables, tables, views, and tags for automating feature pipelines and governance.

Feature Store Workflow

The following diagram shows the high-level workflow of the Snowflake Feature Store.

Overall architecture of Snowflake Feature Store

Note

This release of the Snowflake Feature Store includes the following features:

  • Python SDK for feature definition and management

  • Support for externally created or user-maintained feature tables

  • Helper functions for defining time window features and AsOf JOINs

  • Continuous, incremental feature update pipelines using dynamic tables

  • API for retrieval of batch features and training datasets with point-in-time lookup

  • Linking features to model metadata with the Snowflake Model Registry

Additionally, these related features are available to selected accounts:

  • A Snowsight user interface for the Feature Store

  • APIs for tracking end-to-end lineage of ML artifacts (source data, features, datasets, and models)

Additional capabilities for feature quality monitoring and low-latency online feature serving are on the roadmap.

For details on the Python API, see Snowflake Feature Store API Reference.

Installation

The Snowflake Feature Store is part of the Snowpark ML Python package, snowflake-ml-python. For installation instructions, see Installing Snowpark ML.

Key Concepts

Within Snowflake, feature stores are schemas. You may create as many feature stores as you need and organize them in the databases you choose. See Creating or Connecting to a Feature Store.

A feature store contains feature views. A feature view encapsulates a pipeline for transforming raw data into one or more related features that are refreshed from the data source at the same time. Inside Snowflake, a feature view is a dynamic table or a view. See Creating and Using Feature Views.

Tip

Users who have access to multiple feature stores can combine feature views from more than one feature store to create training and inference datasets

A feature view can be materialized based on a specific table. Features in the materialized feature view are updated incrementally and efficiently as the source table receives new data. A materialized feature view is a Snowflake dynamic table. (This is different from a materialized view.)

Feature views are organized in the feature store according to the entity to which they apply. An entity is a higher-level abstraction that represents what the features are about. For example, in a movie streaming service, the main entities might be users and movies. Raw movie data and user activity data can be converted into useful features such as per-movie viewing time and user session length. See Creating and Registering Entities.

Examples

You can find example notebooks for getting started in the open source Snowflake-Labs. Specifically, you can find four Jupyter notebooks:

  • Feature Store Quickstart

  • Feature Store API Overview

  • End-to-end ML with Feature Store and Model Registry

  • Manage features in DBT with Feature Store

For a more advanced example of Feature Store concepts and end-to-end feature and ML pipelines, see this quickstart.

For common feature and query patterns, see this Common feature and query patterns.

Note

These quickstarts are only shown as an example, and following along with the example may require additional rights to third-party data, products, or services that are not owned or provided by Snowflake. Snowflake does not guarantee the accuracy of this example.

Feature Store Back End and Data Model

Feature store objects map directly to Snowflake objects. All feature store objects are therefore subject to Snowflake access control rules.

Feature store object

Snowflake object

feature store

schema

feature view

dynamic table (internal features) or view (external features)

entity

tag

feature

column in a dynamic table (internal features) or view (external features)

Properties of feature views (such as name and entity) are implemented as tags on dynamic tables or views.

You can query or manipulate the Snowflake objects directly using SQL. Changes you make via SQL are reflected in the Python API.

All objects of a Snowflake Feature Store are stored in the feature store’s schema. You can easily delete an entire feature store by dropping the schema (but make sure the schema doesn’t contain other resources).