Creating and Using Feature Views

Introduction

A feature view is a group of logically-related features that are refreshed on the same schedule. The FeatureView constructor accepts a Snowpark DataFrame that contains the feature generation logic. The provided DataFrame must contain the join_keys columns specified in the entities associated with the feature view. A timestamp column name is required if your feature view includes time-series features.

The refresh frequency can be a time delta (minimum value 1 minute), or it can be a cron expression with time zone (e.g. * * * * * America/Los_Angeles).

from snowflake.ml.feature_store import FeatureView

managed_fv = FeatureView(
    name="MY_MANAGED_FV",
    entities=[entity],
    feature_df=my_df,               # a Snowpark DataFrame
    timestamp_col="ts",             # optional timestamp column name in the dataframe
    refresh_freq="5 minutes",       # optional time unit of how often feature data refreshes
    desc="my managed feature view"  # optional description string.
)
Copy

The example above assumes that features of interest have already been defined in the my_df DataFrame. You can write custom feature logic using Snowpark Python on SQL. The Snowpark Python API provides utility functions for defining common feature types such as windowed aggregations. Examples of these are shown in Common feature and query patterns.

If you have ready-to-use features generated outside of the feature store, you can still register them by omitting the refresh frequency. The feature DataFrame could contain a simple projection from the existing feature table, or additional transformations that will be executed during feature consumption. This does not incur additional storage cost, but other feature store capabilities remain available.

With such features, refresh, immutability, consistency, and correctness are not managed by the feature store. The ready-to-use features must be maintained in some other fashion.

external_fv = FeatureView(
    name="MY_EXTERNAL_FV",
    entities=[entity],
    feature_df=my_external_df,
    timestamp_col="ts",
    refresh_freq=None,      # None = Feature Store will never refresh the feature data
    desc="my external feature view"
)
Copy

To enrich metadata at the feature level, you can add per-feature descriptions to the FeatureView. This makes it easier to find features using Snowsight Universal Search.

external_fv = external_fv.attach_feature_desc(
    {
        "SENDERID":"Sender account-id for the Transaction",
        "RECIEVERID":"Receiver account-id for the Transaction",
        "IBAN":"International Bank Identifier for the Receiver Bank",
        "AMOUNT":"Amount for the Transaction"
    }
)
Copy

At this point, the feature view has been completely defined and can be registered in the feature store.

Registering Feature Views

You register feature views using the register_feature_view method, with a customized name and version. Incremental maintenance (for supported query types) and automatic refresh will occur based on the specified refresh frequency.

When the provided query cannot be maintained via incremental maintenance using a dynamic table, the table will be fully refreshed from the query at the specified frequency. This may lead to greater lag in feature refresh and higher maintenance costs. You can alter the query logic, breaking the query into multiple smaller queries that support incremental maintenance, or provision a larger virtual warehouse for dynamic table maintenance. See General limitations for the latest information on dynamic table limitations.

registered_fv: FeatureView = fs.register_feature_view(
    feature_view=managed_fv,    # feature view created above, could also use external_fv
    version="1",
    block=True,         # whether function call blocks until initial data is available
    overwrite=False,    # whether to replace existing feature view with same name/version
)
Copy

A feature view pipeline definition is immutable after it has been registered, providing consistent feature computation as long as the feature view exists.

Retrieving Feature Views

Once a feature view has been registered with the feature store, you can retrieve it from there when you need it using the feature store’s get_feature_view method.

retrieved_fv: FeatureView = fs.get_feature_view(
    name="MY_MANAGED_FV",
    version="1"
)
Copy

Discovering Feature Views

You can list all registered feature views in the feature store, optionally filtering by entity name or feature view name, using the list_feature_views method. Information about the matching features is returned as a Snowpark DataFrame.

fs.list_feature_views(
    entity_name="<entity_name>",                # optional
    feature_view_name="<feature_view_name>",    # optional
).show()
Copy

Features can also be discovered using the Snowsight Feature Store UI (available to select customers) or Universal Search.

Cost Considerations

Materialized features use Snowflake dynamic tables. See About monitoring dynamic tables for information on monitoring dynamic tables and Understanding cost for dynamic tables for information on the costs of dynamic tables.

Known Limitations

  • The maximum number of managed feature views and the feature transformation queries in feature view s are subject to the limitations of dynamic tables.

  • Not all feature transformation queries are supported by dynamic incremental refresh. See the limitations.

  • Feature view names are SQL identifiers and subject to Snowflake identifier requirements.

  • Feature view versions are strings and have a maximum length of 128 characters. Some characters are not permitted and will produce an error message.