Creating and Using Feature Views¶
Introduction¶
A feature view is a group of logically-related features that are refreshed on the same schedule. The
FeatureView
constructor accepts a Snowpark DataFrame that contains the feature generation logic. The provided
DataFrame must contain the join_keys
columns specified in the entities associated with the feature view. A
timestamp column name is required if your feature view includes time-series features.
The refresh frequency can be a time delta (minimum value 1 minute
), or it can be a cron expression with time
zone (e.g. * * * * * America/Los_Angeles
).
from snowflake.ml.feature_store import FeatureView
managed_fv = FeatureView(
name="MY_MANAGED_FV",
entities=[entity],
feature_df=my_df, # a Snowpark DataFrame
timestamp_col="ts", # optional timestamp column name in the dataframe
refresh_freq="5 minutes", # optional time unit of how often feature data refreshes
desc="my managed feature view" # optional description string.
)
The example above assumes that features of interest have already been defined in the my_df
DataFrame. You can write
custom feature logic using Snowpark Python on SQL. The Snowpark Python API provides
utility functions
for defining common feature types such as windowed aggregations. Examples of these are shown in Common feature and query patterns.
If you have ready-to-use features generated outside of the feature store, you can still register them by omitting the refresh frequency. The feature DataFrame could contain a simple projection from the existing feature table, or additional transformations that will be executed during feature consumption. This does not incur additional storage cost, but other feature store capabilities remain available.
With such features, refresh, immutability, consistency, and correctness are not managed by the feature store. The ready-to-use features must be maintained in some other fashion.
external_fv = FeatureView(
name="MY_EXTERNAL_FV",
entities=[entity],
feature_df=my_external_df,
timestamp_col="ts",
refresh_freq=None, # None = Feature Store will never refresh the feature data
desc="my external feature view"
)
To enrich metadata at the feature level, you can add per-feature descriptions to the FeatureView. This makes it easier to find features using Snowsight Universal Search.
external_fv = external_fv.attach_feature_desc(
{
"SENDERID":"Sender account-id for the Transaction",
"RECIEVERID":"Receiver account-id for the Transaction",
"IBAN":"International Bank Identifier for the Receiver Bank",
"AMOUNT":"Amount for the Transaction"
}
)
At this point, the feature view has been completely defined and can be registered in the feature store.
Registering Feature Views¶
You register feature views using the register_feature_view
method, with a customized name and version.
Incremental maintenance (for supported query types) and automatic refresh will occur based on the specified refresh
frequency.
When the provided query cannot be maintained via incremental maintenance using a dynamic table, the table will be fully refreshed from the query at the specified frequency. This may lead to greater lag in feature refresh and higher maintenance costs. You can alter the query logic, breaking the query into multiple smaller queries that support incremental maintenance, or provision a larger virtual warehouse for dynamic table maintenance. See General limitations for the latest information on dynamic table limitations.
registered_fv: FeatureView = fs.register_feature_view(
feature_view=managed_fv, # feature view created above, could also use external_fv
version="1",
block=True, # whether function call blocks until initial data is available
overwrite=False, # whether to replace existing feature view with same name/version
)
A feature view pipeline definition is immutable after it has been registered, providing consistent feature computation as long as the feature view exists.
Retrieving Feature Views¶
Once a feature view has been registered with the feature store, you can retrieve it from there when you need it using
the feature store’s get_feature_view
method.
retrieved_fv: FeatureView = fs.get_feature_view(
name="MY_MANAGED_FV",
version="1"
)
Discovering Feature Views¶
You can list all registered feature views in the feature store, optionally filtering by entity name or feature view
name, using the list_feature_views
method. Information about the matching features is returned as a Snowpark
DataFrame.
fs.list_feature_views(
entity_name="<entity_name>", # optional
feature_view_name="<feature_view_name>", # optional
).show()
Features can also be discovered using the Snowsight Feature Store UI (available to select customers) or Universal Search.
Cost Considerations¶
Materialized features use Snowflake dynamic tables. See About monitoring dynamic tables for information on monitoring dynamic tables and Understanding cost for dynamic tables for information on the costs of dynamic tables.
Known Limitations¶
The maximum number of managed feature views and the feature transformation queries in feature view s are subject to the limitations of dynamic tables.
Not all feature transformation queries are supported by dynamic incremental refresh. See the limitations.
Feature view names are SQL identifiers and subject to Snowflake identifier requirements.
Feature view versions are strings and have a maximum length of 128 characters. Some characters are not permitted and will produce an error message.