snowflake.ml.feature_store.FeatureView

class snowflake.ml.feature_store.FeatureView(name: str, entities: list[Entity], feature_df: Optional[DataFrame] = None, *, timestamp_col: Optional[str] = None, refresh_freq: Optional[str] = None, desc: str = '', warehouse: Optional[str] = None, initialize: str = 'ON_CREATE', refresh_mode: str = 'AUTO', cluster_by: Optional[list[str]] = None, online_config: Optional[OnlineConfig] = None, feature_granularity: Optional[str] = None, features: Optional[list[Feature]] = None, rollup_config: Optional[RollupConfig] = None, storage_config: Optional[StorageConfig] = None, stream_config: Optional[StreamConfig] = None, **_kwargs: Any)

Bases: LineageNode

A FeatureView instance encapsulates a logical group of features.

Create a FeatureView instance.

Parameters:
  • name – The name of the FeatureView. This must follow Snowflake identifier rules.

  • entities – The entities that the FeatureView is associated with.

  • feature_df – The Snowpark DataFrame containing data source and all feature feature_df logic. The final projection of the DataFrame should contain feature names, join keys and timestamp if applicable.

  • timestamp_col – name of the timestamp column for point-in-time lookup when consuming the feature values.

  • refresh_freq

    Time unit defining how often the new feature data should be generated, in the format { <num> { seconds | minutes | hours | days } | DOWNSTREAM | <cron expr> <time zone>}.

    The minimum refresh frequency is 1 minute.

    When using a cron format, you must provide a time zone.

    When you don’t provide a refresh value, the FeatureView is registered as a View on the Snowflake backend. There are no extra storage costs incurred for this view.

  • desc – Description of the FeatureView.

  • warehouse – The warehouse used to refresh this feature view. Not needed when refresh_freq is None. This warehouse will overwrite the default warehouse of Feature Store if specified, otherwise the default warehouse will be used.

  • initialize – Specifies the behavior of the initial refresh of feature view. This property cannot be altered after you register the feature view. It supports ON_CREATE (default) or ON_SCHEDULE. ON_CREATE refreshes the feature view synchronously at creation. ON_SCHEDULE refreshes the feature view at the next scheduled refresh. It is only effective when refresh_freq is not None.

  • refresh_mode – The refresh mode of managed feature view. The value can be ‘AUTO’, ‘FULL’ or ‘INCREMENTAL’. For managed feature view, the default value is ‘AUTO’. For static feature view it has no effect. For more information, see CREATE DYNAMIC TABLE.

  • cluster_by – Columns to cluster the feature view by. If timestamp_col is provided, it is added to the default clustering keys. Default is to use the join keys from entities in the view.

  • online_config

    Configuration for online storage. If provided with enable=True, online storage will be enabled. Defaults to None (no online storage).

    Note

    This feature is currently in preview.

  • feature_granularity – The tile interval for time-series aggregations (e.g., “1h”, “1d”). When specified along with features, enables tile-based aggregation where a Dynamic Table stores pre-computed partial aggregations (tiles), and dataset generation merges these tiles for point-in-time correct results.

  • features – List of aggregation feature definitions using the Feature class. Required when feature_granularity is specified. Defines the aggregations to compute (e.g., SUM, COUNT, LAST_N) with their windows.

  • rollup_config – Configuration for rolling up a tiled FeatureView to a coarser entity level. When specified, this FeatureView will aggregate features from the source FeatureView using the provided entity mapping. Cannot be used together with feature_df. See RollupConfig for details.

  • storage_config – Configuration for storage format using StorageConfig. Supports Snowflake native format (default) or Iceberg format. When using Iceberg, specify external_volume and optionally base_location.

  • stream_config

    Configuration for streaming feature views using StreamConfig. When provided, feature_df serves as backfill data, and an online feature table with a streaming spec is automatically created. Requires timestamp_col and feature_df. Online storage is always enabled with the POSTGRES store type.

    Note

    This feature is currently in private preview.

  • _kwargs

    Reserved kwargs for system generated args.

    Caution

    Use of additional keywords is prohibited.

Example:

>>> fs = FeatureStore(...)
>>> # draft_fv is a local object that hasn't materialized to Snowflake backend yet.
>>> feature_df = session.sql("select f_1, f_2 from source_table")
>>> draft_fv = FeatureView(
...     name="my_fv",
...     entities=[e1, e2],
...     feature_df=feature_df,
...     timestamp_col='TS', # optional
...     refresh_freq='1d',  # optional
...     desc='A line about this feature view',  # optional
...     warehouse='WH'      # optional, the warehouse used to refresh (managed) feature view
... )
>>> print(draft_fv.status)
FeatureViewStatus.DRAFT

>>> # registered_fv is a local object that maps to a Snowflake backend object.
>>> registered_fv = fs.register_feature_view(draft_fv, "v1")
>>> print(registered_fv.status)
FeatureViewStatus.ACTIVE

>>> # Example with online configuration for online feature storage
>>> config = OnlineConfig(enable=True, target_lag='15s')
>>> online_fv = FeatureView(
...     name="my_online_fv",
...     entities=[e1, e2],
...     feature_df=feature_df,
...     timestamp_col='TS',
...     refresh_freq='1d',
...     desc='Feature view with online storage',
...     online_config=config  # optional, enables online feature storage
... )
>>> registered_online_fv = fs.register_feature_view(online_fv, "v1")
>>> print(registered_online_fv.online)
True

>>> # Example with Iceberg storage configuration
>>> storage = StorageConfig(
...     format=StorageFormat.ICEBERG,
...     external_volume='MY_EXTERNAL_VOLUME',
...     base_location='feature_store/my_fv'  # optional
... )
>>> iceberg_fv = FeatureView(
...     name="my_iceberg_fv",
...     entities=[e1, e2],
...     feature_df=feature_df,
...     refresh_freq='1d',
...     storage_config=storage  # optional, configures Iceberg storage
... )

Methods

attach_feature_desc(descs: dict[str, str]) FeatureView

Associate feature level descriptions to the FeatureView.

Parameters:

descs – Dictionary contains feature name and corresponding descriptions.

Returns:

FeatureView with feature level desc attached.

Raises:

ValueError – if feature name is not found in the FeatureView.

Example:

>>> fs = FeatureStore(...)
>>> e = fs.get_entity('TRIP_ID')
>>> feature_df = session.table(source_table).select('TRIPDURATION', 'START_STATION_LATITUDE', 'TRIP_ID')
>>> draft_fv = FeatureView(name='F_TRIP', entities=[e], feature_df=feature_df)
>>> draft_fv = draft_fv.attach_feature_desc({
...     "TRIPDURATION": "Duration of a trip.",
...     "START_STATION_LATITUDE": "Latitude of the start station."
... })
>>> registered_fv = fs.register_feature_view(draft_fv, version='1.0')
>>> registered_fv.feature_descs
OrderedDict([('TRIPDURATION', 'Duration of a trip.'),
    ('START_STATION_LATITUDE', 'Latitude of the start station.')])
classmethod from_json(json_str: str, session: Session) FeatureView
fully_qualified_name() str

Returns the fully qualified name (<database_name>.<schema_name>.<feature_view_name>) for the FeatureView in Snowflake.

Returns:

fully qualified name string.

Raises:

RuntimeError – if the FeatureView is not registered.

Example:

>>> fs = FeatureStore(...)
>>> e = fs.get_entity('TRIP_ID')
>>> feature_df = session.table(source_table).select(
...     'TRIPDURATION',
...     'START_STATION_LATITUDE',
...     'TRIP_ID'
... )
>>> darft_fv = FeatureView(name='F_TRIP', entities=[e], feature_df=feature_df)
>>> registered_fv = fs.register_feature_view(darft_fv, version='1.0')
>>> registered_fv.fully_qualified_name()
'MY_DB.MY_SCHEMA."F_TRIP$1.0"'
fully_qualified_online_table_name() str

Get the fully qualified name for the online feature table.

Returns:

The fully qualified name (<database_name>.<schema_name>.<online_table_name>) for the online feature table in Snowflake.

Raises:

RuntimeError – if the FeatureView is not registered or not configured for online storage.

lineage(direction: Literal['upstream', 'downstream'] = 'downstream', domain_filter: Optional[set[Literal['feature_view', 'dataset', 'model', 'table', 'view']]] = None) list[typing.Union[ForwardRef('feature_view.FeatureView'), ForwardRef('dataset.Dataset'), ForwardRef('model_version_impl.ModelVersion'), ForwardRef('LineageNode')]]

Retrieves the lineage nodes connected to this node.

Parameters:
  • direction – The direction to trace lineage. Defaults to “downstream”.

  • domain_filter – Set of domains to filter nodes. Defaults to None.

Returns:

A list of connected lineage nodes.

Return type:

List[LineageNode]

list_columns() DataFrame

List all columns and their information.

Returns:

A Snowpark DataFrame contains feature information.

Raises:

ValueError – if the FeatureView has no feature DataFrame (e.g. unregistered rollup).

Example:

>>> fs = FeatureStore(...)
>>> e = Entity("foo", ["id"], desc='my entity')
>>> fs.register_entity(e)

>>> draft_fv = FeatureView(
...     name="fv",
...     entities=[e],
...     feature_df=self._session.table(<source_table>).select(["NAME", "ID", "TITLE", "AGE", "TS"]),
...     timestamp_col="ts",
>>> ).attach_feature_desc({"AGE": "my age", "TITLE": '"my title"'})
>>> fv = fs.register_feature_view(draft_fv, '1.0')

>>> fv.list_columns().show()
--------------------------------------------------
|"NAME"  |"CATEGORY"  |"DTYPE"      |"DESC"      |
--------------------------------------------------
|NAME    |FEATURE     |string(64)   |            |
|ID      |ENTITY      |bigint       |my entity   |
|TITLE   |FEATURE     |string(128)  |"my title"  |
|AGE     |FEATURE     |bigint       |my age      |
|TS      |TIMESTAMP   |bigint       |NULL        |
--------------------------------------------------
slice(names: list[str]) FeatureViewSlice

Select a subset of features within the FeatureView.

Parameters:

names – feature names to select.

Returns:

FeatureViewSlice instance containing selected features.

Raises:

ValueError – if selected feature names is not found in the FeatureView.

Example:

>>> fs = FeatureStore(...)
>>> e = fs.get_entity('TRIP_ID')
>>> # feature_df contains 3 features and 1 entity
>>> feature_df = session.table(source_table).select(
...     'TRIPDURATION',
...     'START_STATION_LATITUDE',
...     'END_STATION_LONGITUDE',
...     'TRIP_ID'
... )
>>> darft_fv = FeatureView(name='F_TRIP', entities=[e], feature_df=feature_df)
>>> fv = fs.register_feature_view(darft_fv, version='1.0')
>>> # shows all 3 features
>>> fv.feature_names
['TRIPDURATION', 'START_STATION_LATITUDE', 'END_STATION_LONGITUDE']

>>> # slice a subset of features
>>> fv_slice = fv.slice(['TRIPDURATION', 'START_STATION_LATITUDE'])
>>> fv_slice.names
['TRIPDURATION', 'START_STATION_LATITUDE']

>>> # query the full set of features in original feature view
>>> fv_slice.feature_view_ref.feature_names
['TRIPDURATION', 'START_STATION_LATITUDE', 'END_STATION_LONGITUDE']
to_df(session: Optional[Session] = None) DataFrame

Convert feature view to a Snowpark DataFrame object.

Parameters:

session – [deprecated] This argument has no effect. No need to pass a session object.

Returns:

A Snowpark Dataframe object contains the information about feature view.

Example:

>>> fs = FeatureStore(...)
>>> e = Entity("foo", ["id"], desc='my entity')
>>> fs.register_entity(e)

>>> draft_fv = FeatureView(
...     name="fv",
...     entities=[e],
...     feature_df=self._session.table(<source_table>).select(["NAME", "ID", "TITLE", "AGE", "TS"]),
...     timestamp_col="ts",
>>> ).attach_feature_desc({"AGE": "my age", "TITLE": '"my title"'})
>>> fv = fs.register_feature_view(draft_fv, '1.0')

>>> fv.to_df().show()
----------------------------------------------------------------...
|"NAME"  |"ENTITIES"                |"TIMESTAMP_COL"  |"DESC"  |
----------------------------------------------------------------...
|FV      |[                         |TS               |foobar  |
|        |  {                       |                 |        |
|        |    "desc": "my entity",  |                 |        |
|        |    "join_keys": [        |                 |        |
|        |      "ID"                |                 |        |
|        |    ],                    |                 |        |
|        |    "name": "FOO",        |                 |        |
|        |    "owner": null         |                 |        |
|        |  }                       |                 |        |
|        |]                         |                 |        |
----------------------------------------------------------------...
to_json() str
with_name(name: str) FeatureView

Use this Feature View with a namespace for dataset generation.

Returns a copy of this Feature View with a custom namespace that will be used to prefix all feature columns during dataset generation. This is useful for avoiding column name collisions or using the same Feature View multiple times with different contexts.

The name is NOT persisted to the Feature Store and only affects column naming in retrieve_feature_values(), generate_dataset(), and generate_training_set().

Parameters:

name – Namespace for feature columns. Will be formatted as ‘{name}_’ before each column name. - “sender” -> sender_count, sender_total - “c” -> c_count, c_total - “” (empty) -> no prefix (useful to override auto_prefix)

Returns:

A new FeatureView instance with the namespace attached.

Examples:

>>> # Avoid collisions
>>> fs.generate_training_set(
...     features=[
...         cart_fv.with_name("cart"),
...         page_fv.with_name("page"),
...     ]
... )
>>> # Same FV twice (sender/recipient pattern)
>>> user_fv = fs.get_feature_view("user_features", "v1")
>>> fs.generate_training_set(
...     features=[
...         user_fv.with_name("sender"),
...         user_fv.with_name("recipient"),
...     ]
... )
>>> # Override auto_prefix
>>> fs.generate_training_set(
...     features=[
...         important_fv.with_name(""),  # No prefix
...         other_fv,  # Gets auto prefix
...     ],
...     auto_prefix=True
... )

Note

Similar to Tecton’s with_name() method. This setting takes precedence over the auto_prefix parameter.

Attributes

aggregation_specs

Get the aggregation specifications (internal use).

cluster_by
column_alias

Returns the column namespace if set via with_name(), else None.

database
desc
entities
feature_descs
feature_df
feature_granularity

Get the tile interval for aggregations.

feature_names
initialize
is_rollup

Check if this feature view is a rollup of another tiled feature view.

Returns:

True if this feature view was created with rollup_config.

is_streaming

Check if this feature view is a streaming feature view.

Returns:

True if stream_config was provided or the FV was reconstructed from a streaming backend object.

is_tiled

Check if this feature view uses tile-based aggregation.

Returns:

True if feature_granularity and features are configured, or if this is a rollup feature view.

name
online

Check if online storage is enabled for this feature view.

Returns:

True if online storage is enabled, False otherwise.

online_config
ordered_entity_columns

Deduplicated entity join key column names, preserving order.

Returns:

List of resolved (uppercased) join key names from all entities, with duplicates removed while preserving the order of first occurrence.

output_schema
owner
query
refresh_freq
refresh_mode
refresh_mode_reason
rollup_config

Get the rollup configuration if this is a rollup feature view.

Returns:

The RollupConfig used to create this feature view, or None.

schema
session
status
storage_config

Get the storage configuration for this feature view.

Returns:

StorageConfig object if set, None otherwise. When None, the feature view uses the default Snowflake-native storage (Dynamic Table).

stream_config

Get the stream configuration if this is a streaming feature view.

Returns:

The StreamConfig used to create this feature view, or None. For reconstructed FVs (via get_feature_view), this is None since the transformation function is only needed at registration time.

timestamp_col
transformation_fn_source

Get the transformation function source code for streaming feature views.

For draft FVs (with stream_config), returns the source from the config. For reconstructed FVs (via get_feature_view), returns the source from stored metadata. Returns None for non-streaming FVs.

Returns:

The plain-text source code string, or None for non-streaming FVs.

Example:

>>> fv = fs.get_feature_view("realtime_txn_features", "v1")
>>> print(fv.transformation_fn_source)
def normalize_txn(df):
    df["amount_cents"] = (df["amount"] * 100).astype(int)
    df["is_large"] = df["amount"] > 1000
    return df
version
warehouse