snowflake.ml.feature_store.FeatureStore¶
- class snowflake.ml.feature_store.FeatureStore(session: Session, database: str, name: str, default_warehouse: str, creation_mode: CreationMode = CreationMode.FAIL_IF_NOT_EXIST)¶
Bases:
object
FeatureStore provides APIs to create, materialize, retrieve and manage feature pipelines.
Creates a FeatureStore instance.
- Parameters:
session – Snowpark Session to interact with Snowflake backend.
database – Database to create the FeatureStore instance.
name – Target FeatureStore name, maps to a schema in the database.
default_warehouse – Default warehouse for feature store compute.
creation_mode – If FAIL_IF_NOT_EXIST, feature store throws when required resources not already exist; If CREATE_IF_NOT_EXIST, feature store will create required resources if they not already exist. Required resources include schema and tags. Note database must already exist in either mode.
- Raises:
SnowflakeMLException – [ValueError] default_warehouse does not exist.
SnowflakeMLException – [ValueError] Required resources not exist when mode is FAIL_IF_NOT_EXIST.
SnowflakeMLException – [RuntimeError] Failed to find resources.
SnowflakeMLException – [RuntimeError] Failed to create feature store.
Methods
- delete_entity(name: str) None ¶
Delete a previously registered Entity.
- Parameters:
name – Entity name.
- Raises:
SnowflakeMLException – [ValueError] Entity with given name not exists.
SnowflakeMLException – [RuntimeError] Failed to alter schema or drop tag.
SnowflakeMLException – [RuntimeError] Failed to find resources.
- delete_feature_view(feature_view: FeatureView) None ¶
Delete a FeatureView.
- Parameters:
feature_view – FeatureView to delete.
- Raises:
SnowflakeMLException – [ValueError] FeatureView is not registered.
- generate_dataset(name: str, spine_df: DataFrame, features: List[Union[FeatureView, FeatureViewSlice]], version: Optional[str] = None, spine_timestamp_col: Optional[str] = None, spine_label_cols: Optional[List[str]] = None, exclude_columns: Optional[List[str]] = None, include_feature_view_timestamp_col: bool = False, desc: str = '', output_type: Literal['dataset'] = 'dataset') Dataset ¶
- generate_dataset(name: str, spine_df: DataFrame, features: List[Union[FeatureView, FeatureViewSlice]], output_type: Literal['table'], version: Optional[str] = None, spine_timestamp_col: Optional[str] = None, spine_label_cols: Optional[List[str]] = None, exclude_columns: Optional[List[str]] = None, include_feature_view_timestamp_col: bool = False, desc: str = '') DataFrame
Generate dataset by given source table and feature views.
- Parameters:
name – The name of the Dataset to be generated. Datasets are uniquely identified within a schema by their name and version.
spine_df – The fact table contains the raw dataset.
features – A list of FeatureView or FeatureViewSlice which contains features to be joined.
version – The version of the Dataset to be generated. If none specified, the current timestamp will be used instead.
spine_timestamp_col – Name of timestamp column in spine_df that will be used to join time-series features. If spine_timestamp_col is not none, the input features also must have timestamp_col.
spine_label_cols – Name of column(s) in spine_df that contains labels.
exclude_columns – Column names to exclude from the result dataframe. The underlying storage will still contain the columns.
include_feature_view_timestamp_col – Generated dataset will include timestamp column of feature view (if feature view has timestamp column) if set true. Default to false.
desc – A description about this dataset.
output_type – The type of Snowflake storage to use for the generated training data.
- Returns:
If output_type is “dataset” (default), returns a Dataset object. If output_type is “table”, returns a Snowpark DataFrame representing the table.
- Raises:
SnowflakeMLException – [ValueError] Dataset name/version already exists
SnowflakeMLException – [ValueError] Snapshot creation failed.
SnowflakeMLException – [ValueError] Invalid output_type specified.
SnowflakeMLException – [RuntimeError] Failed to create clone from table.
SnowflakeMLException – [RuntimeError] Failed to find resources.
- get_entity(name: str) Entity ¶
Retrieve previously registered Entity object.
- Parameters:
name – Entity name.
- Returns:
Entity object.
- Raises:
SnowflakeMLException – [ValueError] Entity is not found.
SnowflakeMLException – [RuntimeError] Failed to retrieve tag reference information.
SnowflakeMLException – [RuntimeError] Failed to find resources.
- get_feature_view(name: str, version: str) FeatureView ¶
Retrieve previously registered FeatureView.
- Parameters:
name – FeatureView name.
version – FeatureView version.
- Returns:
FeatureView object.
- Raises:
SnowflakeMLException – [ValueError] FeatureView with name and version is not found, or incurred exception when reconstructing the FeatureView object.
- list_entities() DataFrame ¶
List all Entities in the FeatureStore.
- Returns:
Snowpark DataFrame containing the results.
- list_feature_views(entity_name: Optional[str] = None, feature_view_name: Optional[str] = None) DataFrame ¶
List FeatureViews in the FeatureStore. If entity_name is specified, FeatureViews associated with that Entity will be listed. If feature_view_name is specified, further reducing the results to only match the specified name.
- Parameters:
entity_name – Entity name.
feature_view_name – FeatureView name.
- Returns:
FeatureViews information as a Snowpark DataFrame.
- load_feature_views_from_dataset(ds: Dataset) List[Union[FeatureView, FeatureViewSlice]] ¶
Retrieve FeatureViews used during Dataset construction.
- Parameters:
ds – Dataset object created from feature store.
- Returns:
List of FeatureViews used during Dataset construction.
- Raises:
ValueError – if dataset object is not generated from feature store.
- read_feature_view(feature_view: FeatureView) DataFrame ¶
Read FeatureView data.
- Parameters:
feature_view – FeatureView to retrieve data from.
- Returns:
Snowpark DataFrame(lazy mode) containing the FeatureView data.
- Raises:
SnowflakeMLException – [ValueError] FeatureView is not registered.
- register_entity(entity: Entity) Entity ¶
Register Entity in the FeatureStore.
- Parameters:
entity – Entity object to register.
- Returns:
A registered entity object.
- Raises:
SnowflakeMLException – [RuntimeError] Failed to find resources.
- register_feature_view(feature_view: FeatureView, version: str, block: bool = True, overwrite: bool = False) FeatureView ¶
Materialize a FeatureView to Snowflake backend. Incremental maintenance for updates on the source data will be automated if refresh_freq is set. NOTE: Each new materialization will trigger a full FeatureView history refresh for the data included in the
FeatureView.
Examples
… draft_fv = FeatureView(name=”my_fv”, entities=[entities], feature_df) registered_fv = fs.register_feature_view(feature_view=draft_fv, version=”v1”) …
- Parameters:
feature_view – FeatureView instance to materialize.
version – version of the registered FeatureView. NOTE: Version only accepts letters, numbers and underscore. Also version will be capitalized.
block – Specify whether the FeatureView backend materialization should be blocking or not. If blocking then the API will wait until the initial FeatureView data is generated. Default to true.
overwrite – Overwrite the existing FeatureView with same version. This is the same as dropping the FeatureView first then recreate. NOTE: there will be backfill cost associated if the FeatureView is being continuously maintained.
- Returns:
A materialized FeatureView object.
- Raises:
SnowflakeMLException – [ValueError] FeatureView entity has not been registered.
SnowflakeMLException – [ValueError] Warehouse or default warehouse is not specified.
SnowflakeMLException – [RuntimeError] Failed to create dynamic table, task, or view.
SnowflakeMLException – [RuntimeError] Failed to find resources.
- resume_feature_view(feature_view: FeatureView) FeatureView ¶
Resume a previously suspended FeatureView.
- Parameters:
feature_view – FeatureView to resume.
- Returns:
A new feature view with updated status.
- retrieve_feature_values(spine_df: DataFrame, features: Union[List[Union[FeatureView, FeatureViewSlice]], List[str]], spine_timestamp_col: Optional[str] = None, exclude_columns: Optional[List[str]] = None, include_feature_view_timestamp_col: bool = False) DataFrame ¶
Enrich spine dataframe with feature values. Mainly used to generate inference data input. If spine_timestamp_col is specified, point-in-time feature values will be fetched.
- Parameters:
spine_df – Snowpark DataFrame to join features into.
features – List of features to join into the spine_df. Can be a list of FeatureView or FeatureViewSlice, or a list of serialized feature objects from Dataset.
spine_timestamp_col – Timestamp column in spine_df for point-in-time feature value lookup.
exclude_columns – Column names to exclude from the result dataframe.
include_feature_view_timestamp_col – Generated dataset will include timestamp column of feature view (if feature view has timestamp column) if set true. Default to false.
- Returns:
Snowpark DataFrame containing the joined results.
- Raises:
ValueError – if features is empty.
- suspend_feature_view(feature_view: FeatureView) FeatureView ¶
Suspend an active FeatureView.
- Parameters:
feature_view – FeatureView to suspend.
- Returns:
A new feature view with updated status.
- update_default_warehouse(warehouse_name: str) None ¶
Update default warehouse for feature store.
- Parameters:
warehouse_name – Name of warehouse.
- Raises:
SnowflakeMLException – If warehouse does not exists.
- update_feature_view(name: str, version: str, refresh_freq: Optional[str] = None, warehouse: Optional[str] = None) FeatureView ¶
- Update a registered feature view.
Check feature_view.py for which fields are allowed to be updated after registration.
- Parameters:
name – name of the FeatureView to be updated.
version – version of the FeatureView to be updated.
refresh_freq – updated refresh frequency.
warehouse – updated warehouse.
- Returns:
Updated FeatureView.
- Raises:
SnowflakeMLException – [RuntimeError] If FeatureView is not managed and refresh_freq is defined.
SnowflakeMLException – [RuntimeError] Failed to update feature view.