FORECAST

Fully-qualified name: SNOWFLAKE.ML.FORECAST

A forecast model produces a forecast for a single or multiple time series. You use CREATE SNOWFLAKE.ML.FORECAST to create and train the forecasting model, and then use the model’s <model_name>!FORECAST method to produce forecasts. The <model_name>!EXPLAIN_FEATURE_IMPORTANCE method provides information about how each feature in the training data influences the forecast.

CREATE SNOWFLAKE.ML.FORECAST

Creates a new forecast model from the training data you provide or replaces the forecast model of the same name.

Syntax

CREATE [ OR REPLACE ] SNOWFLAKE.ML.FORECAST [ IF NOT EXISTS ] <model_name>(
    INPUT_DATA => <input_data>,
    [SERIES_COLNAME => '<series_colname>',]
    TIMESTAMP_COLNAME => '<timestamp_colname>',
    TARGET_COLNAME => '<target_colname>')
  [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
  [ COMMENT = '<string_literal>' ]
Copy

Note

Using named arguments will make argument order irrelevant and result in more readable code. However, you can also use positional arguments, as in the following example:

CREATE SNOWFLAKE.ML.FORECAST <name>(
  '<input_data>', '<series_colname>', '<timestamp_colname>', '<target_colname>');
Copy

Parameters

model_name

Specifies the identifier for the model; must be unique for the schema in which the model is created.

If the model identifier is not fully qualified (in the form of db_name.schema_name.name or schema_name.name), the command creates the model in the current schema for the session.

In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (for example, "My object"). Identifiers enclosed in double quotes are also case-sensitive.

For more details, see Identifier requirements.

Constructor Arguments

Required:

INPUT_DATA => input_data

A reference to the input data. Using a reference allows the training process, which runs with limited privileges, to use your privileges to access the data. You can use a reference to a table or a view if your data is already in that form, or you can use a query reference to provide the query to be executed to obtain the data.

The referenced data is the entire training data consumed by the forecasting model. If input_data contains any columns that are not named as timestamp_colname, target_colname, or series_colname, they are considered exogenous variables (additional features).

Order of the columns in the input data is irrelevant.

Your input data must have columns with appropriate types for your use case. See Examples for details on each use case.

Use Case

Columns and types

Single time series

Multiple time series

Single time series with exogenous variables

Multiple time series with exogenous variables

TIMESTAMP_COLNAME => 'timestamp_colname'

Name of the column containing the timestamps in input_data.

TARGET_COLNAME => 'target_colname'

Name of the column containing the training label (dependent value) in input_data.

Optional:

SERIES_COLNAME => 'series_colname'

For multiple time series models, the name of the column defining the multiple time series in input_data. This column can be a value of any type, or an array of values from one or more other columns, as shown in Forecast on Multiple Series.

If you are providing arguments positionally, this must be the second argument.

Usage Notes

Replication of class instances is currently not supported.

Examples

See Examples.

SHOW SNOWFLAKE.ML.FORECAST

Lists all forecasting models.

Syntax

SHOW SNOWFLAKE.ML.FORECAST [ LIKE <pattern> ]
                           [ IN
                               {
                                   ACCOUNT                  |

                                   DATABASE                 |
                                   DATABASE <database_name> |

                                   SCHEMA                   |
                                   SCHEMA <schema_name>     |
                                   <schema_name>
                                }
                            ]
Copy

Parameters

LIKE 'pattern'

Filters the command output by object name. The filter uses case-insensitive pattern matching with support for SQL wildcard characters (% and _).

For example, the following patterns return the same results:

... LIKE '%testing%' ...
... LIKE '%TESTING%' ...
[ IN ... ]

Optionally specifies the scope of the command. Specify one of the following:

ACCOUNT

Returns records for the entire account.

DATABASE, . DATABASE db_name

Returns records for the current database in use or for a specified database (db_name).

If you specify DATABASE without db_name and no database is in use, the keyword has no effect on the output.

SCHEMA, . SCHEMA schema_name, . schema_name

Returns records for the current schema in use or a specified schema (schema_name).

SCHEMA is optional if a database is in use or if you specify the fully qualified schema_name (for example, db.schema).

If no database is in use, specifying SCHEMA has no effect on the output.

Default: Depends on whether the session currently has a database in use:

  • Database: DATABASE is the default (that is, the command returns the objects you have privileges to view in the database).

  • No database: ACCOUNT is the default (that is, the command returns the objects you have privileges to view in your account).

Output

The command output provides model properties and metadata in the following columns:

Column

Description

created_on

Date and time when the model was created

name

Name of the model

database_name

Database in which the model is stored

schema_name

Schema in which the model is stored

current_version

The version of the model, currently 1

comment

Comment for the model

owner

The role that owns the model

DROP SNOWFLAKE.ML.FORECAST

Removes the specified model from the current or specified schema.

Syntax

DROP SNOWFLAKE.ML.FORECAST [IF EXISTS] <name>;
Copy

Parameters

name

Specifies the identifier for the model to drop. If the identifier contains spaces, special characters, or mixed-case characters, the entire string must be enclosed in double quotes. Identifiers enclosed in double quotes are also case-sensitive.

If the model identifier is not fully qualified (in the form of db_name.schema_name.name or schema_name.name)), the command looks for the model in the current schema for the session.

Usage Notes

Dropped models cannot be recovered; they must be recreated.

<model_name>!FORECAST

Generates a forecast from the previously trained model model_name.

Syntax

The required arguments vary depending on what use case the model was trained for.

For single-series models without exogenous variables:

<model_name>!FORECAST(
  FORECASTING_PERIODS => <forecasting_periods>,
  [CONFIG_OBJECT => <config_object>]
);
Copy

For single-series models with exogenous variables:

<model_name>!FORECAST(
  INPUT_DATA => <input_data>,
  TIMESTAMP_COLNAME => '<timestamp_colname>',
  [CONFIG_OBJECT => <config_object>]
);
Copy

For multiple-series models without exogenous variables:

<model_name>!FORECAST(
  SERIES_VALUE => <series>,
  FORECASTING_PERIODS => <forecasting_periods>,
  TIMESTAMP_COLNAME => '<timestamp_colname>',
  [CONFIG_OBJECT => <config_object>]
);
Copy

For multiple-series models with exogenous variables:

<model_name>!FORECAST(
  SERIES_VALUE => <series>,
  SERIES_COLNAME => <series_colname>,
  INPUT_DATA => <input_data>,
  TIMESTAMP_COLNAME => '<timestamp_colname>',
  [CONFIG_OBJECT => <config_object>]
);
Copy

Arguments

Required:

Not all of the following arguments are required for every use case previously listed.

FORECASTING_PERIODS => forecasting_periods

Required for forecasts without exogenous variables.

The number of steps ahead to forecast. The interval between steps is inferred by the model during training.

INPUT_DATA => input_data

Required for forecasts with exogenous variables. Your exogenous variables must be in the same order at this step as they were in the training step.

A reference to a table, view, or query that contains the future timestamps and values of the exogenous variables (additional user-provided features) that were passed as input_data when training the model. Using a reference allows the forecasting process, which runs with limited privileges, to use your privileges to access the data. Columns are matched between this argument and the original exogenous training data by name.

TIMESTAMP_COLNAME => 'timestamp_colname'

Required for forecasts with exogenous variables.

The name of the column in input_data containing the timestamps.

SERIES_COLNAME => 'series_colname'

Required for multi-series forecasts with exogenous variables.

The name of the column in input_data specifying the series.

SERIES_VALUE => series

Required for multi-series forecasts.

The time series to forecast. Can be a single value (e.g., ‘Series A’::variant) or an VARIANT but must specify a series that the model has been trained on. If not specified, all trained series are predicted.

Optional:

CONFIG_OBJECT => config_object

An OBJECT containing key-value pairs used to configure the forecast job. At present, there is only one configuration option.

Key

Type

Description

prediction_interval

FLOAT

[0, 1). Defaults to 0.95, meaning 95% of future points are expected to fall within the interval [lower_bound, upper_bound] from the forecast result.

Output

The SERIES column is present only for multi-series forecasts. Single-series forecasts do not have this column.

An additional column is also present for each exogenous variable. The number, names, and types of these vary, so they are not included here.

Column

Type

Description

SERIES

VARIANT

Series for the predicted value in this row (for multi-series forecasts)

TS

TIMESTAMP_NTZ

Timestamp

FORECAST

FLOAT

Forecast target value

LOWER_BOUND

FLOAT

Lower boundary for prediction interval

UPPER_BOUND

FLOAT

Upper boundary for prediction interval

<model_name>!EXPLAIN_FEATURE_IMPORTANCE

Returns the relative feature importance for each feature used by the model.

Syntax

<model_name>!EXPLAIN_FEATURE_IMPORTANCE();
Copy

Output

Column

Type

Description

SERIES

VARIANT

Series value (only present if model was trained with multiple time series)

RANK

INTEGER

The importance rank of a feature for a particular series

FEATURE_NAME

VARCHAR

The name of the feature used to train the model aggregated_endogenous_features represents all features derived as transformations of your target variable.

IMPORTANCE_SCORE

FLOAT

The feature’s importance score: a value in [0, 1], with 0 being the lowest possible importance, and 1 the highest.

FEATURE_TYPE

VARCHAR

The source of the feature, one of:

  • user_provided

  • derived_from_timestamp

  • derived_from_endogenous

Examples

See Examples.