Introduction to tasks

Use Tasks to automate, schedule, and optimize business procedures on your data pipeline.

In Snowflake, Tasks can execute any one of the following types of functions:

  • A single SQL statement.

  • A call to a stored procedure.

  • Procedural logic using Snowflake Scripting.

For complicated procedures, like generating reports or performing periodic table maintenance, combine tasks by creating task graphs. To continuously process new or changed data as it comes in, as in Extract, Load, Transform (ELT) workflows, use Triggered tasks to combine tasks with table streams.

Note

Table schema evolution is not supported by tasks.

Task creation workflow overview

  1. Create a task administrator role that can run the commands in the following steps.

  2. Define a new task using CREATE TASK. The task is suspended by default. For information about setting the task parameters, see:

  3. Manually test tasks using EXECUTE TASK.

  4. Change the task, or allow it to run continuously using ALTER TASK … RESUME. For information about running Tasks, see:

Compute resources

Tasks require compute resources to execute SQL, Python, Java, and Scala functions, as well as stored procedures. For each Task, you can choose whether to allow Snowflake to manage the resources by creating serverless tasks, or to manage it yourself using the user-managed virtual warehouse model.

Serverless Tasks

With this model, Tasks execute on Snowflake-managed compute. Snowflake automatically resizes the resources as required for each workload.

Tasks that run regularly are good candidates for the serverless compute model. This model is also recommended when you have under-utilized warehouses that have few existing tasks running on them.

Snowflake determines the ideal size of compute resources for a given run based on a dynamic analysis of statistics for the most recent runs of the same task.

Limitations

  • The maximum compute size for a serverless task is equivalent to an X2LARGE virtual warehouse.

Create a task using the serverless compute model

When you use CREATE TASK or ALTER TASK, leave out the WAREHOUSE parameter.

The role that executes the Task must have the global EXECUTE MANAGED TASK privilege. For more information about the access control requirements for tasks, see Task security.

You can take some control over the cost and performance of serverless tasks by setting the following parameters:

  • SERVERLESS_TASK_MAX_STATEMENT_SIZE: the maximum size of warehouse to prevent unexpected costs.

  • SERVERLESS_TASK_MIN_STATEMENT_SIZE: the minimum size of warehouse for predictable performance.

  • TARGET_COMPLETION_INTERVAL: the desired task completion time.

When these settings are specified, the precedence is as follows:

  • SERVERLESS_TASK_MAX_STATEMENT_SIZE

  • SERVERLESS_TASK_MIN_STATEMENT_SIZE

  • TARGET_COMPLETION_INTERVAL

How Snowflake estimates serverless compute resources

Snowflake automatically estimates the compute resources needed to complete a task based on the target time frame and the performance of previous task runs.

The target time frame can be set by using TARGET_COMPLETION_INTERVAL. If this value is not set, Snowflake resizes serverless compute resources to complete before the next scheduled execution time.

After a task completes, Snowflake estimates compute resources needed as follows:

  • If tasks are completing too quickly (within a time frame that is a factor less than the target time frame), Snowflake scales down resources in subsequent runs by one warehouse size.

  • If tasks are completing too slowly (within a time frame that is a factor more than the target time frame), Snowflake scales up resources in subsequent runs by one warehouse size.

Examples

Example 1: Create a serverless task to run every hour with a target completion interval of 120 minutes:

CREATE TASK SCHEDULED_T1 SCHEDULE='USING CRON 0 * * * * America/Los_Angeles'
TARGET_COMPLETION_INTERVAL='120 MINUTE' AS SELECT 1;
Copy

By setting TARGET_COMPLETION_INTERVAL with a relatively long duration, this indicates it’s acceptable for the task execution to run longer. Snowflake automatically sizes serverless compute so that the task execution completes within the specified TARGET_COMPLETION_INTERVAL.

Example 2: Create a serverless task to run every hour with a target completion time of 10 minutes:

CREATE TASK SCHEDULED_T2 SCHEDULE='USING CRON 0 * * * * UTC'
TARGET_COMPLETION_INTERVAL='10 MINUTE'
SERVERLESS_TASK_MAX_STATEMENT_SIZE='LARGE'
AS SELECT 1;
Copy

By setting TARGET_COMPLETION_INTERVAL with a relatively short duration, this indicates a preference for the task to quickly complete. Snowflake automatically sizes serverless compute so that the task execution completes within the specified TARGET_COMPLETION_INTERVAL until the SERVERLESS_TASK_MAX_STATEMENT_SIZE is reached.

Example 3: Create a serverless task scheduled once a day at some point within a target completion interval and range of minimum and maximum warehouse sizes:

CREATE TASK SCHEDULED_T3 SCHEDULE='USING CRON 0 0 * * * UTC'
TARGET_COMPLETION_INTERVAL='180 M'
SCHEDULING_MODE = 'FLEXIBLE'
SERVERLESS_TASK_MIN_STATEMENT_SIZE='MEDIUM' SERVERLESS_TASK_MAX_STATEMENT_SIZE='LARGE' AS SELECT 1;
Copy

With this configuration, the serverless task runs once a day at midnight on at least a medium warehouse with a TARGET_COMPLETION_INTERVAL of 3 hours. Snowflake automatically sizes serverless compute so that the task execution completes within the specified TARGET_COMPLETION_INTERVAL until the SERVERLESS_TASK_MAX_STATEMENT_SIZE is reached. By setting SCHEDULING_MODE as flexible, the task can begin at any time during the day.

Example 4: Create a serverless task scheduled once a day without a target completion interval:

CREATE TASK SCHEDULED_T4 SCHEDULE='USING CRON 0 0 * * * UTC'
SERVERLESS_TASK_MAX_STATEMENT_SIZE='LARGE' AS SELECT 1;
Copy

With this configuration, the serverless task runs once a day at midnight. If a series of task execution durations don’t complete before the next scheduled time, Snowflake automatically sizes serverless compute so that the task execution completes within the specified SCHEDULE until the SERVERLESS_TASK_MAX_STATEMENT_SIZE is reached.

For more information about the parameters and syntax, see CREATE TASK and ALTER TASK.

User-managed virtual warehouse model

With this model, you have full control of the compute resources used for each workload.

This model is recommended when you want to manually manage unpredictable loads on compute resources. Multi-cluster warehouses with auto-suspend and auto-resume enabled can also help moderate your credit consumption.

Choosing a warehouse size

If you choose to use existing warehouses to supply the compute resources for individual tasks, follow the best practices described in Warehouse considerations. To understand compute needs for your task, analyze the average run time for a single task or task graph using different warehouses based on warehouse size and clustering. You should also consider whether the warehouse is shared by multiple processes.

To analyze the average run time of your tasks, query the TASK_HISTORY account usage view. The average difference between the scheduled and completed times for a task is the expected average run time for the task. This difference includes the time the task was queued while other processes are using the compute resources in the warehouse.

For task graphs, there is a brief lag after a predecessor task finishes running and any child task starts. Choose a warehouse size large enough to accommodate multiple child tasks running simultaneously.

The following diagram shows a window of 1 minute in which a single task queued for 20 seconds and then ran for 40 seconds. This means that other processes were using the warehouse resources for the first 20 seconds after the task was scheduled.

Example task batch window

The following diagram shows a task graph that requires 5 minutes on average to complete for each run. The diagram shows the window for 2 runs of the task graph to complete. This window is calculated from the time the root task is scheduled to start until the last child task has completed running.

In this example, the warehouse the task graph is running on is shared with other concurrent operations. These concurrent operations consume all available resources when each task in the task graph finishes running and before the next task starts running. As a result, the window for each task includes some amount of queuing while it waits for other operations to finish and free up compute resources.

Example task graph batch window

Running tasks

This section describes the different ways that a task can be scheduled and run, how task failures are handled, and how the version of a task is determined.

Schedule a task

Tasks generally run on a schedule.

You can define the schedule when creating a task using CREATE TASK or later using ALTER TASK. When you combine tasks into a task graph, define the schedule using the CREATE TASK and ALTER TASK commands in the root task.

Snowflake ensures only one instance of a task with a schedule is executed at a time. If a task is still running when the next scheduled execution time occurs, then that scheduled time is skipped.

Snowflake automatically resizes and scales the compute resources for serverless tasks. For user-managed tasks, choose an appropriate warehouse size for the task to complete its workload within the schedule. For information, see Choosing a Warehouse Size.

Task scheduling and daylight saving time

The cron expression in a task definition supports specifying a time zone. Tasks scheduled when the transition from standard time to daylight saving time, or the reverse, occurs can have unexpected behaviors.

For example:

  • During the change from daylight saving time to standard time, a task scheduled to start at 1 AM in the America/Los_Angeles time zone (0 1 * * * America/Los_Angeles) would run twice. At 1 AM and then again when 1:59:59 AM shifts to 1:00:00 AM local time.

  • During the change from standard time to daylight saving time, a task scheduled to start at 2 AM in the America/Los_Angeles time zone (0 2 * * * America/Los_Angeles) would not run because the local time shifts from 1:59:59 AM to 3:00:00 AM.

To avoid unexpected task executions due to daylight saving time, consider the following:

  • Don’t schedule tasks to run between 1 AM and 3 AM.

  • Manually adjust the cron expression for tasks scheduled between 1 AM and 3 AM twice each year to compensate for the time change.

  • Use a time format that does not apply daylight savings time, such as UTC.

Triggered Tasks

Use Triggered Tasks to automatically run tasks whenever a defined stream has new data. This eliminates the need to poll a source frequently when the availability of new data is unpredictable. It also reduces latency because data is processed immediately.

Triggered Tasks don’t use compute resources until the defined stream has data and triggers the task run.

The following are run conditions for Triggered Tasks:

  • When data is changed in the table (or tables) that the associated stream tracks.

  • When the task is first resumed, to consume any data already in the stream.

  • Triggered Tasks automatically run a health check every 12 hours to prevent the stream from becoming stale. If no data is in the stream, Snowflake skips the run without using compute resources.

Create a Triggered Task

Use CREATE TASK, and define the target stream using the WHEN clause. (Do not include the SCHEDULE parameter.)

Assign compute resources for the task by choosing one of the following:

  • To manually manage the resources (user-managed warehouse task), add the WAREHOUSE parameter.

  • To allow Snowflake to manage resources (serverless task), add the TARGET_COMPLETION_INTERVAL parameter. Snowflake estimates the resources needed and adjusts to complete the task in this time.

Migrate an existing task from a scheduled task to a Triggered Task

Suspend the task, unset the SCHEDULE parameter, and resume the task. The existing task must have a target stream defined in the WHEN clause.

ALTER TASK task SUSPEND;
ALTER TASK task UNSET SCHEDULE;
ALTER TASK task RESUME;
Copy

Migrate an existing user-managed Triggered Task to a serverless Triggered Task

Suspend the task. Remove the WAREHOUSE parameter and add the TARGET_COMPLETION_INTERVAL parameter. Resume the task.

Triggered Tasks considerations

  • In the SHOW TASKS and DESC TASK output, the SCHEDULE property displays NULL for Triggered Tasks.

  • In the output of the task_history view of the information_schema and account_usage schemas, the SCHEDULED_FROM column displays TRIGGER.

  • When working with multiple data streams, you can manage when tasks start by using conditional parameters: WHEN AND and WHEN OR.

Triggered Tasks Limitations

  • By default, Triggered Tasks run at most every 30 seconds. You can modify the USER_TASK_MINIMUM_TRIGGER_INTERVAL_IN_SECONDS parameter to run more frequently, up to every 10 seconds.

  • Streams on Directory Tables, External Tables, and Hybrid Tables are not supported.

Triggered Tasks examples

Example 1: Create a user-managed task that runs whenever data changes in either of two streams:

CREATE TASK my_task
  WHEN SYSTEM$STREAM_HAS_DATA('my_return_stream')
  OR   SYSTEM$STREAM_HAS_DATA('my_order_stream')
  WAREHOUSE = my_warehouse
  AS
    INSERT INTO customer_activity
    SELECT customer_id, return_total, return_date, return
    FROM my_return_stream
    UNION ALL
    SELECT customer_id, order_total, order_date, order
    FROM my_order_stream;
Copy

Example 2: Create a task to run whenever data changes are detected in two different data streams. Because the task uses the AND conditional, the task is skipped if only one of the two streams has new data.

Because TARGET_COMPLETION_INTERVAL is 120 minutes, Snowflake estimates and adjusts compute resources necessary to complete the task in this time.

CREATE TASK triggeredTask
  WHEN SYSTEM$STREAM_HAS_DATA('orders_stream')
  TARGET_COMPLETION_INTERVAL='120 MINUTES'
  AS
    INSERT INTO completed_promotions
    SELECT order_id, order_total, order_time, promotion_id
    FROM orders_stream
    WHERE promotion_id IS NOT NULL;
Copy

Run a task manually

Use CREATE TASK or ALTER TASK to create a new task and to set the task parameters.

Use EXECUTE TASK to trigger a single run of the task. This SQL command is useful for testing new or modified tasks.

You can call this SQL command directly in scripts or in stored procedures. In addition, this command supports integrating tasks in external data pipelines. Any third-party services that can authenticate into your Snowflake account and authorize SQL actions can execute the EXECUTE TASK command to run tasks.

Versioning of task runs

When a standalone task is first resumed or manually executed, an initial version of the task is set. The standalone task runs using this version. After a task is suspended and modified, a new version is set when the standalone task is resumed or manually executed.

When the task is suspended, all future scheduled runs of the task are cancelled; however, currently running tasks continue to run using the current version.

For example, suppose the task is suspended, but a scheduled run of this task has already started. The owner of the task modifies the SQL code called by the task while the task is still running. The task runs and executes the SQL code in its definition using the version of the task that was current when the task started its run. When the task is resumed or is manually executed, a new version of the task is set. This new version includes the modifications to the task.

To retrieve the history of task versions, query TASK_VERSIONS Account Usage view (in the SNOWFLAKE shared database).

Automatically suspend tasks after failed runs

Optionally suspend tasks automatically after a specified number of consecutive runs that either fail or time out. This feature can reduce costs by suspending tasks that consume Snowflake credits but fail to run to completion.

Set the SUSPEND_TASK_AFTER_NUM_FAILURES = num parameter on a task. When the parameter is set to a value greater than 0, tasks are automatically suspended after the specified number of consecutive task runs either fail or time out.

The parameter can be set when creating a task using CREATE TASK or later using ALTER TASK. You can also change this value in Snowsight.

The SUSPEND_TASK_AFTER_NUM_FAILURES parameter can also be set at the account, database, or schema level. The setting applies to all tasks contained in the modified object. Note that explicitly setting the parameter at a lower level overrides the parameter value set at a higher level.

Automatically retry failed task runs

If any task completes in a FAILED state, Snowflake can automatically retry the task. The automatic task retry is disabled by default. To enable this feature, set TASK_AUTO_RETRY_ATTEMPTS to a value greater than 0.

Tasks that use error notifications send notifications for each failed retry attempt. For more information, see Configuring a task to send error notifications.

When you set the TASK_AUTO_RETRY_ATTEMPTS parameter value at the account, database, or schema level, the change is applied to tasks contained in the modified object during their next scheduled run.

Setting session parameters for tasks

You can set session parameters for the session in which a task runs. To do so, modify an existing task and set the desired parameter values using ALTER TASKSET session_parameter = value[, session_parameter = value ... ] or edit the task in Snowsight.

A task supports all session parameters. For the complete list, see Parameters. Tasks don’t support account or user parameters.

Viewing the task history for your account

You can view the task history for your account using SQL or Snowsight. To view task history in Snowsight, refer to Viewing tasks and task graphs in Snowsight. For information about required privileges, see Viewing task history.

To view the run history for a single task:

SQL:

Query the TASK_HISTORY table function (in the Snowflake Information Schema).

To view details on a task graph run that is currently scheduled or is executing:

SQL:

Query the CURRENT_TASK_GRAPHS table function (in the Snowflake Information Schema).

To view the history for task graph runs that executed successfully, failed, or were cancelled in the past 60 minutes:

SQL:

Query the COMPLETE_TASK_GRAPHS table function (in the Snowflake Information Schema).

Query the COMPLETE_TASK_GRAPHS view view (in Account Usage).

Task costs

The costs associated with running a task to execute SQL code differ depending on the source of the compute resources for the task:

User-managed warehouse

Snowflake bills your account for credit usage based on warehouse usage while a task is running, similar to the warehouse usage for executing the same SQL statements in a client or the Snowflake web interface. Per-second credit billing and warehouse auto-suspend give you the flexibility to start with larger warehouse sizes and then adjust the size to match your task workloads.

Serverless compute model

Snowflake bills your account based on compute resource usage. Charges are calculated based on your total usage of the resources, including cloud service usage, measured in compute-hours credit usage. The compute-hours cost changes based on warehouse size and query runtime. For more information, see Serverless credit usage or Query: Total serverless task cost.

Snowflake analyzes task runs in the task history to dynamically determine the correct size and number of the serverless compute resources. As Snowflake automatically scales up and down resources to manage your task runs, the cost to run the task runs scales proportionally.

To learn how many credits are consumed by tasks, refer to the “Serverless Feature Credit Table” in the Snowflake Service Consumption Table.

Consider the following best practices to optimize for cost when you create tasks:

  • Set the SCHEDULE to run less frequently.

  • Use the auto-suspend and auto-retry parameters to prevent resource waste on failing tasks.

  • Create a budget and alert on spend limits for serverless features. For more information, see Monitor credit usage with budgets.

To retrieve the current credit usage for a specific task, query the SERVERLESS_TASK_HISTORY table function. Execute the following statement as the task owner, where <database_name> is the database that contains the task and <task_name> is the name of the task:

SET num_credits = (SELECT SUM(credits_used)
  FROM TABLE(<database_name>.information_schema.serverless_task_history(
    date_range_start=>dateadd(D, -1, current_timestamp()),
    date_range_end=>dateadd(D, 1, current_timestamp()),
    task_name => '<task_name>')
    )
  );
Copy

To retrieve the current credit usage for all serverless tasks, query the SERVERLESS_TASK_HISTORY view. Execute the following statement as an account administrator:

SELECT start_time,
  end_time,
  task_id,
  task_name,
  credits_used,
  schema_id,
  schema_name,
  database_id,
  database_name
FROM snowflake.account_usage.serverless_task_history
ORDER BY start_time, task_id;
Copy

Task security

To get started with tasks, you must have the correct access privileges. This section describes how to manage access to tasks.

For information about task graph ownership, see Manage task graph ownership.

Access control privileges

Creating tasks

Creating tasks requires a role with a minimum of the following privileges:

Object

Privilege

Notes

Account

EXECUTE MANAGED TASK

Required only for tasks that rely on serverless compute resources.

Database

USAGE

Schema

USAGE, CREATE TASK

Warehouse

USAGE

Required only for tasks that rely on user-managed warehouses.

Running tasks

After a task is created, the task owner must have the following privileges for the task to run:

Object

Privilege

Notes

Account

EXECUTE TASK

Required to run any tasks the role owns. Revoking the EXECUTE TASK privilege on a role prevents all subsequent task runs from starting under that role.

Account

EXECUTE MANAGED TASK

Required only for tasks that rely on serverless compute resources.

Database

USAGE

Schema

USAGE

Task

USAGE

Warehouse

USAGE

Required only for tasks that rely on user-managed warehouses.

In addition, the role must have the permissions required to run the SQL statement executed by the task.

Viewing task history

To view tasks, you must have one or more of the following privileges:

  • The ACCOUNTADMIN role

  • The OWNERSHIP privilege on the task

  • The global MONITOR EXECUTION privilege

Resuming or suspending tasks

In addition to the task owner, a role that has the OPERATE privilege on the task can suspend or resume the task. This role must have the USAGE privilege on the database and schema that contain the task. No other privileges are required.

When a task is resumed, Snowflake verifies that the task owner role has the privileges listed in Running tasks.

Create custom roles to manage task permissions

With custom roles you can easily manage permissions granted to each account or role in Snowflake. To make changes to permissions for all accounts or roles using the custom role, update the custom role. Or, revoke permissions by removing the custom role.

Create a custom role to create tasks

Snowflake requires different permissions to create serverless and user-managed tasks.

For example, to create user-managed tasks, create a custom role named warehouse_task_creation and grant that role the CREATE TASK and USAGE privileges on the warehouse that the role can create tasks in.

USE SYSADMIN;

CREATE ROLE warehouse_task_creation
  COMMENT = 'This role can create user-managed tasks.';
Copy
USE ACCOUNTADMIN;

GRANT CREATE TASK
  ON SCHEMA schema1
  TO ROLE warehouse_task_creation;
Copy
GRANT USAGE
  ON WAREHOUSE warehouse1
  TO ROLE warehouse_task_creation;
Copy

As an example of a role that can create serverless tasks; create a custom role named serverless_task_creation and grant the role the CREATE TASK privilege and the account level EXECUTE MANAGED TASK privilege.

USE SYSADMIN;

CREATE ROLE serverless_task_creation
  COMMENT = 'This role can create serverless tasks.';
Copy
USE ACCOUNTADMIN;

GRANT CREATE TASK
  ON SCHEMA schema1
  TO ROLE serverless_task_creation;
Copy
GRANT EXECUTE MANAGED TASK ON ACCOUNT
  TO ROLE serverless_task_creation;
Copy

Create a custom role to administer tasks

Create a custom role and with the EXECUTE TASK privilege and grant this custom role to any task owner role to allow altering their own tasks. To remove the ability for the task owner role to execute the task, revoke this custom role from the task owner role.

For example, create a custom role name taskadmin and grant that role the EXECUTE TASK privilege. Assign the taskadmin role to a task owner role named myrole:

USE ROLE securityadmin;

CREATE ROLE taskadmin;
Copy

Set the active role to ACCOUNTADMIN before granting the account-level privileges to the new role

USE ROLE accountadmin;

GRANT EXECUTE TASK, EXECUTE MANAGED TASK ON ACCOUNT TO ROLE taskadmin;
Copy

Set the active role to SECURITYADMIN to show that this role can grant a role to another role

USE ROLE securityadmin;

GRANT ROLE taskadmin TO ROLE myrole;
Copy

For more information on creating custom roles and role hierarchies, see Configuring access control.

Drop a task owner role

When you delete the owner role of a task, the task transfers ownership to the role that dropped the owner role. When a task transfers ownership, it is automatically paused and new executions aren’t scheduled until the new owner resumes the task.

If you drop the role while the task is running, the task run completes processing under the dropped role.

System service task execution

Snowflake runs tasks with the privileges of the task owner, but task runs are not associated with a user. Instead, each run is executed by a system service. Tasks are decoupled from specific users to avoid complications that can arise when users are dropped, locked due to authentication issues, or have roles removed.

Because task runs are decoupled from a user, the query history for task runs are associated with the system service. As such, there are no user credentials for this service, and no individual can assume its identity. Activity for the system service is limited to your account. The same encryption protections and other security protocols are built into this service as are enforced for other operations.

Task DDL

To support creating and managing tasks, Snowflake provides the following set of special DDL operations:

In addition, providers can view, grant, or revoke access to the necessary database objects for ELT using the following standard access control DDL:

Task functions

To support retrieving information about tasks, Snowflake provides the following set of functions: