Categories:

Data Pipeline DDL

Creates a new task in the current/specified schema or replaces an existing task.

This command also supports the following variant:

• CREATE TASK … CLONE (creates a clone of an existing task)

In this Topic:

## Syntax¶

CREATE [ OR REPLACE ] TASK [ IF NOT EXISTS ] <name>
WAREHOUSE = <string>
[ SCHEDULE = '{ <num> MINUTE | USING CRON <expr> <time_zone> }' ]
[ ALLOW_OVERLAPPING_EXECUTION = TRUE | FALSE ]
[ <session_parameter> = <value> [ , <session_parameter> = <value> ... ] ]
[ COPY GRANTS ]
[ COMMENT = '<string_literal>' ]
[ AFTER <string> ]
[ WHEN <boolean_expr> ]
AS
<sql>


## Variant Syntax¶

Creates a new task with the same parameter values:

CREATE [ OR REPLACE ] TASK <name> CLONE <source_task>
[ COPY GRANTS ]
[ ... ]


For more details, see CREATE <object> … CLONE.

## Required Parameters¶

name

String that specifies the identifier (i.e. name) for the task; must be unique for the schema in which the task is created.

In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g. "My object"). Identifiers enclosed in double quotes are also case-sensitive.

For more details, see Identifier Requirements.

WAREHOUSE = string

Specifies the virtual warehouse that is used to execute the task.

sql

Any single SQL statement, or a call to a stored procedure, executed when the task runs.

Note

The SQL statement or stored procedure must be executable on its own. We highly recommend that you verify the sql executes as expected before you create the task. Tasks are intended to automate SQL statements and stored procedures that have already been tested thoroughly.

## Optional Parameters¶

SCHEDULE ...

Specifies the schedule for periodically running the task:

Note

• A schedule must be defined for a standalone task or the root task in a task tree, or the task will never run.

• A schedule cannot be specified for child tasks in a task tree (i.e. tasks that have a predecessor task set using the AFTER parameter).

• USING CRON expr time_zone

Specifies a cron expression and time zone for periodically running the task. Supports a subset of standard cron utility syntax.

For a list of time zones, see the list of tz database time zones (in Wikipedia).

The cron expression consists of the following fields:

# __________ minute (0-59)
# | ________ hour (0-23)
# | | ______ day of month (1-31, or L)
# | | | ____ month (1-12, JAN-DEC)
# | | | | _ day of week (0-6, SUN-SAT, or L)
# | | | | |
# | | | | |
* * * * *


The following special characters are supported:

*

Wildcard. Specifies any occurrence of the field.

L

Stands for “last”. When used in the day-of-week field, it allows you to specify constructs such as “the last Friday” (“5L”) of a given month. In the day-of-month field, it specifies the last day of the month.

/n

Indicates the nth instance of a given unit of time. For example, if 4/3 is specified in the month field, then task is scheduled for April, July and October (i.e. every 3 months, starting with the 4th month of the year).

Note

• The cron expression currently evaluates against the specified time zone only. Altering the TIMEZONE parameter value for the account (or setting the value at the user or session level) does not change the time zone for the task.

• The cron expression defines all valid run times for the task. Snowflake attempts to run a task based on this schedule; however, any valid run time is skipped if a previous run has not completed before the next valid run time starts.

• When both a specific day of month and day of week are included in the cron expression, then the task is scheduled on days satisfying either the day of month or day of week. For example, SCHEDULE = 'USING CRON 0 0 10-20 * TUE,THU UTC' schedules a task at 0AM on any 10th to 20th day of the month and also on any Tuesday or Thursday outside of those dates.

• num MINUTE

Specifies an interval (in minutes) of wait time inserted between runs of the task. Accepts positive integers only.

Also supports num M syntax.

To avoid ambiguity, a base interval time is set when:

The base interval time starts the interval counter from the current clock time. For example, if an INTERVAL value of 10 is set and the task is enabled at 9:03 AM, then the task runs at 9:13 AM, 9:23 AM, and so on. Note that we make a best effort to ensure absolute precision, but only guarantee that tasks do not execute before their set interval occurs (e.g. in the current example, the task could first run at 9:14 AM, but will definitely not run at 9:12 AM).

ALLOW_OVERLAPPING_EXECUTION = TRUE | FALSE

Specifies whether to allow multiple instances of the task tree to run concurrently.

Note

This parameter can only be set on a root task. The setting applies to all tasks in the tree.

The parameter can be set on standalone tasks but does not affect the task behavior. Snowflake ensures only one instance of a standalone task is running at a given time.

• TRUE ensures only one instance of a root task is running at a given time. If a root task is still running when the next scheduled run time occurs, then that scheduled time is skipped. This guarantee does not extend to child tasks. If the next scheduled run of the root task occurs while the current run of a child task is still in operation, another instance of the task tree begins.

• FALSE ensures only one instance of a particular tree of tasks is allowed to run at a time. The next run of a root task is scheduled only after all child tasks in the tree have finished running. This means that if the cumulative time required to run all tasks in the tree exceeds the explicit scheduled time set in the definition of the root task, at least one run of the task tree is skipped.

Default: FALSE

session_parameter = value [ , session_parameter = value ... ]

Specifies a comma-separated list of session parameters to set for the session when the task runs. A task supports all session parameters. For the complete list, see Session Parameters.

USER_TASK_TIMEOUT_MS = num

Specifies the time limit on a single run of the task before it times out (in milliseconds).

Note

Before you increase the time limit on a task significantly, consider whether the SQL statement initiated by the task could be optimized (either by rewriting the statement or using a stored procedure) or the warehouse size should be increased.

Values: 0 - 86400000 (1 day).

Default: 3600000 (1 hour)

COPY GRANTS

Specifies to retain the access permissions from the original task when a new task is created using any of the following CREATE TASK variants:

The parameter copies all permissions, except OWNERSHIP, from the existing task to the new task. By default, the role that executes the CREATE TASK command owns the new task.

Note:

• If the CREATE TASK statement references more than one task (e.g. create or replace task t1 clone t2;), the COPY GRANTS clause gives precedence to the task being replaced.

• The SHOW GRANTS output for the replacement task lists the grantee for the copied privileges as the role that executed the CREATE TASK statement, with the current timestamp when the statement was executed.

• The operation to copy grants occurs atomically in the CREATE TASK command (i.e. within the same transaction).

Note

This parameter is not supported currently.

COMMENT = 'string_literal'

Specifies a comment for the task.

Default: No value

AFTER string

Specifies the predecessor task for the current task. When a run of the predecessor task finishes successfully, it triggers this task (after a brief lag).

This parameter enables defining a simple tree of tasks; i.e. a set of tasks organized by their dependencies. In this context, a tree is a series of tasks that start with a scheduled root task and are linked together by their dependencies.

Note

• The root task in the tree should have a defined schedule; each of the other tasks in the tree have a defined predecessor task (i.e. a task specified using the AFTER parameter) to link them together.

• A task is limited to a single predecessor task; however, a task can have a maximum of 100 child tasks (i.e. other tasks that identify the task as a predecessor); in addition, a simple tree of tasks is limited to a maximum of 1000 tasks total (including the root task) in either a resumed or suspended state.

For example, task T2 is limited to a single predecessor task, e.g. T1; however, T1 can serve as the predecessor task for tasks T2, T3, T4, etc.

• All tasks in a simple tree must have the same task owner (i.e. a single role must have the OWNERSHIP privilege on all of the tasks in the tree).

• All tasks in a simple tree must exist in the same schema.

• If any task in a tree is cloned, the role that clones the task becomes the owner of the clone by default.

• If another role creates the clone, then the task clone can have a schedule but not a predecessor.

• Accounts are currently limited to a maximum of 10000 resumed tasks.

• Current limitations:

• Snowflake guarantees that at most one instance of a task with a defined schedule is running at a given time; however, we cannot provide the same guarantee for tasks with a defined predecessor task.

WHEN boolean_expr

Specifies a Boolean SQL expression; multiple conditions joined with AND/OR are supported. When a task is triggered (based on its SCHEDULE or AFTER setting), it validates the conditions of the expression to determine whether to execute. If the conditions of the expression are not met, then the task skips the current run. Any tasks that identify this task as a predecessor also do not run.

Currently, only the following function is supported for evaluation in the SQL expression:

## Examples¶

Create a task that inserts the current timestamp into a table every hour starting at 9 AM and ending at 5 PM on Sundays (America/Los_Angeles time zone). The task sets the TIMESTAMP_INPUT_FORMAT parameter for the session in which the task runs:

CREATE TASK mytask_hour
WAREHOUSE = mywh
SCHEDULE = 'USING CRON 0 9-17 * * SUN America/Los_Angeles'
TIMESTAMP_INPUT_FORMAT = 'YYYY-MM-DD HH24'
AS
INSERT INTO mytable(ts) VALUES(CURRENT_TIMESTAMP);


SCHEDULE Value

Description

* * * * * UTC

Every minute. UTC time zone.

0 2 * * * UTC

Every night at 2 AM. UTC time zone.

0 5,17 * * * UTC

Twice daily, at 5 AM and 5 PM (at the top of the hour). UTC time zone.

30 2 L 6 * UTC

In June, on the last day of the month, at 2:30 AM. UTC time zone.

Create a task that inserts the current timestamp into a table every 5 minutes:

CREATE TASK mytask_minute
WAREHOUSE = mywh
SCHEDULE = '5 MINUTE'
AS
INSERT INTO mytable(ts) VALUES(CURRENT_TIMESTAMP);


Create a task that inserts change tracking data for INSERT operations from a stream into a table every 5 minutes. The task polls the stream using the SYSTEM$STREAM_HAS_DATA function to determine whether change data exists and, if the result is FALSE, skips the current run: CREATE TASK mytask1 WAREHOUSE = mywh SCHEDULE = '5 minute' WHEN SYSTEM$STREAM_HAS_DATA('MYSTREAM')
AS
INSERT INTO mytable1(id,name) SELECT id, name FROM mystream WHERE METADATA\$ACTION = 'INSERT';


Create a simple tree of tasks by specifying the existing mytask1 task as the predecessor task that triggers the new mytask2 task when run successfully. The new task queries the mytable table and inserts the query results into another table:

CREATE TASK mytask2
WAREHOUSE = mywh
AS
INSERT INTO mytable2(id,name) SELECT id, name FROM mytable1;


Create a task named my_copy_task that calls a stored procedure to unload data from the mytable table to the named mystage stage (using COPY INTO <location>) every hour:

-- Create a stored procedure that unloads data from a table
-- The COPY statement in the stored procedure unloads data to files in a path identified by epoch time (using the Date.now() method)
$$var my_sql_command = "" var my_sql_command = my_sql_command.concat("copy into @mystage","/",Date.now(),"/"," from mytable overwrite=true;"); var statement1 = snowflake.createStatement( {sqlText: my_sql_command} ); var result_set1 = statement1.execute(); return my_sql_command; // Statement returned for info/debug purposes$$;