Use SQL to set up data metric functions¶
This topic describes how to use SQL to associate a data metric function (DMF) with a table or view so it runs at regular intervals. It also describes how to call a DMF directly, for example, if you want to test a DMF before associating it with a table or view.
Note
To use a user interface to set up data quality checks, which includes associating a DMF with a table, see Use Snowsight to set up data quality checks.
Associate a DMF¶
You can associate a DMF with a table or view to automatically call it on regular intervals. When associating the DMF, you specify which columns are passed to the DMF as arguments.
Use an ALTER TABLE or ALTER VIEW command to associate a DMF and specify which
columns are passed as arguments. For example, the following command associates the NULL_COUNT system DMF with table t. When the
DMF runs, it will return the number of NULL values in the column c1.
Some DMFs don’t accept a column as an argument. For example, to associate the ROW_COUNT system DMF with view v2, run the following command:
The ACCEPTED_VALUES DMF contains a lambda expression as well as the column name, which
allows you to check how many records do not match an expected value. For example, the following statement associates the function with table
t1 so the function returns the number of records where the value of the column age is not equal to five.
Drop a DMF from an object¶
You can drop a DMF using an ALTER TABLE or ALTER VIEW command. For example:
Adjust the schedule for DMFs¶
The DATA_METRIC_SCHEDULE object parameter for a table, view, or materialized view controls how often DMFs run. By default, the schedule is set to one hour. All data metric functions on a table or view follow the same schedule.
You can use the following approaches to schedule your DMF to run:
Set the DMF to run after a specified number of minutes.
Use a cron expression to schedule the DMF to run at a particular frequency.
Use a trigger event to schedule the DMF to run when there is a DML change to the table, such as inserting a new row into the table. However:
The reclustering of tables doesn’t trigger a DMF to run.
The trigger approach is only available for certain kinds of tables. For more information, see ALTER TABLE … SET DATA_METRIC_SCHEDULE.
For example:
Set the data metric function schedule to run every 5 minutes:
Set the data metric function schedule to run at 8:00 AM daily:
Set the data metric function schedule to run at 8:00 AM on weekdays only:
Set the data metric function schedule to run three times daily at 0600, 1200, and 1800 UTC:
Set the data metric function to run when a general DML operation, such as inserting a new row, modifies the table:
You can use the SHOW PARAMETERS command to view the DMF schedule for a supported table object:
For view and materialized view objects, specify TABLE as the object domain and check the schedule as follows:
Note
There is a 10-minute lag from when you modify the DMF from a table for any scheduling changes to take effect on previous DMFs that are assigned to the table. However, new DMF assignments to the table are not subject to the 10 minute delay. Plan the DMF scheduling and DMF unsetting operations carefully to align with your expected DMF costs.
Additionally, when you evaluate the DMF results, such as by querying the
DATA_QUALITY_MONITORING_RESULTS view, specify the measurement_time
column in your query as the basis for the evaluation. There is an internal process that initiates the DMF evaluation, and it is possible
that table updates, such as INSERT operations, can occur between the scheduled time and the measurement time. When you use the
measurement_time column, you have a more accurate assessment of the DMF results because the measurement time indicates the
evaluation time of the DMF.
Suspend DMFs¶
You can suspend a DMF to prevent it from running even though it is associated with a table. Alternatively, you can suspend all DMFs associated with a table with a single statement.
To suspend a specific DMF associated with a table, modify the association to set the SUSPEND parameter. For example:
To resume running the DMF, use another MODIFY DATA METRIC FUNCTION statement to set the RESUME parameter.
To suspend all DMFs associated with a table, set the table’s schedule to an empty string. For example:
To resume the DMFs, set the DATA_METRIC_SCHEDULE parameter to a valid value.
Call a DMF manually¶
Calling a DMF directly can be useful to test the output of the DMF before associating it with a table or view.
Use the following syntax to call a DMF:
Where:
data_metric_functionSpecifies a system- or user-defined DMF.
querySpecifies a SQL query on a table or view.
The columns projected by the query must match the column arguments in the DMF signature.
Note
The following system DMFs don’t follow this syntax because they don’t take any arguments:
For example, to call a custom DMF count_positive_numbers, which accepts three columns as arguments, run the following command:
For example, to call the NULL_COUNT (system data metric function) system DMF to view the number of NULL values
in the ssn column, run the following command:
If a custom DMF accepts arguments from multiple tables, each query that projects a column must be enclosed in parentheses. For example, if you want to manually call the REFERENTIAL_CHECK DMF, execute the following: