- Categories:
DUPLICATE_COUNT (system data metric function)¶
Returns the count of column values that have duplicates, including NULL values.
This topic provides the syntax for calling the function directly. To learn how to associate the function with a table or view so it runs at regular intervals, see Associate a DMF to automate data quality checks.
Syntax¶
SNOWFLAKE.CORE.DUPLICATE_COUNT(<query>)
Arguments¶
query
Specifies a SQL query that projects a single column.
Allowed data types¶
The column projected by the query
must have one of the following data types:
DATE
FLOAT
NUMBER
TIMESTAMP_LTZ
TIMESTAMP_NTZ
TIMESTAMP_TZ
VARCHAR
Returns¶
The function returns a scalar value with a NUMBER data type.
Access control requirements¶
Associating and running a system DMF requires the USAGE privilege on the system DMF. You can grant the SNOWFLAKE.DATA_METRIC_USER database role to give users the USAGE privilege on all system DMFs. For more information, see Grant the USAGE privilege on system DMFs.
For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.
For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.
Example¶
Determine the number of duplicate US Social Security numbers in the SSN
column:
SELECT SNOWFLAKE.CORE.DUPLICATE_COUNT(
SELECT
ssn
FROM hr.tables.empl_info
);
+---------------------------------------------------------------------+
| SNOWFLAKE.CORE.DUPLICATE_COUNT(SELECT ssn FROM hr.tables.empl_info) |
+---------------------------------------------------------------------+
| 0 |
+---------------------------------------------------------------------+