Categories:

Data metric functions

DUPLICATE_COUNT (system data metric function)¶

Returns the count of column values that have duplicates, including NULL values.

This topic provides the syntax for calling the function directly. To learn how to associate the function with a table or view so it runs at regular intervals, see Associate a DMF to automate data quality checks.

Syntax¶

SNOWFLAKE.CORE.DUPLICATE_COUNT(<query>)
Copy

Arguments¶

query

Specifies a SQL query that projects a single column.

Allowed data types¶

The column projected by the query must have one of the following data types:

  • DATE

  • FLOAT

  • NUMBER

  • TIMESTAMP_LTZ

  • TIMESTAMP_NTZ

  • TIMESTAMP_TZ

  • VARCHAR

Returns¶

The function returns a scalar value with a NUMBER data type.

Example¶

Determine the number of duplicate US Social Security numbers in the SSN column:

SELECT SNOWFLAKE.CORE.DUPLICATE_COUNT(
  SELECT
    ssn
  FROM hr.tables.empl_info
);
Copy
+---------------------------------------------------------------------+
| SNOWFLAKE.CORE.DUPLICATE_COUNT(SELECT ssn FROM hr.tables.empl_info) |
+---------------------------------------------------------------------+
| 0                                                                   |
+---------------------------------------------------------------------+