Using Dynamic Data Masking

This topic provides instructions on how to configure and use Dynamic Data Masking in Snowflake.

To learn more about using a masking policy with a tag, see Tag-based masking policies.

Using Dynamic Data Masking

The following lists the high-level steps to configure and use Dynamic Data Masking in Snowflake:

  1. Grant masking policy management privileges to a custom role for a security or privacy officer.

  2. Grant the custom role to the appropriate users.

  3. The security or privacy officer creates and defines masking policies and applies them to columns with sensitive data.

  4. Execute queries in Snowflake. Note the following:

    • Snowflake dynamically rewrites the query applying the masking policy SQL expression to the column.

    • The column rewrite occurs at every place where the column specified in the masking policy appears in the query (e.g. projections, join predicate, where clause predicate, order by, and group by).

    • Users see masked data based on the execution context conditions defined in the masking policies. For more information on the execution context in Dynamic Data Masking policies, see Advanced Column-level Security topics.

Step 1: Grant masking policy privileges to custom role

A security or privacy officer should serve as the masking policy administrator (i.e. custom role: MASKING_ADMIN) and have the privileges to define, manage, and apply masking policies to columns.

Snowflake provides the following privileges to grant to a security or privacy officer for Column-level Security masking policies:

Privilege

Description

CREATE MASKING POLICY

This schema-level privilege controls who can create masking policies.

APPLY MASKING POLICY

This account-level privilege controls who can [un]set masking policies on columns and is granted to the ACCOUNTADMIN role by default. . This privilege only allows applying a masking policy to a column and does not provide any additional table privileges described in Access control privileges.

APPLY ON MASKING POLICY

Optional. This policy-level privilege can be used by a policy owner to decentralize the [un]set operations of a given masking policy on columns to the object owners (i.e. the role that has the OWNERSHIP privilege on the object). . Snowflake supports discretionary access control where object owners are also considered data stewards. . If the policy administrator trusts the object owners to be data stewards for protected columns, then the policy administrator can use this privilege to decentralize applying the policy [un]set operations.

The following example creates the MASKING_ADMIN role and grants masking policy privileges to that role.

Create a masking policy administrator custom role:

use role useradmin;
CREATE ROLE masking_admin;
Copy

Grant privileges to masking_admin role:

use role securityadmin;
GRANT CREATE MASKING POLICY on SCHEMA <db_name.schema_name> to ROLE masking_admin;
GRANT APPLY MASKING POLICY on ACCOUNT to ROLE masking_admin;
Copy

Allow table_owner role to set or unset the ssn_mask masking policy (optional):

GRANT APPLY ON MASKING POLICY ssn_mask to ROLE table_owner;
Copy

Where:

  • db_name.schema_name

    Specifies the identifier for the schema for which the privilege should be granted.

For more information, see:

Step 2: Grant the custom role to a user

Grant the MASKING_ADMIN custom role to a user serving as the security or privacy officer.

GRANT ROLE masking_admin TO USER jsmith;
Copy

Step 3: Create a masking policy

Using the MASKING_ADMIN role, create a masking policy and apply it to a column.

In this representative example, users with the ANALYST role see the unmasked value. Users without the ANALYST role see a full mask.

CREATE OR REPLACE MASKING POLICY email_mask AS (val string) RETURNS string ->
  CASE
    WHEN CURRENT_ROLE() IN ('ANALYST') THEN val
    ELSE '*********'
  END;
Copy

Tip

If you want to update an existing masking policy and need to see the current definition of the policy, call the GET_DDL function or run the DESCRIBE MASKING POLICY command.

Step 4: Apply the masking policy to a table or view column

These examples assume that a masking policy is not applied to the table column when the table is created and the view column when the view is created. You can optionally apply a masking policy to a table column when you create the table with a CREATE TABLE statement or a view column with a CREATE VIEW statement.

Execute the following statements to apply the policy to a table column or a view column.

-- apply masking policy to a table column

ALTER TABLE IF EXISTS user_info MODIFY COLUMN email SET MASKING POLICY email_mask;

-- apply the masking policy to a view column

ALTER VIEW user_info_v MODIFY COLUMN email SET MASKING POLICY email_mask;
Copy

Step 5: Query data in Snowflake

Execute two different queries in Snowflake, one query with the ANALYST role and another query with a different role, to verify that users without the ANALYST role see a full mask.

-- using the ANALYST role

USE ROLE analyst;
SELECT email FROM user_info; -- should see plain text value

-- using the PUBLIC role

USE ROLE PUBLIC;
SELECT email FROM user_info; -- should see full data mask
Copy

Masking policy with a memoizable function

This example uses a memoizable function to cache the result of a query on the mapping table that determines whether a role is authorized to view PII data. A data engineer uses a masking policy to protect the columns in the table.

The following procedure references these objects:

  • A table that contains PII data, employee_data:

    +----------+-------------+---------------+
    | USERNAME |     ID      | PHONE_NUMBER  |
    +----------+-------------+---------------+
    | JSMITH   | 12-3456-89  | 1555-523-8790 |
    | AJONES   | 12-0124-32  | 1555-125-1548 |
    +----------+-------------+---------------+
    
  • A mapping table that determines whether a particular role is authorized to view data, auth_role_t:

    +---------------+---------------+
    | ROLE          | IS_AUTHORIZED |
    +---------------+---------------+
    | DATA_ENGINEER | TRUE          |
    | DATA_STEWARD  | TRUE          |
    | IT_ADMIN      | TRUE          |
    | PUBLIC        | FALSE         |
    +---------------+---------------+
    

Complete these steps to create a masking policy that calls a memoizable function with arguments:

  1. Create a memoizable function that queries the mapping table. The function returns an array of roles based on the value of the is_authorized column:

    CREATE FUNCTION is_role_authorized(arg1 VARCHAR)
    RETURNS BOOLEAN
    MEMOIZABLE
    AS
    $$
      SELECT ARRAY_CONTAINS(
        arg1::VARIANT,
        (SELECT ARRAY_AGG(role) FROM auth_role WHERE is_authorized = TRUE)
      )
    $$;
    
    Copy
  2. Call the memoizable function to cache the query results. In this example, pass the value TRUE as the argument value because the resultant array serves as the source of allowed roles to access the data protected by the masking policy:

    SELECT is_role_authorized(IT_ADMIN);
    
    Copy
    +---------------------------------------------+
    |         is_role_authorized(IT_ADMIN)        |
    +---------------------------------------------+
    |                    TRUE                     |
    +---------------------------------------------+
    
  3. Create a masking policy to protect the id column. The policy calls the memoizable function to determine whether the role used to query the table is authorized to see the data in the protected column:

    CREATE OR REPLACE MASKING POLICY empl_id_mem_mask
    AS (val VARCHAR) RETURNS VARCHAR ->
    CASE
      WHEN is_role_authorized(CURRENT_ROLE()) THEN val
      ELSE NULL
    END;
    
    Copy
  4. Set the masking policy on the table with an ALTER TABLE … ALTER COLUMN command:

    ALTER TABLE employee_data MODIFY COLUMN id
      SET MASKING POLICY empl_id_mem_mask;
    
    Copy
  5. Query the table to test the policy:

    USE ROLE data_engineer;
    SELECT * FROM employee_data;
    
    Copy

    This query returns unmasked data.

    However, if you switch roles to the PUBLIC role and repeat the query in this step, the values in the id are replaced with NULL.

Additional masking policy examples

The following are additional, representative examples that can be used in the body of the Dynamic Data Masking policy.

Allow a production account to see unmasked values and all other accounts (e.g. development, test) to see masked values.

case
  when current_account() in ('<prod_account_identifier>') then val
  else '*********'
end;
Copy

Return NULL for unauthorized users:

case
  when current_role() IN ('ANALYST') then val
  else NULL
end;
Copy

Return a static masked value for unauthorized users:

CASE
  WHEN current_role() IN ('ANALYST') THEN val
  ELSE '********'
END;
Copy

Return a hash value using SHA2 , SHA2_HEX for unauthorized users. Using a hashing function in a masking policy may result in collisions; therefore, exercise caution with this approach. For more information, see Advanced Column-level Security topics.

CASE
  WHEN current_role() IN ('ANALYST') THEN val
  ELSE sha2(val) -- return hash of the column value
END;
Copy

Apply a partial mask or full mask:

CASE
  WHEN current_role() IN ('ANALYST') THEN val
  WHEN current_role() IN ('SUPPORT') THEN regexp_replace(val,'.+\@','*****@') -- leave email domain unmasked
  ELSE '********'
END;
Copy

Using timestamps.

case
  WHEN current_role() in ('SUPPORT') THEN val
  else date_from_parts(0001, 01, 01)::timestamp_ntz -- returns 0001-01-01 00:00:00.000
end;
Copy

Important

Currently, Snowflake does not support different input and output data types in a masking policy, such as defining the masking policy to target a timestamp and return a string (e.g. ***MASKED***); the input and output data types must match.

A workaround is to cast the actual timestamp value with a fabricated timestamp value. For more information, see DATE_FROM_PARTS and CAST , ::.

Using a UDF:

CASE
  WHEN current_role() IN ('ANALYST') THEN val
  ELSE mask_udf(val) -- custom masking function
END;
Copy

On variant data:

CASE
   WHEN current_role() IN ('ANALYST') THEN val
   ELSE OBJECT_INSERT(val, 'USER_IPADDRESS', '****', true)
END;
Copy

Using a custom entitlement table. Note the use of EXISTS in the WHEN clause. Always use EXISTS when including a subquery in the masking policy body. For more information on subqueries that Snowflake supports, see Working with Subqueries.

CASE
  WHEN EXISTS
    (SELECT role FROM <db>.<schema>.entitlement WHERE mask_method='unmask' AND role = current_role()) THEN val
  ELSE '********'
END;
Copy

Using DECRYPT on previously encrypted data with either ENCRYPT or ENCRYPT_RAW, with a passphrase on the encrypted data:

case
  when current_role() in ('ANALYST') then DECRYPT(val, $passphrase)
  else val -- shows encrypted value
end;
Copy

Using a <JavaScript UDF on JSON (VARIANT):

In this example, a JavaScript UDF masks location data in a JSON string. It is important to set the data type as VARIANT in the UDF and the masking policy. If the data type in the table column, UDF, and masking policy signature do not match, Snowflake returns an error message because it cannot resolve the SQL.

-- Flatten the JSON data

create or replace table <table_name> (v variant) as
select value::variant
from @<table_name>,
  table(flatten(input => parse_json($1):stationLocation));

-- JavaScript UDF to mask latitude, longitude, and location data

CREATE OR REPLACE FUNCTION full_location_masking(v variant)
  RETURNS variant
  LANGUAGE JAVASCRIPT
  AS
  $$
    if ("latitude" in V) {
      V["latitude"] = "**latitudeMask**";
    }
    if ("longitude" in V) {
      V["longitude"] = "**longitudeMask**";
    }
    if ("location" in V) {
      V["location"] = "**locationMask**";
    }

    return V;
  $$;

  -- Grant UDF usage to ACCOUNTADMIN

  grant ownership on function FULL_LOCATION_MASKING(variant) to role accountadmin;

  -- Create a masking policy using JavaScript UDF

  create or replace masking policy json_location_mask as (val variant) returns variant ->
    CASE
      WHEN current_role() IN ('ANALYST') THEN val
      else full_location_masking(val)
      -- else object_insert(val, 'latitude', '**locationMask**', true) -- limited to one value at a time
    END;
Copy

Using the GEOGRAPHY data type:

In this example, a masking policy uses the TO_GEOGRAPHY function to convert all GEOGRAPHY data in a column to a fixed point, the longitude and latitude for Snowflake in San Mateo, California, for users whose CURRENT_ROLE is not ANALYST.

create masking policy mask_geo_point as (val geography) returns geography ->
  case
    when current_role() IN ('ANALYST') then val
    else to_geography('POINT(-122.35 37.55)')
  end;
Copy

Set the masking policy on a column with the GEOGRAPHY data type and set the GEOGRAPHY_OUTPUT_FORMAT value for the session to GeoJSON:

alter table mydb.myschema.geography modify column b set masking policy mask_geo_point;
alter session set geography_output_format = 'GeoJSON';
use role public;
select * from mydb.myschema.geography;
Copy

Snowflake returns the following:

---+--------------------+
 A |         B          |
---+--------------------+
 1 | {                  |
   |   "coordinates": [ |
   |     -122.35,       |
   |     37.55          |
   |   ],               |
   |   "type": "Point"  |
   | }                  |
 2 | {                  |
   |   "coordinates": [ |
   |     -122.35,       |
   |     37.55          |
   |   ],               |
   |   "type": "Point"  |
   | }                  |
---+--------------------+
Copy

The query result values in column B depend on the GEOGRAPHY_OUTPUT_FORMAT parameter value for the session. For example, if the parameter value is set to WKT, Snowflake returns the following:

alter session set geography_output_format = 'WKT';
select * from mydb.myschema.geography;

---+----------------------+
 A |         B            |
---+----------------------+
 1 | POINT(-122.35 37.55) |
 2 | POINT(-122.35 37.55) |
---+----------------------+
Copy

For examples using other context functions and role hierarchy, see Advanced Column-level Security topics.

Next Topics: