Using Dynamic Data Masking¶
This topic provides instructions on how to configure and use Dynamic Data Masking in Snowflake.
To learn more about using a masking policy with a tag, see Tag-based masking policies.
Using Dynamic Data Masking¶
The following lists the high-level steps to configure and use Dynamic Data Masking in Snowflake:
Grant masking policy management privileges to a custom role for a security or privacy officer.
Grant the custom role to the appropriate users.
The security or privacy officer creates and defines masking policies and applies them to columns with sensitive data.
Execute queries in Snowflake. Note the following:
Snowflake dynamically rewrites the query applying the masking policy SQL expression to the column.
The column rewrite occurs at every place where the column specified in the masking policy appears in the query (e.g. projections, join predicate, where clause predicate, order by, and group by).
Users see masked data based on the execution context conditions defined in the masking policies. For more information on the execution context in Dynamic Data Masking policies, see Advanced Column-level Security topics.
Enforce dynamic data masking policies on Apache Iceberg tables queried from Apache Spark™¶
Snowflake supports enforcing dynamic data masking policies on Apache Iceberg tables that you query from Apache Spark™ through Snowflake Horizon Catalog. For more information, see Enforce data protection policies when querying Apache Iceberg™ tables from Apache Spark™.
Step 1: Grant masking policy privileges to the custom role¶
A security or privacy officer should serve as the masking policy administrator (i.e. custom role: MASKING_ADMIN) and have the privileges to define, manage, and apply masking policies to columns.
Snowflake provides the following privileges to grant to a security or privacy officer for Column-level Security masking policies:
Privilege |
Object |
Description |
|---|---|---|
CREATE MASKING POLICY |
Schema |
This privilege controls who can create masking policies. |
APPLY MASKING POLICY |
Account |
This privilege controls who can [un]set masking policies on columns and is granted to the ACCOUNTADMIN role by default. . This privilege only allows applying a masking policy to a column and does not provide any additional table privileges described in Access control privileges. |
APPLY |
Masking policy |
Optional. This policy-level privilege can be used by a policy owner to decentralize the [un]set operations of a given masking policy on columns to the object owners (i.e. the role that has the OWNERSHIP privilege on the object). . Snowflake supports discretionary access control where object owners are also considered data stewards. . If the policy administrator trusts the object owners to be data stewards for protected columns, then the policy administrator can use this privilege to decentralize applying the policy [un]set operations. |
The following example creates the MASKING_ADMIN role and grants masking policy privileges to that role.
Create a masking policy administrator custom role:
Grant privileges to masking_admin role:
Allow table_owner role to set or unset the ssn_mask masking policy (optional):
Where:
db_name.schema_nameSpecifies the identifier for the schema for which the privilege should be granted.
For more information, see:
Step 2: Grant the custom role to a user¶
Grant the MASKING_ADMIN custom role to a user serving as the security or privacy officer.
Step 3: Create a masking policy¶
Using the MASKING_ADMIN role, create a masking policy and apply it to a column.
In this representative example, users with the ANALYST role see the unmasked value. Users without the ANALYST role see a full mask.
Tip
If you want to update an existing masking policy and need to see the current definition of the policy, call the GET_DDL function or run the DESCRIBE MASKING POLICY command.
Step 4: Apply the masking policy to a table or view column¶
These examples assume that a masking policy is not applied to the table column when the table is created and the view column when the view is created. You can optionally apply a masking policy to a table column when you create the table with a CREATE TABLE statement or a view column with a CREATE VIEW statement.
Execute the following statements to apply the policy to a table column or a view column.
Step 5: Query data in Snowflake¶
Execute two different queries in Snowflake, one query with the ANALYST role and another query with a different role, to verify that users without the ANALYST role see a full mask.
Masking policy with a memoizable function¶
This example uses a memoizable function to cache the result of a query on the mapping table that determines whether a role is authorized to view PII data. A data engineer uses a masking policy to protect the columns in the table.
The following procedure references these objects:
A table that contains PII data,
employee_data:A mapping table that determines whether a particular role is authorized to view data,
auth_role_t:
Complete these steps to create a masking policy that calls a memoizable function with arguments:
Create a memoizable function that queries the mapping table. The function returns an array of roles based on the value of the
is_authorizedcolumn:Call the memoizable function to cache the query results. In this example, pass the value
TRUEas the argument value because the resultant array serves as the source of allowed roles to access the data protected by the masking policy:Create a masking policy to protect the
idcolumn. The policy calls the memoizable function to determine whether the role used to query the table is authorized to see the data in the protected column:Set the masking policy on the table with an ALTER TABLE … ALTER COLUMN command:
Query the table to test the policy:
This query returns unmasked data.
However, if you switch roles to the PUBLIC role and repeat the query in this step, the values in the
idare replaced withNULL.
Additional masking policy examples¶
The following are additional, representative examples that can be used in the body of the Dynamic Data Masking policy.
Allow a production account to see unmasked values and all other accounts (e.g. development, test) to see masked values.
Return NULL for unauthorized users:
Return a static masked value for unauthorized users:
Return a hash value using SHA2 , SHA2_HEX for unauthorized users. Using a hashing function in a masking policy may result in collisions; therefore, exercise caution with this approach. For more information, see Advanced Column-level Security topics.
Apply a partial mask or full mask:
Using timestamps.
Important
Currently, Snowflake does not support different input and output data types in a masking policy, such as defining the masking policy to target a timestamp and return a string (e.g.
***MASKED***); the input and output data types must match.A workaround is to cast the actual timestamp value with a fabricated timestamp value. For more information, see DATE_FROM_PARTS and CAST , ::.
Using a UDF:
On variant data:
Using a custom entitlement table. Note the use of EXISTS in the WHEN clause. Always use EXISTS when including a subquery in the masking policy body. For more information on subqueries that Snowflake supports, see Working with Subqueries.
Using DECRYPT on previously encrypted data with either ENCRYPT or ENCRYPT_RAW, with a passphrase on the encrypted data:
Using a <JavaScript UDF on JSON (VARIANT):
In this example, a JavaScript UDF masks location data in a JSON string. It is important to set the data type as VARIANT in the UDF and the masking policy. If the data type in the table column, UDF, and masking policy signature do not match, Snowflake returns an error message because it cannot resolve the SQL.
Using the GEOGRAPHY data type:
In this example, a masking policy uses the TO_GEOGRAPHY function to convert all GEOGRAPHY data in a column to a fixed point, the longitude and latitude for Snowflake in San Mateo, California, for users whose CURRENT_ROLE is not
ANALYST.Set the masking policy on a column with the GEOGRAPHY data type and set the GEOGRAPHY_OUTPUT_FORMAT value for the session to
GeoJSON:Snowflake returns the following:
The query result values in column B depend on the GEOGRAPHY_OUTPUT_FORMAT parameter value for the session. For example, if the parameter value is set to
WKT, Snowflake returns the following:
For examples using other context functions and role hierarchy, see Advanced Column-level Security topics.
Next Topics: