Using Dynamic Data Masking¶
This topic provides instructions on how to configure and use Dynamic Data Masking in Snowflake.
To learn more about using a masking policy with a tag, see Tag-based masking policies.
Using Dynamic Data Masking¶
The following lists the high-level steps to configure and use Dynamic Data Masking in Snowflake:
Grant masking policy management privileges to a custom role for a security or privacy officer.
Grant the custom role to the appropriate users.
The security or privacy officer creates and defines masking policies and applies them to columns with sensitive data.
Execute queries in Snowflake. Note the following:
Snowflake dynamically rewrites the query applying the masking policy SQL expression to the column.
The column rewrite occurs at every place where the column specified in the masking policy appears in the query (e.g. projections, join predicate, where clause predicate, order by, and group by).
Users see masked data based on the execution context conditions defined in the masking policies. For more information on the execution context in Dynamic Data Masking policies, see Advanced Column-level Security topics.
Step 1: Grant masking policy privileges to custom role¶
A security or privacy officer should serve as the masking policy administrator (i.e. custom role: MASKING_ADMIN
) and have the privileges to define, manage, and apply masking policies to columns.
Snowflake provides the following privileges to grant to a security or privacy officer for Column-level Security masking policies:
Privilege |
Description |
---|---|
CREATE MASKING POLICY |
This schema-level privilege controls who can create masking policies. |
APPLY MASKING POLICY |
This account-level privilege controls who can [un]set masking policies on columns and is granted to the ACCOUNTADMIN role by default. . This privilege only allows applying a masking policy to a column and does not provide any additional table privileges described in Access control privileges. |
APPLY ON MASKING POLICY |
Optional. This policy-level privilege can be used by a policy owner to decentralize the [un]set operations of a given masking policy on columns to the object owners (i.e. the role that has the OWNERSHIP privilege on the object). . Snowflake supports discretionary access control where object owners are also considered data stewards. . If the policy administrator trusts the object owners to be data stewards for protected columns, then the policy administrator can use this privilege to decentralize applying the policy [un]set operations. |
The following example creates the MASKING_ADMIN
role and grants masking policy privileges to that role.
Create a masking policy administrator custom role:
use role useradmin; CREATE ROLE masking_admin;
Grant privileges to masking_admin
role:
use role securityadmin; GRANT CREATE MASKING POLICY on SCHEMA <db_name.schema_name> to ROLE masking_admin; GRANT APPLY MASKING POLICY on ACCOUNT to ROLE masking_admin;
Allow table_owner
role to set or unset the ssn_mask
masking policy (optional):
GRANT APPLY ON MASKING POLICY ssn_mask to ROLE table_owner;
Where:
db_name.schema_name
Specifies the identifier for the schema for which the privilege should be granted.
For more information, see:
Step 2: Grant the custom role to a user¶
Grant the MASKING_ADMIN
custom role to a user serving as the security or privacy officer.
GRANT ROLE masking_admin TO USER jsmith;
Step 3: Create a masking policy¶
Using the MASKING_ADMIN role, create a masking policy and apply it to a column.
In this representative example, users with the ANALYST role see the unmasked value. Users without the ANALYST role see a full mask.
CREATE OR REPLACE MASKING POLICY email_mask AS (val string) RETURNS string ->
CASE
WHEN CURRENT_ROLE() IN ('ANALYST') THEN val
ELSE '*********'
END;
Tip
If you want to update an existing masking policy and need to see the current definition of the policy, call the GET_DDL function or run the DESCRIBE MASKING POLICY command.
Step 4: Apply the masking policy to a table or view column¶
These examples assume that a masking policy is not applied to the table column when the table is created and the view column when the view is created. You can optionally apply a masking policy to a table column when you create the table with a CREATE TABLE statement or a view column with a CREATE VIEW statement.
Execute the following statements to apply the policy to a table column or a view column.
-- apply masking policy to a table column
ALTER TABLE IF EXISTS user_info MODIFY COLUMN email SET MASKING POLICY email_mask;
-- apply the masking policy to a view column
ALTER VIEW user_info_v MODIFY COLUMN email SET MASKING POLICY email_mask;
Step 5: Query data in Snowflake¶
Execute two different queries in Snowflake, one query with the ANALYST role and another query with a different role, to verify that users without the ANALYST role see a full mask.
-- using the ANALYST role
USE ROLE analyst;
SELECT email FROM user_info; -- should see plain text value
-- using the PUBLIC role
USE ROLE PUBLIC;
SELECT email FROM user_info; -- should see full data mask
Masking policy with a memoizable function¶
This example uses a memoizable function to cache the result of a query on the mapping table that determines whether a role is authorized to view PII data. A data engineer uses a masking policy to protect the columns in the table.
The following procedure references these objects:
A table that contains PII data,
employee_data
:+----------+-------------+---------------+ | USERNAME | ID | PHONE_NUMBER | +----------+-------------+---------------+ | JSMITH | 12-3456-89 | 1555-523-8790 | | AJONES | 12-0124-32 | 1555-125-1548 | +----------+-------------+---------------+
A mapping table that determines whether a particular role is authorized to view data,
auth_role_t
:+---------------+---------------+ | ROLE | IS_AUTHORIZED | +---------------+---------------+ | DATA_ENGINEER | TRUE | | DATA_STEWARD | TRUE | | IT_ADMIN | TRUE | | PUBLIC | FALSE | +---------------+---------------+
Complete these steps to create a masking policy that calls a memoizable function with arguments:
Create a memoizable function that queries the mapping table. The function returns an array of roles based on the value of the
is_authorized
column:CREATE FUNCTION is_role_authorized(arg1 VARCHAR) RETURNS BOOLEAN MEMOIZABLE AS $$ SELECT ARRAY_CONTAINS( arg1::VARIANT, (SELECT ARRAY_AGG(role) FROM auth_role WHERE is_authorized = TRUE) ) $$;
Call the memoizable function to cache the query results. In this example, pass the value
TRUE
as the argument value because the resultant array serves as the source of allowed roles to access the data protected by the masking policy:SELECT is_role_authorized(IT_ADMIN);
+---------------------------------------------+ | is_role_authorized(IT_ADMIN) | +---------------------------------------------+ | TRUE | +---------------------------------------------+
Create a masking policy to protect the
id
column. The policy calls the memoizable function to determine whether the role used to query the table is authorized to see the data in the protected column:CREATE OR REPLACE MASKING POLICY empl_id_mem_mask AS (val VARCHAR) RETURNS VARCHAR -> CASE WHEN is_role_authorized(CURRENT_ROLE()) THEN val ELSE NULL END;
Set the masking policy on the table with an ALTER TABLE … ALTER COLUMN command:
ALTER TABLE employee_data MODIFY COLUMN id SET MASKING POLICY empl_id_mem_mask;
Query the table to test the policy:
USE ROLE data_engineer; SELECT * FROM employee_data;
This query returns unmasked data.
However, if you switch roles to the PUBLIC role and repeat the query in this step, the values in the
id
are replaced withNULL
.
Additional masking policy examples¶
The following are additional, representative examples that can be used in the body of the Dynamic Data Masking policy.
Allow a production account to see unmasked values and all other accounts (e.g. development, test) to see masked values.
case when current_account() in ('<prod_account_identifier>') then val else '*********' end;
Return NULL for unauthorized users:
case when current_role() IN ('ANALYST') then val else NULL end;
Return a static masked value for unauthorized users:
CASE WHEN current_role() IN ('ANALYST') THEN val ELSE '********' END;
Return a hash value using SHA2 , SHA2_HEX for unauthorized users. Using a hashing function in a masking policy may result in collisions; therefore, exercise caution with this approach. For more information, see Advanced Column-level Security topics.
CASE WHEN current_role() IN ('ANALYST') THEN val ELSE sha2(val) -- return hash of the column value END;
Apply a partial mask or full mask:
CASE WHEN current_role() IN ('ANALYST') THEN val WHEN current_role() IN ('SUPPORT') THEN regexp_replace(val,'.+\@','*****@') -- leave email domain unmasked ELSE '********' END;
Using timestamps.
case WHEN current_role() in ('SUPPORT') THEN val else date_from_parts(0001, 01, 01)::timestamp_ntz -- returns 0001-01-01 00:00:00.000 end;Important
Currently, Snowflake does not support different input and output data types in a masking policy, such as defining the masking policy to target a timestamp and return a string (e.g.
***MASKED***
); the input and output data types must match.A workaround is to cast the actual timestamp value with a fabricated timestamp value. For more information, see DATE_FROM_PARTS and CAST , ::.
Using a UDF:
CASE WHEN current_role() IN ('ANALYST') THEN val ELSE mask_udf(val) -- custom masking function END;
On variant data:
CASE WHEN current_role() IN ('ANALYST') THEN val ELSE OBJECT_INSERT(val, 'USER_IPADDRESS', '****', true) END;
Using a custom entitlement table. Note the use of EXISTS in the WHEN clause. Always use EXISTS when including a subquery in the masking policy body. For more information on subqueries that Snowflake supports, see Working with Subqueries.
CASE WHEN EXISTS (SELECT role FROM <db>.<schema>.entitlement WHERE mask_method='unmask' AND role = current_role()) THEN val ELSE '********' END;
Using DECRYPT on previously encrypted data with either ENCRYPT or ENCRYPT_RAW, with a passphrase on the encrypted data:
case when current_role() in ('ANALYST') then DECRYPT(val, $passphrase) else val -- shows encrypted value end;
Using a <JavaScript UDF on JSON (VARIANT):
In this example, a JavaScript UDF masks location data in a JSON string. It is important to set the data type as VARIANT in the UDF and the masking policy. If the data type in the table column, UDF, and masking policy signature do not match, Snowflake returns an error message because it cannot resolve the SQL.
-- Flatten the JSON data create or replace table <table_name> (v variant) as select value::variant from @<table_name>, table(flatten(input => parse_json($1):stationLocation)); -- JavaScript UDF to mask latitude, longitude, and location data CREATE OR REPLACE FUNCTION full_location_masking(v variant) RETURNS variant LANGUAGE JAVASCRIPT AS $$ if ("latitude" in V) { V["latitude"] = "**latitudeMask**"; } if ("longitude" in V) { V["longitude"] = "**longitudeMask**"; } if ("location" in V) { V["location"] = "**locationMask**"; } return V; $$; -- Grant UDF usage to ACCOUNTADMIN grant ownership on function FULL_LOCATION_MASKING(variant) to role accountadmin; -- Create a masking policy using JavaScript UDF create or replace masking policy json_location_mask as (val variant) returns variant -> CASE WHEN current_role() IN ('ANALYST') THEN val else full_location_masking(val) -- else object_insert(val, 'latitude', '**locationMask**', true) -- limited to one value at a time END;
Using the GEOGRAPHY data type:
In this example, a masking policy uses the TO_GEOGRAPHY function to convert all GEOGRAPHY data in a column to a fixed point, the longitude and latitude for Snowflake in San Mateo, California, for users whose CURRENT_ROLE is not
ANALYST
.create masking policy mask_geo_point as (val geography) returns geography -> case when current_role() IN ('ANALYST') then val else to_geography('POINT(-122.35 37.55)') end;Set the masking policy on a column with the GEOGRAPHY data type and set the GEOGRAPHY_OUTPUT_FORMAT value for the session to
GeoJSON
:alter table mydb.myschema.geography modify column b set masking policy mask_geo_point; alter session set geography_output_format = 'GeoJSON'; use role public; select * from mydb.myschema.geography;Snowflake returns the following:
---+--------------------+ A | B | ---+--------------------+ 1 | { | | "coordinates": [ | | -122.35, | | 37.55 | | ], | | "type": "Point" | | } | 2 | { | | "coordinates": [ | | -122.35, | | 37.55 | | ], | | "type": "Point" | | } | ---+--------------------+The query result values in column B depend on the GEOGRAPHY_OUTPUT_FORMAT parameter value for the session. For example, if the parameter value is set to
WKT
, Snowflake returns the following:alter session set geography_output_format = 'WKT'; select * from mydb.myschema.geography; ---+----------------------+ A | B | ---+----------------------+ 1 | POINT(-122.35 37.55) | 2 | POINT(-122.35 37.55) | ---+----------------------+
For examples using other context functions and role hierarchy, see Advanced Column-level Security topics.
Next Topics: