Working with privacy budgets¶
This topic describes the tasks that a data provider who has implemented differential privacy can perform to manage privacy budgets. For an introduction to privacy budgets and how they help prevent queries from revealing sensitive information about an entity, see Limiting privacy loss.
A privacy budget is created automatically when you define a privacy budget name in the body of the privacy policy. You don’t create a privacy budget independent of a privacy policy.
To manage a privacy budget, you need OWNERSHIP privilege on the privacy policy that specifies the privacy budget.
View a privacy budget¶
Each privacy budget is namespaced to a privacy policy. There can be multiple privacy budgets with the same name, but each is unique to a privacy policy. Within a privacy policy, a privacy budget is further namespaced to the consumer account incurring privacy loss. As a result, multiple accounts can have a privacy budget with the same name and limit on privacy loss, but Snowflake tallies the cumulative privacy loss for each account separately.
Viewing a privacy budget lets you see its limit on privacy loss as well as the cumulative privacy loss incurred by users associated with the budget. You can use this information to determine whether the cumulative privacy loss is approaching the privacy budget’s limit.
Note
The cumulative privacy loss associated with a privacy budget does not include privacy loss incurred in accounts outside of the data provider’s account.
You have the following two options for viewing privacy budgets. For both options, a privacy budget appears only if analysts associated with the privacy budget have incurred privacy loss or if an administrator has reset the privacy budget.
To query all privacy budgets in the account, use the PRIVACY_BUDGETS view in the Account Usage schema. The PRIVACY_BUDGETS view in the ACCOUNT USAGE schema contains all privacy budgets in the account. You can use it to view privacy budgets associated with all of the privacy policies that you own, and can filter results to focus on specific privacy budgets by name. For example, to focus on a specific privacy budget associated with the
patients_policy
privacy policy, you might execute the following query:SELECT * FROM snowflake.account_usage.privacy_budgets WHERE policy_name='patients_policy' AND budget_name='analyst_budget';
To view the privacy budgets associated with a particular privacy policy, use the CUMULATIVE_PRIVACY_LOSSES table function. You can use the CUMULATIVE_PRIVACY_LOSSES table function to retrieve privacy budgets associated with a particular privacy policy. Unlike the PRIVACY_BUDGETS view in the ACCOUNT USAGE schema, this function does not have a fixed amount of latency and will return the real-time values for the cumulative privacy losses. When calling the function, the name of the privacy policy must be fully qualified.
For example, to view the privacy budgets that are specified in the
my_policy_privacy
policy, execute the following:SELECT * FROM TABLE(SNOWFLAKE.DATA_PRIVACY.CUMULATIVE_PRIVACY_LOSSES( 'my_policy_db.my_policy_schema.my_policy_privacy'));
Set privacy settings for a privacy budget¶
Snowflake lets you adjust the privacy budget’s limit on privacy loss and the maximum amount of privacy budget spent per aggregate (collectively known as the epsilon in differential privacy). You set these controls by specifying the following parameters in the body of the privacy policy:
BUDGET_LIMIT
— Sets the privacy budget’s limit on cumulative privacy loss.MAX_BUDGET_PER_AGGREGATE
– Sets the maximum amount of the privacy budget spend per aggregate (that is, the maximum privacy loss incurred by each aggregate function in a query).
For example, to use the ALTER PRIVACY POLICY command to adjust the privacy controls of an existing privacy budget, you might execute:
ALTER PRIVACY POLICY users_policy SET BODY ->
PRIVACY_BUDGET(BUDGET_NAME=>'analysts',
BUDGET_LIMIT=>300,
MAX_BUDGET_PER_AGGREGATE=>0.1);
You can also define these controls when executing the CREATE PRIVACY POLICY command to create the privacy policy.
Caution
When changing the BUDGET_LIMIT
, MAX_BUDGET_PER_AGGREGATE
, or BUDGET_WINDOW
parameter, any
parameter not specified in your ALTER PRIVACY POLICY command reverts back to its default value. So in the previous example,
the BUDGET_WINDOW
parameter, which determines how often Snowflake resets the privacy budget, will revert to its default value.
For more information about setting privacy controls, see Adjust privacy controls.
Privacy budget refresh¶
About the refresh period¶
Snowflake periodically resets the cumulative privacy loss of a privacy budget to 0 to let analysts run a new set of queries. This refresh period is known as the budget window. This automatic refresh lets analysts access new data as it is added to a table. Theoretically, the analyst hasn’t learned any information about this new data, so it’s appropriate to let them run more queries.
The default budget window is weekly.
Modify the refresh period¶
To modify the privacy budget refresh period, update the budget_window
value of the privacy policy’s privacy_budget
. For example:
ALTER PRIVACY POLICY users_policy SET BODY ->
PRIVACY_BUDGET(BUDGET_NAME=>'analysts', BUDGET_WINDOW=>'daily');
Caution
When changing the BUDGET_LIMIT
, MAX_BUDGET_PER_AGGREGATE
, or BUDGET_WINDOW
parameter, any parameter not specified
in your ALTER PRIVACY POLICY command reverts back to its default value. So in the previous example, BUDGET_LIMIT
and
MAX_BUDGET_PER_AGGREGATE
will revert to default values.
Reset cumulative privacy loss¶
As analysts execute queries on data protected by a policy, Snowflake tallies the cumulative privacy loss of those queries. You can call the RESET_PRIVACY_BUDGET stored procedure to reset the cumulative privacy loss to 0, letting the analysts execute additional queries.
The RESET_PRIVACY_BUDGET stored procedure is intended to reset the budget when analysts inadvertently incur privacy loss and want to start over. Remember that the privacy loss is automatically set to 0 when the privacy budget is refreshed.
Only the cumulative privacy loss associated with analysts in the specified account is reset to 0, even if the privacy budget is associated with analysts in multiple accounts.
Note
When calling RESET_PRIVACY_BUDGET, the cumulative privacy loss is not reset immediately. It is reset the next time a query incurs privacy loss. As a result, if you view the privacy budget after calling the function but before the first query incurs privacy loss, the cumulative privacy loss will not be 0.
Example
Here’s an example of zeroing out the privacy usage count for all users executing queries in the companyorg.account_123
account:
CALL SNOWFLAKE.DATA_PRIVACY.RESET_PRIVACY_BUDGET(
'my_policy_db.my_policy_schema.my_policy',
'analyst_budget',
'companyorg',
'account_123');