Differential privacy in Snowflake Data Clean Rooms

One of the flagship privacy enhancing technologies that Snowflake Data Clean Rooms offer is differential privacy. To boost privacy protections for data providers, provide mathematical guarantees around user privacy, and to protect against repeated queries trying to obtain individual information, Snowflake Data Clean Rooms implement state-of-the-art techniques in differential privacy, creating incredibly-strong protections end-to-end for Snowflake Data Clean Rooms. Since data clean rooms act as trusted environments, Snowflake Data Clean Rooms implement global differential privacy [1], with immense privacy guarantees and low noise levels. Snowflake Data Clean Rooms deploy differential privacy in complete alignment with massive scale deployments.

Differential Privacy techniques publish high-level insights about data without revealing any of its individual row-level constituents. Using simple aggregation to hide row level information could potentially expose private information if adversaries can generate “close” queries on the data that differ by one row. The result difference of these queries can compromise personal information (often referred to as a ‘differencing attack’).

Differential privacy overcomes this and provides immense mathematical guarantees on the data privacy by carefully injecting noise into the return values of any query against private data. Under such noise mechanisms and strategically set privacy budgets, it is statistically impossible to tell apart the results of any such “close” queries.

Differential privacy in developer edition

User can add differential privacy to any custom template deployed in the developer edition of a Snowflake Data Clean Room. Using the SQL Jinja custom template mechanism, the following command adds noise as per the desired mechanism to the output:

cleanroom.addNoise(QUERY_RESULT,EPSILON,RANDOM_NUMBER,MECHANISM,...)
Copy

The power of the developer edition of a Snowflake Data Clean Room is that any custom noise mechanism can be designed and deployed in the custom template. So the user can develop their own differential privacy mechanisms easily and deploy them with no changes to the clean room backend.

Usage

See Snowflake Data Clean Rooms: Overlap Analysis for an example of this implemented inside an analysis template.