Multi-provider consumer analysis¶
A consumer can run an analysis in multiple clean rooms owned by the same or multiple providers in a single request, and see the combined results. The results are a union of analysis results from each clean room, not the consumer data analyzed against a union of provider data in multiple clean rooms.
Providers can see all the details of the request, including which other clean rooms are being queried and the template before approving the request. Providers cannot see data from other clean rooms, or even which provider tables are being queried in other clean rooms. Data from each clean room is accessed and handled securely, and in compliance with the join, column, and other policies in each clean room.
Multi-provider analysis results can be activated if activation is enabled in all clean rooms.
Requirements¶
Templates: Multi-provider analysis can be run on any Snowflake-provided or custom template.
Environments: Multi-provider analysis can be implemented only by using the clean rooms API. It cannot be run in the clean rooms UI.
Billing: The consumer pays for multi-provider queries.
Other requirements: - All clean rooms in a multi-provider analysis must have a provider join policy. If the analysis fails in any clean room, the entire request will fail.
Overview of multi-provider analysis¶
Here is a high-level overview of how multi-provider analysis works. See the code samples for runnable sample code.
The consumer installs all clean rooms to use in their multi-provider flow, in the standard way. Clean rooms in a multi-provider analysis are standard clean rooms.
The consumer links datasets and sets join policies in the standard way. Provider and consumer clean rooms must all have join policies defined, whether or not the policy is checked by the template.
The request is for a single template, and all clean rooms must have that template installed. If the consumer wants to use a custom template, they must go through the standard consumer template request flow:
The consumer calls
consumer.create_template_request
on each clean room, passing in the custom template.The provider calls
provider.list_pending_template_requests
to see pending template requests, then callsprovider.approve_template_request
to approve adding the template to their clean room.The consumer calls
consumer.list_template_requests
to see if the request was granted.
The providers and consumer set column policies for the template, in the standard way.
The provider calls
provider.enable_multiprovider_computation
in each clean room to surface requests from the consumer. (Requests sent before this call are queued, but are not visible to the provider until this procedure is called.)The consumer requests approval to run a template. There are a few variations on the request and approval flow but here is the default flow:
The consumer sends a multi-provider request to all clean rooms by calling
consumer.prepare_multiprovider_flow
. This request specifies the list of clean rooms, the template used, and all template parameters, including the provider and consumer tables. The call is made only once and is broadcast to all clean rooms in the request. Each clean room sees all the request details, but the provider table list is filtered to the provider tables in that clean room.Each provider calls
provider.view_multiprovider_requests
to see multi-provider requests sent by the consumer, then callsprovider.process_multiprovider_request
to approve it.
When all clean rooms have approved the request, the consumer calls
consumer.execute_multiprovider_flow
to run the request. The template is run in each clean room with the information provided in the most recentconsumer.prepare_multiprovider_flow
request, and the combined results are sent to the consumer. The consumer can callexecute_multiprovider_flow
again without another approval until a provider revokes permission for the query, or the consumer runs out of differential privacy budget (if differential privacy is enabled).
Request and approval details¶
Here are details about the request and approval process. There are several variations on the process.
Request and approval flow variations¶
There are three possible flows when running a multi-provider analysis:
- Per-request approval (default behavior):
Consumer calls
consumer.prepare_multiprovider_flow
with the query details.Provider sees the query (
provider.view_multiprovider_requests
) and approves it (provider.process_multiprovider_request
). This step can be omitted if the provider has previously approved this query.Consumer runs the query (
consumer.execute_multiprovider_flow
).
- Per-consumer and clean room approval:
Consumer calls
consumer.prepare_multiprovider_flow
with the query details.Provider sees the query (
provider.view_multiprovider_requests
) and approves it (provider.process_multiprovider_request
), passing in-1
as the query ID rather than the actual query ID. As a result, the consumer must still callconsumer.prepare_multiprovider_flow
with future requests, but approval will be granted automatically, without additional provider approval.Consumer runs the query (
consumer.execute_multiprovider_flow
).
- Automated approval:
Provider calls
provider.resume_multiprovider_tasks
on the clean room. All consumer multi-provider requests in this clean room will be approved automatically.Consumer calls
consumer.prepare_multiprovider_flow
with the query details. The request is approved automatically,Consumer runs the query (
consumer.execute_multiprovider_flow
).
In all flows, you can revoke any query approval previously granted.
Criteria for query approval¶
consumer.execute_multiprovider_flow
evaluates the last query sent to consumer.prepare_multiprovider_flow
to see whether it has ever
been approved. If a matching approved request is found, the query will proceed. If no previous matching request is found,
consumer.execute_multiprovider_flow
will fail. (If blanket approval has been granted for
this consumer, or all consumers, then the approval check is skipped.) Previous approvals are matched against the following values:
The list of clean rooms being queried.
Argument names sent to the clean room. All argument names present in the approved request must be present in the new request. Argument values are not checked, only argument names.
Name of the template being run.
Additionally, the template must comply with all clean room policies, including row policies, column policies, and differential privacy (if enabled), whether or not blanket approvals have been given.
Approval history¶
consumer.execute_multiprovider_flow
tries to run the last query sent to consumer.prepare_multiprovider_flow
. The query can run if
all security checks pass and a matching query has been approved in the past. Therefore, you can see a flow like this:
Consumer prepares query A.
Provider approves A.
Consumer runs A.
Consumer prepares query B.
Provider approves B.
Consumer runs B.
Consumer prepares query A again, with all the same parameters.
Consumer can run A again without approval from the provider, because it was already approved. See the next section to learn how the clean room determines whether a query has already been approved.
Enable or disable automated query approvals¶
Query approvals can be automated per consumer, or for all consumers, in a clean room.
To automatically approve all multi-provider queries from a given consumer in the clean room, the provider calls
provider.process_multiprovider_request
with -1
as the query ID. All multi-provider requests from that consumer in that clean room will
then be approved. (The consumer must still call consumer.prepare_multiprovider_flow
when changing the query, and the provider will
still see all consumer.prepare_multiprovider_flow
calls in the provider.view_multiprovider_requests
request history.) To disable
automated approval granted to a given user, you must update the approval log.
To automatically approve all multi-provider queries for all consumers, the provider calls provider.resume_multiprovider_tasks
on
the clean room. All multi-provider requests from all consumers will then be approved automatically. (The consumer must still call
consumer.prepare_multiprovider_flow
when changing the query, and the provider will still see all
consumer.prepare_multiprovider_flow
calls in the provider.view_multiprovider_requests
request history.) To disable automated
approval granted in this way, call provider.suspend_multiprovider_tasks
.
Designing your template¶
The multi-provider flow is time consuming to test and has many steps that can obscure a template error, so make sure your template is correct before using it by testing your template in a basic provider-created, consumer-run flow before using it in a multi-provider flow.
Each clean room runs the exact same template, but the source_table
list passed to each clean room can vary in length and table name,
because the list is filtered to show only the provider tables in that clean room. Therefore, depending on your query, you might need to use
Jinja looping and conditional statements to handle varying list length or table names.
Managing automated approvals and changing your approval status¶
Request history log¶
When the provider calls provider.enable_multiprovider_computation
, this creates a log table named
samooha_cleanroom_cleanroom_ID.admin.request_log_multiprovider
. Whenever a provider approves a multi-provider analysis for a
clean room, the request is logged here. This is the table that clean rooms checks when a consumer requests executing a multi-provider
query. The table includes an approved
column that indicates whether the request is allowed to be executed. Because a query must be approved
before being added to the table, the initial approved
status will be true
, but you can
change it later if you decide to revoke approval.
The log table holds consumer query requests from provider.process_multiprovider_request
, not query executions by the consumer. Consumer
query executions are not logged.
Deny a previously approved request¶
To deny a previously approved request, you must update the approved
column in the multi-provider request log table to FALSE.
Because a consumer can submit the same query multiple times, and any approval for a given query allows the query to be run, the safest way
to disable a multi-provider query is to set approved=false
for all queries from a given consumer, then tell the consumer to resubmit
any queries they want to run by calling consumer.prepare_multiprovider_flow
.
-- Revoke access to a query you had previously approved
UPDATE samooha_cleanroom_Samooha_Cleanroom_Multiprovider_Clean_Room_1.admin.request_log_multiprovider
SET APPROVED=FALSE
WHERE PARTY_ACCOUNT='<CONSUMER_LOCATOR>';
Enable a previously denied request¶
If you previously approved a request, then denied it, and want to re-approve it, it’s generally safest to ask a consumer to resubmit their flow request and then approve it in the standard flow, rather than to update the request log table.
Revoke automated approvals¶
If you enabled automatic approval for all consumers in a clean room by calling
provider.resume_multiprovider_tasks
, callprovider.suspend_multiprovider_tasks
to revoke blanket approvals for all users.If you enabled automatic approval for a specific consumer in a clean room by specifying
-1
as the query ID inprovider.process_multiprovider_request
, see Deny a previously approved request.
Code samples¶
Download the following workbooks to try out the multi-provider analysis flow. You need two Snowflake accounts with the clean rooms API installed: one to serve as the provider and one as the consumer. The provider account creates two clean rooms, which behave similarly to having multiple providers with one clean room each.