공급자 실행 분석

Overview

기본 Clean Room 구성에서는 컨슈머만 Clean Room에서 분석을 실행할 수 있습니다. 그러나 공급자는 특정 Clean Room에서 컨슈머 데이터를 사용하여 템플릿을 실행할 수 있는 권한을 컨슈머에게 요청할 수 있습니다. 공급자 실행 분석은 Clean Rooms UI 또는 코드를 사용하여 활성화하고 실행할 수 있습니다.

The following diagram shows the data flow and the main components in a basic provider-run analysis:

Basic data flow direction in a provider-run analysis
  1. In a basic provider-run analysis, the consumer and provider both link their data into the clean room. Source data is linked into the clean room as private views in the account where the data lives.

  2. When the provider runs an analysis, the provider’s data is shared with the clean room app in the consumer’s account. The analysis runs on the consumer’s account.

  3. The encrypted results are temporarily written to the consumer DB in the consumer’s account.

  4. The encrypted results are copied to the analysis results back share on the provider’s account (also called the governance back share) and decrypted. Because the analysis runs on the consumer’s account, the consumer is billed for the analysis.

For more information, see Snowflake Data Clean Rooms: 설치된 오브젝트.

Templates that support provider-run analyses

The following templates support provider-run analyses:

  • Audience Overlap & Segmentation

  • SQL Query (UI only)

  • 사용자 지정 템플릿(API 전용)

Billing and cost details

Provider-run analyses run in the consumer’s account, and consumers are billed for a provider-run analysis. To stop incurring costs from provider-run analyses, the consumer must uninstall the clean room.

A consumer can estimate the number of credits consumed by the provider within the last N days by executing the following query. Specify the number of previous days as a negative number.

-- Estimate the number of credits consumed in the past 5 days.
SELECT * FROM TABLE(SAMOOHA_BY_SNOWFLAKE_LOCAL_DB.LIBRARY.PRA_CONSUMPTION_UDTF(-5));
Copy

When a provider runs an analysis in the clean rooms UI, the clean room uses auto-scaling logic based on dataset sizes to choose a warehouse for your analysis.

When a provider creates and runs a clean room using the API, the provider can explicitly choose a warehouse size using the API. The consumer can limit the size and type of warehouses available to the provider when running a given template.

General notes

  • Providers can activate results to their own account using the UI or the API, or to third-party providers if using the UI. For information about how to enable activation and view results, see Clean Room에서 활성화 구현하기.

  • 컨슈머와 공급자가 서로 다른 클라우드 리전에 있는 경우 두 계정과 두 Clean Rooms 모두에 대해 클라우드 간 자동 복제 를 활성화해야 합니다.

    Note that provider-run cross-cloud queries can take some time to run because provider source data must be replicated from the provider to the consumer, and query results from the consumer to the provider, all across cloud regions.

  • Any templates run by the provider require column names or aliases for all columns generated in the results. If a column is aggregated (for example, SUM(col1)) or calls a custom function (for example, cleanroom.my_function(p.hashed_email)), you must explicitly specify an alias for the column name as shown here:

    SELECT SUM(col1) AS TOTAL FROM my_db.my_sch.T; -- Correct
    SELECT SUM(col1)          FROM my_db.my_sch.T; -- Error: aggregated column needs an explicit alias.
    
    Copy

Provider-run analyses in the UI

Here is how to enable provider-run analysis in a new clean room using the clean rooms UI:

  1. The provider creates and configures a clean room, using one of the supported templates. Configure the clean room up to the Share Clean Room step.

  2. In the Share Clean Room step of clean room configuration, the provider selects Enable run analysis & query next to their own account to enable them to run all templates in this clean room that support provider-run analysis.

    • 클린룸이 생성된 후에는 이 설정을 변경할 수 없습니다. 게시된 클린룸에서 쿼리를 실행하기 위해 특정 계정의 권한을 변경하려면 클린룸을 삭제하고 새 클린룸을 만들어야 합니다.

  3. The consumer joins and configures the clean room as is usual for all templates in the clean room, including any templates that support provider analysis. If the consumer does not want to enable a provider to run a specific template, they can omit required details for that template.

    • 컨슈머가 클린룸에 조인하면 조인하기 전에 해당 클린룸에 대해 공급자 실행 분석이 활성화되어 있다는 경고가 표시됩니다.

    • The consumer can run queries as soon as the clean room is joined, but there is a delay of up to 30 minutes before the provider can run the template. This setup delay is only occurs during the initial join step; if the provider later adds other provider-run templates, the provider can run them as soon as the consumer configures their clean room for that template.

  4. After the join step completes, the clean room is available for both provider run analyses and consumer run analyses.

    Important:

    • Providers must wait about 10 minutes after the consumer installs the clean room before they can run an analysis. The delay is for additional background configuration required for provider-run analyses.

    • 공급자 또는 컨슈머가 실행하는 이 클린룸의 모든 분석에 대해 컨슈머에게 요금이 청구됩니다.

Provider-run analyses in the API

Here is how to enable provider-run analysis in a new clean room using the clean rooms API:

  1. 공급자

    1. 표준 방식으로 Clean Room과 데이터 및 정책을 생성하고 구성합니다.

    2. 표준 방식으로 컨슈머를 추가합니다.

    3. Enable provider-run analysis for specific consumer accounts in the clean room by calling provider.enable_provider_run_analysis.

      Important:

    • You must call provider.enable_provider_run_analysis after adding consumers to a clean room, but before any consumer installs the clean room. Each consumer account must approve this request for their data to be accessible for provider-run analyses in this clean room.

    • Any time the provider changes the provider-run analysis setting for a clean room, the clean room must be re-installed by all consumers for the change to take effect. Because it can be difficult to force all collaborators to re-install a clean room, it is more reliable for the provider to delete a published, shared clean room when changing the analysis permissions, then create a new clean room with the desired permissions.

    1. Clean Room을 게시합니다.

    2. Clean Room의 가용성, Clean Room 이름, Clean Room에서 실행할 템플릿을 컨슈머에게 알립니다.

  2. 컨슈머

    1. 표준 방식으로 데이터에 클린룸과 링크를 설치합니다.

    2. Set any join and column policies needed on your data.

    3. ``consumer.enable_templates_for_provider_run``(여러 템플릿의 경우) 또는 ``consumer.approve_template``(하나의 템플릿의 경우) 중 하나를 호출하여 클린룸의 특정 템플릿에 대한 공급자 분석을 허용합니다.

      참고

      컨슈머가 템플릿을 승인한 후 공급자가 템플릿을 변경하는 경우 컨슈머는 템플릿을 다시 승인해야 합니다. 템플릿이 다시 승인될 때까지, 승인된 템플릿의 이전에 캐시된 버전이 공급자에 의해 실행됩니다.

    4. (Optional) Provider-run analyses are billed to the consumer. A consumer can limit the warehouse type or sizes available for provider-run analyses: see Restricting warehouse size and type limits.

    5. Clean Room을 설치하고 공급자 실행 분석을 승인했음을 공급자에게 알리십시오.

  3. 공급자

    1. 컨슈머가 클린룸을 설치한 후에는 컨슈머에서 공급자 계정으로의 데이터 공유를 활성화하여 컨슈머 데이터에 액세스할 수 있도록 분석을 활성화해야 합니다. 이를 위한 프로세스는 공급자와 컨슈머가 동일한 클라우드 리전에 있는지 아니면 다른 클라우드 리전에 있는지에 따라 다릅니다.

      • If the provider and consumer are in the same cloud region, the provider calls provider.mount_request_logs_for_all_consumers once. If a new consumer account installs the clean room later and you want to use their data in this template, you must re-run this procedure to be able to access that data.

      • If the provider and consumer are in different cloud regions, the provider and consumer must enable cross-cloud auto-fulfillment. When a provider runs an analysis across regions, the query can take some time to complete, because query data is sent from the provider’s region to the consumer’s region and back.

    2. Call provider.view_warehouse_sizes_for_template to see if the consumer has limited the type and size of warehouse used for the analysis. If the consumer has limited your warehouse selection, you must provide supported warehouse_type and warehouse_size values in your analysis request in the next step. If the consumer has not specified warehouse limits, those fields are optional in your request. For more information, see Restricting warehouse size and type limits.

    3. Run the analysis by calling provider.submit_analysis_request with the template name, the table names, and the template arguments. If the consumer has specified limits on warehouse sizes or types, you must also specify the warehouse size and type in your request.

      • Save the request ID returned by provider.submit_analysis_request; the ID is needed to check the status and results of the analysis.

    4. Check the status of the analysis by calling provider.check_analysis_status. When status is reported as COMPLETED, call provider.get_analysis_result to get the analysis results.

Restricting warehouse size and type limits

Here is how a consumer sets a warehouse size and type limitation, and how a provider chooses a warehouse size and type when running an analysis:

  1. 컨슈머는 ``consumer.set_provider_run_configuration``을 호출하고 공급자가 지정된 템플릿에 사용할 수 있는 웨어하우스 크기와 유형을 지정합니다.

    CALL samooha_by_snowflake_local_db.consumer.set_provider_run_configuration(
      $cleanroom_name,
      {
        $template_name: {
          'warehouse_type': 'STANDARD',
          'warehouse_size': ['MEDIUM', 'LARGE']}
      });
    
    Copy
  2. 공급자는 ``provider.view_warehouse_sizes_for_template``을 호출하여 해당 템플릿에 대한 공급자 실행 분석에 허용되는 웨어하우스 크기와 유형을 확인합니다.

    CALL samooha_by_snowflake_local_db.provider.view_warehouse_sizes_for_template(
      $cleanroom_name,
      $template_name,
      $consumer_account_loc
    );
    
    Copy
  3. The provider specifies which supported warehouse size and type to use in their analysis run request.

    CALL samooha_by_snowflake_local_db.provider.submit_analysis_request(
      $cleanroom_name,
      $consumer_locator_id,
      $template_name,
      ['SAMOOHA_SAMPLE_DATABASE.DEMO.CUSTOMERS'],
      ['SAMOOHA_SAMPLE_DATABASE.DEMO.CUSTOMERS'],
      object_construct(
        'dimensions', ['c.REGION_CODE'],
        'measure_type', ['AVG'],
        'measure_column', ['c.DAYS_ACTIVE'],
        'warehouse_type', 'STANDARD',      -- Any other value would cause the request to fail.
        'warehouse_size', 'LARGE'          -- Only MEDIUM and LARGE supported.
      )
    );
    
    Copy

Install and run the code example

You can download and install a complete running example to create and run a provider-run analysis. To run this example, you need two Snowflake accounts in the same organization and cloud hosting region with the Snowflake Data Clean Room environment installed.

  1. 예제 노트북을 다운로드합니다.

  2. 공급자 계정과 컨슈머 계정 모두에 노트북을 설치합니다.

    노트북 업로드하려면 다음을 수행합니다.

    1. 탐색 메뉴에서 Projects » Notebooks 를 선택합니다.

    2. + Notebook » Import .ipynb file 를 선택합니다.

    3. 다운로드한 .ipynb 파일을 선택합니다.

    4. 파일 이름을 원하는 대로 지정하고 데이터베이스와 스키마를 선택합니다.

    5. 기본 웨어하우스 APP_WH 를 유지합니다.

    6. Create 를 선택합니다.

    7. 공급자 계정에서 노트북을 열고 공급자 부분을 작성하여 Clean Room을 만듭니다.

    8. 컨슈머 계정에서 노트북을 열고 컨슈머 부분을 완료하여 Clean Room을 설치 및 구성하고 템플릿을 실행합니다.

다음 프로시저는 Clean Room에서 분석을 실행할 수 있는 측을 관리합니다.

컨슈머 실행 분석 (기본적으로 허용됨): 변경 사항은 즉시 적용됩니다.

  • provider.enable_consumer_run_analysis

  • provider.disable_consumer_run_analysis

공급자 실행 분석 (기본적으로 비활성화됨): 변경 사항은 컨슈머가 다시 설치해야 합니다.

  • provider.enable_provider_run_analysis (requires the consumer to approve by calling consumer.enable_templates_for_provider_run)

  • provider.disable_provider_run_analysis