Data offering specification¶
Defines a set of tables that a provider is willing to share with analysis runners, as well as sharing rules, such as policies, column formats, and whether the table must be used with a template.
The data provider submits this specification by calling REGISTER_DATA_OFFERING, which returns an offering ID that can be used in the collaboration specification.
A data offering won’t be available in a collaboration until the account that registered the data offering joins the collaboration.
You must have the REGISTER DATA OFFERING account privilege to join any collaboration in which you can activate data; that is, you are an
analysis runner and the collaboration specification includes an activation_destinations field. For more information, see the
access management API reference guide.
Schema:
api_versionThe version of the Collaboration API used. Must be
2.0.0.spec_typeSpecification type identifier. Must be
data_offering.name: data_offering_nameA name for a set of tables and columns to expose to collaborators. This name is used as the data offering reference value in a collaboration specification. You can create multiple data offerings with overlapping tables and columns for different use cases. Must follow Snowflake identifier rules with a maximum of 75 characters and be unique within your Snowflake data clean room account. The
name_versionpair must be unique for all data offerings in this account.versionA custom version identifier for this data offering specification (maximum 20 characters). Must follow Snowflake identifier rules. The version string is given its own column in the response to VIEW_DATA_OFFERINGS and VIEW_REGISTERED_DATA_OFFERINGS, so use a value that can be sorted by increasing value. Example:
V0description: data_offering_description(Optional)A description of the data offering (maximum 1,000 characters).
datasetsA list of one or more datasets to make available to the collaboration.
alias: dataset_nameA name for this data object, used in
collaboration.run. Must follow Snowflake identifier rules and be unique within this offering. Maximum 75 characters.data_object_fqn: fully_qualified_table_nameDescribes a single table available to collaborators. Provide the fully-qualified name of the source object in your account (
database.schema.table_name). Maximum length is 773 characters.allowed_analyses: allowed_analysis_typeThe type of analyses that collaborators can run against this table. Required field with the following values:
template_only: The analysis runner can query this table only by using a template listed in the collaboration specification.template_and_freeform_sql: The analysis runner can query this table by using either a template listed in the collaboration specification, or by using free-form SQL queries in a code environment.
object_class(Optional)The type of object. One of the following values:
ads_log: The tables and columns listed here must fit the ad log requirements.custom: A custom set of tables and columns that doesn’t have any special requirements.
schema_and_template_policiesProvide a list of column names from the table listed by
data_object_fqnand define the policies and format of each column. Only columns listed here are available to collaborators. Each column has the following descriptors:category: category_typeThe category determines whether any column renaming is applied, and any data format enforcement that should be applied.
categoryandcolumn_typedetermine the column name exposed to the analysis runner. The following values are supported:join_standard: This is a joinable column with data in a format specified in thecolumn_typefield. This column is renamed to thecolumn_typevalue in the shared data offering. This column is added to the clean room’s join policy.join_custom: This is a joinable column in any format. Use this when there isn’t an appropriatecolumn_typefor your join column. The original column name is used in the shared data offering. This column is added to the clean room’s join policy.timestamp: This is a projectable column that specifies a timestamp for any event. The column is renamed astimestampin the shared data offering.passthrough: This is a projectable column of any other type. The original column name is used in the shared data offering.event_type: This is a projectable column that records an event type classification for this row, for example: “purchase”, “sign-up”, “impression”, “click”, and so on.
column_type: <format_type>(Required when category=join_standard, ignored for other category types)The format of the data. If the data doesn’t conform to this format, your call to REGISTER_DATA_OFFERING will fail. Provide this field for columns where
category = join_standard.categoryandcolumn_typedetermine the column name exposed to the analysis runner. You can’t assign the samecolumn_typevalue to multiple columns in the same table. The following format types are supported:email: A raw email address.hashed_email_sha256: A SHA256 hashed email.hashed_email_b64_encoded: A base64-encoded hashed email.phone: A phone number without punctuation. For example:2015551212.hashed_phone_sha256: A SHA256 hashed phone number. The original number should be in thephoneformat.hashed_phone_b64_encoded: A base64-encoded hashed phone number.device_id: A raw device ID, such as a mobile advertising ID or a CTV device ID.hashed_device_id_sha256: SHA256 hashed device ID. The original should be in thedevice_idformat.hashed_device_b64_encoded: A base64-encoded hashed device ID.ip_address: A raw IP address in IPv4 format.hashed_ip_address_sha256: SHA256 hashed IPv4 address. The original should be in theip_addressformat.hashed_ip_address_b64_encoded: A base64-encoded hashed IP address.first_name: A raw first name.hashed_first_name_sha256: A SHA256 hashed first name. The original should be in thefirst_nameformat.hashed_first_name_b64_encoded: A base64-encoded hashed first name.last_name: A raw last name.hashed_last_name_sha256: A SHA256 hashed last name. The original should be in thelast_nameformat.hashed_last_name_b64_encoded: A base64-encoded hashed last name.
activation_allowed(Optional)Whether this column can be used for activation purposes. Default is
false.
freeform_sql_policies(Optional)If
allowed_analysesistemplate_and_freeform_sql, this optional field lists any Snowflake policies that should be applied in free-form SQL queries run on this data offering. For more information, see Apply the Snowflake policy to the data offering (free-form query usage only).
The following types are supported:
aggregation_policy(Optional)A single aggregation policy configuration.
name: The fully-qualified policy name.
entity_keys(Optional): List of column names that serve as entity keys for the aggregation policy. NOTE: if these columns have been renamed, you must use the generated column name.join_policy(Optional)A single join policy configuration.
name: The fully-qualified policy name. NOTE: if this column has been renamed, you must use the generated column name.
columns(Optional): List of column names this policy applies to.masking_policies(Optional)An array of masking policy configurations.
name: The fully-qualified policy name. NOTE: if this column has been renamed, you must use the generated column name.
columns(Optional): List of column names this policy applies to.projection_policies(Optional)An array of projection policy configurations.
name: The fully-qualified policy name. NOTE: if this column has been renamed, you must use the generated column name.
columns(Optional): List of column names this policy applies to.row_access_policy(Optional)An object that describes a row access policy configuration.
name: The fully-qualified policy name. NOTE: if this column has been renamed, you must use the generated column name.
columns(Optional): List of column names this policy applies to.
require_freeform_sql_policy(Optional)Whether this data source must define
freeform_sql_policies. This is used as a failsafe to prevent linking a data source that supports free-form SQL queries without assigning policies to it.