Using internal tables for multistep workflows¶
Overview¶
Many clean room use cases involve running a single SQL query against one or more tables in a clean room and displaying the results in the response. However, there are use cases where you might require creating an internal table that can be used within subsequent templates to support a multistep workflow. For example, a machine learning flow, where the model is trained once against a dataset and then run multiple times against varying input data, either singly or in batches.
Creating internal tables¶
You can create internal tables inside a clean room to store intermediary results, or as persistent storage for usage downstream (for example, to save training data that is used for multiple runs). See properties and guidance of internal tables below:
You can create internal tables by using a clean room template that executes CREATE TABLE, or by running a UDF/UDTF that uses Python to create a table.
Internal tables can be created in the
cleanroomschema, which is available by default. If a custom schema is preferred, the schema must be created first before creating the table.By default, internal tables are only accessible by approved templates in the clean room. If access needs to be provided outside of templates, then the CLEANROOM_PUBLIC_ROLE application role of the clean room needs to be granted corresponding privileges. For example, the following grant can be given:
GRANT SELECT ON TABLE CLEANROOM.MY_TABLE TO APPLICATION ROLE CLEANROOM_PUBLIC_ROLE;If you have proper access, you can list the internal tables in your collaboration. Internal tables can be found at
SFDCR_collaboration_name.cleanroom, and can be listed by running the following SQL code:SHOW TABLES IN SCHEMA SFDCR_collaboration_name.CLEANROOM;.Internal tables are deleted when the collaboration is removed. However, if an internal table is designed to have a shorter lifetime than the collaboration, consider deleting the table when it’s no longer needed.
Here are some examples of creating an internal table:
A JinjaSQL template can create an internal table, which is done in some types of activation.
This example template creates the table and returns the table name, so that the name can be passed in as a parameter to other templates.
A UDF can create an internal table. This is typically done by executing SQL in Python.