About Openflow Connector for Confluence Data Center¶

Note

The connector is subject to the Snowflake Connector Terms.

This topic describes the basic concepts of the Openflow Connector for Confluence Data Center, its use cases, benefits, key features, and limitations.

The Openflow Connector for Confluence Data Center connects a Confluence data center site to a Snowflake account. This allows the connector to ingest Confluence documents and document permissions. When changes are made to Confluence data, they are automatically updated in Snowflake.

Supported use cases¶

The following use cases are supported by the Openflow Connector for Confluence Data Center:

The Simple flow supports the use case where Confluence content needs to be stored in Snowflake. In this use case, ingested documents are made available in a stage with document metadata stored in a table. With this use case, you can build pipelines or applications on top of the ingested documents.
The Cortex flow enables natural language and keyword search on your documents for conversational analysis or for use in AI Assistants using SQL, Python or REST APIs.

Both of the use cases support scenarios when document ACLs are needed or when they are omitted by the connector. If you choose to ingest documents with ACLs, the ACLs will be stored in tables together with other document metadata.

Note

Limitations are listed for the predefined versioned flow and if the flow is customized and it doesn’t use the predefined components, the limitations related to these components may not apply.

Simple ingestion use case¶

With the Simple flow, Confluence content is ingested into Snowflake and stored in a stage. Processing concludes once the file is stored in the internal stage of the destination schema, and its metadata is saved in the DOC_METADATA table.

Use the connector definition to perform custom processing on ingested files.

Cortex use case¶

You can use the Cortex Search Service to build chat and search applications to chat with or query Confluence documents. After you install and configure the connector and it begins ingesting content from Confluence, you use the Cortex Search service to perform queries, build chat and search applications and more.

For more information about using Cortex Search, see Query a Cortex Search service.

Response filtering by user where the flow includes ACLs¶

To restrict responses from the Cortex Search service to documents that a specific user has access to in Confluence, specify a filter containing the user ID or email address of the user when you query Cortex Search. For example, filter.@contains.user_ids or filter.@contains.user_emails. The name of the Cortex Search service created by the connector is CORTEX_SEARCH_SERVICE.

Run the following SQL commands in a SQL worksheet to query the Cortex Search service with files ingested from your Confluence instance.

Replace the following:

application_instance_name: Name of your destination database.
user_email: Email of the user who you want to filter the responses for.
your_question: Question that you want to get responses for.
number_of_results: Maximum number of results to return in the response. The maximum value is 1000, and the default value is 10.

SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
   '<application_instance_name>.cortex.cortex_search_service',
     '{
       "query": "<your_question>",
        "columns": ["chunk", "web_url"],
        "filter": {"@contains": {"user_emails": "<user_email>"} },
        "limit": <number_of_results>
      }'
  )
)['results'] AS results

Column names include:


Column name	Type	Description
`full_name`	String	Title of the Confluence page.
`web_url`	String	URL that displays an original Confluence page in a browser.
`last_modified_date_time`	String	Date and time when the page was most recently modified.
`chunk`	String	A piece of text from the page that matched the Cortex Search query.
`user_ids`	Array	An array of Confluence users who have access to the page. It also includes user IDs from all the Confluence groups that are assigned to the page.
`user_emails`	Array	An array of Confluence user emails with access to the page. It also includes user emails from all the Confluence groups assigned to the page.

Limitations¶

The connector is subject to the following limitations:

This connector is intended for use with Confluence Data Center only, and does not support Confluence Cloud.
Cortex Parse document limitations and requirements
Cortex Search limitations
Only a Confluence Personal Access Token can be used for authentication.

For optimal performance Snowflake recommend that you to enable hybrid tables in the account where the data is ingested. When hybrid tables are not available, normal table are used which can impact ingestion performance.

Next steps¶

Review and setup the connector