About Openflow Connector for Confluence Data Center

Note

The connector is subject to the Snowflake Connector Terms.

This topic describes the basic concepts of the Openflow Connector for Confluence Data Center, its use cases, benefits, key features, and limitations.

The Openflow Connector for Confluence Data Center connects a Confluence data center site to a Snowflake account. This allows the connector to ingest Confluence documents and document permissions. When changes are made to Confluence data, they are automatically updated in Snowflake.

Supported use cases

The following use cases are supported by the Openflow Connector for Confluence Data Center:

  • The Simple flow supports the use case where Confluence content needs to be stored in Snowflake. In this use case, ingested documents are made available in a stage with document metadata stored in a table. With this use case, you can build pipelines or applications on top of the ingested documents.

  • The Cortex flow enables natural language and keyword search on your documents for conversational analysis or for use in AI Assistants using SQL, Python or REST APIs.

Both of the use cases support scenarios when document ACLs are needed or when they are omitted by the connector. If you choose to ingest documents with ACLs, the ACLs will be stored in tables together with other document metadata.

Note

Limitations are listed for the predefined versioned flow and if the flow is customized and it doesn’t use the predefined components, the limitations related to these components may not apply.

Simple ingestion use case

With the Simple flow, Confluence content is ingested into Snowflake and stored in a stage. Processing concludes once the file is stored in the internal stage of the destination schema, and its metadata is saved in the DOC_METADATA table.

Use the connector definition to perform custom processing on ingested files.

Cortex use case

You can use the Cortex Search Service to build chat and search applications to chat with or query Confluence documents. After you install and configure the connector and it begins ingesting content from Confluence, you use the Cortex Search service to perform queries, build chat and search applications and more.

For more information about using Cortex Search, see Query a Cortex Search service.

Response filtering by user where the flow includes ACLs

To restrict responses from the Cortex Search service to documents that a specific user has access to in Confluence, specify a filter containing the user ID or email address of the user when you query Cortex Search. For example, filter.@contains.user_ids or filter.@contains.user_emails. The name of the Cortex Search service created by the connector is CORTEX_SEARCH_SERVICE.

Run the following SQL commands in a SQL worksheet to query the Cortex Search service with files ingested from your Confluence instance.

Replace the following:

  • application_instance_name: Name of your destination database.

  • user_email: Email of the user who you want to filter the responses for.

  • your_question: Question that you want to get responses for.

  • number_of_results: Maximum number of results to return in the response. The maximum value is 1000, and the default value is 10.

SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
   '<application_instance_name>.cortex.cortex_search_service',
     '{
       "query": "<your_question>",
        "columns": ["chunk", "web_url"],
        "filter": {"@contains": {"user_emails": "<user_email>"} },
        "limit": <number_of_results>
      }'
  )
)['results'] AS results

Column names include:

Column name

Type

Description

full_name

String

Title of the Confluence page.

web_url

String

URL that displays an original Confluence page in a browser.

last_modified_date_time

String

Date and time when the page was most recently modified.

chunk

String

A piece of text from the page that matched the Cortex Search query.

user_ids

Array

An array of Confluence users who have access to the page. It also includes user IDs from all the Confluence groups that are assigned to the page.

user_emails

Array

An array of Confluence user emails with access to the page. It also includes user emails from all the Confluence groups assigned to the page.

Limitations

The connector is subject to the following limitations:

For optimal performance Snowflake recommend that you to enable hybrid tables in the account where the data is ingested. When hybrid tables are not available, normal table are used which can impact ingestion performance.

Next steps

Review and setup the connector