Set up the Atlassian Jira Cloud (Core) flow¶
Note
This connector is subject to the Snowflake Connector Terms.
This topic describes the steps to install and configure the Atlassian Jira Cloud (Core) flow, the core flow of the Openflow Connector for Jira Cloud. The agile flow is documented separately in Set up the Atlassian Jira Cloud (Agile) flow.
Prerequisites¶
Ensure that you have reviewed About Openflow Connector for Jira Cloud.
Ensure that you have Set up Openflow - BYOC or Set up Openflow - Snowflake Deployments.
If using Openflow - Snowflake Deployments, ensure that you’ve reviewed configuring required domains and have granted access to the required domains for the Jira Cloud connector.
Get the credentials¶
As a Jira Cloud administrator, perform the following tasks in your Atlassian account:
Navigate to the API tokens page.
Select Create API token with scopes.
In the Create an API token dialog box, provide a descriptive name for the API token and select an expiration date for the API token. This can range from 1 to 365 days.
Select the API token app Jira.
Select the required scopes based on the features you plan to use. See Required API scopes for details.
Select Create token.
In the Copy your API token dialog box, select Copy to copy your generated API token and then paste the token to the connector parameters, or save it securely.
Select Close to close the dialog box.
Required API scopes¶
The core flow always requires the following baseline Jira API scopes:
read:jira-work(covers issues, projects, fields, comments, changelogs, worklogs, votes, watchers, remote links, permissions, project components, and project versions)read:jira-user(covers users and user groups, and the connection verification and timezone lookup that run at startup againstGET /rest/api/3/myself)
The API token owner additionally needs the Browse projects Jira permission on every project that you want to ingest.
Some optional tables require additional scopes or permissions on top of the baseline:
Table (Enabled Tables value) |
Additional Jira API scope |
Additional Jira permission |
|---|---|---|
|
None. |
View voters and watchers on the relevant projects. |
|
None. |
View voters and watchers on the relevant projects. |
|
None. |
View worklogs on the relevant projects. |
|
|
Administer Jira (global). |
|
|
Administer Jira (global). |
Comments restricted to specific roles or groups are visible only when the API token owner is a member of these roles or groups, regardless of the token scope or permission configuration.
Tokens without scopes are also supported and grant access based solely on the API token owner’s permissions. However, tokens with scopes are recommended for fine-grained access control.
Set up Snowflake account¶
As a Snowflake account administrator, perform the following tasks:
Create a new role or use an existing role.
Create a new Snowflake service user with the type as SERVICE.
Grant the Snowflake service user the role you created in the previous steps.
Configure with key-pair auth for the Snowflake SERVICE user from step 2.
Configure a secrets manager supported by Openflow (recommended), for example, AWS, Azure, and HashiCorp, and store the public and private keys in the secret store.
Note
If for any reason, you don’t want to use a secrets manager, then you are responsible for safeguarding the public key and private key files used for key-pair authentication according to the security policies of your organization.
After the secrets manager is configured, determine how you will authenticate to it. On AWS, use the EC2 instance role associated with Openflow as this way no other secrets have to be persisted.
In Openflow, configure a Parameter Provider associated with this Secrets Manager, from the main menu (⋮) in the upper-right corner. Navigate to Controller Settings » Parameter Provider and then fetch your parameter values.
At this point, all credentials can be referenced with the associated parameter paths and no sensitive values need to be persisted within Openflow.
If any other Snowflake users require access to the tables ingested by the connector (for example, for custom processing in Snowflake), then grant those users the role created in step 1.
Create a database and schema in Snowflake for the connector to store ingested data. Grant the following Database privileges to the role created in the first step.
Create a warehouse that the connector will use or use an existing one. Start with the smallest warehouse size, then experiment with size depending on the amount of data transferred. Large data volumes typically scale better with multi-cluster warehouses, rather than larger warehouse sizes.
Ensure that the user with the role used by the connector has the required privileges to use the warehouse. If that’s not the case then grant the required privileges to the role.
Set up the connector¶
The core flow is shipped as the Atlassian Jira Cloud (Core) process group. As a data engineer, perform the following tasks to install and configure it.
Install the connector¶
To install the connector, do the following as a data engineer:
Navigate to the Openflow overview page. In the Featured connectors section, select View more connectors.
On the Openflow connectors page, find the connector and select Add to runtime.
In the Select runtime dialog, select your runtime from the Available runtimes drop-down list and click Add.
Note
Before you install the connector, ensure that you have created a database and schema in Snowflake for the connector to store ingested data.
Authenticate to the deployment with your Snowflake account credentials and select Allow when prompted to allow the runtime application to access your Snowflake account. The connector installation process takes a few minutes to complete.
Authenticate to the runtime with your Snowflake account credentials.
The Openflow canvas appears with the connector process group added to it.
After import, the core flow appears on the canvas as the Atlassian Jira Cloud (Core) process group.
Configure the connector¶
Right-click on the imported Atlassian Jira Cloud (Core) process group and select Parameters.
Populate the required parameter values as described in Flow parameters.
Flow parameters¶
The core flow uses the following parameter contexts:
Jira Cloud (Core) Source Parameters: Used to establish connection with the Jira API.
Jira Cloud (Core) Destination Parameters: Used to establish connection with Snowflake.
Jira Cloud (Core) Ingestion Parameters: Used to define the configuration of data ingested from Jira.
Jira Cloud (Core) Source Parameters¶
Parameter |
Description |
|---|---|
Jira Email |
Email address for the Atlassian account used for authentication. |
Jira API Token |
API access token for your Atlassian Jira account. See Required API scopes for the scopes to configure. |
Environment URL |
URL to the Atlassian Jira environment. For example, |
Jira Cloud (Core) Destination Parameters¶
Parameter |
Description |
Required |
|---|---|---|
Destination Database |
The database where data will be persisted. It must already exist in Snowflake. The name is case-sensitive. For unquoted identifiers, provide the name in uppercase. |
Yes |
Destination Schema |
The schema where data will be persisted, which must already exist in Snowflake. The name is case-sensitive. For unquoted identifiers, provide the name in uppercase. See the following examples:
|
Yes |
Snowflake Authentication Strategy |
When using:
|
Yes |
Snowflake Account Identifier |
When using:
|
Yes |
Snowflake Private Key |
When using:
|
No |
Snowflake Private Key File |
When using:
|
No |
Snowflake Private Key Password |
When using
|
No |
Snowflake Role |
When using
|
Yes |
Snowflake Username |
When using
|
Yes |
Oversized Value Strategy |
Determines how the connector handles values that exceed its internal size limits (16 MB) during replication. Possible values are:
|
No |
Snowflake Warehouse |
Snowflake warehouse used to run queries. |
Yes |
Jira Cloud (Core) Ingestion Parameters¶
Parameter |
Description |
|---|---|
Enabled Tables |
Comma-separated list of optional tables to populate. See
Enabled tables configuration for the full list of values and guidance on which
tables to enable. Default value: |
Issue Fields |
A list of fields to return for each issue, used to retrieve a subset of fields. See
Issue fields configuration for available values and custom field handling.
Default value: |
Project Keys Filter |
Optional comma-separated list of Jira project keys to limit ingestion to specific projects.
If empty, all projects accessible by the API token owner are fetched.
For example, |
Deletes Fetch Strategy |
Strategy for fetching deleted issues. Set to |
Merge Interval |
Time interval between journal-to-destination merge operations. When a merge runs, the Snowflake
warehouse resumes. The merge is skipped if no new data has been loaded since the previous merge.
Default value: |
Run the flow¶
Right-click on the canvas and select Enable all Controller Services.
Right-click on the Atlassian Jira Cloud (Core) process group and select Start. The flow starts the data ingestion.
On first run, the flow creates the required Snowflake tables in the destination schema. See Destination tables for the full list of tables created by the core flow and the parameters that control which optional tables are populated.
Resetting the connector state¶
If you need to change the project filter or want to restart the ingestion from scratch, you must clear the ingestion state. The core flow uses a centralized state service rather than per-processor state.
To reset the state, perform the following steps:
Right-click the Atlassian Jira Cloud (Core) process group and select Stop.
Navigate to the Controller Settings for the process group.
Find the StandardJiraIngestionStateService controller service and select View State.
Select Clear State. This clears all project tracking, pagination, and timestamp state.
Optionally, update the connector parameters if needed.
Right-click the Atlassian Jira Cloud (Core) process group and select Start.
Note
Clearing the ingestion state causes the connector to re-fetch all data from the beginning. The
destination tables are not truncated. Existing rows are updated in place, and rows that no
longer exist in Jira are flagged with _SNOWFLAKE_DELETED = TRUE.
Accessing the data¶
Data fetched from Jira is available in the destination tables with explicit column schemas. There is no need to use JSON flattening or views to query the data.
Each entity is stored in its own table. For example, to query issues and their comments:
To exclude deleted issues from query results, filter on the connector-managed
_SNOWFLAKE_DELETED column. The connector sets this flag to TRUE on the matching ISSUE
row when an issue is deleted in Jira, so no anti-join against DELETED_ISSUE is needed:
The DELETED_ISSUE table is still useful when you need the deletion timestamp or the user who
performed the deletion. See Connector-managed columns for the full set of connector-managed
metadata columns.
Enabled tables configuration¶
The Enabled Tables parameter controls which optional tables are populated. Ingestion of the
ISSUE, PROJECT, USER, and FIELD tables is always enabled and can’t be disabled.
Enabling all tables may cause performance issues and require a larger runtime.
Available values for Enabled Tables:
CHANGELOG(field change history for issues)COMMENT(comments on issues)ISSUE_REMOTE_LINK(remote links attached to issues)ISSUE_SECURITY_SCHEME(issue-level security configurations)ISSUE_VOTE(users who voted on issues)ISSUE_WATCHER(users watching issues)PERMISSION(global and project permission definitions)PROJECT_COMPONENT(components defined in a project)PROJECT_VERSION(release versions of a project)USER_GROUP(group memberships per user)WORKLOG(time tracking entries on issues)
The per-issue tables (CHANGELOG, COMMENT, ISSUE_REMOTE_LINK,
ISSUE_VOTE, ISSUE_WATCHER, WORKLOG) and per-project tables
(PROJECT_COMPONENT, PROJECT_VERSION) only ingest data for issues and projects
that are also covered by Project Keys Filter.
Some tables are populated by calling the Jira API once per parent entity (for example, once per user or once per issue). On large Jira instances, enabling these tables can significantly increase the number of API calls and the load on the ingestion runtime, and can slow down population of the parent table due to back-pressure on the upstream processor. Enable these tables only when the corresponding data is required.
Issue fields configuration¶
The ISSUE table schema depends on the Issue Fields parameter. The parameter accepts a
comma-separated list of field IDs or one of the special values below. Prefix a field with a minus
(-) to exclude it. For example, *all,-description returns all fields except description.
*standard(default): Fetches standard Jira fields (Summary, Status, Priority, Assignee).*navigable: Fetches all navigable fields.*all: Fetches all fields, including custom fields.Individual field IDs can be specified (for example,
summary,status,customfield_10001).
The default value *standard doesn’t include custom fields. To ingest custom fields,
set this parameter to *all or list the custom fields explicitly by ID, for example,
*standard,customfield_10001. To find custom field IDs, follow
this guide.
Column names in the ISSUE table are derived from Jira field display names by:
Uppercasing the display name.
Replacing spaces with underscores.
Removing every character that isn’t a letter, digit, or underscore.
For example, the display name OF Test (Multi-User) becomes the column OF_TEST_MULTIUSER.
If two fields produce the same column name after this transformation, the second field’s column
is suffixed with __<flattened_field_id> to keep names unique. For example, two fields with
display name Custom Field and IDs customfield_1 and customfield_2 produce columns
CUSTOM_FIELD and CUSTOM_FIELD__CUSTOMFIELD_2.
Jira field types are mapped to Snowflake column types as follows:
Jira field type |
Snowflake column type |
|---|---|
|
NUMBER |
|
ARRAY |
|
VARIANT |
All other types |
VARCHAR |
Next steps¶
Set up the Atlassian Jira Cloud (Agile) flow to install the agile flow.
Migrate from the legacy Openflow Connector for Jira Cloud if you’re moving from a previous version of the Jira Cloud connector.