Install and configure the Openflow Connector for Oracle

Note

This connector is subject to the Snowflake Connector Terms.

Note

The Openflow Connector for Oracle is also subject to additional terms of service beyond the standard connector terms of service. For more information, see the Openflow Connector for Oracle Addendum.

This topic describes the steps to install and configure the Openflow Connector for Oracle connector.

As a data engineer, perform the following tasks to install and configure the connector:

Install the connector

To install the connector, do the following as a data engineer:

  1. Navigate to the Openflow overview page. In the Featured connectors section, select View more connectors.

  2. On the Openflow connectors page, find the connector and select Add to runtime.

  3. In the Select runtime dialog, select your runtime from the Available runtimes drop-down list and click Add.

    Note

    Before you install the connector, ensure that you have created a database and schema in Snowflake for the connector to store ingested data.

  4. Authenticate to the deployment with your Snowflake account credentials and select Allow when prompted to allow the runtime application to access your Snowflake account. The connector installation process takes a few minutes to complete.

  5. Authenticate to the runtime with your Snowflake account credentials.

The Openflow canvas appears with the connector process group added to it.

Runtime sizing

The runtime size determines the CPU and memory available to the connector. The available sizes are Small, Medium, and Large. The connector requires Medium or Large. Choose the size when you create the runtime: you can’t change the size of an existing runtime in place.

Choose Large if you expect high replication throughput or if source tables contain wide rows.

Resize a runtime

Runtime size is fixed at creation, so to change size you run the connector on a different runtime. You have two options depending on whether you want to preserve the current replication progress.

If you don’t need to keep the progress of the current connector, the simplest path is to create a new runtime at the size you need and install a new connector instance on it. The new connector starts from scratch: it snapshots all configured tables and then captures ongoing changes from that point. The replication progress of the existing connector is discarded.

To keep the progress of the current connector, for example to avoid re-snapshotting tables that took a long time to snapshot initially, migrate the connector to the new runtime. This reuses the existing destination tables and resumes incremental replication from where it left off.

For migration instructions, see Reinstall the connector.

Configure the connector

To configure the connector, do the following as a data engineer:

  1. Right-click on the added runtime and select Parameters.

  2. Populate the required parameter values.

    For more information on the required parameter values, see the following sections:

Snowflake Destination Parameters

ParameterDescriptionRequired
Destination Database

The database where data is persisted. It must already exist in Snowflake. The name is case-sensitive. For unquoted identifiers, provide the name in uppercase.

Yes
Destination Schema Pattern

A pattern for the names of destination schemas where data is persisted. The connector creates the schemas if they don’t exist.

You can customize the pattern per ingested table using these optional variables:

  • ${source.database.name}: a source table’s database.
  • ${source.schema.name}: a source table’s schema.
  • ${source.table.name}: a source table’s name.

For example, for a table with the qualified name source_db.tenant_a.data, the pattern prefix_${source.database.name}_${source.schema.name} evaluates to prefix_source_db_tenant_a.

To ingest all tables into a single schema, provide a schema name without any variables, like destination_schema.

Important

Don’t change this setting after the connector has begun ingesting data. Changing this setting after ingestion has begun breaks the existing ingestion. If you must change this setting, create a new connector instance.

Yes
Snowflake Authentication Strategy

When using:

  • Snowflake Openflow Deployment or BYOC: Use SNOWFLAKE_MANAGED_TOKEN. This token is managed automatically by Snowflake. BYOC deployments must have previously configured runtime roles to use SNOWFLAKE_MANAGED_TOKEN.
  • BYOC: Alternatively BYOC can use KEY_PAIR as the value for authentication strategy.
Yes
Snowflake Account Identifier

When using:

  • Session Token Authentication Strategy: Must be blank.
  • KEY_PAIR: Snowflake account name formatted as [organization-name]-[account-name] where data is persisted.
Yes
Snowflake Connection Strategy

When using KEY_PAIR, specify the strategy for connecting to Snowflake:

  • STANDARD (default): Connect using standard public routing to Snowflake services.
  • PRIVATE_CONNECTIVITY: Connect using private addresses associated with the supporting cloud platform such as AWS PrivateLink.
Required for BYOC with KEY_PAIR only, otherwise ignored.
Snowflake Private Key

When using:

  • Session Token Authentication Strategy: Must be blank.
KEY_PAIR: Must be the RSA private key used for authentication.

The RSA key must be formatted according to PKCS8 standards and have standard PEM headers and footers. Note that either a Snowflake Private Key File or a Snowflake Private Key must be defined.

No
Snowflake Private Key File

When using:

  • Session token authentication strategy: The private key file must be blank.
  • KEY_PAIR: Upload the file that contains the RSA private key used for authentication to Snowflake, formatted according to PKCS8 standards and including standard PEM headers and footers. The header line begins with -----BEGIN PRIVATE. To upload the private key file, select the Reference asset checkbox.
No
Snowflake Private Key Password

When using

  • Session Token Authentication Strategy: Must be blank.
  • KEY_PAIR: Provide the password associated with the Snowflake Private Key File.
No
Snowflake Role

When using

  • Session Token Authentication Strategy: Use Snowflake Role assigned to the runtime or child role granted to this Snowflake Role. You can find your runtime Snowflake Role in the Openflow UI, by expanding the More Options [⋮] button for your runtime and selecting Set Snowflake role.
  • KEY_PAIR Authentication Strategy: Use a valid role configured for your service user.
Yes
Snowflake Username

When using

  • Session Token Authentication Strategy: Must be blank.
  • KEY_PAIR: Provide the user name used to connect to the Snowflake instance.
Yes
Oversized Value Strategy

Determines how the connector handles values that exceed its internal size limits (16 MB) during replication. Possible values are:

  • Fail Table (default): The table is marked as permanently failed, and replication stops for that table.
  • Set Null: The value is replaced with NULL in the destination table. Use this to prevent table failures when it is acceptable to lose data in tables beyond the oversized value.
No
Snowflake WarehouseSnowflake warehouse used to run queries.Yes

Oracle Ingestion Parameters

ParameterDescription
Included Table Names

Comma-separated list of fully-qualified table paths. Tables must be specified using fully qualified database, schema and table name format: DATABASE_NAME.SCHEMA_NAME.TABLE_NAME.

For example: MYPDB.SALES.CUSTOMERS, MYPDB.SALES.ORDERS

Included Table Regex

A regular expression to match table paths for automatic inclusion of existing and new tables. The regex pattern must match the three-part naming convention: DATABASE_NAME.SCHEMA_NAME.TABLE_NAME.

For example: MYPDB\.SALES\..* to match all tables in the SALES schema within the MYPDB database.

Column Filter JSON

Optional. A JSON array of filter objects specifying which columns to include or exclude per table. For syntax details and examples, see Replicate a subset of columns in a table.

Table Key Configuration Service

Optional. A MultiDatabaseJsonTableKeyConfigService controller service that supplies a user-declared logical key for one or more tables. The service exposes a Table Key Configuration JSON property where you define the key mappings. When configured, the logical key takes the highest priority and overrides any primary key, unique constraint, or unique index that the connector would otherwise auto-detect.

For more information on when to use this and how to configure it, see Specify a logical key for a table.

Merge Task Schedule CRON

A CRON expression to define when merge operations from the Journal to the Destination Table are triggered. For example, * * * * * ? for continuous merge.

Object Identifier Resolution

Specifies how source object identifiers such as schemas, tables, and column names are stored and queried in Snowflake. This setting determines if you must use double quotes in SQL queries.

Option 1: Default, case-insensitive (recommended).

  • Transformation: All identifiers are converted to uppercase. For example, My_Table becomes MY_TABLE.
  • Queries: SQL queries are case-insensitive and don’t require SQL double quotes.

For example SELECT * FROM my_table; returns the same results as SELECT * FROM MY_TABLE;.

Note

Snowflake recommends using this option if database objects aren’t expected to have mixed case names.

Option 2: case-sensitive.

  • Transformation: Case is preserved. For example, My_Table remains My_Table.
  • Queries: SQL queries must use double quotes to match the exact case for database objects. For example, SELECT * FROM "My_Table";.

Important

Do not change this setting after connector ingestion has begun. Changing this setting after ingestion has begun breaks the existing ingestion. If you must change this setting, create a new connector instance.

Snapshot Fetching Strategy

Determines the snapshot load fetching strategy:

  • CONCURRENT_BY_ROWID (default): Splits tables into chunks bound by ranges of physical row ids, and retrieves each chunk in parallel.
  • SEQUENTIAL_BY_PRIMARY_KEY: Uses fixed-size batches retrieved sequentially by the table’s replication key (primary key, unique constraint, unique index, or logical key). Despite the name, this strategy uses whatever key the connector resolved for the table, not specifically the primary key.
Concurrent Snapshot QueriesMaximum number of concurrent queries to the source database to run in the Snapshot flow. Increasing this can speed up snapshotting large numbers of tables, but will also increase the load on the source database.

Oracle Source Parameters

ParameterDescriptionRequired
Oracle Connection URL

JDBC URL of the database connection to the DB. The URL must specify the target container (PDB or CDB) that contains the data to be replicated. For example jdbc:oracle:thin@<host>:<port>/YOUR_DB_NAME where YOUR_DB_NAME is the name of your PDB or CDB.

When SSL is enabled, use the TCPS protocol, for example jdbc:oracle:thin:@tcps://<host>:<tcps_port>/YOUR_DB_NAME.

Note

The connector works within a single database/container. Ensure the JDBC URL points directly to the container that holds the tables to be replicated.

Yes
Oracle UsernameUsername of the connect user that has access to the XStream Server.Yes
Oracle PasswordPassword of the connect user that has access to the XStream Server.Yes
Oracle SSL Mode

Controls SSL encryption for connections to the Oracle database.

  • DISABLED, which is the default: Connect without SSL.
  • VERIFY_CA: Connect with SSL. Verifies that a trusted Certificate Authority issued the server certificate.
  • VERIFY_IDENTITY: Connect with SSL. Verifies the CA certificate and that the server hostname matches the certificate’s subject.

When set to VERIFY_CA or VERIFY_IDENTITY, you must also provide the Oracle Wallet Filename parameter.

Yes
Oracle Wallet Filename

Upload the file that contains the Oracle auto-login wallet file (cwallet.sso). The wallet must contain the trusted server certificate for SSL connections.

For information about creating the wallet, see Configure SSL connections (optional).

Required when SSL Mode is not DISABLED
Oracle Database Processor MultiplierCore Processor Licensing Factor as described in Oracle Processor Core Factor TableRequired for Embedded License only
Oracle Database Processor CoresThe number of processor cores in your Oracle database.Required for Embedded License only
XStream Billing AcknowledgementA confirmation of the licensing agreementRequired for Embedded License only
XStream Out Server NameThe name of the XStream Server that must already exist in Oracle.Yes
XStream Out Server URL

JDBC URL of the database connection for XStream, must use OCI driver. For example jdbc:oracle:oci:@<host>:<port>/SID.

When SSL is enabled, use the TCPS protocol, for example jdbc:oracle:oci:@tcps://<host>:<tcps_port>/SID.

Note

When SSL Mode is enabled, the connector automatically adds SSL_SERVER_DN_MATCH and MY_WALLET_DIRECTORY to the XStream URL. You don’t need to include these manually.

Yes

Restart table replication

A table in FAILED state — for example, due to a missing primary key or unsupported schema change — does not restart automatically. If a table enters a FAILED state or you need to restart replication from scratch, use the following procedure to remove and re-add the table to replication.

Note

If the failure was caused by an issue in the source table such as a missing primary key, resolve that issue in the source database before continuing.

  1. Remove the table from replication, using one of the following methods:

    • Add the table to the Re-snapshot Table Exclusions parameter to temporarily exclude it from replication. This is convenient when the table is matched by an Included Table Regex that you don’t want to change.
    • In the Ingestion Parameters context, either remove the table from Included Table Names or modify the Included Table Regex so the table is no longer matched.
  2. Verify the table has been removed:

    1. In the Openflow runtime canvas, right-click a processor group and choose Controller Services.
    2. In the table listing controller services, locate the Table State Store row, click the three vertical dots on the right side of the row, then choose View State.

    Important

    You must wait until the table’s state is fully removed from this list before proceeding. Do not continue until this configuration change has completed.

  3. Clean up the destination: Once the table’s state shows as fully removed, manually DROP the destination table in Snowflake. Note that the connector will not overwrite an existing destination table during the snapshot phase; if the table still exists, replication will fail again. Optionally, the journal table and stream can also be removed if they are no longer needed.

  4. Re-add the table by reversing the change you made in the first step: either remove the table from Re-snapshot Table Exclusions, or add it back to Included Table Names or Included Table Regex. The connector then re-snapshots the table.

  5. Verify the restart: Check the Table State Store using the instructions given previously. The state of the table should appear with the status NEW, then transition to SNAPSHOT_REPLICATION, and finally INCREMENTAL_REPLICATION.

Replicate a subset of columns in a table

The connector can filter the data replicated per table to a subset of configured columns. Primary key columns are always included regardless of exclusions.

To apply column filters, set the Column Filter JSON parameter in the Ingestion Parameters context to a JSON array of filter objects, one per table you want to filter.

Columns can be included or excluded by name or by regular expression pattern. You can apply a single condition per table, or combine multiple conditions, with exclusions always taking precedence over inclusions.

Syntax

Each object in the array identifies a table and specifies which columns to include or exclude. Because this connector uses three-part fully qualified names (database, schema, and table), each object can include a database or databasePattern field in addition to the schema and table fields.

[
    {
        "database": "<database>" | "databasePattern": "<regex>",
        "schema": "<schema>" | "schemaPattern": "<regex>",
        "table": "<table>" | "tablePattern": "<regex>",
        "included": ["<column>", "<column>"],
        "excluded": ["<column>", "<column>"],
        "includedPattern": "<regex>",
        "excludedPattern": "<regex>"
    }
]

The following rules apply:

  • Use database, schema, and table for exact name matching, or databasePattern, schemaPattern, and tablePattern for regex matching. You can’t use both a field and its pattern variant in the same object (for example, schema and schemaPattern can’t both appear).
  • At least one of included, excluded, includedPattern, or excludedPattern must be provided.
  • When both included and excluded filters are specified, exclusions take precedence.
  • When multiple filters match the same table, the last matching filter is used, with exact matches taking precedence over pattern-based filters.
  • The value can be an array of objects to apply different filters to different tables.

Examples

Include specific columns by name:

[
    {
        "database": "my_db",
        "schema": "dbo",
        "table": "orders",
        "included": ["account_id", "status", "created_at"]
    }
]

Exclude specific columns by name:

[
    {
        "database": "my_db",
        "schema": "dbo",
        "table": "orders",
        "excluded": ["internal_note", "debug_flag"]
    }
]

Combine an include pattern with a specific exclusion (for example, include all email columns except admin_email):

[
    {
        "database": "my_db",
        "schema": "dbo",
        "table": "contacts",
        "includedPattern": ".*_email",
        "excluded": ["admin_email"]
    }
]

Mix a database pattern with an exact schema and table name to apply a filter across databases:

[
    {
        "databasePattern": "prod_.*",
        "schema": "dbo",
        "table": "customers",
        "excluded": ["internal_note"]
    }
]

Pass multiple filter objects to apply different rules to different tables:

[
    {"database": "my_db", "schema": "dbo", "table": "orders", "included": ["account_id", "status"]},
    {"database": "my_db", "schema": "dbo", "table": "customers", "excludedPattern": ".*_internal"}
]

Including and excluding the same column

Removing a column from a table’s replicated set (by excluding it or by removing it from the included list) has the same effect on the destination as dropping the column at the source: the connector soft-deletes the column on the destination by renaming it with a suffix (by default, __SNOWFLAKE_DELETED). If you then add the column back to the replicated set and later remove it a second time, replication for the affected table fails because the soft-deleted column name is already taken. To recover, restart replication for the affected table.

Specify a logical key for a table

The connector requires a replication key for every table it replicates. By default, the connector picks the replication key automatically, in this order: a primary key, then a qualifying unique constraint, then a qualifying unique index. For the full selection rules, see How the connector chooses a replication key.

A logical key is a user-declared replacement for the auto-detected key. Configure a logical key when:

  • A table has no primary key and no qualifying unique constraint or unique index, but one or more columns are unique in the data.
  • A specific column or set of columns should be used as the replication key, regardless of what the connector would auto-detect (for example, to override a synthetic primary key).

A logical key takes the highest priority. When the connector finds a logical key for a table, it uses that key and ignores any primary key, unique constraint, or unique index on the table.

JSON syntax

The Table Key Configuration JSON value is a JSON array. Each entry maps one table to its logical key columns:

[
    {
        "database": "<database>",
        "schema": "<schema>",
        "table": "<table>",
        "logicalKey": ["<column>", "<column>"]
    }
]

The fields are:

FieldDescription
databaseRequired. The exact source database (PDB or CDB) name, matching the database in the table’s three-part fully qualified name.
schemaRequired. The exact source schema name.
tableRequired. The exact source table name.
logicalKeyRequired. A non-empty array of source column names that uniquely identify rows in the table.

The following rules apply:

  • database, schema, and table matching is case-sensitive. Oracle stores identifiers in uppercase by default, so use uppercase names unless the identifiers were created with double-quoted mixed-case or lowercase names.
  • logicalKey column matching is case-insensitive. The connector matches column names against the source table schema regardless of case.
  • An entry whose database, schema, and table don’t match any replicated table is silently ignored.

Logical key configuration examples

A single-column logical key on a table without a primary key:

[
    {
        "database": "MYPDB",
        "schema": "SALES",
        "table": "AUDIT_LOG",
        "logicalKey": ["EVENT_ID"]
    }
]

A composite logical key:

[
    {
        "database": "MYPDB",
        "schema": "SALES",
        "table": "ORDER_LINES",
        "logicalKey": ["ORDER_ID", "LINE_ITEM_ID"]
    }
]

Logical keys for several tables in one JSON value:

[
    {
        "database": "MYPDB",
        "schema": "SALES",
        "table": "AUDIT_LOG",
        "logicalKey": ["EVENT_ID"]
    },
    {
        "database": "MYPDB",
        "schema": "SALES",
        "table": "ORDER_LINES",
        "logicalKey": ["ORDER_ID", "LINE_ITEM_ID"]
    }
]

Restrictions

The connector rejects the configuration when any of the following is true:

  • logicalKey is missing, empty, or not an array.
  • logicalKey contains duplicate column names (compared case-insensitively).
  • logicalKey contains the pseudo-column ROWID. ROWID isn’t a reliable replication key because it can change when a row is moved (for example, after a table rebuild or partition operation).
  • logicalKey contains a column name that doesn’t exist in the source table.

When the configuration is rejected, the connector either fails to enable the controller service (for structural issues detected at enablement time) or holds the table in the NEW state (for issues detected when the table is initialized). After you fix the configuration, replication for the table resumes without resetting state.

Warnings logged for risky configurations

The connector accepts the following configurations but logs a warning at table initialization. Verify the data carefully or arrange a periodic full reload to correct drift.

When choosing logical-key columns, prefer columns with high cardinality and, where possible, monotonically increasing values. Low-cardinality or non-monotonic keys can degrade snapshot performance if you use the SEQUENTIAL_BY_PRIMARY_KEY strategy, which orders rows by the replication key.

  • A logical-key column is a large-object type (BLOB, CLOB, NCLOB, LONG, LONG RAW). Using large objects as keys severely degrades MERGE performance.
  • A logical-key column is a floating-point type (FLOAT, DOUBLE, REAL, BINARY_FLOAT, BINARY_DOUBLE). Floating-point comparisons can produce inconsistent results because of precision differences.
  • The composite logical key has more than five columns. Long composite keys often indicate a design issue and might degrade MERGE performance.
  • The logical key replaces an existing primary key on the table.

Limitation: Changes to a logical-key value

Important

When a source UPDATE changes the value of a logical-key column, the connector does not soft-delete the row keyed by the old value before inserting the row keyed by the new value. The destination table ends up with two active rows for what’s a single row in the source: the original row, still active under its old key value, and a new row under the new key value.

This differs from how the connector handles a primary-key value change on tables that don’t use a logical key. For more information on that behavior, see Changes to a replication key value.

To avoid this limitation, choose logical-key columns whose values don’t change in the source. If logical-key values do change, periodically run a full reload for the affected tables to reconcile the destination with the source.

Schema changes that affect a logical key

Logical keys reference column names. The connector doesn’t follow renames or drops of those columns:

  • If a logical-key column is dropped on the source, replication for the affected table fails. The table is marked FAILED. For recovery steps, see A logical-key column was dropped or renamed on the source in the troubleshooting topic.
  • If a logical-key column is renamed on the source, the configuration still references the old name and replication fails. Update the JSON to use the new name and restart table replication.

Run the flow

  1. Right-click on the plane and select Enable all Controller Services.
  2. Right-click on the imported process group and select Start. The connector starts the data ingestion.

Next steps