About Openflow Connector for PostgreSQL¶

Note

The connector is subject to the Connector Terms.

This topic describes the basic concepts of Openflow Connector for PostgreSQL, its workflow, and limitations.

The Openflow Connector for PostgreSQL connects a PostgreSQL database instance to Snowflake and replicates data from selected tables in near real-time or on schedule. The connector also creates a log of all data changes, available along the current state of the replicated tables.

Use this connector if you’re looking to do the following:

CDC replication of PostgreSQL data with Snowflake for comprehensive, centralized reporting

Workflow¶

A Database administrator configures PostgreSQL replication settings, creates a publication, and credentials for the connector. Optionally, they deliver the SSL certificate.
A Snowflake account administrator performs the following tasks:
1. Creates a service user for the connector, a warehouse for the connector, and a destination database to replicate into.
2. Installs the connector.
3. Specifies the required parameters for the flow template.
4. Runs the flow. The connector performs the following tasks when run in Openflow:
  1. Creates a schema for journal tables.
  2. Creates the schemas and destination tables matching the source tables configured for replication.
  3. Starts replication following the table replication lifecycle.

The table replication lifecycle¶

Schema introspection: The connector discovers the columns in the source table, their names, types, then validates them against Snowflake’s and the connector’s limitations. Validation failures cause this stage to fail, and the cycle completes. After successful completion of Schema Introspection, the connector creates an empty destination table.
Snapshot load: The connector copies all data available in the source table into the destination table. Failure of this stage finishes the cycle, and no more data is replicated. After successful completion, the whole set of data from the source table is available in the destination table.
Incremental load: The connector keeps tracking changes in the source table, and copying them into the destination table. This continues until the table is removed from replication. Failure at this stage permanently stops replication of the source table, until the issue is removed.

Note

Interim failures, like connection issues, do not stop the replication of any table, but permanent failures, like unsupported data type mappings, do stop the affected table from replicating further, while other configured tables continue unchanged. Once you fix the issue with the affected table, you can remove and then add it again to replication.

Supported PostgreSQL versions¶

The following are the supported PostgresSQL versions.

Supported PostgreSQL versions¶
	11	12	13	14	15	16	17
Standard	Yes	Yes	Yes	Yes	Yes	Yes	Yes
AWS RDS	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Amazon Aurora	Yes	Yes	Yes	Yes	Yes	Yes
GCP Cloud SQL	Yes	Yes	Yes	Yes	Yes	Yes
Azure Database	Yes	Yes	Yes	Yes	Yes	Yes

Openflow requirements¶

The runtime size must be at least Medium. Use a bigger runtime when replicating large data volumes, especially when row sizes are large.
The connector does not support multi-node Openflow runtimes. Configure the runtime for this connector with Min nodes and Max nodes set to 1.

Limitations¶

The connector supports PostgreSQL version 11 or later.
The connector supports only username and password authentication with PostgreSQL.
The connector does not support PostgreSQL read replica instances.
The connector does not support partitioned tables.
The connector does not replicate tables with data that exceeds Snowflake’s type limitations.
The connector does not replicate tables with TOAST values.
The connector requires every replicated table to have a primary key, and that the replica identity of the table is the same as the primary key.
The connector supports source table schema changes with the exception of changing primary key definitions, changing the precision, or the scale of a numeric column.

Note

Limitations affecting certain table columns can be bypassed by excluding these specific columns from replication.

Next steps¶

Set up the Openflow Connector for PostgreSQL