CaptureChangePostgreSQL 2025.10.9.21

Bundle

com.snowflake.openflow.runtime | runtime-database-cdc-processors-nar

Description

Reads CDC events from a PostgreSQL database. The processor continuously reads events arriving in the stream, filtering for those related to tables provided by the TableStateService, and discarding the rest. After the current batch of events is processed, the processor confirms the replication slot position back to PostgreSQL, letting it trim the WAL. The processor outputs two types of FlowFiles: DDLs, containing the initial schema of a table, and then every time its schema changes, and DMLs, with records representing changes to data in the table. One FlowFile always represents data related to a single table. The DDL with the schema is written to the FlowFile content as a JSON object, in a form such as: { “columns”: [ { “name”: “<columnName>”, “type”: “<snowflakeType>”, “nullable”: <true|false>, “scale”: <scale>, “precision”: <precision> }, … ], “primaryKeys”: [“<primaryKey1>”, “<primaryKey2>”, …] } The DML records are structured as: { “primaryKeys”: { “<column>”: <value>, … }, “payload”: { “<column>”: <value>, … }, “metadata”: { “<column>”: <value>, … }

Tags

cdc, event, jdbc, postgresql, sql

Input Requirement

FORBIDDEN

Supports Sensitive Dynamic Properties

false

Properties

PropertyDescription
Column Filter StoreService storing per-table column filtering settings.
JDBC Driver LocationComma-separated list of files/folders and/or URLs containing the driver JAR and its dependencies (if any). For example ‘/var/tmp/postgresql-java-client-42.7.5.jar’
JDBC URLJDBC URL of the database connection, ie. jdbc:postgresql://localhost:5432/postgres
Max Batch SizeThe maximum number of records to process in a single iteration
Max Batch Wait TimeThe maximum time to wait for data to appear in the CDC stream.
PasswordPassword to access the PostgreSQL database
Publication NameThe name of the CDC publication to read from.
Record WriterThe Record Writer is used for serializing DML events
Replication Slot NameThe name of the replication slot to use. 63 characters maximum. If the slot doesn’t exist, the processor will create it.
SSL Context ServiceSSL Context Service supporting encrypted socket communication
SSL ModeWhether to use and enforce SSL when connecting to PostgreSQL
TOASTed Value PlaceholderThe value to put into a TOASTed column
TOASTed Value StrategyDetermines how to handle TOASTed values.
Table State StoreThe shared store holding the state of replicated tables.
UsernameUsername to access the PostgreSQL database

State management

ScopesDescription
CLUSTERInformation such as a ‘pointer’ to the current CDC event in the database is stored by this processor, such that it can continue from the same location if restarted, and the name of the replication slot created in PostgreSQL.

Relationships

NameDescription
successSuccessfully created FlowFile from CDC stream events

Writes attributes

NameDescription
source.schema.nameName of the schema of the table from which an event originated
source.table.nameName of the table from which an event originated
cdc.event.typeType of event carried by the FlowFile: ddl or dml
cdc.most.significant.positionDdl’s most significant position in cdc stream
cdc.least.significant.positionDdl’s least significant position in cdc stream
cdc.event.seen.atTimestamp from time when ddl event has been read by the processor