About Openflow Connector for Snowflake to Kafka¶
Note
The connector is subject to the Connector Terms.
This topic describes the basic concepts of Openflow Connector for Snowflake to Kafka and limitations.
The connector consumes a Snowflake stream and sends consumed CDC records to a Kafka topic. A Snowflake Stream object records data manipulation language (DML) changes made to tables, including inserts, updates, and deletes, as well as metadata about each change, so that actions can be taken using the changed data. This process is referred to as change data capture (CDC).
Workflow¶
Depending on the configuration of the Kafka broker, which is going to be receiving the CDC data, the workflow may differ slightly.
A Snowflake account administrator performs the following tasks:
Creates or identifies the Snowflake stream that is going to be the source of the CDC data.
Designates a warehouse to be used by the connector.
Configures or identifies the Snowflake user used by the connector and a role for this user. The user must have appropriate permissions to the source Snowflake stream. At a minimum, the user needs USAGE privilege on the database and schema containing the Snowflake stream, and SELECT privilege on the stream and the stream’s underlying table or view object.
A Kafka administrator performs the following tasks.
Creates or identifies a Kafka broker and topic that is going to be the destination for the CDC captured from the Snowflake stream.
Sets up the authentication mechanism for the Kafka broker, which is going to be used by the connector.
A data engineer performs the following tasks:
Installs and configures the connector.
Provides Snowflake credentials and configuration.
Provides Kafka credentials and configuration.
Provides connector parameters.
Stream metadata columns¶
Stream metadata columns METADATA$ROW_ID
, METADATA$ISUPDATE
, and METADATA$ACTION
are sent to the Kafka topic.
The names of these columns are modified before they are sent to Kafka.
In the JSON message payload that is sent, they become METADATA_ROW_ID
, METADATA_ISUPDATE
, and METADATA_ACTION
.
For more information, see Stream Columns.
Limitations¶
A single connector can only capture CDCs from one Snowflake stream.
Messages are sent without a schema.
Schema evolution is not supported.