Monitoring the Kafka connector using Java Management Extensions (JMX)¶
This topic describes how to use Java Management Extensions (JMX) to monitor the Snowflake Connector for Kafka. Kafka Connect provides pre-configured JMX metrics that provides information about the Kafka connector. The Snowflake Connector for Kafka provides multiple Managed Beans (MBeans) that you can use to ingest metrics about the Kafka environment. You can load this information into 3rd-party tools, including Prometheus and Grafana.
The JMX feature is enabled in the connector by default. To disable JMX, set the jmx
property to false
.
Important
Snowpipe supports the Kafka connector version 1.6.0 and later.
Snowpipe Streaming supports the Kafka connector version 2.1.2 and later.
Configuring JMX in the Kafka connector¶
JMX is enabled by default in the Snowflake Kafka connector. To enable JMX in Kafka, perform the following:
Enable JMX to connect to your Kafka installation:
To make JMX connections to a Kafka installation running on a remote server, set the
KAFKA_JMX_OPTS
environment variable in your Kafka Connect startup script:export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=<ip_address> -Dcom.sun.management.jmxremote.port=<jmx_port>"
Where:
ip_address
: specifies the IP address of your Kafka Connect installation.jmx_port
: specifies the JMX port where Kafka Connect listens for JMX connections.
To make JMX connections to Kafka running on the same server, set the
JMX_PORT
environment variable in your Kafka startup script:export JMX_PORT=<port_number>
Where
port_number
is the JMX port of your Kafka installation.
Restart the Kafka connector.
Using the Snowflake Kafka connector managed beans (MBeans)¶
JMX uses MBeans to represent objects within Kafka that it can monitor (e.g. thread count, cpu load, etc.). The Snowflake Kafka connector provides MBeans for accessing objects managed by the connector. You can use these MBeans to create monitoring dashboards.
The general format of the Kafka Connector MBean object name is:
snowflake.kafka.connector:connector=connector_name,pipe=pipe_name,category=category_name,name=metric_name
Where:
connector=connector_name
specifies the name of the connector defined in the Kafka configuration file.pipe=pipe_name
specifies the Snowpipe object used to ingest data. The Kafka connector defines Snowpipe objects for each partition.category=category_name
specifies the category of the MBean. Each category contains a set of metrics.name=metric_name
specifies the name of the metric.
The following sections list the names of the categories and metrics provided by the Snowflake Kafka connector.
Category: file-counts
¶
This category of metrics only applies to Snowpipe-based Kafka connector and does not apply to Snowpipe Streaming.
Metric Name |
Data Type |
Description |
---|---|---|
|
long |
The number of files currently on an internal stage. This value is decremented after the process of purging the files has started. This property provides an estimate of how many files are currently on an internal stage. |
|
long |
The number of files in Snowpipe determined by calling the |
|
long |
The number of files on the table stage that failed ingestion. |
|
long |
The number of files present on the table stage that corresponds to a broken offset. |
|
long |
The number of files purged from the internal stage after the ingestion status was determined. |
Category: offsets
¶
The offsetPersistedInSnowflake
and latestConsumerOffset
metrics apply to Snowpipe Streaming-based Kafka connector. The rest of this category only applies to Snowpipe-based Kafka connector.
Metric Name |
Data Type |
Description |
---|---|---|
|
long |
An offset referring to the most recent record sent to the in-memory buffer. |
|
long |
An offset referring to a record that is being flushed on an internal stage after the buffer threshold was reached. The buffer can reach its threshold by time, number of records, or size. |
|
long |
An offset referring to a record that has had the precommit API called and has called the Snowpipe |
|
long |
An offset referring to a record that is being purged from the internal stage. This number is the value of the highest recent offset that was purged from the internal stage. |
|
long |
An offset that refers to a record that has the latest persisted data in Snowflake. The offset is determined by the |
|
long |
An offset that refers to the most recent record sent to the in-memory buffer. It is only used to resend the offset when the channel offset token is |
Category: buffer
¶
This category of metrics is only available to Snowpipe-based Kafka connector.
Metric Name |
Data Type |
Description |
---|---|---|
|
long |
Based on buffer thresholds, returns the buffer size (in bytes) before it is flushed to an internal stage. This value may not be same as the file size since files are compressed when being loaded to an internal stage. |
|
long |
Based on buffer thresholds, returns the number of Kafka records buffered into memory before the buffer is flushed to an internal stage. |
Category: latencies
¶
This category of metrics is only available to Snowpipe-based Kafka connector.
Metric Name |
Data Type |
Description |
---|---|---|
|
long |
The difference (in seconds) between the time the record is put into Kafka and the time the record is fetched into Kafka Connect. Note that this value can be null if the value was not set inside a record. |
|
long |
The difference (in seconds) between the time the file is uploaded to an internal stage and the time the |
|
long |
The difference (in seconds) between the time a file is uploaded to an internal stage and the time the file ingestion status is reported through the |