Troubleshoot Openflow Connector for Kinesis¶
Note
This connector is subject to the Snowflake Connector Terms.
This topic describes how to troubleshoot common issues with the Openflow Connector for Kinesis.
Common issues¶
Messages are not ingested¶
Symptom
The ConsumeKinesis processor in a Kinesis JSON Source process group is running, but no data is produced and no error bulletins are emitted.
Cause
An error might have occurred in an underlying AWS KCL library, which was not propagated to the Openflow UI.
Solution
Check the KCL logs to identify the underlying error.
FlowFile queues are full¶
Symptom
FlowFile queues are filled up and the connector is not processing data fast enough.
Cause
The downstream processors cannot keep up with the incoming data rate.
Most likely the slowest processor is Put Data To Snowflake of
PutSnowpipeStreaming type
in a Streaming Destination process group.
Solution
Adjust the number of concurrent tasks for the processor. Concurrent tasks allow processors to run multiple threads simultaneously, improving throughput for high-volume scenarios.
To adjust concurrent tasks for a processor, perform the following tasks:
Right-click the processor in the Openflow canvas.
Select Configure from the context menu.
Navigate to the Scheduling tab.
In the Concurrent tasks field, enter the preferred number of concurrent tasks.
Select Apply to save the configuration.
Snowflake recommends the following task count values, although the correct value might differ for a particular use case:
1-2 on small size runtimes
2-4 on medium size runtimes
6-8 on large size runtimes
After changing the task count, observe the processor to ensure that increasing the tasks count improves the throughput.
Check KCL logs¶
The connector uses the AWS Kinesis Client Library (KCL) v3 under the hood. Errors that occur in KCL are not always propagated to the Openflow UI, so checking KCL logs might be necessary for troubleshooting.
The KCL logs are stored in a configured event table. You can retrieve them with the following query:
SELECT
timestamp,
runtime_key,
resource_attributes,
log,
log:formattedMessage,
FROM (
SELECT
timestamp,
resource_attributes,
resource_attributes:"openflow.dataplane.id" AS deployment_id,
resource_attributes:"k8s.namespace.name" AS runtime_key,
resource_attributes:"k8s.pod.name" AS runtime_pod,
TRY_PARSE_JSON(value) AS log,
FROM <event_table>
WHERE TRUE
AND timestamp > DATEADD(minute, -30, SYSDATE())
AND record_type = 'LOG'
AND runtime_key = 'runtime-<runtime_name>'
AND resource_attributes:"k8s.container.name" ILIKE '%-server'
)
WHERE TRUE
AND log:loggerName LIKE 'software.amazon.kinesis.%'
AND log:level IN ('WARN', 'ERROR')
ORDER BY timestamp DESC
;
Replace <event_table> with a configured event table name and <runtime_name> with a runtime name.
Common KCL errors¶
This section describes common errors that can appear in the KCL logs and how to resolve them.
Error: UnknownHostException¶
Error message
java.net.UnknownHostException: dynamodb.eu-west-1.amazonaws.com
Cause
If the runtime is using a Snowflake Deployment, the network rule is most likely misconfigured.
Solution
Make sure the required AWS domains are allowlisted in your network rule. For the list of required domains, see Set up Openflow - Snowflake Deployment: Configure allowed domains for Openflow connectors.
Error: No shards found¶
Error message
java.lang.IllegalStateException: No shards found when attempting to
validate complete hash range.
Cause
This error can occur if the Kinesis stream does not exist or the AWS region is incorrectly specified.
Solution
Check the KCL logs for messages like:
Got ResourceNotFoundException when fetching shard list for stream-name. Stream no longer exists.
Verify that the stream name is correct and that the stream exists in AWS.
Verify that the AWS region is specified correctly in the connector configuration.