SnowflakeDetectDuplicate 2025.5.31.15¶
Bundle¶
com.snowflake.openflow.runtime | runtime-snowflake-processors-nar
Description¶
Checks if a FlowFile’s hash (provided as a FlowFile attribute) is already in a Snowflake table, and routes the FlowFile to ‘duplicate’ if found, ‘distinct’ if not found, or ‘failure’ on errors.
Input Requirement¶
REQUIRED
Supports Sensitive Dynamic Properties¶
false
Properties¶
Property |
Description |
---|---|
Content Hash |
The name of the FlowFile attribute that holds the pre-computed hash. Supports Expression Language. |
Document Source Identifier |
Specifies the document source identifier (doc ID). Supports Expression Language. |
Document Source Name |
Specifies the document source system name. Supports Expression Language. |
Snowflake Connection Service |
The DBCPService that provides connection to Snowflake. |
Snowflake Table Name |
The Snowflake table name that stores the file hashes. Database and schema must be configured prior in the Snowflake Connection Service. |
Relationships¶
Name |
Description |
---|---|
distinct |
FlowFiles that do not match an existing document are routed here (new hash inserted). |
duplicate |
FlowFiles that match an existing document (same hash) are routed here. |
failure |
FlowFiles that encounter an error or exception during processing are routed here. |
Writes attributes¶
Name |
Description |
---|---|
snowflake.detect.duplicate |
A ‘true’ or ‘false’ attribute indicating if the FlowFile was detected as a duplicate. |