SnowflakeDetectDuplicate 2025.10.9.21

Bundle

com.snowflake.openflow.runtime | runtime-snowflake-processors-nar

Description

Checks if a FlowFile ‘s hash (provided as a FlowFile attribute) is already in a Snowflake table, and routes the FlowFile to’ duplicate ‘if found,’distinct ‘if not found, or’ failure’ on errors.

Tags

database, detect, duplicates, hash, snowflake

Input Requirement

REQUIRED

Supports Sensitive Dynamic Properties

false

Properties

PropertyDescription
Content HashThe name of the FlowFile attribute that holds the pre-computed hash. Supports Expression Language.
Document Source IdentifierSpecifies the document source identifier (doc ID). Supports Expression Language.
Document Source NameSpecifies the document source system name. Supports Expression Language.
Snowflake Connection ServiceThe DBCPService that provides connection to Snowflake.
Snowflake Table NameThe Snowflake table name that stores the file hashes. The table name is case-insensitive. Database and schema must be configured prior in the Snowflake Connection Service.

Relationships

NameDescription
distinctFlowFiles that do not match an existing document are routed here (new hash inserted).
duplicateFlowFiles that match an existing document (same hash) are routed here.
failureFlowFiles that encounter an error or exception during processing are routed here.

Writes attributes

NameDescription
snowflake.detect.duplicateA ‘true’ or ‘false’ attribute indicating if the FlowFile was detected as a duplicate.