SnowflakeDetectDuplicate 2025.3.28.13-SNAPSHOT¶

BUNDLE¶

com.snowflake.openflow.runtime | runtime-snowflake-processors-nar

DESCRIPTION¶

Checks if a FlowFile’s hash (provided as a FlowFile attribute) is already in a Snowflake table, and routes the FlowFile to ‘duplicate’ if found, ‘distinct’ if not found, or ‘failure’ on errors.

TAGS¶

database, detect, duplicates, hash, snowflake

INPUT REQUIREMENT¶

REQUIRED

Supports Sensitive Dynamic Properties¶

false

PROPERTIES¶

Property

Description

Content Hash

The name of the FlowFile attribute that holds the pre-computed hash. Supports Expression Language.

Document Source Identifier

Specifies the document source identifier (doc ID). Supports Expression Language.

Document Source Name

Specifies the document source system name. Supports Expression Language.

Snowflake Connection Service

The DBCPService that provides connection to Snowflake.

Snowflake Table Name

The Snowflake table name that stores the file hashes. Database and schema must be configured prior in the Snowflake Connection Service.

RELATIONSHIPS¶

NAME

DESCRIPTION

distinct

FlowFiles that do not match an existing document are routed here (new hash inserted).

failure

FlowFiles that encounter an error or exception during processing are routed here.

duplicate

FlowFiles that match an existing document (same hash) are routed here.

WRITES ATTRIBUTES¶

NAME

DESCRIPTION

snowflake.detect.duplicate

A ‘true’ or ‘false’ attribute indicating if the FlowFile was detected as a duplicate.