All processors (alphabetical)

This topic provides a list of all Snowflake openflow processors in alphabetical order. The list includes:

  • The name of each processor

  • A summary of each processor

A

Processor

Description

AttributesToCSV

Generates a CSV representation of the input FlowFile Attributes.

AttributesToJSON

Generates a JSON representation of the input FlowFile Attributes.

C

Processor

Description

CalculateRecordStats

Counts the number of Records in a record set, optionally counting the number of elements per category, where the categories are defined by user-defined properties.

Snowflake

CaptureChangeMySQL

Reads CDC events from a MySQL database.

Snowflake

CaptureChangePostgreSQL

Reads CDC events from a PostgreSQL database.

Snowflake

CaptureGoogleDriveChanges

Captures changes to a Shared Google Drive and emits a FlowFile for each change that occurs.

Snowflake

CaptureMicrosoft365GroupsChanges

Captures Microsoft365 groups changes and emits a FlowFile for each change that occurs.

Snowflake

CaptureSharepointChanges

Captures changes from a Sharepoint Document Library and emits a FlowFile for each change that occurs.

Snowflake

CheckMetaAdsReportReadiness

Processor checking if the Meta Ads report is ready for download.

Snowflake

ChunkDocument

Given an input Openflow Document, chunks the data into segments that are more applicable for LLM synthesis or semantic embedding.

Snowflake

ChunkRecordText

Chunks text with options for recursively splitting by delimiters and max character length.

Snowflake

ChunkText

Chunks text with options for recursively splitting by delimiters and max character length.

CompressContent

Compresses or decompresses the contents of FlowFiles using a user-specified compression algorithm and updates the mime.

ConnectWebSocket

Acts as a WebSocket client endpoint to interact with a remote WebSocket server.

ConsumeAMQP

Consumes AMQP Messages from an AMQP Broker using the AMQP 0.

ConsumeAzureEventHub

Receives messages from Microsoft Azure Event Hubs with checkpointing to ensure consistent event processing.

ConsumeBoxEnterpriseEvents

Consumes Enterprise Events from Box admin_logs_streaming Stream Type.

ConsumeBoxEvents

Consumes all events from Box.

ConsumeElasticsearch

A processor that repeatedly runs a paginated query against a field using a Range query to consume new Documents from an Elasticsearch index/query.

ConsumeGCPubSub

Consumes messages from the configured Google Cloud PubSub subscription.

ConsumeIMAP

Consumes messages from Email Server using IMAP protocol.

ConsumeJMS

Consumes JMS Message of type BytesMessage, TextMessage, ObjectMessage, MapMessage or StreamMessage transforming its content to a FlowFile and transitioning it to ‘success’ relationship.

Snowflake

ConsumeKafka

Consumes messages from Apache Kafka Consumer API.

ConsumeKinesisStream

Reads data from the specified AWS Kinesis stream and outputs a FlowFile for every processed Record (raw) or a FlowFile for a batch of processed records if a Record Reader and Record Writer are configured.

ConsumeMQTT

Subscribes to a topic and receives messages from an MQTT broker

ConsumePOP3

Consumes messages from Email Server using POP3 protocol.

ConsumeSlack

Retrieves messages from one or more configured Slack channels.

Snowflake

ConsumeSlackConversation

Retrieves messages from Slack conversations available to the App.

Snowflake

ConsumeSlackHistory

Fetches historical messages from all Slack channels available to the App.

Snowflake

ConsumeSnowflakeStream

Fetches data from a Snowflake stream and writes it to a FlowFile.

ConsumeTwitter

Streams tweets from Twitter’s streaming API v2.

ControlRate

Controls the rate at which data is transferred to follow-on processors.

ConvertCharacterSet

Converts a FlowFile’s content from one character set to another

Snowflake

ConvertOfficeFormat

Converts a Open Office compatible file, to a PDF or Docx format.

Snowflake

ConvertPdfToImage

Converts a PDF file into a series of images, one for each page.

ConvertRecord

Converts records from one data format to another using configured Record Reader and Record Write Controller Services.

Snowflake

ConvertToJournalSchema

Converts the incoming database schema into the appropriate schema for a Snowflake CDC Journal table.

CopyAzureBlobStorage_v12

Copies a blob in Azure Blob Storage from one account/container to another.

CopyS3Object

Copies a file from one bucket and key to another in AWS S3

CountText

Counts various metrics on incoming text.

Snowflake

CreateAmazonAdsReport

Processor which creates report configuration for Amazon Ads connector.

Snowflake

CreateAzureOpenAiEmbeddings

Uses Azure OpenAI to create embeddings for text.

Snowflake

CreateCohereEmbeddings

Uses Cohere to create embeddings for text.

Snowflake

CreateMetaAdsReport

Processor which creates report configuration for Meta Ads connector.

Snowflake

CreateOllamaEmbeddings

Uses Ollama to create embeddings for text.

Snowflake

CreateOpenAiEmbeddings

Uses OpenAI to create embeddings for text.

Snowflake

CreateSnowflakeEmbeddings

Create vector embeddings using Snowflake Cortex Large Language Model functions

Snowflake

CreateVertexAIEmbeddings

Uses VertexAI to create embeddings for text.

CryptographicHashContent

Calculates a cryptographic hash value for the flowfile content using the given algorithm and writes it to an output attribute.

D

Processor

Description

DebugFlow

The DebugFlow processor aids testing and debugging the FlowFile framework by allowing various responses to be explicitly triggered in response to the receipt of a FlowFile or a timer event without a FlowFile if using timer or cron based scheduling.

DecryptContentAge

Decrypt content using the age-encryption.

DecryptContentPGP

Decrypt contents of OpenPGP messages.

DeduplicateRecord

This processor de-duplicates individual records within a record set.

DeleteAzureBlobStorage_v12

Deletes the specified blob from Azure Blob Storage.

DeleteAzureDataLakeStorage

Deletes the provided file from Azure Data Lake Storage

DeleteByQueryElasticsearch

Delete from an Elasticsearch index using a query.

Snowflake

DeleteDBFSResource

Delete a DBFS files and directories.

DeleteDynamoDB

Deletes a document from DynamoDB based on hash and range key.

DeleteFile

Deletes a file from the filesystem.

DeleteGCSObject

Deletes objects from a Google Cloud Bucket.

DeleteGridFS

Deletes a file from GridFS using a file name or a query.

Snowflake

DeleteMilvus

Deletes vectors from Milvus database from a collection by ID.

DeleteMongo

Executes a delete query against a MongoDB collection.

Snowflake

DeletePinecone

Deletes vectors from a Pinecone index.

DeleteS3Object

Deletes a file from an Amazon S3 Bucket.

DeleteSFTP

Deletes a file residing on an SFTP server.

DeleteSQS

Deletes a message from an Amazon Simple Queuing Service Queue

Snowflake

DeleteUnityCatalogResource

Delete a Unity Catalog file or directory.

DetectDuplicate

Caches a value, computed from FlowFile attributes, for each incoming FlowFile and determines if the cached value has already been seen.

DistributeLoad

Distributes FlowFiles to downstream processors based on a Distribution Strategy.

DuplicateFlowFile

Intended for load testing, this processor will create the configured number of copies of each incoming FlowFile.

E

Processor

Description

EncodeContent

Encode or decode the contents of a FlowFile using Base64, Base32, or hex encoding schemes

EncryptContentAge

Encrypt content using the age-encryption.

EncryptContentPGP

Encrypt contents using OpenPGP.

EnforceOrder

Enforces expected ordering of FlowFiles that belong to the same data group within a single node.

Snowflake

EnrichAttributes

Looks up a value using the configured Lookup Service and adds the results to the FlowFile as one or more attributes.

Snowflake

EnrichCdcStream

Enriches incoming FlowFiles that come from CaptureChangePostgreSQL, etc.

EvaluateJsonPath

Evaluates one or more JsonPath expressions against the content of a FlowFile.

Snowflake

EvaluateRagAnswerCorrectness

Evaluates the correctness of generated answers in a Retrieval-Augmented Generation (RAG) context by computing metrics such as F1 score, cosine similarity, and answer correctness.

Snowflake

EvaluateRagFaithfulness

Evaluates the faithfulness of generated answers in a Retrieval-Augmented Generation (RAG) system by analyzing responses using an LLM (e.

Snowflake

EvaluateRagRetrieval

Calculates retrieval metrics (Precision@N, Recall@N, FScore@N, MAP@N, MRR) for a RAG system using an LLM as a judge.

EvaluateXPath

Evaluates one or more XPaths against the content of a FlowFile.

EvaluateXQuery

Evaluates one or more XQueries against the content of a FlowFile.

ExecuteGroovyScript

Experimental Extended Groovy script processor.

ExecuteProcess

Runs an operating system command specified by the user and writes the output of that command to a FlowFile.

ExecuteScript

Experimental - Executes a script given the flow file and a process session.

ExecuteSQL

Executes provided SQL select query.

ExecuteSQLRecord

Executes provided SQL select query.

Snowflake

ExecuteSQLStatement

Executes a SQL DDL or DML Statement against a database.

ExecuteStreamCommand

The ExecuteStreamCommand processor provides a flexible way to integrate external commands and scripts into NiFi data flows.

ExtractAvroMetadata

Extracts metadata from the header of an Avro datafile.

Snowflake

ExtractDocumentRawText

Extracts the text from a Document and writes it to the FlowFile content.

ExtractEmailAttachments

Extract attachments from a mime formatted email file, splitting them into individual flowfiles.

ExtractEmailHeaders

Using the flowfile content as source of data, extract header from an RFC compliant email file adding the relevant attributes to the flowfile.

ExtractGrok

Evaluates one or more Grok Expressions against the content of a FlowFile, adding the results as attributes or replacing the content of the FlowFile with a JSON notation of the matched content

ExtractRecordSchema

Extracts the record schema from the FlowFile using the supplied Record Reader and writes it to the ‘avro.

ExtractText

Evaluates one or more Regular Expressions against the content of a FlowFile.

F

Processor

Description

FetchAzureBlobStorage_v12

Retrieves the specified blob from Azure Blob Storage and writes its content to the content of the FlowFile.

FetchAzureDataLakeStorage

Fetch the specified file from Azure Data Lake Storage

FetchBoxFile

Fetches files from a Box Folder.

FetchBoxFileInfo

Fetches metadata for files from Box and adds it to the FlowFile’s attributes.

FetchBoxFileRepresentation

Fetches a Box file representation using a representation hint and writes it to the FlowFile content.

FetchDistributedMapCache

Computes cache key(s) from FlowFile attributes, for each incoming FlowFile, and fetches the value(s) from the Distributed Map Cache associated with each key.

FetchDropbox

Fetches files from Dropbox.

FetchFile

Reads the contents of a file from disk and streams it into the contents of an incoming FlowFile.

FetchFTP

Fetches the content of a file from a remote FTP server and overwrites the contents of an incoming FlowFile with the content of the remote file.

FetchGCSObject

Fetches a file from a Google Cloud Bucket.

FetchGoogleDrive

Fetches files from a Google Drive Folder.

Snowflake

FetchGoogleDriveFileComments

Fetches comments and their replies for a Google Drive file.

Snowflake

FetchGoogleDriveMetadata

Fetches Google Drive file metadata.

FetchGridFS

Retrieves one or more files from a GridFS bucket by file name or by a user-defined query.

Snowflake

FetchJiraIssues

Fetches issues from Jira Cloud using REST API v3 with configurable search options.

Snowflake

FetchMicrosoftDataverseTable

Fetch records from Microsoft Dataverse Tables

FetchS3Object

Retrieves the contents of an S3 Object and writes it to the content of a FlowFile

FetchSFTP

Fetches the content of a file from a remote SFTP server and overwrites the contents of an incoming FlowFile with the content of the remote file.

Snowflake

FetchSharepointFile

Fetches the contents of a file from a Sharepoint Drive, optionally downloading a PDF or HTML version of the file when applicable.

Snowflake

FetchSharepointMetadata

For each drive item retrieves its metadata and permissions and writes them as FlowFile attributes.

Snowflake

FetchSlackConversationInfo

Fetches Slack conversation info and member emails

Snowflake

FetchSlackFile

Downloads a file shared on Slack.

Snowflake

FetchSlackMessage

Fetches data about a single Slack message

FetchSmb

Fetches files from a SMB Share.

Snowflake

FetchSnowflakeTableProperties

Reads properties from a table and stores them as flow file attributes.

Snowflake

FetchSourceTableSchema

Fetches the table schema (i.

Snowflake

FetchTableSnapshot

Fetches a snapshot of a table from a database.

FilterAttribute

Filters the attributes of a FlowFile by retaining specified attributes and removing the rest or by removing specified attributes and retaining the rest.

FlattenJson

Provides the user with the ability to take a nested JSON document and flatten it into a simple key/value pair document.

ForkEnrichment

Used in conjunction with the JoinEnrichment processor, this processor is responsible for adding the attributes that are necessary for the JoinEnrichment processor to perform its function.

ForkRecord

This processor allows the user to fork a record into multiple records.

Snowflake

FormatWordDocument

Formats a MS Word docx file

G

Processor

Description

Snowflake

GenerateAnswersFromContext

Generates synthetic answers for each question present in the incoming records using a Large Language Model (LLM).

Snowflake

GenerateAnswersFromGroundTruth

Generates synthetic answers for each question in the incoming records using an LLM.

GenerateFlowFile

This processor creates FlowFiles with random data or custom content.

GenerateRecord

This processor creates FlowFiles with records having random value for the specified fields.

GenerateTableFetch

Generates SQL select queries that fetch “pages” of rows from a table.

GeoEnrichIP

Looks up geolocation information for an IP address and adds the geo information to FlowFile attributes.

GeoEnrichIPRecord

Looks up geolocation information for an IP address and adds the geo information to FlowFile attributes.

Snowflake

GetAmazonAdsReport

Processor downloading report from Amazon Ads if ready.

GetAwsPollyJobStatus

Retrieves the current status of an AWS Polly job.

GetAwsTextractJobStatus

Retrieves the current status of an AWS Textract job.

GetAwsTranscribeJobStatus

Retrieves the current status of an AWS Transcribe job.

GetAwsTranslateJobStatus

Retrieves the current status of an AWS Translate job.

GetAzureEventHub

Receives messages from Microsoft Azure Event Hubs without reliable checkpoint tracking.

GetAzureQueueStorage_v12

Retrieves the messages from an Azure Queue Storage.

GetBoxFileCollaborators

Retrieves all collaborators on a Box file and adds the collaboration information to the FlowFile’s attributes.

GetBoxGroupMembers

Retrieves members for a Box Group and writes their details in FlowFile attributes.

Snowflake

GetDBFSFile

Read a DBFS file.

GetDynamoDB

Retrieves a document from DynamoDB based on hash and range key.

GetElasticsearch

Elasticsearch get processor that uses the official Elastic REST client libraries to fetch a single document from Elasticsearch by _id.

GetFile

Creates FlowFiles from files in a directory.

GetFileResource

This processor creates FlowFiles with the content of the configured File Resource.

GetFTP

Fetches files from an FTP Server and creates FlowFiles from them

GetGcpVisionAnnotateFilesOperationStatus

Retrieves the current status of an Google Vision operation.

GetGcpVisionAnnotateImagesOperationStatus

Retrieves the current status of an Google Vision operation.

Snowflake

GetGoogleAdsReport

A processor which can interact with Google Ads Reporting API.

Snowflake

GetGoogleGroupMembers

Retrieves the immediate (top-level) members of one or more Google Groups, specified as a comma-separated list of group IDs that is given as a FlowFile attribute.

Snowflake

GetGoogleSheets

Processor responsible for fetching data from Google Sheets.

GetHubSpot

Retrieves JSON data from a private HubSpot application.

Snowflake

GetHubSpotObject

Get a HubSpot object and its associations by ID or unique value.

Snowflake

GetMicrosoft365GroupMembers

Retrieves Microsoft365 group members and emits a FlowFile for each change that occurs.

GetMongo

Creates FlowFiles from documents in MongoDB loaded by a user-specified query.

GetMongoRecord

A record-based version of GetMongo that uses the Record writers to write the MongoDB result set.

GetS3ObjectMetadata

Check for the existence of an Object in S3 and fetch its Metadata without attempting to download it.

GetS3ObjectTags

Check for the existence of an Object in S3 and fetch its Tags without attempting to download it.

GetSFTP

Fetches files from an SFTP Server and creates FlowFiles from them

Snowflake

GetSharepointSiteGroupMembers

Retrieves all members of a SharePoint site group.

GetShopify

Retrieves objects from a custom Shopify store.

GetSmbFile

Reads file from a samba network location to FlowFiles.

GetSplunk

Retrieves data from Splunk Enterprise.

GetSQS

Fetches messages from an Amazon Simple Queuing Service Queue

Snowflake

GetUnityCatalogFile

Read a Unity Catalog file up to 5 GiB.

Snowflake

GetUnityCatalogFileMetadata

Checks for Unity Catalog file metadata.

GetWorkdayReport

A processor which can interact with a configurable Workday Report.

GetZendesk

Incrementally fetches data from Zendesk API.

H

Processor

Description

HandleHttpRequest

Starts an HTTP Server and listens for HTTP Requests.

HandleHttpResponse

Sends an HTTP Response to the Requestor that generated a FlowFile.

I

Processor

Description

IdentifyMimeType

Attempts to identify the MIME Type used for a FlowFile.

Snowflake

InferJiraIssueSchema

Automatically infers and generates an Apache Avro schema from Jira issue data.

InvokeHTTP

An HTTP client processor which can interact with a configurable HTTP Endpoint.

InvokeScriptedProcessor

Experimental - Invokes a script engine for a Processor defined in the given script.

ISPEnrichIP

Looks up ISP information for an IP address and adds the information to FlowFile attributes.

J

Processor

Description

JoinEnrichment

Joins together Records from two different FlowFiles where one FlowFile, the ‘original’ contains arbitrary records and the second FlowFile, the ‘enrichment’ contains additional data that should be used to enrich the first.

JoltTransformJSON

Applies a list of Jolt specifications to the flowfile JSON payload.

JoltTransformRecord

Applies a JOLT specification to each record in the FlowFile payload.

JSLTTransformJSON

Applies a JSLT transformation to the FlowFile JSON payload.

JsonQueryElasticsearch

A processor that allows the user to run a query (with aggregations) written with the Elasticsearch JSON DSL.

L

Processor

Description

Snowflake

ListArchivedHubSpotData

Lists archived data from HubSpot for the chosen object type and generates one FlowFile per listed object with the corresponding metadata as FlowFile attributes.

ListAzureBlobStorage_v12

Lists blobs in an Azure Blob Storage container.

ListAzureDataLakeStorage

Lists directory in an Azure Data Lake Storage Gen 2 filesystem

ListBoxFile

Lists files in a Box folder.

ListDatabaseTables

Generates a set of flow files, each containing attributes corresponding to metadata about a table from a database connection.

Snowflake

ListDBFSDirectory

List file names in a DBFS directory and output a new FlowFile with the filename.

ListDropbox

Retrieves a listing of files from Dropbox (shortcuts are ignored).

ListenFTP

Starts an FTP server that listens on the specified port and transforms incoming files into FlowFiles.

ListenHTTP

Starts an HTTP Server and listens on a given base path to transform incoming requests into FlowFiles.

ListenOTLP

Collect OpenTelemetry messages over HTTP or gRPC.

ListenSlack

Retrieves real-time messages or Slack commands from one or more Slack conversations.

ListenSyslog

Listens for Syslog messages being sent to a given port over TCP or UDP.

ListenTCP

Listens for incoming TCP connections and reads data from each connection using a line separator as the message demarcator.

ListenUDP

Listens for Datagram Packets on a given port.

ListenUDPRecord

Listens for Datagram Packets on a given port and reads the content of each datagram using the configured Record Reader.

ListenWebSocket

Acts as a WebSocket server endpoint to accept client connections.

ListFile

Retrieves a listing of files from the input directory.

ListFTP

Performs a listing of the files residing on an FTP server.

ListGCSBucket

Retrieves a listing of objects from a GCS bucket.

ListGoogleDrive

Performs a listing of concrete files (shortcuts are ignored) in a Google Drive folder.

Snowflake

ListGoogleGroups

Lists all of the groups for a given domain in Google Workspace.

Snowflake

ListHubSpotObjects

Fetches data from HubSpot for specified object types, and generates one FlowFile per listed object with the corresponding metadata as FlowFile attributes.

Snowflake

ListMicrosoftDataverseTables

List Tables from Microsoft Dataverse environments

ListS3

Retrieves a listing of objects from an S3 bucket.

ListSFTP

Performs a listing of the files residing on an SFTP server.

Snowflake

ListSharepointSiteGroups

Lists all SharePoint site groups available on a specified SharePoint site.

ListSmb

Lists concrete files shared via SMB protocol.

Snowflake

ListTableNames

Fetches all source table names and matches them with one of the possible configurations:- regexp expression e.

Snowflake

ListUnityCatalogDirectory

List file names in a Unity Catalog directory and output a new FlowFile with the filename.

LogAttribute

Emits attributes of the FlowFile at the specified log level

LogMessage

Emits a log message at the specified log level

LookupAttribute

Lookup attributes from a lookup service

LookupRecord

Extracts one or more fields from a Record and looks up a value for those fields in a LookupService.

M

Processor

Description

MergeContent

Merges a Group of FlowFiles together based on a user-defined strategy and packages them into a single FlowFile.

Snowflake

MergeDocumentElements

Given a FlowFile that contains a full Document and one more FlowFiles that contain additional data to merge into the Document, this Processor will merge the additional data into the Document.

MergeRecord

This Processor merges together multiple record-oriented FlowFiles into a single FlowFile that contains all of the Records of the input FlowFiles.

Snowflake

MergeSnowflakeJournalTable

Triggers a merge operation on changes from journal table to a destination table in Snowflake.

ModifyBytes

Discard byte range at the start and end or all content of a binary file.

ModifyCompression

Changes the compression algorithm used to compress the contents of a FlowFile by decompressing the contents of FlowFiles using a user-specified compression algorithm and recompressing the contents using the specified compression format properties.

MonitorActivity

Monitors the flow for activity and sends out an indicator when the flow has not had any data for some specified amount of time and again when the flow’s activity is restored

MoveAzureDataLakeStorage

Moves content within an Azure Data Lake Storage Gen 2.

N

Processor

Description

Notify

Caches a release signal identifier in the distributed cache, optionally along with the FlowFile’s attributes.

O

Processor

Description

Snowflake

OpenAiTranscribeAudio

Transcribes audio into English text.

P

Processor

Description

PackageFlowFile

This processor will package FlowFile attributes and content into an output FlowFile that can be exported from NiFi and imported back into NiFi, preserving the original attributes and content.

PaginatedJsonQueryElasticsearch

A processor that allows the user to run a paginated query (with aggregations) written with the Elasticsearch JSON DSL.

ParseEvtx

Parses the contents of a Windows Event Log file (evtx) and writes the resulting XML to the FlowFile

Snowflake

ParsePdfDocument

Parses a PDF file, extracting the text and additional information into a structured JSON document.

ParseSyslog

Attempts to parses the contents of a Syslog message in accordance to RFC5424 and RFC3164 formats and adds attributes to the FlowFile for each of the parts of the Syslog message.

ParseSyslog5424

Attempts to parse the contents of a well formed Syslog message in accordance to RFC5424 format and adds attributes to the FlowFile for each of the parts of the Syslog message, including Structured Data.

Snowflake

ParseTableImage

Extracts the text from a Table image and writes it to the FlowFile content in csv format.

PartitionRecord

Splits, or partitions, record-oriented data based on the configured fields in the data.

Snowflake

PerformOCR

Uses the Openflow Tesseract OCR Service to extract text from a PDF or image, optionally providing metadata including the bounding box, page numberand confidence level of the OCR.

Snowflake

PerformSnowflakeCortexOCR

Performs Optical Character Recognition (OCR) on PDF documents using Snowflake Cortex ML functions.

Snowflake

PromptAnthropicAI

Sends a prompt to Anthropic, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

Snowflake

PromptAzureOpenAI

Sends a prompt to Azure’s OpenAI service, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

Snowflake

PromptLLM

This processor sends a user defined prompt to a Large Language Model (LLM) to respond.

Snowflake

PromptOllama

Sends a prompt to Ollama, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

Snowflake

PromptOpenAI

Sends a prompt to OpenAI, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

Snowflake

PromptSnowflakeCortex

Sends a prompt to Snowflake Cortex, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

Snowflake

PromptVertexAI

Sends a prompt to VertexAI, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile.

PublishAMQP

Creates an AMQP Message from the contents of a FlowFile and sends the message to an AMQP Exchange.

PublishGCPubSub

Publishes the content of the incoming flowfile to the configured Google Cloud PubSub topic.

PublishJMS

Creates a JMS Message from the contents of a FlowFile and sends it to a JMS Destination (queue or topic) as JMS BytesMessage or TextMessage.

Snowflake

PublishKafka

Sends the contents of a FlowFile as either a message or as individual records to Apache Kafka using the Kafka Producer API.

PublishMQTT

Publishes a message to an MQTT topic

PublishSlack

Posts a message to the specified Slack channel.

PutAzureBlobStorage_v12

Puts content into a blob on Azure Blob Storage.

PutAzureCosmosDBRecord

This processor is a record-aware processor for inserting data into Cosmos DB with Core SQL API.

PutAzureDataExplorer

Acts as an Azure Data Explorer sink which sends FlowFiles to the provided endpoint.

PutAzureDataLakeStorage

Writes the contents of a FlowFile as a file on Azure Data Lake Storage Gen 2

PutAzureEventHub

Send FlowFile contents to Azure Event Hubs

PutAzureQueueStorage_v12

Writes the content of the incoming FlowFiles to the configured Azure Queue Storage.

PutBigQuery

Writes the contents of a FlowFile to a Google BigQuery table.

PutBoxFile

Puts content to a Box folder.

PutCloudWatchMetric

Publishes metrics to Amazon CloudWatch.

PutDatabaseRecord

The PutDatabaseRecord processor uses a specified RecordReader to input (possibly multiple) records from an incoming flow file.

Snowflake

PutDatabricksSQL

Submit a SQL Execution using Databricks REST API then write the JSON response to FlowFile Content.

Snowflake

PutDBFSFile

Write FlowFile content to DBFS.

PutDistributedMapCache

Gets the content of a FlowFile and puts it to a distributed map cache, using a cache key computed from FlowFile attributes.

PutDropbox

Puts content to a Dropbox folder.

PutDynamoDB

Puts a document from DynamoDB based on hash and range key.

PutDynamoDBRecord

Inserts items into DynamoDB based on record-oriented data.

PutElasticsearchJson

An Elasticsearch put processor that uses the official Elastic REST client libraries.

PutElasticsearchRecord

A record-aware Elasticsearch put processor that uses the official Elastic REST client libraries.

PutEmail

Sends an e-mail to configured recipients for each incoming FlowFile

PutFile

Writes the contents of a FlowFile to the local file system

PutFTP

Sends FlowFiles to an FTP Server

PutGCSObject

Writes the contents of a FlowFile as an object in a Google Cloud Storage.

PutGoogleDrive

Writes the contents of a FlowFile as a file in Google Drive.

PutGridFS

Writes a file to a GridFS bucket.

Snowflake

PutHubSpot

Upsert a HubSpot object.

Snowflake

PutIcebergTable

Store records in Iceberg using configurable Catalog for managing namespaces and tables.

PutKinesisFirehose

Sends the contents to a specified Amazon Kinesis Firehose.

PutKinesisStream

Sends the contents to a specified Amazon Kinesis.

PutLambda

Sends the contents to a specified Amazon Lambda Function.

PutMongo

Writes the contents of a FlowFile to MongoDB

PutMongoBulkOperations

Writes the contents of a FlowFile to MongoDB as bulk-update

PutMongoRecord

This processor is a record-aware processor for inserting/upserting data into MongoDB.

PutRecord

The PutRecord processor uses a specified RecordReader to input (possibly multiple) records from an incoming flow file, and sends them to a destination specified by a Record Destination Service (i.

PutRedisHashRecord

Puts record field data into Redis using a specified hash value, which is determined by a RecordPath to a field in each record containing the hash value.

PutS3Object

Writes the contents of a FlowFile as an S3 Object to an Amazon S3 Bucket.

PutSalesforceObject

Creates new records for the specified Salesforce sObject.

PutSFTP

Sends FlowFiles to an SFTP Server

PutSmbFile

Writes the contents of a FlowFile to a samba network location.

Snowflake

PutSnowflakeInternalStageFile

Puts files into a Snowflake internal stage.

Snowflake

PutSnowpipeStreaming

Streams records into a Snowflake table.

PutSNS

Sends the content of a FlowFile as a notification to the Amazon Simple Notification Service

PutSplunk

Sends logs to Splunk Enterprise over TCP, TCP + TLS/SSL, or UDP.

PutSplunkHTTP

Sends flow file content to the specified Splunk server over HTTP or HTTPS.

PutSQL

Executes a SQL UPDATE or INSERT command.

PutSQS

Publishes a message to an Amazon Simple Queuing Service Queue

PutSyslog

Sends Syslog messages to a given host and port over TCP or UDP.

PutTCP

Sends serialized FlowFiles or Records over TCP to a configurable destination with optional support for TLS

PutUDP

The PutUDP processor receives a FlowFile and packages the FlowFile content into a single UDP datagram packet which is then transmitted to the configured UDP server.

Snowflake

PutUnityCatalogFile

Write FlowFile content with max size of 5 GiB to Unity Catalog.

Snowflake

PutVectaraDocument

Generate and upload a JSON document to Vectara’s upload endpoint.

Snowflake

PutVectaraFile

Upload a FlowFile content to Vectara’s index endpoint.

Snowflake

PutVespaDocument

Uses Vespa document api to update a record in a specific namespace.

PutWebSocket

Sends messages to a WebSocket remote endpoint using a WebSocket session that is established by either ListenWebSocket or ConnectWebSocket.

PutZendeskTicket

Create Zendesk tickets using the Zendesk API.

Q

Processor

Description

QueryAzureDataExplorer

Query Azure Data Explorer and stream JSON results to output FlowFiles

QueryDatabaseTable

Generates a SQL select query, or uses a provided statement, and executes it to fetch all rows whose values in the specified Maximum Value column(s) are larger than the previously-seen maxima.

QueryDatabaseTableRecord

Generates a SQL select query, or uses a provided statement, and executes it to fetch all rows whose values in the specified Maximum Value column(s) are larger than the previously-seen maxima.

Snowflake

QueryDocument

Evaluates a SQL-like query against the incoming Openflow Document JSON, producing the results on the outgoing FlowFile.

Snowflake

QueryMilvus

Queries a given collection in a Milvus database using vectors.

Snowflake

QueryPinecone

Queries Pinecone for vectors that are similar to the input vector, or retrieves a vector by ID.

QueryRecord

Evaluates one or more SQL queries against the contents of a FlowFile.

QuerySalesforceObject

Retrieves records from a Salesforce sObject.

QuerySplunkIndexingStatus

Queries Splunk server in order to acquire the status of indexing acknowledgement.

R

Processor

Description

RemoveRecordField

Modifies the contents of a FlowFile that contains Record-oriented data (i.

RenameRecordField

Renames one or more fields in each Record of a FlowFile.

ReplaceText

Updates the content of a FlowFile by searching for some textual value in the FlowFile content (via Regular Expression/regex, or literal value) and replacing the section of the content that matches with some alternate value.

ReplaceTextWithMapping

Updates the content of a FlowFile by evaluating a Regular Expression against it and replacing the section of the content that matches the Regular Expression with some alternate value provided in a mapping file.

RetryFlowFile

FlowFiles passed to this Processor have a ‘Retry Attribute’ value checked against a configured ‘Maximum Retries’ value.

RouteOnAttribute

Routes FlowFiles based on their Attributes using the Attribute Expression Language

RouteOnContent

Applies Regular Expressions to the content of a FlowFile and routes a copy of the FlowFile to each destination whose Regular Expression matches.

RouteText

Routes textual data based on a set of user-defined rules.

Snowflake

RunDatabricksJob

Triggers a pre-defined Databricks job to run with custom parameters.

RunMongoAggregation

A processor that runs an aggregation query whenever a flowfile is received.

S

Processor

Description

SampleRecord

Samples the records of a FlowFile based on a specified sampling strategy (such as Reservoir Sampling).

ScanAttribute

Scans the specified attributes of FlowFiles, checking to see if any of their values are present within the specified dictionary of terms

ScanContent

Scans the content of FlowFiles for terms that are found in a user-supplied dictionary.

ScriptedFilterRecord

This processor provides the ability to filter records out from FlowFiles using the user-provided script.

ScriptedPartitionRecord

Receives Record-oriented data (i.

ScriptedTransformRecord

Provides the ability to evaluate a simple script against each record in an incoming FlowFile.

ScriptedValidateRecord

This processor provides the ability to validate records in FlowFiles using the user-provided script.

SearchElasticsearch

A processor that allows the user to repeatedly run a paginated query (with aggregations) written with the Elasticsearch JSON DSL.

SegmentContent

Segments a FlowFile into multiple smaller segments on byte boundaries.

SignContentPGP

Sign content using OpenPGP Private Keys

Snowflake

SnowflakeDetectDuplicate

Checks if a FlowFile’s hash (provided as a FlowFile attribute) is already in a Snowflake table, and routes the FlowFile to ‘duplicate’ if found, ‘distinct’ if not found, or ‘failure’ on errors.

SplitAvro

Splits a binary encoded Avro datafile into smaller files based on the configured Output Size.

SplitContent

Splits incoming FlowFiles by a specified byte sequence

SplitExcel

This processor splits a multi sheet Microsoft Excel spreadsheet into multiple Microsoft Excel spreadsheets where each sheet from the original file is converted to an individual spreadsheet in its own flow file.

SplitJson

Splits a JSON File into multiple, separate FlowFiles for an array element specified by a JsonPath expression.

SplitRecord

Splits up an input FlowFile that is in a record-oriented data format into multiple smaller FlowFiles

SplitText

Splits a text file into multiple smaller text files on line boundaries limited by maximum number of lines or total size of fragment.

SplitXml

Splits an XML File into multiple separate FlowFiles, each comprising a child or descendant of the original root element

StartAwsPollyJob

Trigger a AWS Polly job.

StartAwsTextractJob

Trigger a AWS Textract job.

StartAwsTranscribeJob

Trigger a AWS Transcribe job.

StartAwsTranslateJob

Trigger a AWS Translate job.

StartGcpVisionAnnotateFilesOperation

Trigger a Vision operation on file input.

StartGcpVisionAnnotateImagesOperation

Trigger a Vision operation on image input.

Snowflake

SummarizeText

This processor uses a Large Language Model (LLM) to summarize the content of a FlowFile.

T

Processor

Description

TagS3Object

Adds or updates a tag on an Amazon S3 Object.

TailFile

“Tails” a file, or a list of files, ingesting data from the file as it is written to the file.

TransformXml

Applies the provided XSLT file to the FlowFile XML payload.

U

Processor

Description

UnpackContent

Unpacks the content of FlowFiles that have been packaged with one of several different Packaging Formats, emitting one to many FlowFiles for each input FlowFile.

UpdateAttribute

Updates the Attributes for a FlowFile by using the Attribute Expression Language and/or deletes the attributes based on a regular expression

UpdateByQueryElasticsearch

Update documents in an Elasticsearch index using a query.

UpdateCounter

This processor allows users to set specific counters and key points in their flow.

UpdateDatabaseTable

This processor uses a JDBC connection and incoming records to generate any database table changes needed to support the incoming records.

UpdateRecord

Updates the contents of a FlowFile that contains Record-oriented data (i.

Snowflake

UpdateSnowflakeDatabase

Updates the definition of a Snowflake table based on the schema provided in the incoming FlowFile.

Snowflake

UpdateTableState

Updates the state of a table in the Table State Service

Snowflake

UpsertMilvus

Upserts vectors into Milvus database for a given collection

Snowflake

UpsertPinecone

Publishes vectors, including metadata, and optionally text, to a Pinecone index.

V

Processor

Description

ValidateCsv

Validates the contents of FlowFiles or a FlowFile attribute value against a user-specified CSV schema.

ValidateJson

Validates the contents of FlowFiles against a configurable JSON Schema.

ValidateRecord

Validates the Records of an incoming FlowFile against a given schema.

ValidateXml

Validates XML contained in a FlowFile.

VerifyContentMAC

Calculates a Message Authentication Code using the provided Secret Key and compares it with the provided MAC property

VerifyContentPGP

Verify signatures using OpenPGP Public Keys

W

Processor

Description

Wait

Routes incoming FlowFiles to the ‘wait’ relationship until a matching release signal is stored in the distributed cache from a corresponding Notify processor.

Snowflake

WaitForTableState

Blocks incoming FlowFiles until the corresponding table state is not equal to accepted state.