Snowpipe Streaming Best Practices¶
Cost Optimization¶
As a best practice, we recommend calling the API with fewer Snowpipe Streaming clients that write more data per second. Aggregate data from multiple sources such as IoT devices or sensors using a Java or Scala application, then call the API to load data using the Snowflake Ingest SDK at higher flow rates. The API efficiently aggregates data across multiple target tables in an account.
A single Snowpipe Streaming client can open multiple channels to send data, but the client cost is only charged per active client. The number of channels does not affect the client cost. Therefore, we recommend using multiple channels per client for performance and cost optimization.
Using the same tables for both batch and streaming ingestion can also result in reduced Snowpipe Streaming compute costs due to pre-empted file migration operations. If Automatic Clustering is also enabled on the same table that Snowpipe Streaming is inserting into, compute costs for file migration may be reduced. The clustering operation will optimize and migrate data in the same transaction.
Performance Recommendations¶
For optimal performance in high-throughput deployments, we recommend the following actions:
Pass values for the TIME, DATE and all TIMESTAMP columns as one of the supported types from the
java.time
package.When creating a channel using
OpenChannelRequest.builder
, set theOnErrorOption
toOnErrorOption.CONTINUE
and manually check the return value frominsertRows
for potential ingestion errors. This approach currently leads to a better performance than relying on exceptions thrown whenOnErrorOption.ABORT
is used.Keep the size of each row batch passed to
insertRows
below 16MB.If you are loading multiple rows, using
insertRows
will be more performant and cost effective than callinginsertRow
multiple times as there is less time spent on locks.When setting the default log level to DEBUG, make sure that the following loggers keep logging on INFO. That’s because their DEBUG output is very verbose, which can lead to a significant performance degradation.
net.snowflake.ingest.internal.apache.parquet
org.apache.parquet