Search optimization cost estimation and management¶
The search optimization service impacts costs for both storage and compute resources:
Storage resources: The search optimization service creates a search access path data structure that requires space for each table on which search optimization is enabled. The storage cost of the search access path depends upon multiple factors, including:
The number of distinct values in the table. In the extreme case where all columns have data types that use the search access path, and all data values in each column are unique, the required storage can be as much as the original table’s size.
Typically, however, the size is approximately 1/4 of the original table’s size.
Compute resources:
Adding search optimization to a table consumes resources during the initial build phase.
Maintaining the search optimization service also requires resources. Resource consumption is higher when there is high churn (i.e. when large volumes of data in the table change). These costs are roughly proportional to the amount of data ingested (added or changed). Deletes also have some cost.
Automatic clustering, while improving the latency of queries in tables with search optimization, can further increase the maintenance costs of search optimization. If a table has a high churn rate, enabling automatic clustering and configuring search optimization for the table can result in higher maintenance costs than if the table is just configured for search optimization.
Snowflake ensures efficient credit usage by billing your account only for the actual resources used. Billing is calculated in 1-second increments.
See the “Serverless Feature Credit Table” in the Snowflake Service Consumption Table for the costs per compute hour.
Once you enable the search optimization service, you can view the costs for your use of the service.
Tip
Snowflake recommends starting slowly with this feature (i.e. adding search optimization to only a few tables at first) and closely monitoring the costs and benefits.
Estimating the costs of search optimization¶
To estimate the cost of adding search optimization to a table and configuring specific columns for search optimization, use the SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS function.
In general, the costs are proportional to:
The number of columns on which the feature is enabled and the number of distinct values in those columns.
The amount of data that changes in these tables.
Important
Cost estimates returned by the SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS function are best efforts. The actual realized costs can vary by up to 50% (or, in rare cases, by several times) from the estimated costs.
Build and storage cost estimates are based on sampling a subset of the rows in the table
Maintenance cost estimates are based on recent create, delete, and update activity in the table
Viewing the costs of search optimization¶
You can view the actual billed costs for the search optimization service by using either the web interface or SQL. See Exploring compute cost.
Reducing the costs of search optimization¶
You can control the cost of the search optimization service by carefully choosing the tables and columns for which to enable search optimization.
In addition, to reduce the cost of the search optimization service:
Snowflake recommends batching DML operations on the table:
DELETE
: If tables store data for the most recent time period (e.g. the most recent day or week or month), then when you trim your table by deleting old data, the search optimization service must take into account the updates. In some cases, you might be able to reduce costs by deleting less frequently (e.g. daily rather than hourly).INSERT
,UPDATE
, andMERGE
: Batching these types of DML statements on the table can reduce the cost of maintenance by the search optimization service.
If you recluster the entire table, consider dropping the SEARCH OPTIMIZATION property for that table before reclustering, and then add the SEARCH OPTIMIZATION property back to the table after reclustering.
Before enabling search optimization for substring searches (
ON SUBSTRING(col)
) or VARIANTs (ON EQUALITY(variant_col)
), call SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS to estimate the costs. The initial build and maintenance for these search methods can be computationally intensive, so you should assess the trade-off between performance and cost.