Batch Cortex Search¶
The Batch Cortex Search function is a table function that lets you submit a batch of queries to a Cortex Search Service. It is intended for offline use cases with high throughput requirements, such as entity resolution, deduplication, or clustering tasks.
Jobs submitted to a Cortex Search Service with the CORTEX_SEARCH_BATCH function leverage additional compute resources to provide significantly higher throughput (queries per second) than the interactive (Python, REST, or SEARCH_PREVIEW) API search query surfaces.
Syntax¶
Use the following syntax to query a Cortex Search Service in batch mode using the CORTEX_SEARCH_BATCH
table function:
Parameters¶
The CORTEX_SEARCH_BATCH function supports the following parameters:
service_name(string, required)Fully-qualified name of the Cortex Search Service to query.
query(string, optional)Column containing query string for searching the service.
multi_index_query(variant, optional)An object that specifies one or more vector or keyword query inputs to search against the service index. See multi_index_query for details on how to construct this parameter.
Note
For performance reasons,
multi_index_querycurrently supports at most one vector index entry in the query array.filter(variant, optional)Column containing filter objects to apply to the search results.
limit(integer, optional)Maximum number of results to return per query. Default: 10.
options(variant, optional)Column containing additional search options and configurations.
Note
At least one of query, multi_index_query, or filter must be specified.
Usage notes¶
The throughput of the batch search function might vary depending on the amount of data indexed in the queried Cortex Search Service and the complexity of the search queries. Run the function on a small number of queries to measure the throughput for your specific workload. In general, queries to larger services with more filter conditions see lower throughput.
The throughput of the batch search function, the number of search queries processed per second, is not influenced by the size of the warehouse used to query it.
Because batch search spins up dedicated resources to serve each job, it incurs additional startup latency. If you need to run fewer than 2,000 queries, you’ll typically get faster results using the interactive Cortex Search API (Python or REST API) rather than batch search.
Unlike the interactive Cortex Search API, the batch search function can query services that are currently suspended in serving.
A single Cortex Search Service can be queried in interactive and batch mode concurrently without any degradation to interactive query performance or throughput. Separate compute resources are used to serve interactive and batch queries.
Concurrent batch queries on a given service have no limit.
Cost considerations¶
Batch search has three cost components:
- Serving cost
A charge based on the size of the search index data and the duration of the batch search job, excluding the startup time.
- Query embedding cost
A charge for the number of tokens embedded as a result of the input queries. Unlike interactive Cortex Search, query embedding is not free for batch search.
- Virtual warehouse cost
A charge for the virtual warehouse compute used to run the batch job.
For usage tracking, see the CORTEX_SEARCH_BATCH_QUERY_USAGE_HISTORY Account Usage view. For more information on Cortex Search costs, see Cost considerations.
Regional availability¶
Batch search is available in all regions where Cortex Search is available. See Regional availability for a full list of supported regions.
Example Usage¶
In this example, match products in a user-submitted order form to a “golden” product catalog.
The following example uses multi_index_query to submit precomputed embeddings as the query input instead of raw text. Here, the source table my_db.my_schema.product_embeddings contains a column embedding with precomputed vectors, and the Cortex Search Service my_db.my_schema.golden_product_service was created with a bring-your-own-vector (BYOV) configuration. For details on constructing multi_index_query, see multi_index_query.