CREATE CORTEX SEARCH SERVICE¶
Creates a new Cortex Search service or replaces an existing one.
Syntax¶
CREATE [ OR REPLACE ] CORTEX SEARCH SERVICE [ IF NOT EXISTS ] <name>
ON <search_column>
ATTRIBUTES <col_name> [ , ... ]
WAREHOUSE = <warehouse_name>
TARGET_LAG = '<num> { seconds | minutes | hours | days }'
[ EMBEDDING_MODEL = <embedding_model_name> ]
[ INITIALIZE = { ON_CREATE | ON_SCHEDULE } ]
[ COMMENT = '<comment>' ]
AS <query>;
Required parameters¶
name
String that specifies the identifier (i.e. name) for the Cortex Search service; must be unique for the schema in which the service is created.
In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (for example,
"My object"
). Identifiers enclosed in double quotes are also case-sensitive.For more information, see Identifier requirements.
ON search_column
Specifies the text column in the base table that you wish to search on. This column must be a text value.
ATTRIBUTES col_name [ , ... ]
Specifies comma-separated list of columns in the base table that you wish to filter on when issuing queries to the service. Attribute columns must be included in the source query, either via explicit enumeration or wildcard, (
*
).WAREHOUSE = warehouse_name
Specifies the warehouse to use for running the source query, building the search index, and keeping it refreshed per the TARGET_LAG target.
TARGET_LAG = 'num { seconds | minutes | hours | days }'
Specifies the maximum amount of time that the Cortex Search service content should lag behind updates to the base tables specified in the source query.
Optional parameters¶
EMBEDDING_MODEL = <embedding_model_name>
Optional parameter that specifies the embedding model to use in the Cortex Search Service. This property cannot be altered after you create the Cortex Search Service. To modify the property, recreate the Cortex Search Service with a CREATE OR REPLACE CORTEX SEARCH SERVICE command.
Some embedding models are only available in certain cloud regions for Cortex Search. For an availability list by model by region, see Cortex Search Regional Availability.
Each model may incur a different cost per million input tokens processed. Refer to the Snowflake Service Consumption Table for each function’s cost in credits per million tokens.
If the
EMBEDDING_MODEL
is not specified, the default model is used. The default model issnowflake-arctic-embed-m-v1.5
.INITIALIZE
Specifies the behavior of the initial refresh of the Cortex Search Service. This property cannot be altered after you create the service. To modify the property, replace the cortex search service with a CREATE OR REPLACE CORTEX SEARCH SERVICE command.
ON_CREATE
Refreshes the Cortex Search Service synchronously at creation. If this refresh fails, service creation fails and displays an error message.
ON_SCHEDULE
Refreshes the Cortex Search Service at the next scheduled refresh.
The Cortex Search Service is populated when the refresh schedule process runs. No data is populated when the Cortex Search Service is created. If you try to query the service, you might see the following error because the first scheduled refresh has not yet occurred.
Your service has not yet been loaded into our serving system. Please retry your request in a few minutes.
Default:
ON_CREATE
COMMENT = 'comment'
Specifies a comment for the service.
AS query
Specifies a query defining the base table from which the service is created.
Access Control Requirements¶
A role used to execute this operation must have the following privileges at a minimum:
Privilege |
Object |
---|---|
CREATE CORTEX SEARCH SERVICE |
Schema in which you plan to create the search service. |
SELECT |
Tables, and views that you plan to query for the new service. |
USAGE |
Warehouse that you plan to use to refresh the service. |
The USAGE privilege on the parent database and schema are required to perform operations on any object in a schema.
For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.
For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.
Attention
To create a Cortex Search Service, your role must have the same privileges required to use Cortex LLM Functions. See Cortex LLM Functions Required Privileges.
Usage Notes¶
Attention
Customers should ensure that no personal data (other than for a User object), sensitive data, export-controlled data, or other regulated data is entered as metadata when using the Snowflake service. For more information, see Metadata fields in Snowflake.
The size of the Warehouse used to run the Cortex Search service source query does impact the speed and cost of each refresh. A larger warehouse decreases build and refresh time. However, during this preview, Snowflake recommends using a warehouse size no larger than MEDIUM for Cortex Search services.
Snowflake recommends using a dedicated warehouse for each Cortex Search service so as to not interfere with other workloads.
The search index is built as part of the create statement, which means the CREATE CORTEX SEARCH SERVICE statement may take longer to complete for larger datasets.
The
OR REPLACE
andIF NOT EXISTS
clauses are mutually exclusive. They can’t both be used in the same statement.CREATE OR REPLACE <object> statements are atomic. That is, when an object is replaced, the old object is deleted and the new object is created in a single transaction.
Example¶
Create a Cortex Search service named mysvc
using the snowflake-arctic-embed-l-v2.0
embedding model:
CREATE OR REPLACE CORTEX SEARCH SERVICE mysvc
ON transcript_text
ATTRIBUTES region,agent_id
WAREHOUSE = mywh
TARGET_LAG = '1 hour'
EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (
SELECT
transcript_text,
date,
region,
agent_id
FROM support_db.public.transcripts_etl
);
Create a Cortex Search service named mysvc
, with the first refresh
scheduled to run after one TARGET_LAG
period (1 hour) has passed.
CREATE OR REPLACE CORTEX SEARCH SERVICE mysvc
ON transcript_text
ATTRIBUTES region
WAREHOUSE = mywh
TARGET_LAG = '1 hour'
INITIALIZE = ON_SCHEDULE
AS SELECT * FROM support_db.public.transcripts_etl;