snowflake.snowpark.functions.ai_embed¶
- snowflake.snowpark.functions.ai_embed(model: str, input: Union[Column, str]) Column [source]¶
Creates an embedding vector from text or an image.
- Parameters:
model –
A string specifying the vector embedding model to be used. Supported models:
- For text embeddings:
’snowflake-arctic-embed-l-v2.0’: Arctic large model (default for text)
’snowflake-arctic-embed-l-v2.0-8k’: Arctic large model with 8K context
’nv-embed-qa-4’: NVIDIA embedding model for Q&A
’multilingual-e5-large’: Multilingual embedding model
’voyage-multilingual-2’: Voyage multilingual model
- For image embeddings:
’voyage-multimodal-3’: Voyage multimodal model (only for images)
input – The string or image (as a FILE object) to generate an embedding from. Can be a string with text or a FILE column containing an image.
- Returns:
A VECTOR containing the embedding representation of the input.
Examples:
>>> # Text embedding >>> df = session.range(1).select( ... ai_embed('snowflake-arctic-embed-l-v2.0', 'Hello, world!').alias("embedding") ... ) >>> result = df.collect()[0][0] >>> len(result) > 0 True >>> # Text embedding with multilingual model >>> df = session.create_dataframe([ ... ['Hello world'], ... ['Bonjour le monde'], ... ['Hola mundo'] ... ], schema=["text"]) >>> df = df.select( ... col("text"), ... ai_embed('multilingual-e5-large', col("text")).alias("embedding") ... ) >>> embeddings = df.collect() >>> all(len(row[1]) > 0 for row in embeddings) True >>> # Image embedding >>> _ = session.sql("CREATE OR REPLACE TEMP STAGE mystage ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE')").collect() >>> _ = session.file.put("tests/resources/dog.jpg", "@mystage", auto_compress=False) >>> df = session.range(1).select( ... ai_embed('voyage-multimodal-3', to_file('@mystage/dog.jpg')).alias("image_embedding") ... ) >>> result = df.collect()[0][0] >>> len(result) > 0 True