snowflake.snowpark.DataFrameAIFunctions.embed¶
- DataFrameAIFunctions.embed(input_column: Union[snowflake.snowpark.column.Column, str], model: str, *, output_column: Optional[str] = None) snowflake.snowpark.DataFrame[source]¶
Generate embedding vectors from text or images.
This method creates dense vector representations (embeddings) of text or images, which can be used for similarity search, clustering, or as features for machine learning.
- Parameters:
input_column – The column (Column object or column name as string) containing the text or images (FILE data type) to embed.
model –
The embedding model to use. Supported models:
- For text embeddings:
snowflake-arctic-embed-l-v2.0: Arctic large model (default for text)snowflake-arctic-embed-l-v2.0-8k: Arctic large model with 8K contextnv-embed-qa-4: NVIDIA embedding model for Q&Amultilingual-e5-large: Multilingual embedding modelvoyage-multilingual-2: Voyage multilingual model
- For image embeddings:
voyage-multimodal-3: Voyage multimodal model (only for images)
output_column – The name of the output column to be appended. If not provided, a column named
AI_EMBED_OUTPUTis appended.
- Returns:
A new DataFrame with an appended output column containing VECTOR embeddings.
Examples:
Note
Embeddings can be used with vector similarity functions to find similar items
Different models produce embeddings of different dimensions
For best results, use the same model for all items you want to compare
This function or method is experimental since 1.39.0.