snowflake.snowpark.DataFrameAIFunctions.count_tokens¶
- DataFrameAIFunctions.count_tokens(model: str, prompt: Union[snowflake.snowpark.column.Column, str], *, output_column: Optional[str] = None) snowflake.snowpark.DataFrame [source]¶
Count the number of tokens in text for a specified language model.
This method returns the number of tokens that would be consumed by the specified model when processing the input text. This is useful for estimating costs and ensuring inputs fit within model token limits.
- Parameters:
model –
The model to base the token count on. Required. Supported models include:
deepseek-r1
,e5-base-v2
,e5-large-v2
gemma-7b
,jamba-1.5-large
,jamba-1.5-mini
,jamba-instruct
llama2-70b-chat
,llama3-70b
,llama3-8b
llama3.1-405b
,llama3.1-70b
,llama3.1-8b
llama3.2-1b
,llama3.2-3b
,llama3.3-70b
llama4-maverick
,llama4-scout
mistral-7b
,mistral-large
,mistral-large2
,mixtral-8x7b
nv-embed-qa-4
,reka-core
,reka-flash
snowflake-arctic-embed-l-v2.0
,snowflake-arctic-embed-m-v1.5
snowflake-arctic-embed-m
,snowflake-arctic
snowflake-llama-3.1-405b
,snowflake-llama-3.3-70b
voyage-multilingual-2
prompt – The column (Column object or column name as string) containing the text to count tokens for.
output_column – The name of the output column to be appended. If not provided, a column named
COUNT_TOKENS_OUTPUT
is appended.
- Returns:
A new DataFrame with an appended output column containing the token count as an integer.
Examples:
>>> # Count tokens for a simple text >>> df = session.create_dataframe([ ... ["What is a large language model?"], ... ["Explain quantum computing in simple terms."], ... ], schema=["text"]) >>> result_df = df.ai.count_tokens( ... model="llama3.1-70b", ... prompt="text", ... output_column="token_count" ... ) >>> result_df.show() -------------------------------------------------------------- |"TEXT" |"TOKEN_COUNT" | -------------------------------------------------------------- |What is a large language model? |8 | |Explain quantum computing in simple terms. |9 | --------------------------------------------------------------
Note
The token count does not account for any managed system prompt that may be automatically added when using other Cortex AI functions. The actual token usage may be higher when using those functions.
This function or method is experimental since 1.39.0.