Cortex AI Guardrails¶
Overview¶
Cortex AI Guardrails, part of the Snowflake Horizon Catalog, provide run-time protection against prompt injection and jailbreak attacks on Cortex Code.
As enterprises move AI applications from pilot to production, they face increased risk from adversarial prompts that can threaten data integrity and security. Cortex AI Guardrails extend Snowflake’s default protections against known prompt injection techniques by adding guardrails to detect and mitigate adversarial threats.
Integrated centrally into Snowflake Horizon Catalog, Cortex AI Guardrails leverage contextual reasoning to detect and neutralize malicious intent, preventing adversarial threats from circumventing established security boundaries and hardened permissions.
Key capabilities¶
Cortex AI Guardrails provide the following protections:
Prompt injection detection: Identifies and blocks attempts to override system instructions through malicious prompts, including indirect prompt injections embedded in tool calls.
Jailbreak prevention: Detects attempts to bypass the model’s safety protocols and security boundaries.
Zero-day style protection: Uses advanced techniques to identify sophisticated, previously unknown attack patterns in real time.
Configure Cortex AI Guardrails¶
You can configure Cortex AI Guardrails at the account level using the AI_SETTINGS parameter. This
provides centralized control over guardrail behavior for Cortex Code in your account. Users with
the ACCOUNTADMIN role can configure Cortex AI Guardrails.
Enable guardrails¶
To enable Cortex AI Guardrails for your account, use the ALTER ACCOUNT command with the AI_SETTINGS
parameter:
View guardrail settings¶
To view the current guardrail configuration for your account:
Disable guardrails¶
To disable Cortex AI Guardrails:
Monitor guardrail activity¶
When Cortex AI Guardrails detect a potential threat, the event is logged for audit and monitoring purposes. For Cortex Code, you can review detected threats in the conversation logs. For more information about managing conversation history, see conversation history.
Use these logs to:
Monitor for attempted attacks against your AI workloads
Identify patterns in blocked or flagged requests
Audit guardrail effectiveness
Considerations¶
While Cortex AI Guardrails are optimized for high accuracy, some legitimate prompts may occasionally be flagged. Review your guardrail logs periodically to identify any patterns.
Cortex AI Guardrails for prompt injection are currently available with Cortex Code.
Cost¶
You are charged credits for the use of Cortex AI Guardrails as listed in the Snowflake Service Consumption Table. Usage is measured based on the number of tokens scanned.