Cortex AI Guardrails

Overview

Cortex AI Guardrails, part of the Snowflake Horizon Catalog, provide run-time protection against prompt injection and jailbreak attacks on Cortex Code.

As enterprises move AI applications from pilot to production, they face increased risk from adversarial prompts that can threaten data integrity and security. Cortex AI Guardrails extend Snowflake’s default protections against known prompt injection techniques by adding guardrails to detect and mitigate adversarial threats.

Integrated centrally into Snowflake Horizon Catalog, Cortex AI Guardrails leverage contextual reasoning to detect and neutralize malicious intent, preventing adversarial threats from circumventing established security boundaries and hardened permissions.

Key capabilities

Cortex AI Guardrails provide the following protections:

  • Prompt injection detection: Identifies and blocks attempts to override system instructions through malicious prompts, including indirect prompt injections embedded in tool calls.

  • Jailbreak prevention: Detects attempts to bypass the model’s safety protocols and security boundaries.

  • Zero-day style protection: Uses advanced techniques to identify sophisticated, previously unknown attack patterns in real time.

Configure Cortex AI Guardrails

You can configure Cortex AI Guardrails at the account level using the AI_SETTINGS parameter. This provides centralized control over guardrail behavior for Cortex Code in your account. Users with the ACCOUNTADMIN role can configure Cortex AI Guardrails.

Enable guardrails

To enable Cortex AI Guardrails for your account, use the ALTER ACCOUNT command with the AI_SETTINGS parameter:

ALTER ACCOUNT SET AI_SETTINGS = $$
  guardrails:
    advanced_prompt_injection:
      - enabled: true
$$;

View guardrail settings

To view the current guardrail configuration for your account:

SHOW PARAMETERS LIKE 'AI_SETTINGS' IN ACCOUNT;

Disable guardrails

To disable Cortex AI Guardrails:

ALTER ACCOUNT UNSET AI_SETTINGS;

Monitor guardrail activity

When Cortex AI Guardrails detect a potential threat, the event is logged for audit and monitoring purposes. For Cortex Code, you can review detected threats in the conversation logs. For more information about managing conversation history, see conversation history.

Use these logs to:

  • Monitor for attempted attacks against your AI workloads

  • Identify patterns in blocked or flagged requests

  • Audit guardrail effectiveness

Considerations

  • While Cortex AI Guardrails are optimized for high accuracy, some legitimate prompts may occasionally be flagged. Review your guardrail logs periodically to identify any patterns.

  • Cortex AI Guardrails for prompt injection are currently available with Cortex Code.

Cost

You are charged credits for the use of Cortex AI Guardrails as listed in the Snowflake Service Consumption Table. Usage is measured based on the number of tokens scanned.