YAML Specification for Semantic Views¶
Semantic Views are schema-level objects that define business concepts over your data, making it easier for users to query and analyze data using business terminology. You can use the YAML specification to create a semantic view in Cortex Analyst or use the SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAML stored procedure to create a semantic view from a YAML specification.
Overview¶
Semantic Views are the recommended approach for defining business semantics in Snowflake. They are schema-level objects that integrate with Snowflake’s privilege system, sharing mechanisms, and metadata catalog.
Note
Legacy semantic model YAML files (stored on stages) can still be used with Cortex Analyst for backward compatibility, but we recommend using semantic views for new implementations.
The benefits of semantic views over legacy semantic models are:
Native Snowflake integration: Schema-level objects with full RBAC, sharing, and catalog support
Advanced features: Support for derived metrics and access modifiers (public/private)
Better governance: Integrated with Snowflake’s privilege and sharing systems
Simplified management: No need to manage YAML files on stages
YAML format¶
Semantic views can take a YAML specification to define their behavior, allowing for readable, plain-text definitions.
The general syntax of a semantic view YAML specification is:
# Name and description of the semantic view.
name: <name>
description: <string>
comments: <string>
# Logical table-level concepts
# A semantic view can contain one or more logical tables.
tables:
# A logical table on top of a base table.
- name: <name>
description: <string>
# The fully qualified name of the base table.
base_table:
database: <database>
schema: <schema>
table: <base table name>
# Dimension columns in the logical table.
dimensions:
- name: <name>
synonyms: <array of strings>
description: <string>
expr: <SQL expression>
data_type: <data type>
unique: <boolean>
cortex_search_service:
service: <string>
literal_column: <string>
database: <string>
schema: <string>
is_enum: <boolean>
- ...
# Time dimension columns in the logical table.
time_dimensions:
- name: <name>
synonyms: <array of strings>
description: <string>
expr: <SQL expression>
data_type: <data type>
unique: <boolean>
# Fact columns in the logical table.
facts:
- name: <name>
synonyms: <array of strings>
description: <string>
access_modifier: <public_access | private_access> # Default is public_access.
expr: <SQL expression>
data_type: <data type>
# Regular metrics scoped to the logical table.
metrics:
- name: <name>
synonyms: <array of strings>
description: <string>
access_modifier: <public_access | private_access> # Default is public_access.
expr: <SQL expression>
# Commonly used filters over the logical table.
filters:
- name: <name>
synonyms: <array of strings>
description: <string>
expr: <SQL expression>
# View-level concepts
# Relationships between logical tables
relationships:
- name: <string>
left_table: <table>
right_table: <table>
relationship_columns:
- left_column: <column>
right_column: <column>
- left_column: <column>
right_column: <column>
# Derived metrics scoped to the semantic view.
# Derived metrics combine metrics from multiple tables.
metrics:
- name: <name>
synonyms: <array of strings>
description: <string>
access_modifier: <public_access | private_access> # Default is public_access
expr: <SQL expression>
# Additional context concepts
# Verified queries with example questions and queries that answer them
verified_queries:
- name: <string> # A descriptive name of the query.
question: <string> # The natural language question that this query answers.
verified_at: <int> # Optional: Time (in seconds since the UNIX epoch, January 1, 1970) when the query was verified.
verified_by: <string> # Optional: Name of the person who verified the query.
use_as_onboarding_question: <boolean> # Optional: Marks this question as an onboarding question for the end user.
sql: <string> # The SQL query for answering the question
Important
Semantic views do not require the join_type or relationship_type fields that were used in legacy semantic
models. The relationship type is automatically inferred from the data.
Key concepts¶
Tables¶
Logical tables represent business entities (such as customers, orders, or products) and map to physical database tables. Each logical table can define:
Base table: The fully qualified name of the physical table
Primary key: Columns that uniquely identify rows
Synonyms: Alternative names for the table
Description: Business-friendly explanation of what the table represents
Dimensions¶
Dimensions represent categorical attributes that provide context for analysis. They answer “who,” “what,” “where,” and “when” questions. Dimensions can be:
Regular dimensions: Text, numeric, or other categorical values
Time dimensions: Date or timestamp columns with special time-based handling
Properties of dimensions¶
expr: SQL expression to calculate the dimension valuesynonyms: Alternative terms users might useunique: Whether values are unique across rowsis_enum: Whether the dimension has a fixed set of valuescortex_search_service: Optional Cortex Search service for semantic search
Optional properties for physical dimensions¶
These fields are optional, but recommended for producing higher-quality results from a semantic view search.
synonymsA list of other terms/phrases used to refer to this dimension. Must be unique across all synonyms in this semantic model.
descriptionA brief description of this dimension. Include information that provides useful context, such as data this dimension represents.
uniqueA boolean value that indicates this dimension has unique values.
sample_valuesSample values of this column, if any. Add any value that is likely to be referenced in the user questions.
is_enumA Boolean value. If
True, the values in thesample_valuesfield are taken to be the full list of possible values, and the model only chooses from those values when filtering on that column.cortex_search_serviceSpecifies the Cortex Search Service to use for this dimension. It has the following fields:
service: The name of the Cortex Search Service.literal_column: (optional) The column in the Cortex Search Service that contains the literal values.database: (optional) The database where the Cortex Search Service is located. Defaults tobase_table’s database.schema: (optional) The schema where the Cortex Search Service is located. Defaults tobase_table’s schema.
cortex_search_servicereplaces thecortex_search_service_namefield, which could only specify the name.cortex_search_service_namehas been deprecated.
Optional properties for time dimensions¶
These fields are optional, but recommended for producing higher-quality results from a semantic view search.
synonymsA list of other terms/phrases used to refer to this time dimension. Must be unique across all synonyms in this semantic model.
descriptionA brief description of this dimension. Include information that provides useful context, such as the time zone that this dimension uses as a reference point.
unique:A boolean value that indicates this column has unique values.
sample_values:Sample values of this column, if any. Add any values that are likely to be referenced in the user questions.
Facts¶
Facts are row-level quantitative attributes that represent specific business events or transactions. Facts capture “how much” or “how many” at the most granular level, such as individual sales amounts, quantities purchased, or costs.
Facts typically function as “helper” concepts within the semantic view to help construct dimensions and metrics.
The properties of facts are:
expr: SQL expression to calculate the fact valueaccess_modifier: Set toprivate_accessto hide from queries (useful for intermediate calculations)data_type: The data type of the fact
Metrics¶
Metrics are quantifiable measures of business performance calculated by aggregating facts or other columns using functions like SUM, AVG, and COUNT.
Two types of metrics:
Table-level metrics: Scoped to a specific logical table, aggregating data within that table
Derived metrics: View-level metrics that combine metrics from multiple tables
Properties of metrics:
expr: SQL expression with aggregation functionaccess_modifier: Set toprivate_accessto hide from queries (useful for intermediate calculations)synonyms: Alternative terms for the metric
Derived metrics¶
Derived metrics are view-level metrics not tied to a specific table. They can combine metrics from multiple tables or perform calculations across the entire view.
Example of a derived metric:
metrics:
- name: total_profit_margin
description: "Overall profit margin across all products"
expr: (orders.total_revenue - orders.total_cost) / orders.total_revenue
access_modifier: public_access
Relationships¶
Relationships define how logical tables join together. Each relationship specifies:
left_table: The table containing the foreign keyright_table: The table being referencedrelationship_columns: Pairs of columns to join on, asleft_columnandright_column
The relationship type (one-to-one, many-to-one) is automatically inferred from the data and primary key definitions.
Note
Unlike legacy semantic models, semantic views do not require explicit join_type or relationship_type
specifications. These are determined automatically.
Filters¶
Filters define commonly used filtering conditions that can be referenced by name. This helps ensure consistent filtering logic across queries.
Example:
filters:
- name: active_customers
description: "Customers who have made a purchase in the last 12 months"
expr: "customer_last_purchase_date >= DATEADD(month, -12, CURRENT_DATE())"
Verified queries¶
Verified queries are example questions with their corresponding SQL queries. They help Cortex Analyst understand how to answer similar questions and serve as documentation for users.
Properties:
question: Natural language questionsql: SQL query that answers the questionverified_by: Optional person who verified the query is correctverified_at: Optional timestamp when verifieduse_as_onboarding_question: Optional flag to show this as a suggestion to users
Access modifiers¶
Semantic views support access modifiers for facts and metrics, allowing you to control visibility:
public_access(default): Visible and queryable by usersprivate_access: Hidden from queries, used only for intermediate calculations
Example:
facts:
- name: internal_cost
expr: unit_cost * quantity
data_type: NUMBER
access_modifier: private_access # Not visible in queries
metrics:
- name: total_revenue
expr: SUM(sale_amount)
access_modifier: public_access # Visible in queries
Custom instructions for Cortex Analyst¶
You can use SQL commands to provide custom instructions in the semantic view definition. These instructions guide how the queries are generated and how questions are categorized. These instructions are not part of the YAML specification but are set using the CREATE SEMANTIC VIEW command.
For more information, see Providing custom instructions for Cortex Analyst.
Example semantic view YAML¶
Here’s a complete example of a semantic view YAML specification:
name: revenue_analysis
description: "Semantic view for analyzing revenue across products and customers"
tables:
- name: customers
description: "Customer information"
base_table:
database: sales_db
schema: public
table: customers
dimensions:
- name: customer_name
synonyms: ["client name", "customer"]
description: "Full name of the customer"
expr: c_name
data_type: VARCHAR
- name: customer_segment
synonyms: ["segment", "market segment"]
description: "Customer market segment"
expr: c_mktsegment
data_type: VARCHAR
is_enum: true
- name: orders
description: "Order information"
base_table:
database: sales_db
schema: public
table: orders
dimensions:
- name: order_date
description: "Date when order was placed"
expr: o_orderdate
data_type: DATE
time_dimensions:
- name: order_year
description: "Year when order was placed"
expr: YEAR(o_orderdate)
data_type: NUMBER
facts:
- name: order_total
description: "Total order amount"
expr: o_totalprice
data_type: NUMBER
metrics:
- name: total_orders
description: "Total number of orders"
expr: COUNT(*)
- name: average_order_value
description: "Average order value"
expr: AVG(o_totalprice)
relationships:
- name: orders_to_customers
left_table: orders
right_table: customers
relationship_columns:
- left_column: o_custkey
right_column: c_custkey
metrics:
- name: revenue_per_customer
description: "Average revenue per customer"
expr: orders.total_revenue / customers.customer_count
access_modifier: public_access
verified_queries:
- name: top_customers_by_revenue
question: "Who are the top 10 customers by revenue?"
sql: |
SELECT
customer_name,
SUM(order_total) as total_revenue
FROM revenue_analysis
GROUP BY customer_name
ORDER BY total_revenue DESC
LIMIT 10
use_as_onboarding_question: true
Creating a semantic view from YAML¶
To create a semantic view from a YAML specification, use the SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAML stored procedure.
For more information, see Creating a semantic view from a YAML specification.
Getting YAML from a semantic view¶
To export a semantic view to YAML format, use the SYSTEM$READ_YAML_FROM_SEMANTIC_VIEW function.
For more information, see Getting the YAML specification for a semantic view.
Differences from legacy semantic models¶
If you’re migrating from legacy semantic model YAML files to semantic views, note these key differences:
Feature |
Legacy Semantic Models |
Semantic Views |
|---|---|---|
Storage |
YAML files on stages |
Schema-level objects in database |
Privileges |
Stage-based access control |
Full Snowflake RBAC integration |
Sharing |
Manual file sharing |
Native Snowflake sharing |
Join types |
Requires |
Automatically inferred |
Derived metrics |
Not supported |
Fully supported |
Access modifiers |
Not supported |
|
Custom instructions |
In YAML file |
Set via SQL commands |
When converting from a legacy semantic model to a semantic view:
Remove
join_typeandrelationship_typefrom relationshipsConsider using derived metrics for view-level calculations
Add
access_modifierto facts/metrics you want to make privateMove custom instructions to SQL CREATE SEMANTIC VIEW command