Attributing cost

An organization can gain insight into the cost of using Snowflake by attributing those costs to logical units within the organization (e.g. departments, environments, and projects). This chargeback or showback model is useful for accounting purposes, but it also allows an organization to identify which units are spending the most. This insight helps pinpoint areas of the organization that could benefit from controls and optimizations in order to reduce costs.

Attribute cost of individual resources

Note

Object tagging is available in Enterprise Edition or higher.

The most direct, powerful way to attribute cost to different groupings like departments, environments, or projects is to tag specific resources using the object tagging feature. In this strategy, an administrator creates a tag (e.g. cost_center), then defines a list of possible values of the tag (e.g. sales, finance). In this example, each cost center gets a unique tag value. The tag/value combination is then assigned to resources used by a cost center. As these resources consume credits, you can run reports grouped by the tag value. Because this tag value corresponds directly to a particular grouping within the organization, costs can be attributed accurately.

Object tagging allows you to attribute cost based on something as general as an account or as specific as a table. Note that not all taggable objects accumulate cost. From a cost management perspective, tagging an object only makes sense if there is a schema view that contains usage or cost data related to that object. For a list of Snowflake usage and cost views, see Exploring overall cost.

Attribution workflow

Using object tags to attribute cost consists of three basic tasks:

  1. Design a tagging strategy and create the appropriate tags.

  2. Tag resources with the correct tag/value pair.

  3. Run reports based on usage and tag data.

To help illustrate this workflow, consider a Snowflake account that:

  • Has two cost centers: Sales and finance.

  • Has three warehouses. Sales uses Warehouse1 and Warehouse2, while the finance department uses Warehouse3.

Three warehouses in an account

In this example, the goal is to attribute warehouse costs associated with each warehouse to the appropriate department.

Note

The following example assumes that appropriate privileges have been granted to the role tag_admin. An organization can use the granular tagging permissions to develop a secure tagging strategy.

Task 1: Design tagging strategy and create the appropriate tags

For this example, an administrator with the proper privileges creates a single tag called cost_center.

USE ROLE tag_admin;
USE SCHEMA my_db.my_schema;
CREATE TAG cost_center;
Copy
Task 2: Tag objects with the correct tag/value pair

The tag administrator applies the appropriate tag/value pair to the warehouses. For example, since the sales department uses both warehouse1 and warehouse2, the same tag/value combination is applied to these two resources.

USE ROLE tag_admin;
ALTER WAREHOUSE warehouse1 SET TAG cost_center='SALES';
ALTER WAREHOUSE warehouse2 SET TAG cost_center='SALES';
ALTER WAREHOUSE warehouse3 SET TAG cost_center='FINANCE';
Copy

Note

If administrators want to control which values can be set on the object, they can use the optional allowed_values parameter when creating the tag. For details, see CREATE TAG.

Task 3: Run reports based on usage and tag data

You can attribute costs by reporting on the use of resources that have the cost_center tag. This data is available in Snowsight, or by querying the SNOWFLAKE.ACCOUNT_USAGE schema.

Snowsight:
  1. Switch to a role that has access to the ACCOUNT_USAGE schema.

  2. Navigate to Admin » Cost Management.

  3. Select Consumption.

  4. From the Tags drop-down, select the cost_center tag.

  5. To focus on a specific cost center, select a value from the list of the tag’s values.

  6. Select Apply.

For more details about filtering in Snowsight, see Filter by tag.

SQL:

The Account Usage schema contains data about both resource usage and the tags in the schema.

You can view the cost attribution for all warehouses in a given month with their tags and tag values. This query also identifies untagged warehouses:

SELECT tag_references.tag_name,
     COALESCE(tag_references.tag_value, 'untagged') AS tag_value,
     SUM(warehouse_metering_history.credits_used_compute) AS total_credits
FROM snowflake.account_usage.warehouse_metering_history
LEFT JOIN snowflake.account_usage.tag_references
ON warehouse_metering_history.warehouse_id = tag_references.object_id
WHERE warehouse_metering_history.start_time >= DATE_TRUNC('MONTH', DATEADD(MONTH, -1, CURRENT_DATE))
  AND warehouse_metering_history.start_time < DATE_TRUNC('MONTH',  CURRENT_DATE)
GROUP BY tag_references.tag_name, COALESCE(tag_references.tag_value, 'untagged')
ORDER BY total_credits DESC;
Copy
+-------------+-------------+-----------------+
| TAG_NAME    | TAG_VALUE   |   TOTAL_CREDITS |
|-------------+-------------+-----------------|
| NULL        | untagged    |    20.360277159 |
| COST_CENTER | Sales       |    17.173333333 |
| COST_CENTER | Finance     |      8.14444444 |
+-------------+-------------+-----------------+

For complete details about using object tags, see Object Tagging.

Attribute cost for queries

You can use the QUERY_ATTRIBUTION_HISTORY view to attribute cost based on queries. The cost per query is the warehouse credit usage for executing the query. This cost does not include any other credit usage that is incurred as a result of query execution. For example, the following are not included in the query cost:

  • Data transfer costs

  • Storage costs

  • Cloud services costs

  • Costs for serverless features

  • Costs for tokens processed by AI services

For queries that are executed concurrently, the cost of the warehouse is attributed to individual queries based on the weighted average of their resource consumption during a given time interval.

The cost per query does not include warehouse idle time. Idle time is a period of time in which no queries are running in the warehouse and can be measured at the warehouse level.

Cost attribution by tag, user, and query

You can attribute warehouse and query costs by tag, user, or by query (for example, recurrent queries).

Attribution of warehouse cost based on query tag

You can use the QUERY_TAG parameter to allocate cost across different workloads. A query tag can be assigned at the session level for a workload or a set of queries. For more information about tag-based attribution, see Attribution workflow.

To set a query tag for a session, follow these steps:

  1. Create a tag. For example, create a tag for the finance team:

    CREATE TAG finance;
    
    Copy
  2. Set a query tag for the current session:

    ALTER SESSION SET query_tag = finance;
    
    Copy

    The finance tag is associated with all queries executed during the session.

  3. To view the warehouse cost of all tagged queries for the current month, execute the following statement. Note that this query allocates the entire warehouse compute cost across tags (including idle time) for reconciliation:

    WITH wh_bill AS (
       SELECT SUM(credits_used_compute) AS compute_credits
         FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
         WHERE start_time >= DATE_TRUNC('MONTH', CURRENT_DATE)
         AND start_time < CURRENT_DATE
    ),
    tag_credits AS (
       SELECT COALESCE(NULLIF(query_tag, ''), 'untagged') AS tag,
              SUM(credits_attributed_compute) AS credits
         FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ATTRIBUTION_HISTORY
         WHERE start_time >= DATE_TRUNC('MONTH', CURRENT_DATE)
         AND start_time < CURRENT_DATE
         GROUP BY tag
    ),
    total_credit AS (
       SELECT SUM(credits) AS sum_all_credits
         FROM tag_credits
    )
    SELECT tc.tag,
           tc.credits / t.sum_all_credits * w.compute_credits AS attributed_credits
      FROM tag_credits tc, total_credit t, wh_bill w;
    
    Copy

User-based attribution of warehouse cost

You can attribute warehouse cost by user. For example, to determine the total warehouse cost attributed to each user for the current month, execute the following statement:

WITH wh_bill AS (
   SELECT SUM(credits_used_compute) AS compute_credits
     FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
     WHERE start_time >= DATE_TRUNC('MONTH', CURRENT_DATE)
     AND start_time < CURRENT_DATE
),
user_credits AS (
   SELECT user_name, SUM(credits_attributed_compute) AS credits
     FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ATTRIBUTION_HISTORY
     WHERE start_time >= DATE_TRUNC('MONTH', CURRENT_DATE)
     AND start_time < CURRENT_DATE
     GROUP BY user_name
),
total_credit AS (
   SELECT SUM(credits) AS sum_all_credits
     FROM user_credits
)
SELECT u.user_name,
       u.credits / t.sum_all_credits * w.compute_credits AS attributed_credits
  FROM user_credits u, total_credit t, wh_bill w
  ORDER BY attributed_credits DESC;
Copy
+-----------+--------------------+
| USER_NAME | ATTRIBUTED_CREDITS |
|-----------+--------------------|
| JSMITH    |       17.173333333 |
|-----------+--------------------|
| MJONES    |         8.14444444 |
|-----------+--------------------|
| SYSTEM    |         5.33985393 |
+-----------+--------------------+

Query cost for recurrent queries

For recurrent or similar queries, use the query_hash or query_parameterized_hash to group costs by query.

To find the most expensive recurrent queries for the current month, execute the following statement:

SELECT query_parameterized_hash,
       COUNT(*) AS query_count,
       SUM(credits_attributed_compute) AS total_credits
  FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ATTRIBUTION_HISTORY
  WHERE start_time >= DATE_TRUNC('MONTH', CURRENT_DATE)
  AND start_time < CURRENT_DATE
  GROUP BY query_parameterized_hash
  ORDER BY total_credits DESC
  LIMIT 20;
Copy

For an additional query based on query ID, see Examples.

Query cost for hierarchical queries

For stored procedures that issue multiple hierarchical queries, you can compute the attributed query costs for the procedure by using the root query ID for the procedure.

  1. To find the root query ID for a stored procedure, use the ACCESS_HISTORY view. For example, to find the root query ID for a stored procedure, set the query_id and execute the following statements:

    SET query_id = '<query_id>';
    
    SELECT query_id,
           parent_query_id,
           root_query_id,
           direct_objects_accessed
      FROM SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY
      WHERE query_id = $query_id;
    
    Copy

    For more information, see Example: Ancestor queries with stored procedures.

  2. To sum the query cost for the entire procedure, replace <root_query_id> and execute the following statements:

    SET query_id = '<root_query_id>';
    
    SELECT SUM(credits_attributed_compute) AS total_attributed_credits
      FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ATTRIBUTION_HISTORY
      WHERE (root_query_id = $query_id OR query_id = $query_id);
    
    Copy