Snowflake DevOps¶
You can combine Snowflake tools with your existing tools to build DevOps processes to safely and efficiently manage code and data.
What is Snowflake DevOps?¶
Snowflake provides an integrated approach to accelerate development lifecycles and improve overall productivity for data teams. This approach integrates Git version control, Python APIs, declarative object management, and seamless CI/CD automation.
Git repository for all of your sources |
By keeping your data assets, code, and configurations centrally managed and version-controlled, you can ensure consistency, simplify collaboration, and streamline rollbacks if needed. Using your Git repository from Snowflake lets you do all of this within Snowflake’s secure perimeter, which is crucial for production-ready environments. |
Declarative syntax |
By embracing a declarative approach to database change management — defining and managing Snowflake objects using Python or SQL — you eliminate the need for complex scripts while promoting readability. |
CI/CD automation |
You can integrate with your existing CI/CD tools or Snowflake CLI to automatically execute commands and orchestrate your entire pipeline for efficient and reliable deployments. |
What are the building blocks of Snowflake DevOps?¶
Snowflake offers several features that integrate to make DevOps tasks simpler and more secure.
CREATE OR ALTER for tables or tasks. |
You can use the CREATE OR ALTER command to apply updates for consistency across database objects, simplifying database change management. For more information, see Manage changes with declarative code and versioning. |
With the EXECUTE IMMEDIATE FROM Snowflake command, you can execute SQL and Python code from a Git repository in Snowflake. For more information, see Parameterize scripts with Jinja templates. |
|
In scripts to drive your CI/CD pipeline, you can run Snowflake CLI commands to automate deployments, collaborate with version control, and integrate with other CI/CD tools. For more information, see Automate CI/CD jobs with GitHub Actions. |
|
Using Snowflake’s Python APIs, you can manage Snowflake resources, including databases, schemas, tables, and tasks. For more information, see Orchestrate Snowflake code execution with Snowflake tasks. |
|
You can fetch project config and data pipelines (schema, tables, scripts) from a Git repo to trigger the deployment workflow. For more information, see Streamline workflows by keeping assets in a Git repository connected with Snowflake. |
Using Snowflake features in DevOps workflows¶
Using Snowflake with your existing tools, you can achieve the following DevOps goals:
Streamline workflows by keeping assets in a Git repository connected with Snowflake.
You can take advantage of your existing Git investment by connecting the repository to your Snowflake account. Once this connection is in place, with the repository cloned locally in Snowflake itself, you can browse and execute its code directly in Snowflake.
Minimize risk by maintaining separate environments for development, testing, and production.
By using separate Snowflake databases for each environment, you can minimize the risk of unwanted changes affecting live systems. To more easily manage deployments, you use scripts that parameterize the deployment process.
Manage database changes in a controlled and repeatable way for development, testing, and production by using scripts and declarative code.
Automate CI/CD jobs, including deployment of code and data, by using GitHub Actions.
Orchestrate Snowflake code execution by using Snowflake tasks.
Note
With the Snowflake Extension for Visual Studio Code, you can write and execute Snowflake SQL statements directly in VS Code. By connecting VS Code with the same repository you’ve connected to Snowflake, you can develop code locally in VS Code, keep your work in your Git repository, and then access your code from within Snowflake.
Streamline workflows by keeping assets in a Git repository connected with Snowflake¶
You can ensure consistency, simplify collaboration, and streamline rollbacks if needed by keeping your data assets, code, and configurations centrally managed and version-controlled in a Git repository and using the repository from Snowflake.
You can connect your Snowflake acccount with your Git repository so that Snowflake can execute code in files cloned from the repository. The result is a Git repository stage that represents a full clone of your repository. From within Snowflake, you can access version-controlled files at a certain commit, branch, or tag.
For example, with cloned repository files in Snowflake, you can use the EXECUTE IMMEDIATE FROM command to run code in a cloned file. This way, you can declaratively create database objects with SQL and then execute that SQL from other code, such as a GitHub Action.
Maintain separate environments for development, test, and production¶
By maintaining separate environments for development, testing, and production, your teams can isolate development activities from the production environment, reducing the likelihood of unintended consequences and data corruption.
When you separate workflows into multiple environments, each gets its own Snowflake database — typically an identical copy.
Parameterize scripts with Jinja templates¶
To support deploying and executing code in essentially identical ways between development and production environments, you can parameterize references to environment specifics — such as which database to use during deployment. That way, you can enable a CI/CD pipeline to choose the deployment target appropriate to the environment.
To parameterize scripts, you can use Jinja, a popular templating language with features ranging from simple script parameterization to extensive scripting in a language similar to Python. Snowflake supports the execution of templated SQL scripts with EXECUTE IMMEDIATE FROM.
For changing a deployment target, for example, you’d use variable replacement with Jinja. In SQL scripts you indicate a Jinja variable
with {{...}}
. You can provide a value for a variable in the USING clause of EXECUTE IMMEDIATE FROM.
The following line of SQL uses a Jinja variable {{environment}}
as a placeholder for a portion of the database name that
differentiates this database from others whose names are similar:
CREATE IF NOT EXISTS DATABASE MYDB_{{environment}};
For example, you could use the CREATE DATABASE command from jobs.<job_id>.steps[*].run in a Github Action.
run:
snow sql -q "EXECUTE IMMEDIATE FROM @myco_common.public.my_repo/branches/main/deploy_parameterized_pipeline.sql USING (environment => 'prod', retention_time => 1)" \
Code in this GitHub Action workflow file excerpt includes the following commands:
run
runs thesnow sql
command of the Snowflake CLI.snow sql
executes the contents of thedeploy_parameterized_pipeline.sql
file from themain
branch of the GitHub repository cloned to therepository myco_common.public.my_repo
Git repository stage in Snowflake.deploy_parameterized_pipeline.sql
contains the CREATE DATABASE command.
With Snowflake CLI, you can create, manage, update, and view apps running on Snowflake across workloads.
Manage changes with declarative code and versioning¶
You can more easily manage changes to your database resources with reusable configuration files in your Git repository.
You can define database objects using the CREATE OR ALTER command, which creates the object or alters it to match the definition specified by the command. By using this command from a versioned file in a repository, you can more easily roll back changes to a previous version: you merely execute a previous version of the file.
CREATE OR ALTER TABLE vacation_spots (
city VARCHAR,
airport VARCHAR,
avg_temperature_air_f FLOAT,
avg_relative_humidity_pct FLOAT,
avg_cloud_cover_pct FLOAT,
precipitation_probability_pct FLOAT
) data_retention_time_in_days = 1;
from snowflake.core import Root
from snowflake.core.table import PrimaryKey, Table, TableColumn
my_table = root.databases["my_db"].schemas["my_schema"].tables["vacation_spots"].fetch()
my_table.columns.append(TableColumn(name="city", datatype="varchar", nullable=False]))
my_table.columns.append(TableColumn(name="airport", datatype="varchar", nullable=False]))
my_table.columns.append(TableColumn(name="avg_temperature_air_f", datatype="float", nullable=False]))
my_table.columns.append(TableColumn(name="avg_relative_humidity_pct", datatype="float", nullable=False]))
my_table.columns.append(TableColumn(name="avg_cloud_cover_pct", datatype="float", nullable=False]))
my_table.columns.append(TableColumn(name="precipitation_probability_pct", datatype="float", nullable=False]))
my_table_res = root.databases["my_db"].schemas["my_schema"].tables["vacation_spots"]
my_table_res.create_or_alter(my_table)
Note
You can also use the Snowflake Python APIs and Snowflake CLI to manage Snowflake resources.
Automate CI/CD jobs with GitHub Actions¶
As indicated briefly in Parameterize scripts with Jinja templates, you can use GitHub Actions to automate the jobs that constitute a CI/CD pipeline. With GitHub Actions, you can define workflows that automate build, test, and deployment tasks.
In these workflows, you can use other features of Snowflake and GitHub to put the pieces together. You can do the following tasks:
Store your Snowflake credentials in a GitHub secret so that GitHub Actions can connect to Snowflake.
Execute commands in Snowflake by using the Snowflake CLI.
Use Snowflake CLI to execute code contained in files that are retrieved from a connected Git repository and held by Snowflake.
The GitHub Actions workflow excerpt in the following example deploys a pipeline. The workflow authenticates with Snowflake by retrieving the needed values from previously configured secrets.
To deploy the pipeline, the workflow executes the following two SQL commands with the snow sql
command:
ALTER GIT REPOSITORY fetches the latest from the GitHub repository cloned to the
repository myco_common.public.my_repo
Git repository stage in Snowflake.EXECUTE IMMEDIATE FROM executes code in the
deploy_parameterized_pipeline.sql
file from themain
branch of the repository.
name: Deploy data pipeline to prod
# Controls when the action will run.
on:
push:
branches:
- main
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Python 3.10
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install Python packages
run: pip install -r requirements.txt
- name: Deploy data pipeline
env:
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
SNOWFLAKE_ROLE: ${{ secrets.SNOWFLAKE_ROLE }}
SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWFLAKE_WAREHOUSE}}
run: |
snow sql -q "ALTER GIT REPOSITORY quickstart_common.public.quickstart_repo FETCH" --temporary-connection --account $SNOWFLAKE_ACCOUNT --user $SNOWFLAKE_USER --role $SNOWFLAKE_ROLE --warehouse $SNOWFLAKE_WAREHOUSE
snow sql -q "EXECUTE IMMEDIATE FROM @quickstart_common.public.quickstart_repo/branches/main/deploy_parameterized_pipeline.sql USING (environment => 'prod', retention_time => 1)" --temporary-connection --account $SNOWFLAKE_ACCOUNT --user $SNOWFLAKE_USER --role $SNOWFLAKE_ROLE --warehouse $SNOWFLAKE_WAREHOUSE
Orchestrate Snowflake code execution with Snowflake tasks¶
You can create Snowflake tasks to orchestrate the execution of SQL statements in Snowflake. Tasks include the ability to execute on a schedule, retry, suspend after failures, and execute after specific other tasks have been completed.
With a task, you can join views, send email notifications, feed data to other parts of your application, and so on.
After you create a task that coordinates these SQL statements, you can execute the task in a single execution. As mentioned elsewhere in this topic, for example, you could execute a task by including EXECUTE TASK in a repository file you execute with EXECUTE IMMEDIATE FROM.
For information on creating tasks with SQL, see CREATE TASK.
For information on creating tasks using a Python API, see Managing Snowflake tasks and task graphs with Python.
You can monitor tasks, including task execution history, by using SQL or Snowsight.
For information on monitoring tasks with SQL, see TASK_HISTORY view.
For information on monitoring tasks in Snowsight, see Viewing tasks and task graphs in Snowsight.
For information on managing tasks using a Python API, see Managing Snowflake tasks and task graphs with Python.
The task in the following example selects from a vacation_spots
table. Then, based on the result, the task calls SYSTEM$SEND_EMAIL to
send an email notification.
CREATE OR ALTER TASK email_notification
WAREHOUSE = 'quickstart_wh'
AFTER vacation_spots_update
AS
BEGIN
LET OPTIONS VARCHAR := (
SELECT TO_VARCHAR(ARRAY_AGG(OBJECT_CONSTRUCT(*)))
FROM vacation_spots
WHERE TRUE
AND punctual_pct >= 50
AND avg_temperature_air_f >= 70
-- STEP 5: INSERT CHANGES HERE
LIMIT 10);
IF (:OPTIONS = '[]') THEN
CALL SYSTEM$SEND_EMAIL(
'email_integration',
'<insert your email here>', -- INSERT YOUR EMAIL HERE
'New data successfully processed: No suitable vacation spots found.',
'The query did not return any results. Consider adjusting your filters.');
END IF;
LET QUERY VARCHAR := 'Considering the data provided below in JSON format, pick the best city for a family vacation in summer?
Explain your choice, offer a short description of the location and provide tips on what to pack for the vacation considering the weather conditions?
Finally, could you provide a detailed plan of daily activities for a one week long vacation covering the highlights of the chosen destination?\n\n';
LET RESPONSE VARCHAR := (SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-7b', :query || :options));
CALL SYSTEM$SEND_EMAIL(
'email_integration',
'<insert your email here>', -- INSERT YOUR EMAIL HERE
'New data successfully processed: The perfect place for your summer vacation has been found.',
:response);
EXCEPTION
WHEN EXPRESSION_ERROR THEN
CALL SYSTEM$SEND_EMAIL(
'email_integration',
'<insert your email here>', -- INSERT YOUR EMAIL HERE
'New data successfully processed: Cortex LLM function inaccessible.',
'It appears that the Cortex LLM functions are not available in your region');
END;
For an interactive walk-through of using Snowflake DevOps, see the Getting Started with Snowflake DevOps Quickstart.