DevOps for Snowflake

You can combine Snowflake tools with your existing tools to build DevOps processes to safely and efficiently manage code and data.

What is DevOps with Snowflake?

Snowflake provides an integrated approach to accelerate development lifecycles and improve overall productivity for data teams. This approach integrates Git version control, Python APIs, declarative object management, and seamless CI/CD automation.

Git repository for all of your sources

By keeping your data assets, code, and configurations centrally managed and version-controlled, you can ensure consistency, simplify collaboration, and streamline rollbacks if needed.

Using your Git repository from Snowflake lets you do all of this within Snowflake’s secure perimeter, which is crucial for production-ready environments.

Declarative syntax

By embracing a declarative approach to database change management — defining and managing Snowflake objects using Python or SQL — you eliminate the need for complex scripts while promoting readability.

CI/CD automation

You can integrate with your existing CI/CD tools or Snowflake CLI to automatically execute commands and orchestrate your entire pipeline for efficient and reliable deployments.

What are the building blocks of DevOps with Snowflake?

Snowflake offers several features that integrate to make DevOps tasks simpler and more secure.

CREATE OR ALTER for tables or tasks.

You can use the CREATE OR ALTER command to apply updates for consistency across database objects, simplifying database change management.

For more information, see Manage changes with declarative code and versioning.

EXECUTE IMMEDIATE FROM

With the EXECUTE IMMEDIATE FROM Snowflake command, you can execute SQL and Python code from a Git repository in Snowflake.

For more information, see Parameterize scripts with Jinja templates.

Snowflake CLI

In scripts to drive your CI/CD pipeline, you can run Snowflake CLI commands to automate deployments, collaborate with version control, and integrate with other CI/CD tools.

For more information, see Automate CI/CD jobs with GitHub Actions.

Python APIs

Using Snowflake’s Python APIs, you can manage Snowflake resources, including databases, schemas, tables, and tasks.

For more information, see Orchestrate Snowflake code execution with Snowflake tasks.

Git in Snowflake

You can fetch project config and data pipelines (schema, tables, scripts) from a Git repo to trigger the deployment workflow.

For more information, see Streamline workflows by keeping assets in a Git repository connected with Snowflake.

Using Snowflake features in DevOps workflows

Using Snowflake with your existing tools, you can achieve the following DevOps goals:

  • Streamline workflows by keeping assets in a Git repository connected with Snowflake.

    You can take advantage of your existing Git investment by connecting the repository to your Snowflake account. Once this connection is in place, with the repository cloned locally in Snowflake itself, you can browse and execute its code directly in Snowflake.

  • Minimize risk by maintaining separate environments for development, testing, and production.

    By using separate Snowflake databases for each environment, you can minimize the risk of unwanted changes affecting live systems. To more easily manage deployments, you use scripts that parameterize the deployment process.

  • Manage database changes in a controlled and repeatable way for development, testing, and production by using scripts and declarative code.

  • Automate CI/CD jobs, including deployment of code and data, by using GitHub Actions.

  • Orchestrate Snowflake code execution by using Snowflake tasks.

Note

With the Snowflake Extension for Visual Studio Code, you can write and execute Snowflake SQL statements directly in VS Code. By connecting VS Code with the same repository you’ve connected to Snowflake, you can develop code locally in VS Code, keep your work in your Git repository, and then access your code from within Snowflake.

Streamline workflows by keeping assets in a Git repository connected with Snowflake

You can ensure consistency, simplify collaboration, and streamline rollbacks if needed by keeping your data assets, code, and configurations centrally managed and version-controlled in a Git repository and using the repository from Snowflake.

You can connect your Snowflake acccount with your Git repository so that Snowflake can execute code in files cloned from the repository. The result is a Git repository stage that represents a full clone of your repository. From within Snowflake, you can access version-controlled files at a certain commit, branch, or tag.

For example, with cloned repository files in Snowflake, you can use the EXECUTE IMMEDIATE FROM command to run code in a cloned file. This way, you can declaratively create database objects with SQL and then execute that SQL from other code, such as a GitHub Action.

Diagram showing Git repository exchanging files with development tools and Snowflake.

Maintain separate environments for development, test, and production

By maintaining separate environments for development, testing, and production, your teams can isolate development activities from the production environment, reducing the likelihood of unintended consequences and data corruption.

When you separate workflows into multiple environments, each gets its own Snowflake database — typically an identical copy.

Parameterize scripts with Jinja templates

To support deploying and executing code in essentially identical ways between development and production environments, you can parameterize references to environment specifics — such as which database to use during deployment. That way, you can enable a CI/CD pipeline to choose the deployment target appropriate to the environment.

To parameterize scripts, you can use Jinja, a popular templating language with features ranging from simple script parameterization to extensive scripting in a language similar to Python. Snowflake supports the execution of templated SQL scripts with EXECUTE IMMEDIATE FROM.

For changing a deployment target, for example, you’d use variable replacement with Jinja. In SQL scripts you indicate a Jinja variable with {{...}}. You can provide a value for a variable in the USING clause of EXECUTE IMMEDIATE FROM.

The following line of SQL uses a Jinja variable {{environment}} as a placeholder for a portion of the database name that differentiates this database from others whose names are similar:

CREATE IF NOT EXISTS DATABASE MYDB_{{environment}};
Copy

For example, you could use the CREATE DATABASE command from jobs.<job_id>.steps[*].run in a Github Action.

run:
  snow sql -q "EXECUTE IMMEDIATE FROM @myco_common.public.my_repo/branches/main/deploy_parameterized_pipeline.sql USING (environment => 'prod', retention_time => 1)" \
Copy

Code in this GitHub Action workflow file excerpt includes the following commands:

  • run runs the snow sql command of the Snowflake CLI.

  • snow sql executes the contents of the deploy_parameterized_pipeline.sql file from the main branch of the GitHub repository cloned to the repository myco_common.public.my_repo Git repository stage in Snowflake.

    deploy_parameterized_pipeline.sql contains the CREATE DATABASE command.

With Snowflake CLI, you can create, manage, update, and view apps running on Snowflake across workloads.

Manage changes with declarative code and versioning

You can more easily manage changes to your database resources with reusable configuration files in your Git repository.

You can define database objects using the CREATE OR ALTER command, which creates the object or alters it to match the definition specified by the command. By using this command from a versioned file in a repository, you can more easily roll back changes to a previous version: you merely execute a previous version of the file.

CREATE OR ALTER TABLE vacation_spots (
  city VARCHAR,
  airport VARCHAR,
  avg_temperature_air_f FLOAT,
  avg_relative_humidity_pct FLOAT,
  avg_cloud_cover_pct FLOAT,
  precipitation_probability_pct FLOAT
) data_retention_time_in_days = 1;
Copy

Note

You can also use the Snowflake Python API and Snowflake CLI to manage Snowflake resources.

Automate CI/CD jobs with GitHub Actions

As indicated briefly in Parameterize scripts with Jinja templates, you can use GitHub Actions to automate the jobs that constitute a CI/CD pipeline. With GitHub Actions, you can define workflows that automate build, test, and deployment tasks.

In these workflows, you can use other features of Snowflake and GitHub to put the pieces together. You can do the following tasks:

  • Store your Snowflake credentials in a GitHub secret so that GitHub Actions can connect to Snowflake.

  • Execute commands in Snowflake by using the Snowflake CLI.

  • Use Snowflake CLI to execute code contained in files that are retrieved from a connected Git repository and held by Snowflake.

The GitHub Actions workflow excerpt in the following example deploys a pipeline. The workflow authenticates with Snowflake by retrieving the needed values from previously configured secrets.

To deploy the pipeline, the workflow executes the following two SQL commands with the snow sql command:

  • ALTER GIT REPOSITORY fetches the latest from the GitHub repository cloned to the repository myco_common.public.my_repo Git repository stage in Snowflake.

  • EXECUTE IMMEDIATE FROM executes code in the deploy_parameterized_pipeline.sql file from the main branch of the repository.

name: Deploy data pipeline to prod

# Controls when the action will run.
on:
  push:
    branches:
      - main

  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:

jobs:
  deploy:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Setup Python 3.10
        uses: actions/setup-python@v5
        with:
          python-version: '3.10'

      - name: Install Python packages
        run: pip install -r requirements.txt

      - name: Deploy data pipeline
        env:
          SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
          SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
          SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
          SNOWFLAKE_ROLE: ${{ secrets.SNOWFLAKE_ROLE }}
          SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWFLAKE_WAREHOUSE}}
        run: |
          snow sql -q "ALTER GIT REPOSITORY quickstart_common.public.quickstart_repo FETCH" --temporary-connection --account $SNOWFLAKE_ACCOUNT --user $SNOWFLAKE_USER --role $SNOWFLAKE_ROLE --warehouse $SNOWFLAKE_WAREHOUSE
          snow sql -q "EXECUTE IMMEDIATE FROM @quickstart_common.public.quickstart_repo/branches/main/deploy_parameterized_pipeline.sql USING (environment => 'prod', retention_time => 1)" --temporary-connection --account $SNOWFLAKE_ACCOUNT --user $SNOWFLAKE_USER --role $SNOWFLAKE_ROLE --warehouse $SNOWFLAKE_WAREHOUSE
Copy

Orchestrate Snowflake code execution with Snowflake tasks

You can create Snowflake tasks to orchestrate the execution of SQL statements in Snowflake. Tasks include the ability to execute on a schedule, retry, suspend after failures, and execute after specific other tasks have been completed.

With a task, you can join views, send email notifications, feed data to other parts of your application, and so on.

After you create a task that coordinates these SQL statements, you can execute the task in a single execution. As mentioned elsewhere in this topic, for example, you could execute a task by including EXECUTE TASK in a repository file you execute with EXECUTE IMMEDIATE FROM.

You can monitor tasks, including task execution history, by using SQL or Snowsight.

The task in the following example selects from a vacation_spots table. Then, based on the result, the task calls SYSTEM$SEND_EMAIL to send an email notification.

CREATE OR ALTER TASK email_notification
  WAREHOUSE = 'quickstart_wh'
  AFTER vacation_spots_update
  AS
    BEGIN
      LET OPTIONS VARCHAR := (
        SELECT TO_VARCHAR(ARRAY_AGG(OBJECT_CONSTRUCT(*)))
        FROM vacation_spots
        WHERE TRUE
          AND punctual_pct >= 50
          AND avg_temperature_air_f >= 70
          -- STEP 5: INSERT CHANGES HERE
        LIMIT 10);


      IF (:OPTIONS = '[]') THEN
        CALL SYSTEM$SEND_EMAIL(
            'email_integration',
            '<insert your email here>', -- INSERT YOUR EMAIL HERE
            'New data successfully processed: No suitable vacation spots found.',
            'The query did not return any results. Consider adjusting your filters.');
      END IF;

      LET QUERY VARCHAR := 'Considering the data provided below in JSON format, pick the best city for a family vacation in summer?
      Explain your choice, offer a short description of the location and provide tips on what to pack for the vacation considering the weather conditions?
      Finally, could you provide a detailed plan of daily activities for a one week long vacation covering the highlights of the chosen destination?\n\n';

      LET RESPONSE VARCHAR := (SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-7b', :query || :options));

      CALL SYSTEM$SEND_EMAIL(
        'email_integration',
        '<insert your email here>', -- INSERT YOUR EMAIL HERE
        'New data successfully processed: The perfect place for your summer vacation has been found.',
        :response);
    EXCEPTION
        WHEN EXPRESSION_ERROR THEN
            CALL SYSTEM$SEND_EMAIL(
            'email_integration',
            '<insert your email here>', -- INSERT YOUR EMAIL HERE
            'New data successfully processed: Cortex LLM function inaccessible.',
            'It appears that the Cortex LLM functions are not available in your region');
    END;
Copy

For an interactive walk-through of using DevOps for Snowflake, see the Getting Started with DevOps for Snowflake Quickstart.