Snowflake API Reference (Python)¶
Snowflake API (Python) allows you to manage Snowflake resources by using Python APIs. For more information, see the Snowflake API Developer Guide for Python.
Overview¶
The Snowflake platform has a rich SQL command API which lets you interact with Snowflake entities, creating, deleting, modifying them, and more. The library described here, called the “Snowflake Python API” (or “SnowAPI” for short) provides the same functionality using the Python language.
Here is a quick overview of the concepts, classes, and functionality for using Python to interact with the Snowflake platform, in most cases eliminating the need to write SQL, or use the Python SQL connector.
Connecting to Snowflake¶
Before you begin, you must define your connection to Snowflake.
Note
Snowflake is actively evolving the connection definition mechanism, to make it easier and safer to define your connections. These instructions will change before the public release of SnowAPI.
You can define your connection information in a dictionary in your code [1] like so:
>>> CONNECTION_PARAMETERS = dict(
... account = 'your account name',
... user = 'your user name',
... password = 'your password',
... database = 'database name',
... schema = 'schema name',
... role = 'role if needed',
... warehouse = 'warehouse name',
... )
You then use this connection information to create a session:
>>> session = Session.builder.configs(connection_parameters).create()
The root resource¶
The SnowAPI is organized as a tree of resources, modeling the resources available in the Snowflake REST API [2]. You’ll need to first instantiate root of this resource tree in order to interact with other Snowflake objects:
>>> from snowflake.core import Root
>>> root = Root(session)
When instantiating the root object, you need to either pass in a session, as shown here, or a SnowflakeConnection object.
Local references vs Snowflake snapshots¶
There are two important concepts that help you understand how the SnowAPI library functions. Python objects that you create and interact with are either local references (a.k.a. handles), or snapshots of state stored on Snowflake. This separation is made because communicating with Snowflake incurs costs - network activity and performance costs, but also usage credit costs. For this reason SnowAPI exposes explicit operations when it talks to Snowflake.
In general, if you act on or retrieve information from Snowflake, you do so through a local, in-memory reference object. These references do not synchronize with Snowflake until you call a given method. Calling a method is a good indication of communication with Snowflake, which incurs some or all of the costs outlined above. Other operations on in-memory references incur no such costs.
We’ll call out these differences as we explore more examples.
Example: create a task in Snowflake¶
In this example, we’ll step through creating a task in Snowflake, providing its definition, naming it, pushing it to Snowflake, as well as suspending, resuming, and deleting the task.
In order to create a task, we first create a task object describing the task. We’ll give it a name, a definition, and a schedule to run. Because this is a local reference, this task definition lives only on the client side; it has not been pushed to Snowflake.
>>> from snowflake.core.task import Task
>>> from datetime import timedelta
>>> task_definition = Task(
... 'task1', definition='select 1', schedule=timedelta(hours=1))
... )
This task_definition
object isn’t a local reference per se, it’s simply a
Python object that holds a description of the task which can be pushed to
Snowflake in order to create the task.
This task definition must be pushed into Snowflake before it can execute. As you can see from the SQL CREATE TASK documentation, task creation requires a schema. In Python terms, this means that creating a new task must happen relative to an existing schema object. Schemas in turn are relative to a database.
Before we can create the task in Snowflake, we need to build a local path to
the database and schema where the task will be created. This is done by using
mapping-like syntax in various collections, rooted at the root
object.
Here’s how we build a path to a schema named “public” in the database named
“my_db”. Note most importantly that this is purely local; no communication
with Snowflake occurs.
>>> schema = root.databases['my_db'].schemas['public']
>>> print(schema.name)
public
>>> print(schema.database.name)
my_db
As you can see, these are just the local reference objects for the database
and schema, and accessing their attributes just parrots the name
attributes back to you as you gave them. Now we’re ready to push this task
definition into Snowflake:
>>> task_reference = schema.tasks.create(task_definition)
This method call communicates with Snowflake.
This object actually is the local reference object, and it holds a handle to the task object created in Snowflake. But its properties haven’t been retrieved from Snowflake yet. In order to do that, we have to make a call on the reference object, returning a “snapshot” object which does contain the properties of the newly created task. You’ll see this pattern often, where fetching the snapshot communicates with Snowflake, retrieving the properties of the referenced object.
>>> task_reference.name
'task1'
>>> task = task_reference.fetch()
>>> task.name
'TASK1'
Notice that the name on the reference object is lower case while the name on
the task snapshot object is upper case. That’s because Snowflake stores the
task name in upper case and the .fetch()
call sync’d this value to the
local snapshot object.
Note
Be aware that this snapshot is only guaranteed to be current as of the
moment the .fetch()
executes. After that, changes to underlying
Snowflake objects are not reflected in the local snapshot objects. You
must call .fetch()
again to re-sync the data locally.
The snapshot task object has additional attributes on it, such as the task schedule:
>>> task.schedule
datetime.timedelta(seconds=3600)
and the task’s schema name (notice the upper case again here):
>>> task.schema_name
'PUBLIC'
The task reference object has some additional attributes and methods too. You can get the task’s fully qualified name:
>>> task_reference.fully_qualified_name
'my_db.public.task1'
You can also execute, resume, and suspend the task by using methods on the task reference object:
>>> task_reference.resume()
>>> task_reference.suspend()
>>> task_reference.execute()
You can also delete the task by using the task reference object.
>>> task_reference.delete()
Of course, once you’ve done this, you can’t resume, suspend, or execute the
task, because it no longer exists. You also can’t call .fetch()
to get
the latest snapshop object. All of these operations return 404 Not Found
exceptions (note that the full traceback is omitted here):
>>> task_reference.execute()
...
snowflake.core.exceptions.NotFoundError: (404)
Reason: None
HTTP response headers: HTTPHeaderDict({'Content-Type': 'application/json'})
HTTP response body: {"error_code": "404", ... }
- snowflake.core
- snowflake.core.paging
- snowflake.core.exceptions
- snowflake.core.database
- snowflake.core.schema
- snowflake.core.task
- snowflake.core.task.context
- snowflake.core.task.dagv1
- snowflake.core.compute_pool
- snowflake.core.image_repository
- snowflake.core.service
- snowflake.core.table
- snowflake.core.warehouse