Snowpark Container Services: Working with compute pools

A compute pool is a collection of one or more virtual machine (VM) nodes on which Snowflake runs your Snowpark Container Services jobs and services. You create a compute pool using the CREATE COMPUTE POOL command. You then specify it when creating a service or executing a job.

Creating a compute pool

A compute pool is an account-level construct, analogous to a Snowflake virtual warehouse. The naming scope of the compute pool is your account. That is, you cannot have multiple compute pools with the same name in your account.

The minimum information required to create a compute pool includes the following:

  • The machine type to provision for the compute pool nodes

  • The minimum nodes to launch the compute pool with

  • The maximum number of nodes the compute pool can scale to (Snowflake manages the scaling.)

If you expect a substantial load or sudden bursts of activity on the services you intend to run within your compute pool, you can set a minimum node count greater than 1. This approach ensures that additional nodes are readily available when needed, instead of waiting for autoscaling to start.

Setting a maximum node limit prevents an unexpectedly large number of nodes from being added to your compute pool by Snowflake autoscaling. This can be crucial in scenarios such as unexpected load spikes or issues in your code that might cause Snowflake to allocate a larger number of compute pool nodes than originally planned.

The following CREATE COMPUTE POOL command creates a one-node compute pool:

CREATE COMPUTE POOL tutorial_compute_pool
  MIN_NODES = 1
  MAX_NODES = 1
  INSTANCE_FAMILY = CPU_X64_XS;
Copy

INSTANCE_FAMILY identifies the type of machine you want to provision for computer nodes in the compute pool. Specifying INSTANCE_FAMILY in creating a compute pool is similar to specifying warehouse size (XSMALL, SMALL, MEDIUM, LARGE and so on) when creating a warehouse. The following table lists the available machine types.

Consumption Table Mapping

INSTANCE_FAMILY

vCPU

Memory (GiB)

Storage (GiB)

GPU

GPU Memory per GPU (GiB)

Max. Limit

Description

CPU | XS

CPU_X64_XS

2

8

250

n/a

n/a

50

Smallest instance available for Snowpark Containers. Ideal for cost-savings and getting started.

CPU | S

CPU_X64_S

4

16

250

n/a

n/a

50

Ideal for hosting multiple services/jobs while saving cost.

CPU | M

CPU_X64_M

8

32

250

n/a

n/a

20

Ideal for having a full stack application or multiple services

CPU | L

CPU_X64_L

32

128

250

n/a

n/a

20

For applications which need an unusually large number of CPUs, memory and Storage.

High-Memory CPU | S

HIGHMEM_X64_S

8

64

250

n/a

n/a

20

For memory intensive applications.

High-Memory CPU | M

HIGHMEM_X64_M

32

256

250

n/a

n/a

20

For hosting multiple memory intensive applications on a single machine.

High-Memory CPU | L

HIGHMEM_X64_L

128

1024

250

n/a

n/a

20

Largest high-memory machine available for processing large in-memory data.

GPU | S

GPU_NV_S

8

32

250

1 NVIDIA A10G

24

10

Our smallest NVIDIA GPU size available for Snowpark Containers to get started.

GPU | M

GPU_NV_M

48

192

250

4 NVIDIA A10G

24

5

Optimized for intensive GPU usage scenarios like Computer Vision or LLMs/VLMs

GPU | L

GPU_NV_L

192

2048

250

8 NVIDIA A100

40

On request

Largest GPU instance for specialized and advanced GPU cases like LLMs and Clustering etc.

For information about available instance families, see CREATE COMPUTE POOL.

Autoscaling of compute pool nodes

After you create a compute pool, Snowflake launches the minimum number of nodes and automatically creates additional nodes up to the maximum allowed. This is called autoscaling. New nodes are allocated when the running nodes cannot take any additional workload. For example, suppose that two service instances are running on two nodes within your compute pool. If you execute another service within the same compute pool, the additional resource requirements might cause Snowflake to start an additional node.

However, if no services or jobs run on a node for a specific duration, Snowflake automatically removes the node, ensuring that the compute pool maintains the minimum required nodes even after the removal.

Managing a compute pool

Snowpark Container Services provides the following commands to manage compute pools:

  • Monitoring: Use the SHOW COMPUTE POOLS command to get information about compute pools.

  • Operating: Use the ALTER COMPUTE POOL command to change the state of a compute pool.

    ALTER COMPUTE POOL <name> { SUSPEND | RESUME | STOP ALL }
    
    Copy

    When you suspend a compute pool, Snowflake suspends all services, but the jobs continue to run until they reach a terminal state (DONE or FAILED), after which the compute pool nodes are released.

    A suspended compute pool must be resumed before you can start a new service or a job. If the compute pool is configured to auto-resume (with the AUTO_RESUME property set to TRUE), Snowflake automatically resumes the pool when a service or job is submitted to it. Otherwise, you need to run the ALTER COMPUTE POOL command to manually resume the compute pool.

  • Modifying: Use the ALTER COMPUTE POOL command to change compute pool properties.

    ALTER COMPUTE POOL <name> SET propertiesToAlter
    propertiesToAlter := { MIN_NODES | MAX_NODES | AUTO_RESUME | AUTO_SUSPEND_SECS | COMMENT }
    
    Copy

    When you decrease MAX_NODES, note the following potential effects:

    • Snowflake might need to terminate one or more service instances and restart them on other available nodes in the compute pool. If MAX_NODES is set too low, Snowflake might be unable to schedule certain service instances.

    • If the node terminated had a job execution in progress, the job execution will fail. Snowflake will not restart the job.

      Example:

      ALTER COMPUTE POOL MYPOOL SET MIN_NODES = 2  MAX_NODES = 2;
      
      Copy
  • Removing: Use the DROP COMPUTE POOL command to remove a compute pool.

    Example:

    DROP COMPUTE POOL <name>
    
    Copy

    You must stop all running services before you can drop a compute pool.

  • Listing compute pools and viewing properties: Use SHOW COMPUTE POOLS and DESCRIBE COMPUTE POOL commands. For examples, see Show Compute Pools.

Compute pool lifecycle

A compute pool can be in any of the following states:

  • IDLE: The compute pool has the desired number of virtual machine (VM) nodes, but no services or jobs are scheduled. In this state, autoscaling can shrink the compute pool to the minimum size due to lack of activity.

  • ACTIVE: The compute pool has at least one service running or scheduled to run on it. The pool can grow (up to the maximum nodes) or shrink (down to the minimum nodes) in response to load or user actions.

  • SUSPENDED: The pool currently contains no running virtual machine nodes, but if the AUTO_RESUME compute pool property is set to TRUE, the pool will automatically resume when a service or job is scheduled.

The following states are transient:

  • STARTING: When you create or resume a compute pool, the compute pool enters the STARTING state until at least one node is provisioned.

  • STOPPING: When you suspend a compute pool (using ALTER COMPUTE POOL), the compute pool enters the STOPPING state until Snowflake has released all nodes in the compute pool. When you suspend a compute pool, Snowflake suspends all services, but the jobs continue to run until they reach a terminal state (DONE or FAILED), after which the compute pool nodes are released.

  • RESIZING: When you create a compute pool, initially it enters the STARTING state. After it has one node provisioned, it enters the RESIZING state until the minimum number of nodes (as specified in CREATE COMPUTE POOL) are provisioned. When you change a compute pool (ALTER COMPUTE POOL) and update the minimum and maximum node values, the pool enters the RESIZING state until the minimum nodes are provisioned. Note that autoscaling of a compute pool also puts the compute pool in the RESIZING state.

Compute pool privileges

When you work with compute pools, the following privilege model applies:

  • To create a compute pool in an account, the current role needs the CREATE COMPUTE POOL privilege on the account. If you create a pool, as an owner you have OWNERSHIP permission, which grants full control over that compute pool. Having OWNERSHIP of one compute pool does not imply any permissions on other compute pools.

  • For compute pool management, the following privileges (capabilities) are supported:

    Privilege

    Usage

    MODIFY

    Enables altering any compute pool properties, including changing the size.

    MONITOR

    Enables viewing compute pool usage, including describing compute pool properties.

    OPERATE

    Enables changing the state of the compute pool (suspend, resume). In addition, enables stopping any scheduled services and jobs.

    USAGE

    Enables creating services and jobs in the compute pool. Note that when a compute pool is in a suspended state and has its AUTO_RESUME property set to true, a role with USAGE permission on the compute pool can implicitly trigger the compute pool’s resumption when they start or resume a service, even if the role lacks the OPERATE permission.

    OWNERSHIP

    Grants full control over the compute pool. Only a single role can hold this privilege on a specific object at a time.

    ALL [ PRIVILEGES ]

    Grants all privileges, except OWNERSHIP, on the compute pool.

Compute pool maintenance

As part of routine internal infrastructure maintenance, Snowflake performs updates on its older compute pools. This process involves essential tasks such as operating system upgrades, drive enhancements, and the resolution of any security vulnerabilities present in the compute pool nodes.

During these maintenance procedures, compute pool nodes are taken offline and replaced with updated nodes periodically (every few weeks). A compute pool node remains active for a maximum of a month, after which Snowflake will retire the node and replace it with a newly updated one. When this occurs:

  • All service instances currently running in these compute pools will be automatically recreated on the new nodes.

  • All ongoing jobs will experience disruption and will need to be restarted by customers after the compute pool is operational again.

The expected maintenance window is approximately 30 minutes.

These internal infrastructure updates take place every Tuesday, Wednesday, Thursday, and Friday during the hours of 8 am to 1 pm local time relative to the region.

How services are scheduled on a compute pool

At the time of creating a service, you might choose to run multiple instances to manage incoming load. Snowflake uses the following general guidelines when scheduling your service instances on compute pool nodes:

  • All containers in a service instance always run on a single compute pool node. That is, a service instance never spans across multiple nodes.

  • When you run multiple service instances, Snowflake may run these service instances on the same node or different nodes within the compute pool. When making this decision, Snowflake considers any specified hard resource requirements (such as memory and GPU) as outlined in the service specification file (see containers.resources field).

    For example, suppose each node in your compute pool provides 8 GB of memory. If your service specification includes a 6-GB memory requirement, and you choose to run two instances when creating a service, Snowflake cannot run both instances on the same node. In this case, Snowflake schedules each instance on a separate node within the compute pool to fulfill the memory requirements.

Guidelines and limitations

  • CREATE COMPUTE POOL permission: If you cannot create a compute pool under the current role, use the ACCOUNTADMIN role to grant permission. For example:

    GRANT CREATE COMPUTE POOL ON ACCOUNT TO ROLE <role_name> [WITH GRANT OPTION];
    
    Copy

    For more information, see GRANT <privileges>.

  • Per account limit on compute pool nodes. For any given instance family (see CREATE COMPUTE POOL), there is a per account limit on the total number of nodes that can be active at any time. If you see an error message like - Requested number of nodes <#> exceeds the node limit for the account, you have encountered this limit. For more information, contact your account representation.