Packages Policies

Introduction

Using a packages policy enables you to set allowlists and blocklists for third-party Python packages from Anaconda at the account level. This lets you meet stricter auditing and security requirements and gives you more fine-grained control over which packages are available or blocked in your environment.

For more information about how Snowpark Python allows you to bring in third-party packages from Anaconda, see Using Third-Party Packages.

When you create a Python UDF, UDTF or stored procedure, the allowlist and blocklist will be taken into account when creating the Python environment. The allowlist and blocklist will apply to all packages that are required to create the Python environment. If it’s not possible to create an environment with the specified packages, the query will fail.

When you execute a Python UDF, UDTF or stored procedure, Snowflake will check the allowlist and blocklist and make sure that all of the packages are allowed by the packages policy. Otherwise, the query will fail.

Limitations

  • Packages policies will only apply if the Anaconda legal terms have been accepted.

  • Packages policies will not be applied for built-in functions and will also not be applied for native apps.

Implementing and Using a Packages Policy

In order to create a packages policy object, you must have the following privileges:

  • USAGE on the database and schema in which you plan to create the packages policy.

  • CREATE PACKAGES POLICY on the schema in which you plan to create the packages policy.

After the packages policy object is created, you must have the following privileges to apply it to the account:

  • OWNERSHIP on the packages policy object.

  • APPLY PACKAGES POLICY on the account.

Follow these steps to implement a packages policy.

Step 1: Create a Packages Policy Admin Custom Role

Create a custom role that allows users to create and manage packages policies. Throughout this topic, the example custom role is named policy_admin, although the role could have any appropriate name.

If the custom role already exists, continue to the next step.

Otherwise, create the policy_admin custom role.

use role useradmin;

create role policy_admin;
Copy

Step 2: Grant Privileges to the policy_admin Custom Role

If the policy_admin custom role does not already have the following privileges, grant these privileges as shown below:

  • USAGE on the database and schema that will contain the packages policy.

  • CREATE PACKAGES POLICY on the schema that will contain the packages policy.

  • APPLY PACKAGES POLICY on the account.

use role securityadmin;

grant usage on database yourdb to role policy_admin;

grant usage, create packages policy on schema yourdb.yourschema to role policy_admin;

grant apply packages policy on account to role policy_admin;
Copy

Step 3: Create a New Packages Policy

Using the policy_admin custom role, create a new packages policy, with a language, allowlist, and blocklist specified. ALLOWLIST, BLOCKLIST, ADDITIONAL_CREATION_BLOCKLIST, and COMMENT are optional parameters. By default, the allowlist value is ('*'), and the blocklist value is ().

If a package is specified in both the allowlist and the blocklist, then the blocklist takes precedence. You must explicitly add the Python runtime version in the allowlist and you must also explicitly add all packages and underlying dependencies of a parent package to the allowlist.

You can specify a particular package version or a range of versions by using these version specifiers in the allowlist or blocklist: : ==, <=, >=, <,or >. For example, numpy>=1.2.3. You can use wildcards, such as, numpy==1.2.*, which means any micro version of numpy 1.2.

Note

Currently, in an allowlist or blocklist, only one range operator can be specified per package. Specifying multiple range operators is not supported, for example pkg>1.0, <1.5. Because of this, to configure a policy to allow an interval of a package version, set one side of the range in the allowlist and the other side of the range in the blocklist. For example, to allow package versions greater than 1.0 and less than 1.5, set the allowlist to pkg>1.0 and the blocklist to pkg>1.5.

USE ROLE policy_admin;

CREATE PACKAGES POLICY yourdb.yourschema.packages_policy_prod_1
  LANGUAGE PYTHON
  ALLOWLIST = ('numpy', 'pandas==1.2.3', ...)
  BLOCKLIST = ('numpy==1.2.3', 'bad_package', ...)
  ADDITIONAL_CREATION_BLOCKLIST = ('bad_package2', 'bad_package3', ...)
  COMMENT = 'Packages policy for the prod_1 environment'
;
Copy

Where:

yourdb.yourschema.packages_policy_prod_1

The fully qualified name of the packages policy.

LANGUAGE PYTHON

The language that this packages policy will apply to.

ALLOWLIST = ('numpy', 'pandas==1.2.3', ...)

The allowlist for this packages policy. This is a comma-separated string with package specs.

BLOCKLIST = ('numpy==1.2.3', 'bad_package', ...)

The blocklist for this packages policy. This is a comma-separated string with package specs.

ADDITIONAL_CREATION_BLOCKLIST = ('bad_package2', 'bad_package3', ...)

Specifies a list of package specs that are blocked at creation time. To unset this parameter, specify an empty list. If the ADDITIONAL_CREATION_BLOCKLIST is set, it is appended to the basic BLOCKLIST at the creation time. For temporary UDFs and anonymous stored procedures, the ADDITIONAL_CREATION_BLOCKLIST is appended to the BLOCKLIST at both creation and execution time.

COMMENT = 'Packages policy for the prod_1 environment'

A comment specifying the purpose of the packages policy.

In the example above, the blocklist applied for the creation time will be the ADDITIONAL_CREATION_BLOCKLIST plus the BLOCKLIST so the blocked packages will be numpy==1.2.3, bad_package, bad_package2 and bad_package3. The blocklist applied for the execution will be: numpy==1.2.3 and bad_package. For temporary UDFs and anonymous stored procedures, the blocklist containing numpy==1.2.3, bad_package, bad_package2 and bad_package3 will be applied at both creation and execution time.

To get a list of the dependencies of a Python package, use the SHOW_PYTHON_PACKAGES_DEPENDENCIES function. The first parameter is the Python runtime version you are using and the second is a list of the packages to show dependencies for. For example, to show the dependencies of the numpy package, use this command.

USE ROLE ACCOUNTADMIN;

select SNOWFLAKE.SNOWPARK.SHOW_PYTHON_PACKAGES_DEPENDENCIES('3.8', ['numpy']);
Copy

The result is a list of the dependencies and their versions.

['_libgcc_mutex==0.1', '_openmp_mutex==5.1', 'blas==1.0', 'ca-certificates==2023.05.30', 'intel-openmp==2021.4.0',
'ld_impl_linux-64==2.38', 'ld_impl_linux-aarch64==2.38', 'libffi==3.4.4', 'libgcc-ng==11.2.0', 'libgfortran-ng==11.2.0',
'libgfortran5==11.2.0', 'libgomp==11.2.0', 'libopenblas==0.3.21', 'libstdcxx-ng==11.2.0', 'mkl-service==2.4.0',
'mkl==2021.4.0', 'mkl_fft==1.3.1', 'mkl_random==1.2.2', 'ncurses==6.4', 'numpy-base==1.24.3', 'numpy==1.24.3',
'openssl==3.0.10', 'python==3.8.16', 'readline==8.2', 'six==1.16.0', 'sqlite==3.41.2', 'tk==8.6.12', 'xz==5.4.2', 'zlib==1.2.13']

To show the dependencies of Python 3.8 within Snowpark environment, call the function without specifying any packages.

select SNOWFLAKE.SNOWPARK.SHOW_PYTHON_PACKAGES_DEPENDENCIES('3.8', []);
Copy

If you want to know which packages a function is using, you can use DESCRIBE FUNCTION to print them out. This is an alternative way to identify all of the dependencies of a package. To do this, create a function and in the package specification, provide the top level packages. Next, use DESCRIBE FUNCTION to get a list of all of the packages and their dependencies. You can copy and paste this list into the package allowlist. Note that the packages policy must be temporarily unset or some packages might be blocked. The following example shows how to find the dependencies for the ‘snowflake-snowpark-python’ package.

create or replace function my_udf()
returns string
language python
packages=('snowflake-snowpark-python')
runtime_version=3.10
handler='echo'
as $$
def echo():
  return 'hi'
$$;

describe function my_udf();
Copy

If you want to show all of the packages and versions that are available, query the INFORMATION_SCHEMA.PACKAGES view.

select * from information_schema.packages;
Copy

If you want to see the current set of packages you are using, you can use this SQL statement.

-- at the database level

CREATE OR REPLACE VIEW USED_ANACONDA_PACKAGES
ASSELECT FUNCTION_NAME, VALUE PACKAGE_NAME
FROM (SELECT FUNCTION_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM INFORMATION_SCHEMA.FUNCTIONS
WHERE FUNCTION_LANGUAGE='PYTHON') USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES);

-- at the account level

CREATE OR REPLACE VIEW ACCOUNT_USED_ANACONDA_PACKAGES
AS SELECT FUNCTION_CATALOG, FUNCTION_SCHEMA, FUNCTION_NAME, VALUE PACKAGE_NAME
FROM (SELECT FUNCTION_CATALOG, FUNCTION_SCHEMA, FUNCTION_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM SNOWFLAKE.ACCOUNT_USAGE.FUNCTIONS
WHERE FUNCTION_LANGUAGE='PYTHON') USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES);
Copy

To get a list of third-party packages that are available from Anaconda, use the GET_ANACONDA_PACKAGES_REPODATA function. The parameter is the architecture, which can be: linux-64, linux-aarch64, osx-64, osx-arm64, win-64, or noarch.

For example, to show the list of third-party packages from Anaconda for the linux-64 archtecture, use this command.

USE ROLE ACCOUNTADMIN;

select SNOWFLAKE.SNOWPARK.GET_ANACONDA_PACKAGES_REPODATA('linux-64');
Copy

Step 4: Set the Packages Policy on an Account

Using the policy_admin custom role, set the policy on an account with the ALTER ACCOUNT command.

use role policy_admin;

alter account set packages policy yourdb.yourschema.packages_policy_prod_1;
Copy

Note

To replace a packages policy that is already set for an account, unset the packages policy first and then set the new packages policy for the account. Alternatively, you can use FORCE to set the packages policy without having to unset the packages policy. For example:

alter account set packages policy yourdb.yourschema.packages_policy_prod_2 force;
Copy

If you want to see which policy is active on the account, you can use this SQL statement.

select * from table(information_schema.policy_references(ref_entity_domain=>'ACCOUNT', ref_entity_name=>'<your_account_name>'))
Copy

The result of this query will display a column with the name POLICY_STATUS.

Later, if you want to unset the package policy on your account, use this SQL statement.

alter account unset packages policy;
Copy

Privileges Required to Execute DDL Commands

The following table summarizes the relationship between the packages policy DDL operations and their necessary privileges.

Operation

Privilege required

Create packages policy

A role with the CREATE PACKAGES POLICY privilege on the schema.

Alter packages policy

A role with the OWNERSHIP privilege on the packages policy.

Drop packages policy

A role with the OWNERSHIP privilege on the packages policy.

Describe packages policy

A role with the OWNERSHIP or USAGE privilege on the packages policy.

Show packages policies

A role with the OWNERSHIP or USAGE privilege on the packages policy.

Set & unset packages policy

A role with the APPLY PACKAGES POLICY privilege on the account and the OWNERSHIP privilege on the packages policy.

Packages Policy DDL

Snowflake provides the following DDL commands to manage packages policy objects:

Packages Policy Observability

Users who do not have access to the packages policy that is set on the account are able to see the contents of it.

Users can control who sees the contents of the packages policy by adding the USAGE privilege to the packages policies. The account administrator or packages policy owner can grant this privilege to roles that need to use packages policies.

The CURRENT_PACKAGES_POLICY Information Schema view displays a row for each Snowpark packages policy on the current account.

select * from information_schema.current_packages_policy;
Copy
+------+----------+-----------+-----------+-------------------------------+---------+
| NAME | LANGUAGE | ALLOWLIST | BLOCKLIST | ADDITIONAL_CREATION_BLOCKLIST | COMMENT |
+------+----------+-----------+-----------+-------------------------------+---------+
| P1   | PYTHON   | ['*']     | []        | [NULL]                        | [NULL]  |
+------+----------+-----------+-----------+-------------------------------+---------+

To see the Anaconda packages that are used at the database level for function, use this SQL statement.

USE DATABASE mydb;

CREATE OR REPLACE VIEW USED_ANACONDA_PACKAGES
AS
SELECT FUNCTION_NAME, VALUE PACKAGE_NAME
FROM (SELECT FUNCTION_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM INFORMATION_SCHEMA.FUNCTIONS
WHERE FUNCTION_LANGUAGE='PYTHON') USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES);
Copy

To see the Anaconda packages that are used at the account level for function, use this SQL statement.

USE DATABASE mydb;

CREATE OR REPLACE VIEW ACCOUNT_USED_ANACONDA_PACKAGES
AS
SELECT  FUNCTION_CATALOG, FUNCTION_SCHEMA, FUNCTION_NAME, VALUE PACKAGE_NAME
FROM (SELECT FUNCTION_CATALOG, FUNCTION_SCHEMA, FUNCTION_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM SNOWFLAKE.ACCOUNT_USAGE.FUNCTIONS
WHERE FUNCTION_LANGUAGE='PYTHON') USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES);
Copy

To see all of the installed Anaconda packages on your account, use this SQL statement.

USE DATABASE mydb;

CREATE OR REPLACE VIEW ACCOUNT_USED_ANACONDA_PACKAGES
AS
SELECT 'FUNCTION' TYPE, FUNCTION_CATALOG DATABASE, FUNCTION_SCHEMA SCHEMA, FUNCTION_NAME NAME, VALUE::STRING PACKAGE_NAME
FROM (SELECT FUNCTION_CATALOG, FUNCTION_SCHEMA, FUNCTION_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM SNOWFLAKE.ACCOUNT_USAGE.FUNCTIONS
WHERE FUNCTION_LANGUAGE='PYTHON' AND PACKAGES IS NOT NULL) USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES)
UNION
(SELECT 'PROCEDURE' TYPE, PROCEDURE_CATALOG DATABASE, PROCEDURE_SCHEMA SCHEMA, PROCEDURE_NAME, VALUE::STRING PACKAGE_NAME
FROM (SELECT PROCEDURE_CATALOG, PROCEDURE_SCHEMA,PROCEDURE_NAME,PARSE_JSON(PACKAGES)
PACKAGES FROM SNOWFLAKE.ACCOUNT_USAGE.PROCEDURES
WHERE PROCEDURE_LANGUAGE='PYTHON' AND PACKAGES IS NOT NULL) USED_PACKAGES,LATERAL FLATTEN(USED_PACKAGES.PACKAGES));
Copy

Replication and Packages Policies

Packages policies are replicated from a source account to target accounts if the database containing the packages policy is replicated. For more information, see Dangling references and packages policies.