Introduction to External Functions

This topic describes external functions, which call executable code that is developed, maintained, stored, and executed outside Snowflake.

In this Topic:

See also:

CREATE EXTERNAL FUNCTION , CREATE API INTEGRATION

What is an External Function?

An external function calls code that executes outside Snowflake; the executed code is known as a remote service.

Users can write and call their own remote services, or call remote services written by third parties. These remote services can be written using any HTTP server stack, including cloud serverless compute services such as AWS Lambda.

From the perspective of a user running a SQL statement, an external function behaves like any other scalar function. A SQL statement performs the following actions:

  1. Calls the function, optionally passing parameters.

  2. Receives a value back from the function.

In SQL statements, external functions generally behave like UDFs (user-defined functions). For example, external functions follow these rules:

  • Inside Snowflake, an external function is represented as a database object. That object is created in a specific database and schema, and can be referenced using dot notation (e.g. MY_DATABASE.MY_SCHEMA.MY_EXTERNAL_FUNCTION()).

  • An external function can appear in any clause of a SQL statement in which other types of functions can appear (e.g. the WHERE clause).

  • The returned value can be a compound value, such as a VARIANT that contains JSON.

  • External functions can be overloaded; two different functions can have the same name but different signatures (different numbers or data types of input parameters).

  • An external function can be part of a more complex expression:

    select upper(zipcode_to_city_external_function(zipcode))
      from address_table;
    

Frequently-used Terms for External Functions

Below are definitions for frequently-used terms:

remote service

A remote service is stored and executed outside Snowflake, and returns a value. For example, remote services can be implemented as:

  • An AWS Lambda function.

  • An HTTPS server (e.g. Node.js) running on an EC2 instance.

To be called by the Snowflake external function feature, the remote service must:

  • Expose an HTTPS endpoint.

  • Accept JSON inputs and return JSON outputs.

    For more information about Snowflake-compatible HTTP headers and JSON formatted data, see Data Format.

external function

An external function is a Snowflake construct that makes a remote service appear as though it is inside Snowflake, so that the remote service can be called by a SQL statement. In other words, an external function makes a remote service look like a local function.

An external function is a set of information that Snowflake uses to call the remote service. This information includes the URL of the proxy service (for example, AWS API Gateway) that relays information to and from the remote service. This information is specified as part of the “CREATE EXTERNAL FUNCTION” command and is stored inside a database object.

user-defined function

A user-defined function (UDF) is a function written by a user, rather than by Snowflake. A UDF can be:

  • Internal: the executable code is stored and executed inside Snowflake.

  • External: an external function, as defined here (i.e. a construct that makes a remote service look like a local function).

proxy service

Snowflake does not call a remote service directly; instead, Snowflake calls a remote service indirectly via a proxy service. A proxy service relays a request or function call (including data) from one piece of software to another, for example from Snowflake to a remote service. Typically, it also relays the returned value(s) back to the caller. Using a proxy service increases flexibility and security.

The AWS API Gateway is an example of a proxy service.

Advantages of External Functions

External functions have the following advantages over other UDFs (user-defined functions):

  • The code for the remote service can be written in languages that other UDFs cannot be written in, including:

    • Python

    • Go

    • C#

  • Remote services can use functions and libraries that can’t be accessed by internal UDFs.

    • Remote services can interface with commercially available third-party libraries, such as machine-learning scoring libraries.

  • Developers can write remote services that can be called both from Snowflake and from other software written to use the same interface.

Limitations of External Functions

External functions have the following limitations:

  • Snowflake calls remote services indirectly through a cloud HTTP proxy service (such as the Amazon AWS API gateway), so the remote service for an external function must be callable from a proxy service. Fortunately, almost any function that can act as an HTTPS endpoint can be accessed as an external function via a proxy service. The function author must program the proxy service to call the remote service (e.g. a function running on AWS Lambda).

  • Currently, external functions cannot be shared with data consumers via Secure Data Sharing.

  • Because the remote service is opaque to Snowflake, the optimizer might not be able to perform some optimizations that it could perform for non-external functions.

  • Currently, external functions must be scalar functions. A scalar external function returns a single value for each input “row”.

  • External functions cannot be used in the following situations:

    • The DEFAULT clause of a CREATE TABLE statement. In other words, the default value for a column cannot be an expression that calls an external function. If you try to include an external function in a DEFAULT clause, then the CREATE TABLE statement fails.

    • COPY transformations.

  • The external functions feature supports only functions, not stored procedures.

Billing for External Functions Usage

Using external functions incurs normal costs associated with:

In addition, you might need to pay indirect or third-party charges, including:

  • Charges by the provider of the remote service.

Charges can vary from vendor to vendor.

Additional Considerations for External Functions

  • Before an external function can be called the first time, an administrator must do some configuration work. This work requires knowledge of the cloud platform (e.g. AWS), especially security.

  • Writing a custom remote service typically requires knowledge of topics such as HTTP and REST APIs.

  • External functions have more overhead than internal functions (both built-in and UDFs) and usually execute more slowly.

  • External functions can incur additional processing costs outside of Snowflake.

  • External functions can raise additional security issues. For example, if you call a third party’s function, that party could keep copies of the data passed to the function.

  • Some cloud platforms might have specific requirements. For example, on AWS, external functions require regional endpoints. For more details, see Supported Platforms.

Calling an External Function

Calling an external function is identical to calling any other user-defined function. External functions can appear in almost all clauses that other functions can appear in.

Here are examples of calling a scalar external function:

select my_external_function_2(column_1, column_2)
    from table_1;

select col1
    from table_1
    where my_external_function_3(col2) < 0
    ;

create view view1 (col1) as
    select my_external_function_5(col1)
        from table9;

Supported Platforms

You’ll need a platform and an account on that platform for the remote service and for the proxy service. In many cases, the platform and account for the remote services are the same as the platform and account for the proxy service. However, that is not required.

These platform(s) and account(s) are separate from your Snowflake platform and account.

Platforms for the Remote Service

You’ll need an HTTP server stack to host the remote service. Any HTTP server stack that can support the remote service should be compatible with external functions.

Snowflake provides instructions for creating a remote service as:

  • An AWS Lambda function.

  • A Microsoft Azure function.

Platforms for the Proxy Service

You’ll need an instance of a native HTTP proxy service on a cloud platform.

To configure your proxy service, you typically need:

  • An account with a cloud platform’s provider, for example Amazon AWS. This account provides storage and compute services for the proxy service. This account is separate from your Snowflake account.

  • A cloud platform role that has the privileges required to configure a proxy service. This cloud platform role is separate from your Snowflake role(s).

The following HTTPS proxy services are supported:

  • Amazon AWS API Gateway.

  • Microsoft Azure API Management Service.

Calling an external function from SQL is the same for all platforms. The SQL statements needed to configure access to these services are the same for all platforms. However, administrative details, such as copying your external function code to the service, are different on different platforms.

If your virtual warehouse is on Google Cloud Platform (GCP), you can create an external function that accesses a remote service through an AWS API Gateway or through the Azure API Management Service.

The sections below contain platform-specific information that users should be aware of before choosing a platform.

Amazon AWS API Gateway