Troubleshooting External Functions

This topic describes how to troubleshoot an external function.

Note that for some cloud platforms, the instructions for creating an external function on that cloud platform might have additional troubleshooting information.

In this Topic:

General Troubleshooting

Ensure that Arguments to the External Function Correspond to Arguments to the Remote Service

If you write your own remote service, and if you change the number, data types, or order of the arguments to the remote service, remember to make the corresponding changes to the external function. Currently, the ALTER EXTERNAL FUNCTION command does not have an option to change parameters, so you must drop and re-create the external function to change the arguments.

Performance

The following tips might help you debug performance problems:

See also the troubleshooting section on scalability.

Scalability

The following tips might help you debug scalability problems:

  • Use the Query Profile page page to see average latency per request.

  • Use the Query Profile page page to see how many times requests were retried due to transient errors, including those listed in the section titled Do Not Assume that the Remote Service is Passed Each Row Exactly Once.

  • Monitor your remote service resource usage to see how it scales to the load, and ensure that the remote service has enough capacity to serve peak load.

  • Utilize logging in the Amazon API Gateway or in the remote service to get per-request details.

  • Control the concurrency with which Snowflake sends requests to their remote service. For more details, see concurrency.

  • Return HTTP Response Code 429 from the remote service when it is overloaded. Return this as early as possible, rather than wait for latency to increase.

  • Take into account the proxy service timeout. For example, as of July 2020, the timeout for Amazon API Gateway is 30 seconds. Timeouts can be caused by various factors, including overloading of the remote service.

Snowflake attempts to retry transient errors/timeouts within a reasonable time, but if the service continues to be overloaded, and retries do not succeed, eventually the query is aborted.

See also the troubleshooting section on Performance.

Specific Symptoms

Platform-Independent Symptoms

Symptom: When trying to call the external function, you get the message “Error parsing JSON: Invalid response”

Possible Causes(s)

The most likely cause is that the JSON returned by the remote service (e.g. AWS Lambda function) is not constructed correctly.

Possible Solution(s)

Make sure that you return an array of arrays, with one inner array returned for each input row received. Review the description of the output format at Data Format Received by Snowflake.

Symptom: An error message saying that the format of the returned value is not JSON

Possible Cause(s)

One possible cause of this is that your return value includes double quotes inside the value.

Possible Solution(s)

Although JSON strings are delimited by double quotes, the string itself should not start and end with a quotation mark in most cases. If the embedded double quotes are incorrect, then remove them.

Symptom: An error message saying that the function receiving the wrong number of rows

Possible Cause(s)

The remote service probably tried to return more or fewer rows than it received. (Remember, even though the function is nominally scalar, it might receive multiple rows in the “body” field of the “event” parameter, and should return exactly as many rows as it received.)

Possible Solution(s)

Make sure that the remote service returns one row for each row that it receives.

AWS

Symptom: When you call your function from SQL, you get a message about row numbers being out of order

Possible Cause(s)

Remember that the row numbers you return should be monotonically ascending integers starting at 0. The input row numbers must also follow that rule, and each output should match the corresponding input row; e.g. the output in output row 0 should correspond to the input in input row 0.

Possible Solution(s)
  1. Make sure that the row numbers you return are the same as the row numbers you received, and that each output value uses the row number of the corresponding input. That should work. If it doesn’t, then it is possible that the input row numbers were not right, or that you did not return the rows in the correct order, so move on to step 2 below.

  2. Make sure that the output row numbers start from 0, increase by 1, and are in order.

Symptom: An Amazon API Gateway returns error 502 while the endpoint is using Lambda proxy integration

Possible Cause(s)

The Lambda function might have:

  • Timed out.

  • Thrown an exception.

  • Failed in some other way.

Possible Solution(s)

If the Lambda or API Gateway logs, are available to you, examine them.

If the source code of the Lambda function is available to you, then analyze and debug the code in the Lambda function. In some cases, you might be able to execute a copy of that code in a simpler context (outside AWS) to help debug it.

Verify that the data sent to the Lambda function is in the format that Lambda function expects. You might want to try sending a smaller, simpler data set to see whether that succeeds.

Verify that you are not sending too much data at a time.

In some cases, increasing the timeout might solve the problem, especially if the Lambda function requires a lot of CPU resources, or if the Lambda function itself calls other remote services and thus requires more time.

EXTERNAL_FUNCTIONS_HISTORY Function

Symptom: EXTERNAL_FUNCTIONS_HISTORY Returns “…invalid identifier…”

Possible Cause

You might not have put the function signature in single quotes. For example, the following is wrong because it does not include the quotes:

select *
  from table(information_schema.external_functions_history(
    function_signature => mydb.public.myfunction(integer, varchar)));
Possible Solution

Correct this by adding quotation marks around the function signature:

select *
  from table(information_schema.external_functions_history(
    function_signature => 'mydb.public.myfunction(integer, varchar)'));

Symptom: EXTERNAL_FUNCTIONS_HISTORY returns only one row of output, and many of the columns are NULL

Possible Cause

You probably did not include a function signature. If you do not specify a function signature, then EXTERNAL_FUNCTION_HISTORY() returns the aggregate values for columns such as INVOCATIONS, SENT ROWS, etc., and returns NULL for columns such as the function name, the argument lists, etc.

Possible Solution

If you intended to get information for one function, then include a function signature.

If you intended to get information for all functions, then the NULL values for some columns are correct, and you do not need to fix the query.