Troubleshooting Document AI¶

The following scenarios can help you troubleshoot issues that might occur when working with Document AI.

Extracting query is not working¶

For the extracting query to work, you must store the documents for extraction in either an internal or external stage. Ensure that you specify the SNOWFLAKE_SSE encryption type when you create an internal stage.

Error	Depending on the document format, you might get an error such as one of the following: { "__processingErrors": [ "File extension does not match actual mime type. Mime-Type: application/octet-stream" ] } { "__processingErrors": [ "cannot identify image file <_io.BytesIO object at 0x7f8a800ba020>" ] }
Cause	You didn’t specify the `SNOWFLAKE_SSE` encryption type when you created internal stage to store documents.
Solution	To create an internal stage, run the CREATE STAGE command as shown in the following example: CREATE STAGE doc_ai_stage DIRECTORY = (ENABLE = TRUE) ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE'); Copy

Presigned URL has expired¶

The presigned URL of the staged documents is a required argument to <model_build_name>!PREDICT. To get the presigned URL, call the GET_PRESIGNED_URL function, which has the default expiration time.

For more information, see GET_PRESIGNED_URL.

Error	{ "__processingErrors": [ "Received HTTP 403 response for presigned URL. URL may be expired." ] }
Cause	Presigned URL has expired.
Solution	Either reduce the number of documents in one query, or extend the expiration time. For more information about extending the expiration time, see GET_PRESIGNED_URL.

Too many documents in one query¶

Document AI has a limitation on the number of documents processed in one extracting query. For more information, see Known limitations to Document AI.

Error	{ "__processingErrors": [ "Query limit reached: too many documents in a single query." ] }
Cause	You tried to process too many documents in one query.
Solution	Use several queries to process the documents.

Documents don’t meet specific requirements¶

The documents you process with Document AI must meet specific requirements. For more information, see Prepare your documents for Document AI.

Error	You might get one of the following errors: { "__processingErrors": [ "Page 0 size is larger than the limit. Actual: 1083 mm x 1384 mm. Maximum: 1200 mm x 1200 mm." ] } { "__processingErrors": [ "Document has too many pages. Actual: 150. Maximum: 125." ] } { "__processingErrors": [ "Image size is too small. Actual: 20x20 px. Minimum: 50x50 px." ] } { "__processingErrors": [ "Unsupported file format. Actual: csv. Supported: docx, eml, htm, html, jpeg, jpg, pdf, png, text, tif, tiff, txt." ] } { "__processingErrors": [ "File exceeds maximum size. Actual: 54096026 bytes. Maximum: 50000000 bytes." ] }
Cause	The documents attempted to process don’t meet the requirements of Document AI. For more information about the requirements, see Prepare your documents for Document AI.
Solution	Prepare your documents to meet the requirements.

Error

You might get one of the following errors:

{ "__processingErrors": [ "Page 0 size is larger than the limit. Actual: 1083 mm x 1384 mm. Maximum: 1200 mm x 1200 mm." ] }

{ "__processingErrors": [ "Document has too many pages. Actual: 150. Maximum: 125." ] }

{ "__processingErrors": [ "Image size is too small. Actual: 20x20 px. Minimum: 50x50 px." ] }

{ "__processingErrors": [ "Unsupported file format. Actual: csv. Supported: docx, eml, htm, html, jpeg, jpg, pdf, png, text, tif, tiff, txt." ] }

{ "__processingErrors": [ "File exceeds maximum size. Actual: 54096026 bytes. Maximum: 50000000 bytes." ] }

Cause

The documents attempted to process don’t meet the requirements of Document AI. For more information about the requirements, see Prepare your documents for Document AI.

Solution

Prepare your documents to meet the requirements.

The Document AI model build was not published¶

To extract information with Document AI, you need to have the Document AI model build published. You don’t need to publish the model build if you trained the model and didn’t add new data values (ask new questions) after the training.

Error	The error message starts with the following: Request failed for external function DOCUMENT_EXTRACT_FEATURES$V1 with remote service error: 422
Cause	The Document AI model build was not published.
Solution	Publish the Document AI model build. For more information, see Publish a Document AI model build.

Required privileges are not granted or the model build name is duplicated¶

To create a Document AI model build, you must grant the required privileges to your role, and choose a unique model build name.

For more information on required privileges, see Document AI access control.

Error	Unable to create a build on the specified database and schema. Please check the documentation to learn more.
Cause	Possible causes are: The CREATE SNOWFLAKE.ML.DOCUMENT_INTELLIGENCE privilege is not granted to your role. The model build name already exists in the database and schema.
Solution	Grant the CREATE SNOWFLAKE.ML.DOCUMENT_INTELLIGENCE privilege to your role. See Grant the required roles and privileges to Document AI users. Use a unique model build name within the database and schema.

Error

Unable to create a build on the specified database and schema. Please check the documentation to learn more.

Cause

Possible causes are:

The CREATE SNOWFLAKE.ML.DOCUMENT_INTELLIGENCE privilege is not granted to your role.
The model build name already exists in the database and schema.

Solution

Grant the CREATE SNOWFLAKE.ML.DOCUMENT_INTELLIGENCE privilege to your role. See Grant the required roles and privileges to Document AI users.
Use a unique model build name within the database and schema.