Troubleshooting Document AI¶
The following scenarios can help you troubleshoot issues that might occur when working with Document AI.
Extracting query is not working¶
For the extracting query to work, you must store the documents for extraction in either an internal or external stage.
Ensure that you specify the SNOWFLAKE_SSE
encryption type when you create an internal stage.
Error |
Depending on the document format, you might get an error such as one of the following: { "__processingErrors": [ "File extension does not match actual mime type. Mime-Type: application/octet-stream" ] }
{ "__processingErrors": [ "cannot identify image file <_io.BytesIO object at 0x7f8a800ba020>" ] }
|
---|---|
Cause |
You didn’t specify the |
Solution |
To create an internal stage, run the CREATE STAGE command as shown in the following example: CREATE STAGE doc_ai_stage
DIRECTORY = (ENABLE = TRUE)
ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');
|
Presigned URL has expired¶
The presigned URL of the staged documents is a required argument to <model_build_name>!PREDICT. To get the presigned URL, call the GET_PRESIGNED_URL function, which has the default expiration time.
For more information, see GET_PRESIGNED_URL.
Error |
{ "__processingErrors": [ "Received HTTP 403 response for presigned URL. URL may be expired." ] }
|
---|---|
Cause |
Presigned URL has expired. |
Solution |
Either reduce the number of documents in one query, or extend the expiration time. For more information about extending the expiration time, see GET_PRESIGNED_URL. |
Too many documents in one query¶
Document AI has a limitation on the number of documents processed in one extracting query. For more information, see Known limitations to Document AI.
Error |
{ "__processingErrors": [ "Query limit reached: too many documents in a single query." ] }
|
---|---|
Cause |
You tried to process too many documents in one query. |
Solution |
Use several queries to process the documents. |
Documents don’t meet specific requirements¶
The documents you process with Document AI must meet specific requirements. For more information, see Prepare your documents for Document AI.
Error |
You might get one of the following errors: { "__processingErrors": [ "Page 0 size is larger than the limit. Actual: 1083 mm x 1384 mm. Maximum: 1200 mm x 1200 mm." ] }
{ "__processingErrors": [ "Document has too many pages. Actual: 150. Maximum: 125." ] }
{ "__processingErrors": [ "Image size is too small. Actual: 20x20 px. Minimum: 50x50 px." ] }
{ "__processingErrors": [ "Unsupported file format. Actual: csv. Supported: docx, eml, htm, html, jpeg, jpg, pdf, png, text, tif, tiff, txt." ] }
{ "__processingErrors": [ "File exceeds maximum size. Actual: 54096026 bytes. Maximum: 50000000 bytes." ] }
|
---|---|
Cause |
The documents attempted to process don’t meet the requirements of Document AI. For more information about the requirements, see Prepare your documents for Document AI. |
Solution |
Prepare your documents to meet the requirements. |
The Document AI model build was not published¶
To extract information with Document AI, you need to have the Document AI model build published. You don’t need to publish the model build if you trained the model and didn’t add new data values (ask new questions) after the training.
Error |
The error message starts with the following: Request failed for external function DOCUMENT_EXTRACT_FEATURES$V1 with remote service error: 422
|
---|---|
Cause |
The Document AI model build was not published. |
Solution |
Publish the Document AI model build. For more information, see Publish a Document AI model build. |
Required privileges are not granted or the model build name is duplicated¶
To create a Document AI model build, you must grant the required privileges to your role, and choose a unique model build name.
For more information on required privileges, see Document AI access control.
Error |
Unable to create a build on the specified database and schema. Please check the documentation to learn more.
|
---|---|
Cause |
Possible causes are:
|
Solution |
|