<model_build_name>!PREDICT¶
Extracts information from documents in a stage, and provides answers in a JSON object. If you specify a single document, the method returns results for that document. Otherwise, the method returns results for each document in the stage.
Syntax¶
<model_build_name>!PREDICT(<presigned_url>,
[ <model_build_version> ]
)
Arguments¶
Required:
presigned_url
Pre-signed URL of the staged documents.
To get the pre-signed URL to pass in as an argument, call the GET_PRESIGNED_URL function. See GET_PRESIGNED_URL.
For more information, see Example.
Note
The GET_PRESIGNED_URL function has a default expiration time (60 minutes). For more information about extending the expiration time, see GET_PRESIGNED_URL.
Optional:
model_build_version
Version of the Document AI model build.
If not specified, the latest available model build version is used by default.
Returns¶
Returns a JSON object with the following fields:
ocrScore
Specifies the confidence score for the optical character recognition (OCR) process.
score
Specifies the confidence score for a specific value.
value
Specifies the extracted answer to the question.
{
"__documentMetadata": {
"ocrScore": 0.918
},
"invoice_number": [
{
"score": 0.925,
"value": "123/20"
}
],
"invoice_items": [
{
"score": 0.839,
"value": "NEW CRUSHED VELVET DIVAN BED"
},
{
"score": 0.839,
"value": "Vintage Radiator"
},
{
"score": 0.839,
"value": "Solid Wooden Worktop"
},
{
"score": 0.839,
"value": "Sienna Crushed Velvet Curtains"
}
],
"tax_amount": [
{
"score": 0.879,
"value": "77.57"
}
],
"total_amount": [
{
"score": 0.809,
"value": "465.43 GBP"
}
],
"buyer_name": [
{
"score": 0.925
}
]
"vendor_name": [
{
"score": 0.9,
"value": "UK Exports & Imports Ltd"
}
]
}
Access control requirements¶
To extract information with Document AI, you must use an account role that is granted the SNOWFLAKE.DOCUMENT_INTELLIGENCE_CREATOR database role. For more information, see Document AI access control.
Usage notes¶
Ensure you meet the prerequisites for using this method. For more information, see Prerequisites.
Document AI has a limitation for the number of documents processed in one query. For more information, see Known limitations to Document AI.
All documents must be in the same directory of the stage.
Document AI uses directory tables. For more information, see Querying directory tables.
If the Document AI model does not find an answer in the document, the model does not return a
value
key. However, it does return thescore
key, which indicates how confident the model is that the document does not contain the answer. See thebuyer_name
field as an example.The Document AI model can return lists. See the
invoice_items
field as an example.
Example¶
The following example extracts information from all of the documents on the pdf_inspections_stage
stage for version 1
of the inspections
model build:
SELECT inspections!PREDICT(
GET_PRESIGNED_URL(@pdf_inspections_stage, RELATIVE_PATH), 1)
FROM DIRECTORY(@pdf_inspections_stage);
The following example extracts information from the 'paystubs/paystub01.pdf'
document on the pdf_paystubs_stage
stage for version 1
of the paystubs
model build:
SELECT paystubs!PREDICT(
GET_PRESIGNED_URL(@pdf_paystubs_stage, 'paystubs/paystub01.pdf'), 1);