カテゴリ：: :doc:`/sql-reference/functions-string`（AI 関数）

AI_EXTRACT¶

入力文字列またはファイルから情報を抽出します。

構文¶

入力文字列から情報を抽出する:

AI_EXTRACT( <text>, <responseFormat> )

Copy

AI_EXTRACT( text => <text>,
            responseFormat => <responseFormat> )

Copy

ファイルから情報を抽出する:

AI_EXTRACT( <file>, <responseFormat> )

Copy

AI_EXTRACT( file => <file>,
            responseFormat => <responseFormat> )

Copy

引数¶

text

抽出用の入力文字列。

file

抽出用の FILE。

サポートされているファイル形式:

PDF
PNG
PPTX, PPT
EML
DOC、 DOCX
JPEG, JPG
HTM, HTML
TEXT, TXT
TIF, TIFF
BMP、 GIF、 WEBP
MD

ファイルのサイズは100 MB 未満である必要があります。

responseFormat

次の応答形式のいずれかで抽出される情報。

Simple object schema that maps the label and information to be extracted; for example:

{'name': 'What is the last name of the employee?', 'address': 'What is the address of the employee?'}

抽出する情報を含む文字列の配列。例:

['What is the last name of the employee?', 'What is the address of the employee?']

An array of arrays that contain two strings (label and the information to be extracted); for example:

[['name', 'What is the last name of the employee?'], ['address', 'What is the address of the employee?']]

A JSON schema that defines the structure of the extracted information. Supports entity and table extraction. For example:
```
{
  'schema': {
    'type': 'object',
    'properties': {
      'income_table': {
        'description': 'Income for FY2026Q2',
        'type': 'object',
        'column_ordering': ['month', 'income'],
        'properties': {
          'month': {
            'description': 'Month',
            'type': 'array'
          },
          'income': {
            'description': 'Income',
            'type': 'array'
          }
        }
      },
      'title': {
        'description': 'What is the title of the document?',
        'type': 'string'
      },
      'employees': {
        'description': 'What are the names of employees?',
        'type': 'array'
      }
    }
  }
}
```
注釈
- JSON スキーマ形式を他の応答形式と組み合わせることはできません。responseFormat に schema キーが含まれている場合、すべての質問を JSON スキーマ内で定義する必要があります。追加キーはサポートされていません。
- このモデルは、 JSON スキーマの特定の形式のみを受け入れます。最上位タイプは常に、独立して抽出されたサブオブジェクトを含むオブジェクトである必要があります。サブオブジェクトは、テーブル（列を表す文字列のリストのオブジェクト）、文字列のリスト、または文字列の場合があります。
  
  現在サポートされているスカラー型は文字列のみです。
- description フィールドはオプションです。
  
  description フィールドを使用して、モデルにコンテキストを提供します。たとえば、モデルがドキュメント内の正しい適切なテーブルをローカライズするのに役立ちます。
- column_ordering フィールドを使用して、抽出されたテーブル内にあるすべての列の順序を指定します。column_ordering フィールドでは大文字と小文字が区別されます。 properties フィールドで定義された列名と一致する必要があります。

戻り値¶

抽出された情報を含む JSON オブジェクト。

配列、テーブル、単一値の抽出を含む出力の例：

{
  "error": null,
  "response": {
    "employees": [
      "Smith",
      "Johnson",
      "Doe"
    ],
    "income_table": {
      "income": ["$120 678","$130 123","$150 998"],
      "month": ["February", "March", "April"]
    },
    "title": "Financial report"
  }
}

アクセス制御の要件¶

Users must use a role that has been granted the SNOWFLAKE.CORTEX_USER database role. For information about granting this privilege, see Cortex LLM privileges.

使用上の注意¶

同じ関数呼び出しの中で、 text と file の両方のパラメーターを同時に使用することはできません。
You can either ask questions in natural language or describe information to be extracted (such as city, street, ZIP code); for example:
['address': 'City, street, ZIP', 'name': 'First and last name']
次の言語がサポートされています。
- アラビア語
- ベンガル語
- ビルマ語
- セブアノ語
- 中国語
- チェコ語
- オランダ語
- 英語
- フランス語
- ドイツ語
- ヘブライ語
- ヒンディー語
- インドネシア語
- イタリア語
- 日本語
- クメール語
- 韓国語
- ラオス語
- マレー語
- ペルシャ語
- ポーランド語
- ポルトガル語
- ロシア語
- スペイン語
- タガログ語
- タイ
- トルコ語
- ウルドゥー語
- ベトナム語
ドキュメントは125ページ以内でなければなりません。
1回の AI_EXTRACT 呼び出しで、エンティティ抽出については最大100件、テーブル抽出については最大10件の質問を行うことができます。

テーブル抽出質問は、エンティティ抽出質問10件に相当します。たとえば、1回の AI_EXTRACT 呼び出しで、4件のテーブル抽出質問と60件のエンティティ抽出質問を行うことができます。
エンティティ抽出の最大出力長は、1問あたり512トークンです。テーブル抽出の場合、モデルは最大4096トークンの回答を返します。
Client-side encrypted stages are not supported.
信頼性スコアはサポートされていません。

例¶

Extraction from an input string¶

次の例では、入力テキストから情報を抽出しています。

SELECT AI_EXTRACT(
  text => 'John Smith lives in San Francisco and works for Snowflake',
  responseFormat => {'name': 'What is the first name of the employee?', 'city': 'What is the address of the employee?'}
);

Copy

次の例では、入力テキストから情報を抽出して解析しています。

SELECT AI_EXTRACT(
  text => 'John Smith lives in San Francisco and works for Snowflake',
  responseFormat => PARSE_JSON('{"name": "What is the first name of the employee?", "address": "What is the address of the employee?"}')
);

Copy

Extraction from a file¶

次の例では、document.pdf ファイルから情報を抽出しています。

SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files','document.pdf'),
  responseFormat => [['name', 'What is the first name of the employee?'], ['city', 'Where does the employee live?']]
);

Copy

The following example extracts information from all files in a directory on a stage:

注釈

ディレクトリテーブルが有効になっていることを確認します。詳細については、ディレクトリテーブルの管理をご参照ください。
```
SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files', relative_path),
  responseFormat => [
    'What is this document?',
    'How would you classify this document?'
  ]
) FROM DIRECTORY (@db.schema.files);
```
Copy

The following example extracts the title value from the report.pdf file:

SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files', 'report.pdf'),
  responseFormat => {
    'schema': {
      'type': 'object',
      'properties': {
        'title': {
          'description': 'What is the title of document?',
          'type': 'string'
        }
      }
    }
  }
);

Copy

The following example extracts the employees array from the report.pdf file:

SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files', 'report.pdf'),
  responseFormat => {
    'schema': {
      'type': 'object',
      'properties': {
        'employees': {
          'description': 'What are the surnames of employees?',
          'type': 'array'
        }
      }
    }
  }
);

Copy

The following example extracts the income_table table from the report.pdf file:

SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files', 'report.pdf'),
  responseFormat => {
    'schema': {
      'type': 'object',
      'properties': {
        'income_table': {
          'description': 'Income for FY2026Q2',
          'type': 'object',
          'column_ordering': ['month', 'income'],
          'properties': {
            'month': {
              'type': 'array'
            },
            'income': {
              'type': 'array'
            }
          }
        }
      }
    }
  }
);

Copy

The following example extracts table (income_table), single value (title), and array (employees) from the report.pdf file:

SELECT AI_EXTRACT(
  file => TO_FILE('@db.schema.files', 'report.pdf'),
  responseFormat => {
    'schema': {
      'type': 'object',
      'properties': {
        'income_table': {
          'description': 'Income for FY2026Q2',
          'type': 'object',
          'column_ordering': ['month', 'income'],
          'properties': {
            'month': {
              'type': 'array'
            },
            'income': {
              'type': 'array'
            }
          }
        },
        'title': {
          'description': 'What is the title of document?',
          'type': 'string'
        },
        'employees': {
          'description': 'What are the surnames of employees?',
          'type': 'array'
        }
      }
    }
  }
);

Copy

リージョンの可用性¶

リージョンの可用性をご参照ください。

法的通知¶

法的通知については、 Snowflake AI と ML をご参照ください。