Categories:

Utility & Hash Functions (Utility)

GENERATE_COLUMN_DESCRIPTION

Generates a list of columns from a set of staged files that contain semi-structured data using the INFER_SCHEMA function output. The output from this function can be used as input when creating a table, external table, or view (using the appropriate CREATE <object> command) based on the column definitions of the staged files.

Syntax

GENERATE_COLUMN_DESCRIPTION( <expr> , '<string>' )

Arguments

expr

Output of the INFER_SCHEMA function formatted as an array.

'string'

Type of object that could be created from the column list. The appropriate formatting for this type is applied to the output.

Possible values are table, external_table, or view.

Usage Notes

Currently, all columns in the output are defined as nullable.

Output

The function returns the list of columns a set of staged files, which can be used as input when creating an object of the type identified in the second argument.

Examples

Detect, format, and output the set of column definitions in a set of Parquet files staged in the mystage stage. The output columns are formatted for creating a table.

This example builds on an example in the INFER_SCHEMA topic:

-- Create a file format that sets the file type as Parquet.
CREATE FILE FORMAT my_parquet_format
  TYPE = parquet;

-- Query the GENERATE_COLUMN_DESCRIPTION function.
SELECT GENERATE_COLUMN_DESCRIPTION(ARRAY_AGG(OBJECT_CONSTRUCT(*)), 'table') AS COLUMNS
  FROM TABLE (
    INFER_SCHEMA(
      LOCATION=>'@mystage',
      FILE_FORMAT=>'my_parquet_format'
    )
  );

+--------------------+
| COLUMN_DESCRIPTION |
|--------------------|
| "country" VARIANT, |
| "continent" TEXT   |
+--------------------+

-- The function output can be used to define the columns in a table.
CREATE TABLE mytable ("country" VARIANT, "continent" TEXT);

Same as the previous example, but generates a set of columns formatted for creating an external table:

-- Query the GENERATE_COLUMN_DESCRIPTION function.
SELECT GENERATE_COLUMN_DESCRIPTION(ARRAY_AGG(OBJECT_CONSTRUCT(*)), 'external_table') AS COLUMNS
  FROM TABLE (
    INFER_SCHEMA(
      LOCATION=>'@mystage',
      FILE_FORMAT=>'my_parquet_format'
    )
  );

+---------------------------------------------+
| COLUMN_DESCRIPTION                          |
|---------------------------------------------|
| "country" VARIANT AS ($1:country::VARIANT), |
| "continent" TEXT AS ($1:continent::TEXT)    |
+---------------------------------------------+

Same as the previous examples, but generates a set of columns formatted for creating a view:

-- Query the GENERATE_COLUMN_DESCRIPTION function.
SELECT GENERATE_COLUMN_DESCRIPTION(ARRAY_AGG(OBJECT_CONSTRUCT(*)), 'view') AS COLUMNS
  FROM TABLE (
    INFER_SCHEMA(
      LOCATION=>'@mystage',
      FILE_FORMAT=>'my_parquet_format'
    )
  );

+--------------------+
| COLUMN_DESCRIPTION |
|--------------------|
| "country" ,        |
| "continent"        |
+--------------------+