SnowConvert AI - Hive - CREATE EXTERNAL TABLE¶
Applies to
Hive SQL
Spark SQL
Databricks SQL
Description¶
External Tables defines a new table using a Data Source. (Spark SQL Language Reference CREATE DATASOURCE TABLE)
The CREATE EXTERNAL TABLE statement from Spark/Databricks will be transformed to a CREATE EXTERNAL TABLE statement from Snowflake; however, this transformation requires user intervention.
To complete the transformation performed by SnowConvert AI, it is necessary to define a Storage Integration, an External Stage, and (optionally) a Notification Integration that have access to the external source where files are located. Please refer to the following guides on how to set up the connection for each provider:
Important considerations for the transformations shown on this page:
The @EXTERNAL_STAGE placeholder must be replaced with the external stage created after following the previous guide.
It is assumed that the external stage will point to the root of the bucket. This is important to consider because the PATTERN clause generated for each table specifies the file/folder paths starting at the base of the bucket, defining the external stage pointing to a different location in the bucket might produce undesired behavior.
The
AUTO_REFRESH = FALSEclause is generated to avoid errors. Please note that automatic refresh of external table metadata is only valid if your Snowflake account cloud provider and the bucket provider are the same, and a Notification Integration was created.
Sample Source Patterns¶
Create External Table with explicit column list¶
When the column list is provided, SnowConvert AI will automatically generate the AS expression column options for each column to extract the file values.
Input Code:¶
Output Code:¶
CREATE EXTERNAL TABLE without an explicit column list ¶
When the column list is not provided, Spark automatically detects the schema of the columns from the file structure. To replicate this behavior, SnowConvert AI will generate a USING TEMPLATE clause that makes use of the INFER_SCHEMA function to generate the column definitions.
Since the INFER_SCHEMA function requires a file format to work, SnowConvert AI will generate a temporary file format for this purpose. This file format is only required when running the CREATE EXTERNAL TABLE statement, and it will be automatically dropped when the session ends.
Input Code:¶
Output Code:¶
CREATE EXTERNAL TABLE using Hive format¶
The creation of External Tables using Hive Format is also supported. They will have an FDM added informing the user that inserting into those tables is not supported.
Input Code:¶
Output Code:¶
Known Issues¶
1. External tables with unsupported file formats
Snowflake supports the following Spark formats:
CSV
PARQUET
ORC
XML
JSON
AVRO
Other formats will be marked as not supported.
2. Unsupported table options
Some table options are not supported by SnowConvert AI and are marked with an EWI.