- Categories:
Semi-structured Data Functions (Parsing)
PARSE_XMLΒΆ
Interprets an input string as an XML document, producing an OBJECT value. If the input is NULL, the output is NULL.
- See also:
SyntaxΒΆ
PARSE_XML( <string_containing_xml> [ , <disable_auto_convert> ] )
ArgumentsΒΆ
Required:
string_containing_xml
This expression should evaluate to a VARCHAR. The VARCHAR should contain valid XML.
Optional:
disable_auto_convert
Boolean expression that specifies whether or not the function should attempt to convert numeric and boolean values in
string_containing_xml
to Snowflake data types. (For details about this conversion, see Usage Notes below.)If you do not want the function to convert these values, set this to
TRUE
. This has an effect that is similar to theDISABLE_AUTO_CONVERT
parameter in CREATE FILE FORMAT.If you want the function to convert these values, set this to
FALSE
or omit this argument.
Default:
FALSE
ReturnsΒΆ
The data type of the returned value is OBJECT. The OBJECT contains an internal representation of the XML.
Usage NotesΒΆ
The content of every element in XML documents is text. PARSE_XML attempts to convert some XML data from text to native (Snowflake SQL) Data Types.
NUMERIC and BOOLEAN:
PARSE_XML attempts to convert obviously numeric and Boolean values to the native representation in a way that printing these values back produces textually identical results. For example, when parsing decimal numbers, PARSE_XML attempts to preserve exactness of the representation by treating 123.45 as NUMBER(5,2), not as a DOUBLE. However:
Numbers in scientific notation (i.e. 1.2345e+02) or numbers that cannot be stored as fixed-point decimals due to range or scale limitations are stored as DOUBLE.
If the content of an XML element is a number with digits after the decimal point, then PARSE_XML might truncate trailing zeros.
If you do not want the function to perform this conversion, pass
TRUE
for thedisable_auto_convert
argument.TIMESTAMP, DATE, TIME, BINARY:
Because XML does not represent values such as TIMESTAMP, DATE, TIME, or BINARY natively, these have to be represented as strings in XML. PARSE_XML does not automatically recognize these values; they are retained as strings, so users must convert the values from strings to native SQL data types if needed.
XML attributes are an unordered collection of name/value pairs. The PARSE_XML function does not necessarily preserve order. For example, converting text to XML and back to text might result in a string that contains the original information in a different order.
You might see changes in whitespace between elements when converting from string to XML.
ExamplesΒΆ
The following example demonstrates how to use the PARSE_XML function to convert a string of XML to an OBJECT that can be inserted into an OBJECT column:
CREATE TABLE xtab (v OBJECT); INSERT INTO xtab SELECT PARSE_XML(column1) AS v FROM VALUES ('<a/>'), ('<a attr="123">text</a>'), ('<a><b>X</b><b>Y</b></a>'); SELECT * FROM xtab; +-------------------------+ | V | +-------------------------+ | <a></a> | | <a attr="123">text</a> | | <a><b>X</b><b>Y</b></a> | +-------------------------+
The following example demonstrates the differences between using and disabling the conversion of numeric values. In this example, when the conversion is not disabled, the function interprets a number in scientific notation as a DOUBLE.
SELECT PARSE_XML('<test>22257e111</test>'), PARSE_XML('<test>22257e111</test>', TRUE); +-------------------------------------+-------------------------------------------+ | PARSE_XML('<TEST>22257E111</TEST>') | PARSE_XML('<TEST>22257E111</TEST>', TRUE) | |-------------------------------------+-------------------------------------------| | <test>2.225700000000000e+115</test> | <test>22257e111</test> | +-------------------------------------+-------------------------------------------+