Categories:

Semi-structured Data Functions (Parsing)

# PARSE_XML¶

Interprets an input string as an XML document, producing an OBJECT value. If the input is NULL, the output is NULL.

## Usage Notes¶

Every elementary value in XML documents is text; however it is not very useful for processing the data.

PARSE_XML attempts to convert obviously numeric and Boolean values to the native representation in a way that printing these values back produces textually identical results. When parsing decimal numbers, PARSE_XML attempts to preserve exactness of the representation by treating 123.45 as NUMBER(5,2), not as a DOUBLE. However, numbers in scientific notation (i.e. 1.2345e+02) or numbers that cannot be stored as fixed-point decimals due to range or scale limitations are stored as DOUBLE. Because XML does not represent values such as TIMESTAMP, DATE, TIME, or BINARY natively, these have to be represented as strings.

In the current Snowflake release, PARSE_XML does not provide automatic recognition of these values; they are treated as strings (and the explicit conversion from strings is required).

XML attributes are an unordered collection of name/value pairs. The PARSE_XML function does not necessarily preserve order. For example, converting text to XML and back to text might result in a string that contains the original information in a different order.

You might also see changes in whitespace between elements when converting from string to XML.

## Examples¶

CREATE TABLE xtab (v OBJECT);

INSERT INTO xtab SELECT PARSE_XML(column1) AS v
FROM VALUES ('<a/>'), ('<a attr="123">text</a>'), ('<a><b>X</b><b>Y</b></a>');

SELECT * FROM xtab;

-------------------------+
V             |
-------------------------+
<a></a>                 |
<a attr="123">text</a>  |
<a><b>X</b><b>Y</b></a> |
-------------------------+