- Categories:
Semi-structured and structured data functions (Extraction)
XMLGET¶
Extracts an XML element object (often referred to as simply a tag) from the content of the outer XML element based on the name and instance number of the specified tag.
(Note that an XML tag is not the same as a Snowflake data governance tag.)
If any argument of XMLGET is NULL, the result is NULL.
If the tag instance is not found, the result is NULL.
Syntax¶
XMLGET( <expression> , <tag_name> [ , <instance_number> ] )
Arguments¶
expression
The expression from which to extract the element.
The expression must evaluate to an OBJECT (or a VARIANT containing an OBJECT). The OBJECT must contain valid XML in the internal format that Snowflake supports. Typically, that means that the OBJECT was produced by one of the following:
Calling the PARSE_XML function.
Loading the data (e.g. via the COPY INTO <table> command) and specifying that the data is in XML format.
The XMLGET function does not operate directly on a VARCHAR expression even if that VARCHAR contains valid XML text.
tag_name
The name of an XML tag stored in the
expression
.instance_number
If the XML contains multiple instances of
tag_name
, then useinstance_number
to specify which instance to retrieve. Like an array index, theinstance_number
is 0-based, not 1-based.instance_number
can be omitted, in which case the default value 0 is used.
Returns¶
The data type of the returned value is OBJECT.
See the Usage Notes for more details.
Usage notes¶
The result of XMLGET is not the content of the tag (i.e. the text between the tags), but the entire element (the opening tag, content, and closing tag). The return value is an OBJECT. From this OBJECT, you can extract the tag name, the tag’s attribute values, and the contents of the element (including nested tags) by using the GET function:
To extract attribute values, use
GET(tag, '@attrname')
.To extract the content, use
GET(tag, '$')
.To extract the tag name, use
GET(tag, '@')
.
You can extract nested tags by nesting XMLGET() calls, e.g.:
select xmlget(xmlget(my_xml_column, 'my_tag'), 'my_inner_tag') ...;
Positions of the inner tags in the content can be obtained by using
GET(tag, 'inner-tag-name')
; if the content contains multiple elements, the positions will be represented as an array.You cannot use XMLGET to extract the outermost element. To get the outermost element, simply select the
expression
itself.
Examples¶
The following example creates a table with an OBJECT that contains XML, then uses XMLGET() to extract elements from that OBJECT.
CREATE TABLE xml_demo (ID INTEGER, object_col OBJECT);
INSERT INTO xml_demo (id, object_col)
SELECT 1001,
PARSE_XML('<level1> 1 <level2> 2 <level3> 3A </level3> <level3> 3B </level3> </level2> </level1>');
SELECT object_col,
XMLGET(object_col, 'level2'),
XMLGET(XMLGET(object_col, 'level2'), 'level3', 1)
FROM xml_demo;
+-------------------------+------------------------------+---------------------------------------------------+
| OBJECT_COL | XMLGET(OBJECT_COL, 'LEVEL2') | XMLGET(XMLGET(OBJECT_COL, 'LEVEL2'), 'LEVEL3', 1) |
|-------------------------+------------------------------+---------------------------------------------------|
| <level1> | <level2> | <level3>3B</level3> |
| 1 | 2 | |
| <level2> | <level3>3A</level3> | |
| 2 | <level3>3B</level3> | |
| <level3>3A</level3> | </level2> | |
| <level3>3B</level3> | | |
| </level2> | | |
| </level1> | | |
+-------------------------+------------------------------+---------------------------------------------------+
This example shows how to use GET() with XMLGET() to retrieve the content of an element. In this example, the level2
tag
contains three items (text and two nested tags), so GET returns these items in an ARRAY. The
nested tags are represented by OBJECTs (key-value pairs). The @
property contains the nested tag name and the $
property
contains the nested tag contents.
SELECT object_col,
GET(XMLGET(object_col, 'level2'), '$')
FROM xml_demo;
+-------------------------+----------------------------------------+
| OBJECT_COL | GET(XMLGET(OBJECT_COL, 'LEVEL2'), '$') |
|-------------------------+----------------------------------------|
| <level1> | [ |
| 1 | 2, |
| <level2> | { |
| 2 | "$": "3A", |
| <level3>3A</level3> | "@": "level3" |
| <level3>3B</level3> | }, |
| </level2> | { |
| </level1> | "$": "3B", |
| | "@": "level3" |
| | } |
| | ] |
+-------------------------+----------------------------------------+
This example shows how to use GET() with XMLGET() to retrieve an attribute of a tag:
INSERT INTO xml_demo (id, object_col)
SELECT 1002,
PARSE_XML('<level1> 1 <level2 an_attribute="my attribute"> 2 </level2> </level1>');
SELECT object_col,
GET(XMLGET(object_col, 'level2'), '@an_attribute')
FROM xml_demo
WHERE ID = 1002;
+--------------------------------------------------+----------------------------------------------------+
| OBJECT_COL | GET(XMLGET(OBJECT_COL, 'LEVEL2'), '@AN_ATTRIBUTE') |
|--------------------------------------------------+----------------------------------------------------|
| <level1> | "my attribute" |
| 1 | |
| <level2 an_attribute="my attribute">2</level2> | |
| </level1> | |
+--------------------------------------------------+----------------------------------------------------+