You are viewing documentation about an older version (1.4.0). View latest version

snowflake.snowpark.functions.xmlget¶

snowflake.snowpark.functions.xmlget(xml: ColumnOrName, tag: ColumnOrName, instance_num: ColumnOrName | int = 0) → Column[source]¶

Extracts an XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag and its instance number (counting from 0).

The following example returns the first inner level (level2) from the XML object created via parse_xml.

Example:

>>> df = session.create_dataframe(['<level1 attr1="a">1<level2 attr2="b">2<level3>3a</level3><level3>3b</level3></level2></level1>'], schema=["str"]).select(parse_xml("str").as_("obj"))
>>> df.collect()
[Row(OBJ='<level1 attr1="a">\n  1\n  <level2 attr2="b">\n    2\n    <level3>3a</level3>\n    <level3>3b</level3>\n  </level2>\n</level1>')]
>>> df.select(xmlget("obj", lit("level2")).as_("ans")).collect()
[Row(ANS='<level2 attr2="b">\n  2\n  <level3>3a</level3>\n  <level3>3b</level3>\n</level2>')]
Copy

When multiple tags exist at a level, instance_num can be used to distinguish which element to return.

Example:

>>> df.select(xmlget(xmlget("obj", lit("level2")), lit("level3"), lit(0)).as_("ans")).collect()
[Row(ANS='<level3>3a</level3>')]
>>> df.select(xmlget(xmlget("obj", lit("level2")), lit("level3"), lit(1)).as_("ans")).collect()
[Row(ANS='<level3>3b</level3>')]
>>> df.select(xmlget("obj", lit("level2"), lit(5)).as_("ans")).collect()
[Row(ANS=None)]
Copy

In order to get the tagname, the value of an attribute or the content within a tag the get function can be used.

Example:

>>> df.select(get(xmlget("obj", lit("level2")), lit("@")).as_("ans")).collect()
[Row(ANS='"level2"')]
>>> df.select(get(xmlget("obj", lit("level2")), lit("$")).as_("ans")).collect()
[Row(ANS='[\n  2,\n  {\n    "$": "3a",\n    "@": "level3"\n  },\n  {\n    "$": "3b",\n    "@": "level3"\n  }\n]')]
>>> df.select(get(xmlget(xmlget("obj", lit("level2")), lit("level3")), lit("$")).as_("ans")).collect()
[Row(ANS='"3a"')]
>>> df.select(get(xmlget("obj", lit("level2")), lit("@attr2")).as_("ans")).collect()
[Row(ANS='"b"')]
Copy