Class TableFunctions


  • public class TableFunctions
    extends Object
    Provides utility functions that generate table function expressions that can be passed to DataFrame join method and Session tableFunction method.

    This object also provides functions that correspond to Snowflake system-defined table functions

    Since:
    1.2.0
    • Method Detail

      • split_to_table

        public static TableFunction split_to_table()
        This table function splits a string (based on a specified delimiter) and flattens the results into rows.

        Argument List:

        First argument (no name): Required. Text to be split.

        Second argument (no name): Required. Text to split string by.

        Example

        
         session.tableFunction(TableFunctions.split_to_table(),
           Functions.lit("split by space"), Functions.lit(" "));
         
        Returns:
        The result TableFunction reference
        Since:
        1.2.0
      • split_to_table

        public static Column split_to_table​(Column str,
                                            String delimiter)
        This table function splits a string (based on a specified delimiter) and flattens the results into rows.

        Example

        
         session.tableFunction(TableFunctions.split_to_table(,
           Functions.lit("split by space"), Functions.lit(" ")));
         
        Parameters:
        str - Text to be split.
        delimiter - Text to split string by.
        Returns:
        The result Column reference
        Since:
        1.10.0
      • flatten

        public static TableFunction flatten()
        Flattens (explodes) compound values into multiple rows.

        Argument List:

        input: Required. The expression that will be unseated into rows. The expression must be of data type VariantType, MapType or ArrayType.

        path: Optional. The path to the element within a VariantType data structure which needs to be flattened. Can be a zero-length string (i.e. empty path) if the outermost element is to be flattened. Default: Zero-length string (i.e. empty path)

        outer: Optional boolean value. If FALSE, any input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries, are completely omitted from the output. If TRUE, exactly one row is generated for zero-row expansions (with NULL in the KEY, INDEX, and VALUE columns). Default: FALSE

        recursive: Optional boolean value If FALSE, only the element referenced by PATH is expanded. If TRUE, the expansion is performed for all sub-elements recursively. Default: FALSE

        mode: Optional String ("object", "array", or "both") Specifies whether only objects, arrays, or both should be flattened. Default: both

        Example

        
         Map<String, Column> args = new HashMap<>();
         args.put("input", Functions.parse_json(Functions.lit("[1,2]")));
         session.tableFunction(TableFunctions.flatten(), args);
         
        Returns:
        The result TableFunction reference
        Since:
        1.2.0
      • flatten

        public static Column flatten​(Column input,
                                     String path,
                                     boolean outer,
                                     boolean recursive,
                                     String mode)
        Flattens (explodes) compound values into multiple rows.

        Example

        
         df.join(TableFunctions.flatten(
           Functions.parse_json(df.col("col")), "path", true, true, "both"));
         
        Parameters:
        input - The expression that will be unseated into rows. The expression must be of data type VariantType, MapType or ArrayType.
        path - The path to the element within a VariantType data structure which needs to be flattened. Can be a zero-length string (i.e. empty path) if the outermost element is to be flattened. Default: Zero-length string (i.e. empty path)
        outer - If FALSE, any input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries, are completely omitted from the output. If TRUE, exactly one row is generated for zero-row expansions (with NULL in the KEY, INDEX, and VALUE columns).
        recursive - If FALSE, only the element referenced by PATH is expanded. If TRUE, the expansion is performed for all sub-elements recursively. Default: FALSE
        mode - ("object", "array", or "both") Specifies whether only objects, arrays, or both should be flattened.
        Returns:
        The result Column reference
        Since:
        1.10.0
      • flatten

        public static Column flatten​(Column input)
        Flattens (explodes) compound values into multiple rows.

        Example

        
         df.join(TableFunctions.flatten(
           Functions.parse_json(df.col("col"))));
         
        Parameters:
        input - The expression that will be unseated into rows. The expression must be of data type VariantType, MapType or ArrayType.
        Returns:
        The result Column reference
        Since:
        1.10.0
      • explode

        public static Column explode​(Column input)
        Flattens a given array or map type column into individual rows. The output column(s) in case of array input column is `VALUE`, and are `KEY` and `VALUE` in case of amp input column.

        Example

        
         DataFrame df =
           getSession()
             .createDataFrame(
               new Row[] {Row.create("{\"a\":1, \"b\":2}")},
               StructType.create(new StructField("col", DataTypes.StringType)));
         DataFrame df1 =
           df.select(
             Functions.parse_json(df.col("col"))
               .cast(DataTypes.createMapType(DataTypes.StringType, DataTypes.IntegerType))
               .as("col"));
         df1.select(TableFunctions.explode(df1.col("col"))).show()
         
        Parameters:
        input - The expression that will be unseated into rows. The expression must be either MapType or ArrayType data.
        Returns:
        The result Column reference
        Since:
        1.10.0