Class Functions


  • public final class Functions
    extends Object
    Provides utility functions that generate Column expression that you can pass to DataFrame transformation methods. These functions generate references to columns, literals, and SQL expression.

    This object also provides functions that correspond to Snowflake system-defined functions (built-in functions), including functions for aggregation and window functions.

    Since:
    0.9.0
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static Column abs​(Column col)
      Returns the absolute value of a numeric expression.
      static Column acos​(Column col)
      Computes the inverse cosine (arc cosine) of its input; the result is a number in the interval [-pi, pi].
      static Column add_months​(Column startDate, Column numMonths)
      Adds or subtracts a specified number of months to a date or timestamp, preserving the end-of-month information.
      static Column any_value​(Column e)
      Returns a non-deterministic value for the specified column.
      static Column approx_count_distinct​(Column col)
      Uses HyperLogLog to return an approximation of the distinct cardinality of the input (i.e.
      static Column approx_percentile​(Column col, double percentile)
      Returns an approximated value for the desired percentile.
      static Column approx_percentile_accumulate​(Column col)
      Returns the internal representation of the t-Digest state (as a JSON object) at the end of aggregation.
      static Column approx_percentile_combine​(Column state)
      Combines (merges) percentile input states into a single output state.
      static Column approx_percentile_estimate​(Column col, double percentile)
      Returns the desired approximated percentile value for the specified t-Digest state.
      static Column array​(Column... cols)
      Returns an ARRAY constructed from zero, one, or more inputs.
      static Column array_agg​(Column col)
      Returns the input values, pivoted into an ARRAY.
      static Column array_append​(Column array, Column element)
      Returns an ARRAY containing all elements from the source ARRAYas well as the new element.
      static Column array_cat​(Column array1, Column array2)
      Returns the concatenation of two ARRAYs.
      static Column array_compact​(Column array)
      Returns a compacted ARRAY with missing and null values removed, effectively converting sparse arrays into dense arrays.
      static Column array_construct​(Column... cols)
      Returns an ARRAY constructed from zero, one, or more inputs.
      static Column array_construct_compact​(Column... cols)
      Returns an ARRAY constructed from zero, one, or more inputs; the constructed ARRAY omits any NULL input values.
      static Column array_contains​(Column variant, Column array)
      Returns true if the specified VARIANT is found in the specified ARRAY.
      static Column array_insert​(Column array, Column pos, Column element)
      Returns an ARRAY containing all elements from the source ARRAY as well as the new element.
      static Column array_intersection​(Column col1, Column col2)
      Returns an ARRAY that contains the matching elements in the two input ARRAYs.
      static Column array_position​(Column variant, Column array)
      Returns the index of the first occurrence of an element in an ARRAY.
      static Column array_prepend​(Column array, Column element)
      Returns an ARRAY containing the new element as well as all elements from the source ARRAY.
      static Column array_size​(Column array)
      Returns the size of the input ARRAY.
      static Column array_slice​(Column array, Column from, Column to)
      Returns an ARRAY constructed from a specified subset of elements of the input ARRAY.
      static Column array_to_string​(Column array, Column separator)
      Returns an input ARRAY converted to a string by casting all values to strings (using TO_VARCHAR) and concatenating them (using the string from the second argument to separate the elements).
      static Column arrays_overlap​(Column a1, Column a2)
      Compares whether two arrays have at least one element in common.
      static Column as_array​(Column variant)
      Casts a VARIANT value to an array.
      static Column as_binary​(Column variant)
      Casts a VARIANT value to a binary string.
      static Column as_char​(Column variant)
      Casts a VARIANT value to a string.
      static Column as_date​(Column variant)
      Casts a VARIANT value to a date.
      static Column as_decimal​(Column variant)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values).
      static Column as_decimal​(Column variant, int precision)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision.
      static Column as_decimal​(Column variant, int precision, int scale)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision and scale.
      static Column as_double​(Column variant)
      Casts a VARIANT value to a floating-point value.
      static Column as_integer​(Column variant)
      Casts a VARIANT value to an integer.
      static Column as_number​(Column variant)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values).
      static Column as_number​(Column variant, int precision)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision and scale.
      static Column as_number​(Column variant, int precision, int scale)
      Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision.
      static Column as_object​(Column variant)
      Casts a VARIANT value to an object.
      static Column as_real​(Column variant)
      Casts a VARIANT value to a floating-point value.
      static Column as_time​(Column variant)
      Casts a VARIANT value to a time value.
      static Column as_timestamp_ltz​(Column variant)
      Casts a VARIANT value to a TIMESTAMP value with local timezone.
      static Column as_timestamp_ntz​(Column variant)
      Casts a VARIANT value to a TIMESTAMP value with no timezone.
      static Column as_timestamp_tz​(Column variant)
      Casts a VARIANT value to a TIMESTAMP value with timezone.
      static Column as_varchar​(Column variant)
      Casts a VARIANT value to a string.
      static Column asc​(String name)
      Returns a Column expression with values sorted in ascending order.
      static Column ascii​(Column e)
      Returns the ASCII code for the first character of a string.
      static Column asin​(Column col)
      Computes the inverse sine (arc sine) of its argument; the result is a number in the interval [-pi, pi].
      static Column atan​(Column col)
      Computes the inverse tangent (arc tangent) of its argument; the result is a number in the interval [-pi, pi].
      static Column atan2​(Column y, Column x)
      Computes the inverse tangent (arc tangent) of the ratio of its two arguments.
      static Column avg​(Column col)
      Returns the average of non-NULL records.
      static Column base64​(Column c)
      Computes the BASE64 encoding of a column and returns it as a string column.
      static Column bitnot​(Column col)
      Returns the bitwise negation of a numeric expression.
      static Column bitshiftleft​(Column e, Column numBits)
      Shifts the bits for a numeric expression numBits positions to the left.
      static Column bitshiftright​(Column e, Column numBits)
      Shifts the bits for a numeric expression numBits positions to the right.
      static Column callUDF​(String udfName, Column... cols)
      Calls a user-defined function (UDF) by name.
      static Column ceil​(Column col)
      Returns values from the specified column rounded to the nearest equal or larger integer.
      static Column charindex​(Column targetExpr, Column sourceExpr)
      Searches for targetExpr in sourceExpr and, if successful, returns the position (1-based) of the targetExpr in sourceExpr.
      static Column charindex​(Column targetExpr, Column sourceExpr, Column position)
      Searches for targetExpr in sourceExpr and, if successful, returns the position (1-based) of the targetExpr in sourceExpr.
      static Column check_json​(Column col)
      Checks the validity of a JSON document.
      static Column check_xml​(Column col)
      Checks the validity of an XML document.
      static Column chr​(Column col)
      Converts a Unicode code point (including 7-bit ASCII) into the character that matches the input Unicode.
      static Column coalesce​(Column... cols)
      Returns the first non-NULL expression among its arguments, or NULL if all its arguments are NULL.
      static Column col​(DataFrame df)
      Generates a Column representing the result of the input DataFrame.
      static Column col​(String name)
      Creates a Column with the specified name.
      static Column collate​(Column expr, String collationSpec)
      Returns a copy of expr, but with the specified collationSpec property instead of the original collation specification property.
      static Column collation​(Column expr)
      Returns the collation specification of expr.
      static Column collect_list​(Column c)
      Returns the input values, pivoted into an ARRAY.
      static Column collect_set​(Column e)
      Aggregate function: returns a set of objects with duplicate elements eliminated.
      static Column collect_set​(String e)
      Aggregate function: returns a set of objects with duplicate elements eliminated.
      static Column concat​(Column... exprs)
      Concatenates one or more strings, or concatenates one or more binary values.
      static Column concat_ws​(Column separator, Column... exprs)
      Concatenates two or more strings, or concatenates two or more binary values.
      static Column contains​(Column col, Column str)
      Returns true if col contains str.
      static Column convert_timezone​(Column targetTimeZone, Column sourceTimestamp)
      Converts the given sourceTimestampNTZ to targetTimeZone.
      static Column convert_timezone​(Column sourceTimeZone, Column targetTimeZone, Column sourceTimestampNTZ)
      Converts the given sourceTimestampNTZ from sourceTimeZone to targetTimeZone.
      static Column corr​(Column col1, Column col2)
      Returns the correlation coefficient for non-null pairs in a group.
      static Column cos​(Column col)
      Computes the cosine of its argument; the argument should be expressed in radians.
      static Column cosh​(Column col)
      Computes the hyperbolic cosine of its argument.
      static Column count​(Column col)
      Returns either the number of non-NULL records for the specified columns, or the total number of records.
      static Column count_distinct​(Column first, Column... remaining)
      Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records.
      static Column countDistinct​(Column first, Column... remaining)
      Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records.
      static Column countDistinct​(String first, String... remaining)
      Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records.
      static Column covar_pop​(Column col1, Column col2)
      Calculates the population covariance for non-null pairs in a group.
      static Column covar_samp​(Column col1, Column col2)
      Calculates the sample covariance for non-null pairs in a group.
      static Column cume_dist()
      Finds the cumulative distribution of a value with regard to other values within the same window partition.
      static Column current_account()
      Returns the account used by the user's current session.
      static Column current_available_roles()
      Returns a JSON string that lists all roles granted to the current user.
      static Column current_database()
      Returns the name of the database in use for the current session.
      static Column current_date()
      Returns the current date of the system.
      static Column current_region()
      Returns the name of the region for the account where the current user is logged in.
      static Column current_role()
      Returns the name of the role in use for the current session.
      static Column current_schema()
      Returns the name of the schema in use by the current session.
      static Column current_schemas()
      Returns active search path schemas.
      static Column current_session()
      Returns a unique system identifier for the Snowflake session corresponding to the present connection.
      static Column current_statement()
      Returns the SQL text of the statement that is currently executing.
      static Column current_time()
      Returns the current time for the system.
      static Column current_timestamp()
      Returns the current timestamp for the system.
      static Column current_user()
      Returns the name of the user currently logged into the system.
      static Column current_version()
      Returns the current Snowflake version.
      static Column current_warehouse()
      Returns the name of the warehouse in use for the current session.
      static Column date_add​(Column start, Column days)
      Returns the date that is `days` days after `start` Usage - DATE_ADD( date_or_time_part, value, date_or_time_expr )
      static Column date_add​(Integer days, Column start)
      Returns the date that is `days` days after `start` Usage - DATE_ADD( date_or_time_part, value, date_or_time_expr )
      static Column date_format​(Column col, String s)
      Converts an input expression into the corresponding date in the specified date format.
      static Column date_from_parts​(Column year, Column month, Column day)
      Creates a date from individual numeric components that represent the year, month, and day of the month.
      static Column date_trunc​(String format, Column timestamp)
      Truncates a DATE, TIME, or TIMESTAMP to the specified precision.
      static Column dateadd​(String part, Column value, Column expr)
      Adds the specified value for the specified date or time art to date or time expr.
      static Column datediff​(String part, Column col1, Column col2)
      Calculates the difference between two date, time, or timestamp columns based on the date or time part requested.
      static Column dayname​(Column expr)
      Extracts the three-letter day-of-week name from the specified date or timestamp.
      static Column dayofmonth​(Column e)
      Extracts the day of month from a date or timestamp.
      static Column dayofweek​(Column e)
      Extracts the day of week from a date or timestamp.
      static Column dayofyear​(Column e)
      Extracts the day of year from a date or timestamp.
      static Column degrees​(Column e)
      Converts radians to degrees.
      static Column dense_rank()
      Returns the rank of a value within a group of values, without gaps in the ranks.
      static Column desc​(String name)  
      static Column div0​(Column dividend, Column divisor)
      Performs division like the division operator (/), but returns 0 when the divisor is 0 (rather than reporting an error).
      static Column endswith​(Column expr, Column str)
      Returns TRUE if expr ends with str.
      static Column equal_nan​(Column col)
      Returns true if the value in the column is not a number (NaN).
      static Column exp​(Column col)
      Computes Euler's number e raised to a floating-point value.
      static Column expr​(String s)
      Creates a Column expression from row SQL text.
      static Column factorial​(Column col)
      Computes the factorial of its input.
      static Column floor​(Column col)
      Returns values from the specified column rounded to the nearest equal or smaller integer.
      static Column format_number​(Column x, Integer d)
      Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.
      static Column from_unixtime​(Column ut)
      Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.
      static Column from_unixtime​(Column ut, String f)
      Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
      static Column from_utc_timestamp​(Column ts)
      Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.
      static Column get​(Column col1, Column col2)
      Extracts a value from an object or array; returns NULL if either of the arguments is NULL.
      static Column get_ignore_case​(Column obj, Column field)
      Extracts a field value from an object; returns NULL if either of the arguments is NULL.
      static Column get_path​(Column col, Column path)
      Extracts a value from semi-structured data using a path name.
      static Column greatest​(Column... cols)
      Returns the largest value from a list of expressions.
      static Column grouping​(Column col)
      Describes which of a list of expressions are grouped in a row produced by a GROUP BY query.
      static Column grouping_id​(Column... cols)
      Describes which of a list of expressions are grouped in a row produced by a GROUP BY query.
      static Column hash​(Column... cols)
      Returns a signed 64-bit hash value.
      static Column hex​(Column c)
      Computes hex value of the given column.
      static Column hour​(Column e)
      Extracts the hour from a date or timestamp.
      static Column iff​(Column condition, Column expr1, Column expr2)
      Returns one of two specified expressions, depending on a condition.
      static Column in​(Column[] columns, DataFrame df)
      Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ...
      static Column in​(Column[] columns, List<List<Object>> values)
      Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ...
      static Column initcap​(Column e)
      Returns the input string with the first letter of each word in uppercase and the subsequent letters in lowercase.
      static Column insert​(Column baseExpr, Column position, Column length, Column insertExpr)
      Replaces a substring of the specified length, starting at the specified position, with a new string or binary value.
      static Column instr​(Column str, String substring)
      Locate the position of the first occurrence of substr column in the given string.
      static Column is_array​(Column col)
      Returns true if the specified VARIANT column contains an ARRAY value.
      static Column is_binary​(Column col)
      Returns true if the specified VARIANT column contains a binary value.
      static Column is_boolean​(Column col)
      Returns true if the specified VARIANT column contains a Boolean value.
      static Column is_char​(Column col)
      Returns true if the specified VARIANT column contains a string value.
      static Column is_date​(Column col)
      Returns true if the specified VARIANT column contains a DATE value.
      static Column is_date_value​(Column col)
      Returns true if the specified VARIANT column contains a DATE value.
      static Column is_decimal​(Column col)
      Returns true if the specified VARIANT column contains a fixed-point decimal value or integer.
      static Column is_double​(Column col)
      Returns true if the specified VARIANT column contains a floating-point value, fixed-point decimal, or integer.
      static Column is_integer​(Column col)
      Returns true if the specified VARIANT column contains an integer value.
      static Column is_null​(Column col)
      Returns true if the value in the column is null.
      static Column is_null_value​(Column col)
      Returns true if the specified VARIANT column is a JSON null value.
      static Column is_object​(Column col)
      Returns true if the specified VARIANT column contains an OBJECT value.
      static Column is_real​(Column col)
      Returns true if the specified VARIANT column contains a floating-point value, fixed-point decimal, or integer.
      static Column is_time​(Column col)
      Returns true if the specified VARIANT column contains a TIME value.
      static Column is_timestamp_ltz​(Column col)
      Returns true if the specified VARIANT column contains a TIMESTAMP value to be interpreted using the local time zone.
      static Column is_timestamp_ntz​(Column col)
      Returns true if the specified VARIANT column contains a TIMESTAMP value with no time zone.
      static Column is_timestamp_tz​(Column col)
      Returns true if the specified VARIANT column contains a TIMESTAMP value with a time zone.
      static Column is_varchar​(Column col)
      Returns true if the specified VARIANT column contains a string value.
      static Column isnull​(Column c)
      Wrapper for Snowflake built-in isnull function.
      static Column json_extract_path_text​(Column col, Column path)
      Parses a JSON string and returns the value of an element at a specified path in the resulting JSON document.
      static Column kurtosis​(Column col)
      Returns the population excess kurtosis of non-NULL records.
      static Column lag​(Column col)
      Accesses data in a previous row in the same result set without having to join the table to itself.
      static Column lag​(Column col, int offset)
      Accesses data in a previous row in the same result set without having to join the table to itself.
      static Column lag​(Column col, int offset, Column defaultValue)
      Accesses data in a previous row in the same result set without having to join the table to itself.
      static Column last​(Column col)
      Returns the last value of the column in a group.
      static Column last_day​(Column e)
      Returns the last day of the specified date part for a date or timestamp.
      static Column lead​(Column col)
      Accesses data in a subsequent row in the same result set without having to join the table to itself.
      static Column lead​(Column col, int offset)
      Accesses data in a subsequent row in the same result set without having to join the table to itself.
      static Column lead​(Column col, int offset, Column defaultValue)
      Accesses data in a subsequent row in the same result set without having to join the table to itself.
      static Column least​(Column... cols)
      Returns the smallest value from a list of expressions.
      static Column left​(Column strExpr, Column lengthExpr)
      Returns a left most substring of strExpr.
      static Column length​(Column e)
      Returns the length of an input string or binary value.
      static Column listagg​(Column col)
      Returns the concatenated input values, separated by empty string.
      static Column listagg​(Column col, String delimiter)
      Returns the concatenated input values, separated by `delimiter` string.
      static Column listagg​(Column col, String delimiter, boolean isDistinct)
      Returns the concatenated input values, separated by `delimiter` string.
      static Column lit​(Object literal)
      Creates a Column expression for a literal value.
      static Column locate​(Column substr, Column str, int pos)
      Locate the position of the first occurrence of substr in a string column, after position pos.
      static Column locate​(String substr, Column str)
      Locate the position of the first occurrence of substr in a string column, after position pos.
      static Column log​(Column base, Column a)
      Returns the logarithm of a numeric expression.
      static Column log10​(Column col)
      Computes the logarithm of the given value in base 10.
      static Column log10​(String s)
      Computes the logarithm of the given value in base 10.
      static Column log1p​(Column col)
      Computes the logarithm of the given value in base 10.
      static Column log1p​(String s)
      Computes the logarithm of the given value in base 10.
      static Column lower​(Column e)
      Returns the input string with all characters converted to lowercase.
      static Column lpad​(Column str, Column len, Column pad)
      Left-pads a string with characters from another string, or left-pads a binary value with bytes from another binary value.
      static Column ltrim​(Column e)
      Removes leading characters, including whitespace, from a string.
      static Column ltrim​(Column e, Column trimString)
      Removes leading characters, including whitespace, from a string.
      static Column max​(Column col)
      Returns the maximum value for the records in a group.
      static Column max​(String colName)
      Returns the maximum value for the records in a group.
      static Column md5​(Column e)
      Returns a 32-character hex-encoded string containing the 128-bit MD5 message digest.
      static Column mean​(Column col)
      Returns the average of non-NULL records.
      static Column mean​(String colName)
      Returns the average of non-NULL records.
      static Column median​(Column col)
      Returns the median value for the records in a group.
      static Column min​(Column col)
      Returns the minimum value for the records in a group.
      static Column min​(String colName)
      Returns the minimum value for the records in a group.
      static Column minute​(Column e)
      Extracts the minute from a date or timestamp.
      static Column monotonically_increasing_id()
      A column expression that generates monotonically increasing 64-bit integers.
      static Column month​(Column e)
      Extracts the month from a date or timestamp.
      static Column monthname​(Column expr)
      Extracts the three-letter month name from the specified date or timestamp.
      static Column months_between​(String end, String start)
      Returns number of months between dates `start` and `end`.
      static Column negate​(Column col)
      Returns the negation of the value in the column (equivalent to a unary minus).
      static Column next_day​(Column date, Column dayOfWeek)
      Returns the date of the first specified DOW (day of week) that occurs after the input date.
      static Column not​(Column col)
      Returns the inverse of a boolean expression.
      static Column ntile​(int n)
      Window function: returns the ntile group id (from 1 to `n` inclusive) in an ordered window partition.
      static Column ntile​(Column col)
      Divides an ordered data set equally into the number of buckets specified by n.
      static Column object_construct​(Column... key_values)
      Returns an OBJECT constructed from the arguments.
      static Column object_delete​(Column obj, Column key1, Column... keys)
      Returns an object containing the contents of the input (i.e.source) object with one or more keys removed.
      static Column object_insert​(Column obj, Column key, Column value)
      Returns an object consisting of the input object with a new key-value pair inserted.
      static Column object_insert​(Column obj, Column key, Column value, Column update_flag)
      Returns an object consisting of the input object with a new key-value pair inserted (or an existing key updated with a new value).
      static Column object_keys​(Column obj)
      Returns an array containing the list of keys in the input object.
      static Column object_pick​(Column obj, Column key1, Column... keys)
      Returns a new OBJECT containing some of the key-value pairs from an existing object.
      static Column objectagg​(Column key, Column value)
      Returns one OBJECT per group.
      static Column parse_json​(Column col)
      Parse the value of the specified column as a JSON string and returns the resulting JSON document.
      static Column parse_xml​(Column col)
      Parse the value of the specified column as a JSON string and returns the resulting XML document.
      static Column percent_rank()
      Returns the relative rank of a value within a group of values, specified as a percentage ranging from 0.0 to 1.0.
      static Column pow​(Column l, Column r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(Column l, Double r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(Column l, String r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(Double l, Column r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(Double l, String r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(String l, Column r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(String l, Double r)
      Returns a number (l) raised to the specified power (r).
      static Column pow​(String l, String r)
      Returns a number (l) raised to the specified power (r).
      static Column previous_day​(Column date, Column dayOfWeek)
      Returns the date of the first specified DOW (day of week) that occurs before the input date.
      static Column quarter​(Column e)
      Extracts the quarter from a date or timestamp.
      static Column radians​(Column e)
      Converts degrees to radians.
      static Column randn()
      Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
      static Column randn​(long seed)
      Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
      static Column random()
      Each call returns a pseudo-random 64-bit integer.
      static Column random​(long seed)
      Each call returns a pseudo-random 64-bit integer.
      static Column rank()
      Returns the rank of a value within an ordered group of values.
      static Column regexp_count​(Column strExpr, Column pattern)
      Returns the number of times that a pattern occurs in a strExpr.
      static Column regexp_count​(Column strExpr, Column pattern, Column position, Column parameters)
      Returns the number of times that a pattern occurs in a strExpr.
      static Column regexp_extract​(Column col, String exp, Integer position, Integer Occurences, Integer grpIdx)
      Signature - snowflake.snowpark.functions.regexp_extract (value: Union[Column, str], regexp: Union[Column, str], idx: int) Column Extract a specific group matched by a regex, from the specified string column.
      static Column regexp_replace​(Column strExpr, Column pattern)
      Returns the subject with the specified pattern (or all occurrences of the pattern) removed.
      static Column regexp_replace​(Column strExpr, Column pattern, Column replacement)
      Returns the subject with the specified pattern (or all occurrences of the pattern) replaced by a replacement string.
      static Column repeat​(Column str, Column n)
      Builds a string by repeating the input for the specified number of times.
      static Column replace​(Column strExpr, Column pattern)
      Removes all occurrences of a specified strExpr, and optionally replaces them with replacement.
      static Column replace​(Column strExpr, Column pattern, Column replacement)
      Removes all occurrences of a specified strExpr, and optionally replaces them with replacement.
      static Column reverse​(Column name)
      Wrapper for Snowflake built-in reverse function.
      static Column right​(Column strExpr, Column lengthExpr)
      Returns a right most substring of strExpr.
      static Column round​(Column e)
      Rounds the numeric values of the given column e to 0 decimal places using the half away from zero rounding mode.
      static Column round​(Column e, int scale)
      Rounds the numeric values of the given column e to the scale decimal places using the half away from zero rounding mode.
      static Column round​(Column e, Column scale)
      Rounds the numeric values of the given column e to the scale decimal places using the half away from zero rounding mode.
      static Column row_number()
      Returns a unique row number for each row within a window partition.
      static Column rpad​(Column str, Column len, Column pad)
      Right-pads a string with characters from another string, or right-pads a binary value with bytes from another binary value.
      static Column rtrim​(Column e)
      Removes trailing characters, including whitespace, from a string.
      static Column rtrim​(Column e, Column trimString)
      Removes trailing characters, including whitespace, from a string.
      static Column second​(Column e)
      Extracts the second from a date or timestamp.
      static Column seq1()
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq1​(boolean startsFromZero)
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq2()
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq2​(boolean startsFromZero)
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq4()
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq4​(boolean startsFromZero)
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq8()
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column seq8​(boolean startsFromZero)
      Generates a sequence of monotonically increasing integers, with wrap-around.
      static Column sha1​(Column e)
      Returns a 40-character hex-encoded string containing the 160-bit SHA-1 message digest.
      static Column sha2​(Column e, int numBits)
      Returns a hex-encoded string containing the N-bit SHA-2 message digest, where N is the specified output digest size.
      static Column shiftleft​(Column c, int numBits)
      Shift the given value numBits left.
      static Column shiftright​(Column c, int numBits)
      Shift the given value numBits right.
      static Column sign​(Column col)
      Returns the sign of its argument:
      static Column signum​(Column col)
      Returns the sign of its argument:
      static Column sin​(Column e)
      Computes the sine of its argument; the argument should be expressed in radians.
      static Column sinh​(Column e)
      Computes the hyperbolic sine of its argument.
      static Column size​(Column col)
      Returns the size of the input ARRAY.
      static Column skew​(Column col)
      Returns the sample skewness of non-NULL records.
      static Column soundex​(Column e)
      Returns a string that contains a phonetic representation of the input string.
      static Column split​(Column str, Column pattern)
      Splits a given string with a given separator and returns the result in an array of strings.
      static Column sqlExpr​(String sqlText)
      Creates a Column expression from row SQL text.
      static Column sqrt​(Column col)
      Returns the square-root of a non-negative numeric expression.
      static Column startswith​(Column col, Column str)
      Returns true if col starts with str.
      static Column stddev​(Column col)
      Returns the sample standard deviation (square root of sample variance) of non-NULL values.
      static Column stddev_pop​(Column col)
      Returns the population standard deviation (square root of variance) of non-NULL values.
      static Column stddev_samp​(Column col)
      Returns the sample standard deviation (square root of sample variance) of non-NULL values.
      static Column strip_null_value​(Column col)
      Converts a JSON "null" value in the specified column to a SQL NULL value.
      static Column strtok_to_array​(Column array)
      Tokenizes the given string using the given set of delimiters and returns the tokens as an array.
      static Column strtok_to_array​(Column array, Column delimiter)
      Tokenizes the given string using the given set of delimiters and returns the tokens as an array.
      static Column substring​(Column str, Column pos, Column len)
      Returns the portion of the string or binary value str, starting from the character/byte specified by pos, with limited length.
      static Column substring_index​(String col, String delim, Integer count)
      Returns the substring from string str before count occurrences of the delimiter delim.
      static Column sum​(Column col)
      Returns the sum of non-NULL records in a group.
      static Column sum​(String colName)
      Returns the sum of non-NULL records in a group.
      static Column sum_distinct​(Column col)
      Returns the sum of non-NULL distinct records in a group.
      static Column sysdate()
      Returns the current timestamp for the system, but in the UTC time zone.
      static Column tan​(Column e)
      Computes the tangent of its argument; the argument should be expressed in radians.
      static Column tanh​(Column e)
      Computes the hyperbolic tangent of its argument.
      static Column time_from_parts​(Column hour, Column minute, Column second)
      Creates a time from individual numeric components.
      static Column time_from_parts​(Column hour, Column minute, Column second, Column nanosecond)
      Creates a time from individual numeric components.
      static Column timestamp_from_parts​(Column dateExpr, Column timeExpr)
      Creates a timestamp from individual numeric components.
      static Column timestamp_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second)
      Creates a timestamp from individual numeric components.
      static Column timestamp_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second, Column nanosecond)
      Creates a timestamp from individual numeric components.
      static Column timestamp_ltz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second)
      Creates a timestamp from individual numeric components.
      static Column timestamp_ltz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second, Column nanosecond)
      Creates a timestamp from individual numeric components.
      static Column timestamp_ntz_from_parts​(Column dateExpr, Column timeExpr)
      Creates a timestamp from individual numeric components.
      static Column timestamp_ntz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second)
      Creates a timestamp from individual numeric components.
      static Column timestamp_ntz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second, Column nanosecond)
      Creates a timestamp from individual numeric components.
      static Column timestamp_tz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second)
      Creates a timestamp from individual numeric components.
      static Column timestamp_tz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second, Column nanosecond)
      Creates a timestamp from individual numeric components.
      static Column timestamp_tz_from_parts​(Column year, Column month, Column day, Column hour, Column minute, Column second, Column nanosecond, Column timezone)
      Creates a timestamp from individual numeric components.
      static Column to_array​(Column col)
      Converts the input expression into an array:
      static Column to_date​(Column e)
      Converts an input expression to a date.
      static Column to_date​(Column e, Column fmt)
      Converts an input expression to a date.
      static Column to_decimal​(Column expr, int precision, int scale)
      Converts an input expression to a decimal
      static Column to_json​(Column col)
      Converts any VARIANT value to a string containing the JSON representation of the value.
      static Column to_object​(Column col)
      Converts the input value to an object:
      static Column to_timestamp​(Column s)
      Converts an input expression into the corresponding timestamp.
      static Column to_timestamp​(Column s, Column fmt)
      Converts an input expression into the corresponding timestamp.
      static Column to_utc_timestamp​(Column ts)
      Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.
      static Column to_variant​(Column col)
      Converts any value to VARIANT value or NULL (if input is NULL).
      static Column to_xml​(Column col)
      Converts any VARIANT value to a string containing the XML representation of the value.
      static Column toScalar​(DataFrame df)
      Generates a Column representing the result of the input DataFrame.
      static Column translate​(Column src, Column matchingString, Column replaceString)
      Translates src from the characters in matchingString to the characters in replaceString.
      static Column trim​(Column e, Column trimString)
      Removes leading and trailing characters from a string.
      static Column trunc​(Column expr, Column scale)
      Rounds the input expression down to the nearest (or equal) integer closer to zero, or to the nearest equal or smaller value with the specified number of places after the decimal point.
      static UserDefinedFunction udf​(JavaUDF0<?> func, DataType output)
      Registers a Java Lambda of 0 argument as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF1<?,​?> func, DataType input, DataType output)
      Registers a Java Lambda of 1 argument as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF10<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 10 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF11<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 11 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF12<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 12 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF13<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 13 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF14<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 14 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF15<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 15 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF16<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 16 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF17<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 17 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF18<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 18 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF19<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 19 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF2<?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 2 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF20<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 20 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF21<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 21 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF22<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 22 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF3<?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 3 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF4<?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 4 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF5<?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 5 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF6<?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 6 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF7<?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 7 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF8<?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 8 arguments as a Snowflake UDF and returns the UDF.
      static UserDefinedFunction udf​(JavaUDF9<?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func, DataType[] input, DataType output)
      Registers a Java Lambda of 9 arguments as a Snowflake UDF and returns the UDF.
      static Column unbase64​(Column c)
      Decodes a BASE64 encoded string column and returns it as a column.
      static Column unhex​(Column c)
      Inverse of hex.
      static Column uniform​(Column min, Column max, Column gen)
      Returns a uniformly random number, in the inclusive range (`min`, `max`)
      static Column unix_timestamp​(Column c)
      Returns the current Unix timestamp (in seconds) as a long.
      static Column upper​(Column e)
      Returns the input string with all characters converted to uppercase.
      static Column var_pop​(Column col)
      Returns the population variance of non-NULL records in a group.
      static Column var_samp​(Column col)
      Returns the sample variance of non-NULL records in a group.
      static Column variance​(Column col)
      Returns the sample variance of non-NULL records in a group.
      static Column weekofyear​(Column e)
      Extracts the week of year from a date or timestamp.
      static CaseExpr when​(Column condition, Column value)
      Works like a cascading if-then-else statement.
      static Column xmlget​(Column xml, Column tag)
      Extracts the first XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag
      static Column xmlget​(Column xml, Column tag, Column instance)
      Extracts an XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag and its instance number (counting from 0).
      static Column year​(Column e)
      Extracts the year from a date or timestamp.
    • Method Detail

      • col

        public static Column col​(String name)
        Creates a Column with the specified name.
        Parameters:
        name - The column name
        Returns:
        The result column
        Since:
        0.9.0
      • col

        public static Column col​(DataFrame df)
        Generates a Column representing the result of the input DataFrame. The parameter 'df' should have one column and must produce one row. It is an alias of toScalar function.

        For example:

        
         DataFrame df1 = session.sql("select * from values(1,1,1),(2,2,3) as T(c1, c2, c3)");
         DataFrame df2 = session.sql("select * from values(2) as T(a)");
         df1.select(Functions.col("c1"), Functions.col(df2)).show();
         
        Parameters:
        df - The sub-query Dataframe
        Returns:
        The result column
        Since:
        0.9.0
      • toScalar

        public static Column toScalar​(DataFrame df)
        Generates a Column representing the result of the input DataFrame. The parameter 'df' should have one column and must produce one row.

        For example:

        
         DataFrame df1 = session.sql("select * from values(1,1,1),(2,2,3) as T(c1, c2, c3)");
         DataFrame df2 = session.sql("select * from values(2) as T(a)");
         df1.select(Functions.col("c1"), Functions.toScalar(df2)).show();
         
        Parameters:
        df - The sub-query Dataframe
        Returns:
        The result column
        Since:
        0.9.0
      • lit

        public static Column lit​(Object literal)
        Creates a Column expression for a literal value.
        Parameters:
        literal - The literal value
        Returns:
        The result column
        Since:
        0.9.0
      • lead

        public static Column lead​(Column col,
                                  int offset,
                                  Column defaultValue)
        Accesses data in a subsequent row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        offset - The function offset
        defaultValue - The default value of lead function
        Returns:
        The result column
        Since:
        0.9.0
      • lead

        public static Column lead​(Column col,
                                  int offset)
        Accesses data in a subsequent row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        offset - The function offset
        Returns:
        The result column
        Since:
        0.9.0
      • lead

        public static Column lead​(Column col)
        Accesses data in a subsequent row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • sqlExpr

        public static Column sqlExpr​(String sqlText)
        Creates a Column expression from row SQL text.

        Note that the function does not interpret or check the SQL text.

        Parameters:
        sqlText - The SQL query
        Returns:
        The result column
        Since:
        0.9.0
      • approx_count_distinct

        public static Column approx_count_distinct​(Column col)
        Uses HyperLogLog to return an approximation of the distinct cardinality of the input (i.e. returns an approximation of `COUNT(DISTINCT col)`).
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • avg

        public static Column avg​(Column col)
        Returns the average of non-NULL records. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • corr

        public static Column corr​(Column col1,
                                  Column col2)
        Returns the correlation coefficient for non-null pairs in a group.
        Parameters:
        col1 - The first input column
        col2 - The second input column
        Returns:
        The result column
        Since:
        0.9.0
      • count

        public static Column count​(Column col)
        Returns either the number of non-NULL records for the specified columns, or the total number of records.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • countDistinct

        public static Column countDistinct​(String first,
                                           String... remaining)
        Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records. An alias of count_distinct.
        Parameters:
        first - The first column name
        remaining - A column name list except the first column name
        Returns:
        The result column
        Since:
        1.13.0
      • countDistinct

        public static Column countDistinct​(Column first,
                                           Column... remaining)
        Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records. An alias of count_distinct.
        Parameters:
        first - The first column
        remaining - A column list except the first column
        Returns:
        The result column
        Since:
        1.13.0
      • count_distinct

        public static Column count_distinct​(Column first,
                                            Column... remaining)
        Returns either the number of non-NULL distinct records for the specified columns, or the total number of the distinct records.
        Parameters:
        first - The first column
        remaining - A column list except the first column
        Returns:
        The result column
        Since:
        0.9.0
      • covar_pop

        public static Column covar_pop​(Column col1,
                                       Column col2)
        Calculates the population covariance for non-null pairs in a group.
        Parameters:
        col1 - The first column
        col2 - The second column
        Returns:
        The result column
        Since:
        0.9.0
      • covar_samp

        public static Column covar_samp​(Column col1,
                                        Column col2)
        Calculates the sample covariance for non-null pairs in a group.
        Parameters:
        col1 - The first column
        col2 - The second column
        Returns:
        The result column
        Since:
        0.9.0
      • grouping

        public static Column grouping​(Column col)
        Describes which of a list of expressions are grouped in a row produced by a GROUP BY query.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • grouping_id

        public static Column grouping_id​(Column... cols)
        Describes which of a list of expressions are grouped in a row produced by a GROUP BY query.
        Parameters:
        cols - A list of input column
        Returns:
        The result column
        Since:
        0.9.0
      • kurtosis

        public static Column kurtosis​(Column col)
        Returns the population excess kurtosis of non-NULL records. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • max

        public static Column max​(String colName)
        Returns the maximum value for the records in a group. NULL values are ignored unless all the records are NULL, in which case a NULL value is returned.

        Example:

        
         DataFrame df = session.createDataFrame(
                 new Row[] {Row.create(1), Row.create(3), Row.create(10), Row.create(1), Row.create(3)},
                 StructType.create(new StructField("x", DataTypes.IntegerType))
         );
         df.select(max("x")).show();
        
         ----------------
         |"MAX(""X"")"  |
         ----------------
         |10            |
         ----------------
         
        Parameters:
        colName - The name of the column
        Returns:
        The maximum value of the given column
        Since:
        1.13.0
      • max

        public static Column max​(Column col)
        Returns the maximum value for the records in a group. NULL values are ignored unless all the records are NULL, in which case a NULL value is returned.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • min

        public static Column min​(String colName)
        Returns the minimum value for the records in a group. NULL values are ignored unless all the records are NULL, in which case a NULL value is returned.

        Example:

        
         DataFrame df = session.createDataFrame(
                 new Row[] {Row.create(1), Row.create(3), Row.create(10), Row.create(1), Row.create(3)},
                 StructType.create(new StructField("x", DataTypes.IntegerType))
         );
         df.select(min("x")).show();
        
         ----------------
         |"MIN(""X"")"  |
         ----------------
         |1             |
         ----------------
         
        Parameters:
        colName - The name of the column
        Returns:
        The minimum value of the given column
        Since:
        1.13.0
      • min

        public static Column min​(Column col)
        Returns the minimum value for the records in a group. NULL values are ignored unless all the records are NULL, in which case a NULL value is returned.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • mean

        public static Column mean​(String colName)
        Returns the average of non-NULL records. If all records inside a group are NULL, the function returns NULL. Alias of avg.

        Example:

        
         DataFrame df = session.createDataFrame(
                 new Row[] {Row.create(1), Row.create(3), Row.create(10), Row.create(1), Row.create(3)},
                 StructType.create(new StructField("x", DataTypes.IntegerType))
         );
         df.select(mean("x")).show();
        
         ----------------
         |"AVG(""X"")"  |
         ----------------
         |3.600000      |
         ----------------
         
        Parameters:
        colName - The name of the column
        Returns:
        The average value of the given column
        Since:
        1.13.0
      • mean

        public static Column mean​(Column col)
        Returns the average of non-NULL records. If all records inside a group are NULL, the function returns NULL. Alias of avg
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • median

        public static Column median​(Column col)
        Returns the median value for the records in a group. NULL values are ignored unless all the records are NULL, in which case a NULL value is returned.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • skew

        public static Column skew​(Column col)
        Returns the sample skewness of non-NULL records. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • stddev

        public static Column stddev​(Column col)
        Returns the sample standard deviation (square root of sample variance) of non-NULL values. If all records inside a group are NULL, returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • stddev_samp

        public static Column stddev_samp​(Column col)
        Returns the sample standard deviation (square root of sample variance) of non-NULL values. If all records inside a group are NULL, returns NULL. Alias of stddev
        Parameters:
        col - The input Column
        Returns:
        The result Column
        Since:
        0.9.0
      • stddev_pop

        public static Column stddev_pop​(Column col)
        Returns the population standard deviation (square root of variance) of non-NULL values. If all records inside a group are NULL, returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • sum

        public static Column sum​(Column col)
        Returns the sum of non-NULL records in a group. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • sum

        public static Column sum​(String colName)
        Returns the sum of non-NULL records in a group. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        colName - The input column name
        Returns:
        The result column
        Since:
        1.12.0
      • sum_distinct

        public static Column sum_distinct​(Column col)
        Returns the sum of non-NULL distinct records in a group. You can use the DISTINCT keyword to compute the sum of unique non-null values. If all records inside a group are NULL, the function returns NULL.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • variance

        public static Column variance​(Column col)
        Returns the sample variance of non-NULL records in a group. If all records inside a group are NULL, a NULL is returned.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • var_samp

        public static Column var_samp​(Column col)
        Returns the sample variance of non-NULL records in a group. If all records inside a group are NULL, a NULL is returned. Alias of var_samp
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • var_pop

        public static Column var_pop​(Column col)
        Returns the population variance of non-NULL records in a group. If all records inside a group are NULL, a NULL is returned.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • approx_percentile

        public static Column approx_percentile​(Column col,
                                               double percentile)
        Returns an approximated value for the desired percentile. This function uses the t-Digest algorithm.
        Parameters:
        col - The input column
        percentile - The desired percentile
        Returns:
        The result column
        Since:
        0.9.0
      • approx_percentile_accumulate

        public static Column approx_percentile_accumulate​(Column col)
        Returns the internal representation of the t-Digest state (as a JSON object) at the end of aggregation. This function uses the t-Digest algorithm.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • approx_percentile_estimate

        public static Column approx_percentile_estimate​(Column col,
                                                        double percentile)
        Returns the desired approximated percentile value for the specified t-Digest state. APPROX_PERCENTILE_ESTIMATE(APPROX_PERCENTILE_ACCUMULATE(.)) is equivalent to APPROX_PERCENTILE(.).
        Parameters:
        col - The input column
        percentile - The desired percentile
        Returns:
        The result column
        Since:
        0.9.0
      • approx_percentile_combine

        public static Column approx_percentile_combine​(Column state)
        Combines (merges) percentile input states into a single output state.

        This allows scenarios where APPROX_PERCENTILE_ACCUMULATE is run over horizontal partitions of the same table, producing an algorithm state for each table partition. These states can later be combined using APPROX_PERCENTILE_COMBINE, producing the same output state as a single run of APPROX_PERCENTILE_ACCUMULATE over the entire table.

        Parameters:
        state - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • cume_dist

        public static Column cume_dist()
        Finds the cumulative distribution of a value with regard to other values within the same window partition.
        Returns:
        The result column
        Since:
        0.9.0
      • dense_rank

        public static Column dense_rank()
        Returns the rank of a value within a group of values, without gaps in the ranks. The rank value starts at 1 and continues up sequentially. If two values are the same, they will have the same rank.
        Returns:
        The result column
        Since:
        0.9.0
      • lag

        public static Column lag​(Column col,
                                 int offset,
                                 Column defaultValue)
        Accesses data in a previous row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        offset - The function offset
        defaultValue - The default value
        Returns:
        The result column
        Since:
        0.9.0
      • lag

        public static Column lag​(Column col,
                                 int offset)
        Accesses data in a previous row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        offset - The function offset
        Returns:
        The result column
        Since:
        0.9.0
      • lag

        public static Column lag​(Column col)
        Accesses data in a previous row in the same result set without having to join the table to itself.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • ntile

        public static Column ntile​(Column col)
        Divides an ordered data set equally into the number of buckets specified by n. Buckets are sequentially numbered 1 through n.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • percent_rank

        public static Column percent_rank()
        Returns the relative rank of a value within a group of values, specified as a percentage ranging from 0.0 to 1.0.
        Returns:
        The result column
        Since:
        0.9.0
      • rank

        public static Column rank()
        Returns the rank of a value within an ordered group of values. The rank value starts at 1 and continues up.
        Returns:
        The result column
        Since:
        0.1.0
      • row_number

        public static Column row_number()
        Returns a unique row number for each row within a window partition. The row number starts at 1 and continues up sequentially.
        Returns:
        The result column
        Since:
        0.9.0
      • coalesce

        public static Column coalesce​(Column... cols)
        Returns the first non-NULL expression among its arguments, or NULL if all its arguments are NULL.
        Parameters:
        cols - The list of input column
        Returns:
        The result column
        Since:
        0.9.0
      • equal_nan

        public static Column equal_nan​(Column col)
        Returns true if the value in the column is not a number (NaN).
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • is_null

        public static Column is_null​(Column col)
        Returns true if the value in the column is null.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.1.0
      • negate

        public static Column negate​(Column col)
        Returns the negation of the value in the column (equivalent to a unary minus).
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • not

        public static Column not​(Column col)
        Returns the inverse of a boolean expression.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • random

        public static Column random​(long seed)
        Each call returns a pseudo-random 64-bit integer.
        Parameters:
        seed - The random seed
        Returns:
        The result column
        Since:
        0.9.0
      • random

        public static Column random()
        Each call returns a pseudo-random 64-bit integer.
        Returns:
        The result column
        Since:
        0.9.0
      • bitnot

        public static Column bitnot​(Column col)
        Returns the bitwise negation of a numeric expression.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.9.0
      • to_decimal

        public static Column to_decimal​(Column expr,
                                        int precision,
                                        int scale)
        Converts an input expression to a decimal
        Parameters:
        expr - The input column
        precision - The precision
        scale - The scale
        Returns:
        The result column
        Since:
        0.9.0
      • div0

        public static Column div0​(Column dividend,
                                  Column divisor)
        Performs division like the division operator (/), but returns 0 when the divisor is 0 (rather than reporting an error).
        Parameters:
        dividend - The dividend
        divisor - The divisor
        Returns:
        The result column
        Since:
        0.9.0
      • sqrt

        public static Column sqrt​(Column col)
        Returns the square-root of a non-negative numeric expression.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • abs

        public static Column abs​(Column col)
        Returns the absolute value of a numeric expression.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • acos

        public static Column acos​(Column col)
        Computes the inverse cosine (arc cosine) of its input; the result is a number in the interval [-pi, pi].
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • asin

        public static Column asin​(Column col)
        Computes the inverse sine (arc sine) of its argument; the result is a number in the interval [-pi, pi].
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • atan

        public static Column atan​(Column col)
        Computes the inverse tangent (arc tangent) of its argument; the result is a number in the interval [-pi, pi].
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • atan2

        public static Column atan2​(Column y,
                                   Column x)
        Computes the inverse tangent (arc tangent) of the ratio of its two arguments.
        Parameters:
        y - The value of y
        x - The value of x
        Returns:
        The result column
        Since:
        0.9.0
      • ceil

        public static Column ceil​(Column col)
        Returns values from the specified column rounded to the nearest equal or larger integer.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • floor

        public static Column floor​(Column col)
        Returns values from the specified column rounded to the nearest equal or smaller integer.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • cos

        public static Column cos​(Column col)
        Computes the cosine of its argument; the argument should be expressed in radians.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • cosh

        public static Column cosh​(Column col)
        Computes the hyperbolic cosine of its argument.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • exp

        public static Column exp​(Column col)
        Computes Euler's number e raised to a floating-point value.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • factorial

        public static Column factorial​(Column col)
        Computes the factorial of its input. The input argument must be an integer expression in the range of 0 to 33.
        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • greatest

        public static Column greatest​(Column... cols)
        Returns the largest value from a list of expressions. If any of the argument values is NULL, the result is NULL. GREATEST supports all data types, including VARIANT.
        Parameters:
        cols - The list of input column
        Returns:
        The result column
        Since:
        0.9.0
      • least

        public static Column least​(Column... cols)
        Returns the smallest value from a list of expressions. LEAST supports all data types, including VARIANT.
        Parameters:
        cols - The list of input column
        Returns:
        The result column
        Since:
        0.9.0
      • log

        public static Column log​(Column base,
                                 Column a)
        Returns the logarithm of a numeric expression.
        Parameters:
        base - The value of base
        a - the value of A
        Returns:
        The result column
        Since:
        0.9.0
      • pow

        public static Column pow​(Column l,
                                 Column r)
        Returns a number (l) raised to the specified power (r).
        Parameters:
        l - The value of l
        r - The value of r
        Returns:
        The result column
        Since:
        0.9.0
      • pow

        public static Column pow​(Column l,
                                 String r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.1, 2), (2, 3), (2, 0.5), (2, -1)) as T(base, exponent)");
         df.select(col("base"), col("exponent"), pow(col("base"), "exponent").as("result")).show();
        
         ----------------------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"              |
         ----------------------------------------------
         |0.1     |2.0         |0.010000000000000002  |
         |2.0     |3.0         |8.0                   |
         |2.0     |0.5         |1.4142135623730951    |
         |2.0     |-1.0        |0.5                   |
         ----------------------------------------------
         
        Parameters:
        l - The numeric column representing the base.
        r - The name of the numeric column representing the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(String l,
                                 Column r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.1, 2), (2, 3), (2, 0.5), (2, -1)) as T(base, exponent)");
         df.select(col("base"), col("exponent"), pow("base", col("exponent")).as("result")).show();
        
         ----------------------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"              |
         ----------------------------------------------
         |0.1     |2.0         |0.010000000000000002  |
         |2.0     |3.0         |8.0                   |
         |2.0     |0.5         |1.4142135623730951    |
         |2.0     |-1.0        |0.5                   |
         ----------------------------------------------
         
        Parameters:
        l - The name of the numeric column representing the base.
        r - The numeric column representing the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(String l,
                                 String r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.1, 2), (2, 3), (2, 0.5), (2, -1)) as T(base, exponent)");
         df.select(col("base"), col("exponent"), pow("base", "exponent").as("result")).show();
        
         ----------------------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"              |
         ----------------------------------------------
         |0.1     |2.0         |0.010000000000000002  |
         |2.0     |3.0         |8.0                   |
         |2.0     |0.5         |1.4142135623730951    |
         |2.0     |-1.0        |0.5                   |
         ----------------------------------------------
         
        Parameters:
        l - The name of the numeric column representing the base.
        r - The name of the numeric column representing the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(Column l,
                                 Double r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.5), (2), (2.5), (4)) as T(base)");
         df.select(col("base"), lit(2.0).as("exponent"), pow(col("base"), 2.0).as("result")).show();
        
         ----------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"  |
         ----------------------------------
         |0.5     |2.0         |0.25      |
         |2.0     |2.0         |4.0       |
         |2.5     |2.0         |6.25      |
         |4.0     |2.0         |16.0      |
         ----------------------------------
         
        Parameters:
        l - The numeric column representing the base.
        r - The value of the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(String l,
                                 Double r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.5), (2), (2.5), (4)) as T(base)");
         df.select(col("base"), lit(2.0).as("exponent"), pow("base", 2.0).as("result")).show();
        
         ----------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"  |
         ----------------------------------
         |0.5     |2.0         |0.25      |
         |2.0     |2.0         |4.0       |
         |2.5     |2.0         |6.25      |
         |4.0     |2.0         |16.0      |
         ----------------------------------
         
        Parameters:
        l - The name of the numeric column representing the base.
        r - The value of the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(Double l,
                                 Column r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.5), (2), (2.5), (4)) as T(exponent)");
         df.select(lit(2.0).as("base"), col("exponent"), pow(2.0, col("exponent")).as("result")).show();
        
         --------------------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"            |
         --------------------------------------------
         |2.0     |0.5         |1.4142135623730951  |
         |2.0     |2.0         |4.0                 |
         |2.0     |2.5         |5.656854249492381   |
         |2.0     |4.0         |16.0                |
         --------------------------------------------
         
        Parameters:
        l - The value of the base.
        r - The numeric column representing the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • pow

        public static Column pow​(Double l,
                                 String r)
        Returns a number (l) raised to the specified power (r).

        Example:

        
         DataFrame df = session.sql("select * from (values (0.5), (2), (2.5), (4)) as T(exponent)");
         df.select(lit(2.0).as("base"), col("exponent"), pow(2.0, "exponent").as("result")).show();
        
         --------------------------------------------
         |"BASE"  |"EXPONENT"  |"RESULT"            |
         --------------------------------------------
         |2.0     |0.5         |1.4142135623730951  |
         |2.0     |2.0         |4.0                 |
         |2.0     |2.5         |5.656854249492381   |
         |2.0     |4.0         |16.0                |
         --------------------------------------------
         
        Parameters:
        l - The value of the base.
        r - The name of the numeric column representing the exponent.
        Returns:
        A column containing the result of raising l to the power of r.
        Since:
        1.15.0
      • round

        public static Column round​(Column e,
                                   Column scale)
        Rounds the numeric values of the given column e to the scale decimal places using the half away from zero rounding mode.

        Example:

        
         DataFrame df = session.sql("select * from (values (-3.78), (-2.55), (1.23), (2.55), (3.78)) as T(a)");
         df.select(round(col("a"), lit(1)).alias("round")).show();
        
         -----------
         |"ROUND"  |
         -----------
         |-3.8     |
         |-2.6     |
         |1.2      |
         |2.6      |
         |3.8      |
         -----------
         
        Parameters:
        e - The input column
        scale - The scale
        Returns:
        The result column
        Since:
        0.9.0
      • round

        public static Column round​(Column e)
        Rounds the numeric values of the given column e to 0 decimal places using the half away from zero rounding mode.

        Example:

        
         DataFrame df = session.sql("select * from (values (-3.7), (-2.5), (1.2), (2.5), (3.7)) as T(a)");
         df.select(round(col("a")).alias("round")).show();
        
         -----------
         |"ROUND"  |
         -----------
         |-4       |
         |-3       |
         |1        |
         |3        |
         |4        |
         -----------
         
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • round

        public static Column round​(Column e,
                                   int scale)
        Rounds the numeric values of the given column e to the scale decimal places using the half away from zero rounding mode.

        Example:

        
         DataFrame df = session.sql("select * from (values (-3.78), (-2.55), (1.23), (2.55), (3.78)) as T(a)");
         df.select(round(col("a"), 1).alias("round")).show();
        
         -----------
         |"ROUND"  |
         -----------
         |-3.8     |
         |-2.6     |
         |1.2      |
         |2.6      |
         |3.8      |
         -----------
         
        Parameters:
        e - The column of numeric values to round.
        scale - The number of decimal places to which e should be rounded.
        Returns:
        A new column containing the rounded numeric values.
        Since:
        1.14.0
      • bitshiftleft

        public static Column bitshiftleft​(Column e,
                                          Column numBits)
        Shifts the bits for a numeric expression numBits positions to the left.
        Parameters:
        e - The input column
        numBits - The number of bits
        Returns:
        The result column
        Since:
        0.9.0
      • bitshiftright

        public static Column bitshiftright​(Column e,
                                           Column numBits)
        Shifts the bits for a numeric expression numBits positions to the right.
        Parameters:
        e - The input column
        numBits - The number of bits
        Returns:
        The result column
        Since:
        0.9.0
      • sin

        public static Column sin​(Column e)
        Computes the sine of its argument; the argument should be expressed in radians.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • sinh

        public static Column sinh​(Column e)
        Computes the hyperbolic sine of its argument.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • tan

        public static Column tan​(Column e)
        Computes the tangent of its argument; the argument should be expressed in radians.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • tanh

        public static Column tanh​(Column e)
        Computes the hyperbolic tangent of its argument.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.9.0
      • degrees

        public static Column degrees​(Column e)
        Converts radians to degrees.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • radians

        public static Column radians​(Column e)
        Converts degrees to radians.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • md5

        public static Column md5​(Column e)
        Returns a 32-character hex-encoded string containing the 128-bit MD5 message digest.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • sha1

        public static Column sha1​(Column e)
        Returns a 40-character hex-encoded string containing the 160-bit SHA-1 message digest.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • sha2

        public static Column sha2​(Column e,
                                  int numBits)
        Returns a hex-encoded string containing the N-bit SHA-2 message digest, where N is the specified output digest size.
        Parameters:
        e - The input column
        numBits - The number of bits
        Returns:
        The result column
        Since:
        0.11.0
      • hash

        public static Column hash​(Column... cols)
        Returns a signed 64-bit hash value. Note that HASH never returns NULL, even for NULL inputs.
        Parameters:
        cols - The list of input column
        Returns:
        The result column
        Since:
        0.11.0
      • ascii

        public static Column ascii​(Column e)
        Returns the ASCII code for the first character of a string. If the string is empty, a value of 0 is returned.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • concat_ws

        public static Column concat_ws​(Column separator,
                                       Column... exprs)
        Concatenates two or more strings, or concatenates two or more binary values. If any of the values is null, the result is also null.
        Parameters:
        separator - The separator
        exprs - The list of input values
        Returns:
        The result column
        Since:
        0.11.0
      • initcap

        public static Column initcap​(Column e)
        Returns the input string with the first letter of each word in uppercase and the subsequent letters in lowercase.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • length

        public static Column length​(Column e)
        Returns the length of an input string or binary value. For strings, the length is the number of characters, and UTF-8 characters are counted as a single character. For binary, the length is the number of bytes.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • lower

        public static Column lower​(Column e)
        Returns the input string with all characters converted to lowercase.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • upper

        public static Column upper​(Column e)
        Returns the input string with all characters converted to uppercase.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • lpad

        public static Column lpad​(Column str,
                                  Column len,
                                  Column pad)
        Left-pads a string with characters from another string, or left-pads a binary value with bytes from another binary value.
        Parameters:
        str - The input string
        len - The length
        pad - The pad
        Returns:
        The result column
        Since:
        0.11.0
      • rpad

        public static Column rpad​(Column str,
                                  Column len,
                                  Column pad)
        Right-pads a string with characters from another string, or right-pads a binary value with bytes from another binary value.
        Parameters:
        str - The input string
        len - The length
        pad - The pad
        Returns:
        The result column
        Since:
        0.11.0
      • ltrim

        public static Column ltrim​(Column e,
                                   Column trimString)
        Removes leading characters, including whitespace, from a string.
        Parameters:
        e - The input column
        trimString - The trim string
        Returns:
        The result column
        Since:
        0.11.0
      • ltrim

        public static Column ltrim​(Column e)
        Removes leading characters, including whitespace, from a string.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • rtrim

        public static Column rtrim​(Column e,
                                   Column trimString)
        Removes trailing characters, including whitespace, from a string.
        Parameters:
        e - The input column
        trimString - The trim string
        Returns:
        The result column
        Since:
        0.11.0
      • rtrim

        public static Column rtrim​(Column e)
        Removes trailing characters, including whitespace, from a string.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • trim

        public static Column trim​(Column e,
                                  Column trimString)
        Removes leading and trailing characters from a string.
        Parameters:
        e - The input column
        trimString - The trim string
        Returns:
        The result column
        Since:
        0.11.0
      • repeat

        public static Column repeat​(Column str,
                                    Column n)
        Builds a string by repeating the input for the specified number of times.
        Parameters:
        str - The input string
        n - The time of repeat
        Returns:
        The result column
        Since:
        0.11.0
      • soundex

        public static Column soundex​(Column e)
        Returns a string that contains a phonetic representation of the input string.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • split

        public static Column split​(Column str,
                                   Column pattern)
        Splits a given string with a given separator and returns the result in an array of strings.
        Parameters:
        str - The input string
        pattern - The pattern
        Returns:
        The result column
        Since:
        0.11.0
      • substring

        public static Column substring​(Column str,
                                       Column pos,
                                       Column len)
        Returns the portion of the string or binary value str, starting from the character/byte specified by pos, with limited length.
        Parameters:
        str - The input string
        len - The length
        pos - The position
        Returns:
        The result column
        Since:
        0.11.0
      • any_value

        public static Column any_value​(Column e)
        Returns a non-deterministic value for the specified column.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        1.2.0
      • translate

        public static Column translate​(Column src,
                                       Column matchingString,
                                       Column replaceString)
        Translates src from the characters in matchingString to the characters in replaceString.
        Parameters:
        src - The source column
        matchingString - The matching string
        replaceString - The replacement
        Returns:
        The result column
        Since:
        0.11.0
      • contains

        public static Column contains​(Column col,
                                      Column str)
        Returns true if col contains str.
        Parameters:
        col - The input column
        str - The target string
        Returns:
        The result column
        Since:
        0.11.0
      • startswith

        public static Column startswith​(Column col,
                                        Column str)
        Returns true if col starts with str.
        Parameters:
        col - The input column
        str - The target string
        Returns:
        The result column
        Since:
        0.11.0
      • chr

        public static Column chr​(Column col)
        Converts a Unicode code point (including 7-bit ASCII) into the character that matches the input Unicode.

        alias for Snowflake char function.

        Parameters:
        col - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • add_months

        public static Column add_months​(Column startDate,
                                        Column numMonths)
        Adds or subtracts a specified number of months to a date or timestamp, preserving the end-of-month information.
        Parameters:
        startDate - The start date
        numMonths - the number of Months
        Returns:
        The result column
        Since:
        0.11.0
      • current_date

        public static Column current_date()
        Returns the current date of the system.
        Returns:
        The result column
        Since:
        0.11.0
      • current_timestamp

        public static Column current_timestamp()
        Returns the current timestamp for the system.
        Returns:
        The result column
        Since:
        0.11.0
      • current_region

        public static Column current_region()
        Returns the name of the region for the account where the current user is logged in.
        Returns:
        The result column
        Since:
        0.11.0
      • current_time

        public static Column current_time()
        Returns the current time for the system.
        Returns:
        The result column
        Since:
        0.11.0
      • current_version

        public static Column current_version()
        Returns the current Snowflake version.
        Returns:
        The result column
        Since:
        0.11.0
      • current_account

        public static Column current_account()
        Returns the account used by the user's current session.
        Returns:
        The result column
        Since:
        0.11.0
      • current_role

        public static Column current_role()
        Returns the name of the role in use for the current session.
        Returns:
        The result column
        Since:
        0.11.0
      • current_available_roles

        public static Column current_available_roles()
        Returns a JSON string that lists all roles granted to the current user.
        Returns:
        The result column
        Since:
        0.11.0
      • current_session

        public static Column current_session()
        Returns a unique system identifier for the Snowflake session corresponding to the present connection.
        Returns:
        The result column
        Since:
        0.11.0
      • current_statement

        public static Column current_statement()
        Returns the SQL text of the statement that is currently executing.
        Returns:
        The result column
        Since:
        0.11.0
      • current_user

        public static Column current_user()
        Returns the name of the user currently logged into the system.
        Returns:
        The result column
        Since:
        0.11.0
      • current_database

        public static Column current_database()
        Returns the name of the database in use for the current session.
        Returns:
        The result column
        Since:
        0.11.0
      • current_schema

        public static Column current_schema()
        Returns the name of the schema in use by the current session.
        Returns:
        The result column
        Since:
        0.11.0
      • current_schemas

        public static Column current_schemas()
        Returns active search path schemas.
        Returns:
        The result column
        Since:
        0.11.0
      • current_warehouse

        public static Column current_warehouse()
        Returns the name of the warehouse in use for the current session.
        Returns:
        The result column
        Since:
        0.11.0
      • sysdate

        public static Column sysdate()
        Returns the current timestamp for the system, but in the UTC time zone.
        Returns:
        The result column
        Since:
        0.11.0
      • convert_timezone

        public static Column convert_timezone​(Column sourceTimeZone,
                                              Column targetTimeZone,
                                              Column sourceTimestampNTZ)
        Converts the given sourceTimestampNTZ from sourceTimeZone to targetTimeZone.

        Supported time zones are listed here

        Example

        
         df.select(Functions.convert_timezone(Functions.lit("America/Los_Angeles"),
         Functions.lit("America/New_York"), df.col("time")));
         
        Parameters:
        sourceTimestampNTZ - The timestamp
        targetTimeZone - The target time zone
        sourceTimeZone - The source time zone
        Returns:
        The result column
        Since:
        0.11.0
      • convert_timezone

        public static Column convert_timezone​(Column targetTimeZone,
                                              Column sourceTimestamp)
        Converts the given sourceTimestampNTZ to targetTimeZone.

        Supported time zones are listed here

        Example

        
         df.select(Functions.convert_timezone(Functions.lit("America/New_York"), df.col("time")));
         
        Parameters:
        sourceTimestamp - The timestamp
        targetTimeZone - The target time zone
        Returns:
        The result column
        Since:
        0.11.0
      • year

        public static Column year​(Column e)
        Extracts the year from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • quarter

        public static Column quarter​(Column e)
        Extracts the quarter from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • month

        public static Column month​(Column e)
        Extracts the month from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • dayofweek

        public static Column dayofweek​(Column e)
        Extracts the day of week from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • dayofmonth

        public static Column dayofmonth​(Column e)
        Extracts the day of month from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • dayofyear

        public static Column dayofyear​(Column e)
        Extracts the day of year from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • last_day

        public static Column last_day​(Column e)
        Returns the last day of the specified date part for a date or timestamp. Commonly used to return the last day of the month for a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • weekofyear

        public static Column weekofyear​(Column e)
        Extracts the week of year from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • hour

        public static Column hour​(Column e)
        Extracts the hour from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • minute

        public static Column minute​(Column e)
        Extracts the minute from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • second

        public static Column second​(Column e)
        Extracts the second from a date or timestamp.
        Parameters:
        e - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • next_day

        public static Column next_day​(Column date,
                                      Column dayOfWeek)
        Returns the date of the first specified DOW (day of week) that occurs after the input date.
        Parameters:
        date - The date
        dayOfWeek - The day of week
        Returns:
        The result column
        Since:
        0.11.0
      • previous_day

        public static Column previous_day​(Column date,
                                          Column dayOfWeek)
        Returns the date of the first specified DOW (day of week) that occurs before the input date.
        Parameters:
        date - The date
        dayOfWeek - The day of week
        Returns:
        The result column
        Since:
        0.11.0
      • to_timestamp

        public static Column to_timestamp​(Column s)
        Converts an input expression into the corresponding timestamp.
        Parameters:
        s - The input column
        Returns:
        The result column
        Since:
        0.11.0
      • to_timestamp

        public static Column to_timestamp​(Column s,
                                          Column fmt)
        Converts an input expression into the corresponding timestamp.
        Parameters:
        s - The input value
        fmt - The time format
        Returns:
        The result column
        Since:
        0.11.0
      • to_date

        public static Column to_date​(Column e)
        Converts an input expression to a date.
        Parameters:
        e - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • to_date

        public static Column to_date​(Column e,
                                     Column fmt)
        Converts an input expression to a date.
        Parameters:
        e - The input value
        fmt - The time format
        Returns:
        The result column
        Since:
        0.11.0
      • date_from_parts

        public static Column date_from_parts​(Column year,
                                             Column month,
                                             Column day)
        Creates a date from individual numeric components that represent the year, month, and day of the month.
        Parameters:
        year - The year
        month - The month
        day - The day
        Returns:
        The result column
        Since:
        0.11.0
      • time_from_parts

        public static Column time_from_parts​(Column hour,
                                             Column minute,
                                             Column second,
                                             Column nanosecond)
        Creates a time from individual numeric components.
        Parameters:
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        Returns:
        The result column
        Since:
        0.11.0
      • time_from_parts

        public static Column time_from_parts​(Column hour,
                                             Column minute,
                                             Column second)
        Creates a time from individual numeric components.
        Parameters:
        hour - The hour
        minute - The minute
        second - The second
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_from_parts

        public static Column timestamp_from_parts​(Column year,
                                                  Column month,
                                                  Column day,
                                                  Column hour,
                                                  Column minute,
                                                  Column second)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_from_parts

        public static Column timestamp_from_parts​(Column year,
                                                  Column month,
                                                  Column day,
                                                  Column hour,
                                                  Column minute,
                                                  Column second,
                                                  Column nanosecond)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_from_parts

        public static Column timestamp_from_parts​(Column dateExpr,
                                                  Column timeExpr)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        dateExpr - The date expression
        timeExpr - The time expression
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_ltz_from_parts

        public static Column timestamp_ltz_from_parts​(Column year,
                                                      Column month,
                                                      Column day,
                                                      Column hour,
                                                      Column minute,
                                                      Column second)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_ltz_from_parts

        public static Column timestamp_ltz_from_parts​(Column year,
                                                      Column month,
                                                      Column day,
                                                      Column hour,
                                                      Column minute,
                                                      Column second,
                                                      Column nanosecond)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_ntz_from_parts

        public static Column timestamp_ntz_from_parts​(Column year,
                                                      Column month,
                                                      Column day,
                                                      Column hour,
                                                      Column minute,
                                                      Column second)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_ntz_from_parts

        public static Column timestamp_ntz_from_parts​(Column year,
                                                      Column month,
                                                      Column day,
                                                      Column hour,
                                                      Column minute,
                                                      Column second,
                                                      Column nanosecond)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_ntz_from_parts

        public static Column timestamp_ntz_from_parts​(Column dateExpr,
                                                      Column timeExpr)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        dateExpr - The date expression
        timeExpr - The time expression
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_tz_from_parts

        public static Column timestamp_tz_from_parts​(Column year,
                                                     Column month,
                                                     Column day,
                                                     Column hour,
                                                     Column minute,
                                                     Column second)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_tz_from_parts

        public static Column timestamp_tz_from_parts​(Column year,
                                                     Column month,
                                                     Column day,
                                                     Column hour,
                                                     Column minute,
                                                     Column second,
                                                     Column nanosecond)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        Returns:
        The result column
        Since:
        0.11.0
      • timestamp_tz_from_parts

        public static Column timestamp_tz_from_parts​(Column year,
                                                     Column month,
                                                     Column day,
                                                     Column hour,
                                                     Column minute,
                                                     Column second,
                                                     Column nanosecond,
                                                     Column timezone)
        Creates a timestamp from individual numeric components. If no time zone is in effect, the function can be used to create a timestamp from a date expression and a time expression.
        Parameters:
        year - The year
        month - The month
        day - The day
        hour - The hour
        minute - The minute
        second - The second
        nanosecond - The nanosecond
        timezone - The time zone
        Returns:
        The result column
        Since:
        0.11.0
      • dayname

        public static Column dayname​(Column expr)
        Extracts the three-letter day-of-week name from the specified date or timestamp.
        Parameters:
        expr - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • monthname

        public static Column monthname​(Column expr)
        Extracts the three-letter month name from the specified date or timestamp.
        Parameters:
        expr - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • dateadd

        public static Column dateadd​(String part,
                                     Column value,
                                     Column expr)
        Adds the specified value for the specified date or time art to date or time expr.

        Supported date and time parts are listed here

        Example: add one year on dates

        
         date.select(Functions.dateadd("year", Functions.lit(1), date.col("date_col")))
         
        Parameters:
        part - The part of time being added
        value - The value being added
        expr - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • datediff

        public static Column datediff​(String part,
                                      Column col1,
                                      Column col2)
        Calculates the difference between two date, time, or timestamp columns based on the date or time part requested.

        Supported date and time parts are listed here

        Example: year difference between two date columns

        
         date.select(Functions.datediff("year", date.col("date_col1"), date.col("date_col2")))
         
        Parameters:
        part - The part of time
        col1 - The first input value
        col2 - The second input value
        Returns:
        The result column
        Since:
        0.11.0
      • trunc

        public static Column trunc​(Column expr,
                                   Column scale)
        Rounds the input expression down to the nearest (or equal) integer closer to zero, or to the nearest equal or smaller value with the specified number of places after the decimal point.
        Parameters:
        expr - The input value
        scale - The scale
        Returns:
        The result column
        Since:
        0.11.0
      • date_trunc

        public static Column date_trunc​(String format,
                                        Column timestamp)
        Truncates a DATE, TIME, or TIMESTAMP to the specified precision.
        Parameters:
        format - The time format
        timestamp - The input timestamp
        Returns:
        The result column
        Since:
        0.11.0
      • concat

        public static Column concat​(Column... exprs)
        Concatenates one or more strings, or concatenates one or more binary values. If any of the values is null, the result is also null.
        Parameters:
        exprs - A list of input values
        Returns:
        The result column
        Since:
        0.11.0
      • arrays_overlap

        public static Column arrays_overlap​(Column a1,
                                            Column a2)
        Compares whether two arrays have at least one element in common. Returns TRUE if there is at least one element in common; otherwise returns FALSE. The function is NULL-safe, meaning it treats NULLs as known values for comparing equality.
        Parameters:
        a1 - The first input array
        a2 - The second input array
        Returns:
        The result column
        Since:
        0.11.0
      • endswith

        public static Column endswith​(Column expr,
                                      Column str)
        Returns TRUE if expr ends with str.
        Parameters:
        expr - The input value
        str - The ending string
        Returns:
        The result column
        Since:
        0.11.0
      • insert

        public static Column insert​(Column baseExpr,
                                    Column position,
                                    Column length,
                                    Column insertExpr)
        Replaces a substring of the specified length, starting at the specified position, with a new string or binary value.
        Parameters:
        baseExpr - The base expression
        position - The position
        insertExpr - The new expression
        length - The length
        Returns:
        The result column
        Since:
        0.11.0
      • left

        public static Column left​(Column strExpr,
                                  Column lengthExpr)
        Returns a left most substring of strExpr.
        Parameters:
        strExpr - The input string
        lengthExpr - The length
        Returns:
        The result column
        Since:
        0.11.0
      • right

        public static Column right​(Column strExpr,
                                   Column lengthExpr)
        Returns a right most substring of strExpr.
        Parameters:
        strExpr - The input string
        lengthExpr - The length
        Returns:
        The result column
        Since:
        0.11.0
      • regexp_count

        public static Column regexp_count​(Column strExpr,
                                          Column pattern,
                                          Column position,
                                          Column parameters)
        Returns the number of times that a pattern occurs in a strExpr.

        Pattern syntax is specified here

        Parameter detail is specified here

        Parameters:
        strExpr - The input string
        pattern - The pattern
        position - The position
        parameters - The parameters
        Returns:
        The result column
        Since:
        0.11.0
      • regexp_count

        public static Column regexp_count​(Column strExpr,
                                          Column pattern)
        Returns the number of times that a pattern occurs in a strExpr.

        Pattern syntax is specified here

        Parameter detail is specified here

        Parameters:
        strExpr - The input string
        pattern - The pattern
        Returns:
        The result column
        Since:
        0.11.0
      • regexp_replace

        public static Column regexp_replace​(Column strExpr,
                                            Column pattern)
        Returns the subject with the specified pattern (or all occurrences of the pattern) removed. If no matches are found, returns the original subject.
        Parameters:
        strExpr - The input string
        pattern - The pattern
        Returns:
        The result column
        Since:
        1.9.0
      • regexp_replace

        public static Column regexp_replace​(Column strExpr,
                                            Column pattern,
                                            Column replacement)
        Returns the subject with the specified pattern (or all occurrences of the pattern) replaced by a replacement string. If no matches are found, returns the original subject.
        Parameters:
        strExpr - The input string
        pattern - The pattern
        replacement - The replacement string
        Returns:
        The result column
        Since:
        1.9.0
      • replace

        public static Column replace​(Column strExpr,
                                     Column pattern,
                                     Column replacement)
        Removes all occurrences of a specified strExpr, and optionally replaces them with replacement.
        Parameters:
        strExpr - The input string
        pattern - The pattern
        replacement - The replacement string
        Returns:
        The result column
        Since:
        0.11.0
      • replace

        public static Column replace​(Column strExpr,
                                     Column pattern)
        Removes all occurrences of a specified strExpr, and optionally replaces them with replacement.
        Parameters:
        strExpr - The input string
        pattern - The pattern
        Returns:
        The result column
        Since:
        0.11.0
      • charindex

        public static Column charindex​(Column targetExpr,
                                       Column sourceExpr)
        Searches for targetExpr in sourceExpr and, if successful, returns the position (1-based) of the targetExpr in sourceExpr.
        Parameters:
        sourceExpr - The source value
        targetExpr - The target value
        Returns:
        The result column
        Since:
        0.11.0
      • charindex

        public static Column charindex​(Column targetExpr,
                                       Column sourceExpr,
                                       Column position)
        Searches for targetExpr in sourceExpr and, if successful, returns the position (1-based) of the targetExpr in sourceExpr.
        Parameters:
        sourceExpr - The source value
        targetExpr - The target value
        position - The position
        Returns:
        The result column
        Since:
        0.11.0
      • collate

        public static Column collate​(Column expr,
                                     String collationSpec)
        Returns a copy of expr, but with the specified collationSpec property instead of the original collation specification property.

        Collation Specification is specified here

        Parameters:
        expr - The input value
        collationSpec - The collation specification
        Returns:
        The result column
        Since:
        0.11.0
      • collation

        public static Column collation​(Column expr)
        Returns the collation specification of expr.
        Parameters:
        expr - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • array_intersection

        public static Column array_intersection​(Column col1,
                                                Column col2)
        Returns an ARRAY that contains the matching elements in the two input ARRAYs.
        Parameters:
        col1 - The first input array
        col2 - The second input array
        Returns:
        The result column
        Since:
        0.11.0
      • is_array

        public static Column is_array​(Column col)
        Returns true if the specified VARIANT column contains an ARRAY value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_boolean

        public static Column is_boolean​(Column col)
        Returns true if the specified VARIANT column contains a Boolean value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_binary

        public static Column is_binary​(Column col)
        Returns true if the specified VARIANT column contains a binary value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_char

        public static Column is_char​(Column col)
        Returns true if the specified VARIANT column contains a string value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_varchar

        public static Column is_varchar​(Column col)
        Returns true if the specified VARIANT column contains a string value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_date

        public static Column is_date​(Column col)
        Returns true if the specified VARIANT column contains a DATE value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_date_value

        public static Column is_date_value​(Column col)
        Returns true if the specified VARIANT column contains a DATE value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_decimal

        public static Column is_decimal​(Column col)
        Returns true if the specified VARIANT column contains a fixed-point decimal value or integer.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_double

        public static Column is_double​(Column col)
        Returns true if the specified VARIANT column contains a floating-point value, fixed-point decimal, or integer.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_real

        public static Column is_real​(Column col)
        Returns true if the specified VARIANT column contains a floating-point value, fixed-point decimal, or integer.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_integer

        public static Column is_integer​(Column col)
        Returns true if the specified VARIANT column contains an integer value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_null_value

        public static Column is_null_value​(Column col)
        Returns true if the specified VARIANT column is a JSON null value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_object

        public static Column is_object​(Column col)
        Returns true if the specified VARIANT column contains an OBJECT value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_time

        public static Column is_time​(Column col)
        Returns true if the specified VARIANT column contains a TIME value.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_timestamp_ltz

        public static Column is_timestamp_ltz​(Column col)
        Returns true if the specified VARIANT column contains a TIMESTAMP value to be interpreted using the local time zone.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_timestamp_ntz

        public static Column is_timestamp_ntz​(Column col)
        Returns true if the specified VARIANT column contains a TIMESTAMP value with no time zone.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • is_timestamp_tz

        public static Column is_timestamp_tz​(Column col)
        Returns true if the specified VARIANT column contains a TIMESTAMP value with a time zone.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.11.0
      • check_json

        public static Column check_json​(Column col)
        Checks the validity of a JSON document. If the input string is a valid JSON document or a NULL (i.e. no error would occur when parsing the input string), the function returns NULL. In case of a JSON parsing error, the function returns a string that contains the error message.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • check_xml

        public static Column check_xml​(Column col)
        Checks the validity of an XML document. If the input string is a valid XML document or a NULL (i.e. no error would occur when parsing the input string), the function returns NULL. In case of an XML parsing error, the output string contains the error message.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • json_extract_path_text

        public static Column json_extract_path_text​(Column col,
                                                    Column path)
        Parses a JSON string and returns the value of an element at a specified path in the resulting JSON document.
        Parameters:
        col - Column containing the JSON string that should be parsed.
        path - Column containing the path to the element that should be extracted.
        Returns:
        The result column
        Since:
        0.12.0
      • parse_json

        public static Column parse_json​(Column col)
        Parse the value of the specified column as a JSON string and returns the resulting JSON document.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • parse_xml

        public static Column parse_xml​(Column col)
        Parse the value of the specified column as a JSON string and returns the resulting XML document.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • strip_null_value

        public static Column strip_null_value​(Column col)
        Converts a JSON "null" value in the specified column to a SQL NULL value. All other VARIANT values in the column are returned unchanged.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • array_agg

        public static Column array_agg​(Column col)
        Returns the input values, pivoted into an ARRAY. If the input is empty, an empty ARRAY is returned.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • array_append

        public static Column array_append​(Column array,
                                          Column element)
        Returns an ARRAY containing all elements from the source ARRAYas well as the new element. The new element is located at end of the ARRAY.
        Parameters:
        array - The column containing the source ARRAY.
        element - The column containing the element to be appended. The element may be of almost any data type. The data type does not need to match the data type(s) of the existing elements in the ARRAY.
        Returns:
        The result column
        Since:
        0.12.0
      • array_cat

        public static Column array_cat​(Column array1,
                                       Column array2)
        Returns the concatenation of two ARRAYs.
        Parameters:
        array1 - Column containing the source ARRAY.
        array2 - Column containing the ARRAY to be appended to array1.
        Returns:
        The result column
        Since:
        0.12.0
      • array_compact

        public static Column array_compact​(Column array)
        Returns a compacted ARRAY with missing and null values removed, effectively converting sparse arrays into dense arrays.
        Parameters:
        array - The input array
        Returns:
        The result column
        Since:
        0.12.0
      • array_construct

        public static Column array_construct​(Column... cols)
        Returns an ARRAY constructed from zero, one, or more inputs.
        Parameters:
        cols - Columns containing the values (or expressions that evaluate to values). The values do not all need to be of the same data type.
        Returns:
        The result column
        Since:
        0.12.0
      • array_construct_compact

        public static Column array_construct_compact​(Column... cols)
        Returns an ARRAY constructed from zero, one, or more inputs; the constructed ARRAY omits any NULL input values.
        Parameters:
        cols - Columns containing the values (or expressions that evaluate to values). The values do not all need to be of the same data type.
        Returns:
        The result column
        Since:
        0.12.0
      • array_contains

        public static Column array_contains​(Column variant,
                                            Column array)
        Returns true if the specified VARIANT is found in the specified ARRAY.
        Parameters:
        variant - Column containing the VARIANT to find.
        array - Column containing the ARRAY to search.
        Returns:
        The result column
        Since:
        0.12.0
      • array_insert

        public static Column array_insert​(Column array,
                                          Column pos,
                                          Column element)
        Returns an ARRAY containing all elements from the source ARRAY as well as the new element.
        Parameters:
        array - Column containing the source ARRAY.
        pos - Column containing a (zero-based) position in the source ARRAY. The new element is inserted at this position. The original element from this position (if any) and all subsequent elements (if any) are shifted by one position to the right in the resulting array (i.e. inserting at position 0 has the same effect as using [[array_prepend]]). A negative position is interpreted as an index from the back of the array (e.g. -1 results in insertion before the last element in the array).
        element - Column containing the element to be inserted. The new element is located at position pos. The relative order of the other elements from the source array is preserved.
        Returns:
        The result column
        Since:
        0.12.0
      • array_position

        public static Column array_position​(Column variant,
                                            Column array)
        Returns the index of the first occurrence of an element in an ARRAY.
        Parameters:
        variant - Column containing the VARIANT value that you want to find. The function searches for the first occurrence of this value in the array.
        array - Column containing the ARRAY to be searched.
        Returns:
        The result column
        Since:
        0.12.0
      • array_prepend

        public static Column array_prepend​(Column array,
                                           Column element)
        Returns an ARRAY containing the new element as well as all elements from the source ARRAY. The new element is positioned at the beginning of the ARRAY.
        Parameters:
        array - Column containing the source ARRAY.
        element - Column containing the element to be prepended.
        Returns:
        The result column
        Since:
        0.12.0
      • array_size

        public static Column array_size​(Column array)
        Returns the size of the input ARRAY.

        If the specified column contains a VARIANT value that contains an ARRAY, the size of the ARRAY is returned; otherwise, NULL is returned if the value is not an ARRAY.

        Parameters:
        array - The input array
        Returns:
        The result column
        Since:
        0.12.0
      • array_slice

        public static Column array_slice​(Column array,
                                         Column from,
                                         Column to)
        Returns an ARRAY constructed from a specified subset of elements of the input ARRAY.
        Parameters:
        array - Column containing the source ARRAY.
        from - Column containing a position in the source ARRAY. The position of the first element is 0. Elements from positions less than this parameter are not included in the resulting ARRAY.
        to - Column containing a position in the source ARRAY. Elements from positions equal to or greater than this parameter are not included in the resulting array.
        Returns:
        The result column
        Since:
        0.12.0
      • array_to_string

        public static Column array_to_string​(Column array,
                                             Column separator)
        Returns an input ARRAY converted to a string by casting all values to strings (using TO_VARCHAR) and concatenating them (using the string from the second argument to separate the elements).
        Parameters:
        array - Column containing the ARRAY of elements to convert to a string.
        separator - Column containing the string to put between each element (e.g. a space, comma, or other human-readable separator).
        Returns:
        The result column
        Since:
        0.12.0
      • objectagg

        public static Column objectagg​(Column key,
                                       Column value)
        Returns one OBJECT per group. For each (key, value) input pair, where key must be a VARCHAR and value must be a VARIANT, the resulting OBJECT contains a key:value field.
        Parameters:
        key - The key
        value - The value
        Returns:
        The result column
        Since:
        0.12.0
      • object_construct

        public static Column object_construct​(Column... key_values)
        Returns an OBJECT constructed from the arguments.
        Parameters:
        key_values - The key and value
        Returns:
        The result column
        Since:
        0.12.0
      • object_delete

        public static Column object_delete​(Column obj,
                                           Column key1,
                                           Column... keys)
        Returns an object containing the contents of the input (i.e.source) object with one or more keys removed.
        Parameters:
        obj - The input object
        key1 - The key being removed
        keys - A list of keys being removed except key1
        Returns:
        The result column
        Since:
        0.12.0
      • object_insert

        public static Column object_insert​(Column obj,
                                           Column key,
                                           Column value)
        Returns an object consisting of the input object with a new key-value pair inserted. The input key must not exist in the object.
        Parameters:
        obj - The input object
        key - The key
        value - The value
        Returns:
        The result column
        Since:
        0.12.0
      • object_insert

        public static Column object_insert​(Column obj,
                                           Column key,
                                           Column value,
                                           Column update_flag)
        Returns an object consisting of the input object with a new key-value pair inserted (or an existing key updated with a new value).
        Parameters:
        obj - The input object
        key - The key
        value - The value
        update_flag - The flags
        Returns:
        The result column
        Since:
        0.12.0
      • object_pick

        public static Column object_pick​(Column obj,
                                         Column key1,
                                         Column... keys)
        Returns a new OBJECT containing some of the key-value pairs from an existing object.

        To identify the key-value pairs to include in the new object, pass in the keys as arguments, or pass in an array containing the keys.

        If a specified key is not present in the input object, the key is ignored.

        Parameters:
        obj - The input object
        key1 - The key
        keys - A list of keys except key1
        Returns:
        The result column
        Since:
        0.12.0
      • as_array

        public static Column as_array​(Column variant)
        Casts a VARIANT value to an array.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_binary

        public static Column as_binary​(Column variant)
        Casts a VARIANT value to a binary string.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_char

        public static Column as_char​(Column variant)
        Casts a VARIANT value to a string. Does not convert values of other types into string.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_varchar

        public static Column as_varchar​(Column variant)
        Casts a VARIANT value to a string. Does not convert values of other types into string.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_date

        public static Column as_date​(Column variant)
        Casts a VARIANT value to a date. Does not convert from timestamps.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_decimal

        public static Column as_decimal​(Column variant)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values).
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_decimal

        public static Column as_decimal​(Column variant,
                                        int precision)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision.
        Parameters:
        variant - The input variant
        precision - The precision
        Returns:
        The result column
        Since:
        0.12.0
      • as_decimal

        public static Column as_decimal​(Column variant,
                                        int precision,
                                        int scale)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision and scale.
        Parameters:
        variant - The input variant
        precision - The precision
        scale - The scale
        Returns:
        The result column
        Since:
        0.12.0
      • as_number

        public static Column as_number​(Column variant)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values).
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_number

        public static Column as_number​(Column variant,
                                       int precision)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision and scale.
        Parameters:
        variant - The input variant
        precision - The precision
        Returns:
        The result column
        Since:
        0.12.0
      • as_number

        public static Column as_number​(Column variant,
                                       int precision,
                                       int scale)
        Casts a VARIANT value to a fixed-point decimal (does not match floating-point values), with precision.
        Parameters:
        variant - The input variant
        precision - The precision
        scale - The scale
        Returns:
        The result column
        Since:
        0.12.0
      • as_double

        public static Column as_double​(Column variant)
        Casts a VARIANT value to a floating-point value.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_real

        public static Column as_real​(Column variant)
        Casts a VARIANT value to a floating-point value.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_integer

        public static Column as_integer​(Column variant)
        Casts a VARIANT value to an integer. Does not match non-integer values.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_object

        public static Column as_object​(Column variant)
        Casts a VARIANT value to an object.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_time

        public static Column as_time​(Column variant)
        Casts a VARIANT value to a time value. Does not convert from timestamps.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_timestamp_ltz

        public static Column as_timestamp_ltz​(Column variant)
        Casts a VARIANT value to a TIMESTAMP value with local timezone.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_timestamp_ntz

        public static Column as_timestamp_ntz​(Column variant)
        Casts a VARIANT value to a TIMESTAMP value with no timezone.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • as_timestamp_tz

        public static Column as_timestamp_tz​(Column variant)
        Casts a VARIANT value to a TIMESTAMP value with timezone.
        Parameters:
        variant - The input variant
        Returns:
        The result column
        Since:
        0.12.0
      • strtok_to_array

        public static Column strtok_to_array​(Column array)
        Tokenizes the given string using the given set of delimiters and returns the tokens as an array. If either parameter is a NULL, a NULL is returned. An empty array is returned if tokenization produces no tokens.
        Parameters:
        array - The input array
        Returns:
        The result column
        Since:
        0.12.0
      • strtok_to_array

        public static Column strtok_to_array​(Column array,
                                             Column delimiter)
        Tokenizes the given string using the given set of delimiters and returns the tokens as an array. If either parameter is a NULL, a NULL is returned. An empty array is returned if tokenization produces no tokens.
        Parameters:
        array - The input array
        delimiter - The delimiter
        Returns:
        The result column
        Since:
        0.12.0
      • to_array

        public static Column to_array​(Column col)
        Converts the input expression into an array:

        If the input is an ARRAY, or VARIANT containing an array value, the result is unchanged. For NULL or a JSON null input, returns NULL. For any other value, the result is a single-element array containing this value.

        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • to_json

        public static Column to_json​(Column col)
        Converts any VARIANT value to a string containing the JSON representation of the value. If the input is NULL, the result is also NULL.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • to_object

        public static Column to_object​(Column col)
        Converts the input value to an object:

        For a variant value containing an object, returns this object (in a value of type OBJECT). For a variant value containing JSON null or for NULL input, returns NULL. For all other input values, reports an error.

        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • to_variant

        public static Column to_variant​(Column col)
        Converts any value to VARIANT value or NULL (if input is NULL).
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • to_xml

        public static Column to_xml​(Column col)
        Converts any VARIANT value to a string containing the XML representation of the value. If the input is NULL, the result is also NULL.
        Parameters:
        col - The input value
        Returns:
        The result column
        Since:
        0.12.0
      • get

        public static Column get​(Column col1,
                                 Column col2)
        Extracts a value from an object or array; returns NULL if either of the arguments is NULL.
        Parameters:
        col1 - The first input value
        col2 - The second input value
        Returns:
        The result column
        Since:
        0.12.0
      • get_ignore_case

        public static Column get_ignore_case​(Column obj,
                                             Column field)
        Extracts a field value from an object; returns NULL if either of the arguments is NULL. This function is similar to GET but applies case-insensitive matching to field names.
        Parameters:
        obj - The input object
        field - The field
        Returns:
        The result column
        Since:
        0.12.0
      • object_keys

        public static Column object_keys​(Column obj)
        Returns an array containing the list of keys in the input object.
        Parameters:
        obj - The input object
        Returns:
        The result column
        Since:
        0.12.0
      • xmlget

        public static Column xmlget​(Column xml,
                                    Column tag,
                                    Column instance)
        Extracts an XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag and its instance number (counting from 0).
        Parameters:
        xml - The input xml
        tag - The tag
        instance - The instance
        Returns:
        The result column
        Since:
        0.12.0
      • xmlget

        public static Column xmlget​(Column xml,
                                    Column tag)
        Extracts the first XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag
        Parameters:
        xml - The input xml
        tag - The tag
        Returns:
        The result column
        Since:
        0.12.0
      • get_path

        public static Column get_path​(Column col,
                                      Column path)
        Extracts a value from semi-structured data using a path name.
        Parameters:
        col - The input value
        path - The path
        Returns:
        The result column
        Since:
        0.12.0
      • when

        public static CaseExpr when​(Column condition,
                                    Column value)
        Works like a cascading if-then-else statement. A series of conditions are evaluated in sequence. When a condition evaluates to TRUE, the evaluation stops and the associated result (after THEN) is returned. If none of the conditions evaluate to TRUE, then the result after the optional OTHERWISE is returned, if present; otherwise NULL is returned. For Example:
        
         import com.snowflake.snowpark_java.Functions;
         df.select(Functions
             .when(df.col("col").is_null, Functions.lit(1))
             .when(df.col("col").equal_to(Functions.lit(1)), Functions.lit(6))
             .otherwise(Functions.lit(7)));
         
        Parameters:
        condition - The condition
        value - The result value
        Returns:
        The result column
        Since:
        0.12.0
      • iff

        public static Column iff​(Column condition,
                                 Column expr1,
                                 Column expr2)
        Returns one of two specified expressions, depending on a condition.

        This is equivalent to an `if-then-else` expression. If `condition` evaluates to TRUE, the function returns `expr1`. Otherwise, the function returns `expr2`.

        Parameters:
        condition - The condition to evaluate.
        expr1 - The expression to return if the condition evaluates to TRUE.
        expr2 - The expression to return if the condition is not TRUE (i.e. if it is FALSE or NULL).
        Returns:
        The result column
        Since:
        0.12.0
      • in

        public static Column in​(Column[] columns,
                                List<List<Object>> values)
        Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ... IN query that matches rows containing a sequence of values.

        The expression evaluates to true if the values in a row matches the values in one of the specified sequences.

        For example, the following code returns a DataFrame that contains the rows in which the columns `c1` and `c2` contain the values: - `1` and `"a"`, or - `2` and `"b"` This is equivalent to `SELECT * FROM table WHERE (c1, c2) IN ((1, 'a'), (2, 'b'))`.

        
         import com.snowflake.snowpark_java.Functions;
         df.filter(Functions.in(new Column[]{df.col("c1"), df.col("c2")},
           Arrays.asList(Array.asList(1, "a"), Array.asList(2, "b"))));
         
        Parameters:
        columns - A sequence of the columns to compare for the IN operation.
        values - A sequence containing the sequences of values to compare for the IN operation.
        Returns:
        The result column
        Since:
        0.12.0
      • in

        public static Column in​(Column[] columns,
                                DataFrame df)
        Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ... IN query with the subquery represented by the specified DataFrame.

        The expression evaluates to true if the value in the column is one of the values in the column of the same name in a specified DataFrame.

        For example, the following code returns a DataFrame that contains the rows where the values of the columns `c1` and `c2` in `df2` match the values of the columns `a` and `b` in `df1`. This is equivalent to SELECT * FROM table2 WHERE (c1, c2) IN (SELECT a, b FROM table1).

        
         import com.snowflake.snowpark_java.Functions;
         import com.snowflake.snowpark_java.DataFrame;
         DataFrame df1 = session.sql("select a, b from table1");
         DataFrame df2 = session.table("table2");
         df2.filter(Functions.in(new Column[]{df2.col("c1"), df2.col("c2")}, df1));
         
        Parameters:
        columns - A sequence of the columns to compare for the IN operation.
        df - The DataFrame used as the values for the IN operation
        Returns:
        The result column
        Since:
        0.12.0
      • seq1

        public static Column seq1()
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 1 byte. the sequence continues at 0 after wrap-around.
        Returns:
        The result column
        Since:
        0.12.0
      • seq1

        public static Column seq1​(boolean startsFromZero)
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 1 byte.
        Parameters:
        startsFromZero - if true, the sequence continues at 0 after wrap-around, otherwise, continues at the smallest representable number based on the given integer width.
        Returns:
        The result column
        Since:
        0.12.0
      • seq2

        public static Column seq2()
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 2 byte. the sequence continues at 0 after wrap-around.
        Returns:
        The result column
        Since:
        0.12.0
      • seq2

        public static Column seq2​(boolean startsFromZero)
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 2 byte.
        Parameters:
        startsFromZero - if true, the sequence continues at 0 after wrap-around, otherwise, continues at the smallest representable number based on the given integer width.
        Returns:
        The result column
        Since:
        0.12.0
      • seq4

        public static Column seq4()
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 4 byte. the sequence continues at 0 after wrap-around.
        Returns:
        The result column
        Since:
        0.12.0
      • seq4

        public static Column seq4​(boolean startsFromZero)
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 4 byte.
        Parameters:
        startsFromZero - if true, the sequence continues at 0 after wrap-around, otherwise, continues at the smallest representable number based on the given integer width.
        Returns:
        The result column
        Since:
        0.12.0
      • seq8

        public static Column seq8()
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 8 byte. the sequence continues at 0 after wrap-around.
        Returns:
        The result column
        Since:
        0.12.0
      • seq8

        public static Column seq8​(boolean startsFromZero)
        Generates a sequence of monotonically increasing integers, with wrap-around. Wrap-around occurs after the largest representable integer of the integer width 8 byte.
        Parameters:
        startsFromZero - if true, the sequence continues at 0 after wrap-around, otherwise, continues at the smallest representable number based on the given integer width.
        Returns:
        The result column
        Since:
        0.12.0
      • uniform

        public static Column uniform​(Column min,
                                     Column max,
                                     Column gen)
        Returns a uniformly random number, in the inclusive range (`min`, `max`)

        For example: {{{ import com.snowflake.snowpark_java.Functions; session.generator(10, Functions.seq4(), Functions.uniform(Functions.lit(1), Functions.lit(5), Functions.random())).show() }}}

        Parameters:
        min - The lower bound
        max - The upper bound
        gen - The generator expression for the function. for more information, see here
        Returns:
        The result column
        Since:
        0.12.0
      • listagg

        public static Column listagg​(Column col,
                                     String delimiter,
                                     boolean isDistinct)
        Returns the concatenated input values, separated by `delimiter` string.

        For example:

        
         df.groupBy(df.col("col1")).agg(Functions.listagg(df.col("col2"), ",")
              .withinGroup(df.col("col2").asc()))
        
        
         df.select(Functions.listagg(df.col("col2"), ",", false))
         
        Parameters:
        col - The expression (typically a Column) that determines the values to be put into the list. The expression should evaluate to a string, or to a data type that can be cast to string.
        delimiter - A string delimiter.
        isDistinct - Whether the input expression is distinct.
        Returns:
        The result Column
        Since:
        1.1.0
      • listagg

        public static Column listagg​(Column col,
                                     String delimiter)
        Returns the concatenated input values, separated by `delimiter` string.

        For example:

        
         df.groupBy(df.col("col1")).agg(Functions.listagg(df.col("col2"), ",")
              .withinGroup(df.col("col2").asc()))
        
        
         df.select(Functions.listagg(df.col("col2"), ",", false))
         
        Parameters:
        col - The expression (typically a Column) that determines the values to be put into the list. The expression should evaluate to a string, or to a data type that can be cast to string.
        delimiter - A string delimiter.
        Returns:
        The result Column
        Since:
        1.1.0
      • listagg

        public static Column listagg​(Column col)
        Returns the concatenated input values, separated by empty string.

        For example:

        
         df.groupBy(df.col("col1")).agg(Functions.listagg(df.col("col2"), ",")
              .withinGroup(df.col("col2").asc()))
        
        
         df.select(Functions.listagg(df.col("col2"), ",", false))
         
        Parameters:
        col - The expression (typically a Column) that determines the values to be put into the list. The expression should evaluate to a string, or to a data type that can be cast to string.
        Returns:
        The result Column
        Since:
        1.1.0
      • reverse

        public static Column reverse​(Column name)
        Wrapper for Snowflake built-in reverse function. Gets the reversed string. Reverses the order of characters in a string, or of bytes in a binary value. The returned value is the same length as the input, but with the characters/bytes in reverse order. If subject is NULL, the result is also NULL.

        Example:

        
         SELECT REVERSE('Hello, world!');
           +--------------------------+
           | REVERSE('HELLO, WORLD!') |
           |--------------------------|
           | !dlrow ,olleH            |
           +--------------------------+
         
        Parameters:
        name - Column to be reverse.
        Returns:
        Column object.
        Since:
        1.14.0
      • isnull

        public static Column isnull​(Column c)
        Wrapper for Snowflake built-in isnull function. Gets a boolean depending if value is NULL or not. Return true if the value in the column is null.

        Example::

        
         >>> from snowflake.snowpark.functions import is_null >>> df = session.create_dataframe([1.2,
         float("nan"), None, 1.0], schema=["a"]) >>> df.select(is_null("a").as_("a")).collect()
         [Row(A=False), Row(A=False), Row(A=True), Row(A=False)]
         
        Parameters:
        c - Column to analyze if it is null value.
        Returns:
        Column object.
        Since:
        1.14.0
      • unix_timestamp

        public static Column unix_timestamp​(Column c)
        Returns the current Unix timestamp (in seconds) as a long. Extracts a specified date or time portion from a date, time, or timestamp. All calls of `unix_timestamp` within the same query return the same value

        Example - DATE_PART( date_or_time_part ,date_or_time_expr )

        
         SELECT TO_TIMESTAMP('2013-05-08T23:39:20.123-07:00') AS "TIME_STAMP1",
          DATE_PART(EPOCH_SECOND, "TIME_STAMP1") AS "EXTRACTED EPOCH SECOND";
         +-------------------------+------------------------+
         | TIME_STAMP1             | EXTRACTED EPOCH SECOND |
         |-------------------------+------------------------|
         | 2013-05-08 23:39:20.123 |             1368056360 |
         +-------------------------+------------------------+
         
        Parameters:
        c - Column to be converted.
        Returns:
        Column object.
        Since:
        1.14.0
      • regexp_extract

        public static Column regexp_extract​(Column col,
                                            String exp,
                                            Integer position,
                                            Integer Occurences,
                                            Integer grpIdx)
        Signature - snowflake.snowpark.functions.regexp_extract (value: Union[Column, str], regexp: Union[Column, str], idx: int) Column Extract a specific group matched by a regex, from the specified string column. If the regex did not match, or the specified group did not match, an empty string is returned. Example:
        
         from snowflake.snowpark.functions import regexp_extract
         df = session.createDataFrame([["id_20_30", 10], ["id_40_50", 30]], ["id", "age"])
         df.select(regexp_extract("id", r"(\d+)", 1).alias("RES")).show()
            ---------
             |"RES"  |
             ---------
             |20     |
             |40     |
             ---------
         
        Parameters:
        col - Column.
        exp - String
        position - Integer.
        Occurences - Integer.
        grpIdx - Integer.
        Returns:
        Column object.
        Since:
        1.14.0
      • signum

        public static Column signum​(Column col)
        Returns the sign of its argument:

        - -1 if the argument is negative. - 1 if it is positive. - 0 if it is 0.

        Args: col: The column to evaluate its sign Example:: *

        df =
         session.create_dataframe([(-2, 2, 0)], ["a", "b", "c"]) >>>
         df.select(sign("a").alias("a_sign"), sign("b").alias("b_sign"),
         sign("c").alias("c_sign")).show()
           ----------------------------------
             |"A_SIGN"  |"B_SIGN"  |"C_SIGN"  |
             ----------------------------------
             |-1        |1         |0         |
             ----------------------------------
         
        Parameters:
        col - Column to calculate the sign.
        Returns:
        Column object.
        Since:
        1.14.0
      • sign

        public static Column sign​(Column col)
        Returns the sign of its argument:

        - -1 if the argument is negative. - 1 if it is positive. - 0 if it is 0.

        Args: col: The column to evaluate its sign Example::

        df =
         session.create_dataframe([(-2, 2, 0)], ["a", "b", "c"]) >>>
         df.select(sign("a").alias("a_sign"), sign("b").alias("b_sign"),
         sign("c").alias("c_sign")).show()
           ----------------------------------
             |"A_SIGN"  |"B_SIGN"  |"C_SIGN"  |
             ----------------------------------
             |-1        |1         |0         |
             ----------------------------------
         
        Parameters:
        col - Column to calculate the sign.
        Returns:
        Column object.
        Since:
        1.14.0
      • substring_index

        public static Column substring_index​(String col,
                                             String delim,
                                             Integer count)
        Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delim.
        Parameters:
        col - String.
        delim - String
        count - Integer.
        Returns:
        Column object.
        Since:
        1.14.0
      • collect_list

        public static Column collect_list​(Column c)
        Returns the input values, pivoted into an ARRAY. If the input is empty, an empty ARRAY is returned.

        Example::

        
         df = session.create_dataframe([[1], [2], [3], [1]], schema=["a"])
         df.select(array_agg("a", True).alias("result")).show()
         "RESULT" [ 1, 2, 3 ]
         
        Parameters:
        c - Column to be collect.
        Returns:
        The array.
        Since:
        1.14.0
      • date_add

        public static Column date_add​(Integer days,
                                      Column start)
        Returns the date that is `days` days after `start` Usage - DATE_ADD( date_or_time_part, value, date_or_time_expr )

        Example::

        
         SELECT TO_DATE('2013-05-08') AS v1, DATE_ADD(year, 2, TO_DATE('2013-05-08')) AS v;
        
         +------------+------------+
         | V1         | V          |
         |------------+------------|
         | 2013-05-08 | 2015-05-08 |
         +------------+------------+
         
        Parameters:
        days - Int
        start - Column name
        Returns:
        Column
        Since:
        1.15.0
      • date_add

        public static Column date_add​(Column start,
                                      Column days)
        Returns the date that is `days` days after `start` Usage - DATE_ADD( date_or_time_part, value, date_or_time_expr )

        Example::

        
         SELECT TO_DATE('2013-05-08') AS v1, DATE_ADD(year, 2, TO_DATE('2013-05-08')) AS v;
        
         +------------+------------+
         | V1         | V          |
         |------------+------------|
         | 2013-05-08 | 2015-05-08 |
         +------------+------------+
         
        Parameters:
        start - Column name
        days - Column name
        Returns:
        Column
        Since:
        1.15.0
      • collect_set

        public static Column collect_set​(Column e)
        Aggregate function: returns a set of objects with duplicate elements eliminated. Returns the input values, pivoted into an ARRAY. If the input is empty, an empty ARRAY is returned.

        Example::

        
         >>> df = session.create_dataframe([[1], [2], [3], [1]], schema=["a"])
         >>> df.select(array_agg("a", True).alias("result")).show()
         ------------
         |"RESULT"  |
         ------------
         |[         |
         |  1,      |
         |  2,      |
         |  3       |
         |]         |
         ------------
        
         
        Parameters:
        e - The column to collect the list values
        Returns:
        A list with unique values
        Since:
        1.15.0
      • collect_set

        public static Column collect_set​(String e)
        Aggregate function: returns a set of objects with duplicate elements eliminated. Returns the input values, pivoted into an ARRAY. If the input is empty, an empty ARRAY is returned.

        Example::

        
         df = session.create_dataframe([[1], [2], [3], [1]], schema=["a"])
         df.select(array_agg("a", True).alias("result")).show()
         ------------
         |"RESULT" |
         ------------
         |[| | 1, | | 2, | | 3 | |] |
         ------------
         
        Parameters:
        e - The string to collect the list values
        Returns:
        A list with unique values
        Since:
        1.15.0
      • from_unixtime

        public static Column from_unixtime​(Column ut)
        Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format. Converts an input expression into the corresponding date in the specified date format.

        Example::

        
         df = session.create_dataframe([("2023-10-10",), ("2022-05-15",), ("invalid",)] ,
         schema=['date']) df.select(date_format('date', 'YYYY/MM/DD').as_('formatted_date')).show()
         --------------------
         |"FORMATTED_DATE" |
         --------------------
         |2023/10/10 | |2022/05/15 | |NULL
         | --------------------
         
        Parameters:
        ut - A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
        Returns:
        A string, or null if the input was a string that could not be cast to a long
        Since:
        1.15.0
      • from_unixtime

        public static Column from_unixtime​(Column ut,
                                           String f)
        Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format. Converts an input expression into the corresponding date in the specified date format.

        Example::

        
         df = session.create_dataframe([("2023-10-10",), ("2022-05-15",),
         ("invalid",)], schema=['date'])
         df.select(date_format('date',
         'YYYY/MM/DD').as_('formatted_date')).show()
         --------------------
         |"FORMATTED_DATE" |
         --------------------
         |2023/10/10 | |2022/05/15 | |NULL |
         --------------------
         
        Parameters:
        ut - Column A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
        f - string A date time pattern that the input will be formatted to
        Returns:
        Column, or null if `ut` was a string that could not be cast to a long or `f` was an invalid date time pattern
        Since:
        1.15.0
      • monotonically_increasing_id

        public static Column monotonically_increasing_id()
        A column expression that generates monotonically increasing 64-bit integers. Returns a sequence of monotonically increasing integers, with wrap-around which happens after largest representable integer of integer width 8 byte.

        Example::

        
         >>> df = session.generator(seq8(0), rowcount=3)
         >>> df.collect()
         [Row(SEQ8(0)=0),Row(SEQ8(0)=1), Row(SEQ8(0)=2)]
         
        Returns:
        A sequence of monotonically increasing integers, with wrap-around * which happens after largest representable integer of integer width 8 byte.
        Since:
        1.15.0
      • months_between

        public static Column months_between​(String end,
                                            String start)
        Returns number of months between dates `start` and `end`.

        A whole number is returned if both inputs have the same day of month or both are the last day of their respective months. Otherwise, the difference is calculated assuming 31 days per month.

        For example:

        
         {{{
         months_between("2017-11-14", "2017-07-14")  // returns 4.0
         months_between("2017-01-01", "2017-01-10")  // returns 0.29032258
         months_between("2017-06-01", "2017-06-16 12:00:00")  // returns -0.5
         }}}
         
        Parameters:
        end - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
        start - A date, timestamp or string. If a string, the data must be in a format that can cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
        Returns:
        A double, or null if either `end` or `start` were strings that could not be cast to a timestamp. Negative if `end` is before `start`
        Since:
        1.15.0
      • instr

        public static Column instr​(Column str,
                                   String substring)
        Locate the position of the first occurrence of substr column in the given string. Returns null if either of the arguments are null.

        Example

        
         SELECT id,
                string1,
                REGEXP_SUBSTR(string1, 'nevermore\\d') AS substring,
                REGEXP_INSTR( string1, 'nevermore\\d') AS position
           FROM demo1
           ORDER BY id;
        
           +----+-------------------------------------+------------+----------+
         | ID | STRING1                             | SUBSTRING  | POSITION |
         |----+-------------------------------------+------------+----------|
         |  1 | nevermore1, nevermore2, nevermore3. | nevermore1 |        1 |
         +----+-------------------------------------+------------+----------+
         
        The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
        Parameters:
        str - Column on which instr has to be applied
        substring - Pattern to be retrieved
        Returns:
        A null if either of the arguments are null.
        Since:
        1.15.0
      • from_utc_timestamp

        public static Column from_utc_timestamp​(Column ts)
        Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

        For Example

        
         ALTER SESSION SET TIMEZONE = 'America/Los_Angeles';
         SELECT TO_TIMESTAMP_TZ('2024-04-05 01:02:03');
          +----------------------------------------+
         | TO_TIMESTAMP_TZ('2024-04-05 01:02:03') |
         |----------------------------------------|
         | 2024-04-05 01:02:03.000 -0700          |
         +----------------------------------------+
         
        Parameters:
        ts - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS` A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
        Returns:
        A timestamp, or null if `ts` was a string that could not be cast to a timestamp or `tz` was an invalid value
        Since:
        1.15.0
      • to_utc_timestamp

        public static Column to_utc_timestamp​(Column ts)
        Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
        Parameters:
        ts - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS` A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
        Returns:
        A timestamp, or null if `ts` was a string that could not be cast to a timestamp or `tz` was an invalid value
        Since:
        1.15.0
      • format_number

        public static Column format_number​(Column x,
                                           Integer d)
        Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.

        If d is 0, the result has no decimal point or fractional part. If d is less than 0, the result will be null.

        Parameters:
        x - numeric column to be transformed
        d - Amount of decimal for the number format
        Returns:
        Number casted to the specific string format
        Since:
        1.15.0
      • asc

        public static Column asc​(String name)
        Returns a Column expression with values sorted in ascending order.

        Example: order column values in ascending

        
         DataFrame df = getSession().sql("select * from values(3),(1),(2) as t(a)");
         df.sort(Functions.asc("a")).show();
         -------
         |"A"  |
         -------
         |1    |
         |2    |
         |3    |
         -------
         
        Parameters:
        name - The input column name
        Returns:
        Column object ordered in ascending manner.
        Since:
        1.14.0
      • size

        public static Column size​(Column col)
        Returns the size of the input ARRAY.

        If the specified column contains a VARIANT value that contains an ARRAY, the size of the ARRAY is returned; otherwise, NULL is returned if the value is not an ARRAY.

        Example: calculate size of the array in a column

        
         DataFrame df = getSession().sql("select array_construct(a,b,c) as arr from values(1,2,3) as T(a,b,c)");
         df.select(Functions.size(Functions.col("arr"))).show();
         -------------------------
         |"ARRAY_SIZE(""ARR"")"  |
         -------------------------
         |3                      |
         -------------------------
         
        Parameters:
        col - The input column name
        Returns:
        size of the input ARRAY.
        Since:
        1.14.0
      • expr

        public static Column expr​(String s)
        Creates a Column expression from row SQL text.

        Note that the function does not interpret or check the SQL text.

        
         DataFrame df = getSession().sql("select a from values(1), (2), (3) as T(a)");
         df.filter(Functions.expr("a > 2")).show();
         -------
         |"A"  |
         -------
         |3    |
         -------
         
        Parameters:
        s - The SQL text
        Returns:
        column expression from input statement.
        Since:
        1.14.0
      • array

        public static Column array​(Column... cols)
        Returns an ARRAY constructed from zero, one, or more inputs.
        
         DataFrame df = getSession().sql("select * from values(1,2,3) as T(a,b,c)");
         df.select(Functions.array(df.col("a"), df.col("b"), df.col("c")).as("array")).show();
         -----------
         |"ARRAY"  |
         -----------
         |[        |
         |  1,     |
         |  2,     |
         |  3      |
         |]        |
         -----------
         
        Parameters:
        cols - The input column names
        Returns:
        Column object as array.
        Since:
        1.14.0
      • date_format

        public static Column date_format​(Column col,
                                         String s)
        Converts an input expression into the corresponding date in the specified date format.
        
         DataFrame df = getSession().sql("select * from values ('2023-10-10'), ('2022-05-15') as T(a)");
         df.select(Functions.date_format(df.col("a"), "YYYY/MM/DD").as("formatted_date")).show();
         --------------------
         |"FORMATTED_DATE"  |
         --------------------
         |2023/10/10        |
         |2022/05/15        |
         --------------------
         
        Parameters:
        col - The input date column name
        s - string format
        Returns:
        formatted column object.
        Since:
        1.14.0
      • last

        public static Column last​(Column col)
        Returns the last value of the column in a group.
        
         DataFrame df = getSession().sql("select * from values (5, 'a', 10), (5, 'b', 20),\n" +
                     "    (3, 'd', 15), (3, 'e', 40) as T(grade,name,score)");
         df.select(Functions.last(df.col("name")).over(Window.partitionBy(df.col("grade")).orderBy(df.col("score").desc()))).show();
         ----------------
         |"LAST_VALUE"  |
         ----------------
         |a             |
         |a             |
         |d             |
         |d             |
         ----------------
         
        Parameters:
        col - The input column to get last value
        Returns:
        column object from last function.
        Since:
        1.14.0
      • log10

        public static Column log10​(Column col)
        Computes the logarithm of the given value in base 10.
        
         DataFrame df = getSession().sql("select * from values (100) as T(a)");
         df.select(Functions.log10(df.col("a")).as("log10")).show();
         -----------
         |"LOG10"  |
         -----------
         |2.0      |
         -----------
         
        Parameters:
        col - The input column to get logarithm value
        Returns:
        column object from logarithm function.
        Since:
        1.14.0
      • log10

        public static Column log10​(String s)
        Computes the logarithm of the given value in base 10.
        
         DataFrame df = getSession().sql("select * from values (100) as T(a)");
         df.select(Functions.log10("a").as("log10")).show();
         -----------
         |"LOG10"  |
         -----------
         |2.0      |
         -----------
         
        Parameters:
        s - The input columnName in string to get logarithm value
        Returns:
        column object from logarithm function.
        Since:
        1.14.0
      • log1p

        public static Column log1p​(Column col)
        Computes the logarithm of the given value in base 10.
        
         DataFrame df = getSession().sql("select * from values (0.1) as T(a)");
         df.select(Functions.log1p(df.col("a")).as("log1p")).show();
         -----------------------
         |"LOG1P"              |
         -----------------------
         |0.09531017980432493  |
         -----------------------
         
        Parameters:
        col - The input column to get logarithm value
        Returns:
        column object from logarithm function.
        Since:
        1.14.0
      • log1p

        public static Column log1p​(String s)
        Computes the logarithm of the given value in base 10.
        
         DataFrame df = getSession().sql("select * from values (0.1) as T(a)");
         df.select(Functions.log1p("a").as("log1p")).show();
         -----------------------
         |"LOG1P"              |
         -----------------------
         |0.09531017980432493  |
         -----------------------
         
        Parameters:
        s - The input columnName in string to get logarithm value
        Returns:
        column object from logarithm function.
        Since:
        1.14.0
      • base64

        public static Column base64​(Column c)
        Computes the BASE64 encoding of a column and returns it as a string column. This is the reverse of unbase64.
        
         DataFrame df = getSession().sql("select * from values ('test') as T(a)");
         df.select(Functions.base64(Functions.col("a")).as("base64")).show();
         ------------
         |"BASE64"  |
         ------------
         |dGVzdA==  |
         ------------
         
        Parameters:
        c - ColumnName to apply base64 operation
        Returns:
        base64 encoded value of the given input column.
        Since:
        1.14.0
      • unbase64

        public static Column unbase64​(Column c)
        Decodes a BASE64 encoded string column and returns it as a column.
        
         DataFrame df = getSession().sql("select * from values ('dGVzdA==') as T(a)");
         df.select(Functions.unbase64(Functions.col("a")).as("unbase64")).show();
         --------------
         |"UNBASE64"  |
         --------------
         |test        |
         --------------
         
        Parameters:
        c - ColumnName to apply unbase64 operation
        Returns:
        the decoded value of the given encoded value.
        Since:
        1.14.0
      • locate

        public static Column locate​(Column substr,
                                    Column str,
                                    int pos)
        Locate the position of the first occurrence of substr in a string column, after position pos.
        
         DataFrame df = getSession().sql("select * from values ('scala', 'java scala python'), \n " +
                     "('b', 'abcd') as T(a,b)");
         df.select(Functions.locate(Functions.col("a"), Functions.col("b"), 1).as("locate")).show();
         ------------
         |"LOCATE"  |
         ------------
         |6         |
         |2         |
         ------------
         
        Parameters:
        substr - string to search
        str - value where string will be searched
        pos - index for starting the search
        Returns:
        returns the position of the first occurrence.
        Since:
        1.14.0
      • locate

        public static Column locate​(String substr,
                                    Column str)
        Locate the position of the first occurrence of substr in a string column, after position pos. default to 1.
        
         DataFrame df = getSession().sql("select * from values ('abcd') as T(s)");
         df.select(Functions.locate("b", Functions.col("s")).as("locate")).show();
         ------------
         |"LOCATE"  |
         ------------
         |2         |
         ------------
         
        Parameters:
        substr - string to search
        str - value where string will be searched
        Returns:
        returns the position of the first occurrence.
        Since:
        1.14.0
      • ntile

        public static Column ntile​(int n)
        Window function: returns the ntile group id (from 1 to `n` inclusive) in an ordered window partition. For example, if `n` is 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.

        This is equivalent to the NTILE function in SQL.

        
         DataFrame df = getSession().sql("select * from values(1,2),(1,2),(2,1),(2,2),(2,2) as T(x,y)");
         df.select(Functions.ntile(4).over(Window.partitionBy(df.col("x")).orderBy(df.col("y"))).as("ntile")).show();
         -----------
         |"NTILE"  |
         -----------
         |1        |
         |2        |
         |3        |
         |1        |
         |2        |
         -----------
         
        Parameters:
        n - number of groups
        Returns:
        returns the ntile group id (from 1 to n inclusive) in an ordered window partition.
        Since:
        1.14.0
      • randn

        public static Column randn()
        Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution. Return a call to the Snowflake RANDOM function. NOTE: Snowflake returns integers of 17-19 digits.
        
         DataFrame df = getSession().sql("select * from values(1),(2),(3) as T(a)");
         df.withColumn("randn",Functions.randn()).select("randn").show();
         ------------------------
         |"RANDN"               |
         ------------------------
         |6799378361097866000   |
         |-7280487148628086605  |
         |775606662514393461    |
         ------------------------
         
        Returns:
        Random number.
        Since:
        1.14.0
      • randn

        public static Column randn​(long seed)
        Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution. Return a call to the Snowflake RANDOM function. NOTE: Snowflake returns integers of 17-19 digits.
        
         DataFrame df = getSession().sql("select * from values(1),(2),(3) as T(a)");
         df.withColumn("randn_with_seed",Functions.randn(123l)).select("randn_with_seed").show();
         ------------------------
         |"RANDN_WITH_SEED"     |
         ------------------------
         |5777523539921853504   |
         |-8190739547906189845  |
         |-1138438814981368515  |
         ------------------------
         
        Returns:
        Random number.
        Since:
        1.14.0
      • shiftleft

        public static Column shiftleft​(Column c,
                                       int numBits)
        Shift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.
        
         DataFrame df = getSession().sql("select * from values(1),(2),(3) as T(a)");
         df.select(Functions.shiftleft(Functions.col("a"), 1).as("shiftleft")).show();
         ---------------
         |"SHIFTLEFT"  |
         ---------------
         |2            |
         |4            |
         |6            |
         ---------------
         
        Returns:
        Column object.
        Since:
        1.14.0
      • shiftright

        public static Column shiftright​(Column c,
                                        int numBits)
        Shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.
        
         DataFrame df = getSession().sql("select * from values(1),(2),(3) as T(a)");
         df.select(Functions.shiftright(Functions.col("a"), 1).as("shiftright")).show();
         ---------------
         |"SHIFTRIGHT"  |
         ---------------
         |0            |
         |1            |
         |1            |
         ---------------
         
        Returns:
        Column object.
        Since:
        1.14.0
      • hex

        public static Column hex​(Column c)
        Computes hex value of the given column.
        
         DataFrame df = getSession().sql("select * from values(1),(2),(3) as T(a)");
         df.select(Functions.hex(Functions.col("a")).as("hex")).show();
         ---------
         |"HEX"  |
         ---------
         |31     |
         |32     |
         |33     |
         ---------
         
        Returns:
        Column object.
        Since:
        1.14.0
      • unhex

        public static Column unhex​(Column c)
        Inverse of hex. Interprets each pair of characters as a hexadecimal number and converts to the byte representation of number.
        
         DataFrame df = getSession().sql("select * from values(31),(32),(33) as T(a)");
         df.select(Functions.unhex(Functions.col("a")).as("unhex")).show();
         -----------
         |"UNHEX"  |
         -----------
         |1        |
         |2        |
         |3        |
         -----------
         
        Returns:
        Column object.
        Since:
        1.14.0
      • callUDF

        public static Column callUDF​(String udfName,
                                     Column... cols)
        Calls a user-defined function (UDF) by name.
        Parameters:
        udfName - The name of UDF
        cols - The list of parameters
        Returns:
        The result column
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF0<?> func,
                                              DataType output)
        Registers a Java Lambda of 0 argument as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF1<?,​?> func,
                                              DataType input,
                                              DataType output)
        Registers a Java Lambda of 1 argument as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataType
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF2<?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 2 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF3<?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 3 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF4<?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 4 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF5<?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 5 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF6<?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 6 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF7<?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 7 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF8<?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 8 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF9<?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 9 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF10<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 10 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF11<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 11 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF12<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 12 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF13<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 13 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF14<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 14 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF15<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 15 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF16<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 16 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF17<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 17 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF18<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 18 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF19<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 19 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF20<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 20 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF21<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 21 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0
      • udf

        public static UserDefinedFunction udf​(JavaUDF22<?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?,​?> func,
                                              DataType[] input,
                                              DataType output)
        Registers a Java Lambda of 22 arguments as a Snowflake UDF and returns the UDF.
        Parameters:
        func - the Java lambda to be registered
        input - the UDF input types.DataTypes
        output - the UDF return types.DataType
        Returns:
        The result UserDefinedFunction reference
        Since:
        0.12.0