snowflake.snowpark.functions.jarowinkler_similarity¶
- snowflake.snowpark.functions.jarowinkler_similarity(string_expr1: Union[snowflake.snowpark.column.Column, str], string_expr2: Union[snowflake.snowpark.column.Column, str]) Column[source]¶
Computes the Jaro-Winkler similarity between two strings. The Jaro-Winkler similarity is a string metric measuring an edit distance between two sequences. It is a variant of the Jaro distance metric designed to give more favorable ratings to strings with common prefixes.
- Parameters:
string_expr1 (ColumnOrName) – The first string expression to compare.
string_expr2 (ColumnOrName) – The second string expression to compare.
- Returns:
The Jaro-Winkler similarity score as an integer between 0 and 100.
- Return type:
- Examples::
>>> df = session.create_dataframe([ ... ("Snowflake", "Oracle"), ... ("Ich weiß nicht", "Ich wei? nicht"), ... ("Gute nacht", "Ich weis nicht"), ... ("święta", "swieta"), ... ("", ""), ... ("test", "test") ... ], schema=["s", "t"]) >>> df.select(jarowinkler_similarity(df["s"], df["t"]).alias("similarity")).collect() [Row(SIMILARITY=61), Row(SIMILARITY=97), Row(SIMILARITY=56), Row(SIMILARITY=77), Row(SIMILARITY=0), Row(SIMILARITY=100)]