Snowpark Migration Accelerator: Problemcodes für Spark - Scala

SPRKSCL1126

Meldung: org.apache.spark.sql.functions.covar_pop verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.covar_pop function, which has a workaround.

Eingabe

Below is an example of the org.apache.spark.sql.functions.covar_pop function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

val result1 = df.select(covar_pop("column1", "column2").as("covariance_pop"))
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

Ausgabe

The SMA adds the EWI SPRKSCL1126 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

/*EWI: SPRKSCL1126 => org.apache.spark.sql.functions.covar_pop has a workaround, see documentation for more info*/
val result1 = df.select(covar_pop("column1", "column2").as("covariance_pop"))
/*EWI: SPRKSCL1126 => org.apache.spark.sql.functions.covar_pop has a workaround, see documentation for more info*/
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

Empfohlene Korrektur

Snowpark has an equivalent covar_pop function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

val result1 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

Zusätzliche Empfehlungen

SPRKSCL1112

Message: *spark element* is not supported

Kategorie: Konvertierungsfehler

Beschreibung

Dieses Problem tritt auf, wenn SMA die Verwendung eines Spark-Elements erkennt, das von Snowpark nicht unterstützt wird und für das es keinen eigenen Fehlercode gibt. Dies ist ein allgemeiner Fehlercode, der von SMA für jedes nicht unterstützte Spark-Element verwendet wird.

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für ein Spark-Element, das von Snowpark nicht unterstützt wird und daher diese EWI erzeugt.

val df = session.range(10)
val result = df.isLocal

Ausgabe

The SMA adds the EWI SPRKSCL1112 to the output code to let you know that this element is not supported by Snowpark.

val df = session.range(10)
/*EWI: SPRKSCL1112 => org.apache.spark.sql.Dataset.isLocal is not supported*/
val result = df.isLocal

Empfohlene Korrektur

Da es sich um einen allgemeinen Fehlercode handelt, der für eine Reihe von nicht unterstützten Funktionen gilt, gibt es keine einzelne und spezifische Korrektur. Die geeignete Aktion hängt von dem jeweiligen Element ab, das verwendet wird.

Bitte beachten Sie, dass, auch wenn das Element nicht unterstützt wird, dies nicht unbedingt bedeutet, dass keine Lösung oder Problemumgehung gefunden werden kann. Es bedeutet nur, dass die SMA selbst keine Lösung finden kann.

Zusätzliche Empfehlungen

SPRKSCL1143

Meldung: Beim Laden der Symboltabelle ist ein Fehler aufgetreten

Kategorie: Konvertierungsfehler

Beschreibung

Dieses Problem tritt auf, wenn ein Fehler beim Laden der Symbole in der SMA-Symboltabelle auftritt. Die Symboltabelle ist Teil der zugrunde liegenden Architektur von SMA und ermöglicht komplexere Konvertierungen.

Zusätzliche Empfehlungen

  • This is unlikely to be an error in the source code itself, but rather is an error in how the SMA processes the source code. The best resolution would be to post an issue in the SMA.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1153

Warnung

This issue code has been deprecated since Spark Conversion Core Version 4.3.2

Meldung: org.apache.spark.sql.functions.max verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.max function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.max function, first used with a column name as an argument and then with a column object.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
val result1 = df.select(max("value"))
val result2 = df.select(max(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1153 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
/*EWI: SPRKSCL1153 => org.apache.spark.sql.functions.max has a workaround, see documentation for more info*/
val result1 = df.select(max("value"))
/*EWI: SPRKSCL1153 => org.apache.spark.sql.functions.max has a workaround, see documentation for more info*/
val result2 = df.select(max(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent max function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
val result1 = df.select(max(col("value")))
val result2 = df.select(max(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1102

This issue code has been deprecated since Spark Conversion Core 2.3.22

Meldung: Explode wird nicht unterstützt

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.explode function, which is not supported by Snowpark.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.explode function used to get the consolidated information of the array fields of the dataset.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

    dfExplode.select(explode(dfExplode("Translation").alias("exploded")))

Ausgabe

The SMA adds the EWI SPRKSCL1102 to the output code to let you know that this function is not supported by Snowpark.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

    /*EWI: SPRKSCL1102 => Explode is not supported */
    dfExplode.select(explode(dfExplode("Translation").alias("exploded")))

Empfohlene Korrektur

Since explode is not supported by Snowpark, the function flatten could be used as a substitute.

Die folgende Korrektur vereinfacht den dfExplode-Datenframe und führt dann die Abfrage durch, um das Ergebnis in Spark zu replizieren.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

     var dfFlatten = dfExplode.flatten(col("Translation")).alias("exploded")
                              .select(col("exploded.value").alias("Translation"))

Zusätzliche Empfehlungen

SPRKSCL1136

Warnung

This issue code is deprecated since Spark Conversion Core 4.3.2

Meldung: org.apache.spark.sql.functions.min verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.min function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.min function, first used with a column name as an argument and then with a column object.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(min("value"))
val result2 = df.select(min(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1136 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
/*EWI: SPRKSCL1136 => org.apache.spark.sql.functions.min has a workaround, see documentation for more info*/
val result1 = df.select(min("value"))
/*EWI: SPRKSCL1136 => org.apache.spark.sql.functions.min has a workaround, see documentation for more info*/
val result2 = df.select(min(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent min function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that takes a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(min(col("value")))
val result2 = df.select(min(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1167

Meldung: Projektdatei im Eingabeordner nicht gefunden

Kategorie: Warnung

Beschreibung

Dieses Problem tritt auf, wenn SMA feststellt, dass der Eingabeordner keine Projektkonfigurationsdatei enthält. Die von SMA unterstützten Projektkonfigurationsdateien sind:

  • build.sbt

  • build.gradle

  • pom.xml

Zusätzliche Empfehlungen

SPRKSCL1147

Meldung: org.apache.spark.sql.functions.tanh verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.tanh function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.tanh function, first used with a column name as an argument and then with a column object.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
val result1 = df.withColumn("tanh_value", tanh("value"))
val result2 = df.withColumn("tanh_value", tanh(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1147 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
/*EWI: SPRKSCL1147 => org.apache.spark.sql.functions.tanh has a workaround, see documentation for more info*/
val result1 = df.withColumn("tanh_value", tanh("value"))
/*EWI: SPRKSCL1147 => org.apache.spark.sql.functions.tanh has a workaround, see documentation for more info*/
val result2 = df.withColumn("tanh_value", tanh(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent tanh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
val result1 = df.withColumn("tanh_value", tanh(col("value")))
val result2 = df.withColumn("tanh_value", tanh(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1116

Warnung

This issue code has been deprecated since Spark Conversion Core Version 2.40.1

Meldung: org.apache.spark.sql.functions.split verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.split function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
val result1 = df.withColumn("split_values", split(col("values"), ","))
val result2 = df.withColumn("split_values", split(col("values"), ",", 0))

Ausgabe

The SMA adds the EWI SPRKSCL1116 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
/*EWI: SPRKSCL1116 => org.apache.spark.sql.functions.split has a workaround, see documentation for more info*/
val result1 = df.withColumn("split_values", split(col("values"), ","))
/*EWI: SPRKSCL1116 => org.apache.spark.sql.functions.split has a workaround, see documentation for more info*/
val result2 = df.withColumn("split_values", split(col("values"), ",", 0))

Empfohlene Korrektur

For the Spark overload that receives two arguments, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

Die Überladung, die drei Argumente empfängt, wird von Snowpark noch nicht unterstützt und es gibt keine Problemumgehung.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
val result1 = df.withColumn("split_values", split(col("values"), lit(",")))
val result2 = df.withColumn("split_values", split(col("values"), ",", 0)) // This overload is not supported yet

Zusätzliche Empfehlungen

SPRKSCL1122

Meldung: org.apache.spark.sql.functions.corr verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.corr function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.corr function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

val result1 = df.select(corr("col1", "col2"))
val result2 = df.select(corr(col("col1"), col("col2")))

Ausgabe

The SMA adds the EWI SPRKSCL1122 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

/*EWI: SPRKSCL1122 => org.apache.spark.sql.functions.corr has a workaround, see documentation for more info*/
val result1 = df.select(corr("col1", "col2"))
/*EWI: SPRKSCL1122 => org.apache.spark.sql.functions.corr has a workaround, see documentation for more info*/
val result2 = df.select(corr(col("col1"), col("col2")))

Empfohlene Korrektur

Snowpark has an equivalent corr function that receives two column objects as arguments. For that reason, the Spark overload that receives column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

val result1 = df.select(corr(col("col1"), col("col2")))
val result2 = df.select(corr(col("col1"), col("col2")))

Zusätzliche Empfehlungen

SPRKSCL1173

Meldung: SQL eingebetteter Code kann nicht verarbeitet werden.

Kategorie: Warnung.

Beschreibung

Dieses Problem tritt auf, wenn SMA einen SQL-eingebetteten Code erkennt, der nicht verarbeitet werden kann. Dann kann der SQL-eingebettete Code nicht in Snowflake konvertiert werden.

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für einen SQL-eingebetteten Code, der nicht verarbeitet werden kann.

spark.sql("CREATE VIEW IF EXISTS My View" + "AS Select * From my Table WHERE date < current_date()")

Ausgabe

The SMA adds the EWI SPRKSCL1173 to the output code to let you know that the SQL-embedded code can not be processed.

/*EWI: SPRKSCL1173 => SQL embedded code cannot be processed.*/
spark.sql("CREATE VIEW IF EXISTS My View" + "AS Select * From my Table WHERE date < current_date()")

Empfohlene Korrektur

Stellen Sie sicher, dass der SQL-eingebettete Code eine Zeichenfolge ohne Interpolationen, Variablen oder Zeichenfolge-Verkettungen ist.

Zusätzliche Empfehlungen

SPRKSCL1163

Meldung: Das Element ist kein Literal und kann nicht ausgewertet werden.

Kategorie: Konvertierungsfehler.

Beschreibung

Dieses Problem tritt auf, wenn das aktuelle Verarbeitungselement kein Literal ist, dann kann es nicht von SMA ausgewertet werden.

Szenario

Eingabe

Nachfolgend ein Beispiel, wenn das zu verarbeitende Element kein Literal ist und nicht von SMA ausgewertet werden kann.

val format_type = "csv"
spark.read.format(format_type).load(path)

Ausgabe

The SMA adds the EWI SPRKSCL1163 to the output code to let you know that format_type parameter is not a literal and it can not be evaluated by the SMA.

/*EWI: SPRKSCL1163 => format_type is not a literal and can't be evaluated*/
val format_type = "csv"
spark.read.format(format_type).load(path)

Empfohlene Korrektur

  • Vergewissern Sie sich, dass der Wert der Variablen ein gültiger Wert ist, um unerwartete Verhaltensweisen zu vermeiden.

Zusätzliche Empfehlungen

SPRKSCL1132

Meldung: org.apache.spark.sql.functions.grouping_id verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.grouping_id function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.grouping_id function, first used with multiple column name as arguments and then with column objects.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id("store", "product"))
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

Ausgabe

The SMA adds the EWI SPRKSCL1132 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

/*EWI: SPRKSCL1132 => org.apache.spark.sql.functions.grouping_id has a workaround, see documentation for more info*/
val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id("store", "product"))
/*EWI: SPRKSCL1132 => org.apache.spark.sql.functions.grouping_id has a workaround, see documentation for more info*/
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

Empfohlene Korrektur

Snowpark has an equivalent grouping_id function that receives multiple column objects as arguments. For that reason, the Spark overload that receives multiple column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

Zusätzliche Empfehlungen

SPRKSCL1106

Warnung

Dieser Problemcode ist jetzt veraltet

Meldung: Die Writer-Option wird nicht unterstützt.

Kategorie: Konvertierungsfehler.

Beschreibung

Dieses Problem tritt auf, wenn das Tool in der Writer-Anweisung die Verwendung einer von Snowpark nicht unterstützten Option feststellt.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.DataFrameWriter.option used to add options to a writer statement.

df.write.format("net.snowflake.spark.snowflake").option("dbtable", tablename)

Ausgabe

The SMA adds the EWI SPRKSCL1106 to the output code to let you know that the option method is not supported by Snowpark.

df.write.saveAsTable(tablename)
/*EWI: SPRKSCL1106 => Writer option is not supported .option("dbtable", tablename)*/

Empfohlene Korrektur

Es gibt keine empfohlene Korrektur für dieses Szenario

Zusätzliche Empfehlungen

SPRKSCL1157

Meldung: org.apache.spark.sql.functions.kurtosis verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.kurtosis function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.kurtosis function that generates this EWI. In this example, the kurtosis function is used to calculate the kurtosis of selected column.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = kurtosis(col("elements"))
val result2 = kurtosis("elements")

Ausgabe

The SMA adds the EWI SPRKSCL1157 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3").toDF("elements")
/*EWI: SPRKSCL1157 => org.apache.spark.sql.functions.kurtosis has a workaround, see documentation for more info*/
val result1 = kurtosis(col("elements"))
/*EWI: SPRKSCL1157 => org.apache.spark.sql.functions.kurtosis has a workaround, see documentation for more info*/
val result2 = kurtosis("elements")

Empfohlene Korrektur

Snowpark has an equivalent kurtosis function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = kurtosis(col("elements"))
val result2 = kurtosis(col("elements"))

Zusätzliche Empfehlungen

SPRKSCL1146

Meldung: org.apache.spark.sql.functions.tan verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.tan function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.tan function, first used with a column name as an argument and then with a column object.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
val result1 = df.withColumn("tan_value", tan("angle"))
val result2 = df.withColumn("tan_value", tan(col("angle")))

Ausgabe

The SMA adds the EWI SPRKSCL1146 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
/*EWI: SPRKSCL1146 => org.apache.spark.sql.functions.tan has a workaround, see documentation for more info*/
val result1 = df.withColumn("tan_value", tan("angle"))
/*EWI: SPRKSCL1146 => org.apache.spark.sql.functions.tan has a workaround, see documentation for more info*/
val result2 = df.withColumn("tan_value", tan(col("angle")))

Empfohlene Korrektur

Snowpark has an equivalent tan function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
val result1 = df.withColumn("tan_value", tan(col("angle")))
val result2 = df.withColumn("tan_value", tan(col("angle")))

Zusätzliche Empfehlungen

SPRKSCL1117

Warnung

This issue code is deprecated since Spark Conversion Core 2.40.1

Meldung: org.apache.spark.sql.functions.translate verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.translate function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.translate function that generates this EWI. In this example, the translate function is used to replace the characters ‚a‘, ‚e‘ and ‚o‘ in each word with ‚1‘, ‚2‘ and ‚3‘, respectively.

val df = Seq("hello", "world", "scala").toDF("word")
val result = df.withColumn("translated_word", translate(col("word"), "aeo", "123"))

Ausgabe

The SMA adds the EWI SPRKSCL1117 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("hello", "world", "scala").toDF("word")
/*EWI: SPRKSCL1117 => org.apache.spark.sql.functions.translate has a workaround, see documentation for more info*/
val result = df.withColumn("translated_word", translate(col("word"), "aeo", "123"))

Empfohlene Korrektur

As a workaround, you can convert the second and third argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq("hello", "world", "scala").toDF("word")
val result = df.withColumn("translated_word", translate(col("word"), lit("aeo"), lit("123")))

Zusätzliche Empfehlungen

SPRKSCL1123

Meldung: org.apache.spark.sql.functions.cos verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.cos function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.cos function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
val result1 = df.withColumn("cosine_value", cos("angle_radians"))
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

Ausgabe

The SMA adds the EWI SPRKSCL1123 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
/*EWI: SPRKSCL1123 => org.apache.spark.sql.functions.cos has a workaround, see documentation for more info*/
val result1 = df.withColumn("cosine_value", cos("angle_radians"))
/*EWI: SPRKSCL1123 => org.apache.spark.sql.functions.cos has a workaround, see documentation for more info*/
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

Empfohlene Korrektur

Snowpark has an equivalent cos function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
val result1 = df.withColumn("cosine_value", cos(col("angle_radians")))
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

Zusätzliche Empfehlungen

SPRKSCL1172

Meldung: Snowpark unterstützt nicht StructFiled mit Metadaten-Parameter.

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects that org.apache.spark.sql.types.StructField.apply with org.apache.spark.sql.types.Metadata as parameter. This is because Snowpark does not supported the metadata parameter.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.types.StructField.apply function that generates this EWI. In this example, the apply function is used to generate and instance of StructField.

val result = StructField("f1", StringType(), True, metadata)

Ausgabe

The SMA adds the EWI SPRKSCL1172 to the output code to let you know that metadata parameter is not supported by Snowflake.

/*EWI: SPRKSCL1172 => Snowpark does not support StructFiled with metadata parameter.*/
val result = StructField("f1", StringType(), True, metadata)

Empfohlene Korrektur

Snowpark has an equivalent com.snowflake.snowpark.types.StructField.apply function that receives three parameters. Then, as workaround, you can try to remove the metadata argument.

val result = StructField("f1", StringType(), True, metadata)

Zusätzliche Empfehlungen

SPRKSCL1162

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: Beim Extrahieren der dbc-Dateien ist ein Fehler aufgetreten.

Kategorie: Warnung.

Beschreibung

Dieses Problem tritt auf, wenn eine dbc-Datei nicht extrahiert werden kann. Diese Warnung könnte durch einen oder mehrere der folgenden Gründe verursacht werden: Zu schwer, unzugänglich, schreibgeschützt, usw.

Zusätzliche Empfehlungen

  • Als Problemumgehung können Sie die Größe der Datei überprüfen, wenn sie zu groß ist, um verarbeitet zu werden. Analysieren Sie auch, ob das Tool darauf zugreifen kann, um Zugriffsprobleme zu vermeiden.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1133

Meldung: org.apache.spark.sql.functions.least verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.least function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.least function, first used with multiple column name as arguments and then with column objects.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
val result1 = df.withColumn("least", least("value1", "value2", "value3"))
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

Ausgabe

The SMA adds the EWI SPRKSCL1133 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
/*EWI: SPRKSCL1133 => org.apache.spark.sql.functions.least has a workaround, see documentation for more info*/
val result1 = df.withColumn("least", least("value1", "value2", "value3"))
/*EWI: SPRKSCL1133 => org.apache.spark.sql.functions.least has a workaround, see documentation for more info*/
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

Empfohlene Korrektur

Snowpark has an equivalent least function that receives multiple column objects as arguments. For that reason, the Spark overload that receives multiple column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
val result1 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

Zusätzliche Empfehlungen

SPRKSCL1107

Warnung

Dieser Problemcode ist jetzt veraltet

Meldung: Writer-Speichern wird nicht unterstützt.

Kategorie: Konvertierungsfehler.

Beschreibung

Dieses Problem tritt auf, wenn das Tool in der Writer-Anweisung die Verwendung einer Writer-Speichermethode feststellt, die von Snowpark nicht unterstützt wird.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.DataFrameWriter.save used to save the DataFrame content.

df.write.format("net.snowflake.spark.snowflake").save()

Ausgabe

The SMA adds the EWI SPRKSCL1107 to the output code to let you know that the save method is not supported by Snowpark.

df.write.saveAsTable(tablename)
/*EWI: SPRKSCL1107 => Writer method is not supported .save()*/

Empfohlene Korrektur

Es gibt keine empfohlene Korrektur für dieses Szenario

Zusätzliche Empfehlungen

SPRKSCL1156

Meldung: org.apache.spark.sql.functions.degrees verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.degrees function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.degrees function, first used with a column name as an argument and then with a column object.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
val result1 = df.withColumn("degrees", degrees("radians"))
val result2 = df.withColumn("degrees", degrees(col("radians")))

Ausgabe

The SMA adds the EWI SPRKSCL1156 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
/*EWI: SPRKSCL1156 => org.apache.spark.sql.functions.degrees has a workaround, see documentation for more info*/
val result1 = df.withColumn("degrees", degrees("radians"))
/*EWI: SPRKSCL1156 => org.apache.spark.sql.functions.degrees has a workaround, see documentation for more info*/
val result2 = df.withColumn("degrees", degrees(col("radians")))

Empfohlene Korrektur

Snowpark has an equivalent degrees function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
val result1 = df.withColumn("degrees", degrees(col("radians")))
val result2 = df.withColumn("degrees", degrees(col("radians")))

Zusätzliche Empfehlungen

SPRKSCL1127

Meldung: org.apache.spark.sql.functions.covar_samp verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.covar_samp function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.covar_samp function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

val result1 = df.select(covar_samp("value1", "value2").as("sample_covariance"))
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

Ausgabe

The SMA adds the EWI SPRKSCL1127 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

/*EWI: SPRKSCL1127 => org.apache.spark.sql.functions.covar_samp has a workaround, see documentation for more info*/
val result1 = df.select(covar_samp("value1", "value2").as("sample_covariance"))
/*EWI: SPRKSCL1127 => org.apache.spark.sql.functions.covar_samp has a workaround, see documentation for more info*/
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

Empfohlene Korrektur

Snowpark has an equivalent covar_samp function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

val result1 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

Zusätzliche Empfehlungen

SPRKSCL1113

Meldung: org.apache.spark.sql.functions.next_day verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.next_day function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.next_day function, first used with a string as the second argument and then with a column object.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
val result1 = df.withColumn("next_monday", next_day(col("date"), "Mon"))
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

Ausgabe

The SMA adds the EWI SPRKSCL1113 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
/*EWI: SPRKSCL1113 => org.apache.spark.sql.functions.next_day has a workaround, see documentation for more info*/
val result1 = df.withColumn("next_monday", next_day(col("date"), "Mon"))
/*EWI: SPRKSCL1113 => org.apache.spark.sql.functions.next_day has a workaround, see documentation for more info*/
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

Empfohlene Korrektur

Snowpark has an equivalent next_day function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives a column object and a string, you can convert the string into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
val result1 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

Zusätzliche Empfehlungen

SPRKSCL1002

Message: This code section has recovery from parsing errors *statement*

Kategorie: Parsing-Fehler.

Beschreibung

Dieses Problem tritt auf, wenn SMA eine Anweisung im Code einer Datei entdeckt, die nicht korrekt gelesen oder verstanden werden kann. Dies wird als Parsing-Fehler bezeichnet. SMA kann diesen Parsing-Fehler beheben und die Analyse des Codes der Datei fortsetzen. In diesem Fall ist SMA in der Lage, den Code der Datei ohne Fehler zu verarbeiten.

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für ungültigen Scala-Code, bei dem sich die SMA erholen kann.

Class myClass {

    def function1() & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

Ausgabe

The SMA adds the EWI SPRKSCL1002 to the output code to let you know that the code of the file has parsing errors, however the SMA can recovery from that error and continue analyzing the code of the file.

class myClass {

    def function1();//EWI: SPRKSCL1002 => Unexpected end of declaration. Failed token: '&' @(3,21).
    & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

Empfohlene Korrektur

Da die Meldung den Fehler in der Anweisung lokalisiert, können Sie versuchen, die ungültige Syntax zu identifizieren und sie zu entfernen oder die Anweisung auszukommentieren, um den Parsing-Fehler zu vermeiden.

Class myClass {

    def function1() = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}
Class myClass {

    // def function1() & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

Zusätzliche Empfehlungen

SPRKSCL1142

Message: *spark element* is not defined

Kategorie: Konvertierungsfehler

Beschreibung

Dieses Problem tritt auf, wenn SMA keinen geeigneten Zuordnungsstatus für das angegebene Element ermitteln konnte. Das bedeutet, dass SMA noch nicht weiß, ob dieses Element von Snowpark unterstützt wird oder nicht. Bitte beachten Sie, dass dies ein allgemeiner Fehlercode ist, der von SMA für jedes nicht definierte Element verwendet wird.

Szenario

Eingabe

Below is an example of a function for which the SMA could not determine an appropriate mapping status, and therefore it generated this EWI. In this case, you should assume that notDefinedFunction() is a valid Spark function and the code runs.

val df = session.range(10)
val result = df.notDefinedFunction()

Ausgabe

The SMA adds the EWI SPRKSCL1142 to the output code to let you know that this element is not defined.

val df = session.range(10)
/*EWI: SPRKSCL1142 => org.apache.spark.sql.DataFrame.notDefinedFunction is not defined*/
val result = df.notDefinedFunction()

Empfohlene Korrektur

Um zu versuchen, das Problem zu identifizieren, können Sie die folgenden Überprüfungen durchführen:

  • Prüfen Sie, ob es sich um ein gültiges Spark-Element handelt.

  • Prüfen Sie, ob das Element die richtige Syntax hat und richtig geschrieben ist.

  • Prüfen Sie, ob Sie eine Spark-Version verwenden, die von SMA unterstützt wird.

If this is a valid Spark element, please report that you encountered a conversion error on that particular element using the Report an Issue option of the SMA and include any additional information that you think may be helpful.

Please note that if an element is not defined by the SMA, it does not mean necessarily that it is not supported by Snowpark. You should check the Snowpark Documentation to verify if an equivalent element exist.

Zusätzliche Empfehlungen

SPRKSCL1152

Meldung: org.apache.spark.sql.functions.variance verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.variance function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.variance function, first used with a column name as an argument and then with a column object.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
val result1 = df.select(variance("value"))
val result2 = df.select(variance(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1152 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
/*EWI: SPRKSCL1152 => org.apache.spark.sql.functions.variance has a workaround, see documentation for more info*/
val result1 = df.select(variance("value"))
/*EWI: SPRKSCL1152 => org.apache.spark.sql.functions.variance has a workaround, see documentation for more info*/
val result2 = df.select(variance(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent variance function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
val result1 = df.select(variance(col("value")))
val result2 = df.select(variance(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1103

Dieser Problemcode ist jetzt veraltet

Message: SparkBuilder method is not supported *method name*

Kategorie: Konvertierungsfehler

Beschreibung

Dieses Problem tritt auf, wenn SMA eine Methode erkennt, die von Snowflake in der SparkBuilder-Methodenverkettung nicht unterstützt wird. Daher kann es die Migration der Reader-Anweisung beeinträchtigen.

Im Folgenden finden Sie die nicht unterstützten SparkBuilder-Methoden:

  • Master

  • appName

  • enableHiveSupport

  • withExtensions

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für eine SparkBuilder-Methodenverkettung mit vielen Methoden, die von Snowflake nicht unterstützt werden.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .enableHiveSupport()
           .getOrCreate()

Ausgabe

The SMA adds the EWI SPRKSCL1103 to the output code to let you know that master, appName and enableHiveSupport methods are not supported by Snowpark. Then, it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1103 => SparkBuilder Method is not supported .master("local")*/
/*EWI: SPRKSCL1103 => SparkBuilder Method is not supported .appName("testApp")*/
/*EWI: SPRKSCL1103 => SparkBuilder method is not supported .enableHiveSupport()*/
.create

Empfohlene Korrektur

Um die Sitzung zu erstellen, müssen Sie die richtige Snowflake Snowpark-Konfiguration hinzufügen.

In diesem Beispiel wird eine configs-Variable verwendet.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

Außerdem wird die Verwendung einer configFile (profile.properties) mit den Verbindungsinformationen empfohlen:

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

Zusätzliche Empfehlungen

SPRKSCL1137

Meldung: org.apache.spark.sql.functions.sin verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sin function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.sin function, first used with a column name as an argument and then with a column object.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
val result1 = df.withColumn("sin_value", sin("angle"))
val result2 = df.withColumn("sin_value", sin(col("angle")))

Ausgabe

The SMA adds the EWI SPRKSCL1137 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
/*EWI: SPRKSCL1137 => org.apache.spark.sql.functions.sin has a workaround, see documentation for more info*/
val result1 = df.withColumn("sin_value", sin("angle"))
/*EWI: SPRKSCL1137 => org.apache.spark.sql.functions.sin has a workaround, see documentation for more info*/
val result2 = df.withColumn("sin_value", sin(col("angle")))

Empfohlene Korrektur

Snowpark has an equivalent sin function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
val result1 = df.withColumn("sin_value", sin(col("angle")))
val result2 = df.withColumn("sin_value", sin(col("angle")))

Zusätzliche Empfehlungen

SPRKSCL1166

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: org.apache.spark.sql.DataFrameReader.format wird nicht unterstützt.

Kategorie: Warnung.

Beschreibung

This issue appears when the org.apache.spark.sql.DataFrameReader.format has an argument that is not supported by Snowpark.

Szenarien

There are some scenarios depending on the type of format you are trying to load. It can be a supported, or non-supported format.

Szenario 1

Eingabe

Das Tool analysiert den Typ des Formats, das Sie zu laden versuchen, die unterstützten Formate sind:

  • csv

  • json

  • orc

  • parquet

  • text

The below example shows how the tool transforms the format method when passing a csv value.

spark.read.format("csv").load(path)

Ausgabe

The tool transforms the format method into a csv method call when load function has one parameter.

spark.read.csv(path)

Empfohlene Korrektur

In diesem Fall zeigt das Tool die EWI nicht an, d. h. es ist keine Korrektur erforderlich.

Szenario 2

Eingabe

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

spark.read.format("net.snowflake.spark.snowflake").load(path)

Ausgabe

The tool shows the EWI SPRKSCL1166 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1166 => The parameter net.snowflake.spark.snowflake is not supported for org.apache.spark.sql.DataFrameReader.format
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format("net.snowflake.spark.snowflake").load(path)

Empfohlene Korrektur

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

Szenario 3

Eingabe

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
spark.read.format(myFormat).load(path)

Ausgabe

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

/*EWI: SPRKSCL1163 => myFormat is not a literal and can't be evaluated
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format(myFormat).load(path)

Empfohlene Korrektur

As a workaround, you can check the value of the variable and add it as a string to the format call.

Zusätzliche Empfehlungen

SPRKSCL1118

Meldung: org.apache.spark.sql.functions.trunc hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.trunc function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.trunc function that generates this EWI.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

val result = df.withColumn("truncated", trunc(col("date"), "month"))

Ausgabe

The SMA adds the EWI SPRKSCL1118 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

/*EWI: SPRKSCL1118 => org.apache.spark.sql.functions.trunc has a workaround, see documentation for more info*/
val result = df.withColumn("truncated", trunc(col("date"), "month"))

Empfohlene Korrektur

As a workaround, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

val result = df.withColumn("truncated", trunc(col("date"), lit("month")))

Zusätzliche Empfehlungen

SPRKSCL1149

Meldung: org.apache.spark.sql.functions.toRadians hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.toRadians function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.toRadians function, first used with a column name as an argument and then with a column object.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
val result1 = df.withColumn("radians", toRadians("degrees"))
val result2 = df.withColumn("radians", toRadians(col("degrees")))

Ausgabe

The SMA adds the EWI SPRKSCL1149 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
/*EWI: SPRKSCL1149 => org.apache.spark.sql.functions.toRadians has a workaround, see documentation for more info*/
val result1 = df.withColumn("radians", toRadians("degrees"))
/*EWI: SPRKSCL1149 => org.apache.spark.sql.functions.toRadians has a workaround, see documentation for more info*/
val result2 = df.withColumn("radians", toRadians(col("degrees")))

Empfohlene Korrektur

As a workaround, you can use the radians function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
val result1 = df.withColumn("radians", radians(col("degrees")))
val result2 = df.withColumn("radians", radians(col("degrees")))

Zusätzliche Empfehlungen

SPRKSCL1159

Meldung: org.apache.spark.sql.functions.stddev_samp hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev_samp function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.stddev_samp function that generates this EWI. In this example, the stddev_samp function is used to calculate the sample standard deviation of selected column.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
val result1 = stddev_samp(col("elements"))
val result2 = stddev_samp("elements")

Ausgabe

The SMA adds the EWI SPRKSCL1159 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
/*EWI: SPRKSCL1159 => org.apache.spark.sql.functions.stddev_samp has a workaround, see documentation for more info*/
val result1 = stddev_samp(col("elements"))
/*EWI: SPRKSCL1159 => org.apache.spark.sql.functions.stddev_samp has a workaround, see documentation for more info*/
val result2 = stddev_samp("elements")

Empfohlene Korrektur

Snowpark has an equivalent stddev_samp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
val result1 = stddev_samp(col("elements"))
val result2 = stddev_samp(col("elements"))

Zusätzliche Empfehlungen

SPRKSCL1108

Bemerkung

Dieser Problemcode ist bereits veraltet.

Meldung: org.apache.spark.sql.DataFrameReader.format wird nicht unterstützt.

Kategorie: Warnung.

Beschreibung

This issue appears when the org.apache.spark.sql.DataFrameReader.format has an argument that is not supported by Snowpark.

Szenarien

There are some scenarios depending on the type of format you are trying to load. It can be a supported, or non-supported format.

Szenario 1

Eingabe

Das Tool analysiert den Typ des Formats, das Sie zu laden versuchen, die unterstützten Formate sind:

  • csv

  • json

  • orc

  • parquet

  • text

The below example shows how the tool transforms the format method when passing a csv value.

spark.read.format("csv").load(path)

Ausgabe

The tool transforms the format method into a csv method call when load function has one parameter.

spark.read.csv(path)

Empfohlene Korrektur

In diesem Fall zeigt das Tool die EWI nicht an, d. h. es ist keine Korrektur erforderlich.

Szenario 2

Eingabe

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

spark.read.format("net.snowflake.spark.snowflake").load(path)

Ausgabe

The tool shows the EWI SPRKSCL1108 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1108 => The parameter net.snowflake.spark.snowflake is not supported for org.apache.spark.sql.DataFrameReader.format
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format("net.snowflake.spark.snowflake").load(path)

Empfohlene Korrektur

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

Szenario 3

Eingabe

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
spark.read.format(myFormat).load(path)

Ausgabe

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

/*EWI: SPRKSCL1108 => myFormat is not a literal and can't be evaluated
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format(myFormat).load(path)

Empfohlene Korrektur

As a workaround, you can check the value of the variable and add it as a string to the format call.

Zusätzliche Empfehlungen

SPRKSCL1128

Meldung: org.apache.spark.sql.functions.exp verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.exp function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.exp function, first used with a column name as an argument and then with a column object.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("exp_value", exp("value"))
val result2 = df.withColumn("exp_value", exp(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1128 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
/*EWI: SPRKSCL1128 => org.apache.spark.sql.functions.exp has a workaround, see documentation for more info*/
val result1 = df.withColumn("exp_value", exp("value"))
/*EWI: SPRKSCL1128 => org.apache.spark.sql.functions.exp has a workaround, see documentation for more info*/
val result2 = df.withColumn("exp_value", exp(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent exp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("exp_value", exp(col("value")))
val result2 = df.withColumn("exp_value", exp(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1169

Message: *Spark element* is missing on the method chaining.

Kategorie: Warnung.

Beschreibung

Dieses Problem tritt auf, wenn SMA feststellt, dass der Aufruf eines Spark-Elements in der Methodenverkettung fehlt. SMA muss dieses Spark-Element kennen, um die Anweisung zu analysieren.

Szenario

Eingabe

Nachfolgend sehen Sie ein Beispiel, bei dem der Aufruf der Ladefunktion in der Methodenverkettung fehlt.

val reader = spark.read.format("json")
val df = reader.load(path)

Ausgabe

The SMA adds the EWI SPRKSCL1169 to the output code to let you know that load function call is missing on the method chaining and SMA can not analyze the statement.

/*EWI: SPRKSCL1169 => Function 'org.apache.spark.sql.DataFrameReader.load' is missing on the method chaining*/
val reader = spark.read.format("json")
val df = reader.load(path)

Empfohlene Korrektur

Stellen Sie sicher, dass alle Funktionsaufrufe der Methodenverkettung in der gleichen Anweisung stehen.

val reader = spark.read.format("json").load(path)

Zusätzliche Empfehlungen

SPRKSCL1138

Meldung: org.apache.spark.sql.functions.sinh hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sinh function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.sinh function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("sinh_value", sinh("value"))
val result2 = df.withColumn("sinh_value", sinh(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1138 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
/*EWI: SPRKSCL1138 => org.apache.spark.sql.functions.sinh has a workaround, see documentation for more info*/
val result1 = df.withColumn("sinh_value", sinh("value"))
/*EWI: SPRKSCL1138 => org.apache.spark.sql.functions.sinh has a workaround, see documentation for more info*/
val result2 = df.withColumn("sinh_value", sinh(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent sinh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("sinh_value", sinh(col("value")))
val result2 = df.withColumn("sinh_value", sinh(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1129

Meldung: org.apache.spark.sql.functions.floor hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.floor function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.floor function, first used with a column name as an argument, then with a column object and finally with two column objects.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
val result1 = df.withColumn("floor_value", floor("value"))
val result2 = df.withColumn("floor_value", floor(col("value")))
val result3 = df.withColumn("floor_value", floor(col("value"), lit(1)))

Ausgabe

The SMA adds the EWI SPRKSCL1129 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result1 = df.withColumn("floor_value", floor("value"))
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result2 = df.withColumn("floor_value", floor(col("value")))
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result3 = df.withColumn("floor_value", floor(col("value"), lit(1)))

Empfohlene Korrektur

Snowpark has an equivalent floor function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

For the overload that receives a column object and a scale, you can use the callBuiltin function to invoke the Snowflake builtin FLOOR function. To use it, you should pass the string „floor“ as the first argument, the column as the second argument and the scale as the third argument.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
val result1 = df.withColumn("floor_value", floor(col("value")))
val result2 = df.withColumn("floor_value", floor(col("value")))
val result3 = df.withColumn("floor_value", callBuiltin("floor", col("value"), lit(1)))

Zusätzliche Empfehlungen

SPRKSCL1168

Message: *Spark element* with argument(s) value(s) *given arguments* is not supported.

Kategorie: Warnung.

Beschreibung

Dieses Problem tritt auf, wenn SMA feststellt, dass das Spark-Element mit den angegebenen Parametern nicht unterstützt wird.

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für ein Spark-Element, dessen Parameter nicht unterstützt wird.

spark.read.format("text").load(path)

Ausgabe

The SMA adds the EWI SPRKSCL1168 to the output code to let you know that Spark element with the given parameter is not supported.

/*EWI: SPRKSCL1168 => org.apache.spark.sql.DataFrameReader.format(scala.String) with argument(s) value(s) (spark.format) is not supported*/
spark.read.format("text").load(path)

Empfohlene Korrektur

Für dieses Szenario gibt es keine spezielle Korrektur.

Zusätzliche Empfehlungen

SPRKSCL1139

Meldung: org.apache.spark.sql.functions.sqrt hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sqrt function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.sqrt function, first used with a column name as an argument and then with a column object.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
val result1 = df.withColumn("sqrt_value", sqrt("value"))
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1139 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
/*EWI: SPRKSCL1139 => org.apache.spark.sql.functions.sqrt has a workaround, see documentation for more info*/
val result1 = df.withColumn("sqrt_value", sqrt("value"))
/*EWI: SPRKSCL1139 => org.apache.spark.sql.functions.sqrt has a workaround, see documentation for more info*/
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent sqrt function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
val result1 = df.withColumn("sqrt_value", sqrt(col("value")))
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1119

Meldung: org.apache.spark.sql.Column.endsWith hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.Column.endsWith function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.Column.endsWith function, first used with a literal string argument and then with a column object argument.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
val result1 = df1.filter(col("email").endsWith(".com"))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
val result2 = df2.filter(col("email").endsWith(col("suffix")))

Ausgabe

The SMA adds the EWI SPRKSCL1119 to the output code to let you know that this function is not directly supported by Snowpark, but it has a workaround.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
/*EWI: SPRKSCL1119 => org.apache.spark.sql.Column.endsWith has a workaround, see documentation for more info*/
val result1 = df1.filter(col("email").endsWith(".com"))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
/*EWI: SPRKSCL1119 => org.apache.spark.sql.Column.endsWith has a workaround, see documentation for more info*/
val result2 = df2.filter(col("email").endsWith(col("suffix")))

Empfohlene Korrektur

As a workaround, you can use the com.snowflake.snowpark.functions.endswith function, where the first argument would be the column whose values will be checked and the second argument the suffix to check against the column values. Please note that if the argument of the Spark’s endswith function is a literal string, you should convert it into a column object using the com.snowflake.snowpark.functions.lit function.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
val result1 = df1.filter(endswith(col("email"), lit(".com")))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
val result2 = df2.filter(endswith(col("email"), col("suffix")))

Zusätzliche Empfehlungen

SPRKSCL1148

Meldung: org.apache.spark.sql.functions.toDegrees hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.toDegrees function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.toDegrees function, first used with a column name as an argument and then with a column object.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
val result1 = df.withColumn("angle_in_degrees", toDegrees("angle_in_radians"))
val result2 = df.withColumn("angle_in_degrees", toDegrees(col("angle_in_radians")))

Ausgabe

The SMA adds the EWI SPRKSCL1148 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
/*EWI: SPRKSCL1148 => org.apache.spark.sql.functions.toDegrees has a workaround, see documentation for more info*/
val result1 = df.withColumn("angle_in_degrees", toDegrees("angle_in_radians"))
/*EWI: SPRKSCL1148 => org.apache.spark.sql.functions.toDegrees has a workaround, see documentation for more info*/
val result2 = df.withColumn("angle_in_degrees", toDegrees(col("angle_in_radians")))

Empfohlene Korrektur

As a workaround, you can use the degrees function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
val result1 = df.withColumn("angle_in_degrees", degrees(col("angle_in_radians")))
val result2 = df.withColumn("angle_in_degrees", degrees(col("angle_in_radians")))

Zusätzliche Empfehlungen

SPRKSCL1158

Meldung: org.apache.spark.sql.functions.skewness hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.skewness function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.skewness function that generates this EWI. In this example, the skewness function is used to calculate the skewness of selected column.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = skewness(col("elements"))
val result2 = skewness("elements")

Ausgabe

The SMA adds the EWI SPRKSCL1158 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3").toDF("elements")
/*EWI: SPRKSCL1158 => org.apache.spark.sql.functions.skewness has a workaround, see documentation for more info*/
val result1 = skewness(col("elements"))
/*EWI: SPRKSCL1158 => org.apache.spark.sql.functions.skewness has a workaround, see documentation for more info*/
val result2 = skewness("elements")

Empfohlene Korrektur

Snowpark has an equivalent skew function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = skew(col("elements"))
val result2 = skew(col("elements"))

Zusätzliche Empfehlungen

SPRKSCL1109

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: Der Parameter ist nicht für org.apache.spark.sql.DataFrameReader.option definiert

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects that giving parameter of org.apache.spark.sql.DataFrameReader.option is not defined.

Szenario

Eingabe

Below is an example of undefined parameter for org.apache.spark.sql.DataFrameReader.option function.

spark.read.option("header", True).json(path)

Ausgabe

The SMA adds the EWI SPRKSCL1109 to the output code to let you know that giving parameter to the org.apache.spark.sql.DataFrameReader.option function is not defined.

/*EWI: SPRKSCL1109 => The parameter header=True is not supported for org.apache.spark.sql.DataFrameReader.option*/
spark.read.option("header", True).json(path)

Empfohlene Korrektur

Check the Snowpark documentation for reader format option here, in order to identify the defined options.

Zusätzliche Empfehlungen

SPRKSCL1114

Meldung: org.apache.spark.sql.functions.repeat hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.repeat function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.repeat function that generates this EWI.

val df = Seq("Hello", "World").toDF("word")
val result = df.withColumn("repeated_word", repeat(col("word"), 3))

Ausgabe

The SMA adds the EWI SPRKSCL1114 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("Hello", "World").toDF("word")
/*EWI: SPRKSCL1114 => org.apache.spark.sql.functions.repeat has a workaround, see documentation for more info*/
val result = df.withColumn("repeated_word", repeat(col("word"), 3))

Empfohlene Korrektur

As a workaround, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq("Hello", "World").toDF("word")
val result = df.withColumn("repeated_word", repeat(col("word"), lit(3)))

Zusätzliche Empfehlungen

SPRKSCL1145

Meldung: org.apache.spark.sql.functions.sumDistinct hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sumDistinct function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.sumDistinct function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

val result1 = df.groupBy("name").agg(sumDistinct("value"))
val result2 = df.groupBy("name").agg(sumDistinct(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1145 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

/*EWI: SPRKSCL1145 => org.apache.spark.sql.functions.sumDistinct has a workaround, see documentation for more info*/
val result1 = df.groupBy("name").agg(sumDistinct("value"))
/*EWI: SPRKSCL1145 => org.apache.spark.sql.functions.sumDistinct has a workaround, see documentation for more info*/
val result2 = df.groupBy("name").agg(sumDistinct(col("value")))

Empfohlene Korrektur

As a workaround, you can use the sum_distinct function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

val result1 = df.groupBy("name").agg(sum_distinct(col("value")))
val result2 = df.groupBy("name").agg(sum_distinct(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1171

Meldung: Snowpark unterstützt keine Split-Funktionen mit mehr als zwei Parametern oder mit Regex-Mustern. Siehe Dokumentation für weitere Informationen.

Kategorie: Warnung.

Beschreibung

This issue appears when the SMA detects that org.apache.spark.sql.functions.split has more than two parameters or containing regex pattern.

Szenarien

The split function is used to separate the given column around matches of the given pattern. This Spark function has three overloads.

Szenario 1

Eingabe

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has two parameters and the second argument is a string, not a regex pattern.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), "Snow"))

Ausgabe

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result = df.select(split(col("words"), "Snow"))

Empfohlene Korrektur

Snowpark has an equivalent split function that receives a column object as a second argument. For that reason, the Spark overload that receives a string argument in the second argument, but it is not a regex pattern, can convert the string into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), lit("Snow")))
Szenario 2

Eingabe

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has two parameters and the second argument is a regex pattern.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), "^([\\d]+-[\\d]+-[\\d])"))

Ausgabe

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark because regex patterns are not supported by Snowflake.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result = df.select(split(col("words"), "^([\\d]+-[\\d]+-[\\d])"))

Empfohlene Korrektur

Da Snowflake keine Regex-Muster unterstützt, versuchen Sie, das Muster durch eine Nicht-Regex-Musterzeichenfolge zu ersetzen.

Szenario 3

Eingabe

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has more than two parameters.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(df("words"), "Snow", 3))

Ausgabe

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark, because Snowflake does not have a split function with more than two parameters.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result3 = df.select(split(df("words"), "Snow", 3))

Empfohlene Korrektur

Da Snowflake keine Split-Funktion mit mehr als zwei Parametern unterstützt, versuchen Sie, die von Snowflake unterstützte Split-Funktion zu verwenden.

Zusätzliche Empfehlungen

SPRKSCL1120

Meldung: org.apache.spark.sql.functions.asin hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.asin function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.asin function, first used with a column name as an argument and then with a column object.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
val result1 = df.select(col("value"), asin("value").as("asin_value"))
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

Ausgabe

The SMA adds the EWI SPRKSCL1120 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
/*EWI: SPRKSCL1120 => org.apache.spark.sql.functions.asin has a workaround, see documentation for more info*/
val result1 = df.select(col("value"), asin("value").as("asin_value"))
/*EWI: SPRKSCL1120 => org.apache.spark.sql.functions.asin has a workaround, see documentation for more info*/
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

Empfohlene Korrektur

Snowpark has an equivalent asin function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
val result1 = df.select(col("value"), asin(col("value")).as("asin_value"))
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

Zusätzliche Empfehlungen

SPRKSCL1130

Meldung: org.apache.spark.sql.functions.greatest hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.greatest function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.greatest function, first used with multiple column names as arguments and then with multiple column objects.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

val result1 = df.withColumn("greatest", greatest("value1", "value2", "value3"))
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

Ausgabe

The SMA adds the EWI SPRKSCL1130 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

/*EWI: SPRKSCL1130 => org.apache.spark.sql.functions.greatest has a workaround, see documentation for more info*/
val result1 = df.withColumn("greatest", greatest("value1", "value2", "value3"))
/*EWI: SPRKSCL1130 => org.apache.spark.sql.functions.greatest has a workaround, see documentation for more info*/
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

Empfohlene Korrektur

Snowpark has an equivalent greatest function that receives multiple column objects as arguments. For that reason, the Spark overload that receives column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

val result1 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

Zusätzliche Empfehlungen


Beschreibung: >- Snowpark und Snowpark-Erweiterungen wurden nicht zur Projektkonfigurationsdatei hinzugefügt.


SPRKSCL1161

Meldung: Abhängigkeiten können nicht hinzugefügt werden.

Kategorie: Konvertierungsfehler.

Beschreibung

Dieses Problem tritt auf, wenn SMA in der Projektkonfigurationsdatei eine Spark-Version erkennt, die von SMA nicht unterstützt wird. Daher konnte SMA die Abhängigkeiten von Snowpark- und Snowpark-Erweiterungen nicht zur entsprechenden Projektkonfigurationsdatei hinzufügen. Wenn die Snowpark-Abhängigkeiten nicht hinzugefügt werden, wird der migrierte Code nicht kompiliert.

Szenarien

Es gibt drei mögliche Szenarien: sbt, gradle und pom.xml. Die SMA versucht, die Projektkonfigurationsdatei zu verarbeiten, indem sie Spark-Abhängigkeiten entfernt und Abhängigkeiten von Snowpark- und Snowpark-Erweiterungen hinzufügt.

Szenario 1

Eingabe

Below is an example of the dependencies section of a sbt project configuration file.

...
libraryDependencies += "org.apache.spark" % "spark-core_2.13" % "3.5.3"
libraryDependencies += "org.apache.spark" % "spark-sql_2.13" % "3.5.3"
...

Ausgabe

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

...
libraryDependencies += "org.apache.spark" % "spark-core_2.13" % "3.5.3"
libraryDependencies += "org.apache.spark" % "spark-sql_2.13" % "3.5.3"
...

Empfohlene Korrektur

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the sbt project configuration file.

...
libraryDependencies += "com.snowflake" % "snowpark" % "1.14.0"
libraryDependencies += "net.mobilize.snowpark-extensions" % "snowparkextensions" % "0.0.18"
...

Stellen Sie sicher, dass Sie die Snowpark-Version verwenden, die den Anforderungen Ihres Projekts am besten entspricht.

Szenario 2

Eingabe

Below is an example of the dependencies section of a gradle project configuration file.

dependencies {
    implementation group: 'org.apache.spark', name: 'spark-core_2.13', version: '3.5.3'
    implementation group: 'org.apache.spark', name: 'spark-sql_2.13', version: '3.5.3'
    ...
}

Ausgabe

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

dependencies {
    implementation group: 'org.apache.spark', name: 'spark-core_2.13', version: '3.5.3'
    implementation group: 'org.apache.spark', name: 'spark-sql_2.13', version: '3.5.3'
    ...
}

Empfohlene Korrektur

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the gradle project configuration file.

dependencies {
    implementation 'com.snowflake:snowpark:1.14.2'
    implementation 'net.mobilize.snowpark-extensions:snowparkextensions:0.0.18'
    ...
}

Stellen Sie sicher, dass die Version der Abhängigkeiten den Anforderungen Ihres Projekts entspricht.

Szenario 3

Eingabe

Below is an example of the dependencies section of a pom.xml project configuration file.

<dependencies>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.13</artifactId>
    <version>3.5.3</version>
  </dependency>

  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.13</artifactId>
    <version>3.5.3</version>
    <scope>compile</scope>
  </dependency>
  ...
</dependencies>

Ausgabe

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

<dependencies>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.13</artifactId>
    <version>3.5.3</version>
  </dependency>

  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.13</artifactId>
    <version>3.5.3</version>
    <scope>compile</scope>
  </dependency>
  ...
</dependencies>

Empfohlene Korrektur

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the gradle project configuration file.

<dependencies>
  <dependency>
    <groupId>com.snowflake</groupId>
    <artifactId>snowpark</artifactId>
    <version>1.14.2</version>
  </dependency>

  <dependency>
    <groupId>net.mobilize.snowpark-extensions</groupId>
    <artifactId>snowparkextensions</artifactId>
    <version>0.0.18</version>
  </dependency>
  ...
</dependencies>

Stellen Sie sicher, dass die Version der Abhängigkeiten den Anforderungen Ihres Projekts entspricht.

Zusätzliche Empfehlungen

  • Stellen Sie sicher, dass „input“ eine Projektkonfigurationsdatei hat:

    • build.sbt

    • build.gradle

    • pom.xml

  • Die von SMA unterstützte Spark-Version ist 2.12:3.1.2

  • You can check the latest Snowpark version here.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1155

Warnung

This issue code has been deprecated since Spark Conversion Core Version 4.3.2

Meldung: org.apache.spark.sql.functions.countDistinct hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.countDistinct function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.countDistinct function, first used with column names as arguments and then with column objects.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

val result1 = df.select(countDistinct("name", "value"))
val result2 = df.select(countDistinct(col("name"), col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1155 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

/*EWI: SPRKSCL1155 => org.apache.spark.sql.functions.countDistinct has a workaround, see documentation for more info*/
val result1 = df.select(countDistinct("name", "value"))
/*EWI: SPRKSCL1155 => org.apache.spark.sql.functions.countDistinct has a workaround, see documentation for more info*/
val result2 = df.select(countDistinct(col("name"), col("value")))

Empfohlene Korrektur

As a workaround, you can use the count_distinct function. For the Spark overload that receives string arguments, you additionally have to convert the strings into column objects using the com.snowflake.snowpark.functions.col function.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

val result1 = df.select(count_distinct(col("name"), col("value")))
val result2 = df.select(count_distinct(col("name"), col("value")))

Zusätzliche Empfehlungen

SPRKSCL1104

Dieser Problemcode ist jetzt veraltet

Meldung: Die Spark-Session-Builder-Option wird nicht unterstützt.

Kategorie: Konvertierungsfehler.

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.SparkSession.Builder.config function, which is setting an option of the Spark Session and it is not supported by Snowpark.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.SparkSession.Builder.config function used to set an option in the Spark Session.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .getOrCreate()

Ausgabe

The SMA adds the EWI SPRKSCL1104 to the output code to let you know config method is not supported by Snowpark. Then, it is not possible to set options in the Spark Session via config function and it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1104 => SparkBuilder Option is not supported .config("spark.sql.broadcastTimeout", "3600")*/
.create()

Empfohlene Korrektur

Um die Sitzung zu erstellen, müssen Sie die richtige Snowflake Snowpark-Konfiguration hinzufügen.

In diesem Beispiel wird eine configs-Variable verwendet.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

Außerdem wird die Verwendung einer configFile (profile.properties) mit den Verbindungsinformationen empfohlen:

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

Zusätzliche Empfehlungen

SPRKSCL1124

Meldung: org.apache.spark.sql.functions.cosh hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.cosh function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.cosh function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
val result1 = df.withColumn("cosh_value", cosh("value"))
val result2 = df.withColumn("cosh_value", cosh(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1124 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
/*EWI: SPRKSCL1124 => org.apache.spark.sql.functions.cosh has a workaround, see documentation for more info*/
val result1 = df.withColumn("cosh_value", cosh("value"))
/*EWI: SPRKSCL1124 => org.apache.spark.sql.functions.cosh has a workaround, see documentation for more info*/
val result2 = df.withColumn("cosh_value", cosh(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent cosh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
val result1 = df.withColumn("cosh_value", cosh(col("value")))
val result2 = df.withColumn("cosh_value", cosh(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1175

Message: The two-parameter udf function is not supported in Snowpark. It should be converted into a single-parameter udf function. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.

Kategorie: Konvertierungsfehler.

Beschreibung

This issue appears when the SMA detects an use of the two-parameter org.apache.spark.sql.functions.udf function in the source code, because Snowpark does not have an equivalent two-parameter udf function, then the output code might not compile.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.udf function that generates this EWI. In this example, the udf function has two parameters.

val myFuncUdf = udf(new UDF1[String, Integer] {
  override def call(s: String): Integer = s.length()
}, IntegerType)

Ausgabe

The SMA adds the EWI SPRKSCL1175 to the output code to let you know that the udf function is not supported, because it has two parameters.

/*EWI: SPRKSCL1175 => The two-parameter udf function is not supported in Snowpark. It should be converted into a single-parameter udf function. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
val myFuncUdf = udf(new UDF1[String, Integer] {
  override def call(s: String): Integer = s.length()
}, IntegerType)

Empfohlene Korrektur

Snowpark only supports the single-parameter udf function (without the return type parameter), so you should convert your two-parameter udf function into a single-parameter udf function in order to make it work in Snowpark.

Für den oben erwähnten Beispielcode müssten Sie ihn zum Beispiel manuell in diesen umwandeln:

val myFuncUdf = udf((s: String) => s.length())

Please note that there are some caveats about creating udf in Snowpark that might require you to make some additional manual changes to your code. Please check this other recommendations here related with creating single-parameter udf functions in Snowpark for more details.

Zusätzliche Empfehlungen

  • To learn more about how to create user-defined functions in Snowpark, please refer to the following documentation: Creating User-Defined Functions (UDFs) for DataFrames in Scala

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1001

Message: This code section has parsing errors. The parsing error was found at: line *line number*, column *column number*. When trying to parse *statement*. This file was not converted, so it is expected to still have references to the Spark API.

Kategorie: Parsing-Fehler.

Beschreibung

Dieses Problem tritt auf, wenn SMA eine Anweisung im Code einer Datei entdeckt, die nicht korrekt gelesen oder verstanden werden kann, es wird als Parsing-Fehler bezeichnet. Außerdem tritt dieses Problem auf, wenn eine Datei einen oder mehrere Parsing-Fehler aufweist.

Szenario

Eingabe

Nachfolgend finden Sie ein Beispiel für ungültigen Scala-Code.

/#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

Ausgabe

The SMA adds the EWI SPRKSCL1001 to the output code to let you know that the code of the file has parsing errors. Therefore, SMA is not able to process a file with this error.

// **********************************************************************************************************************
// EWI: SPRKSCL1001 => This code section has parsing errors
// The parsing error was found at: line 0, column 0. When trying to parse ''.
// This file was not converted, so it is expected to still have references to the Spark API
// **********************************************************************************************************************
/#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

Empfohlene Korrektur

Da die Meldung die Fehleranweisung genau angibt, können Sie versuchen, die ungültige Syntax zu identifizieren und sie zu entfernen oder die Anweisung auszukommentieren, um den Parsing-Fehler zu vermeiden.

Class myClass {

    def function1() = { 1 }

}
// /#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

Zusätzliche Empfehlungen

SPRKSCL1141

Meldung: org.apache.spark.sql.functions.stddev_pop hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev_pop function, which has a workaround.

Szenario

Below is an example of the org.apache.spark.sql.functions.stddev_pop function, first used with a column name as an argument and then with a column object.

Eingabe

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

val result1 = df.select(stddev_pop("age"))
val result2 = df.select(stddev_pop(col("age")))

Ausgabe

The SMA adds the EWI SPRKSCL1141 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

/*EWI: SPRKSCL1141 => org.apache.spark.sql.functions.stddev_pop has a workaround, see documentation for more info*/
val result1 = df.select(stddev_pop("age"))
/*EWI: SPRKSCL1141 => org.apache.spark.sql.functions.stddev_pop has a workaround, see documentation for more info*/
val result2 = df.select(stddev_pop(col("age")))

Empfohlene Korrektur

Snowpark has an equivalent stddev_pop function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

val result1 = df.select(stddev_pop(col("age")))
val result2 = df.select(stddev_pop(col("age")))

Zusätzliche Empfehlungen

SPRKSCL1110

Bemerkung

Dieser Problemcode ist jetzt veraltet

Message: Reader method not supported *method name*.

Kategorie: Warnung

Beschreibung

Dieses Problem tritt auf, wenn SMA eine Methode erkennt, die von Snowflake in der DataFrameReader-Methodenverkettung nicht unterstützt wird. Dann könnte sich dies auf die Migration der Reader-Anweisung auswirken.

Szenario

Eingabe

Nachfolgend sehen Sie ein Beispiel für eine DataFrameReader-Methodenverkettung, bei der die Lastmethode nicht von Snowflake unterstützt wird.

spark.read.
    format("net.snowflake.spark.snowflake").
    option("query", s"select * from $tablename")
    load()

Ausgabe

The SMA adds the EWI SPRKSCL1110 to the output code to let you know that load method is not supported by Snowpark. Then, it might affects the migration of the reader statement.

session.sql(s"select * from $tablename")
/*EWI: SPRKSCL1110 => Reader method not supported .load()*/

Empfohlene Korrektur

Check the Snowpark documentation for reader here, in order to know the supported methods by Snowflake.

Zusätzliche Empfehlungen

SPRKSCL1100

This issue code has been deprecated since Spark Conversion Core 2.3.22

Meldung: Neupartitionierung wird nicht unterstützt.

Kategorie: Parsing-Fehler.

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.DataFrame.repartition function, which is not supported by Snowpark. Snowflake manages the storage and the workload on the clusters making repartition operation inapplicable.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.DataFrame.repartition function used to return a new DataFrame partitioned by the given partitioning expressions.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    val dfRepartitionByExpresion = dfName.repartition($"name")

    val dfRepartitionByNumber = dfJob.repartition(3)

    val dfRepartitionByBoth = dfAge.repartition(3, $"age")

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

Ausgabe

The SMA adds the EWI SPRKSCL1100 to the output code to let you know that this function is not supported by Snowpark.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByExpresion = dfName.repartition($"name")

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByNumber = dfJob.repartition(3)

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByBoth = dfAge.repartition(3, $"age")

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

Empfohlene Korrektur

Da Snowflake den Speicher und die Workload auf den Clustern verwaltet, ist eine Neupartitionierung nicht erforderlich. Das bedeutet, dass die Verwendung von repartition vor der Verknüpfung überhaupt nicht erforderlich ist.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    val dfRepartitionByExpresion = dfName

    val dfRepartitionByNumber = dfJob

    val dfRepartitionByBoth = dfAge

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

Zusätzliche Empfehlungen

SPRKSCL1151

Meldung: org.apache.spark.sql.functions.var_samp hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.var_samp function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.var_samp function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

val result1 = df.groupBy("category").agg(var_samp("value"))
val result2 = df.groupBy("category").agg(var_samp(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1151 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

/*EWI: SPRKSCL1151 => org.apache.spark.sql.functions.var_samp has a workaround, see documentation for more info*/
val result1 = df.groupBy("category").agg(var_samp("value"))
/*EWI: SPRKSCL1151 => org.apache.spark.sql.functions.var_samp has a workaround, see documentation for more info*/
val result2 = df.groupBy("category").agg(var_samp(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent var_samp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

val result1 = df.groupBy("category").agg(var_samp(col("value")))
val result2 = df.groupBy("category").agg(var_samp(col("value")))

Zusätzliche Empfehlungen


Beschreibung: >- Das Format des Readers auf DataFrameReader-Methodenverkettung ist nicht eines der von Snowpark definierten.


SPRKSCL1165

Meldung: Reader-Format auf DataFrameReader-Methodenverkettung kann nicht definiert werden

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects that format of the reader in DataFrameReader method chaining is not one of the following supported for Snowpark: avro, csv, json, orc, parquet and xml. Therefore, the SMA can not determine if setting options are defined or not.

Szenario

Eingabe

Unten sehen Sie ein Beispiel für eine DataFrameReader-Methodenverkettung, bei der SMA das Format des Readers bestimmen kann.

spark.read.format("net.snowflake.spark.snowflake")
                 .option("query", s"select * from $tableName")
                 .load()

Ausgabe

The SMA adds the EWI SPRKSCL1165 to the output code to let you know that format of the reader can not be determine in the giving DataFrameReader method chaining.

/*EWI: SPRKSCL1165 => Reader format on DataFrameReader method chaining can't be defined*/
spark.read.option("query", s"select * from $tableName")
                 .load()

Empfohlene Korrektur

Check the Snowpark documentation here to get more information about format of the reader.

Zusätzliche Empfehlungen

SPRKSCL1134

Meldung: org.apache.spark.sql.functions.log bietet eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.log function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.log function that generates this EWI.

val df = Seq(10.0, 20.0, 30.0, 40.0).toDF("value")
val result1 = df.withColumn("log_value", log(10, "value"))
val result2 = df.withColumn("log_value", log(10, col("value")))
val result3 = df.withColumn("log_value", log("value"))
val result4 = df.withColumn("log_value", log(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1134 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10.0, 20.0, 30.0, 40.0).toDF("value")
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result1 = df.withColumn("log_value", log(10, "value"))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result2 = df.withColumn("log_value", log(10, col("value")))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result3 = df.withColumn("log_value", log("value"))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result4 = df.withColumn("log_value", log(col("value")))

Empfohlene Korrektur

Below are the different workarounds for all the overloads of the log function.

1. def log(base: Double, columnName: String): Spalte

You can convert the base into a column object using the com.snowflake.snowpark.functions.lit function and convert the column name into a column object using the com.snowflake.snowpark.functions.col function.

val result1 = df.withColumn("log_value", log(lit(10), col("value")))

2. def log(base: Double, a: Column): Spalte

You can convert the base into a column object using the com.snowflake.snowpark.functions.lit function.

val result2 = df.withColumn("log_value", log(lit(10), col("value")))

3.def log(columnNam: String): Spalte

You can pass lit(Math.E) as the first argument and convert the column name into a column object using the com.snowflake.snowpark.functions.col function and pass it as the second argument.

val result3 = df.withColumn("log_value", log(lit(Math.E), col("value")))

4. def log(e: Column): Spalte

You can pass lit(Math.E) as the first argument and the column object as the second argument.

val result4 = df.withColumn("log_value", log(lit(Math.E), col("value")))

Zusätzliche Empfehlungen

SPRKSCL1125

Warnung

This issue code is deprecated since Spark Conversion Core 2.9.0

Meldung: org.apache.spark.sql.functions.count hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.count function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.count function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

val result1 = df.groupBy("name").agg(count("subject").as("subject_count"))
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

Ausgabe

The SMA adds the EWI SPRKSCL1125 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

/*EWI: SPRKSCL1125 => org.apache.spark.sql.functions.count has a workaround, see documentation for more info*/
val result1 = df.groupBy("name").agg(count("subject").as("subject_count"))
/*EWI: SPRKSCL1125 => org.apache.spark.sql.functions.count has a workaround, see documentation for more info*/
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

Empfohlene Korrektur

Snowpark has an equivalent count function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

val result1 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

Zusätzliche Empfehlungen

SPRKSCL1174

Message: The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.

Kategorie: Warnung.

Beschreibung

This issue appears when the SMA detects an use of the single-parameter org.apache.spark.sql.functions.udf function in the code. Then, it might require a manual intervention.

The Snowpark API provides an equivalent com.snowflake.snowpark.functions.udf function that allows you to create a user-defined function from a lambda or function in Scala, however, there are some caveats about creating udf in Snowpark that might require you to make some manual changes to your code in order to make it work properly.

Szenarien

The Snowpark udf function should work as intended for a wide range of cases without requiring manual intervention. However, there are some scenarios that would requiere you to manually modify your code in order to get it work in Snowpark. Some of those scenarios are listed below:

Szenario 1

Eingabe

Nachfolgend sehen Sie ein Beispiel für die Erstellung von UDFs in einem Objekt mit der Eigenschaft „App Trait“.

The Scala’s App trait simplifies creating executable programs by providing a main method that automatically runs the code within the object definition. Extending App delays the initialization of the fields until the main method is executed, which can affect the UDFs definitions if they rely on initialized fields. This means that if an object extends App and the udf references an object field, the udf definition uploaded to Snowflake will not include the initialized value of the field. This can result in null values being returned by the udf.

For example, in the following code the variable myValue will resolve to null in the udf definition:

object Main extends App {
  ...
  val myValue = 10
  val myUdf = udf((x: Int) => x + myValue) // myValue in the `udf` definition will resolve to null
  ...
}

Ausgabe

The SMA adds the EWI SPRKSCL1174 to the output code to let you know that the single-parameter udf function is supported in Snowpark but it requires manual intervention.

object Main extends App {
  ...
  val myValue = 10
  /*EWI: SPRKSCL1174 => The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
  val myUdf = udf((x: Int) => x + myValue) // myValue in the `udf` definition will resolve to null
  ...
}

Empfohlene Korrektur

To avoid this issue, it is recommended to not extend App and implement a separate main method for your code. This ensure that object fields are initialized before udf definitions are created and uploaded to Snowflake.

object Main {
  ...
  def main(args: Array[String]): Unit = {
    val myValue = 10
    val myUdf = udf((x: Int) => x + myValue)
  }
  ...
}

For more details about this topic, see Caveat About Creating UDFs in an Object With the App Trait.

Szenario 2

Eingabe

Nachfolgend sehen Sie ein Beispiel für die Erstellung von UDFs in Jupyter Notebooks.

def myFunc(s: String): String = {
  ...
}

val myFuncUdf = udf((x: String) => myFunc(x))
df1.select(myFuncUdf(col("name"))).show()

Ausgabe

The SMA adds the EWI SPRKSCL1174 to the output code to let you know that the single-parameter udf function is supported in Snowpark but it requires manual intervention.

def myFunc(s: String): String = {
  ...
}

/*EWI: SPRKSCL1174 => The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
val myFuncUdf = udf((x: String) => myFunc(x))
df1.select(myFuncUdf(col("name"))).show()

Empfohlene Korrektur

To create a udf in a Jupyter Notebook, you should define the implementation of your function in a class that extends Serializable. For example, you should manually convert it into this:

object ConvertedUdfFuncs extends Serializable {
  def myFunc(s: String): String = {
    ...
  }

  val myFuncAsLambda = ((x: String) => ConvertedUdfFuncs.myFunc(x))
}

val myFuncUdf = udf(ConvertedUdfFuncs.myFuncAsLambda)
df1.select(myFuncUdf(col("name"))).show()

For more details about how to create UDFs in Jupyter Notebooks, see Creating UDFs in Jupyter Notebooks.

Zusätzliche Empfehlungen

SPRKSCL1000

Message: Source project spark-core version is *version number*, the spark-core version supported by snowpark is 2.12:3.1.2 so there may be functional differences between the existing mappings

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a version of the spark-core that is not supported by SMA. Therefore, there may be functional differences between the existing mappings and the output might have unexpected behaviors.

Zusätzliche Empfehlungen

  • Die von SMA unterstützte Spark-Core-Version ist 2.12:3.1.2. Ziehen Sie in Erwägung, die Version Ihres Quellcodes zu ändern.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1140

Meldung: org.apache.spark.sql.functions.stddev hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.stddev function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

val result1 = df.select(stddev("score"))
val result2 = df.select(stddev(col("score")))

Ausgabe

The SMA adds the EWI SPRKSCL1140 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

/*EWI: SPRKSCL1140 => org.apache.spark.sql.functions.stddev has a workaround, see documentation for more info*/
val result1 = df.select(stddev("score"))
/*EWI: SPRKSCL1140 => org.apache.spark.sql.functions.stddev has a workaround, see documentation for more info*/
val result2 = df.select(stddev(col("score")))

Empfohlene Korrektur

Snowpark has an equivalent stddev function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

val result1 = df.select(stddev(col("score")))
val result2 = df.select(stddev(col("score")))

Zusätzliche Empfehlungen

SPRKSCL1111

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: CreateDecimalType wird nicht unterstützt.

Kategorie: Konvertierungsfehler.

Beschreibung

This issue appears when the SMA detects a usage org.apache.spark.sql.types.DataTypes.CreateDecimalType function.

Szenario

Eingabe

Im Folgenden finden Sie ein Beispiel für die Verwendung der org.apache.spark.sql.types.DataTypes.CreateDecimalType-Funktion.

var result = DataTypes.createDecimalType(18, 8)

Ausgabe

The SMA adds the EWI SPRKSCL1111 to the output code to let you know that CreateDecimalType function is not supported by Snowpark.

/*EWI: SPRKSCL1111 => CreateDecimalType is not supported*/
var result = createDecimalType(18, 8)

Empfohlene Korrektur

Es gibt noch keine empfohlene Korrektur.

Meldung: Die Spark-Session-Builder-Option wird nicht unterstützt.

Kategorie: Konvertierungsfehler.

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.SparkSession.Builder.config function, which is setting an option of the Spark Session and it is not supported by Snowpark.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.SparkSession.Builder.config function used to set an option in the Spark Session.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .getOrCreate()

Ausgabe

The SMA adds the EWI SPRKSCL1104 to the output code to let you know config method is not supported by Snowpark. Then, it is not possible to set options in the Spark Session via config function and it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1104 => SparkBuilder Option is not supported .config("spark.sql.broadcastTimeout", "3600")*/
.create()

Empfohlene Korrektur

Um die Sitzung zu erstellen, müssen Sie die richtige Snowflake Snowpark-Konfiguration hinzufügen.

In diesem Beispiel wird eine configs-Variable verwendet.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

Außerdem wird die Verwendung einer configFile (profile.properties) mit den Verbindungsinformationen empfohlen:

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

Zusätzliche Empfehlungen

SPRKSCL1101

This issue code has been deprecated since Spark Conversion Core 2.3.22

Nachricht: Broadcast wird nicht unterstützt

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.broadcast function, which is not supported by Snowpark. This function is not supported because Snowflake does not support broadcast variables.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.broadcast function used to create a broadcast object to use on each Spark cluster:

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      broadcast(dfCollege),
      Seq("CollegeName")
    )

Ausgabe

The SMA adds the EWI SPRKSCL1101 to the output code to let you know that this function is not supported by Snowpark.

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      /*EWI: SPRKSCL1101 => Broadcast is not supported*/
      broadcast(dfCollege),
      Seq("CollegeName")
    )

Empfohlene Korrektur

Da Snowflake den Speicher und die Workload auf den Clustern verwaltet, sind Broadcast-Objekte nicht anwendbar. Das bedeutet, dass der Einsatz von Broadcasting möglicherweise gar nicht erforderlich ist, aber jeder Fall sollte weiter analysiert werden.

The recommended approach is replace a Spark dataframe broadcast by a Snowpark regular dataframe or by using a dataframe method as Join.

For the proposed input the fix is to adapt the join to use directly the dataframe collegeDF without the use of broadcast for the dataframe.

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      dfCollege,
      Seq("CollegeName")
    ).show()

Zusätzliche Empfehlungen

SPRKSCL1150

Meldung: org.apache.spark.sql.functions.var_pop verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.var_pop function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.var_pop function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

val result1 = df.groupBy("group").agg(var_pop("value"))
val result2 = df.groupBy("group").agg(var_pop(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1150 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

/*EWI: SPRKSCL1150 => org.apache.spark.sql.functions.var_pop has a workaround, see documentation for more info*/
val result1 = df.groupBy("group").agg(var_pop("value"))
/*EWI: SPRKSCL1150 => org.apache.spark.sql.functions.var_pop has a workaround, see documentation for more info*/
val result2 = df.groupBy("group").agg(var_pop(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent var_pop function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

val result1 = df.groupBy("group").agg(var_pop(col("value")))
val result2 = df.groupBy("group").agg(var_pop(col("value")))

Zusätzliche Empfehlungen


Beschreibung: >- Der Parameter der org.apache.spark.sql.DataFrameReader.option-Funktion ist nicht definiert.


SPRKSCL1164

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: Der Parameter ist nicht für org.apache.spark.sql.DataFrameReader.option definiert

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects that giving parameter of org.apache.spark.sql.DataFrameReader.option is not defined.

Szenario

Eingabe

Below is an example of undefined parameter for org.apache.spark.sql.DataFrameReader.option function.

spark.read.option("header", True).json(path)

Ausgabe

The SMA adds the EWI SPRKSCL1164 to the output code to let you know that giving parameter to the org.apache.spark.sql.DataFrameReader.option function is not defined.

/*EWI: SPRKSCL1164 => The parameter header=True is not supported for org.apache.spark.sql.DataFrameReader.option*/
spark.read.option("header", True).json(path)

Empfohlene Korrektur

Check the Snowpark documentation for reader format option here, in order to identify the defined options.

Zusätzliche Empfehlungen

SPRKSCL1135

Warnung

This issue code is deprecated since Spark Conversion Core 4.3.2

Meldung: org.apache.spark.sql.functions.mean verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.mean function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.mean function, first used with a column name as an argument and then with a column object.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(mean("value"))
val result2 = df.select(mean(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1135 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
/*EWI: SPRKSCL1135 => org.apache.spark.sql.functions.mean has a workaround, see documentation for more info*/
val result1 = df.select(mean("value"))
/*EWI: SPRKSCL1135 => org.apache.spark.sql.functions.mean has a workaround, see documentation for more info*/
val result2 = df.select(mean(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent mean function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(mean(col("value")))
val result2 = df.select(mean(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1115

Warnung

This issue code has been deprecated since Spark Conversion Core Version 4.6.0

Meldung: org.apache.spark.sql.functions.round verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.round function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.round function that generates this EWI.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
val result1 = df.withColumn("rounded_value", round(col("value")))
val result2 = df.withColumn("rounded_value", round(col("value"), 2))

Ausgabe

The SMA adds the EWI SPRKSCL1115 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
/*EWI: SPRKSCL1115 => org.apache.spark.sql.functions.round has a workaround, see documentation for more info*/
val result1 = df.withColumn("rounded_value", round(col("value")))
/*EWI: SPRKSCL1115 => org.apache.spark.sql.functions.round has a workaround, see documentation for more info*/
val result2 = df.withColumn("rounded_value", round(col("value"), 2))

Empfohlene Korrektur

Snowpark has an equivalent round function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a column object and a scale, you can convert the scale into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
val result1 = df.withColumn("rounded_value", round(col("value")))
val result2 = df.withColumn("rounded_value", round(col("value"), lit(2)))

Zusätzliche Empfehlungen

SPRKSCL1144

Meldung: Die Symboltabelle konnte nicht geladen werden

Kategorie: Parsing-Fehler

Beschreibung

Dieses Problem tritt auf, wenn ein kritischer Fehler bei der Ausführung von SMA auftritt. Da die Symboltabelle nicht geladen werden kann, kann SMA den Bewertungs- oder Konvertierungsprozess nicht starten.

Zusätzliche Empfehlungen

  • This is unlikely to be an error in the source code itself, but rather is an error in how the SMA processes the source code. The best resolution would be to post an issue in the SMA.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1170

Bemerkung

Dieser Problemcode ist jetzt veraltet

Meldung: sparkConfig member key wird nicht mit plattformspezifischem Schlüssel unterstützt.

Kategorie: Konvertierungsfehler

Beschreibung

Wenn Sie eine ältere Version verwenden, führen Sie ein Upgrade auf die neueste Version aus.

Zusätzliche Empfehlungen

SPRKSCL1121

Meldung: org.apache.spark.sql.functions.atan verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.atan function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.atan function, first used with a column name as an argument and then with a column object.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
val result1 = df.withColumn("atan_value", atan("value"))
val result2 = df.withColumn("atan_value", atan(col("value")))

Ausgabe

The SMA adds the EWI SPRKSCL1121 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
/*EWI: SPRKSCL1121 => org.apache.spark.sql.functions.atan has a workaround, see documentation for more info*/
val result1 = df.withColumn("atan_value", atan("value"))
/*EWI: SPRKSCL1121 => org.apache.spark.sql.functions.atan has a workaround, see documentation for more info*/
val result2 = df.withColumn("atan_value", atan(col("value")))

Empfohlene Korrektur

Snowpark has an equivalent atan function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
val result1 = df.withColumn("atan_value", atan(col("value")))
val result2 = df.withColumn("atan_value", atan(col("value")))

Zusätzliche Empfehlungen

SPRKSCL1131

Meldung: org.apache.spark.sql.functions.grouping hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.grouping function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.grouping function, first used with a column name as an argument and then with a column object.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
val result1 = df.cube("name").agg(grouping("name"), sum("age"))
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

Ausgabe

The SMA adds the EWI SPRKSCL1131 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
/*EWI: SPRKSCL1131 => org.apache.spark.sql.functions.grouping has a workaround, see documentation for more info*/
val result1 = df.cube("name").agg(grouping("name"), sum("age"))
/*EWI: SPRKSCL1131 => org.apache.spark.sql.functions.grouping has a workaround, see documentation for more info*/
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

Empfohlene Korrektur

Snowpark has an equivalent grouping function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
val result1 = df.cube("name").agg(grouping(col("name")), sum("age"))
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

Zusätzliche Empfehlungen

SPRKSCL1160

Bemerkung

This issue code has been deprecated since Spark Conversion Core 4.1.0

Meldung: org.apache.spark.sql.functions.sum verfügt über eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sum function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.sum function that generates this EWI. In this example, the sum function is used to calculate the sum of selected column.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
val result1 = sum(col("elements"))
val result2 = sum("elements")

Ausgabe

The SMA adds the EWI SPRKSCL1160 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
/*EWI: SPRKSCL1160 => org.apache.spark.sql.functions.sum has a workaround, see documentation for more info*/
val result1 = sum(col("elements"))
/*EWI: SPRKSCL1160 => org.apache.spark.sql.functions.sum has a workaround, see documentation for more info*/
val result2 = sum("elements")

Empfohlene Korrektur

Snowpark has an equivalent sum function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
val result1 = sum(col("elements"))
val result2 = sum(col("elements"))

Zusätzliche Empfehlungen

SPRKSCL1154

Meldung: org.apache.spark.sql.functions.ceil hat eine Problemumgehung, siehe Dokumentation für weitere Informationen

Kategorie: Warnung

Beschreibung

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.ceil function, which has a workaround.

Szenario

Eingabe

Below is an example of the org.apache.spark.sql.functions.ceil function, first used with a column name as an argument, then with a column object and finally with a column object and a scale.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
val result1 = df.withColumn("ceil", ceil("value"))
val result2 = df.withColumn("ceil", ceil(col("value")))
val result3 = df.withColumn("ceil", ceil(col("value"), lit(1)))

Ausgabe

The SMA adds the EWI SPRKSCL1154 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result1 = df.withColumn("ceil", ceil("value"))
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result2 = df.withColumn("ceil", ceil(col("value")))
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result3 = df.withColumn("ceil", ceil(col("value"), lit(1)))

Empfohlene Korrektur

Snowpark has an equivalent ceil function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

For the overload that receives a column object and a scale, you can use the callBuiltin function to invoke the Snowflake builtin CEIL function. To use it, you should pass the string „ceil“ as the first argument, the column as the second argument and the scale as the third argument.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
val result1 = df.withColumn("ceil", ceil(col("value")))
val result2 = df.withColumn("ceil", ceil(col("value")))
val result3 = df.withColumn("ceil", callBuiltin("ceil", col("value"), lit(1)))

Zusätzliche Empfehlungen

SPRKSCL1105

Dieser Problemcode ist jetzt veraltet

Meldung: Der Wert des Writer-Formats wird nicht unterstützt.

Kategorie: Konvertierungsfehler

Beschreibung

This issue appears when the org.apache.spark.sql.DataFrameWriter.format has an argument that is not supported by Snowpark.

Szenarien

There are some scenarios depending on the type of format you are trying to save. It can be a supported, or non-supported format.

Szenario 1

Eingabe

Das Tool analysiert den Typ des Formats, das Sie zu speichern versuchen. Die unterstützten Formate sind:

  • csv

  • json

  • orc

  • parquet

  • text

    dfWrite.write.format("csv").save(path)

Ausgabe

The tool transforms the format method into a csv method call when save function has one parameter.

    dfWrite.write.csv(path)

Empfohlene Korrektur

In diesem Fall zeigt das Tool die EWI nicht an, d. h. es ist keine Korrektur erforderlich.

Szenario 2

Eingabe

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

dfWrite.write.format("net.snowflake.spark.snowflake").save(path)

Ausgabe

The tool shows the EWI SPRKSCL1105 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1105 => Writer format value is not supported .format("net.snowflake.spark.snowflake")*/
dfWrite.write.format("net.snowflake.spark.snowflake").save(path)

Empfohlene Korrektur

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

Szenario 3

Eingabe

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
dfWrite.write.format(myFormat).save(path)

Ausgabe

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

val myFormat = "csv"
/*EWI: SPRKSCL1163 => format_type is not a literal and can't be evaluated*/
dfWrite.write.format(myFormat).load(path)

Empfohlene Korrektur

As a workaround, you can check the value of the variable and add it as a string to the format call.

Zusätzliche Empfehlungen