Snowpark Migration Accelerator: Spark용 문제 코드 - Scala

SPRKSCL1126

메시지: org.apache.sql.functions.covar_pop에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.covar_pop function, which has a workaround.

입력

Below is an example of the org.apache.spark.sql.functions.covar_pop function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

val result1 = df.select(covar_pop("column1", "column2").as("covariance_pop"))
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

출력

The SMA adds the EWI SPRKSCL1126 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

/*EWI: SPRKSCL1126 => org.apache.spark.sql.functions.covar_pop has a workaround, see documentation for more info*/
val result1 = df.select(covar_pop("column1", "column2").as("covariance_pop"))
/*EWI: SPRKSCL1126 => org.apache.spark.sql.functions.covar_pop has a workaround, see documentation for more info*/
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

권장 수정

Snowpark has an equivalent covar_pop function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 100.0),
  (20.0, 150.0),
  (30.0, 200.0),
  (40.0, 250.0),
  (50.0, 300.0)
).toDF("column1", "column2")

val result1 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))
val result2 = df.select(covar_pop(col("column1"), col("column2")).as("covariance_pop"))

추가 권장 사항

SPRKSCL1112

Message: *spark element* is not supported

카테고리: 변환 오류

설명

이 문제는 SMA 가 Snowpark에서 지원되지 않는 Spark 요소의 사용을 감지했을 때 발생하며, 이 문제에는 관련 오류 코드가 없습니다. 이 메시지는 지원되지 않는 모든 Spark 요소에 대해 SMA 가 사용하는 일반적인 오류 코드입니다.

시나리오

입력

아래는 Snowpark에서 지원하지 않는 Spark 요소의 예시이므로 EWI 를 생성합니다.

val df = session.range(10)
val result = df.isLocal

출력

The SMA adds the EWI SPRKSCL1112 to the output code to let you know that this element is not supported by Snowpark.

val df = session.range(10)
/*EWI: SPRKSCL1112 => org.apache.spark.sql.Dataset.isLocal is not supported*/
val result = df.isLocal

권장 수정

이는 지원되지 않는 다양한 함수에 적용되는 일반적인 오류 코드이므로 1개의 구체적인 수정 사항이 없습니다. 적절한 작업은 사용 중인 특정 요소에 따라 달라집니다.

해당 요소가 지원되지 않는다고 해서 반드시 해결 방법이나 해결 방법을 찾을 수 없다는 의미는 아니라는 점에 유의하십시오. SMA 자체에서 해결책을 찾을 수 없다는 의미일 뿐입니다.

추가 권장 사항

SPRKSCL1143

메시지: 기호 테이블을 로딩하는 동안 오류가 발생했습니다

카테고리: 변환 오류

설명

이 문제는 SMA 기호 테이블의 기호를 로딩하는 데 오류가 있을 때 표시됩니다. 기호 테이블은 SMA 의 기본 아키텍처의 일부로, 보다 복잡한 변환을 가능하게 합니다.

추가 권장 사항

  • This is unlikely to be an error in the source code itself, but rather is an error in how the SMA processes the source code. The best resolution would be to post an issue in the SMA.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1153

경고

This issue code has been deprecated since Spark Conversion Core Version 4.3.2

메시지: org.apache.sql.functions.max에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.max function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.max function, first used with a column name as an argument and then with a column object.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
val result1 = df.select(max("value"))
val result2 = df.select(max(col("value")))

출력

The SMA adds the EWI SPRKSCL1153 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
/*EWI: SPRKSCL1153 => org.apache.spark.sql.functions.max has a workaround, see documentation for more info*/
val result1 = df.select(max("value"))
/*EWI: SPRKSCL1153 => org.apache.spark.sql.functions.max has a workaround, see documentation for more info*/
val result2 = df.select(max(col("value")))

권장 수정

Snowpark has an equivalent max function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(10, 12, 20, 15, 18).toDF("value")
val result1 = df.select(max(col("value")))
val result2 = df.select(max(col("value")))

추가 권장 사항

SPRKSCL1102

This issue code has been deprecated since Spark Conversion Core 2.3.22

메시지: 분해는 지원되지 않습니다

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.explode function, which is not supported by Snowpark.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.explode function used to get the consolidated information of the array fields of the dataset.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

    dfExplode.select(explode(dfExplode("Translation").alias("exploded")))

출력

The SMA adds the EWI SPRKSCL1102 to the output code to let you know that this function is not supported by Snowpark.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

    /*EWI: SPRKSCL1102 => Explode is not supported */
    dfExplode.select(explode(dfExplode("Translation").alias("exploded")))

권장 수정

Since explode is not supported by Snowpark, the function flatten could be used as a substitute.

다음 수정 사항은 dfExplode 데이터 프레임의 데이터 스큐를 생성한 다음 쿼리 결과를 Spark에서 복제하도록 합니다.

    val explodeData = Seq(
      Row("Cat", Array("Gato","Chat")),
      Row("Dog", Array("Perro","Chien")),
      Row("Bird", Array("Ave","Oiseau"))
    )

    val explodeSchema = StructType(
      List(
        StructField("Animal", StringType),
        StructField("Translation", ArrayType(StringType))
      )
    )

    val rddExplode = session.sparkContext.parallelize(explodeData)

    val dfExplode = session.createDataFrame(rddExplode, explodeSchema)

     var dfFlatten = dfExplode.flatten(col("Translation")).alias("exploded")
                              .select(col("exploded.value").alias("Translation"))

추가 권장 사항

SPRKSCL1136

경고

This issue code is deprecated since Spark Conversion Core 4.3.2

메시지: org.apache.sql.functions.min에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.min function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.min function, first used with a column name as an argument and then with a column object.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(min("value"))
val result2 = df.select(min(col("value")))

출력

The SMA adds the EWI SPRKSCL1136 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
/*EWI: SPRKSCL1136 => org.apache.spark.sql.functions.min has a workaround, see documentation for more info*/
val result1 = df.select(min("value"))
/*EWI: SPRKSCL1136 => org.apache.spark.sql.functions.min has a workaround, see documentation for more info*/
val result2 = df.select(min(col("value")))

권장 수정

Snowpark has an equivalent min function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that takes a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(min(col("value")))
val result2 = df.select(min(col("value")))

추가 권장 사항

SPRKSCL1167

메시지: 입력 폴더에 프로젝트 파일을 찾을 수 없습니다

카테고리: 경고

설명

이 문제는 SMA 가 입력 폴더에 프로젝트 구성 파일이 없는 것을 감지할 때 표시됩니다. SMA 가 지원하는 프로젝트 구성 파일은 다음과 같습니다.

  • build.sbt

  • build.gradle

  • pom.xml

추가 권장 사항

SPRKSCL1147

메시지: org.apache.sql.functions.tanh에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.tanh function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.tanh function, first used with a column name as an argument and then with a column object.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
val result1 = df.withColumn("tanh_value", tanh("value"))
val result2 = df.withColumn("tanh_value", tanh(col("value")))

출력

The SMA adds the EWI SPRKSCL1147 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
/*EWI: SPRKSCL1147 => org.apache.spark.sql.functions.tanh has a workaround, see documentation for more info*/
val result1 = df.withColumn("tanh_value", tanh("value"))
/*EWI: SPRKSCL1147 => org.apache.spark.sql.functions.tanh has a workaround, see documentation for more info*/
val result2 = df.withColumn("tanh_value", tanh(col("value")))

권장 수정

Snowpark has an equivalent tanh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(-1.0, 0.5, 1.0, 2.0).toDF("value")
val result1 = df.withColumn("tanh_value", tanh(col("value")))
val result2 = df.withColumn("tanh_value", tanh(col("value")))

추가 권장 사항

SPRKSCL1116

경고

This issue code has been deprecated since Spark Conversion Core Version 2.40.1

메시지: org.apache.sql.functions.split에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.split function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
val result1 = df.withColumn("split_values", split(col("values"), ","))
val result2 = df.withColumn("split_values", split(col("values"), ",", 0))

출력

The SMA adds the EWI SPRKSCL1116 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
/*EWI: SPRKSCL1116 => org.apache.spark.sql.functions.split has a workaround, see documentation for more info*/
val result1 = df.withColumn("split_values", split(col("values"), ","))
/*EWI: SPRKSCL1116 => org.apache.spark.sql.functions.split has a workaround, see documentation for more info*/
val result2 = df.withColumn("split_values", split(col("values"), ",", 0))

권장 수정

For the Spark overload that receives two arguments, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

세 개의 인자를 받는 오버로드는 아직 Snowpark에서 지원되지 않으며 해결 방법이 없습니다.

val df = Seq("apple,banana,orange", "grape,lemon,lime", "cherry,blueberry,strawberry").toDF("values")
val result1 = df.withColumn("split_values", split(col("values"), lit(",")))
val result2 = df.withColumn("split_values", split(col("values"), ",", 0)) // This overload is not supported yet

추가 권장 사항

SPRKSCL1122

메시지: org.apache.sql.functions.corr에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.corr function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.corr function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

val result1 = df.select(corr("col1", "col2"))
val result2 = df.select(corr(col("col1"), col("col2")))

출력

The SMA adds the EWI SPRKSCL1122 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

/*EWI: SPRKSCL1122 => org.apache.spark.sql.functions.corr has a workaround, see documentation for more info*/
val result1 = df.select(corr("col1", "col2"))
/*EWI: SPRKSCL1122 => org.apache.spark.sql.functions.corr has a workaround, see documentation for more info*/
val result2 = df.select(corr(col("col1"), col("col2")))

권장 수정

Snowpark has an equivalent corr function that receives two column objects as arguments. For that reason, the Spark overload that receives column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 20.0),
  (20.0, 40.0),
  (30.0, 60.0)
).toDF("col1", "col2")

val result1 = df.select(corr(col("col1"), col("col2")))
val result2 = df.select(corr(col("col1"), col("col2")))

추가 권장 사항

SPRKSCL1173

메시지: SQL 임베디드 코드를 처리할 수 없습니다.

카테고리: 경고.

설명

이 문제는 SMA 가 처리할 수 없는 SQL 임베디드 코드가 감지될 때 표시됩니다. 그러면 SQL 임베디드 코드는 Snowflake로 변환할 수 없습니다.

시나리오

입력

다음은 처리할 수 없는 SQL 임베디드 코드의 예시입니다.

spark.sql("CREATE VIEW IF EXISTS My View" + "AS Select * From my Table WHERE date < current_date()")

출력

The SMA adds the EWI SPRKSCL1173 to the output code to let you know that the SQL-embedded code can not be processed.

/*EWI: SPRKSCL1173 => SQL embedded code cannot be processed.*/
spark.sql("CREATE VIEW IF EXISTS My View" + "AS Select * From my Table WHERE date < current_date()")

권장 수정

SQL 임베디드 코드가 보간, 변수 또는 문자열 연결이 없는 문자열인지 확인하십시오.

추가 권장 사항

SPRKSCL1163

메시지: 요소가 리터럴이 아니므로 평가할 수 없습니다.

카테고리: 변환 오류입니다.

설명

이 문제는 현재 처리 요소가 리터럴이 아닌 경우 SMA 가 평가할 수 없을 때 발생합니다.

시나리오

입력

다음은 처리할 요소가 리터럴이 아니며 SMA 가 평가할 수 없는 경우의 예시입니다.

val format_type = "csv"
spark.read.format(format_type).load(path)

출력

The SMA adds the EWI SPRKSCL1163 to the output code to let you know that format_type parameter is not a literal and it can not be evaluated by the SMA.

/*EWI: SPRKSCL1163 => format_type is not a literal and can't be evaluated*/
val format_type = "csv"
spark.read.format(format_type).load(path)

권장 수정

  • 예기치 않은 동작을 방지하려면 변수 값이 유효한 값인지 확인하십시오.

추가 권장 사항

SPRKSCL1132

메시지: org.apache.sql.functions.grouping_id에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.grouping_id function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.grouping_id function, first used with multiple column name as arguments and then with column objects.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id("store", "product"))
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

출력

The SMA adds the EWI SPRKSCL1132 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

/*EWI: SPRKSCL1132 => org.apache.spark.sql.functions.grouping_id has a workaround, see documentation for more info*/
val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id("store", "product"))
/*EWI: SPRKSCL1132 => org.apache.spark.sql.functions.grouping_id has a workaround, see documentation for more info*/
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

권장 수정

Snowpark has an equivalent grouping_id function that receives multiple column objects as arguments. For that reason, the Spark overload that receives multiple column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Store1", "Product1", 100),
  ("Store1", "Product2", 150),
  ("Store2", "Product1", 200),
  ("Store2", "Product2", 250)
).toDF("store", "product", "amount")

val result1 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))
val result2 = df.cube("store", "product").agg(sum("amount"), grouping_id(col("store"), col("product")))

추가 권장 사항

SPRKSCL1106

경고

이 문제 코드는 사용 중단 되었습니다

메시지: 작성기 옵션은 지원되지 않습니다.

카테고리: 변환 오류입니다.

설명

이 문제는 도구가 작성기 문에서 Snowpark에서 지원하지 않는 옵션의 사용을 감지할 때 표시됩니다.

시나리오

입력

Below is an example of the org.apache.spark.sql.DataFrameWriter.option used to add options to a writer statement.

df.write.format("net.snowflake.spark.snowflake").option("dbtable", tablename)

출력

The SMA adds the EWI SPRKSCL1106 to the output code to let you know that the option method is not supported by Snowpark.

df.write.saveAsTable(tablename)
/*EWI: SPRKSCL1106 => Writer option is not supported .option("dbtable", tablename)*/

권장 수정

이 시나리오에 대한 권장 수정 사항은 없습니다

추가 권장 사항

SPRKSCL1157

메시지: org.apache.sql.functions.kurtosis에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.kurtosis function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.kurtosis function that generates this EWI. In this example, the kurtosis function is used to calculate the kurtosis of selected column.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = kurtosis(col("elements"))
val result2 = kurtosis("elements")

출력

The SMA adds the EWI SPRKSCL1157 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3").toDF("elements")
/*EWI: SPRKSCL1157 => org.apache.spark.sql.functions.kurtosis has a workaround, see documentation for more info*/
val result1 = kurtosis(col("elements"))
/*EWI: SPRKSCL1157 => org.apache.spark.sql.functions.kurtosis has a workaround, see documentation for more info*/
val result2 = kurtosis("elements")

권장 수정

Snowpark has an equivalent kurtosis function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = kurtosis(col("elements"))
val result2 = kurtosis(col("elements"))

추가 권장 사항

SPRKSCL1146

메시지: org.apache.sql.functions.tan에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.tan function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.tan function, first used with a column name as an argument and then with a column object.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
val result1 = df.withColumn("tan_value", tan("angle"))
val result2 = df.withColumn("tan_value", tan(col("angle")))

출력

The SMA adds the EWI SPRKSCL1146 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
/*EWI: SPRKSCL1146 => org.apache.spark.sql.functions.tan has a workaround, see documentation for more info*/
val result1 = df.withColumn("tan_value", tan("angle"))
/*EWI: SPRKSCL1146 => org.apache.spark.sql.functions.tan has a workaround, see documentation for more info*/
val result2 = df.withColumn("tan_value", tan(col("angle")))

권장 수정

Snowpark has an equivalent tan function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(math.Pi / 4, math.Pi / 3, math.Pi / 6).toDF("angle")
val result1 = df.withColumn("tan_value", tan(col("angle")))
val result2 = df.withColumn("tan_value", tan(col("angle")))

추가 권장 사항

SPRKSCL1117

경고

This issue code is deprecated since Spark Conversion Core 2.40.1

메시지: org.apache.sql.functions.translate에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.translate function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.translate function that generates this EWI. In this example, the translate function is used to replace the characters ‘a’, ‘e’ and ‘o’ in each word with ‘1’, ‘2’ and ‘3’, respectively.

val df = Seq("hello", "world", "scala").toDF("word")
val result = df.withColumn("translated_word", translate(col("word"), "aeo", "123"))

출력

The SMA adds the EWI SPRKSCL1117 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("hello", "world", "scala").toDF("word")
/*EWI: SPRKSCL1117 => org.apache.spark.sql.functions.translate has a workaround, see documentation for more info*/
val result = df.withColumn("translated_word", translate(col("word"), "aeo", "123"))

권장 수정

As a workaround, you can convert the second and third argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq("hello", "world", "scala").toDF("word")
val result = df.withColumn("translated_word", translate(col("word"), lit("aeo"), lit("123")))

추가 권장 사항

SPRKSCL1123

메시지: org.apache.sql.functions.cos에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.cos function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.cos function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
val result1 = df.withColumn("cosine_value", cos("angle_radians"))
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

출력

The SMA adds the EWI SPRKSCL1123 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
/*EWI: SPRKSCL1123 => org.apache.spark.sql.functions.cos has a workaround, see documentation for more info*/
val result1 = df.withColumn("cosine_value", cos("angle_radians"))
/*EWI: SPRKSCL1123 => org.apache.spark.sql.functions.cos has a workaround, see documentation for more info*/
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

권장 수정

Snowpark has an equivalent cos function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, Math.PI / 4, Math.PI / 2, Math.PI).toDF("angle_radians")
val result1 = df.withColumn("cosine_value", cos(col("angle_radians")))
val result2 = df.withColumn("cosine_value", cos(col("angle_radians")))

추가 권장 사항

SPRKSCL1172

메시지: Snowpark는 메타데이터 매개 변수로 StructFiled 를 지원하지 않습니다.

카테고리: 경고

설명

This issue appears when the SMA detects that org.apache.spark.sql.types.StructField.apply with org.apache.spark.sql.types.Metadata as parameter. This is because Snowpark does not supported the metadata parameter.

시나리오

입력

Below is an example of the org.apache.spark.sql.types.StructField.apply function that generates this EWI. In this example, the apply function is used to generate and instance of StructField.

val result = StructField("f1", StringType(), True, metadata)

출력

The SMA adds the EWI SPRKSCL1172 to the output code to let you know that metadata parameter is not supported by Snowflake.

/*EWI: SPRKSCL1172 => Snowpark does not support StructFiled with metadata parameter.*/
val result = StructField("f1", StringType(), True, metadata)

권장 수정

Snowpark has an equivalent com.snowflake.snowpark.types.StructField.apply function that receives three parameters. Then, as workaround, you can try to remove the metadata argument.

val result = StructField("f1", StringType(), True, metadata)

추가 권장 사항

SPRKSCL1162

참고

이 문제 코드는 사용 중단 되었습니다

메시지: Dbc 파일을 추출하는 동안 오류가 발생했습니다.

카테고리: 경고.

설명

이 문제는 dbc 파일을 추출할 수 없을 때 표시됩니다. 이 경고는 다음 중 한 가지 이상의 이유로 인해 발생할 수 있습니다. 너무 무겁거나, 액세스할 수 없거나, 읽기 전용입니다.

추가 권장 사항

  • 해결 방법으로 파일이 너무 커서 처리할 수 없는 경우 파일 크기를 확인할 수 있습니다. 또한 도구가 액세스할 수 있는지 분석하여 액세스 문제를 방지하십시오.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1133

메시지: org.apache.sql.functions.least에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.least function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.least function, first used with multiple column name as arguments and then with column objects.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
val result1 = df.withColumn("least", least("value1", "value2", "value3"))
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

출력

The SMA adds the EWI SPRKSCL1133 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
/*EWI: SPRKSCL1133 => org.apache.spark.sql.functions.least has a workaround, see documentation for more info*/
val result1 = df.withColumn("least", least("value1", "value2", "value3"))
/*EWI: SPRKSCL1133 => org.apache.spark.sql.functions.least has a workaround, see documentation for more info*/
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

권장 수정

Snowpark has an equivalent least function that receives multiple column objects as arguments. For that reason, the Spark overload that receives multiple column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq((10, 20, 5), (15, 25, 30), (7, 14, 3)).toDF("value1", "value2", "value3")
val result1 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))
val result2 = df.withColumn("least", least(col("value1"), col("value2"), col("value3")))

추가 권장 사항

SPRKSCL1107

경고

이 문제 코드는 사용 중단 되었습니다

메시지: 작성기 저장은 지원되지 않습니다.

카테고리: 변환 오류입니다.

설명

이 문제는 도구가 작성기 문에서 Snowpark에서 지원하지 않는 작성기 저장 방법의 사용을 감지할 때 표시됩니다.

시나리오

입력

Below is an example of the org.apache.spark.sql.DataFrameWriter.save used to save the DataFrame content.

df.write.format("net.snowflake.spark.snowflake").save()

출력

The SMA adds the EWI SPRKSCL1107 to the output code to let you know that the save method is not supported by Snowpark.

df.write.saveAsTable(tablename)
/*EWI: SPRKSCL1107 => Writer method is not supported .save()*/

권장 수정

이 시나리오에 대한 권장 수정 사항은 없습니다

추가 권장 사항

SPRKSCL1156

메시지: org.apache.sql.functions.degrees에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.degrees function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.degrees function, first used with a column name as an argument and then with a column object.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
val result1 = df.withColumn("degrees", degrees("radians"))
val result2 = df.withColumn("degrees", degrees(col("radians")))

출력

The SMA adds the EWI SPRKSCL1156 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
/*EWI: SPRKSCL1156 => org.apache.spark.sql.functions.degrees has a workaround, see documentation for more info*/
val result1 = df.withColumn("degrees", degrees("radians"))
/*EWI: SPRKSCL1156 => org.apache.spark.sql.functions.degrees has a workaround, see documentation for more info*/
val result2 = df.withColumn("degrees", degrees(col("radians")))

권장 수정

Snowpark has an equivalent degrees function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(math.Pi, math.Pi / 2, math.Pi / 4, math.Pi / 6).toDF("radians")
val result1 = df.withColumn("degrees", degrees(col("radians")))
val result2 = df.withColumn("degrees", degrees(col("radians")))

추가 권장 사항

SPRKSCL1127

메시지: org.apache.sql.functions.covar_samp에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.covar_samp function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.covar_samp function, first used with column names as the arguments and then with column objects.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

val result1 = df.select(covar_samp("value1", "value2").as("sample_covariance"))
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

출력

The SMA adds the EWI SPRKSCL1127 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

/*EWI: SPRKSCL1127 => org.apache.spark.sql.functions.covar_samp has a workaround, see documentation for more info*/
val result1 = df.select(covar_samp("value1", "value2").as("sample_covariance"))
/*EWI: SPRKSCL1127 => org.apache.spark.sql.functions.covar_samp has a workaround, see documentation for more info*/
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

권장 수정

Snowpark has an equivalent covar_samp function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives two string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  (10.0, 20.0),
  (15.0, 25.0),
  (20.0, 30.0),
  (25.0, 35.0),
  (30.0, 40.0)
).toDF("value1", "value2")

val result1 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))
val result2 = df.select(covar_samp(col("value1"), col("value2")).as("sample_covariance"))

추가 권장 사항

SPRKSCL1113

메시지: org.apache.sql.functions.next_day에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.next_day function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.next_day function, first used with a string as the second argument and then with a column object.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
val result1 = df.withColumn("next_monday", next_day(col("date"), "Mon"))
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

출력

The SMA adds the EWI SPRKSCL1113 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
/*EWI: SPRKSCL1113 => org.apache.spark.sql.functions.next_day has a workaround, see documentation for more info*/
val result1 = df.withColumn("next_monday", next_day(col("date"), "Mon"))
/*EWI: SPRKSCL1113 => org.apache.spark.sql.functions.next_day has a workaround, see documentation for more info*/
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

권장 수정

Snowpark has an equivalent next_day function that receives two column objects as arguments. For that reason, the Spark overload that receives two column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives a column object and a string, you can convert the string into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq("2024-11-06", "2024-11-13", "2024-11-20").toDF("date")
val result1 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))
val result2 = df.withColumn("next_monday", next_day(col("date"), lit("Mon")))

추가 권장 사항

SPRKSCL1002

Message: This code section has recovery from parsing errors *statement*

카테고리: 구문 분석 오류.

설명

이 문제는 SMA 가 파일 코드에서 올바르게 읽거나 이해할 수 없는 문을 감지하면 구문 분석 오류 로 표시되지만, SMA 에서 해당 구문 분석 오류를 복구하고 파일 코드를 계속 분석할 수 있습니다. 이 경우 SMA 는 오류 없이 파일의 코드를 처리할 수 있습니다.

시나리오

입력

다음은 SMA 가 복구할 수 있는 잘못된 Scala 코드의 예입니다.

Class myClass {

    def function1() & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

출력

The SMA adds the EWI SPRKSCL1002 to the output code to let you know that the code of the file has parsing errors, however the SMA can recovery from that error and continue analyzing the code of the file.

class myClass {

    def function1();//EWI: SPRKSCL1002 => Unexpected end of declaration. Failed token: '&' @(3,21).
    & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

권장 수정

메시지가 문의 오류를 정확히 찾아내므로 잘못된 구문을 식별하여 제거하거나 해당 문을 설명하여 구문 분석 오류를 방지할 수 있습니다.

Class myClass {

    def function1() = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}
Class myClass {

    // def function1() & = { 1 }

    def function2() = { 2 }

    def function3() = { 3 }

}

추가 권장 사항

SPRKSCL1142

Message: *spark element* is not defined

카테고리: 변환 오류

설명

이 문제는 SMA 에서 지정된 요소에 대한 적절한 매핑 상태를 확인할 수 없을 때 표시됩니다. 즉, SMA 는 이 요소가 Snowpark에서 지원되는지 여부를 아직 알지 못합니다. 이 코드는 정의되지 않은 요소에 대해 SMA 가 사용하는 일반적인 오류 코드입니다.

시나리오

입력

Below is an example of a function for which the SMA could not determine an appropriate mapping status, and therefore it generated this EWI. In this case, you should assume that notDefinedFunction() is a valid Spark function and the code runs.

val df = session.range(10)
val result = df.notDefinedFunction()

출력

The SMA adds the EWI SPRKSCL1142 to the output code to let you know that this element is not defined.

val df = session.range(10)
/*EWI: SPRKSCL1142 => org.apache.spark.sql.DataFrame.notDefinedFunction is not defined*/
val result = df.notDefinedFunction()

권장 수정

문제를 식별하기 위해 다음 유효성 검사를 수행할 수 있습니다.

  • 유효한 Spark 요소인지 확인합니다.

  • 요소의 구문이 올바른지, 철자가 올바른지 확인합니다.

  • SMA 가 지원하는 Spark 버전을 사용하고 있는지 확인하십시오.

If this is a valid Spark element, please report that you encountered a conversion error on that particular element using the Report an Issue option of the SMA and include any additional information that you think may be helpful.

Please note that if an element is not defined by the SMA, it does not mean necessarily that it is not supported by Snowpark. You should check the Snowpark Documentation to verify if an equivalent element exist.

추가 권장 사항

SPRKSCL1152

메시지: org.apache.sql.functions.variance에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.variance function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.variance function, first used with a column name as an argument and then with a column object.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
val result1 = df.select(variance("value"))
val result2 = df.select(variance(col("value")))

출력

The SMA adds the EWI SPRKSCL1152 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
/*EWI: SPRKSCL1152 => org.apache.spark.sql.functions.variance has a workaround, see documentation for more info*/
val result1 = df.select(variance("value"))
/*EWI: SPRKSCL1152 => org.apache.spark.sql.functions.variance has a workaround, see documentation for more info*/
val result2 = df.select(variance(col("value")))

권장 수정

Snowpark has an equivalent variance function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(10, 20, 30, 40, 50).toDF("value")
val result1 = df.select(variance(col("value")))
val result2 = df.select(variance(col("value")))

추가 권장 사항

SPRKSCL1103

이 문제 코드는 사용 중단 되었습니다

Message: SparkBuilder method is not supported *method name*

카테고리: 변환 오류

설명

이 문제는 SparkBuilder 메서드 체이닝에서 Snowflake가 지원하지 않는 메서드를 SMA 가 감지할 때 표시됩니다. 따라서 리더 문의 마이그레이션에 영향을 미칠 수 있습니다.

다음은 지원되지 않는 SparkBuilder 메서드입니다.

  • master

  • appName

  • enableHiveSupport

  • withExtensions

시나리오

입력

아래는 SparkBuilder 메서드 체인 예시로, 많은 메서드가 Snowflake에서 지원되지 않습니다.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .enableHiveSupport()
           .getOrCreate()

출력

The SMA adds the EWI SPRKSCL1103 to the output code to let you know that master, appName and enableHiveSupport methods are not supported by Snowpark. Then, it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1103 => SparkBuilder Method is not supported .master("local")*/
/*EWI: SPRKSCL1103 => SparkBuilder Method is not supported .appName("testApp")*/
/*EWI: SPRKSCL1103 => SparkBuilder method is not supported .enableHiveSupport()*/
.create

권장 수정

세션을 생성하려면 적절한 Snowflake Snowpark 구성을 추가해야 합니다.

이 예제에서는 config 변수가 사용됩니다.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

또한 연결 정보와 함께 configFile(profile.properties)을 사용하는 것이 좋습니다.

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

추가 권장 사항

SPRKSCL1137

메시지: org.apache.sql.functions.sin에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sin function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.sin function, first used with a column name as an argument and then with a column object.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
val result1 = df.withColumn("sin_value", sin("angle"))
val result2 = df.withColumn("sin_value", sin(col("angle")))

출력

The SMA adds the EWI SPRKSCL1137 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
/*EWI: SPRKSCL1137 => org.apache.spark.sql.functions.sin has a workaround, see documentation for more info*/
val result1 = df.withColumn("sin_value", sin("angle"))
/*EWI: SPRKSCL1137 => org.apache.spark.sql.functions.sin has a workaround, see documentation for more info*/
val result2 = df.withColumn("sin_value", sin(col("angle")))

권장 수정

Snowpark has an equivalent sin function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(Math.PI / 2, Math.PI, Math.PI / 6).toDF("angle")
val result1 = df.withColumn("sin_value", sin(col("angle")))
val result2 = df.withColumn("sin_value", sin(col("angle")))

추가 권장 사항

SPRKSCL1166

참고

이 문제 코드는 사용 중단 되었습니다

메시지: org.apache.spark.sql.DataFrameReader.format은 지원되지 않습니다.

카테고리: 경고.

설명

This issue appears when the org.apache.spark.sql.DataFrameReader.format has an argument that is not supported by Snowpark.

시나리오

There are some scenarios depending on the type of format you are trying to load. It can be a supported, or non-supported format.

시나리오 1

입력

이 도구는 로딩하려는 형식의 유형을 분석하며, 지원되는 형식은 다음과 같습니다.

  • csv

  • json

  • orc

  • parquet

  • text

The below example shows how the tool transforms the format method when passing a csv value.

spark.read.format("csv").load(path)

출력

The tool transforms the format method into a csv method call when load function has one parameter.

spark.read.csv(path)

권장 수정

이 경우 도구에 EWI 가 표시되지 않으므로 수정할 필요가 없습니다.

시나리오 2

입력

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

spark.read.format("net.snowflake.spark.snowflake").load(path)

출력

The tool shows the EWI SPRKSCL1166 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1166 => The parameter net.snowflake.spark.snowflake is not supported for org.apache.spark.sql.DataFrameReader.format
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format("net.snowflake.spark.snowflake").load(path)

권장 수정

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

시나리오 3

입력

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
spark.read.format(myFormat).load(path)

출력

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

/*EWI: SPRKSCL1163 => myFormat is not a literal and can't be evaluated
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format(myFormat).load(path)

권장 수정

As a workaround, you can check the value of the variable and add it as a string to the format call.

추가 권장 사항

SPRKSCL1118

메시지: org.apache.sql.functions.trunc에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.trunc function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.trunc function that generates this EWI.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

val result = df.withColumn("truncated", trunc(col("date"), "month"))

출력

The SMA adds the EWI SPRKSCL1118 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

/*EWI: SPRKSCL1118 => org.apache.spark.sql.functions.trunc has a workaround, see documentation for more info*/
val result = df.withColumn("truncated", trunc(col("date"), "month"))

권장 수정

As a workaround, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq(
  Date.valueOf("2024-10-28"),
  Date.valueOf("2023-05-15"),
  Date.valueOf("2022-11-20"),
).toDF("date")

val result = df.withColumn("truncated", trunc(col("date"), lit("month")))

추가 권장 사항

SPRKSCL1149

메시지: org.apache.sql.functions.toRadians에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.toRadians function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.toRadians function, first used with a column name as an argument and then with a column object.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
val result1 = df.withColumn("radians", toRadians("degrees"))
val result2 = df.withColumn("radians", toRadians(col("degrees")))

출력

The SMA adds the EWI SPRKSCL1149 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
/*EWI: SPRKSCL1149 => org.apache.spark.sql.functions.toRadians has a workaround, see documentation for more info*/
val result1 = df.withColumn("radians", toRadians("degrees"))
/*EWI: SPRKSCL1149 => org.apache.spark.sql.functions.toRadians has a workaround, see documentation for more info*/
val result2 = df.withColumn("radians", toRadians(col("degrees")))

권장 수정

As a workaround, you can use the radians function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(0, 45, 90, 180, 270).toDF("degrees")
val result1 = df.withColumn("radians", radians(col("degrees")))
val result2 = df.withColumn("radians", radians(col("degrees")))

추가 권장 사항

SPRKSCL1159

메시지: org.apache.sql.functions.stddev_samp에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev_samp function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.stddev_samp function that generates this EWI. In this example, the stddev_samp function is used to calculate the sample standard deviation of selected column.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
val result1 = stddev_samp(col("elements"))
val result2 = stddev_samp("elements")

출력

The SMA adds the EWI SPRKSCL1159 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
/*EWI: SPRKSCL1159 => org.apache.spark.sql.functions.stddev_samp has a workaround, see documentation for more info*/
val result1 = stddev_samp(col("elements"))
/*EWI: SPRKSCL1159 => org.apache.spark.sql.functions.stddev_samp has a workaround, see documentation for more info*/
val result2 = stddev_samp("elements")

권장 수정

Snowpark has an equivalent stddev_samp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1.7", "2.1", "3.0", "4.4", "5.2").toDF("elements")
val result1 = stddev_samp(col("elements"))
val result2 = stddev_samp(col("elements"))

추가 권장 사항

SPRKSCL1108

참고

이 문제 코드는 사용 중단 되었습니다.

메시지: org.apache.spark.sql.DataFrameReader.format은 지원되지 않습니다.

카테고리: 경고.

설명

This issue appears when the org.apache.spark.sql.DataFrameReader.format has an argument that is not supported by Snowpark.

시나리오

There are some scenarios depending on the type of format you are trying to load. It can be a supported, or non-supported format.

시나리오 1

입력

이 도구는 로딩하려는 형식의 유형을 분석하며, 지원되는 형식은 다음과 같습니다.

  • csv

  • json

  • orc

  • parquet

  • text

The below example shows how the tool transforms the format method when passing a csv value.

spark.read.format("csv").load(path)

출력

The tool transforms the format method into a csv method call when load function has one parameter.

spark.read.csv(path)

권장 수정

이 경우 도구에 EWI 가 표시되지 않으므로 수정할 필요가 없습니다.

시나리오 2

입력

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

spark.read.format("net.snowflake.spark.snowflake").load(path)

출력

The tool shows the EWI SPRKSCL1108 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1108 => The parameter net.snowflake.spark.snowflake is not supported for org.apache.spark.sql.DataFrameReader.format
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format("net.snowflake.spark.snowflake").load(path)

권장 수정

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

시나리오 3

입력

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
spark.read.format(myFormat).load(path)

출력

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

/*EWI: SPRKSCL1108 => myFormat is not a literal and can't be evaluated
  EWI: SPRKSCL1112 => org.apache.spark.sql.DataFrameReader.load(scala.String) is not supported*/
spark.read.format(myFormat).load(path)

권장 수정

As a workaround, you can check the value of the variable and add it as a string to the format call.

추가 권장 사항

SPRKSCL1128

메시지: org.apache.sql.functions.exp에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.exp function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.exp function, first used with a column name as an argument and then with a column object.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("exp_value", exp("value"))
val result2 = df.withColumn("exp_value", exp(col("value")))

출력

The SMA adds the EWI SPRKSCL1128 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
/*EWI: SPRKSCL1128 => org.apache.spark.sql.functions.exp has a workaround, see documentation for more info*/
val result1 = df.withColumn("exp_value", exp("value"))
/*EWI: SPRKSCL1128 => org.apache.spark.sql.functions.exp has a workaround, see documentation for more info*/
val result2 = df.withColumn("exp_value", exp(col("value")))

권장 수정

Snowpark has an equivalent exp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("exp_value", exp(col("value")))
val result2 = df.withColumn("exp_value", exp(col("value")))

추가 권장 사항

SPRKSCL1169

Message: *Spark element* is missing on the method chaining.

카테고리: 경고.

설명

이 문제는 SMA 가 메서드 체인에서 Spark 요소 호출이 누락된 것을 감지할 때 표시됩니다. SMA 는 문을 분석하기 위해 해당 Spark 요소를 알아야 합니다.

시나리오

입력

아래는 메서드 체인에서 로딩 함수 호출이 누락된 예제입니다.

val reader = spark.read.format("json")
val df = reader.load(path)

출력

The SMA adds the EWI SPRKSCL1169 to the output code to let you know that load function call is missing on the method chaining and SMA can not analyze the statement.

/*EWI: SPRKSCL1169 => Function 'org.apache.spark.sql.DataFrameReader.load' is missing on the method chaining*/
val reader = spark.read.format("json")
val df = reader.load(path)

권장 수정

메서드 연쇄의 모든 함수 호출이 동일한 문에 있는지 확인하십시오.

val reader = spark.read.format("json").load(path)

추가 권장 사항

SPRKSCL1138

메시지: org.apache.sql.functions.sinh에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sinh function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.sinh function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("sinh_value", sinh("value"))
val result2 = df.withColumn("sinh_value", sinh(col("value")))

출력

The SMA adds the EWI SPRKSCL1138 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
/*EWI: SPRKSCL1138 => org.apache.spark.sql.functions.sinh has a workaround, see documentation for more info*/
val result1 = df.withColumn("sinh_value", sinh("value"))
/*EWI: SPRKSCL1138 => org.apache.spark.sql.functions.sinh has a workaround, see documentation for more info*/
val result2 = df.withColumn("sinh_value", sinh(col("value")))

권장 수정

Snowpark has an equivalent sinh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, 1.0, 2.0, 3.0).toDF("value")
val result1 = df.withColumn("sinh_value", sinh(col("value")))
val result2 = df.withColumn("sinh_value", sinh(col("value")))

추가 권장 사항

SPRKSCL1129

메시지: org.apache.sql.functions.floor에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.floor function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.floor function, first used with a column name as an argument, then with a column object and finally with two column objects.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
val result1 = df.withColumn("floor_value", floor("value"))
val result2 = df.withColumn("floor_value", floor(col("value")))
val result3 = df.withColumn("floor_value", floor(col("value"), lit(1)))

출력

The SMA adds the EWI SPRKSCL1129 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result1 = df.withColumn("floor_value", floor("value"))
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result2 = df.withColumn("floor_value", floor(col("value")))
/*EWI: SPRKSCL1129 => org.apache.spark.sql.functions.floor has a workaround, see documentation for more info*/
val result3 = df.withColumn("floor_value", floor(col("value"), lit(1)))

권장 수정

Snowpark has an equivalent floor function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

For the overload that receives a column object and a scale, you can use the callBuiltin function to invoke the Snowflake builtin FLOOR function. To use it, you should pass the string “floor” as the first argument, the column as the second argument and the scale as the third argument.

val df = Seq(4.75, 6.22, 9.99).toDF("value")
val result1 = df.withColumn("floor_value", floor(col("value")))
val result2 = df.withColumn("floor_value", floor(col("value")))
val result3 = df.withColumn("floor_value", callBuiltin("floor", col("value"), lit(1)))

추가 권장 사항

SPRKSCL1168

Message: *Spark element* with argument(s) value(s) *given arguments* is not supported.

카테고리: 경고.

설명

이 문제는 SMA 가 지정된 매개 변수가 있는 Spark 요소가 지원되지 않음을 감지할 때 표시됩니다.

시나리오

입력

아래는 매개 변수가 지원되지 않는 Spark 요소의 예입니다.

spark.read.format("text").load(path)

출력

The SMA adds the EWI SPRKSCL1168 to the output code to let you know that Spark element with the given parameter is not supported.

/*EWI: SPRKSCL1168 => org.apache.spark.sql.DataFrameReader.format(scala.String) with argument(s) value(s) (spark.format) is not supported*/
spark.read.format("text").load(path)

권장 수정

이 시나리오의 경우 특별한 수정 사항이 없습니다.

추가 권장 사항

SPRKSCL1139

메시지: org.apache.sql.functions.sqrt에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sqrt function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.sqrt function, first used with a column name as an argument and then with a column object.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
val result1 = df.withColumn("sqrt_value", sqrt("value"))
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

출력

The SMA adds the EWI SPRKSCL1139 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
/*EWI: SPRKSCL1139 => org.apache.spark.sql.functions.sqrt has a workaround, see documentation for more info*/
val result1 = df.withColumn("sqrt_value", sqrt("value"))
/*EWI: SPRKSCL1139 => org.apache.spark.sql.functions.sqrt has a workaround, see documentation for more info*/
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

권장 수정

Snowpark has an equivalent sqrt function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(4.0, 16.0, 25.0, 36.0).toDF("value")
val result1 = df.withColumn("sqrt_value", sqrt(col("value")))
val result2 = df.withColumn("sqrt_value", sqrt(col("value")))

추가 권장 사항

SPRKSCL1119

메시지: org.apache.sql.Column.endsWith에는 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.Column.endsWith function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.Column.endsWith function, first used with a literal string argument and then with a column object argument.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
val result1 = df1.filter(col("email").endsWith(".com"))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
val result2 = df2.filter(col("email").endsWith(col("suffix")))

출력

The SMA adds the EWI SPRKSCL1119 to the output code to let you know that this function is not directly supported by Snowpark, but it has a workaround.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
/*EWI: SPRKSCL1119 => org.apache.spark.sql.Column.endsWith has a workaround, see documentation for more info*/
val result1 = df1.filter(col("email").endsWith(".com"))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
/*EWI: SPRKSCL1119 => org.apache.spark.sql.Column.endsWith has a workaround, see documentation for more info*/
val result2 = df2.filter(col("email").endsWith(col("suffix")))

권장 수정

As a workaround, you can use the com.snowflake.snowpark.functions.endswith function, where the first argument would be the column whose values will be checked and the second argument the suffix to check against the column values. Please note that if the argument of the Spark’s endswith function is a literal string, you should convert it into a column object using the com.snowflake.snowpark.functions.lit function.

val df1 = Seq(
  ("Alice", "alice@example.com"),
  ("Bob", "bob@example.org"),
  ("David", "david@example.com")
).toDF("name", "email")
val result1 = df1.filter(endswith(col("email"), lit(".com")))

val df2 = Seq(
  ("Alice", "alice@example.com", ".com"),
  ("Bob", "bob@example.org", ".org"),
  ("David", "david@example.org", ".com")
).toDF("name", "email", "suffix")
val result2 = df2.filter(endswith(col("email"), col("suffix")))

추가 권장 사항

SPRKSCL1148

메시지: org.apache.sql.functions.toDegrees에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.toDegrees function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.toDegrees function, first used with a column name as an argument and then with a column object.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
val result1 = df.withColumn("angle_in_degrees", toDegrees("angle_in_radians"))
val result2 = df.withColumn("angle_in_degrees", toDegrees(col("angle_in_radians")))

출력

The SMA adds the EWI SPRKSCL1148 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
/*EWI: SPRKSCL1148 => org.apache.spark.sql.functions.toDegrees has a workaround, see documentation for more info*/
val result1 = df.withColumn("angle_in_degrees", toDegrees("angle_in_radians"))
/*EWI: SPRKSCL1148 => org.apache.spark.sql.functions.toDegrees has a workaround, see documentation for more info*/
val result2 = df.withColumn("angle_in_degrees", toDegrees(col("angle_in_radians")))

권장 수정

As a workaround, you can use the degrees function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(Math.PI, Math.PI / 2, Math.PI / 4).toDF("angle_in_radians")
val result1 = df.withColumn("angle_in_degrees", degrees(col("angle_in_radians")))
val result2 = df.withColumn("angle_in_degrees", degrees(col("angle_in_radians")))

추가 권장 사항

SPRKSCL1158

메시지: org.apache.sql.functions.skewness에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.skewness function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.skewness function that generates this EWI. In this example, the skewness function is used to calculate the skewness of selected column.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = skewness(col("elements"))
val result2 = skewness("elements")

출력

The SMA adds the EWI SPRKSCL1158 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3").toDF("elements")
/*EWI: SPRKSCL1158 => org.apache.spark.sql.functions.skewness has a workaround, see documentation for more info*/
val result1 = skewness(col("elements"))
/*EWI: SPRKSCL1158 => org.apache.spark.sql.functions.skewness has a workaround, see documentation for more info*/
val result2 = skewness("elements")

권장 수정

Snowpark has an equivalent skew function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3").toDF("elements")
val result1 = skew(col("elements"))
val result2 = skew(col("elements"))

추가 권장 사항

SPRKSCL1109

참고

이 문제 코드는 사용 중단 되었습니다

메시지: 이 매개 변수는 org.apache.spark.sql.DataFrameReader.option에 대해 정의되지 않았습니다.

카테고리: 경고

설명

This issue appears when the SMA detects that giving parameter of org.apache.spark.sql.DataFrameReader.option is not defined.

시나리오

입력

Below is an example of undefined parameter for org.apache.spark.sql.DataFrameReader.option function.

spark.read.option("header", True).json(path)

출력

The SMA adds the EWI SPRKSCL1109 to the output code to let you know that giving parameter to the org.apache.spark.sql.DataFrameReader.option function is not defined.

/*EWI: SPRKSCL1109 => The parameter header=True is not supported for org.apache.spark.sql.DataFrameReader.option*/
spark.read.option("header", True).json(path)

권장 수정

Check the Snowpark documentation for reader format option here, in order to identify the defined options.

추가 권장 사항

SPRKSCL1114

메시지: org.apache.sql.functions.repeat에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.repeat function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.repeat function that generates this EWI.

val df = Seq("Hello", "World").toDF("word")
val result = df.withColumn("repeated_word", repeat(col("word"), 3))

출력

The SMA adds the EWI SPRKSCL1114 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("Hello", "World").toDF("word")
/*EWI: SPRKSCL1114 => org.apache.spark.sql.functions.repeat has a workaround, see documentation for more info*/
val result = df.withColumn("repeated_word", repeat(col("word"), 3))

권장 수정

As a workaround, you can convert the second argument into a column object using the com.snowflake.snowpark.functions.lit function.

val df = Seq("Hello", "World").toDF("word")
val result = df.withColumn("repeated_word", repeat(col("word"), lit(3)))

추가 권장 사항

SPRKSCL1145

메시지: org.apache.sql.functions.sumDistinct에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sumDistinct function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.sumDistinct function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

val result1 = df.groupBy("name").agg(sumDistinct("value"))
val result2 = df.groupBy("name").agg(sumDistinct(col("value")))

출력

The SMA adds the EWI SPRKSCL1145 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

/*EWI: SPRKSCL1145 => org.apache.spark.sql.functions.sumDistinct has a workaround, see documentation for more info*/
val result1 = df.groupBy("name").agg(sumDistinct("value"))
/*EWI: SPRKSCL1145 => org.apache.spark.sql.functions.sumDistinct has a workaround, see documentation for more info*/
val result2 = df.groupBy("name").agg(sumDistinct(col("value")))

권장 수정

As a workaround, you can use the sum_distinct function. For the Spark overload that receives a string argument, you additionally have to convert the string into a column object using the com.snowflake.snowpark.functions.col function.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Alice", 10),
  ("Alice", 20),
  ("Bob", 15)
).toDF("name", "value")

val result1 = df.groupBy("name").agg(sum_distinct(col("value")))
val result2 = df.groupBy("name").agg(sum_distinct(col("value")))

추가 권장 사항

SPRKSCL1171

메시지: Snowpark는 매개 변수가 2개 이상이거나 정규식 패턴을 포함하는 분할 함수를 지원하지 않습니다. 자세한 내용은 설명서를 참조하십시오.

카테고리: 경고.

설명

This issue appears when the SMA detects that org.apache.spark.sql.functions.split has more than two parameters or containing regex pattern.

시나리오

The split function is used to separate the given column around matches of the given pattern. This Spark function has three overloads.

시나리오 1

입력

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has two parameters and the second argument is a string, not a regex pattern.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), "Snow"))

출력

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result = df.select(split(col("words"), "Snow"))

권장 수정

Snowpark has an equivalent split function that receives a column object as a second argument. For that reason, the Spark overload that receives a string argument in the second argument, but it is not a regex pattern, can convert the string into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), lit("Snow")))
시나리오 2

입력

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has two parameters and the second argument is a regex pattern.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(col("words"), "^([\\d]+-[\\d]+-[\\d])"))

출력

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark because regex patterns are not supported by Snowflake.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result = df.select(split(col("words"), "^([\\d]+-[\\d]+-[\\d])"))

권장 수정

Snowflake는 정규식 패턴을 지원하지 않으므로 정규식 패턴이 아닌 문자열로 패턴을 대체해 보십시오.

시나리오 3

입력

Below is an example of the org.apache.spark.sql.functions.split function that generates this EWI. In this example, the split function has more than two parameters.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
val result = df.select(split(df("words"), "Snow", 3))

출력

The SMA adds the EWI SPRKSCL1171 to the output code to let you know that this function is not fully supported by Snowpark, because Snowflake does not have a split function with more than two parameters.

val df = Seq("Snowflake", "Snowpark", "Snow", "Spark").toDF("words")
/* EWI: SPRKSCL1171 => Snowpark does not support split functions with more than two parameters or containing regex pattern. See documentation for more info. */
val result3 = df.select(split(df("words"), "Snow", 3))

권장 수정

Snowflake는 매개 변수가 2개 이상인 분할 함수를 지원하지 않으므로 Snowflake에서 지원하는 분할 함수를 사용하십시오.

추가 권장 사항

SPRKSCL1120

메시지: org.apache.sql.functions.asin에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.asin function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.asin function, first used with a column name as an argument and then with a column object.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
val result1 = df.select(col("value"), asin("value").as("asin_value"))
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

출력

The SMA adds the EWI SPRKSCL1120 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
/*EWI: SPRKSCL1120 => org.apache.spark.sql.functions.asin has a workaround, see documentation for more info*/
val result1 = df.select(col("value"), asin("value").as("asin_value"))
/*EWI: SPRKSCL1120 => org.apache.spark.sql.functions.asin has a workaround, see documentation for more info*/
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

권장 수정

Snowpark has an equivalent asin function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.5, 0.6, -0.5).toDF("value")
val result1 = df.select(col("value"), asin(col("value")).as("asin_value"))
val result2 = df.select(col("value"), asin(col("value")).as("asin_value"))

추가 권장 사항

SPRKSCL1130

메시지: org.apache.sql.functions.greatest에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.greatest function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.greatest function, first used with multiple column names as arguments and then with multiple column objects.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

val result1 = df.withColumn("greatest", greatest("value1", "value2", "value3"))
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

출력

The SMA adds the EWI SPRKSCL1130 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

/*EWI: SPRKSCL1130 => org.apache.spark.sql.functions.greatest has a workaround, see documentation for more info*/
val result1 = df.withColumn("greatest", greatest("value1", "value2", "value3"))
/*EWI: SPRKSCL1130 => org.apache.spark.sql.functions.greatest has a workaround, see documentation for more info*/
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

권장 수정

Snowpark has an equivalent greatest function that receives multiple column objects as arguments. For that reason, the Spark overload that receives column objects as arguments is directly supported by Snowpark and does not require any changes.

For the overload that receives multiple string arguments, you can convert the strings into column objects using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("apple", 10, 20, 15),
  ("banana", 5, 25, 18),
  ("mango", 12, 8, 30)
).toDF("fruit", "value1", "value2", "value3")

val result1 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))
val result2 = df.withColumn("greatest", greatest(col("value1"), col("value2"), col("value3")))

추가 권장 사항


설명: >- Snowpark 및 Snowpark Extension이 프로젝트 구성 파일에 추가되지 않았습니다.


SPRKSCL1161

메시지: 종속성을 추가하지 못했습니다.

카테고리: 변환 오류입니다.

설명

이 문제는 SMA 가 프로젝트 구성 파일에서 SMA 가 지원되지 않는 Spark 버전을 감지하여 SMA 가 해당 프로젝트 구성 파일에 Snowpark 및 Snowpark Extensions 종속성을 추가할 수 없기 때문에 발생합니다. Snowpark 종속성을 추가하지 않으면 마이그레이션된 코드가 컴파일되지 않습니다.

시나리오

가능한 시나리오는 sbt, gradle 및 pom.xml의 3가지입니다. SMA 는 프로젝트 구성 파일을 처리하기 위해 Spark 종속성을 제거하고 Snowpark 및 Snowpark Extension 종속성을 추가하려고 시도합니다.

시나리오 1

입력

Below is an example of the dependencies section of a sbt project configuration file.

...
libraryDependencies += "org.apache.spark" % "spark-core_2.13" % "3.5.3"
libraryDependencies += "org.apache.spark" % "spark-sql_2.13" % "3.5.3"
...

출력

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

...
libraryDependencies += "org.apache.spark" % "spark-core_2.13" % "3.5.3"
libraryDependencies += "org.apache.spark" % "spark-sql_2.13" % "3.5.3"
...

권장 수정

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the sbt project configuration file.

...
libraryDependencies += "com.snowflake" % "snowpark" % "1.14.0"
libraryDependencies += "net.mobilize.snowpark-extensions" % "snowparkextensions" % "0.0.18"
...

프로젝트의 요구 사항을 가장 잘 충족하는 Snowpark 버전을 사용하십시오.

시나리오 2

입력

Below is an example of the dependencies section of a gradle project configuration file.

dependencies {
    implementation group: 'org.apache.spark', name: 'spark-core_2.13', version: '3.5.3'
    implementation group: 'org.apache.spark', name: 'spark-sql_2.13', version: '3.5.3'
    ...
}

출력

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

dependencies {
    implementation group: 'org.apache.spark', name: 'spark-core_2.13', version: '3.5.3'
    implementation group: 'org.apache.spark', name: 'spark-sql_2.13', version: '3.5.3'
    ...
}

권장 수정

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the gradle project configuration file.

dependencies {
    implementation 'com.snowflake:snowpark:1.14.2'
    implementation 'net.mobilize.snowpark-extensions:snowparkextensions:0.0.18'
    ...
}

종속성 버전이 프로젝트 요구 사항에 맞는지 확인하십시오.

시나리오 3

입력

Below is an example of the dependencies section of a pom.xml project configuration file.

<dependencies>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.13</artifactId>
    <version>3.5.3</version>
  </dependency>

  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.13</artifactId>
    <version>3.5.3</version>
    <scope>compile</scope>
  </dependency>
  ...
</dependencies>

출력

The SMA adds the EWI SPRKSCL1161 to the issues inventory since the Spark version is not supported and keeps the output the same.

<dependencies>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.13</artifactId>
    <version>3.5.3</version>
  </dependency>

  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.13</artifactId>
    <version>3.5.3</version>
    <scope>compile</scope>
  </dependency>
  ...
</dependencies>

권장 수정

Manually, remove the Spark dependencies and add Snowpark and Snowpark Extensions dependencies to the gradle project configuration file.

<dependencies>
  <dependency>
    <groupId>com.snowflake</groupId>
    <artifactId>snowpark</artifactId>
    <version>1.14.2</version>
  </dependency>

  <dependency>
    <groupId>net.mobilize.snowpark-extensions</groupId>
    <artifactId>snowparkextensions</artifactId>
    <version>0.0.18</version>
  </dependency>
  ...
</dependencies>

종속성 버전이 프로젝트 요구 사항에 맞는지 확인하십시오.

추가 권장 사항

  • 입력에 프로젝트 구성 파일이 있는지 확인합니다.

    • build.sbt

    • build.gradle

    • pom.xml

  • SMA 가 지원하는 Spark 버전은 2.12:3.1.2입니다

  • You can check the latest Snowpark version here.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1155

경고

This issue code has been deprecated since Spark Conversion Core Version 4.3.2

메시지: org.apache.sql.functions.countDistinct에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.countDistinct function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.countDistinct function, first used with column names as arguments and then with column objects.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

val result1 = df.select(countDistinct("name", "value"))
val result2 = df.select(countDistinct(col("name"), col("value")))

출력

The SMA adds the EWI SPRKSCL1155 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

/*EWI: SPRKSCL1155 => org.apache.spark.sql.functions.countDistinct has a workaround, see documentation for more info*/
val result1 = df.select(countDistinct("name", "value"))
/*EWI: SPRKSCL1155 => org.apache.spark.sql.functions.countDistinct has a workaround, see documentation for more info*/
val result2 = df.select(countDistinct(col("name"), col("value")))

권장 수정

As a workaround, you can use the count_distinct function. For the Spark overload that receives string arguments, you additionally have to convert the strings into column objects using the com.snowflake.snowpark.functions.col function.

val df = Seq(
  ("Alice", 1),
  ("Bob", 2),
  ("Alice", 3),
  ("Bob", 4),
  ("Alice", 1),
  ("Charlie", 5)
).toDF("name", "value")

val result1 = df.select(count_distinct(col("name"), col("value")))
val result2 = df.select(count_distinct(col("name"), col("value")))

추가 권장 사항

SPRKSCL1104

이 문제 코드는 사용 중단 되었습니다

메시지: Spark 세션 빌더 옵션이 지원되지 않습니다.

카테고리: 변환 오류입니다.

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.SparkSession.Builder.config function, which is setting an option of the Spark Session and it is not supported by Snowpark.

시나리오

입력

Below is an example of the org.apache.spark.sql.SparkSession.Builder.config function used to set an option in the Spark Session.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .getOrCreate()

출력

The SMA adds the EWI SPRKSCL1104 to the output code to let you know config method is not supported by Snowpark. Then, it is not possible to set options in the Spark Session via config function and it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1104 => SparkBuilder Option is not supported .config("spark.sql.broadcastTimeout", "3600")*/
.create()

권장 수정

세션을 생성하려면 적절한 Snowflake Snowpark 구성을 추가해야 합니다.

이 예제에서는 config 변수가 사용됩니다.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

또한 연결 정보와 함께 configFile(profile.properties)을 사용하는 것이 좋습니다.

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

추가 권장 사항

SPRKSCL1124

메시지: org.apache.sql.functions.cosh에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.cosh function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.cosh function, first used with a column name as an argument and then with a column object.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
val result1 = df.withColumn("cosh_value", cosh("value"))
val result2 = df.withColumn("cosh_value", cosh(col("value")))

출력

The SMA adds the EWI SPRKSCL1124 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
/*EWI: SPRKSCL1124 => org.apache.spark.sql.functions.cosh has a workaround, see documentation for more info*/
val result1 = df.withColumn("cosh_value", cosh("value"))
/*EWI: SPRKSCL1124 => org.apache.spark.sql.functions.cosh has a workaround, see documentation for more info*/
val result2 = df.withColumn("cosh_value", cosh(col("value")))

권장 수정

Snowpark has an equivalent cosh function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(0.0, 1.0, 2.0, -1.0).toDF("value")
val result1 = df.withColumn("cosh_value", cosh(col("value")))
val result2 = df.withColumn("cosh_value", cosh(col("value")))

추가 권장 사항

SPRKSCL1175

Message: The two-parameter udf function is not supported in Snowpark. It should be converted into a single-parameter udf function. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.

카테고리: 변환 오류입니다.

설명

This issue appears when the SMA detects an use of the two-parameter org.apache.spark.sql.functions.udf function in the source code, because Snowpark does not have an equivalent two-parameter udf function, then the output code might not compile.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.udf function that generates this EWI. In this example, the udf function has two parameters.

val myFuncUdf = udf(new UDF1[String, Integer] {
  override def call(s: String): Integer = s.length()
}, IntegerType)

출력

The SMA adds the EWI SPRKSCL1175 to the output code to let you know that the udf function is not supported, because it has two parameters.

/*EWI: SPRKSCL1175 => The two-parameter udf function is not supported in Snowpark. It should be converted into a single-parameter udf function. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
val myFuncUdf = udf(new UDF1[String, Integer] {
  override def call(s: String): Integer = s.length()
}, IntegerType)

권장 수정

Snowpark only supports the single-parameter udf function (without the return type parameter), so you should convert your two-parameter udf function into a single-parameter udf function in order to make it work in Snowpark.

예를 들어, 위에서 언급한 샘플 코드의 경우 수동으로 이렇게 변환해야 합니다.

val myFuncUdf = udf((s: String) => s.length())

Please note that there are some caveats about creating udf in Snowpark that might require you to make some additional manual changes to your code. Please check this other recommendations here related with creating single-parameter udf functions in Snowpark for more details.

추가 권장 사항

  • To learn more about how to create user-defined functions in Snowpark, please refer to the following documentation: Creating User-Defined Functions (UDFs) for DataFrames in Scala

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1001

Message: This code section has parsing errors. The parsing error was found at: line *line number*, column *column number*. When trying to parse *statement*. This file was not converted, so it is expected to still have references to the Spark API.

카테고리: 구문 분석 오류.

설명

이 문제는 SMA 가 파일의 코드에서 올바르게 읽거나 이해할 수 없는 문장을 감지했을 때 발생하며, 구문 분석 오류 라고 합니다. 또한 이 문제는 파일에 1개 이상의 구문 분석 오류가 있을 때 표시됩니다.

시나리오

입력

다음은 잘못된 Scala 코드의 예입니다.

/#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

출력

The SMA adds the EWI SPRKSCL1001 to the output code to let you know that the code of the file has parsing errors. Therefore, SMA is not able to process a file with this error.

// **********************************************************************************************************************
// EWI: SPRKSCL1001 => This code section has parsing errors
// The parsing error was found at: line 0, column 0. When trying to parse ''.
// This file was not converted, so it is expected to still have references to the Spark API
// **********************************************************************************************************************
/#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

권장 수정

메시지가 오류 문을 정확히 찾아내므로 잘못된 구문을 식별하여 제거하거나 해당 문을 설명하여 구문 분석 오류를 방지할 수 있습니다.

Class myClass {

    def function1() = { 1 }

}
// /#/(%$"$%

Class myClass {

    def function1() = { 1 }

}

추가 권장 사항

SPRKSCL1141

메시지: org.apache.sql.functions.stddev_pop에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev_pop function, which has a workaround.

시나리오

Below is an example of the org.apache.spark.sql.functions.stddev_pop function, first used with a column name as an argument and then with a column object.

입력

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

val result1 = df.select(stddev_pop("age"))
val result2 = df.select(stddev_pop(col("age")))

출력

The SMA adds the EWI SPRKSCL1141 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

/*EWI: SPRKSCL1141 => org.apache.spark.sql.functions.stddev_pop has a workaround, see documentation for more info*/
val result1 = df.select(stddev_pop("age"))
/*EWI: SPRKSCL1141 => org.apache.spark.sql.functions.stddev_pop has a workaround, see documentation for more info*/
val result2 = df.select(stddev_pop(col("age")))

권장 수정

Snowpark has an equivalent stddev_pop function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", 23),
  ("Bob", 30),
  ("Carol", 27),
  ("David", 25),
).toDF("name", "age")

val result1 = df.select(stddev_pop(col("age")))
val result2 = df.select(stddev_pop(col("age")))

추가 권장 사항

SPRKSCL1110

참고

이 문제 코드는 사용 중단 되었습니다

Message: Reader method not supported *method name*.

카테고리: 경고

설명

이 문제는 DataFrameReader 메서드 체이닝에서 Snowflake가 지원하지 않는 메서드를 SMA 가 감지할 때 표시됩니다. 그러면 리더 문의 마이그레이션에 영향을 미칠 수 있습니다.

시나리오

입력

아래는 로딩 메서드가 Snowflake에서 지원되지 않는 DataFrameReader 메서드 체인 예시입니다.

spark.read.
    format("net.snowflake.spark.snowflake").
    option("query", s"select * from $tablename")
    load()

출력

The SMA adds the EWI SPRKSCL1110 to the output code to let you know that load method is not supported by Snowpark. Then, it might affects the migration of the reader statement.

session.sql(s"select * from $tablename")
/*EWI: SPRKSCL1110 => Reader method not supported .load()*/

권장 수정

Check the Snowpark documentation for reader here, in order to know the supported methods by Snowflake.

추가 권장 사항

SPRKSCL1100

This issue code has been deprecated since Spark Conversion Core 2.3.22

메시지: 파티션 재분할은 지원되지 않습니다.

카테고리: 구문 분석 오류.

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.DataFrame.repartition function, which is not supported by Snowpark. Snowflake manages the storage and the workload on the clusters making repartition operation inapplicable.

시나리오

입력

Below is an example of the org.apache.spark.sql.DataFrame.repartition function used to return a new DataFrame partitioned by the given partitioning expressions.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    val dfRepartitionByExpresion = dfName.repartition($"name")

    val dfRepartitionByNumber = dfJob.repartition(3)

    val dfRepartitionByBoth = dfAge.repartition(3, $"age")

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

출력

The SMA adds the EWI SPRKSCL1100 to the output code to let you know that this function is not supported by Snowpark.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByExpresion = dfName.repartition($"name")

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByNumber = dfJob.repartition(3)

    /*EWI: SPRKSCL1100 => Repartition is not supported*/
    val dfRepartitionByBoth = dfAge.repartition(3, $"age")

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

권장 수정

Snowflake는 클러스터의 저장소와 워크로드를 관리하기 때문에 파티션 재분할 작업을 적용할 수 없습니다. 즉, 조인 전 재분할을 사용할 필요는 전혀 없습니다.

    var nameData = Seq("James", "Sarah", "Dylan", "Leila, "Laura", "Peter")
    var jobData = Seq("Police", "Doctor", "Actor", "Teacher, "Dentist", "Fireman")
    var ageData = Seq(40, 38, 34, 27, 29, 55)

    val dfName = nameData.toDF("name")
    val dfJob = jobData.toDF("job")
    val dfAge = ageData.toDF("age")

    val dfRepartitionByExpresion = dfName

    val dfRepartitionByNumber = dfJob

    val dfRepartitionByBoth = dfAge

    val joinedDf = dfRepartitionByExpresion.join(dfRepartitionByNumber)

추가 권장 사항

SPRKSCL1151

메시지: org.apache.sql.functions.var_samp에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.var_samp function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.var_samp function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

val result1 = df.groupBy("category").agg(var_samp("value"))
val result2 = df.groupBy("category").agg(var_samp(col("value")))

출력

The SMA adds the EWI SPRKSCL1151 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

/*EWI: SPRKSCL1151 => org.apache.spark.sql.functions.var_samp has a workaround, see documentation for more info*/
val result1 = df.groupBy("category").agg(var_samp("value"))
/*EWI: SPRKSCL1151 => org.apache.spark.sql.functions.var_samp has a workaround, see documentation for more info*/
val result2 = df.groupBy("category").agg(var_samp(col("value")))

권장 수정

Snowpark has an equivalent var_samp function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("A", 10),
  ("A", 20),
  ("A", 30),
  ("B", 40),
  ("B", 50),
  ("B", 60)
).toDF("category", "value")

val result1 = df.groupBy("category").agg(var_samp(col("value")))
val result2 = df.groupBy("category").agg(var_samp(col("value")))

추가 권장 사항


설명: >- DataFrameReader 메서드 체인의 리더 형식이 Snowpark에서 정의한 형식이 아닙니다.


SPRKSCL1165

메시지: DataFrameReader 메서드 체인에 대한 리더 형식을 정의할 수 없습니다

카테고리: 경고

설명

This issue appears when the SMA detects that format of the reader in DataFrameReader method chaining is not one of the following supported for Snowpark: avro, csv, json, orc, parquet and xml. Therefore, the SMA can not determine if setting options are defined or not.

시나리오

입력

아래는 DataFrameReader 메서드 체인 예시로, SMA 가 리더의 형식을 결정할 수 있습니다.

spark.read.format("net.snowflake.spark.snowflake")
                 .option("query", s"select * from $tableName")
                 .load()

출력

The SMA adds the EWI SPRKSCL1165 to the output code to let you know that format of the reader can not be determine in the giving DataFrameReader method chaining.

/*EWI: SPRKSCL1165 => Reader format on DataFrameReader method chaining can't be defined*/
spark.read.option("query", s"select * from $tableName")
                 .load()

권장 수정

Check the Snowpark documentation here to get more information about format of the reader.

추가 권장 사항

SPRKSCL1134

메시지: org.apache.sql.functions.log에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.log function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.log function that generates this EWI.

val df = Seq(10.0, 20.0, 30.0, 40.0).toDF("value")
val result1 = df.withColumn("log_value", log(10, "value"))
val result2 = df.withColumn("log_value", log(10, col("value")))
val result3 = df.withColumn("log_value", log("value"))
val result4 = df.withColumn("log_value", log(col("value")))

출력

The SMA adds the EWI SPRKSCL1134 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(10.0, 20.0, 30.0, 40.0).toDF("value")
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result1 = df.withColumn("log_value", log(10, "value"))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result2 = df.withColumn("log_value", log(10, col("value")))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result3 = df.withColumn("log_value", log("value"))
/*EWI: SPRKSCL1134 => org.apache.spark.sql.functions.log has a workaround, see documentation for more info*/
val result4 = df.withColumn("log_value", log(col("value")))

권장 수정

Below are the different workarounds for all the overloads of the log function.

1. def log(base: Double, columnName: String): 열

You can convert the base into a column object using the com.snowflake.snowpark.functions.lit function and convert the column name into a column object using the com.snowflake.snowpark.functions.col function.

val result1 = df.withColumn("log_value", log(lit(10), col("value")))

2. def log(base: Double, a: Column): 열

You can convert the base into a column object using the com.snowflake.snowpark.functions.lit function.

val result2 = df.withColumn("log_value", log(lit(10), col("value")))

3.def log(columnName: String): 열

You can pass lit(Math.E) as the first argument and convert the column name into a column object using the com.snowflake.snowpark.functions.col function and pass it as the second argument.

val result3 = df.withColumn("log_value", log(lit(Math.E), col("value")))

4. def log(e: Column): 열

You can pass lit(Math.E) as the first argument and the column object as the second argument.

val result4 = df.withColumn("log_value", log(lit(Math.E), col("value")))

추가 권장 사항

SPRKSCL1125

경고

This issue code is deprecated since Spark Conversion Core 2.9.0

메시지: org.apache.sql.functions.count에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.count function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.count function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

val result1 = df.groupBy("name").agg(count("subject").as("subject_count"))
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

출력

The SMA adds the EWI SPRKSCL1125 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

/*EWI: SPRKSCL1125 => org.apache.spark.sql.functions.count has a workaround, see documentation for more info*/
val result1 = df.groupBy("name").agg(count("subject").as("subject_count"))
/*EWI: SPRKSCL1125 => org.apache.spark.sql.functions.count has a workaround, see documentation for more info*/
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

권장 수정

Snowpark has an equivalent count function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", "Math"),
  ("Bob", "Science"),
  ("Alice", "Science"),
  ("Bob", null)
).toDF("name", "subject")

val result1 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))
val result2 = df.groupBy("name").agg(count(col("subject")).as("subject_count"))

추가 권장 사항

SPRKSCL1174

Message: The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.

카테고리: 경고.

설명

This issue appears when the SMA detects an use of the single-parameter org.apache.spark.sql.functions.udf function in the code. Then, it might require a manual intervention.

The Snowpark API provides an equivalent com.snowflake.snowpark.functions.udf function that allows you to create a user-defined function from a lambda or function in Scala, however, there are some caveats about creating udf in Snowpark that might require you to make some manual changes to your code in order to make it work properly.

시나리오

The Snowpark udf function should work as intended for a wide range of cases without requiring manual intervention. However, there are some scenarios that would requiere you to manually modify your code in order to get it work in Snowpark. Some of those scenarios are listed below:

시나리오 1

입력

아래는 App Trait을 사용하여 오브젝트에 UDFs 를 생성하는 예제입니다.

The Scala’s App trait simplifies creating executable programs by providing a main method that automatically runs the code within the object definition. Extending App delays the initialization of the fields until the main method is executed, which can affect the UDFs definitions if they rely on initialized fields. This means that if an object extends App and the udf references an object field, the udf definition uploaded to Snowflake will not include the initialized value of the field. This can result in null values being returned by the udf.

For example, in the following code the variable myValue will resolve to null in the udf definition:

object Main extends App {
  ...
  val myValue = 10
  val myUdf = udf((x: Int) => x + myValue) // myValue in the `udf` definition will resolve to null
  ...
}

출력

The SMA adds the EWI SPRKSCL1174 to the output code to let you know that the single-parameter udf function is supported in Snowpark but it requires manual intervention.

object Main extends App {
  ...
  val myValue = 10
  /*EWI: SPRKSCL1174 => The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
  val myUdf = udf((x: Int) => x + myValue) // myValue in the `udf` definition will resolve to null
  ...
}

권장 수정

To avoid this issue, it is recommended to not extend App and implement a separate main method for your code. This ensure that object fields are initialized before udf definitions are created and uploaded to Snowflake.

object Main {
  ...
  def main(args: Array[String]): Unit = {
    val myValue = 10
    val myUdf = udf((x: Int) => x + myValue)
  }
  ...
}

For more details about this topic, see Caveat About Creating UDFs in an Object With the App Trait.

시나리오 2

입력

아래는 Jupyter Notebooks에서 UDFs 를 생성하는 예제입니다.

def myFunc(s: String): String = {
  ...
}

val myFuncUdf = udf((x: String) => myFunc(x))
df1.select(myFuncUdf(col("name"))).show()

출력

The SMA adds the EWI SPRKSCL1174 to the output code to let you know that the single-parameter udf function is supported in Snowpark but it requires manual intervention.

def myFunc(s: String): String = {
  ...
}

/*EWI: SPRKSCL1174 => The single-parameter udf function is supported in Snowpark but it might require manual intervention. Please check the documentation to learn how to manually modify the code to make it work in Snowpark.*/
val myFuncUdf = udf((x: String) => myFunc(x))
df1.select(myFuncUdf(col("name"))).show()

권장 수정

To create a udf in a Jupyter Notebook, you should define the implementation of your function in a class that extends Serializable. For example, you should manually convert it into this:

object ConvertedUdfFuncs extends Serializable {
  def myFunc(s: String): String = {
    ...
  }

  val myFuncAsLambda = ((x: String) => ConvertedUdfFuncs.myFunc(x))
}

val myFuncUdf = udf(ConvertedUdfFuncs.myFuncAsLambda)
df1.select(myFuncUdf(col("name"))).show()

For more details about how to create UDFs in Jupyter Notebooks, see Creating UDFs in Jupyter Notebooks.

추가 권장 사항

SPRKSCL1000

Message: Source project spark-core version is *version number*, the spark-core version supported by snowpark is 2.12:3.1.2 so there may be functional differences between the existing mappings

카테고리: 경고

설명

This issue appears when the SMA detects a version of the spark-core that is not supported by SMA. Therefore, there may be functional differences between the existing mappings and the output might have unexpected behaviors.

추가 권장 사항

  • SMA 에서 지원하는 Spark Core 버전은 2.12:3.1.2입니다. 소스 코드 버전을 변경하는 것을 고려하십시오.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1140

메시지: org.apache.sql.functions.stddev에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.stddev function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.stddev function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

val result1 = df.select(stddev("score"))
val result2 = df.select(stddev(col("score")))

출력

The SMA adds the EWI SPRKSCL1140 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

/*EWI: SPRKSCL1140 => org.apache.spark.sql.functions.stddev has a workaround, see documentation for more info*/
val result1 = df.select(stddev("score"))
/*EWI: SPRKSCL1140 => org.apache.spark.sql.functions.stddev has a workaround, see documentation for more info*/
val result2 = df.select(stddev(col("score")))

권장 수정

Snowpark has an equivalent stddev function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("Alice", 10),
  ("Bob", 15),
  ("Charlie", 20),
  ("David", 25),
).toDF("name", "score")

val result1 = df.select(stddev(col("score")))
val result2 = df.select(stddev(col("score")))

추가 권장 사항

SPRKSCL1111

참고

이 문제 코드는 사용 중단 되었습니다

메시지: CreateDecimalType 은 지원되지 않습니다.

카테고리: 변환 오류입니다.

설명

This issue appears when the SMA detects a usage org.apache.spark.sql.types.DataTypes.CreateDecimalType function.

시나리오

입력

DataTypes.CreateDecimalType 함수의 사용 예는 다음과 같습니다.

var result = DataTypes.createDecimalType(18, 8)

출력

The SMA adds the EWI SPRKSCL1111 to the output code to let you know that CreateDecimalType function is not supported by Snowpark.

/*EWI: SPRKSCL1111 => CreateDecimalType is not supported*/
var result = createDecimalType(18, 8)

권장 수정

아직 권장되는 수정 사항은 없습니다.

메시지: Spark 세션 빌더 옵션이 지원되지 않습니다.

카테고리: 변환 오류입니다.

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.SparkSession.Builder.config function, which is setting an option of the Spark Session and it is not supported by Snowpark.

시나리오

입력

Below is an example of the org.apache.spark.sql.SparkSession.Builder.config function used to set an option in the Spark Session.

val spark = SparkSession.builder()
           .master("local")
           .appName("testApp")
           .config("spark.sql.broadcastTimeout", "3600")
           .getOrCreate()

출력

The SMA adds the EWI SPRKSCL1104 to the output code to let you know config method is not supported by Snowpark. Then, it is not possible to set options in the Spark Session via config function and it might affects the migration of the Spark Session statement.

val spark = Session.builder.configFile("connection.properties")
/*EWI: SPRKSCL1104 => SparkBuilder Option is not supported .config("spark.sql.broadcastTimeout", "3600")*/
.create()

권장 수정

세션을 생성하려면 적절한 Snowflake Snowpark 구성을 추가해야 합니다.

이 예제에서는 config 변수가 사용됩니다.

    val configs = Map (
      "URL" -> "https://<myAccount>.snowflakecomputing.com:<port>",
      "USER" -> <myUserName>,
      "PASSWORD" -> <myPassword>,
      "ROLE" -> <myRole>,
      "WAREHOUSE" -> <myWarehouse>,
      "DB" -> <myDatabase>,
      "SCHEMA" -> <mySchema>
    )
    val session = Session.builder.configs(configs).create

또한 연결 정보와 함께 configFile(profile.properties)을 사용하는 것이 좋습니다.

## profile.properties file (a text file)
URL = https://<account_identifier>.snowflakecomputing.com
USER = <username>
PRIVATEKEY = <unencrypted_private_key_from_the_private_key_file>
ROLE = <role_name>
WAREHOUSE = <warehouse_name>
DB = <database_name>
SCHEMA = <schema_name>

And with the Session.builder.configFile the session can be created:

val session = Session.builder.configFile("/path/to/properties/file").create

추가 권장 사항

SPRKSCL1101

This issue code has been deprecated since Spark Conversion Core 2.3.22

메시지: 브로드캐스트는 지원되지 않습니다

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.broadcast function, which is not supported by Snowpark. This function is not supported because Snowflake does not support broadcast variables.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.broadcast function used to create a broadcast object to use on each Spark cluster:

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      broadcast(dfCollege),
      Seq("CollegeName")
    )

출력

The SMA adds the EWI SPRKSCL1101 to the output code to let you know that this function is not supported by Snowpark.

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      /*EWI: SPRKSCL1101 => Broadcast is not supported*/
      broadcast(dfCollege),
      Seq("CollegeName")
    )

권장 수정

Snowflake는 클러스터의 저장소와 워크로드를 관리하기 때문에 브로드캐스트 오브젝트를 적용할 수 없습니다. 즉, 브로드캐스트 사용이 전혀 요구되지 않을 수도 있지만 각 케이스에 대한 추가 분석이 필요합니다.

The recommended approach is replace a Spark dataframe broadcast by a Snowpark regular dataframe or by using a dataframe method as Join.

For the proposed input the fix is to adapt the join to use directly the dataframe collegeDF without the use of broadcast for the dataframe.

    var studentData = Seq(
      ("James", "Orozco", "Science"),
      ("Andrea", "Larson", "Bussiness"),
    )

    var collegeData = Seq(
      ("Arts", 1),
      ("Bussiness", 2),
      ("Science", 3)
    )

    val dfStudent = studentData.toDF("FirstName", "LastName", "CollegeName")
    val dfCollege = collegeData.toDF("CollegeName", "CollegeCode")

    dfStudent.join(
      dfCollege,
      Seq("CollegeName")
    ).show()

추가 권장 사항

SPRKSCL1150

메시지: org.apache.sql.functions.var_pop에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.var_pop function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.var_pop function, first used with a column name as an argument and then with a column object.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

val result1 = df.groupBy("group").agg(var_pop("value"))
val result2 = df.groupBy("group").agg(var_pop(col("value")))

출력

The SMA adds the EWI SPRKSCL1150 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

/*EWI: SPRKSCL1150 => org.apache.spark.sql.functions.var_pop has a workaround, see documentation for more info*/
val result1 = df.groupBy("group").agg(var_pop("value"))
/*EWI: SPRKSCL1150 => org.apache.spark.sql.functions.var_pop has a workaround, see documentation for more info*/
val result2 = df.groupBy("group").agg(var_pop(col("value")))

권장 수정

Snowpark has an equivalent var_pop function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(
  ("A", 10.0),
  ("A", 20.0),
  ("A", 30.0),
  ("B", 40.0),
  ("B", 50.0),
  ("B", 60.0)
).toDF("group", "value")

val result1 = df.groupBy("group").agg(var_pop(col("value")))
val result2 = df.groupBy("group").agg(var_pop(col("value")))

추가 권장 사항


설명: >- org.apache.spark.sql.DataFrameReader.option 함수의 매개 변수가 정의되어 있지 않습니다.


SPRKSCL1164

참고

이 문제 코드는 사용 중단 되었습니다

메시지: 이 매개 변수는 org.apache.spark.sql.DataFrameReader.option에 대해 정의되지 않았습니다.

카테고리: 경고

설명

This issue appears when the SMA detects that giving parameter of org.apache.spark.sql.DataFrameReader.option is not defined.

시나리오

입력

Below is an example of undefined parameter for org.apache.spark.sql.DataFrameReader.option function.

spark.read.option("header", True).json(path)

출력

The SMA adds the EWI SPRKSCL1164 to the output code to let you know that giving parameter to the org.apache.spark.sql.DataFrameReader.option function is not defined.

/*EWI: SPRKSCL1164 => The parameter header=True is not supported for org.apache.spark.sql.DataFrameReader.option*/
spark.read.option("header", True).json(path)

권장 수정

Check the Snowpark documentation for reader format option here, in order to identify the defined options.

추가 권장 사항

SPRKSCL1135

경고

This issue code is deprecated since Spark Conversion Core 4.3.2

메시지: org.apache.sql.functions.mean에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.mean function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.mean function, first used with a column name as an argument and then with a column object.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(mean("value"))
val result2 = df.select(mean(col("value")))

출력

The SMA adds the EWI SPRKSCL1135 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
/*EWI: SPRKSCL1135 => org.apache.spark.sql.functions.mean has a workaround, see documentation for more info*/
val result1 = df.select(mean("value"))
/*EWI: SPRKSCL1135 => org.apache.spark.sql.functions.mean has a workaround, see documentation for more info*/
val result2 = df.select(mean(col("value")))

권장 수정

Snowpark has an equivalent mean function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1, 3, 10, 1, 3).toDF("value")
val result1 = df.select(mean(col("value")))
val result2 = df.select(mean(col("value")))

추가 권장 사항

SPRKSCL1115

경고

This issue code has been deprecated since Spark Conversion Core Version 4.6.0

메시지: org.apache.sql.functions.round에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.round function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.round function that generates this EWI.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
val result1 = df.withColumn("rounded_value", round(col("value")))
val result2 = df.withColumn("rounded_value", round(col("value"), 2))

출력

The SMA adds the EWI SPRKSCL1115 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
/*EWI: SPRKSCL1115 => org.apache.spark.sql.functions.round has a workaround, see documentation for more info*/
val result1 = df.withColumn("rounded_value", round(col("value")))
/*EWI: SPRKSCL1115 => org.apache.spark.sql.functions.round has a workaround, see documentation for more info*/
val result2 = df.withColumn("rounded_value", round(col("value"), 2))

권장 수정

Snowpark has an equivalent round function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a column object and a scale, you can convert the scale into a column object using the com.snowflake.snowpark.functions.lit function as a workaround.

val df = Seq(3.9876, 5.673, 8.1234).toDF("value")
val result1 = df.withColumn("rounded_value", round(col("value")))
val result2 = df.withColumn("rounded_value", round(col("value"), lit(2)))

추가 권장 사항

SPRKSCL1144

메시지: 기호 테이블을 로딩할 수 없습니다

카테고리: 구문 분석 오류

설명

이 문제는 SMA 실행 프로세스에 심각한 오류가 있을 때 표시됩니다. 기호 테이블을 로딩할 수 없으므로 SMA 가 평가 또는 변환 프로세스를 시작할 수 없습니다.

추가 권장 사항

  • This is unlikely to be an error in the source code itself, but rather is an error in how the SMA processes the source code. The best resolution would be to post an issue in the SMA.

  • For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.

SPRKSCL1170

참고

이 문제 코드는 사용 중단 되었습니다

메시지: 플랫폼별 키에서 sparkConfig 멤버 키가 지원되지 않습니다.

카테고리: 변환 오류

설명

이전 버전을 사용 중인 경우 최신 버전으로 업그레이드하십시오.

추가 권장 사항

SPRKSCL1121

메시지: org.apache.sql.functions.atan에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.atan function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.atan function, first used with a column name as an argument and then with a column object.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
val result1 = df.withColumn("atan_value", atan("value"))
val result2 = df.withColumn("atan_value", atan(col("value")))

출력

The SMA adds the EWI SPRKSCL1121 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
/*EWI: SPRKSCL1121 => org.apache.spark.sql.functions.atan has a workaround, see documentation for more info*/
val result1 = df.withColumn("atan_value", atan("value"))
/*EWI: SPRKSCL1121 => org.apache.spark.sql.functions.atan has a workaround, see documentation for more info*/
val result2 = df.withColumn("atan_value", atan(col("value")))

권장 수정

Snowpark has an equivalent atan function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(1.0, 0.5, -1.0).toDF("value")
val result1 = df.withColumn("atan_value", atan(col("value")))
val result2 = df.withColumn("atan_value", atan(col("value")))

추가 권장 사항

SPRKSCL1131

메시지: org.apache.sql.functions.grouping에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.grouping function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.grouping function, first used with a column name as an argument and then with a column object.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
val result1 = df.cube("name").agg(grouping("name"), sum("age"))
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

출력

The SMA adds the EWI SPRKSCL1131 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
/*EWI: SPRKSCL1131 => org.apache.spark.sql.functions.grouping has a workaround, see documentation for more info*/
val result1 = df.cube("name").agg(grouping("name"), sum("age"))
/*EWI: SPRKSCL1131 => org.apache.spark.sql.functions.grouping has a workaround, see documentation for more info*/
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

권장 수정

Snowpark has an equivalent grouping function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq(("Alice", 2), ("Bob", 5)).toDF("name", "age")
val result1 = df.cube("name").agg(grouping(col("name")), sum("age"))
val result2 = df.cube("name").agg(grouping(col("name")), sum("age"))

추가 권장 사항

SPRKSCL1160

참고

This issue code has been deprecated since Spark Conversion Core 4.1.0

메시지: org.apache.sql.functions.sum에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.sum function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.sum function that generates this EWI. In this example, the sum function is used to calculate the sum of selected column.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
val result1 = sum(col("elements"))
val result2 = sum("elements")

출력

The SMA adds the EWI SPRKSCL1160 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
/*EWI: SPRKSCL1160 => org.apache.spark.sql.functions.sum has a workaround, see documentation for more info*/
val result1 = sum(col("elements"))
/*EWI: SPRKSCL1160 => org.apache.spark.sql.functions.sum has a workaround, see documentation for more info*/
val result2 = sum("elements")

권장 수정

Snowpark has an equivalent sum function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

val df = Seq("1", "2", "3", "4", "5").toDF("elements")
val result1 = sum(col("elements"))
val result2 = sum(col("elements"))

추가 권장 사항

SPRKSCL1154

메시지: org.apache.sql.functions.ceil에 해결 방법이 있습니다. 자세한 내용은 설명서를 참조하십시오

카테고리: 경고

설명

This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.ceil function, which has a workaround.

시나리오

입력

Below is an example of the org.apache.spark.sql.functions.ceil function, first used with a column name as an argument, then with a column object and finally with a column object and a scale.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
val result1 = df.withColumn("ceil", ceil("value"))
val result2 = df.withColumn("ceil", ceil(col("value")))
val result3 = df.withColumn("ceil", ceil(col("value"), lit(1)))

출력

The SMA adds the EWI SPRKSCL1154 to the output code to let you know that this function is not fully supported by Snowpark, but it has a workaround.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result1 = df.withColumn("ceil", ceil("value"))
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result2 = df.withColumn("ceil", ceil(col("value")))
/*EWI: SPRKSCL1154 => org.apache.spark.sql.functions.ceil has a workaround, see documentation for more info*/
val result3 = df.withColumn("ceil", ceil(col("value"), lit(1)))

권장 수정

Snowpark has an equivalent ceil function that receives a column object as an argument. For that reason, the Spark overload that receives a column object as an argument is directly supported by Snowpark and does not require any changes.

For the overload that receives a string argument, you can convert the string into a column object using the com.snowflake.snowpark.functions.col function as a workaround.

For the overload that receives a column object and a scale, you can use the callBuiltin function to invoke the Snowflake builtin CEIL function. To use it, you should pass the string “ceil” as the first argument, the column as the second argument and the scale as the third argument.

val df = Seq(2.33, 3.88, 4.11, 5.99).toDF("value")
val result1 = df.withColumn("ceil", ceil(col("value")))
val result2 = df.withColumn("ceil", ceil(col("value")))
val result3 = df.withColumn("ceil", callBuiltin("ceil", col("value"), lit(1)))

추가 권장 사항

SPRKSCL1105

이 문제 코드는 사용 중단 되었습니다

메시지: 작성기 형식 값이 지원되지 않습니다.

카테고리: 변환 오류

설명

This issue appears when the org.apache.spark.sql.DataFrameWriter.format has an argument that is not supported by Snowpark.

시나리오

There are some scenarios depending on the type of format you are trying to save. It can be a supported, or non-supported format.

시나리오 1

입력

이 도구는 저장하려는 형식의 유형을 분석하며, 지원되는 형식은 다음과 같습니다.

  • csv

  • json

  • orc

  • parquet

  • text

    dfWrite.write.format("csv").save(path)

출력

The tool transforms the format method into a csv method call when save function has one parameter.

    dfWrite.write.csv(path)

권장 수정

이 경우 도구에 EWI 가 표시되지 않으므로 수정할 필요가 없습니다.

시나리오 2

입력

The below example shows how the tool transforms the format method when passing a net.snowflake.spark.snowflake value.

dfWrite.write.format("net.snowflake.spark.snowflake").save(path)

출력

The tool shows the EWI SPRKSCL1105 indicating that the value net.snowflake.spark.snowflake is not supported.

/*EWI: SPRKSCL1105 => Writer format value is not supported .format("net.snowflake.spark.snowflake")*/
dfWrite.write.format("net.snowflake.spark.snowflake").save(path)

권장 수정

For the not supported scenarios there is no specific fix since it depends on the files that are trying to be read.

시나리오 3

입력

The below example shows how the tool transforms the format method when passing a csv, but using a variable instead.

val myFormat = "csv"
dfWrite.write.format(myFormat).save(path)

출력

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKSCL1163 indicating that the value is not supported.

val myFormat = "csv"
/*EWI: SPRKSCL1163 => format_type is not a literal and can't be evaluated*/
dfWrite.write.format(myFormat).load(path)

권장 수정

As a workaround, you can check the value of the variable and add it as a string to the format call.

추가 권장 사항