Snowpark Migration Accelerator: Deploying the Output Code¶
To run the output code generated by the Snowpark Migration Accelerator (SMA), follow these environment-specific recommendations based on your source platform.
Spark Scala¶
Before executing your migrated Apache Spark code in Snowpark, please review these important considerations:
Add snowpark and snowpark extensions library reference¶
The migrated project must include references to both the Snowpark library and its extensions.
Snowpark Extensions¶
Snowpark Extensions is a library that adds Apache Spark features to the standard Snowpark library. These features are not currently available in Snowpark. This library helps developers migrate their projects from Apache Spark to Snowpark more easily.
Follow these steps to reference Snowpark and Snowpark Extensions libraries in your migrated code:
Add the Snowpark library reference to your project
Add the Snowpark Extensions library reference to your project
Update your code to use these libraries
Step 1 - Add snowpark and snowpark extensions library references to the project configuration file¶
The tool automatically adds these dependencies to your project configuration file. After the dependencies are added, your build tool will handle resolving them.
Based on the file extension of your project configuration file, the tool automatically adds the appropriate references in the following way:
build.gradle¶
dependencies {
implementation 'com.snowflake:snowpark:1.6.2'
implementation 'net.mobilize.snowpark-extensions:snowparkextensions:0.0.9'
...
}
build.sbt¶
...
libraryDependencies += "com.snowflake" % "snowpark" % "1.6.2"
libraryDependencies += "net.mobilize.snowpark-extensions" % "snowparkextensions" % "0.0.9"
...
pom.xml¶
<dependencies>
<dependency>
<groupId>com.snowflake</groupId>
<artifactId>snowpark</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>net.mobilize.snowpark-extensions</groupId>
<artifactId>snowparkextensions</artifactId>
<version>0.0.9</version>
</dependency>
...
</dependencies>
Step 2 - Add snowpark extensions library import statements¶
The tool automatically adds these two import statements to every generated .scala file.
import com.snowflake.snowpark_extensions.Extensions._
import com.snowflake.snowpark_extensions.Extensions.functions._
Code example¶
The code below uses hex and isin functions, which are native to Spark but not to Snowpark. However, the code will still execute successfully because these functions are provided through Snowpark extensions.
Input code¶
package com.mobilize.spark
import org.apache.spark.sql._
object Main {
def main(args: Array[String]) : Unit = {
var languageArray = Array("Java");
var languageHex = hex(col("language"));
col("language").isin(languageArray:_*);
}
}
Output code¶
package com.mobilize.spark
import com.snowflake.snowpark._
import com.snowflake.snowpark_extensions.Extensions._
import com.snowflake.snowpark_extensions.Extensions.functions._
object Main {
def main(args: Array[String]) : Unit = {
var languageArray = Array("Java");
// hex does not exist on Snowpark. It is a extension.
var languageHex = hex(col("language"));
// isin does not exist on Snowpark. It is a extension.
col("language").isin(languageArray :_*)
}
}
PySpark¶
Before running your migrated PySpark code in Snowpark, please review these important considerations:
Install snowpark and snowpark extensions libraries¶
The migrated project must include references to both the Snowpark library and its extensions.
Snowpark Extensions¶
Snowpark Extensions is a library that adds PySpark-like features to the standard Snowpark library. These features are currently not available in Snowpark. This library helps developers migrate their projects from PySpark to Snowpark more easily.
Follow these steps to reference Snowpark and Snowpark Extensions libraries in your migrated code:
Add Snowpark library references to your migrated code
Include Snowpark Extensions library references where needed
Step 1 - Install snowpark library¶
pip install snowpark-extensions
Step 2 - Install snowpark extensions library¶
pip install snowflake-snowpark-python
Step 3 - Add snowpark extensions library import statements¶
The tool automatically adds the PySpark import statement to every file that requires PySpark functionality.
import snowpark_extensions
Code example¶
The create_map
function is not available in PySpark but is supported in Snowpark through its extensions. This means your code will work correctly in Snowpark without any modifications.
Input code¶
import pyspark.sql.functions as df
df.select(create_map('name', 'age').alias("map")).collect()
Output code¶
import snowpark_extensions
import snowflake.snowpark.functions as df
df.select(create_map('name', 'age').alias("map")).collect()