Snowpark Migration Accelerator: Walkthrough Setup¶

This guide offers practical experience with the Snowpark Migration Accelerator (SMA). Through real-world examples, you will learn how to evaluate code and interpret assessment results, giving you a clear understanding of the tool’s capabilities.

Materials¶

To complete this tutorial, you will need the following:

  • A computer that has Snowpark Migration Accelerator (SMA) software installed

  • Access to the sample code files on the same computer

To begin, you will need two items on your computer:

  1. The Snowpark Migration Accelerator (SMA) tool

  2. Code samples

Let’s walk through how to obtain these essential resources.

SMA Application¶

The Snowpark Migration Accelerator (SMA) helps developers convert their PySpark and Spark Scala applications to run on Snowflake. It automatically detects Spark API calls in your Python or Scala code and transforms them into equivalent Snowpark API calls. This guide will demonstrate basic SMA functionality by analyzing sample Spark code and showing how it assists with migration projects.

During the initial assessment phase, Snowpark Migration Accelerator (SMA) examines your source code and builds a detailed model that captures all the functionality in your code. Based on this analysis, SMA creates several reports, including a detailed assessment report that we’ll review in this walkthrough. These reports help you understand how ready your code is for migration to Snowpark and estimate the effort needed for the transition. We’ll look at these findings in more detail as we continue through this lab.

Download and Installation¶

To begin an assessment with the Snowpark Migration Accelerator (SMA), you only need to complete the installation process. While Snowflake provides optional helpful training on using the SMA, you can proceed without it. No special access codes are needed. Simply:

  1. Visit our Download and Access section

  2. Download the installer

  3. Follow our Installation instructions to set up the application on your computer

Sample Codebase¶

This guide uses Python code examples to demonstrate the migration process. We have selected two publicly available sample codebases from third-party Git repositories as unbiased, real-world examples. You can access these codebases at:

To analyze codebases using the Snowpark Migration Accelerator (SMA), follow these steps:

  1. Download the codebases as zip files from GitHub. You can find instructions on how to do this in the GitHub documentation.

  2. Create separate folders on your computer for each codebase.

  3. Extract each zip file into its designated folder, as shown in the image below:

Directory with Codebases

These sample codebases demonstrate how SMA evaluates Spark API references to calculate the Spark API Readiness Score. Let’s look at two scenarios:

  1. A codebase that received a high score, indicating it is highly compatible with Snowpark and ready for migration

  2. A codebase that received a low score, indicating it requires additional review and potential modifications before migration

While the readiness score provides valuable insight, it should not be the only factor considered when planning a migration. A comprehensive evaluation of all aspects is necessary for both high and low scoring assessments to ensure a successful migration.

After unzipping the directories, SMA will analyze only files that use supported code formats and notebook formats. These files are checked for references to Spark API and other Third Party APIs. To see which file types are supported, please check the list here.

Throughout the rest of this walkthrough, we will analyze how these two codebases execute.

Support¶

For help with installation or to get access to the code, please email sma-support@snowflake.com.


After downloading and unzipping the codebases into separate directories, you can either: