Setting Up the Scala REPL for Snowpark Scala

This topic explains how to set up the Scala REPL for Snowpark.

Installing the Scala REPL

The Scala REPL (read-eval-print loop) is provided with the Scala build tool. To install the supported version of the Scala build tool, find the version that you plan to use, and follow the installation instructions.

Running the Scala REPL

To use the Snowpark library in the Scala REPL:

  1. If you have not already done so, download the Snowpark library archive file and extract the contents of the file.

  2. Start the REPL by running the run.sh shell script provided in the archive file:

cd <path>/snowpark-1.14.0
./run.sh

The run.sh script does the following:

  • Adds the Snowpark library and dependencies to the classpath.

  • Creates a <path>/snowpark-1.14.0/repl_classes/ directory for the classes generated by the Scala REPL.

  • Preloads the preload.scala file, which imports the com.snowflake.snowpark package and the com.snowflake.snowpark.functions object.

If you are using a different REPL for Scala:

  1. Add the Snowpark library JAR file and dependencies to the classpath.

    • The Snowpark library JAR file is in the top level directory of the extracted TAR/ZIP archive file.

    • The dependencies are in the lib directory of the extracted TAR/ZIP archive file.

  2. Create a temporary directory for the classes generated by the REPL, and configure the REPL to generate classes in that directory.

Later, when defining inline user-defined functions (UDFs), you’ll need to specify the directory for the REPL classes as a dependency.

Verifying Your Scala REPL Configuration

To verify that you have configured your project to use Snowpark, run a simple example of Snowpark code.

  1. In the directory containing the files extracted from the .zip / .tar.gz file (i.e. the directory containing the run.sh script), create a Main.scala file that contains the code below:

    import com.snowflake.snowpark._
    import com.snowflake.snowpark.functions._
    
    object Main {
      def main(args: Array[String]): Unit = {
        // Replace the <placeholders> below.
        val configs = Map (
          "URL" -> "https://<account_identifier>.snowflakecomputing.com:443",
          "USER" -> "<user name>",
          "PASSWORD" -> "<password>",
          "ROLE" -> "<role name>",
          "WAREHOUSE" -> "<warehouse name>",
          "DB" -> "<database name>",
          "SCHEMA" -> "<schema name>"
        )
        val session = Session.builder.configs(configs).create
        session.sql("show tables").show()
      }
    }
    
    Copy

    Note the following:

    • Replace the <placeholders> with values that you use to connect to Snowflake.

    • For <account_identifier>, specify your account identifier.

    • If you prefer to use key pair authentication:

      • Replace PASSWORD with PRIVATE_KEY_FILE, and set it to the path to your private key file.

      • If the private key is encrypted, you must set PRIVATE_KEY_FILE_PWD to the passphrase for decrypting the private key.

      As an alternative to setting PRIVATE_KEY_FILE and PRIVATE_KEY_FILE_PWD, you can set the PRIVATEKEY property to the string value of the unencrypted private key from the private key file.

      • For example, if your private key file is unencrypted, set this to the value of the key in the file (without the -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY----- header and footer and without the line endings).

      • Note that if the private key is encrypted, you must decrypt the key before setting it as the value of the PRIVATEKEY property.

  2. From within the directory, run the run.sh script to start the Scala REPL with the settings needed for the Snowpark library:

    ./run.sh
    
    Copy
  3. In the Scala REPL shell, enter the following command to load the sample file that you just created:

    :load Main.scala
    
    Copy
  4. Run the following statement to execute the main method of the class that you loaded:

    Main.main(Array[String]())
    
    Copy

    This runs the SHOW TABLES command and prints out the first 10 rows of the results.