Snowflake Connector for Spark¶

The Snowflake Connector for Spark (“Spark connector”) brings Snowflake into the Apache Spark ecosystem, enabling Spark to read data from, and write data to, Snowflake. From Spark’s perspective, Snowflake looks similar to other Spark data sources (PostgreSQL, HDFS, S3, etc.).

Note

As an alternative to using Spark, consider writing your code to use Snowpark API instead. Snowpark allows you to perform all of your work within Snowflake (rather than in a separate Spark compute cluster). Snowpark also supports pushdown of all operations, including Snowflake UDFs.

Snowflake supports multiple versions of the Spark connector:

Spark Connector 2.x: Spark versions 3.2, 3.3, and 3.4.

There’s a separate version of the Snowflake connector for each version of Spark. Use the correct version of the connector for your version of Spark.

Spark Connector 3.x: Spark versions 3.2, 3.3, 3.4, and 3.5.

Each Spark Connector 3 package supports most versions of Spark.

The connector runs as a Spark plugin and is provided as a Spark package (spark-snowflake).

Next Topics: