Snowpark Connect for Spark release notes for 2026

Snowflake uses semantic versioning for Snowpark Connect for Spark updates.

For documentation, see Snowpark Connect for Apache Spark and Orchestrating Snowpark Connect for Spark workloads.

1.24.0 (April 24, 2026)

Snowpark Connect for Spark

Bug fixes

  • Disable filter_classpath_jars at server startup

  • Support UDT cast-to-string and reject invalid UDT casts

  • Fix DataFrame describe and summary APIs

  • Add SUPPORTED_SCALES guard to skip workloads at unsupported scales

New features

  • Add Scala 2.13 equivalent JARs to dependency packages

  • Add Hive partitioning implementation and limitations reference

  • Remove 29 unused JARs from snowpark_connect_deps packages (~23 MB)

  • Skip explicit structured cast when server supports implicit cast for Parquet

  • Bump Snowpark dependency to 1.50.0

1.23.0 (April 22, 2026)

Snowpark Connect for Spark

Behavior changes

  • Set Parquet useLogicalType default to true

Bug fixes

  • Fix count() to match Spark SQL behavior

  • Relax protobuf version constraint from <6.32.0 to <6.34.0

  • Consistently coerce to unstructured types

  • Replace snowflake.snowpark_connect.includes import with pyspark.sql

  • Always use vectorized Parquet scanner; remove useVectorizedScanner configuration option

  • Fix regexp_extract defaults, inline flags, and PCRE handling

  • Fix SQL operator compatibility gaps

  • Fix IN NULL semantics to match Spark behavior

  • Support named persistent external stage read in XML UDTF

  • Preserve UDT metadata through temp views and toDF renames

  • Use SQL path for catalog table existence checks

  • Allow star expression in the map columns aggregation

New features

  • Implement sequence support for timestamp/date and interval types

  • Add CTE session parameter

  • Initialize tracking nullability of columns and complex types

  • Track nullability for built-in functions across multiple expression categories

  • Track nullable in Set command

  • Add nullability to range

  • Introduce performance regression gate in GitHub Actions

1.22.0 (April 18, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fix CTE-qualified column refs in ORDER BY/WHERE/GROUP BY

  • Fix withColumn on join key after using-style join

  • Fix fillna raising immediately for missing subset column

  • Fix case sensitive read of internal stage

  • Reduce window function boundary materialization

  • Preserve struct/map/array schema with empty content

  • Support ON_ERROR=CONTINUE for INFER_SCHEMA in CSV and JSON reads

  • Fix hex compile-time type dispatch

  • Avoid redundant temp table creation for read.parquet to saveAsTable

  • Preserve StructType/MapType in strict mode

  • Case-insensitive qualifier comparison in column resolution

  • Use Snowpark builtin for CBRT function

  • Fix XML nullValue and whitespace handling

  • Use Decimal for DecimalType in strict mode

  • Fix map_concat bug

  • Fix unionByName to handle quotes in column names and respect caseSensitive config

  • Remove trailing commas from JSON test resource file

New features

  • Snowpark Connect Java Client library to support Spark Scala and Java workloads

  • Use native implementation for ARRAY_REPEAT and MAP_ENTRIES

  • Use MAP_ENTRIES in map_cast

  • Reduce number of queries used for VARIANT inference in read_parquet

  • Add cross-request sub-plan cache for map_relation

1.21.1 (April 10, 2026)

Snowpark Connect for Spark

Bug fixes

  • Implement JSON encoding validation

  • Reduce query size for functions that internally rename columns

  • Relax py4j version constraints to allow for broader compatibility

  • Isolate artifacts by spark session

New features

  • Add default application name for session

  • Add JSON date/time format conversion

1.21.0 (April 09, 2026)

Snowpark Connect for Spark

Bug fixes

  • Handle glob metacharacter escaping in CSV/JSON paths

  • Fix JSON non-nullable schema to match Spark behavior

  • Add default column matching case for XML

  • Fix TEXT lineSep with hex encoding for RECORD_DELIMITER

  • Fix spark read xml external stage

  • Empty CSV returns empty DataFrame

  • Add default idx to regexp_extract

  • Fix CSV non-nullable schema to match Spark behavior

  • Fix temp stage naming collision under parallel tests

  • Add fast path to regexp functions

  • Schema coercion on storeAssignmentPolicy

  • CSV backslash delimiter double-escape

  • Optimize posexplode

  • CSV lineSep empty validation

  • Fix bug that xml cannot read external stage file

  • Reduce default log verbosity for users

New features

  • Added support for DML row counts

  • Support overwrite(condition) for DataFrameWriterV2

  • Iceberg mergeSchema on write — top-level column evolution

  • Added support for partition overwrites in DataFrameWriterV2

  • Add app_name parameter to init_spark_session

1.20.0 (April 03, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fix performance issue

  • Fix merge schema for JSON

  • Fix arrays_zip for complex types

  • Fix LCAs in implicit aggregations

New features

  • Cache result of JSON file format

  • Resolve known types from map_unresolved_function without typer

  • Support hive partitioning for JSON copy into mode

  • Add SCOS session registration on server initialization

  • Modify warmup query with distinct string for filtering

1.19.0 (March 26, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fix accessing struct field from array via getItem

  • Fix names for accessing array elements

  • Added missing compression for TEXT format

  • Reduce query size in DataFrame.replace, UDTF creation, and read_parquet

  • Emulate types on create [temp] view

  • Fixed casting structured types to

  • Fix text write type validation

  • Support XML read dir in parallel

  • Optimize conv function usage

  • Support both Snowflake and net.snowflake.spark.snowflake format read and write

  • Emulate types on create table

  • Fix accessing nested structs with arrays

  • Fix Parquet error message

  • Optimize to_number reducing query size

  • Fix UDF cache to consider query database change

  • Optimize mask function

  • Pass PATTERN to NVS fallback reader during Parquet schema inference

  • Null and structured type coercion

New features

  • Introduce DIRECTED join hint

  • Integrate XML inferSchema

1.18.0 (March 19, 2026)

Snowpark Connect for Spark

Bug fixes

  • Added missing JDBC Type mapping

  • Support user provided schema in parquet

  • Handle invalid UTF-8 characters in JSON gracefully

  • Resolve LCA columns only if actually used

  • Optimize get_json_object query generation

  • Strip semicolon from SQL query

  • Make processInBulk=True the default for JSON reads and fix NullType schema inference

  • Fix bug regarding incorrect stage read

  • Add non check in udf registration

  • Tighten limit for error message

  • Allow missing fields in user provided schema

  • JSON and CSV compression inference

  • Fix for coalesce(1) creating a single file

New features

  • Add execute_jar method to launch Java/Scala workloads

Snowpark Submit

Bug fixes

  • Fix error swallowing with --wait-for-completion flag

1.17.0 (March 13, 2026)

Snowpark Connect for Spark

Bug fixes

  • JSON and CSV compression inference.

  • Fix for coalesce creating a single file.

  • Refactor JSON read to use COPY INTO for single-file reads and add VariantType schema inference.

  • Allow JSON loading without explicit schema.

  • Fix multi_line in JSON.

  • Fix JSON infer schema to avoid scanning whole files.

  • Correctly handle casting to timestamp ltz.

  • Clamp hash returned value.

  • Fix for repartition with partitionBy.

  • Fix to use [connections.spark-connect] section header in config.toml.

  • Convert Java date/timestamp format tokens to Snowflake equivalents for CSV reads.

  • Calculate schema for pivot functions.

  • Fix UDTFs in aliased lateral join.

  • Align result for SQL SET command.

  • Fix return type for CEIL and FLOOR functions.

  • Improve query generation in unbase64 v2.

  • Fix some of option to Snowflake mapping for CSV.

  • Fix serialization for POJO.

  • Improve CSV header error messages.

  • Improve mapType detection logic with try_cast for Parquet reads.

New features

  • Support for reduceGroups API.

  • Support specifying connection name inside init_spark_session.

  • Add config param to use UDF for unbase64.

1.16.0 (March 12, 2026)

Snowpark Connect for Spark

Bug fixes

  • Optimize SQL generation in function unbase64.

  • Fix from_json regression

  • Fix for records that span multiple BZ2 compression block boundaries

  • Fix nullability mapping in unresolved attribute

  • Initialize spark-connect session with any connection, not just one named spark-connect

  • Add XML options validation

  • Drop CSV ESCAPE option when it matches the quote character to prevent compilation error

  • Fix incorrect conversion of named tuples in productEncoder

  • Verify mergeSchema for CSV and JSON is not supported

  • Fix Parquet complex type round-trip (write + read)

  • Fix schema for pivot/unpivot

  • Fix return type for MOD and PMOD functions

  • Fix CSV header extraction for files with leading blank lines

  • Test timezones correctly and replace string-based date/time serialization with epoch-based

  • Update Java version check for Windows

  • Flatten nested withColumn calls

  • Change logic for Literal _IntegralType in add/sub operations

  • Return LongType for COUNT functions

  • Read JSON: test compression = bz2/bzip2/none

  • Improve performance of to_varchar/to_char

  • Make better comparison in I/O testing

  • Set multi_line to False by default for copy JSON

Snowpark Submit

Bug fixes

  • Throw error on unspecified compute pool.

1.15.0 (March 06, 2026)

Snowpark Connect for Spark

Bug fixes

  • Remove result scan when calling df.count()

  • Make sure infer schema runs on limited rows for reading JSON

  • Fix createDataFrame for interval types

  • Change logic for Literal _IntegralType in multiplication and division operations

  • Widen and coerce type for Set operations

  • Fix neo4j multi label support

  • Modify JAR metadata so that Grype does not detect Netty vulnerability

  • Return correct type for ANY_VALUE function

  • Return widened type for sequence

  • Add support for config spark.sql.parquet.inferTimestampNTZ.enabled

  • Batch column rename/cast in _validate_schema_and_get_writer

  • JDBC hang when partitioned queries given with fetch size

  • Return trimmed exception message when it exceeds the HTTP header limits

  • Fix map_type_to_snowflake_type for BigDecimal

  • Fix literal decimal precision and scale

  • Improve random string generation

  • Make BZ2 compressed JSON loading ignore corrupt records

New features

  • Use staged files from config in Scala UDFs

  • Use permissive TRY_CAST in JSON reading

  • Make the number of server threads configurable

Snowpark Submit

Bug fixes

  • Adding back init_spark_session() to testing

  • Update snowpark-submit command line output to clarify snowflake-connection-name is required.

1.14.0 (February 19, 2026)

Snowpark Connect for Spark

Bug fixes

  • Cache table type when running saveAsTable

  • Optimize literal input for substring and type casting for coalesce

  • Handle decimal overflow in avg/mean and fix decimal type coercion

  • Iceberg - Preserve grants on overwrite

  • Standardize SQL passthrough mode

  • Optimize from_utc_timestamp/to_utc_timestamp for literal timezone

  • Handle JSON null values in structured types to match Spark semantics

  • Emulate integral types on creating tables from SQL

  • Fix edge case with mapping nested rows in Scala UDFs

  • Fix how Parquet handles read and write of complex structured datatypes

  • Support save ignore argument for parquet files

  • Add support for artifact repository

  • Fix array nullability in Scala UDxF

  • Fix log1p for args from (-1, 0) range

  • Fix first_value and last_value in aggregate context

  • Fix reading DayTimeIntervalType for Scala client

New features

  • Handle timezones correctly in Scala UDFs

  • Support Java 11 and 17 without any configuration

Snowpark Submit updates

New features

  • Support snowpark-submit for python 3.9

  • Enhance init_spark_session to be usable in snowpark-submit workflow

1.13.0 (February 13, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fixed split function issue

  • Downgraded snowflake-snowpark-python dependency to version 1.44

  • Fixed Neo4j dialect matching to improve SQL translation

  • Fixed operation ID returned in execute responses to be consistent

  • Fixed gRPC metadata handling for TCP channel connections

New features

  • Added support for partition_hint in mapPartitions operations

  • Added XML reader support for scenarios with user-defined schemas

1.11.0 (January 28, 2026)

Snowpark Connect for Spark

Bug fixes

  • Preserve hidden columns after various DataFrame operators

  • Fix issues for scala udf input types (byte, binary, scala.math.BigDecimal)

Other updates

  • Add snowpark-submit User Defined Args to comment

1.10.0 (January 22, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fix config unset error for session configuration.

  • Use copy into to load CSV files in parallel.

  • Fix writes for DataFrames using outer joins.

  • Handle nulls in Scala UDFs.

  • Optimize CTE query generation with parameter protection.

  • Avoid casting arguments of DATEDIFF.

  • Fix appending partitioned files and reading of null partitions.

  • Make a 10X performance improvement for conversion between base 10 and 16 using SQL.

New features

  • Overwrite only modified partitions for parquet files.

Other updates

  • Updated logic to detect if Snowpark Connect for Spark is running on XP.

  • Support writing to a table with variant data type in Snowflake.

  • Remove unnecessary info logs.

  • Move Java tests out of Scala tests job to a separate job.

  • Update the dependency version for gcsfs.

Snowpark Submit

None.

1.9.0 (January 14, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fix serializing Scala tuples.

  • Fix loading huge JSON files.

  • Implement small fixes for customer issues.

  • Implement fixes for struct comparisons.

  • Add handling for 0-column DataFrames.

  • Correct upload file path.

  • Fix Upload_files_if_needed not running in parallel.

  • Improve input type inference when UDF input types are not defined in the proto.

  • Fix NA edge cases.

New features

  • Support reading single JSON BZ2 file.

  • Support Scala UDFs in server-side Snowpark Connect for Spark.

  • Implement cast between string and daytime.

  • Add support for Scala UDFs in group_map.

Snowpark Submit

Bug fixes

  • Reduce generated workload names.

1.8.0 (January 07, 2026)

Snowpark Connect for Spark

Bug fixes

  • Fixed JAVA_HOME handling for Windows.

New features

  • Support neo4j data source via JDBC.

Snowpark Submit

None.