Snowpark Library for Scala and Java release notes for 2022¶
This article contains the release notes for the Snowpark Library for Scala and Snowpark Library for Java, including the following when applicable:
Behavior changes
New features
Customer-facing bug fixes
Snowflake uses semantic versioning for Snowpark Library for Scala and Java updates.
See Snowpark Developer Guide for Java and Snowpark Developer Guide for Scala for documentation.
Version 1.6.2 (October 26, 2022)¶
Compatible Snowflake release: 6.35.x
Improvements¶
Made internal improvements for stored procedures written in Java or Scala.
Version 1.6.1 (September 30, 2022)¶
Compatible Snowflake release: 6.31.x
This version has a known issue which might break temp object creation. Please use 1.6.2 instead.
Improvements¶
Made internal improvements for stored procedures written in Java or Scala.
Version 1.6.0 (August 12, 2022)¶
Compatible Snowflake release: 6.27.x
Improvements¶
Made internal improvements to UDTFs.
Version 1.5.0 (July 1, 2022)¶
Compatible Snowflake release: 6.22.x
New features¶
Improvements¶
Optimized the SQL queries generated by the Snowpark client library.
Improved the error message that is logged when the Snowpark library fails to resolve a column name in
a DataFrame (e.g. when you attempt to access a column that does not exist).
Version 1.4.1 (May 26, 2022)¶
Compatible Snowflake release: 6.17.x
Changes¶
Updated the version of
jackson-core
andjackson-annotations
that the Snowpark library depends on to 2.13.2.Updated the version of
jackson-databind
that the Snowpark library depends on to 2.13.2.2.Removed the
jackson-core
,jackson-databind
, andjackson-annotations
classes from Snowpark JAR file.If you downloaded the
.tar.gz
/.zip
file, the JAR files for the Jackson classes are now provided separately in thelib/
subdirectory (jackson-core-2.13.2.jar
,jackson-databind-2.13.2.2.jar
, andjackson-annotations-2.13.2.jar
).If you are specifying the Snowpark library as a dependency in your
pom.xml
file and you want to depend on a different version of the Jackson libraries in your pom.xml, you can exclude the dependency on the Jackson libraries from the Snowpark library dependency.
Version 1.4.0 (April 28, 2022)¶
Compatible Snowflake release: 6.14.x
New features¶
Made the Snowpark Java API generally available on AWS and Azure.
The API is still available as a preview feature in GCS.
Made the Snowpark Scala API generally available on Azure.
Prior to this release, the API was only generally available on AWS. The API is still available as a preview feature on GCS.
Added a Java API for creating UDTFs. Note that this is a preview feature.
Added new APIs in Scala and Java for uploading and downloading data from a stage (
FileOperation.uploadStream and FileOperation.downloadStream
).Added the
DataFrameWriter.option
method in Scala and Java for specifying how values in columns in the DataFrame should be mapped to columns in the table. The option method allows you to specify that theDataFrameWriter
should use the column name, rather than the column order.
Improvements¶
Disabled the Closure Cleaner in Java sessions. The Closure Cleaner only works in Scala programs.
Improved
Array
andMap
support in the Java Row API.
Version 1.3.0 (March 18, 2022)¶
Compatible Snowflake release: 6.8.x
New features¶
Added support for writing stored procedures in Java .
Added support for asynchronously merging rows into a table in Scala.
Version 1.2.0 (March 2, 2022)¶
Compatible Snowflake release: 6.5.x
New features¶
Added the Java API for Snowpark.
Added preview support in the Scala API for creating UDTFs.
Added a separate version of the library that complies with the security requirements of FIPS (Federal Information Processing Standard). You can download this library from:
To point to the FIPS-compliant library from an sbt build file or Maven project, use
snowpark-fips
as theartifactId
.
Version 1.1.0 (February 4, 2022)¶
Compatible Snowflake release: 6.2.x
Added support for Writing stored procedures in Scala.
The API reference for this release is available in
the Snowflake documentation
and in a .zip
or .tar.gz
file in the Snowflake Client Repository.
Version 1.0.0 (January 26, 2022)¶
Compatible Snowflake release: 6.1.x
General availability (GA) release on AWS. (Snowpark is still a preview feature on Azure and GCP.)
The API reference for this release is available in a .zip
or .tar.gz
file in
the Snowflake Client Repository.
Version 0.12.0 (January 4, 2022)¶
Compatible Snowflake release: 5.45.x
The API reference for this release is available in a .zip
or .tar.gz
file in
the Snowflake Client Repository.
New features¶
Improvements¶
In the generated code for UDFs, replaced a static code block with an object instance function.
Reorganized error messages.
Changed the
saveAsTable
function so that a new table is not created in Append mode.Improved the
callUDF
function to support any type of argument.Changed the library to set the query tag at the statement level, rather than at the session level.
Version 0.11.0 (November 16, 2021)¶
Compatible Snowflake release: 5.45.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
New features¶
Added the generator method to the
Session
class and the seq1, seq2, seq4, seq8, and uniform functions to the functions object.Added the getSessionInfo method to the Session class.
Added APIs for performing actions on DataFrames asynchronously.
Improvements¶
Upgraded the Snowflake JDBC driver to 3.13.9. Improved the error message reported when no current database is selected for use.
Version 0.10.1 (October 27, 2021)¶
Compatible Snowflake release: 5.38.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
Bug fixes¶
Fixed a problem with uploading files to a GCP stage where the wrong prefix was used.
Fixed a problem in which a 403 HTTP response was returned when accessing a pre-signed URL for GCP.
Version 0.10.0 (October 18, 2021)¶
Compatible Snowflake release: 5.37.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
New features¶
Added the new method dropDuplicates to the DataFrame class.
Added support for in expressions to the Column class (with the in method) and the functions object (with the in function).
Extended the Iterator returned by DataFrame.toLocalIterator to support the
Closeable
interface, which allows you to call the close method on the iterator.Added support for the new configuration property
snowpark_request_timeout_in_seconds
. You can set this in the configuration map / file to adjust the timeout that the library uses when uploading dependencies to a stage. By default, the timeout is 86400 (1 day).
Improvements¶
Added logic to the DataFrame.withColumns method to verify that duplicate input column names are not specified.
Updated the
clone
methods in theCopyable
andUpdatable
classes return correct DataFrame types.Added support for specifying the application ID
by setting the application JDBC property in the configuration map / file.
Behavior changes¶
Removed APIs intended only for Java from the Scala API.
Replaced the default logger log4j with SLF4J SimpleLogger.
Bug fixes¶
Updated the library to close unused statements automatically in order to reduce memory usage.
Fixed the column order in the result of the
DataFrame.withColumns
method.
Version 0.9.0 (September 20, 2021)¶
Compatible Snowflake release: 5.34.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
New features¶
Added a new DataFrame subclass, CopyableDataFrame, that you can use to copy data from a staged file into a table. This is equivalent to the COPY INTO <table> command.
Added the new method DataFrame.rename() for renaming columns in a DataFrame.
Added the new function functions.iff() for specifying an if-then-else expression. This is equivalent to the IFF function.
Added new constructors for the DecimalType class.
Behavior changes¶
Changed the DataFrame.union() and DataFrame.unionByName() methods to use UNION, rather than UNION ALL.
Bug fixes¶
Fixed the error
SQL compilation error: Missing column specification
that could occur when the Snowpark library created a temporary view.
Version 0.8.0 (August 9, 2021)¶
Compatible Snowflake release: 5.30.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
Improvements¶
Refactored some internal code to remove some dependencies.
Bug fixes¶
Fixed an issue with BigDecimal literals in cases where scale might be larger than precision.
Fixed an issue that could occur when performing multiple set operations (e.g. union, intersect, etc.).
Version 0.7.0 (July 23, 2021)¶
Compatible Snowflake release: 5.29.x
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
New APIs¶
Introduced the new Session.close() method. Call this method to close the Snowpark session, which cancels all running queries and prevents the subsequent use of this session to execute queries.
Introduced the new Updatable class. Updatable extends the DataFrame class and provides additional table-related capabilities (e.g. the ability to update and delete values).
The Session.table() method now returns an Updatable object, rather than a DataFrame object.
Introduced new signatures for the registerTemporary methods in the UDFRegistration class. These signatures do not have a parameter for the name of the UDF, which means that you can use these to register an anonymous temporary UDF.
API Changes¶
As mentioned above, the
Session.table()
method now returns anUpdatable
object, which extendsDataFrame
.In the
Geography
class, removed support for formats other than GeoJSON. Now,Geography
only supports the GeoJSON data format.
Improvements¶
Improved the
DataFrame.cacheResult()
method to reduce the possibility of “object already exists” errors.Improved some error messages.
Added a new log message that prints out session information after you log in.
Bug fixes¶
Fixed an issue in which the
DataFrame.show()
method did not display binary data correctly.Fixed an error that occurred when getting the version number.
Version 0.6.0 (June 14, 2021)¶
Compatible Snowflake release: 5.21.x
Preview release on AWS
The API reference for this release is available in a .zip
or .tar.gz
file in the Snowflake Client Repository.
API Changes¶
In this release, the following methods in RelationalGroupedDataFrame now require an argument:
avg
max
median
min
sum
In previous releases, if you called these methods without an argument, these methods were applied to all
numeric columns in the DataFrame. For example, for a DataFrame df
with the columns (a int, b string, c float)
,
calling df.groupBy("a").max()
was equivalent to calling df.groupBy("a").max(col("a"), col("c"))
.
With this release, calling these methods without an argument results in a SnowparkClientException
.
Version 0.5.0¶
New features¶
Added a maxWidth parameter to the DataFrame.show() method. You can use this parameter to adjust the number of characters printed in the output for each column.
Added the Session.cancelAll() method, which you can use to cancel all running actions on this session.
Added the DataFrame.toLocalIterator() method, which returns an iterator that you can use to retrieve data, row by row. You can use this rather than DataFrame.collect(), if you don’t want to load all of the data into memory at once.
Added the median method to the RelationalGroupedDataFrame class.
Improvements¶
Improved the error message returned when an identifier is invalid.
Enhanced the error checking to report an error when no database or schema name is specified.
Added a performance improvement when inserting a large number of values in a table.
Updated the library to consistently handle Snowflake object identifiers (table and view names). Now, all parameters that specify table or view names support the use of:
Short names (e.g. table_name and view_name)
Fully-qualified names (e.g. database.schema.table_name)
Multi-part identifiers (e.g. Seq(“database”, “schema”, “view_name”))
Added a check to verify that the supported version of Scala is being used. The library will report error if the Scala version is not compatible.
Bug fixes¶
Fixed a problem with registering UDFs on Microsoft Windows.
Fixed a problem with the order of results when using DataFrame.sort() with DataFrame.limit().
Fixed Session.range() to generate a sequence of numbers without gaps.
Version 0.4.1¶
In this version, you no longer need to specify a temporary schema or temporary database for Snowpark objects (the TEMP_SCHEMA and TEMP_DB settings). The Snowpark library automatically creates temporary versions of the objects needed.
API Changes¶
Replaced the DataFrame.cache() method with the DataFrame.cacheResult() method.
The new method creates and returns a new DataFrame with the cached results and has no effect on the current DataFrame. As a result of this change, the DataFrame object is now immutable.
New APIs¶
Added the following new methods to the RelationalGroupedDataFrame class:
avg
max
Added the following new methods to the DataFrame class:
groupByGroupingSets
clone
createOrReplaceTempView
Added the following new functions to the functions object:
toScalar
Added a Session.file object, which provides the following new methods for performing file operations:
get
put
Made the following changes to the Session.createDataFrame method:
Added support for user-provided schemas.
Added support for specifying an array/map of variant/geography data.
Added support for Geography/Variant data types in UDFs.
Added registerPermanent methods to the UDFRegistration class.
Bug fixes¶
Fixed a problem when the DataFrame column name contains quotation marks.
Fixed a problem with the inability to escape data that contains backslashes, single quotes, and newline characters.
Fixed a problem where UDF creation fails with the error message “code too larger”.
Fixed a problem where the UDF closure failed to capture the value of a local string variable.
Added the result schema for the following SQL clauses:
GRANT/REVOKE
DESCRIBE
CREATE
USE
Fixed a problem when using Snowpark in Visual Studio Code with the Metals extension to create a UDF.