Snowpark Library for Python release notes for 2023¶
This article contains the release notes for the Snowpark Library for Python, including the following when applicable:
Behavior changes
New features
Customer-facing bug fixes
Snowflake uses semantic versioning for Snowpark Library for Python updates.
See Snowpark Developer Guide for Python for documentation.
Version 1.11.1 (2023-12-07)¶
Version 1.11.1 of the Snowpark library introduces some new features.
New features¶
Added the
conn_errorattribute toSnowflakeSQLException, which stores the whole underlying exception fromsnowflake-connector-python.Added support for
RelationalGroupedDataframe.pivot()to accesspivotin the following patternDataframe.group_by(...).pivot(...).Added the experimental feature, Local Testing Mode, which allows you to create and operate on Snowpark Python DataFrames locally without connecting to a Snowflake account. You can use the local testing framework to test your DataFrame operations locally, on your development machine or in a CI (continuous integration) pipeline, before deploying code changes to your account.
Added support for
arrays_to_objectnew functions insnowflake.snowpark.functions.Added support for the vector data type.
Dependency updates¶
Bumped the cloudpickle dependency to work with
cloudpickle==2.2.1.Updated
snowflake-connector-pythonto version3.4.0.
Bug fixes¶
DataFrame column names quoting check now supports newline characters.
Fixed a bug where a DataFrame generated by
session.read.with_metadatacreated an inconsistent table when doingdf.write.save_as_table.
Version 1.10.0 (2023-11-03)¶
Version 1.10.0 of the Snowpark library introduces some new features.
New features¶
Added support for managing case sensitivity in
DataFrame.to_local_iterator().Added support for specifying vectorized UDTF’s input column names by using the optional parameter
input_namesinUDTFRegistration.register,UDTFRegistration.register_file, andfunctions.pandas_udtf. By default,RelationalGroupedDataFrame.applyInPandaswill infer the column names from current DataFrame schema.Added
sql_error_codeandraw_messageattributes toSnowflakeSQLExceptionwhen it is caused by a SQL exception.
Bug fixes¶
Fixed a bug in
DataFrame.to_pandas()where converting Snowpark DataFrames to Pandas DataFrames was losing precision on integers with more than 19 digits.Fixed a bug in
session.add_packageswhere it could not handle a requirement specifier that contained a project name with an underscore and a version.Fixed a bug in
DataFrame.limit()whenoffsetis used and the parentDataFrameuseslimit. Now theoffsetwon’t impact the parent DataFrame’slimit.Fixed a bug in
DataFrame.write.save_as_tablewhere DataFrames created from the read API could not save data into Snowflake because of an invalid column name$1.
Behavior changes¶
Changed the behavior of
date_format:The
formatargument changed from optional to required.The returned result changed from a date object to a date-formatted string.
When a window function or a sequence-dependent data generator (
normal,zipf,uniform,seq1,seq2,seq4,seq8) function is used, the sort and filter operation will no longer be flattened when generating the query.
Version 1.9.0 (2023-10-16)¶
Version 1.9.0 of the Snowpark library introduces some new features.
New features¶
Added support for the Python 3.11 runtime environment.
Support
PythonObjJSONEncoderJSON-serializable objects forARRAYandOBJECTliterals.
Dependency updates¶
Re-added the dependency of
typing-extensions.
Bug fixes¶
Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
Revert back to using CTAS (CREATE TABLE AS SELECT) statement for
DataFrameWriter.save_as_tablewhich does not need insert permission for writing tables.
Version 1.8.0 (2023-09-14)¶
Version 1.8.0 of the Snowpark library introduces some new features.
New features¶
Added support for
VOLATILEandIMMUTABLEkeywords when registering UDFs.Added support for specifying clustering keys when saving dataframes using
DataFrame.save_as_table.Accept
Iterableobjects input forschemawhen creating dataframes usingSession.create_dataframe.Added the
DataFrame.sessionproperty to return aSessionobject.Added the
Session.session_idproperty to return an integer that represents the session ID.Added the
Session.connectionproperty to return aSnowflakeConnectionobject.Added support for creating a Snowpark session from a configuration file or environment variables.
Dependency updates¶
Updated
snowflake-connector-pythonto 3.2.0.
Bug fixes¶
Fixed a bug where an automatic package upload would raise
ValueErroreven when compatible package versions were added insession.add_packages.Fixed a bug where table stored procedures were not registered correctly when using
register_from_file.Fixed a bug where dataframe joins failed with
invalid_identifiererror.Fixed a bug where
DataFrame.copydisabled SQL simplifier for the returned copy.Fixed a bug where
session.sql().select()would fail if any parameters were specified tosession.sql().
Version 1.7.0 (2023-08-28)¶
Version 1.7.0 of the Snowpark library introduces some new features.
Behavior changes¶
When creating stored procedures, UDFs, UDTFs, and UDAFs with the parameter
is_permanent=False, temporary objects are created even whenstage_nameis provided. The default value ofis_permanentisFalse, which is why if this value is not explicitly set toTruefor permanent objects, users will notice a change in behavior.types.StructFieldnow enquotes column identifier by default.
New features¶
Added parameters
external_access_integrationsandsecretsthat can be used when creating a UDF, UDTF or stored procedure from Snowpark Python to allow integration with external access.Added support for these new functions in
snowflake.snowpark.functions:array_flattenandflatten.Added support for
apply_in_pandasinsnowflake.snowpark.relational_grouped_dataframe.Added support for replicating your local Python environment on Snowflake via
Session.replicate_local_environment.
Bug fixes¶
Fixed a bug where
session.create_dataframefails to properly set nullable columns where nullability was affected by order or when data was given.Fixed a bug where
DataFrame.selectcould not identify and alias columns when using table functions when output columns of the table function overlapped with columns in the DataFrame.
Version 1.6.1 (2023-08-02)¶
Behavior changes¶
DataFrameWriter.save_as_tablenow respects nullable field of for schema provided by the user, or inferred schema based on data from user input.
New features¶
Added support for new functions in
snowflake.snowpark.functions:array_sortsort_arrayarray_minarray_maxexplode_outer
Added support for pure Python packages specified via
Session.add_requirementsorSession.add_packages. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel.Added the Session parameter
custom_packages_upload_enabledandcustom_packages_force_upload_enabledto enable the support for pure Python packages feature mentioned above. Both parameters default toFalse.Added support for specifying package requirements by passing a conda environment YAML file to
Session.add_requirements.Added support for asynchronous execution of multi-query dataframes that contain binding variables.
Added support for renaming multiple columns in
DataFrame.rename.Added support for Geometry datatypes.
Added support for params in
session.sql()in stored procedures.Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
Added support for vectorized user-defined table functions (vectorized UDTFs). This feature is currently in public preview.
Added support for Snowflake Timestamp variants (i.e.,
TIMESTAMP_NTZ,TIMESTAMP_LTZ,TIMESTAMP_TZ):Added TimestampTimezone as an argument in
TimestampTypeconstructor.Added type hints:
NTZ,LTZ,TZand Timestamp to annotate functions when registering UDFs.
Improvements¶
Removed redundant dependency typing-extensions.
DataFrame.cache_resultnow creates a temp table of fully-qualified names under the current database and schema.
Bug fixes¶
Fixed a bug where type check happens on pandas before it is imported.
Fixed a bug when creating a UDF from
numpy.ufunc.Fixed a bug where
DataFrame.unionwas not generating the correctSelectable.schema_querywhen SQL simplifier is enabled.
Dependency updates¶
Updated
snowflake-connector-pythonto version 3.0.4.
Version 1.5.1 (2023-06-20)¶
New features and updates¶
Added support for the Python 3.10 runtime environment.
Version 1.5.0 (2023-06-13)¶
Behavior changes¶
Aggregation results, from functions such as
DataFrame.aggandDataFrame.describe, no longer strip away non-printing characters from column names.
New features and updates¶
Added support for the Python 3.9 runtime environment.
Added support for new functions in
snowflake.snowpark.functions:array_generate_rangearray_unique_aggcollect_setsequenceAdded support for registering and calling stored procedures with the
TABLEreturn type.Added support for parameter length in
StringType()to specify the maximum number of characters that can be stored by the column.Added the alias
functions.element_at()forfunctions.get().Added the alias
Column.containsforfunctions.contains.Added the experimental feature
DataFrame.alias.Added support for querying metadata columns from stage when creating
DataFrameusingDataFrameReader.Added support for
StructType.addto append more fields to existingStructTypeobjects.Added support for parameter
execute_asinStoredProcedureRegistration.register_from_file()to specify stored procedure caller rights.
Bug fixes¶
Fixed a bug where the
Dataframe.join_table_functiondid not run all of the necessary queries to set up the join table function when SQL simplifier was enabled.Fixed type hint declaration for custom types:
ColumnOrName,ColumnOrLiteralStr,ColumnOrSqlExpr,LiteralTypeandColumnOrLiteralthat were breakingmypychecks.Fixed a bug where
DataFrameWriter.save_as_tableandDataFrame.copy_into_tablefailed to parse fully qualified table names.
Version 1.4.0 (2023-04-24)¶
New features¶
Added support for
session.getOrCreate.Added support for alias
Column.getField.Added support for new functions in
snowflake.snowpark.functions:date_addanddate_subto make add and subtract operations easier.ddaydiffdexplodedarray_distinctdregexp_extractdstructdformat_numberdbrounddsubstring_index
Added parameter
skip_upload_on_content_matchwhen creating UDFs, UDTFs, and stored procedures usingregister_from_fileto skip uploading files to a stage if the same version of the files are already on the stage.Added support for the
DataFrame.save_as_tablemethod to take table names that contain dots.Flattened generated SQL when
DataFrame.filter()orDataFrame.order_by()is followed by a projection statement (e.g.DataFrame.select(),DataFrame.with_column()).Added support for creating dynamic tables (in private preview) using
Dataframe.create_or_replace_dynamic_table.Added an optional argument,
params, insession.sql()to support binding variables. Note that this argument is not supported in stored procedures yet.
Bug fixes¶
Fixed a bug in
strtok_to_arraywhere an exception was thrown when a delimiter was passed in.Fixed a bug in
session.add_importwhere the module had the same namespace as other dependencies.
Version 1.3.0 (2023-03-28)¶
New features¶
Added support for the delimiters parameter in
functions.initcap().Added support for
functions.hash()to accept a variable number of input expressions.Added API
Session.conffor getting, setting or checking the mutability of any runtime configuration.Added support for managing case sensitivity in
Rowresults fromDataFrame.collectusingcase_sensitiveparameter.Added indexer support for
snowflake.snowpark.types.StructType.Added a keyword argument
log_on_exceptiontoDataframe.collectandDataframe.collect_no_waitto optionally disable error logging for SQL exceptions.
Bug fixes¶
Fixed a bug where a DataFrame set operation (
DataFrame.subtract,DataFrame.union, etc.) being called after another DataFrame set operation andDataFrame.selectorDataFrame.with_columnthrows an exception.Fixed a bug where chained sort statements are overwritten by the SQL simplifier.
Improvements¶
Simplified JOIN queries to use constant subquery aliases (
SNOWPARK_LEFT,SNOWPARK_RIGHT) by default. Users can disable this at runtime withsession.conf.set('use_constant_subquery_alias', False)to use randomly generated alias names instead.Allowed specifying statement parameters in
session.call().Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.
Version 1.2.0 (2023-03-02)¶
New features and updates¶
Added support for displaying source code as comments in the generated scripts when registering stored procedures. This is enabled by default, turn off by specifying
source_code_display=Falseat registration.Added a parameter
if_not_existswhen creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists.Accept integers when calling
snowflake.snowpark.functions.getto extract value from array.Added
functions.reversein functions to open access to Snowflake built-in function REVERSE.Added parameter
require_scoped_urlinsnowflake.snowflake.files.SnowflakeFile.open()(in Private Preview) to replaceis_owner_file, which is marked for deprecation.
Bug fixes¶
Fixed a bug that overwrote
paramstyletoqmarkwhen creating a Snowpark session.Fixed a bug where
df.join(..., how="cross")fails withSnowparkJoinException: (1112): Unsupported using join type 'Cross'.Fixed a bug where querying a
DataFramecolumn created from chained function calls used a wrong column name.
Version 1.1.0 (2023-01-26)¶
New features and updates¶
Added
asc,asc_nulls_first,asc_nulls_last,desc,desc_nulls_first,desc_nulls_last,date_part, andunix_timestampin functions.Added the property
DataFrame.dtypesto return a list of column name and data type pairs.Added the following aliases:
functions.expr() for functions.sql_expr().functions.date_format() for functions.to_date().functions.monotonically_increasing_id() for functions.seq8().functions.from_unixtime() for functions.to_timestamp().
Bug fixes¶
Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See https://github.com/snowflakedb/snowpark-python/issues/658 for details.
Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.
Improvements¶
The session parameter
PYTHON_SNOWPARK_USE_SQL_SIMPLIFIERwill beTrueafter Snowflake 7.3 is released. In snowpark-python,session.sql_simplifier_enabledreads the value ofPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERby default, meaning that the SQL simplifier is enabled by default after the Snowflake 7.3 release. To turn this off, setPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERin Snowflake to False or runsession.sql_simplifier_enabled = Falsefrom Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.