Snowpark Library for Python release notes for 2023¶
This article contains the release notes for the , including the following when applicable:
- Behavior changes
- New features
- Customer-facing bug fixes
Snowflake uses semantic versioning for updates.
See Snowpark Developer Guide for Python for documentation.
Version 1.11.1 (2023-12-07)¶
Version 1.11.1 of the Snowpark library introduces some new features.
New features¶
- Added the
conn_errorattribute toSnowflakeSQLException, which stores the whole underlying exception fromsnowflake-connector-python. - Added support for
RelationalGroupedDataframe.pivot()to accesspivotin the following patternDataframe.group_by(...).pivot(...). - Added the experimental feature, Local Testing Mode, which allows you to create and operate on Snowpark Python DataFrames locally without connecting to a Snowflake account. You can use the local testing framework to test your DataFrame operations locally, on your development machine or in a CI (continuous integration) pipeline, before deploying code changes to your account.
- Added support for
arrays_to_objectnew functions insnowflake.snowpark.functions. - Added support for the vector data type.
Dependency updates¶
- Bumped the cloudpickle dependency to work with
cloudpickle==2.2.1. - Updated
snowflake-connector-pythonto version3.4.0.
Bug fixes¶
- DataFrame column names quoting check now supports newline characters.
- Fixed a bug where a DataFrame generated by
session.read.with_metadatacreated an inconsistent table when doingdf.write.save_as_table.
Version 1.10.0 (2023-11-03)¶
Version 1.10.0 of the Snowpark library introduces some new features.
New features¶
- Added support for managing case sensitivity in
DataFrame.to_local_iterator(). - Added support for specifying vectorized UDTF’s input column names by using the optional parameter
input_namesinUDTFRegistration.register,UDTFRegistration.register_file, andfunctions.pandas_udtf. By default,RelationalGroupedDataFrame.applyInPandaswill infer the column names from current DataFrame schema. - Added
sql_error_codeandraw_messageattributes toSnowflakeSQLExceptionwhen it is caused by a SQL exception.
Bug fixes¶
- Fixed a bug in
DataFrame.to_pandas()where converting Snowpark DataFrames to Pandas DataFrames was losing precision on integers with more than 19 digits. - Fixed a bug in
session.add_packageswhere it could not handle a requirement specifier that contained a project name with an underscore and a version. - Fixed a bug in
DataFrame.limit()whenoffsetis used and the parentDataFrameuseslimit. Now theoffsetwon’t impact the parent DataFrame’slimit. - Fixed a bug in
DataFrame.write.save_as_tablewhere DataFrames created from the read API could not save data into Snowflake because of an invalid column name$1.
Behavior changes¶
-
Changed the behavior of
date_format:- The
formatargument changed from optional to required. - The returned result changed from a date object to a date-formatted string.
- The
-
When a window function or a sequence-dependent data generator (
normal,zipf,uniform,seq1,seq2,seq4,seq8) function is used, the sort and filter operation will no longer be flattened when generating the query.
Version 1.9.0 (2023-10-16)¶
Version 1.9.0 of the Snowpark library introduces some new features.
New features¶
- Added support for the Python 3.11 runtime environment.
- Support
PythonObjJSONEncoderJSON-serializable objects forARRAYandOBJECTliterals.
Dependency updates¶
- Re-added the dependency of
typing-extensions.
Bug fixes¶
- Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
- Revert back to using CTAS (CREATE TABLE AS SELECT) statement for
DataFrameWriter.save_as_tablewhich does not need insert permission for writing tables.
Version 1.8.0 (2023-09-14)¶
Version 1.8.0 of the Snowpark library introduces some new features.
New features¶
- Added support for
VOLATILEandIMMUTABLEkeywords when registering UDFs. - Added support for specifying clustering keys when saving dataframes using
DataFrame.save_as_table. - Accept
Iterableobjects input forschemawhen creating dataframes usingSession.create_dataframe. - Added the
DataFrame.sessionproperty to return aSessionobject. - Added the
Session.session_idproperty to return an integer that represents the session ID. - Added the
Session.connectionproperty to return aSnowflakeConnectionobject. - Added support for creating a Snowpark session from a configuration file or environment variables.
Dependency updates¶
- Updated
snowflake-connector-pythonto 3.2.0.
Bug fixes¶
- Fixed a bug where an automatic package upload would raise
ValueErroreven when compatible package versions were added insession.add_packages. - Fixed a bug where table stored procedures were not registered correctly when using
register_from_file. - Fixed a bug where dataframe joins failed with
invalid_identifiererror. - Fixed a bug where
DataFrame.copydisabled SQL simplifier for the returned copy. - Fixed a bug where
session.sql().select()would fail if any parameters were specified tosession.sql().
Version 1.7.0 (2023-08-28)¶
Version 1.7.0 of the Snowpark library introduces some new features.
Behavior changes¶
- When creating stored procedures, UDFs, UDTFs, and UDAFs with the parameter
is_permanent=False, temporary objects are created even whenstage_nameis provided. The default value ofis_permanentisFalse, which is why if this value is not explicitly set toTruefor permanent objects, users will notice a change in behavior. types.StructFieldnow enquotes column identifier by default.
New features¶
- Added parameters
external_access_integrationsandsecretsthat can be used when creating a UDF, UDTF or stored procedure from Snowpark Python to allow integration with external access. - Added support for these new functions in
snowflake.snowpark.functions:array_flattenandflatten. - Added support for
apply_in_pandasinsnowflake.snowpark.relational_grouped_dataframe. - Added support for replicating your local Python environment on Snowflake via
Session.replicate_local_environment.
Bug fixes¶
- Fixed a bug where
session.create_dataframefails to properly set nullable columns where nullability was affected by order or when data was given. - Fixed a bug where
DataFrame.selectcould not identify and alias columns when using table functions when output columns of the table function overlapped with columns in the DataFrame.
Version 1.6.1 (2023-08-02)¶
Behavior changes¶
DataFrameWriter.save_as_tablenow respects nullable field of for schema provided by the user, or inferred schema based on data from user input.
New features¶
-
Added support for new functions in
snowflake.snowpark.functions:array_sortsort_arrayarray_minarray_maxexplode_outer
-
Added support for pure Python packages specified via
Session.add_requirementsorSession.add_packages. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel. -
Added the Session parameter
custom_packages_upload_enabledandcustom_packages_force_upload_enabledto enable the support for pure Python packages feature mentioned above. Both parameters default toFalse. -
Added support for specifying package requirements by passing a conda environment YAML file to
Session.add_requirements. -
Added support for asynchronous execution of multi-query dataframes that contain binding variables.
-
Added support for renaming multiple columns in
DataFrame.rename. -
Added support for Geometry datatypes.
-
Added support for params in
session.sql()in stored procedures. -
Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
-
Added support for vectorized user-defined table functions (vectorized UDTFs). This feature is currently in public preview.
-
Added support for Snowflake Timestamp variants (i.e.,
TIMESTAMP_NTZ,TIMESTAMP_LTZ,TIMESTAMP_TZ):- Added TimestampTimezone as an argument in
TimestampTypeconstructor. - Added type hints:
NTZ,LTZ,TZand Timestamp to annotate functions when registering UDFs.
- Added TimestampTimezone as an argument in
Improvements¶
- Removed redundant dependency typing-extensions.
DataFrame.cache_resultnow creates a temp table of fully-qualified names under the current database and schema.
Bug fixes¶
- Fixed a bug where type check happens on pandas before it is imported.
- Fixed a bug when creating a UDF from
numpy.ufunc. - Fixed a bug where
DataFrame.unionwas not generating the correctSelectable.schema_querywhen SQL simplifier is enabled.
Dependency updates¶
- Updated
snowflake-connector-pythonto version 3.0.4.
Version 1.5.1 (2023-06-20)¶
New features and updates¶
- Added support for the Python 3.10 runtime environment.
Version 1.5.0 (2023-06-13)¶
Behavior changes¶
- Aggregation results, from functions such as
DataFrame.aggandDataFrame.describe, no longer strip away non-printing characters from column names.
New features and updates¶
- Added support for the Python 3.9 runtime environment.
- Added support for new functions in
snowflake.snowpark.functions: array_generate_rangearray_unique_aggcollect_setsequence- Added support for registering and calling stored procedures with the
TABLEreturn type. - Added support for parameter length in
StringType()to specify the maximum number of characters that can be stored by the column. - Added the alias
functions.element_at()forfunctions.get(). - Added the alias
Column.containsforfunctions.contains. - Added the experimental feature
DataFrame.alias. - Added support for querying metadata columns from stage when creating
DataFrameusingDataFrameReader. - Added support for
StructType.addto append more fields to existingStructTypeobjects. - Added support for parameter
execute_asinStoredProcedureRegistration.register_from_file()to specify stored procedure caller rights.
Bug fixes¶
- Fixed a bug where the
Dataframe.join_table_functiondid not run all of the necessary queries to set up the join table function when SQL simplifier was enabled. - Fixed type hint declaration for custom types:
ColumnOrName,ColumnOrLiteralStr,ColumnOrSqlExpr,LiteralTypeandColumnOrLiteralthat were breakingmypychecks. - Fixed a bug where
DataFrameWriter.save_as_tableandDataFrame.copy_into_tablefailed to parse fully qualified table names.
Version 1.4.0 (2023-04-24)¶
New features¶
-
Added support for
session.getOrCreate. -
Added support for alias
Column.getField. -
Added support for new functions in
snowflake.snowpark.functions:date_addanddate_subto make add and subtract operations easier.ddaydiffdexplodedarray_distinctdregexp_extractdstructdformat_numberdbrounddsubstring_index
-
Added parameter
skip_upload_on_content_matchwhen creating UDFs, UDTFs, and stored procedures usingregister_from_fileto skip uploading files to a stage if the same version of the files are already on the stage. -
Added support for the
DataFrame.save_as_tablemethod to take table names that contain dots. -
Flattened generated SQL when
DataFrame.filter()orDataFrame.order_by()is followed by a projection statement (e.g.DataFrame.select(),DataFrame.with_column()). -
Added support for creating dynamic tables (in private preview) using
Dataframe.create_or_replace_dynamic_table. -
Added an optional argument,
params, insession.sql()to support binding variables. Note that this argument is not supported in stored procedures yet.
Bug fixes¶
- Fixed a bug in
strtok_to_arraywhere an exception was thrown when a delimiter was passed in. - Fixed a bug in
session.add_importwhere the module had the same namespace as other dependencies.
Version 1.3.0 (2023-03-28)¶
New features¶
- Added support for the delimiters parameter in
functions.initcap(). - Added support for
functions.hash()to accept a variable number of input expressions. - Added API
Session.conffor getting, setting or checking the mutability of any runtime configuration. - Added support for managing case sensitivity in
Rowresults fromDataFrame.collectusingcase_sensitiveparameter. - Added indexer support for
snowflake.snowpark.types.StructType. - Added a keyword argument
log_on_exceptiontoDataframe.collectandDataframe.collect_no_waitto optionally disable error logging for SQL exceptions.
Bug fixes¶
- Fixed a bug where a DataFrame set operation (
DataFrame.subtract,DataFrame.union, etc.) being called after another DataFrame set operation andDataFrame.selectorDataFrame.with_columnthrows an exception. - Fixed a bug where chained sort statements are overwritten by the SQL simplifier.
Improvements¶
- Simplified JOIN queries to use constant subquery aliases (
SNOWPARK_LEFT,SNOWPARK_RIGHT) by default. Users can disable this at runtime withsession.conf.set('use_constant_subquery_alias', False)to use randomly generated alias names instead. - Allowed specifying statement parameters in
session.call(). - Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.
Version 1.2.0 (2023-03-02)¶
New features and updates¶
- Added support for displaying source code as comments in the generated scripts when registering stored procedures.
This is enabled by default, turn off by specifying
source_code_display=Falseat registration. - Added a parameter
if_not_existswhen creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists. - Accept integers when calling
snowflake.snowpark.functions.getto extract value from array. - Added
functions.reversein functions to open access to Snowflake built-in function REVERSE. - Added parameter
require_scoped_urlinsnowflake.snowflake.files.SnowflakeFile.open()(in Private Preview) to replaceis_owner_file, which is marked for deprecation.
Bug fixes¶
- Fixed a bug that overwrote
paramstyletoqmarkwhen creating a Snowpark session. - Fixed a bug where
df.join(..., how="cross")fails withSnowparkJoinException: (1112): Unsupported using join type 'Cross'. - Fixed a bug where querying a
DataFramecolumn created from chained function calls used a wrong column name.
Version 1.1.0 (2023-01-26)¶
New features and updates¶
- Added
asc,asc_nulls_first,asc_nulls_last,desc,desc_nulls_first,desc_nulls_last,date_part, andunix_timestampin functions. - Added the property
DataFrame.dtypesto return a list of column name and data type pairs. - Added the following aliases:
functions.expr() for functions.sql_expr().functions.date_format() for functions.to_date().functions.monotonically_increasing_id() for functions.seq8().functions.from_unixtime() for functions.to_timestamp().
Bug fixes¶
- Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See https://github.com/snowflakedb/snowpark-python/issues/658 for details.
- Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.
Improvements¶
- The session parameter
PYTHON_SNOWPARK_USE_SQL_SIMPLIFIERwill beTrueafter Snowflake 7.3 is released. In snowpark-python,session.sql_simplifier_enabledreads the value ofPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERby default, meaning that the SQL simplifier is enabled by default after the Snowflake 7.3 release. To turn this off, setPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERin Snowflake to False or runsession.sql_simplifier_enabled = Falsefrom Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.