Snowpark Library for Python release notes for 2024¶
This article contains the release notes for the Snowpark Library for Python, including the following when applicable:
- Behavior changes 
- New features 
- Customer-facing bug fixes 
Snowflake uses semantic versioning for Snowpark Library for Python updates.
See Snowpark Developer Guide for Python for documentation.
Warning
As Python 3.8 has reached its End of Life, deprecation warnings will be triggered when using snowpark-python with Python 3.8. For more details, see Snowflake Python Runtime Support. Snowpark Python 1.24.0 will be the last client and server version to support Python 3.8, in accordance with Anaconda’s policy. Upgrade your existing Python 3.8 objects to Python 3.9 or greater.
Version 1.26.0 (2024-12-05)¶
New features¶
- Added support for property - versionand class method- get_active_sessionfor- Sessionclass.
- Added new methods and variables to enhance data type handling and JSON serialization/deserialization: - To - DataType, its derived classes, and- StructField:- type_name: Returns the type name of the data.
- simple_string: Provides a simple string representation of the data.
- json_value: Returns the data as a JSON-compatible value.
- json: Converts the data to a JSON string.
 
- To - ArrayType,- MapType,- StructField,- PandasSeriesType,- PandasDataFrameTypeand- StructType:- from_json: Enables these types to be created from JSON data.
 
- To - MapType:- keyType: keys of the map
- valueType: values of the map
 
 
- Added support for method - appNamein- SessionBuilder.
- Added support for - include_nullsargument in- DataFrame.unpivot.
- Added support for following functions in - functions.py:- sizeto get size of array, object, or map columns.
- collect_listan alias of- array_agg.
- concat_ws_ignore_nullsto concatenate strings with a separator, ignoring null values.
- substringmakes- lenargument optional.
 
- Added parameter - ast_enabledto session for internal usage (default:- False).
Improvements¶
- Added support for specifying the following to - DataFrame.create_or_replace_dynamic_table:- iceberg_configA dictionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for nested data types to - DataFrame.print_schema
- Added support for - levelparameter to- DataFrame.print_schema
- Improved flexibility of - DataFrameReaderand- DataFrameWriterAPI by adding support for the following:- Added - formatmethod to- DataFrameReaderand- DataFrameWriterto specify file format when loading or unloading results.
- Added - loadmethod to- DataFrameReaderto work in conjunction with- format.
- Added - savemethod to- DataFrameWriterto work in conjunction with- format.
- Added support to read keyword arguments to - optionsmethod for- DataFrameReaderand- DataFrameWriter.
 
- Relaxed the cloudpickle dependency for Python 3.11 to simplify build requirements. However, for Python 3.11, - cloudpickle==2.2.1remains the only supported version.
Bug fixes¶
- Removed warnings that dynamic pivot features were in private preview, because dynamic pivot is now generally available. 
- Fixed a bug in - session.read.optionswhere- FalseBoolean values were incorrectly parsed as- Truein the generated file format.
Dependency updates¶
- Added a runtime dependency on - python-dateutil.
Snowpark pandas API updates¶
New features¶
- Added partial support for - Series.mapwhen- argis a pandas- Seriesor a- collections.abc.Mapping. No support for instances of- dictthat implement- __missing__but are not instances of- collections.defaultdict.
- Added support for - DataFrame.alignand- Series.alignfor- axis=1and- axis=None.
- Added support for - pd.json_normalize.
- Added support for - GroupBy.pct_changewith- axis=0,- freq=None, and- limit=None.
- Added support for - DataFrameGroupBy.__iter__and- SeriesGroupBy.__iter__.
- Added support for - np.sqrt,- np.trunc,- np.floor, numpy trig functions,- np.exp,- np.abs,- np.positiveand- np.negative.
- Added partial support for the dataframe interchange protocol method - DataFrame.__dataframe__().
Bug fixes¶
- Fixed a bug in - df.locwhere setting a single column from a series results in unexpected- Nonevalues.
Improvements¶
- Use UNPIVOT INCLUDE NULLS for unpivot operations in pandas instead of sentinel values. 
- Improved documentation for - pd.read_excel.
Version 1.25.0 (2024-11-13)¶
New features¶
- Added the following new functions in - snowflake.snowpark.dataframe:- map
 
Improvements¶
- When target stage is not set in profiler, a default stage from - Session.get_session_stageis used instead of raising- SnowparkSQLException.
- Allowed lower case or mixed case input when calling - Session.stored_procedure_profiler.set_active_profiler.
- Added distributed tracing using open telemetry APIs for action function in - DataFrame:- cache_result
 
- Removed opentelemetry warning from logging. 
Bug fixes¶
- Fixed the pre-action and post-action query propagation when - Inexpressions were used in selects.
- Fixed a bug that raised error - AttributeErrorwhile calling- Session.stored_procedure_profiler.get_outputwhen- Session.stored_procedure_profileris disabled.
Dependency updates¶
- Added a dependency on - protobuf>=5.28and- tzlocalat runtime.
- Added a dependency on - protoc-wheel-0for the development profile.
- Require - snowflake-connector-python>=3.12.0, <4.0.0(was- >=3.10.0).
Snowpark pandas API updates¶
New features¶
- Added support for - Index.to_numpy.
- Added support for - DataFrame.alignand- Series.alignfor- axis=0.
- Added support for - snowflake.snowpark.functions.window
- Added support for - pd.read_pickle(Uses native pandas for processing).
- Added support for - pd.read_html(Uses native pandas for processing).
- Added support for - pd.read_xml(Uses native pandas for processing).
- Added support for aggregation functions - "size"and- lenin- GroupBy.aggregate,- DataFrame.aggregate, and- Series.aggregate.
- Added support for list values in - Series.str.len.
Bug fixes¶
- Fixed a bug where aggregating a single-column dataframe with a single callable function (e.g. - pd.DataFrame([0]).agg(np.mean)) would fail to transpose the result.
- Fixed bugs where - DataFrame.dropna()would:- Treat an empty - subset(e.g.- []) as if it specified all columns instead of no columns.
- Raise a - TypeErrorfor a scalar- subsetinstead of filtering on just that column.
- Raise a - ValueErrorfor a- subsetof type- pandas.Indexinstead of filtering on the columns in the index.
 
- Creation of scoped read-only tables to mitigate - TableNotFoundErrorwhen using dynamic pivot in a notebook environment.
- Fixed a bug when concat dataframe or series objects are coming from the same dataframe when axis = 1. 
Improvements¶
- Improve - np.wherewith scalar x value by eliminating unnecessary join and temp table creation.
- Improve - get_dummiesperformance by flattening the pivot with join.
Snowpark local testing updates¶
New features¶
- Added support for patching functions that are unavailable in the - snowflake.snowpark.functionsmodule.
- Added support for - snowflake.snowpark.functions.any_value
Bug fixes¶
- Fixed a bug where - Table.updatecould not handle- VariantType,- MapType, and- ArrayTypedata types.
- Fixed a bug where column aliases were incorrectly resolved in - DataFrame.join, causing errors when selecting columns from a joined DataFrame.
- Fixed a bug where - Table.updateand- Table.mergecould fail if the target table’s index was not the default- RangeIndex.
Version 1.24.0 (2024-10-28)¶
New features¶
- Updated - Sessionclass to be thread-safe. This allows concurrent DataFrame transformations, DataFrame actions, UDF and stored procedure registration, and concurrent file uploads when using the same- Sessionobject.- The feature is disabled by default and can be enabled by setting - FEATURE_THREAD_SAFE_PYTHON_SESSIONto- Truefor account.
- Updating session configurations, like changing database or schema, when multiple threads are using the session may lead to unexpected behavior. 
- When enabled, some internally created temporary table names returned from - DataFrame.queriesAPI are not deterministic, and may be different when DataFrame actions are executed. This does not affect explicit user-created temporary tables.
 
- Added support for ‘Service’ domain to - session.lineage.traceAPI.
- Added support for the following methods in - DataFrameWriterto support daisy-chaining:- option
- options
- partition_by
 
- Added support for - snowflake_cortex_summarize.
Improvements¶
- Improved the following new capability for function - snowflake.snowpark.functions.array_removeso that it is now possible to use in python.
- Disables SQL simplification when sort is performed after limit. - Previously, - df.sort().limit()and- df.limit().sort()generated the same query with sort in front of limit. Now,- df.limit().sort()generates a query that reads- df.limit().sort().
- Improve performance of generated queries for - df.limit().sort()because limit stops table scanning as soon as the number of records is satisfied.
 
Bug fixes¶
- Fixed a bug where the automatic cleanup of temporary tables could interfere with the results of async query execution. 
- Fixed a bug in - DataFrame.analytics.time_series_aggfunction to handle multiple data points in same sliding interval.
- Fixed a bug that created inconsistent casing in field names of structured objects in iceberg schemas. 
Deprecations¶
- As Python 3.8 has reached its End of Life, deprecation warnings will be triggered when using snowpark-python with Python 3.8. For more details, see Snowflake Python Runtime Support. 
- Snowpark 1.24.0 is the last client and server version to support Python 3.8, in accordance with Anaconda’s policy. Upgrade your existing Python 3.8 objects to Python 3.9 or greater. 
Snowpark pandas API updates¶
New features¶
- Added support for - np.subtract,- np.multiply,- np.divide, and- np.true_divide.
- Added support for tracking usages of - __array_ufunc__.
- Added numpy compatibility support for - np.float_power,- np.mod,- np.remainder,- np.greater,- np.greater_equal,- np.less,- np.less_equal,- np.not_equal, and- np.equal.
- Added numpy compatibility support for - np.log,- np.log2, and- np.log10
- Added support for - DataFrameGroupBy.bfill,- SeriesGroupBy.bfill,- DataFrameGroupBy.ffill, and- SeriesGroupBy.ffill.
- Added support for - onparameter with- Resampler.
- Added support for timedelta inputs in - value_counts().
- Added support for applying Snowpark Python function - snowflake_cortex_summarize.
- Added support for - DataFrame.attrsand- Series.attrs.
- Added support for - DataFrame.style.
- Added numpy compatibility support for - np.full_like
Improvements¶
- Improved generated SQL query for - headand- ilocwhen the row key is a slice.
- Improved error message when passing an unknown timezone to - tz_convertand- tz_localizein- Series,- DataFrame,- Series.dt, and- DatetimeIndex.
- Improved documentation for - tz_convertand- tz_localizein- Series,- DataFrame,- Series.dt, and- DatetimeIndexto specify the supported timezone formats.
- Added additional kwargs support for - df.applyand- series.apply( as well as- mapand- applymap) when using snowpark functions. This allows for some position independent compatibility between apply and functions where the first argument is not a pandas object.
- Improved generated SQL query for - ilocand- iatwhen the row key is a scalar.
- Removed all joins in - iterrows.
- Improved documentation for - Series.mapto reflect the unsupported features.
- Added support for - np.may_share_memorywhich is used internally by many scikit-learn functions. This method will always return false when called with a Snowpark pandas object.
Bug Fixes¶
- Fixed a bug where - DataFrameand- Series- pct_change()would raise- TypeErrorwhen input contained timedelta columns.
- Fixed a bug where - replace()would sometimes propagate- Timedeltatypes incorrectly through- replace(). Instead raise- NotImplementedErrorfor- replace()on- Timedelta.
- Fixed a bug where - DataFrameand- Series- round()would raise- AssertionErrorfor- Timedeltacolumns. Instead raise- NotImplementedErrorfor- round()on- Timedelta.
- Fixed a bug where - reindexfails when the new index is a Series with non-overlapping types from the original index.
- Fixed a bug where calling - __getitem__on a DataFrameGroupBy object always returned a DataFrameGroupBy object if- as_index=False.
- Fixed a bug where inserting timedelta values into an existing column would silently convert the values to integers instead of raising - NotImplementedError.
- Fixed a bug where - DataFrame.shift()on axis=0 and axis=1 would fail to propagate timedelta types.
- DataFrame.abs(),- DataFrame.__neg__(),- DataFrame.stack(), and- DataFrame.unstack()now raise- NotImplementedErrorfor timedelta inputs instead of failing to propagate timedelta types.
Snowpark local testing updates¶
Bug fixes¶
- Fixed a bug where - DataFrame.aliasraises- KeyErrorfor input column name.
- Fixed a bug where - to_csvon Snowflake stage fails when data contains empty strings.
Version 1.23.0 (2024-10-09)¶
New features¶
- Added the following new functions in - snowflake.snowpark.functions:- make_interval
 
- Added support for using Snowflake Interval constants with - Window.range_between()when the order by column is TIMESTAMP or DATE type.
- Added support for file writes. This feature is currently in private preview. 
- Added - thread_idto- QueryRecordto track the thread id submitting the query history.
- Added support for - Session.stored_procedure_profiler.
Bug fixes¶
- Fixed a bug where registering a stored procedure or UDxF with type hints would give a warning - NoneTypehas no- len()when trying to read default values from function.
Snowpark pandas API updates¶
New features¶
- Added support for - TimedeltaIndex.meanmethod.
- Added support for some cases of aggregating - Timedeltacolumns on- axis=0with- aggor- aggregate.
- Added support for - by,- left_by,- right_by,- left_index, and- right_indexfor- pd.merge_asof.
- Added support for passing parameter - include_describeto- Session.query_history.
- Added support for - DatetimeIndex.meanand- DatetimeIndex.stdmethods.
- Added support for - Resampler.asfreq,- Resampler.indices,- Resampler.nunique, and- Resampler.quantile.
- Added support for - resamplefrequency- W,- ME,- YEwith- closed = "left".
- Added support for - DataFrame.rolling.corrand- Series.rolling.corrfor- pairwise = Falseand int- window.
- Added support for string time-based - windowand- min_periods = Nonefor- Rolling.
- Added support for - DataFrameGroupBy.fillnaand- SeriesGroupBy.fillna.
- Added support for constructing - Seriesand- DataFrameobjects with the lazy- Indexobject as- data,- index, and- columnsarguments.
- Added support for constructing - Seriesand- DataFrameobjects with- indexand- columnvalues not present in- DataFrame/- Series- data.
- Added support for - pd.read_sas(Uses native pandas for processing).
- Added support for applying - rolling().count()and- expanding().count()to- Timedeltaseries and columns.
- Added support for - tzin both- pd.date_rangeand- pd.bdate_range.
- Added support for - Series.items.
- Added support for - errors="ignore"in- pd.to_datetime.
- Added support for - DataFrame.tz_localizeand- Series.tz_localize.
- Added support for - DataFrame.tz_convertand- Series.tz_convert.
- Added support for applying Snowpark Python functions (e.g., - sin) in- Series.map,- Series.apply,- DataFrame.applyand- DataFrame.applymap.
Improvements¶
- Improved - to_pandasto persist the original timezone offset for TIMESTAMP_TZ type.
- Improved - dtyperesults for TIMESTAMP_TZ type to show correct timezone offset.
- Improved - dtyperesults for TIMESTAMP_LTZ type to show correct timezone.
- Improved error message when passing non-bool value to - numeric_onlyfor groupby aggregations.
- Removed unnecessary warning about sort algorithm in - sort_values.
- Use SCOPED object for internal create temp tables. The SCOPED objects will be stored sproc scoped if created within stored sproc, otherwise will be session scoped, and the object will be automatically cleaned at the end of the scope. 
- Improved warning messages for operations that lead to materialization with inadvertent slowness. 
- Removed unnecessary warning message about - convert_dtypein- Series.apply.
Bug fixes¶
- Fixed a bug where an - Indexobject created from a- Series/- DataFrameincorrectly updates the- Series/- DataFrame’s index name after an inplace update has been applied to the original- Series/- DataFrame.
- Suppressed an unhelpful - SettingWithCopyWarningthat sometimes appeared when printing- Timedeltacolumns.
- Fixed - inplaceargument for- Seriesobjects derived from other- Seriesobjects.
- Fixed a bug where - Series.sort_valuesfailed if series name overlapped with index column name.
- Fixed a bug where transposing a dataframe would map - Timedeltaindex levels to integer column levels.
- Fixed a bug where - Resamplermethods on timedelta columns would produce integer results.
- Fixed a bug where - pd.to_numeric()would leave- Timedeltainputs as- Timedeltainstead of converting them to integers.
- Fixed - locset when setting a single row, or multiple rows, of a DataFrame with a Series value.
Version 1.22.1 (2024-09-11)¶
- This is a re-release of 1.22.0. Please refer to the 1.22.0 release notes for detailed release content. 
Version 1.22.0 (2024-09-10)¶
New features¶
- Added the following new functions in - snowflake.snowpark.functions:- array_remove
- ln
 
Improvements¶
- Improved documentation for - Session.write_pandasby making the- use_logical_typeoption more explicit.
- Added support for specifying the following to - DataFrameWriter.save_as_table:- enable_schema_evolution
- data_retention_time
- max_data_extension_time
- change_tracking
- copy_grants
- iceberg_config- A dictionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for specifying the following to - DataFrameWriter.copy_into_table:- iceberg_config- A dictionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for specifying the following parameters to - DataFrame.create_or_replace_dynamic_table:- mode
- refresh_mode
- initialize
- clustering_keys
- is_transient
- data_retention_time
- max_data_extension_time
 
Bug fixes¶
- Fixed a bug in - session.read.csvthat caused an error when setting- PARSE_HEADER = Truein an externally defined file format.
- Fixed a bug in query generation from set operations that allowed generation of duplicate queries when children have common subqueries. 
- Fixed a bug in - session.get_session_stagethat referenced a non-existing stage after switching database or schema.
- Fixed a bug where calling - DataFrame.to_snowpark_pandaswithout explicitly initializing the Snowpark pandas plugin caused an error.
- Fixed a bug where using the - explodefunction in dynamic table creation caused a SQL compilation error due to improper boolean type casting on the- outerparameter.
Snowpark local testing updates¶
New features¶
- Added support for type coercion when passing columns as input to UDF calls. 
- Added support for - Index.identical.
Bug fixes¶
- Fixed a bug where the truncate mode in - DataFrameWriter.save_as_tableincorrectly handled DataFrames containing only a subset of columns from the existing table.
- Fixed a bug where function - to_timestampdoes not set the default timezone of the column datatype.
Snowpark pandas API updates¶
New features¶
- Added limited support for the - Timedeltatype, including the following features. Snowpark pandas will raise- NotImplementedErrorfor unsupported- Timedeltause cases.- support for tracking the - Timedeltatype through- copy,- cache_result,- shift,- sort_index,- assign,- bfill,- ffill,- fillna,- compare,- diff,- drop,- dropna,- duplicated,- empty,- equals,- insert,- isin,- isna,- items,- iterrows,- join,- len,- mask,- melt,- merge,- nlargest,- nsmallest,- to_pandas.
- support for converting non-timedelta to timedelta via - astype.
- NotImplementedErrorwill be raised for the rest of methods that do not support- Timedelta.
- support for subtracting two timestamps to get a - Timedelta.
- support indexing with - Timedeltadata columns.
- support for adding or subtracting timestamps and - Timedelta.
- support for binary arithmetic between two - Timedeltavalues.
- support for binary arithmetic and comparisons between - Timedeltavalues and numeric values.
- support for lazy - TimedeltaIndex.
- support for - pd.to_timedelta.
- support for - GroupByaggregations- min,- max,- mean,- idxmax,- idxmin,- std,- sum,- median,- count,- any,- all,- size,- nunique,- head,- tail,- aggregate.
- support for - GroupByfiltrations- firstand- last.
- support for - TimedeltaIndexattributes:- days,- seconds,- microsecondsand- nanoseconds.
- support for - diffwith timestamp columns on- axis=0and- axis=1.
- support for - TimedeltaIndexmethods:- ceil,- floorand- round.
- support for - TimedeltaIndex.total_secondsmethod.
 
- Added support for index’s arithmetic and comparison operators. 
- Added support for - Series.dt.round.
- Added documentation pages for - DatetimeIndex.
- Added support for - Index.name,- Index.names,- Index.rename, and- Index.set_names.
- Added support for - Index.__repr__.
- Added support for - DatetimeIndex.month_nameand- DatetimeIndex.day_name.
- Added support for - Series.dt.weekday,- Series.dt.time, and- DatetimeIndex.time.
- Added support for - Index.minand- Index.max.
- Added support for - pd.merge_asof.
- Added support for - Series.dt.normalizeand- DatetimeIndex.normalize.
- Added support for - Index.is_boolean,- Index.is_integer,- Index.is_floating,- Index.is_numeric, and- Index.is_object.
- Added support for - DatetimeIndex.round,- DatetimeIndex.floorand- DatetimeIndex.ceil.
- Added support for - Series.dt.days_in_monthand- Series.dt.daysinmonth.
- Added support for - DataFrameGroupBy.value_countsand- SeriesGroupBy.value_counts.
- Added support for - Series.is_monotonic_increasingand- Series.is_monotonic_decreasing.
- Added support for - Index.is_monotonic_increasingand- Index.is_monotonic_decreasing.
- Added support for - pd.crosstab.
- Added support for - pd.bdate_rangeand included business frequency support (B, BME, BMS, BQE, BQS, BYE, BYS) for both- pd.date_rangeand- pd.bdate_range.
- Added support for lazy - Indexobjects as- labelsin- DataFrame.reindexand- Series.reindex.
- Added support for - Series.dt.days,- Series.dt.seconds,- Series.dt.microseconds, and- Series.dt.nanoseconds.
- Added support for creating a - DatetimeIndexfrom an- Indexof numeric or string type.
- Added support for string indexing with - Timedeltaobjects.
- Added support for - Series.dt.total_secondsmethod.
Improvements¶
- Improved concat and join performance when operations are performed on a series coming from the same DataFrame by avoiding unnecessary joins. 
- Refactored - quoted_identifier_to_snowflake_typeto avoid making metadata queries if the types have been cached locally.
- Improved - pd.to_datetimeto handle all local input cases.
- Create a lazy index from another lazy index without pulling data to client. 
- Raised - NotImplementedErrorfor Index bitwise operators.
- Display a more clear error message when - Index.namesis set to a non-list-like object.
- Raise a warning whenever - MultiIndexvalues are pulled in locally.
- Improved warning message for - pd.read_snowflaketo include the creation reason when temp table creation is triggered.
- Improved performance for - DataFrame.set_index, or setting- DataFrame.indexor- Series.indexby avoiding checks that require eager evaluation. As a consequence, when the new index that does not match the current- Seriesor- DataFrameobject length, a- ValueErroris no longer raised. Instead, when the- Seriesor- DataFrameobject is longer than the provided index, the new index of the- Seriesor- DataFrameis filled with- NaNvalues for the “extra” elements. Otherwise, the extra values in the provided index are ignored.
Bug fixes¶
- Stopped ignoring nanoseconds in - pd.Timedeltascalars.
- Fixed - AssertionErrorin tree of binary operations.
- Fixed bug in - Series.dt.isocalendarusing a named Series
- Fixed - inplaceargument for Series objects derived from DataFrame columns.
- Fixed a bug where - Series.reindexand- DataFrame.reindexdid not update the result index’s name correctly.
- Fixed a bug where - Series.takedid not give an error when- axis=1was specified.
Version 1.21.1 (2024-09-05)¶
Bug fixes¶
- Fixed a bug where using - to_pandas_batcheswith async jobs caused an error due to improper handling of waiting for asynchronous query completion.
Version 1.21.0 (2024-08-19)¶
New features¶
- Added support for - snowflake.snowpark.testing.assert_dataframe_equal, which is a utility function to check the equality of two Snowpark DataFrames.
Improvements¶
- Added support for server-side string size limitations. 
- Added support for creating and invoking stored procedures, UDFs and UDTFs with optional arguments. 
- Added support for column lineage in the - DataFrame.lineage.traceAPI.
- Added support for passing - INFER_SCHEMAoptions to- DataFrameReadervia- INFER_SCHEMA_OPTIONS.
- Added support for passing - parametersparameter to- Column.rlikeand- Column.regexp.
- Added support for automatically cleaning up temporary tables created by - df.cache_result()in the current session when the DataFrame is no longer referenced (i.e., gets garbage collected). It is still an experimental feature and not enabled by default. It can be enabled by setting- session.auto_clean_up_temp_table_enabledto- True.
- Added support for string literals to the - fmtparameter of- snowflake.snowpark.functions.to_date.
Bug fixes¶
- Fixed a bug where the SQL generated for selecting - *column has an incorrect subquery.
- Fixed a bug in - DataFrame.to_pandas_batcheswhere the iterator could throw an error if a certain transformation is made to the pandas DataFrame due to the wrong isolation level.
- Fixed a bug in - DataFrame.lineage.traceto split the quoted feature view’s name and version correctly.
- Fixed a bug in - Column.isinthat caused invalid SQL generation when passed an empty list.
- Fixed a bug that fails to raise - NotImplementedErrorwhile setting a cell with a list-like item.
Snowpark local testing updates¶
New features¶
- Added support for the following APIs: - snowflake.snowpark.functions- rank
- dense_rank
- percent_rank
- cume_dist
- ntile
- datediff
- array_agg
 
- snowflake.snowpark.column.Column.within_group
 
- Added support for parsing flags in Regex statements for mocked plans. This maintains parity with the - rlikeand- regexpchanges above.
Bug fixes¶
- Fixed a bug where the window functions - LEADand- LAGdo not handle the option- ignore_nullsproperly.
- Fixed a bug where values were not populated into the result DataFrame during the insertion of a table merge operation. 
Improvements¶
- Fix pandas - FutureWarningabout integer indexing.
Snowpark pandas API updates¶
New features¶
- Added support for - DataFrame.backfill,- DataFrame.bfill,- Series.backfill, and- Series.bfill.
- Added support for - DataFrame.compareand- Series.comparewith default parameters.
- Added support for - Series.dt.microsecondand- Series.dt.nanosecond.
- Added support for - Index.is_uniqueand- Index.has_duplicates.
- Added support for - Index.equals.
- Added support for - Index.value_counts.
- Added support for - Series.dt.day_nameand- Series.dt.month_name.
- Added support for indexing on Index, e.g., - df.index[:10].
- Added support for - DataFrame.unstackand- Series.unstack.
- Added support for - DataFrame.asfreqand- Series.asfreq.
- Added support for - Series.dt.is_month_startand- Series.dt.is_month_end.
- Added support for - Index.alland- Index.any.
- Added support for - Series.dt.is_year_startand- Series.dt.is_year_end.
- Added support for - Series.dt.is_quarter_startand- Series.dt.is_quarter_end.
- Added support for lazy - DatetimeIndex.
- Added support for - Series.argmaxand- Series.argmin.
- Added support for - Series.dt.is_leap_year.
- Added support for - DataFrame.items.
- Added support for - Series.dt.floorand- Series.dt.ceil.
- Added support for - Index.reindex.
- Added support for - DatetimeIndexproperties:- year,- month,- day,- hour,- minute,- second,- microsecond,- nanosecond,- date,- dayofyear,- day_of_year,- dayofweek,- day_of_week,- weekday,- quarter,- is_month_start,- is_month_end,- is_quarter_start,- is_quarter_end,- is_year_start,- is_year_endand- is_leap_year.
- Added support for - Resampler.fillnaand- Resampler.bfill.
- Added limited support for the - Timedeltatype, including creating- Timedeltacolumns and- to_pandas.
- Added support for - Index.argmaxand- Index.argmin.
Improvements¶
- Removed the public preview warning message when importing Snowpark pandas. 
- Removed unnecessary count query from the - SnowflakeQueryCompiler.is_series_likemethod.
- Dataframe.columnsnow returns a native pandas Index object instead of a Snowpark Index object.
- Refactored and introduced - query_compilerargument in- Indexconstructor to create- Indexfrom query compiler.
- pd.to_datetimenow returns a- DatetimeIndexobject instead of a- Seriesobject.
- pd.date_rangenow returns a- DatetimeIndexobject instead of a- Seriesobject.
Bug fixes¶
- Made passing an unsupported aggregation function to - pivot_tableraise- NotImplementedErrorinstead of- KeyError.
- Removed axis labels and callable names from error messages and telemetry about unsupported aggregations. 
- Fixed - AssertionErrorin- Series.drop_duplicatesand- DataFrame.drop_duplicateswhen called after- sort_values.
- Fixed a bug in - Index.to_framewhere the result frame’s column name may be wrong where name is unspecified.
- Fixed a bug where some Index docstrings are ignored. 
- Fixed a bug in - Series.reset_index(drop=True)where the result name may be wrong.
- Fixed a bug in - Groupby.first/lastordering by the correct columns in the underlying window expression.
Version 1.20.0 (2024-07-17)¶
Version 1.20.0 of the Snowpark Library for Python introduces some new features.
New features¶
- Added distributed tracing using open telemetry APIs for table stored procedure functions in - DataFrame:- _execute_and_get_query_id
 
- Added support for the - arrays_zipfunction.
- Improved performance for binary column expressions and - df._inby avoiding unnecessary casts for numeric values. You can enable this optimization by setting- session.eliminate_numeric_sql_value_cast_enabled = True.
- Improved error messages for - write_pandaswhen the target table does not exist and- auto_create_table=False.
- Added open telemetry tracing on UDxF functions in Snowpark. 
- Added open telemetry tracing on stored procedure registration in Snowpark. 
- Added a new optional parameter called - format_jsonto the- Session.SessionBuilder.app_namefunction that sets the app name in the- Session.query_tagin JSON format. By default, this parameter is set to- False.
Bug fixes¶
- Fixed a bug where the SQL generated for - lag(x, 0)was incorrect and failed with the error message- argument 1 to function LAG needs to be constant, found 'SYSTEM$NULL_TO_FIXED(null)'.
Snowpark local testing updates¶
New features¶
- Added support for the following APIs: - snowflake.snowpark.functions- random
 
 
- Added new parameters to the - patchfunction when registering a mocked function:- distinctallows an alternate function to be specified for when a SQL function should be distinct.
- pass_column_indexpasses a named parameter,- column_index, to the mocked function that contains the- pandas.Indexfor the input data.
- pass_row_indexpasses a named parameter,- row_index, to the mocked function that is the 0-indexed row number on which the function is currently operating.
- pass_input_datapasses a named parameter,- input_data, to the mocked function that contains the entire input dataframe for the current expression.
- Added support for the - column_orderparameter in the- DataFrameWriter.save_as_tablemethod.
 
Bug fixes¶
- Fixed a bug that caused - DecimalTypecolumns to be incorrectly truncated to integer precision when used in- BinaryExpressions.
Snowpark pandas API Updates¶
New features¶
- Added new API support for the following: - DataFrames - DataFrame.nlargestand- DataFrame.nsmallest
- DataFrame.assign
- DataFrame.stack
- DataFrame.pivot
- DataFrame.to_csv
- DataFrame.corr
- DataFrame.corr
- DataFrame.equals
- DataFrame.reindex
- DataFrame.atand- DataFrame.iat
 
- Series - Series.nlargestand- Series.nsmallest
- Series.atand- Series.iat
- Series.dt.isocalendar
- Series.equals
- Series.reindex
- Series.to_csv
- Series.case_whenexcept when condition or replacement is callable
- series.plot()with data materialized the data to the local client
 
- GroupBy - DataFrameGroupBy.alland- DataFrameGroupBy.any
- DataFrameGroupByand- SeriesGroupByaggregations- firstand- last
- DataFrameGroupBy.get_group
- SeriesGroupBy.alland- SeriesGroupBy.any
 
- General - pd.pivot
- read_excel(Uses local pandas for processing)
- df.plot()with data materialized the data to the local client
 
 
- Extended existing APIs as follows: - Added support for - replaceand- frac > 1in- DataFrame.sampleand- Series.sample.
- Added partial support for - Series.str.translatewhere the values in the- tableare single-codepoint strings.
- Added support for - limitparameter when- methodparameter is used in- fillna.
 
- Added documentation pages for - Indexand its APIs.
Bug fixes¶
- Fixed an issue when using np.where and df.where when the scalar - otheris the literal 0.
- Fixed a bug regarding precision loss when converting to Snowpark pandas - DataFrameor- Serieswith- dtype=np.uint64.
- Fixed a bug where - valuesis set to- indexwhen- indexand- columnscontain all columns in DataFrame during- pivot_table.
Improvements¶
- Added support for - Index.copy().
- Added support for Index APIs: - dtype,- values,- item(),- tolist(),- to_series()and- to_frame().
- Expand support for DataFrames with no rows in - pd.pivot_tableand- DataFrame.pivot_table.
- Added support for - inplaceparameter in- DataFrame.sort_indexand- Series.sort_index.
Version 1.19.0 (2024-06-25)¶
Version 1.19.0 of the Snowpark Library for Python introduces some new features.
New features¶
- Added support for the - to_booleanfunction.
- Added documentation pages for - Indexand its APIs.
Bug fixes¶
- Fixed a bug where Python stored procedures with tables return type fails when run in a task. 
- Fixed a bug where - df.dropnafails due to- RecursionError: maximum recursion depth exceededwhen the DataFrame has more than 500 columns.
- Fixed a bug where - AsyncJob.result("no_result")doesn’t wait for the query to finish execution.
Local testing updates¶
New features¶
- Added support for the - strictparameter when registering UDFs and Stored Procedures.
Bug fixes¶
- Fixed a bug in - convert_timezonethat made setting the- source_timezoneparameter return an error.
- Fixed a bug where creating a DataFrame with empty data of type - DateTyperaises- AttributeError.
- Fixed a bug where table merge fails when an update clause exists but no update takes place. 
- Fixed a bug in the mock implementation of - to_charthat raises- IndexErrorwhen an incoming column has a nonconsecutive row index.
- Fixed a bug in handling - CaseExprexpressions that raises- IndexErrorwhen an incoming column has a nonconsecutive row index.
- Fixed a bug in the implementation of - Column.likethat raises- IndexErrorwhen an incoming column has a nonconsecutive row index.
Improvements¶
- Added support for type coercion in the implementation of - DataFrame.replace,- DataFrame.dropna, and the mock function- iff.
Snowpark pandas API updates¶
New features¶
- Added partial support for - DataFrame.pct_changeand- Series.pct_changewithout the- freqand- limitparameters.
- Added support for - Series.str.get.
- Added support for - Series.dt.dayofweek,- Series.dt.day_of_week,- Series.dt.dayofyear, and- Series.dt.day_of_year.
- Added support for - Series.str.__getitem__ (Series.str[...]).
- Added support for - Series.str.lstripand- Series.str.rstrip.
- Added support for - DataFrameGroupby.sizeand- SeriesGroupby.size.
- Added support for - DataFrame.expandingand- Series.expandingfor aggregations- count,- sum,- min,- max,- mean,- std, and- varwith- axis=0.
- Added support for - DataFrame.rollingand- Series.rollingfor aggregation count with- axis=0.
- Added support for - Series.str.match.
- Added support for - DataFrame.resampleand- Series.resamplefor aggregation size.
Bug fixes¶
- Fixed a bug that causes output of - GroupBy.aggregatecolumns to be ordered incorrectly.
- Fixed a bug where calling - DataFrame.describeon a frame with duplicate columns of differing- dtypescould cause an error or incorrect results.
- Fixed a bug in - DataFrame.rollingand- Series.rollingso- window=0now throws- NotImplementedErrorinstead of- ValueError
Improvements¶
- Added support for named aggregations in - DataFrame.aggregateand- Series.aggregatewith- axis=0.
- pd.read_csvreads using the native pandas CSV parser, then uploads data to Snowflake using parquet. This enables most of the parameters supported by- read_csv, including date parsing and numeric conversions. Uploading via parquet is roughly twice as fast as uploading via CSV.
- Initial work to support a - pd.Indexdirectly in Snowpark pandas. Support for- pd.Indexas a first-class component of Snowpark pandas is under active development.
- Added a lazy index constructor and support for - len,- shape,- size,- empty,- to_pandas(), and- names. For- df.index, Snowpark pandas creates a lazy index object.
- For - df.columns, Snowpark pandas supports a non-lazy version of an- Indexas the data is already stored locally.
Version 1.18.0 (2024-05-28)¶
Version 1.18.0 of the Snowpark library introduces some new features.
New features¶
- Added the - DataFrame.cache_resultand- Series.cache_resultmethods for users to persist- DataFrameand- Seriesobjects to a temporary table for the duration of a session to improve latency of subsequent operations.
Improvements¶
- Added support for - DataFrame.pivot_tablewith no- indexparameter and with the- marginsparameter.
- Updated the signature of - DataFrame.shift,- Series.shift,- DataFrameGroupBy.shift, and- SeriesGroupBy.shiftto match pandas 2.2.1. Snowpark pandas does not yet support the newly-added suffix argument or sequence values of periods.
- Re-added support for - Series.str.split.
Bug fixes¶
- Fixed an issue with mixed columns for string methods ( - Series.str.*).
Local testing updates¶
New features¶
- Added support for the following - DataFrameReaderread options to file formats CSV and JSON:- PURGE 
- PATTERN 
- INFER_SCHEMA with value - False
- ENCODING with value - UTF8
 
- Added support for - DataFrame.analytics.moving_aggand- DataFrame.analytics.cumulative_agg_agg.
- Added support for the - if_not_existsparameter during UDF and stored procedure registration.
Bug fixes¶
- Fixed a bug with processing time formats where the fractional second part was not handled properly. 
- Fixed a bug that caused function calls on - *to fail.
- Fixed a bug that prevented the creation of - mapand- structtype objects.
- Fixed a bug where the function - date_addwas unable to handle some numeric types.
- Fixed a bug where - TimestampTypecasting resulted in incorrect data.
- Fixed a bug that caused - DecimalTypedata to have incorrect precision in some cases.
- Fixed a bug where referencing a missing table or view raised an - IndexError.
- Fixed a bug where the mocked function - to_timestamp_ntzcould not handle- Nonedata.
- Fixed a bug where mocked UDFs handled output data of - Noneimproperly.
- Fixed a bug where - DataFrame.with_column_renamedignored attributes from parent- DataFramesafter join operations.
- Fixed a bug where the integer precision of large values was lost when converted to a pandas - DataFrame.
- Fixed a bug where the schema of a - datetimeobject was wrong when creating a- DataFramefrom a pandas- DataFrame.
- Fixed a bug in the implementation of - Column.equal_nanwhere null data was handled incorrectly.
- Fixed a bug where - DataFrame.dropignored attributes from parent- DataFramesafter join operations.
- Fixed a bug in mocked function - date_partwhere column type was set incorrectly.
- Fixed a bug where - DataFrameWriter.save_as_tabledid not raise exceptions when inserting null data into non-nullable columns.
- Fixed a bug in the implementation of - DataFrameWriter.save_as_tablewhere:- Append or truncate failed when incoming data had a different schema than the existing table. 
- Truncate failed when incoming data did not specify columns that are nullable. 
 
Improvements¶
- Removed the dependency check for - pyarrowbecause it is not used.
- Improved the target type coverage of - Column.cast, adding support for casting to boolean and all integral types.
- Aligned the error experience when calling UDFs and stored procedures. 
- Added appropriate error messages for the - is_permanentand- anonymousoptions in UDFs and stored procedures registration to make it clearer that those features are not yet supported.
- File read operations with unsupported options and values now raise - NotImplementedErrorinstead of warnings and unclear error information.
Version 1.17.0 (2024-05-21)¶
Version 1.17.0 of the Snowpark library introduces some new features.
New features¶
- Added support to add a comment on tables and views using the functions listed below: - DataFrameWriter.save_as_table
- DataFrame.create_or_replace_view
- DataFrame.create_or_replace_temp_view
- DataFrame.create_or_replace_dynamic_table
 
Improvements¶
- Improved error message to remind users to set - {"infer_schema": True}when reading CSV file without specifying its schema.
Local testing updates¶
New features¶
- Added support for - NumericTypeand- VariantTypedata conversion in the mocked function- to_timestamp_ltz,- to_timestamp_ntz,- to_timestamp_tzand- to_timestamp.
- Added support for - DecimalType,- BinaryType,- ArrayType,- MapType,- TimestampType,- DateTypeand- TimeTypedata conversion in the mocked function- to_char.
- Added support for the following APIs: - snowflake.snowpark.functions.to_varchar
- snowflake.snowpark.DataFrame.pivot
- snowflake.snowpark.Session.cancel_all
 
- Introduced a new exception class - snowflake.snowpark.mock.exceptions.SnowparkLocalTestingException.
- Added support for casting to - FloatType.
Bug fixes¶
- Fixed a bug that stored procedures and UDFs should not remove imports already in the - sys.pathduring the clean-up step.
- Fixed a bug that when processing - datetimeformat, the fractional second part is not handled properly.
- Fixed a bug where file operations on the Windows platform were unable to properly handle file separators in directory names. 
- Fixed a bug that on the Windows platform that, when reading a pandas dataframe, an - IntervalTypecolumn with integer data can not be processed.
- Fixed a bug that prevented users from being able to select multiple columns with the same alias. 
- Fixed a bug where - Session.get_current_[schema|database|role|user|account|warehouse]returns uppercased identifiers when identifiers are quoted.
- Fixed a bug that function - substrand- substringcan not handle a zero-based- start_expr.
Improvements¶
- Standardized the error experience by raising - SnowparkLocalTestingExceptionin error cases, which is on par with the- SnowparkSQLExceptionraised in non-local execution.
- Improved the error experience of the - Session.write_pandasmethod so that- NotImplementErrorwill be raised when called.
- Aligned the error experience with reusing a closed session in non-local execution. 
Version 1.16.0 (2024-05-08)¶
Version 1.16.0 of the Snowpark library introduces some new features.
New features¶
- Added - snowflake.snowpark.Session.lineage.traceto explore data lineage of Snowflake objects.
- Added support for registering stored procedures with packages given as Python modules. 
- Added support for structured type schema parsing. 
Bug fixes¶
- Fixed a bug where, when inferring a schema, single quotes were added to stage files that already had single quotes. 
Local testing updates¶
New features¶
- Added support for - StringType,- TimestampTypeand- VariantTypedata conversion in the mocked function- to_date.
- Added support for the following APIs: - snowflake.snowpark.functions:- get
- concat
- concat_ws
 
 
Bug fixes¶
- Fixed a bug that caused - NaTand- NaNvalues to not be recognized.
- Fixed a bug where, when inferring a schema, single quotes were added to stage files that already had single quotes. 
- Fixed a bug where - DataFrameReader.csvwas unable to handle quoted values containing a delimiter.
- Fixed a bug that when there is a - Nonevalue in an arithmetic calculation, the output should remain- Noneinstead of- math.nan.
- Fixed a bug in function - sumand- covar_popthat when there is a- math.nanvalue in the data, the output should also be- math.nan.
- Fixed a bug where stage operations can not handle directories. 
- Fixed a bug that - DataFrame.to_pandasshould take Snowflake numeric types with precision 38 as- int64.
Version 1.15.0 (2024-04-24)¶
Version 1.15.0 of the Snowpark library introduces some new features.
New features¶
- Added - truncatesave mode in- DataFrameWriteto overwrite existing tables by truncating the underlying table instead of dropping it.
- Added telemetry to calculate query plan height and number of duplicate nodes during collect operations. 
- Added the functions below to unload data from a - DataFrameinto one or more files in a stage:- DataFrame.write.json
- DataFrame.write.csv
- DataFrame.write.parquet
 
- Added distributed tracing using open telemetry APIs for action functions in - DataFrameand- DataFrameWriter:- snowflake.snowpark.DataFrame:- collect
- collect_nowait
- to_pandas
- count
- show
 
- snowflake.snowpark.DataFrameWriter:- save_as_table
 
 
- Added support for - snow://URLs to- snowflake.snowpark.Session.file.getand- snowflake.snowpark.Session.file.get_stream
- Added support to register stored procedures and UDFs with a - comment.
- UDAF client support is ready for public preview. Please stay tuned for the Snowflake announcement of UDAF public preview. 
- Added support for dynamic pivot. This feature is currently in private preview. 
Improvements¶
- Improved the generated query performance for both compilation and execution by converting duplicate subqueries to Common Table Expressions (CTEs). It is still an experimental feature and it is not enabled by default. You can enable it by setting - session.cte_optimization_enabledto- True.
Bug fixes¶
- Fixed a bug where - statement_paramsis not passed to query executions that register stored procedures and user defined functions.
- Fixed a bug causing - snowflake.snowpark.Session.file.get_streamto fail for quoted stage locations.
- Fixed a bug that an internal type hint in - utils.pymight raise- AttributeErrorwhen the underlying module can not be found.
Local testing updates¶
New features¶
- Added support for registering UDFs and stored procedures. 
- Added support for the following APIs: - snowflake.snowpark.Session:- file.put
- file.put_stream
- file.get
- file.get_stream
- read.json
- add_import
- remove_import
- get_imports
- clear_imports
- add_packages
- add_requirements
- clear_packages
- remove_package
- udf.register
- udf.register_from_file
- sproc.register
- sproc.register_from_file
 
- snowflake.snowpark.functions- current_database
- current_session
- date_trunc
- object_construct
- object_construct_keep_null
- pow
- sqrt
- udf
- sproc
 
 
- Added support for - StringType,- TimestampTypeand- VariantTypedata conversion in the mocked function- to_time.
Bug fixes¶
- Fixed a bug that null filled columns for constant functions. 
- Fixed - to_object,- to_arrayand- to_binaryto better handle null inputs.
- Fixed a bug that timestamp data comparison can not handle years beyond 2262. 
- Fixed a bug that - Session.builder.getOrCreateshould return the created mock session.
Version 1.14.0 (2024-03-20)¶
Version 1.14.0 of the Snowpark library introduces some new features.
New features¶
- Added support for creating vectorized UDTFs with the - processmethod.
- Added support for dataframe functions: - to_timestamp_ltz
- to_timestamp_ntz
- to_timestamp_tz
- locate
 
- Added support for ASOF JOIN type. 
- Added support for the following local testing APIs: - snowflake.snowpark.functions: - to_double
- to_timestamp
- to_timestamp_ltz
- to_timestamp_ntz
- to_timestamp_tz
- greatest
- least
- convert_timezone
- dateadd
- date_part
 
- snowflake.snowpark.Session: - get_current_account
- get_current_warehouse
- get_current_role
- use_schema
- use_warehouse
- use_database
- use_role
 
 
Improvements¶
- Added telemetry to local testing. 
- Improved the error message of - DataFrameReaderto raise- FileNotFounderror when reading a path that does not exist or when there are no files under the path.
Bug fixes¶
- Fixed a bug in - SnowflakePlanBuilderwhere- save_as_tabledoes not correctly filter columns whose names start with- $and is followed by a number.
- Fixed a bug where statement parameters might have no effect when resolving imports and packages. 
- Fixed bugs in local testing: - LEFT ANTI and LEFT SEMI joins drop rows with null values. 
- DataFrameReader.csvincorrectly parses data when the optional parameter- field_optionally_enclosed_byis specified.
- Column.regexponly considers the first entry when- patternis a- Column.
- Table.updateraises- KeyErrorwhen updating null values in the rows.
- VARIANT columns raise errors at - DataFrame.collect.
- count_distinctdoes not work correctly when counting.
- Null values in integer columns raise - TypeError.
 
Version 1.13.0 (2024-02-26)¶
Version 1.13.0 of the Snowpark library introduces some new features.
New Features¶
- Added support for an optional - date_partargument in function- last_day.
- SessionBuilder.app_namewill set the- query_tagafter the session is created.
- Added support for the following local testing functions: - current_timestamp
- current_date
- current_time
- strip_null_value
- upper
- lower
- length
- initcap
 
Improvements¶
- Added cleanup logic at interpreter shutdown to close all active sessions. 
Bug fixes¶
- Fixed a bug in - DataFrame.to_local_iteratorwhere the iterator could yield wrong results if another query is executed before the iterator finishes due to wrong isolation level.
- Fixed a bug that truncated table names in error messages while running a plan with local testing enabled. 
- Fixed a bug that - Session.rangereturns empty result when the range is large.
Version 1.12.1 (2024-02-08)¶
Version 1.12.1 of the Snowpark library introduces some new features.
Improvements¶
- Use - split_blocks=Trueby default, during- to_pandasconversion, for optimal memory allocation. This parameter is passed to- pyarrow.Table.to_pandas, which enables- PyArrowto split the memory allocation into smaller, more manageable blocks instead of allocating a single contiguous block. This results in better memory management when dealing with larger datasets.
Bug fixes¶
- Fixed a bug in - DataFrame.to_pandasthat caused an error when evaluating on a Dataframe with an- IntegerTypecolumn with null values.
Version 1.12.0 (2024-01-29)¶
Version 1.12.0 of the Snowpark library introduces some new features.
Behavior Changes (API Compatible)¶
- When parsing data types during a - to_pandasoperation, we rely on GS precision value to fix precision issues for large integer values. This may affect users where a column that was earlier returned as- int8gets returned as- int64. Users can fix this by explicitly specifying precision values for their return column.
- Aligned behavior for - Session.callin case of table stored procedures where running- Session.callwould not trigger a stored procedure unless a- collect()operation was performed.
- StoredProcedureRegistrationnow automatically adds- snowflake-snowpark-pythonas a package dependency on the client’s local version of the library. An error is thrown if the server cannot support that version.
New features¶
- Exposed - statement_paramsin- StoredProcedure.__call__.
- Added two optional arguments to - Session.add_import:- chunk_size: The number of bytes to hash per chunk of the uploaded files.
- whole_file_hash: By default only the first chunk of the uploaded import is hashed to save time. When this is set to True each uploaded file is fully hashed instead.
 
- Added parameters - external_access_integrationsand- secretswhen creating a UDAF from Snowpark Python to allow integration with external access.
- Added a new method - Session.append_query_tag, which allows an additional tag to be added to the current query tag by appending it as a comma separated value.
- Added a new method - Session.update_query_tag, which allows updates to a JSON encoded dictionary query tag.
- SessionBuilder.getOrCreatewill now attempt to replace the singleton it returns when token expiration has been detected.
- Added the following functions in - snowflake.snowpark.functions:- array_except
- create_map
- sign/- signum
 
- Added the following functions to - DataFrame.analytics:- Added the - moving_aggfunction in- DataFrame.analyticsto enable moving aggregations like sums and averages with multiple window sizes.
- Added the - cumulative_aggfunction in- DataFrame.analyticsto enable moving aggregations like sums and averages with multiple window sizes.
 
Bug fixes¶
- Fixed a bug in - DataFrame.na.fillthat caused Boolean values to erroneously override integer values.
- Fixed a bug in - Session.create_dataframewhere the Snowpark DataFrames created using pandas DataFrames were not inferring the type for timestamp columns correctly. The behavior is as follows:- Earlier timestamp columns without a timezone would be converted to nanosecond epochs and inferred as - LongType(), but will now be correctly maintained as timestamp values and be inferred as- TimestampType(TimestampTimeZone.NTZ).
- Earlier timestamp columns with a timezone would be inferred as - TimestampType(TimestampTimeZone.NTZ)and loose timezone information but will now be correctly inferred as- TimestampType(TimestampTimeZone.LTZ)and timezone information is retained correctly.
- Set session parameter - PYTHON_SNOWPARK_USE_LOGICAL_TYPE_FOR_CREATE_DATAFRAMEto revert back to old behavior. Snowflake recommends that you update your code to align with correct behavior because the parameter will be removed in the future.
 
- Fixed a bug that - DataFrame.to_pandasgets decimal type when scale is not 0, and creates an object dtype in- pandas. Instead, we cast the value to a float64 type.
- Fixed bugs that wrongly flattened the generated SQL when one of the following happens: - DataFrame.filter()is called after- DataFrame.sort().limit().
- DataFrame.sort()or- filter()is called on a DataFrame that already has a window function or sequence-dependent data generator column. For instance,- df.select("a", seq1().alias("b")).select("a", "b").sort("a")won’t flatten the sort clause anymore.
- A window or sequence-dependent data generator column is used after - DataFrame.limit(). For instance,- df.limit(10).select(row_number().over())won’t flatten the limit and select in the generated SQL.
 
- Fixed a bug where aliasing a DataFrame column raised an error when the DataFrame was copied from another DataFrame with an aliased column. For instance, - df = df.select(col("a").alias("b")) df = copy(df) df.select(col("b").alias("c")) # Threw an error. Now it's fixed. 
- Fixed a bug in - Session.create_dataframethat the non-nullable field in a schema is not respected for Boolean type. Note that this fix is only effective when the user has the privilege to create a temp table.
- Fixed a bug in SQL simplifier where non-select statements in - session.sqldropped a SQL query when used with- limit().
- Fixed a bug that raised an exception when session parameter - ERROR_ON_NONDETERMINISTIC_UPDATEis true.