Snowpark Library for Python release notes for 2025

This article contains the release notes for the Snowpark Library for Python, including the following when applicable:

  • Behavior changes

  • New features

  • Customer-facing bug fixes

Snowflake uses semantic versioning for Snowpark Library for Python updates.

See Snowpark Developer Guide for Python for documentation.

Warning

As Python 3.8 has reached its End of Life, deprecation warnings will be triggered when using snowpark-python with Python 3.8. For more details, see Snowflake Python Runtime Support. Snowpark Python 1.24.0 will be the last client and server version to support Python 3.8, in accordance with Anaconda’s policy. Upgrade your existing Python 3.8 objects to Python 3.9 or greater.

Version 1.31.0 (2025-04-24)

New features

  • Added support for the restricted caller permission of execute_as argument in StoredProcedure.register():code:.

  • Added support for non-select statements in DataFrame.to_pandas().

  • Added support for the artifact_repository parameter to Session.add_packages, Session.add_requirements, Session.get_packages, Session.remove_package, and Session.clear_packages.

  • Added support for reading an XML file using a row tag by session.read.option('rowTag', <tag_name>).xml(<stage_file_path>) (experimental).

    • Each XML record is extracted as a separate row.

    • Each field within that record becomes a separate column of type VARIANT, which can be further queried using the dot notation, such as col(a.b.c).

  • Added updates to DataFrameReader.dbapi (PrPr):

    • Added the fetch_merge_count parameter for optimizing performance by merging multiple fetched data into a single Parquet file.

    • Added support for Databricks.

    • Added support for ingestion with Snowflake UDTF.

  • Added support for the following AI-powered functions in functions.py (Private Preview):

    • prompt

    • ai_filter (added support for prompt() function and image files, and changed the second argument name from expr to file)

    • ai_classify

Improvements

  • Renamed the relaxed_ordering param into enforce_ordering for DataFrame.to_snowpark_pandas. Also the new default values is enforce_ordering=False which has the opposite effect of the previous default value, relaxed_ordering=False.

  • Improved DataFrameReader.dbapi (PrPr) reading performance by setting the default fetch_size parameter value to 1000.

  • Improve the error message for invalid identifier SQL error by suggesting the potentially matching identifiers.

  • Reduced the number of describe queries issued when creating a DataFrame from a Snowflake table using session.table.

  • Improved performance and accuracy of DataFrameAnalyticsFunctions.time_series_agg().

Bug fixes

  • Fixed a bug in DataFrame.group_by().pivot().agg when the pivot column and aggregate column are the same.

  • Fixed a bug in DataFrameReader.dbapi (PrPr) where a TypeError was raised when create_connection returned a connection object of an unsupported driver type.

  • Fixed a bug where df.limit(0) call would not properly apply.

  • Fixed a bug in DataFrameWriter.save_as_table that caused reserved names to throw errors when using append mode.

Deprecations

  • Deprecated support for Python3.8.

  • Deprecated argument sliding_interval in DataFrameAnalyticsFunctions.time_series_agg().

Snowpark local testing updates

New features

  • Added support for Interval expression to Window.range_between.

  • Added support for array_construct function.

Bug fixes

  • Fixed a bug in local testing where transient __pycache__ directory was unintentionally copied during stored procedure execution via import.

  • Fixed a bug in local testing that created incorrect result for Column.like calls.

  • Fixed a bug in local testing that caused Column.getItem and snowpark.snowflake.functions.get to raise IndexError rather than return null.

  • Fixed a bug in local testing where df.limit(0) call would not properly apply.

  • Fixed a bug in local testing where a Table.merge into an empty table would cause an exception.

Snowpark pandas API updates

Dependency updates

  • Updated modin from 0.30.1 to 0.32.0.

  • Added support for numpy 2.0 and above.

New features

  • Added support for DataFrame.create_or_replace_view and Series.create_or_replace_view.

  • Added support for DataFrame.create_or_replace_dynamic_table and Series.create_or_replace_dynamic_table.

  • Added support for DataFrame.to_view and Series.to_view.

  • Added support for DataFrame.to_dynamic_table and Series.to_dynamic_table.

  • Added support for DataFrame.groupby.resample for aggregations max, mean, median, min, and sum.

  • Added support for reading stage files using:

    • pd.read_excel

    • pd.read_html

    • pd.read_pickle

    • pd.read_sas

    • pd.read_xml

  • Added support for DataFrame.to_iceberg and Series.to_iceberg.

  • Added support for dict values in Series.str.len.

Improvements

  • Improve the performance of DataFrame.groupby.apply and Series.groupby.apply by avoiding expensive pivot step.

  • Added an estimate for the row count upper bound to OrderedDataFrame to enable better engine switching. This could potentially result in increased query counts.

  • Renamed the relaxed_ordering parameter in enforce_ordering with pd.read_snowflake. Also the new default value is enforce_ordering=False which has the opposite effect of the previous default value, relaxed_ordering=False.

Bug fixes

  • Fixed a bug for pd.read_snowflake when reading iceberg tables and enforce_ordering=True.

Version 1.30.0 (2025-03-27)

New features

  • Added Support for relaxed consistency and ordering guarantees in Dataframe.to_snowpark_pandas by introducing the relaxed_ordering parameter.

  • DataFrameReader.dbapi (preview) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.

Improvements

  • Improved query generation for Dataframe.stat.sample_by to generate a single flat query that scales well with large fractions dictionary compared to older method of creating a UNION ALL subquery for each key in fractions. To enable this feature, set session.conf.set("use_simplified_query_generation", True).

  • Improved the performance of DataFrameReader.dbapi by enabling the vectorized option when copying a parquet file into a table.

  • Improved query generation for DataFrame.random_split in the following ways. They can be enabled by setting session.conf.set("use_simplified_query_generation", True):

    • Removed the need to cache_result in the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.

    • The seed argument now behaves as expected with repeatable results across multiple calls and sessions.

  • DataFrame.fillna and DataFrame.replace now both support fitting int and float into Decimal columns if include_decimal is set to True.

  • Added documentation for the following UDF and stored procedure functions in files.py as a result of their General Availability.

    • SnowflakeFile.write

    • SnowflakeFile.writelines

    • SnowflakeFile.writeable

  • Minor documentation changes for SnowflakeFile and SnowflakeFile.open().

Bug fixes

  • Fixed a bug for the following functions that raised errors. .cast() is applied to their output:

    • from_json

    • size

Snowpark local testing updates

Bug fixes

  • Fixed a bug in aggregation that caused empty groups to still produce rows.

  • Fixed a bug in Dataframe.except_ that would cause rows to be incorrectly dropped.

  • Fixed a bug that caused to_timestamp to fail when casting filtered columns.

Snowpark pandas API updates

New features

  • Added support for list values in Series.str.__getitem__ (Series.str[...]).

  • Added support for pd.Grouper objects in GROUP BY operations. When freq is specified, the default values of the sort, closed, label, and convention arguments are supported; origin is supported when it is start or start_day.

  • Added support for relaxed consistency and ordering guarantees in pd.read_snowflake for both named data sources (for example, tables and views) and query data sources by introducing the new parameter relaxed_ordering.

Improvements

  • Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASE is found to be set, ask user to unset it.

  • Improved how a missing index_label in DataFrame.to_snowflake and Series.to_snowflake is handled when index=True. Instead of raising a ValueError, system-defined labels are used for the index columns.

  • Improved the error message for groupby, DataFrame, or Series.agg when the function name is not supported.

Snowpark local testing updates

Improvements

  • Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASE is found to be set, ask user to unset it.

  • Improved how a missing index_label in DataFrame.to_snowflake and Series.to_snowflake is handled when index=True. Instead of raising a ValueError, system-defined labels are used for the index columns.

  • Improved error message for groupby or DataFrame or Series.agg when the function name is not supported.

Version 1.29.1 (2025-03-12)

Bug fixes

  • Fixed a bug in DataFrameReader.dbapi (private preview) that prevents usage in stored procedures and Snowbooks.

Version 1.29.0 (2025-03-05)

New features

  • Added support for the following AI-powered functions in functions.py (Private Preview):

    • ai_filter

    • ai_agg

    • summarize_agg

  • Added support for the new FILE SQL type, with the following related functions in functions.py (Private Preview):

    • fl_get_content_type

    • fl_get_etag

    • fl_get_file_type

    • fl_get_last_modified

    • fl_get_relative_path

    • fl_get_scoped_file_url

    • fl_get_size

    • fl_get_stage

    • fl_get_stage_file_url

    • fl_is_audio

    • fl_is_compressed

    • fl_is_document

    • fl_is_image

    • fl_is_video

  • Added support for importing third-party packages from PyPi using Artifact Repository (Private Preview):

    • Use keyword arguments artifact_repository and artifact_repository_packages to specify your artifact repository and packages respectively when registering stored procedures or user defined functions.

    • Supported APIs are:

      • Session.sproc.register

      • Session.udf.register

      • Session.udaf.register

      • Session.udtf.register

      • functions.sproc

      • functions.udf

      • functions.udaf

      • functions.udtf

      • functions.pandas_udf

      • functions.pandas_udtf

Improvements

  • Improved version validation warnings for snowflake-snowpark-python package compatibility when registering stored procedures. Now, warnings are only triggered if the major or minor version does not match, while bugfix version differences no longer generate warnings.

  • Bumped cloudpickle dependency to also support cloudpickle==3.0.0 in addition to previous versions.

Bug fixes

  • Fixed a bug where creating a Dataframe with large number of values raised Unsupported feature 'SCOPED_TEMPORARY'. error if thread-safe session was disabled.

  • Fixed a bug where df.describe raised internal SQL execution error when the DataFrame is created from reading a stage file and CTE optimization is enabled.

  • Fixed a bug where df.order_by(A).select(B).distinct() would generate invalid SQL when simplified query generation was enabled using session.conf.set("use_simplified_query_generation", True).

    • Disabled simplified query generation by default.

Snowpark pandas API updates

Improvements

  • Improve error message for pd.to_snowflake, DataFrame.to_snowflake, and Series.to_snowflake when the table does not exist.

  • Improve readability of docstring for the if_exists parameter in pd.to_snowflake, DataFrame.to_snowflake, and Series.to_snowflake.

  • Improve error message for all pandas functions that use UDFs with Snowpark objects.

Bug fixes

  • Fixed a bug in Series.rename_axis where an AttributeError was being raised.

  • Fixed a bug where pd.get_dummies didn’t ignore NULL/NaN values by default.

  • Fixed a bug where repeated calls to pd.get_dummies results in ‘Duplicated column name error’.

  • Fixed a bug in pd.get_dummies where passing list of columns generated incorrect column labels in output DataFrame.

  • Update pd.get_dummies to return bool values instead of int.

Snowpark local testing updates

New features

  • Added support for literal values to range_between window function.

Version 1.28.0 (2025-02-20)

New features

  • Added support for the following functions in functions.py

    • normal

    • randn

  • Added support for allow_missing_columns parameter to Dataframe.union_by_name and Dataframe.union_all_by_name.

Improvements

  • Improved random object name generation to avoid collisions.

  • Improved query generation for Dataframe.distinct to generate SELECT DISTINCT instead of SELECT with GROUP BY all columns. To disable this feature, set session.conf.set("use_simplified_query_generation", False).

Deprecations

  • Deprecated Snowpark Python function snowflake_cortex_summarize. Users can install snowflake-ml-python and use the snowflake.cortex.summarize function instead.

  • Deprecated Snowpark Python function snowflake_cortex_sentiment. Users can install snowflake-ml-python and use the snowflake.cortex.sentiment function instead.

Bug fixes

  • Fixed a bug where session-level query tag was overwritten by a stack trace for DataFrames that generate multiple queries. Now, the query tag will only be set to the stacktrace if session.conf.set("collect_stacktrace_in_query_tag", True).

  • Fixed a bug in Session._write_pandas where it was erroneously passing use_logical_type parameter to Session._write_modin_pandas_helper when writing a Snowpark pandas object.

  • Fixed a bug in options SQL generation that could cause multiple values to be formatted incorrectly.

  • Fixed a bug in Session.catalog where empty strings for database or schema were not handled correctly and were generating erroneous SQL statements.

Experimental Features

  • Added support for writing pyarrow Tables to Snowflake tables.

Snowpark pandas API updates

New features

  • Added support for applying Snowflake Cortex functions Summarize and Sentiment.

  • Added support for list values in Series.str.get.

Bug fixes

  • Fixed a bug in apply where kwargs were not being correctly passed into the applied function.

Snowpark local testing updates

New features

  • Added support for the following functions
    • hour

    • minute

  • Added support for NULL_IF parameter to CSV reader.

  • Added support for date_format, datetime_format, and timestamp_format options when loading CSVs.

Bug fixes

  • Fixed a bug in DataFrame.join that caused columns to have incorrect typing.

  • Fixed a bug in when statements that caused incorrect results in the otherwise clause.

Version 1.27.0 (2025-02-05)

New features

Added support for the following functions in functions.py:

  • array_reverse

  • divnull

  • map_cat

  • map_contains_key

  • map_keys

  • nullifzero

  • snowflake_cortex_sentiment

  • acosh

  • asinh

  • atanh

  • bit_length

  • bitmap_bit_position

  • bitmap_bucket_number

  • bitmap_construct_agg

  • cbrt

  • equal_null

  • from_json

  • ifnull

  • localtimestamp

  • max_by

  • min_by

  • nth_value

  • nvl

  • octet_length

  • position

  • regr_avgx

  • regr_avgy

  • regr_count

  • regr_intercept

  • regr_r2

  • regr_slope

  • regr_sxx

  • regr_sxy

  • regr_syy

  • try_to_binary

  • base64

  • base64_decode_string

  • base64_encode

  • editdistance

  • hex

  • hex_encode

  • instr

  • log1p

  • log2

  • log10

  • percentile_approx

  • unbase64

  • Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.

  • Added support for DataFrameWriter.insert_into/insertInto. This method also supports local testing mode.

  • Added support for DataFrame.create_temp_view to create a temporary view. It will fail if the view already exists.

  • Added support for multiple columns in the functions map_cat and map_concat.

  • Added an option keep_column_order for keeping original column order in DataFrame.with_column and DataFrame.with_columns.

  • Added options to column casts that allow renaming or adding fields in StructType columns.

  • Added support for contains_null parameter to ArrayType.

  • Added support for creating a temporary view via DataFrame.create_or_replace_temp_view from a DataFrame created by reading a file from a stage.

  • Added support for value_contains_null parameter to MapType.

  • Added interactive to telemetry that indicates whether the current environment is an interactive one.

  • Allow session.file.get in a Native App to read file paths starting with / from the current version

  • Added support for multiple aggregation functions after DataFrame.pivot.

Experimental features

  • Added Session.catalog class to manage Snowflake objects. It can be accessed via Session.catalog.

    • snowflake.core is a dependency required for this feature.

  • Allow user input schema or user input schemas when reading JSON file on stage.

  • Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.

Improvements

  • Updated README.md to include instructions on how to verify package signatures using cosign.

Bug fixes

  • Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.

  • Fixed a bug in StructField.from_json that prevented TimestampTypes with tzinfo from being parsed correctly.

  • Fixed a bug in function date_format that caused an error when the input column was date type or timestamp type.

  • Fixed a bug in DataFrame that allowed null values to be inserted in a non-nullable column.

  • Fixed a bug in functions replace and lit which raised type hint assertion error when passing Column expression objects.

  • Fixed a bug in pandas_udf and pandas_udtf where session parameters were erroneously ignored.

  • Fixed a bug that raised an incorrect type conversion error for system function called through session.call.

Snowpark pandas API updates

New features

  • Added support for Series.str.ljust and Series.str.rjust.

  • Added support for Series.str.center.

  • Added support for Series.str.pad.

  • Added support for applying the Snowpark Python function snowflake_cortex_sentiment.

  • Added support for DataFrame.map.

  • Added support for DataFrame.from_dict and DataFrame.from_records.

  • Added support for mixed case field names in struct type columns.

  • Added support for SeriesGroupBy.unique

  • Added support for Series.dt.strftime with the following directives:

    • %d: Day of the month as a zero-padded decimal number.

    • %m: Month as a zero-padded decimal number.

    • %Y: Year with century as a decimal number.

    • %H: Hour (24-hour clock) as a zero-padded decimal number.

    • %M: Minute as a zero-padded decimal number.

    • %S: Second as a zero-padded decimal number.

    • %f: Microsecond as a decimal number, zero-padded to 6 digits.

    • %j: Day of the year as a zero-padded decimal number.

    • %X: Locale’s appropriate time representation.

    • %%: A literal ‘%’ character.

  • Added support for Series.between.

  • Added support for include_groups=False in DataFrameGroupBy.apply.

  • Added support for expand=True in Series.str.split.

  • Added support for DataFrame.pop and Series.pop.

  • Added support for first and last in DataFrameGroupBy.agg and SeriesGroupBy.agg.

  • Added support for Index.drop_duplicates.

  • Added support for aggregations "count", "median", np.median, "skew", "std", np.std "var", and np.var in pd.pivot_table(), DataFrame.pivot_table(), and pd.crosstab().

Improvements

  • Improved performance of DataFrame.map, Series.apply and Series.map methods by mapping numpy functions to Snowpark functions if possible.

  • Added documentation for DataFrame.map.

  • Improved performance of DataFrame.apply by mapping numpy functions to Snowpark functions if possible.

  • Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.

  • Infer return type of functions in Series.map, Series.apply and DataFrame.map if type-hint is not provided.

  • Added call_count to telemetry that counts method calls including interchange protocol calls.