Snowpark Library for Python release notes for 2025¶
This article contains the release notes for the Snowpark Library for Python, including the following when applicable:
Behavior changes
New features
Customer-facing bug fixes
Snowflake uses semantic versioning for Snowpark Library for Python updates.
See Snowpark Developer Guide for Python for documentation.
Warning
Because Python 3.8 has reached its End of Life, deprecation warnings will be triggered when you use snowpark-python with Python 3.8. For more information, see Snowflake Python Runtime Support. Snowpark Python 1.24.0 will be the last client and server version to support Python 3.8, in accordance with Anaconda’s policy. Upgrade your existing Python 3.8 objects to Python 3.9 or later.
Version 1.41.0: Oct 23, 2025¶
New features¶
Added a new function
serviceinsnowflake.snowpark.functionsthat allows users to create a callable representing a Snowpark Container Services (SPCS) service.Added a new function
group_by_all()to theDataFrameclass.Added
connection_parametersparameter toDataFrameReader.dbapi()(Public Preview) method to allow passing keyword arguments to thecreate_connectioncallable.Added support for
Session.begin_transaction,Session.commit, andSession.rollback.Added support for the following functions in
functions.py:Geospatial functions:
st_interpolatest_intersectionst_intersection_aggst_intersectsst_isvalidst_lengthst_makegeompointst_makelinest_makepolygonst_makepolygonorientedst_disjointst_distancest_dwithinst_endpointst_envelopest_geohashst_geomfromgeohashst_geompointfromgeohashst_hausdorffdistancest_makepointst_npointsst_perimeterst_pointnst_setsridst_simplifyst_sridst_startpointst_symdifferencest_transformst_unionst_union_aggst_withinst_xst_xmaxst_xminst_yst_ymaxst_yminst_geogfromgeohashst_geogpointfromgeohashst_geographyfromwkbst_geographyfromwktst_geometryfromwkbst_geometryfromwkttry_to_geographytry_to_geometry
Added a parameter to enable and disable automatic column name aliasing for
interval_day_time_from_partsandinterval_year_month_from_partsfunctions.
Bug fixes¶
Fixed a bug that
DataFrameReader.xmlfails to parse XML files with undeclared namespaces whenignoreNamespaceisTrue.Added a fix for floating point precision discrepancies in
interval_day_time_from_parts.Fixed a bug where writing Snowpark pandas DataFrames on the pandas backend with a column multiindex to Snowflake with
to_snowflakewould raiseKeyError.Fixed a bug that
DataFrameReader.dbapi(Public Preview) is not compatible with oracledb 3.4.0.Fixed a bug where
modinwould unintentionally be imported during session initialization in some scenarios.Fixed a bug where
session.udf|udtf|udaf|sproc.registerfailed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
Improvements¶
The default maximum length for inferred StringType columns during schema inference in
DataFrameReader.dbapiis now increased from 16 MB to 128 MB in parquet file–based ingestion.
Dependency updates¶
Updated dependency of
snowflake-connector-python>=3.17,<5.0.0.
Snowpark pandas API updates¶
New features¶
Added support for the
dtypesparameter ofpd.get_dummies.Added support for
nuniqueindf.pivot_table,df.agg, and other places where aggregate functions can be used.Added support for
DataFrame.interpolateandSeries.interpolatewith the “linear”, “ffill”/”pad”, and “backfill”/bfill” methods. These use the SQLINTERPOLATE_LINEAR,INTERPOLATE_FFILL, andINTERPOLATE_BFILLfunctions (Public Preview).
Improvements¶
Improved performance of
Series.to_snowflakeandpd.to_snowflake(series)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes.Enhanced autoswitching functionality from Snowflake to native pandas for methods with unsupported argument combinations:
get_dummies()withdummy_na=True,drop_first=True, or customdtypeparameterscumsum(),cummin(),cummax()withaxis=1(column-wise operations)skew()withaxis=1ornumeric_only=Falseparametersround()withdecimalsparameter as a Seriescorr()withmethod!=pearsonparameter
Set
cte_optimization_enabledto True for all Snowpark pandas sessions.Add support for the following in faster pandas:
isinisnaisnullnotnanotnullstr.containsstr.startswithstr.endswithstr.slicedt.datedt.timedt.hourdt.minutedt.seconddt.microseconddt.nanoseconddt.yeardt.monthdt.daydt.quarterdt.is_month_startdt.is_month_enddt.is_quarter_startdt.is_quarter_enddt.is_year_startdt.is_year_enddt.is_leap_yeardt.days_in_monthdt.daysinmonthsort_valuesloc(setting columns)to_datetimedropinvertduplicatedilocheadcolumns(e.g., df.columns = [“A”, “B”])aggminmaxcountsummeanmedianstdvargroupby.agggroupby.mingroupby.maxgroupby.countgroupby.sumgroupby.meangroupby.mediangroupby.stdgroupby.vardrop_duplicates
Reuse row count from the relaxed query compiler in
get_axis_len.
Bug fixes¶
Fixed a bug where the row count was not cached in the ordered DataFrame each time
count_rows()was called.
Version 1.40.0: October 6, 2025¶
New features¶
Added a new module
snowflake.snowpark.secretsthat provides Python wrappers for accessing Snowflake Secrets within Python UDFs and stored procedures that execute inside Snowflake.get_generic_secret_stringget_oauth_access_tokenget_secret_typeget_username_passwordget_cloud_provider_token
Added support for the following scalar functions in
functions.py:Conditional expression functions:
boolandboolnotboolorboolxorboolor_aggdecodegreatest_ignore_nullsleast_ignore_nullsnullifnvl2regr_valx
Semi-structured and structured date functions:
array_remove_atas_booleanmap_deletemap_insertmap_pickmap_size
String & binary functions:
chrhex_decode_binary
Numeric functions:
div0null
Differential privacy functions:
dp_interval_highdp_interval_low
Context functions:
last_query_idlast_transaction
Geospatial functions:
h3_cell_to_boundaryh3_cell_to_childrenh3_cell_to_children_stringh3_cell_to_parenth3_cell_to_pointh3_compact_cellsh3_compact_cells_stringsh3_coverageh3_coverage_stringsh3_get_resolutionh3_grid_diskh3_grid_distanceh3_int_to_stringh3_polygon_to_cellsh3_polygon_to_cells_stringsh3_string_to_inth3_try_grid_pathh3_try_polygon_to_cellsh3_try_polygon_to_cells_stringsh3_uncompact_cellsh3_uncompact_cells_stringshaversineh3_grid_pathh3_is_pentagonh3_is_valid_cellh3_latlng_to_cellh3_latlng_to_cell_stringh3_point_to_cellh3_point_to_cell_stringh3_try_coverageh3_try_coverage_stringsh3_try_grid_distancest_areast_asewkbst_asewktst_asgeojsonst_aswkbst_aswktst_azimuthst_bufferst_centroidst_collectst_containsst_coveredbyst_coversst_differencest_dimension
Bug fixes¶
Fixed a bug that caused
DataFrame.limit()to fail if the executed SQL contained parameter binding when used in non-stored-procedure/udxf environments.Added an experimental fix for a bug in schema query generation that could cause invalid sql to be generated when using nested structured types.
Fixed multiple bugs in
DataFrameReader.dbapi(Public Preview):Fixed UDTF ingestion failure with
pyodbcdriver caused by unprocessed row data.Fixed SQL Server query input failure due to incorrect select query generation.
Fixed UDTF ingestion not preserving column nullability in the output schema.
Fixed an issue that caused the program to hang during multithreaded Parquet based ingestion when a data fetching error occurred.
Fixed a bug in schema parsing when custom schema strings used upper-cased data type names (
NUMERIC,NUMBER,DECIMAL,VARCHAR,STRING,TEXT).
Fixed a bug in
Session.create_dataframewhere schema string parsing failed when using upper-cased data type names (e.g.,NUMERIC,NUMBER,DECIMAL,VARCHAR,STRING,TEXT).
Improvements¶
Improved
DataFrameReader.dbapi(Public Preview) so it doesn’t retry on non-retryable errors, such as SQL syntax error on external data source query.Removed unnecessary warnings about local package version mismatch when using
session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)orxpathfunctions.Improved
DataFrameReader.dbapi(Public Preview) reading performance by setting the defaultfetch_sizeparameter value to 100000.Improved error message for XSD validation failure when reading XML files using
session.read.option('rowValidationXSDPath', <xsd_path>).xml(<stage_file_path>).
Snowpark pandas API updates¶
Dependency updates¶
Updated the supported
modinversions to >=0.36.0 and <0.38.0 (was >= 0.35.0 and <0.37.0).
New features¶
Added support for
DataFrame.queryfor DataFrames with single-level indexes.Added support for
DataFrameGroupby.__len__andSeriesGroupBy.__len__.
Improvements¶
Hybrid execution mode is now enabled by default. Certain operations on smaller data now automatically execute in native pandas in-memory. Use
from modin.config import AutoSwitchBackend; AutoSwitchBackend.disable()to turn this off and force all execution to occur in Snowflake.Added a session parameter
pandas_hybrid_execution_enabledto enable/disable hybrid execution as an alternative to usingAutoSwitchBackend.Removed an unnecessary
SHOW OBJECTSquery issued fromread_snowflakeunder certain conditions.When hybrid execution is enabled,
pd.merge,pd.concat,DataFrame.merge, andDataFrame.joincan now move arguments to backends other than those among the function arguments.Improved performance of
DataFrame.to_snowflakeandpd.to_snowflake(dataframe)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes.
Version 1.39.1: September 25, 2025¶
Bug fixes¶
Added an experimental fix for a bug in schema query generation that could cause invalid SQL to be genrated when using nested structured types.
Version 1.39.0: September 17, 2025¶
New features¶
Downgraded to level
logging.DEBUG - 1the log message saying that the SnowparkDataFramereference of an internalDataFrameReferenceobject has changed.Eliminate duplicate parameter check queries for casing status when retrieving the session.
Retrieve DataFrame row counts through object metadata to avoid a COUNT(*) query (performance)
Added support for applying the Snowflake Cortex function
Complete.Introduce faster pandas: Improved performance by deferring row position computation.
The following operations are currently supported and can benefit from the optimization:
read_snowflake,repr,loc,reset_index,merge, and binary operations.If a lazy object (e.g., DataFrame or Series) depends on a mix of supported and unsupported operations, the optimization will not be used.
Updated the error message for when Snowpark pandas is referenced within
apply.Added a session parameter
dummy_row_pos_optimization_enabledto enable/disable dummy row position optimization in faster pandas.
Dependency updates¶
Updated the supported
modinversions to >=0.35.0 and <0.37.0 (was previously >= 0.34.0 and <0.36.0).
Bug fixes¶
Fixed an issue with
drop_duplicateswhere the same data source could be read multiple times in the same query but in a different order each time, resulting in missing rows in the final result. The fix ensures that the data source is read only once.Fixed a bug with hybrid execution mode where an
AssertionErrorwas unexpectedly raised by certain indexing operations.
Snowpark local testing updates¶
New features¶
Added support to allow patching
functions.ai_complete.
Version 1.38.0: September 4, 2025¶
New features¶
Added support for the following AI-powered functions in
functions.py:ai_extractai_parse_documentai_transcribe
Added time travel support for querying historical data:
Session.table()now supports time travel parameters:time_travel_modestatementoffsettimestamptimestamp_typestream
DataFrameReader.table()supports the same time travel parameters as direct arguments.DataFrameReadersupports time travel via option chaining (e.g.,session.read.option("time_travel_mode", "at").option("offset", -60).table("my_table")).
Added support for specifying the following parameters to
DataFrameWriter.copy_into_locationfor validation and writing data to external locations:validation_modestorage_integrationcredentialsencryption
Added support for
Session.directoryandSession.read.directoryto retrieve the list of all files on a stage with metadata.Added support for
DataFrameReader.jdbc(Private Preview)that allows the JDBC driver to ingest external data sources.Added support for
FileOperation.copy_filesto copy files from a source location to an output stage.Added support for the following scalar functions in
functions.py:all_user_namesbitandbitand_aggbitorbitor_aggbitxorbitxor_aggcurrent_account_namecurrent_clientcurrent_ip_addresscurrent_role_typecurrent_organization_namecurrent_organization_usercurrent_secondary_rolescurrent_transactiongetbit
Bug fixes¶
Fixed the
_repr_ofTimestampTypeto match the actual subtype it represents.Fixed a bug in
DataFrameReader.dbapithatUDTFingestion does not work in stored procedures.Fixed a bug in schema inference that caused incorrect stage prefixes to be used.
Improvements¶
Enhanced error handling in
DataFrameReader.dbapithread-based ingestion to prevent unnecessary operations, which improves resource efficiency.Bumped cloudpickle dependency to also support
cloudpickle==3.1.1in addition to previous versions.Improved
DataFrameReader.dbapi(Public Preview) ingestion performance for PostgreSQL and MySQL by using a server-side cursor to fetch data.
Snowpark pandas API Updates¶
New features¶
Completed support for the following functions on the “Pandas” and “Ray” backends:
pd.read_snowflake()pd.to_iceberg()pd.to_pandas()pd.to_snowpark()pd.to_snowflake()DataFrame.to_iceberg()DataFrame.to_pandas()DataFrame.to_snowpark()DataFrame.to_snowflake()Series.to_iceberg()Series.to_pandas()Series.to_snowpark()Series.to_snowflake()on the “Pandas” and “Ray” backends. Previously, only some of these functions and methods were supported on the Pandas backend.
Added support for
Index.get_level_values().
Improvements¶
Set the default transfer limit in hybrid execution for data leaving Snowflake to 100k, which can be overridden with the
SnowflakePandasTransferThresholdenvironment variable. This configuration is appropriate for scenarios with two available engines, “pandas” and “Snowflake,” on relational workloads.Improved the import error message by adding
--upgradetopip install "snowflake-snowpark-python[modin]"in the message.Reduced the telemetry messages from the modin client by pre-aggregating into five-second windows and only keeping a narrow band of metrics that are useful for tracking hybrid execution and native pandas performance.
Set the initial row count only when hybrid execution is enabled, which reduces the number of queries issued for many workloads.
Added a new test parameter for integration tests to enable hybrid execution.
Bug fixes¶
Raised
NotImplementedErrorinstead ofAttributeErroron attempting to call Snowflake extension functions/methodsto_dynamic_table(),cache_result(),to_view(),create_or_replace_dynamic_table(), andcreate_or_replace_view()on DataFrames or series using the pandas or ray backends.
Version 1.37.0: August 18, 2025¶
New features¶
Added support for the following
xpathfunctions infunctions.py:xpathxpath_stringxpath_booleanxpath_intxpath_floatxpath_doublexpath_longxpath_short
Added support for the
use_vectorized_scannerparameter in theSession.write_arrow()function.DataFrame profiler adds the following information about each query:
describe query time,execution time, andsql query text. To view this information, callsession.dataframe_profiler.enable()and callget_execution_profileon a DataFrame.Added support for
DataFrame.col_ilike.Added support for non-blocking stored procedure calls that return
AsyncJobobjects.Added the
block: bool = Trueparameter toSession.call(). Whenblock=False, returns anAsyncJobinstead of blocking until completion.Added the
block: bool = Trueparameter toStoredProcedure.__call__()for async support across both named and anonymous stored procedures.Added
Session.call_nowait()that is equivalent toSession.call(block=False).
Bug fixes¶
Fixed a bug in CTE optimization stage where
deepcopyof internal plans would cause a memory spike when a DataFrame is created locally usingsession.create_dataframe()using large input data.Fixed a bug in
DataFrameReader.parquetwhere theignore_caseoption in theinfer_schema_optionswas not respected.Fixed a bug where
to_pandas()had a different format of column name when the query result format is set toJSONandARROW.
Deprecations¶
Deprecated
pkg_resources.
Dependency updates¶
Added a dependency on
protobuf<6.32
Snowpark pandas API Updates¶
New features¶
Added support for efficient transfer of data between Snowflake and <Ray with the
DataFrame.set_backendmethod. The installed version ofmodinmust be at least 0.35.0, andraymust be installed.
Dependency updates¶
Updated the supported modin versions to >=0.34.0 and <0.36.0 (was previously >= 0.33.0 and <0.35.0).
Added support for pandas 2.3 when the installed modin version is 0.35.0 or greater.
Bug fixes¶
Fixed an issue in hybrid execution mode (Private Preview) where
pd.to_datetimeandpd.to_timedeltawould unexpectedly raiseIndexError.Fixed a bug where
pd.explain_switchwould raiseIndexErroror returnNoneif called before any potential switch operations were performed.
Version 1.36.0: August 5, 2025¶
New features¶
Session.create_dataframenow accepts keyword arguments that are forwarded in the internal call toSession.write_pandasorSession.write_arrowwhen creating a DataFrame from a pandas DataFrame or apyarrowtable.Added new APIs for AsyncJob:
AsyncJob.is_failed()returns a bool indicating whether a job has failed. Can be used in combination withAsyncJob.is_done()to determine if a job is finished and erred.AsyncJob.status()returns a string representing the current query status (such as, “RUNNING”, “SUCCESS”, “FAILED_WITH_ERROR”) for detailed monitoring without callingresult().
Added a DataFrame profiler. To use, you can call
get_execution_profile()on your desired DataFrame. This profiler reports the queries executed to evaluate a DataFrame and statistics about each of the query operators. Currently an experimental feature.Added support for the following functions in
functions.py:ai_sentiment
Updated the interface for the
context.configure_development_featuresexperimental feature. All development features are disabled by default unless explicitly enabled by the user.
Improvements¶
Hybrid execution row estimate improvements and a reduction of eager calls.
Added a new configuration variable to control transfer costs out of Snowflake when using hybrid execution.
Added support for creating permanent and immutable UDFs/UDTFs with DataFrame/Series/GroupBy.apply, map, and transform by passing the
snowflake_udf_paramskeyword argument.Added support for
mapping np.uniqueto DataFrame and Series inputs usingpd.unique.
Bug fixes¶
Fixed an issue where the Snowpark pandas plugin would unconditionally disable
AutoSwitchBackendeven when users have explicitly configured it programmatically or with environment variables.
Version 1.35.0: July 24, 2025¶
New features¶
Added support for the following functions in
functions.py:ai_embedtry_parse_json
Improvements¶
Improved
queryparameter inDataFrameReader.dbapi(Private Preview) so that parentheses aren’t needed around the query.Improved error experience in
DataFrameReader.dbapi(Private Preview) for exceptions raised when inferring the schema of the target data source.
Bug fixes¶
Fixed a bug in
DataFrameReader.dbapi(Private Preview) that failsdbapiwith process exit code 1 in a Python stored procedure.Fixed a bug in
DataFrameReader.dbapi(Private Preview) wherecustom_schemaaccepts an illegal schema.Fixed a bug in
DataFrameReader.dbapi(Private Preview) wherecustom_schemadoesn’t work when connecting to Postgres and MySQL.Fixed a bug in schema inference that causes it to fail for external stages.
Snowpark local testing updates¶
New features¶
Added local testing support for reading files with
SnowflakeFile. The testing support uses local file paths, the Snow URL semantic (snow://...), local testing framework stages, and Snowflake stages (@stage/file_path).
Version 1.34.0: Jul 14, 2025¶
New features¶
Added a new option
TRY_CASTtoDataFrameReader. WhenTRY_CASTisTrue, columns are wrapped in aTRY_CASTstatement instead of a hard cast when loading data.Added a new option
USE_RELAXED_TYPESto theINFER_SCHEMA_OPTIONSofDataFrameReader. When set toTrue, this option casts all strings to max length strings and all numeric types toDoubleType.Added debuggability improvements to eagerly validate dataframe schema metadata. Enable it using
snowflake.snowpark.context.configure_development_features().Added a new function
snowflake.snowpark.dataframe.map_in_pandasthat allows users to map a function across a dataframe. The mapping function takes an iterator of pandas DataFrames as input and provides one as output.Added a
ttl cacheto describe queries. Repeated queries in a 15-second interval use the cached value rather than requery Snowflake.Added a parameter
fetch_with_processtoDataFrameReader.dbapi(PrPr) to enable multiprocessing for parallel data fetching in local ingestion. By default, local ingestion uses multithreading. Multiprocessing can improve performance for CPU-bound tasks like Parquet file generation.Added a new function
snowflake.snowpark.functions.modelthat allows users to call methods of a model.
Improvements¶
Added support for row validation using XSD schema using
rowValidationXSDPathoption when reading XML files with a row tag usingrowTagoption.Improved SQL generation for
session.table().sample()to generate a flat SQL statement.Added support for complex column expression as input for
functions.explode.Added debuggability improvements to show which Python lines a SQL compilation error corresponds to. Enable it using
snowflake.snowpark.context.configure_development_features(). This feature also depends on AST collections to be enabled in the session, which can be done usingsession.ast_enabled = True.Set
enforce_ordering=Truewhen callingto_snowpark_pandas():code:from a Snowpark DataFrame containing DML/DDL queries instead of throwing aNotImplementedError.
Bug fixes¶
Fixed a bug caused by redundant validation when creating an iceberg table.
Fixed a bug in
DataFrameReader.dbapi(Private Preview) where closing the cursor or connection could unexpectedly raise an error and terminate the program.Fixed ambiguous column errors when using table functions in
DataFrame.select()that have output columns matching the input DataFrame’s columns. This improvement works when DataFrame columns are provided asColumnobjects.Fixed a bug where having a NULL in a column with DecimalTypes would cast the column to FloatTypes instead and lead to precision loss.
Snowpark Local testing Updates¶
Fixed a bug when processing windowed functions that lead to incorrect indexing in results.
When a scalar numeric is passed to
fillna, Snowflake will ignore non-numeric columns instead of producing an error.
Snowpark pandas API Updates¶
New features¶
Added support for
DataFrame.to_excelandSeries.to_excel.Added support for
pd.read_feather,pd.read_orc, andpd.read_stata.Added support for
pd.explain_switch()to return debugging information on hybrid execution decisions.Support
pd.read_snowflakewhen the global modin backend isPandas.Added support for
pd.to_dynamic_table,pd.to_iceberg, andpd.to_view.
Improvements¶
Added modin telemetry on API calls and hybrid engine switches.
Show more helpful error messages to Snowflake Notebook users when the
modinorpandasversion does not match our requirements.Added a data type guard to the cost functions for hybrid execution mode (Private Preview) that checks for data type compatibility.
Added automatic switching to the pandas backend in hybrid execution mode (Private Preview) for many methods that are not directly implemented in pandas on Snowflake.
Set the
typeand other standard fields for pandas on Snowflake telemetry.
Dependency updates¶
Added
tqdmandipywidgetsas dependencies so that progress bars appear when the user switches between modin backends.Updated the supported
modinversions to >=0.33.0 and <0.35.0 (was previously >= 0.32.0 and <0.34.0).
Bug fixes¶
Fixed a bug in Hybrid Execution mode (Private Preview) where certain series operations would raise
TypeError: numpy.ndarray object is not callable.Fixed a bug in hybrid execution mode (Private Preview) where calling
numpyoperations likenp.whereon modin objects with the Pandas backend would raise anAttributeError. This fix requiresmodinversion 0.34.0 or later.Fixed an issue in
df.meltwhere the resulting values have an additional suffix applied.
Version 1.33.0 (2025-06-19)¶
New features¶
Added support for MySQL in
DataFrameWriter.dbapi(Private Preview) for both Parquet and UDTF-based ingestion.Added support for PostgreSQL in
DataFrameReader.dbapi(Private Preview) for both Parquet and UDTF-based ingestion.Added support for Databricks in
DataFrameWriter.dbapi(Private Preview) for UDTF-based ingestion, consolidating with other mentions of Databricks support.Added support to
DataFrameReaderto enable use ofPATTERNwhen reading files withINFER_SCHEMAenabled.Added support for the following AI-powered functions in
functions.py:ai_completeai_similarityai_summarize_agg(originallysummarize_agg)different config options for
ai_classify
Added support for more options when reading XML files with a row tag using
rowTagoption:Added support for removing namespace prefixes from column names using
ignoreNamespaceoption.Added support for specifying the prefix for the attribute column in the result table using
attributePrefixoption.Added support for excluding attributes from the XML element using
excludeAttributesoption.Added support for specifying the column name for the value when there are attributes in an element that has no child elements using
valueTagoption.Added support for specifying the value to treat as a null value using
nullValueoption.Added support for specifying the character encoding of the XML file using
charsetoption.Added support for ignoring surrounding whitespace in the XML element using
ignoreSurroundingWhitespaceoption.
Added support for parameter
return_dataframeinSession.call, which can be used to set the return type of the functions to aDataFrameobject.Added a new argument to
Dataframe.describecalledstrings_include_math_statsthat triggersstddevandmeanto be calculated for String columns.Added support for retrieving
Edge.propertieswhen retrieving lineage fromDGQLinDataFrame.lineage.trace.Added a parameter
table_existstoDataFrameWriter.save_as_tablethat allows specifying if a table already exists. This allows skipping a table lookup that can be expensive.
Bug fixes¶
Fixed a bug in
DataFrameReader.dbapi(Private Preview) where thecreate_connectiondefined as local function was incompatible with multiprocessing.Fixed a bug in
DataFrameReader.dbapi(Private Preview) where DatabricksTIMESTAMPtype was converted to SnowflakeTIMESTAMP_NTZtype which should beTIMESTAMP_LTZtype.Fixed a bug in
DataFrameReader.jsonwhere repeated reads with the same reader object would create incorrectly quoted columns.Fixed a bug in
DataFrame.to_pandas()that would drop column names when converting a DataFrame that did not originate from a select statement.Fixed a bug where
DataFrame.create_or_replace_dynamic_tableraises an error when the DataFrame contains a UDTF andSELECT *in the UDTF is not parsed correctly.Fixed a bug where casted columns could not be used in the values clause of functions.
Improvements¶
Improved the error message for
Session.write_pandas()andSession.create_dataframe()when the input pandas DataFrame does not have a column.Improved
DataFrame.selectwhen the arguments contain a table function with output columns that collide with columns of current DataFrame. With the improvement, if user provides non-colliding columns indf.select("col1", "col2", table_func(...))as string arguments, then the query generated by Snowpark client will not raise ambiguous column error.Improved
DataFrameReader.dbapi(Private Preview) to use in-memory Parquet-based ingestion for better performance and security.Improved
DataFrameReader.dbapi(Private Preview) to useMATCH_BY_COLUMN_NAME=CASE_SENSITIVEin copy into table operation.
Snowpark Local testing Updates¶
New features¶
Added support for snow URLs (
snow://) in local file testing.
Bug fixes¶
Fixed a bug in
Column.isinthat would cause incorrect filtering on joined or previously filtered data.Fixed a bug in
snowflake.snowpark.functions.concat_wsthat would cause results to have an incorrect index.
Snowpark pandas API Updates¶
Dependency updates¶
Updated
modindependency constraint from 0.32.0 to >=0.32.0, <0.34.0. The latest version tested with Snowpark pandas ismodin0.33.1.
New features¶
Added support for Hybrid Execution (Private Preview). By running
from modin.config import AutoSwitchBackend; AutoSwitchBackend.enable(), pandas on Snowflake automatically chooses whether to run certain pandas operations locally or on Snowflake. This feature is disabled by default.
Improvements¶
Set the default value of the
indexparameter toFalseforDataFrame.to_view,Series.to_view,DataFrame.to_dynamic_table, andSeries.to_dynamic_table.Added
iceberg_versionoption to table creation functions.Reduced query count for many operations, including
insert,repr, andgroupby, that previously issued a query to retrieve the input data’s size.
Bug fixes¶
Fixed a bug in
Series.wherewhen theotherparameter is an unnamedSeries.
Version 1.32.0 (2025-05-15)¶
Improvements¶
Invoking Snowflake system procedures does not invoke an additional
describe procedurecall to check the return type of the procedure.Added support for
Session.create_dataframe()with the stage URL andFILEdata type.Added support for different modes for dealing with corrupt XML records when reading an XML file using
session.read.option('mode', <mode>), option('rowTag', <tag_name>).xml(<stage_file_path>). CurrentlyPERMISSIVE,DROPMALFORMEDandFAILFASTare supported.Improved the error message of the XML reader when the specified
ROWTAGis not found in the file.Improved query generation for
Dataframe.dropto useSELECT * EXCLUDE ()to exclude the dropped columns. To enable this feature, setsession.conf.set("use_simplified_query_generation", True).Added support for
VariantTypetoStructType.from_json.
Bug fixes¶
Fixed a bug in
DataFrameWriter.dbapi(Private preview) where unicode or double-quoted column names in external databases cause errors because they are not quoted correctly.Fixed a bug where named fields in nested
OBJECTdata could cause errors when containing spaces.
Snowpark local testing updates¶
Bug fixes¶
Fixed a bug in
snowflake.snowpark.functions.rankthat would not respect sort direction.Fixed a bug in
snowflake.snowpark.functions.to_timestamp_*that would cause incorrect results on filtered data.
Snowpark pandas API Updates¶
New features¶
Added support for dict values in
Series.str.get,Series.str.slice, andSeries.str.__getitem__(Series.str[...]).Added support for
DataFrame.to_html.Added support for
DataFrame.to_stringandSeries.to_string.Added support for reading files from S3 buckets using
pd.read_csv.
Improvements¶
Make
iceberg_configa required parameter forDataFrame.to_icebergandSeries.to_iceberg.
Version 1.31.0 (2025-04-24)¶
New features¶
Added support for the
restricted callerpermission ofexecute_asargument inStoredProcedure.register():code:.Added support for non-select statements in
DataFrame.to_pandas().Added support for the
artifact_repositoryparameter toSession.add_packages,Session.add_requirements,Session.get_packages,Session.remove_package, andSession.clear_packages.Added support for reading an XML file using a row tag by
session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)(experimental).Each XML record is extracted as a separate row.
Each field within that record becomes a separate column of type
VARIANT, which can be further queried using the dot notation, such ascol(a.b.c).
Added updates to
DataFrameReader.dbapi(PrPr):Added the
fetch_merge_countparameter for optimizing performance by merging multiple fetched data into a single Parquet file.Added support for Databricks.
Added support for ingestion with Snowflake UDTF.
Added support for the following AI-powered functions in
functions.py(Private Preview):promptai_filter(added support forprompt()function and image files, and changed the second argument name fromexprtofile)ai_classify
Improvements¶
Renamed the
relaxed_orderingparam intoenforce_orderingforDataFrame.to_snowpark_pandas. Also the new default values isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False.Improved
DataFrameReader.dbapi(PrPr) reading performance by setting the defaultfetch_sizeparameter value to 1000.Improve the error message for invalid identifier SQL error by suggesting the potentially matching identifiers.
Reduced the number of describe queries issued when creating a DataFrame from a Snowflake table using
session.table.Improved performance and accuracy of
DataFrameAnalyticsFunctions.time_series_agg().
Bug fixes¶
Fixed a bug in
DataFrame.group_by().pivot().aggwhen the pivot column and aggregate column are the same.Fixed a bug in
DataFrameReader.dbapi(PrPr) where aTypeErrorwas raised whencreate_connectionreturned a connection object of an unsupported driver type.Fixed a bug where
df.limit(0)call would not properly apply.Fixed a bug in
DataFrameWriter.save_as_tablethat caused reserved names to throw errors when using append mode.
Deprecations¶
Deprecated support for Python3.8.
Deprecated argument
sliding_intervalinDataFrameAnalyticsFunctions.time_series_agg().
Snowpark local testing updates¶
New features¶
Added support for Interval expression to
Window.range_between.Added support for
array_constructfunction.
Bug fixes¶
Fixed a bug in local testing where transient
__pycache__directory was unintentionally copied during stored procedure execution via import.Fixed a bug in local testing that created incorrect result for
Column.likecalls.Fixed a bug in local testing that caused
Column.getItemandsnowpark.snowflake.functions.getto raiseIndexErrorrather than returnnull.Fixed a bug in local testing where
df.limit(0)call would not properly apply.Fixed a bug in local testing where a
Table.mergeinto an empty table would cause an exception.
Snowpark pandas API updates¶
Dependency updates¶
Updated
modinfrom 0.30.1 to 0.32.0.Added support for
numpy2.0 and above.
New features¶
Added support for
DataFrame.create_or_replace_viewandSeries.create_or_replace_view.Added support for
DataFrame.create_or_replace_dynamic_tableandSeries.create_or_replace_dynamic_table.Added support for
DataFrame.to_viewandSeries.to_view.Added support for
DataFrame.to_dynamic_tableandSeries.to_dynamic_table.Added support for
DataFrame.groupby.resamplefor aggregationsmax,mean,median,min, andsum.Added support for reading stage files using:
pd.read_excelpd.read_htmlpd.read_picklepd.read_saspd.read_xml
Added support for
DataFrame.to_icebergandSeries.to_iceberg.Added support for dict values in
Series.str.len.
Improvements¶
Improve the performance of
DataFrame.groupby.applyandSeries.groupby.applyby avoiding expensive pivot step.Added an estimate for the row count upper bound to
OrderedDataFrameto enable better engine switching. This could potentially result in increased query counts.Renamed the
relaxed_orderingparameter inenforce_orderingwithpd.read_snowflake. Also the new default value isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False.
Bug fixes¶
Fixed a bug for
pd.read_snowflakewhen reading iceberg tables andenforce_ordering=True.
Version 1.30.0 (2025-03-27)¶
New features¶
Added Support for relaxed consistency and ordering guarantees in
Dataframe.to_snowpark_pandasby introducing therelaxed_orderingparameter.DataFrameReader.dbapi(preview) now accepts a list of strings for thesession_init_statementparameter, allowing multiple SQL statements to be executed during session initialization.
Improvements¶
Improved query generation for
Dataframe.stat.sample_byto generate a single flat query that scales well with largefractionsdictionary compared to older method of creating a UNION ALL subquery for each key infractions. To enable this feature, setsession.conf.set("use_simplified_query_generation", True).Improved the performance of
DataFrameReader.dbapiby enabling the vectorized option when copying a parquet file into a table.Improved query generation for
DataFrame.random_splitin the following ways. They can be enabled by settingsession.conf.set("use_simplified_query_generation", True):Removed the need to
cache_resultin the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.The
seedargument now behaves as expected with repeatable results across multiple calls and sessions.
DataFrame.fillnaandDataFrame.replacenow both support fittingintandfloatintoDecimalcolumns ifinclude_decimalis set toTrue.Added documentation for the following UDF and stored procedure functions in
files.pyas a result of their General Availability.SnowflakeFile.writeSnowflakeFile.writelinesSnowflakeFile.writeable
Minor documentation changes for
SnowflakeFileandSnowflakeFile.open().
Bug fixes¶
Fixed a bug for the following functions that raised errors.
.cast()is applied to their output:from_jsonsize
Snowpark local testing updates¶
Bug fixes¶
Fixed a bug in aggregation that caused empty groups to still produce rows.
Fixed a bug in
Dataframe.except_that would cause rows to be incorrectly dropped.Fixed a bug that caused
to_timestampto fail when casting filtered columns.
Snowpark pandas API updates¶
New features¶
Added support for list values in
Series.str.__getitem__(Series.str[...]).Added support for
pd.Grouperobjects in GROUP BY operations. Whenfreqis specified, the default values of thesort,closed,label, andconventionarguments are supported;originis supported when it isstartorstart_day.Added support for relaxed consistency and ordering guarantees in
pd.read_snowflakefor both named data sources (for example, tables and views) and query data sources by introducing the new parameterrelaxed_ordering.
Improvements¶
Raise a warning whenever
QUOTED_IDENTIFIERS_IGNORE_CASEis found to be set, ask user to unset it.Improved how a missing
index_labelinDataFrame.to_snowflakeandSeries.to_snowflakeis handled whenindex=True. Instead of raising aValueError, system-defined labels are used for the index columns.Improved the error message for
groupby,DataFrame, orSeries.aggwhen the function name is not supported.
Snowpark local testing updates¶
Improvements¶
Raise a warning whenever
QUOTED_IDENTIFIERS_IGNORE_CASEis found to be set, ask user to unset it.Improved how a missing
index_labelinDataFrame.to_snowflakeandSeries.to_snowflakeis handled whenindex=True. Instead of raising aValueError, system-defined labels are used for the index columns.Improved error message for
groupby or DataFrame or Series.aggwhen the function name is not supported.
Version 1.29.1 (2025-03-12)¶
Bug fixes¶
Fixed a bug in
DataFrameReader.dbapi(private preview) that prevents usage in stored procedures and Snowbooks.
Version 1.29.0 (2025-03-05)¶
New features¶
Added support for the following AI-powered functions in
functions.py(Private Preview):ai_filterai_aggsummarize_agg
Added support for the new FILE SQL type, with the following related functions in
functions.py(Private Preview):
fl_get_content_type
fl_get_etag
fl_get_file_type
fl_get_last_modified
fl_get_relative_path
fl_get_scoped_file_url
fl_get_size
fl_get_stage
fl_get_stage_file_url
fl_is_audio
fl_is_compressed
fl_is_document
fl_is_image
fl_is_videoAdded support for importing third-party packages from PyPi using Artifact Repository (Private Preview):
Use keyword arguments
artifact_repositoryandpackagesto specify your artifact repository and packages respectively when registering stored procedures or user defined functions.Supported APIs are:
Session.sproc.register
Session.udf.register
Session.udaf.register
Session.udtf.register
functions.sproc
functions.udf
functions.udaf
functions.udtf
functions.pandas_udf
functions.pandas_udtf
Improvements¶
Improved version validation warnings for
snowflake-snowpark-pythonpackage compatibility when registering stored procedures. Now, warnings are only triggered if the major or minor version does not match, while bugfix version differences no longer generate warnings.Bumped cloudpickle dependency to also support
cloudpickle==3.0.0in addition to previous versions.
Bug fixes¶
Fixed a bug where creating a Dataframe with large number of values raised
Unsupported feature 'SCOPED_TEMPORARY'.error if thread-safe session was disabled.Fixed a bug where
df.describeraised internal SQL execution error when the DataFrame is created from reading a stage file and CTE optimization is enabled.Fixed a bug where
df.order_by(A).select(B).distinct()would generate invalid SQL when simplified query generation was enabled usingsession.conf.set("use_simplified_query_generation", True).
Disabled simplified query generation by default.
Snowpark pandas API updates¶
Improvements¶
Improve error message for
pd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflakewhen the table does not exist.Improve readability of docstring for the
if_existsparameter inpd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflake.Improve error message for all pandas functions that use UDFs with Snowpark objects.
Bug fixes¶
Fixed a bug in
Series.rename_axiswhere anAttributeErrorwas being raised.Fixed a bug where
pd.get_dummiesdidn’t ignore NULL/NaN values by default.Fixed a bug where repeated calls to
pd.get_dummiesresults in ‘Duplicated column name error’.Fixed a bug in
pd.get_dummieswhere passing list of columns generated incorrect column labels in output DataFrame.Update
pd.get_dummiesto return bool values instead of int.
Snowpark local testing updates¶
New features¶
Added support for literal values to
range_betweenwindow function.
Version 1.28.0 (2025-02-20)¶
New features¶
Added support for the following functions in
functions.pynormalrandn
Added support for
allow_missing_columnsparameter toDataframe.union_by_nameandDataframe.union_all_by_name.
Improvements¶
Improved random object name generation to avoid collisions.
Improved query generation for
Dataframe.distinctto generate SELECT DISTINCT instead of SELECT with GROUP BY all columns. To disable this feature, setsession.conf.set("use_simplified_query_generation", False).
Deprecations¶
Deprecated Snowpark Python function
snowflake_cortex_summarize. Users can installsnowflake-ml-pythonand use thesnowflake.cortex.summarizefunction instead.Deprecated Snowpark Python function
snowflake_cortex_sentiment. Users can installsnowflake-ml-pythonand use thesnowflake.cortex.sentimentfunction instead.
Bug fixes¶
Fixed a bug where session-level query tag was overwritten by a stack trace for DataFrames that generate multiple queries. Now, the query tag will only be set to the stacktrace if
session.conf.set("collect_stacktrace_in_query_tag", True).Fixed a bug in
Session._write_pandaswhere it was erroneously passinguse_logical_typeparameter toSession._write_modin_pandas_helperwhen writing a Snowpark pandas object.Fixed a bug in options SQL generation that could cause multiple values to be formatted incorrectly.
Fixed a bug in
Session.catalogwhere empty strings for database or schema were not handled correctly and were generating erroneous SQL statements.
Experimental Features¶
Added support for writing pyarrow Tables to Snowflake tables.
Snowpark pandas API updates¶
New features¶
Added support for applying Snowflake Cortex functions
SummarizeandSentiment.Added support for list values in
Series.str.get.
Bug fixes¶
Fixed a bug in
applywhere kwargs were not being correctly passed into the applied function.
Snowpark local testing updates¶
New features¶
- Added support for the following functions
hourminute
Added support for NULL_IF parameter to CSV reader.
Added support for
date_format,datetime_format, andtimestamp_formatoptions when loading CSVs.
Bug fixes¶
Fixed a bug in
DataFrame.jointhat caused columns to have incorrect typing.Fixed a bug in
whenstatements that caused incorrect results in theotherwiseclause.
Version 1.27.0 (2025-02-05)¶
New features¶
Added support for the following functions in functions.py:
array_reversedivnullmap_catmap_contains_keymap_keysnullifzerosnowflake_cortex_sentimentacoshasinhatanhbit_lengthbitmap_bit_positionbitmap_bucket_numberbitmap_construct_aggcbrtequal_nullfrom_jsonifnulllocaltimestampmax_bymin_bynth_valuenvloctet_lengthpositionregr_avgxregr_avgyregr_countregr_interceptregr_r2regr_sloperegr_sxxregr_sxyregr_syytry_to_binarybase64base64_decode_stringbase64_encodeeditdistancehexhex_encodeinstrlog1plog2log10percentile_approxunbase64Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe.Added support for
DataFrameWriter.insert_into/insertInto. This method also supports local testing mode.Added support for
DataFrame.create_temp_viewto create a temporary view. It will fail if the view already exists.Added support for multiple columns in the functions
map_catandmap_concat.Added an option
keep_column_orderfor keeping original column order inDataFrame.with_columnandDataFrame.with_columns.Added options to column casts that allow renaming or adding fields in
StructTypecolumns.Added support for
contains_null parametertoArrayType.Added support for creating a temporary view via
DataFrame.create_or_replace_temp_viewfrom a DataFrame created by reading a file from a stage.Added support for
value_contains_nullparameter toMapType.Added interactive to telemetry that indicates whether the current environment is an interactive one.
Allow
session.file.getin a Native App to read file paths starting with / from the current versionAdded support for multiple aggregation functions after
DataFrame.pivot.
Experimental features¶
Added
Session.catalogclass to manage Snowflake objects. It can be accessed viaSession.catalog.snowflake.coreis a dependency required for this feature.
Allow user input schema or user input schemas when reading JSON file on stage.
Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe.
Improvements¶
Updated
README.mdto include instructions on how to verify package signatures usingcosign.
Bug fixes¶
Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.
Fixed a bug in
StructField.from_jsonthat preventedTimestampTypeswithtzinfofrom being parsed correctly.Fixed a bug in
function date_formatthat caused an error when the input column was date type or timestamp type.Fixed a bug in DataFrame that allowed null values to be inserted in a non-nullable column.
Fixed a bug in functions
replaceandlitwhich raised type hint assertion error when passing Column expression objects.Fixed a bug in
pandas_udfandpandas_udtfwhere session parameters were erroneously ignored.Fixed a bug that raised an incorrect type conversion error for system function called through
session.call.
Snowpark pandas API updates¶
New features¶
Added support for
Series.str.ljustandSeries.str.rjust.Added support for
Series.str.center.Added support for
Series.str.pad.Added support for applying the Snowpark Python function
snowflake_cortex_sentiment.Added support for
DataFrame.map.Added support for
DataFrame.from_dictandDataFrame.from_records.Added support for mixed case field names in struct type columns.
Added support for
SeriesGroupBy.uniqueAdded support for
Series.dt.strftimewith the following directives:%d: Day of the month as a zero-padded decimal number.
%m: Month as a zero-padded decimal number.
%Y: Year with century as a decimal number.
%H: Hour (24-hour clock) as a zero-padded decimal number.
%M: Minute as a zero-padded decimal number.
%S: Second as a zero-padded decimal number.
%f: Microsecond as a decimal number, zero-padded to 6 digits.
%j: Day of the year as a zero-padded decimal number.
%X: Locale’s appropriate time representation.
%%: A literal ‘%’ character.
Added support for
Series.between.Added support for
include_groups=FalseinDataFrameGroupBy.apply.Added support for
expand=TrueinSeries.str.split.Added support for
DataFrame.popandSeries.pop.Added support for
firstandlastinDataFrameGroupBy.aggandSeriesGroupBy.agg.Added support for
Index.drop_duplicates.Added support for aggregations
"count","median",np.median,"skew","std",np.std"var", andnp.varinpd.pivot_table(),DataFrame.pivot_table(), andpd.crosstab().
Improvements¶
Improved performance of
DataFrame.map,Series.applyandSeries.mapmethods by mapping numpy functions to Snowpark functions if possible.Added documentation for
DataFrame.map.Improved performance of
DataFrame.applyby mapping numpy functions to Snowpark functions if possible.Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.
Infer return type of functions in
Series.map,Series.applyandDataFrame.mapif type-hint is not provided.Added
call_countto telemetry that counts method calls including interchange protocol calls.