Snowpark Migration Accelerator: Release Notes¶
Note that the release notes below are organized by release date. Version numbers for both the application and the conversion core will appear below.
February 5th, 2025¶
Hotfix: Application & CLI Version 2.5.2¶
Desktop App desktop-app¶
Fixed an issue when converting in the sample project option.
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 5.3.0
February 4th, 2025¶
Application & CLI Version 2.5.1¶
Desktop App desktop-app¶
Added a new modal when the user does not have write permission.
Updated the licensing aggrement, acceptance is required.
CLI cli¶
Fixed the year in the CLI screen when showing “–version” or “-v”
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 5.3.0
Added¶
Added the following Python Third-Party libraries with Direct status:
about-time
affinegap
aiohappyeyeballs
alibi-detect
alive-progress
allure-nose2
allure-robotframework
anaconda-cloud-cli
anaconda-mirror
astropy-iers-data
asynch
asyncssh
autots
autoviml
aws-msk-iam-sasl-signer-python
azure-functions
backports.tarfile
blas
bottle
bson
cairo
capnproto
captum
categorical-distance
census
clickhouse-driver
clustergram
cma
conda-anaconda-telemetry
configspace
cpp-expected
dask-expr
data-science-utils
databricks-sdk
datetime-distance
db-dtypes
dedupe
dedupe-variable-datetime
dedupe_lehvenshtein_search
dedupe_levenshtein_search
diff-cover
diptest
dmglib
docstring_parser
doublemetaphone
dspy-ai
econml
emcee
emoji
environs
eth-abi
eth-hash
eth-typing
eth-utils
expat
filetype
fitter
flask-cors
fpdf2
frozendict
gcab
geojson
gettext
glib-tools
google-ads
google-ai-generativelanguage
google-api-python-client
google-auth-httplib2
google-cloud-bigquery
google-cloud-bigquery-core
google-cloud-bigquery-storage
google-cloud-bigquery-storage-core
google-cloud-resource-manager
google-generativeai
googlemaps
grapheme
graphene
graphql-relay
gravis
greykite
grpc-google-iam-v1
harfbuzz
hatch-fancy-pypi-readme
haversine
hiclass
hicolor-icon-theme
highered
hmmlearn
holidays-ext
httplib2
icu
imbalanced-ensemble
immutabledict
importlib-metadata
importlib-resources
inquirerpy
iterative-telemetry
jaraco.context
jaraco.test
jiter
jiwer
joserfc
jsoncpp
jsonpath
jsonpath-ng
jsonpath-python
kagglehub
keplergl
kt-legacy
langchain-community
langchain-experimental
langchain-snowflake
langchain-text-splitters
libabseil
libflac
libgfortran-ng
libgfortran5
libglib
libgomp
libgrpc
libgsf
libmagic
libogg
libopenblas
libpostal
libprotobuf
libsentencepiece
libsndfile
libstdcxx-ng
libtheora
libtiff
libvorbis
libwebp
lightweight-mmm
litestar
litestar-with-annotated-types
litestar-with-attrs
litestar-with-cryptography
litestar-with-jinja
litestar-with-jwt
litestar-with-prometheus
litestar-with-structlog
lunarcalendar-ext
matplotlib-venn
metricks
mimesis
modin-ray
momepy
mpg123
msgspec
msgspec-toml
msgspec-yaml
msitools
multipart
namex
nbconvert-all
nbconvert-core
nbconvert-pandoc
nlohmann_json
numba-cuda
numpyro
office365-rest-python-client
openapi-pydantic
opentelemetry-distro
opentelemetry-instrumentation
opentelemetry-instrumentation-system-metrics
optree
osmnx
pathlib
pdf2image
pfzy
pgpy
plumbum
pm4py
polars
polyfactory
poppler-cpp
postal
pre-commit
prompt-toolkit
propcache
py-partiql-parser
py_stringmatching
pyatlan
pyfakefs
pyfhel
pyhacrf-datamade
pyiceberg
pykrb5
pylbfgs
pymilvus
pymoo
pynisher
pyomo
pypdf
pypdf-with-crypto
pypdf-with-full
pypdf-with-image
pypng
pyprind
pyrfr
pysoundfile
pytest-codspeed
pytest-trio
python-barcode
python-box
python-docx
python-gssapi
python-iso639
python-magic
python-pandoc
python-zstd
pyuca
pyvinecopulib
pyxirr
qrcode
rai-sdk
ray-client
ray-observability
readline
rich-click
rouge-score
ruff
scikit-criteria
scikit-mobility
sentencepiece-python
sentencepiece-spm
setuptools-markdown
setuptools-scm
setuptools-scm-git-archive
shareplum
simdjson
simplecosine
sis-extras
slack-sdk
smac
snowflake-sqlalchemy
snowflake_legacy
socrata-py
spdlog
sphinxcontrib-images
sphinxcontrib-jquery
sphinxcontrib-youtube
splunk-opentelemetry
sqlfluff
squarify
st-theme
statistics
streamlit-antd-components
streamlit-condition-tree
streamlit-echarts
streamlit-feedback
streamlit-keplergl
streamlit-mermaid
streamlit-navigation-bar
streamlit-option-menu
strictyaml
stringdist
sybil
tensorflow-cpu
tensorflow-text
tiledb-ptorchaudio
torcheval
trio-websocket
trulens-connectors-snowflake
trulens-core
trulens-dashboard
trulens-feedback
trulens-otel-semconv
trulens-providers-cortex
tsdownsample
typing
typing-extensions
typing_extensions
unittest-xml-reporting
uritemplate
us
uuid6
wfdb
wsproto
zlib
zope.index
Added the following Python BuiltIn libraries with Direct status:
aifc
array
ast
asynchat
asyncio
asyncore
atexit
audioop
base64
bdb
binascii
bitsect
builtins
bz2
calendar
cgi
cgitb
chunk
cmath
cmd
code
codecs
codeop
colorsys
compileall
concurrent
contextlib
contextvars
copy
copyreg
cprofile
crypt
csv
ctypes
curses
dbm
difflib
dis
distutils
doctest
email
ensurepip
enum
errno
faulthandler
fcntl
filecmp
fileinput
fnmatch
fractions
ftplib
functools
gc
getopt
getpass
gettext
graphlib
grp
gzip
hashlib
heapq
hmac
html
http
idlelib
imaplib
imghdr
imp
importlib
inspect
ipaddress
itertools
keyword
linecache
locale
lzma
mailbox
mailcap
marshal
math
mimetypes
mmap
modulefinder
msilib
multiprocessing
netrc
nis
nntplib
numbers
operator
optparse
ossaudiodev
pdb
pickle
pickletools
pipes
pkgutil
platform
plistlib
poplib
posix
pprint
profile
pstats
pty
pwd
py_compile
pyclbr
pydoc
queue
quopri
random
re
reprlib
resource
rlcompleter
runpy
sched
secrets
select
selectors
shelve
shlex
signal
site
sitecustomize
smtpd
smtplib
sndhdr
socket
socketserver
spwd
sqlite3
ssl
stat
string
stringprep
struct
subprocess
sunau
symtable
sysconfig
syslog
tabnanny
tarfile
telnetlib
tempfile
termios
test
textwrap
threading
timeit
tkinter
token
tokenize
tomllib
trace
traceback
tracemalloc
tty
turtle
turtledemo
types
unicodedata
urllib
uu
uuid
venv
warnings
wave
weakref
webbrowser
wsgiref
xdrlib
xml
xmlrpc
zipapp
zipfile
zipimport
zoneinfo
Added the following Python BuiltIn libraries with NotSupported status:
msvcrt
winreg
winsound
Changed¶
Update .NET version to v9.0.0.
Improved EWI SPRKPY1068.
Bumped the version of Snowpark Python API supported by the SMA from 1.24.0 to 1.25.0.
Updated the detailed report template, now has the Snowpark version for Pandas.
Changed the following libraries from ThirdPartyLib to BuiltIn.
configparser
dataclasses
pathlib
readline
statistics
zlib
Updated the mapping status for the following Pandas elements, from Direct to Partial:
pandas.core.frame.DataFrame.add
pandas.core.frame.DataFrame.aggregate
pandas.core.frame.DataFrame.all
pandas.core.frame.DataFrame.apply
pandas.core.frame.DataFrame.astype
pandas.core.frame.DataFrame.cumsum
pandas.core.frame.DataFrame.div
pandas.core.frame.DataFrame.dropna
pandas.core.frame.DataFrame.eq
pandas.core.frame.DataFrame.ffill
pandas.core.frame.DataFrame.fillna
pandas.core.frame.DataFrame.floordiv
pandas.core.frame.DataFrame.ge
pandas.core.frame.DataFrame.groupby
pandas.core.frame.DataFrame.gt
pandas.core.frame.DataFrame.idxmax
pandas.core.frame.DataFrame.idxmin
pandas.core.frame.DataFrame.inf
pandas.core.frame.DataFrame.join
pandas.core.frame.DataFrame.le
pandas.core.frame.DataFrame.loc
pandas.core.frame.DataFrame.lt
pandas.core.frame.DataFrame.mask
pandas.core.frame.DataFrame.merge
pandas.core.frame.DataFrame.mod
pandas.core.frame.DataFrame.mul
pandas.core.frame.DataFrame.ne
pandas.core.frame.DataFrame.nunique
pandas.core.frame.DataFrame.pivot_table
pandas.core.frame.DataFrame.pow
pandas.core.frame.DataFrame.radd
pandas.core.frame.DataFrame.rank
pandas.core.frame.DataFrame.rdiv
pandas.core.frame.DataFrame.rename
pandas.core.frame.DataFrame.replace
pandas.core.frame.DataFrame.resample
pandas.core.frame.DataFrame.rfloordiv
pandas.core.frame.DataFrame.rmod
pandas.core.frame.DataFrame.rmul
pandas.core.frame.DataFrame.rolling
pandas.core.frame.DataFrame.round
pandas.core.frame.DataFrame.rpow
pandas.core.frame.DataFrame.rsub
pandas.core.frame.DataFrame.rtruediv
pandas.core.frame.DataFrame.shift
pandas.core.frame.DataFrame.skew
pandas.core.frame.DataFrame.sort_index
pandas.core.frame.DataFrame.sort_values
pandas.core.frame.DataFrame.sub
pandas.core.frame.DataFrame.to_dict
pandas.core.frame.DataFrame.transform
pandas.core.frame.DataFrame.transpose
pandas.core.frame.DataFrame.truediv
pandas.core.frame.DataFrame.var
pandas.core.indexes.datetimes.date_range
pandas.core.reshape.concat.concat
pandas.core.reshape.melt.melt
pandas.core.reshape.merge.merge
pandas.core.reshape.pivot.pivot_table
pandas.core.reshape.tile.cut
pandas.core.series.Series.add
pandas.core.series.Series.aggregate
pandas.core.series.Series.all
pandas.core.series.Series.any
pandas.core.series.Series.cumsum
pandas.core.series.Series.div
pandas.core.series.Series.dropna
pandas.core.series.Series.eq
pandas.core.series.Series.ffill
pandas.core.series.Series.fillna
pandas.core.series.Series.floordiv
pandas.core.series.Series.ge
pandas.core.series.Series.gt
pandas.core.series.Series.lt
pandas.core.series.Series.mask
pandas.core.series.Series.mod
pandas.core.series.Series.mul
pandas.core.series.Series.multiply
pandas.core.series.Series.ne
pandas.core.series.Series.pow
pandas.core.series.Series.quantile
pandas.core.series.Series.radd
pandas.core.series.Series.rank
pandas.core.series.Series.rdiv
pandas.core.series.Series.rename
pandas.core.series.Series.replace
pandas.core.series.Series.resample
pandas.core.series.Series.rfloordiv
pandas.core.series.Series.rmod
pandas.core.series.Series.rmul
pandas.core.series.Series.rolling
pandas.core.series.Series.rpow
pandas.core.series.Series.rsub
pandas.core.series.Series.rtruediv
pandas.core.series.Series.sample
pandas.core.series.Series.shift
pandas.core.series.Series.skew
pandas.core.series.Series.sort_index
pandas.core.series.Series.sort_values
pandas.core.series.Series.std
pandas.core.series.Series.sub
pandas.core.series.Series.subtract
pandas.core.series.Series.truediv
pandas.core.series.Series.value_counts
pandas.core.series.Series.var
pandas.core.series.Series.where
pandas.core.tools.numeric.to_numeric
Updated the mapping status for the following Pandas elements, from NotSupported to Direct:
pandas.core.frame.DataFrame.attrs
pandas.core.indexes.base.Index.to_numpy
pandas.core.series.Series.str.len
pandas.io.html.read_html
pandas.io.xml.read_xml
pandas.core.indexes.datetimes.DatetimeIndex.mean
pandas.core.resample.Resampler.indices
pandas.core.resample.Resampler.nunique
pandas.core.series.Series.items
pandas.core.tools.datetimes.to_datetime
pandas.io.sas.sasreader.read_sas
pandas.core.frame.DataFrame.attrs
pandas.core.frame.DataFrame.style
pandas.core.frame.DataFrame.items
pandas.core.groupby.generic.DataFrameGroupBy.head
pandas.core.groupby.generic.DataFrameGroupBy.median
pandas.core.groupby.generic.DataFrameGroupBy.min
pandas.core.groupby.generic.DataFrameGroupBy.nunique
pandas.core.groupby.generic.DataFrameGroupBy.tail
pandas.core.indexes.base.Index.is_boolean
pandas.core.indexes.base.Index.is_floating
pandas.core.indexes.base.Index.is_integer
pandas.core.indexes.base.Index.is_monotonic_decreasing
pandas.core.indexes.base.Index.is_monotonic_increasing
pandas.core.indexes.base.Index.is_numeric
pandas.core.indexes.base.Index.is_object
pandas.core.indexes.base.Index.max
pandas.core.indexes.base.Index.min
pandas.core.indexes.base.Index.name
pandas.core.indexes.base.Index.names
pandas.core.indexes.base.Index.rename
pandas.core.indexes.base.Index.set_names
pandas.core.indexes.datetimes.DatetimeIndex.day_name
pandas.core.indexes.datetimes.DatetimeIndex.month_name
pandas.core.indexes.datetimes.DatetimeIndex.time
pandas.core.indexes.timedeltas.TimedeltaIndex.ceil
pandas.core.indexes.timedeltas.TimedeltaIndex.days
pandas.core.indexes.timedeltas.TimedeltaIndex.floor
pandas.core.indexes.timedeltas.TimedeltaIndex.microseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.nanoseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.round
pandas.core.indexes.timedeltas.TimedeltaIndex.seconds
pandas.core.reshape.pivot.crosstab
pandas.core.series.Series.dt.round
pandas.core.series.Series.dt.time
pandas.core.series.Series.dt.weekday
pandas.core.series.Series.is_monotonic_decreasing
pandas.core.series.Series.is_monotonic_increasing
Updated the mapping status for the following Pandas elements, from NotSupported to Partial:
pandas.core.frame.DataFrame.align
pandas.core.series.Series.align
pandas.core.frame.DataFrame.tz_convert
pandas.core.frame.DataFrame.tz_localize
pandas.core.groupby.generic.DataFrameGroupBy.fillna
pandas.core.groupby.generic.SeriesGroupBy.fillna
pandas.core.indexes.datetimes.bdate_range
pandas.core.indexes.datetimes.DatetimeIndex.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.resample.Resampler.asfreq
pandas.core.resample.Resampler.quantile
pandas.core.series.Series.map
pandas.core.series.Series.tz_convert
pandas.core.series.Series.tz_localize
pandas.core.window.expanding.Expanding.count
pandas.core.window.rolling.Rolling.count
pandas.core.groupby.generic.DataFrameGroupBy.aggregate
pandas.core.groupby.generic.SeriesGroupBy.aggregate
pandas.core.frame.DataFrame.applymap
pandas.core.series.Series.apply
pandas.core.groupby.generic.DataFrameGroupBy.bfill
pandas.core.groupby.generic.DataFrameGroupBy.ffill
pandas.core.groupby.generic.SeriesGroupBy.bfill
pandas.core.groupby.generic.SeriesGroupBy.ffill
pandas.core.frame.DataFrame.backfill
pandas.core.frame.DataFrame.bfill
pandas.core.frame.DataFrame.compare
pandas.core.frame.DataFrame.unstack
pandas.core.frame.DataFrame.asfreq
pandas.core.series.Series.backfill
pandas.core.series.Series.bfill
pandas.core.series.Series.compare
pandas.core.series.Series.unstack
pandas.core.series.Series.asfreq
pandas.core.series.Series.argmax
pandas.core.series.Series.argmin
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.microsecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.nanosecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.day_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_leap_year
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.floor
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.ceil
pandas.core.groupby.generic.DataFrameGroupBy.idxmax
pandas.core.groupby.generic.DataFrameGroupBy.idxmin
pandas.core.groupby.generic.DataFrameGroupBy.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.tools.timedeltas.to_timedelta
Known Issue¶
This version includes an issue when converting the sample project will not work on this version, it will be fixed on the next release
January 9th, 2025¶
Application & CLI Version 2.4.3¶
Desktop App desktop-app¶
Added link to the troubleshooting guide in the crash report modal.
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 4.15.0
Added¶
Added the following PySpark elements to ConversionStatusPySpark.csv file as
NotSupported:
pyspark.sql.streaming.readwriter.DataStreamReader.table
pyspark.sql.streaming.readwriter.DataStreamReader.schema
pyspark.sql.streaming.readwriter.DataStreamReader.options
pyspark.sql.streaming.readwriter.DataStreamReader.option
pyspark.sql.streaming.readwriter.DataStreamReader.load
pyspark.sql.streaming.readwriter.DataStreamReader.format
pyspark.sql.streaming.query.StreamingQuery.awaitTermination
pyspark.sql.streaming.readwriter.DataStreamWriter.partitionBy
pyspark.sql.streaming.readwriter.DataStreamWriter.toTable
pyspark.sql.streaming.readwriter.DataStreamWriter.trigger
pyspark.sql.streaming.readwriter.DataStreamWriter.queryName
pyspark.sql.streaming.readwriter.DataStreamWriter.outputMode
pyspark.sql.streaming.readwriter.DataStreamWriter.format
pyspark.sql.streaming.readwriter.DataStreamWriter.option
pyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatch
pyspark.sql.streaming.readwriter.DataStreamWriter.start
Changed¶
Updated Hive SQL EWIs format.
SPRKHVSQL1001
SPRKHVSQL1002
SPRKHVSQL1003
SPRKHVSQL1004
SPRKHVSQL1005
SPRKHVSQL1006
Updated Spark SQL EWIs format.
SPRKSPSQL1001
SPRKSPSQL1002
SPRKSPSQL1003
SPRKSPSQL1004
SPRKSPSQL1005
SPRKSPSQL1006
Fixed¶
Fixed a bug that was causing some PySpark elements not identified by the tool.
Fixed the mismatch in the ThirdParty identified calls and the ThirdParty import Calls number.
December 13th, 2024¶
Application & CLI Version 2.4.2 version-2.4.2-prpr-dic-12-2024¶
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 4.14.0
Added added¶
Added the following Spark elements to ConversionStatusPySpark.csv:
pyspark.broadcast.Broadcast.value
pyspark.conf.SparkConf.getAll
pyspark.conf.SparkConf.setAll
pyspark.conf.SparkConf.setMaster
pyspark.context.SparkContext.addFile
pyspark.context.SparkContext.addPyFile
pyspark.context.SparkContext.binaryFiles
pyspark.context.SparkContext.setSystemProperty
pyspark.context.SparkContext.version
pyspark.files.SparkFiles
pyspark.files.SparkFiles.get
pyspark.rdd.RDD.count
pyspark.rdd.RDD.distinct
pyspark.rdd.RDD.reduceByKey
pyspark.rdd.RDD.saveAsTextFile
pyspark.rdd.RDD.take
pyspark.rdd.RDD.zipWithIndex
pyspark.sql.context.SQLContext.udf
pyspark.sql.types.StructType.simpleString
Changed changed.6¶
Updated the documentation of the Pandas EWIs,
PNDSPY1001
,PNDSPY1002
andPNDSPY1003
SPRKSCL1137
to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the following Scala EWIs:
SPRKSCL1106
andSPRKSCL1107
. To be aligned with a standardized format, ensuring consistency and clarity across all the EWIs.
Fixed fixed.2¶
Fixed the bug the was causing the UserDefined symbols showing in the third party usages inventory.
December 4th, 2024.¶
Application & CLI Version 2.4.1 version-2.3.1-prpr-nov-14-2024¶
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 4.13.1
Command Line Interface command-line-interface¶
Changed
Added timestamp to the output folder.
Snowpark Conversion Core 4.13.1¶
Added added¶
Added ‘Source Language’ column to Library Mappings Table
Added
Others
as a new category in the Pandas API Summary table of the DetailedReport.docx
Changed changed.6¶
Updated the documentation for Python EWI
SPRKPY1058
.Updated the message for the pandas EWI
PNDSPY1002
to show the relate pandas element.Updated the way we created the .csv reports, now are overwritten after a second run .
Fixed fixed.2¶
Fixed a bug that was causing Notebook files not being generated in the output.
Fixed the replacer for
get
andset
methods frompyspark.sql.conf.RuntimeConfig
, the replacer now match the correct full names.Fixed query tag incorrect version.
Fixed UserDefined packages reported as ThirdPartyLib.
\
November 14th, 2024¶
Application & CLI Version 2.3.1 version-2.3.1-prpr-nov-14-2024¶
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 4.12.0
Desktop App desktop-app¶
Fixed
Fix case-sensitive issues in –sql options.
Removed
Remove platform name from show-ac message.
Snowpark Conversion Core 4.12.0¶
Added added¶
Added support for Snowpark Python 1.23.0 and 1.24.0.
Added a new EWI for the
pyspark.sql.dataframe.DataFrame.writeTo
function. All the usages of this function will now have the EWI SPRKPY1087.
Changed changed.5¶
Updated the documentation of the Scala EWIs from
SPRKSCL1137
toSPRKSCL1156
to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the Scala EWIs from
SPRKSCL1117
toSPRKSCL1136
to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the message that is shown for the following EWIs:
SPRKPY1082
SPRKPY1083
Updated the documentation of the Scala EWIs from
SPRKSCL1100
toSPRKSCL1105
, fromSPRKSCL1108
toSPRKSCL1116
; fromSPRKSCL1157
toSPRKSCL1175
; to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the mapping status of the following PySpark elements from NotSupported to Direct with EWI:
pyspark.sql.readwriter.DataFrameWriter.option
=>snowflake.snowpark.DataFrameWriter.option
: All the usages of this function now have the EWI SPRKPY1088pyspark.sql.readwriter.DataFrameWriter.options
=>snowflake.snowpark.DataFrameWriter.options
: All the usages of this function now have the EWI SPRKPY1089
Updated the mapping status of the following PySpark elements from Workaround to Rename:
pyspark.sql.readwriter.DataFrameWriter.partitionBy
=>snowflake.snowpark.DataFrameWriter.partition_by
Updated EWI documentation: SPRKSCL1000, SPRKSCL1001, SPRKSCL1002, SPRKSCL1100, SPRKSCL1101, SPRKSCL1102, SPRKSCL1103, SPRKSCL1104, SPRKSCL1105.
Removed removed.1¶
Removed the
pyspark.sql.dataframe.DataFrameStatFunctions.writeTo
element from the conversion status, this element does not exist.
Deprecated deprecated¶
Deprecated the following EWI codes:
SPRKPY1081
SPRKPY1084
October 30th, 2024¶
Application & CLI Version 2.3.0¶
Snowpark Conversion Core 4.11.0
Snowpark Conversion Core 4.11.0¶
Added added¶
Added a new column called
Url
to theIssues.csv
file, which redirects to the corresponding EWI documentation.Added new EWIs for the following Spark elements:
[SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
[SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
[SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
[SPRKPY1085] pyspark.ml.feature.VectorAssembler
[SPRKPY1086] pyspark.ml.linalg.VectorUDT
Added 38 new Pandas elements:
pandas.core.frame.DataFrame.select
andas.core.frame.DataFrame.str
pandas.core.frame.DataFrame.str.replace
pandas.core.frame.DataFrame.str.upper
pandas.core.frame.DataFrame.to_list
pandas.core.frame.DataFrame.tolist
pandas.core.frame.DataFrame.unique
pandas.core.frame.DataFrame.values.tolist
pandas.core.frame.DataFrame.withColumn
pandas.core.groupby.generic._SeriesGroupByScalar
pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
pandas.core.indexes.datetimes.DatetimeIndex.year
pandas.core.series.Series.columns
pandas.core.tools.datetimes.to_datetime.date
pandas.core.tools.datetimes.to_datetime.dt.strftime
pandas.core.tools.datetimes.to_datetime.strftime
pandas.io.parsers.readers.TextFileReader.apply
pandas.io.parsers.readers.TextFileReader.astype
pandas.io.parsers.readers.TextFileReader.columns
pandas.io.parsers.readers.TextFileReader.copy
pandas.io.parsers.readers.TextFileReader.drop
pandas.io.parsers.readers.TextFileReader.drop_duplicates
pandas.io.parsers.readers.TextFileReader.fillna
pandas.io.parsers.readers.TextFileReader.groupby
pandas.io.parsers.readers.TextFileReader.head
pandas.io.parsers.readers.TextFileReader.iloc
pandas.io.parsers.readers.TextFileReader.isin
pandas.io.parsers.readers.TextFileReader.iterrows
pandas.io.parsers.readers.TextFileReader.loc
pandas.io.parsers.readers.TextFileReader.merge
pandas.io.parsers.readers.TextFileReader.rename
pandas.io.parsers.readers.TextFileReader.shape
pandas.io.parsers.readers.TextFileReader.to_csv
pandas.io.parsers.readers.TextFileReader.to_excel
pandas.io.parsers.readers.TextFileReader.unique
pandas.io.parsers.readers.TextFileReader.values
pandas.tseries.offsets
October 24th, 2024¶
Application Version 2.2.3 version-2.2.16-prpr-oct-22-2024¶
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 4.10.0
Desktop App desktop-app¶
Fixed
Fixed a bug that caused the SMA to show the label SnowConvert instead of Snowpark Migration Accelerator in the menu bar of the Windows version.
Fixed a bug that caused the SMA to crash when it did not have read and write permissions to the
.config
directory in macOS and theAppData
directory in Windows.
Command Line Interface command-line-interface¶
Changed
Renamed the CLI executable name from
snowct
tosma
.Removed the source language argument so you no longer need to specify if you are running a Python or Scala assessment / conversion.
Expanded the command line arguments supported by the CLI by adding the following new arguments:
--enableJupyter
|-j
: Flag to indicate if the conversion of Databricks notebooks to Jupyter is enabled or not.--sql
|-f
: Database engine syntax to be used when a SQL command is detected.--customerEmail
|-e
: Configure the customer email.--customerCompany
|-c
: Configure the customer company.--projectName
|-p
: Configure the customer project.
Updated some texts to reflect the correct name of the application, ensuring consistency and clarity in all the messages.
Updated the terms of use of the application.
Updated and expanded the documentation of the CLI to reflect the latests features, enhancements and changes.
Updated the text that is shown before proceeding with the execution of the SMA to improve
Updated the CLI to accept “Yes” as a valid argument when prompting for user confirmation.
Allowed the CLI to continue the execution without waiting for user interaction by specifying the argument
-y
or--yes
.Updated the help information of the
--sql
argument to show the values that this argument expects.
Snowpark Conversion Core Version 4.10.0¶
Added added¶
Added a new EWI for the
pyspark.sql.readwriter.DataFrameWriter.partitionBy
function. All the usages of this function will now have the EWI SPRKPY1081.Added a new column called
Technology
to theImportUsagesInventory.csv
file.
Changed changed.5¶
Updated the Third-Party Libraries readiness score to also take into account the
Unknown
libraries.Updated the
AssessmentFiles.zip
file to include.json
files instead of.pam
files.Improved the CSV to JSON conversion mechanism to make processing of inventories more performant.
Improved the documentation of the following EWIs:
SPRKPY1029
SPRKPY1054
SPRKPY1055
SPRKPY1063
SPRKPY1075
SPRKPY1076
Updated the mapping status of the following Spark Scala elements from
Direct
toRename
.org.apache.spark.sql.functions.shiftLeft
=>com.snowflake.snowpark.functions.shiftleft
org.apache.spark.sql.functions.shiftRight
=>com.snowflake.snowpark.functions.shiftright
Updated the mapping status of the following Spark Scala elements from
Not Supported
toDirect
.org.apache.spark.sql.functions.shiftleft
=>com.snowflake.snowpark.functions.shiftleft
org.apache.spark.sql.functions.shiftright
=>com.snowflake.snowpark.functions.shiftright
Fixed fixed.2¶
Fixed a bug that caused the SMA to incorrectly populate the
Origin
column of theImportUsagesInventory.csv
file.Fixed a bug that caused the SMA to not classify imports of the libraries
io
,json
,logging
andunittest
as Python built-in imports in theImportUsagesInventory.csv
file and in theDetailedReport.docx
file.
October 11th, 2024¶
Application Version 2.2.2¶
Features Updates include:
Snowpark Conversion Core 4.8.0
Snowpark Conversion Core Version 4.8.0¶
Added added¶
Added
EwiCatalog.csv
and .md files to reorganize documentationAdded the mapping status of
pyspark.sql.functions.ln
Direct.Added a transformation for
pyspark.context.SparkContext.getOrCreate
Please check the EWI SPRKPY1080 for further details.
Added an improvement for the SymbolTable, infer type for parameters in functions.
Added SymbolTable supports static methods and do not assume the first parameter will be self for them.
Added documentation for missing EWIs
SPRKHVSQL1005
SPRKHVSQL1006
SPRKSPSQL1005
SPRKSPSQL1006
SPRKSCL1002
SPRKSCL1170
SPRKSCL1171
SPRKPY1057
SPRKPY1058
SPRKPY1059
SPRKPY1060
SPRKPY1061
SPRKPY1064
SPRKPY1065
SPRKPY1066
SPRKPY1067
SPRKPY1069
SPRKPY1070
SPRKPY1077
SPRKPY1078
SPRKPY1079
SPRKPY1101
Changed changed.3¶
Updated the mapping status of:
pyspark.sql.functions.array_remove
fromNotSupported
toDirect
.
Fixed fixed¶
Fixed the Code File Sizing table in the Detail Report to exclude .sql and .hql files and added the Extra Large row in the table.
Fixed missing the
update_query_tag
whenSparkSession
is defined into multiple lines onPython
.Fixed missing the
update_query_tag
whenSparkSession
is defined into multiple lines onScala
.Fixed missing EWI
SPRKHVSQL1001
to some SQL statements with parsing errors.Fixed keep new lines values inside string literals
Fixed the Total Lines of code showed in the File Type Summary Table
Fixed Parsing Score showed as 0 when recognize files successfully
Fixed LOC count in the cell inventory for Databricks Magic SQL Cells
September 26th, 2024¶
Application Version 2.2.0 version-2.2.0-prpr-sept-16-2024¶
Feature Updates include:
Snowpark Conversion Core 4.6.0
Snowpark Conversion Core Version 4.6.0¶
Added added¶
Add transformation for
pyspark.sql.readwriter.DataFrameReader.parquet
.Add transformation for
pyspark.sql.readwriter.DataFrameReader.option
when it is a Parquet method.
Changed changed.3¶
Updated the mapping status of:
pyspark.sql.types.StructType.fields
fromNotSupported
toDirect
.pyspark.sql.types.StructType.names
fromNotSupported
toDirect
.pyspark.context.SparkContext.setLogLevel
fromWorkaround
toTransformation
.More detail can be found in EWIs SPRKPY1078 and SPRKPY1079
org.apache.spark.sql.functions.round
fromWorkAround
toDirect
.org.apache.spark.sql.functions.udf
fromNotDefined
toTransformation
.More detail can be found in EWIs SPRKSCL1174 and SPRKSCL1175
Updated the mapping status of the following Spark elements from
DirectHelper
toDirect
:org.apache.spark.sql.functions.hex
org.apache.spark.sql.functions.unhex
org.apache.spark.sql.functions.shiftleft
org.apache.spark.sql.functions.shiftright
org.apache.spark.sql.functions.reverse
org.apache.spark.sql.functions.isnull
org.apache.spark.sql.functions.unix_timestamp
org.apache.spark.sql.functions.randn
org.apache.spark.sql.functions.signum
org.apache.spark.sql.functions.sign
org.apache.spark.sql.functions.collect_list
org.apache.spark.sql.functions.log10
org.apache.spark.sql.functions.log1p
org.apache.spark.sql.functions.base64
org.apache.spark.sql.functions.unbase64
org.apache.spark.sql.functions.regexp_extract
org.apache.spark.sql.functions.expr
org.apache.spark.sql.functions.date_format
org.apache.spark.sql.functions.desc
org.apache.spark.sql.functions.asc
org.apache.spark.sql.functions.size
org.apache.spark.sql.functions.locate
org.apache.spark.sql.functions.ntile
Fixed fixed¶
Fixed value showed in the Percentage of total Pandas Api
Fixed Total percentage on ImportCalls table in the DetailReport
Deprecated deprecated¶
Deprecated the following EWI code:
SPRKSCL1115
September 12th, 2024¶
Application Version 2.1.7¶
Feature Updates include:
Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2
Snowpark Conversion Core Version 4.5.7¶
Hotfixed fixed¶
Fixed Total row added on Spark Usages Summaries when there are not usages
Bumped of Python Assembly to Version=
1.3.111
Parse trail comma in multiline arguments
Snowpark Conversion Core Version 4.5.2¶
Added added¶
Added transformation for
pyspark.sql.readwriter.DataFrameReader.option
:When the chain is from a CSV method call.
When the chain is from a JSON method call.
Added transformation for
pyspark.sql.readwriter.DataFrameReader.json
.
Changed changed.3¶
Executed SMA on SQL strings passed to Python/Scala functions
Create AST in Scala/Python to emit temporary SQL unit
Create SqlEmbeddedUsages.csv inventory
Deprecate SqlStatementsInventroy.csv and SqlExtractionInventory.csv
Integrate EWI when the SQL literal could not be processed
Create new task to process SQL-embedded code
Collect info for SqlEmbeddedUsages.csv inventory in Python
Replace SQL transformed code to Literal in Python
Update test cases after implementation
Create table, views for telemetry in SqlEmbeddedUsages inventory
Collect info for SqlEmbeddedUsages.csv report in Scala
Replace SQL transformed code to Literal in Scala
Check line number order for Embedded SQL reporting
Filled the
SqlFunctionsInfo.csv
with the SQL functions documented for SparkSQL and HiveSQLUpdated the mapping status for:
org.apache.spark.sql.SparkSession.sparkContext
from NotSupported to Transformation.org.apache.spark.sql.Builder.config
fromNotSupported
toTransformation
. With this new mapping status, the SMA will remove all the usages of this function from the source code.
September 5th, 2024¶
Application Version 2.1.6¶
Hotfix change for Snowpark Engines Core version 4.5.1
Spark Conversion Core Version 4.5.1¶
Hotfix
Added a mechanism to convert the temporal Databricks notebooks generated by SMA in exported Databricks notebooks
August 29th, 2024¶
Application Version 2.1.5¶
Feature Updates include:
Updated Spark Conversion Core: 4.3.2
Spark Conversion Core Version 4.3.2¶
Added¶
Added the mechanism (via decoration) to get the line and the column of the elements identified in notebooks cells
Added an EWI for pyspark.sql.functions.from_json.
Added a transformation for pyspark.sql.readwriter.DataFrameReader.csv.
Enabled the query tag mechanism for Scala files.
Added the Code Analysis Score and additional links to the Detailed Report.
Added a column called OriginFilePath to InputFilesInventory.csv
Changed¶
Updated the mapping status of pyspark.sql.functions.from_json from Not Supported to Transformation.
Updated the mapping status of the following Spark elements from Workaround to Direct:
org.apache.spark.sql.functions.countDistinct
org.apache.spark.sql.functions.max
org.apache.spark.sql.functions.min
org.apache.spark.sql.functions.mean
Deprecated¶
Deprecated the following EWI codes:
SPRKSCL1135
SPRKSCL1136
SPRKSCL1153
SPRKSCL1155
Fixed¶
Fixed a bug that caused an incorrect calculation of the Spark API score.
Fixed an error that avoid copy SQL empty or commented files in the output folder.
Fixed a bug in the DetailedReport, the notebook stats LOC and Cell count is not accurate.
August 14th, 2024 id-4.2.0—2024-08-06¶
Application Version 2.1.2¶
Feature Updates include:
Updated Spark Conversion Core: 4.2.0
Spark Conversion Core Version 4.2.0¶
Added add¶
Add technology column to SparkUsagesInventory
Added an EWI for not defined SQL elements .
Added SqlFunctions Inventory
Collect info for SqlFunctions Inventory
Changed changed.3¶
The engine now processes and prints partially parsed Python files instead of leaving original file without modifications.
Python notebook cells that have parsing errors will also be processed and printed.
Fixed fixed¶
Fixed
pandas.core.indexes.datetimes.DatetimeIndex.strftime
was being reported wrongly.Fix mismatch between SQL readiness score and SQL Usages by Support Status.
Fixed a bug that caused the SMA to report
pandas.core.series.Series.empty
with an incorrect mapping status.Fix mismatch between Spark API Usages Ready for Conversion in DetailedReport.docx is different than UsagesReadyForConversion row in Assessment.json.
August 8th, 2024¶
Application Version 2.1.1¶
Feature Updates include:
Updated Spark Conversion Core: 4.1.0
Spark Conversion Core Version 4.1.0¶
Added add.1¶
Added the following information to the
AssessmentReport.json
fileThe third-party libraries readiness score.
The number of third-party library calls that were identified.
The number of third-party library calls that are supported in Snowpark.
The color code associated with the third-party readiness score, the Spark API readiness score, and the SQL readiness score.
Transformed
SqlSimpleDataType
in Spark create tables.Added the mapping of
pyspark.sql.functions.get
as direct.Added the mapping of
pyspark.sql.functions.to_varchar
as direct.As part of the changes after unification, the tool now generates an execution info file in the Engine.
Added a replacer for
pyspark.sql.SparkSession.builder.appName
.
Changed changed.4¶
Updated the mapping status for the following Spark elements
From Not Supported to Direct mapping:
pyspark.sql.functions.sign
pyspark.sql.functions.signum
Changed the Notebook Cells Inventory report to indicate the kind of content for every cell in the column Element
Added a
SCALA_READINESS_SCORE
column that reports the readiness score as related only to references to the Spark API in Scala files.Partial support to transform table properties in
ALTER TABLE
andALTER VIEW
Updated the conversion status of the node
SqlSimpleDataType
from Pending to Transformation in Spark create tablesUpdated the version of the Snowpark Scala API supported by the SMA from
1.7.0
to1.12.1
:Updated the mapping status of:
org.apache.spark.sql.SparkSession.getOrCreate
from Rename to Directorg.apache.spark.sql.functions.sum
from Workaround to Direct
Updated the version of the Snowpark Python API supported by the SMA from
1.15.0
to1.20.0
:Updated the mapping status of:
pyspark.sql.functions.arrays_zip
from Not Supported to Direct
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.any
pandas.core.frame.DataFrame.applymap
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.core.frame.DataFrame.groupby
pandas.core.frame.DataFrame.index
pandas.core.frame.DataFrame.T
pandas.core.frame.DataFrame.to_dict
From Not Supported to Rename mapping:
pandas.core.frame.DataFrame.map
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.where
pandas.core.groupby.generic.SeriesGroupBy.agg
pandas.core.groupby.generic.SeriesGroupBy.aggregate
pandas.core.groupby.generic.DataFrameGroupBy.agg
pandas.core.groupby.generic.DataFrameGroupBy.aggregate
pandas.core.groupby.generic.DataFrameGroupBy.apply
Not Supported mappings:
pandas.core.frame.DataFrame.to_parquet
pandas.core.generic.NDFrame.to_csv
pandas.core.generic.NDFrame.to_excel
pandas.core.generic.NDFrame.to_sql
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.empty
pandas.core.series.Series.apply
pandas.core.reshape.tile.qcut
Direct mappings with EWI:
pandas.core.series.Series.fillna
pandas.core.series.Series.astype
pandas.core.reshape.melt.melt
pandas.core.reshape.tile.cut
pandas.core.reshape.pivot.pivot_table
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.dt
pandas.core.series.Series.groupby
pandas.core.series.Series.loc
pandas.core.series.Series.shape
pandas.core.tools.datetimes.to_datetime
pandas.io.excel._base.ExcelFile
Not Supported mappings:
pandas.core.series.Series.dt.strftime
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.parquet.read_parquet
pandas.io.parsers.readers.read_csv
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.pickle.read_pickle
pandas.io.sql.read_sql
pandas.io.sql.read_sql_query
Updated the description of Understanding the SQL Readiness Score.
Updated
PyProgramCollector
to collect the packages and populate the current packages inventory with data from Python source code.Updated the mapping status of
pyspark.sql.SparkSession.builder.appName
from Rename to Transformation.Removed the following Scala integration tests:
AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentMode
AssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScala
AssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
Updated the mapping status of
pandas.core.generic.NDFrame.shape
from Not Supported to Direct.Updated the mapping status of
pandas.core.series
from Not Supported to Direct.
Deprecated deprecated¶
Deprecated the EWI code
SPRKSCL1160
sinceorg.apache.spark.sql.functions.sum
is now a direct mapping.
Fixed fixed.1¶
Fixed a bug by not supporting Custom Magics without arguments in Jupyter Notebook cells.
Fixed incorrect generation of EWIs in the issues.csv report when parsing errors occur.
Fixed a bug that caused the SMA not to process the Databricks exported notebook as Databricks notebooks.
Fixed a stack overflow error while processing clashing type names of declarations created inside package objects.
Fixed the processing of complex lambda type names involving generics, e.g.,
def func[X,Y](f: (Map[Option[X], Y] => Map[Y, X]))...
Fixed a bug that caused the SMA to add a PySpark EWI code instead of a Pandas EWI code to the Pandas elements that are not yet recognized.
Fixed a typo in the detailed report template: renaming a column from “Percentage of all Python Files” to “Percentage of all files”.
Fixed a bug where
pandas.core.series.Series.shape
was wrongly reported.