pd.Series supported APIsΒΆ

The following table is structured as follows: The first column contains the method name. The second column is a flag for whether or not there is an implementation in Snowpark for the method in the left column.

Note

Y stands for yes, i.e., supports distributed implementation, N stands for no and API simply errors out, P stands for partial (meaning some parameters may not be supported yet), and D stands for defaults to single node pandas execution via UDF/Sproc.

Attributes

Series attribute

Snowpark implemented? (Y/N/P/D)

Notes for current implementation

T

Y

array

N

at

P

N for set with MultiIndex

attrs

Y

axes

Y

dtype

Y

dtypes

Y

empty

Y

flags

P

flags can only be read, and not set.

hasnans

Y

iat

Y

iloc

Y

index

Y

is_monotonic_decreasing

Y

is_monotonic_increasing

Y

is_unique

Y

loc

P

N for set with MultiIndex

name

Y

nbytes

N

ndim

Y

shape

Y

size

Y

values

Y

Methods

Series method

Snowpark implemented? (Y/N/P/D)

Missing parameters

Notes for current implementation

abs

Y

add

P

level

add_prefix

Y

add_suffix

Y

agg

P

Y when function is one of count, mean, min, max, sum, median, size; std and var supported with ddof=0 or ddof=1; quantile is supported when q is the default value or a scalar.

aggregate

P

See agg

align

P

copy, level, fill_value

N for MultiIndex, for deprecated parameters method, limit, fill_axis, broadcast_axis, if axis == 1 or None, or if fill_value is not default of np.nan

all

P

N for non-integer/boolean types

any

P

N for non-integer/boolean types

apply

P

convert_dtype is ignored

N if func is not callable.

argmax

P

N if the Series has a MultiIndex index.

argmin

P

N if the Series has a MultiIndex index.

argsort

N

asfreq

P

how, normalize, fill_value

Only DatetimeIndex is supported and its freq will be lost. Only rule frequencies β€˜s’, β€˜min’, β€˜h’, and β€˜D’ are supported.

asof

N

astype

P

N if from string to datetime/timedelta or errors == "ignore"

at_time

N

autocorr

N

axes

Y

backfill

P

N if param downcast is set.

between

N

between_time

N

bfill

P

N if param downcast is set.

bool

N

case_when

P

N if condition or replacement is a callable.

clip

N

combine

N

combine_first

N

compare

P

align_axis, keep_shape, keep_equal, result_names

convert_dtypes

N

copy

Y

corr

N

count

Y

cov

N

cummax

Y

cummin

Y

cumprod

N

cumsum

P

Y if values are numeric, otherwise fails.

describe

Y

diff

Y

div

P

level

See truediv

divide

P

level

See truediv

divmod

N

dot

N

drop

Y

drop_duplicates

Y

droplevel

N

dropna

P

duplicated

Y

eq

P

level

equals

Y

ewm

N

expanding

P

method is ignored

N if axis = 1

explode

N

factorize

N

ffill

P

N if parameter downcast is set. limit parameter only supported if method parameter is used.

fillna

P

See ffill

filter

N

first

Y

first_valid_index

Y

floordiv

P

level

Raises division by zero exception when the right hand side contains at least one zero. pandas allows division by zero for non-object type Series and returns +/-inf.

ge

P

level

get

Y

groupby

P

observed is ignored since Categoricals are not implemented yet

Y when axis == 0 and by is column label or Series from the current DataFrame; otherwise N; Note that supported functions are agg, count, cumcount, cummax, cummin, cumsum, first, last, max, mean, median, min, quantile, shift, size, std, sum, and var. Otherwise N

gt

P

level

head

Y

hist

N


idxmax

Y


idxmin

Y


infer_objects

N

info

D

Different Index types are used in pandas but not in Snowpark pandas

interpolate

N

isin

Y

Snowpark pandas deviates with respect to handling NA values

isna

Y

isnull

Y

item

N

items

Y

keys

Y

kurt

N

kurtosis

N

last

Y

last_valid_index

Y

le

P

level

lt

P

level

map

P

na_action

N if func is not callable

mask

P

N if given axis or level parameters, N if cond or other is Callable

max

Y

mean

Y

median

Y

memory_usage

N

min

Y

mod

P

level

mode

N

mul

P

level

multiply

P

level

ne

P

level

nlargest

P

N if keep == "all"

notna

Y

notnull

Y

nsmallest

P

N if keep == "all"

nunique

Y

pad

P

See ffill

pct_change

P

limit, freq

pipe

N

plot

D

Performed locally on the client

pop

N

pow

P

level

prod

N

product

N

quantile

P

Y if values are numeric, and interpolation is "linear" or "nearest"; N if q is a DataFrame or Series

radd

P

level

rank

P

N if axis == 1

ravel

N

rdiv

P

level

See truediv

rdivmod

N

reindex

P

N if the series has MultiIndex, or method is nearest.

reindex_like

N

rename

P

copy is ignored

N if mapper is callable or the series has MultiIndex

rename_axis

Y

reorder_levels

N

repeat

N

replace

P

method, limit

resample

P

axis, label, convention, kind, , level, origin, , offset, group_keys

Only DatetimeIndex is supported and its freq will be lost. rule frequencies β€˜s’, β€˜min’, β€˜h’, and β€˜D’ are supported. rule frequencies β€˜W’, β€˜ME’, and β€˜YE’ are supported with closed = β€œleft”

reset_index

Y

rfloordiv

P

level

See floordiv

rmod

P

level

rmul

P

level

rolling

P

method is ignored, step, win_type, closed, on

N for non-integer window, axis = 1, or min_periods = 0

round

Y

rpow

P

level

rsub

P

level

rtruediv

P

level

See truediv

sample

P

N if weights or random_state is specified when axis = 0

searchsorted

N

sem

N

set_axis

Y

copy is ignored

shift

P

freq

No support for freq != None

skew

P

N if axis == 1 or skipna == False or numeric_only=False

sort_index

P

key

N if given the key param, or MultiIndex

sort_values

P

key, kind is ignored

The kind parameter has no effect. Snowpark pandas always uses a stable sort algorithm, while pandas by default does not.

squeeze

Y

std

P

N if ddof is not 0 or 1

sub

P

level

subtract

P

level

sum

Y

swapaxes

N

swaplevel

N

tail

Y

take

Y

to_clipboard

N

to_csv

P

Supports writing to both local and snowflake stage. Filepath starting with @ is treated as snowflake stage location. Writing to local file supports all parameters. Writing to snowflake state does not support float_format, mode, encoding, quoting, quotechar, lineterminator, doublequote and decimal parameters.

to_dict

Y

to_frame

Y

to_hdf

N

to_json

N

to_latex

N

to_list

Y

to_markdown

N

to_numpy

Y

copy is ignored

to_period

N

to_pickle

N

to_sql

N

to_string

N

to_timestamp

N

to_xarray

N

tolist

Y

transform

N

transpose

Y

truediv

P

level

Raises division by zero exception when right hand hand side contains at least one zero. pandas allows division by zero for non-object type Series and returns +/-inf.

truncate

N

tz_convert

P

axis, level, copy

N if timezone format is not supported. Only timezones listed in pytz.all_timezones are supported. For example, UTC is supported but UTC+/-<offset>, such as, UTC+09:00 is not supported.

tz_localize

P

axis, level, copy ambiguous, nonexistent

N if timezone format is not supported. Only timezones listed in pytz.all_timezones are supported. For example, UTC is supported but UTC+/-<offset>, such as UTC+09:00, is not supported.

unique

Y

unstack

P

sort

N for non-integer level.

update

Y

value_counts

P

bins

var

P

See std

view

N

where

P

See mask

xs

N