DataFrame¶

All supported DataFrame APIs

Constructor

DataFrame([data, index, columns, dtype, ...])

Snowpark pandas representation of pandas.DataFrame with a lazily-evaluated relational dataset.

Attributes

DataFrame.index

Get the index for this Series/DataFrame.

DataFrame.columns

Get the columns for this Snowpark pandas DataFrame.

DataFrame.dtypes

Return the dtypes in the DataFrame.

DataFrame.info([verbose, buf, max_cols, ...])

Print a concise summary of the DataFrame.

DataFrame.select_dtypes([include, exclude])

Return a subset of the DataFrame's columns based on the column dtypes.

DataFrame.values

Return a NumPy representation of the dataset.

DataFrame.axes

Return a list representing the axes of the DataFrame.

DataFrame.ndim

Return the number of dimensions of the underlying data, by definition 2.

DataFrame.size

Return an int representing the number of elements in this object.

DataFrame.shape

Return a tuple representing the dimensionality of the DataFrame.

DataFrame.empty

Indicator whether the DataFrame is empty.

Snowflake Specific

DataFrame.to_pandas(*[, statement_params])

Convert Snowpark pandas DataFrame to pandas.DataFrame

DataFrame.to_snowflake(name[, if_exists, ...])

Save the Snowpark pandas DataFrame as a Snowflake table.

DataFrame.to_snowpark([index, index_label])

Convert the Snowpark pandas DataFrame to a Snowpark DataFrame.

DataFrame.cache_result([inplace])

Persists the current Snowpark pandas DataFrame to a temporary table to improve the latency of subsequent operations.

Conversion

DataFrame.astype(dtype[, copy, errors])

Cast a pandas object to a specified dtype dtype.

DataFrame.convert_dtypes([infer_objects, ...])

Convert columns to best possible dtypes using dtypes supporting pd.NA.

DataFrame.copy([deep])

Make a copy of this object's indices and data.

Indexing, iteration

DataFrame.assign(**kwargs)

Assign new columns to a DataFrame.

DataFrame.head([n])

Return the first n rows.

DataFrame.loc

Access a group of rows and columns by label(s) or a boolean array.

DataFrame.iloc

Purely integer-location based indexing for selection by position.

DataFrame.insert(loc, column, value[, ...])

Insert column into DataFrame at specified location.

DataFrame.__iter__()

Iterate over info axis.

DataFrame.keys()

Get columns of the DataFrame.

DataFrame.iterrows()

Iterate over DataFrame rows as (index, Series) pairs.

DataFrame.items()

Iterate over (column name, Series) pairs.

DataFrame.itertuples([index, name])

Iterate over DataFrame rows as namedtuples.

DataFrame.tail([n])

Return the last n rows.

DataFrame.isin(values)

Whether each element in the DataFrame is contained in values.

DataFrame.where(cond[, other, inplace, ...])

Replace values where the condition is False.

DataFrame.mask(cond[, other, inplace, axis, ...])

Replace values where the condition is True.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value])

Get addition of DataFrame and other, element-wise (binary operator add).

DataFrame.sub(other[, axis, level, fill_value])

Get subtraction of DataFrame and other, element-wise (binary operator sub).

DataFrame.mul(other[, axis, level, fill_value])

Get multiplication of DataFrame and other, element-wise (binary operator mul).

DataFrame.div(other[, axis, level, fill_value])

Get floating division of DataFrame and other, element-wise (binary operator truediv).

DataFrame.truediv(other[, axis, level, ...])

Get floating division of DataFrame and other, element-wise (binary operator truediv).

DataFrame.floordiv(other[, axis, level, ...])

Get integer division of DataFrame and other, element-wise (binary operator floordiv).

DataFrame.mod(other[, axis, level, fill_value])

Get modulo of DataFrame and other, element-wise (binary operator mod).

DataFrame.pow(other[, axis, level, fill_value])

Get exponential power of DataFrame and other, element-wise (binary operator pow).

DataFrame.radd(other[, axis, level, fill_value])

Get addition of DataFrame and other, element-wise (binary operator radd).

DataFrame.rsub(other[, axis, level, fill_value])

Get subtraction of DataFrame and other, element-wise (binary operator rsub).

DataFrame.rmul(other[, axis, level, fill_value])

Get multiplication of DataFrame and other, element-wise (binary operator mul).

DataFrame.rdiv(other[, axis, level, fill_value])

Get floating division of DataFrame and other, element-wise (binary operator rtruediv).

DataFrame.rtruediv(other[, axis, level, ...])

Get floating division of DataFrame and other, element-wise (binary operator rtruediv).

DataFrame.rfloordiv(other[, axis, level, ...])

Get integer division of DataFrame and other, element-wise (binary operator rfloordiv).

DataFrame.rmod(other[, axis, level, fill_value])

Get modulo of DataFrame and other, element-wise (binary operator rmod).

DataFrame.rpow(other[, axis, level, fill_value])

Get exponential power of DataFrame and other, element-wise (binary operator rpow).

DataFrame.lt(other[, axis, level])

Get less than comparison of DataFrame and other, element-wise (binary operator le).

DataFrame.gt(other[, axis, level])

Get greater than comparison of DataFrame and other, element-wise (binary operator ge).

DataFrame.le(other[, axis, level])

Get less than or equal comparison of DataFrame and other, element-wise (binary operator le).

DataFrame.ge(other[, axis, level])

Get greater than or equal comparison of DataFrame and other, element-wise (binary operator ge).

DataFrame.ne(other[, axis, level])

Get not equal comparison of DataFrame and other, element-wise (binary operator ne).

DataFrame.eq(other[, axis, level])

Perform equality comparison of DataFrame and other (binary operator eq).

Function application, GroupBy & window

DataFrame.apply(func[, axis, raw, ...])

Apply a function along an axis of the DataFrame.

DataFrame.applymap(func[, na_action])

Apply a function to a Dataframe elementwise.

DataFrame.agg([func, axis])

Aggregate using one or more operations over the specified axis.

DataFrame.aggregate([func, axis])

Aggregate using one or more operations over the specified axis.

DataFrame.transform(func[, axis])

Call func on self producing a Snowpark pandas DataFrame with the same axis shape as self.

DataFrame.groupby([by, axis, level, ...])

Group DataFrame using a mapper or by a Series of columns.

DataFrame.rolling(window[, min_periods, ...])

Provide rolling window calculations.

Computations / descriptive stats

DataFrame.abs()

Return a BasePandasDataset with absolute numeric value of each element.

DataFrame.all([axis, bool_only, skipna])

Return whether all elements are True, potentially over an axis.

DataFrame.any(*[, axis, bool_only, skipna])

Return whether any element are True, potentially over an axis.

DataFrame.count([axis, numeric_only])

Count non-NA cells for each column or row.

DataFrame.cummax([axis, skipna])

Return cumulative maximum over a BasePandasDataset axis.

DataFrame.cummin([axis, skipna])

Return cumulative minimum over a BasePandasDataset axis.

DataFrame.cumsum([axis, skipna])

Return cumulative sum over a BasePandasDataset axis.

DataFrame.describe([percentiles, include, ...])

Generate descriptive statistics for columns in the dataset.

DataFrame.diff([periods, axis])

First discrete difference of element.

DataFrame.max([axis, skipna, numeric_only])

Return the maximum of the values over the requested axis.

DataFrame.mean([axis, skipna, numeric_only])

Return the mean of the values over the requested axis.

DataFrame.median([axis, skipna, numeric_only])

Return the median of the values over the requested axis.

DataFrame.min([axis, skipna, numeric_only])

Return the minimum of the values over the requested axis.

DataFrame.pct_change([periods, fill_method, ...])

Fractional change between the current and a prior element.

DataFrame.quantile([q, axis, numeric_only, ...])

Return values at the given quantile over requested axis.

DataFrame.rank([axis, method, numeric_only, ...])

Compute numerical data ranks (1 through n) along axis.

DataFrame.round([decimals])

Round a BasePandasDataset to a variable number of decimal places.

DataFrame.skew([axis, skipna, numeric_only])

Return unbiased skew, normalized over n-1

DataFrame.sum([axis, skipna, numeric_only, ...])

Return the sum of the values over the requested axis.

DataFrame.std([axis, skipna, ddof, numeric_only])

Return sample standard deviation over requested axis.

DataFrame.var([axis, skipna, ddof, numeric_only])

Return unbiased variance over requested axis.

DataFrame.nunique([axis, dropna])

Count number of distinct elements in specified axis.

DataFrame.value_counts([subset, normalize, ...])

Return a Series containing the frequency of each distinct row in the Dataframe.

Reindexing / selection / label manipulation

DataFrame.add_prefix(prefix[, axis])

Prefix labels with string prefix.

DataFrame.add_suffix(suffix[, axis])

Suffix labels with string suffix.

DataFrame.drop([labels, axis, index, ...])

Return Series with specified index labels removed.

DataFrame.drop_duplicates([subset, keep, ...])

Return DataFrame with duplicate rows removed.

DataFrame.duplicated([subset, keep])

Return boolean Series denoting duplicate rows.

DataFrame.equals(other)

Test whether two dataframes contain the same elements.

DataFrame.first(offset)

Select initial periods of time series data based on a date offset.

DataFrame.get(key[, default])

Get item from object for given key (ex: DataFrame column).

DataFrame.head([n])

Return the first n rows.

DataFrame.idxmax([axis, skipna, numeric_only])

Return index of first occurrence of maximum over requested axis.

DataFrame.idxmin([axis, skipna, numeric_only])

Return index of first occurrence of minimum over requested axis.

DataFrame.last(offset)

Select final periods of time series data based on a date offset.

DataFrame.rename([mapper, index, columns, ...])

Rename columns or index labels.

DataFrame.rename_axis([mapper, index, ...])

Set the name of the axis for the index or columns.

DataFrame.reset_index([level, drop, ...])

Reset the index, or a level of it.

DataFrame.sample([n, frac, replace, ...])

Return a random sample of items from an axis of object.

DataFrame.set_axis(labels, *[, axis, copy])

Assign desired index to given axis.

DataFrame.set_index(keys[, drop, append, ...])

Set the DataFrame index using existing columns.

DataFrame.tail([n])

Return the last n rows.

DataFrame.take(indices[, axis])

Return the elements in the given positional indices along an axis.

Missing data handling

DataFrame.backfill(*[, axis, inplace, ...])

Synonym for DataFrame.fillna with method='bfill'.

DataFrame.bfill(*[, axis, inplace, limit, ...])

Synonym for DataFrame.fillna with method='bfill'.

DataFrame.dropna(*[, axis, how, thresh, ...])

Remove missing values.

DataFrame.ffill(*[, axis, inplace, limit, ...])

Synonym for DataFrame.fillna() with method='ffill'.

DataFrame.fillna([value, method, axis, ...])

Fill NA/NaN values using the specified method.

DataFrame.isna()

Detect missing values.

DataFrame.isnull()

DataFrame.isnull is an alias for DataFrame.isna.

DataFrame.notna()

Detect non-missing values for an array-like object.

DataFrame.notnull()

Detect non-missing values for an array-like object.

DataFrame.pad(*[, axis, inplace, limit, ...])

Synonym for DataFrame.fillna() with method='ffill'.

DataFrame.replace([to_replace, value, ...])

Replace values given in to_replace with value.

Reshaping, sorting, transposing

DataFrame.melt([id_vars, value_vars, ...])

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

DataFrame.nlargest(n, columns[, keep])

Return the first n rows ordered by columns in descending order.

DataFrame.nsmallest(n, columns[, keep])

Return the first n rows ordered by columns in ascending order.

DataFrame.pivot(*, columns[, index, values])

Return reshaped DataFrame organized by given index / column values.

DataFrame.pivot_table([values, index, ...])

Create a spreadsheet-style pivot table as a DataFrame.

DataFrame.sort_index(*[, axis, level, ...])

Sort object by labels (along an axis).

DataFrame.nlargest(n, columns[, keep])

Return the first n rows ordered by columns in descending order.

DataFrame.nsmallest(n, columns[, keep])

Return the first n rows ordered by columns in ascending order.

DataFrame.melt([id_vars, value_vars, ...])

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

DataFrame.sort_values(by[, axis, ascending, ...])

Sort by the values along either axis.

DataFrame.squeeze([axis])

Squeeze 1 dimensional axis objects into scalars.

DataFrame.stack([level, dropna, sort, ...])

Stack the prescribed level(s) from columns to index.

DataFrame.T

Transpose index and columns.

DataFrame.transpose([copy])

Transpose index and columns.

DataFrame.unstack([level, fill_value, sort])

Pivot a level of the (necessarily hierarchical) index labels.

Combining / comparing / joining / merging

DataFrame.compare(other[, align_axis, ...])

Compare to another DataFrame and show the differences.

DataFrame.join(other[, on, how, lsuffix, ...])

Join columns of another DataFrame.

DataFrame.merge(right[, how, on, left_on, ...])

Merge DataFrame or named Series objects with a database-style join.

DataFrame.update(other)

Modify Series in place using values from passed Series.

Time Series-related

DataFrame.shift([periods, freq, axis, ...])

Shift data by desired number of periods along axis and replace columns with fill_value (default: None).

DataFrame.first_valid_index()

Return index for first non-NA value or None, if no non-NA value is found.

DataFrame.last_valid_index()

Return index for last non-NA value or None, if no non-NA value is found.

DataFrame.resample(rule[, axis, closed, ...])

Resample time-series data.

Serialization / IO / conversion

DataFrame.to_csv([path_or_buf, sep, na_rep, ...])

Write object to a comma-separated values (csv) file.