DataFrame¶

Constructor


`DataFrame`([data, index, columns, dtype, ...])	Snowpark pandas representation of `pandas.DataFrame` with a lazily-evaluated relational dataset.

Attributes


`DataFrame.index`	Get the index for this DataFrame.
`DataFrame.columns`	Get the columns for this Snowpark pandas `DataFrame`.
`DataFrame.dtypes`	Return the dtypes in the `DataFrame`.
`DataFrame.info`([verbose, buf, max_cols, ...])	Print a concise summary of the `DataFrame`.
`DataFrame.select_dtypes`([include, exclude])	Return a subset of the `DataFrame`'s columns based on the column dtypes.
`DataFrame.values`	Return a NumPy representation of the dataset.
`DataFrame.axes`	Return a list representing the axes of the DataFrame.
`DataFrame.ndim`	Return the number of dimensions of the underlying data, by definition 2.
`DataFrame.size`	Return an int representing the number of elements in this object.
`DataFrame.shape`	Return a tuple representing the dimensionality of the `DataFrame`.
`DataFrame.empty`	Indicator whether the DataFrame is empty.

Snowflake Specific


`DataFrame.to_pandas`(*[, statement_params])	Convert Snowpark pandas DataFrame to pandas.DataFrame
`DataFrame.to_snowflake`(name[, if_exists, ...])	Save the Snowpark pandas DataFrame as a Snowflake table.
`DataFrame.to_snowpark`([index, index_label])	Convert the Snowpark pandas DataFrame to a Snowpark DataFrame.
`DataFrame.cache_result`([inplace])	Persists the current Snowpark pandas DataFrame to a temporary table to improve the latency of subsequent operations.

Conversion


`DataFrame.astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`DataFrame.convert_dtypes`([infer_objects, ...])	Convert columns to best possible dtypes using dtypes supporting `pd.NA`.
`DataFrame.copy`([deep])	Make a copy of this object's indices and data.

Indexing, iteration


`DataFrame.head`([n])	Return the first n rows.
`DataFrame.loc`	Access a group of rows and columns by label(s) or a boolean array.
`DataFrame.iloc`	Purely integer-location based indexing for selection by position.
`DataFrame.insert`(loc, column, value[, ...])	Insert column into DataFrame at specified location.
`DataFrame.__iter__`()	Iterate over info axis.
`DataFrame.keys`()	Get columns of the `DataFrame`.
`DataFrame.iterrows`()	Iterate over `DataFrame` rows as (index, `Series`) pairs.
`DataFrame.itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`DataFrame.tail`([n])	Return the last n rows.
`DataFrame.isin`(values)	Whether each element in the DataFrame is contained in values.
`DataFrame.where`(cond[, other, inplace, ...])	Replace values where the condition is False.
`DataFrame.mask`(cond[, other, inplace, axis, ...])	Replace values where the condition is True.

Binary operator functions


`DataFrame.add`(other[, axis, level, fill_value])	Get addition of `DataFrame` and other, element-wise (binary operator add).
`DataFrame.sub`(other[, axis, level, fill_value])	Get subtraction of `DataFrame` and other, element-wise (binary operator sub).
`DataFrame.mul`(other[, axis, level, fill_value])	Get multiplication of `DataFrame` and other, element-wise (binary operator mul).
`DataFrame.div`(other[, axis, level, fill_value])	Get floating division of `DataFrame` and other, element-wise (binary operator truediv).
`DataFrame.truediv`(other[, axis, level, ...])	Get floating division of `DataFrame` and other, element-wise (binary operator truediv).
`DataFrame.floordiv`(other[, axis, level, ...])	Get integer division of `DataFrame` and other, element-wise (binary operator floordiv).
`DataFrame.mod`(other[, axis, level, fill_value])	Get modulo of `DataFrame` and other, element-wise (binary operator mod).
`DataFrame.pow`(other[, axis, level, fill_value])	Get exponential power of `DataFrame` and other, element-wise (binary operator pow).
`DataFrame.radd`(other[, axis, level, fill_value])	Get addition of `DataFrame` and other, element-wise (binary operator radd).
`DataFrame.rsub`(other[, axis, level, fill_value])	Get subtraction of `DataFrame` and other, element-wise (binary operator rsub).
`DataFrame.rmul`(other[, axis, level, fill_value])	Get multiplication of `DataFrame` and other, element-wise (binary operator mul).
`DataFrame.rdiv`(other[, axis, level, fill_value])	Get floating division of `DataFrame` and other, element-wise (binary operator rtruediv).
`DataFrame.rtruediv`(other[, axis, level, ...])	Get floating division of `DataFrame` and other, element-wise (binary operator rtruediv).
`DataFrame.rfloordiv`(other[, axis, level, ...])	Get integer division of `DataFrame` and other, element-wise (binary operator rfloordiv).
`DataFrame.rmod`(other[, axis, level, fill_value])	Get modulo of `DataFrame` and other, element-wise (binary operator rmod).
`DataFrame.rpow`(other[, axis, level, fill_value])	Get exponential power of `DataFrame` and other, element-wise (binary operator rpow).
`DataFrame.lt`(other[, axis, level])	Get less than comparison of `DataFrame` and other, element-wise (binary operator le).
`DataFrame.gt`(other[, axis, level])	Get greater than comparison of `DataFrame` and other, element-wise (binary operator ge).
`DataFrame.le`(other[, axis, level])	Get less than or equal comparison of `DataFrame` and other, element-wise (binary operator le).
`DataFrame.ge`(other[, axis, level])	Get greater than or equal comparison of `DataFrame` and other, element-wise (binary operator ge).
`DataFrame.ne`(other[, axis, level])	Get not equal comparison of `DataFrame` and other, element-wise (binary operator ne).
`DataFrame.eq`(other[, axis, level])	Perform equality comparison of `DataFrame` and other (binary operator eq).

Function application, GroupBy & window


`DataFrame.apply`(func[, axis, raw, ...])	Apply a function along an axis of the DataFrame.
`DataFrame.applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`DataFrame.agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`DataFrame.aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`DataFrame.transform`(func[, axis])	Call `func` on self producing a Snowpark pandas DataFrame with the same axis shape as self.
`DataFrame.groupby`([by, axis, level, ...])	Group DataFrame using a mapper or by a Series of columns.
`DataFrame.rolling`(window[, min_periods, ...])	Provide rolling window calculations.

Computations / descriptive stats


`DataFrame.abs`()	Return a DataFrame with absolute numeric value of each element.
`DataFrame.all`([axis, bool_only, skipna])	Return whether all elements are True, potentially over an axis.
`DataFrame.any`([axis, bool_only, skipna])	Return whether any element are True, potentially over an axis.
`DataFrame.count`([axis, numeric_only])	Count non-NA cells for each column or row.
`DataFrame.cummax`([axis, skipna])	Return cumulative maximum over a BasePandasDataset axis.
`DataFrame.cummin`([axis, skipna])	Return cumulative minimum over a BasePandasDataset axis.
`DataFrame.cumsum`([axis, skipna])	Return cumulative sum over a BasePandasDataset axis.
`DataFrame.describe`([percentiles, include, ...])	Generate descriptive statistics for columns in the dataset.
`DataFrame.diff`([periods, axis])	First discrete difference of element.
`DataFrame.max`([axis, skipna, numeric_only])	Return the maximum of the values over the requested axis.
`DataFrame.mean`([axis, skipna, numeric_only])	Return the mean of the values over the requested axis.
`DataFrame.median`([axis, skipna, numeric_only])	Return the median of the values over the requested axis.
`DataFrame.min`([axis, skipna, numeric_only])	Return the minimum of the values over the requested axis.
`DataFrame.quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`DataFrame.rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`DataFrame.round`([decimals])	Round a DataFrame to a variable number of decimal places.
`DataFrame.skew`([axis, skipna, numeric_only])	Return unbiased skew, normalized over n-1
`DataFrame.sum`([axis, skipna, numeric_only, ...])	Return the sum of the values over the requested axis.
`DataFrame.std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation over requested axis.
`DataFrame.var`([axis, skipna, ddof, numeric_only])	Return unbiased variance over requested axis.
`DataFrame.nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`DataFrame.value_counts`([subset, normalize, ...])	Return a Series containing the frequency of each distinct row in the Dataframe.

Reindexing / selection / label manipulation


`DataFrame.add_prefix`(prefix)	Prefix labels with string prefix.
`DataFrame.add_suffix`(suffix)	Suffix labels with string suffix.
`DataFrame.drop`([labels, axis, index, ...])	Drop specified labels from rows or columns.
`DataFrame.drop_duplicates`([subset, keep, ...])	Return `DataFrame` with duplicate rows removed.
`DataFrame.duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`DataFrame.first`(offset)	Select initial periods of time series data based on a date offset.
`DataFrame.get`(key[, default])	Get item from object for given key (ex: DataFrame column).
`DataFrame.head`([n])	Return the first n rows.
`DataFrame.idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`DataFrame.idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`DataFrame.last`(offset)	Select final periods of time series data based on a date offset.
`DataFrame.rename`([mapper, index, columns, ...])	Rename columns or index labels.
`DataFrame.rename_axis`([mapper, index, ...])	Set the name of the axis for the index or columns.
`DataFrame.reset_index`([level, drop, ...])	Reset the index, or a level of it.
`DataFrame.sample`([n, frac, replace, ...])	Return a random sample of items from an axis of object.
`DataFrame.set_axis`(labels, *[, axis, copy])	Assign desired index to given axis.
`DataFrame.set_index`(keys[, drop, append, ...])	Set the DataFrame index using existing columns.
`DataFrame.tail`([n])	Return the last n rows.
`DataFrame.take`(indices[, axis])	Return the elements in the given positional indices along an axis.

Missing data handling


`DataFrame.dropna`(*[, axis, how, thresh, ...])	Remove missing values.
`DataFrame.ffill`([axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='ffill'`.
`DataFrame.fillna`([value, method, axis, ...])	Fill NA/NaN values using the specified method.
`DataFrame.isna`()	Detect missing values.
`DataFrame.isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`DataFrame.notna`()	Detect non-missing values for an array-like object.
`DataFrame.notnull`()	Detect non-missing values for an array-like object.
`DataFrame.pad`([axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='ffill'`.
`DataFrame.replace`([to_replace, value, ...])	Replace values given in to_replace with value.

Reshaping, sorting, transposing


`DataFrame.pivot_table`([values, index, ...])	Create a spreadsheet-style pivot table as a `DataFrame`.
`DataFrame.sort_values`(by[, axis, ascending, ...])	Sort by the values along either axis.
`DataFrame.sort_index`([axis, level, ...])	Sort object by labels (along an axis).
`DataFrame.melt`([id_vars, value_vars, ...])	Unpivot a `DataFrame` from wide to long format, optionally leaving identifiers set.
`DataFrame.squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`DataFrame.T`	Transpose index and columns.
`DataFrame.transpose`([copy])	Transpose index and columns.

Combining / comparing / joining / merging


`DataFrame.join`(other[, on, how, lsuffix, ...])	Join columns of another DataFrame.
`DataFrame.merge`(right[, how, on, left_on, ...])	Merge DataFrame or named Series objects with a database-style join.
`DataFrame.update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another `DataFrame`.

Time Series-related


`DataFrame.shift`([periods, freq, axis, ...])	Shift data by desired number of periods along axis and replace columns with fill_value (default: None).
`DataFrame.first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`DataFrame.last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`DataFrame.resample`(rule[, axis, closed, ...])	Resample time-series data.