DataFrame¶

Constructor

DataFrame([data, index, columns, dtype, ...])

Snowpark pandas representation of pandas.DataFrame with a lazily-evaluated relational dataset.

Attributes

`DataFrame.index`	Get the index for this Series/DataFrame.
`DataFrame.columns`	Get the columns for this `DataFrame`.
`DataFrame.dtypes`	Return the dtypes in the `DataFrame`.
`DataFrame.info`([verbose, buf, max_cols, ...])	Print a concise summary of the `DataFrame`.
`DataFrame.select_dtypes`([include, exclude])	Return a subset of the `DataFrame`'s columns based on the column dtypes.
`DataFrame.values`	Return a NumPy representation of the dataset.
`DataFrame.axes`	Return a list representing the axes of the DataFrame.
`DataFrame.ndim`	Return the number of dimensions of the underlying data, by definition 2.
`DataFrame.size`	Return an int representing the number of elements in this object.
`DataFrame.shape`	Return a tuple representing the dimensionality of the `DataFrame`.
`DataFrame.empty`	Indicator whether the DataFrame is empty.

Snowflake Specific

`DataFrame.to_pandas`()	Convert Modin `DataFrame` to pandas `DataFrame`.
`DataFrame.to_snowflake`(args, *kwargs)
`DataFrame.to_snowpark`(args, *kwargs)
`DataFrame.cache_result`(args, *kwargs)
`DataFrame.create_or_replace_view`(args, *kwargs)
`DataFrame.create_or_replace_dynamic_table`(...)
`DataFrame.to_view`(args, *kwargs)
`DataFrame.to_dynamic_table`(args, *kwargs)
`DataFrame.to_iceberg`(args, *kwargs)
`DataFrame.get_backend`()	Get the backend for this `DataFrame`.
`DataFrame.set_backend`(backend[, inplace, ...])	Move the data in this `DataFrame` from its current backend to the given one.
`DataFrame.move_to`(backend[, inplace, ...])	Move the data in this `DataFrame` from its current backend to the given one.
`DataFrame.pin_backend`([inplace])	Pin the object's underlying data, preventing Modin from automatically moving it to another backend.
`DataFrame.unpin_backend`([inplace])	Unpin the object's underlying data, allowing Modin to automatically move it to another backend.

Conversion

`DataFrame.astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`DataFrame.convert_dtypes`([infer_objects, ...])	Convert columns to best possible dtypes using dtypes supporting `pd.NA`.
`DataFrame.copy`([deep])	Make a copy of this object's indices and data.

Indexing, iteration

`DataFrame.assign`(**kwargs)	Assign new columns to a `DataFrame`.
`DataFrame.head`([n])	Return the first n rows.
`DataFrame.loc`	Access a group of rows and columns by label(s) or a boolean array.
`DataFrame.iloc`	Purely integer-location based indexing for selection by position.
`DataFrame.insert`(loc, column, value[, ...])	Insert column into DataFrame at specified location.
`DataFrame.__iter__`()	Iterate over info axis.
`DataFrame.keys`()	Get columns of the `DataFrame`.
`DataFrame.iterrows`()	Iterate over `DataFrame` rows as (index, `Series`) pairs.
`DataFrame.items`()	Iterate over (column name, `Series`) pairs.
`DataFrame.itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`DataFrame.tail`([n])	Return the last n rows.
`DataFrame.isin`(values)	Whether each element in the DataFrame is contained in values.
`DataFrame.where`(cond[, other, inplace, ...])	Replace values where the condition is False.
`DataFrame.mask`(cond[, other, inplace, axis, ...])	Replace values where the condition is True.

Binary operator functions

`DataFrame.add`(other[, axis, level, fill_value])	Get addition of `DataFrame` and other, element-wise (binary operator add).
`DataFrame.sub`(other[, axis, level, fill_value])	Get subtraction of `DataFrame` and other, element-wise (binary operator sub).
`DataFrame.mul`(other[, axis, level, fill_value])	Get multiplication of `DataFrame` and other, element-wise (binary operator mul).
`DataFrame.div`(other[, axis, level, fill_value])	Get floating division of `DataFrame` and other, element-wise (binary operator truediv).
`DataFrame.truediv`(other[, axis, level, ...])	Get floating division of `DataFrame` and other, element-wise (binary operator truediv).
`DataFrame.floordiv`(other[, axis, level, ...])	Get integer division of `DataFrame` and other, element-wise (binary operator floordiv).
`DataFrame.mod`(other[, axis, level, fill_value])	Get modulo of `DataFrame` and other, element-wise (binary operator mod).
`DataFrame.pow`(other[, axis, level, fill_value])	Get exponential power of `DataFrame` and other, element-wise (binary operator pow).
`DataFrame.radd`(other[, axis, level, fill_value])	Get addition of `DataFrame` and other, element-wise (binary operator radd).
`DataFrame.rsub`(other[, axis, level, fill_value])	Get subtraction of `DataFrame` and other, element-wise (binary operator rsub).
`DataFrame.rmul`(other[, axis, level, fill_value])	Get multiplication of `DataFrame` and other, element-wise (binary operator mul).
`DataFrame.rdiv`(other[, axis, level, fill_value])	Get floating division of `DataFrame` and other, element-wise (binary operator rtruediv).
`DataFrame.rtruediv`(other[, axis, level, ...])	Get floating division of `DataFrame` and other, element-wise (binary operator rtruediv).
`DataFrame.rfloordiv`(other[, axis, level, ...])	Get integer division of `DataFrame` and other, element-wise (binary operator rfloordiv).
`DataFrame.rmod`(other[, axis, level, fill_value])	Get modulo of `DataFrame` and other, element-wise (binary operator rmod).
`DataFrame.rpow`(other[, axis, level, fill_value])	Get exponential power of `DataFrame` and other, element-wise (binary operator rpow).
`DataFrame.lt`(other[, axis, level])	Get less than comparison of `DataFrame` and other, element-wise (binary operator le).
`DataFrame.gt`(other[, axis, level])	Get greater than comparison of `DataFrame` and other, element-wise (binary operator ge).
`DataFrame.le`(other[, axis, level])	Get less than or equal comparison of `DataFrame` and other, element-wise (binary operator le).
`DataFrame.ge`(other[, axis, level])	Get greater than or equal comparison of `DataFrame` and other, element-wise (binary operator ge).
`DataFrame.ne`(other[, axis, level])	Get not equal comparison of `DataFrame` and other, element-wise (binary operator ne).
`DataFrame.eq`(other[, axis, level])	Perform equality comparison of `DataFrame` and other (binary operator eq).

Function application, GroupBy & window

`DataFrame.apply`(func[, axis, raw, ...])	Apply a function along an axis of the DataFrame.
`DataFrame.applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`DataFrame.agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`DataFrame.aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`DataFrame.transform`(func[, axis])	Call `func` on self producing a Snowpark pandas DataFrame with the same axis shape as self.
`DataFrame.groupby`([by, axis, level, ...])	Group DataFrame using a mapper or by a Series of columns.
`DataFrame.rolling`(window[, min_periods, ...])	Provide rolling window calculations.

Computations / descriptive stats

`DataFrame.abs`()	Return a DataFrame with absolute numeric value of each element.
`DataFrame.all`([axis, bool_only, skipna])	Return whether all elements are True, potentially over an axis.
`DataFrame.any`(*[, axis, bool_only, skipna])	Return whether any element are True, potentially over an axis.
`DataFrame.count`([axis, numeric_only])	Count non-NA cells for each column or row.
`DataFrame.cummax`([axis, skipna])	Return cumulative maximum over a BasePandasDataset axis.
`DataFrame.cummin`([axis, skipna])	Return cumulative minimum over a BasePandasDataset axis.
`DataFrame.cumsum`([axis, skipna])	Return cumulative sum over a BasePandasDataset axis.
`DataFrame.describe`([percentiles, include, ...])	Generate descriptive statistics for columns in the dataset.
`DataFrame.diff`([periods, axis])	First discrete difference of element.
`DataFrame.max`([axis, skipna, numeric_only])	Return the maximum of the values over the requested axis.
`DataFrame.mean`([axis, skipna, numeric_only])	Return the mean of the values over the requested axis.
`DataFrame.median`([axis, skipna, numeric_only])	Return the median of the values over the requested axis.
`DataFrame.min`([axis, skipna, numeric_only])	Return the minimum of the values over the requested axis.
`DataFrame.pct_change`([periods, fill_method, ...])	Fractional change between the current and a prior element.
`DataFrame.quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`DataFrame.rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`DataFrame.round`([decimals])	Round a DataFrame to a variable number of decimal places.
`DataFrame.skew`([axis, skipna, numeric_only])	Return unbiased skew, normalized over n-1
`DataFrame.sum`([axis, skipna, numeric_only, ...])	Return the sum of the values over the requested axis.
`DataFrame.std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation over requested axis.
`DataFrame.var`([axis, skipna, ddof, numeric_only])	Return unbiased variance over requested axis.
`DataFrame.nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`DataFrame.value_counts`([subset, normalize, ...])	Return a Series containing the frequency of each distinct row in the Dataframe.

Reindexing / selection / label manipulation

`DataFrame.add_prefix`(prefix[, axis])	Prefix labels with string prefix.
`DataFrame.add_suffix`(suffix[, axis])	Suffix labels with string suffix.
`DataFrame.drop`([labels, axis, index, ...])	Drop specified labels from rows or columns.
`DataFrame.drop_duplicates`([subset, keep, ...])	Return `DataFrame` with duplicate rows removed.
`DataFrame.duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`DataFrame.equals`(other)	Test whether two dataframes contain the same elements.
`DataFrame.first`(offset)	Select initial periods of time series data based on a date offset.
`DataFrame.get`(key[, default])	Get item from object for given key (ex: DataFrame column).
`DataFrame.head`([n])	Return the first n rows.
`DataFrame.idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`DataFrame.idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`DataFrame.last`(offset)	Select final periods of time series data based on a date offset.
`DataFrame.rename`([mapper, index, columns, ...])	Rename columns or index labels.
`DataFrame.rename_axis`([mapper, index, ...])	Set the name of the axis for the index or columns.
`DataFrame.reset_index`([level, drop, ...])	Reset the index, or a level of it.
`DataFrame.sample`([n, frac, replace, ...])	Return a random sample of items from an axis of object.
`DataFrame.set_axis`(labels, *[, axis, copy])	Assign desired index to given axis.
`DataFrame.set_index`(keys, *[, drop, append, ...])	Set the DataFrame index using existing columns.
`DataFrame.tail`([n])	Return the last n rows.
`DataFrame.take`(indices[, axis])	Return the elements in the given positional indices along an axis.

Missing data handling

`DataFrame.backfill`(*[, axis, inplace, ...])	Synonym for DataFrame.fillna with `method='bfill'`.
`DataFrame.bfill`(*[, axis, inplace, limit, ...])	Fill NA/NaN values by using the next valid observation to fill the gap.
`DataFrame.dropna`(*[, axis, how, thresh, ...])	Remove missing values.
`DataFrame.ffill`(*[, axis, inplace, limit, ...])	Fill NA/NaN values by propagating the last valid observation to next valid.
`DataFrame.fillna`([value, method, axis, ...])	Fill NA/NaN values using the specified method.
`DataFrame.isna`()	Detect missing values.
`DataFrame.isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`DataFrame.notna`()	Detect non-missing values for an array-like object.
`DataFrame.notnull`()	Detect non-missing values for an array-like object.
`DataFrame.pad`(*[, axis, inplace, limit, ...])	Fill NA/NaN values by propagating the last valid observation to next valid.
`DataFrame.replace`([to_replace, value, ...])	Replace values given in to_replace with value.

Reshaping, sorting, transposing

`DataFrame.melt`([id_vars, value_vars, ...])	Unpivot a `DataFrame` from wide to long format, optionally leaving identifiers set.
`DataFrame.nlargest`(n, columns[, keep])	Return the first n rows ordered by columns in descending order.
`DataFrame.nsmallest`(n, columns[, keep])	Return the first n rows ordered by columns in ascending order.
`DataFrame.pivot`(*, columns[, index, values])	Return reshaped DataFrame organized by given index / column values.
`DataFrame.pivot_table`([values, index, ...])	Create a spreadsheet-style pivot table as a `DataFrame`.
`DataFrame.sort_index`(*[, axis, level, ...])	Sort object by labels (along an axis).
`DataFrame.nlargest`(n, columns[, keep])	Return the first n rows ordered by columns in descending order.
`DataFrame.nsmallest`(n, columns[, keep])	Return the first n rows ordered by columns in ascending order.
`DataFrame.melt`([id_vars, value_vars, ...])	Unpivot a `DataFrame` from wide to long format, optionally leaving identifiers set.
`DataFrame.sort_values`(by, *[, axis, ...])	Sort by the values along either axis.
`DataFrame.squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`DataFrame.stack`([level, dropna, sort, ...])	Stack the prescribed level(s) from columns to index.
`DataFrame.T`	Transpose index and columns.
`DataFrame.transpose`([copy])	Transpose index and columns.
`DataFrame.unstack`([level, fill_value, sort])	Pivot a level of the (necessarily hierarchical) index labels.

Combining / comparing / joining / merging

`DataFrame.compare`(other[, align_axis, ...])	Compare to another DataFrame and show the differences.
`DataFrame.join`(other[, on, how, lsuffix, ...])	Join columns of another DataFrame.
`DataFrame.merge`(right[, how, on, left_on, ...])	Merge DataFrame or named Series objects with a database-style join.
`DataFrame.update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another `DataFrame`.

Time Series-related

`DataFrame.shift`([periods, freq, axis, ...])	Shift data by desired number of periods along axis and replace columns with fill_value (default: None).
`DataFrame.first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`DataFrame.last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`DataFrame.resample`(rule[, axis, closed, ...])	Resample time-series data.

Plotting

DataFrame.boxplot([column, by, ax, ...])

Make a box plot from DataFrame columns.

Serialization / IO / conversion

`DataFrame.to_csv`([path_or_buf, sep, na_rep, ...])	Write object to a comma-separated values (csv) file.
`DataFrame.to_excel`(excel_writer[, ...])	Write object to an Excel sheet.
`DataFrame.to_html`([buf, columns, col_space, ...])	Render a `DataFrame` as an HTML table.
`DataFrame.to_string`([buf, columns, ...])	Render a DataFrame to a console-friendly tabular output.