modin.pandas.DataFrame.iterrows¶

DataFrame.iterrows() → Iterable[tuple[Hashable, modin.pandas.series.Series]][source]¶

Iterate over DataFrame rows as (index, Series) pairs.

Yields:

index (label or tuple of label) – The index of the row. A tuple for a MultiIndex.
data (Series) – The data of the row as a Series.

See also

DataFrame.itertuples: Iterate over DataFrame rows as namedtuples of the values.
DataFrame.items: Iterate over (column name, Series) pairs.

Notes

Iterating over rows is an antipattern in Snowpark pandas and pandas. Use df.apply() or other aggregation methods when possible instead of iterating over a DataFrame. Iterators and for loops do not scale well.
Because iterrows returns a Series for each row, it does not preserve dtypes across the rows (dtypes are preserved across columns for DataFrames).
You should never modify something you are iterating over. This will not work. The iterator returns a copy of the data and writing to it will have no effect.

Examples

>>> df = pd.DataFrame([[1, 1.5], [2, 2.5], [3, 7.8]], columns=['int', 'float'])
>>> df
   int  float
0    1    1.5
1    2    2.5
2    3    7.8

Print the first row’s index and the row as a Series. >>> index_and_row = next(df.iterrows()) >>> index_and_row # doctest: +SKIP (0, int 1.0 float 1.5 Name: 0, dtype: float64)

Print the first row as a Series. >>> row = next(df.iterrows())[1] >>> row int 1.0 float 1.5 Name: 0, dtype: float64

Pretty printing every row. >>> for row in df.iterrows(): … print(row[1]) … int 1.0 float 1.5 Name: 0, dtype: float64 int 2.0 float 2.5 Name: 1, dtype: float64 int 3.0 float 7.8 Name: 2, dtype: float64

>>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1]], columns=['A', 'B', 'C'])
>>> df
   A  B  C
0  0  2  3
1  0  4  1

Pretty printing the results to distinguish index and Series. >>> for row in df.iterrows(): … print(f”Index: {row[0]}”) … print(“Series:”) … print(row[1]) … Index: 0 Series: A 0 B 2 C 3 Name: 0, dtype: int64 Index: 1 Series: A 0 B 4 C 1 Name: 1, dtype: int64