modin.pandas.DataFrame.iterrows¶
- DataFrame.iterrows() Iterator[tuple[Hashable, modin.pandas.series.Series]] [source]¶
Iterate over
DataFrame
rows as (index,Series
) pairs.- Yields:
index (label or tuple of label) – The index of the row. A tuple for a MultiIndex.
data (Series) – The data of the row as a Series.
See also
DataFrame.itertuples
Iterate over DataFrame rows as namedtuples of the values.
DataFrame.items
Iterate over (column name, Series) pairs.
Notes
Iterating over rows is an antipattern in Snowpark pandas and pandas. Use df.apply() or other aggregation methods when possible instead of iterating over a DataFrame. Iterators and for loops do not scale well.
Because
iterrows
returns a Series for each row, it does not preserve dtypes across the rows (dtypes are preserved across columns for DataFrames).You should never modify something you are iterating over. This will not work. The iterator returns a copy of the data and writing to it will have no effect.
Examples
>>> df = pd.DataFrame([[1, 1.5], [2, 2.5], [3, 7.8]], columns=['int', 'float']) >>> df int float 0 1 1.5 1 2 2.5 2 3 7.8
Print the first row’s index and the row as a Series. >>> index_and_row = next(df.iterrows()) >>> index_and_row (0, int 1.0 float 1.5 Name: 0, dtype: float64)
Print the first row as a Series. >>> row = next(df.iterrows())[1] >>> row int 1.0 float 1.5 Name: 0, dtype: float64
Pretty printing every row. >>> for row in df.iterrows(): … print(row[1]) … int 1.0 float 1.5 Name: 0, dtype: float64 int 2.0 float 2.5 Name: 1, dtype: float64 int 3.0 float 7.8 Name: 2, dtype: float64
>>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1]], columns=['A', 'B', 'C']) >>> df A B C 0 0 2 3 1 0 4 1
Pretty printing the results to distinguish index and Series. >>> for row in df.iterrows(): … print(f”Index: {row[0]}”) … print(“Series:”) … print(row[1]) … Index: 0 Series: A 0 B 2 C 3 Name: 0, dtype: int64 Index: 1 Series: A 0 B 4 C 1 Name: 1, dtype: int64