DataFrame.dropna(how: str = 'any', thresh: Optional[int] = None, subset: Optional[Union[str, Iterable[str]]] = None) DataFrame[source]

Returns a new DataFrame that excludes all rows containing fewer than a specified number of non-null and non-NaN values in the specified columns.

  • how – An str with value either ‘any’ or ‘all’. If ‘any’, drop a row if it contains any nulls. If ‘all’, drop a row only if all its values are null. The default value is ‘any’. If thresh is provided, how will be ignored.

  • thresh

    The minimum number of non-null and non-NaN values that should be in the specified columns in order for the row to be included. It overwrites how. In each case:

    • If thresh is not provided or None, the length of subset will be used when how is ‘any’ and 1 will be used when how is ‘all’.

    • If thresh is greater than the number of the specified columns, the method returns an empty DataFrame.

    • If thresh is less than 1, the method returns the original DataFrame.

  • subset

    A list of the names of columns to check for null and NaN values. In each case:

    • If subset is not provided or None, all columns will be included.

    • If subset is empty, the method returns the original DataFrame.


>>> df = session.create_dataframe([[1.0, 1], [float('nan'), 2], [None, 3], [4.0, None], [float('nan'), None]]).to_df("a", "b")
>>> # drop a row if it contains any nulls, with checking all columns
|"A"  |"B"  |
|1.0  |1    |

>>> # drop a row only if all its values are null, with checking all columns
|"A"   |"B"   |
|1.0   |1     |
|nan   |2     |
|NULL  |3     |
|4.0   |NULL  |

>>> # drop a row if it contains at least one non-null and non-NaN values, with checking all columns
|"A"   |"B"   |
|1.0   |1     |
|nan   |2     |
|NULL  |3     |
|4.0   |NULL  |

>>> # drop a row if it contains any nulls, with checking column "a"
|"A"  |"B"   |
|1.0  |1     |
|4.0  |NULL  |

|"A"  |"B"   |
|1.0  |1     |
|4.0  |NULL  |