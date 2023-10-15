Answers by Sentry

Remove DataFrame rows with missing values in Python

David Y.

The Problem

In Pandas, how do I remove DataFrame rows that contain None or NaN across all columns? How can I do this when these values are present in only some columns?

The Solution

We can achieve both of these results using the DataFrame.dropna method. For example:

import pandas
from numpy import nan

df = pandas.DataFrame(
    {
        "Test 1": [90, 10, nan, nan],
        "Test 2": [41, nan, 32, nan],
        "Test 3": [89, 35, 72, nan],
        "Test 4": [52, nan, nan, nan],
    }
)
print(df)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0
# 1    10.0     NaN    35.0     NaN
# 2     NaN    32.0    72.0     NaN
# 3     NaN     NaN     NaN     NaN

df_no_empty_rows = df.dropna(how="all")  # drop rows containing all NaNs
print(df_no_empty_rows)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0
# 1    10.0     NaN    35.0     NaN
# 2     NaN    32.0    72.0     NaN

df_no_empty_values = df.dropna(how="any")  # drop rows containing any NaNs
print(df_no_empty_values)

# output:
#    Test 1  Test 2  Test 3  Test 4
# 0    90.0    41.0    89.0    52.0

