w3resource

Removing columns with too many missing values using dropna() in Pandas


Pandas: Data Cleaning and Preprocessing Exercise-13 with Solution


Write a Pandas program to remove columns with too many missing values.

Following exercise removes columns that contain too many missing values using dropna().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with missing values
df = pd.DataFrame({
    'Name': ['Selena', 'Annabel', 'Caeso'],
    'Age': [25, None, 22],
    'Salary': [None, None, 70000]
})

# Remove columns with more than 50% missing values
df_cleaned = df.dropna(thresh=2, axis=1)

# Output the result
print(df_cleaned)

Output:

      Name   Age
0   Selena  25.0
1  Annabel   NaN
2    Caeso  22.0

Explanation:

  • Created a DataFrame with multiple columns containing missing values.
  • Used dropna(thresh=2, axis=1) to remove columns with more than 50% missing values.
  • Returned the DataFrame with only columns that have sufficient data.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Become a Patron!

Follow us on Facebook and Twitter for latest update.