Pandas - Detecting and removing outliers in a DataFrame using Z-score

Last update on May 05 2025 13:05:16 (UTC/GMT +8 hours)

5. Handling Outliers with Z-Score Method

Write a Pandas program to handle outliers in a DataFrame with Z-Score method.

This exercise demonstrates how to identify and remove outliers from a DataFrame using the Z-score method.

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with outliers
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 99]  # '99' is an outlier
})

# Calculate Z-scores to identify outliers
mean_age = df['Age'].mean()
std_age = df['Age'].std()
df['Z_Score'] = (df['Age'] - mean_age) / std_age

# Remove rows where Z-score is above 2 or below -2 (outliers)
df_no_outliers = df[df['Z_Score'].abs() <= 2]

# Drop the Z_Score column
df_no_outliers = df_no_outliers.drop(columns='Z_Score')

# Output the result
print(df_no_outliers)

Output:

      Name  Age
0    David   25
1  Annabel   30
2  Charlie   22
3    David   99

Explanation:

Created a DataFrame with an outlier in the 'Age' column (99).
Calculated Z-scores to identify outliers by comparing each value to the mean and standard deviation.
Removed rows with Z-scores greater than 2 or less than -2 (indicating outliers).
Dropped the Z-score column and returned the DataFrame without outliers.

For more Practice: Solve these Related Problems:

Write a Pandas program to identify and remove outliers using Z-Score on a specific numeric column.
Write a Pandas program to calculate the Z-Score for each row and filter out rows exceeding a given threshold.
Write a Pandas program to visualize the distribution of Z-Scores and highlight potential outliers in a DataFrame.
Write a Pandas program to replace detected outliers with the median value using the Z-Score method.

Go to:

Previous: String Manipulation in Pandas.
Next: Normalizing Data with Min-Max Scaling.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.