Pandas - Detecting outliers in a DataFrame using the IQR Method
Pandas: Data Validation Exercise-8 with Solution
Write a Pandas program to detect outliers in a DataFrame.
This exercise shows how to detect outliers in a column using the Interquartile Range (IQR) method.
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with outliers
df = pd.DataFrame({
'Value': [10, 15, 14, 18, 90, 12, 11, 13]
})
# Calculate the IQR for the 'Value' column
Q1 = df['Value'].quantile(0.25)
Q3 = df['Value'].quantile(0.75)
IQR = Q3 - Q1
# Define outliers as values outside 1.5*IQR from Q1 and Q3
outliers = df[(df['Value'] < Q1 - 1.5 * IQR) | (df['Value'] > Q3 + 1.5 * IQR)]
# Output the result
print(outliers)
Output:
Value 4 90
Explanation:
- Created a DataFrame with numerical values, including outliers.
- Calculated the Interquartile Range (IQR) for detecting outliers.
- Defined outliers as values outside 1.5 times the IQR and filtered the DataFrame to display them.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics