w3resource

Pandas - Detecting outliers in a DataFrame using the IQR Method


8. Detecting Outliers in a DataFrame

Write a Pandas program to detect outliers in a DataFrame.

This exercise shows how to detect outliers in a column using the Interquartile Range (IQR) method.

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with outliers
df = pd.DataFrame({
    'Value': [10, 15, 14, 18, 90, 12, 11, 13]
})

# Calculate the IQR for the 'Value' column
Q1 = df['Value'].quantile(0.25)
Q3 = df['Value'].quantile(0.75)
IQR = Q3 - Q1

# Define outliers as values outside 1.5*IQR from Q1 and Q3
outliers = df[(df['Value'] < Q1 - 1.5 * IQR) | (df['Value'] > Q3 + 1.5 * IQR)]

# Output the result
print(outliers)

Output:

   Value
4     90

Explanation:

  • Created a DataFrame with numerical values, including outliers.
  • Calculated the Interquartile Range (IQR) for detecting outliers.
  • Defined outliers as values outside 1.5 times the IQR and filtered the DataFrame to display them.

For more Practice: Solve these Related Problems:

  • Write a Pandas program to detect outliers in a DataFrame using the IQR method and mark them in a new column.
  • Write a Pandas program to identify outliers in a numerical column using Z-score and return the indices of the outlier rows.
  • Write a Pandas program to detect outliers and replace them with the median value of the column.
  • Write a Pandas program to create a box plot for a DataFrame column and programmatically flag the outlier values.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.