Pandas: Find and replace the missing values in a given DataFrame which do not have any valuable information
Pandas Handling Missing Values: Exercise-4 with Solution
Write a Pandas program to find and replace the missing values in a given DataFrame which do not have any valuable information.
Example:
Missing values: ?, --
Replace those values with NaN
Test Data:
ord_no purch_amt ord_date customer_id salesman_id 0 70001 150.5 ? 3002 5002 1 NaN 270.65 2012-09-10 3001 5003 2 70002 65.26 NaN 3001 ? 3 70004 110.5 2012-08-17 3003 5001 4 NaN 948.5 2012-09-10 3002 NaN 5 70005 2400.6 2012-07-27 3001 5002 6 -- 5760 2012-09-10 3001 5001 7 70010 ? 2012-10-10 3004 ? 8 70003 12.43 2012-10-10 -- 5003 9 70012 2480.4 2012-06-27 3002 5002 10 NaN 250.45 2012-08-17 3001 5003 11 70013 3045.6 2012-04-25 3001 --
Sample Solution:
Python Code :
import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,np.nan,70002,70004,np.nan,70005,"--",70010,70003,70012,np.nan,70013],
'purch_amt':[150.5,270.65,65.26,110.5,948.5,2400.6,5760,"?",12.43,2480.4,250.45, 3045.6],
'ord_date': ['?','2012-09-10',np.nan,'2012-08-17','2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17','2012-04-25'],
'customer_id':[3002,3001,3001,3003,3002,3001,3001,3004,"--",3002,3001,3001],
'salesman_id':[5002,5003,"?",5001,np.nan,5002,5001,"?",5003,5002,5003,"--"]})
print("Original Orders DataFrame:")
print(df)
print("\nReplace the missing values with NaN:")
result = df.replace({"?": np.nan, "--": np.nan})
print(result)
Sample Output:
Original Orders DataFrame: ord_no purch_amt ord_date customer_id salesman_id 0 70001 150.5 ? 3002 5002 1 NaN 270.65 2012-09-10 3001 5003 2 70002 65.26 NaN 3001 ? 3 70004 110.5 2012-08-17 3003 5001 4 NaN 948.5 2012-09-10 3002 NaN 5 70005 2400.6 2012-07-27 3001 5002 6 -- 5760 2012-09-10 3001 5001 7 70010 ? 2012-10-10 3004 ? 8 70003 12.43 2012-10-10 -- 5003 9 70012 2480.4 2012-06-27 3002 5002 10 NaN 250.45 2012-08-17 3001 5003 11 70013 3045.6 2012-04-25 3001 -- Replace the missing values with NaN: ord_no purch_amt ord_date customer_id salesman_id 0 70001.0 150.50 NaN 3002.0 5002.0 1 NaN 270.65 2012-09-10 3001.0 5003.0 2 70002.0 65.26 NaN 3001.0 NaN 3 70004.0 110.50 2012-08-17 3003.0 5001.0 4 NaN 948.50 2012-09-10 3002.0 NaN 5 70005.0 2400.60 2012-07-27 3001.0 5002.0 6 NaN 5760.00 2012-09-10 3001.0 5001.0 7 70010.0 NaN 2012-10-10 3004.0 NaN 8 70003.0 12.43 2012-10-10 NaN 5003.0 9 70012.0 2480.40 2012-06-27 3002.0 5002.0 10 NaN 250.45 2012-08-17 3001.0 5003.0 11 70013.0 3045.60 2012-04-25 3001.0 NaN
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to count the number of missing values in each column of a given DataFrame.
Next: Write a Pandas program to drop the rows where at least one element is missing in a given DataFrame.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/missing-values/python-pandas-missing-values-exercise-4.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics