﻿ Pandas: Interpolate the missing values using the Linear Interpolation method - w3resource

# Pandas: Interpolate the missing values using the Linear Interpolation method

## Pandas Handling Missing Values: Exercise-15 with Solution

Write a Pandas program to interpolate the missing values using the Linear Interpolation method in a given DataFrame.

From Wikipedia, in mathematics, linear interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points.

Test Data:

```     ord_no  purch_amt  sale_amt    ord_date  customer_id  salesman_id
0   70001.0     150.50     10.50  2012-10-05         3002       5002.0
1       NaN        NaN     20.65  2012-09-10         3001       5003.0
2   70002.0      65.26       NaN         NaN         3001       5001.0
3   70004.0     110.50     11.50  2012-08-17         3003          NaN
4       NaN     948.50     98.50  2012-09-10         3002       5002.0
5   70005.0        NaN       NaN  2012-07-27         3001       5001.0
6       NaN    5760.00     57.00  2012-09-10         3001       5001.0
7   70010.0    1983.43     19.43  2012-10-10         3004          NaN
8   70003.0        NaN       NaN  2012-10-10         3003       5003.0
9   70012.0     250.45     25.45  2012-06-27         3002       5002.0
10      NaN      75.29     75.29  2012-08-17         3001       5003.0
11  70013.0    3045.60     35.60  2012-04-25         3001          NaN
```

Sample Solution:

Python Code :

``````import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,np.nan,70002,70004,np.nan,70005,np.nan,70010,70003,70012,np.nan,70013],
'purch_amt':[150.5,np.nan,65.26,110.5,948.5,np.nan,5760,1983.43,np.nan,250.45, 75.29,3045.6],
'sale_amt':[10.5,20.65,np.nan,11.5,98.5,np.nan,57,19.43,np.nan,25.45, 75.29,35.6],
'ord_date': ['2012-10-05','2012-09-10',np.nan,'2012-08-17','2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17','2012-04-25'],
'customer_id':[3002,3001,3001,3003,3002,3001,3001,3004,3003,3002,3001,3001],
'salesman_id':[5002,5003,5001,np.nan,5002,5001,5001,np.nan,5003,5002,5003,np.nan]})
print("Original Orders DataFrame:")
print(df)
print("\nInterpolate the missing values using the Linear Interpolation method (purch_amt):")
df['purch_amt'].interpolate(method='linear', direction = 'forward', inplace=True)
print(df)
``````

Sample Output:

```Original Orders DataFrame:
ord_no  purch_amt     ...      customer_id salesman_id
0   70001.0     150.50     ...             3002      5002.0
1       NaN        NaN     ...             3001      5003.0
2   70002.0      65.26     ...             3001      5001.0
3   70004.0     110.50     ...             3003         NaN
4       NaN     948.50     ...             3002      5002.0
5   70005.0        NaN     ...             3001      5001.0
6       NaN    5760.00     ...             3001      5001.0
7   70010.0    1983.43     ...             3004         NaN
8   70003.0        NaN     ...             3003      5003.0
9   70012.0     250.45     ...             3002      5002.0
10      NaN      75.29     ...             3001      5003.0
11  70013.0    3045.60     ...             3001         NaN

[12 rows x 6 columns]

Interpolate the missing values using the Linear Interpolation method (purch_amt):
ord_no  purch_amt     ...      customer_id salesman_id
0   70001.0     150.50     ...             3002      5002.0
1       NaN     107.88     ...             3001      5003.0
2   70002.0      65.26     ...             3001      5001.0
3   70004.0     110.50     ...             3003         NaN
4       NaN     948.50     ...             3002      5002.0
5   70005.0    3354.25     ...             3001      5001.0
6       NaN    5760.00     ...             3001      5001.0
7   70010.0    1983.43     ...             3004         NaN
8   70003.0    1116.94     ...             3003      5003.0
9   70012.0     250.45     ...             3002      5002.0
10      NaN      75.29     ...             3001      5003.0
11  70013.0    3045.60     ...             3001         NaN

[12 rows x 6 columns]
```

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.

﻿

## Python: Tips of the Day

F strings:

It is a common practice to add variables inside strings. F strings are by far the coolest way of doing it. To appreciate the f strings more, let's first perform the operation with the format function.

```name = 'Owen'
age = 25
print("{} is {} years old".format(name, age))
```

Output:

```Owen is 25 years old
```

We specify the variables that go inside the curly braces by using the format function at the end. F strings allow for specifying the variables inside the string.

```name = 'Owen'
age = 25
print(f"{name} is {age} years old")
```

Output:

```Owen is 25 years old
```

F strings are easier to follow and type. Moreover, they make the code more readable.

```A, B, C = {2, 4, 6}
print(A, B, C)
A, B, C = ['p', 'q', 'r']
print(A, B, C)
```

Output:

```2 4 6
p q r
```