Pandas Series: interpolate() function
Fill NA/missing values in a Pandas series
The interpolate() function is used to interpolate values according to different methods.
Syntax:
Series.interpolate(self, method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)
Parameters:
Name | Description | Type/Default Value | Required / Optional |
---|---|---|---|
method | Interpolation technique to use. One of:
|
str Default Value: ‘linear’ |
Required |
axis | Axis to interpolate along. | {0 or ‘index’, 1 or ‘columns’, None} Default Value: None |
Required |
limit | Maximum number of consecutive NaNs to fill. Must be greater than 0. | int | Optional |
inplace | Update the data in place if possible. | bool Default Value: False |
Required |
limit_direction | If limit is specified, consecutive NaNs will be filled in this direction. | {‘forward’, ‘backward’, ‘both’} Default Value: ‘forward’ |
Required |
limit_area | If limit is specified, consecutive NaNs will be filled with this restriction.
|
{None, ‘inside’, ‘outside’} Default Value: None |
Required |
downcast | Downcast dtypes if possible. | infer’ or None Default Value: None |
Optional |
**kwargs | Keyword arguments to pass on to the interpolating function. | Required |
Returns: Series or DataFrame- Returns the same object type as the caller, interpolated at some or all NaN values.
Notes
The ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’ and ‘akima’ methods are wrappers around the respective SciPy implementations of similar names. These use the actual numerical values of the index.
Example - Filling in NaN in a Series via linear interpolation:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([0, 2, np.nan, 5])
s
Output:
0 0.0 1 2.0 2 NaN 3 5.0 dtype: float64
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([0, 2, np.nan, 5])
s.interpolate()
Output:
0 0.0 1 2.0 2 3.5 3 5.0 dtype: float64
Example - Filling in NaN in a Series by padding, but filling at most two consecutive NaN at a time:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
"fill_two_more", np.nan, np.nan,
3.71, np.nan])
s
Output:
0 NaN 1 single_one 2 NaN 3 fill_two_more 4 NaN 5 NaN 6 3.71 7 NaN dtype: object
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
"fill_two_more", np.nan, np.nan,
3.71, np.nan])
s.interpolate(method='pad', limit=2)
Output:
0 NaN 1 single_one 2 single_one 3 fill_two_more 4 fill_two_more 5 fill_two_more 6 3.71 7 3.71 dtype: object
Example - Filling in NaN in a Series via polynomial interpolation or splines: Both ‘polynomial’ and ‘spline’ methods require that you also specify an order (int):
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
"fill_two_more", np.nan, np.nan,
3.71, np.nan])
s = pd.Series([0, 4, np.nan, 8])
s.interpolate(method='polynomial', order=2)
Output:
0 0.000000 1 4.000000 2 6.666667 3 8.000000 dtype: float64
Example - Fill the DataFrame forward (that is, going down) along each column using linear interpolation:
Note how the last entry in column ‘p’ is interpolated differently, because there is no entry after it to use for interpolation. Note how the first entry in column ‘q’ remains NaN, because there is no entry before it to use for interpolation.
Python-Pandas Code:
import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
(np.nan, 3.0, np.nan, np.nan),
(2.0, 3.0, np.nan, 7.0),
(np.nan, 4.0, -4.0, 16.0)],
columns=list('pqrs'))
df
Output:
p q r s 0 0.0 NaN -2.0 2.0 1 NaN 3.0 NaN NaN 2 2.0 3.0 NaN 7.0 3 NaN 4.0 -4.0 16.0
Python-Pandas Code:
import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
(np.nan, 3.0, np.nan, np.nan),
(2.0, 3.0, np.nan, 7.0),
(np.nan, 4.0, -4.0, 16.0)],
columns=list('pqrs'))
df.interpolate(method='linear', limit_direction='forward', axis=0)
Output:
p q r s 0 0.0 NaN -2.000000 2.0 1 1.0 3.0 -2.666667 4.5 2 2.0 3.0 -3.333333 7.0 3 2.0 4.0 -4.000000 16.0
Example - Using polynomial interpolation:
Python-Pandas Code:
import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
(np.nan, 3.0, np.nan, np.nan),
(2.0, 3.0, np.nan, 7.0),
(np.nan, 4.0, -4.0, 16.0)],
columns=list('pqrs'))
df['s'].interpolate(method='polynomial', order=2)
Output:
0 2.000000 1 2.333333 2 7.000000 3 16.000000 Name: s, dtype: float64
Previous: Fill NA/NaN values using the specified method
Next: Sort Pandas series in ascending or descending order by some criterion
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics