Pandas Series: asof() function
Get the last row(s) without any NaNs in Pandas series
The asof() function is used to return the last row(s) without any NaNs before where.
The last row (for each element in where, if list) without any NaN is taken. In case of a DataFrame, the last row without NaN considering only the subset of columns (if not None)
New in version 0.19.0: For DataFrame
If there is no good value, NaN is returned for a Series or a Series of NaN values for a DataFrame
Syntax:
Series.asof(self, where, subset=None)
Parameters:
Name | Description | Type/Default Value | Required / Optional |
---|---|---|---|
where | Date(s) before which the last row(s) are returned. | date or array-like of dates | Required |
subset | For DataFrame, if not None, only use these columns to check for NaNs. | str or array-like of str Default Value: None |
Required |
Returns: scalar, Series, or DataFrame
The return can be:
- scalar : when self is a Series and where is a scalar
- Series: when self is a Series and where is an array-like, or when self is a DataFrame and where is a scalar
- DataFrame : when self is a DataFrame and where is an array-like
Return scalar, Series, or DataFrame.
Notes: Dates are assumed to be sorted. Raises if this is not the case.
Example - A Series and a scalar where:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([2, 3, np.nan, 5], index=[20, 30, 40, 50])
s
Output:
20 2.0 30 3.0 40 NaN 50 5.0 dtype: float64
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([2, 3, np.nan, 5], index=[20, 30, 40, 50])
s.asof(20)
Output:
2.0
Example - For a sequence where, a Series is returned. The first value is NaN, because the first element of where is before the first index value.:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([2, 3, np.nan, 5], index=[20, 30, 40, 50])
s.asof([5, 30])
Output:
5 NaN 30 3.0 dtype: float64
Example - Missing values are not considered. The following is 2.0, not NaN, even though NaN is at the index location for 20:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series([2, 3, np.nan, 5], index=[20, 30, 40, 50])
s.asof(20)
Output:
2.0
Example - Take all columns into consideration:
Python-Pandas Code:
import numpy as np
import pandas as pd
df = pd.DataFrame({'p': [20, 30, 40, 50, 60],
'q': [None, None, None, None, 500]},
index=pd.DatetimeIndex(['2019-02-28 09:02:00',
'2019-02-28 09:03:00',
'2019-02-28 09:04:00',
'2019-02-28 09:05:00',
'2019-02-28 09:06:00']))
df.asof(pd.DatetimeIndex(['2019-02-28 09:04:30',
'2019-02-28 09:05:30']))
Output:
p q 2019-02-28 09:04:30 NaN NaN 2019-02-28 09:05:30 NaN NaN
Example - Take a single column into consideration:
Python-Pandas Code:
import numpy as np
import pandas as pd
df = pd.DataFrame({'p': [20, 30, 40, 50, 60],
'q': [None, None, None, None, 500]},
index=pd.DatetimeIndex(['2019-02-28 09:02:00',
'2019-02-28 09:03:00',
'2019-02-28 09:04:00',
'2019-02-28 09:05:00',
'2019-02-28 09:06:00']))
df.asof(pd.DatetimeIndex(['2019-02-28 09:04:30',
'2019-02-28 09:05:30']),
subset=['p'])
Output:
p q 2019-02-28 09:04:30 40.0 NaN 2019-02-28 09:05:30 50.0 NaN
Previous: Convert Pandas TimeSeries to specified frequency
Next: Series shift() function
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics