Examples
Scalar 'to_replace' and 'value'
import numpy as np
import pandas as pd
s = pd.Series([0, 2, 3, 4, 5])
s.replace(0, 6)
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
'Y': [6, 7, 8, 9, 1],
'Z': ['p', 'q', 'r', 's', 't']})
df.replace(0, 5)
List-like 'to_replace'
df.replace([0, 2, 3, 4], 5)
df.replace([0, 2, 3, 4], [4, 3, 2, 1])
s.replace([2, 3], method='bfill')
dict-like 'to_replace'
df.replace({0: 20, 1: 80})
df.replace({'X': 0, 'Y': 6}, 80)
df.replace({'X': {0: 100, 3: 200}})
Regular expression 'to_replace'
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
'Y': ['abc', 'brr', 'pqr']})
df.replace(to_replace=r'^ba.$', value='new', regex=True)
df.replace({'X': r'^ba.$'}, {'X': 'new'}, regex=True)
df.replace(regex=r'^ba.$', value='new')
df.replace(regex={r'^ba.$': 'new', 'fff': 'pqr'})
df.replace(regex=[r'^ba.$', 'fff'], value='new')
Note that when replacing multiple bool or datetime64 objects, the data types in the to_replace parameter
must match the data type of the value being replaced:
df = pd.DataFrame({'X': [True, False, True],
'Y': [False, True, False]})
This raises a TypeError because one of the dict keys is not of the correct type for replacement.
Compare the behavior of s.replace({'p': None}) and s.replace('p', None) to understand the peculiarities
of the to_replace parameter:
s = pd.Series([10, 'p', 'p', 'q', 'p'])
When one uses a dict as the to_replace value, it is like the value(s) in the dict are equalto the value parameter.
s.replace({'p': None}) is equivalent to s.replace(to_replace={'p': None}, value=None, method=None):
s.replace({'p': None})
When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter (default ‘pad’)
to do the replacement. So this is why the ‘p’ values are being replaced by 10 in rows 1 and 2 and ‘q’ in row 4 in this case.
The command s.replace('p', None) is actually equivalent to s.replace(to_replace='p', value=None, method='pad'):
s.replace('p', None)