Pandas Series: str.rsplit() function
Series-str.rsplit() function
The str.rsplit() function is used to split strings around given separator/delimiter.
Splits the string in the Series/Index from the end, at the specified delimiter string. Equivalent to str.rsplit().
Syntax:
Series.str.rsplit(self, pat=None, n=-1, expand=False)
Parameters:
Name | Description | Type/Default Value | Required / Optional |
---|---|---|---|
pat | String or regular expression to split on. If not specified, split on whitespace. | str | Optional |
n | Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits. | int, default -1 (all) | Required |
expand | Expand the splitted strings into separate columns.
|
bool, default False | Required |
Returns: Series, Index, DataFrame or MultiIndex
Type matches caller unless expand=True (see Notes).
Note:
The handling of the n keyword depends on the number of found splits:
- If found splits > n, make first n splits only
- If found splits <= n, make all splits
- If for a certain row the number of found splits < n, append None for padding up to n if expand=True
Example - In the default setting, the string is split by whitespace:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.split()
Output:
0 [this, is, my, new, pen] 1 [https://www.w3resource.com/pandas/index.php] 2 NaN dtype: object
Example - Without the n parameter, the outputs of rsplit and split are identical:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.rsplit()
Output:
0 [this, is, my, new, pen] 1 [https://www.w3resource.com/pandas/index.php] 2 NaN dtype: object
Example - The n parameter can be used to limit the number of splits on the delimiter. The outputs of split and rsplit are different:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.split(n=2)
Output:
0 [this, is, my new pen] 1 [https://www.w3resource.com/pandas/index.php] 2 NaN dtype: object
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.rsplit(n=2)
Output:
0 [this is my, new, pen] 1 [https://www.w3resource.com/pandas/index.php] 2 NaN dtype: object
Example - The pat parameter can be used to split by other characters:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.split(pat = "/")
Output:
0 [this is my new pen] 1 [https:, , www.w3resource.com, pandas, index.php] 2 NaN dtype: object
Example - When using expand=True, the split elements will expand out into separate columns. If NaN is present, it is propagated throughout the columns during the split:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.split(expand=True)
Output:
0 1 2 3 4 0 this is my new pen 1 https://www.w3resource.com/pandas/index.php None None None None 2 NaN NaN NaN NaN NaN
Example - For slightly more complex use cases like splitting the html document name from a url, a combination of parameter settings can be used:
Python-Pandas Code:
import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
"https://www.w3resource.com/pandas/index.php",
np.nan])
s.str.rsplit("/", n=1, expand=True)
Output:
0 1 0 this is my new pen None 1 https://www.w3resource.com/pandas index.php 2 NaN NaN
Previous: Series-str.split() function
Next: Series-str.startswith() function
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics