Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])

In the default setting, the string is split by whitespace.

In [3]:
s.str.split()
Out[3]:
0                         [this, is, my, new, pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object

Without the n parameter, the outputs of rsplit and split are identical.

The n parameter can be used to limit the number of splits on the delimiter. The outputs of split and
rsplit are different.

In [4]:
s.str.split(n=2)
Out[4]:
0                           [this, is, my new pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object

The pat parameter can be used to split by other characters.

In [5]:
s.str.split(pat = "/")
Out[5]:
0                                 [this is my new pen]
1    [https:, , www.w3resource.com, pandas, index.php]
2                                                  NaN
dtype: object

When using expand=True, the split elements will expand out into separate columns. If NaN is present,
it is propagated throughout the columns during the split.

In [6]:
s.str.split(expand=True)
Out[6]:
0 1 2 3 4
0 this is my new pen
1 https://www.w3resource.com/pandas/index.php None None None None
2 NaN NaN NaN NaN NaN

Remember to escape special characters when explicitly using regular expressions.

In [7]:
s = pd.Series(["1+1=2"])
In [8]:
s.str.split(r"\+|=", expand=True)
Out[8]:
0 1 2
0 1 1 2