w3resource

Pandas Series: str.split() function

Series-str.split() function

The str.split() function is used to split strings around given separator/delimiter.

The function splits the string in the Series/Index from the beginning, at the specified delimiter string. Equivalent to str.split().

Syntax:

Series.str.split(self, pat=None, n=-1, expand=False)
Pandas Series: str.split() function

Parameters:

Name Description Type/Default Value Required / Optional
pat String or regular expression to split on. If not specified, split on whitespace. str Optional
n Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits. int
Default Value: 1 (all)
Required
expand Expand the splitted strings into separate columns.
  • If True, return DataFrame/MultiIndex expanding dimensionality.
  • If False, return Series/Index, containing lists of strings.
bool
Default Value: False
Required

Returns: Series, Index, DataFrame or MultiIndex
Type matches caller unless expand=True

Example - In the default setting, the string is split by whitespace:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split()			   

Output:

0                         [this, is, my, new, pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object
Pandas Series: str.split() function

Example - Without the n parameter, the outputs of rsplit and split are identical:

The n parameter can be used to limit the number of splits on the delimiter. The outputs of split and rsplit are different

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(n=2)			   

Output:

0                           [this, is, my new pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object

Example - The pat parameter can be used to split by other characters:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(pat = "/")		   

Output:

0                                 [this is my new pen]
1    [https:, , www.w3resource.com, pandas, index.php]
2                                                  NaN
dtype: object

Example - When using expand=True, the split elements will expand out into separate columns. If NaN is present, it is propagated throughout the columns during the split:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(expand=True)	   

Output:

                                            0	  1	      2	       3	  4
0	this	is	my	new	pen
1	https://www.w3resource.com/pandas/index.php	None	None	None	None
2	NaN	NaN	NaN	NaN	NaN

Example - Remember to escape special characters when explicitly using regular expressions:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s = pd.Series(["1+1=2"])
s.str.split(r"\+|=", expand=True)	   

Output:

    0	1	2
0	1	1	2

Previous: Series-str.slice_replace() function
Next: Series-str.rsplit() function



Become a Patron!

Follow us on Facebook and Twitter for latest update.

It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.

https://w3resource.com/pandas/series/series-str-split.php