Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point
numbers, Python objects, etc.). The axis labels are collectively referred to as the index.
import numpy as np
import pandas as pd
s = pd.Series(data, index=index)
There are different types of data:
From ndarray
If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having
values [0, ..., len(data) - 1].
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
s
s.index
pd.Series(np.random.randn(6))
From dict
Series can be instantiated from dicts:
n = {'q': 1, 'p': 2, 'r': 3}
pd.Series(n)
In the example above, if you were on a Python version lower than 3.6 or a Pandas version lower than 0.23,
the Series would be ordered by the lexical order of the dict keys (i.e. ['p', 'q', 'r'] rather than ['q', 'p', 'r']).
If an index is passed, the values in data corresponding to the labels in the index will be pulled out.
n = {'p': 2., 'q': 1., 'r': 3.}
pd.Series(n)
pd.Series(n, index=['q', 'r', 'n', 'p'])
From scalar value
If data is a scalar value, an index must be provided. The value will be repeated to match the length of index.
pd.Series(4., index=['p', 'q', 'r', 'n', 't'])
Series is ndarray-like
Series acts very similarly to a ndarray, and is a valid argument to most NumPy functions. However, operations
such as slicing will also slice the index.
import numpy as np
import pandas as pd
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
s[0]
s[:4]
s[s > s.median()]
s[[5, 4, 3]]
np.exp(s)
Like a NumPy array, a pandas Series has a dtype.
s.dtype
If you need the actual array backing a Series, use Series.array.
s.array
Accessing the array can be useful when you need to do some operation without the index.
While Series is ndarray-like, if you need an actual ndarray, then use Series.to_numpy().
s.to_numpy()
Even if the Series is backed by a ExtensionArray, Series.to_numpy() will return a NumPy ndarray.
Series is dict-like A Series is like a fixed-size dict in that you can get and set values by index label:
import numpy as np
import pandas as pd
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
s['q']
s['n'] = 10.
s
'n' in s
'd' in s
If a label is not contained, an exception is raised:
s['d'] KeyError: 'd'
Using the get method, a missing label will return None or specified default:
s.get('d')
s.get('d', np.nan)
Vectorized operations and label alignment with Series
Series can also be passed into most NumPy methods expecting an ndarray.
s + s
s * 2
np.exp(s)
A key difference between Series and ndarray is that operations between Series automatically align the data based
on label.
s[2:] + s[:-2]
Name attribute Series can also have a name attribute:
s = pd.Series(np.random.randn(6), name='research')
s
s.name
You can rename a Series with the pandas.Series.rename() method.
s2=s.rename("search")
s2.name
Note that s and s2 refer to different objects.