w3resource

Pandas: Series - rank() function

Compute numerical data ranks along axis

The rank() function is used to compute numerical data ranks (1 through n) along axis.

By default, equal values are assigned a rank that is the average of the ranks of those values.

Syntax:

Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)
Pandas Series rank image

Parameters:

Name Description Type/Default Value Required / Optional
axis Index to direct ranking. {0 or ‘index’, 1 or ‘columns’}
Default Value: 0
Required
method How to rank the group of records that have the same value (i.e. ties):
  • average: average rank of the group
  • min: lowest rank in the group
  • max: highest rank in the group
  • first: ranks assigned in order they appear in the array
  • dense: like ‘min’, but rank always increases by 1 between groups
{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}
Default Value: ‘average’
Required
numeric_only For DataFrame objects, rank only numeric columns if set to True. bool Optional
na_option How to rank NaN values:
  • keep: assign NaN rank to NaN values
  • top: assign smallest rank to NaN values if ascending
  • bottom: assign highest rank to NaN values if ascending
{‘keep’, ‘top’, ‘bottom’}
Default Value: ‘keep’
Required
ascending Whether or not the elements should be ranked in ascending order. bool
Default Value: True
Required
pct Whether or not to display the returned rankings in percentile form. bool
Default Value: False
Required

Returns: same type as caller
Return a Series or DataFrame with data ranks as values.

Example:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame(data={'Animal': ['lion', 'fox', 'cow',
                                   'spider', 'snake'],
                        'Number_legs': [4, 4, 4, 8, np.nan]})
df

Output:

  Animal	Number_legs
0	lion	    4.0
1	fox	        4.0
2	cow	        4.0
3	spider	    8.0
4	snake	    NaN

The following example shows how the method behaves with the above parameters:

  • default_rank: this is the default behaviour obtained without using any parameter.
  • max_rank: setting method = 'max' the records that have the same values are ranked using the highest rank (e.g.: since ‘lion’ and ‘cow’ are both in the 2nd and 3rd position, rank 3 is assigned.)
  • NA_bottom: choosing na_option = 'bottom', if there are records with NaN values they are placed at the bottom of the ranking.
  • pct_rank: when setting pct = True, the ranking is expressed as percentile rank.

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame(data={'Animal': ['lion', 'fox', 'cow',
                                   'spider', 'snake'],
                        'Number_legs': [4, 4, 4, 8, np.nan]})
df['default_rank'] = df['Number_legs'].rank()
df['max_rank'] = df['Number_legs'].rank(method='max')
df['NA_bottom'] = df['Number_legs'].rank(na_option='bottom')
df['pct_rank'] = df['Number_legs'].rank(pct=True)
df

Output:

  Animal	Number_legs	default_rank	max_rank	NA_bottom	pct_rank
0	lion	4.0	           2.0	          3.0	         2.0	0.5
1	fox	    4.0	           2.0	          3.0	         2.0	0.5
2	cow	    4.0	           2.0	          3.0	         2.0	0.5
3	spider  8.0	           4.0	          4.0	         4.0	1.0
4	snake	NaN	           NaN	          NaN	         5.0	NaN

Previous: Value at the given quantile
Next: Sum of the values for the requested axis in Pandas



Follow us on Facebook and Twitter for latest update.