w3resource

Pandas: Sort a given DataFrame by two or more columns

Pandas: DataFrame Exercise-50 with Solution

Write a Pandas program to sort a given DataFrame by two or more columns.

Sample Solution :

Python Code :

import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
        'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
        'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
        'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
df = pd.DataFrame(exam_data)
print("Original DataFrame:")
print(df)
print("\nSort the above DataFrame on attempts, name:")
df = df.sort_values(['attempts', 'name'], ascending=[True, True])
print(df)

Sample Output:

Original DataFrame:
   attempts       name qualify  score
0         1  Anastasia     yes   12.5
1         3       Dima      no    9.0
2         2  Katherine     yes   16.5
3         3      James      no    NaN
4         2      Emily      no    9.0
5         3    Michael     yes   20.0
6         1    Matthew     yes   14.5
7         1      Laura      no    NaN
8         2      Kevin      no    8.0
9         1      Jonas     yes   19.0

Sort the above DataFrame on attempts, name:
   attempts       name qualify  score
0         1  Anastasia     yes   12.5
9         1      Jonas     yes   19.0
7         1      Laura      no    NaN
6         1    Matthew     yes   14.5
4         2      Emily      no    9.0
2         2  Katherine     yes   16.5
8         2      Kevin      no    8.0
1         3       Dima      no    9.0
3         3      James      no    NaN
5         3    Michael     yes   20.0            

Explanation:

The above code creates a Pandas DataFrame df using the dictionary exam_data. The DataFrame has four columns named name, score, attempts, and qualify.

df = df.sort_values(['attempts', 'name'], ascending=[True, True]): Here the sort_values() method is used to sort the DataFrame based on two columns ‘attempts’ and ‘name’. The ascending parameter is set to [True, True] to indicate that the sorting should be done in ascending order for both columns. This will result in the DataFrame being sorted first by the ‘attempts’ column in ascending order, and then within each group of attempts, the ‘name‘ column will be sorted in ascending order as well. The sorted DataFrame is stored back into ‘df’.

Finally print() function prints the ‘df’ DataFrame.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to append data to an empty DataFrame.
Next: Write a Pandas program to convert the datatype of a given column (floats to ints).

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.