Pandas: Remove the duplicates of a specific column in a given dataframe

Last update on September 06 2025 12:51:18 (UTC/GMT +8 hours)

5. Duplicate Removal in 'WHO region'

Write a Pandas program to remove the duplicates from 'WHO region' column of World alcohol consumption dataset.

Test Data:

   Year       WHO region                Country Beverage Types  Display Value
0  1986  Western Pacific               Viet Nam           Wine           0.00
1  1986         Americas                Uruguay          Other           0.50
2  1985           Africa           Cte d'Ivoire           Wine           1.62
3  1986         Americas               Colombia           Beer           4.27
4  1987         Americas  Saint Kitts and Nevis           Beer           1.98

Sample Solution:

Python Code :

import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())

print("\nAfter removing the duplicates of WHO region column:")
print(w_a_con.drop_duplicates('WHO region'))

Sample Output:

World alcohol consumption sample data:
   Year       WHO region      ...      Beverage Types Display Value
0  1986  Western Pacific      ...                Wine          0.00
1  1986         Americas      ...               Other          0.50
2  1985           Africa      ...                Wine          1.62
3  1986         Americas      ...                Beer          4.27
4  1987         Americas      ...                Beer          1.98

[5 rows x 5 columns]

After removing the duplicates of WHO region column:
    Year             WHO region      ...      Beverage Types Display Value
0   1986        Western Pacific      ...                Wine          0.00
1   1986               Americas      ...               Other          0.50
2   1985                 Africa      ...                Wine          1.62
13  1984  Eastern Mediterranean      ...               Other          0.00
18  1984                 Europe      ...             Spirits          1.62
20  1986        South-East Asia      ...                Wine          0.00

[6 rows x 5 columns]

Click to download world_alcohol.csv

For more Practice: Solve these Related Problems:

Write a Pandas program to extract unique values from the 'WHO region' column and then sort them in descending order.
Write a Pandas program to drop duplicate rows based solely on the 'WHO region' column while keeping the last occurrence.
Write a Pandas program to identify duplicate entries in 'WHO region' and then create a new DataFrame with only the unique values.
Write a Pandas program to remove duplicates in 'WHO region' and then count the frequency of each unique region in the dataset.

Go to:

PREV : Missing Value Handling.
NEXT : Filtering by Year.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.