w3resource

Pandas: Match if a given column has a particular sub string in a given dataframe

Pandas Filter: Exercise-16 with Solution

Write a Pandas program to filter those records where WHO region contains "Ea" substring from world alcohol consumption dataset.

Test Data:

   Year       WHO region                Country Beverage Types  Display Value
0  1986  Western Pacific               Viet Nam           Wine           0.00
1  1986         Americas                Uruguay          Other           0.50
2  1985           Africa           Cte d'Ivoire           Wine           1.62
3  1986         Americas               Colombia           Beer           4.27
4  1987         Americas  Saint Kitts and Nevis           Beer           1.98   

Sample Solution:

Python Code :

import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())
# Remove NA / NaN values
new_w_a_con = w_a_con.dropna()
print("\nMatch if  a given column has a particular sub string:")
print(new_w_a_con[new_w_a_con["WHO region"].str.contains("Ea")])

Sample Output:

World alcohol consumption sample data:
   Year       WHO region      ...      Beverage Types Display Value
0  1986  Western Pacific      ...                Wine          0.00
1  1986         Americas      ...               Other          0.50
2  1985           Africa      ...                Wine          1.62
3  1986         Americas      ...                Beer          4.27
4  1987         Americas      ...                Beer          1.98

[5 rows x 5 columns]

Match if  a given column has a particular sub string:
    Year             WHO region      ...      Beverage Types Display Value
13  1984  Eastern Mediterranean      ...               Other          0.00
20  1986        South-East Asia      ...                Wine          0.00
25  1984  Eastern Mediterranean      ...               Other          0.00
27  1984  Eastern Mediterranean      ...                Beer          2.22
36  1987  Eastern Mediterranean      ...                Beer          0.07
38  1987  Eastern Mediterranean      ...               Other          0.00
52  1986  Eastern Mediterranean      ...                Wine          0.00
53  1984  Eastern Mediterranean      ...                Beer          0.00
58  1984  Eastern Mediterranean      ...             Spirits          0.00
59  1989  Eastern Mediterranean      ...               Other          0.00
60  1987  Eastern Mediterranean      ...               Other          0.00
63  1985  Eastern Mediterranean      ...               Other          0.00
65  1989  Eastern Mediterranean      ...                Beer          0.00
66  1987  Eastern Mediterranean      ...                Wine          0.01
73  1986  Eastern Mediterranean      ...               Other          0.01
75  1989  Eastern Mediterranean      ...               Other          0.00
84  1986        South-East Asia      ...               Other          0.00
87  1989  Eastern Mediterranean      ...                Wine          0.01
88  1987  Eastern Mediterranean      ...                Beer          0.42
89  1986  Eastern Mediterranean      ...                Wine          0.70
97  1984        South-East Asia      ...                Wine          0.00
99  1985        South-East Asia      ...                Wine          0.00

[22 rows x 5 columns]

Click to download world_alcohol.csv

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to filter the specified columns and records by range from world alcohol consumption dataset.

Next: Write a Pandas program to filter those records where WHO region matches with multiple values (Africa, Eastern Mediterranean, Europe) from world alcohol consumption dataset.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.