Pandas: Remove the html tags within the specified column of a given DataFrame
Pandas: String and Regular Expression Exercise-41 with Solution
Write a Pandas program to remove the html tags within the specified column of a given DataFrame.
Sample Solution:
Python Code :
import pandas as pd
import re as re
df = pd.DataFrame({
'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
'address': ['9910 Surrey <b>Avenue</b>','92 N. Bishop Avenue','9910 <br>Golden Star Avenue', '102 Dunbar <i></i>St.', '17 West Livingston Court']
})
print("Original DataFrame:")
print(df)
def remove_tags(string):
result = re.sub('<.*?>','',string)
return result
df['with_out_tags']=df['address'].apply(lambda cw : remove_tags(cw))
print("\nSentences without tags':")
print(df)
Sample Output:
Original DataFrame: company_code ... address 0 Abcd ... 9910 Surrey Avenue 1 EFGF ... 92 N. Bishop Avenue 2 zefsalf ... 9910
Golden Star Avenue 3 sdfslew ... 102 Dunbar St. 4 zekfsdf ... 17 West Livingston Court [5 rows x 3 columns] Sentences without tags': company_code ... with_out_tags 0 Abcd ... 9910 Surrey Avenue 1 EFGF ... 92 N. Bishop Avenue 2 zefsalf ... 9910 Golden Star Avenue 3 sdfslew ... 102 Dunbar St. 4 zekfsdf ... 17 West Livingston Court [5 rows x 4 columns]
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/string/python-pandas-string-exercise-41.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics