Cleaning text data using str.replace() in Pandas
Pandas: Data Cleaning and Preprocessing Exercise-8 with Solution
Write a Pandas program that handles text data with str.replace().
This exercise shows how to clean text data by replacing specific substrings in a column using str.replace().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with messy text
df = pd.DataFrame({
'Product': ['$50-Discount', '$100-Off', '$200-Rebate']
})
# Clean the text by removing special characters like '$' and '-'
df['Product_Cleaned'] = df['Product'].str.replace('[$-]', '', regex=True)
# Output the result
print(df)
Output:
Product Product_Cleaned 0 $50-Discount 50Discount 1 $100-Off 100Off 2 $200-Rebate 200Rebate
Explanation:
- Created a DataFrame with text data that contains special characters.
- Used str.replace() with a regular expression to remove characters like $ and - from the 'Product' column.
- Added a new column 'Product_Cleaned' with the cleaned text.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics