Cleaning text data using str.replace() in Pandas
8. Converting Data Types and Column Operations
Write a Pandas program that handles text data with str.replace().
This exercise shows how to clean text data by replacing specific substrings in a column using str.replace().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with messy text
df = pd.DataFrame({
'Product': ['$50-Discount', '$100-Off', '$200-Rebate']
})
# Clean the text by removing special characters like '$' and '-'
df['Product_Cleaned'] = df['Product'].str.replace('[$-]', '', regex=True)
# Output the result
print(df)
Output:
Product Product_Cleaned 0 $50-Discount 50Discount 1 $100-Off 100Off 2 $200-Rebate 200Rebate
Explanation:
- Created a DataFrame with text data that contains special characters.
- Used str.replace() with a regular expression to remove characters like $ and - from the 'Product' column.
- Added a new column 'Product_Cleaned' with the cleaned text.
For more Practice: Solve these Related Problems:
- Write a Pandas program to convert a DataFrame column from object to datetime and extract specific components (e.g., month).
- Write a Pandas program to convert multiple columns to numeric types and handle errors during conversion.
- Write a Pandas program to strip whitespace from all string columns and then convert them to lowercase.
- Write a Pandas program to rename columns to a standardized format after performing data type conversions.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.