w3resource

Cleaning text data using str.replace() in Pandas


8. Converting Data Types and Column Operations

Write a Pandas program that handles text data with str.replace().

This exercise shows how to clean text data by replacing specific substrings in a column using str.replace().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with messy text
df = pd.DataFrame({
    'Product': ['$50-Discount', '$100-Off', '$200-Rebate']
})

# Clean the text by removing special characters like '$' and '-'
df['Product_Cleaned'] = df['Product'].str.replace('[$-]', '', regex=True)

# Output the result
print(df)

Output:

        Product Product_Cleaned
0  $50-Discount      50Discount
1      $100-Off          100Off
2   $200-Rebate       200Rebate

Explanation:

  • Created a DataFrame with text data that contains special characters.
  • Used str.replace() with a regular expression to remove characters like $ and - from the 'Product' column.
  • Added a new column 'Product_Cleaned' with the cleaned text.

For more Practice: Solve these Related Problems:

  • Write a Pandas program to convert a DataFrame column from object to datetime and extract specific components (e.g., month).
  • Write a Pandas program to convert multiple columns to numeric types and handle errors during conversion.
  • Write a Pandas program to strip whitespace from all string columns and then convert them to lowercase.
  • Write a Pandas program to rename columns to a standardized format after performing data type conversions.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.