NLTK Tokenize: Remove username handles from a twitter text
NLTK Tokenize: Exercise-7 with Solution
Write a Python NLTK program to remove Twitter username handles from a given twitter text.
Sample Solution:
Python Code :
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer(strip_handles=True)
tweet_text = "@abcd @pqrs NoSQL introduction - w3resource http://bit.ly/1ngHC5F #nosql #database #webdev"
print("\nOriginal Tweet:")
print(tweet_text)
result = tknzr.tokenize(tweet_text)
print("\nTokenize a twitter text:")
print(result)
Sample Output:
Original Tweet: @abcd @pqrs NoSQL introduction - w3resource http://bit.ly/1ngHC5F #nosql #database #webdev Tokenize a twitter text: ['NoSQL', 'introduction', '-', 'w3resource', 'http://bit.ly/1ngHC5F', '#nosql', '#database', '#webdev']
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python NLTK program to tokenize a twitter text.
Next: Write a Python NLTK program that will read a given text through each line and look for sentences. Print each sentence and divide two sentences with “==============”.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/nltk/nltk-tokenize-exercise-7.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics