NLTK Tokenize: Split all punctuation into separate tokens
NLTK Tokenize: Exercise-4 with Solution
Write a Python NLTK program to split all punctuation into separate tokens.
Sample Solution:
Python Code :
from nltk.tokenize import WordPunctTokenizer
text = "Reset your password if you just can't remember your old one."
print("\nOriginal string:")
print(text)
result = WordPunctTokenizer().tokenize(text)
print("\nSplit all punctuation into separate tokens:")
print(result)
Sample Output:
Original string: Reset your password if you just can't remember your old one. Split all punctuation into separate tokens: ['Reset', 'your', 'password', 'if', 'you', 'just', 'can', "'", 't', 'remember', 'your', 'old', 'one', '.']
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python NLTK program to create a list of words from a given string.
Next: Write a Python NLTK program to tokenize words, sentence wise.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/nltk/nltk-tokenize-exercise-4.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics