w3resource

NLTK Tokenize: Split all punctuation into separate tokens

NLTK Tokenize: Exercise-4 with Solution

Write a Python NLTK program to split all punctuation into separate tokens.

Sample Solution:

Python Code :

from nltk.tokenize import WordPunctTokenizer
text = "Reset your password if you just can't remember your old one."
print("\nOriginal string:")
print(text)
result = WordPunctTokenizer().tokenize(text)
print("\nSplit all punctuation into separate tokens:")
print(result)

Sample Output:

Original string:
Reset your password if you just can't remember your old one.

Split all punctuation into separate tokens:
['Reset', 'your', 'password', 'if', 'you', 'just', 'can', "'", 't', 'remember', 'your', 'old', 'one', '.']

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python NLTK program to create a list of words from a given string.
Next: Write a Python NLTK program to tokenize words, sentence wise.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.