NLTK corpus: Remove stop words from a given text
NLTK corpus: Exercise-4 with Solution
Write a Python NLTK program to remove stop words from a given text.
Sample Solution:
Python Code :
from nltk.corpus import stopwords
stoplist = stopwords.words('english')
text = '''
In computing, stop words are words which are filtered out before or after
processing of natural language data (text). Though "stop words" usually
refers to the most common words in a language, there is no single universal
list of stop words used by all natural language processing tools, and
indeed not all tools even use such a list. Some tools specifically avoid
removing these stop words to support phrase search.
'''
print("\nOriginal string:")
print(text)
clean_word_list = [word for word in text.split() if word not in stoplist]
print("\nAfter removing stop words from the said text:")
print(clean_word_list)
Sample Output:
Original string: In computing, stop words are words which are filtered out before or after processing of natural language data (text). Though "stop words" usually refers to the most common words in a language, there is no single universal list of stop words used by all natural language processing tools, and indeed not all tools even use such a list. Some tools specifically avoid removing these stop words to support phrase search. After removing stop words from the said text: ['In', 'computing,', 'stop', 'words', 'words', 'filtered', 'processing', 'natural', 'language', 'data', '(text).', 'Though', '"stop', 'words"', 'usually', 'refers', 'common', 'words', 'language,', 'single', 'universal', 'list', 'stop', 'words', 'used', 'natural', 'language', 'processing', 'tools,', 'indeed', 'tools', 'even', 'use', 'list.', 'Some', 'tools', 'specifically', 'avoid', 'removing', 'stop', 'words', 'support', 'phrase', 'search.']
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python NLTK program to check the list of stopwords in various languages.
Next: Write a Python NLTK program to omit some given stop words from the stopwords list.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/nltk/nltk-corpus-exercise-4.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics