NLTK corpus: Extract the last letter of all the labeled names and create a new array with the last letter of each name
NLTK corpus: Exercise-13 with Solution
Write a Python NLTK program to extract the last letter of all the labeled names and create a new array with the last letter of each name and the associated label.
Sample Solution:
Python Code :
from nltk.corpus import names
import random
male_names = names.words('male.txt')
female_names = names.words('female.txt')
labeled_male_names = [(str(name), 'male') for name in male_names]
labeled_female_names = [(str(name), 'female') for name in female_names]
# combine labeled male and labeled female names
all_labeled_names = labeled_male_names + labeled_female_names
feature_set = [(name[-1], gender) for (name, gender) in all_labeled_names]
print("\nFirst 15 labeled names:")
print((all_labeled_names[:15]))
print("\nLast letter of all the labeled names with the associated label:")
print((feature_set[:15]))
Sample Output:
First 15 labeled names: [('Aamir', 'male'), ('Aaron', 'male'), ('Abbey', 'male'), ('Abbie', 'male'), ('Abbot', 'male'), ('Abbott', 'male'), ('Abby', 'male'), ('Abdel', 'male'), ('Abdul', 'male'), ('Abdulkarim', 'male'), ('Abdullah', 'male'), ('Abe', 'male'), ('Abel', 'male'), ('Abelard', 'male'), ('Abner', 'male')] Last letter of all the labeled names with the associated label: [('r', 'male'), ('n', 'male'), ('y', 'male'), ('e', 'male'), ('t', 'male'), ('t', 'male'), ('y', 'male'), ('l', 'male'), ('l', 'male'), ('m', 'male'), ('h', 'male'), ('e', 'male'), ('l', 'male'), ('d', 'male'), ('r', 'male')]
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/nltk/nltk-corpus-exercise-13.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics