Python Scikit-learn: K Nearest Neighbors - Create a plot to present the performance for different values of k
Python Machine learning K Nearest Neighbors: Exercise-7 with Solution
Write a Python program using Scikit-learn to split the iris dataset into 80% train data and 20% test data. Out of total 150 records, the training set will contain 120 records and the test set contains 30 of those records. Train or fit the data into the model and using the K Nearest Neighbor Algorithm and create a plot to present the performance for different values of k.
Sample Solution:
Python Code:
# Import necessary modules
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics
iris = pd.read_csv("iris.csv")
#Drop id column
iris = iris.drop('Id',axis=1)
X = iris.iloc[:, :-1].values
y = iris.iloc[:, 4].values
#Split arrays or matrices into train and test subsets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)
knn = KNeighborsClassifier(n_neighbors=7)
knn.fit(X_train, y_train)
a_index=list(range(1,11))
a=pd.Series()
# Calculate the accuracy of the model for different values of k
for i in np.arange(1, 10):
knn2 = KNeighborsClassifier(n_neighbors=i)
knn2.fit(X_train, y_train)
print("For k = %d accuracy is"%i,knn2.score(X_test,y_test))
# Visual presentation: Various values of n for K-Nearest nerighbours
print("\nVisual presentation: Various values of n for K-Nearest nerighbours:")
for i in list(range(1,11)):
model=KNeighborsClassifier(n_neighbors=i)
model.fit(X_train,y_train)
prediction=model.predict(X_test)
a=a.append(pd.Series(metrics.accuracy_score(prediction,y_test)))
plt.plot(a_index, a)
Sample Output:
For k = 1 accuracy is 0.9333333333333333 For k = 2 accuracy is 0.9333333333333333 For k = 3 accuracy is 0.9666666666666667 For k = 4 accuracy is 0.9666666666666667 For k = 5 accuracy is 0.9666666666666667 For k = 6 accuracy is 0.9666666666666667 For k = 7 accuracy is 1.0 For k = 8 accuracy is 1.0 For k = 9 accuracy is 1.0 Visual presentation: Various values of n for K-Nearest nerighbours:
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Next: Write a Python program using Scikit-learn to split the iris dataset into 80% train data and 20% test data. Out of total 150 records, the training set will contain 120 records and the test set contains 30 of those records. Train or fit the data into the model and using the K Nearest Neighbor Algorithm and create a plot of k values vs accuracy.What is the difficulty level of this exercise?
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics