Splitting a Dataset into training and testing sets using Pandas
9. Splitting Dataset into Training and Testing Sets
Write a Pandas program that splits Dataset into Training and Testing sets.
This exercise shows how to split a dataset into training and testing sets using Scikit-learn's train_test_split().
Sample Solution :
Code :
Output:
Training set size: 4 Testing set size: 2
Explanation:
- Loaded the dataset and split it into features (X) and target (y).
- Used train_test_split() to split the dataset into training and testing sets with an 80-20 ratio.
- Displayed the size of the training and testing sets.
For more Practice: Solve these Related Problems:
- Write a Pandas program to split a dataset into training and testing sets with stratified sampling based on a categorical column.
- Write a Pandas program to split a DataFrame into training and testing sets while maintaining the original index order.
- Write a Pandas program to partition a dataset into multiple sets for cross-validation and report the sizes of each set.
- Write a Pandas program to split a dataset into training and testing sets and then save each set into separate CSV files.
Go to:
Previous: Standardizing Numerical Data Using Z-Score Scaling.
Next: Removing Outliers from a Dataset.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.