Pandas Machine Learning Integration: Exercises and Solutions for Data Integrity
This resource offers a total of 85 Pandas Machine Learning Integration problems for practice. It includes 17 main exercises, each accompanied by solutions, detailed explanations, and four related problems.
[An Editor is available at the bottom of the page to write and execute the scripts.]
Structure of data.csv:
ID Name Age Gender Salary Target 1,Sara,25,Female,50000,0 2,Ophrah,30,Male,60000,1 3,Torben,22,Male,70000,0 4,Masaharu,35,Male,80000,1 5,Kaya,NaN,Female,55000,0 6,Abaddon,29,Male,NaN,1
Column Description:
ID: A unique identifier for each record (integer).
Name: The name of the individual (string).
Age: Age of the individual (numerical, may have missing values).
Gender: Gender of the individual (categorical: Male/Female).
Salary: The individual's salary (numerical, may have missing values).
Target: The target variable for binary classification (binary: 0 or 1).
1. Loading Dataset from CSV
Write a Pandas program that loads a Dataset from a CSV file.
Click me to see the sample solution
2. Checking for Missing Values in a Dataset
Write a Pandas program to check for missing values in a dataset.
Click me to see the sample solution
3. Dropping Rows with Missing Values from a Dataset
Write a Pandas program to drop rows with missing values from a dataset.
Click me to see the sample solution
4. Filling Missing Values with the Mean
Write a Pandas program that fills missing values with the Mean.
Click me to see the sample solution
5. Converting Categorical Variables into Numerical Values Using Label Encoding
Write a Pandas program that converts categorical variables into numerical values using label.
Click me to see the sample solution
6. Applying One-Hot Encoding to Categorical Variables
Write a Pandas program to apply one-hot encoding to categorical variables.
Click me to see the sample solution
7. Normalizing Numerical Data Using Min-Max Scaling
Write a Pandas program that normalizes numerical data using Min-Max scaling.
Click me to see the sample solution
8. Standardizing Numerical Data Using Z-Score Scaling
Write a Pandas program to standardize numerical data using Z-Score scaling.
Click me to see the sample solution
9. Splitting Dataset into Training and Testing Sets
Write a Pandas program that splits Dataset into Training and Testing sets.
Click me to see the sample solution
10. Removing Outliers from a Dataset
Write a Pandas program that removes outliers from a Dataset.
Click me to see the sample solution
11. Imputing Missing Values Using K-Nearest Neighbours
Write a Pandas program that imputes missing values using K-Nearest neighbours.
Click me to see the sample solution
12. Selecting Features Using Variance Threshold
Write a Pandas program to select feature selection using variance threshold.
Click me to see the sample solution
13. Handling Class Imbalance Using Random Oversampling
Write a Pandas program to handling class imbalance using random oversampling.
Click me to see the sample solution
14. Applying Polynomial Features for Feature Expansion
Write a Pandas program that applies Polynomial Features for feature expansion.
Click me to see the sample solution
15. Scaling Numerical Features Using Scikit-learn's RobustScaler
Write a Pandas program to scale numerical features using Scikit-learn's RobustScaler.
Click me to see the sample solution
16. Saving the Processed Dataset to a CSV File
Write a Pandas program to save the processed Dataset to a CSV file.
Click me to see the sample solution
17. Applying Log Transformation to Skewed Data
Write a Pandas program that applies Log Transformation to Skewed Data.
Click me to see the sample solution
Python-Pandas Code Editor:
More to Come !
Do not submit any solution of the above exercises at here, if you want to contribute go to the appropriate exercise page.
Test your Python skills with w3resource's quiz