w3resource

Pandas Machine Learning Integration: Exercises and Solutions for Data Integrity


This resource offers a total of 85 Pandas Machine Learning Integration problems for practice. It includes 17 main exercises, each accompanied by solutions, detailed explanations, and four related problems.

[An Editor is available at the bottom of the page to write and execute the scripts.]


Structure of data.csv:

ID	Name	  Age	Gender	Salary	 Target
1,Sara,25,Female,50000,0
2,Ophrah,30,Male,60000,1
3,Torben,22,Male,70000,0
4,Masaharu,35,Male,80000,1
5,Kaya,NaN,Female,55000,0
6,Abaddon,29,Male,NaN,1

Column Description:

ID: A unique identifier for each record (integer).

Name: The name of the individual (string).

Age: Age of the individual (numerical, may have missing values).

Gender: Gender of the individual (categorical: Male/Female).

Salary: The individual's salary (numerical, may have missing values).

Target: The target variable for binary classification (binary: 0 or 1).


1. Loading Dataset from CSV

Write a Pandas program that loads a Dataset from a CSV file.

Click me to see the sample solution


2. Checking for Missing Values in a Dataset

Write a Pandas program to check for missing values in a dataset.

Click me to see the sample solution


3. Dropping Rows with Missing Values from a Dataset

Write a Pandas program to drop rows with missing values from a dataset.

Click me to see the sample solution


4. Filling Missing Values with the Mean

Write a Pandas program that fills missing values with the Mean.

Click me to see the sample solution


5. Converting Categorical Variables into Numerical Values Using Label Encoding

Write a Pandas program that converts categorical variables into numerical values using label.

Click me to see the sample solution


6. Applying One-Hot Encoding to Categorical Variables

Write a Pandas program to apply one-hot encoding to categorical variables.

Click me to see the sample solution


7. Normalizing Numerical Data Using Min-Max Scaling

Write a Pandas program that normalizes numerical data using Min-Max scaling.

Click me to see the sample solution


8. Standardizing Numerical Data Using Z-Score Scaling

Write a Pandas program to standardize numerical data using Z-Score scaling.

Click me to see the sample solution


9. Splitting Dataset into Training and Testing Sets

Write a Pandas program that splits Dataset into Training and Testing sets.

Click me to see the sample solution


10. Removing Outliers from a Dataset

Write a Pandas program that removes outliers from a Dataset.

Click me to see the sample solution


11. Imputing Missing Values Using K-Nearest Neighbours

Write a Pandas program that imputes missing values using K-Nearest neighbours.

Click me to see the sample solution


12. Selecting Features Using Variance Threshold

Write a Pandas program to select feature selection using variance threshold.

Click me to see the sample solution


13. Handling Class Imbalance Using Random Oversampling

Write a Pandas program to handling class imbalance using random oversampling.

Click me to see the sample solution


14. Applying Polynomial Features for Feature Expansion

Write a Pandas program that applies Polynomial Features for feature expansion.

Click me to see the sample solution


15. Scaling Numerical Features Using Scikit-learn's RobustScaler

Write a Pandas program to scale numerical features using Scikit-learn's RobustScaler.

Click me to see the sample solution


16. Saving the Processed Dataset to a CSV File

Write a Pandas program to save the processed Dataset to a CSV file.

Click me to see the sample solution


17. Applying Log Transformation to Skewed Data

Write a Pandas program that applies Log Transformation to Skewed Data.

Click me to see the sample solution


Python-Pandas Code Editor:

More to Come !

Do not submit any solution of the above exercises at here, if you want to contribute go to the appropriate exercise page.

Test your Python skills with w3resource's quiz



Follow us on Facebook and Twitter for latest update.