Pandas: Divide a DataFrame in a given ratio
Pandas: DataFrame Exercise-38 with Solution
Write a Pandas program to divide a DataFrame in a given ratio.
Sample data:
Original DataFrame:
0 1
0 0.316147 -0.767359
1 -0.813410 -2.522672
2 0.869615 1.194704
3 -0.892915 -0.055133
4 -0.341126 0.518266
5 1.857342 1.361229
6 -0.044353 -1.205002
7 -0.726346 -0.535147
8 -1.350726 0.563117
9 1.051666 -0.441533
70% of the said DataFrame:
0 1
8 -1.350726 0.563117
2 0.869615 1.194704
5 1.857342 1.361229
6 -0.044353 -1.205002
3 -0.892915 -0.055133
1 -0.813410 -2.522672
0 0.316147 -0.767359
30% of the said DataFrame:
0 1
4 -0.341126 0.518266
7 -0.726346 -0.535147
9 1.051666 -0.441533
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 2))
print("Original DataFrame:")
print(df)
part_70 = df.sample(frac=0.7,random_state=10)
part_30 = df.drop(part_70.index)
print("\n70% of the said DataFrame:")
print(part_70)
print("\n30% of the said DataFrame:")
print(part_30)
Sample Output:
Original DataFrame: 0 1 0 0.316147 -0.767359 1 -0.813410 -2.522672 2 0.869615 1.194704 3 -0.892915 -0.055133 4 -0.341126 0.518266 5 1.857342 1.361229 6 -0.044353 -1.205002 7 -0.726346 -0.535147 8 -1.350726 0.563117 9 1.051666 -0.441533 70% of the said DataFrame: 0 1 8 -1.350726 0.563117 2 0.869615 1.194704 5 1.857342 1.361229 6 -0.044353 -1.205002 3 -0.892915 -0.055133 1 -0.813410 -2.522672 0 0.316147 -0.767359 30% of the said DataFrame: 0 1 4 -0.341126 0.518266 7 -0.726346 -0.535147 9 1.051666 -0.441533
Explanation:
The above code first generates a Pandas DataFrame df with 10 rows and 2 columns filled with random numbers using NumPy.
part_70 = df.sample(frac=0.7,random_state=10): This code creates a new DataFrame 'part_70' by sampling 70% of the rows from 'df' using the sample method. The 'frac' parameter specifies the fraction of the rows to be sampled, while the random_state parameter is used to ensure that the same set of rows is always sampled if the code is run again with the same random_state value.
part_30 = df.drop(part_70.index): This code creates another DataFrame 'part_30' by dropping the rows in ‘part_70’ from ‘df’. This is achieved by calling the drop method on ‘df’ with the indices of the rows to be dropped, which are obtained by calling the index attribute on ‘part_70’. The resulting DataFrame ‘part_30’ contains the remaining 30% of the rows from df.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to reset index in a given DataFrame.
Next: Write a Pandas program to combining two series into a DataFrame.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/python-exercises/pandas/python-pandas-data-frame-exercise-38.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics