C++ File handling: Split a large text file
C++ File handling: Exercise-10 with Solution
Write a C++ program to split a large text file into smaller files of equal size.
Sample Solution:
C Code:
#include <iostream> // Including the input/output stream library
#include <fstream> // Including the file stream library
#include <string> // Including the string handling library
#include <vector> // Including the vector container
// Function to split a file into smaller chunks
void splitFile(const std::string & inputFile, const std::string & outputPrefix, int chunkSize) {
// Open the input file in binary mode
std::ifstream input(inputFile, std::ios::binary); // Open the input file in binary mode
if (input.is_open()) { // Check if the input file was successfully opened
// Get the file size
input.seekg(0, std::ios::end); // Move the file pointer to the end of the file
std::streampos fileSize = input.tellg(); // Get the current position of the file pointer, indicating the file size
input.seekg(0, std::ios::beg); // Move the file pointer back to the beginning of the file
// Calculate the number of chunks
int numChunks = (fileSize + chunkSize - 1) / chunkSize; // Calculate the number of chunks based on file size and chunk size
// Read and write each chunk
for (int i = 0; i < numChunks; ++i) { // Iterate through each chunk
// Create or overwrite the output file with an incremental suffix
std::ofstream output(outputPrefix + std::to_string(i + 1) + ".txt", std::ios::binary); // Create or overwrite the output file
if (output.is_open()) { // Check if the output file was successfully opened
std::vector<char> buffer(chunkSize); // Create a buffer to hold the chunk data
// Read a chunk of data from the input file
input.read(buffer.data(), chunkSize); // Read chunkSize number of bytes into the buffer
// Write the chunk to the output file
output.write(buffer.data(), input.gcount()); // Write the read data from the buffer to the output file
output.close(); // Close the output file
} else {
std::cout << "Failed to open output file: " << outputPrefix + std::to_string(i + 1) + ".txt" << std::endl; // Display an error message if output file opening failed
}
}
input.close(); // Close the input file
std::cout << "File split successfully." << std::endl; // Display a success message after splitting
} else {
std::cout << "Failed to open the input file." << std::endl; // Display an error message if input file opening failed
}
}
int main() {
std::string inputFile = "merged_test_file.txt"; // Input file
std::string outputPrefix = "part_"; // Prefix for output files
int chunkSize = 400; // Chunk size in bytes
splitFile(inputFile, outputPrefix, chunkSize); // Call the function to split the file
return 0; // Return 0 to indicate successful execution
}
Sample Output:
File split successfully
Explanation:
In the above exercise,
- The function splitFile() takes three parameters: inputFile (the name of the input file to be split), outputPrefix (the prefix for the output files), and chunkSize (the size of each chunk in bytes).
- The program opens the input file using std::ifstream in binary mode. It then determines the size of the input file using the seekg() and tellg() functions.
- Next, it calculates the number of chunks required to split the file based on the specified chunk size.
- The program iterates over each chunk, creates or overwrites the corresponding output file using std::ofstream, and reads a chunk of data from the input file using a std::vector<char> buffer.
- Each chunk is then written to the output file using the write function.
- After all the chunks have been written, the input and output files are closed, and a success message is displayed.
Note:
Content of "merged_test_file.txt"
Many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Embarcadero, Oracle, and IBM.
C++ is a high-level, general-purpose programming language created by Danish computer scientist Bjarne Stroustrup.
It is almost always implemented in a compiled language.
Modern C++ currently has object-oriented, generic, and functional features, in addition to facilities for low-level memory manipulation.
First released in 1985 as an extension of the C programming language, it has since expanded significantly over time.
Content of the split files
part_1.txt
Many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Embarcadero, Oracle, and IBM.
C++ is a high-level, general-purpose programming language created by Danish computer scientist Bjarne Stroustrup.
It is almost always implemented in a compiled language.
Modern C++ currently has object-oriented, generic, and functional features, in addition to facil.
part_2.txt
ities for low-level memory manipulation.
First released in 1985 as an extension of the C programming language, it has since expanded significantly over time.
Flowchart:
CPP Code Editor:
Contribute your code and comments through Disqus.
Previous C++ Exercise: Merge multiple text files.
Next C++ Exercise: Search for string with line number in text file.
What is the difficulty level of this exercise?
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/cpp-exercises/file-handling/cpp-file-handling-exercise-10.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics