You should have learnt to use [co
de][/code] tags by now.
> Only and only thing I care about is 'SPEED'.
Your average instruction execution time is measured in nanoseconds, and your average seek time of a spinning hard disk is measured in milliseconds (that's 6 to 8 orders of magnitude slower). With a 10G file, your hard disk is going to be doing a lot of seeking, which means your code is going to be doing a
lot of waiting.
This measures your lower bound on how fast you can read the file with trivial processing. Consider it a benchmark you measure Number1,2,3,n against.
1 2 3 4 5 6 7 8 9
|
std::ifstream myfile("file.txt");
if (myfile.is_open()) {
std::string line;
int nlines = 0;
while ( std::getline (myfile,line) ) {
nlines++;
}
std::cout << nlines << std::endl;
}
| |
While it's doing that, use whatever process monitoring tools you have to see how much time is spent in your code. It won't be much.
So how often do you need to process a 10G file, like where is it coming from and who needs the results of your analysis? To me, it seems like a run once a day kind of problem.
If the benchmark above takes say 10 minutes, there is nothing you could ever do to make it run in 10 seconds. Accept that it's going to take 10 minutes and tell the user's to take a coffee break.
Similarly, if the benchmark is 10 minutes, your #1 approach is 11 minutes and your #2 approach is 12 minutes (but the code is a lot cleaner for you to maintain), then user's are not going to change their work pattern. It's still a coffee break interval to them.
If the ballpark is an hour, then they'll go to lunch instead.
They won't care about +/-10% around whatever baseline time it takes.
They WILL care about you being late delivering s/w because you're too obsessed about saving milliseconds on a program which takes minutes or hours to run.