In text mode, getline() will read all chars from the specified stream up-to but NOT including the specified terminator char (\n by default) into the specified variable and will then read and DISCARD the terminator char if not eof. Hence if there no non-term chars before the next term char, then getline() will return an empty string.
For
Windows in text mode, the line terminator written as '\n' in code is actually written as \r\n by the OS. Consider:
1 2 3 4 5 6 7 8 9
|
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ofstream of("tt.txt");
of << "l1" << '\n' << "l2" << '\n';
}
| |
If you look at the file tt.txt with a hex editor you'll see
108 49 13 10 108 50 13 10
|
If this is now read back with getline():
1 2 3 4 5 6 7 8 9
|
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ifstream ifs("tt.txt");
for (std::string s; std::getline(ifs, s); std::cout << s << '\n');
}
| |
you get the expected:
However, if you now read the same file character by character then you get:
1 2 3 4 5 6 7 8 9 10 11
|
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ifstream ifs("tt.txt");
for (char c ; ifs.get(c); std::cout << (int)c << ' ');
std::cout << '\n';
}
| |
you get:
No 13 - even though the hex editor shows 13's are present. The OS does the conversion from \r\n to \n.
If you open the same file in binary mode:
1 2 3 4 5 6 7 8 9 10 11
|
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ifstream ifs("tt.txt", std::ios::binary);
for (char c ; ifs.get(c); std::cout << (int)c << ' ');
std::cout << '\n';
}
| |
then you do see the \r\n termination:
108 49 13 10 108 50 13 10
|
The interesting one is:
1 2 3 4 5 6 7 8 9
|
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ifstream ifs("tt.txt", std::ios::binary);
for (std::string s; std::getline(ifs, s); std::cout << "!" << s << "!\n");
}
| |
which displays:
which at first glance seems OK. But it's not. If you look at the code, the displayed string should be delimited on output by !. But the output only shows ! at the start of the line, not at the end. This is because in binary mode the \r before the \n is not 'translated' by Windows but becomes part of the read string and \r is return which moves the cursor to the start of the line and the outout ! is then written over the existing ! Yikes!!
What in text mode if the line termination is just \n instead of \r\n?
No problem. This still works as expected. if there's no \r before the \n then the OS does no conversion.
But what about \n\r in text mode? This is a problem. The \n is treated as line terminator but the following \r is treated as the first char of the next line!
So reading a text file in Windows in text mode is OK if the line terminator is either \n or \r\n.
PS. Why \r\n and not \n\r? This goes back to the age of mechanical tele-typewriters. \r caused the mechanical head to move back to the left hand side - which took time. If a char was output during this movement then it wasn't necessarily printed on the left as expected - but somewhere. If a \r was followed by \n (line feed - physically advance the paper by 1 line) then this paper advancement could be done during the time it took for the head to move to the left. So the char to be printed after \r\n printed as expected. Using \n\r didn't work the same!