I have a problem with parsing, to do with stripping whitespace. I'm saving data for an application in JSON format (I'm creating a "schema" parser for JSON, that's why I'm not just getting a existing one), so the parser isn't too extensive, nothing big.
The problem I've run into is with dealing with whitespace. I was just calling "consume whitespace" wherever whitespace is valid, but that seemed like a horribly inefficient way to do it.
So, I changed it over to cull all of the whitespace in one go at the start. That all went good until I tried a large complex schema, and my parser proceeded to stumble over something, and give me an error it shouldn't half way through the file. Now, having a line number was rather essential to figuring out where the problem was, but wait! I already stripped out the whitespace, so I have no way to track where from the original file the character was.
Now, it's very possible for me to use hacks such as deleting all the whitespace in the schema, but I'd rather have a nice parser than a hack with no proper feedback when you give it bad input, even if I can solve the current error.
Is there some straightforwards ingenious way to track the current character/line number while still stripping whitespace at the start of the parse that I'm missing?, or will it probably be as tricky as I think and no more efficient than stripping whitespace during the parse?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
|
//my character stream class, basic 1-character lookahead stream
class ParserInput {
public:
void setStream(...); //initialize from whatever input source
void consume() { //go to next character
//how I would implement line/char tracking without whitespace stripping
if (cur() == '\n') {
++currentLine;
currentChar = 0;
} else {
++currentChar;
}
charPtr++;
}
char cur(); //current char
char peek(); //lookahead
//plus lots of convenience functions to calling on the above ones
};
| |