gettoken

I have trouble understanding this gettoken function. Token is a name, a pair of parentheses, a pair of brackets perhaps including a number, or any other single character. This is supposed to parse a declarator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

int gettoken(void)   /* return next token */

{
	
	int c, getch(void);
	void ungetch(int);
	char *p = token;
	
	while ((c = getch()) == ' '  || c == '\t') 
		;
	if (c == '(') {
		if ((c = getch()) == ')') {
			strcpy(token, "()");      
			return tokentype = PARENS;  
		}  else {
			ungetch(c);                                             
			return tokentype = '(';
			
		 }
	 } else if (c == '[')  {
		 for (*p++ = c; (*p++ = getch()) != ']'; ) 
			 ;
		 *p = '\0';
		 return tokentype = BRACKETS;  
	 }   else if (isalpha(c))  {      
			for (*p++ = c; isalnum(c == getch()); )
				*p++ = c;      
			*p = '\0';
			ungetch(c);
			return tokentype = NAME;
		} else
			return tokentype = c;
  }



I don't know about these ungetch - statements (lines 17 and 30). How do they work here?
Last edited on
Looks like that code uses the header <conio.h> which is non standard and thus cannot be guaranteed to either exist or behave consistently across all systems. Use at your own risk.

In the above code, at line 13, a character is read from the input buffer. If it is a space or a tab, it is ignored. But if it is anything else, the program does not want to ignore it. Thus it uses the ungetch to put the character back into the buffer, before going on to the process the input.

At line 27, characters are read and stored in a character string pointed to by p, so long as the char is alphanumeric. If it was not alphanumeric, it is put back into the buffer so that it does not get ignored.


Reference:
http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/devwin32/getch_xml.html
http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/devwin32/ungetch_xml.html

I don't know about these ungetch

It puts a character back into the input buffer as if it had never been read.
http://stackoverflow.com/questions/476615/what-is-the-purpose-of-ungetc-or-ungetch-from-kr
ungetch is an non-standard extension.

The supported way to do this is through the use of C++ istream::unget
http://www.cplusplus.com/reference/istream/istream/unget/

I think conio.h is a windows header so a windows compiler should know where to find it. No, I don't think here is used conio header at all.

i'm still somewhat puzzled about this code.

Thus it uses the ungetch to put the character back into the buffer, before going on to the process the input.

I still don't understand. Why?





Last edited on
Back at the dawn of time, in 1982, conio.h was a header provided with the first C compiler for the IBM PC. It provided some helpful input.output functions for the PC DOS terminal on the 8 bit PCs of the day.

Borland, when they pushed out Turbo C in 1987, decided to provide the same (and some extra) functions in a header of the same name.



Since then, and to this day over three decades later, people are still passing around code from the eighties and complaining that their modern C++14 compiler on their 64 bit graphical operating system doesn't emulate the C compiler provided for an 8 bit DOS prompt from 1982. Every time the OS and the compiler advances, someone somewhere tortuously reimplements the interface (and in doing so manages to get a modern terminal emulating a terminal from 1982) so that people can keep passing around C code from the eighties.

It's particularly popular in the Indian education system where the people teaching C++ learned to code using Turbo C++ in 1990, and never learned anything after that; they continue to teach people how to use Turbo C++ in the year 1990, with heavy use of conio.h

It's non-standard, and if you want to use it, you'll have to find an implementation for your particular operating system. I usually advise people to stop using it and instead use a modern terminal library, or whatever their operating system provides.
Last edited on
Thus it uses the ungetch to put the character back into the buffer, before going on to the process the input.


I still don't understand. Why?

Think of it this way. What the program really needs to do is to take a sneak peek at the next character. Instead, it extracts the character, takes a look at it and puts it back unchanged.

Topics like this are not necessarily easy to understand in the abstract. If you write down some actual input which this code should process, you should be able to step through the code (on paper) one instruction at a time, to see how it is working.
Last edited on
I realize this is not rocket-science.
I think I got it now. getch delivers the next input character to be considered. Once that number(or anything else) is finished, push the unwanted character(s) back on input, so we can read it later (next call to getch).
Last edited on
Do you have to read the input char by char with getch() ? Can't you just read the whole input into a string before parsing it?
Last edited on
Topic archived. No new replies allowed.