"how we can convert a line instruction to an exe file?"
I'm going to focus on this part, if I understand the question correctly.
"A line instruction" I assume is something like this:
1 2 3 4 5
|
int a;
int b = 5;
int c = 6;
a = b * c;
| |
The "line" in this case is "a = b * c;"
This is assumed to be a multiplication of "b" times "c". The answer will be stored in "a".
The purpose of compiling this will be to generate machine instructions, sometimes called assembler instructions. These are the instructions the CPU hardware recognizes.
If the CPU can perform multiplication (and that is not always true, some can't), this may be translated into machine code for the multiply instruction. That instruction may have requirements. It may require that one or both of the operands are stored in CPU registers. It may only require one of them to be in a register, and the other might come from memory.
The purpose of the compiler will be to figure out how to fashion those instructions.
They will be different for each CPU design. It is different for ARM CPU's than it is for PowerPC CPU's, which are different from Intel/AMD CPU's.
When the multiply instruction is executed, the "line" above will have to figure out how to move the answer into the memory for "a", the result. This is usually a move instruction, which moves the answer from a CPU register into the memory for "a".
This highly detailed process is just for a simple multiply instruction.
If the CPU does not have multiplication instructions (and some didn't in the past, some still don't), then the compiler may be required to generate instructions which perform multiplication using what primitive features the CPU does have.
Now, leaving that focus for a moment, there is a much larger study about parsing (what you called the Lexic step). Parsing is how the text is "read" and "understood" by the compiler. It is what involves, as you put it, discovering keywords, operators, etc.
To do that, one must fashion a type of grammar. That is it's own computer science study.
The grammar takes care of handling all the ways humans might write code, but enforces the lexical demands of the language.
"now i'm confused, that's why i convert the new language to C++..."
I don't really understand this text. I have no idea what "why I convert the new language to C++" means.
My point here is that this is a complex, deep subject of many levels.
There are books on the subject, and no post will do the subject justice.
What I've tried to do is introduce the concept of machine language to you. The CPU has it's own language, but it is extremely primitive.
The CPU is an electronic version of simple mechanical devices.
Imagine if you have to gears. One gear has 10 teeth. The second has 30 teeth. When put together, the small gear must turn 3 times to move the large gear one rotation.
This is a mechanical divider. If you count the rotations of the large gear, as a result of turning the small gear, this mechanical device performs division.
If, instead, you turn the large gear, the small gear turns 3 times for every rotation of the large gear, making this a multiplication device.
The circuits of a CPU are related to this kind of simple, mechanical idea. The are more complex, of course, and work in base 2 (or binary math).
That said, the machine instructions are not so much, really, a language - but a way of firing specific circuits in the CPU that do the various primitive functions the CPU can perform.
To make a compiler, one must know those primitive features - what the CPU can do, and what instructions they're "called" to work them. That is the last stage of the work, and appears to be what you refer to when you post "I'm confused"