|
The landlord
Published in:
15:57:15 2016-01-04
First to the father of the C language Ritchie Dennis tribute!
Today almost all practical compiler / Interpreter (hereinafter referred to as the compiler) is written in C language, some languages like clojure, Jython is based on the JVM or said is implemented in Java, ironpython is based on. Net to achieve, but Java and C# and so on their own have to rely on the C / C + + to achieve, it is tantamount to indirect call call C. Therefore, the measure of the portability of a high level language is a discussion of the portability of C ANSI/ISO. C language is a very low-level language, many aspects are similar to assembly language, in the "Intel32 assembly language program design," a book, and even the manual to translate the simple C language into a compilation of the method. For a compiler to the system software, using C language to write it is only natural, even like Python such a high-level language still is dependent on the underlying C (give examples of Python is because Intel hackers trying to make Python does not require the operating system can run -- is actually removed from the BIOS on disposable C code). Now the students, after learning the compiler theory, as long as the ability to have a bit of programming can be a simple function of the class C language compiler. But the question is, do not know you have not thought, we all use C language or C language to write the compiler, then the world's first C language compiler is how to write it? This is not a "chicken and egg" problem...... Let us review the history of C language: 1970 Tomphson and Ritchie in BCPL (an interpreted language) based on the developed language b, in 1973 and in the B language based on the successfully developed the C language now. Before the C language is used as the system programming language, Tomphson also used the B language to write the operating system. Visible in the C language to achieve, B language has been able to put into practice. So the first prototype of the C language compiler is entirely possible to use the B language or mixed B language and PDP assembly language. We now know that language b, the efficiency is relatively low, but if a complete assembly language to write, not only the development cycle is long, difficult maintenance, more terrible is lost the high-level programming language required transplantation. So the early C language compiler took a tricky way: first using assembly language to write for a subset of a C language compiler, and through this subset to pass push complete integrity of the C language compiler. Detailed procedure is as follows: First to create a subset of the most basic functions of the C language, written in C0 language, C0 language is simple enough, you can directly use the assembly language to write the C0 compiler. Rely on the existing functions of C0, the design is more complex than C0, but still not complete C language and a subset of C1 language, which belongs to C0 C1, C1 belong to C, using C0 to develop the C1 language compiler. In the C1 based on the design of the C language and a subset of C2 language, C2 language is more complex than C1, but still not complete C language, developed the C2 language compiler...... So until CN, CN has been strong enough, this time enough to develop a complete implementation of the C language compiler. As to where n is the number, depending to your target language (here is the C language) and complexity of the programmer programming ability -- simply put, if to a subset of the stage can very convenient use of the existing functions of C language, then you will find n. The following description of this abstract process: C language CN language ...... C0 language assembly language machine language Is not a little familiar? Right is the virtual machine when seen, but here is the CVM (C language virtual machine), each language is in each virtual layer can be implemented independently compiled and in addition to the C language, the output of each layer will as a layer of input (the last layer of the output is application), which and snowball is a truth. A handful of snow together by hand (assembly language), a little roll down on the formation of a big snowball. This is probably the so-called 0 1, 1 C, C everything it? So what is the theoretical basis for such a bold subset of the simplified method is how to achieve? Will learn is to introduce a concept, "self compiled" (Self-Compile) is for certain has obvious bootstrap properties of strongly typed (the so-called strongly typed programs in each variable type declarations can be used, for example, C language, on the contrary some scripting language is simply didn't have this type of argument) programming language, you can make use of them to a finite subset by a limited number of recursion to achieve on their own expression, so the language C, Pascal, ADA, etc., as to why can self compiled, you can see the Tsinghua University Press "compiling principle", the book is implemented in a Pascal subset compiler. In short, there have been CS scientists have proved that C language theory can be said by the above CVM method to achieve a complete compiler, then in fact, how to do it? The following is the keyword C99: Enum restrict unsigned Auto Extern return void break Float short volatile case For signed while char Goto sizeof _Bool const If static _Complex continue Inline struct _Imaginary default Int switch do Long typedef double Register union else A total of 37. A closer look, in fact, there are a lot of key words is to help the compiler optimization, there are some is used to define the variable and function scope, links or life cycle (function) of, these early in the realization of the compiler doesn't need, so you can get rid of auto, restrict, extern, volatile, const, sizeof static inline register typedef, so that the formation of the subset of C, C language, C language keywords are as follows: Unsigned enum Return void break Float short case For signed while char _Bool goto If _Complex continue Struct _Imaginary default Int switch do Long double Union else A total of 27. Think again, it was found that C3 in fact, there are many types and types of modifiers is not necessary once and for all add up, such as three kinds of shaping, as long as the realization of int on the line, so to further remove these keywords, they are: unsigned float, short, char, char is int), signed, bool, complex, the imaginary, long, so that the formation of the language of our C2 and C2 language keywords are as follows: Enum Return void break Case While for Goto If continue Struct default Int switch do Double Union else A total of 18. Continue to think, even if is only 18 a keyword in the C language, there are still a lot of, senior places, such as based on complex data structure with the basic data types, also our keyword in the table is not write operator. In C language in the compound assignment operator - > operators such as + +, - such as the overly flexible expression at this time can be completely removed, so you can get rid of the keyword: enum, struct, union, so that we can get C1 language keywords: Return void break Case While for Goto If continue Default Int switch do Double Else A total of 15. Close to perfect, but the last step to a generous nature. This time arrays and pointers to remove, also C1 language actually still have a lot of miscellaneous, such as looping and branching control are stated in a variety of methods, in fact, can be simplified into a. Specifically, loop with a while loop, do... While cycle and for cycle, just need to keep the while cycle is enough; the branch statement has if... {}, if... {}... Else, if... {}... If else... , switch, these four forms, they can pass more than two if... {} to achieve, therefore only need to retain if,... {} is enough. But once again, the so-called branch and loop is conditional jump statement just, function call statement is also a pressure stack and jump statement just, so only goto (not limited to goto). So bold to remove all the structured keywords, and even the function does not, the C0 language is the following: Void break Goto Int Double A total of 5. This is the ultimate simplicity of the. Only 5 key words, has been fully realized with the rapid assembly language. By reverse analysis we restored the first C language compiler writing process, but also feel the scientists of the predecessors of the wisdom and diligence! We're all just dust on the shoulders of giants! 0 students 1, 1 students C, C living things, it is clever! [/size] |
Reply times: 9
|
|
#1
Score: 0
Reply to:
16:03:47 2016-01-04
0 students 1, 1 students C, C living things, it is clever!
|
|
|
#2
Score: 0
Reply to:
16:12:20 2016-01-04
Thank you, see this article is pretty good, to share with you
![]() |
|
#3
Score: 0
Reply to:
16:15:18 2016-01-04
KG top!!!!!
|
|
|
|
#4
Score: 0
Reply to:
16:14:59 2016-01-04
!!! 0 living things!
|
|
|
#5
Score: 0
Reply to:
16:15:11 2016-01-04
Just so so
|
|
|
#6
Score: 0
Reply to:
16:17:26 2016-01-04
Quite interesting.......
|
|
|
#7
Score: 0
Reply to:
18:57:59 2016-01-04
Good
![]() |
|
|
#8
Score: 0
Reply to:
10:23:48 2016-01-05
How was the first DNA created by God?
|
|
#9
Score: 0
Reply to:
13:53:53 2016-01-05
Compiler is slightly deeper point is a mathematical problem, defined the core of the spec, such as only int, int type, contains only + - * / simple & ^ | several operations, including only if goto control, the initial human construct LR compiler also does not calculate the trouble. Since it is convenient to lift up a lot of.
The beginning of the C compiler should be used to write the B bar, anyway, then there is a high-level language available. |
|
|