|
|
| Line 1: |
Line 1: |
| This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church. | | This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church. |
|
| |
| Here's a write up that covers the first half of
| |
| the first C on Linux class that I gave last Tuesday
| |
| 20120619 in the Church classroom from 5:30 to 7 PM.
| |
| I hope to write up the balance of last Tuesday's
| |
| class before the weekend's out.
| |
| Note the To: list, please; if you know of anyone
| |
| who's missing, please let them and me know.
| |
| Complaints, suggestions, sarcasms, all are welcome.
| |
|
| |
| jim
| |
| 415 823 4590 my cellphone, call anytime
| |
|
| |
|
| |
| Learning C programming on Linux
| |
|
| |
| * C programming language is a specification that defines keywords,
| |
| operators, and rules of syntax.
| |
| This may sound stupidly obvious or useless knowledge, but you may,
| |
| if you really get into using C, find that it's a practical
| |
| concept--useful, intelligently obvious.
| |
|
| |
| * C compiler is a software program that implements the C specification:
| |
| parser, keywords, operators, syntax rules.
| |
| The practical purpose of this idea is that there are different C
| |
| compilers for different machines and for different purposes. If you're
| |
| just starting to learn C, this idea will seem pretty nearly as useless
| |
| as the idea that C is a specification.
| |
|
| |
|
| |
| The tools you use to write C programs include an editor and a C
| |
| compiler at minimum. There are a lot more tools available, such as
| |
| debuggers and profilers and more.
| |
|
| |
| The process you follow is to use a text editor to write some ASCII
| |
| text that complies with the rules of the C language then use a C
| |
| compiler to read your ASCII file and create a new file that contains
| |
| executable machine code.
| |
| Look for C compiler-generated error messages. If there are any, even
| |
| one, then the compiler does not make an executable file; you have to fix
| |
| all errors. You may see warning messages that indicate the compiler
| |
| found one or more things that are not perfect but let the compiler
| |
| continue. If you don't have too many warning messages, the compiler will
| |
| probably make the executable file.
| |
| If you get an executable file, run it and see if it works as you
| |
| expect. If it does, you probably won't learn anything more from this
| |
| exercise. If it doesn't, you get to learn about runtime and logic
| |
| errors: you wrote a program that is correct according to the C language
| |
| but incorrect in terms of implementing what you hoped it would do.
| |
|
| |
| The following commands exemplify the process using a bash shell:
| |
| $ vi myfile.c
| |
| $ gcc myfile.c
| |
| $ ls
| |
| a.out
| |
| $ chmod 755 a.out
| |
| $ ./a.out
| |
|
| |
|
| |
| You use a text editor such as vi to create a file of text that
| |
| conforms to the rules of the C specification.
| |
|
| |
| You run the C compiler so that it reads what you wrote. The C
| |
| compiler sees your program file as an ASCII character stream that it
| |
| interprets as a token stream.
| |
| So, what is a "token"? A token is one or more ASCII characters that
| |
| the compiler sees as a meaningful thing. To compare with the English
| |
| language, think of a token as a word or a word ending or punctuation or
| |
| some other element that's meaningful.
| |
|
| |
| The C compiler is a software program that conforms to a particular
| |
| design: the design for interpreters and compilers. Generally, any
| |
| compiler or interpreter includes an input stage that parses the incoming
| |
| ASCII (token) stream and also has a set of keywords and operators that
| |
| are reserved ASCII character(s) and a set of rules that the compiler
| |
| applies to the tokens it reads.
| |
| When the compiler begins, it sets itself to a neutral state, which
| |
| is to say that it will examine the first ASCII characters to verify that
| |
| it can parse it as a stream of tokens.
| |
| When the compiler identifies the first token, it verifies that that
| |
| token is of a class that can be a first token and then resets its (the
| |
| compiler's) state so that the following token must be one of a limited
| |
| set of tokens. For example:
| |
| 1+2
| |
| The compiler reads the 1 and then the '+' character, at which point
| |
| it determines that it
| |
| has at least one valid token:1. The compiler continues reading and sees
| |
| the 2 and determines that it now has two tokens, 1 and '+'. The 1 token
| |
| is an integer type of data the value of which is 1. The '+' token,
| |
| because it occurs between the 1 and the 2 represents the addition
| |
| operator. The compiler continues reading to find only whitespace and
| |
| then is able to identify the ASCII stream as a set of three tokens--a
| |
| value, an operator, and a value--that together form an expression.
| |
| An expression is at least one operand and zero or more operators
| |
| that must be resolved to a single value.
| |
| The compiler resolves the expression 1+2 to be a single value of 3.
| |
| If you know how to write a C program that is exactly 1+2 and nothing
| |
| else, it's very likely your compiler will generate an error message
| |
| (remember, a compiler implements the C programming language
| |
| specification, and does so in its own way--the C specification is
| |
| deliberately permissive in some aspects of implementation).
| |
| If you get an error message, very likely it will be a complaint that
| |
| there's not a complete statement or there's a problem at the end of the
| |
| file or some such.
| |
|
| |
| The C compiler is designed to read statements. A statement is a set
| |
| of valid tokens that follow the rules of the C programming language and
| |
| end with a statement termination character, which is the ; character.
| |
| Try revising your program to read
| |
| 1+2;
| |
| The 1+2 is an expression: the C compiler sees 1 followed by +
| |
| followed by 2 and verifies that this is a valid sequence of tokens that
| |
| makes an expression. It interprets the ; character as a statement
| |
| terminator, which means the compiler creates the machine code for the
| |
| expression and resets itself to a neutral state, ready to read the next
| |
| statement (ASCII character stream of valid tokens).
| |
| The compiler may compile the program with only warning messages. If
| |
| so, it will make a new file that is named a.out. It is not a loadable
| |
| program, nor is it executable. Very likely the entire contents is 3,
| |
| which means the compiler did the addition as it did the compiling. You
| |
| may think that the compiler would leave the 1+2 in the file as data and
| |
| machine instructions that the CPU runs to create the sum, 3. That the
| |
| compiler does the arithmetic before it is done is a matter of
| |
| optimization.
| |
|
| |
| The C compiler generally runs in four different phases:
| |
| 1 preprocessor
| |
| 2 compiler
| |
| 3 optimizer
| |
| 4 linker
| |
|
| |
| Consider the program:
| |
| 1+2;
| |
| The preprocessor runs and sees nothing to do.
| |
| The compiler runs and translates the ASCII to data and machine code,
| |
| which properly is a set of 1 bits and 0 bits that represent integer 1,
| |
| integer 2, and the operation of addition.
| |
| The optimizer recognizes that this expression can be resolved now
| |
| without doing any harm to any other parts of the program, so the
| |
| optimizer replaces the code with the integer value of 3.
| |
| The linker runs and does nothing: there is no code to which to link
| |
| this module.
| |
|
| |
| Consider the following program:
| |
| 1+2
| |
| 3 + 4 ;
| |
| How many statements do you see? How many expressions? How many
| |
| tokens?
| |
|
| |
| There is a single statement that has two expressions and a total of
| |
| seven tokens: 1, +, 2, 3, +, 4, and ; (we're not counting the space
| |
| characters or the newline characters).
| |
| Note that the C compiler sees 1+2 and 3 + 4 identically: two
| |
| expressions that add two integer values together. Very likely the
| |
| resulting program will effectively be 3 7 after the optimizer pass does
| |
| its thing.
| |
| Note that the 3 and the 7 are there in the program but the program
| |
| does nothing with them.
| |
| Now it may be that the optimizer of your compiler detects that there
| |
| are no machine operations for the CPU and the optimizer might eliminate
| |
| the data itself. I doubt it, as it's possible that you may want to make
| |
| a file that contains only data and link it to one or more other programs
| |
| that you'll write at some time.
| |
|
| |
| The discussion so far includes the terms ASCII stream, token stream,
| |
| values, operands, operators, expressions, statements, and the four
| |
| compiler passes: preprocessor, compiler, optimizer, and linker.
| |