Table of Contents
Compilation in general is split into roughly 5 stages: Preprocessing, Parsing, Translation, Assembling, and Linking.
All 5 stages are implemented by one program in UNIX, namely cc, or in our case, gcc (or g++). The general order of things goes gcc -> gcc -E -> gcc -S -> as -> ld.
Under Windows, however, the process is a bit more obfuscated, but once you delve under the MSVC++ front end, it is essentially the same. Also note that the GNU toolchain is available under Windows, through both the MinGW project as well as the Cygwin Project and behaves the same as under UNIX. Cygwin provides an entire POSIX compatibility layer and UNIX-like environment, where as MinGW just provides the GNU buildchain itself, and allows you to build native windows apps without having to ship an additional dll. Many other commercial compilers exist, but they are omitted for space.
Despite their seemingly disparate approaches to the development environment, both UNIX and Windows do share a common architectural back-end when it comes to compilers (and many many other things, as we will find out in the coming pages). Executable generation is essentially handled end-to-end on both systems by one program: the compiler. Both systems have a single front-end executable that acts as glue for essentially all 5 steps mentioned above.
gcc is the C compiler of choice for most UNIX. The program gcc itself is actually just a front end that executes various other programs corresponding to each stage in the compilation process. To get it to print out the commands it executes at each step, use gcc -v
cl.exe is the back end to MSVC++, which is the the most prevalent development environment in use on Windows. You'll find it has many options that are quite similar to gcc. Try running cl -? for details.
The problem with running cl.exe outside of MSVC++ is that none of your include paths or library paths are set. Running the program vsvars32.bat in the CommonX/Tools directory will give you a shell with all the appropriate environment variables set to compile from the command line. If you're a fan of Cygwin, you may find it more comfortable to cut and paste vsvars32.bat into cygwin.bat.
The preprocessor is what handles the logic behind all the # directives in C. It runs in a single pass, and essentially is just a substitution engine.
gcc -E runs only the preprocessor stage. This places all include files into your .c file, and also translates all macros into inline C code. You can add -o file to redirect to a file.
The parsing and translation stages are the most useful stages of the compiler. Later in this book, we will use this functionality to teach ourselves assembly, and to get a feel for the type of code generated by the compiler under certain circumstances. Unfortunately, the UNIX world and the Windows world diverge on their choice of syntax for assembly, as we shall see in a bit. It is our hope that exposure to both of these syntax methods will increase the flexibility of the reader when moving between the two environments. Note that most of the GNU tools do allow the flexibility to choose Intel syntax, should you wish to just pick one syntax and stick with it. We will cover both, however. (FIXME: Should we?)
gcc -S will take .c files as input and output .s assembly files in AT&T syntax. If you wish to have Intel syntax, add the option -masm=intel. To gain some association between variables and stack usage, use add -fverbose-asm to the flags.
gcc can be called with various optimization options that can do interesting things to the assembly code output. There are between 4 and 7 general optimization classes that can be specified with a -ON, where 0 <= N <= 6. 0 is no optimization (default), and 6 is usually maximum, although oftentimes no optimizations are done past 4, depending on architecture and gcc version.
There are also several fine-grained assembly options that are specified with the -f flag. The most interesting are -funroll-loops, -finline-functions, and -fomit-frame-pointer. Loop unrolling means to expand a loop out so that there are n copies of the code for n iterations of the loop (ie no jmp statements to the top of the loop). On modern processors, this optimization is negligible. Inlining functions means to effectively convert all functions in a file to macros, and place copies of their code directly in line in the calling function (like the C++ inline keyword). This only applies for functions called in the same C file as their definition. It is also a relatively small optimization. Omitting the frame pointer (aka the base pointer) frees up an extra register for use in your program. If you have more than 4 heavily used local variables, this may be rather large advantage, otherwise it is just a nuisance (and makes debugging much more difficult).
NOTE | |
---|---|
Since some of these get turned on by default in the higher optimization classes, it is useful to know that despite the fact that the manual page does not mention it explicitly, all of the -f options have -fno- equivalents. So -fno-inline-functions prevents function inlining, regardless of the -O option. If you use -fverbose-asm, a non-inclusive list of compiler options is now printed at the top of the assembly output file. An annoying nuisance with gcc-3.x is that it enables many optimizations even at the -O0 level, making it difficult to generate hand-tuned asm from C. You can turn these off one by one using the above mentioned -fno- switch, however. Also one can write inline assembly to make sure that gcc will generate the code desired, but this should not be the preferred approach. |
Likewise, cl.exe has a -S option that will generate assembly, and also has several optimization options. Unfortunately, cl does not appear to allow optimizations to be controlled to as fine a level as gcc does. The main optimization options that cl offers are predefined ones for either speed or space. A couple of options that are similar to what gcc offers are:
-Ob<n> - inline functions (-finline-functions)
-Oy - enable frame pointer omission (-fomit-frame-pointer)
FIXME: Play with these.
The assembly stage is where assembly code is translated almost directly to machine instructions. Some minimal preprocessing, padding, and instruction reordering can occur, however. We won't concern ourselves with that too much, as it will become visible during disassembly, which is covered in the section Know Your Compiler
as is the GNU assembler. It takes input as an AT&T or Intel syntax asm file and generates a .o object file.
Both Windows and UNIX have similar linking procedures, although the support is slightly different. Both systems support 3 styles of linking, and both implement these in remarkably similar ways.
Static linking means that for each function your program calls, the assembly to that function is actually included in the executable file. Function calls are performed by calling the address of this code directly, the same way that functions of your program are called.
Dynamic linking means that the library exists in only one location on the entire system, and the operating system's virtual memory system will map that single location into your program's address space when your program loads. The address at which this map occurs is not always guaranteed, although it will remain constant once the executable has been built. Functions calls are performed by making calls to a compile-time generated section of the executable, called the Procedure Linkage Table, PLT, or jump table, which is essentially a huge array of jump instructions to the proper addresses of the mapped memory. These structures will be discussed in Chapter 8, Executable formats and also in the Code Modification Chapter. (FIXME: Verify PLT on windows)
Runtime linking is linking that happens when a program requests a function from a library it was not linked against at compile time. The library is mapped with dlopen() under UNIX, and LoadLibrary() under Windows, both of which return a handle that is then passed to symbol resolution functions (dlsym() and GetProcAddress()), which actually return a function pointer that may be called directly from the program as if it were any normal function. This approach is often used by applications to load user-specified plugin libraries with well-defined initialization functions. Such initialization functions typically report further function addresses to the program that loaded them.
ld is the GNU linker. It will generate a valid executable file. If you link against shared libraries, you will want to actually use what gcc calls, which is collect2. FIXME: Watch gcc -v for flags
This is the MSVC++ linker. Normally, you will just pass it options indirectly via cl's -link option. However, you can use it directly to link object files and .dll files together into an executable. For some reason though, Windows requires that you have a .lib (or a .def) file in addition to your .dlls in order to link against them. The .lib file is only used in the interim stages, but the location to it must be specified on the -LIBPATH: option.