All Articles

Exploring the GNU Compiler Collection and the GNU Binutils

executable

The GNU Compiler Collection (GCC) includes front ends for C, C++, Objective-C, Fortran, Ada, Go, and D, as well as libraries for these languages. The GNU Binutils includes a collection of binaries, like ld (the GNU linker), as (the GNU assembler) and gold (a new, faster, ELF only linker). That’s major toolchain to build the ELF (Executable and Linkable Format) executable in a linux system.

  • preprocessing

Before the actual compilation process, the code undergoes a preprocessing stage. This stage involves macro expansion, file inclusion, and conditional compilation. The output provides a glimpse into the code’s transformation before it gets compiled. We can achieve the result by gcc -E.

  • compilation

Once preprocessed, the code proceeds to the compilation stage. This stage translates the high-level programming code into assembly language, generating human-readable assembly code files. These files serve as an intermediary stage before further conversion into machine code. We can achieve the result by gcc -S.

  • assembling

The assembling stage is the conversion of the generated assembly code into machine code or object code. The as assembler is employed for this purpose. This stage is critical in translating the code into a format that the computer’s processor can understand and execute. We can also achieve the result by gcc -c.

  • linking

The final stage in the compilation process is linking, facilitated by the ld linker. This phase combines the object code generated during the assembling stage with other necessary libraries and modules to create a standalone executable file. The linker resolves symbols, addresses, and references, ensuring a cohesive and functional program. That’s the default output of gcc.

The output of gcc is an ELF executable in the linux environment. ELF is a standard file format that defines the structure for executables, object code, shared libraries, and even core dumps. The ELF executable encapsulates the entire program, comprising sections such as “.init” and “.fini.” The “.init” section contains the initialization code for the target file, executed before the main function. On the other hand, the “.fini” section holds the termination or cleanup code, executed after the main function completes its execution.

There’s a tool named readelf which can dump the details of an ELF executable file.

By mastering these tools, we can deeply understand how an executable file is made in a linux system.

Published Oct 10, 2010

Flying code monkey