# Binary Objects

## Binary files

The result of a compilation process.

* Translating high-level code (C/C++, etc…) into native code or bytecode.

Code is encapsulated in a binary format.

* It’s not a raw file with unstructured bytes.

The target system (CPU or VM) will process the resulting code.

* Which may be only part of the file content.

## Compilation process

### The C/C++ use case

<figure><img src="/files/PQPIKiXvjszIZXYfczxu" alt=""><figcaption></figcaption></figure>

Pre-processor (maybe the compiler) processes code, validating its structure and expanding existing macros.

The result is a text blob with content ready to be further processed, and frequently without external dependencies.

### Source code

```c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
    printf("Hello World\n");
    return 0;
}
```

### Pre-compile: `gcc -E -o hello.e hello.c`

Produces >1500 lines.

```c
…
extern int rpmatch (const char *__response) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1))) ;
# 980 "/usr/include/stdlib.h" 3 4
extern int getsubopt (char **__restrict __optionp,
                char *const *__restrict __tokens,
                char **__restrict __valuep)
        __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2, 3))) ;
# 1026 "/usr/include/stdlib.h" 3 4
extern int getloadavg (double __loadavg[], int __nelem)
        __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
# 1036 "/usr/include/stdlib.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/stdlib-float.h" 1 3 4
# 1037 "/usr/include/stdlib.h" 2 3 4
# 1048 "/usr/include/stdlib.h" 3 4

# 3 "hello.c" 2

# 5 "hello.c“
int main(int argc, char** argv) {
    printf("Hello World\n");
    return 0;
}
```

The **compiler processes the file and produces assembly code**. This may result in assembly for an intermediate processor, and not the final processor.

The processor will create abstract syntax trees (AST) and may tweak or optimize the result according to the options it was provided with.

Typically for GCC, -m and –f switches, and then -On switches can modify the output. That is: the same source code can result in different assemblies based on the compiler, target and flags.

### Compile: `gcc -masm intel -S –o hello.s hello.c`

<figure><img src="/files/dV25vHwR2U72igBjufZG" alt=""><figcaption></figcaption></figure>

### Assembler

Input **containing assembly code is transformed into machine code**. Output is a set of object files or modules with a `.o` extension.

Code produced may use relative addresses, making it reusable (technically relocatable) when integrated into a final binary file.

**Symbols are also present as they are required at later stages.**

Although the binary files contain machine code, it is not executable as they don’t include all the code required, only what was present in the original .c and included .h.

<figure><img src="/files/5hj0VoGL6W0CqTkPyawC" alt=""><figcaption></figcaption></figure>

### Linker

Take all the **object files belonging to a program and merge them into a single coherent executable**, typically intended to be loaded at a particular memory address.

As the arrangement of all modules in the executable is known, the linker can also resolve most symbolic references.

References to libraries may or may not be completely resolved, depending on the type of library. In this case, the library is added as a dependency and the symbol is resolved in real time.

<figure><img src="/files/PKW8KlPMlzF2FEZWcNga" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/tr9Gb1vc9BsqLVaZBQza" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://davidjosearaujo.gitbook.io/notes-mcs/reverse-engineering/binary-analysis/binary-objects.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
