Low-level languages

Machine Code

Each CPU has a specific instruction set.

When a program is compiled to “binary”, the high-level logic is converted to a sequence of instructions.

This sequence may be executed by a family of CPUs or a single model.
Running this sequence on another CPU may involve binary translation (conversion).

Humans are typically not capable of reading binary instructions, but instructions are always able to be translated into Assembly.

For compiled programs, the RE tasks involve extracting information from the sequence of Assembly instructions.

Reconstruction is never perfect!

Different levels of abstraction: e.g., it is not trivial to recover C++ class structure and OOP relations from Assembly code.
Different compilers generate different assembly for the same source code.
The same compiler may generate different assembly for the same source code.
- Optimization flags, CPU matching, protection mechanisms, target object type…

Some languages are compiled into a bytecode (!= machine code).

Bytecode contains a compact (optimized) representation of the higher layer structures.

Framework/VM will execute bytecode in the target CPU.
The same bytecode usually can be executed in multiple CPUs, provided there is a native VM implementation.
- The Java moto: Write Once, Run Anywhere.

Bytecode allows easier extraction of information, provided there is such a route.

Last updated 7 months ago