Considerations

(Need for) Stability

Reversing is significantly more difficult if execution is unstable.

  • Observations are affected by "random" factors, such as multithreaded execution, hardware behaviour, user interactions with graphical interfaces, etc.

  • Applications being reversed should be isolated from external effects as much as possible.

Determinism in design results from stable execution of a program run.

  • Thus it facilitates debugging and reversing.

  • A state may also be deterministically altered for the entire program or a specific function (fuzzing).

Logs can be obtained from executions using monitor applications.

(Need for) Save and Replaying

Reversing may need tracing from the current state to the code where a change was produced.

  • It implies moving "back in time".

  • To restore the past program state, one must re-run it and try to find the failure source.

  • This operation may be performed multiple times, moving backwards step-by-step, and then forward.

Deterministic replay reconstructs program execution using previously recorded input data.

  • The first program run is used to record these inputs into the blog.

  • Then all following runs will reconstruct the same behaviour because the program uses only recorded inputs.

  • It should include all inputs (disk, network).

(Need for) Safety

Target binary may be malicious (... it is always malicious until proven safe).

An important aspect of Reversing binaries is malware analysis.

  • Malware is way too complex to be analyzed statically.

  • However, executing the malware may be dangerous.

    • Most importantly: dangerous in ways unknown to the reverse engineer.

Solutions must create adequate isolation boundaries between environments.

  • If stability is required, no interactions with the software under analysis.

  • Sometimes, isolation must be broken to trigger specific behaviour.

    • Network connection allows contact with a C&C address or to download some payload.

    • Disk or file presence.

    • Whenever possible, such resources should be virtualized.

(Need for) Support of Heterogeneous Architectures

Dynamic analysis requires the execution of the program under analysis.

An analyst will mostly run on an Intel x86 64-bit computer (a COTS laptop/server).

  • Most embedded devices are ARM, which has several variants.

  • Microcontrollers frequently use 8085, AVR or PIC architectures (MIPS).

  • Several speciality SoCs use custom architectures (the list is large... ).

  • Several binary formats are popular: ELF, PE, DWARF and then many others from IoT.

Frameworks must be extensible to support a wide range of architectures.

  • And the related interfaces and customizations.

  • While minimizing the need for new tools.

(Need for) Support of Peripheral and external entities

Reversing an application with external interactions may require the existence of the related entities.

  • Web sites, and servers in fixed/dynamic IP addresses.

  • Common physical devices for user input, storage, ...

  • Exotic external devices communicate through known or unknown buses.

  • Hardware Dongles.

Need to recreate the set of devices/entities required to trigger a specific path.

  • Frequently resorts to device emulation with mock software constructs.

(Need for) Content manipulation (instrumentation)

The main limitation of a dynamic approach is coverage.

  • Every path that is not covered by the instrumented executions cannot be analyzed.

  • This limitation can be slightly reduced by performing active instrumentation, and in particular by forcing conditional branching.

(Need for) Context manipulation (instrumentation)

A reversing task will need to observe structure and behaviour.

  • The analysis should have enough coverage to recover an adequate level of detail.

  • But while static analysis aims for wide coverage, dynamic analysis aims for focus.

  • What if a specific course of execution is not triggered?

  • Results of dynamic analysis are dependent on the context of the execution.

Context manipulation allows setting the adequate state to trigger a specific flow of execution, increasing the reversing coverage.

  • Achieved by careful manipulation of execution state, registers and memory content.

  • Problems:

    • This may lead to the recovery of an incorrect design as the found flow may be a decoy!

    • May lead to the recovery of artificial vulnerabilities, that do not exist.

Context manipulation (instrumentation)

  • Live patching: modifying RAM in a debugger/controlled environment.

  • File patching: alter binary files to replace their content.

  • Binary instrumentation: real-time, automated modification.

Design Fidelity

Program under analysis may detect it and try to defend actively against analysis.

  • For instance, it can hide a part of its behaviour if it detects that it is being analyzed.

  • These anti-debugging and anti-instrumentation techniques are used by many malware.

So, when we achieve a hypothesis of a design, how correct it is?

Example of gdb+br detection

Last updated