Processes

  1. Tracing

  2. Debugging

  3. Sandboxing

  4. Emulation

  5. Instrumentation

Tracers

Tracers execute a binary, logging information about function and system calls.

A binary is executed in the analyst's system (VM).

Tracer adds hooks to applications or kernels to gain information about execution.

  • Access to files, packets sent, registry access.

No confinement or security measures are in place.

  • There may be no interaction between the tracer and the application.

    • The tracer monitors the system through kernel debug interfaces.

Limitations

  • No isolation, and no capability to analyze malicious or harmful code.

  • Can only inspect interactions between the application and the external environment.

  • The host environment must be compatible with the target binary.

    • No possibility of analyzing Windows binaries on Linux, vice-versa, embedded systems on Windows, etc…

Linux: ltrace, strace (ptrace), bpftrace, wireshark, valgrind, cachegrind, callgrind, helgrind

Windows: process monitor, wireshark

Debugging

Applications that can control (trace) a target executing binary.

  • Debuggers can create a process and analyze it or attach themselves to a running process.

    • The process usually executes in the host system.

  • This is the “typical”, low-tech way of dynamically analyzing a program.

    • Reuses concepts/tools from the engineering process applied to reverse engineering.

Provide extensive, interactive control over a process execution flow.

  • Frequently at the level of opcodes and assembly.

  • Can be integrated with static analysis tools.

    • Combining execution information with decompiled code, CFGs, and disassembly.

Limitations

Debugging can be detected and subverted by the target application.

  • Especially popular in malware and DRM systems.

The target application must be executed in a fully hosted environment.

  • Without isolation measures, this provides a serious security risk.

  • Remote debugging may be used to circumvent this limitation.

Host system architecture must match the target binary architecture.

  • The binary is loaded to the host system as a standard process.

  • No debugging of Windows in Linux, ARM or MIPS in x86.

  • No direct way of debugging shellcode or a binary blob (e.g firmware).

How do debuggers work?

Debuggers explore system calls provided by the operating system.

  • Debuggers either:

    • create a child process, sharing the same address space.

    • attach to an existing process given that the user has the correct permissions (e.g. root).

  • Linux: ptrace

  • Windows: provides API for process control.

    • CreateProcess with specific dwCreationFlags (DEBUG_PROCESS).

    • OpenProcess with dwDesiredAccess (PROCESS_VM_READ, PROCESS_VM_WRITE, PROCESS_VM_OPERATION).

Debuggers may attach to hardware devices providing external debugging.

  • Used in embedded devices.

Debugger set breakpoints which Trigger SIGTRAP, returning control to the debugger.

Patching the code with 0xCC or using Hardware breakpoints (through PTRACE)

Example

Sandboxing

Sandboxing improves the control that debuggers provide.

  • Creation of a distinct execution environment.

    • Different libraries? Restricted view of the filesystem (minimal access to files).

  • Isolate some actions, providing some safety to analyze malicious applications.

Implementation: lightweight virtual machines or namespaces/containers.

  • Supported by mechanisms of the Operating System or additional tools.

  • Tools: sandboxie, pyrebox, panda.

An agent monitors interactions of the application inside the environment and may allow instrumentation.

  • File access, network communication.

  • Remote debugging.

Emulators

Emulators are common backends for secure sandboxes.

  • May provide much better isolation as the guest and host environments are distinct.

    • The kernel is not shared, hardware is emulated.

  • Tools: QEMU, Virtualbox, Vmware.

Emulation types.

  • Full system emulation.

  • User mode emulation.

User Mode Emulation

Launches processes directly, but in a restricted environment.

  • Process may be compiled for one CPU and executed on another CPU.

  • Address space is restricted, such as filesystem and libraries available.

  • Interaction with the Host OS is mediated by the emulator.

The emulator processes native CPU instructions (emulation/translation) and:

  • Provide means to translate syscalls from guest to host OS.

  • Understand intrinsic characteristics such as clones.

    • Clone is used to spawn new processes and will require the creation of a new emulation environment.

  • Handle signals between the analyzed binary and the host system.

May provide integration with debugging tools.

User Mode Emulation with QEMU

QEMU allows user mode emulation as long as the OS is kept the same.

What it does:

  • Machine code translation from any CPU to any CPU.

  • Syscall mapping.

  • Data structure conversion (Bit-order and Bit-width conversions).

  • Extensive tracing capability to the level of Micro Ops.

Provides an gdbserver interface for interaction with GDB.

Usefulness: reverse engineering applications compiled to other architectures.

Full System Emulation

A full-blown virtual machine.

  • Emulates a highly configurable set of hardware, including embedded devices.

  • Maps interactions to Host resources (screen, disk, network).

  • RE-aware software tools expose debugging interfaces (usually to gdb).

Provides the best level of isolation.

  • All accesses are mediated by the emulator, reducing the attack surface to emulator components.

  • Allows analyzing other binaries besides standard executable files.

    • Firmware, MBR, UEFI.

Malware frequently tries to detect Virtual Machines, emulators and debuggers...

  • With variable sophistication.

Remote debugging with emulators

gdb and gdbserver

gdb can debug remote applications.

  • It can even debug remote kernels and firmware.

  • Why? Consider embedded devices and software inside an emulator.

gdbserver is launched on the target system, with the arguments:

  • Either a device name (to use a serial liner) or a TCP hostname and port number, and the path and filename of the executable to be debugged.

  • It then waits passively for the host gdb to communicate with it.

gdb is run on the host, with the arguments:

  • The path and filename of the executable (and any sources) on the host.

  • A device name (for a serial line) or the IP address and port number needed for connection to the target system.

Alternative: the remote application is compiled with a stub that provides a gdbserver interface when the application is launched.

Example: Reversing an ARM binary

unknown.bin

$ file unknown.bin returns “unknown.bin: DOS/MBR boot sector”, but it does look like a PDF file.

What we may extrapolate from that:

  • Seems to be a DOS/Master Boot Record.

  • DOS was only released for i386 (16 bits and 32 bits).

  • qemu-system-i386 may boot it if used as a hard disk or floppy disk.

How to address such files?

  • Binary files other than ELFs (or PE or other similar) obey to a fixed set of rules.

  • It is required to check the datasheets and gather the information required to load the file.

  • Important:

    • CPU used, CPU mode, relevant or required peripherals: to know how to decode the binary instructions.

    • Program Entry Point: to know where the program starts, and where disassembly should start.

From a Master Boot Record, we may know:

  • MBR is loaded to address 0x7C00.

  • MBR code runs in Intel x86 Real Mode (16bits).

  • There are quite a few limitations and assumptions.

  • No OS is running. Input/Output must use BIOS Interrupts.

Last updated