Software Reversing

System Level Reversing

Observe how the software is provided and how it operates.

  • Involves analyzing the environment, packaging, and dependencies, and then observing behaviour.

  • May require tools to intercept traffic, system calls, and input/output.

End goal: collect information to direct further analysis.

  • Important to select tools, processes, and overall strategy.

    • Language use, packaging algorithms, encryption

  • Important to characterize behaviour and identify external dependencies.

    • Remote servers involved, files accessed, communication channels used.

Code Level Reversing

Extract design concepts and algorithms from binaries.

  • Compiled to binary code or bytecode.

It’s a complex, architecture-dependent process.

  • Some say “an art form”.

  • Expensive enough that competitive RE is not usually pursued.

    • To fully reverse and reassemble a given competing software (except in some cases).

Makes use of tools capable of representing the low-level language in something “human compatible”.

  • Compiler optimization and obfuscation make this process uncertain.

  • Perfect reconstruction is frequently impossible as low-level languages do not use the same constructs as higher-level ones.

Activities

Understanding the processes.

  • Large-scale observation of the program at a process level.

  • Identification of major components and their functionality.

Understanding the Data.

  • Understand the data structures used.

Understanding Interfaces.

  • Which interfaces exist and how does the process react to them?

Software Reversing

Programs are developed in a high-level programming language.

  • C, C++, C#, Java, Python, Go…

A compiler converts the high-level instructions to low-level instructions.

  • Machine Code: instructions that are executed directly by the CPU.

  • Bytecode: instructions that are executed by a middleware, VM or Interpreter.

Reverse Engineering involves understanding low-level instructions.

  • Which is not easy and is costly.

  • Requires knowledge of the specific target being analyzed (the VM, the CPU).

    • Different CPUs have different opcodes and execution behaviour.

Last updated