Content Type Obfuscation

Dissimulate one file type as another file type or as raw data.

  • Exploring how the file is processed

  • Exploring how users interact with it

  • Exploring how researchers and automatic tools process a file

Purposes (some):

  • Marketing, branding and usability

  • Exploit users through social engineering

  • Increase the cost required for a reverse engineering task

  • Carry a malicious payload while escaping manual analysis

  • Carry a malicious payload bypassing automatic filtering

Marketing, Branding and Usability

Aims to make a filetype more usable, or to make the brand present to the user

  • Benning and common usage

Approach: The file has one specific type, but uses another file extension

  • The environment has a configuration stating how to handle such file extension

  • Explores the fact that an Environment uses a fixed string to know how to open the file.

Impact: File explorers will present content based on the file extension, not based on the content.

For a PPTX file

  • File reports a zip file and magic is PK

  • DOCX and XLSX are similar

Explore users through social engineering

It aims to confuse users about the purpose of a file

  • Malicious and common in phishing campaigns and malware

Approach: The file has a filename and presentation that confuses users

  • Mail client or explorer presents a safe file with a known extension

  • But… icon is stored in the file metadata, and the file has two extensions (file.txt.exe)

Impact: The user thinks that a file is not malicious (e.g, it’s a Word document), while in reality, it executes a malicious code

Windows hides extensions of known file types

  • Sample.pptx becomes only Sample

Executable files may have an embedded icon

  • Freely defined by the developer

  • Explorer will show that icon

A file named Sample.pptx.exe will be shown as Sample.pptx

  • Users recognize the extension and may think the file is safe

In a RE task, a file may have bogus extensions

Increase the cost required for a reverse engineering task

Aims to disguise/manipulate files so that a RE task skips the file, or processes the file incorrectly

Approaches:

  • Hides content in file without extension, without headers or with modified headers

  • Mangles content to make it less human-friendly

  • Polyglots

Impact: Reversing or Forensics Analyst will not process the file, or will not process the file with the correct approach/tools

  • This may prevent the researcher from recovering the original file

Magic Headers

Besides extensions, most files can be recognized by a magic value in the file start/end

  • Manipulating headers can lead to incorrect detection and maybe processing

Some magic values:

  • Office Documents: D0 CF 11 E0

  • ELF: 7F E L F

  • JPG: FF D8

  • PNG: 89 P N G 0D 0A 1A 0A

  • Java class: CA FE BA BE

Headers are important to maintain compatibility with third-party software

Headers may be irrelevant for custom software

  • The software has the filetype hard coded

PyInstaller allows converting Python code to an executable.

  • It packs the pyc files into a container. The container is extracted on runtime and compiled python code is executed

  • Headers are omitted from pyc files. If a header is added, the extracted file executes as a standard pyc file

Last updated