Exercise 1

Analyze the Hello application

The Java Virtual Machine

The Java bytecode is built for a Stack Based Machine.

  • Instructions pop values from the stack, and push the result.

  • Minimal number of registers (essentially only 2 for arithmetic).

  • Stack stores intermediate data.

Result

  • Very few assumptions about the target architecture (number of registers).

  • Maximizes compatibility.

  • Very compact code.

  • Simple tools (compiler), simple state maintenance.

Similar design is used in Cpython, WebAssemble, Postscript, apache Harmony and many others.

The Android Environment

Android run Linux with binary programs and Java applications.

  • Most user space applications are Java (or HTML).

  • But can load binary objects through JNI or NDK.

The VM differs from the standard JVM, following a register-based architecture.

  • Originally named Dalvik.

  • Then evolved to ART after Android 4.4.

  • Both environments process the Dalvik bytecode from Dalvik Executable (DEX) files.

Focus on better exploring the capability of the hardware, while having a low footprint.

  • Each application is executed in an independent VM instance.

  • Crashes and any other side effects are limited to one application.

  • Data isolation is ensured by the independent execution environments and forced communication through a single interface.

Dalvik VM

Machine models and calling conventions imitate common architectures and C-style calling conventions.

  • The machine is register-based, and frames are fixed in size upon creation.

  • Each frame consists of several registers (specified by the method) as well as any adjunct data needed to execute the method.

  • Registers are considered 32-bit wide. Adjacent register pairs are used for 64-bit values.

  • A function may access up to 65535 registers, usually only 16, but 256 may be common.

Before execution, files are optimized for faster execution.

  • Some optimizations include resolving methods and updating the vtable.

    • Methods have a signature that must be resolved to an actual vtable entry. Optimization changes bytecode by resolving the method location (index) in the vtable.

  • The result is stored as a odex file in the /system/cache.

    • Applications are stored "twice" as standard (APK with DEX) and optimized versions (ODEX).

Bytecode is processed using Just-in-time (JIT) approach.

  • The VM will compile and translate code in Real-time, during execution.

  • Garbage collection tasks also execute in the foreground (impact on performance).

DEX files

Dalvik EXecutable files are the standard execution format for previous Android versions.

  • Created with the dx command:

    • In reality: java -Xmx1024M -jar ${SDK_ROOT}.../lib/dx.jar

  • However, the format is still relevant in current systems.

Contains Java bytecode that was converted to Dalvik bytecode.

  • Java uses stack +4 registers, while DEX uses 0-v65535 registers.

    • DEX registers can be mapped to ARM registers (ARM has 10 general-purpose registers).

  • Optimized to constraint devices, but not so compact as instructions may be larger.

    • 1-5 bytes fo java, instead of 2-10 bytes.

DEX is highly like Java and bytecode can be converted both ways.

  • dx compiles .jar to .dex, dex2jar decompiles .dex to .jar.

  • Allows reengineering applications (download apk, reversing, change, build, sign, publish to store).

DEX and Java Bytecode

DEX OpcodeJava BytecodePurpose

60-66:sget-* 52-58:iget-*

b2:getstatic b4:getfield

Read a static or instance variable

67-6d:sput 59-5f:iput

b3:putstatic b5:putfield

Write a static or instance variable

6e:invoke-virtual 6f:invoke-super 70:invoke-direct 71:invoke-static 72:invoke-interface

b6:invokevirtual ba:invokedynamic b7:invokespecial b8:invokestatic b9:invokeinterface

Call a method

20:instance-of

c1:instanceof

Return true if obj is of class

1f:check-cast

c0:checkcast

Check if a type cast can be performed

bb:new

22:new-instance

New (unconstructed) instance of object

12-1c:const*

12:iidc 13:idc_w 14:idc2_w

Define constant

21: array-lenght

be:arraylength

Get length of an array

23:new-array

bd:anewarray

Instantiate an array

24-25:filled-new-array[/range]

N/A

Populate an array

32..37:if-* 38..3d:if-*z

a0-a6:if_icmp* 99-9e:if*

Branch on logical

2b:packed-switch

ab:lookupswitch

Switch statement

2c:sparse-switch

aa:tableswitch

Switch statement

28:goto 29:goto/16 30:goto/32

a7:goto c8:goto_w

Jump to offset in code

27: throw

bf:athrow

Throw exception

Android RunTime (ART)

Alternative runtime which presents an optimized execution path.

  • Introduced in Android 4.4, implemented in C++, and supports 64bits.

  • Run OAT files, which contain native code (not bytecode).

  • References to Java objects point towards C++ objects managed by the VM.

    • While application logic is expressed in Java, framework methods actually execute in native code.

ART introduces ahead-of-time (AOT) compilation.

  • At install time, ART compiles apps using the on-device dex2oat tool.

  • This utility accepts DEX files as input and generates a compiled app executed for the target device.

  • Improves performance over ODEX files as file repetitive load operations are avoided.

Improves Garbage Collection by optimizing memory usage.

  • Avoiding GC driven app pauses.

  • Overall, it provides much better performance.

  • JIT is not that efficient and doing it in real time hurts performance and battery.

ART specific files

  • .oat – only at /system/framework/[arch]/boot.oat

    • Main ART format, OAT: Of Ahead Time (from Ahead of Time).

      • We went with that because then we say that process of converting .dex files to .oat files would be called quakerizing and that would be really funny.”, reference to the Quaker Oats Company

  • .odex – an .OAT file containing the precompiled applications

    • Although it uses the same extension, .odex files with ART are .OAT files, in reality, ELF files.

    • Stored in /data/dalvik-cache

      • But Dalvik is not used with ART…

  • .art – only at /system/framework/[arch]/boot.art

    • An .OAT file containing vital framework classes (base Java classes to be used by ART).

  • .vdex - contains the uncompressed DEX code of the APK, with some additional metadata to speed up verification.

    • Assumed to be already verified DEX files.

OAT files (or DEX files in ART, which are also OAT)

  • Are ELF files containing DEX code.

    • OAT Header, followed by DEX files in an ELF container.

      • DEX files can be extracted with oat2dex.

  • Java methods in DEX file are mirrored in C++

    • java.lang.String: -> art::mirror::String

    • When the Java code creates an object, the object is created in the C++ (native) code by the VM.

      • JVM handles references to the C++ object.

  • On boot, common objects are instantiated (ones in Android Framework) by loading boot.art.

    • To speed up execution as such classes are required by most applications.

Last updated