Exercise 1
Analyze the Hello application
Last updated
Analyze the Hello application
Last updated
The Java bytecode is built for a Stack Based Machine.
Instructions pop values from the stack, and push the result.
Minimal number of registers (essentially only 2 for arithmetic).
Stack stores intermediate data.
Very few assumptions about the target architecture (number of registers).
Maximizes compatibility.
Very compact code.
Simple tools (compiler), simple state maintenance.
Similar design is used in Cpython, WebAssemble, Postscript, apache Harmony and many others.
Android run Linux with binary programs and Java applications.
Most user space applications are Java (or HTML).
But can load binary objects through JNI or NDK.
The VM differs from the standard JVM, following a register-based architecture.
Originally named Dalvik.
Then evolved to ART after Android 4.4.
Both environments process the Dalvik bytecode from Dalvik Executable (DEX) files.
Focus on better exploring the capability of the hardware, while having a low footprint.
Each application is executed in an independent VM instance.
Crashes and any other side effects are limited to one application.
Data isolation is ensured by the independent execution environments and forced communication through a single interface.
Machine models and calling conventions imitate common architectures and C-style calling conventions.
The machine is register-based, and frames are fixed in size upon creation.
Each frame consists of several registers (specified by the method) as well as any adjunct data needed to execute the method.
Registers are considered 32-bit wide. Adjacent register pairs are used for 64-bit values.
A function may access up to 65535 registers, usually only 16, but 256 may be common.
Before execution, files are optimized for faster execution.
Some optimizations include resolving methods and updating the vtable
.
Methods have a signature that must be resolved to an actual vtable
entry. Optimization changes bytecode by resolving the method location (index) in the vtable.
The result is stored as a odex
file in the /system/cache
.
Applications are stored "twice" as standard (APK with DEX) and optimized versions (ODEX).
Bytecode is processed using Just-in-time (JIT) approach.
The VM will compile and translate code in Real-time, during execution.
Garbage collection tasks also execute in the foreground (impact on performance).
Dalvik EXecutable files are the standard execution format for previous Android versions.
Created with the dx
command:
In reality: java -Xmx1024M -jar ${SDK_ROOT}.../lib/dx.jar
However, the format is still relevant in current systems.
Contains Java bytecode that was converted to Dalvik bytecode.
Java uses stack +4 registers, while DEX uses 0-v65535 registers.
DEX registers can be mapped to ARM registers (ARM has 10 general-purpose registers).
Optimized to constraint devices, but not so compact as instructions may be larger.
1-5 bytes fo java, instead of 2-10 bytes.
DEX is highly like Java and bytecode can be converted both ways.
dx
compiles .jar
to .dex
, dex2jar
decompiles .dex
to .jar
.
Allows reengineering applications (download apk, reversing, change, build, sign, publish to store).
60-66:sget-* 52-58:iget-*
b2:getstatic b4:getfield
Read a static or instance variable
67-6d:sput 59-5f:iput
b3:putstatic b5:putfield
Write a static or instance variable
6e:invoke-virtual 6f:invoke-super 70:invoke-direct 71:invoke-static 72:invoke-interface
b6:invokevirtual ba:invokedynamic b7:invokespecial b8:invokestatic b9:invokeinterface
Call a method
20:instance-of
c1:instanceof
Return true if obj is of class
1f:check-cast
c0:checkcast
Check if a type cast can be performed
bb:new
22:new-instance
New (unconstructed) instance of object
12-1c:const*
12:iidc 13:idc_w 14:idc2_w
Define constant
21: array-lenght
be:arraylength
Get length of an array
23:new-array
bd:anewarray
Instantiate an array
24-25:filled-new-array[/range]
N/A
Populate an array
32..37:if-* 38..3d:if-*z
a0-a6:if_icmp* 99-9e:if*
Branch on logical
2b:packed-switch
ab:lookupswitch
Switch statement
2c:sparse-switch
aa:tableswitch
Switch statement
28:goto 29:goto/16 30:goto/32
a7:goto c8:goto_w
Jump to offset in code
27: throw
bf:athrow
Throw exception
Alternative runtime which presents an optimized execution path.
Introduced in Android 4.4, implemented in C++, and supports 64bits.
Run OAT files, which contain native code (not bytecode).
References to Java objects point towards C++ objects managed by the VM.
While application logic is expressed in Java, framework methods actually execute in native code.
ART introduces ahead-of-time (AOT) compilation.
At install time, ART compiles apps using the on-device dex2oat tool.
This utility accepts DEX files as input and generates a compiled app executed for the target device.
Improves performance over ODEX files as file repetitive load operations are avoided.
Improves Garbage Collection by optimizing memory usage.
Avoiding GC driven app pauses.
Overall, it provides much better performance.
JIT is not that efficient and doing it in real time hurts performance and battery.
.oat
– only at /system/framework/[arch]/boot.oat
Main ART format, OAT: Of Ahead Time (from Ahead of Time).
“We went with that because then we say that process of converting .dex files to .oat files would be called quakerizing and that would be really funny.”, reference to the Quaker Oats Company
.odex
– an .OAT
file containing the precompiled applications
Although it uses the same extension, .odex files with ART are .OAT files, in reality, ELF files.
Stored in /data/dalvik-cache
But Dalvik is not used with ART…
.art
– only at /system/framework/[arch]/boot.art
An .OAT file containing vital framework classes (base Java classes to be used by ART).
.vdex
- contains the uncompressed DEX code of the APK, with some additional metadata to speed up verification.
Assumed to be already verified DEX files.
Are ELF files containing DEX code.
OAT Header, followed by DEX files in an ELF container.
DEX files can be extracted with oat2dex.
Java methods in DEX file are mirrored in C++
java.lang.String:
-> art::mirror::String
When the Java code creates an object, the object is created in the C++ (native) code by the VM.
JVM handles references to the C++ object.
On boot, common objects are instantiated (ones in Android Framework) by loading boot.art.
To speed up execution as such classes are required by most applications.