A few years ago, mobile phones were used to make calls or send messages. Today, they surpass computers as the most commonly used digital device. They manage our agenda, emails, credit cards, itineraries and business documents. Android is the most popular operating system for mobiles and embedded devices, having the largest application market and 85% of all smartphones sold in 2019 were equipped with an Android OS [1
]. Android is an open nature platform, which means that applications could be downloaded from sources other than the official Google play store. This is an important feature that has contributed to its unquestionable success, given the breadth of the available application that draws people to the platform, making it an ideal target for malicious application downloads.
Indeed, users are increasingly exposed to attacks targeting the Android environment via malicious applications. They thus endanger privacy information, by disclosing sensitive data (FakeNetflix malware [2
]) or collecting sensitive banking information, especially with the increasing use of banking applications (Anubis trojan [3
]). Furthermore, the installation of apparently legitimate malicious applications can lead to: clandestine eavesdropping on telephone conversations; tracking GPS position; exploiting pay services to cause financial losses to the user for the benefit of the attacker by calling or sending SMS messages to premium-rate numbers without the user’s knowledge (SMS Trojan such as FakePlayer, AsiaHitGroup and GGTracker [4
To deal with this, automated tools for analyzing, verifying and enforcing the security of Android applications are highly needed [7
]. Nevertheless, they must be based on a formal specification of the target platform to give solid results. In this paper, we propose formal operational semantics for a subset of the low-level Android code, which we consider particularly relevant for modeling Android applications and which we call Smali
. It includes the main bytecode instructions of Dalvik, and a few important API methods related to Java concurrency. Smali
is ultimately written from Smali with some essential native methods that were replaced with macro-instructions for simplification. Smali
is intended to serve as a basis for further analysis of Android applications and security implementation techniques. Android applications are mainly written in Java. The Java source code is first compiled into a Java Virtual Machine (JVM) bytecode using a standard Java compiler called Javac
. Following this, the Java source files are converted into class files that store Java bytecode. The Java bytecode is then translated to an optimized bytecode called Dalvik
through a tool called dx
. At this stage, all the class files are converted and consolidated into a single DEX file called Dalvik EXecutable or simply a DEX to save memory. An Android Package Kit (APK) is essentially a zip of the DEX file accompanied by a Androidmanifest.xml file, a set of resources and potentially shared libraries. Figure 1
illustrates these steps.
In this work, we focus on the DEX format file, which contains the Dalvik binary code used even by the successor of Dalvik (since Android 5.0) called Android Runtime (ART).
Formalizing a low-level code, rather than high-level Java source or intermediate level Java bytecode, is our choice for many reasons. Firstly, Dalvik byte code is always available and it is easily obtainable from any Android application. Secondly, Dalvik bytecode is the common executable format for all Android applications and therefore the code is much closer to the code really executed. Even though decompilation from Dalvik back to Java or to Java bytecode is possible using reverse engineering tools (such as dex2jar and ded), there is no guarantee to recover the original source code since there is not a 100% robust and correct Dalvik-to-Java reverse translation tool [11
]. However, even though that it is possible to retrieve source code or Java bytecode from Dalvik, editing or improving code at this level requires the user to reconvert it back to Dalvik and running the application afterward will often fail [9
]. Focusing directly on Smali will avoid such problems. Hence, binary code obtained at this level, in DEX file, is illegible and requires conversion into a more understandable format prior to being analyzed, improved or edited. Reverse engineering in software makes it possible to convert a machine-readable binary file into a human-readable file, which is the case with DEX files.
] is a reverse engineering tool that simplifies the entire process of assembling and disassembling Android applications. It includes “Smali
” and “bakSmali
”, which are equivalent to “assembler” and “disassembler”, respectively allowing the passage from and to the DEX format. Apktool allows the user to disassemble applications to nearly original form. It uses BakSmali
to produce, from an APK, a human-readable format akin to assembly languages called Smali (Smali is both the name of a mnemonic language for the Dalvik bytecode and its assembler version.). This code is nothing but a translation of the machine code generated by the DVM. In other words, it is a readable representation of Dalvik bytecode in an assembly-like code, with mnemonic instructions. BakSmali
creates a Smali file for each class in the application preserving the original signature. The structure of such a file is presented in Figure 2
. In addition to the code contained in the classes.dex file, Apktool generates the application decoded resources, as well as the AndroidManifest.xml
file (in a readable version. These reverse engineering analysis techniques are still effective with the newly introduced ART environment [13
In this paper, we put forward formal semantics for Smali. Smali is an assembly-like language that runs on Android’s DVM. It is obtained by ’bakSmaling’ the Dalvik executable file (.dex). A syntax and semantics have been adopted to specify this low-level code. The resulting formal language is a sub-language of Smali and a simpler language, called Smali
. A set of the most used Dalvik instructions have been generalized into 12 semantically different instructions (see [11
] for generalization process), compared to more than 200 Dalvik original instructions in Smali. In addition to this set, our semantics includes instructions related to multi-threading. We plan to use Smali
in the near future to specify security properties for Android applications and this in order to protect the user from security threats that target the Android environment through downloaded applications.
The paper is organized as follows. In Section 2
, we present some related work with similar ideas of bytecode formalization and we discuss their advantages as well as their drawbacks and limitations. In Section 3
, we give some essential preliminaries related to Smali (registers, some adopted notations, types, etc.). In Section 4
, we present the operational syntax and semantics of Smali
for a single-threaded application. In Section 5
, we present the operational syntax and semantics of Smali
for a multi-threaded application. In Section 8
, we conclude and we introduce the future avenues of our research.
2. Related Work
Mostly, the studies based on formal semantics of Android target a single well-defined goal. This can be an analysis for certification, a detection of potential vulnerabilities or malicious behavior of an application, or a verification of any aspect. It can also be a means to reveal security breaches of Android applications [14
]. We will see in the studies we are presenting hereafter that formalization elements substantially differ from one objective to another. This being said, it is practically impossible to evaluate the efficiency of analyzes that are not based on the formal specification of the targeted platform.
], Payet et al. define operational semantics for a subset of Dalvik opcodes that present registers manipulation, arithmetic operations, object creation, access and method calls as well as Android activities. Semantics rules were relatively complex. An Android program was modeled as a graph of blocks where each block has one or more instructions among the selected instructions. Blocks are linked in a way that they express control flow passing from one block to another. They require that invoke and return instructions only occur at the beginning and the end of a block, respectively. Blocks of semantics integrate instruction semantics for those that are different from a call or a return. Call instruction semantics allow passing from the caller method block to the callee method block. Activity semantics depend on the activity state, method callback, activity life cycle and external events. These semantics are defined to be the basis of static analyses that take into account the life-cycle of the activities. Despite the importance of thread-activity connection in Android semantics, threading was detached from activities semantics and concurrency was ignored in this work.
], the authors propose a formal operational semantics for the Dalvik bytecode. The formalization was accompanied by a control flow analysis to detect potential malicious actions. Although the results highlight threading as the most often used language features with a (90.18%), this feature was omitted in both analyses and semantics to focus, instead, on reflection, exceptions and dynamic dispatch with 73.00% and 19.53%, respectively, which we find somewhat awkward. This motivates us to pay a special attention to the mutli-threading aspect modeling for Android.
], the authors present SymDroid as a Dalvik bytecode interpreter for eventual security vulnerabilities detection. It is a symbolic execution for a simplified intermediate language of a fraction of Dalvik opcodes, named μ-Dalvik
. SymDroid receives the Dalvik bytecode (the .dex file) as input. The opcode is first translated to μ-Dalvik
, which one is based on 16 instructions considered as the most relevant ones to perform code analysis. Then, it is processed by a symbolic execution core using the SMT solver to generate traces as an intermediate result. Finally, the post-analyzer inspects the output traces and determines the final result. Entry points and all possible events affecting the application’s behavior were developed according to a client-oriented specification (it is up to the user to model it) to drive the application under test as desired. Although this work’s models, in addition to modeling bytecode instructions, the system libraries including Bundle and Intent, Android components life cycle, services and views; it ignores the system’s concurrent nature, either in the selected bytecode instructions or at the program symbolic execution level, which is considered as being sequential.
In the same vein, Julia presented in [18
] is a static analyzer for Java bytecode based on abstract interpretation. It was extended in [19
] and adapted to analyze Dalvik bytecode and handle specific features of Android such as event-driven nature, potentially concurrent entry points and dynamic inflation of graphical views. It applies several static analyses for Android applications’ classcast, nullness, dead code and termination analysis, but does not track information flow. Multi-threaded applications were not included in this work and event handlers are executed by a single thread.
Gunadi et al. [20
] propose an operational semantics of DEX bytecode for certifying non-interference properties through type system. This study includes a translating process from Java bytecode semantics developed in [22
] to Dalvik bytecode, concluding that if the first type system guarantees non-interference then its translation into Dalvik bytecode is also typable. Therefore existing bytecode verifiers for Java could certify non-interference properties of Dalvik bytecode.
Multi-threading programming semantics in applications have lately drawn increasing attention. Some combine it with event handling [23
], others consider the main API methods relating to it [26
]. In [24
], Kanade proposes a semantic of a combined concurrency model of threads and events. All the focus in this work goes to the event-driven nature of Android and its relationship with the application’s threads. As a consequence, all other states that semantics could reach, such as those resulting from basic instruction execution (method call, jump, return instruction, etc.), have not been treated. The semantics proposed in [26
] were the closest to ours. They cover the main important Dalvik instructions and handle multi-threading. This paper could be seen an extension of [27
], with the obviously major change of the semantics needed for the concurrent setting and exception handling. However, thread scheduling was not discussed and thread spawning is left to the virtual machine to execute in an unpredictable point in time.
In the same stream of thought, in [28
], Chaudhuri presents a formal security study on Android using operational semantics and a system of types for specific Android constructs. However, semantics ignore all Java constructs that may appear in Android applications (no class and method modeling), to focus instead on Android components, intents and all Android-specific features related to it (binding a service, sending an intent, etc.). This can be seen as a unified formal understanding of security for users and developers of Android applications to deal with their security concerns.
Some works have focused on other issues of Android such as multi-tasking. For instance, ASM presented in [29
] is a formal model that formalizes all Android elements related to multi-tasking, such as activities, back stacks and tasks. An Android application is somewhat seen as a collection of activities with different types that interact with the user through a back stack. ASM has recently been extended in [30
] to capture all the core elements of the multi-tasking mechanism used in inter-component communication.
Over time, formalization has included the permissions system as well [31
]. For example, Bagheri et al. propose in [31
] a formal specification for Android application’s permission system through an ad-hoc specification language called Alloy. It aims to formally specify the behavior of Android applications, in particular, the mutual interaction between applications based on permissions and security consequences caused by it or what authors call inter-app permission leakage vulnerabilities. Almost all Android elements related to inter-app permissions were taken into account in the formalism. Every application is modeled as a set of components, permissions, intent filters and vulnerable paths. Similarly, in [33
], a formal model of the Android’s permissions is specified in the theorem prover Coq syntax.
] is an automated testing tool for Android Apps. It is based on Acteve [35
] but is improved to support input events and broadcast events in order to achieve higher coverage. Authors use a non-standard operational semantics that describes the concolic execution of the program. Semantics describe program execution in response to a sequence of events generated automatically from an external environment. All other features and instructions that Android handles were neglected to focus instead on the event-driven paradigm, which we found not expressive enough to model an Android application. Our operational semantics consider, besides the concurrent feature, a variety of instructions that models methods invocations, object creation and the whole tree structure of an application (class, method and fields).
], the authors focused on the low-level interactions with the operating system, by recording the system calls (syscalls) invoked. To benefit from two levels, the analysis uses generic low-level syscall traces to reconstruct the high-level semantics. While syscalls analysis offers more security guarantees, it, in our opinion, complicates the task more. Especially, this information is extracted from internal interfaces between the Android libraries and the kernel, which may change in the next versions of Android without notice. In our work, we propose a rich semantics that covers all API calls at a high level and we consider that it is sufficient to enforce security policies later.
Some studies like those conducted by Stowway [8
] and Comdroid [37
] for flow analysis directly analyze the disassembled DEX file for a given application to identify potential component and/or communication vulnerabilities. Despite the promising results of both tools in analyzing Dalvik bytecode and Android’s API, proving its soundness and evaluating its efficiency or deficiency is practically impossible in the absence of formal specification and proof.
Concurrent programming concepts and techniques are widely used in Android in order to manage different tasks and threads. Our formalism Smali consider this important feature that was neglected before given its complexity. Overall, none of the aforementioned studies, including those considering multi-threading, offer complete semantics covering all the states that a thread can reach nor representing all multi-threading essentials. Most of the studies formalizing Dalvik byte code and handling multi-threading include only the two Dalvik instructions related to monitor use, monitor-enter and monitor-exit, since Dalvik opcodes encompass only these two instructions with regard to threading. However, a semantic for an Android program should not be limited to these instructions and must also consider instructions related to threads communication, signaling and scheduling. In this paper, we fill this gap by proposing semantics that incorporate, in addition to Dalvik instructions, a wide range of API methods covering multi-threading essentials formulated in macro instructions for the sake of simplicity. In comparison with all test-based approaches, Smali is based on formal methods with their foundation in mathematical logic, allowing us to achieve rigorous and unambiguous reasoning in the system specification and proofs, ensuring the system proprieties, while test-based approaches can only ensure that systems satisfy the requirements for test cases. In sum, the proposed formal language is expressive enough to enforce security proprieties and to detect security critical APIs (i.e., those related to sensitive data access such as camera, SMS, telephony and contact list). Its syntax includes the class fully qualified name for each invoked method facilitating to localize such APIs.
In this section, we present the most essential information for Smali. First, we present the DVM architecture and how it affects Smali syntax. Then, we present method invocation and how it affects Smali registers. Finally, we present Smali special notations for types.
Being optimized to run on devices on which resources and processor speed are scarce and the DVM architecture is register-based. Local variables are assigned to any of the
available registers. A register is used to hold any data value, except for double
values where each one requires two registers (64 bits). The Dalvik opcodes operate on the register’s content instead of operating directly on values and accesses elements on a program stack such as stack-based virtual machines. Hence, registers allow the DVM to keep track of program evolution while it executes bytecode [38
]. Each method in Smali has its own set of registers for each method’s arguments, local variables and a special register for its return value. We will see later that most of the instructions include source and destination registers. Smali language denotes each set of registers differently, which allows us to visually distinguish between the method’s local and argument registers.
The alternate .locals directive specifies the number of local registers used by the method (non-parameter registers) which is statically known. Local registers in Smali are denoted with , where is the first local register, the second and so on until the last register. This includes a special register for a method return value that allows passing return values from the callee back to the caller, which one is denoted by .
Parameter registers in Smali are denoted by . The first parameter for non-static methods is always the object that the method is being invoked on, in this case holds the object reference and the second parameter register. For a static method invocation is the first parameter register. For more details, please see the Method invocation subsection.
The .registers directive specifies the total number of registers in the method. This includes the registers needed to hold the method parameters, which are stored in the last registers in the method.
3.2. Method Invocation
The DVM conforms to the ARM’s calling convention which is used for low-level code where parameters, return values, return addresses and scope links are placed in registers. It dictates how these elements are shared between the caller and the callee. In fact, these two share a part of their register array so that the caller passes arguments to the callee by setting its parameter registers in the right order. As for class methods, a lookup procedure starts by searching in the list of all static methods that belong to the named class, where classes have distinct names and locating the invoked method through its signature (i.e., name, argument types and number, and return type). Then, its parameter registers array is set according to ARM’s calling convention, so that the first argument leads to the first parameter register and so on until the last argument which identifies the last register for arguments (n arguments lead to n parameter registers).
In the dynamic invocation case, the class of the object whose method is being called (or recipient object’s class) is statically unknown, so it is first retrieved from the heap through its reference (see the semantics section for more details). Then, a lookup procedure searches among the class method list upwards to its super-class chain, for a method matching the given method signature. Registers comprise an additional register for the object reference called in Smali code. Hence, the actual number of parameter registers is .
Local register contents are initially undefined (registers are untyped in Dalvik), however, its number is statically known.
3.3. Types in Smali
Smali code has two major classes of types, primitive types and reference types.
A primitive notation in Smali is particular where a single letter specifies each type, for example V is used for a void type.
Reference types are objects (i.e., class type) and arrays. A class type takes the formLpackagename/ClassName;
where the leading L
indicates that it is a class type, packagename
is the package name path where class ClassName
belongs to, whereas ClassName
refers to the class name. For example, a thread object in Smali has the following type:LJava/lang/Thread;
which is equivalent to Java.lang.Thread
in Java. Arrays take the form
which could obviously be a primitive or a reference). Arrays with multiple dimensions are presented by corresponding number of "[" characters. For example, a two-dimension arrays of int(s) is presented as follow
which is equivalent to
in Java. Table 1
summarizes different types in Smali.
4. Operational Semantics for a Single-Threaded Application
Throughout the paper, we use the following notations:
to designate a stack, where A is the top-most value of the stack, B is the underlying element and C is the remaining portion of the stack. An empty stack is presented by .
⊥ to denote any undefined value.
is domain of a function f. The notation expresses the domain where the value of a functionf is updated to x.
expresses the function f where value x maps to y so .
provides basic syntactic categories as well as the selected instructions syntax.
A package of a disassembled DEX bytecode format is specified by a name and sequences of classes. In our formal model, we consider that a package consists only of classes that correspond to .Smali files (Androidmanifest file and the rest of XML files are not considered in our formalization).
A class definition includes its access flags , which is a keyword defining the class visibility, a fully qualified class name that indicates the class package path name followed by the class name c (we assume an unlimited supply of distinct names). This includes also its direct super-class fully qualified name (a single inheritance). ⊤ is applied to classes without super-classes such as the Object class and the Thread class, and finally a set of implemented interfaces , fields and methods .
An interface is specified by its fully qualified name
, access flags
, a set of super-interfaces
, its abstract methods (which consist of their method signatures) and constant fields. A field definition comprised its name f⧀
, its access flags and a type
(which could be a primitive for static fields or a class type for instance fields ). A method definition includes a set of access flags that determines its scope, the method signature, the number of local registers it operates on denoted by
and a sequence of labeled instructions
that present the method body. A method signature consists of the method name m
, argument(s) type
and a return type
which might be a void, primitive or a class type. In Smali
, we consider a subset of Dalvik instructions being selected based on results of a study of 1700 Android applications, carried out to determine what instructions and language features are most often used in typical applications [16
]. In fact, Dalvik bytecode comprises 218 instructions [39
]. We bring some modification to the selected instructions that does not affect the expressive power of Dalvik language. In contrast, it simplifies the representation of our semantics. For example, in Dalvik we find 13 variants of the move
instruction that are semantically similar, we model this group of instructions by only one move
In our formal model, we consider instructions expressing the unconditional and conditional jump with, respectively, goto and if instructions. A move instruction to move values from source to destination . A destination may be a register name v, an instance field or a static field , whereas a source may be any of these elements beside constants . We consider also instructions expressing the creation of a new object of a class , a return from a void and non-void method with new-instance, return-void and return instructions, respectively. Method invocation refers to the method name, argument types and number, return type and registers. For methods class that are dynamically dispatched, it includes in addition to that a register holding the recipient object reference.
defines the domains used by our operational semantics. In fact, each application has at least one thread that defines the code path of execution and all of the code will be processed along the same code path if there is no other created thread. Hereafter, we suppose a single-threaded execution, a simple programming model with deterministic execution order, which means that an instruction has to wait for all preceding instructions to finish prior to being processed. We model such execution with a local configuration denoted by
. It models the full state of a single-threaded program. It includes a call stack
, a heap H
and a static heap S
. A call stack allows keeping track of all information concerning methods invoked in the program. It is initially empty and presented as a sequence of method frames. A method frame
is a triplet consisting of a method name m
, a program counter i
for execution progress, both determine the program point in the invoked method and finally a register array R
mapping register names (parameters, locals and return) to values. We adopt the same notations for registers used in Smali, as explained in the Registers subsection. Therefore, we have a set of registers for the method parameters and a set for the method local variables. Local registers content are initially undefined denoted by ⊥. The top of the call stack represents the currently executing method’s frame. Values can be either primitives or heap locations. A heap H
map locations (we suppose an arbitrary number of unique locations) to objects
. Objects record their class and a mapping from (class) fields to values, whereas arrays record the array type and its values. Finally, the static heap S
is a mapping from static (class) field names to their values. Fields are annotated with their type used for initialization, to determine the default values of each primitive type (see Table 4
). This annotation is omitted when it is unneeded.
The relation models evolution of a starting configuration into a new as the result of a computation step. represents the program point, which corresponds to the instruction at a position i in a specified method m, always for the top-most method frame of the call stack in .
To illustrate the semantics, we present in Table 5
the semantic rules for instructions presented in Table 2
These rules are as follows. The rule updates the program counter to the specified one unconditionally. Rules related to a instruction from source to destination use an evaluation function ⟦-⟧ that evaluates a destination or a source under the current configuration , except for registers. In this case, for the sake of being simple, we use directly always from the top-most method frame of the call stack in since ⟦v⟧ is equivalent to . Constants are evaluated to themselves whereas static and instance fields are evaluated based on static S and dynamic H heaps, respectively, obviously under the current configuration . The rule evaluates the source sub-expression and then updates the destination register content in the register array. Rules and update instance and static field, respectively, by the content of the source register. Rule is quite straightforward. That is, after evaluating the source to constant, it updates the destination register content by the constant value.
creates a new object in the heap by reserving a memory with a new fresh location l
, loading the class that is instantiated from and initialing its static fields, each by its default value according to Table 4
. Once created, it returns the newly allocated object by pushing its heap location in a destination register v
Rules and compute a binary or unary expression, respectively, and store the results in the destination register. Rules and models conditional jump. If the guard is evaluated to true, it branches to the targeted program counter (), otherwise the program counter is advanced to the next instruction (). In rules and , a lookup function is called to look up for the appropriate method. In the dynamic case, the method class is retrieved from the heap through object location l which is passed to the register . In both rules, a new method frame structure is pushed on the top of the call stack. It includes the method name, a count program set to 0 and a register array set as explained in the subsection Method invocation. Notice that here we increment the program counter of the caller by one to restart from the correct instruction once the callee returns.
A lookup method searches for a method matching the given method signature (
) in the given class full name and upwards to its super-class chain. Once located, it returns the method signature with the number of its local registers. We assume that the identified class and method exist in the package and class ancestry, respectively, with an array of local registers. Moreover, we admit that all verification checks are performed by the DVM. For instance it is verified that the method can be legally accessed by the class. Thus, the invoke instructions
are safe to execute.
Rules and pop the top frame from the call stack and pass on the return value from the callee back to the caller through its return register . Notice that, in the case of a void method, the return value must be moved to by the callee before the return-void instruction.
5. Operational Semantics for a Multi-Threaded Program
Results shown in [17
] have highlighted multi-threading as a widely used feature in Android applications with 90.18% including a reference to Java/lang/Thread and 88% using monitors. An important rate that motivates us to take this feature into account in our formalization in order to develop a complete semantic.
Here, we consider multi-threaded programs. Multi-threading semantics include single-threaded semantics for each running thread separately. Threads in the same DVM interact and synchronize using shared objects and monitors associated with these objects. In order to give a full account of Java concurrency, we consider instructions related to this aspect. We define macro-instructions that cover methods of the Java Thread API [40
] which are start
for thread spawning and join
for joining a referenced thread. We also define macro-instructions that cover several methods of the Java Object API [41
] related to thread signaling such as notify
and to synchronization such as wait
. We also give the semantics of Dalvik instructions related to threads synchronization and monitors with the instructions monitor-enter
. All instructions syntax are illustrated in Table 6
An overall configuration models the full state of an Android application in its low-level implementation. It presents a multi-threading program configuration including as first attribute a running thread’s call stack , a set of runnable threads , a heap H and a static heap S.
Each thread in the program has a call stack
for methods being invoked, their arguments and local variables, with the same syntax used in Table 3
is a set of pending threads. Each thread is presented by its call stack for method invoked information, plus a special register holding the thread reference. Threads in this set are in a “runnable” state (i.e., waiting to be selected by the scheduler).
are dynamic and static heaps which are shared between all threads in the program and have the same semantics domain used for the single-threaded program in Table 3
A new semantic domain for multi-threaded program is provided in Table 7
. Some changes are applied to the object definition. It includes a new fields
which indicates if the object’s monitor is acquired by another thread. If this is the case,
will contain this thread’s reference, otherwise it will contain an undefined value ⊥ since an object cannot be reserved by more than one thread at once, at a given time.
is a set of blocked threads waiting for the object’s monitor to be released.
is a set of threads pending notification (threads that executed the wait
instruction). The initial state of a new instance object, in a multi-threading context, will be initialized as seen in the single-threaded environment (with default values). New attributes are initialized as follows:
initialized to an undefined value, which means that initially the object is in a free state and could be acquired by a given thread.
, an empty set of blocked threads, which means that initially there is no thread waiting for the monitor to be released.
, an empty set of waiting to be notified threads.
A class is a Thread class if and only if it is an instance of a Thread class (⊥=Thread), which means that its super class is either the Thread top class path ( LJava/lang/Thread) or another class that it is extended from this class. Each thread object has a Boolean finished field indicating whether the thread has completed its execution or not, a mapping from a group of threads to a set of threads call stacks, it contains a set of threads waiting to join this thread and an attribute called indicating the current state of the thread. Each thread has a run method. Thread attributes are initialized as follows:
, an empty set of join threads, which means that initially there is no thread waiting to join the current thread.
provides the semantics of spawning and scheduling threads. Rule
starts a new thread, which reference is stored in the register
. It internally calls the referenced thread’s
method that will be executed in this thread separately, once selected. Therefore, a
procedure for its run method is performed and a separate call stack for a new thread is created with one frame comprising all information about the thread’s
method returned by the
function. This thread moves to a "runnable" state in
. When it gets a chance to execute, its target run() method will be executed. The actual execution of the launched thread will be managed with the rule
. Notice that, as expressed by the rule
, the reference of the launched thread is always stored in the register
and we assume that it will remain there for all semantics rules and for all method’s frames in the thread’s call stack.
Rules and manage threads scheduling. Rule selects from one thread to be executed for a time slice . The selected thread’s state will be updated to a “Running()” state. The thread’s call stack will be removed from the runnable set and placed at the first position of configuration to start execution. The function will be based on a CFS scheduler’s algorithm for scheduling threads in . It takes into account the thread’s nice values and returns the selected thread’s local state presented in its current call stack as well as the time slice allocated to it for execution.
Rule stops, in a monitoring mode (i.e., a mode that monitors the execution time given to each thread), a thread whose allocated time slice to execute a task has expired. We model the timing aspect in our formalism by the function which represents the scheduler timer to control running threads.
Synchronization in Dalvik is modeled by the use of monitors with instructions monitor-enter and monitor-exit. That actually corresponds to the synchronized keyword in Java. A monitor is attached to an object and could be acquired and released by threads.
The semantics of these two instructions must fulfill two conditions. The first is related to the mutual exclusive access to shared objects in the heap by different threads. The second relates to the cooperation between these threads. Cooperation is modeled by a set of threads waiting for notification when the object is released by another thread. The sole thread running and owning the monitor is in a critical section. Table 9
presents rules related to synchronization. Monitor-enter
semantics represent a thread trying to access the critical section by acquiring monitor for the object, whose reference is stored in a register
. It first checks if the object is acquired by any other thread. If this is the case, the current thread will be blocked (mutual exclusive access condition) and added to the object blocking set
to join other threads (if any) with the same situation (cooperation condition). This case is modeled by the rule
). Otherwise, the current thread can take ownership of the monitor. The
attribute is then updated with this thread’s reference. This thread could resume its execution in the critical section. This case is modeled with the rule
Monitor-exit semantics represents a thread that reaches the end of the critical section by releasing the owned monitor for another thread to take ownership, which perfectly fulfills the cooperation condition. Rule provides this semantics, the current thread must first own this object’s monitor, once this condition is satisfied, the attribute is updated to an undefined value (object is free). Then, all waiting threads in are removed to the runnable set . It is up to the scheduler to select which thread to execute (there is no ordering among the blocked threads).
A thread could voluntarily give up ownership of the monitor before reaching the end of the critical section by calling the method or by executing the instruction. This thread releases ownership of this monitor and remains in a waiting state (i.e., suspended or inactive until be notified by another thread). Rule provides the semantics of wait instruction. The calling thread must own this object’s monitor (i.e., must executing from inside a synchronized block) then relinquish it. Once the monitor associated with this object is released, the current thread is placed in the wait set for this object.
expressing the signaling mechanism. Rule
represents the semantics for waking up a single thread that is waiting for this object’s monitor in the waiting set
. One thread among the set will be chosen randomly by the function
. This thread will be moved from the waiting set to the runnable set to be selected later on by the scheduler and then processed. The rule
is similar to the rule
, with the exception that it wakes all threads in the waiting set, which ones will be moved to the runnable set
. Notice that, rules
release in addition to waiting thread(s) set
all blocked threads in
. The two sets have the same privileges with regards to acquiring monitor. In other words, waiting threads have no precedence over potentially blocked threads that also want to synchronize on this object.
presents semantics of finishing thread and joining instructions. Rules
check if the joined thread has finished its execution, if so, the current thread resumes execution (
). Otherwise, the rule
is applied. The current running thread is removed into
for threads waiting for the same thread to complete its execution (no release by the monitor of the object is acquired by the running thread here). The rule
ensures that when a thread completes its execution (i.e., its run() method returns) and releases all waiting threads in
by moving them to the runnable set
6. Practical Aspects
We give, hereafter, some practical aspects of Smali+ through an example. For the sake of simplicity and due to the space limitation, we only present an illustration of a single-threaded program in Smali
that includes various important instructions such as method call, return, static and instance field update, etc. As shown in Table 12
, the program is sequential and consists of two classes
belonging to the same package called p
. Figure 3
shows the initial configuration. We show in detail, through this example, how the rules are applied and how the configuration evolves in every step. Each rule is followed by the resulting configuration.
The first table corresponds to the call stack , which is the current method frame. The second table corresponds to an empty heap H and the last two tables correspond to the register arrays for methods and , respectively. The first Smali instruction to execute is the move instruction labeled with 5. It is a constant displacement, so the rule applies. Since constants are evaluated to themselves, the register for locals registers is updated by the constant value and the program counter is incremented.
The next instruction corresponds to the unconditional jump . The rule so applies to update the program counter by the instruction labeled with 10.
is an invocation of a static method. Rule
so applies. A new frame for the called method is pushed on top of
and the counter program in the caller method frame is incremented.
After some execution steps, we suppose that the register in is updated by a new value "CA" and the current instruction to execute is labeled with 18 in .
The instruction is a return from a non-void method , so the rule applies. The top frame of is popped and the return value is passed from the callee back to the caller through its return register .
The instruction is a static field update. So the rule so applies to update the indicated field in the static heap S by the register content.
The instruction corresponds to an object creation. The rule so applies to create a new instance from the class in the heap H and all fields are initialized according to their types.
The instruction is an instance field update. So the rule applies. The register holds the instance location in H. The instance field in is updated with the source register content.
So far, we have proposed a formal language for Android programs called Smali
. Presented in a BNF notation,
is a simple language that remains faithful to the original Smali notations and the .Smali file structure. It contains 12 generalized instructions from 218 Dalvik instructions [39
] and some macros instructions modeling concurrency aspect. These 12 instructions were selected carefully to highlight Dalvik’s characteristics, such as register-based architecture, assembly-like code for Smali, methods invocations, monitors, etc. Macro instructions were used for the sake of simplification as well as to model multi-threading in Android. All the important API methods that affect a thread life-cycle were considered in Smali
semantics. Another important feature that lacks so far in Android application semantics is thread scheduling. This important aspect, in general, consists in picking a thread for execution and allocating an execution time to it, depending on its priority, before selecting a new thread to execute and switching the context. Android applications including their threads adhere to the Linux execution environment. So, threads are scheduled using the standard scheduler of the Linux kernel, known as a completely fair scheduler
(CFS). On Linux, the thread priority is called a “nice value”. A low nice value corresponds to a high priority and vice versa. In Android, a Linux thread has niceness values in the range of −20 (most prioritized) to 19 (least prioritized), with a default niceness of 0 [42
]. We exhibited in this work two rules related to scheduling feature in Android,
. In the first rule, we presented a function
that plays the same role as the CFS, meaning it selects from runnable threads the most prioritized thread based on nice values comparison and allocates to it an amount of time for execution. The second rule stops a thread when the allocated time expires, prior to picking a new one through
. We mean by “monitoring mode” mentioned in threads scheduling, a monitor that is based on the CFS algorithm that monitors each thread for each task executed, and we suppose that each rule in the concurrent context is executing under a monitoring mode. This mode was presented just for
and omitted in other rules for simplification reasons. The operational semantics are mainly created to secure Android applications. In fact, we intend to use these semantics in an upcoming work to check a number of security proprieties to protect users from rogue applications. Our ultimate goal is to formally reinforce security policies on Android applications. That is to say, starting from a Smali
program and a formal specification of a security policy, we automatically generate a new equivalent secure version of the original program that respects the security policy. Formally, the approach takes, as input, a Smali
and a formal specification of a security policy
and generates, as output, a new version
. The new version of the program preserves all the behavior of the original version, except in cases where the security policy is on the verge of being violated. This is equivalent to saying that the traces of
are the intersection of traces accepted by
and traces of P
. It is formally modeled by (1
Security policies will be enforced through a program-rewriting approach that combines static and dynamic approaches. It rewrites the program statically, according to a given security property, then generates a new executable version that satisfies this property. Security modifications or tests are added at well-calculated points in the program to force the latter to conform to the security property during execution. In other words, the untrusted code will be transformed into a self-monitoring code that will be exploded at specific points in the program. The rewritten version should be equivalent but more restrictive than the original so that it will be able to avoid potentially dangerous operations before they occur.
Reinforced security properties will obviously be specific to malware and attacks threatening Android applications, such as sensitive information leakage, which could be SMS contents, call logs, contact information or geographical location or Android financial malware, which exploit the premium services to incur financial loss to the user for the benefit of the attacker, for example, by calling or texting to premium-rate numbers without the user’s consent and privilege escalation attacks [43
]. Therefore, all mediums that could be exploited for this kind of malware, such as Internet access, system services access including SMS, contact, telephony, Bluetooth, Global Positioning System (GPS) as well as APIs resulted from inter-application communication, will be checked through security policies. Such APIs will be easily located in Smali
, since it provides for each invocation the class fully qualified name.