A Context-Aware Android Malware Detection Approach Using Machine Learning

: The Android platform has become the most popular smartphone operating system, which makes it a target for malicious mobile apps. This paper proposes a machine learning-based approach for Android malware detection based on application features. Unlike many prior research that focused exclusively on API Calls and permissions features to improve detection efficiency and accuracy, this paper incorporates applications’ contextual features with API Calls and permissions features. Moreover, the proposed approach extracted a new dataset of static API Calls and permission features using a large dataset of malicious and benign Android APK samples. Furthermore, the proposed approach used the Information Gain algorithm to reduce the API and permission feature space from 527 to the most relevant 50 features only. Several combinations of API Calls, permissions, and contextual features were used. These combinations were fed into different machine-learning algorithms to show the significance of using the selected contextual features in detecting Android malware. The experiments show that the proposed model achieved a very high accuracy of about 99.4% when using contextual features in comparison to 97.2% without using contextual features. Moreover, the paper shows that the proposed approach outperformed the state-of-the-art models considered in this work.


Introduction
The Internet's expansion and the technical revolution in smartphones have led to a tremendous increase in the number of smartphone users.This encouraged competition among various software industries to serve customers by releasing powerful platforms for their smartphones.Android is the most popular mobile operating system, with millions of users around the world taking advantage of its services.Google created and developed the Android operating system in 2005, and the first Android smartphone was introduced in 2008 [1].By 2021, there were approximately 2 Billion Android-based devices (smartphones and tablets), indicating that Android applications are rapidly growing and exceeding other mobile operating systems such as IOS, Windows, and others [2].As the most popular and powerful platform, Android provides a vast number of mobile applications in various categories to be available for all android-based mobile users of various ages worldwide.
Android malware is growing immensely due to the vast growth of Android users, which poses threat to the security and privacy of Android users.Android malware is known for sending fraudulent SMS, misusing users' private information, devouring traffic, downloading malicious applications, remote control, data exploitation, and other dangerous behaviors [3].According to some statistics [4], the number of Android-based malware cases rises every year.About 3.5 million Android malware samples were observed in the first quarter of 2021, up from 1 million in 2019 and 2020.
Android malware differs in various ways; it acts differently, hacks differently, and performs different damage.Some Android malware infiltrates the device by exploiting the user and then launching an assault via malicious applications, while others replicate and clone themselves in various locations before installing malicious applications in these locations to inflict and spread damage as broadly as possible.Table 1 shows a list of mobile malware types and their behaviors, as well as an example of each one.

Mobile Malware Behavior Example
Trojan [5] Looks to be a harmless application that convinces users to download it and then installs malware on their mobile devices.

Android.Counterlank
Worms [6] Worms can infect additional devices while they are operating on infected systems, and they can carry a payload that degrades mobile network capacity.
Ikee.B Adware [7] Deceives the user through malicious advertising.UAPush Spyware [8] Collects user's data and behavior, such as email and passwords, and sends it to another location across the network.

Zitmo
Botnet [9] Comprises many internet-connected cellphones controlled by a malicious user; it gains full access to the device and its contents and sends data to the malicious controller.

Not compatible
The diversity and complexity of Android malware, as well as the employment of various strategies to elude detection, make traditional malware detection techniques ineffective, necessitating the development of more efficient and powerful ways to overcome this constraint [10].Existing malware detection methods are limited and only reveal malware after it has been infected.Automated detection techniques, such as the use of supervised and unsupervised machine learning algorithms, operate effectively by extracting app features using both static and dynamic analysis to execute an optimal and clear classification of Android apps into two groups: malware or benign [11].Current detection software is unable to detect zero-day attacks.As a result, most researchers use various machine learning classifiers in the detection of Android-based malware, such as Support Vector Machines (SVM), Naïve Bayes (NB), Random Forest (RF), and Decision Trees (DT) [12].Machine learning algorithms take advantage of features and characteristics learned from malware and benign samples during training.
Machine learning detection techniques rely on static, dynamic, and hybrid analysis to extract and gather application features that are used to classify and detect malicious behaviors.System API Calls, permissions, privileges used, and contextual information are some of the extracted features that machine learning-based algorithms use [13].The main machine-learning approaches used to detect Android malware are static and dynamic approaches.
Static analysis involves extracting features from Java bytecode or the AndroidManifest.xmlfile, which contains contextual information and a collection of features, such as permissions required by the app [14].The static analysis of Android applications to extract static features is shown in Figure 1.Dynamic analysis is used to discover harmful behavior in applications while they are running.It collects the system calls that the application makes while it is executing.Furthermore, the dynamic analysis also works well with unidentified application signatures [15].Figure 2 illustrates the dynamic analysis of Android applications.The hybrid technique combines static and dynamic features to improve malware detection results and prevent flaws that can occur when using either a static or dynamic approach alone.The hybrid analysis of Android applications is illustrated in Figure 3.The benefits of a static analysis include the ability to detect malware before it executes, as well as the ability to detect unknown malware and code vulnerabilities.A dynamic analysis has the advantage of being able to detect undiscovered malware as well as zeroday attacks [16].However, a dynamic analysis can be time-consuming and resourceintensive usage.
The contribution of the paper is summarized as follows: 1.
The paper created a new dataset of static API Calls and permissions features from a large number of Android APKs.

2.
The paper selected and used the most relevant contextual features along with the API Calls and permissions to test the efficacy of using contextual information in detecting Android malware.

3.
The proposed model used the Information Gain algorithm [17] to reduce the feature space from 527 API Calls and permissions to 50 features only and achieve a very close accuracy to what was achieved using 527 features.

4.
The paper tested several machine learning algorithms, which are Random Forest, Logistic Regression, SVM, K-NN, and Decision Trees using different combinations of API Calls, permissions, and contextual features to evaluate their accuracy in detecting Android malware.

5.
The experiments show that using the selected contextual features, the proposed model achieved a high accuracy of about 99.4% in detecting Android malware.

6.
The paper considers different state-of-the art models that used contextual features or the same dataset used in this work, and it shows that the proposed models outperformed the state-of-the art models.
The rest of the paper is organized as follows.Section 2 discusses some related work.Section 3 describes the methodology of the proposed approach, discusses the dataset, data pre-processing, features extraction, features selection, and machine learning algorithms.Section 4 shows and discusses the experiments and results.Finally, Section 5 concludes the work.

Related Work
Many research approaches have been conducted on detecting Android malware using machine learning.This section discusses some related work in this direction.
Le et al. [18] proposed an approach for Android malware detection that employs different machine learning methods as detectors to identify and detect malicious Android applications.They extracted the features using the static analysis technique by decompiling source files into snail code and using some C++ libraries to read the information from AndroidManifest.xml.In their work, Decision Tree, Naïve Bayes, and an ensemble of Random Forest, Stochastic Gradient Boosting, and AdaBoost were trained based on some application features such as behavior, permission, the size of the application, the class number in the application, and the user interface number of the application.The used dataset contained about 16,589 malicious Android applications collected from different sources, such as Virusshare [19] and Koodous [20], in addition to 12,290 benign Android applications installed from Google Play.The results of their approach showed that the Random Forest classifier achieved the best accuracy at about 98.66%.
Kaushal et al. [21] used permissions from AndroidManifest.xml files as features of Android applications to build an automated Android malware detection system.They trained two machine learning algorithms (Support Vector Machine (SVM) and Naïve Bayes) with a deep learning algorithm (Recurrent Neural Network with LSTM architecture).Then, they used the extracted permissions to perform malware classification into malicious or benign.Their results showed that the Recurrent Neural Network achieved the best results with an accuracy of 95%, outperforming other machine learning classifiers.
Hyoil et al. [22] used Support Vector Machines (SVM) for Android malware detection.They have used a dataset of two samples of Android applications (malicious and benign), where the number of malicious samples is 30,113 applications from the AMD [23] and Drebin [24] dataset, and the benign samples contain 28,489 applications downloaded from Google Play, the Amazon AppStore, and APKPure [25].Then, they employed a static analysis technique to extract 133,227 API Calls to be used as features for classification.They claimed that the experiments showed that their approach outperformed existing approaches by obtaining an accuracy of 99.97% in detecting malicious Android applications.Bilal et al. [26] used static and dynamic analysis techniques to propose an approach for Android malware detection using machine learning algorithms based on a hybrid of static, dynamic, and some intrinsic features.They extracted 20 different features from a sample of about 600 Android applications (malicious and benign) that were collected from the Androtracker project [27] and Google Play store [28].After extracting these features, two machine learning classifiers, which are the K-Nearest Neighbor and Logistic Regression, were created as detection models to perform malware classification.The experiments showed that the proposed approach performs well, and both machine learning classifiers achieved the same accuracy of 97.5% in malware detection on the same training dataset, while the Logistic Regression classifier outperformed other classifiers over the testing dataset.
Fang et al. [29] proposed a method based on the Dalvik Executable file (Dex file) for Android malware family classification.The method extracted the Dex files of 24,553 Android malicious samples from the Android Malware Dataset (AMD) [23] and obtained RGB images and plaintext from the DEX file.Next, it extracted the text features of plaintext, as well as the color and texture features of images.To perform the classification, the study used the feature fusion algorithm based on multi-kernel machine learning.The experiments showed that the proposed method achieved an accuracy of about 96%.
Danish et al. [30] proposed an image-based malware families multiclassification detection method using fine-tuning Convolutional Neural Network (CNN).The method transformed the raw malware binary files into color images to be used as inputs to the CNN for classification.The experiments were performed on two different datasets.The first one is the Malimg malware dataset [31], which consists of 9435 malicious samples, while the other dataset is the IoT-Android mobile dataset, which includes 14,733 malicious samples and 2486 benign samples.The data augmentation technique was adopted by the proposed method during the fine-tuning process.The experiments showed that the proposed method achieved an accuracy of about 98.82% on the Malimg malware dataset, while on the IoT-Android mobile dataset achieved an accuracy of about 97.35%.
Halil et al. [32] proposed an approach for Android malware classification and detection based on a visualization technique and various machine-learning algorithms.They converted some Android application files (Manifest.xml,DEXcode, and Resources (ARSC)) into grayscale images to extract different types of global and local image-based features to be used for training.Before training the algorithms, they normalized the extracted global features in one feature vector and applied the Bag of Visual Words (BOVW) algorithm to build one feature vector from the extracted local features descriptors.To test the model, they performed the experiments on three grayscale image datasets that consisted of 4850 benign samples and 4580 malicious samples for each one.Six machine learning algorithms, Random Forest, K-Nearest Neighbor, Decision Tree, Bagging, AdaBoost, and Gradient Boost, were tested.The experiments showed that the model achieved an accuracy of about 98.75%.
Nuren et al. [33] proposed a machine learning-based approach for Android malware detection.They extracted the application permissions from the AndroidManifest.xmlfile using static analysis and loaded them in WEKA, where the top 15 permissions were used as malware features.Five machine learning classifiers, Random Forest, J48, Multi-Layer perceptron, Decision Table, and Naïve Bayes, were trained for classification.Using a dataset of 10,000 malicious Android applications and 10,000 benign applications, the Random Forest classifier achieved the best accuracy of about 89.36%.
Talal et al. [34] conducted an empirical study for Android malware detection using various supervised machine-learning algorithms.They employed static analysis to extract some features from the AndroidManifest.xmland Dex files.The extracted features were permissions, intents, and API Calls.Then, they evaluated and compared the performance of six different supervised machine learning classifiers, K-Nearest Neighbour, Support Vector Machine, Decision Tree, Naïve Bayes, Random Forest, and Logistic Regression, on a dataset of 1260 malicious Android applications and 2539 benign Android applications.The results showed that the Random Forest classifier achieved the best accuracy of about 99.21%, while the Naïve Bayes classifier achieved the lowest detection accuracy of about 95.45%.
Other methods in this field were performed by Du et al. [35], Narayanan et al. [36], Mahdavifar et al. [37], and Hadiprakoso et al. [38].Du et al. [35] proposed a contextbased approach that used the semantics and contextual information of the network flow of Android applications.Similarly, Narayanan et al. [36] proposed a contextual-based approach that used a multiple kernel learning method to detect malicious code patterns.Mahdavifar et al. [37] and Hadiprakoso et al. [38] used the same dataset that is used in this work, which is CIC_Maldroid2020 [39].However, both approaches did not use contextual information.The aforementioned approaches in this paragraph are selected as the state-of-the-art models and are explained in detail in Section 5.

Methodology
This section introduces and discusses the methodology of the proposed approach.Figure 4 shows an illustration of the methodology, and each step is discussed in the following subsections.

Datasets
Unlike many studies conducted in this field, which used small datasets, this paper used a large dataset of APK samples of malicious and benign Android applications (APKs) called CICMalDroid2020 [39] and extracted a new dataset of API Calls and permissions features.The APKs were collected and published by Mahdavifar et al. [37] and provided by the Canadian Institute for Cybersecurity [40].It consists of about 16,900 Android samples in different categories; 12,800 samples of the dataset are malware applications, while the rest of the 4100 samples are benign applications.The dataset was collected from 2017 to 2018 from different sources such as MalDozer [41], AMD [23], the VirusTotal service [42], and the Contagio security blog [43].The collected Android application samples include various application categories, such as advertising, social, educational, etc.The dataset categories samples and their numbers are shown in Figure 5. Table 2 provides a brief explanation of each category of malware inspected in this paper.

Android Malware Category Concept
Adware Malware uses advertising to exploit the user.Banking Malware exploits the banking accounts of the user.

SMS
Malware exploits the user by sending a malicious SMS.Riskware A program that behaves as good but is malware.

Features Extraction
The samples of Android applications are in the form of an Android Application Package (APK).An APK is a file that holds all files and components that are responsible for running the application.Figure 6 illustrates the structure of the Android application (APK) [44].APKs need to be converted and analyzed to get the features that will be used for detection.Static analysis is a technique to extract static application features from the APK files without running the application.This paper adopts a static analysis approach to extract static features from the applications such as API Calls, permissions, and some contextual information, using Python programming language.The feature extraction process produced a total of 531 distinct features, as shown in Table 3.More information about the extracted features is provided next.

API Calls Features
The Application Programming Interface (API) is a set of rules that the application uses for communication.API Calls are considered a significant indicator to distinguish between malware and benign applications and to reveal any suspicious behavior [45].Therefore, we extracted a set of API Calls from Android APK samples to be used for malware detection.Table 4 shows the extracted API Calls.The API Calls for each application were extracted using a script code written in Python using "Androguard", which is a full-featured Python utility for manipulating and handling Android files [46].It uses reverse engineering by analyzing the DEX file of each APK file [47].Then, API Calls were converted into binary values (0 or 1) that indicate the presence of API Calls in an APK.A total of 15 API Call features were extracted from 15,836 Android samples, resulting in 11,800 malicious applications and 4036 benign applications.Table 5 displays the number of Android application samples for each category from which API Call features were extracted.

Permissions Features
Android application permissions grant apps access to the phone's hardware and data, as well as the ability to control the phone.Permissions can be legitimate or malicious.For example, when an application asks for permission to access sensitive data, such as the phone book or the camera, the application could be suspicious and potentially malicious.Therefore, permissions are powerful indicators for detecting malicious apps and separating them from benign ones.Figure 7 shows examples of some requested permissions by an Android application.A large set of permissions were extracted from Android application samples for use in malware detection.Tables 6 and 7 provide a brief description of some normal and dangerous permissions extracted from various Android applications [44].Each application's used permissions were extracted using a Python script code using "Androguard" and then converted to binary representation "0 or 1".About 700 permissions features were extracted from 16,703 Android samples, including 12,692 malicious apps and 4011 benign apps.The number of Android application samples from which permissions features were extracted is shown in Table 8.

Contextual Features Extraction
Contextual Information refers to information that characterizes the state of an Android application, such as the activities that the app launches, the system services that the app uses, the resources that the app loads, and so on.Because many studies relied solely on API Calls and permissions to distinguish between malware and benign applications, this paper combines application contextual information with API Calls and permissions to enhance the detection performance and detect malicious behavior in Android applications with high accuracy.The authors of the dataset used in this paper employed a dynamic analysis technique to run all Android application samples in a VMI-based dynamic analysis system using CopperDroid, and then recorded the results in JSON format [40].In this paper, we parsed and analyzed the massive chunk of JSON data for each Android application (APK) using Python scripting codes to extract related contextual information that helps in improving Android malware detection.As indicated in Table 9 below, four categories of application contextual information were selected from various Android application samples.Table 10 shows the number of Android application samples from which contextual information was extracted.

Features Processing
The term "features preprocessing" refers to the act of preparing and transforming features so that they may be trained by machine learning algorithms.Features transformation, or converting features from one format to another, is one of the feature preprocessing mechanisms used in this paper.All extracted API Calls and permissions were converted to binary representation (0 or 1).In other words, if an application calls a specific API (for example, startService), the value will be 1; otherwise, it will be 0. Similarly, if an application uses a specific permission (for example, SEND SMS), the value will be 1; otherwise, it will be 0.An example of the binary representation process of the extracted API Calls and permissions features is shown in Tables 11 and 12.In this example, as shown in Table 11, the first Android application uses API Call 1 and API Call 2, the second Android application only uses API Call 3, and the third Android application uses all API Calls (1, 2 and 3).According to Table 12, the first Android application only uses permission 3, the second Android application uses all permissions (1, 2 and 3), and no permission have been used by the third Android application.

Feature Extraction and Selection
Many application API Calls and permissions have been extracted.Therefore, this paper uses feature selection on the API Calls with permissions to eliminate duplicate and inconsistent API Calls and permissions features that reduce classification efficiency.To achieve this task, Information Gain (IG) was used.IG is a feature evaluation method that evaluates the quantity of information about the class prediction and the projected reduction in entropy if the only information provided is the presence of a feature and the accompanying class distribution [48].IG is based on entropy, which is calculated by determining how much of a term may be used for the classification of data [49].It is a method for selecting the optimal API Calls and permissions features that have been adopted in this paper.The results of using IG in the API Calls and permissions selection, where the top 50 ranked were selected, are shown in Figure 8.Each API Call and permissions feature has an IG value with a high value indicating a significant impact on classification.
The proposed method employed mutual information to measure the correlation between variables, where a higher value means higher dependency.

Machine Learning Algorithms
Supervised and unsupervised learning are two types of machine learning algorithms.This paper relies entirely on the supervised technique, which predicts the class of problembased on related input examples of similar objects.Machine learning classifiers use many features extracted from static analysis to accomplish training for malware classification.This section discusses the various machine learning algorithms used in this paper.

Random Forest RF
RF is one of the most powerful and versatile supervised machine-learning algorithms for classification and regression.Random Forest fits the forest of numerous decision trees, in which the number of trees increases the robustness of the prediction, resulting in improved accuracy and avoiding overfitting [50].The Random Forest classifier is the best machine learning discriminator between malware and benign applications, according to the results of a literature review performed on Android malware detection, as described by [51][52][53][54].This paper sets the value of the n_estimators to 100, after testing 10, 50, 100, and 200.

Support Vector Machines SVM
SVM is a supervised learning model used for classification analysis by creating the hyperplane to divide the data into classes [55].SVM are solid classifiers that give accurate results, but their computations are complex and they work slowly with huge datasets [56].This paper applied the linear support vector classification SVC, with kernel = "linear"; this has more flexibility in the choice of the loss functions and is better to scale with large numbers of samples.

Logistic Regression LR
LR is a statistical machine learning classification technique used for predicting binary dependent variables.This classifier excels at linear problems, delivering accurate results while consuming minimal computer resources [57].

Naïve Bayesian NB
This is a Bayes Theorem-based probabilistic supervised machine learning algorithm that gives the conditional probability of an event A given event B, and is used for classification tasks [57].The NB classifier is quick to calculate and can deal with noisy data, but it performs poorly when the data includes many features [56].

K-Nearest Neighbor KNN
The K-Nearest Neighbors technique is a simple, easy-to-implement, and commonly used supervised machine learning algorithm that calculates the similarity between training and testing samples to handle classification and regression tasks [58].This paper set the value of n_neighbors to 10.

Decision Trees DT
DT is a supervised machine learning algorithm and a type of tree structure classifier, which is used to accomplish classification and regression tasks.DT splits the data into subsets and presents the results as a tree with two entities: decision nodes for data splitting and leaf nodes for final decisions [59,60].

Experiments and Results
This section discusses the experiments conducted to evaluate the model and analyzes the results.The first subsection goes over each experiment that was performed in order to detect malicious Android apps based on their features.The second subsection summarizes all the experiments and determines which one had the highest detection accuracy.

Results and Analysis
Different metrics were computed to measure and evaluate the performance and effectiveness of each machine learning classifier in order to select the best and most accurate one.The mathematical calculations of the various evaluation metrics are shown in the following equations: • TP (True Positive) is the number of malware detections that are correctly labeled as malware, • TN (True Negative) is the number at which benign is accurately identified as benign, • FP (False Positive) is the number of benign that are mistakenly identified as malware, and • FN (False Negative) is the number at which malware is incorrectly identified as benign.
The most intuitive evaluation metric is accuracy, which reflects the correctly predicted ratio.In some circumstances, accuracy is not always a reliable indicator; instead, alternative metrics should be evaluated, such as Precision, which is the rate of correctly predicted positive outcomes to all positive outcomes.The F1-Score (F-Measure) is the average of Precision and Recall, with Recall referring to the percentage of properly recognized outcomes across all samples [61].

API Calls-Based Android Malware Detection
In this part, Android API Calls features were used to classify the applications.A dataset of 15 features from 15,836 Android application APKs (11,800 malware and 4036 benign) were been used to train various machine learning classifiers.The outcomes the classification using the Random Forest classifier, along with the confusion matrix, are shown in Figure 9.The classification report in Figure 9 shows the Random Forest classifier detection performance for each class of Android applications.For example, on adware, the Random Forest obtained 80% Precision, which means it can identify 80% of the adware dataset.Similarly, it shows a 78% Recall, which means it correctly predicts 78% of the adware.Random Forest on adware also earned a 79% F1-Score, which implies it properly predicts 79% of the data.
The number of correct and incorrect predictions for each type of Android application is shown in the confusion matrix in Figure 9.For example, the number of true adware predictions achieved by Random Forest is 477 out of 600, while the number of incorrect predictions is 123.Here are some examples of faulty predictions of adware: seven adware are falsely labeled as banking, fifty-six adware are labeled as benign, fifty-four adware are labeled as riskware, and four adware are labeled as SMS.
Table 13 and Figure 10 show how API Calls-based Android malware detection compares to different machine learning classifiers.They show that employing API Calls only to detect malicious applications is insufficient to produce accurate detection results.The Random Forest classifier had the best Accuracy, Precision, Recall, and F1-Score, but overall, this experiment performed poorly over different machine learning algorithms.The results show that the algorithm's detection accuracy is between 75% and 87%, implying that the percentage of inaccurate predictions is about 25%, which is not excellent.Furthermore, the algorithms attained a precision of 75-87%, implying that they were able to identify 75-87% of the data during testing.The different algorithms have achieved a Recall of 75-87%, which implies they properly detect 75-87% of malicious applications.

Permissions-Based Android Malware Detection
This experiment involves extracting permissions from Android apps to train various machine-learning algorithms to classify whether the app is malicious or benign.For a dataset of 512 features from 16,703 Android application APKs (12,692 malicious and 4011 benign) was employed.Figure 11 shows the results of the classification using the Random Forest classifier, as well as the confusion matrix.The detection performance of the random forest classifier for each class of Android applications is shown in Figure 11.For riskware, for example, the Random Forest achieved 95% precision, which means it can properly predict 95% of the riskware dataset, and 98% Recall, which means it can identify 98% of the riskware dataset.Random Forest on riskware received an F1-Score of 97%, indicating that it correctly predicts 97% of the data.
The confusion matrix in Figure 11 shows the number of correct and wrong predictions for each type of Android application.The number of genuine riskware predictions made by Random Forest, for example, is 1522 out of 1558, with 36 incorrect predictions.Here are some examples of riskware predictions that were incorrect: three riskware are incorrectly categorized as adware, eighteen riskware are incorrectly labeled as banking, twelve riskware are benign, and three riskware are incorrectly labeled as SMS.
The performance of different machine learning classifiers in detecting Android malware applications is shown in Table 14 and Figure 12.Table 14 and Figure 12 show that using application permissions to discriminate between malicious and benign programs is effective and yields reliable detection results.The Accuracy, Precision, Recall, and F1-Score of the Random Forest classifier were the best.The detection accuracy of the algorithms is between 95% and 97%, meaning a low percentage of incorrect predictions, which is desirable.Furthermore, the algorithms achieved a precision of 95-97% during testing, meaning that they were able to recognize 95-97% of the data.The various methods have a recall of 95-97%, indicating that they correctly detect 95-97% of the malicious programs.In this experiment, we leveraged the existing API Calls and permissions in the apps and combined them so that the machine learning algorithms learn to achieve better efficiency in detecting harmful Android applications.Various machine learning algorithms have been trained with 527 features from 11,781 malware and 4008 benign real-world Android applications.The classification report, as well as the confusion matrix using the Random Forest classifier, is shown in Figure 13. Figure 13 displays the random forest classifier detection performance for each class of Android applications.For SMS, for example, the Random Forest obtained 99% precision, meaning it can correctly predict 99% of the dataset and 99% Recall, meaning it can correctly identify 99% of the dataset.The F1-Score for Random Forest on SMS was 99%, showing that it correctly predicts 99% of the data.Moreover, the number of correct and incorrect predictions for each type of Android application is shown in the confusion matrix in Figure 13.Random Forest, for example, produced 1895 genuine SMS predictions out of 1905, with only 10 wrong guesses.Here are a few examples of wrong SMS predictions: four SMSs were wrongly classified as riskware, one SMS was incorrectly classified as benign, five SMSs were classified as banking, and no SMSs were classified as adware.
Table 15 and Figure 14 demonstrate that combining API Calls and permissions improves the detection of Android malicious apps.The Random Forest classifier has the best Accuracy, Precision, Recall, and F1-Score.The algorithm's detection accuracy ranges from 96% to 98%, indicating a high percentage of correct predictions.Furthermore, during testing, the algorithms were able to recognize 96% to 98% of the data with a Precision from 96% to 98%.The various approaches have a Recall from 96% to 98%, meaning that 96% to 98% of harmful applications are correctly detected.To reduce the dimension of the feature and improve detection performance, we used the feature selection method (mutual information gain) on the combined API Calls and permissions in this experiment.The top-ranked 50 API Calls and permissions features were picked from 11,781 malware and 4008 benign real-world Android applications; refer to Figure 8 for more information.They were picked after experimenting with different feature dimensions, as shown in Figure 15.This experiment shows an increased efficiency with no discernible effect on classification accuracy.Figure 16 shows the confusion matrix as well as the classification report using the Random Forest classifier.The results of various machine learning classifiers in detecting fraudulent Android applications are shown in Table 16 and Figure 17.The adoption of the feature selection technique did not increase detection accuracy, as shown in Table 16 and Figure 17.The findings of this experiment are nearly identical to those of the prior experiment (without feature selection), in which the Random Forest classifier had the best Accuracy, Precision, Recall, and F1-Score.The only advantage we can see in this experiment is that the algorithm's performance, i.e., the time it takes to predict, has improved.That is, while this experiment did not enhance the accuracy results, it improved performance.In this experiment, four contextual information features, num_services, num_receivers, num_activities, and num_providers, were selected and used along with the selected 50 API Calls and permissions features (used in the previous experiment) to show the effectiveness of using contextual information on the detection accuracy.These four features represent the number of times the application performs a certain activity, making them crucial and highindication features for detecting malicious behavior and distinguishing between malware and benign applications.The number of applications used in this experiment is 842 benign samples and 5866 malware samples.
Table 17 and Figure 18 show that using contextual features along with API Calls and permissions increases detection accuracy and delivers better results.The Random Forest classifier produced the greatest results, with an accuracy of 99.4%.Figure 19 compares the accuracy results of all tested algorithms according to the use of API and permissions features only, without feature selection (527 features), with feature selection (50 features), with feature selection (50 features), and contextual features.The figure shows that using the contextual features with 50 API and permissions features only enhances the accuracy of all algorithms and outperforms the accuracy of the other models (with using feature selection and without using feature selection).In addition, an important finding is obvious in the results, which is the rise of the accuracy of NB when using contextual information.The results show that when we used the contextual features, the accuracy of NB increased sharply from 82.6% to 92.5%, which is an outstanding enhancement.Figure 20 shows the confusion matrix as well as the Random Forest classifier classification results.As shown in the Figure, the detection results for each Android application category are outstanding, with a very small proportion of wrong predictions.The Random Forest algorithm was capable of successfully identifying 99.4% of malicious apps.Moreover, the number of incorrect predictions for all Android malware categories is modest, as illustrated in the confusion matrix in Figure 20.For example, just 6 predictions out of 1331 are incorrect predictions for SMS, and only 8 predictions out of 426 are wrong predictions for adware.

Results Summary
The results of many experiments conducted in this paper show that using API Calls to identify suspicious applications is insufficient; the results were not accurate enough, and there were numerous incorrect predictions.Meanwhile, the results of detection based on application permissions only were better than those of using API Calls only.However, the results of Android malware detection based on API Calls and permissions together were clearly better, which was higher by 1% and reached around 98% accuracy using the Random Forest algorithm.The features selection approach did not enhance the security.However, it achieved a close accuracy of about 97.2% using 50 features only, instead of 98% using 527 features.
The interesting results were achieved when the contextual features were used along with the selected 50 API Calls and permissions.Using this combination, the highest results reached about 99.4% using the Random Forest algorithm.This proves that using the selected contextual features enhances the classification and detects Android malware with very high accuracy when it is used with API Calls and permissions features.Moreover, using contextual features enhanced Naïve Bayesian accuracy sharply from 82.6% to 92.5%.

State of Art
The proposed model in this work has achieved very high accuracy, as discussed in the previous section.To show the significance of this work, four stat of art models are considered, which are Du et al. [35], Narayanan et al. [36], Mahdavifar et al. [37], and Hadiprakoso et al. [38].
Du et al. [35] proposed a context-based approach, called FlowCog, that used natural language processing and deep learning methods to analyze the semantics and contextual information of the network flow of Android applications.Their approach used a large dataset of more than 8000 samples collected from different sources, such as the ICC-bench dataset [62], Google Play, and Drebin.The results of their approach showed that their proposed model achieved an accuracy of about 95.4%.Similarly, Narayanan et al. [36] proposed a contextual-based approach, called MKLDroid, which used a multiple kernel learning method that extracted the contextual subgraph features from the applications' dependency graphs to detect malicious code patterns.MKLDroid was applied using two datasets, Drebin and Virusshare.The authors claimed that MKLDroid achieved an accuracy of about 97%.
Mahdavifar et al. [37] and Hadiprakoso et al. [38] used the same dataset used in this work, CIC_Maldroid2020.However, both methods did not use contextual information in their approaches.The authors in [37] proposed a deep neural networks method that used about 470 features, such as system calls, binders, and composite behaviors.Their results showed that their proposed method achieved an accuracy of about 97.84%.Meanwhile, Hadiprakoso et al. [38] proposed a machine-learning model and tested several machinelearning algorithms such as SVM, KNN, RF, and XGBoost.Their model used many static and dynamic features such as API Calls, permissions, and system calls.The authors claimed that their model achieved an accuracy of about 96%.
Table 18 shows a comparison between the accuracy of the proposed model and the state-of-theart models.The table shows that the proposed work outperformed the approaches that used contextual features, which are [35] and [38], which achieved an accuracy of about 95.4% and 97%, respectively.However, these approaches did not use the Mal-Droid2020 dataset that was used in this work, and they did not use conventional machine learning algorithms.This proves the significance of the machine learning model that has been proposed in this work and the significance of the chosen contextual features in detecting Android malware with very high accuracy.Moreover, the table shows that the proposed work outperformed the approaches that used the same dataset in this work.These approaches used deep learning algorithms and conventional machine learning methods, but they did not use contextual features.This proves the significance of using contextual features in achieving very high accuracy in Android malware detection.

Conclusions
The accuracy of Android malware detection methods using machine learning depends on the features used.API Calls and permissions are two of the most important features that are used in Android malware detection.However, most machine learning methods use these features without considering the context.This paper has shed light on the importance of using contextual features with API Calls and permissions on the detection accuracy of machine learning models.The paper has proposed a machine learning model based on the use of four important contextual features and fifty API Calls and permission features, which were extracted from a large dataset of 12,800 malicious and 4100 benign Android apps.To test the model, the paper has used several machine learning algorithms, Random Forest, SVM, Linear Regression, Naïve Bayesian, K-NN, and Decision Tree.The results have shown that when using the proposed model with API Calls and permissions only, the best results achieved were 98.1% using the Random Forest algorithm.Moreover, the results have shown that after applying the Information Gain selection algorithm to select the best relevant features, only 50 features out of 527 can be used to achieve a close accuracy of about 97.2%.Furthermore, the results have shown that using contextual features along with the 50 API Calls and permissions achieved a very high accuracy of about 99.4% when using the Random Forest algorithm.In addition, the results have shown that the most affected algorithm by using contextual features was the Naïve Bayesian algorithm, where its accuracy raised sharply from 82.6% to 92.5%, which is an interesting change for the Naïve Bayesian.Moreover, this paper considered four important methods as state-oftheart models.The comparison has shown that the proposed model outperformed the state-of-the-art models.

Figure 4 .
Figure 4.A framework for context-aware Android malware detection approach using machine learning techniques.

Figure 8 .
Figure 8. Top 50 ranked-selected API and permissions based on Information Gain (IG).

Figure 9 .
Figure 9. Android application permissions classification report and confusion matrix for the Random Forest classifier-API Calls only.

Figure 10 .
Figure 10.Other machine learning classifier results in detecting malicious Android applications-API Calls only.

Figure 11 .
Figure 11.Android application permissions classification report and confusion matrix for Random Forest classifier-permissions only.

Figure 13 .
Figure 13.Android application permissions classification report and confusion matrix for the Random Forest classifier-API Calls with permissions.

Figure 14 .
Figure 14.Other machine learning classifier performance in Android malware detection-API Calls with permissions.4.1.4.API Calls and Permissions-Based Android Malware Detection with Feature Selection

Figure 15 .
Figure 15.Accuracy scores when training with different feature dimensions.

Figure 16 .
Figure 16.Classification report and confusion matrix for the Random Forest Classifier-API Calls with permissions (with feature selection).

Figure 17 .
Figure 17.Machine learning classifiers results in detecting Android malware-API Calls with permissions (with feature selection).4.1.5.API Calls and Permissions with Feature Selection and Contextual Information-Based Android Malware Detection

Figure 18 .
Figure 18.Detection of Android malware results with other machine learning algorithms-API Calls, permissions, and contextual information.

Figure 19 .
Figure 19.Enhancement of the accuracy of the proposed model using contextual features.

Figure 20 .
Figure 20.The classification results and confusion matrix for the Random Forest Classifier-API Calls, permissions, and contextual features.

Table 1 .
Types of mobile malware.

Table 3 .
Number of extracted features.

Table 4 .
Extracted API Call categories.
getLastKnownLocationReturns the data from the last known location retrieved from the supplied source.isProviderEnabledReturns if the given provider is enabled or disabled.

Table 5 .
Number of Android application samples from which API Call features were extracted.

Table 6 .
Sample of normal extracted permissions.

Table 7 .
Sample of extracted dangerous permissions.

Table 8 .
Number of Android application samples from which permissions features were extracted.

Table 10 .
Number of APK samples from which the contextual features were extracted.

Table 11 .
Binary representation of the extracted API Calls.

Table 12 .
Binary representation of the extracted permissions.

Table 13 .
Machine learning classifiers results-API Calls only.

Table 14 .
Other machine learning classifier results in Android malware detection-permissions only.

Table 15 .
The results of using machine leaning algorithms in detecting Android malware-API Calls with permissions.

Table 16 .
The results of the other machine learning algorithms in Android malware detection-API Calls with permissions (with feature selection).

Table 17 .
Android malware detection results with other machine learning algorithms-API Calls, permissions, and contextual information.

Table 18 .
Comparison with the state-of-the-art methods.