1. Introduction
The number of mobile devices used in 2022 was almost 16 billion [
1], nearly twice the world population, which was reported as 8.1 billion this year [
2]. These devices include resources such as memory, sensors, microphone, GPS (Global Positioning System), bluetooth, camera, and NFC (Near Field Communication), which must be handled by applications to provide functionalities to the users. During the development of a mobile application, developers implement the acquisition, use, and release of such resources. For example, when one opens a
camera application, the camera resource is acquired and used to take pictures, and, once the application is closed, the camera resource must be released. However, sometimes these resources are acquired by a mobile application and not properly released, resulting in a defect called resource leak, which can lead to unnecessary battery consumption, crashes, and slowdowns [
3].
In recent years, some methods have been developed to identify resource leaks in mobile applications. For example,
Android Studio has the
Android Lint method that inspects code and shows resource leaks as well as other defects [
4]. The
FindBugs method performs code analysis in order to find structural problems and resource leaks [
5].
In a previous work, the authors presented the
LeakPred approach, which manages to identify resource leaks in Android applications using machine learning (ML) [
6]. A study was carried out to verify which is the best ML classifier to be used in
LeakPred and showed that the KNN (K-Nearest Neighbor) and DNN (Deep Neural Network) classifiers obtained the best results [
6].
Continuing with the evolution of this approach, we want to demonstrate the usefulness of LeakPred by helping app developers to identify components (in this work, a component is considered a method of a class of an object-oriented program) with resource leak problems. So, we validated this approach with three state-of-the-art methods through a controlled experiment. These state-of-the-art methods identify a limited number of kinds of resource classes with leaks, and the current number of these resources class is greater. Therefore, this article will present a comparative study.
The three state-of-the-art methods used in the study were
Android Lint [
4],
FindBugs [
5], and
Infer [
7]. This new study was carried out in the form of a controlled experiment, where 15 mobile applications were used, which were randomly selected from the list of applications available at [
8].
The results of this study indicate the feasibility of the LeakPred approach to assist developers in identifying components with resource leaks, since the approach reached the best median (85.37%) of components with identified resource leaks and had the highest coverage (96.15%) of classes of resource leaks in applications. The main contributions of this work are as follows:
This article is organized as follows:
Section 2 shows the related work,
Section 3 presents the planning and execution of the study,
Section 4 shows the results found,
Section 5 presents the study discussions, and
Section 6 discusses the threats to the validity of the study. Finally,
Section 7 shows the conclusions and future work.
2. Related Work
In recent years, many resource leak identification techniques have been proposed in order to help Android developers to properly manage device resources. However, there was no database of applications with leaks, so in order to evaluate these new techniques, it was necessary to make an effort to find applications to be used in studies [
8]. To reduce this problem, Liu et al. [
8] presented the
DroidLeaks database and also revealed some characteristics of leaks in Android applications, as well as some defect patterns in resource management. To demonstrate the usefulness of the database, they performed a study comparing eight methods in relation to detecting leaks in Android mobile applications, where the
Android Lint and
FindBugs methods did not have any false positives, but also, they have not achieved the best detection rates [
8].
Some state-of-the-art techniques for identifying resource leaks in Android applications are LeakPred, Infer, FindBugs, and Android Lint, which will be briefly explained below.
Lima et al. [
6] proposed an approach called
LeakPred, which uses machine learning to identify leaks in Android application components. In ML, a database is needed, and Lima et al. presented the
CompLeaks database, which was based on the
DroidLeaks database. Aiming at analyzing which would be the best classifier to be used, a study was carried out comparing six classifiers commonly used in studies for defect prediction, where a KNN and DNN had the best results, and in this study, the DNN classifier was used, which had the best result (78.93%) in the ROC AUC metric. This evaluation metric was chosen because it is recommended for when the database is unbalanced [
6].
Facebook [
7] uses the
Infer method, which uses static analysis. It can parse Java and C/C++/Objective-C code. After parsing, it will produce a list of possible defects (including resource leaks). Some of the companies that use this method are
Amazon Web Services,
Spotify,
Uber,
WhatsApp,
Microsoft,
Mozilla, and
Instagram [
7].
Pugh et al. [
5] presented the static analysis method
FindBugs. It can identify more than 200 defect patterns, such as null pointers, infinite recursive loops, resource leaks, the misuse of Java libraries, and deadlocks. The project is open-source, has been downloaded over 230,000 times, and is used by many large companies and financial institutions [
5].
Android [
4] provides the
Android Lint method. This method helps to find code with an inefficient structure that can affect the reliability and efficiency of Android applications (including resource leaks) and make code maintenance difficult. Possible defects and improvements are grouped according to the following criteria: accuracy, security, performance, usability, accessibility, and internationalization [
4].
3. Study Planning
This feasibility study aimed to compare the identification of components with resource leaks using the LeakPred approach with state-of-the-art methods, in which mobile applications from the Android platform were used. With this, it was expected that the results obtained and the body of knowledge resulting from the study’s conduction will provide information that will allow for the evolution of the approach and improve its use in the identification of leaking components in Android mobile applications.
The purpose of this study was to answer the following question: “Is the use of the LeakPred approach feasible in analyzing its effectiveness in comparison to representative state-of-the-art methods in identifying components with resource leaks in mobile applications on Android platform?”. Effectiveness is understood in this study as the number of components with resource leaks identified in relation to the total number of components with resource leaks in mobile applications. In this context, the list of components identified with resource leaks by the LeakPred approach was compared with the list identified by each of the state-of-the-art methods. Therefore, the research question defined for this study was the following:
RQ1. How effective is the LeakPred approach in relation to the identification performed by state-of-the-art methods regarding the number of components with identified resource leaks?
To help answer the research question, five metrics were chosen: the True Positive Rate (TPR, also known as Recall), which is the probability that a leaking component will be identified as leaking (Equation (
1)); the False Discovery Rate (FDR), which is the expected ratio of the number of non-leaking components identified as leaking (false discoveries) to the total number of leaking components identified (Equation (
2)); the accuracy; the precision; and the f1-score. In Equations (
1) and (
2),
TP is the number of components with leaks correctly identified,
P is the number of leaks that could be identified, and
FP is the number of non-leaking components identified as having a leak.
3.1. Methodology for Identifying Resource Leaking Components
To identify components with resource leaks, the same protocol used to assemble the
DroidLeaks database in [
8] was followed.
Figure 1 shows an overview of the process followed formed by two steps: (1) search for keywords in the commit logs (some example keywords appear in
Table 1) and commit code differences (examples of keywords are shown in
Table 1); and (2) the manual validation of resource leaks identified in the commits found in step 1. The process presented in this subsection was followed in the applications discussed in the next subsection, and some of the applications with identified resource leaks were used to evaluate the effectiveness of the approach
LeakPred.
3.2. Mobile Applications Selection
Liu et al. [
8] provided a list of 170 mobile applications that meet the following criteria:
They have more than 10,000 downloads in the store (application is popular);
They have a public defect tracking system (defects are trackable);
The application’s code repository has over 100 code reviews (application is actively maintained);
They have at least 1000 lines of Java source code (the application has a medium or high level of complexity) [
8].
For the creation of the
DroidLeaks database, 34 applications of these 170 were used. Therefore, for this study, 32 applications were randomly selected among the remaining 136 (
Figure 2). These 32 selected applications are shown in
Table 2, of which 22 were used to increase the database, and 5 of these 22 were also randomly selected to be used in the comparison study between the methods, and the other 17 were used together with the old version of the database (
CompLeaks) for the training of the
LeakPred approach. Therefore, 15 applications were used in this study. More information, such as the component name and leaked resource class, is found at
https://bit.ly/3o7XJr9 (accessed on 1 May 2024).
3.3. Tools
For this study, some of the state-of-the-art methods for resource leak detection in mobile applications were selected, which were selected from a systematic mapping that we performed on techniques related to resource leaks. Those selected are as follows:
Android Lint provides a code scan that helps identify resource leaks and other structural code issues, using static analysis to check if the code breaks existing lint rules [
4]. A lint rule has the following information: id, summary, explanation, category, priority, severity, detector class (responsible for detecting the occurrence of the issue; could be written using UAST—Universal Abstract Syntax Tree), and scope [
9].
FindBugs is a program that uses static analysis to look for defects (including resource leaks) in Java code [
5]. It often syntactically matches source code with faulty code patterns, but also uses data flow analysis to check for defects [
10].
Infer checks using static analysis for resource leaks and other defects [
7]. It develops a compositional, bottom-up variant of the RHS inter-procedural analysis algorithm [
11].
These methods will serve as a basis for comparisons with the
LeakPred approach [
6].
3.4. Execution of the Study
The execution of this study had three steps: (1) compiling the applications (some methods need the application compiled), (2) executing the methods, and (3) listing the classes of resource leaks by method. These steps are presented in
Figure 3. For steps 1 and 2, a maximum time of 2 weeks was provided (this amount of time was chosen as it was believed to be reasonable for understanding how to prepare the environment to compile an application or run a method) to try to solve compilation problems or for the execution of each application and method. If even with this amount of time, it was not possible to solve the problem, the application or the method would not be used in the study.
The first step was to compile the 15 applications randomly selected for this study (the applications contain the word experiment in the use column of
Table 2). Applications with gray background in the table and with ids
1,
5,
10,
25,
26,
27,
29,
30,
31, and
32 were not compiled due to library dependencies and design errors. It was only possible to compile the applications that have the green background in the table, namely,
OI Notepad (15),
OI Safe (16),
AnyMemo (23),
Avare(24), and
Seafile(28). Therefore, of the 15 methods previously selected, only five could be used in this study, as some of the methods require the application to be compiled.
The second step was to implement state-of-the-art methods. The Android Lint, FindBugs, and Infer methods were successfully executed. Thus, the three methods plus the LeakPred approach were executed to analyze resource leaks in the five mobile applications.
The third step was to make a list of resource leak classes that each method could identify in the five applications (
Table 3), as each method identifies some types of resource leaks and not all methods provide a list of the resource classes that they can identify. For this, we used the list of leaks identified by at least one of the methods in the five applications and the list of leaks that each method could identify mapped in
DroidLeaks [
8]. In the next subsection, the analysis of the results will be shown, and the detection rate of each method was based only on the leaks that it could identify in each application.
4. Results
The results of this study will be shown for each of the mobile applications, starting with the application
OI Notepad followed by the results for the applications
OI Safe,
AnyMemo,
Avare, and
Seafile. More information (for example, resource class or which method identified each leak) about the components with identified resource leaks in each of the applications is available at
https://bit.ly/3mq5RTv (accessed on 1 May 2024).
4.1. Results: OI Notepad Application
Table 4 presents the 20 resource leaks identified by the methods in the application
OI Notepad. For a better understanding of how many resource leaks were identified by more than one method,
Figure 4 shows a Venn diagram showing resource leaks and false positives, in which it can be seen that
LeakPred identified 17 leaks, while identified
FindBugs 0 (zero), and the methods
Infer and
Android Lint identified 10 leaks each. Still, in
Figure 4, the detection rates of resource leaks and false positives are presented, where it is possible to observe that the
LeakPred approach obtained the best detection coverage, with 85%, as well as the highest percentage of false positives, with 37.04%. The
FindBugs method did not find either of the two leaks it could identify, and it did not have any false positives either.
The three leaks that the LeakPred approach failed to identify were two from the java.io.BufferedWriter class (the two components are about 81% the same) and one from the android.database.Cursor. Regarding the false positives reported by the LeakPred approach, nine are from the android.database.Cursor class and one from the java.io.InputStream class. We can consider that the approach could decrease the percentage of detected false positives.
4.2. Results: OI Safe Application
Table 5 presents the 13 resource leaks discovered by the methods in the
OI Safe application, and
Figure 5 shows the number of resource leaks and false positives that each method identified and how many were identified by more than one method. For example,
LeakPred identified eight leaks, and
Infer, seven. Also, in
Figure 5, the detection rates of resource leaks and false positives are shown, where the method
Android Lint scored 100% in coverage, identifying the only leaky component it could identify in this application. Next is the
Infer method with 70% coverage, and then the
LeakPred approach with 61.54% coverage, identifying 8 out of 13 possible leaks to be identified.
It is worth mentioning that the LeakPred approach does not identify resource leaks in an intermediate class that inherits the original resource class, and this application had two resource leaks that were in this situation. The resource leaks were from the InputStreamData class, which inherited from the java.io.InputStream class, making it impossible to identify this resource leak using the approach LeakPred. These 2 resource leaks were counted in the 13 that could be identified.
With regard to false positives, the LeakPred approach had the highest percentage, with 33.33%, followed by the Infer method with 30%. The Android Lint and FindBugs methods did not report any false positives. However, the FindBugs method did not find any of the six resource leaks that could be identified in this application.
4.3. Results: AnyMemo Application
The 12 leaked components detected by the methods in the
AnyMemo application are presented in
Table 6. For a better understanding of how many leaks were identified by each method,
Figure 6 is shown. In this figure, it can be seen that
Infer identified two leaks and had a false positive. Still, in
Figure 6, the detection rates of leaks and false positives are shown, in which it is highlighted that the
LeakPred approach had the best coverage, identifying 100% of the leaks that the approach could identify. As far as false positives go, 7 out of 18 were in test files. Therefore, an improvement in the approach would be to ignore test files during code analysis.
The Infer method had the second best coverage with 50%. The Android Lint method, on the other hand, did not have any leaks that it could identify in this application, and the FindBugs method had two possible leaks to be identified, but it did not identify any of them and had a false positive (100%).
4.4. Results: Avare Application
Table 7 shows the 22 resource leaks found by the methods in the application
Avare.
Figure 7 shows the number of leaks identified by each method. The
LeakPred approach identified 20 leaks and had 39 false positives. Also, in
Figure 7, the detection rates of leaks and false positives are shown, and it is observed that the
LeakPred approach achieved the best coverage, 90.91%, as well as the highest percentage of false positives, 62.10%. Next is the
FindBugs method with a coverage of 11.11% and a false positive rate of 50%.
The Infer method could identify 11 leaks but did not identify any of them, and the Android Lint method had no leaks that could be identified in this application. Regarding the false positives of the LeakPred approach, it can be noted that 14 of the 39 were from resources in class-level variables and not from the component; in other words, generally, these resources are not closed in the component where they are used.
4.5. Results: Seafile Application
Table 8 displays the 42 leaking components that the methods identified in the
Seafile application. To distinguish how many components were identified by more than one method,
Figure 8 is shown. It shows that the same leak was identified by the
LeakPred,
Infer, and
Android Lint methods. The
LeakPred approach had the best coverage, 85.37%, and the highest false positive rate, 40.68%. Right after this method is the
Infer method, with 37.50% coverage and a false positive rate of 25%.
The Android Lint method did not have any false positives, but it only identified 2 out of 21 possible leaks (9.52%), and the FindBugs method did not have any leaking components that could be identified in this application.
5. Discussion
The LeakPred approach achieved the best coverage (mean of 84.56% and median of 85.37%) of leaks identified and with a coverage of 96.15% of the classes of leaks that could be identified in the five applications. This approach also had the highest rate of false positives (mean of 47.84% and median of 40.68%). Therefore, the need for further refinement to reduce false positives was noticed.
In the false positives of the LeakPred approach, we have three patterns that we can highlight. (1) When a class-level variable is being instantiated in a component and released in another, a possible way of improvement would be to find metrics that could represent this situation. (2) Test files were analyzed, which caused some false positives in these files. One solution would be to ignore the test files during code analysis. (3) It was observed that during file manipulation, a resource class is instantiated and passed as a parameter to instantiate another resource class and so on. Sometimes, it is necessary to close only one of them for the resource to be released correctly. A way to solve this problem would be to define a heuristic to deal with this phenomenon. Another way to decrease false positives would be to increase the amount of database leakage.
Table 9 shows a summary of metrics. The
Android Lint method had the second highest coverage (mean of 62.15% and median of 76.92%) and the highest accuracy (median of 100%) of components with identified resource leaks and no false positives. However, it can only identify the resource class
android.database.Cursor (1.92%) among the leak classes of the five applications in the study. The
Infer method had the second highest accuracy (median of 95.16%), and the
LeakPred method had the lowest accuracy (median of 81.25%).
The LeakPred approach can identify leaks in several resource classes. This is an advantage, since the fact that a method can identify several categories of leaks reduces the number of methods to be configured and executed, which can help in its use during the development of mobile applications and, consequently, in reducing costs and/or time throughout the project.
6. Threats to Validity
This study has internal, building, and external threats that must be examined. This section details each one as follows:
Internal Validity: The sample of projects was not completely random, as they were randomly selected from the list of open-source applications presented in [
8]. As this was a feasibility study, it is believed that this issue does not pose a significant threat.
Construction Validity: This feasibility study used five Android platform applications from different categories developed in the Java language. This study may not be representative for other categories of mobile applications. Commits containing resource leak fixes were identified using keywords, and to ensure that the commit was related to a leak, a manual validation step was included.
External Validity: To reduce this threat, five applications from four different categories were used, namely, productivity, maps and navigation, tools, and education. Likewise, the LeakPred approach was compared with more than one state-of-the-art method. In the future, it is intended to increase the number of applications of different categories. We also chose the median, as it provides a direct understanding of the central point of the data and is not as influenced by outliers as the mean.
7. Conclusions
In this work, a feasibility study for the LeakPred approach was presented, aiming to analyze it through a controlled experiment with respect to its effectiveness in identifying components with resource leaks compared to state-of-the-art methods. The results show the possibility of using the LeakPred approach to identify resource leaks in mobile applications, as it obtained the highest median coverage percentage of identified resource leaks with 85.37%, as well as having the highest percentage of leak class coverage, 96.15%. However, it had the lowest accuracy (median of 81.25%). There is also the possibility of refinement to reduce the number of false positives.
The results found provide a basic understanding of the feasibility of the LeakPred approach. As future work, a study could be carried out that increases the number of open-source mobile applications and uses company applications. Another possibility is to evaluate the efficiency of the methods. Another interesting future research would be removing the limitation of the LeakPred approach in identifying leaks in an intermediate class that inherits the original leak class and removing the parsing of the test files. Finally, there is the possibility to adapt the approach and carry out a feasibility study for the iOS platform.