Unsupervised Detection of Changes in Usage-Phases of a Mobile App
Abstract
:1. Introduction
2. Related Work
3. Materials and Methods
3.1. Experiment Description
3.1.1. Problem Statement
3.1.2. Data
3.2. Basic Change Detection Algorithm
Algorithm 1:Basic change detection algorithm. |
3.3. SIFT Based Method
3.4. Graph Based Methods
3.4.1. Graph Entropy Based Detection
3.4.2. Graph Kernel Based Detection
3.5. Probability Distribution Based Methods
3.5.1. KLD Based Detection
3.5.2. Usage-Phase Model Based Detection
3.5.3. Hypothesis Testing Based Detection
4. Results and Discussions
4.1. Overall Performance
4.2. Case Studies
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
CUSUM | Cumulative sum |
ELM | Extreme learning model |
FN | False negative |
FP | False positive |
GUI | Graphical user interface |
KLD | Kullback–Leibler divergence |
LSDD | Least-squares density-difference |
ML | Machine learning |
NPrecision | Negative precision |
NRecall | Negative recall |
OCR | Optical character recognition |
R-CNN | Regions with convolutional neural network |
SIFT | Scale-invariant feature transform |
TISLF | Target image search based on local features |
TN | True negative |
TP | True positive |
UI | User interface |
References
- Mao, K.; Harman, M.; Jia, Y. Sapienz: Multi-objec tive automated testing for android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis, Saarbrücken, Germany, 18–20 July 2016; pp. 94–105. [Google Scholar]
- Google. Android Monkey. 2017. Available online: https://developer.android.com/studio/test/monkey (accessed on 25 May 2020).
- Wetzlmaier, T.; Ramler, R.; Putschögl, W. A framework for monkey GUI testing. In Proceedings of the 9th IEEE International Conference on Software Testing, Verification and Validation, Chicago, IL, USA, 11–15 April 2016; pp. 416–423. [Google Scholar]
- Nyman, N. Using monkey test tools. STQE 2000, 29, 18–23. [Google Scholar]
- White, T.D.; Fraser, G.; Brown, G.J. Improving random GUI testing with image-based widget detection. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, Beijing, China, 15–19 July 2019; pp. 307–317. [Google Scholar]
- Degott, C.; Borges, N.P., Jr.; Zeller, A. Learning user interface element interactions. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, Beijing, China, 15–19 July 2019; pp. 296–306. [Google Scholar]
- Saumya, C.; Koo, J.; Kulkarni, M.; Bagchi, S. XSTRESSOR: Automatic generation of large-scale worst-case test inputs by inferring path conditions. In Proceedings of the 12th IEEE International Conference on Software Testing, Verification and Validation, Xi’an, China, 22–27 April 2019; pp. 1–12. [Google Scholar]
- Moran, K.; Linares-Vásquez, M.; Bernal-Cárdenas, C.; Vendome, C.; Poshyvanyk, D. Automatically discovering, reporting and reproducing android application crashes. In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation, Chicago, IL, USA, 10–15 April 2016; pp. 33–44. [Google Scholar]
- Iwashita, A.S.; Papa, J.P. An overview on concept drift learning. IEEE Access 2019, 7, 1532–1547. [Google Scholar] [CrossRef]
- Lu, J.; Liu, A.; Dong, F.; Gu, F.; Gama, J.; Zhang, G. Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. 2019, 31, 2346–2363. [Google Scholar] [CrossRef] [Green Version]
- Orlitsky, A.; Roche, J.R. Coding for computing. IEEE Trans. Inf. Theory 2001, 47, 903–917. [Google Scholar] [CrossRef]
- Vishwanathan, S.V.N.; Schraudolph, N.N.; Kondor, R.; Borgwardt, K.M. Graph kernels. J. Mach. Learn. Res. 2010, 11, 1201–1242. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; pp. 42–43. [Google Scholar]
- Li, X.; Hu, W.; Shen, C.; Zhang, Z.; Dick, A. A survey of appearance models in visual object tracking. ACM Trans. Intell. Syst. Technol. 2013, 58, 1–48. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Du, L.; Dai, H. Unsupervised SAR image change detection based on SIFT keypoints and region information. IEEE Geosci. Remote Sens. Lett. 2016, 13, 931–935. [Google Scholar] [CrossRef]
- Akoglu, L.; Tong, H.; Koutra, D. Graph-based anomaly detection and description: A survey. Data Min. Knowl. Discov. 2015, 29, 626–688. [Google Scholar] [CrossRef] [Green Version]
- Ramírez-Gallego, S.; Krawczyk, B.; García, S.; Woźniak, M.; Herrera, F. A survey on data processing for data stream mining: Current status and future directions. Neurocomputing 2017, 239, 39–57. [Google Scholar] [CrossRef]
- Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P. Learning with drift detection. In Advances in Artificial Intelligence—SBIA 2004. SBIA 2004. Lecture Notes in Computer Science; Bazzan, A.L.C., Labidi, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3171, pp. 286–295. [Google Scholar]
- Huang, G.; Huang, G.-B.; Song, S.; You, K. Trends in extreme learning machines: A review. Neural Netw. 2015, 61, 32–48. [Google Scholar] [CrossRef]
- Xu, S.; Wang, J. Dynamic extreme learning machine for data stream classification. Neurocomputing 2017, 238, 433–449. [Google Scholar] [CrossRef]
- Dasu, T.; Krishnan, S.; Venkatasubramanian, S.; Yi, K. An information-theoretic approach to detecting changes in multi-dimensional data streams. In Proceedings of the 38th Symposium on the Interface of Statistics, Computing Science, and Applications, Pasadena, CA, USA, 24–27 May 2006; pp. 1–24. [Google Scholar]
- Nguyen, T.D.; Du Plessis, M.C.; Kanamori, T.; Sugiyama, M. Constrained least-squares density-difference estimation. IEICE Trans. Inf. Syst. 2014, 97, 1822–1829. [Google Scholar] [CrossRef] [Green Version]
- Bu, L.; Alippi, C.; Zhao, D. A PDF-free change detection test based on density difference estimation. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 324–334. [Google Scholar] [CrossRef] [PubMed]
- Ross, G.J.; Adams, N.M. Two nonparametric control charts for detecting arbitrary distribution changes. J. Qual. Technol. 2012, 44, 102–116. [Google Scholar] [CrossRef]
- Alippi, C.; Boracchi, G.; Carrera, D.; Roveri, M. Change detection in multivariate datastreams: Likelihood and detectability loss. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, NY, USA, 9–15 July 2016; pp. 1368–1374. [Google Scholar]
- Ranshous, S.; Shen, S.; Koutra, D.; Harenberg, S.; Faloutsos, C.; Samatova, N.F. Anomaly detection in dynamic networks: A survey. WIREs Comput. Stat. 2015, 7, 223–247. [Google Scholar] [CrossRef]
- Grattarola, D.; Zambon, D.; Alippi, C.; Livi, L. Change detection in graph streams by learning graph embeddings on constant-curvature manifolds. arXiv 2019, arXiv:1805.06299v3. [Google Scholar] [CrossRef] [Green Version]
- Zambon, D.; Alippi, C.; Livi, L. Concept drift and anomaly detection in graph streams. IEEE Trans. Neur. Netw. Learn. Syst. 2018, 29, 5592–5605. [Google Scholar] [CrossRef] [Green Version]
- Wackerly, D.D.; Mendenhall, W., III.; Scheaffer, R.L. Likelihood ratio tests. In Mathematical Statistics with Applications; Thomson Brooks/Cole: Belmont, NV, USA, 2008; pp. 549–550. [Google Scholar]
- Alippi, C.; Roveri, M. An adaptive CUSUM-based test for signal change detection. In Proceedings of the International Symposium on Circuits and Systems, Island of Kos, Greece, 21–24 May 2006; pp. 5752–5755. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Lowe, D.G. Object recognition from local scale-invariant features. In Proceedings of the International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; pp. 1150–1157. [Google Scholar]
- Park, M.-H.; Park, R.-H.; Lee, S.W. Shot boundary detection using scale invariant feature matching. Proceedings of SPIE Visual Communications and Image Processing; SPIE: Bellingham, WA, USA, 2006; pp. 569–577. [Google Scholar]
- Guan, B.; Ye, H. Target image video search based on local features. arXiv 2019, arXiv:1808.03735v2. [Google Scholar]
- Korhonen, A.; Krymolowski, Y. On the robustness of entropy-based similarity measures in evaluation of subcategorization acquisition systems. In Proceedings of the 6th Conference on Natural Language Learning (CoNLL-2002), Taipei, Taiwan, 31 August–1 September 2002; pp. 1–7. [Google Scholar]
- Kriege, N.M.; Johansson, F.D.; Morris, C. A survey on graph kernels. Appl. Netw. Sci. 2020, 5, 6. [Google Scholar] [CrossRef] [Green Version]
- Sugiyama, M.; Borgwardt, K.M. Halting in Random Walk Kernels. Adv. Neural Inf. Process. Syst. 2015, 28, 1630–1638. [Google Scholar]
Sample Availability: Samples of the GUI screenshots are available from the authors. |
4shared | alba | albamon | amazon | baedal | cars | cbs | cgv |
heaven | sports | ||||||
231 | 92 | 176 | 201 | 135 | 294 | 252 | 154 |
(34/6.8) | (8/11.5) | (20/8.8) | (35/5.7) | (8/16.9) | (42/7.0) | (18/14.0) | (36/4.3) |
coupang | deep | door | drink water | ebay | emart | espn | |
booster | dash | reminder | |||||
133 | 433 | 580 | 217 | 143 | 158 | 97 | 225 |
(19/7.0) | (55/7.9) | (187/3.1) | (22/9.9) | (24/6.0) | (13/12.2) | (15/6.5) | (27/8.3) |
fish | groupon | hawhae | home& | kakao | korail | little | |
brain | translator | shopping | bus | caesars | |||
310 | 189 | 335 | 196 | 177 | 279 | 212 | 193 |
(40/7.8) | (32/5.9) | (47/7.1) | (20/9.8) | (46/3.8) | (59/4.7) | (40/5.3) | (48/4.0) |
mcdonalds | melon | messenger | my fitness | naver | news | offerup | papago |
webtoon | break | ||||||
373 | 201 | 388 | 387 | 150 | 273 | 365 | 173 |
(69/5.4) | (11/18.3) | (53/7.3) | (49/7.9) | (17/8.8) | (25/10.9) | (32/11.4) | (27/6.4) |
pininterest | pluto tv | poshmark | roku | shareit | smart | spotify | the weather |
news | channel | ||||||
394 | 299 | 348 | 333 | 458 | 263 | 263 | 195 |
(49/8.0) | (32/9.3) | (74/4.7) | (22/15.1) | (63/7.3) | (28/9.4) | (57/4.6) | (18/10.8) |
today | triple | tubi | wayfair | wish | workout for | yanolja | yelp |
home | woman | ||||||
227 | 237 | 227 | 549 | 393 | 359 | 258 | 411 |
(31/7.3) | (72/3.3) | (14/16.2) | (96/5.7) | (68/5.8) | (56/6.4) | (63/4.1) | (76/5.4) |
yeogieotae | zumo | The mean of the manually-labeled | |||||
208 | 128 | usage-phases | |||||
(44/4.7) | (31/4.1) | 265.44 |
Method | TP | FP | TN | FN | Precision | Recall | Accuracy | |
---|---|---|---|---|---|---|---|---|
SIFT | 684 | 1413 | 9787 | 1388 | 0.326 | 0.330 | 0.789 | |
Graph kernel | 323 | 368 | 10,832 | 1749 | 0.467 | 0.156 | 0.840 | |
Graph entropy | 369 | 682 | 10,518 | 1703 | 0.351 | 0.178 | 0.820 | |
KLD | 264 | 428 | 10,772 | 1808 | 0.382 | 0.127 | 0.832 | |
Likelihood | 260 | 432 | 10,768 | 1812 | 0.376 | 0.125 | 0.831 | |
Hypothesis | 1084 | 5481 | 5719 | 988 | 0.165 | 0.523 | 0.513 | |
testing | ||||||||
Mean | SIFT | 1031 | 4306 | 6894 | 1041 | 0.193 | 0.498 | 0.597 |
Graph kernel | 682 | 1550 | 9650 | 1390 | 0.306 | 0.329 | 0.778 | |
Graph entropy | 953 | 2413 | 8787 | 1119 | 0.283 | 0.460 | 0.734 | |
KLD | 778 | 2458 | 8742 | 1294 | 0240 | 0.375 | 0.717 | |
Likelihood | 781 | 2533 | 8667 | 1,291 | 0.236 | 0.377 | 0.712 | |
Hypothesis | 1085 | 5480 | 5720 | 987 | 0.165 | 0.524 | 0.513 | |
testing | ||||||||
Empirical threshold | SIFT | 706 | 1141 | 10,059 | 1366 | 0.382 | 0.341 | 0.811 |
Graph kernel | 569 | 927 | 10,273 | 1503 | 0.380 | 0.275 | 0.817 | |
Graph entropy | 1159 | 3921 | 7279 | 913 | 0.228 | 0.559 | 0.636 | |
KLD | 538 | 1105 | 10,095 | 1534 | 0.327 | 0.260 | 0.801 | |
Likelihood | 751 | 2214 | 8986 | 1321 | 0.253 | 0.362 | 0.734 | |
Hypothesis | 1085 | 5482 | 5718 | 987 | 0.165 | 0.524 | 0.513 | |
testing |
Method | Threshold | NPrecision | NRecall | Threshold | NPrecision | NRecall |
---|---|---|---|---|---|---|
SIFT | 0.876 | 0.874 | Mean | 0.869 | 0.616 | |
Graph | 0.861 | 0.967 | 0.874 | 0.862 | ||
kernel | ||||||
Graph | 0.861 | 0.939 | 0.887 | 0.785 | ||
entropy | ||||||
KLD | 0.856 | 0.962 | 0.871 | 0.781 | ||
Likelihood | 0.856 | 0.961 | 0.870 | 0.774 | ||
Hypothesis testing | 0.853 | 0.511 | 0.853 | 0.511 | ||
SIFT | Empirical threshold | 0.880 | 0.898 | |||
Graph | 0.872 | 0.917 | ||||
kernel | ||||||
Graph | 0.889 | 0.650 | ||||
entropy | ||||||
KLD | 0.868 | 0.901 | ||||
Likelihood | 0.872 | 0.802 | ||||
Hypothesis testing | 0.853 | 0.511 |
Method | TP | FP | TN | FN | Precision | Recall | Accuracy | |
---|---|---|---|---|---|---|---|---|
(Min + Max)/2 | SIFT | 610 | 1401 | 9643 | 882 | 0.303 | 0.409 | 0.818 |
(74) | (12) | (144) | (506) | (0.860) | (0.128) | (0.296) | ||
Graph | 208 | 355 | 10,689 | 1284 | 0.369 | 0.139 | 0.869 | |
kernel | (115) | (13) | (143) | (465) | (0.898) | (0.198) | (0.351) | |
Graph | 294 | 672 | 10,372 | 1198 | 0.304 | 0.197 | 0.851 | |
entropy | (75) | (10) | (146) | (505) | (0.882) | (0.129) | (0.300) | |
KLD | 190 | 442 | 10,622 | 1302 | 0.310 | 0.127 | 0.862 | |
(74) | (6) | (150) | (506) | (0.925) | (0.128) | (0.304) | ||
Likelihood | 187 | 426 | 10,618 | 1305 | 0.305 | 0.125 | 0.862 | |
(73) | (6) | (150) | (507) | (0.924) | (0.126) | (0.303) | ||
Hypothesis | 819 | 5401 | 5643 | 673 | 0.132 | 0.549 | 0.515 | |
testing | (265) | (80) | (76) | (315) | (0.768) | (0.457) | (0.463) | |
Mean | SIFT | 905 | 4251 | 6793 | 587 | 0.176 | 0.607 | 0.614 |
(126) | (55) | (101) | (454) | (0.696) | (0.217) | (0.308) | ||
Graph | 480 | 1505 | 9539 | 1012 | 0.242 | 0.322 | 0.799 | |
kernel | (202) | (45) | (111) | (378) | (0.818) | (0.348) | (0.425) | |
Graph | 774 | 2379 | 8665 | 718 | 0.245 | 0.519 | 0.753 | |
entropy | (179) | (34) | (122) | (401) | (0.840) | (0.309) | (0.409) | |
KLD | 577 | 2409 | 8635 | 915 | 0.193 | 0.387 | 0.735 | |
(202) | (43) | (113) | (378) | (0.824) | (0.348) | (0.428) | ||
Likelihood | 582 | 2485 | 8559 | 910 | 0.190 | 0.390 | 0.729 | |
(200) | (44) | (112) | (380) | (0.820) | (0.345) | (0.424) | ||
Hypothesis | 821 | 5403 | 5641 | 671 | 0.132 | 0.550 | 0.515 | |
testing | (265) | (80) | (76) | (315) | (0.768) | (0.457) | (0.463) | |
Empirical threshold | SIFT | 629 | 1127 | 9917 | 863 | 0.358 | 0.422 | 0.841 |
(77) | (14) | (142) | (503) | (0.846) | (0.133) | (0.298) | ||
Graph | 392 | 891 | 10,153 | 1100 | 0.306 | 0.263 | 0.841 | |
kernel | (177) | (36) | (120) | (403) | (0.831) | (0.305) | (0.404) | |
Graph | 956 | 3869 | 7175 | 536 | 0.198 | 0.641 | 0.649 | |
entropy | (203) | (52) | (104) | (377) | (0.796) | (0.350) | (0.417) | |
KLD | 406 | 1087 | 9957 | 1086 | 0.272 | 0.272 | 0.827 | |
(131) | (18) | (138) | (449) | (0.879) | (0.226) | (0.365) | ||
Likelihood | 549 | 2175 | 8869 | 943 | 0.202 | 0.368 | 0.751 | |
(201) | (34) | (122) | (379) | (0.855) | (0.347) | (0.439) | ||
Hypothesis | 819 | 5399 | 5645 | 673 | 0.132 | 0.549 | 0.516 | |
testing | (265) | (80) | (76) | (315) | (0.768) | (0.457) | (0.463) |
Method | TP | FP | TN | FN | Precision | Recall | Accuracy | |
---|---|---|---|---|---|---|---|---|
(Min + Max)/2 | SIFT | 621 | 1404 | 9515 | 1136 | 0.307 | 0.353 | 0.800 |
(63) | (9) | (272) | (252) | (0.875) | (0.200) | (0.562) | ||
Graph | 254 | 351 | 10,568 | 1503 | 0.420 | 0.145 | 0.854 | |
kernel | (69) | (17) | (264) | (246) | (0.802) | (0.219) | (0.559) | |
Graph | 342 | 661 | 10,258 | 1415 | 0.341 | 0.195 | 0.836 | |
entropy | (27) | (21) | (260) | (288) | (0.563) | (0.086) | (0.482) | |
KLD | 207 | 415 | 10,504 | 1550 | 0.333 | 0.118 | 0.845 | |
(57) | (13) | (268) | (258) | (0.814) | (0.181) | (0.545) | ||
Likelihood | 203 | 420 | 10,499 | 1554 | 0.326 | 0.116 | 0.844 | |
(57) | (12) | (269) | (258) | (0.826) | (0.181) | (0.547) | ||
Hypothesis | 983 | 5436 | 5483 | 774 | 0.153 | 0.559 | 0.510 | |
testing | (101) | (45) | (236) | (214) | (0.692) | (0.321) | (0.565) | |
Mean | SIFT | 907 | 4,220 | 6699 | 850 | 0.177 | 0.516 | 0.600 |
(124) | (86) | (195) | (191) | (0.590) | (0.394) | (0.535) | ||
Graph | 585 | 1510 | 9409 | 1172 | 0.279 | 0.333 | 0.788 | |
kernel | (97) | (40) | (241) | (218) | (0.708) | (0.308) | (0.567) | |
Graph | 862 | 2385 | 8534 | 895 | 0.265 | 0.491 | 0.741 | |
entropy | (91) | (28) | (253) | (224) | (0.765) | (0.289) | (0.577) | |
KLD | 680 | 2,407 | 8512 | 1077 | 0.220 | 0.387 | 0.730 | |
(99) | (45) | (236) | (216) | (0.688) | (0.314) | (0.562) | ||
Likelihood | 684 | 2,484 | 8435 | 1073 | 0.216 | 0.389 | 0.720 | |
(98) | (45) | (236) | (217) | (0.685) | (0.311) | (0.560) | ||
Hypothesis | 985 | 5438 | 5481 | 772 | 0.153 | 0.561 | 0.510 | |
testing | (101) | (45) | (236) | (214) | (0.692) | (0.321) | (0.565) | |
Empirical threshold | SIFT | 640 | 1129 | 9790 | 1117 | 0.362 | 0.364 | 0.823 |
(66) | (12) | (269) | (249) | (0.846) | (0.210) | (0.562) | ||
Graph | 473 | 887 | 10,032 | 1284 | 0.348 | 0.269 | 0.829 | |
kernel | (96) | (40) | (241) | (219) | (0.706) | (0.305) | (0.565) | |
Graph | 1064 | 3892 | 7027 | 693 | 0.215 | 0.606 | 0.638 | |
entropy | (95) | (29) | (252) | (220) | (0.766) | (0.302) | (0.582) | |
KLD | 458 | 1079 | 9840 | 1299 | 0.298 | 0.261 | 0.812 | |
(79) | (26) | (255) | (236) | (0.752) | (0.251) | (0.560) | ||
Likelihood | 649 | 2164 | 8755 | 1108 | 0.231 | 0.369 | 0.742 | |
(101) | (45) | (236) | (214) | (0.692) | (0.321) | (0.565) | ||
Hypothesis | 983 | 5434 | 5485 | 774 | 0.153 | 0.559 | 0.510 | |
testing | (101) | (45) | (236) | (214) | (0.692) | (0.321) | (0.565) |
Real | Method | Threshold | Estimated | Threshold | Estimated | Threshold | Estimated |
---|---|---|---|---|---|---|---|
41.44 | SIFT | 41.94 | Mean | 106.74 | Empirical threshold | 36.94 | |
Graph | 13.82 | 44.64 | 26.92 | ||||
kernel | |||||||
Graph | 21.02 | 67.32 | 101.6 | ||||
entropy | |||||||
KLD | 13.84 | 64.62 | 32.84 | ||||
Likelihood | 13.84 | 66.22 | 59.18 | ||||
Hypothesis | 131.3 | 131.38 | 131.26 | ||||
testing |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chae, H.; Kang, R.; Seok, H.-S. Unsupervised Detection of Changes in Usage-Phases of a Mobile App. Appl. Sci. 2020, 10, 3656. https://doi.org/10.3390/app10103656
Chae H, Kang R, Seok H-S. Unsupervised Detection of Changes in Usage-Phases of a Mobile App. Applied Sciences. 2020; 10(10):3656. https://doi.org/10.3390/app10103656
Chicago/Turabian StyleChae, Hoyeol, Ryangkyung Kang, and Ho-Sik Seok. 2020. "Unsupervised Detection of Changes in Usage-Phases of a Mobile App" Applied Sciences 10, no. 10: 3656. https://doi.org/10.3390/app10103656