Next Article in Journal
Joint Optimization Algorithm for UAV-Assisted Caching and Charging Based on Wireless Energy Harvesting
Previous Article in Journal
A Quantitative-Qualitative Classification for Igneous Building Stones Based on Brazilian Tensile Strength: Application to the Stone Durability
 
 
Review
Peer-Review Record

Advanced Financial Fraud Malware Detection Method in the Android Environment

Appl. Sci. 2025, 15(7), 3905; https://doi.org/10.3390/app15073905
by Jaeho Shin 1,2, Daehyun Kim 1,2 and Kyungho Lee 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2025, 15(7), 3905; https://doi.org/10.3390/app15073905
Submission received: 12 February 2025 / Revised: 25 March 2025 / Accepted: 27 March 2025 / Published: 2 April 2025
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The comments are as follows,

 

  1. The study relies solely on data from a single South Korean bank, risking regional overfitting to local user behaviors or financial ecosystems. The lack of validation across diverse financial environments (e.g., payment habits, app designs in other countries) undermines the model’s generalizability.
  2. Malware samples are not classified by family or attack type (e.g., banking trojans, ransomware), preventing evaluation of the model’s effectiveness against specific advanced threats. This limits actionable insights for targeted defense strategies in real-world scenarios.
  3. The fixed 10% undersampling rate lacks theoretical justification, with no comparison to alternatives like SMOTE, ensemble learning, or cost-sensitive approaches. Random sampling may discard critical malicious patterns, compromising model robustness.
  4. While excluding app names due to multilingual complexity, the authors omit alternatives like Unicode pattern analysis or language-agnostic semantic extraction. This oversight may neglect critical features (e.g., spoofed bank names in app titles).
  5. No ablation study validates the independent impact of proposed features (e.g., user age/gender statistics). Performance gains could stem from data scale rather than feature innovation, leaving their necessity unproven.
  6. The model remains untested against financial malware evasion tactics (code obfuscation, dynamic loading, zero-day exploits). This overestimates real-world detection capability, particularly against APT-level threats.
  7. Partial dataset disclosure (GitHub subset) and incomplete feature engineering details (e.g., age/gender normalization) hinder independent verification. Full pipeline documentation and anonymization protocols are needed.
Comments on the Quality of English Language

none

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1) Abstract – Please better clarify the meaning of the following statement: “Moreover, 92 datasets were compiled through daily training 16 to select the optimal model, with five ML algorithms used to evaluate the proposed approach.”

2) Please try to not anticipate too much numerical results within the abstract.

3) In my opinion, the authors should compress the statement of contributions into 3-4 main innovations provided by this work.

4) There is a typo in Fig. 3: “hyperprameter search”.

5) The following papers on dynamic analysis was missing in the literature review. Also, it should be discussed that incremental learning is essential in malware analysis, as malware signatures should be constantly incorporated in the considered ML model, e.g.:


Xu, Xiaohu, et al. "Advancing malware detection in network traffic with self-paced class incremental learning." IEEE Internet of Things Journal (2024). 

Cerasuolo, Francesco, et al. "Adaptable, incremental, and explainable network intrusion detection systems for internet of things." Engineering Applications of Artificial Intelligence 144 (2025): 110143.


6) Other than malware detection, it would be very useful if the authors could discuss (at least) how the proposed framework could be applied to classify also malware types, i.e. to enable taking the appropriate countermeasures.


7) The authors should clarify whether the considered real data from BanK A will be released (in anonymized form) for reproducibility purposes.

8) Other than end-to-end malware detection pipeline, it would be useful if the authors could perform some statistical evaluation (e.g. histograms) of the static analysis-originated features to provide a snapshot of the challenges associated with the considered dataset.

9) In Sec. 5, as a potential avenue of research, the authors may also want to mention explainable AI to interpret the result of the proposed ML-based detection pipeline, e.g. following:

Nascita, Alfredo, et al. "A Survey on Explainable Artificial Intelligence for Internet Traffic Classification and Prediction, and Intrusion Detection." IEEE Communications Surveys & Tutorials (2024).

Comments on the Quality of English Language

Can be further improved.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

No further comments.

Comments on the Quality of English Language

none

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have satisfactorily addressed my previous concerns.

Back to TopTop