Next Article in Journal
A Versatile and Low-Cost IoT Solution for Bioclimatic Monitoring in Precision Viticulture
Previous Article in Journal
Seamless Vital Signs-Based Continuous Authentication Using Machine Learning
Previous Article in Special Issue
Cybersecurity in Higher Education Institutions: A Systematic Review of Emerging Trends, Challenges and Solutions
 
 
Article
Peer-Review Record

Wangiri Fraud Detection: A Comprehensive Approach to Unlabeled Telecom Data

Future Internet 2026, 18(1), 15; https://doi.org/10.3390/fi18010015 (registering DOI)
by Amirreza Balouchi 1, Meisam Abdollahi 1,*, Ali Eskandarian 2, Kianoush Karimi Pour Kerman 3, Elham Majd 1, Neda Azouji 1 and Amirali Baniasadi 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Future Internet 2026, 18(1), 15; https://doi.org/10.3390/fi18010015 (registering DOI)
Submission received: 26 October 2025 / Revised: 17 December 2025 / Accepted: 22 December 2025 / Published: 27 December 2025
(This article belongs to the Special Issue Cybersecurity in the Age of AI, IoT, and Edge Computing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors
  1. The writers have chosen the good domain.
  2. The Introduction and related work section have been framed well. Gaps were identified very well.
  3. Figure 3 belongs to the data on how they were processed. But here, need to know how far the data is used by the Proposed model. Content is needed to this point.
  4.  What are the features that were considered, and what are the correlated features that identify the fraud label?
  5. The entire content was written about the data preprocessing and about the feature engineering only.
  6. There is no proposed model. Here the model should be novel, and it must and should contain the flow of the model, the algorithm, how it was implemented, and an explanation.
  7. The models used in the article are baseline. To achieve the desired outcome, an optimized or hybrid model is required.
  8. On which basis the tables 5 to 11 values were generated. Need a proper explanation. How far the features in the data were used to reach the metrics.
  9. Figure 14 needs to be cross-checked, and it should be replaced.
  10. Table 12 needs some more citations to identify that the proposed model was the best.
  11. Unrelated data was more. How far the title of the work was reached, the data was not provided well. Need to improve a lot.
  12. The conclusion section should be modified as per the results obtained

Author Response

Responses to Reviewer 1

Comment 1: The writers have chosen the good domain.

Response: We sincerely thank the reviewer for recognizing the importance and relevance of the Wangiri fraud detection domain. We agree that this is a critical area for telecommunications security.

---------------------------------------------------------------------------------------------------------------------

Comment 2: The Introduction and related work section have been framed well. Gaps were identified very well.

Response: We appreciate the reviewer’s positive feedback regarding the framing of the Introduction and the identification of research gaps in the Related Work section.

---------------------------------------------------------------------------------------------------------------------

Comment 3: Figure 3 belongs to the data on how they were processed. But here, need to know how far the data is used by the Proposed model. Content is needed to this point.

Response: We thank the reviewer for this valid observation. We have updated the text to explicitly link the feature correlations shown in Figure 3 to the architectural choices of our proposed model. We clarify that the weak correlations necessitated the use of non-linear tree-based models (XGBoost) rather than linear baselines.

Changes made (Page 19, Line 548): In Section 4 (Methodology), under the newly renamed subsection "Proposed Optimized Ensemble Framework," we added: "The correlations observed in Figure 3 were pivotal in defining the model architecture... the weak monotonic correlations of timing features suggest non-linear dependencies; therefore, our framework utilizes gradient boosting."

---------------------------------------------------------------------------------------------------------------------

Comment 4: What are the features that were considered, and what are the correlated features that identify the fraud label?

Response: We appreciate this question. To address this, we have explicitly highlighted the specific engineered features that were most predictive of the fraud label in both the Methodology and Conclusion sections.

Changes made (Page 41, Line 1079): In Section 6 (Conclusion), we added: "Our analysis identified specific correlations that distinguish fraudulent activity, most notably the high frequency of unique_calls_last_day combined with specific temporal patterns in acm_time and cpg_time."

---------------------------------------------------------------------------------------------------------------------

Comment 5: The entire content was written about the data preprocessing and about the feature engineering only.

Response: We are grateful for this feedback. We acknowledged that the balance of the paper leaned too heavily on preprocessing. We have revised the Methodology section to place greater emphasis on the model architecture, optimization strategies, and the ensemble framework itself.

Changes made (Page 19, Line 548): We have renamed "Model Development" to "Proposed Optimized Ensemble Framework" and expanded the text to detail the three-stage pipeline (Labeling, Encoding, and Hyperparameter Optimization), shifting the focus toward the modeling aspect.

---------------------------------------------------------------------------------------------------------------------

Comment 6: There is no proposed model. Here the model should be novel, and it must and should contain the flow of the model, the algorithm, how it was implemented, and an explanation.

Response: We thank the reviewer for highlighting this lack of clarity. We have restructured the methodology to present our entire pipeline (Unsupervised Labeling + Feature Engineering + Optimized Ensemble) as the novel "Proposed Framework," rather than simply presenting it as a list of standard classifiers.

Changes made (Page 19, Line 548): In Section 4, we added the subsection "Proposed Optimized Ensemble Framework" which explicitly outlines the flow of the model: "The framework operates in three distinct stages: Heuristic-Based Weak Labeling, Feature Interaction Encoding, and Hyperparameter-Optimized Ensemble."

---------------------------------------------------------------------------------------------------------------------

Comment 7: The models used in the article are baseline. To achieve the desired outcome, an optimized or hybrid model is required.

Response: We agree with the reviewer that standard baselines are insufficient. We have clarified that our approach involves a hybrid pipeline using "Weak Supervision" (heuristic labeling) combined with "Hyperparameter-Optimized" XGBoost/Random Forest models, moving beyond default settings.

Changes made (Page 20, Line 567): In the Methodology section, we explicitly stated: "We move beyond default baseline settings by employing a rigorous grid search (GridSearchCV) to optimize the regularization parameters... This validates that the 'Proposed Model' is not simply a comparison of classifiers, but a hybrid, imbalance-aware decision framework."

Comment 8: On which basis the tables 5 to 11 values were generated. Need a proper explanation. How far the features in the data were used to reach the metrics.

Response: We thank the reviewer for requesting this clarification. We have added text explaining that the metrics were generated based on the balanced test set to ensure a fair assessment of the 9 engineered features' discriminatory power.

Changes made (Page 41, Line 1079): In the newly added "Comparison with State-of-the-Art" section, we state: "The values presented... were generated by evaluating the model on the balanced test set to ensure a fair assessment of class discrimination capabilities."

---------------------------------------------------------------------------------------------------------------------

Comment 9: Figure 14 needs to be cross-checked, and it should be replaced.

Response: We appreciate the reviewer’s attention to detail. We have cross-checked all figures, including Figure 14, to ensure the captions, axes, and data representations accurately reflect the results of the optimized XGBoost model.

---------------------------------------------------------------------------------------------------------------------

Comment 10: Table 12 needs some more citations to identify that the proposed model was the best.

Response: We agree that a direct comparison was necessary. We have introduced a new comparison table with citations to prior art to contextualize our results.

Changes made (Page 40, Table 12): We added a new subsection "Comparison with State-of-the-Art" in the Results section, containing Table 8, which compares our ROC-AUC (99.7%) against specific citations like Sahin et al. (2011) and Arafat et al. (2019) and other newly added papers.

---------------------------------------------------------------------------------------------------------------------

Comment 11: Unrelated data was more. How far the title of the work was reached, the data was not provided well. Need to improve a lot.

Response: We thank the reviewer for this comment. We have revised the text to ensure that every data point and feature discussed is directly linked to the paper's title and the specific goal of detecting Wangiri fraud.

Changes made (Page 41, Line 1079): We refined the Conclusion to explicitly link the "signaling data" to "Wangiri fraud risks," reinforcing that the data analysis is tightly coupled with the research objective.

---------------------------------------------------------------------------------------------------------------------

Comment 12: The conclusion section should be modified as per the results obtained.

Response: We agree completely. We have rewritten the conclusion to be data-driven, citing specific performance metrics and clearly summarizing the contributions.

Changes made (Page 41, Line 1087): The Section 6 (Conclusion) has been completely overhauled to include specific results (e.g., "ROC-AUC > 0.99"), limitations, and the specific impact of the proposed pipeline.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Only the Conclusion section needs little improvement

Comments for author File: Comments.pdf

Author Response

Response to Reviewer 2

Comment 1: Only the Conclusion section needs little improvement.

Response: We thank the reviewer for this helpful suggestion. We have significantly expanded the Conclusion section to provide a more comprehensive summary of the study.

Changes made (Page 41, Line 1079): We have rewritten Section 6 (Conclusion) to include a clear summary of findings and added two distinct subsections: "6.1 Limitations of the Study" and "6.2 Future Works" to provide a complete closure to the paper.

---------------------------------------------------------------------------------------------------------------------

Comment 2 (In the manuscript): Abstract is excellently written as it covers the standard abstract template correctly. Motivation / Background, Problem / Gap, Methods / Approach, Results, Conclusion / Significance.

Response: We sincerely thank the reviewer for the positive feedback on our abstract. We are delighted that the reviewer finds it excellently written and well-structured, properly covering the key elements.

---------------------------------------------------------------------------------------------------------------------

Comment 3 (In the manuscript): Introduction also no comments form my end because you have written a strong introduction covering Broad context / importance of the topic, Specific background of the phenomenon being studied, Statement of the problem or gap, Why the problem matters (impact), What previous work has done (briefly), What this paper contributes (thesis / purpose statement), Structure of the paper.

Response: We sincerely thank the reviewer for the very positive evaluation of our Introduction section. We are grateful that the reviewer considers it strong and complete, successfully addressing all the essential components.

---------------------------------------------------------------------------------------------------------------------

Comment 4 (In the manuscript): Very well written Literature Review. You have covered all the below areas excellently, Context / Background, Clear statement of the core problem, Why the problem matters (consequences), What prior approaches lack, What is needed going forward.

Response: We sincerely thank the reviewer for the highly positive assessment of our Literature Review section. We are delighted that the reviewer finds it very well written and excellent in its coverage of all key elements.

---------------------------------------------------------------------------------------------------------------------

Comment 5 (In the manuscript): Excellently covered all the below for problem statement: Context / Background, Clear statement of the core problem, Why the problem matters (consequences), What prior approaches lack, What is needed going forward.

Response: We sincerely thank the reviewer for the excellent rating of our Problem Statement section. We are very pleased that the reviewer finds it comprehensively and excellently addresses all the essential components.

---------------------------------------------------------------------------------------------------------------------

Comment 6 (In the manuscript): For conclusion my review below: Clear restatement of the problem – Present, Summary of methods – Present, Summary of key findings – Present, Evaluation of significance / contributions – Present, Limitations of the study – MISSING, Future work directions – Present.

Response: We thank the reviewer for this helpful suggestion. We have significantly expanded the Conclusion section to provide the limitations of the study.

Changes made (Page 41, Line 1079): We have rewritten Section 6 (Conclusion) to include a clear summary of findings and added two distinct subsections: "6.1 Limitations of the Study" and "6.2 Future Works" to provide a complete closure to the paper.

 

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors
  1. How the unlabelled data challenge necessitates exploration of semi-supervised, weakly supervised, and self-supervised approaches.
  2. Need a clear explanation of how models exploit abundant negatives under skew and capture non-linear structure.
  3. Explain how models exploit abundant negatives under skew and capture non-linear structure. 
  4. The values in the confusion matrix are not matching with the collected data

Author Response

Response to Reviewers

We would like to thank the editor and the reviewers for their time and insightful comments. We have carefully reviewed all suggestions and revised the manuscript accordingly. Below is a point-by-point response to the specific concerns raised.

 

Response to Reviewer 1

Comment 1: How the unlabelled data challenge necessitates exploration of semi-supervised, weakly supervised, and self-supervised approaches.

Response: We thank the reviewer for this important observation. We have expanded our discussion and add “4.4.3. Learning Approach” subsection to clarify the motivation for exploring semi-supervised, weakly supervised, and self-supervised approaches in the context of unlabeled data challenges.

Changes made: In Page 21, “4.4.3 Learning Approach” subsection, “A critical challenge in deploying AI for Wangiri fraud detection is the scarcity and unreliability of labeled data, which necessitates exploration beyond fully supervised learning toward semi-supervised, weakly supervised, and self-supervised approaches. High detection accuracy is essential in this context, as misclassifications directly translate into financial loss or degraded user trust; however, purely unsupervised methods primarily focus on identifying generic anomalies rather than discriminating Wangiri fraud from other irregular but legitimate calling behaviors. In practice, this limitation is significant because fraud detection requires class-specific reasoning rather than deviation-based detection alone.

Moreover, many unsupervised techniques, such as k-nearest neighbors or DBSCAN, impose substantial operational constraints by requiring access to the training data during inference, making them unsuitable for real-time, large-scale telecom environments. These methods also typically rely on heuristic or distance-based similarity measures, which struggle to capture the complex relational patterns and heterogeneous feature types present in telecom data, including categorical, temporal, and behavioral attributes. Our early experiments with Isolation Forest further highlighted these shortcomings, yielding suboptimal performance and high false-positive rates.

Consequently, these observations motivated a shift toward weakly supervised labeling strategies, which better align with the practical constraints of telecom fraud detection while enabling models to learn fraud-specific patterns from imperfect but informative supervisory signals.”

---------------------------------------------------------------------------------------------------------------------

Comment 2: Need a clear explanation of how models exploit abundant negatives under skew and capture non-linear structure.

Response: We appreciate the reviewer's comment for clarification on these important aspects of our model. In response to the reviewer’s comment, we emphasize that the selected models are explicitly designed to leverage the characteristics of highly skewed telecom data while capturing complex fraud patterns. We have revised the manuscript to provide a more detailed explanation of these mechanisms.

Changes made: In Page 39, Line 1030, “The selected models are explicitly designed to leverage the characteristics of highly skewed telecom data while capturing complex fraud patterns. Wangiri fraud detection benefits from the abundance of negative (legitimate) samples, as accurately modeling normal calling behavior is essential for distinguishing fraudulent activity. Tree-based ensemble methods, particularly Random Forest and XGBoost, exploit this setting by learning hierarchical, non-linear decision rules that naturally incorporate interactions among heterogeneous features, including categorical and behavioral attributes. Unlike linear or distance-based methods, these models do not assume simple feature relationships and are therefore better suited to the structured and relational nature of call detail records. Moreover, their ensemble formulation improves robustness under severe class imbalance by reducing variance and limiting bias toward the majority class. To further assess this robustness, we applied SMOTE-based rebalancing and observed consistent performance, confirming that the models’ effectiveness is not solely driven by class distribution. These properties explain why XGBoost and Random Forest achieved the best and most stable results in our experiments and justify their selection for Wangiri fraud detection under realistic, imbalanced conditions.”

---------------------------------------------------------------------------------------------------------------------

Comment 3: Explain how models exploit abundant negatives under skew and capture non-linear structure.

Response: We identified this as a duplicate of the earlier comment, so it's already been addressed.

---------------------------------------------------------------------------------------------------------------------

Comment 4: The values in the confusion matrix are not matching with the collected data.

Response: We sincerely thank the reviewer for their concise review. As noted, the dataset was divided into 70% training (8,166,928 records), 10% validation (1,166,704 records), and 20% testing (2,333,408 records). In all confusion matrices, we initially displayed the True Negative (TN) values in the scientific notation (E notation) due to their large magnitude. However, we recognize that this notation may have caused confusion, so we have updated all figures to display these values in standard notation. This change makes it easier to verify that the total number of records is consistent across all confusion matrices.

Changes made: Left subfigures in Figures 14, 17, 20, 26 and 29 are changed based on the standard value notation instead of E notation.

Back to TopTop