Wangiri Fraud Detection: A Comprehensive Approach to Unlabeled Telecom Data

Amirreza Balouchi; Meisam Abdollahi; Ali Eskandarian; Kianoush Karimi Pour Kerman; Elham Majd; Neda Azouji; Amirali Baniasadi

doi:10.3390/fi18010015

,

and

¹

Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada

²

Computer Engineering Department, Shiraz University, Shiraz 8433471946, Iran

³

Department of Telecommunication Engineering, Islamic Azad University, Tehran 1477893855, Iran

^*

Author to whom correspondence should be addressed.

Future Internet2026, 18(1), 15;https://doi.org/10.3390/fi18010015
(registering DOI)

This article belongs to the Special Issue Cybersecurity in the Age of AI, IoT, and Edge Computing

Version Notes

Order Reprints

Review Reports

Abstract

Wangiri fraud is a pervasive telecommunications scam that exploits missed calls to lure victims into dialing premium-rate numbers, resulting in significant financial losses for operators and consumers. This paper presents a comprehensive machine learning framework for detecting Wangiri fraud in highly imbalanced and unlabeled Call Detail Record (CDR) datasets. We introduce a novel unsupervised labeling approach using domain-driven heuristics, coupled with advanced feature engineering to capture temporal, geographic, and behavioral patterns indicative of fraud. To address severe class imbalance, we evaluate multiple sampling strategies like the Synthetic Minority Over-sampling Technique (SMOTE) and undersampling, and also compare the performance of Logistic Regression, Decision Trees, Random Forest, XGBoost, and Multi-Layer Perceptron (MLP). Our results demonstrate that ensemble methods, particularly Random Forest and XGBoost, achieve near-perfect accuracy (e.g., Receiver Operating Characteristic Area Under the Curve (ROC-AUC)

> 0.99

) on balanced data while maintaining interpretability. The proposed pipeline offers a scalable and practical solution for real-time fraud detection, providing telecom operators with an effective tool to mitigate Wangiri fraud risks.

Keywords:

Wangiri fraud detection; machine learning; class imbalance; feature engineering; XGBoost; SMOTE; SHAP

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.