You are currently viewing a new version of our website. To view the old version click .
Future Internet
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

27 December 2025

Wangiri Fraud Detection: A Comprehensive Approach to Unlabeled Telecom Data

,
,
,
,
,
and
1
Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada
2
Computer Engineering Department, Shiraz University, Shiraz 8433471946, Iran
3
Department of Telecommunication Engineering, Islamic Azad University, Tehran 1477893855, Iran
*
Author to whom correspondence should be addressed.
Future Internet2026, 18(1), 15;https://doi.org/10.3390/fi18010015 
(registering DOI)
This article belongs to the Special Issue Cybersecurity in the Age of AI, IoT, and Edge Computing

Abstract

Wangiri fraud is a pervasive telecommunications scam that exploits missed calls to lure victims into dialing premium-rate numbers, resulting in significant financial losses for operators and consumers. This paper presents a comprehensive machine learning framework for detecting Wangiri fraud in highly imbalanced and unlabeled Call Detail Record (CDR) datasets. We introduce a novel unsupervised labeling approach using domain-driven heuristics, coupled with advanced feature engineering to capture temporal, geographic, and behavioral patterns indicative of fraud. To address severe class imbalance, we evaluate multiple sampling strategies like the Synthetic Minority Over-sampling Technique (SMOTE) and undersampling, and also compare the performance of Logistic Regression, Decision Trees, Random Forest, XGBoost, and Multi-Layer Perceptron (MLP). Our results demonstrate that ensemble methods, particularly Random Forest and XGBoost, achieve near-perfect accuracy (e.g., Receiver Operating Characteristic Area Under the Curve (ROC-AUC) >0.99) on balanced data while maintaining interpretability. The proposed pipeline offers a scalable and practical solution for real-time fraud detection, providing telecom operators with an effective tool to mitigate Wangiri fraud risks.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.