Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions
Abstract
1. Introduction
- -
- What are the recent advancements in using machine learning to detect fraud in the healthcare sector?
- -
- What barriers or challenges do organisations face in implementing machine learning in healthcare fraud detection?
- -
- In which ways can the efficiency of machine learning be improved in detecting healthcare fraud?
- -
- Which datasets are more common in healthcare fraud detection?
2. Background and Related Work
3. Navigating the Challenges of Healthcare Fraud Detection with ML
3.1. Data Imbalance and Quality
3.2. Privacy and Compliance Constraints
3.3. Interpretability and Trust
3.4. Resource Limitations
3.5. Adversarial Manipulation
4. Emerging Trends in ML-Driven Fraud Detection
5. The Art and Science of Feature Engineering and Selection for Fraud Detection
5.1. Importance of Feature Engineering
5.2. Common Features in Healthcare Fraud
5.3. Feature Selection Techniques
6. Datasets Used in Healthcare Fraud Detection Research
Dataset | Description | Strengths | Limitations | Reference |
---|---|---|---|---|
CMS Medicare Data (MPOP PS/P) | Detailed data on Medicare utilisation, payments, and charges. | Large scale, publicly available. | Can suffer from class imbalance. | [117,118,119] |
LEIE (List of Excluded Individuals And Entities) | List of providers excluded from federal healthcare programmes due to fraud. | Provides labels for overt fraud. | Significant class imbalance is often associated with blatant fraud. | [44,91,120] |
Kaggle Healthcare Datasets | Various datasets related to healthcare claims and provider information. | Publicly available, used in competitions. | Varies in quality and representativeness. | [86,121,122] |
NHCAA Healthcare Fraud Dataset | Suitable for relative model benchmarking and works perfectly when the input is balanced data. | Consists of Labelled instances of various frauds and feature metadata | Not available publicly, and requires ethical requirements for use | [123,124,125] |
MIMIC-III & IV | Commonly used in advanced fraud detection, e.g., document falsification and insurance exploitation inference. | Large-scale clinical and temporal data. | Explicitly designed for clinical analytics and requires a huge preprocessing time. | [126,127] |
7. Interplay Between Cybersecurity and Machine Learning in Healthcare Fraud Prevention
8. Tools Used in Healthcare Fraud Detection
8.1. Core Machine Learning Tools
8.1.1. TensorFlow and Keras
8.1.2. PyTorch
8.1.3. Apache MXNet
8.1.4. Weka
8.2. Specialised Analytical Tools
8.2.1. spaCy + ScispaCy
8.2.2. Transformers
9. Discussion
9.1. Recent Advancements in Using Machine Learning for Healthcare Fraud Detection
9.2. Barriers Organisations Face in Implementing Machine Learning
9.2.1. Data Quality and Availability
9.2.2. Integration with Existing Systems
9.2.3. Regulatory Compliance
9.2.4. Resource Constraints
9.2.5. Adaptability to Emerging Threats
9.2.6. Resistance to Change by Organisation
9.3. Improving Machine Learning Efficiency
9.4. Standard Datasets Used in Healthcare Fraud Detection
9.5. Ethical Concerns in ML-Based Healthcare Fraud Detection
9.6. Limitations of Machine Learning in Healthcare Systems
9.6.1. Data Privacy and Governance
9.6.2. Label Noise
9.6.3. Model Drift
9.6.4. Infrastructure Constraints
9.7. Operationalising Transparency and Bias Mitigation
9.7.1. Model Auditing
9.7.2. Explainability Metrics
9.7.3. Transparency Dashboards
10. Conclusions and Future Research Directions
10.1. Practical Implications
10.2. Future Research Directions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Paper | ML Techniques | ML Types | Applications | Datasets Used | Study Types | Benefits | Challenges | Mitigation Strategies |
---|---|---|---|---|---|---|---|---|
[44] | Decision Trees mixed with a hybrid resampling technique, i.e., SMOTE-ENN | Supervised ML | Fraudulent transactions detection in U.S. Medicare claims processing systems | Medicare dataset from the US | Experimental study | Exceptional accuracy, F1 and recall approximately 0.99 and effective with imbalanced data | Severe class imbalance, The threat of overfitting, SMOTE can bring the noise | Combining SMOTE-ENN with domain-related features like “Provider Type.” |
[169] | ML Algorithms: Classification and Regression | Supervised ML | Financial management and decision-making in healthcare organisations | Administrative data | Experimental study | Enhanced Efficiency and accuracy in handling healthcare data, cost-effectiveness | Issues in the quality of data, Implementation Challenges | Usually, preprocessing and model tuning can resolve challenges. |
[94] | Bagging and Stacking, and ML Integration with Blockchain Technology | Ensemble Learning Techniques | Fraud detection in healthcare insurance claims using decentralised blockchain systems | Healthcare insurance claims data | Experimental study | Advanced fraud detection process, Improved data security | Scalability issues between ML and blockchain | Integration of multiple classifiers and blockchains for securing data |
[157] | Deep Reinforcement Learning and Convolutional Autoencoders | Deep learning and Reinforcement learning | Automatically Anomaly Detection from clinical image datasets | Clinical CT images dataset | Experimental study | Efficient data processing, Minimises manual assignments Increases diagnostic precision | Challenges in structural variations | Using mixed methods and autoencoders can remove errors. |
[158] | Random Forest, Support Vector Machine and Federated Learning | Supervised and unsupervised ML | Detection of Intrusions from Medical Internet of Things data to protect healthcare information | Medical Internet of Things (M-IoT) network data | Experimental study | Improved accuracy, Energy efficacy, and Resource Optimisation | Computational resource constraint, High energy usage | Leveraging economic ML models and federated learning for model updating can mitigate the challenges. |
[170] | Fuzzy Closure Miner for Frequent Itemset (FCMFI) and Nucleotide Sequence Comprehension Engine (NSCE) | Unsupervised ML | SARS-CoV-2 analysis to detect transmutation prototypes and abnormalities | SARS-CoV-2 data | Experimental study | Improved Pattern Recognition | Complicated genomic data patterns might be challenging in analysation | Anomalies identification by combining multiple techniques like FCMFI and NSCE |
[171] | Dense Multi-Scale - Transnet DMSC and Multi-level Fusion | Deep Learning | Automatic Detection of Anomalies | Multiple Medical and Image Datasets | Experimental study | Enhanced feature mining, Advanced anomaly detection | Conventional CNNs are not effective, Issues in losing features | Combining different transfer modules and DMSC for feature fusion. |
[160] | A mixture of AI and ML techniques | Supervised ML, Unsupervised ML and Deep Learning | Detection and prevention of fraudulent transactions in Nigeria | Different real-world datasets from Nigeria | Empirical study | Enhanced accuracy, efficient fraud detection | Scalability and regulatory compliance challenges | Permanent learning and adaptive systems using current datasets |
[172] | Patch-wise contrastive learning-based auto-encoder (PatchCL-AE) | Unsupervised ML and Deep Learning | Anomaly detection from medical image datasets to scan diseases | Medical Imaging Datasets | Experimental study | Enhanced anomaly detection | Reconstruction-based techniques constraints, over-dependence on pixel-level losses | Reduce noise and improve scalability. |
[173] | Weighted MultiTree (WMT) and Density-Based Clustering (DBC) | Unsupervised ML | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets | Experimental study | Improved fraud detection performance | Challenges in clustering claims | Applying two different approaches in two different stages, i.e., WMT and DBC. |
[174] | Bayesian Belief Network (BBN) | Supervised ML | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets | Experimental study | Enhanced model performance | Issues in Claims Analysis | Exploits relational structure of transaction attributes using BBN to integrate interconnections |
[175] | Multiple anomaly detection techniques | Unsupervised, Supervised ML and Deep Learning | Detection of Anomalies from medical image datasets | Multiple Medical imaging datasets | Benchmark study | Enhanced anomaly detection | Unreliable processes and dataset selections limit productivity | Using numerous datasets and implementing a single framework. |
[18] | Decision tree | Supervised ML | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets from Ghana | Experimental study | Efficient classification accuracy and improved security | Scalability issues between ML and blockchain, Challenges in fraud detection | Integrating ML into Ethereum smart contracts. |
[32] | CatBoost, XGBoost, LightGBM, Random Forest | Supervised ML | Detection and prediction of fraud from healthcare insurance claims datasets. | Health Insurance Claims Datasets | Experimental study | Enhanced detection accuracy | Multi-dimensional and noisy data | Using ensemble learning and the use of CatBoost, XGBoost, and LightGBM tools. |
[176] | Random Forest, KNN, SVM, and MLP | Supervised ML | Detection of Anomalies from the Internet of Medical Things Datasets | Different Internet of Things (IoT) datasets | Experimental study | Outstanding accuracy and efficient detection | Scalability issues with conventional models, Challenges in the implementation of IoMT | Use of multiple ML algorithms at once. |
[177] | Aggregated Mondrian Forests, Half-Space Trees, Bijective soft sets, Shannon entropy, and TOPSIS | Supervised and Unsupervised ML | Effective diagnosis in competent healthcare. | Medical Datasets | Experimental study | Enhanced Detection and accuracy | Reconstruction of the static model is required, Increased computational costs | Leveraging novel framework to minimise dimensionality and real-time processing. |
[178] | Decision tree, K-NN, Logistic Regression, Random Forest, AdaBoost, XGBoost | Supervised ML | Detection of Anomalies from body area networks (BANs) | Body Area Networks Data | Experimental study | Exceptionally great accuracy | Resource constraints in IoT BANs, Anomaly detection challenges | Integration of multiple classifiers and using standard data conversion tools. |
[179] | Meta-Reinforcement Learning (Meta-RL) | Reinforcement learning | Detection of fraud from healthcare insurance claims datasets | Medical Datasets | Experimental study | Improved performance and high accuracy | Substantial class imbalance, the threat of overfitting, Possibility of data loss | Using Meta Reinforcement Learning for task distribution. |
[34] | Multiple Anomaly Detection Techniques | Unsupervised ML, Deep learning | Detection of fraud from healthcare insurance claims datasets and anomaly detection. | Health Insurance Claims Dataset from Belgium | Experimental study | Advanced anomalies detection | Unavailability of labelled dataset, Challenges in identifying the difference between fraud and abuse | Using advanced anomaly detection techniques. |
[93] | Federated learning, Blockchain-based task scheduling | Supervised ML | Detection of Anomalies from the Internet of Medical Things Datasets | Healthcare Internet of Medical Things Data | Experimental study | Increase energy efficiency | Dispersed fraudulent transactions in IoMT data, Security and privacy challenges in cloud infrastructure | Integrating federated learning with blockchain-based task scheduling. |
[14] | Logistic Regression, Decision Tree, KNN, Naive Bayes, SVM, and Random Forest | Supervised ML | Fraud detection in healthcare insurance claims using decentralised (blockchain) systems | Different Healthcare Datasets | Experimental study | Increased Accuracy and improved fraud detection | Security and privacy challenges in blockchain implementation, Data may be compromised | Leveraging ML to examine data and check blockchain transactions. |
[88] | Black-box neural networks, XAI techniques, DT | Supervised ML | Detection of fraud from healthcare datasets | UCI datasets | Experimental study | Improved model Performance and accuracy | Black-box complexity, lack of conventional accuracy | Integrating XAI with traditional methods. |
[180] | CNN-LSTM, DNN, Modified SHA-256 encryption algorithm | Supervised ML, Deep learning | Detection of fraud from healthcare insurance claims datasets | Hospital Administrative Datasets | Experimental study | Improved accuracy | Data security challenges | Leveraging DL and modified SHA-256 encryption algorithm for security enhancement. |
[22] | Neural Networks, Focal-loss function | Unsupervised ML | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets from China | Experimental study | Improved fraud detection | Unavailability of labelled dataset, Noisy data | Use of unsupervised ML model and feature engineering, along with focal-loss function. |
[24] | Deep learning ensemble (EffiIncepNet), EfficientNet, Inception-ResNet-v2 architectures | Supervised ML and Deep learning | Fraud detection in healthcare insurance claims using decentralised blockchain systems | IEEE-CIS Fraud Detection Dataset | Experimental study | Increased accuracy and improved information security | Managing complex and high-dimensional data, Issues in blockchain data | Using deep learning ensemble model with blockchain. |
[181] | Multiple AI techniques and approaches | Supervised ML and Deep learning | Detection of Anomalies from medical image datasets | Clinical Dataset of MRI | Review article | Improved and efficient fraud detection | Manual explanation, Challenges in feature extraction | Different DL, ensemble learning and XAI approaches. |
[182] | Multi-channel heterogeneous graph neural networks (HGNNs) | Unsupervised ML and Deep Learning | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Dataset from China | Experimental study | Improved fraud detection | Conventional methods’ limitations, Issues in graph creation | Convert data into multi-channel heterogeneous graphs and use advanced anomaly detection. |
[45] | Random Forest, Logistic Regression, ANNs, SMOTE Boruta | Supervised ML | Detection of fraud from healthcare insurance claims datasets from Saudi Arabia. | Health Insurance Claims Dataset of Saudi Arabia | Experimental study | Increased fraud detection | Data disparity issues, Vigorous fraudulent transactions | Use of SMOTE Boruta to improve model accuracy. |
[46] | Deep autoencoders | Unsupervised ML and Deep learning | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets | Experimental study | Improved fraud detection | Unavailability of labelled dataset, Noisy data | Using deep autoencoders and evaluating performance with density-based models. |
[13] | Association rule mining, Unsupervised classifiers, e.g., IF, CBLOF, ECOD, OCSVM | Unsupervised learning | Detection of fraud from healthcare insurance claims datasets | CMS DE-SynPUF dataset | Experimental study | Improved fraud detection | Complications in the data, i.e., data quality, Real-time fraud detection required | Use of association rule mining and unsupervised ML for anomaly detection. |
[34] | Multiple anomaly detection techniques, SHAP | Unsupervised ML | Detection of fraud from healthcare insurance claims datasets and anomaly detection. | Health Insurance Claims Datasets from Belgium | Experimental study | Increased anomalies detection | Unavailability of labelled dataset, Challenges in anomaly detection | Using advanced anomaly detection techniques along with SHAP. |
[183] | Feature ranking methods | Supervised learning | Detection of fraud from healthcare insurance claims datasets | Health Insurance Claims Datasets | Experimental study | Enhanced model performance | Substantial class imbalance, The threat of overfitting | Using feature ranking methods to identify the most relevant feature and minimising noise. |
[184] | Random Forest, Adaptive Boosting, Logistic Regression, Perceptron, and Deep NN | Supervised ML | Detection of Anomalies from medical image datasets | Canadian Institute for Cybersecurity (CIC) IoT Dataset | Experimental study | Increased accuracy and advanced anomaly detection | Substantial class imbalance, the threat of overfitting Effective threat detection is required | Implications of feature-reducing methods. Use of SMOTE to prevent overfitting. |
[19] | Multiple Anomaly Detection techniques | Unsupervised ML and Deep Learning | Detection of fraud from healthcare insurance claims datasets and anomaly detection. | Smart home datasets | Systematic literature review | Automatic fraud detection | Substantial class imbalance, Issues due to differences in simulation and real-world datasets | Use of supervised ML to identify potential risks. |
[185] | Multiple ML classification techniques, Risk classification, Premium prediction | Supervised ML | Detection of fraud from healthcare insurance claims datasets and anomaly detection | Health Insurance Claims Datasets from the US | Experimental study | Minimises manual workload, Enhanced operational excellence | Managing complex and high-dimensional data, Data security, | Integrating ensemble ML with feature engineering for fast processing. |
References
- Najar, A.V.; Alizamani, L.; Zarqi, M.; Hooshmand, E. A global scoping review on the patterns of medical fraud and abuse: Integrating data-driven detection, prevention, and legal responses. Arch. Public Health 2025, 83, 1–24. [Google Scholar] [CrossRef]
- Dorsey. Healthcare Fraud: A World Beyond the Anti-Kickback Statute. 2024 [cited 2025 04/2025]. Available online: https://www.dorsey.com/newsresources/publications/client-alerts/2024/5/healthcare-fraud (accessed on 27 June 2025).
- Attorney. Man Pleads Guilty to Conspiracy to Launder Money in Connection with $100 Million Health Care Fraud Scheme. 2025 [cited 2025 2025]. Available online: https://www.justice.gov/usao-mdnc/pr/man-pleads-guilty-conspiracy-launder-money-connection-100-million-health-care-fraud (accessed on 27 June 2025).
- Sweeney, E. Predictive Analytics Saves Government $1.5B in Improper Payments. 2016. Available online: https://www.fiercehealthcare.com/antifraud/predictive-analytics-saves-government-1-5b-improper-payments (accessed on 27 June 2025).
- Vandenberg, O.; Martiny, D.; Rochas, O.; van Belkum, A.; Kozlakidis, Z. Considerations for diagnostic COVID-19 tests. Nat. Rev. Microbiol. 2020, 19, 171–183. [Google Scholar] [CrossRef]
- Duong, M.T.; Bruns, E.J.; Lee, K.; Cox, S.; Coifman, J.; Mayworm, A.; Lyon, A.R. Rates of Mental Health Service Utilization by Children and Adolescents in Schools and Other Common Service Settings: A Systematic Review and Meta-Analysis. Adm. Policy Ment. Health Ment. Health Serv. Res. 2020, 48, 420–439. [Google Scholar] [CrossRef]
- Arshed, M.A.; Mumtaz, S.; Gherghina, Ș.C.; Urooj, N.; Ahmed, S.; Dewi, C. A Deep Learning Model for Detecting Fake Medical Images to Mitigate Financial Insurance Fraud. Computation 2024, 12, 173. [Google Scholar] [CrossRef]
- Venkatesh, R.; Hanumantha, B.S. A Privacy-Preserving Quantum Blockchain Technique for Electronic Medical Records. IEEE Eng. Manag. Rev. 2023, 51, 137–144. [Google Scholar] [CrossRef]
- Joiner, K.A.; Lin, J.; Pantano, J. Upcoding in medicare: Where does it matter most? Health Econ. Rev. 2024, 14, 1. [Google Scholar] [CrossRef] [PubMed]
- Viriyathorn, S.; Witthayapipopsakul, W.; Kulthanmanusorn, A.; Rittimanomai, S.; Khuntha, S.; Patcharanarumol, W.; Tangcharoensathien, V. Definition, Practice, Regulations, and Effects of Balance Billing: A Scoping Review. Health Serv. Insights 2023, 16, 11786329231178766. [Google Scholar] [CrossRef]
- Branion-Calles, M.; Godfreyson, A.; Berniaz, K.; Arason, N.; Chan, H.; Erdelyi, S.; Winters, M.; Sengupta, J.; Essa, M.; Rajabali, F.; et al. Underreporting and selection bias of serious road traffic injuries in auto insurance claims and police reports in British Columbia, Canada. Transp. Res. Interdiscip. Perspect. 2025, 30, 101375. [Google Scholar] [CrossRef]
- Lukyanenko, R.; Maass, W.; Storey, V.C. Trust in artificial intelligence: From a Foundational Trust Framework to emerging research opportunities. Electron. Mark. 2022, 32, 1993–2020. [Google Scholar] [CrossRef]
- Hamid, Z.; Khalique, F.; Mahmood, S.; Daud, A.; Bukhari, A.; Alshemaimri, B. Healthcare insurance fraud detection using data mining. BMC Med. Inform. Decis. Mak. 2024, 24, 112. [Google Scholar] [CrossRef]
- Mohammed, M.A.; Boujelben, M.; Abid, M. A Novel Approach for Fraud Detection in Blockchain-Based Healthcare Networks Using Machine Learning. Futur. Internet 2023, 15, 250. [Google Scholar] [CrossRef]
- Kühl, N.; Schemmer, M.; Goutier, M.; Satzger, G. Artificial intelligence and machine learning. Electron. Mark. 2022, 32, 2235–2244. [Google Scholar] [CrossRef]
- Aminizadeh, S.; Heidari, A.; Toumaj, S.; Darbandi, M.; Navimipour, N.J.; Rezaei, M.; Talebi, S.; Azad, P.; Unal, M. The applications of machine learning techniques in medical data processing based on distributed computing and the Internet of Things. Comput. Methods Programs Biomed. 2023, 241, 107745. [Google Scholar] [CrossRef] [PubMed]
- Praveen, S.P.; Krishna, T.B.M.; Anuradha, C.H.; Mandalapu, S.R.; Sarala, P.; Sindhura, S. A robust framework for handling health care information based on machine learning and big data engineering techniques. Int. J. Health Manag. 2022, 1–18. [Google Scholar] [CrossRef]
- Amponsah, A.A.; Adekoya, A.F.; Weyori, B.A. A novel fraud detection and prevention method for healthcare claim processing using machine learning and blockchain technology. Decis. Anal. J. 2022, 4, 100122. [Google Scholar] [CrossRef]
- Galvão, Y.M.; Castro, L.; Ferreira, J.; Neto, F.B.D.L.; Fagundes, R.A.D.A.; Fernandes, B.J. Anomaly detection in smart houses for healthcare: Recent advances, and future perspectives. SN Comput. Sci. 2024, 5, 136. [Google Scholar] [CrossRef]
- Yan, J.; Wang, X. Unsupervised and semi-supervised learning: The next frontier in machine learning for plant systems biology. Plant J. 2022, 111, 1527–1538. [Google Scholar] [CrossRef]
- Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar] [CrossRef]
- Zhang, C.; Xiao, X.; Wu, C. Medical Fraud and Abuse Detection System Based on Machine Learning. Int. J. Environ. Res. Public Health 2020, 17, 7265. [Google Scholar] [CrossRef]
- Ali, A.; Razak, S.A.; Othman, S.H.; Eisa, T.A.E.; Al-Dhaqm, A.; Nasser, M.; Elhassan, T.; Elshafie, H.; Saif, A. Financial Fraud Detection Based on Machine Learning: A Systematic Literature Review. Appl. Sci. 2022, 12, 9637. [Google Scholar] [CrossRef]
- Almazroi, A.A. Innovative AI ensemble model for robust and optimized blockchain-based healthcare systems. Netw. Model. Anal. Health Inform. Bioinform. 2025, 14, 1–19. [Google Scholar] [CrossRef]
- Thundiyil; Picone, J.; McKenzie, S. Transformer Architectures in Time Series Analysis: A Review; Temple University: Philadelphia, PA, USA, 2014. [Google Scholar]
- Shafiei, A.; Tatar, A.; Rayhani, M.; Kairat, M.; Askarova, I. Artificial neural network, support vector machine, decision tree, random forest, and committee machine intelligent system help to improve performance prediction of low salinity water injection in carbonate oil reservoirs. J. Pet. Sci. Eng. 2022, 219, 111046. [Google Scholar] [CrossRef]
- Murorunkwere, B.F.; Ihirwe, J.F.; Kayijuka, I.; Nzabanita, J.; Haughton, D. Comparison of Tree-Based Machine Learning Algorithms to Predict Reporting Behavior of Electronic Billing Machines. Information 2023, 14, 140. [Google Scholar] [CrossRef]
- Chatterjee, P.; Das, D.; Rawat, D.B. Digital twin for credit card fraud detection: Opportunities, challenges, and fraud detection advancements. Futur. Gener. Comput. Syst. 2024, 158, 410–426. [Google Scholar] [CrossRef]
- Afriyie, J.K.; Tawiah, K.; Pels, W.A.; Addai-Henne, S.; Dwamena, H.A.; Owiredu, E.O.; Ayeh, S.A.; Eshun, J. A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decis. Anal. J. 2023, 6, 100163. [Google Scholar] [CrossRef]
- Razzaq, K.; Shah, M.; Fattahi, M.; Tang, J. Empowering machine learning for robust cyber-attack prevention in online retail: An integrative analysis. Humanit. Soc. Sci. Commun. 2025, 12, 1–15. [Google Scholar] [CrossRef]
- Niaz, N.U.; Shahariar, K.N.; Patwary, M.J.A. Class Imbalance Problems in Machine Learning: A Review of Methods and Future Challenges. In Proceedings of the ICCA 2022: 2nd International Conference on Computing Advancements, Dhaka, Bangladesh, 13–15 January 2022; pp. 485–490. [Google Scholar]
- Wang, Z.; Chen, X.; Wu, Y.; Jiang, L.; Lin, S.; Qiu, G. A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud. Sci. Rep. 2025, 15, 218. [Google Scholar] [CrossRef] [PubMed]
- Samara, M.A.; Bennis, I.; Abouaissa, A.; Lorenz, P. A survey of outlier detection techniques in IoT: Review and classification. J. Sens. Actuator Netw. 2022, 11, 4. [Google Scholar] [CrossRef]
- De Meulemeester, H.; De Smet, F.; van Dorst, J.; Derroitte, E.; De Moor, B. Explainable unsupervised anomaly detection for healthcare insurance data. BMC Med. Inform. Decis. Mak. 2025, 25, 14. [Google Scholar] [CrossRef]
- Li, D.; Qi, Z.; Zhou, Y.; Elchalakani, M. Machine Learning Applications in Building Energy Systems: Review and Prospects. Buildings 2025, 15, 648. [Google Scholar] [CrossRef]
- Yang, X.; Qi, X.; Zhou, X. Deep Learning Technologies for Time Series Anomaly Detection in Healthcare: A Review. IEEE Access 2023, 11, 117788–117799. [Google Scholar] [CrossRef]
- Li, G.; Yu, Z.; Yang, K.; Lin, M.; Chen, C.L.P. Exploring Feature Selection With Limited Labels: A Comprehensive Survey of Semi-Supervised and Unsupervised Approaches. IEEE Trans. Knowl. Data Eng. 2024, 36, 6124–6144. [Google Scholar] [CrossRef]
- Iqbal, A.; Amin, R. Time series forecasting and anomaly detection using deep learning. Comput. Chem. Eng. 2023, 182, 108560. [Google Scholar] [CrossRef]
- Abimbola, B.; Marin, E.d.L.C.; Tan, Q. Enhancing Legal Sentiment Analysis: A Convolutional Neural Network–Long Short-Term Memory Document-Level Model. Mach. Learn. Knowl. Extr. 2024, 6, 877–897. [Google Scholar] [CrossRef]
- Amarasinghe, S.C. Developing Robust Deep Learning Models for Intelligent Infrastructure: Addressing Scalability, Security, and Privacy Challenges. Appl. Res. Artif. Intell. Cloud Comput. 2024, 7, 1–10. [Google Scholar]
- Nesvijevskaia, A.; Ouillade, S.; Guilmin, P.; Zucker, J.-D. The accuracy versus interpretability trade-off in fraud detection model. Data Policy 2021, 3, e12. [Google Scholar] [CrossRef]
- Rajendran, R.M.; Vyas, B. Detecting APT Using Machine Learning: Comparative Performance Analysis with Proposed Model. In Proceedings of the SoutheastCon 2024, Atlanta, GA, USA, 15–24 March 2024; pp. 1064–1069. [Google Scholar]
- Verma, I.; Prasad, S.K. Exploring Ensemble Learning Techniques for Infant Mortality Prediction: A Technical Analysis of XGBoost Stacking AdaBoost and Bagging Models. Birth Defects Res. 2025, 117, e2443. [Google Scholar] [CrossRef]
- Bounab, R.; Zarour, K.; Guelib, B.; Khlifa, N. Enhancing Medicare Fraud Detection Through Machine Learning: Addressing Class Imbalance With SMOTE-ENN. IEEE Access 2024, 12, 54382–54396. [Google Scholar] [CrossRef]
- Nabrawi, E.; Alanazi, A. Fraud Detection in Healthcare Insurance Claims Using Machine Learning. Risks 2023, 11, 160. [Google Scholar] [CrossRef]
- Suesserman, M.; Gorny, S.; Lasaga, D.; Helms, J.; Olson, D.; Bowen, E.; Bhattacharya, S. Procedure code overutilization detection from healthcare claims using unsupervised deep learning methods. BMC Med. Inform. Decis. Mak. 2023, 23, 1–11. [Google Scholar] [CrossRef]
- Mahadevkar, S.V.; Patil, S.; Kotecha, K.; Soong, L.W.; Choudhury, T. Exploring AI-driven approaches for unstructured document analysis and future horizons. J. Big Data 2024, 11, 1–54. [Google Scholar] [CrossRef]
- Ocan, P. Enhancing Prescription Fraud and Error Detection in NHS Prescriptions Through Anomaly Detection. Reflective Prof. 2024, 4, 1–63. [Google Scholar]
- Darwish, D. Machine Learning and IoT in Health 4.0, in IoT and ML for Information Management: A Smart Healthcare Perspective; Springer: Berlin/Heidelberg, Germany, 2024; pp. 235–276. [Google Scholar]
- Wang, J.E.; Beaulieu-Jones, B.; Brat, G.A.; Marwaha, J.S. The role of artificial intelligence in helping providers manage pain and opioid use after surgery. Glob. Surg. Educ.-J. Assoc. Surg. Educ. 2024, 3, 1–5. [Google Scholar] [CrossRef]
- Nguyen, T.; Perez, V. Privatizing Plaintiffs: How Medicaid, the False Claims Act, and Decentralized Fraud Detection Affect Public Fraud Enforcement Efforts. J. Risk Insur. 2019, 87, 1063–1091. [Google Scholar] [CrossRef]
- Kapadiya, K.; Patel, U.; Gupta, R.; Alshehri, M.D.; Tanwar, S.; Sharma, G.; Bokoro, P.N. Blockchain and AI-Empowered Healthcare Insurance Fraud Detection: An Analysis, Architecture, and Future Prospects. IEEE Access 2022, 10, 79606–79627. [Google Scholar] [CrossRef]
- DeFulio, A.; Rzeszutek, M.J.; Furgeson, J.; Ryan, S.; Rezania, S. A smartphone-smartcard platform for contingency management in an inner-city substance use disorder outpatient program. J. Subst. Abus. Treat. 2021, 120, 108188. [Google Scholar] [CrossRef]
- Mahmud, M.A.I.; Talukder, A.T.; Sultana, A.; Bhuiyan, K.I.A.; Rahman, M.S.; Pranto, T.H.; Rahman, R.M. Toward news authenticity: Synthesizing natural language processing and human expert opinion to evaluate news. IEEE Access 2023, 11, 11405–11421. [Google Scholar] [CrossRef]
- Cord, M.; Cunningham, P. Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval. J. Electron. Imaging 2007, 18, 039901-01-2. [Google Scholar] [CrossRef]
- Hastie, T. Overview of supervised learning. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009; pp. 9–41. [Google Scholar]
- Nasteski, V. An overview of the supervised machine learning methods. Horizons B 2017, 4, 51–62. [Google Scholar] [CrossRef]
- Krishnan, R.; Rajpurkar, P.; Topol, E.J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 2022, 6, 1346–1352. [Google Scholar] [CrossRef]
- Tiwari, A. Supervised learning: From theory to applications. In Artificial Intelligence and Machine Learning for EDGE Computing; Elsevier: Amsterdam, The Netherlands, 2022; pp. 23–32. [Google Scholar]
- Tyagi, K.; Rane, C.; Sriram, R.; Manry, M. Unsupervised learning. In Artificial Intelligence and Machine Learning for EDGE Computing; Elsevier: Amsterdam, The Netherlands, 2022; pp. 33–52. [Google Scholar]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R.; Taylor, J. Unsupervised learning. In An Introduction to Statistical Learning: With Applications in Python; Springer: Berlin/Heidelberg, Germany, 2023; pp. 503–556. [Google Scholar]
- Priyadarshi, R.; Ranjan, R.; Vishwakarma, A.K.; Yang, T.; Rathore, R.S. Exploring the Frontiers of Unsupervised Learning Techniques for Diagnosis of Cardiovascular Disorder: A Systematic Review. IEEE Access 2024, 12, 139253–139272. [Google Scholar] [CrossRef]
- He, M.; Cerna, J.; Mathew, R.; Zhao, J.; Zhao, J.; Espina, E.; Clore, J.L.; Sowers, R.B.; Hsiao-Wecksler, E.T.; Hernandez, M.E. Objective anxiety level classification using unsupervised learning and multimodal physiological signals. Smart Health 2025, 36, 100572. [Google Scholar] [CrossRef]
- Yang, X.; Song, Z.; King, I.; Xu, Z. A survey on deep semi-supervised learning. IEEE Trans. Knowl. Data Eng. 2022, 35, 8934–8954. [Google Scholar] [CrossRef]
- Song, Z.; Yang, X.; Xu, Z.; King, I. Graph-Based Semi-Supervised Learning: A Comprehensive Review. IEEE Trans. Neural Networks Learn. Syst. 2022, 34, 8174–8194. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Z.-H. Semi-supervised learning. Mach. Learn. 2021, 1, 315–341. [Google Scholar]
- Taye, M.M. Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions. Computers 2023, 12, 91. [Google Scholar] [CrossRef]
- Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
- Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
- Matsuo, Y.; LeCun, Y.; Sahani, M.; Precup, D.; Silver, D.; Sugiyama, M.; Morimoto, J. Deep learning, reinforcement learning, and world models. Neural Netw. 2022, 152, 267–275. [Google Scholar] [CrossRef]
- Archana, R.; Jeevaraj, P.S.E. Deep learning models for digital image processing: A review. Artif. Intell. Rev. 2024, 57, 1–33. [Google Scholar] [CrossRef]
- Cha, Y.-J.; Ali, R.; Lewis, J.; Büyüköztürk, O. Deep learning-based structural health monitoring. Autom. Constr. 2024, 161, 105328. [Google Scholar] [CrossRef]
- Razzaq, K.; Shah, M. Machine Learning and Deep Learning Paradigms: From Techniques to Practical Applications and Research Frontiers. Computers 2025, 14, 93. [Google Scholar] [CrossRef]
- Ullah, F.; Ullah, I.; Khan, R.U.; Khan, S.; Khan, K.; Pau, G. Conventional to Deep Ensemble Methods for Hyperspectral Image Classification: A Comprehensive Survey. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 3878–3916. [Google Scholar] [CrossRef]
- Nobel, S.M.N.; Swapno, S.M.M.R.; Islam, R.; Safran, M.; Alfarhood, S.; Mridha, M.F. A machine learning approach for vocal fold segmentation and disorder classification based on ensemble method. Sci. Rep. 2024, 14, 14435. [Google Scholar] [CrossRef]
- Fathi, S.; Ahmadi, A.; Dehnad, A.; Almasi-Dooghaee, M.; Sadegh, M.; Initiative, F.T.A.D.N. A Deep Learning-Based Ensemble Method for Early Diagnosis of Alzheimer’s Disease using MRI Images. Neuroinformatics 2023, 22, 89–105. [Google Scholar] [CrossRef]
- Zamani, A.S.; Hashim, A.H.A.; Shatat, A.S.A.; Akhtar, M.; Rizwanullah, M.; Mohamed, S.S.I. Implementation of machine learning techniques with big data and IoT to create effective prediction models for health informatics. Biomed. Signal Process. Control. 2024, 94, 106247. [Google Scholar] [CrossRef]
- Razali, F.M.; Sulaiman, N.; Manan, D.I.A.; Said, J. Sustainability of Audit Profession in Digital Technology Era: The Role of Competencies and Digital Technology Capabilities to Detect Fraud Risk. SAGE Open 2025, 15, 21582440241304974. [Google Scholar] [CrossRef]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 2021, 54, 1–35. [Google Scholar] [CrossRef]
- Ekin, T.; Frigau, L.; Conversano, C. Health care fraud classifiers in practice. Appl. Stoch. Models Bus. Ind. 2021, 37, 1182–1199. [Google Scholar] [CrossRef]
- Saxena, S.; Singh, A.; Tiwari, S. Prediction model for digital image tampering using customised deep neural network techniques. Int. J. Syst. Assur. Eng. Manag. 2024, 1–9. [Google Scholar] [CrossRef]
- Li, C.; Ding, S.; Zou, N.; Hu, X.; Jiang, X.; Zhang, K. Multi-task learning with dynamic re-weighting to achieve fairness in healthcare predictive modeling. J. Biomed. Inform. 2023, 143, 104399. [Google Scholar] [CrossRef]
- D’hOndt, E.; Ashby, T.J.; Chakroun, I.; Koninckx, T.; Wuyts, R. Identifying and evaluating barriers for the implementation of machine learning in the intensive care unit. Commun. Med. 2022, 2, 162. [Google Scholar] [CrossRef]
- Karimian, G.; Petelos, E.; Evers, S.M.A.A. The ethical issues of the application of artificial intelligence in healthcare: A systematic scoping review. AI Ethics 2022, 2, 539–551. [Google Scholar] [CrossRef]
- Tazi, F.; Nandakumar, A.; Dykstra, J.; Rajivan, P.; Das, S. SoK: Analyzing Privacy and Security of Healthcare Data from the User Perspective. ACM Trans. Comput. Health 2024, 5, 1–31. [Google Scholar] [CrossRef]
- Gholampour, S. Impact of Nature of Medical Data on Machine and Deep Learning for Imbalanced Datasets: Clinical Validity of SMOTE Is Questionable. Mach. Learn. Knowl. Extr. 2024, 6, 827–841. [Google Scholar] [CrossRef]
- Alsamhi, S.H.; Myrzashova, R.; Hawbani, A.; Kumar, S.; Srivastava, S.; Zhao, L.; Wei, X.; Guizan, M.; Curry, E. Federated Learning Meets Blockchain in Decentralized Data Sharing: Healthcare Use Case. IEEE Internet Things J. 2024, 11, 19602–19615. [Google Scholar] [CrossRef]
- Khan, N.; Nauman, M.; Almadhor, A.S.; Akhtar, N.; Alghuried, A.; Alhudhaif, A. Guaranteeing Correctness in Black-Box Machine Learning: A Fusion of Explainable AI and Formal Methods for Healthcare Decision-Making. IEEE Access 2024, 12, 90299–90316. [Google Scholar] [CrossRef]
- Razzaq, K.; Shah, M. Barriers to Implementing ML for Cybercrime Prevention in Online Retailing. In Proceedings of the SaudiCIS 2024 Proceedings, Dhahran, Saudi Arabia, 19–21 November 2024; 2024. [Google Scholar]
- Sarker, I.H. Multi-aspects AI-based modeling and adversarial learning for cybersecurity intelligence and robustness: A comprehensive overview. Secur. Priv. 2023, 6, e295. [Google Scholar] [CrossRef]
- Liang, Q.; Bauder, R.A.; Khoshgoftaar, T.M. Enhancing Medicare Fraud Detection: Random Undersampling Followed by SHAP-Driven Feature Selection with Big Data. In Proceedings of the 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), Herndon, VA, USA, 28–40 October 2024; pp. 256–263. [Google Scholar]
- Mohale, V.Z.; Obagbuwa, I.C. A systematic review on the integration of explainable artificial intelligence in intrusion detection systems to enhancing transparency and interpretability in cybersecurity. Front. Artif. Intell. 2025, 8, 1526221. [Google Scholar] [CrossRef]
- Lakhan, A.; Mohammed, M.A.; Nedoma, J.; Martinek, R.; Tiwari, P.; Vidyarthi, A.; Alkhayyat, A.; Wang, W. Federated-Learning Based Privacy Preservation and Fraud-Enabled Blockchain IoMT System for Healthcare. IEEE J. Biomed. Health Inform. 2022, 27, 664–672. [Google Scholar] [CrossRef]
- Kapadiya, K.; Ramoliya, F.; Gohil, K.; Patel, U.; Gupta, R.; Tanwar, S.; Rodrigues, J.J.; Alqahtani, F.; Tolba, A. Blockchain-assisted healthcare insurance fraud detection framework using ensemble learning. Comput. Electr. Eng. 2024, 122, 109898. [Google Scholar] [CrossRef]
- Fetaji, B.; Fetaji, M.; Hasan, A.; Rexhepi, S.; Armenski, G. FRAUD-X: An Integrated AI, Blockchain, and Cybersecurity Framework with Early Warning Systems for Mitigating Online Financial Fraud—A Case Study from North Macedonia. IEEE Access 2025, 13, 48068–48082. [Google Scholar] [CrossRef]
- Cholevas, C.; Angeli, E.; Sereti, Z.; Mavrikos, E.; Tsekouras, G.E. Anomaly Detection in Blockchain Networks Using Unsupervised Learning: A Survey. Algorithms 2024, 17, 201. [Google Scholar] [CrossRef]
- Benedetti, H.; Nikbakht, E.; Sarkar, S.; Spieler, A.C. Blockchain and corporate fraud. J. Financ. Crime 2020, 28, 702–721. [Google Scholar] [CrossRef]
- Xu, C.; Zhang, C.; Xu, J.; Pei, J. SlimChain: Scaling blockchain transactions through off-chain storage and parallel processing. In Proceedings of the VLDB Endowment, Copenhagen, Denmark, 4 December 2021; Volume 14, pp. 2314–2326. [Google Scholar]
- Cirillo, F.; De Santis, M.; Esposito, C. Applications of Solid Platform and Federated Learning for Decentralized Health Data Management. In Artificial Intelligence Techniques for Analysing Sensitive Data in Medical Cyber-Physical Systems: System Protection and Data Analysis; Springer: Berlin/Heidelberg, Germany, 2025; pp. 95–111. [Google Scholar]
- Li, N.; Lewin, A.; Ning, S.; Waito, M.; Zeller, M.P.; Tinmouth, A.; Shih, A.W. The Canadian Transfusion Trials Group Privacy-preserving federated data access and federated learning: Improved data sharing and AI model development in transfusion medicine. Transfusion 2024, 65, 22–28. [Google Scholar] [CrossRef]
- Long, G.; Shen, T.; Tan, Y.; Gerrard, L.; Clarke, A.; Jiang, J. Federated learning for privacy-preserving open innovation future on digital health. In Humanity Driven AI: Productivity, Well-Being, Sustainability and Partnership; Springer: Berlin/Heidelberg, Germany, 2021; pp. 113–133. [Google Scholar]
- Joshi, M.; Pal, A.; Sankarasubbu, M. Federated learning for healthcare domain-pipeline, applications and challenges. ACM Trans. Comput. Healthc. 2022, 3, 1–36. [Google Scholar] [CrossRef]
- Paul, S.G.; Saha, A.; Hasan, Z.; Noori, S.R.H.; Moustafa, A. A Systematic Review of Graph Neural Network in Healthcare-Based Applications: Recent Advances, Trends, and Future Directions. IEEE Access 2024, 12, 15145–15170. [Google Scholar] [CrossRef]
- Islam, M.A.; Majumder, M.Z.H.; Miah, M.S.; Jannaty, S. Precision healthcare: A deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction. Comput. Biol. Med. 2024, 176, 108432. [Google Scholar] [CrossRef]
- Rakhmatulin, I.; Dao, M.-S.; Nassibi, A.; Mandic, D. Exploring Convolutional Neural Network Architectures for EEG Feature Extraction. Sensors 2024, 24, 877. [Google Scholar] [CrossRef]
- Hassan, M.; Kaabouch, N. Impact of Feature Selection Techniques on the Performance of Machine Learning Models for Depression Detection Using EEG Data. Appl. Sci. 2024, 14, 10532. [Google Scholar] [CrossRef]
- Theng, D.; Bhoyar, K.K. Feature selection techniques for machine learning: A survey of more than two decades of research. Knowl. Inf. Syst. 2023, 66, 1575–1637. [Google Scholar] [CrossRef]
- Amiriebrahimabadi, M.; Mansouri, N. A comprehensive survey of feature selection techniques based on whale optimization algorithm. Multimedia Tools Appl. 2023, 83, 47775–47846. [Google Scholar] [CrossRef]
- Bounab, R.; Guelib, B.; Benzerogue, S.; Zarour, K. Optimizing Machine Learning for Healthcare Fraud Detection: A Framework Using Hybrid Feature Selection and Hyperparameter Tuning. In Proceedings of the 2024 International Conference on Advanced Aspects of Software Engineering (ICAASE), Constantine, Algeria, 9–10 November 2024; pp. 1–8. [Google Scholar]
- Puttelaar, R.V.D.; de Lima, P.N.; Knudsen, A.B.; Rutter, C.M.; Kuntz, K.M.; de Jonge, L.; Escudero, F.A.; Lieberman, D.; Zauber, A.G.; Hahn, A.I.; et al. Effectiveness and Cost-Effectiveness of Colorectal Cancer Screening With a Blood Test that Meets the Centers for Medicare & Medicaid Services Coverage Decision. Gastroenterology 2024, 167, 368–377. [Google Scholar] [CrossRef]
- Pennap, D.; Swain, R.S.; Akhtar, S.; Liao, J.; Wei, Y.; Li, J.; Wernecke, M.; MaCurdy, T.E.; Kelman, J.A.; Mosholder, A.D.; et al. Comparing the Centers for Medicare and Medicaid Services (CMS) enrollment data death dates to the National Death Index (NDI). Pharmacoepidemiol. Drug Saf. 2024, 33, e5772. [Google Scholar] [CrossRef] [PubMed]
- Hancock, J.T.; Wang, H.; Khoshgoftaar, T.M.; Liang, Q. Data reduction techniques for highly imbalanced medicare Big Data. J. Big Data 2024, 11, 8. [Google Scholar] [CrossRef]
- Yang, X.; Zeng, Z.; Teo, S.G.; Wang, L.; Chandrasekhar, V.; Hoi, S. Deep learning for practical image recognition: Case study on kaggle competitions. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018. [Google Scholar]
- Leone, N.; Greco, G.; Ianni, G.; Lio, V.; Terracina, G.; Eiter, T.; Staniszkis, W. The INFOMIX system for advanced integration of incomplete and inconsistent data. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, USA, 14–16 June 2005. [Google Scholar]
- Cai, L.; Li, J.; Lv, H.; Liu, W.; Niu, H.; Wang, Z. Integrating domain knowledge for biomedical text analysis into deep learning: A survey. J. Biomed. Inform. 2023, 143, 104418. [Google Scholar] [CrossRef]
- Murala, D.K.; Panda, S.K.; Dash, S.P. MedMetaverse: Medical Care of Chronic Disease Patients and Managing Data Using Artificial Intelligence, Blockchain, and Wearable Devices State-of-the-Art Methodology. IEEE Access 2023, 11, 138954–138985. [Google Scholar] [CrossRef]
- Warren, J.L.; Barrett, M.J.; White, D.P.; Banks, R.; Cafardi, S.; Enewold, L. Sensitivity of Medicare Data to Identify Oncologists. JNCI Monogr. 2020, 2020, 60–65. [Google Scholar] [CrossRef]
- Jacobs, J.P.; Shahian, D.M.; Grau-Sepulveda, M.; O’bRien, S.M.; Pruitt, E.Y.; Bloom, J.P.; Edgerton, J.R.; Kurlansky, P.A.; Habib, R.H.; Antman, M.S.; et al. Current Penetration, Completeness, and Representativeness of The Society of Thoracic Surgeons Adult Cardiac Surgery Database. Ann. Thorac. Surg. 2022, 113, 1461–1468. [Google Scholar] [CrossRef]
- Johnson, J.M.; Khoshgoftaar, T.M. Data-Centric AI for Healthcare Fraud Detection. SN Comput. Sci. 2023, 4, 1–14. [Google Scholar] [CrossRef]
- Duman, E. Implementation of XGBoost Method for Healthcare Fraud Detection. Sci. J. Mehmet Akif Ersoy Univ. 2022, 5, 69–75. [Google Scholar]
- Tariq, M.; Palade, V.; Ma, Y. Transfer learning based classification of diabetic retinopathy on the Kaggle EyePACS dataset. In International Conference on Medical Imaging and Computer-Aided Diagnosis; Springer: Singapore, 2022. [Google Scholar]
- Neto, E.C.P.; Dadkhah, S.; Sadeghi, S.; Molyneaux, H.; Ghorbani, A.A. A review of Machine Learning (ML)-based IoT security in healthcare: A dataset perspective. Comput. Commun. 2023, 213, 61–77. [Google Scholar] [CrossRef]
- Kumaraswamy, N.; Markey, M.K.; Barner, J.C.; Rascati, K. Feature engineering to detect fraud using healthcare claims data. Expert Syst. Appl. 2022, 210, 118433. [Google Scholar] [CrossRef]
- Haque, M.E.; Tozal, M.E. Identifying health insurance claim frauds using mixture of clinical concepts. IEEE Trans. Serv. Comput. 2021, 15, 2356–2367. [Google Scholar] [CrossRef]
- Mardani, S.; Moradi, H. Using Graph Attention Networks in Healthcare Provider Fraud Detection. IEEE Access 2024, 12, 132786–132800. [Google Scholar] [CrossRef]
- Xu, J.; Cai, H.; Zheng, X. Timing of vasopressin initiation and mortality in patients with septic shock: Analysis of the MIMIC-III and MIMIC-IV databases. BMC Infect. Dis. 2023, 23, 1–10. [Google Scholar] [CrossRef]
- Tian, J.; Cui, R.; Song, H.; Zhao, Y.; Zhou, T. Prediction of acute kidney injury in patients with liver cirrhosis using machine learning models: Evidence from the MIMIC-III and MIMIC-IV. Int. Urol. Nephrol. 2023, 56, 237–247. [Google Scholar] [CrossRef]
- Qayyum, A.; Qadir, J.; Bilal, M.; Al-Fuqaha, A. Secure and Robust Machine Learning for Healthcare: A Survey. IEEE Rev. Biomed. Eng. 2020, 14, 156–180. [Google Scholar] [CrossRef]
- Razzaq, K.; Shah, M. Advancing Cybersecurity Through Machine Learning: A Scientometric Analysis of Global Research Trends and Influential Contributions. J. Cybersecur. Priv. 2025, 5, 12. [Google Scholar] [CrossRef]
- Borky, J.M.; Bradley, T.H. Protecting information with cybersecurity. In Effective Model-Based Systems Engineering; Springer: Cham, Switzerland, 2019; pp. 345–404. [Google Scholar]
- Hassan, N.H.; Ismail, Z.; Maarop, N. A conceptual model for knowledge sharing towards information security culture in healthcare organization. In Proceedings of the 2013 International Conference on Research and Innovation in Information Systems (ICRIIS), Kuala Lumpur, Malaysia, 27–28 November 2013; pp. 516–520. [Google Scholar]
- Nifakos, S.; Chandramouli, K.; Nikolaou, C.K.; Papachristou, P.; Koch, S.; Panaousis, E.; Bonacina, S. Influence of Human Factors on Cyber Security within Healthcare Organisations: A Systematic Review. Sensors 2021, 21, 5119. [Google Scholar] [CrossRef]
- Georgiadou, A.; Mouzakitis, S.; Askounis, D. Assessing MITRE ATT&CK Risk Using a Cyber-Security Culture Framework. Sensors 2021, 21, 3267. [Google Scholar] [CrossRef]
- Papathanasiou, A.; Liontos, G.; Liagkou, V.; Glavas, E. Business email compromise (BEC) attacks: Threats, vulnerabilities and countermeasures—A perspective on the greek landscape. J. Cybersecur. Priv. 2023, 3, 610–637. [Google Scholar] [CrossRef]
- Yang, X.; Zhang, C.; Sun, Y.; Pang, K.; Jing, L.; Wa, S.; Lv, C. FinChain-BERT: A High-Accuracy Automatic Fraud Detection Model Based on NLP Methods for Financial Scenarios. Information 2023, 14, 499. [Google Scholar] [CrossRef]
- Samariya, D.; Thakkar, A. A Comprehensive Survey of Anomaly Detection Algorithms. Ann. Data Sci. 2021, 10, 829–850. [Google Scholar] [CrossRef]
- Jabarulla, M.Y.; Lee, H.-N. A Blockchain and Artificial Intelligence-Based, Patient-Centric Healthcare System for Combating the COVID-19 Pandemic: Opportunities and Applications. Healthcare 2021, 9, 1019. [Google Scholar] [CrossRef] [PubMed]
- Developers, T. TensorFlow; Zenodo: Genève, Switzerland, 2022. [Google Scholar]
- Tolstoluzka, O.; Telezhenko, D. Development and training of LSTM models for control of virtual distributed systems using TensorFlow and Keras. Radioelectron. Comput. Syst. 2024, 2024, 27–37. [Google Scholar] [CrossRef]
- Abadi, Z.J.K.; Mansouri, N.; Javidi, M.M. Deep reinforcement learning-based scheduling in distributed systems: A critical review. Knowl. Inf. Syst. 2024, 66, 5709–5782. [Google Scholar] [CrossRef]
- Imambi, S.; Prakash, K.B.; Kanagachidambaresan, G.; PyTorch. Programming with TensorFlow: Solution for Edge Computing Applications; Springer: Cham, Switzerland, 2021; pp. 87–104. [Google Scholar]
- Kim, S.; Wimmer, H.; Kim, J. Analysis of deep learning libraries: Keras, pytorch, and MXnet. In Proceedings of the 2022 IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA), Las Vegas, NV, USA, 22–25 May 2022. [Google Scholar]
- Merlini, D.; Rossini, M. Text categorization with WEKA: A survey. Mach. Learn. Appl. 2021, 4, 100033. [Google Scholar] [CrossRef]
- Qamar, U.; Raza, M.S. Practical Data Science with WEKA. In Data Science Concepts and Techniques with Applications; Springer: Berlin/Heidelberg, Germany, 2023; pp. 393–448. [Google Scholar]
- Novac, O.-C.; Chirodea, M.C.; Novac, C.M.; Bizon, N.; Oproescu, M.; Stan, O.P.; Gordan, C.E. Analysis of the Application Efficiency of TensorFlow and PyTorch in Convolutional Neural Network. Sensors 2022, 22, 8872. [Google Scholar] [CrossRef]
- Joseph, F.J.J.; Nonsiri, S.; Monsakul, A. Keras and TensorFlow: A hands-on experience. In Advanced Deep Learning for Engineers and Scientists: A Practical Approach; Springer: Cham, Switzerland, 2021; pp. 85–111. [Google Scholar]
- Rivera-Escobedo, M.; López-Martínez, M.D.J.; Solis-Sánchez, L.O.; Guerrero-Osuna, H.A.; Vázquez-Reyes, S.; Acosta-Escareño, D.; Olvera-Olvera, C.A. Low-Scalability Distributed Systems for Artificial Intelligence: A Comparative Study of Distributed Deep Learning Frameworks for Image Classification. Appl. Sci. 2025, 15, 6251. [Google Scholar] [CrossRef]
- Ketkar, N.; Moolayil, J.; Ketkar, N.; Moolayil, J. Introduction to pytorch. In Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch; Apress: New York, NY, USA, 2021; pp. 27–91. [Google Scholar]
- Shao, Y.; Zhang, C.; Xing, L.; Sun, H.; Zhao, Q.; Zhang, L. A new dust detection method for photovoltaic panel surface based on Pytorch and its economic benefit analysis. Energy AI 2024, 16, 100349. [Google Scholar] [CrossRef]
- Li, M.; Wen, K.; Lin, H.; Jin, X.; Wu, Z.; An, H.; Chi, M. Improving the Performance of Distributed MXNet with RDMA. Int. J. Parallel Program. 2019, 47, 467–480. [Google Scholar] [CrossRef]
- Stancato, G. Enhancing Parametric Design Education Through Rhinoceros/Grasshopper: Visual Perception Principles, Student Learning, and Future Integration with AI. In Advances in Representation: New AI-and XR-Driven Transdisciplinarity; Springer: Berlin/Heidelberg, Germany, 2024; pp. 813–824. [Google Scholar]
- Sachin, D.N.; Annappa, B.; Ambesange, S. Federated learning for digital healthcare: Concepts, applications, frameworks, and challenges. Computing 2024, 106, 3113–3150. [Google Scholar] [CrossRef]
- Bouh, M.M.; Hossain, F.; Paul, P.; Ahmed, A. Enhancing Medical Records Digitization Through a Post-OCR Processing Technique. In Proceedings of the 2024 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), Penang, Malaysia, 11–13 December 2024; pp. 311–316. [Google Scholar]
- Hernandez, F.G.; Nguyen, Q.; Smith, V.C.; Cordero, J.A.; Ballester, M.R.; Duran, M.; Solé, A.; Chotsiri, P.; Wattanakul, T.; Mundin, G.; et al. Named entity recognition of pharmacokinetic parameters in the scientific literature. Sci. Rep. 2024, 14, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Niu, H.; Omitaomu, O.A.; Langston, M.A.; Olama, M.; Ozmen, O.; Klasky, H.B.; Laurio, A.; Ward, M.; Nebeker, J. EHR-BERT: A BERT-based model for effective anomaly detection in electronic health records. J. Biomed. Inform. 2024, 150, 104605. [Google Scholar] [CrossRef] [PubMed]
- Purushothaman, S.; Shanmugam, G.S.; Nagarajan, S. Achieving Seamless Semantic Interoperability and Enhancing Text Embedding in Healthcare IoT: A Deep Learning Approach with Survey. SN Comput. Sci. 2023, 5, 1–28. [Google Scholar] [CrossRef]
- Diez, P.L.; Sundgaard, J.V.; Margeta, J.; Diab, K.; Patou, F.; Paulsen, R.R. Deep reinforcement learning and convolutional autoencoders for anomaly detection of congenital inner ear malformations in clinical CT images. Comput. Med. Imaging Graph. 2024, 113, 102343. [Google Scholar] [CrossRef]
- Ioannou, I.; Nagaradjane, P.; Angin, P.; Balasubramanian, P.; Kavitha, K.J.; Murugan, P.; Vassiliou, V. GEMLIDS-MIOT: A Green Effective Machine Learning Intrusion Detection System based on Federated Learning for Medical IoT network security hardening. Comput. Commun. 2024, 218, 209–239. [Google Scholar] [CrossRef]
- Imtiaz, M.A.; Razzaq, K.; Javed, M.A.; Masood, H.; Yousaf, H.F.; Siddique, H. An Enhanced Data Protection and Security based on Machine Learning: Deep Analysis on Threat Mitigation, Challenges in Internet of Medical Things (IoMTs). Spectr. Eng. Sci. 2025, 3, 496–521. [Google Scholar]
- Odufisan, O.I.; Abhulimen, O.V.; Ogunti, E.O. Harnessing artificial intelligence and machine learning for fraud detection and prevention in Nigeria. J. Econ. Criminol. 2025, 7, 100127. [Google Scholar] [CrossRef]
- Zhang, J.; Morley, J.; Gallifant, J.; Oddy, C.; Teo, J.T.; Ashrafian, H.; Delaney, B.; Darzi, A. Mapping and evaluating national data flows: Transparency, privacy, and guiding infrastructural transformation. Lancet Digit. Health 2023, 5, e737–e748. [Google Scholar] [CrossRef] [PubMed]
- Andrade, D. GDPR and Cross-Border Data Transfers in Clinical Trials. 2025. Available online: https://www.clinicaltrialvanguard.com/article/gdpr-and-cross-border-data-transfers-in-clinical-trials/ (accessed on 27 June 2025).
- Shaikh, T.A.; Rasool, T.; Verma, P.; Mir, W.A. A fundamental overview of ensemble deep learning models and applications: Systematic literature and state of the art. Ann. Oper. Res. 2024, 1–77. [Google Scholar] [CrossRef]
- Alahmadi, A.; Khan, H.A.; Shafiq, G.; Ahmed, J.; Ali, B.; Javed, M.A.; Alahmadi, A.H. A privacy-preserved IoMT-based mental stress detection framework with federated learning. J. Supercomput. 2024, 80, 10255–10274. [Google Scholar] [CrossRef]
- Kumar, R.; Garg, S.; Kaur, R.; Johar, M.G.M.; Singh, S.; Menon, S.V.; Kumar, P.; Hadi, A.M.; Hasson, S.A.; Lozanović, J. A comprehensive review of machine learning for heart disease prediction: Challenges, trends, ethical considerations, and future directions. Front. Artif. Intell. 2025, 8, 1583459. [Google Scholar] [CrossRef]
- Vyas, A.; Abimannan, S.; Hwang, R.H. Sensitive Healthcare Data: Privacy and Security Issues and Proposed Solutions. In Emerging Technologies for Healthcare: Internet of Things and Deep Learning Models; Wiley Online Library: Hoboken, NJ, USA, 2021; pp. 93–127. [Google Scholar]
- Sahoh, B.; Choksuriwong, A. The role of explainable Artificial Intelligence in high-stakes decision-making systems: A systematic review. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 7827–7843. [Google Scholar] [CrossRef]
- Schreiber, M. New AI tool counters health insurance denials decided by automated algorithms. In The Guardian; American Medical Association: Chicago, IL, USA, 2025. [Google Scholar]
- Lebcir, I. Utilizing Machine Learning for Financial Management in Healthcare. South East. Eur. J. Public Health 2025, 26, 1529–1542. [Google Scholar]
- Dubey, S.; Verma, D.K.; Kumar, M. Severe acute respiratory syndrome Coronavirus-2 GenoAnalyzer and mutagenic anomaly detector using FCMFI and NSCE. Int. J. Biol. Macromol. 2023, 258, 129051. [Google Scholar] [CrossRef]
- Zhou, W.; Wu, S.; Wang, Y.; Zuo, L.; Yi, Y.; Cui, W. DMU-TransNet: Dense multi-scale U-shape transformer network for anomaly detection. Measurement 2024, 229, 114216. [Google Scholar] [CrossRef]
- Lu, S.; Zhang, W.; Guo, J.; Liu, H.; Li, H.; Wang, N. PatchCL-AE: Anomaly detection for medical images using patch-wise contrastive learning-based auto-encoder. Comput. Med. Imaging Graph. 2024, 114, 102366. [Google Scholar] [CrossRef]
- Settipalli, L.; Gangadharan, G. WMTDBC: An unsupervised multivariate analysis model for fraud detection in health insurance claims. Expert Syst. Appl. 2022, 215, 119259. [Google Scholar] [CrossRef]
- Kumaraswamy, N.; Ekin, T.; Park, C.; Markey, M.K.; Barner, J.C.; Rascati, K. Using a Bayesian Belief Network to detect healthcare fraud. Expert Syst. Appl. 2023, 238, 122241. [Google Scholar] [CrossRef]
- Cai, Y.; Zhang, W.; Chen, H.; Cheng, K.-T. MedIAnomaly: A comparative study of anomaly detection in medical images. Med. Image Anal. 2025, 102, 103500. [Google Scholar] [CrossRef] [PubMed]
- Alsalman, D. A Comparative Study of Anomaly Detection Techniques for IoT Security Using Adaptive Machine Learning for IoT Threats. IEEE Access 2024, 12, 14719–14730. [Google Scholar] [CrossRef]
- Sai, S.; Bhandari, K.S.; Nawal, A.; Chamola, V.; Sikdar, B. An IoMT-Based Incremental Learning Framework With a Novel Feature Selection Algorithm for Intelligent Diagnosis in Smart Healthcare. IEEE Trans. Mach. Learn. Commun. Netw. 2024, 2, 370–383. [Google Scholar] [CrossRef]
- Siddiqui, M.A.; Kalra, M.; Krishna, C.R. ADSBAN: Anomaly detection system for body area networks utilizing IoT and machine learning. Concurr. Comput. Pr. Exp. 2024, 36, e8075. [Google Scholar] [CrossRef]
- Seshagiri, S.; Prema, K.V. Efficient Handling of Data Imbalance in Health Insurance Fraud Detection Using Meta-Reinforcement Learning. IEEE Access 2025, 13, 23482–23497. [Google Scholar] [CrossRef]
- Mohanty, M.D.; Das, A.; Mohanty, M.N.; Altameem, A.; Nayak, S.R.; Saudagar, A.K.J.; Poonia, R.C. Design of smart and secured healthcare service using deep learning with modified SHA-256 algorithm. Healthcare 2022, 10, 1274. [Google Scholar] [CrossRef]
- Khosravi, P.; Mohammadi, S.; Zahiri, F.; Khodarahmi, M.; Zahiri, J. AI-Enhanced Detection of Clinically Relevant Structural and Functional Anomalies in MRI: Traversing the Landscape of Conventional to Explainable Approaches. J. Magn. Reson. Imaging 2024, 60, 2272–2289. [Google Scholar] [CrossRef]
- Hong, B.; Lu, P.; Xu, H.; Lu, J.; Lin, K.; Yang, F. Health insurance fraud detection based on multi-channel heterogeneous graph structure learning. Heliyon 2024, 10, e30045. [Google Scholar] [CrossRef]
- Hancock, J.T.; Bauder, R.A.; Wang, H.; Khoshgoftaar, T.M. Explainable machine learning models for Medicare fraud detection. J. Big Data 2023, 10, 154. [Google Scholar] [CrossRef]
- Khan, M.M.; Alkhathami, M. Anomaly detection in IoT-based healthcare: Machine learning for enhanced security. Sci. Rep. 2024, 14, 5872. [Google Scholar] [CrossRef]
- Devaguptam, S.; Gorti, S.S.; Akshaya, T.L.; Kamath, S.S. Automated Health Insurance Processing Framework with Intelligent Fraud Detection, Risk Classification and Premium Prediction. SN Comput. Sci. 2024, 5, 450. [Google Scholar] [CrossRef]
Techniques | Description | Algorithms | Strengths | References | |
---|---|---|---|---|---|
Traditional ML | Supervised Learning | Trained on labelled data | DT, RF, LR, SVM | Effective with labelled data Detects known fraud patterns | [55,56,57,58,59] |
Unsupervised Learning | Identifies hidden patterns | Clustering, Anomaly Detection | Detects novel fraud schemes Labels not needed | [60,61,62,63] | |
Semi-Supervised Learning | Uses labelled and unlabelled data | Mixed supervised/unsupervised | Works with limited labelled data | [64,65,66] | |
Advanced ML | Deep Learning | Neural networks for complex pattern analysis | CNNs, LSTMs | Handles high-dimensional data Detect sophisticated fraud | [67,68,69,70,71,72,73] |
Ensemble Methods | Combines multiple models | Boosting, Stacking | Robustness High accuracy | [74,75,76] |
Capability | Temporal Abstraction | Cross-Modal Learning | Interpretability | Deployment |
---|---|---|---|---|
LLMs GPT, BERT) | Moderate | Strong | Moderate | High |
Graph-based models (TGAT, TGCN) | Strong | Moderate | Moderate | Moderate |
Computational Trade-offs | Graph-based models require more memory | LLMs need high computational power | GNNs are more transparent | LLMs are good in text domains, while GNNs are good in graph scenarios |
Framework | Key Functionalities | Strengths Relevant to Healthcare Fraud Detection | Limitations/Considerations | References |
---|---|---|---|---|
TensorFlow | Deep learning, numerical computation, large-scale ML, multi-language support, GPU/distributed processing, Keras integration. | Scalability, flexibility, strong industry adoption, and a comprehensive ecosystem for complex models. | It can have a steeper learning curve for beginners compared to Keras. | [138,139,145,146,147] |
PyTorch | Deep learning, dynamic computation graphs, strong GPU acceleration, and extensive libraries. | Flexibility, ease of use in research, strong community support, and rapid prototyping. | It may require more coding for basic tasks than Weka. | [141,148,149] |
MXNet | Scalable deep learning, multi-language support, and efficient training. | Scalability and support from major cloud providers. | No longer under active development as of September 2023. | [150,151] |
Weka | Data mining, machine learning algorithms, user-friendly GUI, data preprocessing, and visualisation. | Ease of use, intuitive interface, a wide range of algorithms, and suitable for non-programmers. | Less emphasis on deep learning compared to TensorFlow and PyTorch. | [143,144] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Razzaq, K.; Shah, M. Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions. Information 2025, 16, 730. https://doi.org/10.3390/info16090730
Razzaq K, Shah M. Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions. Information. 2025; 16(9):730. https://doi.org/10.3390/info16090730
Chicago/Turabian StyleRazzaq, Kamran, and Mahmood Shah. 2025. "Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions" Information 16, no. 9: 730. https://doi.org/10.3390/info16090730
APA StyleRazzaq, K., & Shah, M. (2025). Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions. Information, 16(9), 730. https://doi.org/10.3390/info16090730