Deep Learning for Credit Risk Prediction: A Survey of Methods, Applications, and Challenges
Abstract
1. Introduction
- We present a unified overview of credit risk prediction tasks and benchmark datasets, and trace the evolution from logistic regression to deep architectures, including MLPs, CNNs, RNNs, transformers, and GNNs, highlighting how each class aligns with different credit-risk objectives and data modalities.
- We provide a modality-aware synthesis of recent deep learning applications in credit risk, organising studies by tabular, sequential, transformer-based, and graph-based models, and collating their datasets, architectures, and reported performance.
- We critically analyse methodological and operational challenges for deploying DL-based credit risk models and derive concrete research directions for developing trustworthy, regulation-ready deep learning solutions.
2. Review Methodology
Search Strategy and Screening Protocol
3. Overview of Credit Risk Prediction and Datasets
3.1. Credit Risk Prediction Tasks
- Probability of Default: The likelihood that a borrower will fail to meet contractual repayment obligations within a specified horizon. PD modelling forms the foundation of most credit scoring systems and risk-based pricing frameworks [28].
- Loss Given Default (LGD): The proportion of the exposure that is not recovered in the event of default, reflecting collateral values, recovery processes, and legal costs [29].
- Exposure at Default (EAD): The total outstanding amount a lender is exposed to when default occurs, which is particularly important for revolving facilities such as credit cards and overdrafts [30].
3.2. Benchmark Datasets
4. Evolution of Models for Credit Risk
4.1. Logistic Regression and Classical Machine Learning
4.2. Early Deep Learning for Tabular Credit Risk
4.3. Sequential and Temporal Behaviour Modelling
4.3.1. Long Short-Term Memory Networks
4.3.2. Gated Recurrent Unit Networks
4.3.3. Temporal Convolutional Networks
4.4. Convolutional and Hybrid Architectures
4.5. Transformer
4.6. Graph Neural Networks
5. Notable DL Applications in Peer-Reviewed Credit Risk Studies
5.1. Tabular Deep Networks for Credit Risk
5.2. Sequential Models and Event–Time Targets
5.3. Transformer-Based Models for Credit Risk
5.4. Graph Neural Networks for Relational Credit Risk
6. Challenges, Limitations, and Future Research Directions
6.1. Challenges and Limitations
6.1.1. Evaluation Integrity
6.1.2. Imbalanced Learning and Reject Inference
6.1.3. Interpretability and Fairness
6.1.4. Robustness and Privacy
6.1.5. Operational Deployment and Governance
6.2. Future Research Directions
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial intelligence |
| AUC | Area under the receiver operating characteristic curve |
| AUPRC | Area under the precision–recall curve |
| CNNs | Convolutional neural networks |
| DL | Deep learning |
| EAD | Exposure at default |
| EL | Expected loss |
| GNNs | Graph neural networks |
| GRU | Gated recurrent unit |
| HDNN | High-dimensional deep neural network |
| I-DNN | Incremental DNN |
| LGD | Loss given default |
| LR | Logistic regression |
| LSTM | Long short-term memory |
| ML | Machine learning |
| MLOps | Machine learning operations |
| MLP | Multi-layer perceptron |
| MS-CGNN | Multi-structure cascaded GNN |
| NODE | Neural Oblivious Decision Ensembles |
| PD | Probability of default |
| PDNN | Penalised deep neural network |
| RNNs | Recurrent neural networks |
| RWA | Risk-weighted asset |
| SME | Small and medium enterprise |
| SVMs | Support vector machines |
| TCNs | Temporal convolutional networks |
| XAI | Explainable AI |
References
- Zhu, Y.; Wu, D. P2P credit risk management with KG-GNN: A knowledge graph and graph neural network-based approach. J. Oper. Res. Soc. 2025, 76, 866–880. [Google Scholar] [CrossRef]
- Alagic, A.; Zivic, N.; Kadusic, E.; Hamzic, D.; Hadzajlic, N.; Dizdarevic, M.; Selmanovic, E. Machine learning for an enhanced credit risk analysis: A comparative study of loan approval prediction models integrating mental health data. Mach. Learn. Knowl. Extr. 2024, 6, 53–77. [Google Scholar] [CrossRef]
- Karami, A.; Igbokwe, C. The impact of big data characteristics on credit risk assessment. Int. J. Data Sci. Anal. 2025, 20, 4239–4259. [Google Scholar] [CrossRef]
- Talaat, F.M.; Aljadani, A.; Badawy, M.; Elhosseini, M. Toward interpretable credit scoring: Integrating explainable artificial intelligence with deep learning for credit card default prediction. Neural Comput. Appl. 2024, 36, 4847–4865. [Google Scholar] [CrossRef]
- Aruleba, I.; Sun, Y. Effective credit risk prediction using ensemble classifiers with model explanation. IEEE Access 2024, 12, 115015–115025. [Google Scholar] [CrossRef]
- Machado, M.R.; Chen, D.T.; Osterrieder, J.R. An analytical approach to credit risk assessment using machine learning models. Decis. Anal. J. 2025, 16, 100605. [Google Scholar] [CrossRef]
- Tian, Z.; Xiao, J.; Feng, H.; Wei, Y. Credit risk assessment based on gradient boosting decision tree. Procedia Comput. Sci. 2020, 174, 150–160. [Google Scholar] [CrossRef]
- Aruleba, I.; Sun, Y. Enhanced credit risk prediction using deep learning and SMOTE-ENN resampling. Mach. Learn. Appl. 2025, 21, 100692. [Google Scholar] [CrossRef]
- Sun, P.; Jia, Y.; Shi, Y.; Ren, J.; Li, Z.; Li, X. Artificial Intelligence Credit Risk Assessment Model Based on MLP-Hybrid Clustering. Complexity 2025, 2025, 3308222. [Google Scholar] [CrossRef]
- Mienye, I.D.; Esenogho, E.; Modisane, C. Deep Reinforcement Learning in the Era of Foundation Models: A Survey. Computers 2026, 15, 40. [Google Scholar] [CrossRef]
- Bhatore, S.; Mohan, L.; Reddy, Y.R. Machine learning techniques for credit risk evaluation: A systematic literature review. J. Bank. Financ. Technol. 2020, 4, 111–138. [Google Scholar] [CrossRef]
- Shi, S.; Tse, R.; Luo, W.; D’Addona, S.; Pau, G. Machine learning-driven credit risk: A systemic review. Neural Comput. Appl. 2022, 34, 14327–14339. [Google Scholar] [CrossRef]
- Noriega, J.P.; Rivera, L.A.; Herrera, J.A. Machine learning for credit risk prediction: A systematic literature review. Data 2023, 8, 169. [Google Scholar] [CrossRef]
- Montevechi, A.A.; de Carvalho Miranda, R.; Medeiros, A.L.; Montevechi, J.A.B. Advancing credit risk modelling with Machine Learning: A comprehensive review of the state-of-the-art. Eng. Appl. Artif. Intell. 2024, 137, 109082. [Google Scholar] [CrossRef]
- Kim, H.; Cho, H.; Ryu, D. Corporate default predictions using machine learning: Literature review. Sustainability 2020, 12, 6325. [Google Scholar] [CrossRef]
- Çallı, B.A.; Coşkun, E. A longitudinal systematic review of credit risk assessment and credit default predictors. Sage Open 2021, 11, 21582440211061333. [Google Scholar] [CrossRef]
- Mhlanga, D. Financial inclusion in emerging economies: The application of machine learning and artificial intelligence in credit risk assessment. Int. J. Financ. Stud. 2021, 9, 39. [Google Scholar] [CrossRef]
- Hayashi, Y. Emerging trends in deep learning for credit scoring: A review. Electronics 2022, 11, 3181. [Google Scholar] [CrossRef]
- Peng, K.; Yan, G. A survey on deep learning for financial risk prediction. Quant. Financ. Econ. 2021, 5, 716–737. [Google Scholar] [CrossRef]
- Hoyos Gutiérrez, S.P.; Santos López, F.M. Credit Risk Assessment System Based on Deep Learning: A Systematic Literature Review. In Proceedings of the International Conference on Computer Science, Electronics and Industrial Engineering (CSEI); Springer: Berlin/Heidelberg, Germany, 2023; pp. 395–413. [Google Scholar]
- Demma Wube, H.; Zekarias Esubalew, S.; Fayiso Weldesellasie, F.; Girma Debelee, T. Deep learning and machine learning techniques for credit scoring: A review. In Proceedings of the Pan African Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2024; pp. 30–61. [Google Scholar]
- Mienye, E.; Jere, N.; Obaido, G.; Mienye, I.D.; Aruleba, K. Deep Learning in Finance: A Survey of Applications and Techniques. AI 2024, 5, 2066–2091. [Google Scholar] [CrossRef]
- Valdrighi, G.; M Ribeiro, A.; SB Pereira, J.; Guardieiro, V.; Hendricks, A.; Miranda Filho, D.; Nieto Garcia, J.D.; F Bocca, F.; B Veronese, T.; Wanner, L.; et al. Best practices for responsible machine learning in credit scoring. Neural Comput. Appl. 2025, 37, 20781–20821. [Google Scholar] [CrossRef]
- Paz, Á.; Crawford, B.; Monfroy, E.; Barrera-García, J.; Peña Fritz, Á.; Soto, R.; Cisternas-Caneo, F.; Yáñez, A. Machine Learning and Metaheuristics Approach for Individual Credit Risk Assessment: A Systematic Literature Review. Biomimetics 2025, 10, 326. [Google Scholar] [CrossRef]
- Alvi, J.; Arif, I.; Nizam, K. Advancing financial resilience: A systematic review of default prediction models and future directions in credit risk management. Heliyon 2024, 10, e39770. [Google Scholar] [CrossRef]
- Bhattacharya, A.; Biswas, S.K.; Mandal, A. Credit risk evaluation: A comprehensive study. Multimed. Tools Appl. 2023, 82, 18217–18267. [Google Scholar] [CrossRef]
- Gunnarsson, B.R.; Vanden Broucke, S.; Baesens, B.; Óskarsdóttir, M.; Lemahieu, W. Deep learning for credit scoring: Do or don’t? Eur. J. Oper. Res. 2021, 295, 292–305. [Google Scholar] [CrossRef]
- Thomas, L.; Crook, J.; Edelman, D. Credit Scoring and Its Applications; SIAM: Philadelphia, PA, USA, 2017. [Google Scholar]
- Bandyopadhyay, A. Loan level loss given default (LGD) study of Indian banks. IIMB Manag. Rev. 2022, 34, 168–177. [Google Scholar] [CrossRef]
- Wattanawongwan, S.; Mues, C.; Okhrati, R.; Choudhry, T.; So, M.C. Modelling credit card exposure at default using vine copula quantile regression. Eur. J. Oper. Res. 2023, 311, 387–399. [Google Scholar] [CrossRef]
- Hofmann, H. Statlog (German Credit Data). UCI Machine Learning Repository. 1994. Available online: https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data (accessed on 1 October 2025).
- Yeh, I.C.; Lien, C.H. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 2009, 36, 2473–2480. [Google Scholar] [CrossRef]
- Lessmann, S.; Baesens, B.; Seow, H.V.; Thomas, L.C. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 2015, 247, 124–136. [Google Scholar] [CrossRef]
- Montoya, A.; Inversion; KirillOdintsov; Kotek, M. Home Credit Default Risk. Kaggle. 2018. Available online: https://kaggle.com/competitions/home-credit-default-risk (accessed on 5 December 2025).
- Malekipirbazari, M.; Aksakalli, V. Risk assessment in social lending via random forests. Expert Syst. Appl. 2015, 42, 4621–4631. [Google Scholar] [CrossRef]
- Nwafor, C.N.; Nwafor, O.; Brahma, S. Enhancing transparency and fairness in automated credit decisions: An explainable novel hybrid machine learning approach. Sci. Rep. 2024, 14, 25174. [Google Scholar] [CrossRef]
- de Oliveira, N.A.; Basso, L.F.C. Advancing Credit Rating Prediction: The Role of Machine Learning in Corporate Credit Rating Assessment. Risks 2025, 13, 116. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Mienye, I.D.; Swart, T.G.; Obaido, G. Recurrent neural networks: A comprehensive review of architectures, variants, and applications. Information 2024, 15, 517. [Google Scholar] [CrossRef]
- Mienye, I.D.; Esenogho, E.; Modisane, C. Detecting Imbalanced Credit Card Fraud via Hybrid Graph Attention and Variational Autoencoder Ensembles. AppliedMath 2025, 5, 131. [Google Scholar] [CrossRef]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
- Liu, M.; Xia, C.; Xia, Y.; Deng, S.; Wang, Y. TDCN: A novel temporal depthwise convolutional network for short-term load forecasting. Int. J. Electr. Power Energy Syst. 2025, 165, 110512. [Google Scholar] [CrossRef]
- Dong, A.; Starr, A.; Zhao, Y. An interpretable temporal convolutional framework for Granger causality analysis. IEEE/CAA J. Autom. Sin. 2025, 13, 1–15. [Google Scholar] [CrossRef]
- Vashishth, T.K.; Sharma, V.; Sharma, K.K.; Ahamad, S.; Kaushik, V. Financial Forecasting with Convolutional Neural Networks (CNNs): Trends and Challenges. In Shaping Cutting-Edge Technologies and Applications for Digital Banking and Financial Services; Taylor & Francis: Abingdon, UK, 2025; pp. 62–81. [Google Scholar]
- Mienye, I.D.; Swart, T.G. A comprehensive review of deep learning: Architectures, recent advances, and applications. Information 2024, 15, 755. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- Huang, X.; Khetan, A.; Cvitkovic, M.; Karnin, Z. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv 2020, arXiv:2012.06678. [Google Scholar] [CrossRef]
- Yang, M.; Lim, M.K.; Qu, Y.; Li, X.; Ni, D. Deep neural networks with L1 and L2 regularization for high dimensional corporate credit risk prediction. Expert Syst. Appl. 2023, 213, 118873. [Google Scholar] [CrossRef]
- Lin, C.; Qiao, N.; Zhang, W.; Li, Y.; Ma, S. Default risk prediction and feature extraction using a penalized deep neural network. Stat. Comput. 2022, 32, 76. [Google Scholar] [CrossRef]
- Asencios, R.; Asencios, C.; Ramos, E. Profit scoring for credit unions using the multilayer perceptron, XGBoost and TabNet algorithms: Evidence from Peru. Expert Syst. Appl. 2023, 213, 119201. [Google Scholar] [CrossRef]
- Wang, S.; Zhang, X. Research on credit default prediction model based on TabNet-stacking. Entropy 2024, 26, 861. [Google Scholar] [CrossRef]
- Hjelkrem, L.O.; Lange, P.E.d. Explaining deep learning models for credit scoring with SHAP: A case study using Open Banking Data. J. Risk Financ. Manag. 2023, 16, 221. [Google Scholar] [CrossRef]
- Li, G.; Zhu, F.; Zhang, Y.; Li, M. A Data-Driven Incremental Deep Neural Network for Borrower Credit Scoring. SSRN 2023. [Google Scholar] [CrossRef]
- Popov, S.; Morozov, S.; Babenko, A. Neural oblivious decision ensembles for deep learning on tabular data. arXiv 2019, arXiv:1909.06312. [Google Scholar] [CrossRef]
- Shan, W.; Gao, B. Stacked Ensemble Model with Enhanced TabNet for SME Supply Chain Financial Risk Prediction. Systems 2025, 13, 892. [Google Scholar] [CrossRef]
- Liang, L.; Cai, X. Forecasting peer-to-peer platform default rate with LSTM neural network. Electron. Commer. Res. Appl. 2020, 43, 100997. [Google Scholar] [CrossRef]
- Ala’raj, M.; Abbod, M.F.; Majdalawieh, M.; Jum’a, L. A deep learning model for behavioural credit scoring in banks. Neural Comput. Appl. 2022, 34, 5839–5866. [Google Scholar] [CrossRef]
- Zhang, L. The Evaluation on the Credit Risk of Enterprises with the CNN-LSTM-ATT Model. Comput. Intell. Neurosci. 2022, 2022, 6826573. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Xu, C.; Feng, B.; Zhao, H. Credit risk prediction model for listed companies based on CNN-LSTM and attention mechanism. Electronics 2023, 12, 1643. [Google Scholar] [CrossRef]
- Wang, H.; Bellotti, A.; Qu, R.; Bai, R. Discrete-Time Survival Models with Neural Networks for Age–Period–Cohort Analysis of Credit Risk. Risks 2024, 12, 31. [Google Scholar]
- Chen, B.; Long, S. A novel end-to-end corporate credit rating model based on self-attention mechanism. IEEE Access 2020, 8, 203876–203889. [Google Scholar] [CrossRef]
- Han, D.; Guo, W.; Chen, Y.; Wang, B.; Li, W. Personal credit default prediction fusion framework based on self-attention and cross-network algorithms. Eng. Appl. Artif. Intell. 2024, 133, 107977. [Google Scholar] [CrossRef]
- Shi, X.; Tang, D.; Yu, Y. Credit Scoring Prediction Using Deep Learning Models in the Financial Sector. IEEE Access 2025, 13, 130731–130746. [Google Scholar] [CrossRef]
- Yang, Y.; Lin, Y.; Zhang, Y.; Su, Z.; Goh, C.C.; Fang, T.; Bellotti, A.G.; Lee, B.G. Transforming Credit Risk Analysis: A Time-Series-Driven ResE-BiLSTM Framework for Post-Loan Default Detection. arXiv 2025, arXiv:2508.00415. [Google Scholar] [CrossRef]
- Zhang, Y. AI-Driven Framework for Financial Risk Management: Enhancing Decision-Making with LSTM Networks and Probabilistic Models. In Proceedings of the 2025 2nd International Conference on Intelligent Computing and Robotics (ICICR); IEEE: Piscataway, NJ, USA, 2025; pp. 176–181. [Google Scholar]
- Gorishniy, Y.; Rubachev, I.; Khrulkov, V.; Babenko, A. Revisiting deep learning models for tabular data. Adv. Neural Inf. Process. Syst. 2021, 34, 18932–18943. [Google Scholar]
- Wang, C.; Xiao, Z. A deep learning approach for credit scoring using feature embedded transformer. Appl. Sci. 2022, 12, 10995. [Google Scholar] [CrossRef]
- Korangi, K.; Mues, C.; Bravo, C. A transformer-based model for default prediction in mid-cap corporate markets. Eur. J. Oper. Res. 2023, 308, 306–320. [Google Scholar] [CrossRef]
- Li, J.; Zhou, Z.; Zhang, J.; Cheng, D.; Jiang, C. HFTCRNet: Hierarchical fusion transformer for interbank credit rating and risk assessment. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 13006–13020. [Google Scholar] [CrossRef]
- Kakadiya, R.; Khan, T.; Diwan, A.; Mahadeva, R. Transformer Models for Predicting Bank Loan Defaults a Next-Generation Risk Management. In Proceedings of the 2024 IEEE 6th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA); IEEE: Piscataway, NJ, USA, 2024; pp. 26–31. [Google Scholar]
- Zhang, Y.; Liang, X. Personal Credit Risk Prediction Based on Minimum Weight Value Error Combination Model. In Proceedings of the 2025 8th International Conference on Advanced Algorithms and Control Engineering (ICAACE); IEEE: Piscataway, NJ, USA, 2025; pp. 307–313. [Google Scholar]
- Hartomo, K.D.; Arthur, C.; Nataliani, Y. A novel weighted loss tabtransformer integrating explainable ai for imbalanced credit risk datasets. IEEE Access 2025, 13, 31045–31056. [Google Scholar] [CrossRef]
- Wu, Y. Enterprise financial sharing and risk identification model combining recurrent neural networks with transformer model supported by blockchain. Heliyon 2024, 10, e32639. [Google Scholar] [CrossRef]
- Stevenson, M.; Mues, C.; Bravo, C. The value of text for small business default prediction: A deep learning approach. Eur. J. Oper. Res. 2021, 295, 758–771. [Google Scholar] [CrossRef]
- Lu, S.; Zhang, X.; Su, Y.; Liu, X.; Yu, L. Efficient multimodal learning for corporate credit risk prediction with an extended deep belief network. In Annals of Operations Research; Springer: Berlin/Heidelberg, Germany, 2025; pp. 1–38. [Google Scholar]
- Schwab, B.; Kriebel, J. Mitigating adversarial attacks on transformer models in credit scoring. Eur. J. Oper. Res. 2025, 328, 309–323. [Google Scholar] [CrossRef]
- Wang, J.; Liu, G.; Xu, X.; Xing, X. Credit risk prediction for small and medium enterprises utilizing adjacent enterprise data and a relational graph attention network. J. Manag. Sci. Eng. 2024, 9, 177–192. [Google Scholar] [CrossRef]
- Song, L.; Li, H.; Tan, Y.; Li, Z.; Shang, X. Enhancing enterprise credit risk assessment with cascaded multi-level graph representation learning. Neural Netw. 2024, 169, 475–484. [Google Scholar] [CrossRef] [PubMed]
- Yuan, Q.; Liu, Y.; Tang, Y.; Chen, X.; Zheng, X.; He, Q.; Ao, X. Dynamic Graph Learning with Static Relations for Credit Risk Assessment. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 13133–13141. [Google Scholar]
- Mojdehi, K.F.; Amiri, B.; Haddadi, A. A Novel Hybrid Model for Credit Risk Assessment of Supply Chain Finance Based on Topological Data Analysis and Graph Neural Network. IEEE Access 2025, 13, 13101–13127. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, Z.; Zhao, Y.; Huang, K.; Kang, Y.; Zhou, J. Financial default prediction via motif-preserving graph neural network with curriculum learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 2233–2242. [Google Scholar]
- Liu, B.; Li, I.; Yao, J.; Chen, Y.; Huang, G.; Wang, J. Unveiling the Potential of Graph Neural Networks in SME Credit Risk Assessment. In Proceedings of the 2024 5th International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI); IEEE: Piscataway, NJ, USA, 2024; pp. 562–566. [Google Scholar]
- Zhang, Z.; Shen, Q.; Hu, Z.; Liu, Q.; Shen, H. Credit risk analysis for SMEs using graph neural networks in supply chain. In Proceedings of the 2025 International Conference on Big Data, Artificial Intelligence and Digital Economy, Kunming, China, 18–20 July 2025; pp. 81–85. [Google Scholar]
- Cheng, C.; Luo, C. Enterprise Credit Rating Framework Based on Risk Contagion Graph Neural Network. In Proceedings of the International Conference on Machine Learning and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2025; pp. 243–254. [Google Scholar]
- Bergmeir, C.; Benítez, J.M. On the use of cross-validation for time series predictor evaluation. Inf. Sci. 2012, 191, 192–213. [Google Scholar] [CrossRef]
- Fonseca, P.G.; Lopes, H.D. Calibration of machine learning classifiers for probability of default modelling. arXiv 2017, arXiv:1710.08901. [Google Scholar] [CrossRef]
- Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On calibration of modern neural networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1321–1330. [Google Scholar]
- Idwan, S.; Etaiwi, W.; Rafayia, H.; Matar, I. A comprehensive review of statistical variants and enhancements of SMOTE oversampling method. Int. J. Data Sci. Anal. 2025, 20, 6887–6904. [Google Scholar] [CrossRef]
- Hu, X.; Chen, H.; Zhang, J.; Chen, H.; Liu, S.; Li, X.; Wang, Y.; Xue, X. GAT-COBO: Cost-sensitive graph neural network for telecom fraud detection. IEEE Trans. Big Data 2024, 10, 528–542. [Google Scholar] [CrossRef]
- Andrae, S. Fairness and bias in machine learning models for credit decisions. In Machine Learning and Modeling Techniques in Financial Data Science; IGI Global Scientific Publishing: Hershey, PA, USA, 2025; pp. 1–24. [Google Scholar]
- Liao, J.; Wang, W.; Xue, J.; Lei, A.; Han, X.; Lu, K. Combating sampling bias: A self-training method in credit risk models. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 12566–12572. [Google Scholar]
- Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
- Hardt, M.; Price, E.; Srebro, N. Equality of opportunity in supervised learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3323–3331. [Google Scholar]
- Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL Tech. 2017, 31, 841. [Google Scholar] [CrossRef]
- Greco, S.; Vacchetti, B.; Apiletti, D.; Cerquitelli, T. Unsupervised concept drift detection from deep learning representations in real-time. IEEE Trans. Knowl. Data Eng. 2025, 37, 6232–6245. [Google Scholar] [CrossRef]
- Ximenes, R.; Alves, A.P.S.; Escovedo, T.; Spinola, R.; Kalinowski, M. Investigating Issues that Lead to Code Technical Debt in Machine Learning Systems. In Proceedings of the 2025 IEEE/ACM 4th International Conference on AI Engineering–Software Engineering for AI (CAIN); IEEE: Piscataway, NJ, USA, 2025; pp. 173–183. [Google Scholar]
- Bartlett, R.; Morse, A.; Stanton, R.; Wallace, N. Consumer-lending discrimination in the FinTech Era. J. Financ. Econ. 2022, 143, 30–56. [Google Scholar] [CrossRef]
- Pradhan, R.; Alazzam, M.B.; Keswani, S.; Bhasin, N.K.; Jaff, N.A.; Muthuperumal, S. A Hybrid GRU-Transformer Model for Financial Forecasting and Risk Management. In Proceedings of the 2025 3rd International Conference on Integrated Circuits and Communication Systems (ICICACS); IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar]
- Amershi, S.; Begel, A.; Bird, C.; DeLine, R.; Gall, H.; Kamar, E.; Nagappan, N.; Nushi, B.; Zimmermann, T. Software engineering for machine learning: A case study. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP); IEEE: Piscataway, NJ, USA, 2019; pp. 291–300. [Google Scholar]






| Study | Year | Scope |
|---|---|---|
| Valdrighi et al. [23] | 2025 | Best practices for responsible ML in credit scoring, covering fairness, explainability, and governance. |
| Paz et al. [24] | 2025 | Systematic review of ML and metaheuristics for individual credit risk assessment. |
| Alvi et al. [25] | 2024 | Systematic review of default prediction ML models and their role in strengthening credit risk management. |
| Montevechi et al. [14] | 2024 | Comprehensive review of state-of-the-art ML models for credit risk. |
| Demma Wube et al. [21] | 2024 | Review of ML and DL techniques for credit scoring. |
| Noriega et al. [13] | 2023 | Systematic review of ML methods for credit risk prediction, with emphasis on algorithms, datasets, and performance metrics. |
| Bhattacharya et al. [26] | 2023 | Comprehensive study of credit risk evaluation methods, including statistical and ML models. |
| Hoyos et al. [20] | 2023 | Systematic review of DL-based credit risk assessment systems, summarising architectures, application settings, and evaluation measures. |
| Hayashi [18] | 2022 | Review of emerging trends in DL for credit scoring, focusing on neural network architectures, feature learning, and interpretability. |
| Shi et al. [12] | 2022 | A survey of ML-driven credit risk, organising algorithms, data sources, and evaluation methods. |
| Çallı and Coşkun [16] | 2021 | Longitudinal systematic review of credit risk assessment and default predictors. |
| Mhlanga [17] | 2021 | Review of ML for credit risk assessment in the context of financial inclusion in emerging economies. |
| Peng and Yan [19] | 2021 | Survey of DL for financial risk prediction across multiple tasks, with credit risk as one application area. |
| Gunnarsson et al. [27] | 2021 | Empirical study on when DL is beneficial for credit scoring, practical adoption, and comparison with traditional scoring. |
| Bhatore et al. [11] | 2020 | Systematic review of ML techniques for credit risk evaluation, covering classifiers, feature selection, and benchmark datasets. |
| Kim et al. [15] | 2020 | Literature review of corporate default prediction models, summarising statistical and ML approaches. |
| Dataset | Sample Size | Features | Description |
|---|---|---|---|
| German Credit | 1000 | 20 | Demographic and financial attributes with binary good/bad credit labels. |
| Australian Credit | 690 | 14 | Mixed categorical and numerical features for creditworthiness classification. |
| Taiwan Credit Card Default | 30,000 | 24 | Client payment history, billing amounts, and demographics. |
| Home Credit Default Risk | 300,000+ | 122 | Realistic industrial dataset combining behavioural and bureau information. |
| Lending Club | Millions | 100+ | Peer-to-peer lending records, including loan, borrower, and repayment details. |
| Model Class | Input Modality | Core Mechanism | Credit-Risk Strengths | Key Limitations |
|---|---|---|---|---|
| MLP | Tabular features | Feed-forward non-linear transformation | Learns complex interactions beyond linear scorecards; scalable and easy to deploy | No temporal or relational reasoning; feature engineering required |
| CNN | Sequential/behavioural time-series | Local shared-weight convolutions with pooling | Detects repayment and utilisation motifs; efficient training and parallelisable | Captures only local temporal patterns; struggles with long-range dependencies |
| Recurrent Models (RNN, LSTM, GRU) | Temporal behavioural sequences | Hidden state propagation with gating | Models long-term delinquency, repayment evolution, and behavioural drift | Training cost increases with long sequences; vanishing gradients; limited parallelism |
| Transformers | Tabular or sequential data | Global multi-head self-attention, contextual embeddings | Captures global temporal/feature interactions; scalable training; interpretability via attention | Requires larger data volume and tuning; less effective on tiny datasets |
| GNNs | Relational borrower networks | Iterative message passing and neighbour aggregation | Models contagion effects, systemic dependencies, and hidden risk propagation | Requires graph construction; sensitive to noise and missing relationships |
| Hybrid Architectures (e.g., CNN–LSTM) | Heterogeneous data (tabular, text, sequence, network) | Combined feature extractors and fusion layers | Balances interpretability, scalability, and multimodal learning; strongest performance on real-world data | Increased complexity; harder to interpret and validate for regulation |
| Modelling Focus | Reference | Year | Methods and Application |
|---|---|---|---|
| Tabular DL Models | Popov et al. [54] | 2019 | NODE differentiable tree ensembles; match gradient boosting on credit-style tabular tasks while remaining fully differentiable. |
| Lin et al. [49] | 2022 | Penalised DNN survival model for P2P time-to-default; embedded penalties support feature selection and improve PD estimation. | |
| Yang et al. [48] | 2023 | HDNN with L1–L2 regularisation for corporate credit; Acc = 80.12% and outperforms LR, SVM, and baseline DNN. | |
| Asencios et al. [50] | 2023 | MLP, XGBoost, and TabNet for profit scoring; XGBoost best , TabNet slightly lower but more interpretable. | |
| Hjelkrem and Lange [52] | 2023 | MLP on open-banking transactions with SHAP; outperforms a BERT model and yields intuitive behavioural risk drivers. | |
| Li et al. [53] | 2023 | Incremental DNN for agricultural microloans under concept drift; 1.4–7.8 pp AUC gains over DNN, XGBoost, and RF. | |
| Wang and Zhang [51] | 2024 | TabNet–stacking ensemble on large-scale credit; Acc = 0.979, AUC = 0.941 on Tianchi. | |
| Shan and Gao [55] | 2025 | Stacked TabNet with multi-stage optimisation for SME supply-chain risk; AUC = 0.9616, Acc = 0.9277, above TabNet, LightGBM, and CatBoost. | |
| Sequential Models | Liang and Cai [56] | 2020 | LSTM for Lending Club monthly default-rate forecasts; MAE 0.072 and RMSE 0.093, better than ARIMA, SVM, and ANN. |
| Chen and Long [61] | 2020 | Self-attention end-to-end corporate rating; removes manual aggregation and stabilises ratings vs classical ML. | |
| Ala’raj et al. [57] | 2022 | Behavioural LSTM variants for credit cards; exceed SVM, RF, MLP, and LR in PD prediction and calibration. | |
| Zhang [58] | 2022 | CNN–LSTM–attention for enterprise credit; AUC = 0.92 and F1 = 0.91, outperforming CNN-only and LSTM-only models. | |
| Li et al. [59] | 2023 | CNN–LSTM for listed corporates; CNN captures short-term motifs, LSTM long-term behaviour, improving discrimination. | |
| Wang et al. [60] | 2024 | Deep discrete-time survival with age–period–cohort decomposition; yields smooth credit hazard curves and macro/maturity structure. | |
| Han et al. [62] | 2024 | Self-attention plus cross-network for default prediction; improves accuracy, precision, recall, and F1 over baseline DL. | |
| Shi et al. [63] | 2025 | Benchmark of CNN, RNN, and DNN for financial credit scoring; RNN best temporal sensitivity, CNN most parameter-efficient. | |
| Yang et al. [64] | 2025 | Residual-enhanced BiLSTM with multi-head attention on Freddie Mac data; AUC = 0.982, F1 = 0.958, beating BiLSTM, GRU, CNN, and RNN. | |
| Zhang [65] | 2025 | LSTM encoders with Bayesian calibration for corporate risk; Acc = 0.972, AUC = 0.981 and reduced uncertainty miscalibration vs RF and LR. | |
| Transformer-Based Models | Huang et al. [47] | 2020 | TabTransformer with contextual embeddings for high-cardinality categorical features; improves over MLP and tree baselines. |
| Gorishniy et al. [66] | 2021 | FT-Transformer for tabular data; attention blocks match or surpass CatBoost/XGBoost on nonlinear financial tasks. | |
| Stevenson et al. [74] | 2021 | BERT embeddings from SME loan texts; text alone gives competitive default prediction without structured variables. | |
| Wang and Xiao [67] | 2022 | Feature-embedded transformer fusing behavioural sequences and static features for online lending; AUC = 0.72, KS = 0.32, better than LR, XGBoost, and LSTM. | |
| Korangi et al. [68] | 2023 | Transformer for mid-cap corporate multi-horizon default; multi-channel panel design yields higher AUC than statistical and LSTM baselines. | |
| Li et al. [69] | 2024 | HFTCRNet hierarchical fusion transformer for interbank ratings and systemic risk; temporal + graph transformers and contagion module outperform other models on 4548 banks. | |
| Kakadiya et al. [70] | 2024 | Transformer models for bank loan default; self-attention captures higher-order interactions and beats LR and tree ensembles. | |
| Wu [73] | 2024 | BiLSTM-Transformer on a blockchain financial sharing platform; multimodal text/visual enterprise risk identification with Acc > 94% and AUC > 0.95. | |
| Hartomo et al. [72] | 2025 | Weighted-loss TabTransformer with SHAP-based XAI for imbalanced MSME and consumer credit; increases accuracy and minority-class AUC/PR (e.g., 86.35%→89.27%). | |
| Zhang and Liang [71] | 2025 | Minimum weighted value error combination of BERT-based temporal encoder and DNN/MLP experts; dynamic weighting improves personal credit classification vs single models. | |
| Lu et al. [75] | 2025 | BERT plus residual blocks to fuse textual and numeric signals for corporate credit; improves classification vs. single-modality and non-residual baselines. | |
| Schwab and Kriebel [76] | 2025 | Analysis of transformer robustness in financial tasks; shows adversarial sensitivity and proposes gradient-regularised defences. | |
| GNN-based Models | Wang et al. [81] | 2023 | Motif-preserving GNN with curriculum learning for enterprise networks; improves accuracy and convergence stability across public and industrial datasets. |
| Wang et al. [77] | 2024 | RGAT on SME graphs from shared directors and interactions; multi-head RGAT achieves AUC = 0.799 and KS = 0.528, above non-graph baselines. | |
| Song et al. [78] | 2024 | MS-CGNN combining pairwise graphs and hypergraphs; Recall = 0.8863, Acc = 0.9442, F1 = 0.93, outperforming several GNN variants. | |
| Liu et al. [82] | 2024 | GraphSAGE on maximum-spanning-tree enterprise credit graphs; higher ROC than tree and neural baselines despite sparse connectivity. | |
| Yuan et al. [79] | 2025 | DGNN-SR fusing static fund-transfer and dynamic payment graphs with multi-view time encoders; gains 0.85–2.5 pp AUC over continuous-time GNNs. | |
| Mojdehi et al. [80] | 2025 | BM–GNN using topological data analysis and GNNs for supply-chain finance; max Acc = 93.56% with robust performance vs classical ML. | |
| Zhang et al. [83] | 2025 | Large-scale industrial GNN pipeline (23.4 M and 8.6 M nodes) for supply-chain mining and default; AUC = 0.995 (links) and 0.701 (default). | |
| Cheng and Luo [84] | 2025 | Metapath-driven RCGNN using heterogeneous paths (investment, geography, industry); improves multi-class enterprise credit classification vs. homogeneous GNNs. |
| Challenge | Description | Emerging Research Directions |
|---|---|---|
| Evaluation Integrity | Temporal leakage and weak calibration undermine external validity [18,23,58,87]. | Out-of-time and rolling validation; calibration-aware reporting; cost-sensitive and utility-aligned scoring. |
| Imbalance and Reject Inference | Rare defaults and missing counterfactual labels distort learning signals [88,89,90]. | Causal estimation, selective abstention, semi-supervised reject inference, and cost-sensitive objectives. |
| Interpretability and Fairness | Deep models violate explainability and anti-bias compliance [92,97]. | Interpretable-by-design architectures, causal fairness constraints, certified explanation mechanisms. |
| Robustness and Privacy | Drift and privacy constraints limit long-term reliability [95,98]. | Drift-robust adaptive training, federated learning, synthetic financial digital twins, DP-SGD optimisation. |
| Operational Deployment and Governance | Insufficient deployment discipline increases regulatory risk [96,99]. | Automated monitoring frameworks, model-card pipelines, Basel-aligned documentation standards. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Mienye, I.D.; Esenogho, E.; Modisane, C. Deep Learning for Credit Risk Prediction: A Survey of Methods, Applications, and Challenges. Information 2026, 17, 395. https://doi.org/10.3390/info17040395
Mienye ID, Esenogho E, Modisane C. Deep Learning for Credit Risk Prediction: A Survey of Methods, Applications, and Challenges. Information. 2026; 17(4):395. https://doi.org/10.3390/info17040395
Chicago/Turabian StyleMienye, Ibomoiye Domor, Ebenezer Esenogho, and Cameron Modisane. 2026. "Deep Learning for Credit Risk Prediction: A Survey of Methods, Applications, and Challenges" Information 17, no. 4: 395. https://doi.org/10.3390/info17040395
APA StyleMienye, I. D., Esenogho, E., & Modisane, C. (2026). Deep Learning for Credit Risk Prediction: A Survey of Methods, Applications, and Challenges. Information, 17(4), 395. https://doi.org/10.3390/info17040395

