# Deep Neural Networks for Behavioral Credit Rating

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- Probability of Default (PD): the average percentage of the defaulted obligors for a rating grade,
- Loss Given Default (LGD): share of the exposure the bank might lose in case of a default, and
- Exposure At Default (EAD): estimated outstanding amount in case of a default, i.e., total value to which the bank is exposed.

## 2. The Data

- Tenure features, which contain data on the length and volume of the business relationship of the client and the bank,
- Data on the balance of current and business accounts, the balance of deposit, and regular income,
- Features that measure the average monthly obligations of the client, as well as the average monthly burden (debt burden ratio),
- The client’s utilization of an overdraft,
- Features that describe the credit history of the client (days past due and debt),
- The balance of the current account.

## 3. Models and Methods

#### 3.1. Logistic Regression

#### 3.2. Support Vector Machine

#### 3.3. Random Forest

#### 3.4. Gradient Boosting

Algorithm 1: Gradient boosting [34]. |

#### 3.5. Feedforward Neural Network

## 4. Performance Measures

## 5. Results

#### 5.1. 2009–2013 Dataset

#### 5.2. 2014–2018 Dataset

#### 5.3. Long-Term Performance

#### 5.4. Impact of Reprogrammed Facilities

#### 5.5. Distribution of PD Estimates

#### 5.6. Results Summary

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Witzany, J. Credit Risk Management and Modeling; Oeconomica Prague: Prague, Czech Republic, 2010. [Google Scholar]
- Basel Committee on Banking Supervision. An Explanatory Note on the Basel II IRB Risk Weight Functions; Bank for International Settlements: Basel, Switzerland, 2005. [Google Scholar]
- Basel Committee on Banking Supervision. The Internal Ratings-Based Approach: Supporting Document to the New Basel Capital Accord; Bank for International Settlements: Basel, Switzerland, 2001. [Google Scholar]
- Loterman, G.; Brown, I.; Martens, D.; Mues, C.; Baesens, B. Benchmarking regression algorithms for loss given default modeling. Int. J. Forecast.
**2012**, 28, 161–170. [Google Scholar] [CrossRef] - Calabrese, R. Downturn loss given default: Mixture distribution estimation. Eur. J. Oper. Res.
**2014**, 237, 271–277. [Google Scholar] [CrossRef] [Green Version] - Leow, M.; Crook, J. A new Mixture model for the estimation of credit card Exposure at Default. Eur. J. Oper. Res.
**2016**, 249, 487–497. [Google Scholar] [CrossRef] [Green Version] - Yao, X.; Crook, J.; Andreeva, G. Enhancing two-stage modeling methodology for loss given default with support vector machines. Eur. J. Oper. Res.
**2017**, 263, 679–689. [Google Scholar] [CrossRef] [Green Version] - Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 20 September 2020).
- Kaggle. Give Me Some Credit. 2011. Available online: https://www.kaggle.com/c/GiveMeSomeCredit (accessed on 10 October 2020).
- Louzada, F.; Ara, A.; Fernandes, G.B. Classification methods applied to credit scoring: Systematic review and overall comparison. Surv. Oper. Res. Manag. Sci.
**2016**, 21, 117–134. [Google Scholar] [CrossRef] [Green Version] - Yu, L.; Yao, X.; Wang, S.; Lai, K.K. Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection. Expert Syst. Appl.
**2011**, 38, 15392–15399. [Google Scholar] [CrossRef] - Yao, J.R.; Chen, J.R. A New Hybrid Support Vector Machine Ensemble Classification Model for Credit Scoring. J. Inf. Technol. Res. (JITR)
**2019**, 12, 77–88. [Google Scholar] [CrossRef] - Chaudhuri, A.; De, K. Fuzzy support vector machine for bankruptcy prediction. Appl. Soft Comput.
**2011**, 11, 2472–2486. [Google Scholar] [CrossRef] - Khandani, A.E.; Kim, A.J.; Lo, A.W. Consumer credit-risk models via machine-learning algorithms. J. Bank. Financ.
**2010**, 34, 2767–2787. [Google Scholar] [CrossRef] [Green Version] - Addo, P.M.; Guegan, D.; Hassani, B. Credit Risk Analysis Using Machine and Deep Learning Models. Risks
**2018**, 6, 38. [Google Scholar] [CrossRef] [Green Version] - Chang, Y.C.; Chang, K.H.; Wu, G.J. Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions. Appl. Soft Comput.
**2018**, 73, 914–920. [Google Scholar] [CrossRef] - Singh, B.E.R.; Sivasankar, E. Enhancing Prediction Accuracy of Default of Credit Using Ensemble Techniques. In First International Conference on Artificial Intelligence and Cognitive Computing; Springer: Berlin/Heidelberg, Germany, 2019; pp. 427–436. [Google Scholar]
- Lessmann, S.; Baesens, B.; Seow, H.V.; Thomas, L.C. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res.
**2015**, 247, 124–136. [Google Scholar] [CrossRef] [Green Version] - Li, Y.; Lin, X.; Wang, X.; Shen, F.; Gong, Z. Credit Risk Assessment Algorithm Using Deep Neural Networks with Clustering and Merging. In Proceedings of the 2017 13th International Conference on Computational Intelligence and Security (CIS), Hong Kong, China, 15–18 December 2017; pp. 173–176. [Google Scholar]
- Sun, T.; Vasarhelyi, M.A. Predicting credit card delinquencies: An application of deep neural networks. Intell. Syst. Account. Financ. Manag.
**2018**, 25, 174–189. [Google Scholar] [CrossRef] - Munkhdalai, L.; Munkhdalai, T.; Namsrai, O.E.; Lee, J.Y.; Ryu, K.H. An empirical comparison of machine-learning methods on bank client credit assessments. Sustainability
**2019**, 11, 699. [Google Scholar] [CrossRef] [Green Version] - Luo, C.; Wu, D.; Wu, D. A deep learning approach for credit scoring using credit default swaps. Eng. Appl. Artif. Intell.
**2017**, 65, 465–470. [Google Scholar] [CrossRef] - Ciampi, F.; Gordini, N. Small Enterprise Default Prediction Modeling through Artificial Neural Networks: An Empirical Analysis of I talian Small Enterprises. J. Small Bus. Manag.
**2013**, 51, 23–45. [Google Scholar] [CrossRef] - du Jardin, P. Forecasting corporate failure using ensemble of self-organizing neural networks. Eur. J. Oper. Res.
**2021**, 288, 869–885. [Google Scholar] [CrossRef] - Sirignano, J.; Sadhwani, A.; Giesecke, K. Deep learning for mortgage risk. arXiv
**2016**, arXiv:1607.02470. [Google Scholar] [CrossRef] [Green Version] - Wang, C.; Han, D.; Liu, Q.; Luo, S. A Deep Learning Approach for Credit Scoring of Peer-to-Peer Lending Using Attention Mechanism LSTM. IEEE Access
**2019**, 7, 2161–2168. [Google Scholar] [CrossRef] - Kvamme, H.; Sellereite, N.; Aas, K.; Sjursen, S. Predicting mortgage default using convolutional neural networks. Expert Syst. Appl.
**2018**, 102, 207–217. [Google Scholar] [CrossRef] [Green Version] - Basel Committee on Banking Supervision. Basel III: Finalising Post-Crisis Reforms; Bank for International Settlements: Basel, Switzerland, 2017. [Google Scholar]
- Nasrabadi, N.M. Pattern recognition and machine learning. J. Electron. Imaging
**2007**, 16, 049901. [Google Scholar] - Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; CRC Press: Boca Raton, FL, USA, 1984. [Google Scholar]
- Breiman, L. Bagging predictors. Mach. Learn.
**1996**, 24, 123–140. [Google Scholar] [CrossRef] [Green Version] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat.
**2001**, 29, 1189–1232. [Google Scholar] [CrossRef] - Rosenblatt, F. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms; Spartan Books: Washington, DC, USA, 1962. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Hanley, J.A.; McNeil, B.J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology
**1982**, 143, 29–36. [Google Scholar] [CrossRef] [Green Version] - Hand, D.J. Measuring classifier performance: A coherent alternative to the area under the ROC curve. Mach. Learn.
**2009**, 77, 103–123. [Google Scholar] [CrossRef] [Green Version] - Brier, G.W. Verification of forecasts expressed in terms of probability. Mon. Weather. Rev.
**1950**, 78, 1–3. [Google Scholar] [CrossRef]

**Figure 1.**Illustration of variance in experienced losses (left) and distribution of losses (right) [2].

**Figure 2.**End-of-year dataset statistics: (

**a**) the number of examples and the default rate (with reprogrammed facilities labeled as both defaults and non-defaults); (

**b**) number of defaulted and reprogrammed loans for each snapshot date.

**Figure 3.**An example of a deep feedforward network; the input layer consists of two neurons, followed by two hidden layers with five and four neurons, respectively, and a single neuron output layer (note: neurons’ bias units are omitted).

**Figure 5.**Deep neural network model: median Probability of Default (PD) of defaulted and reprogrammed loans based on the number of months from the snapshot date to the opening of the default status or reprogram. zero months represents a period between one and 30 days; one month is 31 to 60 days, etc. The width of envelopes around the curves represents the Interquartile Range (IRQ) for each data point.

**Figure 6.**Distribution of out-of-time PD estimates for XGBoost and the deep model on 2013-12-31 data; both models were trained on 2009–2012 data with reprograms labeled as defaults. The histogram for non-defaulted examples is shown in the left column, while defaulted examples are in the right one. The top row contains XGBoost PDs, while the deep model estimates are in the bottom row. The width of each histogram column is two percentage points.

**Figure 7.**Distribution of out-of-time PD estimates for XGBoost and the deep model on 2013-12-31 data; both models were trained on 2009–2012 data with reprograms labeled as non-defaults. The histogram for non-defaulted examples is shown in the left column, while defaulted examples are in the right one. The top row contains XGBoost PDs, while the deep model estimates are in the bottom row. The width of each histogram column is two percentage points.

**Table 1.**Performance of the models trained on 2009–2012 data; reprogrammed loans are labeled as defaults.

Model | Validation Set | Out-of-Time Set (2013-12-31) | ||
---|---|---|---|---|

Mean ROC AUC | ROC AUC | H-Measure | Brier Score | |

Logistic regression | 0.896668 | 0.866566 | 0.414292 | 0.124405 |

Linear SVM | 0.896090 | 0.865722 | 0.413397 | - |

Random forest | 0.939872 | 0.878587 | 0.441497 | 0.037041 |

XGBoost | 0.940979 | 0.886009 | 0.456540 | 0.039166 |

Deep feedforward | 0.914695 | 0.886477 | 0.456309 | 0.116189 |

**Table 2.**Performance of the models trained on 2009–2012 data; reprogrammed loans are labeled as non-defaults.

Model | Validation Set | Out-of-Time Set (2013-12-31) | ||
---|---|---|---|---|

Mean ROC AUC | ROC AUC | H-Measure | Brier Score | |

Logistic regression | 0.916323 | 0.906946 | 0.543300 | 0.091685 |

Linear SVM | 0.916179 | 0.908359 | 0.548104 | - |

Random forest | 0.944355 | 0.917116 | 0.564419 | 0.018484 |

XGBoost | 0.948748 | 0.921723 | 0.573775 | 0.019638 |

Deep feedforward | 0.928784 | 0.920317 | 0.578402 | 0.108295 |

**Table 3.**Performance of the models trained on 2014–2017 data; reprogrammed loans are labeled as defaults.

Model | Validation Set | Out-of-Time Set (2018-12-31) | ||
---|---|---|---|---|

Mean ROC AUC | ROC AUC | H-Measure | Brier Score | |

Logistic regression | 0.896018 | 0.909755 | 0.547925 | 0.073841 |

Linear SVM | 0.895249 | 0.910149 | 0.553930 | - |

Random forest | 0.951180 | 0.925821 | 0.604190 | 0.013256 |

XGBoost | 0.953976 | 0.933554 | 0.618382 | 0.013580 |

Deep feedforward | 0.917070 | 0.929786 | 0.612123 | 0.054511 |

**Table 4.**Performance of the models trained on 2014–2017 data; reprogrammed loans are labeled as non-defaults.

Model | Validation Set | Out-of-Time Set (2018-12-31) | ||
---|---|---|---|---|

Mean ROC AUC | ROC AUC | H-Measure | Brier Score | |

Logistic regression | 0.922209 | 0.915283 | 0.570536 | 0.079378 |

Linear SVM | 0.921126 | 0.914567 | 0.569720 | - |

Random forest | 0.958128 | 0.925179 | 0.602819 | 0.013188 |

XGBoost | 0.961277 | 0.933961 | 0.618473 | 0.013683 |

Deep feedforward | 0.939366 | 0.933304 | 0.615086 | 0.084993 |

**Table 5.**Long-term ROC AUC score of the models trained on 2009–2012 data with reprogrammed loans labeled as defaults.

Model | Out-of-Time Set | ||||
---|---|---|---|---|---|

2014-12-31 | 2015-12-31 | 2016-12-31 | 2017-12-31 | 2018-12-31 | |

Logistic regression | 0.873557 | 0.874864 | 0.869676 | 0.906084 | 0.905157 |

Linear SVM | 0.874060 | 0.875146 | 0.871298 | 0.906778 | 0.906335 |

Random forest | 0.886108 | 0.889226 | 0.878379 | 0.919961 | 0.917771 |

XGBoost | 0.892793 | 0.896854 | 0.888610 | 0.926328 | 0.925317 |

Deep feedforward | 0.893153 | 0.895266 | 0.885186 | 0.925506 | 0.921958 |

**Table 6.**Long-term ROC AUC score of the models trained on 2009–2012 data with reprogrammed loans labeled as non-defaults.

Model | Out-of-Time Set | ||||
---|---|---|---|---|---|

2014-12-31 | 2015-12-31 | 2016-12-31 | 2017-12-31 | 2018-12-31 | |

Logistic regression | 0.914545 | 0.912980 | 0.907418 | 0.914217 | 0.906431 |

Linear SVM | 0.915236 | 0.915165 | 0.908992 | 0.917692 | 0.909679 |

Random forest | 0.925358 | 0.927714 | 0.912753 | 0.927178 | 0.922232 |

XGBoost | 0.931065 | 0.932262 | 0.921962 | 0.932984 | 0.931460 |

Deep feedforward | 0.928203 | 0.927673 | 0.920908 | 0.931293 | 0.924273 |

**Table 7.**Out-of-time (2013-12-31) Brier scores for all examples and individual classes; models were trained on 2009–2012 data, with reprograms labeled as defaults.

Model | Brier Score | ||
---|---|---|---|

All Examples | Non-Defaults | Defaults | |

XGBoost | 0.039166 | 0.006279 | 0.675704 |

Deep feedforward | 0.116189 | 0.113707 | 0.164224 |

**Table 8.**Out-of-time (2013-12-31) Brier scores for all examples and individual classes; models were trained on 2009–2012 data, with reprograms labeled as non-defaults.

Model | Brier Score | ||
---|---|---|---|

All Examples | Non-Defaults | Defaults | |

XGBoost | 0.019638 | 0.004879 | 0.600952 |

Deep feedforward | 0.108295 | 0.107429 | 0.142368 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Merćep, A.; Mrčela, L.; Birov, M.; Kostanjčar, Z.
Deep Neural Networks for Behavioral Credit Rating. *Entropy* **2021**, *23*, 27.
https://doi.org/10.3390/e23010027

**AMA Style**

Merćep A, Mrčela L, Birov M, Kostanjčar Z.
Deep Neural Networks for Behavioral Credit Rating. *Entropy*. 2021; 23(1):27.
https://doi.org/10.3390/e23010027

**Chicago/Turabian Style**

Merćep, Andro, Lovre Mrčela, Matija Birov, and Zvonko Kostanjčar.
2021. "Deep Neural Networks for Behavioral Credit Rating" *Entropy* 23, no. 1: 27.
https://doi.org/10.3390/e23010027