Bayesian Logistic Regression for Credit Risk Modelling Among South African Loan Borrowers
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Collection
2.2. The Traditional Logistic Regression (TLR) Model
2.3. Bayesian Logistic Regression (BLR) Model
2.4. Likelihood Function
2.5. Prior Distribution
2.6. Posterior Distribution
2.7. Markov Chain Monte Carlo (MCMC) Algorithm
- Step 1: Take the initial value for the parameter: . The starting values can be obtained via MLE.
- Step 2: Generate a random sample from a uniform distribution .
- Step 3: Compute the ratio .
- Step 4: Compare with a random draw . If , then set . However, if , set
- Step 5: Set and repeat steps 1 to 4 until enough draws are obtained.
2.8. Convergence Assessment Using the Diagnostic
2.9. Model Evaluation Metrics and Classification Performance Assessment
2.10. Data Analysis
3. Results
4. Discussion
Study Strengths and Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AP | Average Precision |
| BLR | Bayesian Logistic Regression |
| CI | Confidence Interval/Credible Interval |
| IQR | Interquartile Range |
| MCMC | Markov Chain Monte Carlo |
| MH | Metropolis–Hastings |
| MLE | Maximum Likelihood Estimation |
| NCR | National Credit Regulator |
| PR | Precision–Recall |
| (Rhat) | Gelman–Rubin Convergence Diagnostic |
| SE | Standard Error |
| TLR | Traditional Logistic Regression |
References
- Adams, R., Barnes, C., Bopst, C., & Sommer, K. (2025). A note on recent dynamics of consumer delinquency rates. Board of Governors of the Federal Reserve System. [CrossRef]
- Agresti, A. (2015). Foundations of linear and generalized linear models. John Wiley & Sons. Available online: https://www.oreilly.com/library/view/foundations-of-linear/9781118730058/ (accessed on 3 September 2025).
- Ahmed, H. M., El-Halaby, S. I., & Soliman, H. A. (2022). The consequence of the credit risk on the financial performance in light of COVID-19: Evidence from Islamic versus conventional banks across MEA region. Future Business Journal, 8(1), 21. [Google Scholar] [CrossRef]
- Atchadé, Y., & Wang, L. (2023). A fast asynchronous Markov chain Monte Carlo sampler for sparse Bayesian inference. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(5), 1492–1516. [Google Scholar] [CrossRef]
- Aydin, S. (2021). Bayesian logistic regression inference: Posterior distribution based on prior and likelihood. BMC Public Health, 21, 10674. [Google Scholar] [CrossRef]
- Bhandary, R., & Ghosh, B. K. (2025). Credit card default prediction: An empirical analysis on predictive performance using statistical and machine learning methods. Journal of Risk and Financial Management, 18(1), 23. [Google Scholar] [CrossRef]
- Bolarinwa, F. A., Makinde, O. S., & Fasoranbaku, O. A. (2023). A new Bayesian ridge estimator for logistic regression in the presence of multicollinearity. World Journal of Advanced Research and Reviews, 20(3), 458–465. [Google Scholar] [CrossRef]
- Butt, U., & Chamberlain, T. (2025). Performance of Islamic banks during the COVID-19 pandemic: An empirical analysis and comparison with conventional banking. Journal of Risk and Financial Management, 18(6), 308. [Google Scholar] [CrossRef]
- Chen, L., & Nandram, B. (2023). Bayesian logistic regression model for sub-areas. Stats, 6(1), 209–231. [Google Scholar] [CrossRef]
- Dey, D., Haque, M. S., Islam, M. M., Aishi, U. I., Shammy, S. S., & Mayen, M. S. A. (2025). The proper application of logistic regression model in complex survey data: A systematic review. BMC Medical Research Methodology, 25, 15. [Google Scholar] [CrossRef]
- Fischer, L., & Wollstadt, P. (2024). Precision and recall reject curves for classification. arXiv, arXiv:2308.08381. [Google Scholar] [CrossRef]
- Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2019). Bayesian data analysis (3rd ed.). Chapman & Hall/CRC. Available online: https://www.routledge.com/Bayesian-Data-Analysis/Gelman-Carlin-Stern-Dunson-Vehtari-Rubin/p/book/9780429113079 (accessed on 27 July 2025).
- Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Bürkner, P.-C., & Modrák, M. (2020). Bayesian workflow. arXiv, arXiv:2011.01808. [Google Scholar] [CrossRef]
- Global Credit Data. (2020). Downturn LGD study 2020. Available online: https://globalcreditdata.org (accessed on 17 August 2025).
- Hassan, M. M. (2020). A fully Bayesian logistic regression model for classification of ZADA diabetes dataset. Science Journal of University of Zakho, 8(3), 105–111. [Google Scholar] [CrossRef]
- Hosmer, D. W., Jr., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression. John Wiley & Sons. [Google Scholar] [CrossRef]
- International Monetary Fund (IMF). (2025). Global financial stability report, April 2025: Enhancing resilience amid uncertainty. Available online: https://www.imf.org/en/Publications/GFSR/Issues/2025/04/22/global-financial-stability-report-april-2025 (accessed on 10 October 2025).
- Kim, H., Cho, H., & Ryu, D. (2018). An empirical study on credit card loan delinquency. Economic Systems, 42(3), 437–449. [Google Scholar] [CrossRef]
- Kimetto, G. J. (2023). Adapting a developed market credit risk model for the understanding and estimation of consumer credit losses in South Africa [Doctoral dissertation, University of South Africa]. [Google Scholar]
- Kyeong, S., & Shin, J. (2022). Two-stage credit scoring using Bayesian approach. Journal of Big Data, 9, 106. [Google Scholar] [CrossRef]
- Lawrence, B., Doorasamy, M., & Sarpong, P. (2024). The impact of credit risk on performance: A case of South African commercial banks. Global Business Review, 25, S151–S164. [Google Scholar] [CrossRef]
- Lewis, R. M., & Battey, H. S. (2024). On inference in high-dimensional logistic regression models with separated data. Biometrika, 111(3), 989–1011. [Google Scholar] [CrossRef]
- Li, Z. (2021). A review of Bayesian posterior distribution based on MCMC methods. In Computing and data science (pp. 204–213). Springer. [Google Scholar] [CrossRef]
- Loredo, T. J., & Wolpert, R. L. (2024). Bayesian inference: More than Bayes’s theorem. Frontiers in Astronomy and Space Sciences, 11, 1326926. [Google Scholar] [CrossRef]
- McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC. [Google Scholar] [CrossRef]
- Moody’s. (2025). Credit risk insights. Available online: https://www.moodys.com (accessed on 13 November 2025).
- Moolchandani, S. (2024). Exploring Bayesian hierarchical models for multi-level credit risk assessment: Detailed insights. International Journal of Computer Science & Information Technology, 16(3), 67–74. [Google Scholar] [CrossRef]
- Newman, K. B., Villa, C., & King, R. (2025). Logistic regression models: Practical induced prior specification. arXiv, arXiv:2501.18106. [Google Scholar] [CrossRef]
- Pham, H. T., Pham, H., & Siong Yow, K. (2025). Applying non-informative G-prior for logistic regression models with different patterns of data points. Monte Carlo Methods and Applications, 31(4), 343–356. [Google Scholar] [CrossRef]
- Principa. (2025). Leveraging Bayesian models for financial inclusion in South Africa. Principa Insights. Available online: https://principa.co.za/how-to-use-alternative-data-to-improve-credit-risk-models-in-south-africa/ (accessed on 5 September 2025).
- Richardson, E., Trevizani, R., Greenbaum, J. A., Carter, H., & Nielsen, M. (2024). The receiver operating characteristic curve accurately assesses imbalanced datasets and interprets precision–recall behaviour. Patterns, 5, 100994. [Google Scholar] [CrossRef]
- Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), e0118432. [Google Scholar] [CrossRef] [PubMed]
- Saliba, C., Farmanesh, P., & Athari, S. A. (2023). Does country risk impact the banking sectors’ non-performing loans? Evidence from BRICS emerging economies. Financial Innovation, 9(1), 86. [Google Scholar] [CrossRef] [PubMed]
- Seitshiro, M. B., & Govender, S. (2024). Credit risk prediction with and without weights of evidence using quantitative learning models. Cogent Economics & Finance, 12(1), 2338971. [Google Scholar] [CrossRef]
- Siddiqi, N. (2012). Credit risk scorecards: Developing and implementing intelligent credit scoring. John Wiley & Sons. Available online: https://onlinelibrary.wiley.com/doi/book/10.1002/9781119201731?msockid=244f0f0cb968683e351f1a84b8a16920 (accessed on 19 August 2025).
- S&P Global. (2025). Global credit outlook 2025. Available online: https://www.spglobal.com (accessed on 7 November 2025).
- Tham, A. W., Kakamu, K., & Liu, S. (2023). Bayesian statistics for loan default. Journal of Risk and Financial Management, 16(3), 203. [Google Scholar] [CrossRef]
- TransUnion. (2025). Industry insights report: South African consumer credit and delinquency trends. TransUnion South Africa. Available online: https://transunion.co.za (accessed on 1 January 2026).
- Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P.-C. (2021). Rank normalization, folding, and localization: An improved for assessing convergence of MCMC (with discussion). Bayesian Analysis, 16(2), 667–718. [Google Scholar] [CrossRef]



| Variable | Category | Total, n (%) | Default, % (95% CI) | Not Default, % (95% CI) |
|---|---|---|---|---|
| Credit Score origin, median (IQR) | 601 (560–640) | 594 (555–632) | 633 (590–668) | |
| Bank governance, median (IQR) | 3.19 (3.11–3.22) | 3.19 (3.11–3.22) | 3.19 (3.11–3.22) | |
| Original Interest rate, median (IQR) | 6.25 (4.75–6.75) | 6.25 (4.75–6.75) | 6.25 (4.75–6.75) | |
| Original Unemployment rate, median (IQR) | 30.8 (29.1–33.9) | 30.8 (29.1–33.9) | 30.8 (29.1–33.9) | |
| Original Inflation rate, median (IQR) | 4.4 (4.1–4.5) | 4.4 (4.1–4.5) | 4.4 (4.1–4.5) | |
| Interest rate at event, median (IQR) | 6.25 (4.75–6.75) | 6.25 (4.75–6.75) | 6.25 (4.75–6.75) | |
| Unemployment rate at event, median (IQR) | 30.8 (29.1–33.9) | 30.8 (29.1–33.9) | 30.8 (29.1–33.9) | |
| Inflation rate at event, median (IQR) | 4.4 (4.1–4.5) | 4.4 (4.1–4.5) | 4.4 (4.1–4.5) | |
| Bank | Bank A | 2217 (44.3) | 81.0 (79.4–82.6) | 19.0 (17.4–20.6) |
| Bank B | 1807 (36.1) | 80.7 (78.9–82.5) | 19.3 (17.5–21.1) | |
| Bank C | 976 (19.5) | 81.9 (79.4–84.3) | 18.1 (15.7–20.6) | |
| Gender | Female | 2487 (49.7) | 80.5 (79.0–82.1) | 19.5 (17.9–21.0) |
| Male | 2424 (48.5) | 81.7 (80.1–83.2) | 18.3 (16.8–19.9) | |
| Other | 89 (1.8) | 78.7 (70.1–87.2) | 21.3 (12.8–29.9) | |
| Age group (years) | 18–25 | 588 (11.8) | 80.3 (77.1–83.5) | 19.7 (16.5–22.9) |
| 26–35 | 1605 (32.1) | 82.7 (80.9–84.6) | 17.3 (15.4–19.1) | |
| 36–45 | 1391 (27.8) | 80.3 (78.2–82.4) | 19.7 (17.6–21.8) | |
| 46–60 | 1033 (20.7) | 78.8 (76.3–81.3) | 21.2 (18.7–23.7) | |
| 60+ | 383 (7.7) | 84.1 (80.4–87.7) | 15.9 (12.3–19.6) | |
| Product type | Credit Card | 1262 (25.2) | 82.4 (80.3–84.5) | 17.6 (15.5–19.7) |
| Mortgage | 525 (10.5) | 81.7 (78.4–85.0) | 18.3 (15.0–21.6) | |
| Personal Loan | 1718 (34.4) | 79.7 (77.8–81.6) | 20.3 (18.4–22.2) | |
| Store Credit | 493 (9.9) | 82.4 (79.0–85.7) | 17.6 (14.3–21.0) | |
| Vehicle Finance | 1002 (20.0) | 80.7 (78.3–83.2) | 19.3 (16.8–21.7) | |
| Income band (rands) | <5000 | 1030 (20.6) | 78.1 (75.5–80.6) | 21.9 (19.4–24.5) |
| 5000–<10,000 | 1713 (34.3) | 80.6 (78.7–82.4) | 19.4 (17.6–21.3) | |
| 10,000–<20,000 | 1261 (25.2) | 80.6 (78.4–82.8) | 19.4 (17.2–21.6) | |
| 20,000–<50,000 | 728 (14.6) | 85.2 (82.6–87.7) | 14.8 (12.3–17.4) | |
| ≥50,000 | 268 (5.4) | 86.9 (82.9–91.0) | 13.1 (9.0–17.1) | |
| Race | African | 3127 (62.5) | 81.0 (79.7–82.4) | 19.0 (17.6–20.3) |
| Coloured | 870 (17.4) | 80.1 (77.5–82.8) | 19.9 (17.2–22.5) | |
| Indian | 424 (8.5) | 82.1 (78.4–85.7) | 17.9 (14.3–21.6) | |
| White | 579 (11.6) | 81.9 (78.7–85.0) | 18.1 (15.0–21.3) | |
| Education level | High School | 497 (9.9) | 82.3 (78.9–85.6) | 17.7 (14.4–21.1) |
| Some Colleges | 827 (16.5) | 80.3 (77.6–83.0) | 19.7 (17.0–22.4) | |
| Diploma | 1698 (34.0) | 80.8 (78.9–82.7) | 19.2 (17.3–21.1) | |
| Bachelor | 1529 (30.6) | 80.7 (78.7–82.7) | 19.3 (17.3–21.3) | |
| Postgraduate | 449 (9.0) | 83.3 (79.8–86.7) | 16.7 (13.3–20.2) | |
| Marital status | Single | 1369 (27.4) | 82.3 (80.3–84.3) | 17.7 (15.7–19.7) |
| Married | 2019 (40.4) | 80.9 (79.2–82.6) | 19.1 (17.4–20.8) | |
| Divorced | 1111 (22.2) | 80.3 (77.9–82.6) | 19.7 (17.4–22.1) | |
| Widowed | 501 (10.0) | 79.8 (76.3–83.4) | 20.2 (16.6–23.7) | |
| Variable | Traditional Logistic Regression | Bayesian Logistic Regression | ||||
|---|---|---|---|---|---|---|
| SE | OR (95% CI) | SE | OR (95% CI) | (Rhat) | ||
| Intercept | 5.17 | 79.18 (0–1,944,304.75) | 2.77 | 1.70 (0.01–380.79) | 1 | |
| Term of loan (months) | 4.62 × 10−3 | 1.10 (1.09–1.11) * | 0.04 | 2.32 (2.15–2.52) * | 1 | |
| Loan amount (rands) | 6.44 × 10−7 | 1.00 (1.00–1.00) | 0.04 | 0.95 (0.88–1.02) | 1 | |
| Credit Score at origin | 7.29 × 10−4 | 0.99 (0.99–0.99) * | 0.04 | 0.47 (0.44–0.51) * | 1 | |
| Bank governance at origin | 1.51 | 3.60 (0.19–70.38) | 0.86 | 1.48 (0.28–7.94) | 1 | |
| Interest rate | 0.10 | 0.89 (0.73–1.08) | 0.12 | 0.87 (0.68–1.11) | 1 | |
| Unemployment rate | 0.05 | 0.95 (0.87–1.04) | 0.13 | 0.88 (0.69–1.13) | 1 | |
| Inflation rate | 0.04 | 1.13 (1.05–1.22) * | 0.05 | 1.16 (1.06–1.27) * | 1 | |
| Bank | A (Ref.) | |||||
| B | 0.17 | 1.14 (0.82–1.59) | 0.12 | 1.05 (0.82–1.33) | 1 | |
| C | 0.13 | 1.00 (0.78–1.28) | 0.12 | 1.04 (0.83–1.30) | 1 | |
| Gender | Female (Ref.) | |||||
| Male | 0.08 | 1.10 (0.94–1.29) | 0.08 | 1.11 (0.94–1.29) | 1 | |
| Other | 0.30 | 0.70 (0.40–1.29) | 0.29 | 0.73 (0.43–1.32) | 1 | |
| Age group (years) | 18–25 (Ref.) | |||||
| 26–35 | 0.19 | 1.26 (0.87–1.83) | 0.20 | 1.30 (0.87–1.91) | 1 | |
| 36–45 | 0.26 | 0.94 (0.57–1.54) | 0.27 | 1.01 (0.59–1.66) | 1 | |
| 46–59 | 0.26 | 0.95 (0.58–1.57) | 0.27 | 1.03 (0.60–1.70) | 1 | |
| 60 and above | 0.26 | 1.57 (0.94–2.63) | 0.26 | 1.56 (0.93–2.61) | 1 | |
| Product type | Credit Card (Ref.) | |||||
| Mortgage | 0.15 | 0.89 (0.67–1.19) | 0.14 | 0.89 (0.68–1.19) | 1 | |
| Personal loan | 0.10 | 0.78 (0.64–0.96) * | 0.10 | 0.79 (0.64–0.96) * | 1 | |
| Store credit | 0.15 | 1.12 (0.83–1.52) | 0.15 | 1.13 (0.83–1.52) | 1 | |
| Vehicle finance | 0.12 | 0.87 (0.69–1.11) | 0.12 | 0.89 (0.71–1.13) | 1 | |
| Income band (rands) | <5000 (Ref.) | |||||
| 5000–<10,000 | 0.11 | 1.17 (0.95–1.44) | 0.21 | 2.03 (1.35–3.11) * | 1 | |
| 10,000–<20,000 | 0.11 | 1.19 (0.95–1.49) | 0.13 | 1.22 (0.94–1.56) | 1 | |
| 20,000–<50,000 | 0.14 | 1.72 (1.31–2.27) * | 0.15 | 1.75 (1.29–2.34) * | 1 | |
| >50,000 | 0.21 | 2.09 (1.39–3.21) * | 0.11 | 1.16 (0.93–1.43) | 1 | |
| Race | African (Ref.) | |||||
| Coloured | 0.11 | 0.92 (0.74–1.13) | 0.13 | 0.86 (0.67–1.11) | 1 | |
| Indian | 0.15 | 1.10 (0.83–1.48) | 0.18 | 1.14 (0.81–1.61) | 1 | |
| White | 0.13 | 1.00 (0.78–1.30) | 0.15 | 0.98 (0.72–1.32) | 1 | |
| Education level | High school (Ref.) | |||||
| Some college | 0.19 | 0.84 (0.58–1.22) | 0.10 | 0.92 (0.75–1.12) | 1 | |
| Diploma | 0.2 | 1.03 (0.69–1.54) | 0.21 | 0.88 (0.59–1.34) | 1 | |
| Bachelor | 0.22 | 1.13 (0.74–1.73) | 0.17 | 1.15 (0.82–1.63) | 1 | |
| Postgraduate | 0.26 | 1.30 (0.78–2.18) | 0.15 | 0.74 (0.56–1.00) | 1 | |
| Marital status | Single (Ref.) | |||||
| Married | 0.15 | 0.85 (0.64–1.13) | 0.12 | 0.93 (0.74–1.17) | 1 | |
| Divorced | 0.19 | 0.92 (0.64–1.33) | 0.18 | 1.07 (0.75–1.51) | 1 | |
| Widowed | 0.22 | 0.74 (0.48–1.13) | 0.17 | 0.81 (0.58–1.14) | 1 | |
| AP | 0.9368 | 0.9381 | ||||
| Percentage (%) | ||
|---|---|---|
| Product type | Credit card | 95.7 |
| Mortgage | 92.6 | |
| Personal loan | 93.0 | |
| Store credit | 96.0 | |
| Vehicle finance | 95.7 | |
| Income band (rands) | <5000 | 93.8 |
| 50,000–<10,000 | 93.8 | |
| 10,000–<20,000 | 94.4 | |
| 20,000–<50,000 | 95.9 | |
| 50,000 and above | 97.6 | |
| Bank | A | 94.1 |
| B | 95.4 | |
| C | 93.7 | |
| Overall | 94.5 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Masekoameng, J.L.; Mbona, S.V.; Ananth, A.; Chifurira, R. Bayesian Logistic Regression for Credit Risk Modelling Among South African Loan Borrowers. J. Risk Financial Manag. 2026, 19, 358. https://doi.org/10.3390/jrfm19050358
Masekoameng JL, Mbona SV, Ananth A, Chifurira R. Bayesian Logistic Regression for Credit Risk Modelling Among South African Loan Borrowers. Journal of Risk and Financial Management. 2026; 19(5):358. https://doi.org/10.3390/jrfm19050358
Chicago/Turabian StyleMasekoameng, John Lehlaka, Sizwe Vincent Mbona, Anisha Ananth, and Retius Chifurira. 2026. "Bayesian Logistic Regression for Credit Risk Modelling Among South African Loan Borrowers" Journal of Risk and Financial Management 19, no. 5: 358. https://doi.org/10.3390/jrfm19050358
APA StyleMasekoameng, J. L., Mbona, S. V., Ananth, A., & Chifurira, R. (2026). Bayesian Logistic Regression for Credit Risk Modelling Among South African Loan Borrowers. Journal of Risk and Financial Management, 19(5), 358. https://doi.org/10.3390/jrfm19050358

