Modeling the Probability of Default Term Structure Using Different Methodologies Under IFRS 9
Abstract
1. Introduction and Literature Review
- (RQ1) Which evaluation metric can be used to systematically identify the best-performing modeling technique?
- (RQ2) For the three implemented methods to formulate the PD, which one tends to outperform the rest?
- (RQ3) For the three implemented methods to formulate the PD, which one is the most time-efficient to use?
2. Methodology
2.1. Kaplan–Meier
2.2. Cox Proportional Hazard (PH)
2.3. Extended Cox Proportional Hazard
2.4. Machine Learning—Random Boosting Forest (RBF)
2.5. Software
3. Analysis and Results
3.1. Data Description
3.2. Kaplan–Meier
3.3. Cox Proportional Hazard
3.4. Extended Cox Proportional Hazard
3.5. Random Boosting Forest
| Algorithm 1. Steps implemented to conduct the RBF model |
| Input: dataset with variables: |
| CREDIT_SCORE, ORIGINAL_LOAN_TERM, NUMBER_OF_BORROWERS, Mortgage_Insurance%, Current_Actual_UPB%, LOAN_AGE, Default_Flag |
| Step 1: Build feature matrix X |
| - Select predictors: CREDIT_SCORE, ORIGINAL_LOAN_TERM, NUMBER_OF_BORROWERS, Mortgage_Insurance%, Current_Actual_UPB% |
| - Encode predictors into numeric design matrix |
| - Remove intercept column |
| Step 2: Define target variables |
| - Loan_Age ← Dataset.LOAN_AGE // survival time |
| - default_Flag ← Dataset.Default_Flag // event indicator (defined but not used here) |
| Step 3: Create training dataset object |
| - Training Dataset ← Matrix(data = X, label = Loan_Age) |
| Step 4: Set model parameters |
| - objective = “survival:cox” // Cox proportional hazards model |
| - eval_metric = “cox-nloglik” // Negative log-likelihood for Cox |
| - max_depth = 2 // Maximum depth of trees |
| - eta = 0.1 // Learning rate |
| - subsample = 0.8 // Fraction of rows sampled per tree |
| Step 5: Train XGBoost model |
| For round = 1 to 100: |
| Build a new decision tree using current parameters |
| Add tree to the ensemble |
| Update predictions from ensemble |
| Compute evaluation metric (cox-nloglik if survival:cox) |
| Adjust gradients and weights for next round |
| End loop |
| Output: trained survival model (ensemble of 100 boosted trees) |
3.6. Survival Probability Graphs
3.7. Cumulative Hazard Rate Graphs
3.8. Model Evaluation Metrics
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
| Position | Attribute Name | Type |
|---|---|---|
| 1 | 1_Credit Score | Numeric |
| 2 | 2_FIRST PAYMENT DATE | Date |
| 3 | 3_FIRST TIME HOMEBUYER FLAG | Alpha |
| 4 | 4_MATURITY DATE | Date |
| 5 | 5_METROPOLITAN STATISTICAL AREA (MSA) | Numeric |
| 6 | 6_MORTGAGE INSURANCE PERCENTAGE (MI %) | Numeric |
| 7 | 7_Number of Units | Numeric |
| 8 | 8_OCCUPANCY STATUS | Alpha |
| 9 | 9_ORIGINAL COMBINED LOAN-TO-VALUE (CLTV) | Numeric |
| 10 | 10_ORIGINAL DEBT-TO-INCOME (DTI) RATIO | Numeric |
| 11 | 11_Original UPB | Numeric |
| 12 | 12_ORIGINAL LOAN-TO-VALUE (LTV) | Numeric |
| 13 | 13_ORIGINAL INTEREST RATE | Numeric Literal decimal |
| 14 | 14_Channel | Alpha |
| 15 | 15_PREPAYMENT PENALTY MORTGAGE (PPM) FLAG | Alpha |
| 16 | 16_AMORTIZATION TYPE | Alpha |
| 17 | 17_PROPERTY STATE | Alpha |
| 18 | 18_PROPERTY TYPE | Alpha |
| 19 | 19_POSTAL CODE | Alpha |
| 20 | 20_LOAN SEQUENCE NUMBER | Alpha-numeric |
| 21 | 21_Loan Purpose | Alpha |
| 22 | 22_Original Loan Term | Numeric |
| 23 | 23_Number of Borrowers | Numeric |
| 24 | 24_SELLER NAME | Alpha-numeric |
| 25 | 25_SERVICER NAME | Alpha-numeric |
| 26 | 26_SUPER CONFORMING FLAG | Alpha |
| 27 | 27_PRE-RELIEF REFINANCE LOAN SEQUENCE NUMBER | Alpha-numeric |
| 28 | 28_SPECIAL ELIGIBILITY PROGRAM | Alpha-numeric |
| 29 | 29_RELIEF REFINANCE INDICATOR | Alpha |
| 30 | 30_PROPERTY VALUATION METHOD | Numeric |
| 31 | 31_INTEREST ONLY INDICATOR (I/O INDICATOR) | Alpha |
| 32 | 32_MI CANCELLATION INDICATOR | Alpha Numeric |
| Position | Attribute Name | Data Type and Format | Max Length |
|---|---|---|---|
| 1 | 1_Loan Sequence Number | Alpha Numeric | 12 |
| 2 | 2_Monthly Reporting Period | Date | 6 |
| 3 | 3_Current Actual UPB | Numeric—12,2 | 12 |
| 4 | 4_Current Loan Delinquency Status | Alpha Numeric | 3 |
| 5 | 5_Loan Age | Numeric | 3 |
| 6 | 6_Remaining Months to Legal Maturity | Numeric | 3 |
| 7 | 7_Defect Settlement Date | Date | 6 |
| 8 | 8_Modification Flag | Alpha | 1 |
| 9 | 9_Zero Balance Code | Numeric | 2 |
| 10 | 10_Zero Balance Effective Date | Date | 6 |
| 11 | 11_Current Interest Rate | Numeric—8,3 | 8 |
| 12 | 12_Current Deferred UPB | Numeric | 12 |
| 13 | 13_Due Date of Last Paid Installment (DDLPI) | Date | 6 |
| 14 | 14_MI Recoveries | Numeric—12,2 | 12 |
| 15 | 15_Net Sales Proceeds | Alpha-Numeric | 14 |
| 16 | 16_Non MI Recoveries | Numeric—12,2 | 12 |
| 17 | 17_Expenses | Numeric—12,2 | 12 |
| 18 | 18_Legal Costs | Numeric—12,2 | 12 |
| 19 | 19_Maintenance and Preservation Costs | Numeric—12,2 | 12 |
| 20 | 20_Taxes and Insurance | Numeric—12,2 | 12 |
| 21 | 21_Miscellaneous Expenses | Numeric—12,2 | 12 |
| 22 | 22_Actual Loss Calculation | Numeric—12,2 | 12 |
| 23 | 23_Modification Cost | Numeric—12,2 | 12 |
| 24 | 24_Step Modification Flag | Alpha | 1 |
| 25 | 25_Deferred Payment Plan | Alpha | 1 |
| 26 | 26_Estimated Loan-to-Value (ELTV) | Numeric | 4 |
| 27 | 27_Zero Balance Removal UPB | Numeric—12,2 | 12 |
| 28 | 28_Delinquent Accrued Interest | Numeric—12,2 | 12 |
| 29 | 29_Delinquency Due to Disaster | Alpha | 1 |
| 30 | 30_Borrower Assistance Status Code | Alpha | 1 |
| 31 | 31_Current Month Modification Cost | Numeric—12,2 | 12 |
| 32 | 32_Interest Bearing UPB | Numeric—12,2 | 12 |

| Variable | Frequency | Mean | Std Dev | Min | Max |
|---|---|---|---|---|---|
| Original Loan Term | 3,357,837 | 292.77 | 86.84 | 60 | 361 |
| Loan Purpose | 3,357,837 | 2.06 | 0.80 | 1 | 3 |
| Number of Borrowers | 3,357,837 | 1.63 | 1.59 | 1 | 99 |
| Credit Score | 3,357,837 | 733.02 | 268.92 | 300 | 9999 |
| Mortgage Insurance% | 3,357,837 | 3.14 | 8.65 | 0 | 999 |
| DTI | 3,357,837 | 49.88 | 127.01 | 1 | 999 |

References
- Ahrens, A., Ersoy, E., Iakovlev, V., Li, H., & Schaffer, M. E. (2022). An introduction to stacking regression for economists. In S. Sriboonchitta, V. Kreinovich, & W. Yamaka (Eds.), Credible asset allocation, optimal transport methods, and related topics. TES 2022. Studies in systems, decision and control (Vol. 429). Springer. [Google Scholar] [CrossRef]
- Banasik, J., Crook, J. N., & Thomas, L. C. (1999). Not if but when will borrowers default. The Journal of the Operational Research Society, 50(12), 1185–1190. [Google Scholar] [CrossRef]
- Bank, M., & Eder, B. (2021). A review on the probability of default for IFRS 9. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3981339 (accessed on 25 July 2025). [CrossRef]
- Bellotti, T., & Crook, J. (2009). Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society, 60(12), 1699–1707. [Google Scholar] [CrossRef]
- Bone-Winkel, G. F., & Reichenbach, F. (2024). Improving credit risk assessment in P2P lending with explainable machine learning survival analysis. Digit Finance, 6, 501–542. [Google Scholar] [CrossRef]
- Botha, A., & Verster, T. (2025). Approaches for modelling the term-structure of default risk under IFRS 9: A tutorial using discrete-time survival analysis. arXiv, arXiv:2507.15441. [Google Scholar] [CrossRef]
- Brunel, V. (2016). Lifetime PD analytics for credit portfolios: A survey. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2857183 (accessed on 30 July 2025). [CrossRef]
- Caselli, S., Corbetta, G., Cucinelli, D., & Rossolini, M. (2021). A survival analysis of public guaranteed loans: Does financial intermediary matter? Journal of Financial Stability, 54, 100880. [Google Scholar] [CrossRef]
- Cha, G.-W., Moon, H.-J., & Kim, Y.-C. (2021). Comparison of random forest and gradient boosting machine models for predicting demolition waste based on small datasets and categorical variables. International Journal of Environmental Research and Public Health, 18(16), 8530. [Google Scholar] [CrossRef] [PubMed]
- Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–220. [Google Scholar] [CrossRef]
- Englund, H., & Mostberg, V. (2022). Probability of default term structure modelling: A comparison between machine learning and Markov chains [Master’s thesis, Umeå University (Sweden)]. Available online: https://www.diva-portal.org/smash/get/diva2:1667201/FULLTEXT03 (accessed on 30 March 2025).
- Ersoy, E., Li, H., Schaffer, M. E., & Szendrei, T. (2024). Stacking regression for time-series, with an application to forecasting quarterly US GDP growth. In T. N. Ngoc, V. Kreinovich, D. T. Ha, & N. D. Trung (Eds.), Optimal transport statistics for economics and related topics. studies in systems, decision and control (Vol. 483). Springer. [Google Scholar] [CrossRef]
- Ertan, C., & Gansmann, A. (2015). A semi-parametric probability of default model [Master’s dissertation, Stockholm School of Economics]. Available online: https://arc.hhs.se/download.aspx?MediumId=2818 (accessed on 31 July 2025).
- George, B., Seals, S., & Aban, I. (2014). Survival analysis and regression models. Journal of Nuclear Cardiology, 21(4), 686–694. [Google Scholar] [CrossRef]
- Ghosal, I., & Hooker, G. (2020). Boosting random forests to reduce bias; one-step boosted forest and its variance estimate. Journal of Computational and Graphical Statistics, 30(2), 493–502. [Google Scholar] [CrossRef]
- Hardin, P., & Ingre, R. (2021). BNPL probability of default modelling including macroeconomic factors: A supervised learning approach. KTH Royal Institute of Technology, School of Engineering Sciences. Available online: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-307334 (accessed on 30 June 2025).
- Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer. Available online: https://link.springer.com/book/10.1007/978-0-387-84858-7 (accessed on 30 June 2025).
- Hastie, T., Tibshirani, R., & Tibshirani, R. (2020). Best subset, forward stepwise or lasso? Analysis and recommendations based on extensive comparisons. Statistical Science, 35(4), 579–592. [Google Scholar] [CrossRef]
- Herrmann, M., Probst, P., Hornung, R., Jurinovic, V., & Boulesteix, A. L. (2021). Large-scale benchmark study of survival prediction methods using multi-omics data. Briefings in Bioinformatics, 22(3), bbaa167. [Google Scholar] [CrossRef]
- International Accounting Standards Board. (2014). International financial reporting standard 9 financial instruments. International Accounting Standards Board. Available online: https://www.ifrs.org/issued-standards/list-of-standards/ifrs-9-financial-instruments/ (accessed on 30 June 2025).
- International Financial Reporting Standards Foundation. (2011). International accounting standard 39 financial instruments: Recognition and measurement. Available online: https://www.ifrs.org/issued-standards/list-of-standards/ias-39-financial-instruments-recognition-and-measurement/#:~:text=IAS%2039%20establishes%20principles%20for,instruments%20and%20for%20hedge%20accounting (accessed on 30 June 2025).
- James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). Statistical learning. In An introduction to statistical learning: With applications in Python. Springer. [Google Scholar] [CrossRef]
- Kleinbaum, D. G., & Klein, M. (2012). Survival analysis: A self-learning text (3rd ed.). Springer. [Google Scholar] [CrossRef]
- Kochański, B. (2022). Which curve fits best: Fitting ROC curve models to empirical credit-scoring data. Risks, 10(10), 184. [Google Scholar] [CrossRef]
- Kottayil, N. M., Jailia, M., & Verma, S. (2025). Machine learning approach for PD term structure modeling under IFRS 9 regulatory framework. International Journal of Science and Research, 14(4), 1115–1119. [Google Scholar] [CrossRef]
- Kraus, A., & Küchenhoff, H. (2014). Credit scoring optimization using the area under the curve. Journal of Risk Model Validation, 8(1), 31–67. [Google Scholar] [CrossRef]
- Kulinskaya, E. (2019). Modelling non-proportional hazards: Time-dependent coefficients, parametric “double Cox” regression and landmark analysis. University of East Anglia. Available online: https://vle.actuaries.org.uk/pluginfile.php/148705/mod_resource/content/1/Plenary%203_Elena%20Kulinskaya.pdf (accessed on 30 March 2025).
- Laeven, L., & Majnoni, G. (2003). Loan loss provisioning and economic slowdowns: Too much, too late? Journal of Financial Intermediation, 12(2), 178–197. [Google Scholar] [CrossRef]
- Lin, H., & Zelterman, D. (2002). Modeling survival data: Extending the Cox model. Technometrics, 44(1), 85–86. [Google Scholar] [CrossRef]
- Ludwig, T., & Bank, M. (2019). Markov Chain based PD term structure modelling in an IFRS 9 framework [Master’s dissertation, Department of Banking and Finance, University of Innsbruck]. Available online: https://diglib.uibk.ac.at/ulbtirolhs/download/pdf/4347915?originalFilename=true (accessed on 30 June 2025).
- Mahesh, B. (2020). Machine learning algorithms—A review. International Journal of Science and Research, 9(1), 381–386. [Google Scholar] [CrossRef]
- Narain, B. (2004). Chapter 16: Survival analysis and the credit granting decision. In L. C. Thomas, D. B. Edelman, & J. N. Crook (Eds.), Readings in Credit Scoring: Foundations, developments, and aims (online edition). Oxford Academic. [Google Scholar] [CrossRef]
- Novotny-Farkas, Z. (2015). The significance of IFRS 9 for financial stability and supervisory rules. EPRS: European Parliamentary Research Service. Available online: https://coilink.org/20.500.12592/0kn91j (accessed on 30 March 2025).
- Ozili, P. K., & Outa, E. R. (2017). Bank loan loss provisions research: A review. Borsa Istanbul Review, 17(3), 144–163. [Google Scholar] [CrossRef]
- Ptak-Chmielewska, A., & Gonzalez, J. P. E. (2024). Default prediction using the cox regression model and macroeconomic conditions—A lifetime perspective. Econometrics. Ekonometria. Advances in Applied Data Analytics, 28(2), 50–61. [Google Scholar] [CrossRef]
- Saavedra, C. A. P. B., Fachini-Gomes, J. B., de Castro Gomes, E. M., & Kimura, H. (2024). Probability of default for lifetime credit loss for IFRS 9 using machine learning competing risks survival analysis models. Expert Systems with Applications, 249, 123607. [Google Scholar] [CrossRef]
- Santos, B. L. (2018). Practical approach for probability of default estimation under IFRS 9 [Doctoral dissertation, Instituto Superior de Economia e Gestão]. Available online: https://repositorio.ulisboa.pt/bitstream/10400.5/17350/1/DM-BLS-2018.pdf (accessed on 3 September 2025).
- Smuts, M., & Allison, J. (2020). An overview of survival analysis with an application in the credit risk environment. ORioN, 36(2), 89–110. [Google Scholar] [CrossRef]
- Stepanova, M., & Thomas, L. (2002). Survival analysis methods for personal loan data. Operations Research, 50(2), 277–289. [Google Scholar] [CrossRef]
- Stumpfe, S. F., & Shongwe, S. C. (2026). Comparative analysis and optimisation of machine learning models for regression and classification on structured tabular datasets. Mathematics, 14(3), 473. [Google Scholar] [CrossRef]
- Tibshirani, R. (2022). What is Cox’s proportional hazards model? Significance, 19(2), 38–39. [Google Scholar] [CrossRef]
- Turkson, A. J., Ayiah-Mensah, F., & Nimoh, V. (2021). Handling censoring and censored data in survival analysis: A standalone systematic literature review. International Journal of Mathematics and Mathematical Sciences, 2021(1), 9307475. [Google Scholar] [CrossRef]



















| Variables | Coefficient | Se (coef) | Exp (coeff) | p-Value | C-Index |
|---|---|---|---|---|---|
| Original Loan Term | 0.00602 | 0.000360 | 1.006000 | <0.001 *** | 0.757 (0.005) |
| Loan Purpose | −0.21220 | 0.027920 | 0.808800 | <0.001 *** | |
| Number of Borrowers | −0.35370 | 0.046480 | 0.702100 | <0.001 *** | |
| Channel | 0.14910 | 0.044500 | 1.161000 | 0.000806 *** | |
| Credit Score | −0.00685 | 0.000376 | 0.993200 | <0.001 *** | |
| Mortgage Insurance% | 0.00710 | 0.000501 | 1.007000 | <0.001 *** | |
| Original UPB | 0.00000 | 0.000000 | 1.000000 | 0.094 | |
| DTI | 0.00006 | 0.000182 | 1.000000 | 0.762 | |
| Number of Units | −0.10990 | 0.101400 | 0.895900 | 0.279 | |
| Current Non-Interest Bearing UPB | −0.00418 | 0.158000 | 0.995800 | 0.979 |
| Variable | Coefficient | Se (coef) | Exp (coeff) | p-Value | C-Index |
|---|---|---|---|---|---|
| Original Loan Term | 0.0070202 | 0.0003492 | 1.0070 | <0.001 | 0.639 (0.006) |
| Loan Purpose | −0.2576336 | 0.0275811 | 0.7729 | <0.001 |
| Variables | Coefficient | Se (coeff) | Exp (coeff) | p-Value | C-Index |
|---|---|---|---|---|---|
| Original Loan Term | 0.00538 | 0.00032 | 1.00500 | <0.001 *** | 0.761 (0.005) |
| Loan Purpose | −0.22210 | 0.02561 | 0.80080 | <0.001 *** | |
| Number of Borrowers | −0.37220 | 0.04235 | 0.68920 | <0.001 *** | |
| Channel | 0.07893 | 0.04111 | 1.08200 | 0.0548 | |
| Credit Score | −0.00725 | 0.00033 | 0.99280 | <0.001 *** | |
| Mortgage Insurance% | 0.00657 | 0.00045 | 1.00700 | <0.001 *** | |
| Original UPB | 0.00000 | 0.00000 | 1.00000 | 0.6086 | |
| DTI | 0.00031 | 0.00015 | 1.00000 | 0.0443 * | |
| Number of Units | −0.03847 | 0.08296 | 0.96230 | 0.6428 | |
| Current Non-Interest Bearing UPB | 0.00000 | 0.00004 | 1.00000 | 0.9866 |
| Variables | Coefficient | Se (coef) | Exp (coeff) | p-Value | C-Index |
|---|---|---|---|---|---|
| Original Loan Term | 0.005474 | 0.000311 | 1.005489 | <0.001 *** | 0.761 (0.005) |
| Loan Purpose | −0.224787 | 0.025584 | 0.798686 | <0.001 *** | |
| Number of Borrowers | 0.000310 | 0.000152 | 1.000311 | <0.001 *** | |
| Credit Score | −0.371441 | 0.041322 | 0.689740 | <0.001 *** | |
| Mortgage Insurance% | −0.007251 | 0.000328 | 0.992776 | <0.001 *** | |
| DTI | 0.006602 | 0.000447 | 1.006624 | <0.001 *** |
| Model | C-Index | CI (Lower) | CI (Upper) | AIC | BIC | CAIC | AUC |
|---|---|---|---|---|---|---|---|
| Cox PH | 0.639 | 0.626 | 0.653 | 38,286.27 | 38,297.44 | 38,305.46 | 0.613 |
| Ext Cox PH | 0.761 | 0.749 | 0.774 | 45,834.02 | 45,868.63 | 45,886.97 | 0.746 |
| RBF | 0.780 | 0.765 | 0.795 | 65,123.81 | 65,129.61 | 65,137.61 | 0.508 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Moremoholo, K.R.; Shongwe, S.C.; Koning, F.F. Modeling the Probability of Default Term Structure Using Different Methodologies Under IFRS 9. Int. J. Financial Stud. 2026, 14, 62. https://doi.org/10.3390/ijfs14030062
Moremoholo KR, Shongwe SC, Koning FF. Modeling the Probability of Default Term Structure Using Different Methodologies Under IFRS 9. International Journal of Financial Studies. 2026; 14(3):62. https://doi.org/10.3390/ijfs14030062
Chicago/Turabian StyleMoremoholo, Kgotso Rudolf, Sandile Charles Shongwe, and Frans Frederick Koning. 2026. "Modeling the Probability of Default Term Structure Using Different Methodologies Under IFRS 9" International Journal of Financial Studies 14, no. 3: 62. https://doi.org/10.3390/ijfs14030062
APA StyleMoremoholo, K. R., Shongwe, S. C., & Koning, F. F. (2026). Modeling the Probability of Default Term Structure Using Different Methodologies Under IFRS 9. International Journal of Financial Studies, 14(3), 62. https://doi.org/10.3390/ijfs14030062

