2.1. The Model
Let Y
t−1 be the principal amount in dollars lent to an applicant at t − 1. Let AP
t be the amount in dollars paid to the principal at time t. Then, in case of default, the charge-off amount, CO
t at time t, is given by:
where Y
t−1 = Σ y
i = sum of future payments.
The individual risk of a given account to charge-off is estimated by the credit score of the account:
where S
t−1 is the risk measure or credit score that the loan application received when it was accepted by the lender, and p(CO
t) measures the probability of a charge-off at time t. The probability of charge-off is determined by several predictable variables that, in turn, provide a credit score. Notice that the credit score for a loan was given at S
t−1 at the same time the loan was approved, i.e., Y
t−1.
The risk of individual loan applications is measured through credit scoring. Every loan application gets a score S
k when the borrower submits their application at time t − 1. S
k measures or indicates the risk of the loan application:
where Z
k, X
k, U
k, V
k, represent predictive variables such as maximum potential exposure (MPE), expected revenues, personal bureau credit risk, commercial bureau credit risk, and so on.
The individual loan applicant will be accepted or declined depending on his/her position with respect to a cut-off value, S
co. A low value of S
k implies the probability of charge-off is high. If S
k is lower than S
co, the loan is expected to be rejected. A high value of S
k implies the probability of charge-off is low. If S
k is greater than S
co, the loan is expected to be accepted. Loan applications with a score below Sco have a higher probability of becoming delinquent, since the credit score for each application is the same for the duration of the loan:
This implies that credit scoring is done once. So, even if the account becomes delinquent or charged off, the score given when the loan application was approved will remain the same throughout the life of the account. The probability that a borrower is going to either pay back or not pay back the whole amount is given by PB
t such that:
where PB
t is the payback amount at time t, and α measures the probability of charge-off. Notice that if α is equal to 1, then the amount is going to be paid in full, that is:
Now, if α = 0, then the amount lent is going to charge-off, that is:
For example, if α = 0.5, it implies that there is a 50–50 chance of the loan being paid and defaulted, respectively. This implies that the lower the values for α, the higher the possibility that the application will charge-off.
If we concentrate only on the charge-off side of Equation (4), then we can see that:
So, the probability of charge-off for an individual account can be shown as:
where ε
t provides the unexpected loan risk of the charge-off that could not be known to the lender when the loan was approved. This error term also measures hidden information not given by the borrower when the loan was processed. If the lender had known the information contained on ε
t, the credit score given to that application would have been lower than the cut-off value, which, in turn, would have gotten the loan application rejected.
If we assume that the bank has N accounts, then the portfolio charge-off can be obtained by:
where PCO
t = is the charge-offs at time t for the entire portfolio of the bank:
is the number of individual charge-offs in a given period t:
In the case that Σαi = 1, the total sum of the charge-off will be equal to the total payment of the loan or principal. The error term, vt, provides the unexpected market loan risk of the charge-off for the bank portfolio. This error term also measures the hidden market information not given by the borrowers or any other market risk, including the information that was not available when the loans were processed.
Since the probability of charge-offs is measured by the credit score that each loan application receives, which, in turn, measures the risk for each borrower, we can estimate the risk of the entire bank portfolio by using the individual credit score assigned to each loan. Using the estimates for α
s, we can then find the weights that can be assigned to the risks found in previous periods. We can then forecast the next period’s risk level by using the previous average weighted risk levels using the following equation:
where j = 1 … N
s is the number of individual scores in each period t and ά
ts are estimated parameters.
The following term indicates that all the loans are scored:
Notice that to Equation (9), an interest rate spread, r
t, can be added to measure the market expectation for the next period, such that:
In order to capture the immediate effect on the market, a short-term rate, rt, may be used, such as money market rates or ninety-day treasury bills. If rt increases, it will imply that the market risk has increased, increasing the expected score for St+1. The estimated error term, έt, provides information on the charge-off accounts, for which the scores did not capture the risk involved in the underlying accounts.
2.2. Sample Selection Procedure
The empirical analysis is based on loan-level primary data from two branches of the same commercial bank, located in different cities within the same regional market. The study includes only one commercial bank, with each branch treated as a separate observational unit for regional comparison. Restricting the sample to a single institution ensures homogeneity in underwriting standards, credit scoring methodology, and risk management practices.
During the data collection, the bank was a mid-sized regional institution with approximately $1.7 billion in total assets and an average annual net income growth rate of 22.8% over the study period. The bank was selected for its strong loan portfolio growth, superior regional performance, and availability of detailed primary credit score data. Additionally, institutional familiarity with one of the authors, whose team created the credit scores, verified the accuracy of data validation and model specification.
The dataset spans a three-year period and initially included all small business loan credit score observations generated by the two branches. Observations were screened for completeness and internal consistency. Extreme values were identified and removed using standard outlier diagnostics to mitigate undue influence on coefficient estimates. The final sample consists of 196 observations for the northern branch and 129 observations for the southern branch.