Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA

Castro, Juan; Nguyen, James; Castro, Esther

doi:10.3390/jrfm19020152

Open AccessArticle

Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA

by

Juan Castro

¹,

James Nguyen

^2,* and

Esther Castro

³

¹

Fred Hale School of Business, East Texas Baptist University, Marshall, TX 75670, USA

²

Department of Economics and Finance, College of Business, Texas A&M University, Texarkana, TX 75503, USA

³

Marilyn Davies College of Business, University of Houston Downtown, Houston, TX 77002, USA

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2026, 19(2), 152; https://doi.org/10.3390/jrfm19020152

Submission received: 31 December 2025 / Revised: 10 February 2026 / Accepted: 12 February 2026 / Published: 19 February 2026

(This article belongs to the Section Risk)

Download

Browse Figures

Versions Notes

Abstract

Credit scoring is the industry-standard methodology for quantifying the creditworthiness and default risk of individual loan applicants. However, assessing the risk at the portfolio level—across different branches or regions—requires more than just aggregating individual scores. This paper presents a simple, pragmatic model for evaluating overall commercial bank portfolio risk by analyzing accumulated credit scores, facilitating effective inter-branch benchmarking. The proposed model is validated using credit score data from two distinct regions of the bank. Logistic regressions by region show that both northern and southern banks maintain low overall risk profiles due to strong portfolio credit scores. However, a nuanced analysis reveals regional discrepancies: the southern region appears riskier when segmented by credit score groupings (indicating a higher concentration of lower-tier borrowers), whereas the northern region exhibits higher risk when analyzed against a broader set of factors, such as approved amounts, maximum potential exposure, and approved versus book rates. This research suggests that portfolio risk is not one-dimensional; effective risk management requires analyzing both individual scores and the interaction of loan characteristics, particularly when comparing regional performance.

Keywords:

credit scoring; bank loan default; logistic regression; credit worthiness assessment

1. Introduction

The management of credit risk underpins banking stability and guides institutional capital allocation, especially for regional banks facing growing scale and complexity. In 2024, mortgage applications totaled approximately 7.67 million, an increase of about 6% from the previous year (Urban Institute, 2025a, 2025b), making manual underwriting unmanageable. Automated credit scoring systems, therefore, are crucial for transforming individual borrower information into a portfolio-level risk view. Because credit risk assessment is so central to balance sheet health, miscalculations—especially across portfolios—directly threaten both institutional stability and the broader system, according to Tabachová et al. (2023). Consequently, modern risk management dictates that banks not only accurately price for expected loss (EL) through interest margins and provisions but also maintain robust capital buffers for unexpected loss (UL) based on systematic risk factors (Gordy, 2003; Basel Committee on Banking Supervision, 2005). However, as evidenced by historical defaults, this risk governance must extend beyond a reliance on scores alone; it requires ensuring data accuracy and accounting for the interplay of loan-to-value ratios to prevent the ‘offsetting’ of high-risk loan attributes with misleadingly stable credit scores (Avery et al., 2000; Mayer et al., 2009). From a theoretical perspective, modern credit risk management is grounded in the decomposition of portfolio credit losses into probability of default (PD), loss given default (LGD), and exposure at default (EAD), with expected losses priced into loan terms and provisions and unexpected losses absorbed through regulatory and economic capital buffers (Gordy, 2003; Basel Committee on Banking Supervision, 2005; Schuermann, 2004). This framework underpins both internal risk governance and Basel-style capital regulation.

Historically, the banking industry has experienced a fundamental shift in credit risk assessment. Smaller and regional banks once primarily used relationship lending, which relied on collecting “soft information”—qualitative insights into a borrower’s character, trustworthiness, and local reputation gathered through repeated interaction (Berger & Udell, 2002). Stein (2002) notes that small banks had a comparative advantage in handling soft information, since flatter structures allowed more discretion at the loan officer level. Over time, though, competitive pressures, regulatory demands, and advances in data processing have pushed banks toward “hard information” technologies, such as credit scoring, which depend on quantifiable, verifiable borrower attributes (Berger & Udell, 2006). Berger et al. (2005) demonstrate that adopting credit scoring increased small-business credit availability and shifted evaluation from relationship-based judgment to automated, model-driven assessment. According to the Federal Deposit Insurance Corporation, as banks increase in size and complexity, they tend to shift from relying on expert judgment to using quantitative scorecards or modeled approaches, often with qualitative adjustments. This shift is significant for regional banks, which, because they have fewer resources than national institutions, adopt standardized scoring models to connect traditional local lending practices with modern quantitative risk management. This hybrid approach lets them lend to lower-score borrowers (e.g., 620–660) whom a national bank might automatically reject. Despite these advances, credit scores remain an unresolved challenge; while efficient and widely adopted, they may obscure important sources of portfolio risk when used in isolation, particularly when loan contract characteristics and regional lending practices vary systematically across branches or markets.

A credit score is commonly defined as a numerical summary derived from statistical analysis of credit bureau data intended to represent an individual’s creditworthiness (Hand & Henley, 1997). In the United States, the FICO score has become the industry standard and is constructed from five primary components: payment history, amounts owed, length of credit history, credit mix, and new credit (Fair Isaac Corporation, n.d.). Incorporating these scores into underwriting decisions helps banks mitigate information asymmetry—a condition in which borrowers possess more accurate knowledge of their own risk profiles than lenders (Stiglitz & Weiss, 1981). However, the effectiveness of credit scoring depends critically on data quality and interpretation. Avery et al. (2000) identify several statistical limitations in credit bureau files, showing that situational factors, reporting errors, and incomplete histories can materially affect score reliability. As a result, portfolio-level risk cannot be fully understood from borrower credit scores alone; it must be evaluated jointly with loan structure, underwriting practices, and regional lending conditions.

The risks of overreliance on credit scores became clear during the late 2000s mortgage crisis (Demyanyk, 2008). While earlier studies found that credit scoring expanded credit access (Avery et al., 2000; Berger et al., 2005), later research revealed these models could obscure weakening lending standards. Mayer et al. (2009) show that rising default rates were largely driven by high loan-to-value (LTV) ratios and lax underwriting—risk factors not fully reflected by credit scores. Similarly, Sengupta and Bhardwaj (2015) document that, in the securitized subprime mortgage market, lenders relied on high credit scores to offset other risky loan traits, a strategy that proved unsustainable as macroeconomic conditions worsened. These findings show that credit scores, though useful, cannot fully explain portfolio risk without considering loan terms and lending practices. Taken together, the literature reveals a tension: early studies document the efficiency and inclusiveness of credit scoring, while later studies highlight its inability to capture contract-level and systemic risk factors, suggesting an unresolved gap at the portfolio level.

Recent years have seen a shift from traditional models to machine learning (ML) and AI for default prediction (Addo et al., 2018; Acharya & Upasan, 2022; Hayashi, 2022; Ala’raj et al., 2022; Galán & Lamas, 2025). While deep learning models reveal complex borrower patterns and ensemble techniques like XGBoost deliver high predictive accuracy (Addo et al., 2018; Sadhwani et al., 2021; Hayashi, 2022), regional banks face challenges due to regulatory requirements for transparency and interpretability. While recent machine learning studies emphasize predictive accuracy in default classification (Addo et al., 2018; Hayashi, 2022), they typically abstract from portfolio benchmarking and institutional lending practices, limiting their usefulness for internal risk governance in regional banks. Consequently, traditional methods like logistic regression remain crucial as regulatory benchmarks and internal risk tools (Hu et al., 2025). Schuermann (2004) situates loss given default (LGD) alongside probability of default (PD) and exposure at default (EAD) as a core component of portfolio credit risk measurement, emphasizing that LGD is stochastic and cyclical and materially affects both expected and unexpected losses.

For regional banks, the clearest operational challenge is internal benchmarking—systematically assessing risk profiles across branches or regions to ensure consistent lending standards (European Banking Authority, 2023). Heterogeneous economies amplify this challenge, as seen in Texas, where the average FICO score is 695, and there are marked regional divergences (Experian, 2025). Critically, portfolio risk arises not only from borrower scores but also from loan contract features, rate, and maximum exposure (MPE), highlighting that robust credit risk management must analyze both borrower data and institutional practices, a core facet of the main argument (Liu & Liang, 2025). Despite extensive research on credit scoring accuracy, loan-level default prediction, and machine learning models, relatively little work provides transparent tools for internal portfolio benchmarking that jointly evaluate borrower risk and lending practices across regions within the same institution.

This study contributes to the credit risk literature by offering an applied, interpretable framework that enables regional banks to benchmark portfolio risk internally and to distinguish whether observed risk differentials arise from borrower characteristics or institutional lending practices. This paper aims to test a practical, transparent framework that uses credit scores and loan characteristics to quantify and compare loan portfolio risk across two regions of a commercial bank. By leveraging a proprietary dataset and logistic regression, the research seeks to provide regional banks with an actionable method for internal benchmarking, allowing them to determine whether risk differentials stem from borrower characteristics or lending practices. In doing so, the study directly addresses its central argument: that strengthening internal portfolio risk governance through applied, interpretable tools will promote financial stability for regional banks amid economic volatility. The remainder of the paper is organized as follows: Section 2 describes the materials and empirical methodology; Section 3 presents the results; and Section 4 discusses implications for regional bank risk governance and financial stability.

2. Materials and Methods

2.1. The Model

Let Y_t−1 be the principal amount in dollars lent to an applicant at t − 1. Let AP_t be the amount in dollars paid to the principal at time t. Then, in case of default, the charge-off amount, CO_t at time t, is given by:

CO_t = Y_t−1 − AP_t

(1)

where Y_t−1 = Σ y_i = sum of future payments.

The individual risk of a given account to charge-off is estimated by the credit score of the account:

S_t−1 = p(CO_t)

(2)

where S_t−1 is the risk measure or credit score that the loan application received when it was accepted by the lender, and p(CO_t) measures the probability of a charge-off at time t. The probability of charge-off is determined by several predictable variables that, in turn, provide a credit score. Notice that the credit score for a loan was given at S_t−1 at the same time the loan was approved, i.e., Y_t−1.

The risk of individual loan applications is measured through credit scoring. Every loan application gets a score S_k when the borrower submits their application at time t − 1. S_k measures or indicates the risk of the loan application:

S_k,t−1 = α_k,t−1(Z_k,t−1, X_k,t−1, U_k,t−1, V_k,t−1, …)

where Z_k, X_k, U_k, V_k, represent predictive variables such as maximum potential exposure (MPE), expected revenues, personal bureau credit risk, commercial bureau credit risk, and so on.

The individual loan applicant will be accepted or declined depending on his/her position with respect to a cut-off value, S_co. A low value of S_k implies the probability of charge-off is high. If S_k is lower than S_co, the loan is expected to be rejected. A high value of S_k implies the probability of charge-off is low. If S_k is greater than S_co, the loan is expected to be accepted. Loan applications with a score below Sco have a higher probability of becoming delinquent, since the credit score for each application is the same for the duration of the loan:

S_t−1 = S_t = S_t+1 = S_t+2 = S_t+3 …

(3)

This implies that credit scoring is done once. So, even if the account becomes delinquent or charged off, the score given when the loan application was approved will remain the same throughout the life of the account. The probability that a borrower is going to either pay back or not pay back the whole amount is given by PB_t such that:

PB_t = α Y_t−1 + (1 − α) CO_t

(4)

where PB_t is the payback amount at time t, and α measures the probability of charge-off. Notice that if α is equal to 1, then the amount is going to be paid in full, that is:

PB_t = Y_t−1.

(5)

Now, if α = 0, then the amount lent is going to charge-off, that is:

PB_t = CO_t.

(6)

For example, if α = 0.5, it implies that there is a 50–50 chance of the loan being paid and defaulted, respectively. This implies that the lower the values for α, the higher the possibility that the application will charge-off.

If we concentrate only on the charge-off side of Equation (4), then we can see that:

(1 − α) CO_t or CO_t = αCO_t,

So, the probability of charge-off for an individual account can be shown as:

CO_t = αCO_t + ε_t

where ε_t provides the unexpected loan risk of the charge-off that could not be known to the lender when the loan was approved. This error term also measures hidden information not given by the borrower when the loan was processed. If the lender had known the information contained on ε_t, the credit score given to that application would have been lower than the cut-off value, which, in turn, would have gotten the loan application rejected.

If we assume that the bank has N accounts, then the portfolio charge-off can be obtained by:

PCO_t = α₁CO₁ + α₂CO₂ + α₃CO₃ + α₄CO₄ + α₅CO₅ + … + α_NCO_N + v_t

(7)

where PCO_t = is the charge-offs at time t for the entire portfolio of the bank:

i = 1 … N,

(8)

is the number of individual charge-offs in a given period t:

Σα_i = 1

In the case that Σα_i = 1, the total sum of the charge-off will be equal to the total payment of the loan or principal. The error term, v_t, provides the unexpected market loan risk of the charge-off for the bank portfolio. This error term also measures the hidden market information not given by the borrowers or any other market risk, including the information that was not available when the loans were processed.

Since the probability of charge-offs is measured by the credit score that each loan application receives, which, in turn, measures the risk for each borrower, we can estimate the risk of the entire bank portfolio by using the individual credit score assigned to each loan. Using the estimates for α_s, we can then find the weights that can be assigned to the risks found in previous periods. We can then forecast the next period’s risk level by using the previous average weighted risk levels using the following equation:

S_t+1 = ά₁ S₁ + ά₂S₂ + ά₃S₃ + … ά_n S_n + έ_t

(9)

S_t = ΣS_j/N_s,

(10)

where j = 1 … N_s is the number of individual scores in each period t and ά_ts are estimated parameters.

The following term indicates that all the loans are scored:

ΣN_{i co} = ΣN_{j s},

(11)

Notice that to Equation (9), an interest rate spread, r_t, can be added to measure the market expectation for the next period, such that:

r_t = r_t − r_t−1.

(12)

In order to capture the immediate effect on the market, a short-term rate, r_t, may be used, such as money market rates or ninety-day treasury bills. If r_t increases, it will imply that the market risk has increased, increasing the expected score for S_t+1. The estimated error term, έ_t, provides information on the charge-off accounts, for which the scores did not capture the risk involved in the underlying accounts.

2.2. Sample Selection Procedure

The empirical analysis is based on loan-level primary data from two branches of the same commercial bank, located in different cities within the same regional market. The study includes only one commercial bank, with each branch treated as a separate observational unit for regional comparison. Restricting the sample to a single institution ensures homogeneity in underwriting standards, credit scoring methodology, and risk management practices.

During the data collection, the bank was a mid-sized regional institution with approximately $1.7 billion in total assets and an average annual net income growth rate of 22.8% over the study period. The bank was selected for its strong loan portfolio growth, superior regional performance, and availability of detailed primary credit score data. Additionally, institutional familiarity with one of the authors, whose team created the credit scores, verified the accuracy of data validation and model specification.

The dataset spans a three-year period and initially included all small business loan credit score observations generated by the two branches. Observations were screened for completeness and internal consistency. Extreme values were identified and removed using standard outlier diagnostics to mitigate undue influence on coefficient estimates. The final sample consists of 196 observations for the northern branch and 129 observations for the southern branch.

2.3. Variables

We use the following variables in our evaluation of the riskiness of a bank’s portfolio based on its clients’ credit scores: approved loan amount, maximum potential exposure, credit score, approved rate, and booked rate. These variables are explained below.

2.3.1. Approved Amount

The approved amount is the value of the loan given to the bank’s customers. Institutional analyses and research demonstrate a strong positive correlation between credit scores and approved loan amounts (Inspire Credit Union, 2024; Setiadi et al., 2024). This variable is expected to have a positive relationship with the credit score. This is because riskier clients will be approved for a lower amount and will thus have a lower credit score. Likewise, when calculating a client’s risk, it is expected that the independent variable and the dependent variable will have an inverse relationship.

2.3.2. Maximum Potential Exposure (MPE)

Maximum potential exposure (MPE) is the greatest possibility that the client will default on his or her loan. This is measured by the largest dollar amount that the bank could lose to a single client. Banks are willing to lend more and expose themselves to greater risk. The relationship between this variable and credit score is expected to be positive (Chen, 2025).

2.3.3. Credit Score

A credit score is a statistically derived numerical measure of an individual’s creditworthiness, used by lenders to assess the likelihood of timely debt repayment. Credit scores are based on a borrower’s historical credit behavior, including repayment history and other credit characteristics. For example, the widely used FICO credit score for consumer loans ranges from 300 to 850, with higher values indicating lower perceived credit risk. Following the general framework of the FICO model, which constructs scores based on borrower-specific characteristics, the bank in this study developed an internal, proprietary credit score. This internal score ranges from 0 to 350 and serves as the primary measure of borrower credit risk in our analysis.

2.3.4. Approved Rate

The approved rate is the interest rate that clients were given for their loans. This particular rate is the rate on the loan that the bank officially accepted. It was suggested that there is an inverse relationship between a borrower’s credit score and his or her approved loan rate (Abdymomunov et al., 2025). It is anticipated that this variable will have an inverse relationship with credit scores. Riskier clients are given loans at higher interest rates because this provides greater certainty that the bank will recover its money. Therefore, the higher the risk, the higher the approved rate and the lower the credit score.

2.3.5. Booked Rate

This variable represents the interest rate that was recorded for the client. The approved rate is the rate at which the loan was accepted, while the booked rate is the rate employed. In most cases, these two rates are the same or very similar. The relationship between the booked rate and the credit score is expected to be inverse, like the approved rate (Bank of America, 2025; Kern, 2017). This is because riskier clients are given loans with higher interest rates. Therefore, the higher the risk, the higher the approved rate and the lower the credit score.

2.4. Statistical Methodologies

We use multiple regression analysis to test this model to see what effect the independent variables’ approved amount, MPE, approved rate, and booked rate have on the individual’s credit score, as follows:

S_{t} = B_{0} + B_{1} A p p r o v e d A m o u n t + B_{2} M P E + B_{3} A p p r o v e d R a t e + B_{4} B o o k e d R a t e + ε_{t}

(13)

where

S_{t} =

credit score,

B_{0} =

constant, and

B i =

multiple regression coefficient used to relate the independent variables to the credit score. The expected sign of the approved amount is positive for the reasons discussed. The expected signs of MPE, approved rate, and booked rate are negative for the reasons discussed.

2.4.1. Logistic Regression Models

Based on the literature review by Kim et al. (2020), corporate default prediction models are traditionally categorized into three generations: discriminant analyses, binary response models, and hazard models. Our study utilizes a second-generation approach, focusing on binary response models, particularly logistic regression, for its baseline analysis. These models are advantageous because they do not require specific probability distributions for predictor variables and allow for testing the significance of individual independent variables. Representative examples of this generation include logit and probit models, which calculate the probability of default within a specific time. Unlike complex “black box” machine learning models, binary response models (specifically logistic regression) allow bankers and regulators to clearly understand the influence of each explanatory variable (e.g., debt-to-income ratio, credit score) on the probability of default. This is crucial for regulatory compliance and auditability. Also, financial decisions are often discrete (e.g., loan approved or denied). Binary models are structurally suited to model these binary outcomes, whereas linear regression can produce illogical, out-of-range probabilities. In a recent study (Nigmonov et al., 2024), the authors employ the same methodology to study the determinants of default risk in a peer-to-peer lending market and show that higher interest rates lead to higher default rates.

In order to evaluate the riskiness of both the northern and southern region banks, we use two different logistic regressions. The first logistic regression is done using a dummy variable and data that is grouped according to the size of the approved amount. The second logistic regression uses a dummy variable and the same independent variables used in the multiple regression model: approved amount, MPE, approved rate, and booked rate. Each of these logistic regression models is explained below.

2.4.2. Logistic Regression (Credit Score Groups)

We performed a logistic regression analysis to test the model and to see the impact of the credit scores of the clients of a bank on the credit risk of the bank. For our independent variables, we divide the data into five different groups, as shown in Table 1.

These groups are defined in Table 2 for the northern bank and Table 3 for the southern bank.

We then establish the following binary regression model:

Z_{t} = B_{0} + B_{1} G r o u p 1 + B_{2} G r o u p 2 + B_{3} G r o u p 3 + B_{4} G r o u p 4 + B_{5} G r o u p 5 + ε_{t}

(14)

where

Z_{t}

= is the dummy variable,

B_{0}

= constant, and

B_{n}

= logistic regression coefficient used to relate the independent variable to the dependent variable. The expected sign for Group 1 is negative, with the largest negative coefficient. Each of the groups after the first should gradually decrease in their coefficient, change signs, and then increase in their coefficient. According to this logic, Group 5 should have a positive relationship with the dependent variable and the largest positive coefficient. To determine the overall riskiness of the bank, we take the average credit score for each group and plug it into the equation in place of their respective variable. We then take the

Z_{t}

output and plug it into the logistic function as follows:

f (z) = \frac{1}{1 + e^{- z}}

(15)

The output of this equation, which should be between 1 and 0, reveals the riskiness of the bank. An output of 1, or close to 1, means the bank has low risk, while an output of 0, or close to 0, means the bank has high risk.

2.4.3. Logistic Regression

The last regression performed was another logistic regression, though this time we utilized the independent variables’ approved amount, MPE, approved rate, and booked rate. Therefore, the resulting regression equation is shown below:

Z_{t} = B_{0} + B_{1} A p p r o v e d A m o u n t + B_{2} M P E + B_{3} A p p r o v e d R a t e + B_{4} B o o k e d R a t e + ε_{t}

(16)

where

Z_{t}

= is the dummy variable,

B_{0}

= constant, and

B_{n}

= logistic regression coefficient used to relate the independent variable to the dependent variable. The expected sign of the approved amount is positive for the reasons discussed in Section 2. The expected signs of MPE, approved rate, and booked rate are negative for the reasons discussed in the previous section. We again take the average for each of the independent variables and plug them into the equation in place of their respective variables. We then take the

Z_{t}

output and insert it into the logistic function, as shown in Equation (15).

3. Results

3.1. Multiple Regression Model Results

3.1.1. Northern Branch

After running our multiple regression model using all the northern branch data for all the variables described in Section 2, we obtain the following equation for the northern branch:

Credit Score = 223 − 0.000085 Approved Amount + 0.000017 MPE + 0.96 Approved Rate − 1.34 Booked Rate

(17)

Since several of the signs are counterintuitive, we evaluate the Variance Inflation Factor (VIF) for each variable, the

r^{2}

for the equation, and the Durbin–Watson statistic for the equation. The VIF for each variable was under two, which shows that multicollinearity is not a serious issue. The Durbin–Watson statistic is 2.02960, which is higher than both the DU of 1.81 and DL of 1.73. This is good because it shows that there is no significant serial correlation between errors. In order to test whether or not there was a linear relationship between at least one of the independent variables and the dependent variables, we conducted a p-test and found that the p-value was 0.773, which, assuming an

α

of 0.05, shows that we should fail to reject the null hypothesis. This means that the independent variables do not show a linear relationship with the dependent variables. After running the regressions and four graphs, we notice that the errors are somewhat homoscedastic with a few outliers, and the histogram skews to the left. To test the regularity of the errors in the equation, we ran an autocorrelation on the residuals for the data. Overall, we determine that there is no consistent error. Because our equation seems a bit off, we employ Minitab (version 18) to figure out the best subset. The best subset based on the

r^{2}

value ends up being a tie between the regression equation we already have and a regression equation that only uses approved amount, MPE, and booked rate as the independent variables. Since we accept Ho in the p-test and the graphs examined were somewhat unsatisfactory, we rerun the regression equation to get rid of the approved rate. We chose this because it has the second highest VIF, and it was suggested in the best subset. The resulting equation is:

Credit Score = 225 − 0.000087 Approved Amount + 0.000019 MPE − 0.59 Booked Rate

We now see that the VIFs for the variables remain under two, decreasing only slightly. The Durbin–Watson statistic is 2.02049, which is higher than both the DU of 1.80 and DL of 1.74, an improvement. In order to test whether or not there exists a linear relationship between at least one of the independent variables and the dependent variables, we conduct a p-test and obtain a p-value of 0.648, which, assuming an

α

of 0.05, shows that we fail to reject the null hypothesis. Overall, it is still unsatisfactory.

In running the regression and four graphs (not all are shown to conserve space), we see that the line of best fit is better than that with the first equation, the errors are somewhat homoscedastic with a few outliers, and the histogram is skewed less to the left. This is seen in Figure 1. The fact that the line of best fit is not great and the histogram is skewed to the left still concerns us. The errors being homoscedastic is a positive outcome. In order to test the regularity of the errors in the equation, we performed an autocorrelation on the new residuals for the data. This is seen in Figure 2. Overall, we conclude that there is no consistent error.

We again examine the best subset, which reveals that all the variables in our second equation yield the best R-square values. And since our second equation is an improvement, we use the second equation for our data analysis.

3.1.2. Southern Branch

After running our multiple regression model using all the southern branch data for all the variables described in Section 2, we obtain the following equation for the southern branch:

Credit Score = 155 + 0.00129 Approved Amount − 0.000040 MPE − 5.6 Approved Rate + 8.9 Booked Rate

Approved amount is now positive. The higher the approved amount, the lower the risk, and thus, the higher the credit score. MPE is negative. This makes sense because the greater the exposure to the company, the higher the individual’s risk, which should lead to a lower credit score for the individual. Approved rate is negative. This is intuitive because the higher the rate, the higher the individual’s risk and the lower their credit score should be. Booked rate is positive, which contradicts what we expected.

In order to determine how well the equation fits our data, we evaluate the VIF for each variable, the

r^{2}

for the equation, and the Durbin–Watson statistic for the equation. The only variables with VIFs above 2 are the approved rate and booked rate, which are 3.831 and 3.840, respectively. While these are higher VIFs than the northern region data, this is still very good because it shows that there is little correlation between independent variables. Unfortunately, the

r^{2}

for the equation was only 10.3%. The Durbin–Watson statistic was 1.98634, which is higher than both the DU of 1.79 and DL of 1.68. To test whether or not there is a linear relationship between at least one of the independent variables and the dependent variables, we perform a p-test. The p-value found is 0.009, which, assuming an

α

of 0.05, shows that we should reject the null hypothesis. This means that the independent variables do show a linear relationship with the dependent variables. Examining the regressions and graphs, and performing similar analyses as done in the previous section, we remove the approved rate as an independent variable while keeping all the other independent variables, and we obtain the following equation:

Credit Score = 150 + 0.00130 Approved Amount − 0.000041 MPE + 3.49 Booked Rate

After this procedure, we notice that the VIFs for the variables remain under two. The Durbin–Watson statistic was 1.97272, which is higher than both the DU of 1.77 and DL of 1.69, a good sign. In order to test whether or not there was a linear relationship between at least one of the independent variables and the dependent variables, we conducted a p-test and obtained 0.004, which shows that we should reject the null hypothesis. Finally, we searched again for the best subset for the new equation with Minitab and decided to use the first equation for our data analysis.

3.2. Logistic Regression Model (Credit Score Groups) Results

3.2.1. Northern Region

In order to run our binary logistic regression model using the northern branch data for all the credit score groups described in Section 2, we determined the credit score group with the lowest number of data points, group 5, and used the number of data points for group 5 to dictate how many data points to analyze from the other groups. The purpose of this endeavor was to ensure that we would have equal numbers of data points in each credit score grouping. This resulted in 21 data points. We then took a random sample of 21 data points from each of the groups larger than group 5 to use as our data points when running our binary logistic regression.

We also developed a dummy variable to use as our response when performing this logistic regression. This dummy variable is set to equal 0 when a bank’s risk of default is high and 1 when a bank’s risk of default is low. To determine the risk level of the different credit scores using this methodology, we find the average of the credit score values for all the credit score groupings described above (using only the 21 data points picked through random sampling from each group) and determine that this average would be our breakpoint. Any credit score above this average is considered low risk and given a 1, and any credit score below this average is considered high risk and given a 0. These dummy variables are shown in Table 4.

Using the data described above, we perform a binary logistic regression and obtain the following equation:

Dummy Variable = −635.439 + 0.633013 Group 1 + 0.734241 Group 2 + 0.253490 Group 3 + 0.629671 Group 4 + 0.688884 Group 5

Logically, the coefficient by group should start negative at group 1 and increase to positive by group 5 because the approved amounts of group 1 are low and generally correlated to a low credit score, and by the time you get to group 5, the approved amounts are the highest and generally correlated to a high credit score. This is not the case, as group 1 is not negative and is indeed larger than group 3 or 4. The coefficient values seem a little random.

To test the goodness of fit for this equation, we employ the Pearson, Deviance, and Hosmer–Lemeshow goodness of fit tests. All three have a p-value of 1.000, which means, assuming a

α = 0.05,

there is no significant evidence that the model does not fit the data. We also evaluate the Chi-Square for the Pearson and Deviance methods. The Pearson method showed a degree of freedom of 15, a Chi-Square value of 0.0000001, and a Critical Value of 24.996. Thus, we fail to reject the null hypothesis, which means there is no significant difference between the expected and observed results. The Deviance method shows a degree of freedom of 15, a Chi-Square value of 0.0000002, and a Critical Value of 24.996. This means we fail to reject Ho, which means there is no significant difference between the expected and observed results. Because of these results, we accept the equation as a good fit for the data.

To test the riskiness of the northern region, we then take the average of all the data points, not just the 21 sampled data points, for each of the independent variables: Group 1, Group 2, Group 3, Group 4, and Group 5. These averages are seen in Table 5.

We then plug these into our equation to find a Z-value of −4.63:

Z = −635.439 + 0.633013 (208.48) + 0.734241 (215.5) + 0.253490 (223.76) + 0.629671 (223.03) + 0.688884 (208.24) = −4.63

Plugging this into the logistic function equation gives:

f (x) = \frac{1}{1 + e^{- z}} = f (x) = \frac{1}{1 - e^{4.63}} = 0.009662 \times 100 = 0.9662 %

Thus, there is a 0.9662% probability of default on the part of the northern bank.

3.2.2. Southern Branch

We run the binary logistic regression model using the southern branch data for all the credit score groups described in the previous section, and determine the credit score group with the lowest number of data points, group 2, and use the number of data points for group 2 to dictate how many data points to analyze from the other groups. The purpose of this endeavor was to ensure that we would have equal numbers of data points in each credit score grouping. This results in 16 data points. We then take a random sample of 16 data points from each of the groups larger than group 5 to use as our data points when running our binary logistic regression.

As done in the previous section, we create a dummy variable to use as our response when performing this logistic regression. This dummy variable is set to equal 0 when a bank’s risk of default is high and 1 when a bank’s risk of default is low. In order to determine the risk level of the different credit scores using this methodology, we compute the average of the credit score values for all the credit scores and divide it by the average of the approved amounts for all the credit scores to obtain our breakpoint. This helps eliminate any trouble in determining risk that comes with a high credit score and a low approved amount or vice versa. Following the procedure described previously, we obtain the results in Table 6.

Using the data described above, we obtain the following binary logistic regression:

Dummy Variable = −1350.11 + 1.34666 Group 1 + 3.08610 Group 2 −0.708166 Group 3 + 1.07092 Group 4 + 1.48490 Group 5.

To test the goodness of fit for this equation, we again use the Pearson, Deviance, and Hosmer–Lemeshow goodness of fit tests. All three have a p-value of 1.000, which means, assuming

α = 0.05,

there is no significant evidence that the model does not fit the data. We also evaluate the Chi-Square for the Pearson and Deviance methods and find this number of degrees of freedom to be 10, a Chi-Square value of 0.0000001, and a Critical Value of 18.307. Thus, the null hypothesis is rejected. The Deviance method showed a degree of freedom of 10, a Chi-Square value of 0.0000002, and a Critical Value of 18.307. This means we fail to reject Ho. Because of these results, we accept the equation as a good fit for the data.

In order to test the riskiness of the southern region, we then take the average of all the data points, not just the 16 sampled data points, for each of the independent variables: Group 1, Group 2, Group 3, Group 4, and Group 5. These averages are seen in Table 7.

We plug these values into our equation to find a Z-value of 172.4:

Z = −1350.11 + 1.34666 (201.09) + 3.08610 (224.56) − 0.708166 (230.55) + 1.07092 (223.36) + 1.48490 (325.14) = 172.4

We then insert this value into the logistic function equation to obtain:

f (x) = \frac{1}{1 + e^{- z}} = f (x) = \frac{1}{1 - e^{172.4}} = 1.34 \times 10^{- 75} \times 100 = 1.34 \times 10^{- 73} %

That is, there is a 1.34 × 10⁻⁷³% probability of default on the part of the southern bank.

3.3. Logistic Regression Model (Other Variables) Results

3.3.1. Northern Branch

A binary logistic regression model is performed utilizing the northern branch data for all the independent variables described previously: Approved amount, MPE, credit score, approved rate, and booked rate; to predict risk level for the bank, we used the data given for all these variables. As aforementioned, we create a dummy variable to use as our response when performing this logistic regression. This dummy variable is set to equal 0 when a bank’s risk of default is high and 1 when a bank’s risk of default is low. In order to determine the risk level for the bank using this methodology, we divide the average of the credit score values for all the credit scores and divide it by the average of the approved amounts to arrive at our breakpoint. For each credit score, the value that comes from dividing the credit score by its approved amount determines whether the client is low or high risk. If this value is higher than the breakpoint, the individual is considered low risk and given a 1, and if this value is lower than the break-point, the individual is considered high risk and given a 0.

Using the data described above, we obtain the following equation:

Dummy Variable = 28.8645 − 0.0669884 Approved Amount + 0.0030134 MPE + 4.13756 Credit Score + 5.97883 Approved Rate − 7.61539 Booked Rate

To test the goodness of fit for this equation, we compute the p-values for the Pearson, Deviance, and Hosmer–Lemeshow goodness of fit tests. All three have a p-value of 1.000, which means, assuming

α = 0.05,

there is no significant evidence that the model does not fit the data. We also evaluate the Chi-Square for the Pearson and Deviance methods. The Pearson method shows a degree of freedom of 189, a Chi-Square value of 0.0006460, and a Critical Value of 222.076. We thus fail to reject Ho, which means there is no significant difference between the expected and observed results. The Deviance method shows a degree of freedom of 189, a Chi-Square value of 0.0012919, and a Critical Value of 222.076. Therefore, we fail to reject Ho. As a result, we accept the equation as a good fit for the data. Finally, to test the riskiness of the northern region, we then take the average of all the independent variables and plug these into our equation to find a Z-value of −1393. These averages are seen in Table 8:

Z = 28.8645 − 0.0669884 (37,638) + 0.0030134 (68,811) + 4.13756 (218) + 5.97883 (11) − 7.61539 (10) = −1393

Inserting this value into the logistic function equation yields:

f (x) = \frac{1}{1 + e^{- z}} = f (x) = \frac{1}{1 + e^{1393}} = 1.06607 \times 10^{- 605} \times 100 = 1.06607 \times 10^{- 603} %

3.3.2. Southern Branch

Following the steps outlined in 7.3.1, we run a binary logistic regression and obtain the following equation:

Dummy Variable = −29.4986 − 0.0204173 Approved Amount + 0.0001149 MPE + 1.85001 Credit Score − 3.52030 Approved Rate + 2.00018 Booked Rate

As usual, some of the coefficients do not have the expected signs. For example, the maximum potential exposure is positive, which does not make sense. Similarly, the booked rate is positive, which is counterintuitive. To test the goodness of fit for this equation, we perform the Pearson, Deviance, and Hosmer–Lemeshow goodness of fit tests. All three have a p-value of 1.000, which means, assuming

α = 0.05,

there is no significant evidence that the model does not fit the data. We also evaluate the Chi-Square for the Pearson and Deviance method, which indicates a degree of freedom of 123, a Chi-Square value of 0.0000554, and a Critical Value of 149.885, failing to reject Ho. The Deviance method shows a degree of freedom of 123, a Chi-Square value of 0.0001108, and a Critical Value of 149.885, suggesting there is no significant difference between the expected and observed results. Consequently, we accept the equation as a good fit for the data. Next, to test the riskiness of the northern region, we then take the average of all the independent variables and include these into our equation to find a Z-value of −497.653. These averages are seen in Table 9:

Z = −29.4986 − 0.0204173 (44,499) + 0.0001149 (165,312) + 1.85001 (236) − 3.52030 (10) + 2.00018 (10) = −497.653

We finally insert this number into the logistic function equation and get:

f (x) = \frac{1}{1 + e^{- z}} = f (x) = \frac{1}{1 + e^{- 497.653}} = 7.452 \times 10^{- 217} \times 100 = 7.452 \times 10^{- 215} %

4. Discussion

The results of this study demonstrate that credit scores, when aggregated and analyzed using simple logistic regression models, provide a reliable and operationally efficient measure of portfolio-level credit risk for small- and medium-sized commercial banks. Across all model specifications, both regional portfolios exhibit low estimated probabilities of default, reflecting strong underlying borrower quality and effective underwriting practices. These findings confirm that widely available internal credit score data can be successfully leveraged to benchmark portfolio risk across bank branches or regions without the need for complex or data-intensive modeling approaches (Siddiqi, 2017; Jacobson et al., 2006).

Importantly, the analysis shows that alternative model specifications reveal complementary dimensions of risk. Credit score groupings highlight differences in borrower composition across regions, while models incorporating exposure and pricing variables capture variations in risk concentration and potential loss severity. This reinforces the value of combining borrower-level credit metrics with balance-sheet-oriented variables to obtain a more comprehensive assessment of portfolio risk.

From a practical standpoint, the proposed framework offers a scalable and transparent tool for risk monitoring, internal benchmarking, and early warning analysis. For community and regional banks that lack access to sophisticated machine learning infrastructure, the methodology provides a cost-effective, interpretable approach that aligns with regulatory expectations and internal risk governance. Overall, the results contribute positively to the credit risk literature by demonstrating that parsimonious models based on credit scores can meaningfully support portfolio risk assessment and strategic decision-making in smaller banking institutions.

5. Conclusions

Credit scoring remains the primary instrument for assessing applicant risk within financial institutions. A primary objective of this study is not to identify a single “true” risk ranking, but rather to evaluate whether credit scores alone can serve as a reliable diagnostic tool for portfolio-level risk assessment in small- and medium-sized banks that lack access to sophisticated, high-cost analytical systems. We utilize existing credit scores from a commercial bank’s portfolio to evaluate and compare overall institutional health. By introducing a streamlined test based on accumulated credit scores, this paper provides a framework for banks to benchmark risk across various branches. Validation using data from two distinct regions reveals that while both maintain low risk profiles due to high credit scores, their vulnerabilities differ: the southern region shows higher risk within specific credit score cohorts, while the northern region exhibits greater sensitivity to variables such as loan amount, exposure, and interest rates. This is consistent with the established dummy variables, which categorized many individual credit scores for both banks into the low-risk bracket.

Effectively managing loan portfolio risk is a critical banking function. By capturing the risk associated with mortgages, small business loans, and individual borrowers, financial institutions can adjust interest rates, lending policies, and reserve requirements accordingly. This paper also highlights the importance of credit scoring models for assessing loan portfolio risk and maintaining institutional stability. Using proprietary data to classify borrowers, our model offers a flexible approach for comparing risk between bank branches or across different periods. The capability to convert detailed borrower characteristics into useful probability-of-default measurements is crucial for managing exposure and reducing potential losses. The results of this study demonstrate that credit scores, when aggregated and analyzed using parsimonious logistic regression models, can provide a reliable, transparent, and operationally efficient measure of portfolio-level credit risk for small- and medium-sized commercial banks. Across all model specifications, both regional portfolios exhibited low estimated probabilities of default, reflecting strong underlying borrower quality and effective underwriting practices. These findings confirm that widely available internal credit score data can be successfully leveraged to benchmark portfolio risk across bank branches or regions without the need for complex, data-intensive machine learning infrastructure (Addo et al., 2018; Acharya & Upasan, 2022; Hayashi, 2022; Ala’raj et al., 2022; Galán & Lamas, 2025). Lastly, data constraints precluded the inclusion of a broader sample of banks; consequently, future studies would benefit significantly from more extensive datasets.

Author Contributions

Conceptualization, J.C.; methodology, J.C. and E.C.; software, J.C., E.C. and J.N.; validation, J.N. and E.C.; formal analysis, J.C., E.C. and J.N.; investigation, J.C.; resources, J.C., E.C. and J.N.; data curation, J.N. and E.C.; writing—original draft preparation—J.C.; writing—review and editing, J.N. and E.C.; visualization, J.C. and E.C. supervision, J.C.; project administration, J.N.; funding acquisition, J.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available from the authors upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abdymomunov, A., Elul, R., Ruffino, D., & Wang, J. (2025). Examining the relationship between loan pricing and credit risk. Available online: https://www.federalreserve.gov/econres/notes/feds-notes/examining-the-relationship-between-loan-pricing-and-credit-risk-20250924.html (accessed on 19 January 2026).
Acharya, T. A., & Upasan, P. V. (2022, December). Credit risk assessment: A machine learning approach. In International conference on intelligent systems and machine learning (pp. 39–54). Springer Nature. [Google Scholar]
Addo, P. M., Guegan, D., & Hassani, B. (2018). Credit risk analysis using machine learning and deep learning models. Risks, 6(2), 38. [Google Scholar] [CrossRef]
Ala’raj, M., Abbod, M. F., Majdalawieh, M., & Jum’a, L. (2022). A deep learning model for behavioural credit scoring in banks. Neural Computing and Applications, 34(8), 5839–5866. [Google Scholar] [CrossRef]
Avery, R. B., Bostic, R. W., Calem, P. S., & Canner, G. B. (2000). Credit scoring: Statistical issues and evidence from credit-bureau files. Real Estate Economics, 28(3), 523–547. [Google Scholar] [CrossRef]
Bank of America. (2025). Better money habits study. Available online: https://newsroom.bankofamerica.com/content/newsroom/press-releases/2025/07/confronted-with-higher-living-costs--72--of-young-adults-take-ac.html (accessed on 17 January 2026).
Basel Committee on Banking Supervision. (2005). An explanatory note on the Basel II IRB risk weight functions. Bank for International Settlements. [Google Scholar]
Berger, A. N., Frame, W. S., & Miller, N. H. (2005). Credit scoring and the availability, price, and risk of small business credit. Journal of Money, Credit and Banking, 37(2), 191–222. [Google Scholar] [CrossRef]
Berger, A. N., & Udell, G. F. (2002). Small business credit availability and relationship lending: The importance of bank organizational structure. The Economic Journal, 112(477), F32–F53. [Google Scholar] [CrossRef]
Berger, A. N., & Udell, G. F. (2006). A more complete conceptual framework for SME finance. Journal of Banking & Finance, 30(11), 2945–2966. [Google Scholar] [CrossRef]
Chen, J. (2025). Understanding credit exposure: Managing loan risks. Investopedia. Available online: https://www.investopedia.com/terms/c/credit-exposure.asp#:~:text=Credit%20exposure%20is%20the%20maximum,avoiding%20those%20with%20lower%20ratings (accessed on 19 January 2026).
Demyanyk, Y. (2008). Did credit scores predict the subprime crisis? Federal Reserve Bank of St. Louis Review, 90, 405–419. [Google Scholar]
European Banking Authority. (2023, March 10). EBA publishes annual assessment of banks’ internal approaches for the calculation of capital requirements. Available online: https://www.eba.europa.eu/publications-and-media/press-releases/eba-publishes-annual-assessment-banks-internal-approaches (accessed on 17 January 2026).
Experian. (2025). State of credit: U.S. consumer credit review. Experian Information Solutions. Available online: https://www.experian.com/blogs/insights/2025-state-of-credit-card-report/ (accessed on 16 January 2026).
Fair Isaac Corporation. (n.d.). What’s in my FICO^® scores? myFICO. Available online: https://www.myfico.com/credit-education/whats-in-your-credit-score (accessed on 16 January 2026).
Galán, J. E., & Lamas, M. (2025). Beyond the LTV ratio: Lending standards, regulatory arbitrage, and mortgage default. Journal of Money, Credit and Banking, 57(1), 107–150. [Google Scholar] [CrossRef]
Gordy, M. B. (2003). A risk-factor model foundation for ratings-based bank capital rules. Journal of Financial Intermediation, 12(3), 199–232. [Google Scholar] [CrossRef]
Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160(3), 523–541. [Google Scholar] [CrossRef]
Hayashi, Y. (2022). Emerging trends in deep learning for credit scoring: A review. Electronics, 11(19), 3181. [Google Scholar] [CrossRef]
Hu, W., Shao, C., & Zhang, W. (2025). Predicting US bank failures and stress testing with machine learning algorithms. Finance Research Letters, 75, 106802. [Google Scholar] [CrossRef]
Inspire Credit Union. (2024). Unlocking opportunities: Understanding the impact of credit scores on loan approvals. Available online: https://inspirefcu.org/unlocking-opportunities-understanding-the-impact-of-credit-scores-on-loan-approvals/ (accessed on 19 January 2026).
Jacobson, T., Lindé, J., & Roszbach, K. (2006). Internal ratings systems, implied credit risk and the consistency of banks’ risk classification policies. Journal of Banking & Finance, 30(7), 1899–1926. [Google Scholar] [CrossRef]
Kern, A. M. (2017). Credit score analysis. Southern Illinois University Carbondale. [Google Scholar]
Kim, H., Cho, H., & Ryu, D. (2020). Corporate default predictions using machine learning: Literature review. Sustainability, 12, 6325. [Google Scholar] [CrossRef]
Liu, Z., & Liang, H. (2025). Do Fintech lenders align pricing with risk? Evidence from a model-based assessment of conforming mortgages. FinTech, 4(2), 23. [Google Scholar] [CrossRef]
Mayer, C., Pence, K., & Sherlund, S. M. (2009). The rise in mortgage defaults. Journal of Economic Perspectives, 23(1), 27–50. [Google Scholar] [CrossRef]
Nigmonov, A., Shams, S., & Urbonas, P. (2024). Estimating probability of default via delinquencies? Evidence from European P2P lending market. Global Finance Journal, 63, 101050. [Google Scholar] [CrossRef]
Sadhwani, A., Giesecke, K., & Sirignano, J. (2021). Deep learning for mortgage risk. Journal of Financial Econometrics, 19(2), 313–368. [Google Scholar] [CrossRef]
Schuermann, T. (2004). What do we know about loss given default? Economic Policy Review, Federal Reserve Bank of New York, 10(2), 1–12. [Google Scholar]
Sengupta, R., & Bhardwaj, G. (2015). Credit scoring and loan default. International Review of Finance, 15(2), 139–167. [Google Scholar] [CrossRef]
Setiadi, D. R. I. M., Muslikh, A. R., Iriananda, S. W., Warto, W., Gondohanindijo, J., & Ojugo, A. A. (2024). Outlier detection using Gaussian mixture model clustering to optimize XGBoost for credit approval prediction. Journal of Computing Theories and Applications, 2(2), 244–255. [Google Scholar] [CrossRef]
Siddiqi, N. (2017). Intelligent credit scoring: Building and implementing better credit risk scorecards. John Wiley & Sons. [Google Scholar]
Stein, J. C. (2002). Information production and capital allocation: Decentralized versus hierarchical firms. The Journal of Finance, 57(5), 1891–1921. [Google Scholar] [CrossRef]
Stiglitz, J. E., & Weiss, A. (1981). Credit rationing in markets with imperfect information. The American Economic Review, 71(3), 393–410. [Google Scholar]
Tabachová, Z., Diem, C., Borsos, A., Burger, C., & Thurner, S. (2023). Estimating the impact of supply chain network contagion on financial stability. arXiv, arXiv:2305.04865. [Google Scholar] [CrossRef]
Urban Institute. (2025a). Housing finance chartbook (October v7). Available online: https://www.urban.org/sites/default/files/2025-10/October%20v7.pdf (accessed on 16 January 2026).
Urban Institute. (2025b). Mortgage market outlook. Urban Institute Housing Finance Policy Center. [Google Scholar]

Figure 1. Basic Statistics.

Figure 2. Autocorrelation Test of Residuals.

Table 1. Group Members and Associated Approval Amounts.

Group	Approval Amount
1	$1000–$10,000
2	$10,000–$20,000
3	$20,000–$30,000
4	$30,000–$50,000
5	$50,000–$500,000

Table 2. Northen Region Characteristics.

Northern Region
	Group 1	Group 2	Group 3	Group 4	Group 5
Approved Amount Mean	6907.30	15,701.42	26,121.46	45,308.41	164,404.33
Credit Score Mean	208.48	215.5	223.76	223.03	208.24

Table 3. Southern Region Characteristics.

Southern Region
	Group 1	Group 2	Group 3	Group 4	Group 5
Approved Amount Mean	7593.06	17,637.50	25,731.27	44,455.03	138,232.45
Credit Score Mean	201.09	224.56	230.55	223.36	325.14

Table 4. Northern branch group characteristics.

Group 1	Group 2	Group 3	Group 4	Group 5	Average	Dummy Variable
225.00	188.00	224.00	256.00	343.00	247	1
211.00	0.00	228.00	109.00	233.00	156	0
188.00	227.00	247.00	0.00	259.00	184	0
237.00	148.00	230.00	300.00	215.00	226	1
224.00	133.00	221.00	218.00	167.00	193	0
232.00	235.00	158.00	209.00	0.00	167	0
216.00	201.00	306.00	239.00	224.00	237	1
266.00	264.00	267.00	275.00	230.00	260	1
247.00	250.00	250.00	232.00	227.00	241	1
236.00	165.00	246.00	275.00	266.00	238	1
250.00	242.00	232.00	220.00	241.00	237	1
249.00	263.00	249.00	236.00	195.00	238	1
140.00	231.00	216.00	276.00	189.00	210	0
251.00	256.00	247.00	227.00	151.00	226	1
232.00	184.00	246.00	215.00	251.00	226	1
180.00	211.00	295.00	243.00	227.00	231	1
229.00	238.00	248.00	193.00	253.00	232	1
220.00	167.00	263.00	211.00	228.00	218	0
131.00	196.00	202.00	215.00	233.00	195	0
185.00	223.00	248.00	250.00	241.00	229	1
188.00	271.00	249.00	210.00	0.00	184	0
					218

Table 5. Northern region riskiness.

	Group 1	Group 2	Group 3	Group 4	Group 5
Averages	208.48	215.5	223.76	223.03	208.24

Table 6. Southern brank group characteristics.

Group 1	Group 2	Group 3	Group 4	Group 5	Average	Dummy Variable
242	201	203	275	228	230	1
262	250	240	234	184	234	1
228	214	220	242	201	221	1
222	235	207	238	294	239	1
153	239	254	256	222	225	1
206	273	231	239	267	243	1
219	255	203	229	262	234	1
215	250	0	0	179	129	0
253	212	241	203	220	226	1
183	221	239	298	224	233	1
227	218	248	245	238	235	1
209	206	215	251	237	224	1
228	177	200	241	268	223	1
0	251	278	192	259	196	0
194	185	238	246	203	213	0
196	206	231	236	232	220	0
					220

Table 7. Southern region risk by groups.

	Group 1	Group 2	Group 3	Group 4	Group 5
Averages	201.09	224.56	230.55	223.36	325.14

Table 8. Northern region risk levels.

	Approved Amount	MPE	Credit Score	Approved Rate	Booked Rate
Averages	37,638	68,811	218	11	10

Table 9. Southern region risk levels.

	Approved Amount	MPE	Credit Score	Approved Rate	Booked Rate
Averages	44,499	165,312	236	10	10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Castro, J.; Nguyen, J.; Castro, E. Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA. J. Risk Financial Manag. 2026, 19, 152. https://doi.org/10.3390/jrfm19020152

AMA Style

Castro J, Nguyen J, Castro E. Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA. Journal of Risk and Financial Management. 2026; 19(2):152. https://doi.org/10.3390/jrfm19020152

Chicago/Turabian Style

Castro, Juan, James Nguyen, and Esther Castro. 2026. "Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA" Journal of Risk and Financial Management 19, no. 2: 152. https://doi.org/10.3390/jrfm19020152

APA Style

Castro, J., Nguyen, J., & Castro, E. (2026). Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA. Journal of Risk and Financial Management, 19(2), 152. https://doi.org/10.3390/jrfm19020152

Group 1	Group 2	Group 3	Group 4	Group 5	Average	Dummy Variable
242	201	203	275	228	230	1
262	250	240	234	184	234	1
228	214	220	242	201	221	1
222	235	207	238	294	239	1
153	239	254	256	222	225	1
206	273	231	239	267	243	1
219	255	203	229	262	234	1
215	250	0	0	179	129	0
253	212	241	203	220	226	1
183	221	239	298	224	233	1
227	218	248	245	238	235	1
209	206	215	251	237	224	1
228	177	200	241	268	223	1
0	251	278	192	259	196	0
194	185	238	246	203	213	0
196	206	231	236	232	220	0
					220

Group 1	Group 2	Group 3	Group 4	Group 5	Average	Dummy Variable
242	201	203	275	228	230	1
262	250	240	234	184	234	1
228	214	220	242	201	221	1
222	235	207	238	294	239	1
153	239	254	256	222	225	1
206	273	231	239	267	243	1
219	255	203	229	262	234	1
215	250	0	0	179	129	0
253	212	241	203	220	226	1
183	221	239	298	224	233	1
227	218	248	245	238	235	1
209	206	215	251	237	224	1
228	177	200	241	268	223	1
0	251	278	192	259	196	0
194	185	238	246	203	213	0
196	206	231	236	232	220	0
					220

Article Menu

Using Credit Scores to Capture Regional Banks’ Portfolio Credit Risk: The Case of East Texas, USA

Abstract

1. Introduction

2. Materials and Methods

2.1. The Model

2.2. Sample Selection Procedure

2.3. Variables

2.3.1. Approved Amount

2.3.2. Maximum Potential Exposure (MPE)

2.3.3. Credit Score

2.3.4. Approved Rate

2.3.5. Booked Rate

2.4. Statistical Methodologies

2.4.1. Logistic Regression Models

2.4.2. Logistic Regression (Credit Score Groups)

2.4.3. Logistic Regression

3. Results

3.1. Multiple Regression Model Results

3.1.1. Northern Branch

3.1.2. Southern Branch

3.2. Logistic Regression Model (Credit Score Groups) Results

3.2.1. Northern Region

3.2.2. Southern Branch

3.3. Logistic Regression Model (Other Variables) Results

3.3.1. Northern Branch

3.3.2. Southern Branch

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Group 1	Group 2	Group 3	Group 4	Group 5	Average	Dummy Variable
242	201	203	275	228	230	1
262	250	240	234	184	234	1
228	214	220	242	201	221	1
222	235	207	238	294	239	1
153	239	254	256	222	225	1
206	273	231	239	267	243	1
219	255	203	229	262	234	1
215	250	0	0	179	129	0
253	212	241	203	220	226	1
183	221	239	298	224	233	1
227	218	248	245	238	235	1
209	206	215	251	237	224	1
228	177	200	241	268	223	1
0	251	278	192	259	196	0
194	185	238	246	203	213	0
196	206	231	236	232	220	0
					220