Next Article in Journal
Has US (Un)Conventional Monetary Policy Affected South African Financial Markets in the Aftermath of COVID-19? A Quantile–Frequency Connectedness Approach
Previous Article in Journal
Exchange Rate Forecasting: A Deep Learning Framework Combining Adaptive Signal Decomposition and Dynamic Weight Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

When Models Fail: Credit Scoring, Bank Management, and NPL Growth in the Greek Recession

by
Vasileios Giannopoulos
* and
Spyridon Kariofyllas
Department of Accounting and Finance, University of the Peloponnese, 24100 Kalamata, Greece
*
Author to whom correspondence should be addressed.
Int. J. Financial Stud. 2025, 13(3), 152; https://doi.org/10.3390/ijfs13030152
Submission received: 27 June 2025 / Revised: 13 August 2025 / Accepted: 18 August 2025 / Published: 22 August 2025

Abstract

The significant increase in non-performing loans (NPLs) during the escalating recession of the Greek economy motivates us to study the predictive power of credit rating models in periods of economic shocks. In parallel, we examined the responsibilities of bank management in the expansion of NPLs in this adverse environment. Certain studies connect bad loans with turbulent conditions. Our paper weighs the relative significance of both economic shock and management effectiveness using data at an individual level, which provides the originality of our study. We use a unique dataset of small business loans that were granted during 2005 (expansion period) by a large commercial Greek bank, and we explore their performance between 2010 and 2012 (early recession period). In the context of a stepwise methodology, we compare the Bank’s credit scoring model with three other prediction models (binomial logistic regression, decision tree, and multilayer perceptron neural network) to check both the predictive ability of credit scoring models during recession and the effectiveness of bank management. The comparative analysis confirms the management’s responsibilities in granting NPLs, since the Bank’s model exhibited the worst predictive performance. Additionally, we find that adverse external conditions lead to an increase in NPLs and decrease the predictive performance of all credit scoring models. The study offers a reliable methodological tool for lending management in economic downturns.

1. Introduction

Micro and small enterprises (MSEs) represent up to 99% of all businesses operating in the European Union. Moreover, they are strongly dependent on the banking system since 75–80% of all enterprises are bank-financed, especially with short-term bank loans. (European Banking Authority [EBA], 2016). MSEs’ credit risk seems to be high since their financial reporting quality is lower and the asymmetric information is higher than those of other businesses (Duarte et al., 2016). After the financial crisis of 2008, the level of non-performing loans (NPLs) for MSEs was the highest in most EU countries (18.5%, European Banking Authority [EBA], 2016), proving that they are more vulnerable to adverse economic conditions than other enterprise groups. According to Christodoulou-Volos (2025), lower economic growth, higher inflation, and higher interest rates correlate with an increase in non-performing loans (NPLs). Moreover, economic policy uncertainty, unemployment, capital adequacy, and liquidity risk during recession periods affect NPLs (Papadamou & Pitsilkas, 2025). Maztoul (2025) studied a dataset of 131 OECD commercial banks for 10 years. The empirical findings provide strong evidence that banks’ financial performance is negatively affected by non-performing loans (NPLs) and that institutions with higher ESG (Environmental, Social, and Governance) scores tend to have lower NPL ratios. The study concludes that enhancing sustainability practices contributes to improved financial performance by mitigating credit risk. Integrating sustainability into banking operations strengthens risk management, which in turn supports financial outcomes. Moreover, the results highlight that the Social and Governance components play a significant role in this relationship, while the Environmental component has a more ambiguous impact. These findings underscore the importance for banks to improve their ESG performance due to its positive influence on loan quality and overall financial health.
Accurately modeling NPLs is critically important for a wide range of stakeholders seeking to assess bank performance, including shareholders, bank managers, competitors, regulators, and credit rating agencies (Karagiannis & Kourtzidis, 2025).
In this paper, we focus on the evaluation process of bank financing, as we believe that the management of NPLs is crucial for financial stability, especially during recession periods. In this direction, we try to find out if the remarkable increase in NPLs is a result of poor bank managerial ability in risk-taking behavior (Andreou et al., 2016) or the country’s adverse macroeconomic conditions. When bank managers apply poor credit scoring models or ignore the model rating to increase the number of loans granted and their market share during expansion periods, they then expect a greater increase in NPLs in periods of recession. On the other hand, we also argue that during economic downturns, the level of NPLs could mark a significant increase that strongly challenges management decisions. Manz (2019) presented a literature review of 44 studies on determinants of NPLs published for the period 1987 to 2017 in 30 peer-reviewed journals. The paper concluded that the interaction of loan- and asset-specific events with macroeconomic and bank-specific factors remains poorly understood and deserves additional empirical research.
Although there is some literature on poor bank managerial ability (Berger & DeYoung, 1997; Williams, 2004; Louzis et al., 2012; Belaid et al., 2017), this does not evaluate lending management during recessionary periods. In such periods, credit risk materializes and raises concerns for a further increase in the NPL ratio. In this way, external environmental deterioration fully reveals bad management decisions that are taken more easily in the growth period (in which loan demand booms), resulting in a continuously decreasing performance of marginally efficient loans. Although specific studies connect bad loans with turbulent conditions (Quagliariello, 2007; Konstantakis et al., 2016), to the best of our knowledge, no study weighs the relative significance of both dimensions of NPLs, systematically considering adverse external effects and using data at an individual level, providing our study’s originality.
In our paper, we compare the Bank’s credit scoring model with three other prediction models that the literature proposes, such as binomial logistic regression, decision tree, and multilayer perceptron neural network, to answer our research questions. The comparative analysis confirms the management’s responsibilities in granting NPLs, since the Bank’s model exhibited the worst predictive performance. Additionally, we find that adverse external conditions lead to an increase in NPLs and decrease the accuracy of all empirical methods used in the study.
The Greek economy is a bank-based system, similar to other European countries, such as Spain, Portugal, Italy, etc. (Duarte et al., 2016). So, the study of the responsibilities of banking management in the significant increase in NPLs during the recession is a typical case study that can lead to general conclusions. Although the EU average NPL ratio was up to 6% in September 2015, the NPL ratio in Greece was extremely high (43.5%), which is the second-highest NPL ratio in Europe after the NPL ratio of 50% in Cyprus (European Banking Authority [EBA], 2016). In our paper, we examine whether the increase in NPLs from August 2010 to July 2012 is due to bad management during the expansion period or is a matter of the escalating recession of the economy.
The Greek retail banking system is oligopolistic, consisting of four systemic institutions that provide similar services and products and have a similar structure in terms of retail network and organizational form. The data was collected manually and consists of 3.294 loans to small businesses that were granted in 2005, a year during which the Greek economy was in a phase of redefinition after the 2004 Olympic Games. We explore the performance of small business loans from August 2010 to July 2012 (early recession period of the Greek economy). During the recession period, all systemic Greek banks were under the supervision of the European Central Bank (ECB) and the Bank of Greece. Accordingly, the high homogeneity of the Greek systemic banks gives our results high representativeness for the entire Greek banking sector.
Our paper contributes to the existing literature related to NPLs’ determinants (Louzis et al., 2012) and the literature of credit scoring models (Crook et al., 2007; Louzada et al., 2012) in diverse ways. First, this offers a reliable response to whether bad loan performance is the result of bad management or caused by an adverse external environment. In particular, the findings confirm the bad management hypothesis and demonstrate that poor decision-making is very vulnerable to economic fluctuations. Moreover, we observe that loans that were 90 days past due in the first year indicated a further deterioration in the second year. As an important implication, it seems to be an urgent focus for banks to avoid loans becoming more than 90 days past due. Secondly, the paper considers, for the first time, a relatively large number of micro and small-sized borrowers’ idiosyncratic features, thus substantially differing from the current literature (e.g., Louzis et al., 2012; Anastasiou et al., 2016) that emphasizes both macroeconomic factors and bank-specific characteristics for NPLs accumulation at the aggregated level. Moreover, the paper assesses the relative accuracy of several credit scoring models using data from emerging economies (Abdou et al., 2007; Louzada et al., 2012; Mileris & Boguslauskas, 2011), where banks’ credit scoring models record the worst accuracy compared to the models proposed in the literature. In this paper, we use three credit scoring models widely used in the literature, and we focus on the efficiency and the evolution of all credit scoring models over time to answer the bad management and bad luck hypotheses.
The rest of the paper is structured as follows. In Section 2, we present a literature review of bad management and credit scoring models. In Section 3, we provide the dataset and its descriptive statistics. In Section 4, we set the research methodology, and we provide a short presentation of the different credit scoring models. In Section 5, we present our main results. In Section 6, we discuss the main findings of our study and we conclude the paper.

2. Literature Review

2.1. Bad Management Hypothesis

According to the bad management hypothesis, introduced by Berger and DeYoung (1997), poor management skills in the underwriting process (credit scoring models), appraisal of pledged collaterals, and monitoring borrowers are associated with increases in future problem loans. Under these circumstances, “bad managers” follow a liberal credit policy to boost current earnings, increase market shares, and, in conjunction with the income smoothing activities by borrowers in expansionary phases, they conduct inadequate credit risk loan assessment. There are some studies on the bad management hypothesis at the aggregate level (Berger & DeYoung, 1997; Partovi & Matousek, 2019; Louzis et al., 2012; Williams, 2004) that provide interesting findings but ignore the substantial impact of crisis effects on bank lending decision-making.
Exogenous crisis effects can challenge bad management decisions further. In this framework, certain studies examine bad loans under turbulent conditions (Quagliariello, 2007; Konstantakis et al., 2016; Makri et al., 2014; Tmava & Spahiu, 2025; Undji & Sheefeni, 2025); however, they do not consider strategic firm-specific considerations, Bank lending managerial options, and micro-level data. Akuoko-Kanadu and Mahmud (2025) applied the Arellano–Bond Generalized Method of Moments dynamic panel estimation technique to a sample of 493 banks across 31 Sub-Saharan African (SSA) countries between 2011 and 2019. Their findings indicate that corruption and economic growth have had significant, but opposite, effects on NPLs in the region—corruption exerting a positive influence and economic development a negative one. In a turbulent environment, commercial banks face increased credit risk due to the reduced cash flows of their business borrowers. At the same time, household borrowers receive lower income payments because of wage cuts or unemployment. In this high-risk environment, the rise in problem loans and the decline in collateral values lead to a significant tightening of credit conditions, as banks become increasingly unwilling to provide new credit. Thus, Berger and DeYoung (1997) suggest that external events cause an increase in problem loans for the Bank, and consequently, a decrease in performance. The specific hypothesis suggests a circular relationship between the macroeconomic environment and loan quality, implying that in expansion years, borrowers’ incomes improve, and thus their capacity to repay their loans increases. In turn, when the economy enters a recessionary phase, NPLs augment as unemployment rises, disposable income declines, and borrowers face difficulties in repaying their debt obligations.

2.2. Credit Scoring Techniques

The global financial crisis of 2008 has focused the attention of financial institutions on the control and management of credit risk. A good credit risk assessment method can help financial institutions lend to trusted borrowers, thereby increasing their profits, and refuse lending to unreliable borrowers, thereby reducing losses. The precision of the credit rating model of prospective borrowers is critical for the profitability of financial institutions (Wang et al., 2012). Even a 1% improvement in the accuracy of recognizing bad borrowers leads to a significant reduction in losses for financial institutions (Hand & Henley, 1997). The accuracy of scoring models is crucial for banks’ profitability, as credit management needs to identify potential bad clients and minimize the chance of default (Lee et al., 2002). Researchers developed several credit scoring models that could be categorized into two main techniques. In the first category, there are the statistical credit scoring models, such as Linear Discriminant Analysis (Altman & Saunders, 1998), Logistic Regression Analysis (Lee et al., 2002; Gordy, 2000), and Multivariate Adaptive Regression Splines (Friedman, 1991). The most popular technique, logistic regression, allows for the best accuracy (Louzada et al., 2012). In the second category, there are the neural network machine learning models, such as Artificial Neural Networks (Zopounidis & Doumpos, 2002), Decision Trees (Dilsha & Kiruthika, 2014), Case-Based Reasoning (Shin & Han, 2001), and Support Vector Machines (Trustorff et al., 2011). Many researchers claim that the artificial intelligence models are more accurate than the statistical credit scoring models since the relationship between variables is non-linear (Lee et al., 2002).
To measure the predicted performance of a credit scoring model, we could use both statistical (average accuracy, precision, specificity, etc.) and information measures (entropy, etc.). In general, researchers choose the credit scoring models according to the dataset, the independent variables, and the purpose of the classification (Twala, 2010). On the other hand, Sun and Vasarhelyi (2018) developed a deep neural network to evaluate the risk of credit card delinquency based on the client’s characteristics and spending behaviors using a dataset of 711,397 credit card holders from a large bank in Brazil. Compared with machine-learning algorithms of logistic regression, naive Bayes, traditional artificial neural networks, and decision trees, they found that the proposed deep neural networks have a better overall predictive performance with the highest F scores and area under the receiver operating characteristic curve.
In many cases, banks are developing their credit scoring models to evaluate the probability of default of potential borrowers. Studying the literature, we found that in most cases, the models used by the banks performed worse than the models suggested by the relevant literature. Louzada et al. (2012) compared the performance of the model applied by a Brazilian bank with different scoring models, and they showed that the Bank’s model performed worst. Mileris and Boguslauskas (2011) suggested that the credit scoring they used had better accuracy than the Lithuanian Bank’s scoring model. Similarly, Abdou et al. (2007) found that the credit scoring model which an Egyptian bank uses scored the lowest accuracy compared to logistic regression and discriminant analysis models.
In our paper, we compare the Bank’s credit scoring model with three other credit scoring techniques (i.e., Logistic Regression, Decision Tree, and Multilayer Perceptron Neural Networks) and we examine their behavior during the worsening recession of the Greek economy.

3. The Evolution of the Greek Economy

Greece’s entrance into the Eurozone in 2002 and the Olympic Games of 2004 led the Greek economy to remarkable growth in the first years of the millennium. During this period, Greek banks pursued an expansionary policy in the field of private sector lending and participation in the financing of public debt. Greek banks also implemented an aggressive lending policy to households and MSEs because of intense competition. However, the global financial crisis and, most importantly, the structural problems of the Greek economy led to a prolonged recession period from September 2008 (Aggelopoulos & Georgopoulos, 2015).
As shown in Table 1, the GDP growth stalled after the Olympic Games of 2004, but during 2006 and 2007, the Greek economy recorded remarkable growth. The GDP declined for the first time in 2008 by 0.3% (after fifteen years of continuous growth). In 2011, the reduction of GDP rose to −9.1%, but after the restructuring of sovereign debt in 2012 (Private Sector Involvement), the reduction of GRD was limited to −3.2%. The gross debt as a percentage of GDP increased from 103.1% in 2007 to 172.1% in 2011 and reached 177.7% in 2013, despite the sovereign debt’s restructuring. At the same time, private debt as a percentage of GDP increased from 114.6% in 2007 to 144.4% in 2011 and 147.8% in 2013. As a result, the creditworthiness of the Greek economy in October 2009 was downgraded, and the yield spread between Greek and German bonds was significantly widening. Given the deteriorating financial condition of the Greek economy, the percentage of NPLs dramatically increased, since both households and enterprises were unable to pay their loan obligations. The percentage of non-performing business loans of Greek banks increased significantly after the outbreak of the financial crisis (in September 2008), while there was a significant further increase from 2010 onwards. At the end of 2013, the NPL ratio had increased dramatically to 39.5%. Generally, the restructuring of NPLs is a significant concern for the ECB and EBA.

4. Research Methodology

In our paper, we set the loans’ performance as the dependent variable at three time points. We observe the performance of loans in August 2010, August 2011, and July 2012. Thereafter, to control the effectiveness of bank management, we check the accuracy of the Bank’s model compared with the performance of three other models (Binomial Logistic Regression, Decision Tree, and Multilayer Perceptron Neural Network). Furthermore, to check the bad luck hypothesis, we observe the evolution of the predicting ability of all models during the escalating recession of the Greek economy.

4.1. Loan Characteristics and Borrowers’ Features

Loan characteristics and borrower’s features were determined as independent variables to demonstrate a credit scorecard and measure the probability of default of business loans. Particularly, we use qualitative information (such as bank relationship, residence status, etc.) to predict the credit risk of an MSE (Gupta et al., 2015). The independent variables are justified by the “Five Cs of Credit” (DeVaney & Lytton, 1995), which represent five general borrower’s characteristics, trying to estimate the probability of default (Character of the borrower, Capital, Collateral, Capacity, and Economic Conditions). In our analysis, we use ten independent variables that were used in the Bank’s model (borrowers’ idiosyncratic). To make the results of the models comparable with the scorecard applied by the Bank at the time the loans were granted, macroeconomic factors have not been added as independent variables in our research. The definition of each variable is summarized in Table 2.

4.2. Logistic Regression

According to Siddiqi (2006), logistic regression is a common technique used to develop scorecards in most financial industry applications where the predicted variable is categorical. In cases where the predicted variable is binary (good/bad), multiple logistic regression is used. Logistic regression uses a set of predictor characteristics to predict the probability of a specific outcome. The equation for the logit transformation of the probability of an event is shown by the following:
L o g i t p i = β 0 + β 1 x 1 i + + β k x k i
where
  • p = posterior probability of “event”, given inputs;
  • xi = independent variables;
  • β0 = intercept of the regression line;
  • βk = parameters.
Logit transformation is the log of the odds, that is, log(p(event)/p(non-event)), and is used to linearize posterior probability and limit the outcome of estimated probabilities in the model to between 0 and 1. Maximum likelihood is used to estimate parameters β1 to βk. These parameter estimates measure the rate of change of logit for a one-unit change in the independent variable (adjusted for the other inputs), that is, they are the slopes of the regression line between the target and their respective independent variables x1i to xki.
In our paper, we use as independent variables the loan characteristics and the borrowers’ features that are presented in Table 2. Appendix A presents the coefficients of the binary logistic regressions we used. So, the logistic regression of our model is formulated as follows:
Y i =   β 0   +   β 1 × H +   β 2 × A g +   β 3 × B R +   β 4 × C o +   β 5 × L T T +   β 6 × O F +   β 7 × P +   β 8 × R +   β 9 × L T +   β 10 × Y r
where
  • Yi = A dummy variable taking the value 1 if a loan is non-performing or the value 0 if the loan is performing;
  • BH = Firm’s owner’s bad trading past;
  • Ag = Age of the firm’s owner;
  • BR = Relationship between the firm’s owner and the Bank;
  • Co = Type of collateral;
  • LTT = Loan to turnover ratio;
  • OF = Own facilities;
  • P = Mortgage-free property;
  • R = Residence status;
  • LT = Loan type;
  • Yr = Years of operation;
  • β0 = Intercept of the regression line;
  • βk = Parameters.

4.3. Classification and Regression Trees

The term “classification” refers to the act of assigning an object to one of the predefined classes within a set of classes. Each object in a dataset has several attributes (X1, …, Xk), where Π(Χi) is the domain of attribute Xi. In addition, each object has an attribute C, which denotes the class to which it belongs, with Π(C) symbolizing the domain of the attribute of class C. The categorization involves finding a function f: Π(X1) x… x Π(Xk) → Π(C), which is called the classification model. If we know the values of the attributes X1, …, Xk of an object, but not the value of the attribute C, then we apply a categorization model and assign the object to the class f(X1, …, Xk). A decision tree is one of the most popular classification models (Figure 1).
The decision tree is a graph with the classic tree structure, where we distinguish the following: (a) an initial or decision node, which is the root, (b) the inner nodes, which are the edges or branches, and (c) the outer nodes, which are the leaves. At each node (inner or outer) outside the root, a directed edge enters from another node. Each inner node corresponds to a feature used to separate the tree further. At the edges coming out of the root or any inner node, there is a control condition based on the separator characteristic. The process of constructing a decision tree is iterative. It can be briefly described as follows: First, we select an attribute, which refers to the root of the tree, and then construct an edge and a node for each of the distinct values of the attribute. These two steps are repeated continuously until all the attributes are inserted into the nodes of the tree. In the literature, we found different algorithms that were used for constructing a decision tree (ID3. C5.0, and CART (Quinlan, 1986)). The classification and regression trees (CART models) are a classification method that has been used successfully in credit scoring (Zopounidis & Doumpos, 2002).
In our paper, we use a CART model. Appendix C presents the information on our CART models. As “initial or decision” nodes, we set the loan characteristics, the categories of each characteristic are defined as “edges” or “branches,” and as “leaves,” we set the performance of each loan (performing or non-performing). We finally chose the minimum number of loan characteristics that maximizes the average accuracy and minimizes the estimated misclassification cost.

4.4. Multilayer Perceptron Neural Network

A multilayer perceptron neural network consists of a set of input nodes that make up the input layer, one or more hidden layers made up of computing neurons, and an output layer also made up of computing neurons (Figure 2). The input signal (input standard) moves through the network forward, i.e., from one level to the next. Multilayer perceptron neural networks are usually trained with supervised learning rules. An algorithm that is very often used for this purpose is known as the Back-Propagation algorithm and is based on the rule of learning with error correction.
Two types of signals are transmitted in the network: (1) Function signals. A function signal is an input signal (stimulus) that starts at the input nodes of the network, propagates forward, from neuron to neuron, and ends at the output neurons of the network. In each neuron of the network through which the signal passes, it is calculated as a function of all the incoming signals and the corresponding weights of the synapses that end up in that particular neuron. (2) Error signals. An error signal starts at the output neurons of the network and propagates backwards from level to level. In each neuron, this signal is calculated from an error-dependent function.
Each neuron, in the input layer (i = 1, …, n), yields the value of an estimator of the vector x. Each neuron in the hidden layer (j = 1, …, q) produces the so-called activation
a j = g i w i j x i
Neurons in the output layer (k = 1, …, m) behave like the neurons in the hidden layer to produce the network output result:
y k = f j w j k a j = f j w j k g i w i j x i
where wij and wjk are weights.
The logarithmic function:
f x = 1 / 1 + exp x
or the alternative tangent hyperbolic function:
f x = exp x exp x / ( exp x + exp x )
are commonly used in the upper output of the network for the functions f and g. The logarithmic function is appropriate to the output layer if we have a binary classification problem, as in credit scoring, so that the output can be considered a default probability. The structure of a neural network with a single hidden layer is capable of approximating any continuous, bounded, and integrable function.
Appendix B presents the information on the Multilayer Perceptron Neural Networks we used. Our model is as follows:
Y i   =   j w j k g ( w 1 j × H +   w 2 j × A g   +   w 3 j × B R   +   w 4 j × C o   +   w 5 j × L T T   +   w 6 j × O F   +   w 7 j × P   +   w 8 j × R   +   w 9 j × L T   +   w 10 j × Y r )
where
  • Yi = A dummy variable taking the value 1 if a loan is non-performing or the value 0 if the loan is performing;
  • BH = Firm’s owner’s bad trading past;
  • Ag = Age of the firm’s owner;
  • BR = Relationship between the firm’s owner and the Bank;
  • Co = Type of collateral;
  • LTT = Loan to turnover ratio;
  • OF = Own facilities;
  • P = Mortgage-free property;
  • R = Residence status;
  • LT = Loan type;
  • Yr = Years of operation;
  • wij and wjk are weights.

4.5. The Dataset

The dataset contains loans that were granted to micro and small enterprises (MSEs—EU definitions for SMEs define a firm as micro if it has less than 10 employees and an annual turnover of less than 10 million and as small if it has less than 50 employees with an annual turnover of less than 10 million). The loans’ features and borrowers’ characteristics are collected manually from the Bank’s Management Information System (MIS). The final dataset contained 3294 loans of Greek MSEs that were granted during 2005 (expansion period). We study their performance from August 2010 to July 2012, two years after the onset of the financial crisis and the bankruptcy of Lehman Brothers Holdings Inc. (September 2008). During this period, Greek banks implemented two re-capitalizations to avoid collapse.
To obtain a loan, the prospective borrower fills in their details in an application. In particular, this provides information on the financial situation of the company, its primary demographic characteristics, and its loan and deposit cooperation with the Bank. The applications are registered in the credit scoring model of the Bank by a Small Business officer and receive an evaluation: “Approve”, “Referral”, and “Reject”. Then, the credit director decides whether or not to approve the loan, taking into account the SB officer’s suggestion and the Bank’s guidelines. Therefore, the final financing decision is based on the subjective judgment of credit directors, beyond the categorization of the credit scoring model.
Table 3 shows the evolution of NPLs by rating category based on the Bank’s scorecard. We mention that loans, although rejected by the scorecard of the examined Bank, were granted, with the worst performance throughout the studied period; following loans that were designated as ‘Referral’, while ‘Approve’ loans noted smaller percentages of NPLs. The percentage of non-performing loans that were classified as “Reject” was very high from August 2010 (23.92%) and increased dramatically to 42.11% in July 2012. However, the recession affected all types of loans as well. The percentage of non-performing “Approve” loans increased from 4.58% in August 2010 to 21.15% in July 2012. Moreover, deep recession (and the respective rise of NPLs) significantly deteriorated the predictive power of all credit scoring models, thus further weakening the effectiveness of such lending evaluation tools, considerably challenging bank management.
The descriptive statistics are presented in Table 4. The loan’s performance was determined as the dependent variable. Specifically, a loan is classified as non-performing when the delay exceeds 90 days, according to the basic rules of Basel II. In the literature, the behavior of loans is controlled both in a particular month and during a period, usually 12 months (Makri et al., 2014; Louzis et al., 2012). We control the loans’ behavior for three months during two years (August 2010, August 2011, and September 2012). The percentage of NPLs rose from 7.7% in August 2010 to 27% in September 2012.
As regards the loan type, the dataset consists of 275 loan applications for business equipment, 428 loan applications for business property, 1884 applications for credit lines and overdrafts, and finally 707 applications related to fixed-term working capital loans. Moreover, 43.3% of the loans are unsecured and 56.7% are covered. Regarding the occurrence of existing cooperation with the Bank, 71.9% of borrowers had a previous relationship with the Bank, while 28.1% were new customers. Also, 41% of the loans were granted to businesses operating up to 5 years in the same field, while the remaining 59% were granted to businesses operating for more than 5 years.
Moreover, 49.6% of businesses had their own facilities, and only 9.3% of the business owners experienced a bad trading past. A total of 82.4% of the borrowers and the guarantors had mortgage-free property; 68.1% of the borrowers lived in their home, 17.9% lived with their parents, and only 14% lived in a rental home.
The mean years of operation of the businesses was 10.3 years, and the mean age of the businesses’ owners was 42.55 years. Finally, regarding the loan-to-turnover ratio, the mean LTT was 31.9% with a standard deviation of 0.640.
Figure 3 shows the evolution of loan arrears between the first period (08/2010–07/2011) and the second period (08/2011–07/2021). In particular, we study the maximum days of delay (bucket) that each loan showed during the first and second period. Firstly, we can see that 88% of loans that were more than 90 days late during the first period are still more than 90 days late during the second period. On the other hand, only 67% of the loans without delays during the first period remain up-to-date (no delays) during the second period, while the remaining 33% show some delays. At the same time, 52% of the loans that showed a delay of up to 30 days in the first period showed a deterioration in the second period, and 23% showed a delay of more than 90 days. The percentage of loans with a delay of more than 90 days during the second period increases further when we study the loans with a maximum delay of up to 60 and 90 days, respectively, during the first period. A total of 51% of loans with a maximum delay of up to 60 days and 59% of loans with a maximum delay of up to 90 days during the first period showed a delay of more than 90 days during the second period. We therefore observe that avoiding the transfer of a loan at a delay level of more than 30 days is critical to avoid the increased possibility of a loan being classified as non-performing (bucket 90+) within the next 12 months.

5. Results

Table 5 presents the correlations between the independent variables. Specifically, the Pearson correlations appear on the lower diagonal and the Spearman correlations appear on the upper diagonal. We observe that most observations are statistically significant at the 1% or 5% level. At the same time, we observe that the correlations range at levels lower than 0.5, indicating low to moderate correlations between the independent variables. Therefore, there is no issue of multicollinearity.
To ensure the robustness of the models and eliminate the risk of overfitting of the three proposed models, we proceeded to separate the sample into a training sample (approximately 70% of the cases) and a test sample (approximately 30% of the cases). We then compared the efficiency of the models between the two subsamples. As we observe in Table 6, the differences between the training sample and the control sample are less than 2% for all models. Therefore, the reliability and robustness of the models are ensured. We then proceeded to calculate the metrics Average Accuracy, F1, and Estimated Misclassification Cost of the models, using the results of the test sample and comparing them with the Bank’s credit rating model.
The comparative analysis is presented in Table 7. To compare the predictive performance of the credit scoring models studied, we calculate the average accuracy, the F1 metric, and the estimated misclassification cost for the testing sample of the dataset. By average accuracy, we mean the percentage of correct predictions. As an F1 metric, we set the mean of the percentage of all actual NPLs that are successfully identified (recall) and the percentage of actual NPLs in all predicted NPLs (precision). Finally, by estimated misclassification cost (EMC), we mean the total cost incurred by granting a bad loan and not granting a good loan.
Our first result is that, over time, the Bank’s credit scoring model performed worse than the three benchmark ones widely used in the literature. We observe that the Bank’s credit scoring model remains the least effective in each case regarding all metrics. The difference in average accuracy is considered statistically significant as it exceeds two percentage points from the average of the three proposed models. Moreover, the difference in EMC is up to 30% from the average of the proposed models in August 2010 (20% in July 2012), confirming the bank management’s responsibilities in granting problem loans.
In August 2010, the decision tree model performed better than the other models. On the other hand, the multilayer perceptron neural network model offered the best performance in August 2011 and July 2012.
The main result of the above analysis is that the overtime performance of the Bank’s credit scoring model is worse than the performance of the three credit scoring models used in the literature. It is worth mentioning that the difference in the EMC between the Bank’s credit scoring model and the three models studied increases as the recession in the Greek economy deepens.
Regarding the three credit scoring models that are used in the literature, the average accuracy reduced from a mean of over 92% in August 2010 (89% for the Bank’s credit scoring model) to a mean of 74% in July 2012 (72% for the Bank’s credit scoring model). At the same time, the F1 metric reduced from a mean of 96% in August 2010 (94% for the Bank’s credit scoring model) to a mean of 84% in July 2012 (83.2% for the Bank’s credit scoring model). Finally, the estimated misclassification cost increased during the recession from a mean of 0.04 in August 2010 (0.057 for the Bank’s credit scoring model) to 0.13 in July 2012 (0.16 for the Bank’s credit scoring model).
The remarkable reduction in the accuracy of all models we used in our paper during the escalation of the Greek economy’s recession confirms that the external environment can significantly affect the performance of a loan. In other words, the probability of default depends not only on the borrower’s features but also on the financial status of the domestic Economy, confirming, in that way, the bad luck hypothesis.

6. Conclusions and Limitations

From the above analysis, we found that the poor effectiveness of bank management is related to the increase in NPLs in turbulent periods. More specifically, we found that the studied bank applied a low-performance prediction model as compared to four prediction models proposed by relevant literature, such as the multilayer perceptron neural network, the binomial logistic regression, and the decision tree. This finding is followed by studies on emerging economies (such as those of Louzada et al., 2012, for Brazil; Mileris & Boguslauskas, 2011, for Lithuania; Abdou et al., 2007, for Egypt), revealing that banks typically use relatively ineffective credit scoring models with controversial predictive ability. Furthermore, our analysis confirmed the ambiguous results of other studies (Twala, 2010; Wang et al., 2012; Mileris & Boguslauskas, 2011) as regards the relative predictive performance of the aforementioned four models.
Our study has further important implications regarding bad decision-making in bank lending. In particular, we found that bad bank management deteriorated loan portfolio performance due to the failure to assess the outcomes of its credit scoring model. This conclusion is consistent with the corresponding conclusions of the existing literature (Berger & DeYoung, 1997; Partovi & Matousek, 2019; Louzis et al., 2012; Williams, 2004), which attribute the increase in MPLs to ineffective bank management. So, management’s decision to approve loans that were classified as ‘Reject’ (despite the contrary outcome of the credit scoring model) was wrong and detrimental to the Bank’s results, as these borrowers were the first that defaulted the agreed payments from the beginning of the recession; consequently, this irrational management policy undermined a regular loan repayment over time. This situation highlights other aspects of the problem, such as the achievement of personal goals by management at the time of loan approval (moral hazard). However, our study does not focus on this dimension.
This study’s results strongly indicate that economic fluctuations inevitably lead to bad management decisions over time. More specifically, such decisions, driven by a more relaxed credit policy during the expansion (profited from low interest rates), dramatically worsened NPLs during the subsequent recession. Consequently, we found that external adverse effects not only revealed bad management decisions (which were taken in a previous optimistic era) but also strengthened their adverse effects throughout the recession. This result is expected as in periods of recession, the liquidity of businesses decreases, banks reduce financing to businesses, and there is difficulty in repaying obligations. Therefore, businesses with low creditworthiness are unable to obtain additional financing to cover their obligations. At the same time, their revenues decrease, resulting in losses.
Our study concludes that bank management is responsible for both the granting of low-quality loans and the respective application of inefficient evaluation techniques that become even more ineffective during the recession. The inefficiency of the Bank’s scoring model is consistent with the findings of the existing literature. Based on published studies (Berger & DeYoung, 1997; Partovi & Matousek, 2019; Louzis et al., 2012; Williams, 2004), the credit scoring models applied by banks are less effective than the models recommended in the literature. It should be mentioned that banks even ignore the evaluation outcome of their prediction models, further undermining their professional management style. The above study’s findings strongly confirm the responsibilities of poor management.
On the other hand, we found a rapid accumulation of NPLs as the recession deepened, underlining the crucial role of external adverse effects as an additional factor in the formation of new NPLs. This conclusion is consistent with the findings of the existing literature (Quagliariello, 2007; Konstantakis et al., 2016; Makri et al., 2014; Tmava & Spahiu, 2025; Undji & Sheefeni, 2025). The predictive performance of all credit scoring models remarkably reduced during the prolonged recession period.
Consequently, our study has important implications for managers and policymakers. A central message is that the avoidance of new NPL creation via control mechanisms indeed constitutes a big challenge for bank management. From this point of view, loans with a delay of more than 90 days in the first year indicated a further deterioration in the second year. As an important challenge, it seems to be an urgent focus of banks to avoid transition of loans in more than 90 days delay since in times of recessions, new attractive loans are limited, so they can not cover the performance damage from rising NPLs (contrary to the periods of growth in which losses from NPLs are easily outweighed by new funding).
This study presents certain limitations that should be mentioned. Specifically, our study focuses on the evolution of the predictive ability of credit scoring models and the effectiveness of bank management during the recession of the Greek economy. The loan data was collected manually from the Bank’s information system. Therefore, there is a possibility of incorrect entries.
For this research, only loan and borrower characteristics were used, and not macroeconomic data. In a subsequent phase, the models could be enriched with macroeconomic data in order to examine the increase in predictive ability during recessionary periods. In this case, there is a risk of a significant increase in rejections during periods of growth, which could lead to a decrease in market share.
At the same time, the collection of data from only one bank is a limitation of this research. However, the conclusions of the research can be generalized given that the banking market in Greece is oligopolistic and the bank under study is one of the four systemic banks operating in Greece. Finally, the time lag between the implementation of the banking model and the development of the three proposed models creates limitations regarding the evaluation of the efficiency of banking management.

Author Contributions

Conceptualization, V.G.; methodology, V.G. and S.K.; software, V.G. and S.K.; validation, V.G. and S.K.; formal analysis, V.G.; investigation, V.G. and S.K.; writing—original draft preparation, V.G. and S.K.; writing—review and editing, V.G. and S.K.; supervision, V.G. and S.K.; project administration, V.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is unavailable due to privacy restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Binary Logistic Regression

Logistic Regression August 2010
BS.E.WalddfSig.Exp(B)95% C.I. for EXP(B)
LowerUpper
Years−0.0310.0135.88810.0150.9690.9450.994
Age0.0250.0105.97910.0141.0261.0051.047
Adverse0.6280.2566.02110.0141.8741.1353.095
Property−0.3500.2042.95010.0860.7040.4721.051
LTT0.0080.1390.00310.9561.0080.7681.323
Type 6.86230.076
Type(1)−0.6670.4112.64210.1040.5130.2301.147
Type(2)0.4470.3321.80610.1791.5630.8153.000
Type(3)−0.2360.2151.20810.2720.7900.5181.203
Owfac−0.6460.17813.22710.0000.5240.3700.742
Residence 0.74320.690
Residence(1)−0.1960.2310.72410.3950.8220.5231.291
Residence(2)−0.1110.2730.16710.6830.8950.5241.527
Bankrel 13.29330.004
Bankrel(1)−0.4820.2443.90810.0480.6180.3830.996
Bankrel(2)−1.1300.4237.15510.0070.3230.1410.739
Bankrel(3)0.0960.1900.25810.6121.1010.7591.597
Collat 17.04430.001
Collat(1)−1.0510.6172.90210.0880.3500.1041.171
Collat(2)−0.0600.2500.05710.8110.9420.5771.538
Collat(3)−0.9130.23215.52810.0000.4010.2550.632
Constant−2.0270.46818.74210.0000.132
Model Summary
Step−2 Log likelihoodCox and Snell R SquareNagelkerke R Square
11153.137 a0.0370.088
a Estimation terminated at iteration number 6 because
parameter estimates changed by less than 0.001.
Hosmer and Lemeshow Test
StepChi-squaredfSig.
125.77680.001
Logistic Regression August 2011
BS.E.WalddfSig.Exp(B)95% C.I. for EXP(B)
LowerUpper
Years−0.0150.0102.57510.1090.9850.9661.003
Age0.0160.0084.12710.0421.0171.0011.033
Adverse0.6290.19510.40710.0011.8761.2802.749
Property−0.9510.15338.87110.0000.3860.2860.521
LTT0.1510.1032.14910.1431.1630.9501.423
Type 0.96030.811
Type(1)−0.0660.2580.06610.7980.9360.5651.551
Type(2)0.1170.2380.24310.6221.1250.7051.793
Type(3)−0.0970.1640.35210.5530.9070.6581.251
Owfac−0.3650.1327.68310.0060.6940.5360.899
Residence 14.94920.001
Residence(1)−0.3010.1762.92210.0870.7400.5241.045
Residence(2)0.3320.1992.79210.0951.3940.9442.059
Bankrel 30.98530.000
Bankrel(1)−0.4800.1836.91610.0090.6190.4330.885
Bankrel(2)−1.1240.30014.01210.0000.3250.1800.585
Bankrel(3)0.2240.1472.32710.1271.2510.9381.669
Collat 1.73830.629
Collat(1)−0.0880.3430.06510.7990.9160.4681.795
Collat(2)−0.2280.2061.22110.2690.7960.5311.193
Collat(3)−0.1400.1580.78410.3760.8690.6371.186
Constant−1.1670.36210.35710.0010.311
Model Summary
Step−2 Log likelihoodCox and Snell R SquareNagelkerke R Square
11773.509 a0.0550.098
a Estimation terminated at iteration number 5 because
parameter estimates changed by less than 0.001.
Hosmer and Lemeshow Test
StepChi-squaredfSig.
138.18380.000
Logistic Regression July 2012
BS.E.WalddfSig.Exp(B)95% C.I. for EXP(B)
LowerUpper
Years−0.0350.00820.19610.0000.9660.9510.981
Age0.0250.00714.47210.0001.0251.0121.038
Adverse0.3520.1684.39410.0361.4221.0231.976
Property−0.6360.12924.32710.0000.5290.4110.681
LTT0.1830.0834.84110.0281.2011.0201.413
Type 11.72530.008
Type(1)−0.2580.2201.37610.2410.7730.5021.189
Type(2)0.2190.1901.33610.2481.2450.8591.806
Type(3)0.3190.1345.64510.0181.3761.0581.791
Owfac−0.2420.1035.52610.0190.7850.6420.961
Residence 8.77020.012
Residence(1)−0.3700.1406.97910.0080.6910.5250.909
Residence(2)−0.0810.1680.23510.6280.9220.6631.281
Bankrel 20.04230.000
Bankrel(1)0.0150.1410.01110.9161.0150.7691.339
Bankrel(2)−0.3320.1992.77910.0950.7170.4861.060
Bankrel(3)0.3830.1239.77710.0021.4671.1541.865
Collat 7.66330.054
Collat(1)−0.7580.2867.01910.0080.4690.2670.821
Collat(2)−0.1580.1630.94210.3320.8540.6201.175
Collat(3)−0.1290.1241.07110.3010.8790.6891.122
Constant−1.1510.29615.16110.0000.316
Model Summary
Step−2 Log likelihoodCox and Snell R SquareNagelkerke R Square
12534.374 a0.0590.086
a Estimation terminated at iteration number 5 because
parameter estimates changed by less than 0.001.
Hosmer and Lemeshow Test
StepChi-squaredfSig.
113.98880.082

Appendix B. Neural Networks

Multilayer Perceptron NN—August 2010
TrainingCross Entropy Error557,311
Percent Incorrect Predictions7.4%
Stopping Rule Used1 consecutive step(s) with no decrease in error a
Training Time0:00:00.15
TestingCross Entropy Error278,049
Percent Incorrect Predictions8.5%
Dependent Variable: August 2010
a Error computations are based on the testing sample.
Network Information
Input LayerFactors1Bankrel
2Residence
3Collat
4Type
Covariates1Years
2Age
3LTT
4Owfac
5Adverse
6Property
Number of Units a21
Rescaling Method for CovariatesStandardized
Hidden Layer(s)Number of Hidden Layers1
Number of Units in Hidden Layer 1 a8
Activation FunctionHyperbolic tangent
Output LayerDependent Variables1Aug2010
Number of Units2
Activation FunctionSoftmax
Error FunctionCross-entropy
a Excluding the bias unit
Multilayer Perceptron NN—August 2011
TrainingCross Entropy Error946,002
Percent Incorrect Predictions15.2%
Stopping Rule Used1 consecutive step(s) with no decrease in error a
Training Time0:00:00.15
TestingCross Entropy Error373,873
Percent Incorrect Predictions13.3%
Dependent Variable: August 2011
a Error computations are based on the testing sample.
Network Information
Input LayerFactors1Bankrel
2Residence
3Collat
4Type
Covariates1Years
2Age
3LTT
4Owfac
5Adverse
6Property
Number of Units a21
Rescaling Method for CovariatesStandardized
Hidden Layer(s)Number of Hidden Layers1
Number of Units in Hidden Layer 1 a6
Activation FunctionHyperbolic tangent
Output LayerDependent Variables1August 2011
Number of Units2
Activation FunctionSoftmax
Error FunctionCross-entropy
a Excluding the bias unit
Multilayer Perceptron NN—September 2012
TrainingCross Entropy Error1,133,768
Percent Incorrect Predictions24.2%
Stopping Rule Used1 consecutive step(s) with no decrease in error a
Training Time0:00:00.19
TestingCross Entropy Error529,041
Percent Incorrect Predictions25.0%
Dependent Variable: July 2012
a Error computations are based on the testing sample.
Network Information
Input LayerFactors1Bankrel
2Residence
3Collat
4Type
Covariates1Years
2Age
3LTT
4Owfac
5Adverse
6Property
Number of Units a21
Rescaling Method for CovariatesStandardized
Hidden Layer(s)Number of Hidden Layers1
Number of Units in Hidden Layer 1 a8
Activation FunctionHyperbolic tangent
Output LayerDependent Variables1July2012
Number of Units2
Activation FunctionSoftmax
Error FunctionCross-entropy
a Excluding the bias unit

Appendix C. Classification and Regression Trees

Decision Tree August 2010
SpecificationsGrowing MethodCHAID
Dependent VariableAug2010
Independent VariablesType, Years, Owfac, Bankrel, Residence, Age, Adverse, Collat, Property, LTT
ValidationSplit Sample:
Training 2280
Test 1014
Maximum Tree Depth3
Minimum Cases in Parent Node100
Minimum Cases in Child Node50
ResultsIndependent Variables IncludedCollat, Owfac, Property, Age, Years, LTT, Type
Number of Nodes20
Number of Terminal Nodes13
Depth3
Risk
SampleEstimateStd. Error
Training0.0790.006
Test0.0740.008
Growing Method: CHAID
Dependent Variable: August 2010
Decision Tree August 2011
SpecificationsGrowing MethodCHAID
Dependent VariableAug2011
Independent VariablesType, Years, Owfac, Bankrel, Residence, Age, Adverse, Collat, Property, LTT
ValidationSplit Sample
Maximum Tree Depth3
Minimum Cases in Parent Node100
Minimum Cases in Child Node50
ResultsIndependent Variables IncludedProperty, Residence, Owfac, Age, Years, Bankrel
Number of Nodes15
Number of Terminal Nodes9
Depth3
Risk
SampleEstimateStd. Error
Training0.1490.007
Test0.1410.011
Growing Method: CHAID
Dependent Variable: August 2011
Decision Tree July 2012
SpecificationsGrowing MethodCHAID
Dependent VariableJuly2012
Independent VariablesType, Years, Owfac, Bankrel, Residence, Age, Adverse, Collat, Property, LTT
ValidationSplit Sample
Maximum Tree Depth3
Minimum Cases in Parent Node100
Minimum Cases in Child Node50
ResultsIndependent Variables IncludedYears, Owfac, Residence, Property, Type, LTT, Collat, Age
Number of Nodes31
Number of Terminal Nodes18
Depth3
Risk
SampleEstimateStd. Error
Training0.2560.009
Test0.2700.014
Growing Method: CHAID
Dependent Variable: July 2012

References

  1. Abdou, H., El-Masry, A., & Pointon, J. (2007). On the applicability of credit scoring models in Egyptian Banks. Banks and Bank Systems, 2, 4–20. [Google Scholar]
  2. Aggelopoulos, E., & Georgopoulos, A. (2015). The determinants of shareholder value in retail banking during crisis years: The case of Greece. Multinational Finance Journal, 19, 109–147. [Google Scholar] [CrossRef]
  3. Akuoko-Kanadu, E., & Mahmud, A. (2025). Corruption, economic growth, and non-performing loans in Sub-Saharan Africa: An empirical analysis (2011–2019). Journal of Quantitative Economics, 23, 233–252. [Google Scholar] [CrossRef]
  4. Altman, E., & Saunders, A. (1998). Credit risk measurement: Developments over the last 20 years. Journal of Banking and Finance, 21, 1721–1742. [Google Scholar] [CrossRef]
  5. Anastasiou, D., Louri, H., & Tsionas, M. (2016). Determinants of non-performing loans: Evidence from Euro-area countries. Finance Research Letters, 18, 116–119. [Google Scholar]
  6. Andreou, P. C., Philip, D., & Robejsek, P. (2016). Bank liquidity creation and risk-taking: Does managerial ability matter? Journal of Business Finance and Accounting, 43(1–2), 226–259. [Google Scholar] [CrossRef]
  7. Belaid, F., Boussaada, R., & Belguith, H. (2017). Bank-firm relationship and credit risk: An analysis on Tunisian firms. Research in International Business and Finance, 42, 532–543. [Google Scholar] [CrossRef]
  8. Berger, A., & DeYoung, R. (1997). Problem loans and cost efficiency in commercial banks. Journal of Banking and Finance, 21, 849–870. [Google Scholar] [CrossRef]
  9. Christodoulou-Volos, C. (2025). Determinants of non-performing loans in cyprus: An empirical analysis of macroeconomic and borrower-specific factors. International Journal of Economics and Financial Issues, 15(1), 190–201. [Google Scholar] [CrossRef]
  10. Crook, J., Edelman, D., & Lyn, T. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183, 1447–1465. [Google Scholar] [CrossRef]
  11. DeVaney, S., & Lytton, R. (1995). Household insolvency: A review of household debt repayment, delinquency, and bankruptcy. Financial Services Review, 4(2), 137–156. [Google Scholar] [CrossRef]
  12. Dilsha, M., & Kiruthika. (2014). A hybrid ensemble model for credit scoring of microfinance data. International Journal of Mathematics and Computer Applications Research, 4(6), 61–68. [Google Scholar]
  13. Duarte, F., Gama, M., Paula, A., & Esperança, J. P. (2016). The role of collateral in the credit acquisition process: Evidence from SME lending. Journal of Business Finance and Accounting, 43(5–6), 693–728. [Google Scholar] [CrossRef]
  14. European Banking Authority [EBA]. (2016). EBA report on SMEs and SME supporting factors. European Banking Authority. Available online: https://www.eba.europa.eu/publications-and-media/press-releases/eba-publishes-report-smes-and-sme-supporting-factor (accessed on 10 May 2025).
  15. Friedman, J. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19, 1–67. [Google Scholar] [CrossRef]
  16. Gordy, M. (2000). A comparative anatomy of credit risk models. Journal of Banking and Finance, 24, 119–149. [Google Scholar] [CrossRef]
  17. Gupta, J., Gregoriou, A., & Healy, J. (2015). Forecasting bankruptcy for SMEs using hazard function: To what extent does size matter? Review of Quantitative Finance and Accounting, 45, 845–869. [Google Scholar] [CrossRef]
  18. Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society. Series A (Statistics in Society), 160(3), 523–541. [Google Scholar] [CrossRef]
  19. Karagiannis, G., & Kourtzidis, S. (2025). On modelling non-performing loans in bank efficiency analysis. International Journal of Finance and Economics, 30(2), 1742–1757. [Google Scholar] [CrossRef]
  20. Konstantakis, K., Michaelides, P., & Vouldis, A. (2016). Non-performing loans in a crisis economy: Long-run equilibrium analysis with a real time VEC model for Greece (2001–2015). Physica A, 451, 149–161. [Google Scholar] [CrossRef]
  21. Lee, T., Chiu, C., Lu, C., & Chen, I. (2002). Credit scoring using the hybrid neural discriminant technique. Expert Systems with Applications, 23, 245–254. [Google Scholar] [CrossRef]
  22. Louzada, F., Ferreira-Silva, P., & Diniz, C. (2012). On the impact of disproportional samples in credit scoring models: An application to a Brazilian Bank data. Expert Systems with Applications, 39, 8071–8078. [Google Scholar] [CrossRef]
  23. Louzis, D., Vouldis, A., & Metaxas, V. (2012). Macroeconomic and bank specific determinants of non-performing loans in Greece: A comparative study of mortgage, business and consumer loan portfolios. Journal of Banking and Finance, 36, 1012–1027. [Google Scholar] [CrossRef]
  24. Makri, V., Tsagkanos, A., & Bellas, A. (2014). Determinants of non-performing loans: The case of Eurozone. Panoeconomicus, 2, 193–206. [Google Scholar] [CrossRef]
  25. Manz, F. (2019). Determinants of non-performing loans: What do we know? A systematic review and avenues for future research. Management Review Quarterly, 69, 351–389. [Google Scholar] [CrossRef]
  26. Maztoul, S. (2025). Banks’ Sustainability and Financial Performance: The role of credit risk. Financial and Credit Activity: Problems of Theory and Practice, 2(61), 73–86. [Google Scholar] [CrossRef]
  27. Mileris, R., & Boguslauskas, V. (2011). Credit risk estimation model development process: Main steps and model improvement. Engineering Economics, 22, 126–133. [Google Scholar] [CrossRef]
  28. Papadamou, S., & Pitsilkas, K. (2025). Policy uncertainty and non-performing loans in Greece. American Journal of Economics and Sociology, 84, 231–252. [Google Scholar] [CrossRef]
  29. Partovi, E., & Matousek, R. (2019). Bank efficiency and non-performing loans: Evidence from Turkey. Research in International Business and Finance, 48, 287–309. [Google Scholar] [CrossRef]
  30. Quagliariello, M. (2007). Banks; riskiness over the business cycle: A panel analysis on Italian intermediaries. Applied Financial Economics, 17, 119–138. [Google Scholar] [CrossRef]
  31. Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1, 81–106. [Google Scholar] [CrossRef]
  32. Shin, K., & Han, I. (2001). A case-based approach using inductive indexing for corporate bond rating. Decision Support Systems, 32, 41–52. [Google Scholar] [CrossRef]
  33. Siddiqi, N. (2006). Credit risk scorecard. John Wiley & Sons. ISBN 978-0-471-75451-0. [Google Scholar]
  34. Sun, T., & Vasarhelyi, M. (2018). Predicting credit card delinquencies: An application of deep neural networks. Intelligent Systems in Accounting, Finance and Management, 25, 174–189. [Google Scholar] [CrossRef]
  35. Tmava, Q., & Spahiu, M. (2025). Banking and macroeconomic drivers effects on non-performing loans: Insights from western balkan countries. Financial and Credit Activity: Problems of Theory and Practice, 2(61), 40–53. [Google Scholar] [CrossRef]
  36. Trustorff, J., Konrad, P., & Leker, J. (2011). Credit risk prediction using support vector machines. Review of Quantitative Finance and Accounting, 36, 565–581. [Google Scholar] [CrossRef]
  37. Twala, B. (2010). Multiple classifier application to credit risk assessment. Expert System with Applications, 37, 3326–3336. [Google Scholar] [CrossRef]
  38. Undji, J. V., & Sheefeni, P. S. J. (2025). Determinants of non-performing loans in Namibia’s banking sector using composite indices. African Journal of Business and Economic Research, 20(1), 391–420. [Google Scholar] [CrossRef]
  39. Wang, G., Ma, J., Huang, L., & Xu, K. (2012). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26, 61–68. [Google Scholar] [CrossRef]
  40. Williams, J. (2004). Determining management behaviour in European banking. Journal of Banking and Finance, 28(10), 2427–2460. [Google Scholar] [CrossRef]
  41. Zopounidis, C., & Doumpos, M. (2002). Multicriteria classification and sorting methods: A literature review. European Journal of Operational Research, 138, 229–246. [Google Scholar] [CrossRef]
Figure 1. A decision tree case.
Figure 1. A decision tree case.
Ijfs 13 00152 g001
Figure 2. Artificial neural network—multilayer perceptron.
Figure 2. Artificial neural network—multilayer perceptron.
Ijfs 13 00152 g002
Figure 3. Temporal scaling NPLs per bucket for the 3294 loans between the first and the second year of the studied period. Notes: Figure 3 presents an analysis of delays according to the days a loan is overdue. The vertical axis measures the number of days of overdue debt during the first period (August 2010 to July 2011). The horizontal axis measures the number of days of overdue loans during the second period (August 2011 to July 2012). The longer the delay of a loan during the first twelve-month period, the greater the possibility of displaying the same or more days of delay in the second 12-month period.
Figure 3. Temporal scaling NPLs per bucket for the 3294 loans between the first and the second year of the studied period. Notes: Figure 3 presents an analysis of delays according to the days a loan is overdue. The vertical axis measures the number of days of overdue debt during the first period (August 2010 to July 2011). The horizontal axis measures the number of days of overdue loans during the second period (August 2011 to July 2012). The longer the delay of a loan during the first twelve-month period, the greater the possibility of displaying the same or more days of delay in the second 12-month period.
Ijfs 13 00152 g003
Table 1. Statistics of the Greek economy and the NPL ratio of the Greek banking sector over the period 2003 to 2013.
Table 1. Statistics of the Greek economy and the NPL ratio of the Greek banking sector over the period 2003 to 2013.
Variable20032004200520062007200820092010201120122013
Real GDP growth5.8%5.1%0.6%5.7%3.3%−0.3%−4.3%−5.5%−9.1%−7.3%−3.2%
Gross debt (% of GDP)101.5%102.9%107.4%103.6%103.1%109.4%126.7%146.3%172.1%159.6%177.7%
Private debt (% of GDP)82.7%87.1%98.9%104.7%114.6%126.3%130.1%141.2%144.4%148.5%147.8%
Non-performing business loans ratio of the banking sector8.0%7.9%7.2%6.2%5.2%5.7%9.5%14.1%21.5%31.3%39.5%
Notes: The table presents the main statistics of the Greek economy and the Greek banking sector during the period from 2003 to 2013 (IMF, Global Debt Database, World Economic Outlook Database April 2020/Bank of Greece).
Table 2. Loan characteristics and borrowers’ features.
Table 2. Loan characteristics and borrowers’ features.
NumberIndependent
Variable
DefinitionType of Characteristic
1Bad History (BH)Dummy variable which takes the value 1 if the firm’s owner had a bad trading past at the time of assessing the application, or the value 0 otherwise.Character
2Age (Ag)The age of the firm’s owner.Character
3Bank Relationship (BR)The relationship between the firm’s owner and the bank takes the value 1 if the firm’s owner is not a customer, the value 2 if the firm’s owner has only a loan relationship, the value 3 if the firm’s owner has only a deposit relationship, and the value 4 if the firm’s owner has both deposit and loan relationship with the bank.Capital
4Collateral (Co)The type of collateral takes the value 1 if there is no collateral, the value 2 if the loan is covered by securities (checks-exchange), the value 3 if the loan is covered by mortgage on the property, and the value 4 if the loan is covered by cash collateral (deposits, bancassurance, and investment savings products).Collateral
5LTTLoan to turnover ratio.Economic conditions
6Own Facilities (OF)Dummy variable which takes the value 1 if the firm’s owner had owned facilities and the value 0 otherwise.Capacity
7Property (P)Dummy variable taking the value 1 if the firm’s owner and the guarantor had mortgage-free property, or the value 0 otherwise.Capital
8Residence Status (R)The residence status takes the value 1 if the firm’s owner lives in a rented house, the value 2 if the firm’s owner lives with their parents, and the value 3 if the firm’s owner has a private residence.Capacity
9Loan Type (LT)Four categories of loans depending on the purpose of lending: 1. equipment, 2. facilities, 3. working capital fixed term, 4. working capital limit-overdraft.Capital
10Years (Yr)The years of operation of the company.Economic conditions
Table 3. Non-performing loans per decision of the Bank’s credit scoring model.
Table 3. Non-performing loans per decision of the Bank’s credit scoring model.
Credit Scoring DecisionNPL (08/2010)NPL (12/2010)NPL (08/2011)NPL (12/2011)NPL (07/2012)
Referral7.78%9.68%15.20%20.97%28.60%
Approve4.58%5.49%10.26%11.81%21.15%
Reject23.92%24.40%32.54%33.97%42.11%
Total7.74%9.23%14.66%18.76%26.99%
Notes: This table presents the percentage of non-performing loans for each loan category according to the bank’s scorecard decision (i.e., Referral, Approve, Reject).
Table 4. Structure of the dataset (3294 total loans).
Table 4. Structure of the dataset (3294 total loans).
Loan Performance August 2010Loan Performance August 2011Loan Performance September 2012
Freq.Percent Freq.Percent Freq.Percent
Performing Loan303992.3Performing Loan281185.3Performing Loan240573.0
Non-Performing Loan2557.7Non-Performing Loan48314.7Non-Performing Loan88927.0
Total3294100.0Total3294100.0Total3294100.0
Loan Type (LT)Collateral (Co)Bank Relationship (BR)
Freq.Percent Freq.Percent Freq.Percent
Business Equipment2758.3Cash collateral1634.9Both deposit and loan relationship86826.4
Business property42813.0Securities
(checks-exchange)
36811.2Only deposit relationship35510.8
Credit Lines188457.2Mortgage on
the property
133740.6Only loan relationship114634.8
Working Capital70721.5No collateral142643.3No customer92528.1
Total3294100.0Total3294100.0Total3294100.0
Own Facilities (OF)Bad History (BH)Property (P)
Freq.Percent Freq.Percent Freq.Percent
No own facilities165950.4No bad history298890.7No mortgage
free property
58117.6
Own facilities163549.6Bad history3069.3Mortgage
free property
271382.4
Total3294100.0Total3294100.0Total3294100.0
Residence Status (R) Years (Yr)Age (Ag)LTT
Freq.Percent Mean10.3242.550.319
Home owner224368.1 Median9.0042.000.048
Rental home46014.0 Std. Deviation8.7979.4480.640
Live with parents59117.9 Minimum0200.001
Total3294100.0 Maximum53793.333
Percentiles253.0036.000.020
509.0042.000.048
7515.0049.000.143
Table 5. Pearson and Spearman correlation matrix.
Table 5. Pearson and Spearman correlation matrix.
YrOFBRRAgBHCoPLTT
Years (Yr)10.168 **0.064 **0.281 **0.492 **−0.011−0.053 **−0.016−0.303 **
Own Facilities (OF)0.205 **10.0220.230 **0.138 **−0.0190.100 **0.052 **0.038 *
Bank Relationship (BR)0.038 *0.00910.079 **0.108 **−0.0040.146 **−0.069 **−0.082 **
Residence Status (R)0.226 **0.237 **0.056 **10.369 **0.0040.042 *0.101 **−0.096 **
Age (Ag)0.522 **0.145 **0.099 **0.292 **1−0.024−0.043 *0.017−0.156 **
Bad History (BH)−0.009−0.019−0.0020.013−0.02910.082 **0.036 *0.022
Collateral (Co)−0.040 *0.099 **0.158 **0.049 **−0.048 **0.081 **10.119 **0.292 **
Property (P)−0.0050.052 **−0.057 **0.110 **0.0160.036 *0.117 **10.135 **
Loan to Turnover (LTT)−0.377 **−0.025−0.067 **−0.132 **−0.197 **0.0050.150 **0.115 **1
Note: The table presents the Spearman correlation coefficients (up diagonal) and the Pearson correlation coefficients (lower diagonal). ** Correlation is significant at the 0.01 level (2-tailed). * Correlation is significant at the 0.05 level (2-tailed).
Table 6. Cross-validation results on predictive performance.
Table 6. Cross-validation results on predictive performance.
August 2010August 2011July 2012
Employed modelsTraining set (in-sample)Testing set
(out of sample)
Training set (in-sample)Testing set
(out of sample)
Training set (in-sample)Testing set
(out of sample)
Binomial Logistic Regression92.40%91.90%85.90%85.20%73.60%72.60%
Decision Tree92.10%92.60%85.10%85.90%74.40%73.00%
Multilayer Perceptron92.60%91.50%84.80%86.70%75.80%75.00%
Note: This table presents the predictive performance of the proposed models using a cross-validation approach that splits the data into training and testing subsets. The full dataset, comprising 3294 loans, is randomly divided into a training set (approximately 70%) and a test set (approximately 30%). To evaluate the model’s predictive accuracy on the hold-out (test) sample, we use the average classification accuracy rate, which reflects the percentage of correctly identified performing and non-performing loans.
Table 7. Predictive performance of credit scoring models for the testing sample (average accuracy, F1 metric, estimated misclassification cost).
Table 7. Predictive performance of credit scoring models for the testing sample (average accuracy, F1 metric, estimated misclassification cost).
August 2010
MetricBinomial Logistic RegressionDecision TreeMultilayer PerceptronBank’s Credit Scoring Model
Average Accuracy0.91890.92600.91530.8895
F10.95780.96160.95580.9405
Estimated Misclassification Cost0.04050.03700.04230.0566
August 2011
Average Accuracy0.85210.85860.86650.8312
F10.91970.92390.92850.9057
Estimated Misclassification Cost0.07200.07070.06670.0927
July 2012
Average Accuracy0.72640.73000.74950.7201
F10.83830.83590.84630.8321
Estimated Misclassification Cost0.13490.13740.11660.1606
Notes: The Average Accuracy is calculated by the following equation: AAC = (TP + TN)/Total; the F1 metric is calculated by the following equation: F1 = 2 * [(TP/(TP + FN)) * (TP/(TP + FP))]/[(TP/(TP + FN)) + (TP/(TP + FP))]; the Estimated Misclassification Cost is calculated by the following equation: EMC = C((predicted bad)/(actually good)) × P((predicted bad)/(actually good)) × π0 + C((predicted good)/(actually bad)) × P((predicted good)/(actually bad)) × π1. Where AAC: Average Accuracy; EMC: Estimated Misclassification Cost; C((predicted bad)/(actually good)): the cost a good loan incorrectly identified as bad; C((predicted good)/(actually bad)): the cost a bad loan incorrectly identified as good; P((predicted bad)/(actually good)): the probability a good loan incorrectly identified as bad; P((predicted good)/(actually bad)): the probability a bad loan incorrectly identified as good; π0 is the probability that a loan is good and π1 is the probability that a loan is bad. PL: Performing Loans; NPL: Non-Performing Loans; FN: False Negative; TP: True Positive; FP: False Positive; TN: True Negative; Total: Total Loans.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Giannopoulos, V.; Kariofyllas, S. When Models Fail: Credit Scoring, Bank Management, and NPL Growth in the Greek Recession. Int. J. Financial Stud. 2025, 13, 152. https://doi.org/10.3390/ijfs13030152

AMA Style

Giannopoulos V, Kariofyllas S. When Models Fail: Credit Scoring, Bank Management, and NPL Growth in the Greek Recession. International Journal of Financial Studies. 2025; 13(3):152. https://doi.org/10.3390/ijfs13030152

Chicago/Turabian Style

Giannopoulos, Vasileios, and Spyridon Kariofyllas. 2025. "When Models Fail: Credit Scoring, Bank Management, and NPL Growth in the Greek Recession" International Journal of Financial Studies 13, no. 3: 152. https://doi.org/10.3390/ijfs13030152

APA Style

Giannopoulos, V., & Kariofyllas, S. (2025). When Models Fail: Credit Scoring, Bank Management, and NPL Growth in the Greek Recession. International Journal of Financial Studies, 13(3), 152. https://doi.org/10.3390/ijfs13030152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop