Predicting Bank Failures: A Synthesis of Literature and Directions for Future Research

: Risk management has been a topic of great interest to Michael McAleer. Even as recent as 2020, his paper on risk management for COVID-19 was published. In his memory, this article is focused on bankruptcy risk in ﬁnancial ﬁrms. For ﬁnancial institutions in particular, banks are considered special, given that they perform risk management functions that are unique. Risks in banking arise from both internal and external factors. The GFC underlined the need for comprehensive risk management, and researchers since then have been working towards fulﬁlling that need. Similarly, the central banks across the world have begun periodic stress-testing of banks’ ability to withstand shocks. This paper investigates the machine-learning and statistical techniques used in the literature on bank failure prediction. The study ﬁnds that though considerable progress has been made using advanced statistical and computational techniques, given the complex nature of banking risk, the ability of statistical techniques to predict bank failures is limited. Machine-learning-based models are increasingly becoming popular due to their signiﬁcant predictive ability. The paper also suggests the directions for future research.


Introduction
Financial institutions occupy an important position in any economy.Among these, banks in particular perform functions that are unique.The failure of a major bank in any economy would be disastrous for the entire economy due to the risk of contagion, as banks are connected with each other by payment systems.Accepting deposits repayable on demand and making loans and investments are the predominant functions that commercial banks perform, besides a host of other functions.Banks accept deposits of short maturity and make loans that have a long maturity.The unique functions that a bank performs expose it to several types of risks, such as interest-rate risk, market risk, credit risk, liquidity risk, off-balance-sheet risk, foreign-exchange risk, and others.Banks are the major users of technology, and consequently they are exposed to technology risk as well as operational risk.Banks' international lending exposes them to country risk.A combined effect of all these risks could lead to an insolvency risk.
Given the multifarious risks that banks face and the negative externalities they impose on the rest of the economy, banks are subject to strict prudential supervision and periodic stress-testing by regulatory agencies in all countries.The objective is to ensure that banks are prudently run so that their failures and the required bailouts are avoided.A timely prediction of a possible bank failure would considerably help supervisory authorities, as it would help identify areas where the bank is vulnerable to failure risk, and undertake riskbased on-sight inspection and an audit.Bank failure prediction models help in this respect, as they generate a better understanding of a bank's business.Supervisory authorities have also introduced an early-warning system towards this end.
Bank failure prediction has a long history.The CAMELS rating system introduced by the Federal Reserve in the United States in the mid-1990s is still in use, with revisions made from time to time.The five components of the CAMELS system include capital adequacy, asset quality, management administration, earnings, and liquidity (FRBN 1997).A composite rating was produced by CAMELS.However, the difficult dimension was how to measure management quality, since other components could be measured by financial data.The statistical techniques of bank failure prediction that used financial data typically included the use of discriminant analysis and logistic regression function.Some researchers introduced data envelopment analysis, to capture the management efficiency component.However, any model is a prototype of the reality, not the reality per se.The reality in the banking world is quite complex, and as such, predictions must be made in an everchanging dynamic environment.Towards that end, machine-learning techniques (MLTs) are increasingly being used.The major MLTs include the artificial neural network (ANN), support vector machines (SVMs), and k-nearest neighbour algorithm or KNN (Le and Viviani 2018).
Whether these techniques have helped in accurately predicting bank failures is a question that remains to be answered.It is this gap in the literature that the present paper addresses.
The paper is organised as follows: Section 2 presents a literature review of more than 60 key papers in the area; and provides a classification of these papers by methodologies, database used, study period, country and district studied, conclusions drawn, and limitations; Section 3 presents a discussion of findings, and Section 4 concludes the paper.

Literature on Bank Failure Prediction
The literature on business failures dates back to the late 1970s, when Beaver (1966) applied a set of financial ratios to assess the likelihood of business failure.Similarly, Altman attempted to assess the corporate bankruptcy issue using a traditional ratio analysis, as well as more rigorous statistical techniques (Altman 1968).Over the years, models of prediction have become more sophisticated.
The review of studies was conducted from two perspectives: a methodological review and a predictive indicators review.

Review of Methodology
Among statistical techniques, the methods are covered in three categories: (1) logit/ probit and discriminant analysis and linear analysis; (2) artificial intelligence methods; and (3) machine-learning methods.Table 1 presents the prior studies in these categories.The papers are reviewed in chronological order.

Discriminant Analyses
The family of discriminant analyses includes linear discriminant analysis (LDA), multivariate discriminate analysis (MDA), and the quadratic discriminant analysis (QDA).These remained the leading techniques for many years.The first application of a discriminant analysis to explain corporate failure was performed by Altman (1968).Studies related to specific corporate groups such as banking soon followed; for example, the Sinkey (1975) study on commercial banks.Bloch (1969) applied linear discriminant analysis in an exploratory study of savings and loan associations, and the encouraging results helped to initiate Altman's study in the same area.Altman (1977) adopted a quadratic discriminant analysis in predicting performance in the savings and loan association industry.The study finds that on average, the Z-score can predict 76% of bank failures, and an additional set of other bank-and macro-level variables do not increase this predictability level.It also was found that the prediction power of the Z-score to predict bank defaults remains stable within the three-year forward window.The study finds that the probability of distress is connected with macroeconomic conditions via regional grouping (clustering).Bank-level variables that were stable predictors of distress from 1 to 4 years prior to an event are the ratios of equity to total assets (leverage) and loans to funding (liquidity).For macroeconomic factors, the GDP growth is a reasonable variable, but with a differentiated impact, which shows the changing role of the macroeconomic environment and indicates the potential impact of favorable macroeconomic conditions on the accumulation of systemic problems in the banking sector.
(  This study offers an analytical approach, including the selection of the most significant bank-failure-specific indicators using lasso regression, converting data from imbalanced to balanced form using SMOTE, and the choice of the appropriate machine-learning techniques, to predict the failure of the bank.
AdaBoost was found to have the maximum accuracy.The results suggest that low levels of bank liquid assets and domestic financial liabilities, and high levels of foreign liabilities and financial leverage, increase the likelihood of a banking crisis.These results are robust when different dependent variables and control variables are used.Results also show that there is no single optimal lag length for all the indicators.Combining all indicators together, it is found that the indicators have the best predictive power with a lag of 42 months.
Adopting these methods, researchers used U.S. bank data to identify the main explanatory contributors of bank failure (Cleary and Hebb 2016;Cox and Wang 2014;Jordan et al. 2010).In order to address the classification problem associated with discriminant methods, Lam and Moy (2002) presented a method that combined several discriminant methods to predict the classification of new observations.The simulation experiment proved further enhanced accuracy of classification results.Serrano-Cinca and Gutiérrez-Nieto ( 2013) performed an empirical study, comparing partial least-squares discriminant analysis (PLS-DA) with other eight techniques widely used for classification tasks.The results showed that PLS-DA performed very well in the presence of multicollinearity, with a satisfactory interpretability.The PLS-DA results resembled the linear discriminant analysis and support vector machine results.

Logit/Probit and Linear Regression Analysis
When independent variables are not normally distributed, maximum likelihood methods such as logit and probit models are used.These were used in many studies on bank failure prediction.A logit model is a nonlinear model with dichotomous outcome variables of failed/nonfailed bank.After Martin's (1977) application of a logit model to predict bank failures in the U.S., various studies adopted this model (univariate or multivariate) to predict bank failures in different countries in different periods.These included, for example, Andersen (2008) in Norway; Arena (2008) in East Asia and Latin America; Ercan and Evirgen (2009) in Turkey; Zhao et al. (2009), Cole and White (2012), DeYoung and Torna (2013), Mayes andStremmel (2014), andBerger et al. (2016) in the U.S.; Poghosyan and Čihak (2011), Betz et al. (2014), and Chiaramonte and Casu (2017)  The probit model is another binary model used in banking failure studies (Chiaramonte et al. 2015;Cipollini and Fiordelisi 2012;Kerstein and Kozberg 2013;Wong et al. 2010).Research in this area found that the accuracy was similar to that of logit models (Barr and Siems 1997).
The hazard model as another statistical model is also applied to predict bank failures; this stream of study includes Lane et al. (1986), Molina (2002), Hong et al. (2014), Maghyereh andAwartani (2014), andChiaramonte et al. (2016).However, Cole and Wu (2009) compared the out-of-sample forecasting accuracy of the time-varying hazard model and the one-period probit model, using data on U.S. bank failures from 1985-1992, and the study found that from an econometric perspective, the hazard model was more accurate than the probit model in predicting bank failures when more recent information was incorporated in the hazard model.
Although standard discriminant analysis has been a popular technique for bankruptcy studies, it suffers from methodological or statistical problems that have limited the practical usefulness of their erroneous results (Ozkan-Gunay and Ozkan 2007).Violations of the normality assumptions may bias the tests of significance and estimated error rates (Ohlson 1980).However, as an early study of the application of the Cox model in finance literature, empirical results from Lane et al. (1986) indicated that the total classification accuracy of the Cox model was similar to that of discriminant analysis.Lanine and Vennet (2006) and Kolari et al. (2002) both used a logit model and a trait-recognition approach to predict bank failures in Russia and the U.S.Both concluded that a trait-recognition approach outperformed the logit approach.
Prediction can be described as a classification method.In the context of banking failure prediction, we categorized the banks into failed and nonfailed groups, which is exactly what data-mining models focus on.As data-mining models capture the relationships between dependent and independent variables by learning from the data, imposing fewer constraints than traditional statistical models such as the logit model on the distribution of the data (Jing and Fang 2018).In the next subsection, we will review the studies in this area.

Artificial Intelligence Method
The traditional approach of predicting business distress or failures has been criticized because the validity of its results hinges on restrictive assumptions (Coats and Fant 1993).In order to address the problematic issues brought by linear analysis, researchers began bankruptcy studies through neural network analysis in 1990.Neural networks differ from the classical approach because these models assume a nonlinear relation among variables (De Miguel et al. 1993).Tam (1991) believes a neural network is a learning process when given a collection of failed and nonfailed banks, and a network is trained by using a learning algorithm so that the resultant network not only represents a discriminant function for the sample banks, but also makes generalizations from the training sample.Atiya (2001) argued that there are saturation effects in the relationships between the financial ratios and the prediction of default.The following are the bank failure prediction studies that have applied the neural network approach.
One of the early studies adopting neural network was that of Tam (1991), who examined failed banks in in the period of 1985-1987. López-Iturriaga et al. (2010) applied the neural network method, studying U.S. commercial banks during the financial crisis period.The model showed a high discriminant power and was able to differentiate healthy and distressed banks.López Iturriaga and Sanz (2015) developed a hybrid neural network model to study U.S. bank bankruptcies.Based on the data, which spanned between 2002 and 2012, the model detected 96.15% of the failures and outperformed traditional models of bankruptcy prediction.Constantin et al. (2018) studied the European bank network with a distress model that offered information about the external-dependence structure of listed European banks.The model could provide information on potential distress following an early-warning signal, and the potential for financial contagion and a systemic banking crisis.
Similar studies have been applied in emerging markets.Olmeda and Fernandez (1997) examined the bankruptcies of Spanish banks, and found the artificial neural network approach had an 82.4% accuracy, compared with 61.8-79.4% for the competing techniques.Ravisankar and Ravi (2010) adopted three unused neural network architectures for bank distress for four different countries.Ozkan-Gunay and Ozkan (2007) applied the artificial neural network approach for examining bank failures in the Turkish banking sector.A new principal component neural network (PCNN) architecture for commercial bank bankruptcy prediction also was proposed and examined in the Spanish and Turkish banking sectors, and the hybrid models that combined PCNN and several other models of banking bankruptcy prediction outperformed other classifiers used in the literature (Ravi and Pramodh 2008).The superiority of artificial-neural-network-related models was further documented and supported (Bell 1997;Boyacioglu et al. 2009;Swicegood and Clark 2001).Ecer (2013) compared the ability of an artificial neural network (ANN) and support vector machine (SVM) in predicting bank failures in Turkish banks.Of these two models, neural networks were observed to have a slightly better predictive ability than support vector machines.A similar comparative study was conducted by Jing and Fang (2018); however, the study was in favour of the logit model.Le and Viviani's (2018) comparative study revealed that the artificial neural network and k-nearest neighbour methods are the most accurate models.
2.1.4.Machine-Learning Methods (Including Ensembles, Support Vector Machines, Generalized Boosting, AdaBoost, and Random Forests) Recent statistical learning techniques such as generalized boosting, AdaBoost, and random forests are used to predict banking failure with the purpose of improving prediction accuracy.Using a comprehensive dataset encompassing systemic banking crises for 15 advance economies over the past 45 years, Beutel et al. (2019) concluded that machine learning helps us predict banking crises.Tanaka et al. (2016) adopted a novel random-forests-based approach for predicting bank failures for OECD member countries.The experimental results showed that this method outperformed conventional methods in terms of prediction accuracy.Momparler et al. (2016) found the boosted regression trees method was a better model to identify a set of key leading indicators, and further to anticipate and avert bank financial distress.Ekinci and Erdal (2017) applied three common machine-learning models in analysing bank failure prediction for 37 commercial banks operating in Turkey between 1997 and 2001.The experimental results indicated that hybrid ensemble machine-learning models outperformed conventional base and ensemble models.Erdogan (2013) found that the support vector machine method with a Gaussian kernel was a good application for bank bankruptcy.Gogas et al. (2018) found that a model trained by a support vector machine had an overall accuracy of 99.22%.Olson et al. (2012) applied a variety of data-mining tools to bankruptcy data to compare accuracy and number of rules.Decision trees were found to be more accurate than neural networks and support vector machines, albeit with an undesirably high number of rule nodes.Carmona et al. (2019) adopted an extreme gradient-boosting approach that was not required to be managed like a black box, and found out the predictive power was greater than most conventional methods.Kolari et al. (2019) studied a European bank stress test by using an AdaBoost ensemble approach, and the models' accuracy was found to be 98.4%.A similar result was found by Shrivastava et al. (2020) in the banking sector in India.
Overall, many studies compared the traditional approaches to several machinelearning approaches, as it is well documented that machine-learning methods outperform the traditional models.However, further enhancements to machine learning are needed when we consider the performance metric, crisis or distress event definition, preference parameters, sample length, and regulatory differences among countries.

Review of Predicting Indicators
In the empirical literature, the prediction of bank failure has been primarily focused on the identification of leading indicators that contribute to generate reliable early warning systems (Chiaramonte et al. 2015).This group of indicators mostly includes financial/accounting-based indicators since Beaver (1966) pioneered the prediction of bankruptcy using financial statement data such as financial leverage, return on assets, and liquidity.
In our particular banking sector, over the years, the Federal Reserve and FDIC developed their own methodology for identifying distress in the banking sector (Kerstein and Kozberg 2013).The initial CAMELS rating comprised five categories: capital adequacy, asset quality, management, earnings, and liquidity, to indicate the condition of a bank.In 1996, the CAMELS system was expanded to include a sixth rating area.Nevertheless, the bank-level fundamentals proxied by CAMELS-related variables has been extensively studied for a particular country or district or at a cross-country level (Arena 2008;Chiaramonte et al. 2015;Iwanicz-Drozdowska and Ptak-Chmielewska 2019;Kerstein and Kozberg 2013;Kolari et al. 2002;Lane et al. 1986;Maghyereh and Awartani 2014;Männasoo and Mayes 2009;Molina 2002;Wheelock and Wilson 2000), and most of them were associated with a statistical model such as the logit model.
In the early seminar articles on bankruptcy prediction, Altman (1968), Beaver (1966), and Beaver (1968) used Z-scores that comprise five market-and/or accounting-based ratios to predict business failures.Subsequent articles adapted or expanded the use of Z-score analysis to predict bank failure.Martin (1977) drew a set of 25 financial ratios from the database maintained by the Federal Reserve Bank of New York for research on bank surveillance programs, and used a similar logit analysis to Altman's study to examine bank failures in the period of 1975-1976. Chiaramonte et al. (2016) ) examined U.S commercial banks data from 2004 to 2012 and found that on average, the Z-score can predict 76% of bank failures, and an additional set of other bank-and macro-level variables did not increase this predictability level.However, Bongini et al. (2018) found that the predictive power of the Z-score was weak, especially for developing economies.
Although traditional CAMELS indicators are found to be successful in anticipating bank failures in the U.S., Canbas et al. (2005) found that these criteria did not maintain a oneto-one correspondence with the specific financial characteristics for Turkish commercial banks due to different applications of bank regulatory and supervisory actions.Kapinos and Mitnik (2016) proposed a simple method for stress-testing banks using a top-down approach that captured the heterogeneous impact of shocks to macroeconomic variables on banks' capitalization.They performed a principal component analysis on the selected variables and showed how the principal component factors can be used to make projections, conditional on exogenous paths of macroeconomic variables.Ercan and Evirgen (2009) and Canbas et al. (2005) adopted the same approach (principal component analysis).Iwanicz-Drozdowska et al. ( 2018) also found that it was difficult to predict the distress with a set of CAMELS-like variables in the European setting.
Meanwhile, researchers are attempting to find other explanatory factors to address the distress phenomena.These include macroeconomic and regulation variables (Cebula 2010;Männasoo and Mayes 2009;Schaeck 2008;Wong et al. 2010), accounting and audit quality (Jin et al. 2011), income from nontraditional banking activities (DeYoung and Torna 2013), market and macroeconomic variables (Cole and Wu 2009), commercial real-estate investments (Cole and White 2012), information content of Basel III liquidity risk and capital measures (Chiaramonte and Casu 2017;Hong et al. 2014), corporate governance (Al-Tamimi 2012; Berger et al. 2016).
Data envelopment analysis (DEA) has been widely applied in banking efficiency studies.Although DEA suffers from the usual statistical inefficiency problems found in nonparametric estimation (Kneip et al. 1996), the efficiency variable generated from this method is also used as an indicator to predict banking failure.Wheelock and Wilson (2000) adopted the DEA method by developing an operating efficiency as a measure of management performance, along with other CAMELS-related variables to investigate the determinants of U.S. bank failures.Similar studies were conducted in different banking sectors in different countries (Avkiran and Cai 2012;Barr and Siems 1997;Cipollini and Fiordelisi 2012;Kao and Liu 2004;Tatom and Houston 2011).Barr and Siems (1997) found their model outperformed many previous logistic models in predicting failure when DEA efficiency as the proxy for the management quality and other CAMELS-ratings related variables were used.

Discussion
We reviewed 24 papers that in the artificial intelligence and machine-learning research areas, and 41 papers that used regression models and discriminant analyses to assess bank failures-a total of 65 papers.Though regression models formed close to 50% of the papers on bank failures after the global financial crisis (GFC), the recent trend seems to be to use machine-learning techniques for prediction of bank distress.The accuracy rate of machine-learning models as reported above is 95% generally.Almost half of the machine-learning papers used U.S. bank data.The other half was scattered throughout a few European countries.The use of artificial intelligence and machine-learning approaches requires solid skills in these areas, and few banks and regulators may have the necessary expertise.The statistical techniques, on the other hand, are commonly used, and data are easily available to the banks.In addition, from a cost perspective, data and other associated costs are much higher if artificial intelligence or machine-learning techniques are to be used (Incze 2019).Overall, more research is required using banking data, regulation, macroeconomic conditions, and market structure in non-U.S countries.Research on Asia Pacific countries is woefully lacking, barring one paper that used Indian bank data.
However, we do not know whether regulatory agencies have adopted these models in practice or whether the banks, in their own interest, use these models to assess their vulnerability periodically.Future studies may consider a survey of banks to find which techniques are being used in practice, and if not, why they are not being used.Similarly, a survey of regulatory agencies could also be conducted.Only a few papers have performed a comparative study of regression models and machine-learning techniques, and these found that machine-learning models performed better in predicting bank distress.
Furthermore, papers are overwhelmingly based on U.S. data.However, the regulatory set-up and banking laws in other countries of the world may not be similar to those in the U.S. Accordingly, there is an inherent bias in the literature.In countries where banks are predominantly under public ownership, such as India or China, the conclusions of prior studies may not be relevant.Similarly, the macroeconomic environment and market structure in these countries would be different, and this fact needs to be taken into consideration.
It is not surprising to see that corporate bankruptcy prediction models have been intensively developed and studied.Researchers found each method had its pros and cons.For example, for the recent trend of the application of neural networks, Olson et al. (2012) argued that decision trees can be just as accurate, and provide the transparency and transportability that neural networks are often criticized for.Further, the breadth and depth of the recent financial crisis indicates that these methods must improve if they are to serve as a useful tool for regulators and managers of financial institutions (Carmona et al. 2019).While research on bankruptcy in the banking sector has been well developed, studies on other financial institutions are rather sparse, such as those on fund management, insurance companies, etc.
The majority stream of research predicting bank failures focuses on the determinant factors or leading indicators, such as accounting and financial ratios, macroeconomic data, and regulation.A small set of studies applied a different dataset, but aligned with banking activities in bank failures during the financial crisis.In light of the ongoing FinTech advancement, it would be beneficial to conduct further studies on the different risks faced by large banks, such as trading risk (off-balance-sheet items), or currency risk or crises (Kaminsky and Reinhart 1999).These authors pointed out that not much attention has been paid to the interaction between banking and currency problems, neither in the older literature nor in the new models of self-fulfilling crises, or technological risk, which would be a logical extension of bank early-warning-sign literature.

Conclusions
The paper provided a synthesis of post-GFC studies on bank failures.A total of 39 studies published in reputed journals were compared.The emerging trend was towards the use of machine-learning techniques, although currently, regression-model-based studies dominate.The directions for future research have also been identified.
in most of the European Union countries and banks; Demirgüç-Kunt and Detragiache (1998) in 65 developing and developed economies; and de Haan et al. (2020) in 147 emerging and developing countries.

Table 1 .
The literature study.
The bank failure rate was found to be an increasing function of the unemployment rate, the average cost of funds, volatility of the S&P 500 stock index, and charge-offs as percentage of outstanding loans and a decreasing function of the mortgage rate on new30-year fixed-rate mortgages.