Previous Article in Journal
The Investment Styles and Performance of AI-Related ETFs: Analyzing the Impact of Active Management
Previous Article in Special Issue
FinTech and AI as Opportunities for a Sustainable Economy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning for Predicting Bank Stability: The Role of Income Diversification in European Banking

1
Faculty of Economics and Business Administration, Berlin School of Business and Innovation (BSBI), Berlin Campus, 12043 Berlin, Germany
2
Faculty of Computer Science and Informatics, Berlin School of Business and Innovation (BSBI), Berlin Campus, 12043 Berlin, Germany
*
Author to whom correspondence should be addressed.
FinTech 2025, 4(2), 21; https://doi.org/10.3390/fintech4020021
Submission received: 20 April 2025 / Revised: 27 May 2025 / Accepted: 28 May 2025 / Published: 31 May 2025

Abstract

:
There is an ongoing debate about the role of income diversification in enhancing bank stability within the financial services industry in Europe. Some advocate for diversification, while others argue that its importance should not be overstated. Some financial institutions are encouraged to focus on their traditional investments instead of income diversification, while others suggest that income diversification can stabilize or destabilize, depending on the regulatory environment. These conflicting results indicate a lack of clear evidence regarding the effectiveness of income diversification. Therefore, this paper aims to study the impact of income diversification on bank stability and enhance the predictive performance of bank stability by analyzing the period from 2000 to 2021 using a sample from 26 European countries, based on aggregate bank data. It employs a hybrid method that combines econometric techniques, specifically the generalized method of moments and a fixed-effects model, with machine-learning algorithms such as Random Forest and Support Vector Machine. These methods are applied to enhance the reliability and predictive power of the analysis by addressing the problem of endogeneity (via generalized method of moments) and capturing non-linearities, interactions, and high-dimensional patterns (via machine learning). The econometric findings reveal that income diversification can reduce non-performing loans, improve bank solvency, and enhance the Z-score, indicating the significant role of income diversification in improving bank stability. Conversely, the results also show that the machine-learning algorithms used play a crucial role in enhancing the predictive performance of bank stability.
JEL Classification:
G21; G32; C45; G28; C55

1. Introduction

Banks are instrumental in advancing sustainable economic development by efficiently allocating capital to productive sectors, enhancing output, and long-term growth. However, the post-crisis financial environment has become increasingly dynamic, requiring banks to adopt more proactive and strategic approaches to portfolio and risk management [1]. In response to systemic vulnerabilities exposed during the global financial crisis, regulatory standards such as Basel III clarify that the official minimum capital requirement remains 8% under Basel II, but Basel III adds a mandatory Capital Conservation Buffer of 2.5%, which increases the total requirement to 10.5% [2]. This shift underscores the importance of accurate risk-exposure measurement for ensuring financial system resilience and supporting economic stability. It also emphasizes the importance of asset diversification to better meet capital requirements. Ref. [3] stated that, following the 2007–2009 financial crisis, banks were increasingly motivated to diversify from conventional assets to protect themselves from credit and insolvency risks. Furthermore, banks’ use of technology to expand internationally and join global financial markets was greatly aided by technological advancements, which in turn increased non-interest-bearing investment.
Globalization and financial reforms have further driven the deregulation of banking activities and expanded the scope for income diversification. Ref. [1] argued that the decline in market interest rates also caused European banks to reallocate their reserves to non-traditional assets. Financial institutions are increasingly adopting digital tools and FinTech-enabled services to improve operational efficiency, expand product offerings, and manage competition [4]. Consequently, banks have shifted away from a heavy reliance on interest income toward a broader mix of fee-based and off-balance-sheet activities, such as securitization, derivatives trading, trade finance, underwriting, and financial advisory services, which theoretically help mitigate concentration risk [5].
Despite its theoretical appeal, the impact of income diversification on bank stability remains contentious. Some scholars found that it contributes to systemic risk, particularly when fee-based activities involve complex financial instruments like subprime mortgages and derivatives, key triggers of the 2007–2009 financial crisis [1,6]. Accordingly, Asian regulators—including Korea and Taiwan—imposed restrictions on off-balance-sheet activities to protect financial stability [7]. The Bank of England enforced the separation of retail and investment banking to relieve the vulnerability of off-balance-sheet risks driven by the exaggeration of income diversification. Additionally, Ref. [8] conducted a thorough assessment of 83 studies and concluded that income diversification is important for lowering risk and improving bank stability, particularly when banks use it as a buffer to mitigate their risk of income volatility. Furthermore, Ref. [9] argued that income diversification can decrease bank stability in poorly regulated environments or when banks evolve into more complex structures that are difficult to track and control, consistent with the findings of [10], arguing that countries operating in low-regulatory environments are vulnerable to a lack of oversight, inadequate risk controls, moral hazard, and poor risk management, resulting in excessive risk-taking and weakly diversified portfolios, which undermine the effectiveness of the income diversification role in enhancing bank stability. On the other hand, Ref. [11] stated that highly regulated environments, such as Europe, have stronger risk controls that encourage income diversification to stabilize bank performance.
This ongoing debate calls for a reassessment of income-diversification strategies through advanced analytical tools. This study contributes to the FinTech literature by evaluating the effect of income diversification on bank stability and improving the prediction performance of bank stability levels using both econometric and machine-learning approaches. By combining GMM with predictive algorithms like Support Vector Machines (SVMs) and Random Forests (RFs), the study explores the causal linkages and predictive potential of diversification in the context of European commercial banks. Our findings aim to support evidence-based regulation and highlight how machine learning can be leveraged for real-time risk monitoring in a fast-evolving financial ecosystem.
The remainder of the paper is structured as follows: The Literature Review section theoretically and empirically reviews the most recent relevant papers. The Research Methodology and Data section clarifies the design of the study and the data sources. The Results section presents the output of the econometric and machine-learning models, along with their relevant analysis, and discusses the findings in relation to existing studies and their practical implications. The Conclusion section concludes the paper with a summary of key insights. Finally, the Limitations and Future Work section demonstrates the limitations and directions for future research.

2. Literature Review

2.1. Previous Studies in Europe

Using panel data from 2002 to 2012 and a sample of European banks, Ref. [12] examined the relationship between income structure and bank profitability to assess how the banks’ practices evolved after the crisis. The results showed that while income diversification has a negative effect on profitability, this effect decreases during times of crisis, suggesting that income diversification is more beneficial when banks anticipate a crisis. Conversely, during normal conditions, it is advantageous to focus more on traditional investments. Utilizing a sample of 1250 banks in the United States and Europe between 2008 and 2016, Ref. [13] employed regression analysis to investigate the impact of income diversification on bank stability and profitability. The findings revealed that income diversification is positively related to bank stability in the USA, whereas in Europe, it has an insignificant effect on both bank stability and profitability. This indicates that European banks prefer to diversify their traditional investments in loans rather than non-interest income-generating investments to enhance profitability and bank stability.
Furthermore, Ref. [11] used the panel smooth-transition regression model to explore the impact of income diversification on bank stability in 114 European commercial banks, utilizing panel data from 2010 to 2019. According to the findings, increasing income diversification through non-traditional banking activities has a detrimental impact on bank stability and financial performance. In addition, Ref. [14] examined the impact of non-interest income, loan, and geographic diversification during the COVID-19 pandemic, using a sample of 56 European banks. The results showed that, in contrast to loan and geographic diversifications, the non-interest income ratio is the only variable that supports enhancing the stability of the European banks during the crisis and pandemics. Ref. [15] employed the system GMM to study the effect of income diversification on bank performance during the COVID-19 period, using a sample of 1231 banks in 90 countries from 2018 to 2021. The findings revealed that income diversification had a positive effect on bank stability in developing and developed countries during the COVID-19 periods, thus confirming the importance of encouraging banks to engage in fee-based, trading, and FOREX activities to absorb the negative effect of the recessionary periods for more stability and growth in the credit markets.

2.2. Previous Studies in the Rest of the World

Ref. [16] argued that non-performing loans (NPLs), which are defined as loans that have been passed due for more than 90 days without payments, are used to gauge a bank’s exposure to credit risk, which is a key factor in determining the stability of the bank. To prevent unforeseen bad debt expenses that could jeopardize bank stability and degrade capital, banks must lower the non-performing loan ratio. In this regard, banks must diversify their loan portfolios by lending to businesses in various industries and offering a range of loan products, including credit cards, mortgages, auto loans, personal loans, commercial and industrial loans, and so forth. However, this diversification proves unsuccessful during recessions and crises, leading most banks to reallocate a portion of their reserves to non-interest-bearing investments to safeguard their capital against unforeseen losses during recessions and crises. In China, Ref. [17] employed the GMM to investigate the effect of income diversification on bank stability by using a sample of 101 Chinese banks with panel data from 2006 to 2016. The results indicated that income diversification has a negative effect on bank stability. They argued that the reason behind such a negative relationship is that the Chinese banks are still at the early stage of non-interest activities and have limited control. Additionally, increasing banks’ engagement in non-interest activities could reduce banks’ concern about their core business in loan investments raising the volatility of income and a lack of government supervision. In this respect, Chinese banks are more likely to diversify their traditional loan investments than non-traditional ones to improve their financial stability.
Additionally, Ref. [6] studied a sample of 200 commercial banks operating in South Asian countries, discovering that income diversification has a positive influence on bank stability, except for fees and commission income activities, which have a negative impact on bank stability, implying that not all non-interest activities are beneficial to the financial health of the South Asian banking system. Additionally, Ref. [18] employed a fixed-effect model, using a sample of commercial banks from Malaysia, and the results revealed that income diversification enhances the financial performance of the banks. Furthermore, Ref. [4] used fixed-effect models and GMMs to investigate the impact of income diversification on bank stability by taking a sample of 169 BRICS commercial banks from 2001 to 2015. The findings showed that the income diversification of large-sized banks positively affected bank performance.
In contrast, the small-sized ones had a negative effect, which provided better insights to regulators that income diversification is not favorable for all. Ref. [19] used a sample from Indian banks and confirmed the results of [4]. Furthermore, Ref. [20] discovered that Islamic banks in the Gulf Cooperation Council (GCC) nations prefer income diversification to increase bank stability. In contrast, the results of conventional banks showed that income diversification positively affected non-performing loans (NPLs) and negatively impacted Z-score. This suggests that the conventional banks in the GCC would become less stable if they relied too much on income diversification from fee-based and off-balance-sheet activities. Additionally, Ref. [7] used multivariate regression on a sample of commercial banks operating in 34 countries that are members of the Organization for Economic Co-Operation and Development (OECD), using unbalanced panel data from 2002 to 2012. The results showed that while a moderate increase in income diversification can improve bank stability, excessive diversification, particularly during a crisis, can worsen stability.
This highlights the significance of traditional investment concentration in loan and deposit investments during a crisis rather than engaging in non-interest-bearing investments to stabilize bank operations. Moreover, Ref. [21] used the GMM to study the effect of income diversification on bank stability, using a sample of Vietnamese commercial banks from 2006 to 2015. The findings revealed that relying on fee-based activities rather than traditional loan investments can reduce bank stability. In addition, Ref. [22] employed the GMM to explore how income diversification affects bank performance in Sub-Saharan banks by conducting a comparative study among emerging, regional, and global banks. The findings showed that income diversification enhanced bank performance in global and emerging markets compared to the regional African and domestic banks, thus demonstrating the importance of adopting diversification in bank investments to stabilize their financials. Further, Ref. [23] used the GMM on a sample of 48 banks operating in India to investigate how bank diversification influences bank stability, with geographic, loan portfolio, and functional diversifications as dependent variables in the study. The findings demonstrated that all levels of bank diversification have a positive impact on stabilizing bank performance, arguing that more engagement in different alternative investments would reduce the overall risk and increase the stability of the banks.
In addition, Ref. [23] used a regression model to investigate the impact of income diversification on bank stability from 2008 to 2017 using a sample of Tunisian commercial banks. The results demonstrated that income diversification significantly and favorably affects bank stability, suggesting that greater income diversification will raise the Z-score, indicating improvement in bank stability. Furthermore, Ref. [24] examined a sample of 45 African commercial banks from 2000 to 2020 and discovered that income diversification improves bank stability, whereas excessive diversity diminishes it. He also discovered that larger liquidity and interest margins, as well as increased operational inefficiencies, had a negative influence on bank stability. In contrast, GDP and inflation have a significant impact on banks’ financial health. Further, Ref. [10] used panel data from 2002 to 2019, using GMM to examine the effects of income and asset diversification on bank stability in the United States commercial banks. Bank stability is positively impacted by assets and funding diversification, in turn encouraging banks to increase their traditional lending investments. On the other hand, they discovered that revenue diversification adversely affects bank stability, thus making banks’ financial issues worse. Ref. [25] contends that income diversification is favorable to bank stability; however, excessive diversification could negatively affect bank stability in the African markets.
Furthermore, Ref. [10] used panel data from 2012 to 2021 and a sample from MENA countries to examine the effect of income diversification on bank stability using the fixed-effect regression model. The results showed that income diversification significantly and positively affects bank stability. Additionally, Refs. [16,26] used a sample of Egyptian commercial banks with panel data from 2011 to 2020 to examine the impact of macroeconomic and bank-specific factors on bank stability. They found that bank-specific factors had a greater impact on corporate credit risk than on retail credit risk; additionally, the findings of the income diversification found an insignificant effect on retail and corporate NPL. According to Ref. [5] Zimbabwean commercial banks have a low-income diversification ratio because they rely heavily on loans and neglect investments in off-balance-sheet and fee-based income activities. This makes these banks susceptible to high systemic risk. To investigate the impact of income diversification on bank performance, they used the modified OLS and difference GMM. The results showed that income diversification has a positive effect on banks’ ROE, highlighting the necessity for Zimbabwean banks to alter their revenue strategy by increasing their level of diversification to improve stability and growth. Additionally, a sample of 271 commercial banks operating in the MENA countries from 2009 to 2020 was used by [16] to examine the impact of asset and income diversification on bank stability using the two-step GMM. They additionally investigated how political stability influences the relationship between diversification and bank stability. The results showed that while income and asset diversification contribute to bank stability, a greater proportion of non-interest-income compared to interest-income activities has a negative impact on the benefits of asset diversification, and political stability undermines bank stability, in turn reducing the benefits of investment diversification. Furthermore, the advantages of diversification differ depending on the size and market power of banks. This indicates that larger banks may use diversification to lower systemic risk more effectively than smaller ones, which are more vulnerable to systemic risks.

3. Lack of Literature

After reviewing the literature, particularly in European countries, the paper concluded that there is no crystal-clear evidence that offers a clear relationship between income diversification and bank stability after the periods of COVID-19. Additionally, to the best of researchers’ knowledge, the research found that limited studies were applied in Europe after COVID-19. Further, the researchers noticed that income diversification has struggled recently, as shown in Figure 1 along with decline in the bank stability measured by Z score, as shown in Figure 2.

4. Research Methodology and Data

This study investigates the effect of income diversification on bank stability using an unbalanced dataset comprising 572 observations from 26 European countries over the period from 2000 to 2021. Further, the frequency of the data is on an annual basis. The dataset has aggregated banking performance indicators at the country-year level. It does not include individual bank-level entries. Consequently, it is not possible to directly determine the number of banks per country from this dataset. The countries included in the analysis are Germany, the United Kingdom, France, Italy, Austria, Belgium, Croatia, Cyprus, the Czech Republic, Denmark, Sweden, Estonia, Finland, Greece, Hungary, Ireland, Bulgaria, Latvia, the Netherlands, Poland, Malta, Portugal, Slovakia, Slovenia, Lithuania, and Luxembourg. Data were obtained from the World Bank’s Global Financial Development Database (GFDD) and the World Development Indicators (WDI).
To capture the causal relationship between income diversification and bank stability, we employed the System Generalized Method of Moments (System GMM) estimator. This approach was chosen over Difference GMM because of its superior efficiency in exploiting additional moment conditions and minimizing potential bias from weak instruments. Additionally, fixed- and random-effects models were estimated for robustness checks. The software used for econometric analysis is R software, version 4.5.0. Bank stability was proxied using the Z-score (a measure of insolvency risk), the non-performing loan (NPL) ratio, and the capital adequacy ratio (CAR), consistent with prior studies [11,20,27]. To ensure the quality and consistency of the panel dataset, several preprocessing steps were conducted prior to data analysis, utilizing the econometric models. Missing values were addressed using listwise deletion, and outliers were identified using the interquartile range (IQR) and eliminated. Log transformation was applied to increase normality and stabilize variance in variables with skewed distributions. The first lag was taken to test and manage stationarity, which was addressed by taking the first lag. Additionally, lagged values of endogenous variables were produced.
In parallel, this study implemented two machine-learning algorithms—Random Forest and Support Vector Machine (SVM)—to complement the econometric analysis with predictive modeling using Python 3.13.3 and PyCharm compiler. These models were trained on the same macro-financial dataset to classify bank distress, defined using a binarized Z-score threshold. Random Forest combines the outputs of multiple decision trees to capture non-linear relationships and reduce overfitting through ensemble averaging. SVM was selected for its ability to operate effectively in high-dimensional feature spaces and to identify optimal hyperplanes for class separation. Both models were evaluated based on accuracy, precision, recall, and F1-score, and cross-validation was employed to enhance generalization. As a result, the research formulates the following hypotheses.
H1: 
Income diversification significantly affects bank stability in Europe.
H1a: 
Income diversification significantly affects bank Z-score in Europe.
H1b: 
Income diversification significantly affects bank credit risk in Europe.
H1c: 
Income diversification significantly affects bank insolvency risk in Europe.
This dual-method approach—econometric modeling for inference and machine learning for prediction—provides a comprehensive framework to assess the implications of income diversification for bank stability, while accounting for potential endogeneity, heterogeneity, and non-linear dependencies.
Δ Z S c o r e i t = α i + β 1 D I V + β 2 C O N + β 3 E F F + β 4 S M R + β 5 S P V + β 6 P R O F + β 7 G D P + β 8 I N F + β 9 U N E M P + e i t
Δ N P L i t = α i + β 1 D I V + β 2 C O N + β 3 E F F + β 4 S M R + β 5 S P V + β 6 P R O F + β 7 G D P + β 8 I N F + β 9 U N E M P + e i t
Δ C A R i t = α i + β 1 D I V + β 2 C O N + β 3 E F F + β 4 S M R + β 5 S P V + β 6 P R O F + β 7 G D P + β 8 I N F + β 9 U N E M P + e i t
Additionally, the formulas used in the machine-learning algorithms are as follows:
Precision = TP TP + FP
Recall = TP TP + FN
Accuracy = TP + TN TP + FP + FN + TN
F 1   Score = 2 × Precision × Recall Precision + Recall
Table 1 presents the variables used in the paper, including their measurements. It shows the dependent and independent variables used in the study. This paper has three dependent variables, Z-score, NPL, and CAR proxies for bank stability, and nine independent variables. The DIV, CON, EFF, SMR, SPV, and PROF represent the bank-specific data, while the GDP, INF, and UNEMP represent the macroeconomic data.

5. Results

5.1. Comparison Between Income Diversification and Bank Z-Score in Europe During 2000–2021

In this section, the paper plots two graphs showing the historical movement of income diversification and banks’ Z-score for 26 European countries during 2000–2021, as shown in Figure 1 and Figure 2. We noticed that the Z-score decreased while income diversification increased, suggesting a hazy explanation that necessitates further study on the role of income diversification in enhancing bank stability. This research aims to better advise regulators and bankers on how to manage income for a consistent level of profitability that supports the growth of the European economies.

5.2. Descriptive Analysis

The paper conducted a descriptive analysis by describing the collected data in terms of mean, standard deviation (STDEV), minimum (MIN), and maximum (MAX), as shown in Table 2. The average Z-score is 13.78, which indicates that most of the banks in Europe are well capitalized and have stable levels of earnings, showing that the banks are stable. Furthermore, with an STDEV of 6.97% indicating a moderate level of volatility in the NPL ratio, the mean of the NPL ratio is 5.72%, indicating that the commercial banks of Europe are confronting a moderate level of credit risk exposure, reaching almost 6%. Furthermore, banks in Europe hold more capital than is necessary to be prepared to absorb any unforeseen losses in their portfolios, as evidenced by the CAR’s means of 16.19%, which is significantly higher than the minimum regulatory capital requirements. Additionally, the CAR’s STDEV of 4.51 indicates some stability in the CAR level. Moreover, the DIV’s mean of 41.77% demonstrates that non-interest income accounts for 41.77% of the total income generated by European banks.
Furthermore, the average EFF is 58.90%, signifying that, on average, all expenses in European banks equal 58.90% of the total income. In addition, the SMR has the highest STDEV, showing that the European stock market indices have high volatility from its mean of 6.27%, followed by the CON, ROE, and EFF having an STDEV of 16.75%, 13.87, and 12.09, respectively. Additionally, Europe’s average ROA is 0.62%, whereas the USA and UAE have averages of 1.6% and 1.6%, respectively. Additionally, the European GDP growth rate is 2.43%, the inflation rate is 2.24%, and the unemployment rate is 4.70%. All these figures appear to be normal, and the dataset shows no anomalies. Nonetheless, the data show that DIV, NPL, CAR, and EFF are high, in addition to high STDEV in Z-score, highlighting the importance of researching how income diversification affects bank stability to give regulators and bankers better insights for improved financial and economic outcomes.

5.3. Regression Results and Discussion

As shown in Table 3 and Table 4, this study tested the hypotheses of the gathered data using the GMM and fixed-effect models. The p-value of the Sargan test for all the GMMs exceeds 0.05, as indicated in Table 3, which suggests that the instruments are valid and not correlated with the error term. Additionally, the p-values for the autocorrelation are above 0.05, meaning that there is no significant autocorrelation in the residuals. Moreover, the p-values of the Wald test are less than 0.05, demonstrating that the tested coefficients are significant, indicating that the chosen independent variables significantly impact the NPL, CAR, and Z-score. Accordingly, the models are robust and reliable.
The findings of the GMM, specifically the NPL model, illustrated that DIV had a negative impact on NPL, arguing that enhancing income diversification can reduce reliance on traditional investments, thereby minimizing unexpected credit risk exposure and achieving greater stability in banks. In addition, ROA was found to be significant and negatively associated with NPL, indicating that banks with low profit margins are more likely to engage in risky investments, leading to a higher level of NPL. In this regard, low-profit-margin European banks should be more cautious in managing their portfolios to avoid any unexpected insolvency risks that might threaten their survival and growth in the credit markets. Moreover, EFF had a negative significant impact on NPL, confirming the need to keep expenses under control in relation to income to better mitigate credit risk exposure and increase operational stability. Furthermore, the CAR model under GMM and fixed effects show that DIV is significantly positive. This suggests that increases in income diversification enhance profitability, which is absorbed by the CAR, improving the level of solvency in banks and achieving greater stability. Therefore, the results support H1b and H1c and are consistent with the findings of [1,4,10,14,18,19].
The SMR had a negative impact on CAR, claiming that lower stock market index levels indicate larger economic issues that further strain bank capital as corporate borrowers experience financial difficulties, compelling banks to boost their capital buffer to protect against insolvency risk threats. Moreover, the findings of the Z-score model under GMM and fixed effects revealed that DIV and ROA are statistically significant and positively related, demonstrating the importance of increasing income diversification to enhance bank stability. In other words, this suggests that the advantages of income diversification and superior risk management techniques enable large banks to manage their portfolios better, findings which are consistent with the studies of [2,3,5,14,18,19,26]. Therefore, the findings support H1a.
On the other hand, the findings regarding the macroeconomic variables were significant, showing that GDP has a negative association with NPL and CAR. This indicates that during periods of economic growth, borrowers have better repayment capacity to meet their obligations to banks, thus reducing the level of NPL. Additionally, lower inflation levels enhance bank stability, while an increase in the unemployment rate raises NPL levels, posing a threat to bank stability. Moreover, inflation is negatively associated, meaning that increases in inflation deteriorate borrowers’ repayment capacity, heightening credit risk and insolvency risk, and thus reducing bank stability. In this regard, European banks should carefully monitor macroeconomic indicators to respond appropriately and protect their solvency from potential negative threats.

5.4. Machine-Learning Results and Discussion

In this section, the paper uses the machine-learning algorithm to study the relationship between income diversification (DIV) and bank stability (Z-score binary), and to predict financial distress in banks in Europe. The paper utilized a binary classification approach to classify the Z-score above 3 as a non-distressed bank, while the Z-score below 3 is a distressed bank. To do so, the paper used a threshold of 3, which is used in the financial-distress prediction models. The threshold confirms that banks with strong financial health are categorized as non-distressed banks, while those at higher risk of bankruptcy are classified as distressed banks. In this regard, this binary classification approach simplifies decision-making, providing a clear-cut distinction between stable banks and troubled ones. Therefore, this paper used two types of machine-learning algorithms: Random Forest and SVM. Random Forest was adopted for its ability to handle large, complex datasets; manage non-linear relationships; provide robust predictions by aggregating the results of multiple decision trees. As a result, it reduces the risk of overfitting and enhances model generalization. On the other side, SVM is selected for its ability to work effectively in high-dimensional spaces and its ability to find the optimal hyperplane that separates the data points into distinct classes. These algorithms were trained on the financial metric DIV and evaluated based on their ability to predict the Z-score binary classification. Metrics such as accuracy, F1-score, and confusion matrix components (True Positives, True Negatives, False Positives, and False Negatives) were used to assess the models’ performance. Furthermore, the use of a Z-score threshold of 3 is consistent with industry practices for financial-distress prediction, making the binary classification approach both practical and interpretable. By comparing the performance of Random Forest and SVM, we aimed to determine which model best captures data patterns and provides the most reliable predictions for financial distress detection.
In this section, our paper uses the importance features of the Random Forest, as shown in Figure 3, to illustrate how the income diversification and control variables are valuable or useful features and how much each feature contributes to the dependent variable of the Z-score. The results illustrate that income diversification has a positive effect on bank stability, a finding that is consistent with the results of the regression models arguing that banks of Europe are recommended to increase their diversification in fee-based-income activities to raise the bank-stability levels. Additionally, it shows that income diversification has a more positive impact on Z-score than CAR, ROE, ROA, OPEFF, GDP, and stock market return. On the other hand, the findings also revealed that NIM, NPL, UEMP, INF, and CONS have a negative effect on Z-score, a revelation supported by the regression results.
In addition to the standard feature importance provided by the Random Forest model, we implemented SHAP (SHapley Additive exPlanations) analysis to gain a deeper understanding of each feature’s marginal contribution to the model’s output. SHAP values offer a consistent and unified measure for feature relevance by computing the marginal contribution of each variable across all possible model predictions, as shown in Figure 4. The SHAP summary plot shown in the following figure confirms the critical influence of income diversification (DIV), non-performing loans (NPLs), and macroeconomic variables such as unemployment (UEMP) and inflation (INF). These insights guided our understanding of model behavior and can inform regulators on which indicators are most predictive of financial distress.
In Figure 5, the paper uses a scatter plot to visualize the relationship between the DIV and Z-score. The X-axis represents the variable of DIV, with values ranging from 0 to 100, while the Y-axis represents the variable of Z-score, with values ranging from 0 to 60. Additionally, each blue dot represents a specific data point, demonstrating the combination of the DIV and Z-score values for that point. We can also notice that there are many points clustered tightly together in a certain area, thus indicating that there is a moderately strong relationship between income diversification and bank stability, because some of the points are spread out more widely across the scatter plot. Additionally, the paper separates the classes of the classification in the scatter plot to provide a clear separation between the two classes effectively, as shown in Figure 6; after the classification of the Z-score, there are a few blue dots scattered at 0, while most of the dots are clustered at 1, demonstrating that most of the points belong to class 1. The clear separation illustrates that the DIV value can be used to distinguish between the two classes effectively; the clustering points of 1 signify that higher DIV values are associated with Z-score binary 1 and lower DIV values are associated with Z-score binary 0.
This article employed the confusion matrix to evaluate the performance of the Random Forest and SPV classification models, as illustrated in Table 5. The findings reveal that Random Forest and SVM accurately predict 160 and 162 positive observations, respectively, whereas both models correctly predict 0 negative data, as shown in Figure 7 and Figure 8. Furthermore, the incorrectly anticipated positive observation values are 2 and 0, whereas the inaccurate negative values are 2 and 0. Accordingly, the findings show that there are high TP and TN values, which confirm that the models correctly predicted most of the positive and negative classes. On the other hand, the low FC and FN show that there are few misclassifications, demonstrating good model performance. Furthermore, the accuracy was used to evaluate the correctness of the two models among the total number of cases studied. In this respect, the accuracy score of both models is 96.39% and 97.59%, respectively, as shown below in Table 5, which indicates that there is a good proportion of correct predictions in both models. Moreover, the F1-score is also adopted in the examination to evaluate the harmonic mean of precision and recall, developing a single metric that balances both concerns. In this regard, the scores of the F1-score show 0.98 and 0.99 respectively, meaning that both models have good balances between precision and recall. To address the issue of class imbalance noted in the initial model evaluation, where both RF and SVM failed to identify any distressed banks (True Negatives = 0), we applied a manual oversampling technique. This rebalanced the dataset by increasing the number of minority class instances. The results, post-rebalancing, show substantial improvement. Random Forest achieved an F1-score of 0.995 and recall of 1.0 for distressed banks, while SVM achieved an F1-score of 0.985 and recall of 1.0. These results confirm that balancing the data significantly improves the model’s utility as an early warning system. The results show that SVM has a higher accuracy score than Random Forest. Therefore, the results illustrate that income diversification (DIV) has a strong relationship with bank stability measured by the Z-score, and both models can be used to predict the future values of Z-score to improve the prediction level of future bank stability in the banks of Europe.
This paper used a structured modeling process to improve the machine-learning pipeline’s transparency. Initially, the dataset was divided into 20% for testing and 80% for training at random. The training data were used to train the Random Forest and SVM classifiers, while the held-out test set was used to assess them. To ensure robustness, performance metrics included classification-based indicators such as accuracy, precision, recall, and F1-score, and regression-based metrics, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the R2 score, as shown in Table 6. These metrics provide a comprehensive view of the models’ predictive accuracy and fit quality.
Table 6 shows the regression-based performance metrics of both Random Forest and SVM models. The Random Forest model yielded the lowest error rates (RMSE = 0.147; MAE = 0.112) and highest explanatory power (R2 = 0.92), compared to the SVM model (RMSE = 0.163; MAE = 0.124; R2 = 0.88). These results further support the suitability of Random Forest in identifying financial distress with higher prediction precision.

5.5. Addressing Class Imbalance and Model Retraining

Machine-learning models are often biased when faced with imbalanced datasets, leading to misleadingly high accuracy while failing to detect minority cases. In our study, both SVM and RF initially failed to identify distressed banks due to dataset imbalance. We addressed this issue using oversampling techniques to rebalance the classes and then retrained the models. The performance metrics significantly improved, especially recall and F1-score for the minority class. Table 7 shows the confusion matrices post-rebalancing. These findings affirm that proper class balancing is essential for building effective early warning systems in banking.
To further strengthen the robustness and generalizability of our findings, we conducted an additional evaluation step by applying 10-fold cross-validation to both machine-learning models—Random Forest and Support Vector Machine (SVM). This step was introduced in response to the concern regarding potential overfitting when evaluating models on a single random train–test split. Cross-validation is widely recognized as one of the best practices in predictive modeling because it reduces the variability in performance estimates and provides a more reliable assessment of a model’s true ability to generalize to unseen data.
Table 8 below reports the performance of both models based on averaged metrics across the ten folds. As the cross-validation technique splits the dataset repeatedly into training and validation folds, the individual confusion matrix elements (True Negatives, False Positives, etc.) are not presented; instead, we focus on key aggregate classification metrics, precision, recall, and F1-score for the positive class (Class 1), corresponding to distressed banks.
The results reveal that the SVM model consistently outperformed the Random Forest model in all key metrics under cross-validation. Specifically, SVM achieved a perfect recall of 1.00, indicating that it successfully identified all instances of distressed banks across all folds. It also maintained a high precision (0.9801) and a near-perfect F1-score (0.9900), suggesting a strong balance between sensitivity and specificity. The Random Forest model also demonstrated excellent performance, with recall at 0.9779, precision at 0.9816, and an F1-score of 0.9796, indicating that it, too, is effective at predicting financial distress with high reliability.
These findings confirm that the strong model performance observed in earlier single-split evaluations was not the result of overfitting or chance. On the contrary, the models—especially the SVM classifier—retain their predictive power even when tested under more rigorous cross-validation protocols. This reinforces the value of income diversification (DIV) as a reliable and interpretable feature for forecasting bank distress and confirms the robustness of our machine-learning framework as a potential early warning system for banking-sector risk.

6. Conclusions

This paper aimed to study the relationship between income diversification and bank stability in the banking sector of Europe by employing regression and machine-learning approaches to provide a comprehensive analysis that seeks to address the ongoing debate about the pros and cons of income diversification in banks. Moreover, the paper intended to enhance the predictive capabilities of bankers and to offer better insights for regulators, potentially leading to more effective regulations that can help control the financial-distress exposure of European banks. This paper used GMM and fixed-effect models to examine the impact of income diversification along with several control variables on bank stability, as measured by Z-score, NPL ratio, and CAR. The robust checks confirmed that these regression models are accurate and reliable econometric tools for testing the hypotheses. The regression findings revealed that increasing income diversification among banks can enhance bank stability. Conversely, the research employed Random Forest and SVM as machine-learning algorithms to create predictive models that improve the prediction performance of bankers in European banks. The results indicate that SVM has a higher accuracy score than Random Forest. Although the original models demonstrated high accuracy, their inability to identify distressed banks rendered them ineffective as early warning systems. After rebalancing the dataset using oversampling techniques, both models—particularly Random Forest—showed strong performance in detecting distress. This underscores the importance of addressing class imbalance in predictive modeling to ensure robustness and fairness in distress prediction, with performance levels exceeding 95%.
To reinforce the validity of the machine-learning findings, we applied a 10-fold cross-validation procedure to both the Random Forest and SVM models. The results confirmed the reliability of the models under repeated resampling, with the SVM model achieving perfect recall and high F1-score across folds. This additional step demonstrates the robustness and generalizability of our predictive models and further supports the practical relevance of income diversification as an early warning indicator of financial distress in European banks.
This finding emphasizes the value of utilizing the SVM model to forecast future movements in bank stability, serving as an early indicator for bankers to adopt proactive and precautionary strategies that can aid their survival and growth, thereby continuing their effective intermediation role in supporting the growth of European economies, especially when guided by data-driven tools validated through rigorous cross-validation procedures.

7. Limitations and Future Work

A key limitation of the original machine-learning application was the significant class imbalance, which led to zero detection of distressed banks in the initial results. This undermines the reliability of the model in real-world early warning systems. Future work should explore more advanced techniques, such as SMOTE, ensemble rebalancing, and cost-sensitive learning, to enhance model reliability and fairness. Additionally, alternative performance metrics—such as the AUC-ROC curve, recall, and precision–recall curve—should be prioritized over accuracy, particularly when evaluating models on imbalanced datasets. Lastly, this paper focused on European banks, where regulatory frameworks are relatively strong and harmonized, which justifies the positive relationship between income diversification and bank stability. However, the paper did not explicitly model regulatory environments. Therefore, it suggests that future research should explore how the regulatory environment moderates the relationship between income diversification and bank stability in European countries to provide clearer insights for regulators. These improvements would ensure more robust predictions and actionable insights for financial regulators and banking institutions.

Author Contributions

K.F., conceptualization, formal analysis, and writing—original draft preparation; L.A., data curation and methodology; N.C.M., validation, and writing—review and editing; R.L., resources, and writing—review and editing; A.M., data curation, and writing—review and editing; N.K., investigation, visualization, and writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data can be requested from the corresponding author.

Acknowledgments

We acknowledge Berlin School of Business and Innovation for providing a conducive environment for this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lahouel, B.B.; Taleb, L.; Kossai, M. Nonlinearities between Bank Stability and Income Diversification: A Dynamic Network Data Envelopment Analysis Approach. Expert Syst. Appl. 2022, 207, 117776. [Google Scholar] [CrossRef]
  2. Velasco, P. Is Bank Diversification a Linking Channel between Regulatory Capital and Bank Value? Br. Account. Rev. 2022, 54, 101070. [Google Scholar] [CrossRef]
  3. De Meo, E.; De Nicola, A.; Lusignani, G.; Orsini, F.; Zicchino, L. European Banks in the XXI Century: Are Their Business Models Sustainable? In Proceedings of the 5th EBA Policy Research Workshop “Competition in Banking: Implications for Financial Regulation and Supervision”, London, UK, 28–29 November 2016; pp. 28–29. [Google Scholar]
  4. Sharma, S.; Anand, A. Income Diversification and Bank Performance: Evidence from BRICS Nations. Int. J. Product. Perform. Manag. 2018, 67, 1625–1639. [Google Scholar] [CrossRef]
  5. Dzingirai, C.; Dzingirai, M. Threshold Effect of Non-Interest Income Disaggregates on Commercial Banks’ Financial Performance in Zimbabwe. Heliyon 2024, 10, e31379. [Google Scholar] [CrossRef]
  6. Nisar, S.; Peng, K.; Wang, S.; Ashraf, B.N. The Impact of Revenue Diversification on Bank Profitability and Stability: Empirical Evidence from South Asian Countries. Int. J. Financ. Stud. 2018, 6, 40. [Google Scholar] [CrossRef]
  7. Kim, H.; Batten, J.A.; Ryu, D. Financial Crisis, Bank Diversification, and Financial Stability: OECD Countries. Int. Rev. Econ. Financ. 2020, 65, 94–104. [Google Scholar] [CrossRef]
  8. Zouaoui, H.; Zoghlami, F. What Do We Know about the Impact of Income Diversification on Bank Performance? A Systematic Literature Review. J. Bank. Regul. 2023, 24, 286–309. [Google Scholar] [CrossRef]
  9. Karkowska, R. Diversification of Banking Activity and Its Importance in Building Financial Stability. In Global Versus Local Perspectives on Finance and Accounting, Proceedings of the 19th Annual Conference on Finance and Accounting (ACFA 2018), Prague, Czech Republic, 25 May 2018; Springer: Berlin/Heidelberg, Germany, 2019; pp. 79–88. [Google Scholar]
  10. Adem, M. Impact of Income Diversification on Bank Stability: A Cross-Country Analysis. Asian J. Account. Res. 2023, 8, 133–144. [Google Scholar] [CrossRef]
  11. Ben Lahouel, B.; Taleb, L.; Kočišová, K.; Ben Zaied, Y. The Threshold Effects of Income Diversification on Bank Stability: An Efficiency Perspective Based on a Dynamic Network Slacks-Based Measure Model. Ann. Oper. Res. 2023, 330, 267–304. [Google Scholar] [CrossRef]
  12. Maudos, J. Income Structure, Profitability and Risk in the European Banking Sector: The Impact of the Crisis. Res. Int. Bus. Financ. 2017, 39, 85–101. [Google Scholar] [CrossRef]
  13. Rossi, S.; Dreassi, A.; Borroni, M.; Paltrinieri, A. Does Revenue Diversification Still Matter in Banking? Evidence from A Cross-Country Analysis. J. Financ. Manag. Mark. Inst. 2020, 8, 2050003. [Google Scholar] [CrossRef]
  14. Simoens, M.; Vander Vennet, R. Does Diversification Protect European Banks’ Market Valuations in a Pandemic? Financ. Res. Lett. 2022, 44, 102093. [Google Scholar] [CrossRef]
  15. Ho, T.H.; Nguyen, D.T.; Luu, T.B.; Le, T.D.; Ngo, T.D. Bank Performance during the COVID-19 Pandemic: Does Income Diversification Help? J. Appl. Econ. 2023, 26, 2222964. [Google Scholar] [CrossRef]
  16. Farag, K.; Kassem, T.; Ramzy, Y. The Crucial Macroeconomic and Microeconomic Determinants of Retail and Corporate Credit Risks. J. Account. Financ. Audit. Stud. (JAFAS) 2023, 9, 148–161. [Google Scholar] [CrossRef]
  17. Wang, C.; Lin, Y. The Influence of Income Diversification on Operating Stability of the Chinese Commercial Banking Industry. Rom. J. Econ. Forecast. 2018, 21, 38. [Google Scholar]
  18. Brahmana, R.; Kontesa, M.; Gilbert, R.E. Income Diversification and Bank Performance: Evidence from Malaysian Banks. Econ. Bull. 2018, 38, 799–809. [Google Scholar]
  19. Vidyarthi, H. Dynamics of Income Diversification and Bank Performance in India. J. Financ. Econ. Policy 2020, 12, 383–407. [Google Scholar] [CrossRef]
  20. Abuzayed, B.; Al-Fayoumi, N.; Molyneux, P. Diversification and Bank Stability in the GCC. J. Int. Financ. Mark. Inst. Money 2018, 57, 17–43. [Google Scholar] [CrossRef]
  21. Le, T.D. Geographic Expansion, Income Diversification, and Bank Stability: Evidence from Vietnam. Cogent Bus. Manag. 2021, 8, 1885149. [Google Scholar] [CrossRef]
  22. Addai, B.; Tang, W.; Agyeman, A.S. Examining the Impact of Income Diversification on Bank Performance: Are Foreign Banks Heterogeneous? J. Appl. Econ. 2022, 25, 1–21. [Google Scholar] [CrossRef]
  23. Chandramohan, K.; Lunawat, C.D.; Lunawat, C.A. The Impact of Diversification on Bank Stability in India. Cogent Bus. Manag. 2022, 9, 2094590. [Google Scholar] [CrossRef]
  24. Alouane, N.; Kahloul, I.; Grira, J. The Trilogy of Ownership, Income Diversification, and Performance Nexus: Empirical Evidence from Tunisian Banks. Financ. Res. Lett. 2022, 45, 102180. [Google Scholar] [CrossRef]
  25. Abbas, F.; Ali, S. Dynamics of Diversification and Banks’ Risk-taking and Stability: Empirical Analysis of Commercial Banks. Manag. Decis. Econ. 2022, 43, 1000–1014. [Google Scholar] [CrossRef]
  26. Shabir, M.; Jiang, P.; Shahab, Y.; Wang, W.; Işık, O.; Mehroush, I. Diversification and Bank Stability: Role of Political Instability and Climate Risk. Int. Rev. Econ. Financ. 2024, 89, 63–92. [Google Scholar] [CrossRef]
  27. Shahriar, A.; Mehzabin, S.; Azad, M.A.K. Diversification and Bank Stability in the MENA Region. Soc. Sci. Humanit. Open 2023, 8, 100520. [Google Scholar] [CrossRef]
Figure 1. Income diversification ratio in Europe during 2000–2021 (source: World Bank’s Global Financial Development Database (GFDD).
Figure 1. Income diversification ratio in Europe during 2000–2021 (source: World Bank’s Global Financial Development Database (GFDD).
Fintech 04 00021 g001
Figure 2. Bank Z-score in Europe during 2000–2021 (source: World Bank’s Global Financial Development Database (GFDD)).
Figure 2. Bank Z-score in Europe during 2000–2021 (source: World Bank’s Global Financial Development Database (GFDD)).
Fintech 04 00021 g002
Figure 3. Importance features.
Figure 3. Importance features.
Fintech 04 00021 g003
Figure 4. SHAP values.
Figure 4. SHAP values.
Fintech 04 00021 g004
Figure 5. Scatter plot between DIV and Z-score.
Figure 5. Scatter plot between DIV and Z-score.
Fintech 04 00021 g005
Figure 6. Scatter plot of before classification and after classification between DIV and Z-score.
Figure 6. Scatter plot of before classification and after classification between DIV and Z-score.
Fintech 04 00021 g006
Figure 7. Confusion matrix of Random Forest.
Figure 7. Confusion matrix of Random Forest.
Fintech 04 00021 g007
Figure 8. Confusion matrix of SVM.
Figure 8. Confusion matrix of SVM.
Fintech 04 00021 g008
Table 1. Definition of variables and measurements.
Table 1. Definition of variables and measurements.
VariablesMeasurements
Dependent variables:
Z-score
Non-performing loan (NPL) ratio
Capital adequacy ratio (CAR)
(ROA + equity to assets ratio)/STDEV of ROA
NPLs/total gross loans
Total equity/total assets
Independent variables:
Income diversification (DIV)
Concentration risk (CON)
Operating efficiency (EFF)
Stock market return (SMR)
Stock price volatility (SPV)
Profitability (PROF)
Economic growth (GDP)
Inflation (INF)
Unemployment (UNEMP)
Non-interest income/total income
Assets of the three largest banks/total assets of all banks
Total expenses/total income
End of period market capitalization−beginning of period market capitalization/end of period market capitalization
Square root of (sum of (daily return minus average daily return) squared divided by number of days)
ROA, ROE, and NIM
Real GDP growth rate
CPI in current year − CPI in previous year/CPI in previous year
Number of unemployed people/total labor force
Table 2. Descriptive analysis.
Table 2. Descriptive analysis.
VariablesMeanSTDEVMinMax
Z-score13.789.25−0.3357.44
NPL5.726.970.1047.75
CAR16.194.517.0035.65
DIV41.7712.967.3982.49
CON71.6316.7528.56100.00
EFF58.9012.0914.7597.17
SMR6.7224.84−74.56124.98
SPV20.348.856.3361.52
ROA0.621.06−9.534.36
ROE7.4113.87−117.6737.46
NIM2.111.180.185.79
GDP2.433.93−14.8424.48
INF2.242.094.4515.40
UNEMP4.702.660.99620.86
Table 3. The results of the GMM.
Table 3. The results of the GMM.
NPLCARZ-Score
VariablesEstimatePr(|t|)EstimatePr(|t|)EstimatePr(|t|)
Lags in dep. var.0.6617440 ***0.00000.9288025 ***0.00000.606589 ***0.0000
DIV−0.0600427 *0.0544230.0190250 **0.01393410.064441 **0.02702
CON−0.01422260.5039620.01492686 **0.0143470−0.016560.82612
EFF−0.0602007 **0.0200250.001727330.81675670.01777520.64720
SMR0.00758720.235614−0.012461 ***0.0057241−0.00861830.23690
SPV0.03480370.231596−0.011116530.43541130.02550000.21924
ROA−0.34357500.3587370.19490603 *0.08706371.4711140 *0.05789
ROE−0.0330053 **0.02612−0.012565380.4233142−0.00828490.84289
NIM−0.49968050.2713580.082244460.4253109−0.08043530.87443
GDP−0.1139556 **0.02793−0.082485 ***0.00071260.01617310.72245
INF0.01240790.91447−0.105292 ***0.0017900−0.2637828 *0.01648
UNEMP07747466 ***0.000100.000922650.97031130.14527430.12279
Sargan test26 (p-value = 1)26 (p-value = 1)26 (p-value = 1)
Autocorrelation test (1)Normal = −2.059052 (p-value = 0.339489)Normal = −2.291137 (p-value = 0.521955)Normal = −1.408896 (p-value = 0.15887)
Autocorrelation test (2)Normal = 0.5974548 (p-value = 0.5502)Normal = 1.140559 (p-value = 0.25405)Normal = 0.7279215 (p-value = 0.46666)
Wald testChisq(12) = 1668.669 (p-value = < 0.00000)Chisq(12) = 1831.75 (p-value = < 0.00000)Chisq(12) = 2318.332 (p-value = < 0.00000)
Note: *** denotes p < 0.01 (1%), ** indicates p < 0.05 (5%), and * depicts p < 0.1 (10%).
Table 4. The results of the fixed-effect model.
Table 4. The results of the fixed-effect model.
NPLCARZ-Score
VariablesEstimatePr(|t|)EstimatePr(|t|)EstimatePr(|t|)
DIV0.042937 **0.0335690.09224 ***0.00000.09224 ***0.0000
CON0.045269 *0.054029.0.036232 *0.06340.03623 *0.0634
EFF0.0005660.9799990.019040.30780.019040.3078
SMR−0.02635 **0.006476−0.009690.2232−0.009690.2232
SPV−0.05842 *0.033568−0.117733 ***0.0000−0.11773 ***0.0000
ROA−1.02345 *0.0103451.337261 ***0.00001.33726 ***0.0000
ROE0.0231660.488981−0.11047 ***0.0000−0.11047 ***0.0000
NIM−0.216570.576269−0.54226 *0.0920−0.54226 *0.0920
GDP0.0802100.228705−0.22664 ***0.0000−0.22664 ***0.0000
INF−0.21054 *0.044879−0.59337 ***0.0000−0.59337 ***0.0000
UNEMP1.893341 ***0.000000−0.027990.78968−0.02798980.78968
Note: *** denotes p < 0.01 (1%), ** indicates p < 0.05 (5%), and * depicts p < 0.1 (10%).
Table 5. The results of the confusion matrix.
Table 5. The results of the confusion matrix.
Confusion MatrixRandom Forest ScoresSVM Scores
TP (True Positive)160162
TN (True Negative)00
FP (False Positive)44
FN (False Negative)20
Accuracy96.39%97.59%
F1-score0.980.99
Table 6. Regression performance metrics of Random Forest and SVM models.
Table 6. Regression performance metrics of Random Forest and SVM models.
ModelRMSEMAER2
Random Forest0.1470.1120.92
SVM0.1630.1240.88
Table 7. Confusion matrix and performance after rebalancing.
Table 7. Confusion matrix and performance after rebalancing.
ModelTNFPFNTPRecall (Class 1)Precision (Class 1)F1-Score (Class 1)
Random Forest8810971.000.98980.9949
SVM8630971.000.97000.9848
Table 8. Confusion matrix and performance after applying cross-validation.
Table 8. Confusion matrix and performance after applying cross-validation.
ModelTNFPFNTPRecall (Class 1)Precision (Class 1)F1-Score (Class 1)
Random Forest0.97790.98160.9796
SVM1.00000.98010.9900
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Farag, K.; Ali, L.; Mutai, N.C.; Luqman, R.; Mahmoud, A.; Krasniqi, N. Machine Learning for Predicting Bank Stability: The Role of Income Diversification in European Banking. FinTech 2025, 4, 21. https://doi.org/10.3390/fintech4020021

AMA Style

Farag K, Ali L, Mutai NC, Luqman R, Mahmoud A, Krasniqi N. Machine Learning for Predicting Bank Stability: The Role of Income Diversification in European Banking. FinTech. 2025; 4(2):21. https://doi.org/10.3390/fintech4020021

Chicago/Turabian Style

Farag, Karim, Loubna Ali, Noah Cheruiyot Mutai, Rabia Luqman, Ahmed Mahmoud, and Nol Krasniqi. 2025. "Machine Learning for Predicting Bank Stability: The Role of Income Diversification in European Banking" FinTech 4, no. 2: 21. https://doi.org/10.3390/fintech4020021

APA Style

Farag, K., Ali, L., Mutai, N. C., Luqman, R., Mahmoud, A., & Krasniqi, N. (2025). Machine Learning for Predicting Bank Stability: The Role of Income Diversification in European Banking. FinTech, 4(2), 21. https://doi.org/10.3390/fintech4020021

Article Metrics

Back to TopTop