Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations

Chen, James Ming; Chesini, Giusy

doi:10.3390/cmsf2025011033

Open AccessProceeding Paper

Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations^†

by

James Ming Chen

^1,*

and

Giusy Chesini

²

¹

College of Law, Michigan State University, East Lansing, MI 48824, USA

²

Department of Management, Università degli Studi di Verona, 37129 Verona, VR, Italy

^*

Author to whom correspondence should be addressed.

^†

Presented at the 11th International Conference on Time Series and Forecasting, Canaria, Spain, 16–18 July 2025.

Comput. Sci. Math. Forum 2025, 11(1), 33; https://doi.org/10.3390/cmsf2025011033

Published: 9 September 2025

(This article belongs to the Proceedings of The 11th International Conference on Time Series and Forecasting)

Download

Browse Figures

Versions Notes

Abstract

This paper focuses on one specific aspect of a larger project evaluating three measures of banking risk. It emphasizes the overarching question of comparative regulatory policy: Do the European Union and the United States constitute two distinct and separate banking cultures? To answer such a question, conventional econometrics often prescribes fixed effects regression. This paper pursues an alternative approach. It directly asks whether banks on those separate continents can be distinguished using exactly the same design matrix to evaluate the proposed risk measures. The successful completion of that classification task permits the bifurcation of the overall dataset into distinct subsets, one for each continent. Parameter estimates and fitted values produced by separate regressions supply far more reliable and accurate insights into the distinct business and regulatory cultures of European and American banking.

Keywords:

banking; banking regulation; classification; fixed effects; penalized regression

1. Introduction

This paper focuses on one specific aspect of a larger project exploring the sources of banking risk and their disparate treatment by regulators in the European Union and the United States [1]. That larger project seeks ultimately to predict three distinct ways to measure the risk of bank failure.

That larger project will explore substantive questions regarding financial regulation and the efficacy of diverse risk measures. This paper will focus instead on a comparison of the two great banking unions of the north Atlantic. The European Central Bank recognizes that return on equity for European banks has hovered around 5 percent, roughly half of the 10 percent return on equity realized by their American counterparts [2] (pp. 2, 6). This difference has led to “to a structural profitability gap of around 5–6 percentage points over the period 2012–2021” [2] (p. 6). Mindful that Europe and the United States have followed distinct regulatory paths [3,4] (pp. 37–38), [5] (pp. 101–103), this paper emphasizes the overarching question of comparative law and policy: Do the European Union and the United States constitute two distinct and separate banking cultures?

We suspect that differences between banks in the Europe Union and the United States are substantial. Conventional econometrics ordinarily prescribes fixed effects regression when a dataset exhibits categorical differences attributable to geographic characteristics, especially where geography encodes distinct business practices or separate regulatory systems. This is the canonical use-case for fixed effects regression as a tool for exposing omitted variable bias [6].

This paper presents an alternative approach [7]. Rather than committing the question of regulatory differences to a single fixed effects test, this paper converts the dummy variable encoding the geographic distinction between Europe and the United States into the target variable in a binary classification exercise. The successful completion of that task permits the bifurcation of the overall dataset into distinct subsets, one for each continent. Parameter estimates and fitted values produced by separate exercises in multiple regression supply far more reliable and accurate insights into the distinct business and regulatory cultures of European and American banking.

This approach is a special case of iterative local regression. When conducted according to rolling windows along a single timeline, iterative local regression is arguably the foundational method for time-series forecasting [8,9]. As such, iterative local regression is hardly alien to econometrics. The ease, utility, and clarity of this article’s two-step process—binary classification confirming a natural form of clustering, followed by separate regression of each cluster—stand in sharp contrast with the awkward and ultimately uninformative process needed to conduct and evaluate fixed effects.

There are two readily imaginable reasons for the conventional preference for fixed effects tests. One is that the method is universally taught and has become nearly as deeply entrenched in the academic culture of economics as null hypothesis significance testing. The other reason is that fixed effects regression is encoded as an off-the-shelf, prêt-à-porter solution in Stata and other popular software packages.

Neither of the obvious explanations justifies the failure to extend the foundational method of time-series forecasting to panel data. The same design matrix used for an unconditional evaluation of panel data’s implicit drift term can be used to evaluate the accuracy of a natural clustering method based on geographical or political boundaries. Validation of that clustering method leads inexorably to distinct downstream exercises in local regression of a subset of data instances. Those exercises not only produce more accurate fitted values (in many cases); they also generate more reliable parameter estimates for each validated natural cluster.

Section 2 presents the materials and methods used in this article. Section 3 presents the results of its classification exercise alongside the results of both approaches to regression. Section 4 discusses methodological implications for other economic settings, especially (but not only) panel data containing geopolitical distinctions that conventionally invite fixed effects regression. Section 5 concludes that clustering or classification should augment, supplement, or even supplant fixed effects tests in diverse econometric settings.

2. Materials and Methods

The larger project from which this article is drawn assesses three different risk measures in banking: Return on assets, Altman’s z-score, and the Texas ratio. Return on assets is perhaps the simplest, most direct way to measure profitability in banking [10] (p. 213), [11,12]. Its use as a risk measure arises from the simple intuition that profitable banks face a lower risk of failure than their less profitable counterparts.

Altman’s z-score evaluates the probability of bankruptcy [13,14] according to “five ratios that measure a company’s liquidity, profitability, financial leverage, solvency and sales activity” [15] (p. 38). Z-score rating systems [16], a genre in which “Altman’s model is probably the classic” [17] (p. 52), are accepted in American courts as a predictor of bankruptcy [18] (p. 126); [19] (p. 1316); [20] (p. 15). Edward Altman himself has proposed the use of his scoring system in banking regulation [21].

The Texas ratio is “the ratio of non-performing loans to the sum of loan loss reserves and tangible common equity” [22] (p. 67). An artifact of American bank failures in the 1980s [23,24], the Texas ratio of late has attracted attention from European scholars [25,26]. It “has been … adopted in Europe by the [European Central Bank] as a high-level [nonperforming-loan] metric in its asset quality assessments” [22] (p. 67).

Both the principal project and this article rely on a dataset covering 164 banks, 75 in the European Union and 89 in the United States. The quarterly balance sheet data begins coverage from the end of 2005 and continues through June 2024. Data on profitability, capitalization, liquidity, asset quality, and efficiency facilitate a comparison of European and American banks. All data was split into (1) a training set comprising

\frac{e - 1}{e}

out of 7555 total observations, or 4775 instances, and (2) a test set comprising the remaining

\frac{1}{e}

, or 2780 total observations. Parameter estimates and machine-learning “feature importances” are based on the training set. Test set measures of predictive accuracy are better indicators of a method’s generalizability to data not seen during training.

Among 7555 quarterly observations, 1996 come from European banks. The remaining 5559 observations come from banks based in the United States. A dummy variable, eu_dummy indicates the geography, both political and physical, of the bank associated with each observation. Traditional fixed effects regression would place this dummy variable inside the design matrix. It would generate a parameter estimate for eu_dummy, as well as a p-value indicating its statistical significance.

In its classification exercise, this article will place eu_dummy on the left-hand side of the regression equation as the dependent variable. In all other respects, the original design matrix remains unchanged. If attained, success in classifying banks as either European or American will lead to the bifurcation of the dataset and the application of separate regression exercises to each cohort.

This article liberally applies various implementations of the ℓ₁ (or Lasso) penalty in linear regression [27]. Lasso regularization has the highly desirable property of driving irrelevant or redundant parameter estimates toward zero. In some cases, the ℓ₁ penalty assigns a zero coefficient and induces a sparser design matrix by removing a weakly predictive variable from the active set.

Binary classification proceeded according to four different algorithms: logistic regression [28,29], stochastic gradient descent (with a machine-calibrated ℓ₁ penalty) [30], random forests [31], and extra trees [32]. The former two methods are generalized linear methods that can express their results in closed form, with bidirectional parameter estimates. The latter two methods are supervised machine learning methods based on decision-tree ensembles. Although they cannot produce coefficients in the conventional sense, they do supply “feature importances” that express the contribution of each predictive variable to the final result.

Section 3 will now report results from fixed effects regression, binary classification, and iterative local regression.

3. Results

3.1. Unconditional Drift: OLS and Fixed Effects Regression of the Original Dataset

Consistent with econometric convention, the principal study conducted fixed effects regression of the entire dataset. A value of one identified an observation associated with a European bank. Observations associated with American banks received a zero.

Table 1 reports parameter estimates for fixed effects regressions for all three target variables: return on assets, Altman’s z-score, and the Texas ratio. The second column for each target variable reports the result of Lars, or least angle regression, which applies a ℓ₁/Lasso penalty in order to shrink the absolute value of parameter estimates for irrelevant or redundant variables [33]. In many instances, especially for the prediction of return on assets, Lars drove coefficients to zero.

For two of these target variables (return on assets and Altman’s z-score), the ℓ₁ penalty drove the coefficient on the eu_dummy variable to zero. For return on assets, the fixed entity regression using OLS had found eu_dummy statistically significant at p < 0.05, albeit at a miniscule effect size. In the Texas ratio model, Lars did not completely zero out the parameter estimate for eu_dummy. But the OLS version of the fixed effects regression had already assigned a small, inconsequential coefficient. And Lars reduced the effect size even more. Although the more conservative Lars method found numerous variables affecting all three risk measures, in no case did Lars assign the eu_dummy variable to the active and—to the extent null hypothesis significance testing (NHST) still matters or should command any respect [34]—statistically significant set of predictors.

Unthinking adherence to NHST conventions might suggest that there is no reason to reject the null hypothesis that the fixed effect of European versus American identity has no statistically significant impact on any of the three risk measures. The modest analytical value of finding p < 0.05 for eu_dummy in the OLS model for return on assets is negated by the assignment of a zero coefficient to that dummy variable by the parallel Lars model.

This conclusion should be surprising to any observer of banking on both sides of the northern Atlantic Ocean. As binary classification will show, this inference is also incorrect.

3.2. Binary Classification

As a matter of regression methodology, a classification exercise involving data with identifiable entities inverts the approach of fixed effects regression. Instead of placing eu_dummy in the design matrix, binary classification—by any algorithm—converts that dummy variable into the target. The benefit of this exercise is that the design matrix for classification is identical to the design matrix for any numerical target in regression. For methods that report parameters for all predictors, causal inference for classification can be directly compared to causal inference for regression of any of the three risk measures.

Figure 1 reports the (test-set) results of four classification methods. All four are highly accurate by any conventional measure: the point-biserial coefficient, the area under the receiver operating characteristic curve, precision, recall, and the F₁ score. Indeed, extra trees attained perfection.

The internal workings of each classification method are readily visualized. In Figure 2a, logistic regression operates on a smooth gradient corresponding to this equation:

f (x) = \frac{1}{1 + e^{- k x}}

. In Figure 2b, the extra trees model makes its predictions along a nonlinear gradient. Figure 3 shows the extra trees model in a flat, two-dimensional projection.

Accurate classification of European and American bank data enables the next step, which is the regression of all three risk measures for separate subsets of the data bifurcated according to the (now validated) geopolitical classification. But classification operates on the right-hand side as well as the left: In addition to accurately separating banks by geography, all four classification methods shed light on the quantitative factors distinguishing European from American banks.

Table 2 reports parameter estimates for logistic regression and stochastic gradient descent, alongside feature importances for random forest and extra trees. Feature importances in machine learning always add up to 1; they are correctly interpreted as a vector of the probability that any particular variable will prove decisive in tilting the classification in one direction or another. Since there are 16 predictive variables, the breakeven point for any single feature importance is

\frac{1}{16}

, or 0.0625.

NHST conventions work very poorly for linear classification methods. Logistic regression reports p-values < 0.001 for every predictor, except p < 0.05 for growth_in_total_loans. Even at the modest level of 0.001413, the ℓ₁ penalty on SGD has the virtue of removing six out of 16 predictors from the active set. The p-value for SGD parameters is either indistinguishable from 0 or indistinguishable from 1 with ordinary 64-bit floating point precision. Therefore, as is true for almost every other analytical purpose, effect sizes provide more reliable guidance than p-values.

The absolute value of parameter estimates for the two linear methods reaches an evident consensus that cash_near_cash_item, common_equity_over_total_assets, nonperforming_loans_over_total_assets, return_on_common_equity, t12_net_interest_margin, and total_loans_over_total_deposits are the six variables most likely to determine whether an observation is classified as European or American. Since eu_dummy is true if an observation comes from Europe, positive coefficients indicate a higher probability that the observation is European.

Machine learning feature importances do not necessarily agree with these causal inferences. Only three of the variables awarded substantial coefficients by the linear methods—cash_near_cash_item, nonperforming_loans_over_total_assets, and t12_net_interest_margin—are deemed by both random forests and extra trees to be more than 6.25 percent likely to affect a classification decision. Random forests credits the t12_net_interest_margin variable for more than half of that method’s decision weight. Extra trees is more democratic; it credits cash_near_cash_item and t12_net_interest_margin with roughly a quarter each of its decision weight. These are the three—or six—variables where the search for variables distinguishing European from American banking should focus.

In either event, the linear and machine learning classification methods vastly outperform fixed effects regression in providing information on causal inferences affecting geopolitical differences in bank risk and regulation. This study’s central research question asks whether European and American banks respond differently to distinct measures of risk, perhaps in large part because their regulators emphasize different legal criteria. Placing the eu_dummy variable on the left-hand side of the regression equation and making it the target variable in classification directly answers that question, and in the affirmative. By contrast, fixed effects regression found almost no effect. Including the eu_dummy variable within the design matrix provided no guidance on the variables that separate Europe from America, let alone the extent and direction of their impact.

Binary classification is hardly the final step analytical step. Instead, it is analogous to the idea of a Zwischenzug in chess [35] (p. 460). Classification represents a response to econometric convention: Instead of accepting the inconsequential and even disappointing results of fixed effects regression, this study reassigns the eu_dummy variable to the left-hand side of the regression equation. Only after completing that classification assignment do we return to the task of evaluating each proposed measure of banking risk. Successfully distinguishing European from American banks, though not the ultimate object of our larger investigation of risk measures in banking, takes a crucial intermediate step toward that final goal.

Section 3.3 now tackles the task of estimating model parameters in a reliably predictive way for distinct European and American banking cohorts.

3.3. Separate Regression of the European and American Subsets

The successful separation of European from American banks validates the independent evaluation of these two cohorts. Binary classification has now obviated any occasion for including fixed effects in the design matrix. Subjecting the European and American subsets to their own regressions, using the original design matrix of 16 predictive variables, generates distinct parameter estimates for three measures of risk within each continent’s banking system (Table 3).

Although we leave a fuller discussion of the economic significance of these findings to our principal project [1], an important methodological point arises from the immediately visible differences between European and American parameter estimates for each of the three target risk measures. For purposes of this comparison, conventional statistical significance, as measured by p-values, will be taken into account. Anywhere from a quarter to a half of all statistically significant predictive variables either change signs between the two “continental” regressions or receive zero coefficients under the other regression’s application of the ℓ₁ penalty (Table 4).

An incontrovertible criticism of regressing subsets of a broader dataset is that each division reduces the number of instances available for evaluation. With 1996 European observations and 5559 observations from the United States, this study did not suffer from an insufficiency of data. Goodness of fit perceptibly suffered for exactly one subset: the American cohort in the model for Altman’s z-score. The reduction in the test-set accuracy from r² ≈ 0.438940 for the unconditional regression to r² ≈ 0.310657 for the regression of the American subset was offset by a dramatic improvement in the European subset’s test-set goodness of fit to r² ≈ 0.646172.

Test-set accuracy for all other regressions, at least in OLS, was no lower than r² = 0.834590 (the European subset for return on assets). Test-set accuracy for both continental subsets affirmatively improved relative to the unconditional r² value. Improvements in causal inference through separate regression, confirmed by the binary classification of European and American observations, did not require any meaningful sacrifice in predictive accuracy.

4. Discussion

We will briefly discuss some implications of the results from the separate regression of European and American banking data. We then ponder this study’s broader implications.

4.1. Predictive Accuracy for Different Risk Measures

The striking difference in predictive accuracy for Altman’s z-score may reflect inconsistencies in its acceptance or application by European regulators vis-à-vis their American counterparts. The discretion that individual countries enjoy “in integrating the legally non-binding Basel Accords into their national regulation of banks lead[s] to great variability in national implementation” [4] (p. 8). According to the European Banking Authority, “accounting rules materially differ between the EU and the US” [36] (p. 4). In particular, American rules requiring the lifetime recognition of expected credit loss elevate “accumulated loan loss allowances and provisions” in the United States [36] (p. 12).

Regulatory differences can produce differences in business conduct and success. When regulators adopt or avoid a particular measure, private enterprises react accordingly. Banks respond, as all businesses do, to legal and regulatory constraints on their conduct [37]. Across many economic domains, “[a]ny observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes” [38] (p. 116). Even more generally: “When a measure becomes a target, it ceases to be a good measure [39] (p. 308). This variant of “Goodhart’s law” applies to banking [40].

Despite recurring criticism [41] and a modern preference for market data over accounting ratios in models for predicting default or bankruptcy [42,43], Altman’s z-score is widely used by firms, auditors, and ratings systems [17] (p. 52). The longstanding criticism leveled at Altman’s z-score applies to all predictive models based on accounting ratios: “Altman demonstrates that failed and non-failed firms have dissimilar ratios, not that ratios have predictive power. But the crucial problem is to make an inference in the reverse direction, i.e., from ratios to failures. It must be demonstrated that stratified random samples of ratios’ values can imply failure and non-failure” [44] (p. 1168).

In his own application of the z-score method to bank failures in a comparative international study, Edward Altman has argued that a country-specific model might add “information provided even by simple additional variables” in order to “boost the classification accuracy to a much higher level” [21] (p. 167). For his part, Charles Goodhart urges a shift in regulatory emphasis “from levels” of assets and ratios “to their rates of growth,” in order better to safeguard “against both the bubble and the bust, both of the system and of individual institutions, by requiring additional capital and liquidity when bank lending and asset prices [are] rising fast, and relaxing such requirements in the downturn” [45] (p. 356).

Even accuracy in prediction does not foreclose meaningful differences between European and American banks. Nor do accurate predictions prevent the broader project of evaluating risk measures in banking from detecting meaningful differences across the north Atlantic’s two banking cultures. The otherwise scant literature does suggest that the Texas ratio is more sensitive to potential failures among smaller institutions [46] (pp. 203–204). The smaller size and greater diversity of American banks—the opposite of the typical American pattern in global comparisons of market structure and industrial organization—have long distinguished the American banking industry from its international counterparts [47]. These traits arise from the distinctively American commitment to state rather than federal law as the primary driver of banking regulation [5] (p. 101).

Further research with this dataset should address time effects across nearly 80 quarters spanning both the global financial crisis of 2008–09 and the COVID-19 pandemic. The nonlinear relationship of those crisis years to the rest of the years in the sample may give rise to an instance where time should be treated both discretely and continuously. In addition, the ample number of otherwise predictive variables that were omitted from the design matrix provides fertile ground for a two-stage least-squares (2SLS) correction of residuals. Although this study lacks the geodesic coordinates that inform the application of 2SLS in geospatial analysis [7], or even stylized “Carolingian distances” from one of Europe’s historical centers of cultural and political influence [48], standardized Euclidean distances based on those additional variables may facilitate a wholly distinct, non-geospatial approach to stochastic diffusion.

4.2. Fixed Effects and Categorical Differences as “Natural” Variations on the Theme of Clustering

Iterative local regression in geospatial analysis is often motivated by a desire to improve predictive accuracy [7]. By contrast, the separate regression of European and American banking observations primarily serves analytical purposes on the right-hand side of the regression equation. This study’s regression of distinct subsets is closer in spirit to walk-forward validation through rolling windows in time-series forecasting. It ensures that Europe and America, divided by differences in regulatory culture arguably as vast as the ocean between those continents, are evaluated in proper isolation from one another, and not together in a single undifferentiated mass.

Taking care to evaluate categorically distinct subsets of a seemingly comprehensive body of data proves pivotal to a nuanced understanding of Europe and America’s distinct banking cultures. This is the geopolitical analog of the overriding concern in time-series forecasting that a single, unconditional regression might fail to capture serial autocorrelation and other temporal effects.

This study is also readily harmonized with geospatial econometrics and panel data. Since classification and clustering both rely on some distance metric quantifying separation or dissimilarity in higher-dimensional space, the binary classification exercise in this study should be regarded as a special instance of a more general approach to the bifurcation of the drift and diffusion components latent in many data collections.

For its part, geospatial data should also be regarded as a special instance of a generalized approach to deterministic drift and stochastic diffusion. The distance metric in geospatial data is often unambiguous: Haversine distance between points on earth. At most, a practical substitute such as the time needed to complete a trip by land, sea, or air might replace Haversine distance.

This study adds the further complication of legal and regulatory differences not fully or accurately captured by geographic separation. Great Britain, for example, is much closer in strictly geospatial terms to France than it is to the United States. But the world’s Anglophone countries derive their legal and political traditions from Britain. The common law and a preference for first-past-the-post legislative districting over proportional parliamentary representation put Britain closer to its former colonies across the Atlantic Ocean and further away, culturally and sociologically, from its neighbor across the English Channel. Not for naught do the French use the phrase les pays anglo-saxons or its synonym le monde anglophone as shorthand for the global English-speaking alliance [49].

Fixed effects regression in conventional econometrics fares very poorly relative to this study’s methodological progression. That time-honored method forces a single dummy variable—or, more generally, a vector of dummy variables where the number of fixed entities is greater than two—to bear the analytical burden of overcoming omitted variable bias. This study shows that converting the dummy variable or variables representing a vector of fixed entities into the target variable for binary classification is a viable, even fruitful, alternative to fixed effects regression.

Fixed entity effects may be reimagined as the presence of “naturally occurring groups, or clusters, in the population” [50] (p. 144). Reliance on clusters as a sampling methodology can raise a problem “when the clusters are more internally homogeneous” and thus fail to reflect all of the heterogeneity of the sampled population [50] (p. 144). The econometric convention of converting fixed entities into dummy variables implicitly expects that this expedient will capture some statistical phenomenon not captured by any of the other variables within the design matrix. Warnings against the misapplication of cluster sampling suggest that internal homogeneity, relative to omitted variable bias, may pose an even greater risk to proper estimation and causal inference.

Clustering and classification as alternatives to fixed effects regression reflect a different set of assumptions about the data. Whether marked by “natural” entities defined by political boundaries or other artifacts of social organization, or else identified in a mathematically principled way, clusters within a dataset acknowledge far greater heterogeneity within a data sample than fixed effects can capture. Even an entire vector of multiple fixed effect dummy variables leaves intact the original collection of independent variables and expects that vector of predictors to describe the data in an unconditional, persistent way.

Clusters, by contrast, arise precisely because distances among the variables in a design matrix are heterogeneous. Classification works in those special instances where entities arising from institutions or other human interventions come close enough to mathematically defined cluster boundaries so that these lines provide sufficient guidance to downstream efforts to understand stochastic diffusion, in addition to the deterministic drift within a particular collection of data.

5. Conclusions

This study makes two distinct and important contributions to our understanding of the differences in commercial performance and regulatory policy between the two banking cultures of the European Union and the United States. First, it establishes that the EU and the U.S.A. are indeed separated by more than a geographical ocean. Successful confirmation of this categorical difference through an exercise in binary classification validates the separate regression of these two continental cohorts.

The second contribution has broader implications for data delineated along geographic or geopolitical lines. Binary classification, followed by the separate regression of the European and American cohorts, succeeds where fixed effects regression fails. There truly are distinct banking cultures on either side of the north Atlantic. Those differences, for all three proposed risk measures in banking, are not only susceptible to more accurate predictions if approached separately by continent. Separate regression supplies radically different parameter estimates throughout the design matrix, all of which are more reliable and more revealing than coefficients attributed to ultimately inconsequential fixed effects dummy variables.

Future work should compare clustering and classification as distinct ways to manage categorical distinctions within data. The two methods abandon the assumption that a single, unconditional regression can adequately capture all nuances within a dataset crossing legal or regulatory boundaries. At a bare minimum, these superior ways of bridging stochastic diffusion with deterministic drift expand an econometric toolkit otherwise starved by mechanical reliance on fixed effects regression. Clustering and classification, as it were, are to local regression as John the Baptist is to Jesus: One crying in the wilderness of data must make straight the way for more complete analytical understanding.

Author Contributions

Conceptualization, J.M.C. and G.C.; methodology, J.M.C.; software, J.M.C.; J.M.C.; formal analysis, J.M.C.; investigation, G.C. and J.M.C.; resources, G.C.; data curation, G.C.; writing—original draft preparation, J.M.C.; writing—review and editing, J.M.C. and G.C.; visualization, J.M.C.; supervision, G.C.; project administration, G.C.; funding acquisition, G.C. All authors have read and agreed to the published version of the manuscript.

Funding

James Ming Chen thanks the Department of Management of the University of Verona for hosting him during the performance of this research with Giusy Chesini. The funding was obtained through the University of Verona’s Internationalization Program for 2024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data was collected from Bloomberg. The data presented in this study is available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. Although James Ming Chen received a grant to conduct this research at the University of Verona, the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

LARS	Least angle regression
NHST	Null hypothesis significance testing
OLS	Ordinary least squares
SGD	Stochastic gradient descent

References

Chen, J.M.; Chesini, G. The Two Banking Cultures: Risk and Profitability in the European Union and the United States. 20 May 2025; Unpublished manuscript in preparation. [Google Scholar]
Di Vito, L.; Fuentes, N.M.; Leite, J.M. Understanding the Profitability Gap Between Euro Area and US Global Systemically Important Banks. ECB Occasional Paper Series, N. 327. 2023. Available online: https://op.europa.eu/en/publication-detail/-/publication/b1025886-4228-11ee-8548-01aa75ed71a1/language-en (accessed on 19 May 2025).
Clair, R.T.; O’Driscoll, G.P., Jr. Learning from one another: The U.S. and European banking experience. J. Multinatl. Fin. Mgmt. 1993, 2, 33–55. [Google Scholar] [CrossRef]
Hutukka, P. Regulation of banks in the European Union, the United States, and China: Banking law in comparative context. Eur. Bus. Law Rev. 2025, 36, 1–48. [Google Scholar] [CrossRef]
Majone, G. Cross-national sources of regulatory policymaking in Europe and the United States. J. Pub. Policy 1991, 11, 79–106. Available online: https://www.jstor.org/stable/4007339 (accessed on 20 May 2025). [CrossRef]
Forbes, K.J. A reassessment of the relationship between inequality and growth. Am. Econ. Rev. 2000, 90, 869–887. [Google Scholar] [CrossRef]
Chen, J.M. Drift and diffusion in geospatial econometrics: Implications for panel data and time-series. Comput. Sci. Math. Forum, May 2025; under review. [Google Scholar]
Livieris, I.E.; Stavroyiannis, S.; Pintelas, E.; Pintelas, P. A novel validation framework to enhance deep learning models in time-series forecasting. Neural Comput. Appl. 2020, 32, 17149–17167. [Google Scholar] [CrossRef]
Wahyuddin, E.P.; Caraka, R.E.; Kurniawan, R.; Caesarendra, W.; Gio, P.U.; Pardamean, B. Improved LSTM hyperparameters alongside sentiment walk-forward validation for time series prediction. J. Open Innov. Technol. Mkt. Complex. 2025, 11, 100458. [Google Scholar] [CrossRef]
Ferreira, C. Competition and stability in the European Union banking sector. Intl. Adv. Econ. Res. 2023, 29, 207–224. [Google Scholar] [CrossRef]
Karadayi, N. Determinants of return on assets. Eur. J. Bus. Mgmt. Res. 2023, 8, 37–44. [Google Scholar] [CrossRef]
Petersen, M.A.; Schoeman, I. Modeling of banking profit via return-on-assets and return-on-equity. In Proceedings of the World Congress on Engineering, London, UK, 2–4 July 2008; Volume 2, pp. 828–833. [Google Scholar]
Altman, E.I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Fin. 1968, 23, 589–609. [Google Scholar] [CrossRef]
Altman, E.I.; Loris, B. A financial early warning system for over-the-counter broker dealers. J. Fin. 1976, 31, 1201–1217. [Google Scholar] [CrossRef]
Bajaj, N.; Huffman, A.; Plastino, D.T. Solvency shortcuts: The use and misuse of simple tools for predicting financial distress. Am. Bankruptcy Inst. J. 2022, 41, 38–83. Available online: https://www.abi.org/abi-journal/solvency-shortcuts-the-use-and-misuse-of-simple-tools-for-predicting-financial-distress (accessed on 20 May 2025).
Mare, D.S.; Moreira, F.; Rossi, R. Nonstationary Z-Score measures. Eur. J. Oper. Res. 2017, 260, 348–358. [Google Scholar] [CrossRef]
Eidleman, G.J. Z scores—A guide to failure prediction. CPA J. 1995, 65, 52–53. [Google Scholar]
U.S. Court of Federal Claims. Litman v. United States. In Federal Claims Reporter; District Court for the Southern District of Florida: Miami, FL, USA, 2007; pp. 90–146. [Google Scholar]
U.S. District Court for the Southern District of Florida. Figueroa v. Sharper Image Corp; Federal Supplement, 2d Series; U.S. District Court for the Southern District of Florida: Miami, FL, USA, 2007; Volume 517, pp. 1292–1329.
U.S. District Court for the Western District of Pennsylvania. In re SoClean, Inc., No. 2:22-cv-542; Westlaw: Pittsburgh, PA, USA, 2025; Document Number 1330539.
Altman, E.I.; Iwanicz-Drozdowska, M.; Laitinen, E.K.; Suvas, A. Financial distress prediction in an international context: A review and empirical analysis of Altman’s Z-score model. J. Intl. Fin. Mgmt. Account. 2017, 28, 131–171. [Google Scholar] [CrossRef]
Velliscig, G.; Floreani, J.; Polato, M. Capital and asset quality implications for bank resilience and performance in the light of NPLs’ regulation: A focus on the Texas ratio. J. Bank. Regul. 2023, 24, 66–88. [Google Scholar] [CrossRef]
Jesswein, K.R. An examination of the “Texas ratio” as a bank failure model. Acad. Bank. Stud. J. 2009, 8, 63–73. [Google Scholar]
Siems, T.F. The so-called Texas ratio. Fed. Reserve Bank Dallas Fin. Insights 2012, 1, 1–3. [Google Scholar]
Ferrarin, A.; Polato, M.; Velliscig, G. Disentangling the Texas ratio: The case of the Italian banking sector. Intl. J. Manag. Fin. Account. 2020, 12, 217–241. [Google Scholar] [CrossRef]
Floreani, J.; Velliscig, G.; Stefano, P.; Polato, M. The effects of capital on bank risk-taking: New evidence for the European banking system. Theor. Econ. Lett. 2023, 13, 597–626. [Google Scholar] [CrossRef]
Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
Tolles, J.; Meurer, W.J. Logistic regression relating patient characteristics to outcomes. JAMA 2016, 316, 533–534. [Google Scholar] [CrossRef]
Yu, H.-F.; Huang, F.-L.; Lin, C.-J. Dual coordinate descent methods for logistic regression and maximum entropy models. Mach. Learn. 2011, 85, 41–75. [Google Scholar] [CrossRef]
Tsuruoka, Y.; Tsujii, J.; Ananiadou, S. Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, Singapore, 2–7 August 2009; pp. 477–485. Available online: https://aclanthology.org/P09-1054.pdf (accessed on 20 May 2025).
Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar] [CrossRef]
Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least angle regression. Ann. Stat. 2004, 32, 407–499. [Google Scholar] [CrossRef]
Wasserstein, R.L.; Schirm, A.L.; Lazar, N.A. Moving to a world beyond “p < 0.05”. Am. Stat. 2019, 73 (Suppl. 1), 1–19. [Google Scholar] [CrossRef]
Hooper, D.; Whyld, K. The Oxford Companion to Chess, 2nd ed.; Oxford University Press: Oxford, UK, 1992. [Google Scholar]
European Banking Authority. Differences in Provisioning Practices in the United States and the European Union; Thematic Note EBA/REP/2021/13; European Banking Authority: Paris, France, 2021. [Google Scholar]
Beaver, W.H.; Engel, E.E. Discretionary behavior with respect to allowances for loan losses and the behavior of security prices. J. Account. Econ. 1996, 22, 177–206. [Google Scholar] [CrossRef]
Goodhart, C. Problems of monetary management: The U.K. experience. In Inflation, Depression, and Economic Policy in the West; Courakis, A.S., Ed.; Rowman and Littlefield: Lanham, MD, USA, 1981; Volume 3, pp. 111–146. [Google Scholar]
Strathern, M. ‘Improving ratings’: Audit in the British University system. Eur. Rev. 1997, 5, 305–321. [Google Scholar] [CrossRef]
Sheng, A.; Looi, T.G. Is there a Goodhart’s law in financial regulation? In Monetary History, Exchange Rates and Financial Markets: Essays in Honour of Charles Goodhart; Mizen, P., Ed.; Edward Elgar: Cheltenham, UK, 2003; Volume 2, pp. 234–249. [Google Scholar] [CrossRef]
Charles Moyer, R. Forecasting Financial Failure: A Re-Examination. Financ. Manag. 1977, 6, 11–17. [Google Scholar] [CrossRef]
Li, L.; Faff, R. Predicting corporate bankruptcy: What matters? Intl. Rev. Econ. Fin. 2019, 62, 1–19. [Google Scholar] [CrossRef]
Shumway, T. Forecasting bankruptcy more accurately: A simple hazard model. J. Bus. 2001, 74, 101–124. Available online: https://www.jstor.org/stable/10.1086/209665 (accessed on 20 May 2025). [CrossRef]
Johnson, C.G. Ratio analysis and the prediction of firm failure. J. Fin. 1970, 25, 1166–1168. Available online: https://www.jstor.org/stable/2325590 (accessed on 20 May 2025). [CrossRef]
Goodhart, C.A.E. The regulatory response to the financial crisis. J. Fin. Stab. 2008, 4, 351–358. [Google Scholar] [CrossRef]
Acrey, J.C.; Lee, W.Y.; Yeager, T.J. Can Federal Home Loan Banks effectively self-regulate lending to influential banks? J. Bank. Regul. 2019, 20, 197–210. [Google Scholar] [CrossRef]
Rhoades, S.A. The relative size of banks and industrial firms in the U.S. and other countries: A note. J. Bank. Fin. 1982, 6, 579–585. [Google Scholar] [CrossRef]
Chen, J.M.; Poufinas, T.; Pagagopoulou, A. Drift and diffusion in panel data: Extracting geopolitical and temporal effects in a study of passenger rail traffic. Comput. Sci. Math. Forum, September 2025; forthcoming. [Google Scholar]
Chabal, E. The rise of the Anglo-Saxon: French perceptions of the Anglo-American world in the long twentieth century. Fr. Politics Cult. Soc. 2013, 31, 24–46. [Google Scholar] [CrossRef]
Brown, R.S. Sampling. In International Encyclopedia of Education, 3rd ed.; Peterson, P., Baker, E., McGaw, B., Eds.; Elsevier Science: Amsterdam, The Netherlands, 2010; pp. 142–146. [Google Scholar] [CrossRef]

Figure 1. Test-set accuracy for four methods of binary classification: logistic regression, stochastic gradient descent, random forest, and extra trees.

Figure 2. The internal workings of binary classification methods. (a) Logistic regression, a canonical linear method; (b) extra trees, an ensemble-based machine learning method.

Figure 3. The extra trees classifier, depicted in two dimensions. The darkening of the yellow-to-green color gradient indicates a higher probability that an observation comes from a European bank. The intersecting red lines indicate the inflection point between American and European banks along a 1-dimensional t-SNE manifold of the prediction space.

Table 1. Parameter estimates for fixed entity effects regression for all three target variables: return on assets, Altman’s z-score, and the Texas ratio.

Variables	RoA: OLS	RoA: Lars ¹	Altman: OLS	Altman: Lars	Texas: OLS	Texas: Lars
eu_dummy	−0.024343 * ²	0.000000	0.039521	0.000000	0.012592	0.010587
annual_net_interest_margin	−0.006112	0.000000	0.037261 *	0.000000	0.008885	0.008412
cash_near_cash_item	0.035992 **	0.000000	−0.276452 ***	−0.203883 ***	−0.084858 ***	−0.080918 ***
disclosed_intangibles	−0.001844	0.000000	0.073143 ***	0.029977	0.192894 ***	0.190362 ***
tier1_capital_ratio	−0.007557	0.000000	0.087373 **	0.079151 **	−0.047821 ***	−0.045256 ***
total_capital_over_risk_based_capital	0.005328	0.000000	0.090190 ***	0.084811 **	0.099128 ***	0.095523 ***
common_equity_over_total_assets	0.166054 ***	0.163208 ***	0.203412 ***	0.196500 ***	−0.272485 ***	−0.270965 ***
efficiency_ratio	−0.037198 ***	−0.029197 ***	0.025382 *	0.015931	0.042799 ***	0.042459 ***
eps_growth	0.020005 *	0.014881	−0.003541	0.000000	−0.034549 ***	−0.032629 ***
growth_in_total_deposits	0.006270	0.000000	0.034507 **	0.016388	0.000519	0.000000
growth_in_total_loans	0.004685	0.000000	−0.042596 **	−0.014259	−0.000519	0.000000
net_income_growth	0.009829	0.006175	−0.024967	−0.019372	0.031106 ***	0.029354 ***
nonperforming_loans_over_total_assets	−0.071650 ***	−0.073357 ***	−0.275734 ***	−0.250955 ***	0.837755 ***	0.837340 ***
return_on_common_equity	0.779113 ***	0.777358 ***	0.298634 ***	0.291045 ***	0.007467	0.006593
t12_net_interest_margin	0.086433 ***	0.075754 ***	−0.127733 ***	−0.094787 ***	0.066768 ***	0.065320 ***
total_loans_over_total_deposits	−0.013440 *	−0.006152	0.049391 ***	0.037200 **	0.025881 ***	0.025853 ***
ℓ₁ alpha	N/A	0.010945	N/A	0.007232	N/A	0.000349

¹ “Lars” indicates least angle regression. In this study, all instances of Lars determined the ℓ₁/Lasso penalty according to the Bayesian information criterion. ² Levels of statistical significance—***: p < 0.001; **: p < 0.01; *: p < 0.05; +: p < 0.10.

Table 2. Parameter estimates (logistic regression and SGD) and feature importances (random forest and extra trees) for the binary classification of European versus American banking data.

Variables	Logistic Regression	SGD (with a ℓ₁ Penalty) ³	Random Forest	Extra Trees
intercept	−2.378868 *** ⁴	−2.474330 ***	N/A	N/A
annual_net_interest_margin	−0.492577 ***	−0.084153 ***	0.088243	0.065691
cash_near_cash_item	1.351696 ***	1.637565 ***	0.148800	0.262538
disclosed_intangibles	0.204390 ***	0.000000	0.031807	0.078799
tier1_capital_ratio	0.281099 ***	0.238743 ***	0.005840	0.011768
total_capital_over_risk_based_capital	0.022195 ***	0.000000	0.004864	0.009598
common_equity_over_total_assets	−1.065244 ***	−0.953667 ***	0.026012	0.131921
efficiency_ratio	−0.107207 ***	0.000000	0.002366	0.005198
eps_growth	0.168614 ***	0.041315 ***	0.000829	0.001818
growth_in_total_deposits	0.063693 ***	0.000000	0.001586	0.002556
growth_in_total_loans	0.006979 *	0.000000	0.002355	0.004634
net_income_growth	−0.089263 ***	0.000000	0.000990	0.001596
nonperforming_loans_over_total_assets	1.457363 ***	1.513600 ***	0.093624	0.114464
return_on_common_equity	0.565925 ***	0.629780 ***	0.002358	0.007141
t12_net_interest_margin	−1.052515 ***	−1.438182 ***	0.547773	0.239952
total_loans_over_total_deposits	1.183871 ***	1.195250 ***	0.042551	0.062325

³ The ℓ₁ penalty for stochastic gradient descent was set through a grid search for hyperparameters and determined to be 0.001413. ⁴ Levels of statistical significance—***: p < 0.001; **: p < 0.01; *: p < 0.05; +: p < 0.10.

Table 3. Parameter estimates for separate exercises in regression for all three target variables (return on assets, Altman’s z-score, and the Texas ratio) within Europe and within the United States. All regressions in this table proceeded under the Bayesian information criterion implementation of Lasso Lars least angle regression.

Variables	RoA: Europe	RoA: USA	Altman: Europe	Altman: USA	Texas: Europe	Texas: USA
annual_net_interest_margin	0.000000	−0.019170 *	−0.026492	0.000000	−0.010052	0.015696 *
cash_near_cash_item	0.000000	0.029691 **	0.000000	−0.366997 ***	0.033483 *	−0.105701 ***
disclosed_intangibles	0.000000	0.012955	0.179605 ***	0.057135 *	0.142307 ***	0.175521 ***
tier1_capital_ratio	0.000000	0.000000	0.000000	0.110124***	−0.034321 +	0.000000
total_capital_over_risk_based_capital	0.000000	0.006749	0.052463	0.000000	−0.028943	−0.014578 +
common_equity_over_total_assets	0.263023 *** ⁵	0.116345 ***	0.481305 ***	0.230170 ***	−0.319784 ***	−0.145884 ***
efficiency_ratio	−0.016034 +	−0.075350 ***	0.030215 *	−0.066502 **	0.000000	0.043169 ***
eps_growth	0.000000	0.020825 *	0.042796 +	−0.063479 *	−0.015779 +	−0.019164 *
growth_in_total_deposits	0.042873 **	−0.014543 *	0.047554 *	0.021917	0.000000	0.011883 *
growth_in_total_loans	0.074695 ***	−0.026108 ***	0.000000	−0.049490 **	−0.015144 +	0.024720 ***
net_income_growth	0.018685	0.019369 +	0.000000	0.000000	0.000000	0.014629 +
nonperforming_loans_over_total_assets	−0.107255 ***	−0.059011 ***	−0.451297 ***	−0.027021	0.658933 ***	1.180277 ***
return_on_common_equity	0.644823 ***	0.866887 ***	0.136747 ***	0.417484 ***	0.034315 ***	0.038409 ***
t12_net_interest_margin	0.000000	0.156754 ***	−0.163328 ***	−0.098905 **	0.000000	0.068231 ***
total_loans_over_total_deposits	0.000000	−0.014060 +	0.024022	0.064179 **	−0.051219 ***	0.016657 **
ℓ₁ alpha	0.011445	0.000075	0.011189	0.002772	0.004883	0.000059

⁵ Levels of statistical significance—***: p < 0.001; **: p < 0.01; *: p < 0.05; +: p < 0.10.

Table 4. The number of reversals and negations of statistically significant variables between regressions of the European and American subsets for each target variable.

Reversals and Negations	RoA	Altman	Texas
Flipped signs	2	1	3
Significant for one, sparse in the other	5	3	5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, J.M.; Chesini, G. Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations. Comput. Sci. Math. Forum 2025, 11, 33. https://doi.org/10.3390/cmsf2025011033

AMA Style

Chen JM, Chesini G. Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations. Computer Sciences & Mathematics Forum. 2025; 11(1):33. https://doi.org/10.3390/cmsf2025011033

Chicago/Turabian Style

Chen, James Ming, and Giusy Chesini. 2025. "Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations" Computer Sciences & Mathematics Forum 11, no. 1: 33. https://doi.org/10.3390/cmsf2025011033

APA Style

Chen, J. M., & Chesini, G. (2025). Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations. Computer Sciences & Mathematics Forum, 11(1), 33. https://doi.org/10.3390/cmsf2025011033

Article Menu

Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations^†

Abstract

1. Introduction

2. Materials and Methods

3. Results

3.1. Unconditional Drift: OLS and Fixed Effects Regression of the Original Dataset

3.2. Binary Classification

3.3. Separate Regression of the European and American Subsets

4. Discussion

4.1. Predictive Accuracy for Different Risk Measures

4.2. Fixed Effects and Categorical Differences as “Natural” Variations on the Theme of Clustering

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations †

Abstract

1. Introduction

2. Materials and Methods

3. Results

3.1. Unconditional Drift: OLS and Fixed Effects Regression of the Original Dataset

3.2. Binary Classification

3.3. Separate Regression of the European and American Subsets

4. Discussion

4.1. Predictive Accuracy for Different Risk Measures

4.2. Fixed Effects and Categorical Differences as “Natural” Variations on the Theme of Clustering

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Classifying Two Banking Cultures: The Pragmatic Structure of Economic Revelations^†