Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts

Krimpas, George A.; Thanasas, Georgios; Krimpas, Nikolaos A.; Rigou, Maria; Lampropoulou, Konstantina

doi:10.3390/jrfm19030219

Open AccessArticle

Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts

by

George A. Krimpas

^1,*

,

Georgios Thanasas

^1,*

,

Nikolaos A. Krimpas

¹

,

Maria Rigou

¹

and

Konstantina Lampropoulou

²

¹

Department of Management and Technology, University of Patras, 26504 Rio-Patras, Greece

²

Department of Intelligent Systems, Tilburg University, 5000 LE Tilburg, The Netherlands

^*

Authors to whom correspondence should be addressed.

J. Risk Financial Manag. 2026, 19(3), 219; https://doi.org/10.3390/jrfm19030219

Submission received: 4 February 2026 / Revised: 3 March 2026 / Accepted: 7 March 2026 / Published: 16 March 2026

(This article belongs to the Special Issue AI and Machine Learning for Credit Risk and Financial Distress Prediction)

Download

Browse Figures

Versions Notes

Abstract

This study addresses the gap between predictive optimization and monitoring-oriented risk concentration by introducing a hierarchical Fuzzy Cognitive Map (FCM) framework for financial risk assessment. Financial distress prediction models are employed to estimate firm-level default probabilities and are required to comply with regulatory standards. IFRS 9 and Basel III/IV frameworks emphasize model explainability, scenario analysis and causal transparency, which are essential for compliance purposes. The methodology aggregates correlated financial ratios into financial concepts through unsupervised clustering. Concepts interact through a learned coupling matrix and a controlled multi-step propagation, which enables the amplification of risk signals. A small residual correction is applied at the final readout, preserving the interpretability of the proposed framework. The framework was applied to two severely imbalanced benchmark bankruptcy datasets. It achieved higher precision–recall performance than Logistic Regression (PR–AUC

\approx 0.32

vs.

0.27

), improved calibration (Brier score

\approx 0.046

vs.

0.089

) and maintained competitive Recall@Top–K under tight supervisory monitoring budgets. Hierarchical FCM achieved predictive performance comparable to nonlinear models while maintaining concept-level interpretability. Our findings demonstrate that structured concept aggregation combined with interaction-based propagation provides a transparent alternative to purely predictive black-box models in financial distress assessment and is aligned with regulatory frameworks.

Keywords:

financial risk monitoring; warning systems; fuzzy cognitive maps; financial distress; interpretability; unsupervised learning; decision making

Graphical Abstract

1. Introduction

Financial risk management is one of the most complex tasks in the modern economy, and its interconnected markets are characterized by their rapid information flows, systemic vulnerabilities, and unprecedented levels of uncertainty. Traditional quantitative risk models are the core of risk analysis and have shown their inherent capabilities, but they often struggle to capture the qualitative dimensions of risk, such as the judgment of experts, causal interdependencies, and dynamic feedback mechanisms, which are apparent in financial systems. The global financial crisis in 2008 revealed the need for risk assessment methodologies in order to model systemic risk, contagion effects, and the complex interactions between financial institutions, markets, and regulatory frameworks (Mezei & Sarlin, 2016). Financial systems are characterized by complexity with multiple interacting components, nonlinear relationships, and behaviors that are difficult to capture (Bakhtavar et al., 2021).

Fuzzy Cognitive Maps (FCMs) is an alternative model that combines the qualitative reasoning of cognitive maps with the quantitative rigor of fuzzy logic (Azadeh et al., 2012; Glykas & Xirogiannis, 2005). The FCMs model was introduced by Kosko in 1986 (Kosko, 1986) as an extension of Axelrod’s cognitive maps. FCMs represent knowledge as a directed graph where nodes denote concepts and edges represent causal relationships. FCMs’ associated fuzzy weights indicate the strength and the direction of influence. This representation enables us to model complex systems that are characterized by uncertainty and incomplete information, which is precisely the conditions that are common in financial risk management.

FCMs have found applicability (Papageorgiou, 2012; Papageorgiou & Salmeron, 2013) in a diverse spectrum of scientific fields. Financial risk analysis is one of those fields, and it has gained a lot of attention over the past two decades (Bakhtavar et al., 2021). Financial risk management entails knowledge that resides in the implicit expertise of professionals, regulators, and domain experts, while FCMs can systematically elicit and formalize such knowledge (Glykas & Xirogiannis, 2005; Mezei & Sarlin, 2016). The dynamic nature of financial risk requires models capable of scenario analysis and what-if simulations, which FCMs can naturally provide (Jalilian et al., 2019; Medina & Moreno, 2007).

FCMs have focused on the economic domain, and they include applications in forecasting (Azadeh et al., 2012), fund management in financial sector enterprises (Glykas & Xirogiannis, 2005), systemic risk measurement (Mezei & Sarlin, 2016), banking credit risk (Jalilian et al., 2019), corporate financial distress prediction (Hajek & Prochazka, 2018), reputational risk management (Trostianska & Semencha, 2020), electricity market risk (Medina & Moreno, 2007), and sustainable financial systems design (Ziolo et al., 2019). Such applications demonstrate the versatility, adaptability, and effectiveness of FCM methodologies across the variety of financial risk management challenges.

In financial distress monitoring, firms are described by a large collection of heterogeneous ratios such as profitability, liquidity, leverage, efficiency, and cash-flow characteristics (Altman et al., 2017). Two practical limitations are created by reasoning directly on raw financial ratios. First, interpretability deteriorates as the number of features grows and ratios become highly correlated, obscuring which financial dimensions drive the final risk signal (Qian et al., 2022). Second, purely data-driven models can exhibit degraded portability across datasets, periods, and institutional settings when they rely on unstable or opaque high-order feature interactions (Zhang et al., 2022).

The scalability limitations of Fuzzy Cognitive Maps in high-dimensional settings have also been explicitly acknowledged in the literature. Haritha et al. (2022) show that conventional FCM formulations are not inherently suited to large feature spaces, as the direct representation of all variables as concepts leads to increased computational complexity and reduced learning stability. To address this limitation, they propose a distributed FCM framework for feature selection, where causal influence is evaluated locally and aggregated into a global state representation. Their findings show that dimensionality reduction and aggregation are not merely computational optimizations, but structural necessities for applying FCMs to complex, high-dimensional datasets.

These limitations are even more pronounced for feature-level FCMs. Using each ratio as a node yields a large and typically dense interaction matrix, complicating learning, destabilizing dynamics, and undermining the core motivation of FCMs (transparent reasoning and stress testing). This observation supports the need for concept aggregation or hierarchical structuring when FCM-based models are employed in data-intensive risk assessment problems. Therefore, an intermediate concept layer is needed to bridge high-dimensional financial inputs and stable concept-level FCM reasoning.

The main purpose of this work is to depart from classical feature selection and retain the full set of available financial ratios. Rather than operating directly on raw, high-dimensional feature spaces, we aggregate all ratios into a reduced set of interpretable financial concepts, which serve as the building blocks of a hierarchical Fuzzy Cognitive Map (FCM). In this framework, aggregation is used for predictive convenience, to enable structured interaction modeling and risk propagation at the concept level. The employed hierarchical FCM preserves the informational content of the original features, stability, and suitability for monitoring financial distress.

Specifically, we propose a hierarchical Fuzzy Cognitive Map framework for financial risk monitoring. We aggregate several financial ratios into a reduced set of data-driven financial concepts using unsupervised clustering. Risk reasoning is then performed at the concept level through an unsigned FCM, which propagates vulnerability across related financial dimensions. The interaction structure is learned using weak Hebbian updates without end-to-end predictive optimization, and supervision is introduced only through a sparse residual readout. This corrects aggregation-induced information loss. Our framework was evaluated on two real-world bankruptcy datasets under severe class imbalance and limited monitoring budgets. We show that the proposed approach improves risk concentration among top-ranked firms compared to linear baselines, while remaining competitive with stronger non-linear models and providing concept-level risk explanations.

The rest of the paper proceeds as follows. Section 2 presents related work and research questions. Section 3 discusses the proposed methodology. Section 4 presents the results. Section 5 discusses the empirical findings, and Section 6 concludes.

2. Literature Review

2.1. Foundations of Ratio-Based Distress Prediction

The empirical study of corporate financial distress originated in the 1960s with the systematic use of accounting ratios as early warning indicators. Beaver (1966) was among the first that gave early evidence from a large sample setting that financial ratios provide information about the likelihood of bankruptcy. These measures concern cash-flow and several liquidity ratios and formalized issues related to measurement design, accrual versus cash-flow emphasis, and sampling biases in failure prediction studies (Beaver et al., 2011).

Based on this, Altman introduced the Z-score model and combined multiple financial ratios via Multiple Discriminant Analysis (MDA). MDA enabled the classification of the firms by its bankruptcy risk. The Z-score framework and its later extensions remain reference benchmarks in distress prediction research and practice (Altman, 2018). These studies established financial ratios as key factors to model and estimate bankruptcy and financial distress.

Subsequent developments extended discriminant-based modeling through the ZETA framework by refining multivariate ratio analysis (Altman et al., 1977). Moreover, Ohlson introduced a probabilistic approach by estimating a logistic regression that improved statistical inference in bankruptcy prediction (Ohlson, 1980).

International validations have provided substantial evidence of the predictive accuracy and relevance of ratio-based models across institutional settings (Altman et al., 2017). Moreover, SME-oriented adaptations have expanded their applicability beyond large listed firms (Altman & Sabato, 2007).

Recent studies have documented the predictive importance of financial ratios such as profitability, liquidity and leverage in diverse markets (Akil et al., 2024; Marsenne et al., 2023; Powell et al., 2023; Afgani et al., 2023). Ratio-based frameworks have been extended by corporate governance, yielding incremental explanatory and predictive gains (Liang et al., 2016).

2.2. Agency, Capital Structure, and Economic Mechanisms of Distress

During the 1980s and 1990s, research on financial distress expanded beyond statistical classification and incorporated structural and agency-based frameworks. Financial distress was examined as a contracting problem among shareholders, creditors, and other stakeholders, where incentive conflicts, renegotiation, and restructuring mechanisms influence the probability of default and its resolution (Chen et al., 1995; Senbet & Wang, 2012).

Capital structure theories have contributed to the disentangling of distress risk by linking the gap between finance decisions and default incentives. The trade-off framework models leverage as a balance between tax benefits against expected bankruptcy and agency costs. Furthermore, pecking-order theory attributes financing choices to informational asymmetries between insiders and external investors (Muvingi et al., 2015). These frameworks have contributed to a better understanding of the correlation between financial distress probability and leverage, liquidity constraints and cash-flow instability.

Recent research has emphasized the role of corporate governance mechanisms and institutional structures in distress prediction and bank failure analysis (Alzayed et al., 2023). Sustainability and resilience affect financial stability and risk assessment, which reflects the integration of ESG and systemic risk dimensions into distress modeling (Shcherbak et al., 2026; Ziolo et al., 2019).

The above has shifted the view of bankruptcy prediction solely as a classification task and has helped the viewing of financial distress as an economically grounded process influenced by financing decisions and contractual frictions.

2.3. Market-Based, Hazard, and Hybrid Prediction Models

In the 1990s, attention turned to another strand of the literature and the market-based approaches which were first conceptualized in contingent-claims theory following the Merton framework, in which equity is modeled as an option on firm assets. These structural models generated distance-to-default measures derived from market prices and asset volatility, providing forward-looking indicators of financial distress (Hendricks, 2004).

Empirical findings show that accounting and market-based models can capture complementary information. Accounting ratios often perform strongly in short-horizon prediction, and market-implied measures may incorporate real-time information by improving longer-horizon forecasts (Agarwal & Taffler, 2008; Fejér-Király, 2015).

Hazard and duration models have provided a promising extension to the financial distress framework by incorporating time-varying covariates and dynamic default probabilities. Large-sample evidence indicates that hazard-based approaches can capture the information content of traditional discriminant and logit models (Bauer & Agarwal, 2014).

More recent hybrid frameworks use accounting ratios and market-based variables within unified credit risk models. For example, dynamic loading models show that accounting and market signals play different roles across credit quality segments. Furthermore, hybrid specifications outperform standalone approaches in both in-sample and out-of-sample predictive performance (Li & Miu, 2009).

2.4. Traditional Statistical Methods and Machine Learning Approaches

Beaver (1966) in his seminal work showed specific ratios such as cash-flow-to-debt ratio, differentiating between bankrupt and healthy firms, and analyzed the ability to predict bankruptcy through univariate ratio analysis. Altman (1968), on the other hand, recognized the limitations of the univariate model, introduced the Multiple Discriminant Analysis (MDA), examined financial ratios combined with a linear risk score (Z-score), and thus managed to analyze the profile of a firm as a whole, managing to statistically consolidate the informative value of the accounting ratios.

Recent studies have extended classical models through penalization and structured variable selection. Penalized logistic regression models such as Lasso, SCAD and MCP improve stability and reduce overfitting in high-dimensional settings. These models are competitive in terms of accuracy and AUC performance. Moreover, they combine simplicity with interpretability by reducing weak predictors in order to retain only the informative variables (Fejér-Király, 2015; Guo & Xie, 2025). Dimensionality reduction techniques such as Principal Component Analysis (PCA) have also been integrated with the financial distress literature to mitigate multicollinearity and improve estimation stability (Nguyen et al., 2023; Zeng & Xu, 2023).

Ensemble learning techniques, such as Gradient Boost and Decision Trees, demonstrate strong discriminatory power in bankruptcy prediction tasks (Qian et al., 2022; Zieba et al., 2016). Feature selection and importance-correction methods enhance model performance and interpretability in tree-based frameworks (Qian et al., 2022). Furthermore, hybrid combinations of boosting algorithms and logistic regression have shown significant improvements in predictive accuracy without losing the interpretability of the model (Duan, 2023). At the same time, the increasing complexity of ensemble and deep learning models has motivated the development of explainable artificial intelligence approaches. These frameworks are tailored to financial distress prediction while their main concern is to reconcile predictive performance with transparency and regulatory requirements (Zhang et al., 2022).

Although these approaches often improve global discrimination metrics, they sacrifice structural transparency and economic interpretability. In particularly, when feature interactions become opaque they exhibit a trade-off between predictive optimization and interpretability, which is a central tension in contemporary financial distress modeling.

2.5. Fuzzy Cognitive Maps in Financial Risk Modeling

Fuzzy Cognitive Maps have been widely applied as a modeling tool in complex risk problems. Bakhtavar et al. (2021) conducted an analytical review of their use in risk analysis, decision making, and policy modeling in various domains and presented their ability to act under uncertainty. FCMs have also been applied to analyze the financial macroeconomic domain (Migkos et al., 2022). Specifically, they analyzed systemic interactions and policy propagation mechanisms to evaluate the impact of the 2010 Memorandum in Greece by developing an explanatory model that focused on the interactions of macroeconomic factors and performed an analysis of its consequences for the economy.

Korol (2019), on the other hand, studied risk prediction modeling at firm level and emphasized the temporal evolution of financial ratios. Korol (2019) relied on statistical and Machine Learning methodologies and constructed a rule-based fuzzy inference system where financial ratios are converted to membership functions. Furthermore, Korol (2019) managed to predict a risk score with the application of “If-Then’’ rules that allowed for smooth transitions between healthy and distressed states instead of relying on rigid thresholds.

Applications in credit and bankruptcy risk assessment exist (Hajek & Prochazka, 2018), but empirical benchmarking against large-scale data-driven baselines is limited. Another line of research has focused on modeling the causal relationships between different types of banking credit risk. Jalilian et al. (2019) ranked 36 credit risk types using intuitionistic fuzzy Failure Mode and Effects Analysis. Through scenario simulation, they identified which are the main risk factors that drive systemic credit risk. Hybrid and multi-stage FCM variants attempt to reduce structural complexity (Rezaee et al., 2017), yet they typically do not address the scalability challenges arising in high-dimensional financial ratio environments. Trostianska and Semencha (2020) studied how reputational risk modeling could be used as a qualitative tool for diagnostic assessment acting as a component of risk management modeling that could facilitate the realization of a strategic program for restoring confidence in the banking system as a whole.

Beyond financial distress analysis, FCMs have been applied in the broader financial domain, including housing market forecasting and geographically dispersed financial organizations (Azadeh et al., 2012; Glykas & Xirogiannis, 2005). Fuzzy-logic-based risk evaluation has also been employed in market and energy risk contexts (Medina & Moreno, 2007). Expert knowledge aggregation and systemic risk measurement within cognitive-map frameworks further illustrate the flexibility of FCM-based approaches (Mezei & Sarlin, 2016). Recent developments have explored distributed FCM architectures for feature selection in high-dimensional classification tasks, indicating increased scalability potential (Haritha et al., 2022).

2.6. Research Questions and Contributions

In this work, we examine how can Fuzzy Cognitive Maps be employed for financial risk monitoring, by addressing the research questions below.

RQ1.: Can we employ FCMs when financial risk assessment relies on high-dimensional and highly correlated financial ratios?

We propose a hierarchical construction of FCMs that can separate raw financial ratios from concept-level reasoning. We construct an intermediate layer of aggregated financial concepts, where the proposed methodology mitigates the instability and complexity issues associated with feature-level FCMs by preserving interpretability and dynamic reasoning capabilities.

RQ2.: Can we aggregate financial ratios in an unsupervised manner into data-induced concepts, and improve risk monitoring, robustness and interpretability?

We introduce an unsupervised, cluster-induced mechanism for constructing FCM concepts directly from financial data. We group ratios according to their empirical co-movement structure, and we produce coherent and interpretable financial concepts that enhance robustness and reduce sensitivity to noise and feature redundancy.

RQ3.: To what extent can propagation concentrate risk signals among top-ranked firms beyond linear scoring?

We develop a risk monitoring framework in which aggregated financial concepts interact through an unsigned FCM structure. By allowing information to propagate across related financial dimensions, the proposed approach captures risk accumulation effects that are not observable when financial ratios are analyzed in isolation.

While the literature on financial distress prediction has evolved from ratio-based diagnostics to structural, market-implied, hybrid and machine-learning models, certain methodological challenges remain relevant. High-dimensional and highly correlated financial ratios continue to pose stability and interpretability concerns, particularly in monitoring-oriented settings under severe class imbalance.

Although Fuzzy Cognitive Maps provide a natural framework for modeling interacting financial concepts, their application in firm-level distress prediction has largely remained expert-driven and has rarely been evaluated under large-scale empirical benchmarking and ranking-based monitoring criteria.

The present study contributes to this literature by proposing a hierarchical, data-driven FCM framework that aggregates financial ratios into empirically induced financial concepts and evaluates their interaction-driven propagation under monitoring constraints.

3. Materials and Methods

3.1. Materials

3.1.1. Dataset Description

The evaluation was conducted on two real-world financial distress datasets. The main dataset (Liang et al., 2016), the Taiwanese bankruptcy dataset, contains 6819 firm observations described by 95 financial ratios, covering multiple dimensions of firm performance, including profitability, leverage, liquidity, and operating efficiency. Each observation is associated with a binary distress label, resulting in a severely imbalanced setting with 220 distressed firms (base rate

220 / 6819 = 3.23 %

).

To evaluate the robustness and validate externally under a different accounting regime, we tested the proposed framework on the Polish bankruptcy dataset (5th Year) (Zieba et al., 2016). We followed an explicit early-warning design where financial ratios are observed one year before bankruptcy. The dataset contains 5910 instances with 64 financial attributes and a binary outcome label, including 410 bankrupt firms and 5500 non-bankrupt firms (base rate

410 / 5910 = 6.94 %

).

The objective of the empirical evaluation was methodological validation across heterogeneous accounting environments rather than dominance within a specific market. The Taiwanese and Polish datasets represent widely used benchmark settings in financial distress prediction and differ substantially in accounting regimes and class imbalance structure. The consistent behavior of the proposed framework across these distinct environments supports its structural portability and robustness across heterogeneous accounting regimes, which is central to its intended monitoring application.

3.1.2. Data Splitting and Preprocessing

The dataset was partitioned using a stratified

70 / 30

split, with

70 %

of the observations used for model development and

30 %

reserved for final testing under different seeds. Within the development set, separate training and validation splits were employed for parameter learning and hyperparameter selection, respectively, while preserving class proportions across all subsets. All preprocessing, including scaling and normalization of financial ratios, was fitted exclusively on the training data and subsequently applied to the validation and test sets to prevent information leakage.

3.1.3. Pipeline

In Figure 1 we depict the proposed framework, which follows a structured pipeline for interpretable financial risk monitoring. First, standardized financial ratios are aggregated into financially meaningful concepts using correlation-based clustering. Second, concept activations are propagated through the proposed hierarchical Fuzzy Cognitive Map with a learned interaction structure. Third, the concept interaction matrix is updated via weakly supervised Hebbian learning, guided by deviations from the average distress rate. Finally, firms are ranked according to a concept-based risk score and evaluated under limited monitoring budgets using Top-K performance metrics.

3.2. Methods

3.2.1. Fuzzy Cognitive Maps for Risk Reasoning

Fuzzy Cognitive Maps (FCMs) are graph-based dynamical systems that represent complex domains through interacting concepts and weighted interconnections. An FCM is defined by (i) a set of concept nodes, (ii) a weighted adjacency matrix encoding inter-concept influence, and (iii) a nonlinear update rule that propagates the activation through the graph over discrete time steps (Kosko, 1986; Papageorgiou, 2012). This makes FCMs attractive for financial risk monitoring, where understanding interactions among financial dimensions and enabling stress testing can be as important as predictive performance.

Let

a_{i}^{(t)} \in [0, 1]

denote the activation of concept i at iteration t, and let

A^{(t)} = (a_{1}^{(t)}, \dots, a_{m}^{(t)}) \in {[0, 1]}^{m}

be the concept state vector. Interactions are encoded in a matrix

W \in R^{m \times m}

with entries

w_{i j}

, where

w_{i j}

denotes the influence from concept i to concept j. Accordingly, the net input received by concept j at iteration t is

\sum_{i = 1}^{m} a_{i}^{(t)} w_{i j}

.

Before introducing the FCM dynamics, we specify how concept activations are obtained from the observed financial ratios. Let

x \in R^{d}

denote the standardized feature vector of a firm. Financial ratios are first grouped into economically meaningful concepts via correlation-based clustering. Each concept activation is computed as an aggregation of the ratios assigned to the corresponding cluster, resulting in an initial concept state

A^{(0)} = g (x) \in {[0, 1]}^{m}

, where

g (\cdot)

denotes the aggregation map. This mapping defines the interface between the raw feature space and the concept-level FCM representation.

Using a row-vector state

A^{(t)} \in {[0, 1]}^{m}

, the aggregation is written as

A^{(t)} W

. The FCM dynamics are implemented via an inertial update:

A^{(t + 1)} = β A^{(t)} + (1 - β) σ (A^{(t)} W),

(1)

where

β \in [0, 1)

controls inertia and

σ (\cdot)

is a bounded activation function (the sigmoid function is employed in our work). Learning in FCMs aims to estimate W from data, expert knowledge, or hybrid schemes. In data-driven settings, associative mechanisms (e.g., Hebbian learning) are commonly employed to capture empirical co-activation patterns in an interpretable manner (Papageorgiou, 2012; Papageorgiou & Salmeron, 2013). In our setting, learning focuses exclusively on estimating the interaction matrix W, while the aggregation map

g (\cdot)

remains fixed after concept construction.

3.2.2. Cluster-Induced Hierarchical FCM

Hierarchical architecture.

We formalize the proposed methodology as a three-level construction:

Level 1 (Attribute Concepts): raw financial ratios $x \in R^{p}$ .
Level 2 (Aggregated Concepts): induced concept activations $c^{(0)} = g (x) \in {[0, 1]}^{m}$ obtained by unsupervised clustering and aggregation of ratios.
Level 3 (Risk Index Concept): a continuous risk score $r \in [0, 1]$ read out from the evolved concept state $c^{(T)}$ .

The goal is to study how interactions among induced financial dimensions (Level 2) shape a firm-level vulnerability index (Level 3), while preserving interpretability through explicit concept memberships (Level 1 → Level 2 mapping).

Train-fitted preprocessing.

Given a dataset

D = {(x^{(n)}, y^{(n)})}_{n = 1}^{N}

with

x^{(n)} \in R^{p}

and binary outcome

y^{(n)} \in {0, 1}

, all preprocessing steps are fitted exclusively on the training split and then frozen, meaning that the resulting scaling parameters are kept fixed and applied unchanged to validation and test observations to prevent information leakage. Each ratio is robustly scaled using the median and interquartile range (IQR):

z_{i j} = \frac{x_{i j} - median (x_{j})}{IQR (x_{j}) + ε} .

(2)

Robust scaling mitigates the impact of the heavy-tailed distributions and extreme values commonly observed in financial ratios, ensuring stable aggregation and comparability across features without imposing parametric distributional assumptions.

Unsupervised clustering of ratios into aggregated concepts.

We cluster ratios (features), treating each ratio j as the vector

v_{j} = {\tilde{Z}}_{: j}

of its preprocessed values across training firms. Pairwise similarity is measured via absolute Pearson correlation, converted to a distance

d_{i j} = 1 - |corr (v_{i}, v_{j})|

. Hierarchical clustering with average linkage partitions the p ratios into m clusters

{C_{1}, \dots, C_{m}}

. This yields a fixed attribute-to-concept mapping matrix

A \in R^{p \times m}

defined by

A_{j k} = 1 / | C_{k} |

if

j \in C_{k}

and

A_{j k} = 0

otherwise. For each firm, initial aggregated concept activations are computed as

c^{(0)} = σ (z^{⊤} A) \in {[0, 1]}^{m},

(3)

where z is the preprocessed ratio vector of the firm.

Ratios are clustered based on absolute Pearson correlation, treating each ratio as a vector of firm-level observations. This choice reflects the fact that financial ratios often differ in scale and units but convey related economic information when they co-move across firms. Correlation-based similarity therefore captures shared financial dynamics rather than magnitude effects, making it suitable for aggregating ratios into economically coherent dimensions.

Learning inter-concept interactions $W_{cc}$ via correlation initialization and weak Hebbian refinement.

Let

C^{(0)} \in {[0, 1]}^{N_{tr} \times m}

denote the matrix of initial concept activations on the training set, where each row

C_{n, :}^{(0)} = g (x^{(n)})

corresponds to a firm and each column to an induced financial concept. We construct the inter-concept coupling matrix

W_{c c} \in R_{\geq 0}^{m \times m}

in two stages.

Stage 1 (Unsupervised correlation-based initialization). We initialize

W_{c c}

from the empirical co-activation structure of the training concepts by computing an unsigned correlation matrix. Specifically, each concept dimension of

C^{(0)}

is standardized across training firms, and we set

W^{(0)} = |\frac{1}{N_{tr} - 1} {(\frac{C^{(0)} - {\bar{C}}^{(0)}}{std (C^{(0)})})}^{⊤} (\frac{C^{(0)} - {\bar{C}}^{(0)}}{std (C^{(0)})})|, diag (W^{(0)}) = 0,

(4)

where the absolute value is applied elementwise to enforce unsigned interaction strengths. Optionally, row-wise top-k sparsification is applied to retain only the strongest interactions per concept, followed by row normalization to ensure stable propagation dynamics.

Stage 2 (Weakly supervised Hebbian refinement around an average case). We incorporate limited supervisory information without performing end-to-end predictive optimization by refining

W_{c c}

using a weak Hebbian update centered around an average-case concept profile. Let

μ_{0} = E [C^{(0)} ∣ y = 0]

denote the mean concept activation vector of non-distressed firms in the training set. For each training firm with concept vector

c_{n}^{(0)}

and label

y_{n} \in {0, 1}

, we perform the update

W_{c c} \leftarrow (1 - δ) W_{c c} + η ω (y_{n}) (c_{n}^{(0)} - μ_{0}) {(c_{n}^{(0)} - μ_{0})}^{⊤},

(5)

where

η > 0

is a learning rate,

δ \in [0, 1)

is a decay factor, and

ω (y_{n}) = 1

for

y_{n} = 1

and

ω (y_{n}) = - ρ

for

y_{n} = 0

, with a small

ρ > 0

to mitigate class imbalance. After each update, non-negativity is enforced, self-loops are removed, and the same optional sparsification and row normalization as in the initialization stage are applied. The resulting

W_{c c}

is held fixed during inference and used in the FCM-style propagation of concept activations. We use the FCM formalism as an interpretable dynamical aggregation mechanism; edges encode unsigned coupling strengths learned from co-activation patterns and not causal effects.

Inter-concept interactions are modeled as unsigned in order to represent influence intensity rather than causal direction. The objective of the proposed framework is monitoring-oriented risk concentration rather than structural causal inference. Consequently, the coupling matrix

W_{c c}

captures association strength derived from empirical co-movement patterns of financial ratios, without attributing directional economic causality.

This choice improves dynamical stability during multi-step propagation and prevents oscillatory effects that may arise from arbitrary sign assignments in high-dimensional settings. Interpretability is preserved through explicit concept membership and concept-to-risk weights, which provide transparent attribution independently of edge sign orientation. Importantly, this procedure does not minimize a global predictive loss. Instead, it combines unsupervised concept co-activation with a weak, average-case–centered associative signal, yielding a structured and interpretable concept interaction network.

FCM propagation on aggregated concepts.

Starting from

c^{(0)}

, the aggregated concepts evolve for T iterations according to an FCM-inspired update that preserves a direct injection of the data-induced state:

c^{(t + 1)} = β c^{(t)} + (1 - β) σ (c^{(0)} + c^{(t)} W_{c c}), t = 0, \dots, T - 1 .

(6)

This propagation allows coherent vulnerability patterns to diffuse across related financial dimensions while preventing degenerate dynamics driven solely by inter-concept interactions. The proposed framework does not reduce to simple feature aggregation followed by linear scoring nor to conventional dimensionality reduction techniques such as Principal Component Analysis (PCA). While aggregation constitutes the first stage of the methodology, the model subsequently introduces explicit inter-concept interaction through the learned coupling matrix

W_{c c}

and controlled multi-step propagation. In contrast, linear scoring or PCA-based approaches produce static projections that do not model interaction-driven amplification or risk accumulation effects across financial dimensions. Moreover, PCA components are variance-driven and orthogonal by construction, whereas the proposed clustering preserves economically interpretable ratio groupings and enables concept-level attribution throughout the propagation process.

Risk index readout with a sparse residual channel.

Supervision is introduced only at the final risk readout stage and does not affect the learned FCM structure. This separation preserves the interpretability and stability of the concept interaction network while allowing limited alignment with observed distress outcomes.

The final risk score is computed from the evolved concept state via a concept-level readout, augmented with a low-magnitude residual correction to mitigate information loss due to aggregation. Let

s \in R

denote the latent risk score in logit space and

σ (\cdot)

the sigmoid. We define the risk score as

s = c^{(T) ⊤} w_{c r} + λ (b_{R} + x_{R}^{⊤} w_{R}), r = σ (s) .

(7)

where

w_{c r} \in R^{m}

links aggregated concepts to risk, and where

(b_{R}, w_{R})

is a sparse residual head defined over a small subset of raw ratios

x_{R} \in R^{k}

. The quantity

r \in [0, 1]

represents the model’s predicted probability of financial distress and is used for ranking firms and for all probabilistic evaluation metrics.

Residual feature selection. The residual subset R (with

| R | = k ≪ p

) is selected via mutual information ranking using the training split only, and then held fixed.

Residual head fitting. Given R,

(b_{R}, w_{R})

is fit on the TrainFull split using a class-balanced Logistic Regression on

x_{R}

.

Weak supervision control.

The scalar

λ \geq 0

controls the magnitude of supervised residual correction applied at the final readout stage. The residual supervision parameter

λ

is selected exclusively on the validation split within the development data and fixed prior to any test evaluation. We restrict

λ

to small-magnitude values to ensure that concept-level propagation remains the dominant explanatory mechanism. The chosen value (

λ = 0.1

) balances calibration improvement with structural transparency. An ablation experiment with

λ = 0

further confirms that the residual channel operates primarily as a low-magnitude corrective adjustment rather than as the main driver of ranking performance.

Overall, the proposed framework combines unsupervised structure learning with lightweight supervised calibration, prioritizing interpretability and monitoring-oriented warning over end-to-end predictive optimization.

Out-of-sample inference and interpretability.

For any unseen firm, all preprocessing parameters, the clustering-derived mapping A, and learned weights

(W_{c c}, w_{c r})

remain fixed. The firm is mapped to

c^{(0)}

, propagated to

c^{(T)}

, and assigned the risk score r. Interpretability is supported by (i) concept memberships (where ratios define each aggregated concept) and (ii) concept-level contribution through the magnitude of

w_{c r}

and scenario perturbations at the concept level.

High-dimensional financial ratios are grouped into latent risk concepts through unsupervised clustering applied to the standardized training data. Each cluster defines a concept node within the hierarchical FCM, whose activation represents an aggregated signal from its constituent ratios. Inter-concept connections are estimated in an unsigned manner to represent influence intensity (rather than causal direction), and they are refined through constrained learning to stabilize the dynamics while preserving interpretability.

The contribution of the proposed framework does not arise from global predictive optimization, but from its structured interaction mechanism. After aggregating highly correlated financial ratios into coherent financial concepts, the model introduces explicit inter-concept coupling and controlled multi-step propagation. This mechanism allows vulnerability signals in one financial dimension (e.g., leverage stress) to reinforce related dimensions (e.g., liquidity pressure), thereby capturing risk accumulation effects that are not observable in purely additive linear scoring frameworks. The resulting interaction-driven amplification explains the model’s ability to concentrate distress risk among the highest-ranked firms under fixed monitoring budgets.

3.2.3. Baseline Models

The proposed model is benchmarked against standard classification models that are widely used in financial risk prediction. Logistic Regression is employed as a linear baseline, while a Random Forest classifier serves as a nonlinear, tree-based reference model. All the baseline models are trained on the same training data and evaluated under identical experimental conditions.

3.2.4. Evaluation Metrics

Bankruptcy risk assessment in supervisory practice is inherently a monitoring problem under limited inspection capacity. Regulators and risk managers are typically able to review only a fixed fraction of firms within a given period. Under such operational constraints, the relevant objective is not threshold-based classification accuracy, but the ability to concentrate distressed firms among the highest-ranked observations.

For this reason, we adopt ranking-based evaluation metrics, and in particular Recall@Top-K, which measures the proportion of distressed firms captured within a predefined monitoring budget (e.g., Top-10%). Unlike threshold-dependent metrics, Recall@Top-K directly reflects resource-constrained inspection settings and is invariant to arbitrary classification cutoffs. This choice is consistent with the use of precision–recall analysis as a more informative evaluation framework than accuracy or ROC-based measures in severely imbalanced settings (Davis & Goadrich, 2006; Saito & Rehmsmeier, 2015).

Model performance is evaluated using metrics appropriate for highly imbalanced financial distress data and ranking-based warning systems. Overall discriminative ability is assessed through the area under the receiver operating characteristic curve (ROC–AUC) and the area under the precision–recall curve (PR–AUC), with particular emphasis on PR–AUC due to its sensitivity to minority-class performance. To reflect practical monitoring constraints and the effectiveness of the model, we further evaluate performance using recall for the top-K percent of ranked firms.

Let

y_{i} \in {0, 1}

denote the true distress status of firm i, and let

{\hat{s}}_{i} \in [0, 1]

be the corresponding predicted risk score, where higher values indicate greater financial distress risk. The receiver operating characteristic (ROC) curve plots the true positive rate

TPR = \frac{TP}{TP + FN}

(8)

against the false positive rate

FPR = \frac{FP}{FP + TN},

(9)

as the decision threshold varies. The ROC–AUC summarizes the probability that a randomly selected distressed firm receives a higher risk score than a non-distressed firm. The precision–recall (PR) curve depicts precision

Precision = \frac{TP}{TP + FP}

(10)

as a function of recall

Recall = \frac{TP}{TP + FN} .

(11)

The PR–AUC corresponds to the average precision across recall levels and provides a more informative assessment than ROC–AUC in settings with severe class imbalance.

Furthermore, recall at top-K percent (Recall@Top-K) is defined by ranking firms in descending order of predicted risk score

{\hat{s}}_{i}

and computing

Recall @ Top - K = \frac{\sum_{i \in I_{K}} I (y_{i} = 1)}{\sum_{i = 1}^{N} I (y_{i} = 1)},

(12)

where

I_{K}

denotes the set of firms within the top-

K %

highest-ranked risk scores. This metric captures the proportion of distressed firms identified when monitoring capacity is limited to a fixed fraction of the population.

Lift@Top-K

To quantify the concentration of distressed firms within the monitored subset, we report the Lift at Top-K, defined as the ratio between the precision within the Top-K ranked firms and the base rate of distress:

Lift @ Top - K = \frac{Precision @ Top - K}{π},

(13)

where

π

denotes the base rate of distressed firms in the evaluation set. Values greater than one indicate improved identification of distressed firms relative to random selection.

Brier Score

Finally, to assess the accuracy and calibration of probabilistic distress estimates, we report the Brier score, defined as the mean squared error between predicted probabilities and the binary outcomes:

BS = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{s}}_{i} - y_{i})}^{2},

(14)

Lower values indicate better probabilistic accuracy (and, in particular, improved calibration), complementing ranking-based measures such as PR–AUC and Recall@Top-K.

4. Empirical Evaluation and Results

4.1. Implementation Details and Hyperparameter Selection

For the hierarchical FCM, we fixed the number of aggregated concepts to

m = 40

. To mitigate information loss from aggregation, we employed a sparse residual feature channel comprising

k = 15

raw financial ratios. These ratios were selected using mutual information ranking on the training split only and were kept fixed thereafter. The residual contribution was controlled by a low-magnitude coefficient

λ = 0.1

, selected on the validation split to ensure that concept-level inference remained dominant.

The proposed model is not designed as a binary classifier optimized for a fixed decision threshold, but as a ranking-based early warning system that prioritizes firms according to their relative financial distress risk.

The number of aggregated concepts m, the size of the residual feature subset k, and the residual supervision strength

λ

were selected exclusively based on validation performance within the development split and were fixed prior to test evaluation.

The parameter m controlled the granularity of concept aggregation, where smaller values increased dimensionality reduction but entailed information loss risk, whereas larger values preserved detail at the cost of reduced compactness. The selected value represented a balance between interpretability and predictive stability.

The residual subset size

k ≪ p

was intentionally kept small to ensure that supervision remained lightweight and did not override concept-level reasoning.

Finally, the magnitude parameter

λ

acted as a calibration control, limiting the influence of the supervised residual channel so that risk propagation through aggregated concepts remained the dominant mechanism.

4.2. Cluster-Induced Concept Structure

The clustering of financial ratios into aggregated financial concepts represents a coherent group of empirically correlated ratios. The complete mapping of all 92 financial ratios that correspond to cluster-induced concepts is reported in Appendix A (Table A1) and summarizes the induced structure of the aggregated concepts. The Table shows the number of ratios per concept together with the complete list of the ratios associated with each concept. Singleton concepts correspond to ratios exhibiting weak correlation with the remaining feature set, while larger clusters capture shared financial dimensions. The resulting clusters are interpreted together in financial analysis, such as leverage-related indicators, liquidity measures, and profitability ratios, suggesting that correlation-based clustering recovers financially meaningful structures. In Figure 2 we depict the absolute Pearson correlation matrix of financial ratios, reordered according to the induced concept memberships. The observed block-diagonal structure indicates strong concept co-movement, and weaker or neutral associations between different groups of ratios. This pattern supports the claim of correlation-based clustering for aggregating financial ratios into coherent concepts. Financial ratios grouped within the same concept exhibit substantially higher mutual dependence than ratios belonging to different concepts.

4.3. Precision–Recall Performance

In Figure 3 we depict the comparison of precision–recall behavior between the proposed FCM model with the baseline models on the test set. As expected in a highly imbalanced setting, precision decreased as recall increased for all the approaches. Hierarchical FCM achieved a higher average precision than Logistic Regression across most recall levels, while remaining below Random Forest in overall PR–AUC. Importantly, the FCM exhibited competitive precision in the low-to-medium recall region, which is particularly relevant for early-stage risk identification.

In Table 1 we report the overall discriminative performance on the test set. Random Forest achieved the highest ROC–AUC and PR–AUC and reflected the advantage of flexible nonlinear models. Hierarchical FCM substantially outperformed Logistic Regression and attained strong precision–recall performance using a compact and interpretable concept-level representation. Moreover, the FCM exhibited improved probabilistic calibration relative to Logistic Regression, as reflected by a lower Brier score. All the metrics reported in Table 1 were computed according to the definitions in Section 3.2.4. PR–AUC was derived from the precision and recall definitions (Equations (10) and (11)), and the Brier score was computed according to Equation (14). All the probability estimates were obtained from the sigmoid-transformed risk score

{\hat{s}}_{i}

defined in Equation (7). The Brier score was computed according to Equation (14).

Sensitivity analysis.

We examined the sensitivity of the proposed framework to key hyperparameters controlling concept granularity (m), propagation depth (T), and the strength of weak supervision (

λ

). Performance remained stable across all the examined configurations with respect to the number of propagation steps, indicating that the proposed constraints effectively prevent unstable FCM dynamics. As shown in Table 2, increasing the number of concepts from

m = 32

to

m = 40

yielded consistently higher PR–AUC across all propagation depths, suggesting a favorable trade-off between dimensionality reduction and information preservation. In contrast, variation in the number of propagation steps had a limited effect on performance, supporting the robustness of the proposed propagation mechanism.

In order to assess the role of weak supervision, we evaluated the model for

λ = 0

and

λ = 0.1

. Our results show that supervision was light and did not dominate the risk propagation mechanism. When

λ = 0

the model worked without supervision, so Equation (7) reduced to

s = c^{(T) ⊤} w_{c r}

, and it functioned only in a purely concept-based manner. Hence, this supervision-free risk propagation mechanism achieved a PR–AUC of 0.325, compared to 0.275 for Logistic Regression, while maintaining comparable Recall@Top-K under realistic monitoring budgets. This analysis with

λ = 0

confirms that the concept-based propagation mechanism remained the primary driver of monitoring performance. The residual channel therefore operated as a low-magnitude corrective adjustment rather than as a dominant predictive component. In addition, probabilistic calibration improved substantially, with the Brier score decreasing from 0.089 to 0.046, as reported in Table 3.

These findings confirm that the warning signal is primarily driven by aggregation of financial ratios into interpretable concepts and interaction-based propagation within the FCM, with the residual channel acting as a low-magnitude calibration mechanism rather than a primary driver of ranking performance.

4.4. Warning Capability

In Figure 4 we show Recall@Top-K percent of ranked instances, in order to evaluate the ability of each model to identify high-risk firms under limited inspection capacity. The FCM consistently improved Recall@Top-K over Logistic Regression across monitoring budgets and captured a substantial proportion of distressed firms among the top-ranked observations. Although Random Forest achieved higher recall across most thresholds, the FCM offered a competitive trade-off between warning performance and interpretability through explicit concept-level risk attribution. The base rate of distressed firms, defined as the proportion of bankrupt firms in the test set, was

3.23 %

. Recall@Top-K values were computed according to Equation (12), where firms were ranked in descending order of predicted risk score

{\hat{s}}_{i}

(Equation (7)).

In Table 4 we show the evaluation of performance under limited inspection capacity. At the strict Top-5% operating point, the hierarchical FCM matched Random Forest and outperformed Logistic Regression, capturing more than half of the distressed firms within a very limited inspection budget. At broader monitoring levels (Top-10% and (Top-15%), the FCM consistently improved upon Logistic Regression and remained competitive with Random Forest, despite operating on a substantially lower-dimensional and interpretable concept-based representation.

This behavior suggests that interaction-driven propagation among aggregated financial concepts concentrates risk more effectively among the highest-ranked firms than linear feature-level scoring, even when overall discrimination is similar. While Random Forest achieves the highest global discrimination metrics, the objective of the present framework is not to compete in aggregate accuracy but to support monitoring-oriented decision making under limited inspection capacity. Under strict Top-K monitoring budgets, the hierarchical FCM achieves performance comparable to high-capacity nonlinear models while preserving a compact and economically interpretable concept structure. This positioning reflects a deliberate trade-off between maximal predictive optimization and structured transparency suitable for supervisory environments.

In Table 5 we show the statistical significance of performance differences that were assessed using paired tests on the common test set. Differences in ROC–AUC were evaluated using DeLong’s test (DeLong et al., 1988), while differences in PR–AUC and Recall@Top-K were assessed via a paired non-parametric bootstrap with 2000 resamples. The proposed FCM achieved a statistically significant improvement over Logistic Regression in PR–AUC (

Δ = + 0.115

,

95 %

CI

[+ 0.004, + 0.223]

,

p = 0.042

), indicating superior precision–recall trade-offs under class imbalance. Differences in ROC–AUC were not statistically significant, consistent with the fact that the FCM is not optimized for global discrimination. Compared to Random Forest, the hierarchical FCM exhibits comparable PR–AUC and warning recall, with no statistically significant differences at conventional levels. Recall@Top-K improvements over Logistic Regression were positive but not statistically significant, reflecting the limited number of distressed firms in the test set.

4.5. False Negative Analysis Under Monitoring Budgets

While Recall@Top-K summarizes aggregate warning effectiveness, it does not reveal which distressed firms remain undetected under limited monitoring budgets. To better understand systematic blind spots shared across models, we analyzed false negatives at the firm level. Table 6 shows false negatives under a Top-10% monitoring budget.

Across all the models, a subset of 13 distressed firms was consistently missed, indicating structurally hard cases that were not captured by the available financial features and effectively defined an empirical upper bound on achievable recall. Beyond this shared set, Random Forest missed only three additional cases, while the proposed hierarchical FCM missed seven, substantially fewer than a linear baseline. Despite relying on weak supervision and associative learning rather than end-to-end predictive optimization, the FCM did not introduce additional blind spots and remained competitive in terms of missed distressed firms, while providing a structured and interpretable concept-level representation.

4.6. Concept-Level Interpretability Under Monitoring Constraints

Global concept importance.

Figure 5 illustrates some of the strongest positive and negative aggregated concept contributions to the bankruptcy risk node, as determined by the learned concept-to-risk weights of the hierarchical FCM. Concept Importance shows that risk propagation was driven by a small subset of concepts that indicated substantial absolute influence. The majority of concepts contributed marginally, with positive weights corresponding to risk-amplifying effects and negative weights corresponding to risk-mitigating ones. Such sparsity shows the effectiveness of concept induction in reducing dimensionality and enhances interpretability by allowing risk drivers to be traced back to financially reasonable groups of financial indicators rather than individual ratios.

We focused on two true-positive firms within the monitored set (Top–5%) to illustrate decision support under a fixed inspection budget.

This provides an auditable justification: supervisors can see which concepts drive the alert and whether propagation plays a marginal (clear case) or supportive (borderline case) role.

Firm-level interpretation under monitoring budgets.

In Table 7 we show the behavior of hierarchical FCM in two qualitatively different supervisory scenarios. We evaluated two cases: a high-risk firm and a borderline (but risky) firm. In the high-risk case, multiple concepts exhibited saturated activations already at the aggregation stage, which led to a risk assessment dominated by a small number of primary drivers. Propagation-induced amplification was minimal, reflecting the fact that the warning signal was strong and unambiguous. On the contrary, the borderline firm displayed a more heterogeneous activation profile, with several moderately activated concepts and fewer saturated signals. In this setting, interaction-driven propagation is more crucial because it reinforces secondary concepts through co-movement effects, and therefore refines the firm’s final position relative to the monitoring cut-off. From a policy-making perspective, this behavior is desirable. The model escalates clear cases without relying on interaction effects, while employing controlled propagation to aggregate multiple weak signals in ambiguous cases. This allows supervisors to prioritize firms for inspection in a transparent and consistent manner given limited monitoring capacity.

4.7. Validation on the Polish Bankruptcy Dataset

To assess the robustness and the generalization capacity of the proposed FCM, we evaluated the model on the Polish bankruptcy dataset. The Polish dataset follows a different accounting regime and exhibits an explicit early-warning structure, where financial ratios are observed one year prior to bankruptcy (5th Year subset). This setting provides a challenging benchmark for monitoring-oriented risk assessment.

The test set contained 1773 firms, of which 123 (with a base rate of

6.94 %

) corresponded to bankrupt entities, resulting in a highly imbalanced classification problem. Consistent with the main experiments, the firms were ranked by predicted risk and were evaluated under fixed monitoring budgets. In addition to Recall@Top-K, we report the number of false negatives (FNs), corresponding to distressed firms that remained undetected within the monitored subset.

Table 8 shows the Top-K performance for

K \in {5 %, 10 %, 15 %}

. As expected, Random Forest achieved the strongest overall ranking performance, while Logistic Regression provides a competitive linear benchmark operating directly on the raw financial ratios. The proposed hierarchical FCM exhibited consistent comparable performance to Logistic Regression across all the monitoring budgets. Lift values were computed according to Equation (13), with the base rate

π

defined over the corresponding test set.

At a

10 %

monitoring budget, the hierarchical FCM captured 66 out of 123 bankrupt firms (Recall@10% =

0.537

), closely matching Logistic Regression (67 bankruptcies) and attaining over

75 %

of the maximum achievable recall under this budget. Importantly, this performance was obtained without dataset-specific feature selection or representation learning, supporting the portability and interpretability of the proposed aggregation-based framework.

Attainable recall and intrinsic detection limits.

To contextualize these results, we analyzed the overlap of false negatives across the models. In Table 9 at a

10 %

monitoring budget, 36 out of 123 bankrupt firms were not ranked within the Top-K list of any model. These common false negatives were difficult bankruptcy cases that were not identifiable by any model and defined an inherent upper bound on achievable recall under Top-

10 %

. Consequently, the maximum attainable Recall@

10 %

was

70.7 %

, indicating that the hierarchical FCM captured approximately

76 %

of the theoretical ceiling despite operating under structural constraints and weak supervision. This analysis highlights that a non-negligible subset of bankruptcies remains fundamentally hard to detect using standard financial ratios alone.

4.8. Policy Implications

Bankruptcy risk assessment serves dual purposes, prediction and ongoing supervision, within constrained resource environments. Regulatory institutions and financial organizations lack the capacity to examine all firm entities comprehensively, requiring prioritization frameworks that flag high-risk firms for intensive review. Within this framework, ranking-based assessments conducted under fixed monitoring capacity constraints offer greater policy utility than classification approaches which are based on predictive accuracy only.

The hierarchical FCM facilitates monitoring-focused decision frameworks by generating risk rankings through intermediate concepts that carry financial meaning. The aggregation of raw financial indicators into higher-order dimensions enables decision-makers to connect high-risk signals to comprehensible financial factors. Such transparency serves both internal risk evaluation processes and external validation of regulatory interventions to auditors and supervised organizations.

In this work we identify significant policy implications arising from the fact that even when integrating diverse predictive models a large number of bankrupt firms manage to evade detection under fixed monitoring capacity. Within the Polish dataset, approximately 30% of bankruptcy cases remain unidentified by all models when examining only the top 10% of ranked entities. This finding establishes a performance upper bound for recall achievable through conventional financial indicators alone and emphasizes the necessity for supplementary qualitative evaluation mechanisms.

Furthermore, the observed complementarity across models suggests that effective supervisory practice should not be considered as a competition between predictive algorithms. Instead, policy-relevant risk monitoring can benefit from the parallel use of diverse risk signals that capture different aspects of financial distress. Particularly, fuzzy cognitive models offer a valuable baseline that complements data-driven approaches without requiring complex ensemble deployment or opaque optimization procedures.

The proposed hierarchical FCM is consistent with these objectives and supports early-warning and monitoring-oriented decision making. The aggregation of high-dimensional financial ratios into financially meaningful concepts allows for risk assessment to be traced back to the underlying financial dimensions, such as liquidity pressure, leverage stress or profitability deterioration. It also promotes transparent internal reporting and model reviewing, which are central to supervisory processes under Basel Pillar 2 and the Supervisory Review and Evaluation Process (SREP), where institutions are expected to justify risk assessments not only in terms of model outputs but also with respect to their underlying economic rationale (Basel Committee on Banking Supervision, 2015).

Essentially, these findings underline that policy-driven bankruptcy risk evaluation should emphasize model interpretability, consistency across diverse accounting frameworks, and recognition of inherent detection boundaries rather than incremental improvements in aggregate discrimination measures. The proposed FCM establishes a transparent and transferable architecture for evidence-based decision-making within resource-limited monitoring contexts.

5. Discussion

In this section we discuss the empirical findings regarding the research questions posed in Section 2.6, emphasizing financial interpretation, warning implications, and the role of concept-based aggregation.

RQ1:: Can hierarchical FCMs be employed under high-dimensional and correlated financial inputs?

The proposed hierarchical construction enables the use of FCMs in settings characterized by high-dimensional and strongly correlated financial ratios by separating feature-level information from concept-level reasoning. Through the aggregation of raw ratios into a set of financial concepts, the proposed model avoids the instability and complexity associated with feature-level FCMs (Hajek & Prochazka, 2018; Haritha et al., 2022) and preserves dynamic propagation and interpretability. The resulting hierarchical FCM exhibited robust warning performance under severe class imbalance, as shown in Table 1 and Table 4 and in Figure 3 and Figure 4, in highly imbalanced bankruptcy data. The FCM consistently outperformed Logistic Regression and approached the performance of Random Forest at strict monitoring budgets (Top-5%). This behavior demonstrates that interaction-driven propagation among aggregated financial concepts effectively concentrates distress risk among the highest-ranked firms, which is the primary objective of warning systems rather than threshold-based classification (Altman, 1968; Grice & Dugan, 2003; Ohlson, 1980).

RQ2:: Can we aggregate financial ratios in an unsupervised manner, into data-induced concepts, and improve risk monitoring, robustness and interpretability?

The unsupervised concept induction procedure yielded clusters of financially reasonable ratios, as shown in Appendix A and visually supported by the correlation heatmap in Figure 2. These results confirm that aggregated concepts are not arbitrary combinations of the raw ratios but can capture shared financial drivers. Although aggregation reduces dimensionality, the findings reveal that much of the warning signal is preserved, and in some cases enhanced, when risk is propagated at the concept level. This supports the claim that concept-based aggregation acts as a form of structural regularization (Nguyen et al., 2023; Zeng & Xu, 2023) and can retain economically meaningful information.

RQ3:: To what extent can propagation concentrate risk signals among top-ranked firms beyond linear scoring?

The results indicate that interaction-driven propagation among aggregated financial concepts helps concentrate distress risk among the highest-ranked firms, especially given strict monitoring budgets. This behavior supports warning objectives beyond what can be achieved through direct linear scoring of individual ratios (Altman, 1968; Fejér-Király, 2015), and interpretability arises directly from the model structure.

6. Conclusions

Financial and practical implications.

This paper proposes a novel framework that supports risk-based monitoring and early intervention under limited inspection capacity. It enables analysts to prioritize a small subset of firms for closer review while maintaining high coverage of distressed cases. Hierarchical FCM identifies more than half of bankrupt firms, at a Top-

5 %

monitoring budget, relying on a compact and interpretable representation of financial conditions. Such behavior allows supervisory authorities, credit analysts and managers to operate under resource constraints. Furthermore, the interpretable nature of FCMs can help monitor and examine the reasons that account for the resulting predicted risk of a given firm, making tractability and transparency a tool for professionals.

Why aggregated concepts rather than flat or random representations?

Compared to flat feature-level FCMs, concept-level aggregation emphasizes the common financial structure among correlated ratios and reduces sensitivity to measurement noise. Random aggregation would fail to preserve this structure, resulting in concept representations that lack the economic coherence that underpins interpretability. Our results suggest that aggregating ratios into financially meaningful concepts provides a favorable trade-off between dimensionality reduction, interpretability and warning performance.

Limitations and future research.

Hierarchical FCM demonstrates strong warning capabilities and relies on unsupervised clustering to induce financial concepts. Future research may explore hybrid approaches that incorporate expert knowledge into concept formation or investigate extensions that model the evolution of concept interactions over time. Furthermore, the employment of supervised FCMs that make use of optimization methods could enhance predictive accuracy that can match the performance of black box models but at the expense of interpretability and reduced transparency.

Author Contributions

Conceptualization, G.A.K., N.A.K., G.T. and M.R.; methodology, G.A.K. and N.A.K.; software, G.A.K. and K.L.; validation, G.A.K., N.A.K., M.R. and K.L.; formal analysis, G.A.K.; investigation, G.A.K. and N.A.K.; resources, N.A.K.; data curation, G.A.K.; writing—original draft preparation, G.A.K. and N.A.K.; writing—review and editing, G.A.K., N.A.K. and G.T.; visualization, G.A.K. and K.L.; supervision, G.T. and M.R.; project administration, G.T. and M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Taiwanese Bankruptcy Prediction [Dataset]. (2020). UCI Machine Learning Repository. https://doi.org/10.24432/C5004D. Tomczak, S. (2016). Polish Companies Bankruptcy [Dataset] UCI Machine Learning Repository, https://doi.org/10.24432/C5F600.

Acknowledgments

The publication fees of this manuscript have been financed by the Research Council of the University of Patras.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FCM	Fuzzy Cognitive Maps
LR	Logistic Regression
RF	Random Forest

Appendix A. Full Concept–Ratio Mapping

Table A1 reports the exact mapping between aggregated concepts and raw financial ratios as induced by the clustering procedure for

m = 40

concepts. This mapping was automatically generated from the experimental run and ensures the full reproducibility of the reported results.

Table A1. Concept-to-ratio membership obtained via unsupervised clustering.

Concept	No. of Attributes	Financial Ratios
C01	1	Interest Expense Ratio
C02	1	Interest Coverage Ratio (interest expense to EBIT)
C03	1	Degree of Financial Leverage (DFL)
C04	7	Operating Profit Rate; Pre-tax Net Interest Rate; After-tax Net Interest Rate; Non-industry Income and Expenditure/Revenue; Continuous Interest Rate (After Tax); Working Capital Turnover Rate; Cash Flow to Sales
C05	1	Revenue Per Share
C06	2	Contingent Liabilities/Net Worth; Accounts Receivable Turnover
C07	1	Total Assets to GNP Price
C08	4	Operating Profit Growth Rate; After-tax Net Profit Growth Rate; Regular Net Profit Growth Rate; Total Asset Return Growth Rate Ratio
C09	2	Continuous Net Profit Growth Rate; Inventory/Working Capital
C10	1	Quick Ratio
C11	1	Cash/Current Liability
C12	1	Realized Sales Gross Profit Growth Rate
C13	1	Total Asset Growth Rate
C14	1	Research and Development Expense Rate
C15	1	Cash Turnover Rate
C16	4	Net Value Growth Rate; Quick Assets/Current Liability; Working Capital/Equity
C17	2	Current Liabilities/Liability; Current Liability to Liability
C18	5	Working Capital to Total Assets; Quick Assets/Total Assets; Current Assets/Total Assets; Cash/Total Assets; Current Liability to Current Assets
C19	1	Fixed Assets Turnover Frequency
C20	7	Borrowing Dependency; Inventory and Accounts Receivable/Net Value; Current Liabilities/Equity; Current Liability to Equity; Equity to Long-term Liability; Net Income to Stockholder’s Equity; Liability to Equity
C21	2	Total Asset Turnover; Net Worth Turnover Rate (times)
C22	5	Current Ratio; Debt Ratio %; Net Worth/Assets; Current Liability to Assets; Equity to Liability
C23	3	Cash Flow to Total Assets; Cash Flow to Liability; Cash Flow to Equity
C24	2	Revenue per Person; Operating Profit per Person
C25	3	Operating Gross Margin; Realized Sales Gross Margin; Gross Profit to Sales
C26	5	Cash Flow Rate; Cash Flow per Share; Cash Reinvestment %; Operating Funds to Liability; CFO to Assets
C27	14	ROA(C) before Interest and Depreciation; ROA(A) before Interest and After Tax; ROA(B) before Interest and Depreciation after Tax; Net Value per Share (B); Net Value per Share (A); Net Value per Share (C); Persistent EPS in the Last Four Seasons; Operating Profit per Share; Net Profit before Tax per Share; Operating Profit/Paid-in Capital; Net Profit before Tax/Paid-in Capital; Retained Earnings to Total Assets; Total Expense/Assets; Net Income to Total Assets
C28	1	Tax Rate (A)
C29	2	Current Asset Turnover Rate; Quick Asset Turnover Rate
C30	1	Operating Expense Rate
C31	1	Inventory Turnover Rate (times)
C32	1	Total Debt/Total Net Worth
C33	1	Interest-bearing Debt Interest Rate
C34	1	Inventory/Current Liability
C35	2	Average Collection Days; Total Income/Total Expense
C36	2	Long-term Fund Suitability Ratio (A); Fixed Assets to Assets
C37	1	Allocation Rate per Person
C38	1	No-credit Interval
C39	1	Long-term Liability to Current Assets
C40	1	Net Income Flag

References

Agarwal, V., & Taffler, R. (2008). Comparing the performance of market-based and accounting-based bankruptcy prediction models. Journal of Banking & Finance, 32(8), 1541–1551. [Google Scholar] [CrossRef]
Akil, M., Perera, W. T. N. M., & Wijekoon, W. M. H. N. (2024). The use of financial ratios in predicting financial distress of listed entities in Sri Lanka. International Journal of Accounting and Business Finance, 10(2), 148–171. [Google Scholar] [CrossRef]
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589–609. [Google Scholar] [CrossRef]
Altman, E. I. (2018). A fifty-year retrospective on credit risk models, the Altman Z-score family of models and their applications to financial markets and managerial strategies. Journal of Credit Risk, 14(4), 1–34. [Google Scholar] [CrossRef]
Altman, E. I., Haldeman, R. G., & Narayanan, P. (1977). ZETATM analysis A new model to identify bankruptcy risk of corporations. Journal of Banking & Finance, 1(1), 29–54. [Google Scholar] [CrossRef]
Altman, E. I., Iwanicz-Drozdowska, M., Laitinen, E. K., & Suvas, A. (2017). Financial distress prediction in an international context: A review and empirical analysis of Altman’s Z-score model. Journal of International Financial Management & Accounting, 28(2), 131–171. [Google Scholar] [CrossRef]
Altman, E. I., & Sabato, G. (2007). Modelling credit risk for SMEs: Evidence from the US market. Abacus, 43(3), 332–357. [Google Scholar] [CrossRef]
Alzayed, N., Eskandari, R., & Yazdifar, H. (2023). Bank failure prediction: Corporate governance and financial indicators. Review of Quantitative Finance and Accounting, 61, 601–631. [Google Scholar] [CrossRef]
Azadeh, A., Ziaei, B., & Moghaddam, M. (2012). A hybrid fuzzy regression–fuzzy cognitive map algorithm for forecasting and optimization of housing market fluctuations. Expert Systems with Applications, 39(1), 298–315. [Google Scholar] [CrossRef]
Bakhtavar, E., Valipour, M., Yousefi, S., Sadiq, R., & Hewage, K. (2021). Fuzzy cognitive maps in systems risk analysis: A comprehensive review. Complex & Intelligent Systems, 7, 621–637. [Google Scholar] [CrossRef]
Basel Committee on Banking Supervision. (2015). Guidelines for identifying and dealing with weak banks. Bank for International Settlements. Available online: http://www.bis.org/bcbs/publ/d330.htm (accessed on 3 February 2026).
Bauer, J., & Agarwal, V. (2014). Are hazard models superior to traditional bankruptcy prediction approaches? A comprehensive test. Journal of Banking and Finance, 40, 432–442. [Google Scholar] [CrossRef]
Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111. [Google Scholar] [CrossRef]
Beaver, W. H., Correia, M., & McNichols, M. F. (2011). Financial statement analysis and the prediction of financial distress. Foundations and Trends in Accounting, 5(2), 99–173. [Google Scholar] [CrossRef]
Chen, Y., Weston, J. F., & Altman, E. I. (1995). Financial distress and restructuring models. Financial Management, 24(2), 57–75. [Google Scholar] [CrossRef]
Davis, J., & Goadrich, M. (2006, June 25–29). The relationship between Precision-Recall and ROC curves. 23rd International Conference on Machine Learning (pp. 233–240), Pittsburgh, PA, USA. [Google Scholar] [CrossRef]
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44(3), 837–845. [Google Scholar] [CrossRef]
Duan, M. (2023). Analysis of corporate financial risk avoidance strategies based on Logistic Regression model. Applied Mathematics and Nonlinear Sciences, 9. [Google Scholar] [CrossRef]
Fejér-Király, G. (2015). Bankruptcy prediction: A survey on evolution, critiques, and solutions. Acta Universitatis Sapientiae: Economics and Business, 3(1), 93–108. [Google Scholar] [CrossRef]
Glykas, M., & Xirogiannis, G. (2005). A soft knowledge modeling approach for geographically dispersed financial organizations. Soft Computing, 9(8), 579–593. [Google Scholar] [CrossRef]
Grice, J. S., & Dugan, M. T. (2003). Re-estimations of the Zmijewski and Ohlson bankruptcy prediction models. Advances in Accounting, 20, 77–93. [Google Scholar] [CrossRef]
Guo, B., & Xie, M. (2025). Financial crisis early warning model combined with penalty Logistic Regression model. Journal of Computational Methods in Sciences and Engineering, 1–18. [Google Scholar] [CrossRef]
Hajek, P., & Prochazka, O. (2018). Interval-valued fuzzy cognitive maps with genetic learning for predicting corporate financial distress. Filomat, 32(5), 1657–1662. [Google Scholar] [CrossRef]
Haritha, K., Judy, M. V., Papageorgiou, K., Georgiannis, V. C., & Papageorgiou, E. (2022). Distributed fuzzy cognitive maps for feature selection in big data classification. Algorithms, 15(10), 383. [Google Scholar] [CrossRef]
Hendricks, L. (2004). An assessment and comparison of bankruptcy prediction models in forecasting the financial distress of JSE-listed companies over a twenty-year period (2000 to 2020) [Master’s thesis, University of Cape Town]. Available online: http://hdl.handle.net/11427/39519 (accessed on 7 January 2026).
Jalilian, N., Zanjirchi, S. M., & Goh, M. (2019). Interactive scenario analysis of banking credit risks in intuitive fuzzy space. Journal of Modelling in Management, 15(1), 257–275. [Google Scholar] [CrossRef]
Korol, T. (2019). Dynamic bankruptcy prediction models for European enterprises. Journal of Risk and Financial Management, 12(4), 185. [Google Scholar] [CrossRef]
Kosko, B. (1986). Fuzzy cognitive maps. International Journal of Man–Machine Studies, 24(1), 65–75. [Google Scholar] [CrossRef]
Li, M. Y. L., & Miu, P. (2009). A hybrid bankruptcy prediction model with dynamic loadings on accounting-ratio-based and market-based information: A binary quantile regression approach. Journal of Empirical Finance, 17(4), 818–833. [Google Scholar] [CrossRef]
Liang, D., Lu, C. C., Tsai, C. F., & Shih, G. A. (2016). Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study. European Journal of Operational Research, 252(2), 561–572. [Google Scholar] [CrossRef]
Marsenne, M., Ismail, T., Taqi, M., & Hanifah, I. A. (2024). Financial distress predictions with Altman, Springate, Zmijewski, Taffler and Grover models. Decision Science Letters, 13(1), 181–190. [Google Scholar] [CrossRef]
Medina, S., & Moreno, J. (2007). Risk evaluation in Colombian electricity market using fuzzy logic. Energy Economics, 29(5), 999–1009. [Google Scholar] [CrossRef]
Mezei, J., & Sarlin, P. (2016). Aggregating expert knowledge for the measurement of systemic risk. Decision Support Systems, 88, 38–50. [Google Scholar] [CrossRef]
Migkos, S. P., Sakas, D. P., Giannakopoulos, N. T., Konteos, G., & Metsiou, A. (2022). Analyzing Greece 2010 memorandum’s impact on macroeconomic and financial figures through FCM. Economies, 10(8), 178. [Google Scholar] [CrossRef]
Muvingi, J., Nkomo, D. J., Mazuruse, P., & Mapungwana, P. (2015). Default prediction models: A comparison between market-based models and accounting-based models. Journal of Finance and Investment Analysis, 4(1), 39–65. [Google Scholar]
Nguyen, M., Nguyen, B. D., & Lieu, M. L. (2023). Corporate financial distress prediction in a transition economy. Journal of Forecasting, 43(8), 3128–3160. [Google Scholar] [CrossRef]
Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18(1), 109–131. [Google Scholar] [CrossRef]
Papageorgiou, E. I. (2012). Learning algorithms for fuzzy cognitive maps—A review study. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 42(2), 150–163. [Google Scholar] [CrossRef]
Papageorgiou, E. I., & Salmeron, J. L. (2013). A review of fuzzy cognitive maps research during the last decade. IEEE Transactions on Fuzzy Systems, 21(1), 66–79. [Google Scholar] [CrossRef]
Powell, R., Dinh, D. V., Vu, N. T., & Vo, D. H. (2023). Accounting-based variables as an early warning indicator of financial distress in crisis and non-crisis periods. International Journal of Finance and Economics, 29(4), 4105–4124. [Google Scholar] [CrossRef]
Qian, H., Wang, B., Yuan, M., Gao, S., & Song, Y. (2022). Financial distress prediction using a corrected feature selection measure and gradient boosted decision tree. Expert Systems with Applications, 190, 116202. [Google Scholar] [CrossRef]
Rezaee, M. J., Yousefi, S., & Babaei, M. (2017). Multi-stage cognitive map for failures assessment of production processes: An extension in structure and algorithm. Neurocomputing, 232, 69–82. [Google Scholar] [CrossRef]
Rivanda, A. K., Afgani, K. F., Purbayati, R., & Marzuki, M. M. (2023). The effect of liquidity, leverage, operating capacity, profitability, and sales growth as predictors of financial distress: (Property, real estate, and construction services companies listed on the IDX). Journal Integration of Management Studies, 1(1), 13–21. [Google Scholar] [CrossRef]
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), e0118432. [Google Scholar] [CrossRef]
Senbet, L. W., & Wang, T. Y. (2012). Corporate financial distress and bankruptcy: A survey. Foundations and Trends in Finance, 5(4), 243–335. [Google Scholar] [CrossRef]
Shcherbak, V., Dorokhov, O., Dorokhova, L., Vzhytynska, K., Yatsenko, V., & Yermolenko, O. (2026). Financial risk management and resilience of small enterprises amid the wartime crisis. Journal of Risk and Financial Management, 19(1), 37. [Google Scholar] [CrossRef]
Trostianska, K., & Semencha, I. (2020). Reputational risk management in conditions of credibility gap in the banking system. Journal of Financial Economic Policy, 12(3), 327–343. [Google Scholar] [CrossRef]
Zeng, X., & Xu, J. (2023). Research on financial early warning of listed companies based on factor analysis and logistic regression. Advances in Economics and Management Research, 4(1), 52. [Google Scholar] [CrossRef]
Zhang, Z., Wu, C., Qu, S., & Chen, X. (2022). An explainable artificial intelligence approach for financial distress prediction. Information Processing & Management, 59(4), 102988. [Google Scholar] [CrossRef]
Zieba, M., Tomczak, S. K., & Tomczak, J. M. (2016). Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Systems with Applications, 58, 93–101. [Google Scholar] [CrossRef]
Ziolo, M., Filipiak, B. Z., Bąk, I., & Cheba, K. (2019). How to design more sustainable financial systems: The roles of environmental, social, and governance factors in the decision-making process. Sustainability, 11(20), 5604. [Google Scholar] [CrossRef]

Figure 1. Overview of the proposed pipeline for hierarchical FCM-based financial risk assessment. Concept induction and FCM dynamics are data-driven; supervision is restricted to a sparse residual readout used to correct aggregation-induced information loss.

Figure 2. Absolute Pearson correlation matrix of financial ratios, reordered by cluster-induced concept membership.The structure shows strong within-concept and weaker cross-concept correlation.

Figure 3. Precision–recall curves on the test set for the proposed FCM and baseline models. The dashed line indicates the empirical base rate of the positive class.

Figure 4. Recall@Top-K performance (warning evaluation) on the test set. Higher values indicate that a larger fraction of distressed firms was captured among the top-ranked

K %

of firms.

Figure 4. Recall@Top-K performance (warning evaluation) on the test set. Higher values indicate that a larger fraction of distressed firms was captured among the top-ranked

K %

of firms.

Figure 5. Top-10 concept-level contributions to the bankruptcy risk node, as determined by the learned concept-to-risk weights of the hierarchical FCM. Positive values indicate risk-amplifying effects, while negative values correspond to mitigating influences.

Table 1. Overall discriminative performance on the test set.

Model	ROC–AUC	PR–AUC	Brier Score
Logistic Regression	0.88	0.27	0.089
Random Forest	0.95	0.43	0.023
Hierarchical FCM (m = 40)	0.91	0.39	0.042

Table 2. Sensitivity analysis w.r.t. concept granularity (m) and propagation depth (T) on the validation set. Results are reported for

λ = 0.1

.

Table 2. Sensitivity analysis w.r.t. concept granularity (m) and propagation depth (T) on the validation set. Results are reported for

λ = 0.1

.

m	T	ROC–AUC	PR–AUC
40	1	0.905	0.361
40	3	0.902	0.349
40	5	0.901	0.348
32	1	0.901	0.342
32	3	0.897	0.336
32	5	0.896	0.335

Table 3. Sensitivity analysis with respect to the residual supervision strength

λ

on the test set.

Table 3. Sensitivity analysis with respect to the residual supervision strength

λ

on the test set.

Model	ROC–AUC	PR–AUC	Brier Score
Logistic Regression	0.883	0.275	0.089
Hierarchical FCM ( $λ = 0$ )	0.897	0.325	0.046

Table 4. Warning performance on the test set measured by Recall@Top-K (Equation (12)). The base rate of distressed firms was

3.23 %

.

Table 4. Warning performance on the test set measured by Recall@Top-K (Equation (12)). The base rate of distressed firms was

3.23 %

.

Model	Top-5%	Top-10%	Top-15%
Logistic Regression	0.5000	0.6515	0.7424
Random Forest	0.5455	0.7576	0.8788
Hierarchical FCM (m = 40)	0.5455	0.6970	0.7727

Table 5. Paired statistical significance tests on the test set. Differences are reported as FCM minus baseline. ROC–AUC differences were evaluated using DeLong’s test. PR–AUC and Recall@Top-K differences were evaluated using a paired nonparametric bootstrap with 2000 resamples.

Metric	Comparison	$Δ$	p-Value
ROC–AUC	FCM vs. Logistic	$+ 0.030$	$0.825$
ROC–AUC	FCM vs. RF	$- 0.035$	$0.815$
PR–AUC	FCM vs. Logistic	$+ 0.115$	$0.042$
PR–AUC	FCM vs. RF	$- 0.036$	$0.571$
Recall@Top-5%	FCM vs. Logistic	$+ 0.061$	$0.483$
Recall@Top-5%	FCM vs. RF	$+ 0.000$	$0.841$
Recall@Top-10%	FCM vs. Logistic	$+ 0.046$	$0.490$
Recall@Top-10%	FCM vs. RF	$- 0.061$	$0.222$
Recall@Top-15%	FCM vs. Logistic	$+ 0.030$	$0.697$
Recall@Top-15%	FCM vs. RF	$- 0.106$	$0.059$

Table 6. False negatives (FNs) under a Top-10% monitoring budget. The table reports the total number of missed distressed firms (FNs), the subset of FNs shared across all models (structurally hard cases), and the additional FNs specific to each model.

Model	FNs (Total)	FNs (Common)	FNs (Additional)	Recall@10%
Logistic Regression	23	13	10	65.2%
Random Forest	16	13	3	75.8%
Hierarchical FCM	20	13	7	69.7%

Table 7. Concept-level interpretation given limited monitoring capacity. The table depicts a high-risk distressed firm identified within the Top-5% monitoring budget (rank = 4) and a borderline-risk firm ranked close to the cut-off (rank = 100). For each case, we report initial concept activation (

c^{(0)}

), propagation-induced amplification (

Δ = c^{(T)} - c^{(0)}

), and contribution to the final risk score.

Table 7. Concept-level interpretation given limited monitoring capacity. The table depicts a high-risk distressed firm identified within the Top-5% monitoring budget (rank = 4) and a borderline-risk firm ranked close to the cut-off (rank = 100). For each case, we report initial concept activation (

c^{(0)}

), propagation-induced amplification (

Δ = c^{(T)} - c^{(0)}

), and contribution to the final risk score.

High-Risk Firm				Borderline-Risk Firm
Concept	$c^{(0)}$	$Δ$	Contr.	Concept	$c^{(0)}$	$Δ$	Contr.
C32	1.00	$- 0.0001$	0.342	C32	0.89	$+ 0.001$	0.307
C20	1.00	$- 0.0002$	0.179	C20	0.84	$+ 0.002$	0.150
C33	0.81	$+ 0.0003$	0.086	C37	1.00	$+ 0.011$	0.143
C39	0.77	$+ 0.0001$	0.085	C19	1.00	$+ 0.000$	0.098
C29	1.00	$- 0.0004$	0.065	C33	0.83	$+ 0.001$	0.089
C07	0.67	$+ 0.0031$	0.041	C39	0.76	$+ 0.018$	0.053
C30	0.52	$+ 0.0032$	0.028	C07	0.80	$+ 0.007$	0.050

Table 8. Top-K bankruptcy detection on the Polish 5th Year test set (1773 firms; 123 bankrupt, base rate

6.94 %

). FNs denotes false negatives (bankrupt firms not captured within the monitoring budget).

Table 8. Top-K bankruptcy detection on the Polish 5th Year test set (1773 firms; 123 bankrupt, base rate

6.94 %

). FNs denotes false negatives (bankrupt firms not captured within the monitoring budget).

Model	K	TP	FNs	Recall@K	Precision@K	Lift
Logistic Regression	5%	40	83	0.3252	0.4494	6.48
Hierarchical FCM	5%	43	80	0.3496	0.4831	6.96
Random Forest	5%	46	77	0.3740	0.5169	7.45
Logistic Regression	10%	67	56	0.5447	0.3785	5.46
Hierarchical FCM	10%	66	57	0.5366	0.3729	5.37
Random Forest	10%	75	48	0.6098	0.4237	6.11
Logistic Regression	15%	83	40	0.6748	0.3120	4.50
Hierarchical FCM	15%	79	44	0.6423	0.2970	4.28
Random Forest	15%	92	31	0.7480	0.3459	4.99

Table 9. False negatives (FNs) under a Top-

10 %

monitoring budget on the Polish test set (123 bankrupt firms). The table reports the total number of missed distressed firms (FNs), the subset of FNs shared across all models (structurally hard cases), and the additional FNs specific to each model.

Table 9. False negatives (FNs) under a Top-

10 %

monitoring budget on the Polish test set (123 bankrupt firms). The table reports the total number of missed distressed firms (FNs), the subset of FNs shared across all models (structurally hard cases), and the additional FNs specific to each model.

Model	FNs (Total)	FNs (Common)	FNs (Additional)	Recall@10%
Logistic Regression	56	36	20	54.5%
Random Forest	48	36	12	61.0%
Hierarchical FCM	57	36	21	53.7%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Krimpas, G.A.; Thanasas, G.; Krimpas, N.A.; Rigou, M.; Lampropoulou, K. Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts. J. Risk Financial Manag. 2026, 19, 219. https://doi.org/10.3390/jrfm19030219

AMA Style

Krimpas GA, Thanasas G, Krimpas NA, Rigou M, Lampropoulou K. Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts. Journal of Risk and Financial Management. 2026; 19(3):219. https://doi.org/10.3390/jrfm19030219

Chicago/Turabian Style

Krimpas, George A., Georgios Thanasas, Nikolaos A. Krimpas, Maria Rigou, and Konstantina Lampropoulou. 2026. "Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts" Journal of Risk and Financial Management 19, no. 3: 219. https://doi.org/10.3390/jrfm19030219

APA Style

Krimpas, G. A., Thanasas, G., Krimpas, N. A., Rigou, M., & Lampropoulou, K. (2026). Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts. Journal of Risk and Financial Management, 19(3), 219. https://doi.org/10.3390/jrfm19030219

Article Menu

Hierarchical Fuzzy Cognitive Maps for Financial Risk Monitoring Using Aggregated Financial Concepts

Abstract

1. Introduction

2. Literature Review

2.1. Foundations of Ratio-Based Distress Prediction

2.2. Agency, Capital Structure, and Economic Mechanisms of Distress

2.3. Market-Based, Hazard, and Hybrid Prediction Models

2.4. Traditional Statistical Methods and Machine Learning Approaches

2.5. Fuzzy Cognitive Maps in Financial Risk Modeling

2.6. Research Questions and Contributions

3. Materials and Methods

3.1. Materials

3.1.1. Dataset Description

3.1.2. Data Splitting and Preprocessing

3.1.3. Pipeline

3.2. Methods

3.2.1. Fuzzy Cognitive Maps for Risk Reasoning

3.2.2. Cluster-Induced Hierarchical FCM

3.2.3. Baseline Models

3.2.4. Evaluation Metrics

Lift@Top-K

Brier Score

4. Empirical Evaluation and Results

4.1. Implementation Details and Hyperparameter Selection

4.2. Cluster-Induced Concept Structure

4.3. Precision–Recall Performance

4.4. Warning Capability

4.5. False Negative Analysis Under Monitoring Budgets

4.6. Concept-Level Interpretability Under Monitoring Constraints

4.7. Validation on the Polish Bankruptcy Dataset

4.8. Policy Implications

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Full Concept–Ratio Mapping

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI