3.2.1. Fuzzy Cognitive Maps for Risk Reasoning
Fuzzy Cognitive Maps (FCMs) are graph-based dynamical systems that represent complex domains through interacting
concepts and weighted interconnections. An FCM is defined by (i) a set of concept nodes, (ii) a weighted adjacency matrix encoding inter-concept influence, and (iii) a nonlinear update rule that propagates the activation through the graph over discrete time steps (
Kosko, 1986;
Papageorgiou, 2012). This makes FCMs attractive for financial risk monitoring, where understanding interactions among financial dimensions and enabling stress testing can be as important as predictive performance.
Let denote the activation of concept i at iteration t, and let be the concept state vector. Interactions are encoded in a matrix with entries , where denotes the influence from concept i to concept j. Accordingly, the net input received by concept j at iteration t is .
Before introducing the FCM dynamics, we specify how concept activations are obtained from the observed financial ratios. Let denote the standardized feature vector of a firm. Financial ratios are first grouped into economically meaningful concepts via correlation-based clustering. Each concept activation is computed as an aggregation of the ratios assigned to the corresponding cluster, resulting in an initial concept state , where denotes the aggregation map. This mapping defines the interface between the raw feature space and the concept-level FCM representation.
Using a row-vector state
, the aggregation is written as
. The FCM dynamics are implemented via an inertial update:
where
controls inertia and
is a bounded activation function (the sigmoid function is employed in our work). Learning in FCMs aims to estimate
W from data, expert knowledge, or hybrid schemes. In data-driven settings, associative mechanisms (e.g., Hebbian learning) are commonly employed to capture empirical co-activation patterns in an interpretable manner (
Papageorgiou, 2012;
Papageorgiou & Salmeron, 2013). In our setting, learning focuses exclusively on estimating the interaction matrix
W, while the aggregation map
remains fixed after concept construction.
3.2.2. Cluster-Induced Hierarchical FCM
We formalize the proposed methodology as a three-level construction:
Level 1 (Attribute Concepts): raw financial ratios .
Level 2 (Aggregated Concepts): induced concept activations obtained by unsupervised clustering and aggregation of ratios.
Level 3 (Risk Index Concept): a continuous risk score read out from the evolved concept state .
The goal is to study how interactions among induced financial dimensions (Level 2) shape a firm-level vulnerability index (Level 3), while preserving interpretability through explicit concept memberships (Level 1 → Level 2 mapping).
Given a dataset
with
and binary outcome
, all preprocessing steps are fitted exclusively on the training split and then frozen, meaning that the resulting scaling parameters are kept fixed and applied unchanged to validation and test observations to prevent information leakage. Each ratio is robustly scaled using the median and interquartile range (IQR):
Robust scaling mitigates the impact of the heavy-tailed distributions and extreme values commonly observed in financial ratios, ensuring stable aggregation and comparability across features without imposing parametric distributional assumptions.
We cluster ratios (features), treating each ratio
j as the vector
of its preprocessed values across training firms. Pairwise similarity is measured via absolute Pearson correlation, converted to a distance
. Hierarchical clustering with average linkage partitions the
p ratios into
m clusters
. This yields a fixed attribute-to-concept mapping matrix
defined by
if
and
otherwise. For each firm, initial aggregated concept activations are computed as
where
z is the preprocessed ratio vector of the firm.
Ratios are clustered based on absolute Pearson correlation, treating each ratio as a vector of firm-level observations. This choice reflects the fact that financial ratios often differ in scale and units but convey related economic information when they co-move across firms. Correlation-based similarity therefore captures shared financial dynamics rather than magnitude effects, making it suitable for aggregating ratios into economically coherent dimensions.
Let denote the matrix of initial concept activations on the training set, where each row corresponds to a firm and each column to an induced financial concept. We construct the inter-concept coupling matrix in two stages.
Stage 1 (Unsupervised correlation-based initialization). We initialize
from the empirical co-activation structure of the training concepts by computing an unsigned correlation matrix. Specifically, each concept dimension of
is standardized across training firms, and we set
where the absolute value is applied elementwise to enforce unsigned interaction strengths. Optionally, row-wise top-
k sparsification is applied to retain only the strongest interactions per concept, followed by row normalization to ensure stable propagation dynamics.
Stage 2 (Weakly supervised Hebbian refinement around an average case). We incorporate limited supervisory information without performing end-to-end predictive optimization by refining
using a weak Hebbian update centered around an average-case concept profile. Let
denote the mean concept activation vector of non-distressed firms in the training set. For each training firm with concept vector
and label
, we perform the update
where
is a learning rate,
is a decay factor, and
for
and
for
, with a small
to mitigate class imbalance. After each update, non-negativity is enforced, self-loops are removed, and the same optional sparsification and row normalization as in the initialization stage are applied. The resulting
is held fixed during inference and used in the FCM-style propagation of concept activations. We use the FCM formalism as an interpretable dynamical aggregation mechanism; edges encode unsigned coupling strengths learned from co-activation patterns and not causal effects.
Inter-concept interactions are modeled as unsigned in order to represent influence intensity rather than causal direction. The objective of the proposed framework is monitoring-oriented risk concentration rather than structural causal inference. Consequently, the coupling matrix captures association strength derived from empirical co-movement patterns of financial ratios, without attributing directional economic causality.
This choice improves dynamical stability during multi-step propagation and prevents oscillatory effects that may arise from arbitrary sign assignments in high-dimensional settings. Interpretability is preserved through explicit concept membership and concept-to-risk weights, which provide transparent attribution independently of edge sign orientation. Importantly, this procedure does not minimize a global predictive loss. Instead, it combines unsupervised concept co-activation with a weak, average-case–centered associative signal, yielding a structured and interpretable concept interaction network.
Starting from
, the aggregated concepts evolve for
T iterations according to an FCM-inspired update that preserves a direct injection of the data-induced state:
This propagation allows coherent vulnerability patterns to diffuse across related financial dimensions while preventing degenerate dynamics driven solely by inter-concept interactions. The proposed framework does not reduce to simple feature aggregation followed by linear scoring nor to conventional dimensionality reduction techniques such as Principal Component Analysis (PCA). While aggregation constitutes the first stage of the methodology, the model subsequently introduces explicit inter-concept interaction through the learned coupling matrix
and controlled multi-step propagation. In contrast, linear scoring or PCA-based approaches produce static projections that do not model interaction-driven amplification or risk accumulation effects across financial dimensions. Moreover, PCA components are variance-driven and orthogonal by construction, whereas the proposed clustering preserves economically interpretable ratio groupings and enables concept-level attribution throughout the propagation process.
Supervision is introduced only at the final risk readout stage and does not affect the learned FCM structure. This separation preserves the interpretability and stability of the concept interaction network while allowing limited alignment with observed distress outcomes.
The final risk score is computed from the evolved concept state via a concept-level readout, augmented with a low-magnitude residual correction to mitigate information loss due to aggregation. Let
denote the latent risk score in logit space and
the sigmoid. We define the risk score as
where
links aggregated concepts to risk, and where
is a sparse residual head defined over a small subset of raw ratios
. The quantity
represents the model’s predicted probability of financial distress and is used for ranking firms and for all probabilistic evaluation metrics.
Residual feature selection. The residual subset R (with ) is selected via mutual information ranking using the training split only, and then held fixed.
Residual head fitting. Given R, is fit on the TrainFull split using a class-balanced Logistic Regression on .
Weak supervision control.
The scalar controls the magnitude of supervised residual correction applied at the final readout stage. The residual supervision parameter is selected exclusively on the validation split within the development data and fixed prior to any test evaluation. We restrict to small-magnitude values to ensure that concept-level propagation remains the dominant explanatory mechanism. The chosen value () balances calibration improvement with structural transparency. An ablation experiment with further confirms that the residual channel operates primarily as a low-magnitude corrective adjustment rather than as the main driver of ranking performance.
Overall, the proposed framework combines unsupervised structure learning with lightweight supervised calibration, prioritizing interpretability and monitoring-oriented warning over end-to-end predictive optimization.
For any unseen firm, all preprocessing parameters, the clustering-derived mapping A, and learned weights remain fixed. The firm is mapped to , propagated to , and assigned the risk score r. Interpretability is supported by (i) concept memberships (where ratios define each aggregated concept) and (ii) concept-level contribution through the magnitude of and scenario perturbations at the concept level.
High-dimensional financial ratios are grouped into latent risk concepts through unsupervised clustering applied to the standardized training data. Each cluster defines a concept node within the hierarchical FCM, whose activation represents an aggregated signal from its constituent ratios. Inter-concept connections are estimated in an unsigned manner to represent influence intensity (rather than causal direction), and they are refined through constrained learning to stabilize the dynamics while preserving interpretability.
The contribution of the proposed framework does not arise from global predictive optimization, but from its structured interaction mechanism. After aggregating highly correlated financial ratios into coherent financial concepts, the model introduces explicit inter-concept coupling and controlled multi-step propagation. This mechanism allows vulnerability signals in one financial dimension (e.g., leverage stress) to reinforce related dimensions (e.g., liquidity pressure), thereby capturing risk accumulation effects that are not observable in purely additive linear scoring frameworks. The resulting interaction-driven amplification explains the model’s ability to concentrate distress risk among the highest-ranked firms under fixed monitoring budgets.
3.2.4. Evaluation Metrics
Bankruptcy risk assessment in supervisory practice is inherently a monitoring problem under limited inspection capacity. Regulators and risk managers are typically able to review only a fixed fraction of firms within a given period. Under such operational constraints, the relevant objective is not threshold-based classification accuracy, but the ability to concentrate distressed firms among the highest-ranked observations.
For this reason, we adopt ranking-based evaluation metrics, and in particular Recall@Top-
K, which measures the proportion of distressed firms captured within a predefined monitoring budget (e.g., Top-10%). Unlike threshold-dependent metrics, Recall@Top-
K directly reflects resource-constrained inspection settings and is invariant to arbitrary classification cutoffs. This choice is consistent with the use of precision–recall analysis as a more informative evaluation framework than accuracy or ROC-based measures in severely imbalanced settings (
Davis & Goadrich, 2006;
Saito & Rehmsmeier, 2015).
Model performance is evaluated using metrics appropriate for highly imbalanced financial distress data and ranking-based warning systems. Overall discriminative ability is assessed through the area under the receiver operating characteristic curve (ROC–AUC) and the area under the precision–recall curve (PR–AUC), with particular emphasis on PR–AUC due to its sensitivity to minority-class performance. To reflect practical monitoring constraints and the effectiveness of the model, we further evaluate performance using recall for the top-K percent of ranked firms.
Let
denote the true distress status of firm
i, and let
be the corresponding predicted risk score, where higher values indicate greater financial distress risk. The receiver operating characteristic (ROC) curve plots the true positive rate
against the false positive rate
as the decision threshold varies. The ROC–AUC summarizes the probability that a randomly selected distressed firm receives a higher risk score than a non-distressed firm. The precision–recall (PR) curve depicts precision
as a function of recall
The PR–AUC corresponds to the average precision across recall levels and provides a more informative assessment than ROC–AUC in settings with severe class imbalance.
Furthermore, recall at top-
K percent (Recall@Top-
K) is defined by ranking firms in descending order of predicted risk score
and computing
where
denotes the set of firms within the top-
highest-ranked risk scores. This metric captures the proportion of distressed firms identified when monitoring capacity is limited to a fixed fraction of the population.
Lift@Top-K
To quantify the concentration of distressed firms within the monitored subset, we report the Lift at Top-
K, defined as the ratio between the precision within the Top-
K ranked firms and the base rate of distress:
where
denotes the base rate of distressed firms in the evaluation set. Values greater than one indicate improved identification of distressed firms relative to random selection.
Brier Score
Finally, to assess the accuracy and calibration of probabilistic distress estimates, we report the Brier score, defined as the mean squared error between predicted probabilities and the binary outcomes:
Lower values indicate better probabilistic accuracy (and, in particular, improved calibration), complementing ranking-based measures such as PR–AUC and Recall@Top-
K.