Next Article in Journal
How Digital Government Empowers Public Service Delivery in China: Mechanisms from Public Value and Technological Empowerment Perspectives
Previous Article in Journal
A Conceptualization of Agility: Utilization and Future Research for the Development of Mechatronic Systems
Previous Article in Special Issue
Enabling Intelligent Data Modeling with AI for Business Intelligence and Data Warehousing: A Data Vault Case Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integrating Analyst-Forecasting Indicators into Business Intelligence Systems for Data-Driven Financial Distress Prediction

1
School of Management, Nanjing University of Posts and Telecommunications, No 66 Xinmofan Road, Gulou District, Nanjing 210003, China
2
Interdisciplinary Sciences Institute, Hefei University of Technology, No. 193 Tunxi Road, Baohe District, Hefei 230009, China
3
International Business School, Shandong Technology and Business University, No. 191, Binhai Middle Road, Laishan District, Yantai 264005, China
4
AbbVie Inc., North Chicago, IL 60064, USA
5
School of Finance, Nanjing University of Finance and Economics, No.3 Wenyuan Road, Qixia District, Nanjing 210023, China
6
Institute of Systems Engineering, Macau University of Science and Technology, Taipa Street, Macao 999078, China
*
Author to whom correspondence should be addressed.
Systems 2026, 14(1), 29; https://doi.org/10.3390/systems14010029 (registering DOI)
Submission received: 27 October 2025 / Revised: 17 December 2025 / Accepted: 22 December 2025 / Published: 26 December 2025
(This article belongs to the Special Issue Business Intelligence and Data Analytics in Enterprise Systems)

Abstract

Predictive analytics for financial distress plays an important role in enterprise risk management and everyday business decisions. Most past studies mainly use accounting indicators that come from standard financial reports. This study adds analyst-forecast financial indicators and places them in a data-driven business intelligence setup to improve how companies predict financial distress. We work with seven real datasets to test several predictive models and run statistical checks to see how analyst forecasts work with historical financial data. The results show that analyst-forecast indicators can clearly improve prediction accuracy and make the results easier to understand. From an enterprise systems view, this study pushes traditional financial distress prediction toward a smarter analytics setup that supports real-time, explainable, and data-based risk assessment. The findings provide useful ideas for both the theory and practice of designing business intelligence systems and financial decision-support tools for companies.

1. Introduction

As the capital market grows and the economy faces stronger downward pressure, companies are dealing with intense competitive challenges [1]. This environment brings more volatility and uncertainty, which makes it crucial for firms to adjust their strategies with care. When companies fail to build effective systems that help them adapt, they become more exposed to problems such as tight cash flow and rising debt levels [2,3]. These problems can grow over time and may eventually push a company into financial distress. This condition threatens not only the firm’s survival but also creates heavy costs for market participants and the wider economy [4].
Financial distress can appear in several ways. A company may face ongoing losses, fail to meet debt payments, or even enter bankruptcy. These situations are not sudden events; they show that financial risks have been building up and have reached dangerous levels [5]. When this happens, the company’s long-term health is put at risk, and the interests of investors, employees, suppliers, and customers are also affected [6,7]. Trouble in a single company can spread through the market, disrupt an entire industry, and weaken investor confidence. Because of these risks, it is important to build strong and reliable methods for predicting financial distress. These methods should use the financial information that companies release and look for early signs that show a company may be heading toward problems [8]. By using clear and advanced analytical tools, stakeholders can gain useful insight into weak points inside a company. This insight can help them act in time, reduce possible losses, and protect both the company’s future and the stability of the larger economic environment.
Current research mostly utilizes accounting features derived from financial statements, such as solvency, profitability, and operating capacity, as the primary indicators for predicting financial distress [9,10,11,12]. However, relying solely on financial information is no longer sufficient for accurately predicting a company’s potential for financial distress [13]. According to previous study, during the 1950s and 1960s, approximately 80 to 90% of the factors influencing companies’ market values (stock prices) stem from corporate earnings and book values; however, this figure has declined to around 50% now [14]. In addition, financial statements offer a historical view of a company’s finances and do not indicate its current or future operational status. Therefore, relying exclusively on historical financial statement information cannot promptly respond to and assess the constantly changing economic environment, as well as to accurately predict a company’s future financial condition [15,16]. At this point, researchers have begun to focus on the support that alternative data provides for financial decision-making and have applied alternative data across various domains to enhance decision-making quality [17,18,19,20].
Alternative data is defined as non-traditional data that can indicate the performance of firms or financial markets beyond traditional ones, such as company filings, text, and analyst forecasts. Company filings provide crucial insights into financial health, allowing for informed predictions of distress through analysis of trends in revenue, debt levels, and operational performance. Wu et al. (2024) proposed a deep learning technology to predict financial distress by integrating financial indicators, current report texts, and user responses [21]. Similarly, Che et al. (2024) used multimodal data for financial distress prediction (FDP) [22], including financial indicators, current reports, and interfirm networks. Their results showed that the proposed method outperformed all baseline models in both prediction accuracy and feature representation. Text data also strengthened the analysis because it captured qualitative signals that could reveal hidden issues, and it worked together with numerical data to create a more complete picture [23]. Liu and Jia (2025) also included text data in financial distress prediction and used text embedding with deep feature extraction in an ensemble model to improve prediction performance [24].
Compared with company filings, text data, and other fixed sources of information, analyst forecasts provide expert opinions and forward-looking insights that reflect current market conditions. These forecasts offer a shared view that brings together both numerical data and qualitative judgment. This type of information can reveal risks and trends that may not appear in traditional reports or other static data. The experience of analysts, along with their access to broad and timely information, makes these forecasting indicators valuable for understanding a company’s financial strength. Because of this, analyst-forecasting data from listed firms can serve as useful alternative signals that capture market views not shown in standard financial statements and give a clearer forward-looking warning. However, only a few studies treat analyst forecasts as an important factor in predicting financial distress, which limits our understanding of how market expectations shape corporate stability. A closer study of this link may help build a stronger framework for identifying early signs of financial distress.
From a theoretical view, the value of analyst-forecasting indicators can be explained through several basic ideas in financial economics. First, information asymmetry theory states that managers hold private knowledge about a firm’s future, while outside users cannot easily access timely or detailed information [25,26]. Analysts help narrow this gap by using private communication, industry experience, and market signals to form forward-looking estimates. These estimates reduce information gaps and improve the timeliness of prediction models. Second, the semi-strong form of market efficiency says that asset prices reflect all public information, including analysts’ shared forecasts. These forecasts show market views on future earnings quality, risk levels, and business trends. These market views often change faster than accounting data, which look backward and are updated only a few times each year. Third, behavioral finance suggests that analysts’ forecasts capture changes in market sentiment, views on uncertainty, and early reactions to new risks—factors that traditional financial statements cannot measure. Therefore, when analyst-forecasting indicators are added to financial distress prediction models, the models can include forward-looking, market-based, and sentiment-sensitive information that traditional data alone cannot provide.
This study aims to examine in a systematic way whether analyst-forecasting indicators can improve corporate financial distress prediction when they are added to a data-driven business intelligence framework. To guide this work, we set out two research questions:
(RQ1) Does adding analyst-forecasting financial indicators improve the predictive performance of benchmark models compared with using only historical accounting indicators?
(RQ2) How do analyst-forecasting indicators help with model interpretability, and how important are they in explaining financial distress compared with traditional historical indicators?
To address these questions, we look at the analysis from two angles: predictive performance and feature importance. For predictive performance, we compare evaluation metrics from models that use different sets of features. We also run statistical tests on these metrics to confirm whether analyst-forecasting indicators provide real performance gains. For feature importance, we use Shapley additive explanations (SHAP) to examine how much these indicators contribute to predicting financial distress. This study contributes to the existing literature by demonstrating, from multiple angles, that analyst-forecasting financial indicators as alternative data enhance the predictive capability of financial distress. Furthermore, it provides a comprehensive validation framework for future research to assess the effectiveness of other alternative information sources. The graphic abstract of our study is shown as Figure 1.
The subsequent sections can be structured as follows: Section 2 gives the literature review. Section 3 presents the methods. Section 4 outlines the datasets, the benchmark models and their corresponding parameter settings, and the evaluation metrics. Section 5 presents the experimental findings. Section 6 provides an in-depth discussion of the experimental findings. Finally, Section 7 offers concluding remarks for the paper.

2. Literature Review

This section reviews the development of FDP from three thematic perspectives: traditional prediction models, alternative and text-based data, and analyst-forecasting indicators.

2.1. Traditional Financial Distress Prediction Models

Current classification methods for FDP can primarily divided into two categories: statistical and artificial intelligence techniques. Altman (1968) proposed the multiple discriminant analysis (MDA) model for bankruptcy prediction, which, despite its high accuracy, is limited by assumptions like multivariate normal distribution and linear separability [27]. Blum (1974) built a financial distress model using discriminant analysis with 12 variables, and the model reached classification rates above 70% [28]. Ohlson (1980) improved the field by introducing a more flexible logistic regression (LR) model that could capture nonlinear links between financial status and financial ratios [29]. Lin (2009) tested several models—such as MDA, logit, probit, and artificial neural networks (ANNs)—to predict financial distress in Taiwanese public firms and found that these methods showed higher accuracy and better generalization [30]. However, statistical methods often depend on assumptions, such as multivariate normality and independent variables, which may not fit the characteristics of financial data [31]. Because of these limits, many researchers have moved toward artificial intelligence methods [32]. Wu et al. (2022) combined a multilayer perceptron (MLP) with the classic Altman Z-Score model to provide early warning signs of a firm’s weakening financial health [9]. Aydin et al. (2022) created a new model that joined ANNs with decision trees (DTs) to classify financial failures in different sectors [33]. Recent studies also show that Support vector machines (SVM), fuzzy neural networks, and back-propagation neural networks deliver strong results in financial distress prediction [34,35,36,37]. Chen (2011) applied three decision tree methods—C5.0, classification and regression tree (CART), and CHAID—along with a logistic regression model to predict financial distress [38]. The results showed that the decision tree models performed better than logistic regression in short-term forecasts. These findings suggest that artificial intelligence methods may offer more effective predictive power than traditional statistical models in identifying financial distress.
In recent years, ensemble models have received strong interest in financial distress prediction because they offer better predictive performance [39]. These models combine the results of several individual models, and this process often produces more accurate and stable outcomes than relying on a single model. Climent et al. (2019) used extreme gradient boosting (XGBoost) to predict financial distress in Eurozone commercial banks by examining 25 yearly financial ratios from 2006 to 2016 [40]. Sun et al. (2020) combined SVM classifiers with adaptive boosting (AdaBoost) and applied synthetic minority oversampling and time weighting to create two dynamic FDP methods for imbalanced datasets [41]. Their findings showed that both combined models improved the detection of minority samples [41]. Zhao et al. (2023) used financial statement data and management analysis to predict financial distress in Chinese listed firms [42]. They applied logistic regression, SVM, XGBoost, and random forest (RF) models, and found that XGBoost reached the highest accuracy [42]. Papíková and Papík examined FDP for small and medium-sized enterprises using 27 financial ratios from a large, highly imbalanced dataset of 89,851 firms, and found that categorical boosting (CatBoost) consistently outperformed other classification models [43]. Liang et al. (2020) added corporate governance factors into a stacking ensemble framework and showed that stacking models performed strongly and reliably [10].
Overall, artificial intelligence methods and ensemble models can detect complex patterns that traditional statistical tools may miss. They also offer more stable predictions because they can process different types of data and reduce overfitting. Based on these strengths, this paper uses several models to study how different input dimensions affect prediction results. However, most existing methods still rely mainly on historical accounting information, showing the need to include more timely and forward-looking indicators.

2.2. Alternative and Text-Based Data Approaches

Earlier studies on financial distress prediction have mainly used market data and accounting data from financial statements [44,45]. However, these financial indicators only show the past financial condition of listed companies, and they often lack both timeliness and completeness. To address these limitations, researchers increasingly incorporate alternative data—especially textual information—to capture forward-looking signals and qualitative aspects of firm performance.
Unlike traditional financial metrics, non-financial features taken from text-based company disclosures can offer useful insight into a firm’s wider operating environment, including cost control, strategy, and management [46]. Research shows that adding textual features from sources such as the management discussion and analysis (MD&A) sections of annual reports, legal judgments, and current reports can greatly improve the accuracy of financial distress prediction. Qiu et al. (2024) introduced an unsupervised learning method for FDP that used MD&A text [15]. Huang et al. (2023) studied financial distress prediction through sentiment analysis of annual reports from Chinese listed companies and found that adding sentiment scores improved model performance [47]. Mai et al. (2019) built a deep learning method for bankruptcy prediction based on textual disclosures and showed that combining text data with accounting ratios and market variables increased prediction accuracy [5]. Using data from European firms, Borchert et al. (2023) showed that website text information served as a useful supplement to traditional models [48]. Other studies have shown the value of sentiment indicators in crisis prediction [49] and have used advanced language models, such as bidirectional encoder representations from transformers (BERT), to extract predictive signals from large-scale text [50]. In line with these results, Nguyen and Huynh (2022) found that adding textual sentiment to firm characteristics and financial indicators improved model performance [51]. However, MD&A and similar documents come from annual financial reports of listed companies, and these reports may be altered for reasons such as financing or tax advantages, which can lead to misleading information [52]. In addition, these text disclosures, like financial reports, often appear with delays, which can limit timely decision-making for stakeholders [53]. Therefore, it is important to find information that is both more accurate and more timely to improve financial distress prediction.

2.3. Analyst-Forecasting Indicators and Identified Research Gap

Compared with past financial statements and text disclosures that may be delayed or biased, analyst-forecasting indicators provide a forward-looking and expert-based source of information. These indicators reflect timely judgments about a firm’s financial outlook [54,55]. Analysts combine real-time market conditions, firm-specific details, and industry expectations, which helps them identify early signs of financial decline [56,57]. Their forecasts are updated throughout the year, so they respond more quickly to new risks than annual or quarterly reports.
Despite these strengths, the use of analyst-forecasting indicators in financial distress prediction is still limited. Most existing studies focus on accounting ratios, market data, or text-based disclosures, while only a few include analyst forecasts in a systematic way. In addition, earlier research has rarely examined how analyst-forecasting information can be integrated into an intelligent business intelligence system that provides real-time and interpretable financial risk assessment. This gap shows a need for a complete framework. Such a framework would combine analyst-forecasting indicators with historical financial data and assess both their predictive power and their contribution to explaining financial distress.

3. Method

In financial distress prediction, firms often show nonlinear, asymmetric, and multi-factor financial behaviors. Capturing these complex patterns requires flexible and powerful models. To provide a thorough and reliable evaluation, this study uses seven widely validated machine-learning models. These models come from three methodological families: tree-based learners, ensemble techniques, and artificial neural networks.
(1) Tree-based learner (CART)
CART is adopted in FDP as a baseline nonlinear classifier due to its strong interpretability and ability to highlight key explanatory financial ratios. CART constructs a binary recursive partitioning structure by selecting split points that minimize impurity. For classification, the impurity measure is the Gini index:
G i n i ( t ) = k = 1 K p k ( t ) ( 1 p k ( t ) )
where p k ( t ) is the proportion of class k at node t. The optimal split s for node t is selected by minimizing the weighted impurity of the two resulting child nodes:
s * = a r g m i n [ N t L N t I m p u r i t y ( t L ) + N t R N t I m p u r i t y ( t R ) ) ]
where N t represents the total number of samples contained in the current node t, N t L and N t R are the number of samples assigned to the left and right child node after applying split, respectively. I m p u r i t y ( t L ) and I m p u r i t y ( t R ) denote the impurity of the left and right child node, respectively.
(2) Ensemble techniques
Ensemble methods combine multiple weak or strong learners to enhance stability and predictive accuracy. They are widely recognized for outperforming single learners in noisy, high-dimensional financial environments. Ensemble methods can be broadly categorized into boosting (AdaBoost, categorical boosting (CatBoost), light gradient boosting machine (LightGBM)) and bagging (extremely randomized tree (ERT), RF) strategies.
AdaBoost focuses sequentially on misclassified firms, enabling the model to capture subtle signals of impending distress. It builds an ensemble by iteratively reweighting misclassified samples. At iteration m , the classifier weight α m   is:
α m = 1 2 l n ( 1 ϵ m ϵ m )
where ϵ m is the weighted error. The final prediction is obtained by combining multiple weak learners through a weighted voting mechanism. The ensemble classifier is defined as:
F ( x ) = s i g n ( m = 1 M α m h m ( x ) )
where M is the total number of weak learners (iterations) used in the boosting process. h m ( x ) is the m -th weak learner. sign ( ) is the sign function that converts the aggregated output into the final class label (distressed vs. non-distressed).
CatBoost is a gradient boosting algorithm that updates the ensemble model iteratively by adding a new decision tree at each boosting step. Its prediction function at iteration m is updated as:
f ( m ) ( x ) = f ( m 1 ) ( x ) + δ h m ( x )
where f ( m 1 ) ( x )   is the model’s prediction from the previous iteration, h m ( x )   is the newly constructed regression tree, and δ   is the learning rate controlling the contribution of each tree to the ensemble. A key innovation of CatBoost lies in its ordered boosting mechanism, which avoids prediction shift when processing categorical data. For each observation i , CatBoost computes an unbiased target statistic using only prior samples as:
y ^ i = j < i y j + α i 1 + β
where j < i y j denotes the cumulative target values of samples preceding i   in a randomized order, and α and β   are smoothing hyperparameters preventing instability when few prior observations are available.
LightGBM optimizes gradient boosting decision trees through histogram-based splitting. At each iteration, the tree is built by minimizing:
L = i l ( y i , y ^ i ) + m = 1 M Ω ( f m )
where y ^ i   is the predicted probability for sample i . f m represents the m-th decision tree, M is the total number of boosting iterations, and Ω ( ) is the regularization term penalizing tree complexity. Each leaf is chosen by the gain:
G a i n = 1 2 ( ( i L g i ) 2 i L h i + λ + ( i R g i ) 2 i R h i + λ ( i S g i ) 2 i S h i + λ )
where g i and h i are first- and second-order gradients. where S   represents all samples at the current node, and L and R   denote the left and right child nodes after the split. For each instance i , g i and h i     are the first- and second-order gradient of the loss function. The term λ is an L 2 -regularization parameter that penalizes overly complex leaf weights and stabilizes tree growth. A positive gain indicates that the split improves model accuracy, and LightGBM selects the split with the largest gain during tree construction.
Through its use of gradient-based one-side sampling and leaf-wise tree growth, LightGBM can capture subtle nonlinear relationships in high-dimensional financial data while maintaining computational efficiency and strong performance on imbalanced datasets.
ERT introduces complete randomness by selecting split thresholds at random rather than searching for the optimal split, which can reduce variance and improve generalization. This property is advantageous in FDP, as the enhanced randomness helps the model better capture noisy, volatile, and highly nonlinear financial patterns. For a tree with prediction f k ( x ) , the ensemble output is:
F ( x ) = 1 K k = 1 K f k ( x )
where K is the number of trees, f k ( x ) represents the prediction of the t-th randomized tree.
RF trains multiple DTs using bootstrapped subsets of data and random feature selection at each split, enhancing robustness and generalization. Prediction is obtained via majority voting:
F ( x ) = m a j o r i t y _ v o t e ( f 1 ( x ) , f 1 ( x ) , ,   f K ( x ) )
where K is the total number of trees in the forest and f k ( x ) is the prediction from tree t .
(3) ANNs (MLP)
MLP is a feedforward neural network composed of an input layer, one or more hidden layers, and an output layer, where each neuron performs a nonlinear transformation of its inputs. For a given layer l , the output of neuron j   is computed as:
a j ( l ) = σ ( i ω i j ( l ) a i ( l 1 ) + b j ( l ) )
where a i ( l 1 ) represents the input from neuron i   in the previous layer, w i j ( l ) and b j ( l ) denote the connection weight and bias, and σ ( ) is a nonlinear activation function. The network prediction is obtained by composing these transformations across layers,
y ^ = f ( x ; θ )
where θ = { W , b } denotes the collection of all trainable parameters. During training, MLP minimizes a loss function L ( y ^ , y ) via backpropagation, updating parameters through gradient descent:
θ ( t + 1 ) = θ ( t ) η θ L ( y ^ , y )
where η is the learning rate and θ L is the gradient of the loss with respect to parameters. MLP can capture complex nonlinear relationships and uncover subtle interactions among financial indicators that traditional models often overlook.

4. Experimental Setup

This section details the datasets, the benchmark models and their respective parameter settings, and the performance evaluation metric used in the study.

4.1. Datasets

To validate whether analyst-forecasting financial indicators contribute to predicting corporate financial distress, we collected two sets of indicators: historical financial indicators and forecasting financial indicators. The specific features are detailed in Table 1. For historical financial indicators, we selected several indicators from Chinese listed companies. The forecasting financial indicators were aggregated from 732,334 predictions made by various analysts out of a total of 1,194,137 predictions, with the selection criterion being that the forecast was made before the predicted year. For instance, to predict the financial condition of the certain company in 2017, we used the financial indicator predictions made by analysts in 2016 for 2017. The above features are collected from China Stock Market & Accounting Research Database 1.
The dataset was divided by year, using historical financial indicators in 2016 and financial indicator predictions in 2017 to forecast the financial condition in 2017. By averaging the predictions from numerous analysts, we obtained the mean forecasting financial indicators for each company. This study focuses on predicting the financial conditions of Chinese publicly traded companies from 2017 to 2023, resulting in the construction of seven datasets. Missing values were imputed using the mean of the respective feature, and numerical variables were standardized based on the training set. To ensure a more stable comparison of the contribution from each feature to predictive performance, we employed five-fold cross-validation across the seven datasets.

4.2. Benchmark Models and Parameter Settings

This study takes seven machine learning models as benchmark models, including tree model (CART, boosting (AdaBoost, CatBoost, LightGBM), bagging (ERT, RF), and ANNs (MLP)). By employing a diverse range of models, this study seeks to thoroughly assess the performance of various machine learning techniques across different FDP scenarios, thereby evaluating the practicality of alternative datasets and feature configurations. The core parameters and their values are shown in Table 2.

4.3. Evaluation Metric

The area under the receiver operating characteristic curve (AUC) is employed to evaluate the discrimination ability of financial distress [6]. To compute the AUC metric, which quantifies the area beneath the receiver operating characteristic (ROC) curve and the coordinate axis beneath it, it is essential to initially arrange the predicted financial distress scores generated by the FDP model for each sample [58]. These sorted scores are then sequentially used as thresholds from high to low to construct the ROC curve and subsequently calculate the area beneath the curve. However, calculating the area can be cumbersome. Therefore, the AUC indicator is often calculated by:
A U C = i p o s i t i v e r i n + × ( n + + 1 ) / 2 n + × n
where n + and n are the numbers of positive and negative samples, respectively, and r i is the rank of the i-th positive sample in the sorted list of all samples when the scores are arranged from high to low. The range of AUC value is [0.5, 1]. A higher AUC value signifies better forecasting performance, reflecting the model’s superior capacity to correctly differentiate between instances of financial distress and non-distress.

5. Experiment Results from Two Perspectives

In this section, we conduct experiments from two perspectives—performance comparison and feature contribution—to validate the effectiveness of the analyst-forecasting financial indicators.

5.1. Validation Experiment from a Performance Comparison Perspective

To assess the enhancement of alternative information (forecasting indicators) to financial distress identification, we examine the AUC values of benchmark models under three prediction scenarios: using only historical indicators, using only forecasting indicators, and using both historical and forecasting indicators (all indicators), as detailed in Table 3. The standard deviation is given in brackets. In many instances, relying solely on forecasting indicators for FDP is suboptimal, as evidenced by the low AUC values observed in both the ANNs and ensemble models. These AUC values, which range from 0.6 to 0.8, show that these methods may not be strong enough for reliable prediction. These findings highlight the need to use a broader set of predictive factors to improve model performance. Our results also show that historical indicators provide better predictions than forecasting indicators alone. For instance, the CART model with historical indicators achieves AUC values of 0.8198, 0.8123, 0.7964, 0.7607, 0.7595, 0.7608, and 0.7733. Each of these values is at least 0.1 higher than the values based on forecasting indicators. This difference indicates that historical indicators contain key information about a firm’s operating condition and help the model achieve stronger predictive performance. Furthermore, combining forecasting indicators with historical indicators produces a more stable early-warning signal than using either type alone. The empirical results show that models using both types of indicators achieve higher AUC values in all seven datasets. This improvement suggests that forecasting information adds meaningful value and increases the model’s ability to detect financial risks, which improves the accuracy of financial distress prediction.
The prediction results are used to determine the average ranks for AUC values with different input indicators, necessary for the Friedman test, as shown in Table 4. A higher rank corresponds to higher prediction accuracy. Compared to using a single type of feature as input, using both historical and forecasting indicators as input achieves higher ranks. Models with all indicators are considered the control set, and the adjusted p-values based on the Holm post hoc test are calculated, as presented in Table 4. According to the results of the Friedman test, there are significant performance differences among models using the three sets of indicators, with p-values all less than 0.01. Based on the Holm post hoc test results, we observe that the performance differences between models based on historical indicators and those based on all indicators are smaller. However, at a certain confidence level, the performance of models based on all indicators is still superior to those based on different subsets of indicators. Although the Holm post hoc test results for certain models are only significant at the 80% confidence level, the Holm post hoc tests for the evaluation metrics of the majority of models are significant at the 90% confidence level. Consequently, the contribution of forecasting indicators to the prediction of corporate financial distress is statistically significant. This suggests that, despite some uncertainty in the evaluation of individual models, the overall trend indicates that forecasting indicators combined with historical indicators play a crucial role in identifying financial risks.
We also use Bayesian A/B testing to compare the prediction results from different scenarios. Bayesian A/B testing is a statistical method that uses Bayesian inference to evaluate how well different versions perform by updating beliefs as more data becomes available [59]. Unlike frequentist tests, which rely on p-values and fixed significance thresholds, Bayesian testing provides a full probability distribution for the parameters of interest. This allows for more flexible interpretation. Bayesian testing also updates results as new data becomes available and can incorporate prior knowledge, which strengthens the analysis. The main steps of Bayesian A/B testing are as follows. First, we rank the performance of each model across the three scenarios. Then, we use these rankings in the Bayesian A/B test to calculate the probability that each model is the best and to estimate the expected loss if that model is chosen. Table 5 presents the results. The findings show that whether we examine all models together or individually, the scenario that includes all forecasting indicators has the highest probability—over 99%—of being the best model. At the same time, the expected loss in this scenario is the lowest, remaining below 0.0050. These results demonstrate that using a full set of indicators is highly effective. The Bayesian A/B test confirms that adding alternative information not only raises the chance of selecting the best model but also reduces expected loss. This emphasizes the importance of combining multiple types of information in financial distress prediction, as it improves predictive performance and lowers financial risk.

5.2. Validation Experiment from a Feature Contribution Perspective

In this paper, SHAP values are introduced to uncover the internal relationships between all indicators and the prediction results of ERT [60]. The ERT is selected to validate the contribution of analyst-forecasting financial indicators to predictions, because experimental outcomes demonstrate that ERT achieves the highest prediction accuracy across all datasets. The visualization results are presented in Figure 2, with the meanings of the variables provided in Table 1.
In Figure 2, the SHAP values of each attribute are systematically ranked from highest to lowest, with the top attribute identified as having the most significant contribution to the overall outcome [61]. This ranking provides valuable insights into which features are most influential in the analysis. To enhance comprehension, feature values are represented through a clear color-coding system: blue represents lower feature values, while red denotes higher values. A high positive feature value positively affects the outcome, and the opposite is true for lower values.
From the figure, the Lroe (Return on equity in last year) contributes the most to the FDP results, followed by the Leps (Earnings per share in last year) and the Lroa (Return on assets in last year). Higher values in the three historical financial indicators reduce the chance that a listed company will face financial distress. These results show that strong return on equity, earnings per share, and return on assets are closely linked to better financial performance and higher profitability. These indicators serve as key signals that a company is likely to remain stable and continue growing, which lowers the risk of financial distress. When the values of these three indicators are low, it is important to examine their relationship with financial distress prediction and consider other related measures. The Lpb (price-to-book ratio in the previous year) is another important factor. A lower price-to-book ratio usually indicates a lower chance of financial distress. This is because a low ratio reflects high book asset value relative to market price, suggesting a strong asset base and a stable financial position. When book value is high compared with market valuation, the company may also appear undervalued, which often signals stability and resilience. A solid asset base can help the firm absorb economic pressure and manage financial challenges. A favorable price-to-book ratio can also attract investors, as it indicates potential for value growth and lower risk. Conversely, a high price-to-book ratio may signal overvaluation or financial instability, making the company more vulnerable to market fluctuations. The Froe (forecasting return on equity) also plays a critical role in predicting financial health. A high Froe suggests that the company is likely to sustain steady profitability, which is essential for long-term success. This indicator shows the firm’s ability to generate earnings from shareholders’ equity and reflects efficient use of capital. A high Froe signals strong current performance and a positive outlook for future earnings. This supports resilience in uncertain markets and helps the company handle external risks. A strong Froe can also build investor confidence and reduce concerns about financial distress. In addition, Lnavs (net asset value per share in the previous year), Feps (forecasting earnings per share), Ltat (total asset turnover in the previous year), and Fnavs (forecasting net asset value per share) serve as valuable reference indicators for predicting financial distress.
From the above analysis, we can see that many historical financial indicators play an important role in shaping a company’s future financial condition. Analyst-forecasting indicators also have strong predictive power. These forecasts rely on broad market analysis and industry trends, which allows them to provide insights beyond what historical data can reveal. They can highlight emerging patterns and risks that past indicators might overlook. This offers a more complete understanding of a company’s future prospects. By combining historical indicators with forecasting indicators, decision-makers can build a stronger and more reliable system for judging financial stability. This combined approach helps support better planning and more effective allocation of resources.

6. Discussion

This study adds to the FDP literature by providing new evidence on the value of using analyst-forecasting indicators in a data-driven business intelligence framework. Earlier research has relied mainly on historical accounting ratios as the key source of predictive information. However, many recent studies have pointed out that the explanatory power of book-value-based indicators has weakened, and these indicators are limited because they only reflect past performance [13]. Our findings support this concern and show that accounting information alone cannot fully capture the changing risk conditions of firms. This conclusion aligns with earlier work showing that traditional financial statements offer only a delayed view of a company’s situation and do not include fast-emerging risks shaped by rapidly changing market environments [62].
In contrast to the studies of Wu et al. (2024) [21], Hajek and Munk (2024) [6], and Qiu et al. (2024) [15], which use company filings or textual disclosures as alternative data, this study introduces analyst-forecasting indicators as an expert-driven and forward-looking source of information. This study also demonstrates the clear added value of analyst-forecasting indicators. Text-based sources, like MD&A sections or current reports, can provide useful risk signals. However, they often face reporting delays, carefully managed tone, or difficulties in text processing. Analyst forecasts are different because they reflect real-time market opinions, private communications, and industry insights. They also provide updated numerical estimates, giving a faster view of a company’s financial outlook. Our results show that models using analyst-forecasting indicators consistently outperform models that rely only on accounting data. These findings support theories of information asymmetry, semi-strong market efficiency, and behavioral finance. They suggest that analyst forecasts capture market sentiment, risk expectations, and forward-looking signals that traditional disclosures cannot fully reveal.
Unlike earlier studies that focus mainly on prediction accuracy [63,64], our model not only improves predictive performance but also adds interpretability through SHAP analysis. The interpretability results show that analyst-forecasting indicators strongly influence predicted distress probabilities. They complement profitability and solvency indicators from accounting data, rather than replacing them. This helps explain how forward-looking expert insights can support backward-looking financial information. It also addresses a key challenge in financial distress prediction: keeping models interpretable while incorporating new types of data. Compared with multimodal deep-learning FDP models [24,65], which often function as black boxes, our framework maintains high transparency. This makes it more practical for financial institutions, risk-management teams, and regulators who require clear and explainable decision support.
It is also important to consider the strengths and limitations of our approach. Using analyst-forecasting indicators takes advantage of their frequent updates, ability to combine market and firm-level information, and sensitivity to early signs of risk. These features allow the model to identify financial problems earlier than models that rely only on static disclosures. Their expert-driven nature also provides a concise summary of complex economic conditions and reduces noise often present in large, unstructured text data. However, analyst forecasts have some limitations. Coverage can vary across firms, especially for smaller or less visible companies. Analysts may also show optimism, follow herd behavior, or display other biases. Forecast updates can react sharply to market shocks, which may increase volatility in predictions. Despite these issues, our empirical results show that these drawbacks do not outweigh the clear improvements in predictive accuracy achieved when analyst-forecasting indicators are included in the model.

7. Conclusions

FDP is essential for detecting potential risks, allowing for proactive management strategies that can protect a company’s stability and promote long-term success. In this paper, the forecasting financial indicators of listed companies are utilized as alternative information, and an input system of forecasting indicators is constructed to predict financial distress in these companies. Seven machine learning models were applied to seven datasets under three prediction scenarios to evaluate the impact of different input features on prediction results. The experimental results demonstrate that a feature framework combining historical indicators with forecasting indicators provides more valuable insights into the operational status of enterprises, thereby significantly enhancing the model’s prediction accuracy. The effectiveness of the forecasting indicators of analysts in improving FDP is further validated through the Friedman test with Holm post hoc test and Bayesian A/B testing. Finally, SHAP value is employed to measure the contribution of historical financial indicators and analyst-forecasting financial indicators to the predictions of financial distress, giving another evidence of the validity of the combination of historical indicators and analyst-forecasting financial indicators for the enhancement of enterprise FDP.
In addition to its methodological contributions, this study demonstrates clear practical relevance for business and financial decision-making. Rather than treating analyst-forecasting indicators as purely academic variables, the empirical results indicate that these indicators can serve as effective and actionable inputs within enterprise-level business intelligence and financial risk monitoring systems [66]. When combined with internally generated accounting information, analyst-forecasting indicators enable firms to construct more adaptive and forward-looking early-warning mechanisms, particularly for identifying potential financial risks that are not yet evident in historical financial statements. In practical business intelligence settings, such information can be integrated into decision-support platforms and dynamic monitoring dashboards, allowing managers not only to track changes in financial risk exposure over time but also to interpret model outputs with the aid of SHAP-based explanations, thereby enhancing transparency and confidence in data-driven decisions.
From a risk management standpoint, the findings further suggest that exclusive reliance on backward-looking financial ratios may delay the detection of financial distress. Analyst-forecasting indicators, which incorporate market expectations regarding future profitability, asset valuation, and operational performance, provide earlier and more informative signals of potential deterioration. These forward-looking insights can be used by firms to strengthen internal control systems, refine credit evaluation processes, and support more informed decisions related to capital allocation and strategic planning. For investors and financial institutions, the proposed framework offers a complementary tool to traditional credit assessment and investment screening approaches by incorporating market-informed risk signals into existing evaluation practices.
The study also has implications at the regulatory and policy level. Supervisory authorities may benefit from encouraging greater standardization and accessibility of analyst-forecasting information, as well as from considering the adoption of machine-learning-based early-warning frameworks within regulatory monitoring systems. Such measures could enhance the timeliness and responsiveness of financial supervision and improve the early identification of firms facing elevated financial distress risk.
This study also has several limitations that deserve attention. First, although we focus on how alternative information can improve financial distress prediction, our analysis is shaped by the methods we use. The study relies on a specific set of machine-learning models and seven datasets. This may limit the applicability of the results to other markets, time periods, or modeling approaches. Future research could test the same ideas with a wider range of algorithms, such as deep time-series models or graph-based methods, to see if the improvements remain consistent across different techniques. Second, even though our results are statistically validated, the study would benefit from external validation. Using out-of-sample data or datasets from other markets would help confirm the robustness and reproducibility of the findings. Including more data environments would also provide a clearer understanding of how well forecasting indicators perform in practice. Third, the forecasting indicators used in this study depend on the availability and quality of analyst data, which varies across countries and industries. Because of this, future research should consider expanding the sources of forecasting indicators and examining how measurement differences may affect the results. These steps would improve the reliability and methodological strength of research in financial distress prediction.

Author Contributions

Conceptualization, Z.L., M.W. and L.Z.; methodology, Z.L.; software, Z.L. and M.W.; validation, Z.L., M.W., L.Z. and D.L.; formal analysis, Z.L.; investigation, D.L.; data curation, L.Z.; writing—original draft preparation, Z.L.; writing—review and editing, Z.L., M.W., L.Z., D.L., Z.D. and J.W.; visualization, D.L.; supervision, J.W.; project administration, Z.D.; funding acquisition, Z.L. and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Talent Introduction Research Start-up Fund Project of Nanjing University of Posts and Telecommunications (Grant No. NYY223024), the Project of Natural Science Research in Colleges and Universities of Jiangsu Province (Grant No. 25KJB630018), the National Natural Science Foundation of China (Grant No. 72501135), and the Humanities and Social Sciences Youth Foundation of Ministry of Education of China (Grant No. 25YJC630096).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Author Zhiyuan Du was employed by AbbVie Inc. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Note

1

References

  1. Zhao, S.; Xu, K.; Wang, Z.; Liang, C.; Lu, W.; Chen, B. Financial Distress Prediction by Combining Sentiment Tone Features. Econ. Model. 2022, 106, 105709. [Google Scholar] [CrossRef]
  2. Wang, G.; Chen, G.; Chu, Y. A New Random Subspace Method Incorporating Sentiment and Textual Information for Financial Distress Prediction. Electron. Commer. Res. Appl. 2018, 29, 30–49. [Google Scholar] [CrossRef]
  3. Tsai, C.F.; Sue, K.L.; Hu, Y.H.; Chiu, A. Combining Feature Selection, Instance Selection, and Ensemble Classification Techniques for Improved Financial Distress Prediction. J. Bus. Res. 2021, 13, 200–209. [Google Scholar] [CrossRef]
  4. Song, Y.; Li, R.; Zhang, Z.; Sahut, J.-M. ESG Performance and Predicting Financial Distress in Energy Companies. Financ. Res. Lett. 2024, 65, 105546. [Google Scholar] [CrossRef]
  5. Mai, F.; Tian, S.; Lee, C.; Ma, L. Deep Learning Models for Bankruptcy Prediction Using Textual Disclosures. Eur. J. Oper. Res. 2019, 274, 743–758. [Google Scholar] [CrossRef]
  6. Hajek, P.; Munk, M. Corporate Financial Distress Prediction Using the Risk-Related Information Content of Annual Reports. Inf. Process. Manag. 2024, 61, 103820. [Google Scholar] [CrossRef]
  7. Prusak, B. Review of Research into Enterprise Bankruptcy Prediction in Selected Central and Eastern European Countries. Int. J. Financ. Stud. 2018, 6, 60. [Google Scholar] [CrossRef]
  8. Zhang, Z.; Wu, C.; Qu, S.; Chen, X. An Explainable Artificial Intelligence Approach for Financial Distress Prediction. Inf. Process. Manag. 2022, 59, 102988. [Google Scholar] [CrossRef]
  9. Wu, D.; Ma, X.; Olson, D.L. Financial Distress Prediction Using Integrated Z-Score and Multilayer Perceptron Neural Networks. Decis. Support Syst. 2022, 159, 113814. [Google Scholar] [CrossRef]
  10. Liang, D.; Tsai, C.F.; Lu, H.Y.; Chang, L.S. Combining Corporate Governance Indicators with Stacking Ensembles for Financial Distress Prediction. J. Bus. Res. 2020, 120, 137–146. [Google Scholar] [CrossRef]
  11. Yu, L.; Li, M. A Case-Based Reasoning Driven Ensemble Learning Paradigm for Financial Distress Prediction with Missing Data. Appl. Soft Comput. 2023, 137, 110163. [Google Scholar] [CrossRef]
  12. Yu, L.; Li, M.; Liu, X. A Two-Stage Case-Based Reasoning Driven Classification Paradigm for Financial Distress Prediction with Missing and Imbalanced Data. Expert Syst. Appl. 2024, 249, 123745. [Google Scholar] [CrossRef]
  13. Zhou, L.; Lu, D.; Fujita, H. The Performance of Corporate Financial Distress Prediction Models with Features Selection Guided by Domain Knowledge and Data Mining Approaches. Knowl. Based Syst. 2015, 85, 52–61. [Google Scholar] [CrossRef]
  14. Lev, B.; Gu, F. The End of Accounting and the Path Forward for Investors and Managers; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
  15. Qiu, Y.; He, J.; Chen, Z.; Yao, Y.; Qu, Y. A Novel Semisupervised Learning Method with Textual Information for Financial Distress Prediction. J. Forecast. 2024, 43, 2478–2494. [Google Scholar] [CrossRef]
  16. Muñoz-Izquierdo, N.; Laitinen, E.K.; Camacho-Miñano, M.d.M.; Pascual-Ezama, D. Does Audit Report Information Improve Financial Distress Prediction over Altman’s Traditional Z-Score Model? J. Int. Financ. Manag. Account. 2020, 31, 65–97. [Google Scholar] [CrossRef]
  17. Zhang, Y.; He, M.; Liao, C.; Wang, Y. Climate Risk Exposure and the Cross-Section of Chinese Stock Returns. Financ. Res. Lett. 2023, 55, 103987. [Google Scholar] [CrossRef]
  18. Zhang, Y.; Hu, A.; Wang, J.; Zhang, Y. Detection of Fraud Statement Based on Word Vector: Evidence from Financial Companies in China. Financ. Res. Lett. 2022, 46, 102477. [Google Scholar] [CrossRef]
  19. Zhang, Y.; Zhang, Y.; Ren, X.; Jin, M. Geopolitical Risk Exposure and Stock Returns: Evidence from China. Financ. Res. Lett. 2024, 64, 105479. [Google Scholar] [CrossRef]
  20. Citterio, A.; King, T. The Role of Environmental, Social, and Governance (ESG) in Predicting Bank Financial Distress. Financ. Res. Lett. 2023, 51, 103411. [Google Scholar] [CrossRef]
  21. Wu, C.; Jiang, C.; Wang, Z.; Ding, Y. Predicting Financial Distress Using Current Reports: A Novel Deep Learning Method Based on User-Response-Guided Attention. Decis. Support Syst. 2024, 179, 114176. [Google Scholar] [CrossRef]
  22. Che, W.; Wang, Z.; Jiang, C.; Abedin, M.Z. Predicting Financial Distress Using Multimodal Data: An Attentive and Regularized Deep Learning Method. Inf. Process. Manag. 2024, 61, 103703. [Google Scholar] [CrossRef]
  23. Li, S.; Shi, W.; Wang, J.; Zhou, H. A Deep Learning-Based Approach to Constructing a Domain Sentiment Lexicon: A Case Study in Financial Distress Prediction. Inf. Process. Manag. 2021, 58, 102673. [Google Scholar] [CrossRef]
  24. Liu, J.; Jia, M. Financial Distress Prediction with Annual Reports-Based Deep Textual Feature Extraction: A Hybrid Approach. Inf. Sci. 2025, 686, 121318. [Google Scholar] [CrossRef]
  25. Ni, J.; Xu, Y.; Shi, J.; Li, J. Product Innovation in a Supply Chain with Information Asymmetry: Is More Private Information Always Worse? Eur. J. Oper. Res. 2024, 314, 229–240. [Google Scholar] [CrossRef]
  26. Akerlof, G.A. The Market for “Lemons”: Quality Uncertainty and the Market Mechanism. Q. J. Econ. 1970, 84, 488–500. [Google Scholar] [CrossRef]
  27. Altman, E.I. Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. J. Finance 1968, 23, 589. [Google Scholar] [CrossRef]
  28. Blum, M. Failing Company Discriminant Analysis. J. Account. Res. 1974, 12, 1–25. [Google Scholar] [CrossRef]
  29. Ohlson, J.A. Financial Ratios and the Probabilistic Prediction of Bankruptcy. J. Account. Res. 1980, 18, 109–131. [Google Scholar] [CrossRef]
  30. Lin, T.H. A Cross Model Study of Corporate Financial Distress Prediction in Taiwan: Multiple Discriminant Analysis, Logit, Probit and Neural Networks Models. Neurocomputing 2009, 72, 3507–3516. [Google Scholar] [CrossRef]
  31. Wang, S.; Chi, G. Cost-Sensitive Stacking Ensemble Learning for Company Financial Distress Prediction. Expert Syst. Appl. 2024, 255, 124525. [Google Scholar] [CrossRef]
  32. Korol, T. Early Warning Models against Bankruptcy Risk for Central European and Latin American Enterprises. Econ. Model. 2013, 31, 22–30. [Google Scholar] [CrossRef]
  33. Aydin, N.; Sahin, N.; Deveci, M.; Pamucar, D. Prediction of Financial Distress of Companies with Artificial Neural Networks and Decision Trees Models. Mach. Learn. Appl. 2022, 10, 100432. [Google Scholar] [CrossRef]
  34. Sun, J.; Fujita, H.; Zheng, Y.; Ai, W. Multi-Class Financial Distress Prediction Based on Support Vector Machines Integrated with the Decomposition and Fusion Methods. Inf. Sci. 2021, 559, 153–170. [Google Scholar] [CrossRef]
  35. Kim, S.; Mun, B.M.; Bae, S.J. Data Depth Based Support Vector Machines for Predicting Corporate Bankruptcy. Appl. Intell. 2018, 48, 791–804. [Google Scholar] [CrossRef]
  36. Sun, X.; Lei, Y. Research on Financial Early Warning of Mining Listed Companies Based on BP Neural Network Model. Resour. Policy 2021, 73, 102223. [Google Scholar] [CrossRef]
  37. Zhao, M.; Song, Y.; Huang, H.; Kim, E.-H. Attention-Based Fuzzy Neural Networks Designed for Early Warning of Financial Crises of Listed Companies. Inf. Sci. 2025, 686, 121374. [Google Scholar] [CrossRef]
  38. Chen, M.Y. Predicting Corporate Financial Distress Based on Integration of Decision Tree Classification and Logistic Regression. Expert Syst. Appl. 2011, 38, 11261–11272. [Google Scholar] [CrossRef]
  39. Wang, W.; Liang, Z. Financial Distress Early Warning for Chinese Enterprises from a Systemic Risk Perspective: Based on the Adaptive Weighted XGBoost-Bagging Model. Systems 2024, 12, 65. [Google Scholar] [CrossRef]
  40. Climent, F.; Momparler, A.; Carmona, P. Anticipating Bank Distress in the Eurozone: An Extreme Gradient Boosting Approach. J. Bus. Res. 2019, 101, 885–896. [Google Scholar] [CrossRef]
  41. Sun, J.; Li, H.; Fujita, H.; Fu, B.; Ai, W. Class-Imbalanced Dynamic Financial Distress Prediction Based on Adaboost-SVM Ensemble Combined with SMOTE and Time Weighting. Inf. Fusion 2020, 54, 128–144. [Google Scholar] [CrossRef]
  42. Zhao, Q.; Xu, W.; Ji, Y. Predicting Financial Distress of Chinese Listed Companies Using Machine Learning: To What Extent Does Textual Disclosure Matter? Int. Rev. Financ. Anal. 2023, 89, 102770. [Google Scholar] [CrossRef]
  43. Papíková, L.; Papík, M. Effects of Classification, Feature Selection, and Resampling Methods on Bankruptcy Prediction of Small and Medium-Sized Enterprises. Intell. Syst. Account. Financ. Manag. 2022, 29, 254–281. [Google Scholar] [CrossRef]
  44. Qian, H.; Wang, B.; Yuan, M.; Gao, S.; Song, Y. Financial Distress Prediction Using a Corrected Feature Selection Measure and Gradient Boosted Decision Tree. Expert Syst. Appl. 2022, 190, 116202. [Google Scholar] [CrossRef]
  45. Kim, S.Y.; Upneja, A. Predicting Restaurant Financial Distress Using Decision Tree and AdaBoosted Decision Tree Models. Econ. Model. 2014, 36, 354–362. [Google Scholar] [CrossRef]
  46. Wang, G.; Ma, J.; Chen, G.; Yang, Y. Financial Distress Prediction: Regularized Sparse-Based Random Subspace with ER Aggregation Rule Incorporating Textual Disclosures. Appl. Soft Comput. J. 2020, 90, 106152. [Google Scholar] [CrossRef]
  47. Huang, B.; Yao, X.; Luo, Y.; Li, J. Improving Financial Distress Prediction Using Textual Sentiment of Annual Reports. Ann. Oper. Res. 2023, 330, 457–484. [Google Scholar] [CrossRef]
  48. Borchert, P.; Coussement, K.; De Caigny, A.; De Weerdt, J. Extending Business Failure Prediction Models with Textual Website Content Using Deep Learning. Eur. J. Oper. Res. 2023, 306, 348–357. [Google Scholar] [CrossRef]
  49. Gandhi, P.; Loughran, T.; McDonald, B. Using Annual Report Sentiment as a Proxy for Financial Distress in U.S. Banks. J. Behav. Financ. 2019, 20, 424–436. [Google Scholar] [CrossRef]
  50. Stevenson, M.; Mues, C.; Bravo, C. The Value of Text for Small Business Default Prediction: A Deep Learning Approach. Eur. J. Oper. Res. 2021, 295, 758–771. [Google Scholar] [CrossRef]
  51. Nguyen, B.-H.; Huynh, V.-N. Textual Analysis and Corporate Bankruptcy: A Financial Dictionary-Based Sentiment Approach. J. Oper. Res. Soc. 2022, 73, 102–121. [Google Scholar] [CrossRef]
  52. Beatty, A.; Liao, S.; Yu, J.J. The Spillover Effect of Fraudulent Financial Reporting on Peer Firms’ Investments. J. Account. Econ. 2013, 55, 183–205. [Google Scholar] [CrossRef]
  53. Breuer, M. How Does Financial-Reporting Regulation Affect Industry-Wide Resource Allocation? J. Account. Res. 2021, 59, 59–110. [Google Scholar] [CrossRef]
  54. Li, L.; Li, G. Information Rigidity: Comparing Average and Individual Forecasts of Analysts of Chinese A-Share Listed Companies. Int. Rev. Econ. Financ. 2025, 104, 104732. [Google Scholar] [CrossRef]
  55. Chourou, L.; Purda, L.; Saadi, S. Economic Policy Uncertainty and Analysts’ Forecast Characteristics. J. Account. Public Policy 2021, 40, 106775. [Google Scholar] [CrossRef]
  56. Gu, Z.; Gu, C.; Zhang, C. Analyst Forecast Behavior under Trade Uncertainty: Evidence from China. Financ. Res. Lett. 2025, 85, 108245. [Google Scholar] [CrossRef]
  57. Gavious, I.; Milo, O.; Weihs, H. Analysts’ Bankruptcy Prediction: Revisiting the Information Value Added by Financial Experts. Financ. Res. Lett. 2025, 85, 108138. [Google Scholar] [CrossRef]
  58. Zhu, S.; Wu, H.; Ngai, E.W.T.; Ren, J.; He, D.; Ma, T.; Li, Y. A Financial Fraud Prediction Framework Based on Stacking Ensemble Learning. Systems 2024, 12, 588. [Google Scholar] [CrossRef]
  59. Zhang, L.; Abedin, M.Z.; Liu, Z. Incorporating Media News to Predict Financial Distress: Case Study on Chinese Listed Companies. J. Forecast. 2024, 43, 1374–1398. [Google Scholar] [CrossRef]
  60. Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar]
  61. Sun, G.; Li, Y. Intraday and Post-Market Investor Sentiment for Stock Price Prediction: A Deep Learning Framework with Explainability and Quantitative Trading Strategy. Systems 2025, 13, 390. [Google Scholar] [CrossRef]
  62. Nugroho, D.S.; Dewayanto, T. Application of Statistics and Artificial Intelligence for Corporate Financial Distress Prediction Models: A Systematic Literature Review. J. Model. Manag. 2025, 20, 1999–2023. [Google Scholar] [CrossRef]
  63. Chen, X.; Liu, J.; Wu, C. Multi-Class Financial Distress Prediction Based on Hybrid Feature Selection and Improved Stacking Ensemble Model. Expert Syst. Appl. 2025, 282, 127832. [Google Scholar] [CrossRef]
  64. Chen, P.; Ji, M. Deep Learning-Based Financial Risk Early Warning Model for Listed Companies: A Multi-Dimensional Analysis Approach. Expert Syst. Appl. 2025, 283, 127746. [Google Scholar] [CrossRef]
  65. Tao, F.; Wang, W.; Lu, R. A Deep Learning-Based Sentiment Flow Analysis Model for Predicting Financial Risk of Listed Companies. Eng. Appl. Artif. Intell. 2025, 150, 110522. [Google Scholar] [CrossRef]
  66. Durana, P.; Poliak, M.; Kovalova, E.; Blazek, R. Prediction Models Reloaded: Advanced Insights for SMEs in the Bucharest Nine Countries. Oeconomia Copernic. 2025, 16, 689–760. [Google Scholar] [CrossRef]
Figure 1. The graphical abstract of our study.
Figure 1. The graphical abstract of our study.
Systems 14 00029 g001
Figure 2. SHAP value of the ERT (with all indicators).
Figure 2. SHAP value of the ERT (with all indicators).
Systems 14 00029 g002
Table 1. Indicators used in our study.
Table 1. Indicators used in our study.
Financial indicators in last yearLepsEarnings per share in last year
LpePrice to earnings ratio in last year
LnetproNet profit in last year
LebitEarnings before interest and taxes in last year
LebitdaEarnings before interest, taxes, depreciation, and amortization in last year
LorOperating revenue in last year
LocfpsOperating cash flow per share in last year
LbmBook to market ratio in last year
LnavsNet asset value per share in last year
LroaReturn on assets in last year
LroeReturn on equity in last year
LpbPrice to book ratio in last year
LtatTotal asset turnover in last year
Forecasting financial indicatorsFepsForecasting earnings per share
FpeForecasting price to earnings ratio
FnetproForecasting net profit
FebitForecasting earnings before interest and taxes
FebitdaForecasting earnings before interest, taxes, depreciation, and amortization
ForForecasting operating revenue
FocfpsForecasting operating cash flow per share
FnavsForecasting net asset value per share
FroaForecasting return on assets
FroeForecasting return on equity
FpbForecasting price to book ratio
FtatForecasting total asset turnover
FnpaopcForecasting net profit attributable to owners of the parent company
FtpForecasting total profit
FopForecasting operating profit
Table 2. Candidate parameter setting of all classifiers.
Table 2. Candidate parameter setting of all classifiers.
ModelHyperparametersValue
AdaBoostLearning rate[0.01, 0.25, 0.5, 0.75, 1]
The number of DTs[50, 100, 150, 200]
The min number of samples required to be at a leaf in DTs[1, 25, 50, 75, 100]
The max depth of DTs[1, 2, 3, 4, 5]
CARTThe ratio of features considered for splitting in DTs[0.01, 0.25, 0.5, 0.75, 1]
The max depth of DTs[1, 2, 3, 4, 5]
The min number of samples required to split a leaf in DTs[2, 25, 50, 75, 100]
The min number of samples required to be at a leaf in DTs[1, 25, 50, 75, 100]
CatBoostThe number of DTs[50, 100, 150, 200]
Learning rate[0.01, 0.25, 0.5, 0.75, 1]
The depth of DTs[1, 2, 3, 4, 5]
ERTThe number of DTs[50, 100, 150, 200]
The min number of samples required to split a leaf in DTs[2, 25, 50, 75, 100]
The max depth of DTs[1, 2, 3, 4, 5]
The min number of samples required to be at a leaf in DTs[1, 25, 50, 75, 100]
LightGBMThe number of DTs[50, 100, 150, 200]
Learning rate[0.01, 0.25, 0.5, 0.75, 1]
The max depth of DTs[1, 2, 3, 4, 5]
MLPMax iteration100
Activation function“relu”
Solver“adam”
The size of batch[16, 32, 64, 128]
Initial learning rate[0.01, 0.25, 0.5, 0.75, 1]
The size of hidden layer[1, 25, 50, 75, 100]
RFThe number of DTs[50, 100, 150, 200]
The min number of samples required to split a leaf in DTs[2, 25, 50, 75, 100]
The max depth of DTs[1, 2, 3, 4, 5]
The min number of samples required to be at a leaf in DTs[1, 25, 50, 75, 100]
Table 3. Prediction performance comparison.
Table 3. Prediction performance comparison.
DatasetFeaturesAdaBoostCARTCatBoostERTLightGBMMLPRF
Dataset 1Historical Indicators0.7252 (0.1067)0.8198 (0.0610)0.8311 (0.0493)0.8529 (0.0662)0.8332 (0.0975)0.6823 (0.1653)0.8552 (0.0708)
Forecasting Indicators0.7299 (0.1222)0.7192 (0.1028)0.7173 (0.1393)0.7711 (0.0608)0.6993 (0.1345)0.7102 (0.1061)0.7279 (0.0585)
All Indicators0.7957 (0.1039)0.8414 (0.1300)0.8736 (0.1182)0.9313 (0.0465)0.8667 (0.1324)0.7798 (0.0694)0.8969 (0.1027)
Dataset 2Historical Indicators0.7352 (0.1195)0.8123 (0.0594)0.8374 (0.0434)0.8671 (0.0508)0.8256 (0.0968)0.7019 (0.1816)0.8513 (0.0719)
Forecasting Indicators0.6985 (0.0993)0.6809 (0.0532)0.7190 (0.1405)0.7681 (0.0622)0.6588 (0.0820)0.7438 (0.0821)0.7505 (0.0671)
All Indicators0.7908 (0.1022)0.8238 (0.1289)0.8344 (0.1274)0.9087 (0.0543)0.8387 (0.1186)0.7814 (0.0680)0.8650 (0.0940)
Dataset 3Historical Indicators0.7346 (0.1193)0.7964 (0.0871)0.8308 (0.0513)0.8673 (0.0506)0.8279 (0.0948)0.6978 (0.1783)0.8604 (0.0571)
Forecasting Indicators0.7098 (0.1068)0.6926 (0.0367)0.7021 (0.1396)0.7471 (0.0489)0.6773 (0.0776)0.7010 (0.0467)0.7530 (0.0645)
All Indicators0.8043 (0.1111)0.8139 (0.1250)0.8432 (0.1337)0.9125 (0.0576)0.8549 (0.1287)0.7874 (0.0780)0.8721 (0.0970)
Dataset 4Historical Indicators0.7611 (0.1505)0.7607 (0.0637)0.8218 (0.0378)0.8471 (0.0202)0.8277 (0.0945)0.7245 (0.1493)0.8510 (0.0437)
Forecasting Indicators0.7292 (0.0680)0.6599 (0.0593)0.7209 (0.1303)0.7289 (0.0788)0.7025 (0.0277)0.6853 (0.0738)0.7471 (0.0670)
All Indicators0.8660 (0.0450)0.8429 (0.0754)0.8875 (0.0950)0.9224 (0.0480)0.8876 (0.0711)0.8265 (0.0722)0.8927 (0.0662)
Dataset 5Historical Indicators0.8044 (0.1212)0.7595 (0.0629)0.8143 (0.0456)0.8451 (0.0203)0.8521 (0.0691)0.7802 (0.0851)0.8473 (0.0451)
Forecasting Indicators0.7207 (0.0693)0.6579 (0.0598)0.7519 (0.0886)0.7439 (0.0864)0.7145 (0.0412)0.6831 (0.0732)0.7724 (0.0589)
All Indicators0.8860 (0.0482)0.8255 (0.0454)0.8841 (0.0908)0.9234 (0.0492)0.8871 (0.0704)0.8579 (0.0799)0.8892 (0.0602)
Dataset 6Historical Indicators0.8541 (0.0634)0.7608 (0.0640)0.8228 (0.0589)0.8455 (0.0210)0.8443 (0.0601)0.7762 (0.0831)0.8457 (0.0428)
Forecasting Indicators0.7105 (0.0591)0.6464 (0.0466)0.7127 (0.0494)0.7415 (0.0835)0.7225 (0.0511)0.6799 (0.0707)0.7546 (0.0526)
All Indicators0.8743 (0.0570)0.8111 (0.0431)0.8556 (0.1103)0.8997 (0.0677)0.8663 (0.0892)0.8409 (0.0953)0.8749 (0.0654)
Dataset 7Historical Indicators0.8629 (0.0692)0.7733 (0.0796)0.8392 (0.0671)0.8525 (0.0260)0.8541 (0.0586)0.7937 (0.1092)0.8517 (0.0417)
Forecasting Indicators0.6867 (0.0833)0.6138 (0.0656)0.6781 (0.0315)0.6986 (0.1277)0.7013 (0.0810)0.6314 (0.0860)0.7223 (0.0559)
All Indicators0.8986 (0.0480)0.8271 (0.0499)0.8845 (0.0865)0.9149 (0.0619)0.8672 (0.0888)0.8685 (0.0737)0.8912 (0.0590)
Table 4. Friedman test with Holm post hoc test.
Table 4. Friedman test with Holm post hoc test.
Model GroupFriedman Test: p-ValueHolm Post Hoc Test: Rank (Corrected p-Value)
Benchmark SetControl Set
Historical IndicatorsForecasting IndicatorsAll Indicators
All Models0.00001.9388 (0.0000)1.0816 (0.0000)2.9796
AdaBoost0.00001.8571 (0.0325)1.1429 (0.0010)3.0000
CART0.00002.0000 (0.0614)1.0000 (0.0004)3.0000
CatBoost0.00002.1429 (0.1814)1.0000 (0.0010)2.8571
ERT0.00002.0000 (0.0614)1.0000 (0.0004)3.0000
LightGBM0.00002.0000 (0.0614)1.0000 (0.0004)3.0000
MLP0.00021.5714 (0.0075)1.4286 (0.0066)3.0000
RF0.00002.0000 (0.0614)1.0000 (0.0004)3.0000
Table 5. Bayesian A/B testing results.
Table 5. Bayesian A/B testing results.
Model GroupHistorical IndicatorsForecasting IndicatorsAll Indicators
All Models0.00% (0.9818)0.00% (1.7888)100.00% (0.0000)
AdaBoost0.37% (0.7990)0.03% (1.2994)99.61% (0.0005)
CART0.73% (0.7000)0.00% (1.3988)99.28% (0.0009)
CatBoost3.47% (0.5067)0.02% (1.3041)96.52% (0.0045)
ERT0.73% (0.7000)0.00% (1.3988)99.28% (0.0009)
LightGBM0.73% (0.7000)0.00% (1.3988)99.28% (0.0009)
MLP0.09% (1.0003)0.07% (1.1003)99.84% (0.0002)
RF0.73% (0.7000)0.00% (1.3988)99.28% (0.0009)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Z.; Wang, M.; Liu, D.; Du, Z.; Zhang, L.; Wang, J. Integrating Analyst-Forecasting Indicators into Business Intelligence Systems for Data-Driven Financial Distress Prediction. Systems 2026, 14, 29. https://doi.org/10.3390/systems14010029

AMA Style

Liu Z, Wang M, Liu D, Du Z, Zhang L, Wang J. Integrating Analyst-Forecasting Indicators into Business Intelligence Systems for Data-Driven Financial Distress Prediction. Systems. 2026; 14(1):29. https://doi.org/10.3390/systems14010029

Chicago/Turabian Style

Liu, Zhenkun, Mu Wang, Dansheng Liu, Zhiyuan Du, Lifang Zhang, and Jianzhou Wang. 2026. "Integrating Analyst-Forecasting Indicators into Business Intelligence Systems for Data-Driven Financial Distress Prediction" Systems 14, no. 1: 29. https://doi.org/10.3390/systems14010029

APA Style

Liu, Z., Wang, M., Liu, D., Du, Z., Zhang, L., & Wang, J. (2026). Integrating Analyst-Forecasting Indicators into Business Intelligence Systems for Data-Driven Financial Distress Prediction. Systems, 14(1), 29. https://doi.org/10.3390/systems14010029

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop