Abstract
Corporate bankruptcy prediction has become increasingly critical amid economic uncertainty. This study proposes a novel two-stage machine learning approach to enhance bankruptcy prediction accuracy, applied to Tokyo Stock Exchange-listed companies. First, models were trained using 173 financial indicators. Second, a wrapper-based feature selection process was employed to reduce dimensionality and eliminate noise, thereby identifying an optimal seven-feature set. Two ensemble learning methods, Random Forest and Light Gradient Boosting Machine (LightGBM), were used. Random Forest correctly predicted 566 bankruptcies using the reduced feature set (88 more than when using all features) compared with 451 by LightGBM (31 more than when using all features). LightGBM is a gradient boosting–based ensemble learning method that employs a leaf-wise tree growth strategy, enabling fast computation and high predictive accuracy, especially in large-scale and high-dimensional datasets. The study also addresses challenges posed by imbalanced data by employing resampling techniques (SMOTE, SMOTE-ENN, and KMeans). Additionally, the need for industry-specific modeling is recognized by constructing models for the six industry sectors. These findings highlight the importance of feature selection and ensemble learning for improving model generalizability and uncovering industry-specific patterns. This study contributes to the field of bankruptcy prediction by providing a robust framework for accurate and interpretable predictions for both academic research and practical applications. Future work will focus on further enhancing prediction accuracy to identify more potential bankruptcies.
1. Introduction
In today’s increasingly uncertain environment, corporate management faces significant challenges due to the heightened risk of bankruptcy arising from deteriorating business performance. Bankruptcies impose substantial losses on stakeholders, including business partners, investors, and financial institutions. Accordingly, developing models that prevent or enable the early detection of bankruptcy has become essential. While traditional research has relied on statistical approaches, recent advances in machine learning have enabled more objective and accurate predictions.
This study builds on earlier work by applying ensemble learning methods—Random Forest and LightGBM—while addressing key challenges such as feature selection, imbalanced data, and industry-specific modeling. In particular, the study integrates resampling techniques with stepwise feature selection, thereby enhancing model generalization, interpretability, and the ability to uncover sector-specific bankruptcy patterns.
Corporate bankruptcy prediction has long been an important research topic in accounting, finance, and risk management. Early foundational studies, including (), (), (), and (), demonstrated the predictive power of financial ratios through discriminant and probit/logit models (). Subsequent refinements introduced hazard models (), the incorporation of industry effects (), and market-based indicators (; ). Altman and colleagues later expanded the Z-score model to non-manufacturing firms, private companies, and emerging markets, improving its general applicability (). These early frameworks established the theoretical basis for risk classification and default probability estimation in corporate finance.
As data availability and computational power increased, the field transitioned toward machine learning–based approaches. () and () applied data mining and genetic algorithms, while () introduced Bayesian networks for bankruptcy prediction. () compared support vector machines and neural networks for credit rating analysis, showing the potential of non-linear models in financial risk assessment. () advanced this further by applying multi-label learning to handle multiple bankruptcy indicators simultaneously. These studies highlighted both the interpretability and flexibility of machine learning in capturing complex financial patterns.
Since the 2000s, ensemble learning methods have become central to bankruptcy prediction. ’s () Random ’s () ensemble strategies demonstrated the advantages of combining multiple classifiers for robustness. Boosting algorithms, including AdaBoost (), Gradient Boosting (), and LightGBM (), achieved high predictive performance and scalability. Hybrid ensemble frameworks integrating multiple learners further improved generalization (; ; ). CatBoost-based models also contributed to improved accuracy in recent studies (; ). Comparative research consistently confirmed that ensemble models outperform single classifiers in both stability and predictive capability (; ; ; ; ). Furthermore, stacking and meta-learning approaches have been proposed to exploit complementary model strengths (; ). Studies such as () emphasized that incorporating sectoral heterogeneity enhances predictive realism.
Another methodological challenge involves imbalanced data, as bankruptcies occur far less frequently than solvent cases. Class imbalance often inflates accuracy metrics while reducing sensitivity to true defaults (; ). To mitigate this, resampling strategies such as SMOTE (), SMOTE-ENN (; ), and SMOTE-IPF () were developed. Exploratory undersampling methods () and KMeans-based approaches offered alternatives to preserve data diversity. Subsequent studies validated these methods’ importance for balanced learning (; ). Neural network studies () demonstrated imbalance sensitivity, while () combined resampling and ensemble learning to enhance minority-class recognition. Recent work by () explored hybrid evolutionary algorithms with domain adaptation, further extending imbalance correction techniques.
Feature selection and dimensionality reduction also play essential roles in improving model interpretability and performance. While early research relied primarily on accounting ratios, later work incorporated broader information such as governance, market, and macroeconomic variables (; ). Feature selection methods reduce noise and overfitting (; ) and help identify the most influential predictors. Hybrid strategies combining multiple selection criteria (; ; ; ; ) further enhanced stability. Explainable AI techniques (; ; ) and advances in interpretable modeling (; ) improved transparency in financial applications. () even demonstrated the applicability of interpretable models in environmental risk, underscoring the cross-domain potential of explainable AI.
Deep learning has expanded the methodological toolkit for bankruptcy prediction. Neural networks were introduced by () and later refined by () and (). More recent architectures such as attention-based and sequential models capture complex temporal dependencies (; ). Ensemble deep learning reviews () highlight these models’ increasing importance. BERT-based adaptations extend predictive analysis to textual data (), while hybrid frameworks integrating neural networks with macroeconomic variables improve overall performance (). Comparative studies () confirmed the competitiveness of these deep architectures.
Finally, several meta-studies have synthesized and benchmarked decades of research. () summarized early findings; (), (), and () offered systematic reviews of predictive models; and () assessed recent advances in machine learning–based bankruptcy prediction. () applied DEA benchmarking, () quantified socio-economic impacts, and () reviewed deep learning for credit scoring. Additional contributions by () and () addressed robustness and bankruptcy resolution prediction. () compared alternative modeling techniques, while () examined credit card defaults as a parallel to corporate bankruptcy risk. () demonstrated the effectiveness of a two-stage classification strategy, directly motivating the two-stage framework adopted in this study.
Overall, the literature reveals a clear progression from interpretable statistical models to increasingly complex, data-driven, and explainable frameworks. Prediction success depends not only on algorithmic choice but also on handling imbalanced data, selecting informative features, and incorporating industry-specific determinants. Building on this understanding, the present study adopts a two-stage machine learning framework for bankruptcy prediction among Tokyo Stock Exchange–listed firms. In the first stage, comprehensive learning is performed using 173 financial indicators. In the second stage, wrapper-based feature selection is applied to gradually reduce dimensionality, eliminate noise, and arrive at an optimal seven-feature set. This approach enhances both predictive performance and interpretability.
To capture sector-specific heterogeneity, separate models are constructed for six industries—Construction, Real Estate, Services, Retail, Wholesale, and Electrical Equipment—thus uncovering sectoral bankruptcy determinants and patterns. In addition, three resampling techniques—SMOTE, SMOTE-ENN, and k-means clustering—are incorporated to address class imbalance. Empirical results reveal that Random Forest correctly predicts 566 bankruptcies and LightGBM predicts 451, both substantially outperforming models without feature reduction. By simultaneously addressing four key challenges—methodological choices, dataset design, class imbalance, and industry heterogeneity—this study underscores both its novelty and practical relevance.
2. Materials and Methods
2.1. Machine Learning Methods
This study employs Random Forest and LightGBM. Brief explanations of each method follow.
2.1.1. Random Forest
Random Forest is a machine learning framework based on decision-tree algorithms. It uses bagging, which is a type of ensemble learning, to create high-accuracy learners. Bagging trains multiple ‘weak learners’ in parallel using bootstrapping. Bootstrapping extracts data samples with replacement from the original dataset. Since sampling with replacement allows for duplicates, the same data point may be selected multiple times for training a single weak learner. Bagging uses bootstrapping to train multiple weak learners in parallel. Bootstrapping involves resampling with replacement from the original dataset. Each learner performs learning and prediction independently; for regression tasks, learner predictions are averaged, and for classification tasks, a majority vote determines the final prediction.
2.1.2. LightGBM
LightGBM is also a machine learning framework based on decision tree algorithms. It employs gradient boosting, an ensemble-learning method, to create high-accuracy learners. This approach constructs a ‘strong learner’ by sequentially combining individual learners. Specifically, it uses the steepest descent method to minimize the error between predicted and actual numbers. XGBoost, a conventional machine learning algorithm that uses gradient boosting, employs a level-wise tree-growth strategy to grow decision trees. This method grows a decision tree by expanding its levels (layers). In contrast, LightGBM employs leaf-wise growth, which grows decision trees by expanding the tree’s leaves. This enables LightGBM to achieve faster processing while maintaining XGBoost’s accuracy.
2.2. Evaluation Metrics
Evaluation metrics are quantitative indicators used to assess model performance. Using evaluation metrics helps to determine how accurately the constructed model can predict bankruptcy from the input data. Furthermore, evaluation metrics play a crucial role in comparing model performance and facilitating improvements.
The following evaluation metrics are used in this study. We define true positives (TPs) as instances when bankrupt companies are correctly predicted as bankrupt; false negatives (FNs) as instances when bankrupt companies are incorrectly predicted as non-bankrupt; true negatives (TNs) as instances when non-bankrupt companies are correctly predicted as non-bankrupt; and false positives (FPs) as instances when non-bankrupt companies are incorrectly predicted as bankrupt. Using these TPs, FPs, TNs, and FNs, we calculate the following metrics: precision, accuracy, recall, false positive rate, and false negative rate. Given our focus on bankruptcy, TPs are our primary metric to represent correctly predicted bankrupt companies. Therefore, we consider the model that yielded the highest number of TPs as the best. We compute recall as the evaluation metric for industry-specific bankruptcy prediction, considering different dataset compositions and resampling methods. Since the actual number of bankrupt companies varies by industry, TPs alone cannot determine prediction accuracy. Assessing recall, therefore, provides a better understanding of bankruptcy prediction accuracy for each industry.
2.3. Numerical Experiment Design
2.3.1. Datasets
We construct models for six industries based on the Tokyo Stock Exchange’s sector classifications: construction, real estate, services, retail, wholesale, and electrical equipment. Data were obtained from the financial statements of companies listed on the Tokyo Stock Exchange, using the Nikkei NEEDS database. In this study, we define bankruptcy as a company being delisted owing to civil rehabilitation proceedings or similar events. We use data from 317 companies that filed for bankruptcy between 1991 and 2021.
Table 1 presents the industry-specific sample sizes for three analytical datasets, each constructed under a distinct feature configuration. Numbers in parentheses indicate the number of features included in each dataset. The significant disparity between the numbers of non-bankrupt and bankrupt companies indicates an imbalanced dataset. The first dataset configuration contains data from 52,950 companies and uses only financial indicators as features. The second configuration contains data from 26,674 companies. This dataset augments the short-term financial performance indicators from the first dataset with investment-financing network indicators representing financing diversity and long-term intercompany trust relationships. The third dataset, constructed for benchmarking against the first (financial) and second (investment-financing) datasets, includes data from 26,674 companies. This is derived from stripping the 12 investment-financing network indicators from the second dataset, leaving 161 purely financial features. Consequently, it differs from the first dataset in terms of size (26,674 vs. 52,950 companies) and from the second in terms of feature count (161 vs. 173 features).
Table 1.
The industry-specific sample sizes for three analytical datasets.
2.3.2. Indicators
We utilize all 161 financial indicators available from the Nikkei NEEDS-Financial QUEST (FQ), a comprehensive economic database service. Following the NEEDS-FQ classification system, these indicators are categorized into seven types: profitability, return on capital, margin-related, productivity, stability, growth, and cash flow indicators (Table 2).
Table 2.
Financial indicators and features.
2.3.3. Investment-Financing Network Indicators
We calculate investment-financing network indicators from networks representing corporate investment and financing relations. To construct these networks, we use data from the Nikkei NEEDS-Financial QUEST on major shareholders, corporate shareholdings, and loans. Specifically, we calculate six investment network indicators and six financing network indicators, totaling 12 indicators.
We define each indicator as follows:
- Degree centrality: How connected a given node is to other nodes in a network.
- Betweenness centrality: The frequency with which a given node lies on the shortest path between other nodes, indicating the degree of connection between the node and others.
- Network density: The degree of connection among nodes, expressed as a ratio. The denominator is the number of possible connections, and the numerator is the number of actual connections.
- Authority score: The extent to which other nodes link to a given node, representing how many other nodes are linked to it.
- Hub score: Outgoing edge connections to other nodes.
- PageRank: The importance of a webpage.
2.3.4. Computational Environment
Before running the models with LightGBM and Random Forest, we perform data standardization as a preprocessing step. Next, we conduct a 10-fold cross-validation to optimize the hyperparameters using Optuna (version 3.0.3), a Python 3.9.7. The prediction models are then constructed using the optimized parameters. Table 3 present the corresponding hardware and software specifications.
Table 3.
Hardware and software specifications.
3. Results
In this two-stage study, the results of the second stage are derived from a dataset of seven features obtained through progressive feature reduction in the 173-feature set used in the first stage. This eliminates noise and prevents overfitting. We use the feature importance attribute to determine which features the models rely on the most for bankruptcy prediction, quantifying each feature’s importance and visualizing the results.
3.1. First Stage Results (173 Features)
This section presents the results of the first stage of bankruptcy prediction. In the first stage, we construct bankruptcy prediction models using Random Forest and LightGBM with a dataset containing all 173 available financial indicators.
3.1.1. Random Forest
Table 4 presents the results obtained by dataset configuration using the 173-feature set. The total TP count across the financial, investment financing, and comparison datasets was 478. Among the datasets, the financial dataset, having the largest sample size, achieved the best prediction accuracy, with 284 TPs. Both the investment financing and comparison datasets yielded 97 TPs, indicating no apparent advantage in using investment financing network indicators.
Table 4.
Random Forest true positive count by dataset configuration.
Resampling: For the 317 bankrupt companies across the six industries (Table 5), k-means achieved the highest accuracy with 186 TPs across all three datasets, followed by SMOTE-ENN with 177 TPs, and SMOTE with 115 TPs.
Table 5.
Random Forest true positive count by resampling method and dataset configuration.
Industry-Specific Results: Table 6 presents the industry-specific prediction results by dataset configuration. The highest TP count (75) was recorded in the construction industry when the financial dataset was used. By contrast, the lowest TP count (0) was recorded for the wholesale industry when using the comparison dataset.
Table 6.
Random Forest true positive count by industry and dataset configuration.
Resampling Results by Industry: Table 7 presents the resampling results by industry. Applying k-means resampling to the construction industry data yielded the highest TP count (59), whereas applying SMOTE resampling to the wholesale industry data produced the lowest TP count (3).
Table 7.
Random Forest true positive count and recall rate by resampling method and industry.
Taken together, the results in Table 6 and Table 7 show that applying Random Forest to the data from the construction industry yields the highest total TP count (152), whereas applying this method to the wholesale industry data produces the lowest total TP count (21). Recall was highest in the real estate industry (58.33%) and lowest in the wholesale industry (26.92%).
Feature Importance: To pinpoint which features were significant for the model’s predictions, we visualized the feature importance scores assigned by the Random Forest algorithm. Figure 1 illustrates the top seven features (out of 173) for the electrical-equipment industry using the financial dataset with SMOTE-ENN resampling.
Figure 1.
Random Forest feature importance scores for 173 features under SMOTE-ENN resampling using the financial dataset for the electrical equipment industry.
3.1.2. LightGBM
Table 8 presents the results obtained using 173-feature set. Across the financial, investment-financing, and comparison datasets, Random Forest correctly identified 420 bankrupt firms, 58 more than LightGBM. When broken down by dataset, Random Forest achieved its highest TP count on the financial dataset, which had the largest sample size with 264 TPs, and identified 58 bankrupt firms using the investment-financing dataset and 97 using the comparison dataset. Notably, we found no significant differences between the investment-financing dataset and the comparison dataset when using Random Forest. However, a difference emerged with LightGBM. For LightGBM, incorporating network indicators resulted in worse performance.
Table 8.
LightGBM true positive count by dataset configuration.
Resampling: Looking at the resampling results across six industries covering 317 bankrupt companies in total, SMOTE-ENN achieved the highest accuracy with 149 TPs, followed by SMOTE with 141 TPs, and K-means with 129 TPs (Table 9).
Table 9.
LightGBM true positive count by resampling method and dataset configuration.
Industry-Specific Results: Table 10 presents the prediction results categorized by industry and dataset configuration. The financial dataset for the real estate industry produced the highest TP count with 62 companies, whereas the investment-financing dataset for the electrical equipment industry showed the lowest TP count of 0.
Table 10.
LightGBM true positive count and recall rate by industry and dataset configuration.
Resampling Results by Industry: Table 11 presents the resampling results for each industry. The highest TP count (49) was generated by the construction industry using SMOTE-ENN. Conversely, the lowest TP count (11) is observed for the service industry using both SMOTE-ENN and k-means.
Table 11.
LightGBM true positive count and recall rate by resampling method and industry.
The combined results in Table 10 and Table 11 show that, when using LightGBM, 139 TPs were generated for the real estate industry. By contrast, the lowest TP count (34) was observed in the service industry. Recall was highest in the real estate industry (60.96%) and lowest in the wholesale industry (24.30%).
Feature Importance: To identify the features that were significant for the model’s predictions, we visualized the feature importance scores assigned by LightGBM. Figure 2 illustrates the top seven features (out of 173) for the electrical-equipment industry using the financial dataset processed using SMOTE-ENN.
Figure 2.
LightGBM feature importance scores for 173 features under SMOTE-ENN resampling using the financial dataset for the electrical equipment industry.
3.2. Second-Stage Analysis Results (7 Features)
This study proposes a two-stage bankruptcy prediction model using financial data from listed Japanese companies. The first stage involves model training with all 173 available financial indicators. The second stage improves predictive accuracy through passive dimensionality reduction to eliminate noise and prevent overfitting. It applies feature selection using Random Forest and LightGBM to iteratively reduce the feature set from 173 to a smaller optimal subset, thereby enhancing the model’s performance in identifying bankrupt firms.
We explicitly define the threshold criteria for feature removal. Specifically, features with an importance ratio below a predefined level (e.g., 1% of cumulative importance) are sequentially removed in each iteration. To ensure reproducibility, the procedure terminates when the improvement in validation accuracy remains below 0.5% for three consecutive iterations. Furthermore, we clarify the evaluation of the trade-off between model complexity and predictive performance by monitoring both the number of selected features and the corresponding changes in the F1 score and Cohen’s Kappa value. These refinements render the selection logic transparent and empirically robust.
The feature selection process follows a wrapper-based approach involving iterative model training and evaluation based on the true positive (TP) count and recall to identify the optimal number of features for each industry category. The procedure is summarized as follows:
- Step 1: Train the model using all 173 features.
- Step 2: Based on performance (TP count), remove less important features and retain the top-ranked features for each industry.
- Step 3: Train a new model using the reduced feature set.
- Step 4: Repeat Steps 2 and 3 until the TP count reaches its maximum; this point determines the optimal feature set.
Among the seven selected features, Random Forest achieves 566 TPs, whereas LightGBM achieves 451. Thus, the Random Forest model predicts 115 more bankruptcies. In contrast, when all 173 features are used, Random Forest predicts 478 bankruptcies and LightGBM predicts 420. These results demonstrate that, by employing the wrapper-based approach—the key methodological contribution of this study—we effectively reduce dimensionality, eliminate noise, and improve model performance.
Table 12 and Figure 3 show how the TP count changes as the number of features is progressively reduced from 173 to two through feature selection, representing the sum of the SMOTE, SMOTE+ENN, and k-means results.
Table 12.
Changes in true positive count by feature count.
Figure 3.
Workflow of wrapper-based feature selection (Steps 1–4).
3.2.1. Random Forest
Table 13 presents the results generated using the 7-feature set. The total TP count across the financial, investment financing, and comparison datasets was 566. By contrast, as mentioned previously, when all 173 features were used, Random Forest achieved 478 TPs. This demonstrates that using the seven selected features predicted 88 more bankruptcies (566 vs. 478 TPs) with Random Forest than using all 173 features. Thus, we successfully achieved our study’s objective of improving bankruptcy prediction accuracy. Considering the results by dataset configuration, the financial dataset, which had the largest sample size, showed the best prediction accuracy, with 303 TPs. The model produced 142 TPs with the investment financing dataset and 121 TPs with the comparison dataset, which demonstrates the added value of incorporating investment financing network indicators.
Table 13.
Random Forest true positive counts by dataset configuration.
Resampling: SMOTE-ENN resampling achieved the highest accuracy, with 211 TPs across all three datasets combined (Table 14). K-means followed with 206 TPs and SMOTE with 149 TPs. In comparison, when all 173 features were used, SMOTE, SMOTE-ENN, and K-means achieved 115, 177, and 186 TPs, respectively, indicating improvements of 34, 34, and 20 additional TPs, respectively, with the seven-feature set compared with all 173 features.
Table 14.
Random Forest true positive counts by resampling method and dataset configuration.
As shown in Table 15, the Random Forest model achieved the highest number of TPs when applied to the financial dataset of the construction industry. Specifically, it correctly identified 87 bankrupt firms out of 126 actual bankruptcies. In contrast, the comparative dataset exhibited the lowest performance in the service industry, with only two TPs.
Table 15.
Random Forest true positive count and recall rate by industry and dataset configuration.
Resampling Results by Industry: Table 16 presents the resampling results for each industry. In the construction industry, SMOTE-ENN yields the highest TP count of 67. Conversely, for the wholesale industry, using SMOTE resulted in the lowest TP count at only 12.
Table 16.
Random Forest true positive count and recall rate by resampling method and industry.
The combined results in Table 15 and Table 16 indicate that the highest TP count (172) was recorded for the construction industry, whereas the lowest TP count (48) was recorded for the wholesale industry. In terms of recall, the highest rate (68.86%) was observed for the real estate industry data, whereas the lowest rate (54.29%) was observed for the service industry data.
Feature Importance: Table 17 compares the top seven features identified by Random Forest from the full 173-feature set with the seven features selected for the optimized model. This table highlights the top seven features selected from the original 173 features for comparison.
Table 17.
Comparison of the top seven features identified by Random Forest from the full set with the seven features selected for the optimized model.
Notably, no network indicators were selected among the final seven features, and all the selected features were financial indicators. Furthermore, Cash Flow to Net Interest-Bearing Debt Ratio and Net Interest Burden to Sales Ratio appear among the seven selected features and among the top seven indicators of the 173-feature set, demonstrating their importance for bankruptcy prediction with Random Forest. Moreover, when using the seven-feature set, two cash flow indicators are included: cash flow to net interest-bearing debt ratio and operating cash flow to sales ratio. When using all 173 features, four of the top seven features are cash flow-related: the Cash Flow to Net Interest-Bearing Debt Ratio, Dividends to Cash Flow Ratio, Cash Flow to Long-Term Debt Ratio, and Cash Flow to Fixed Liabilities Ratio. Given that cash shortages are often the primary cause of corporate bankruptcy, this finding indicates that the model successfully identifies the relevant features.
For the electrical equipment industry, Figure 4 presents the importance scores of the top seven features computed on the financial dataset following SMOTE-ENN resampling.
Figure 4.
Importance scores for the top seven features under SMOTE-ENN resampling using the financial dataset for the electrical equipment industry.
3.2.2. LightGBM
Table 18 presents the results obtained using these seven features. Among the datasets, the highest prediction accuracy for 283 TPs was achieved using the financial dataset with the largest sample size.
Table 18.
LightGBM true positive count by dataset configuration.
Table 19 presents Resampling: Across all six industries and three dataset configurations (encompassing 317 bankrupt companies in total), SMOTE resampling achieved the highest accuracy with 199 TPs, followed by SMOTE-ENN with 191 TPs, and k-means with 61 TPs.
Table 19.
LightGBM true positive count by resampling method and dataset configuration.
Industry-Specific Results: As shown in Table 20, for LightGBM, the highest TP count (72) is achieved using the financial dataset for the real estate industry, whereas the lowest TP count of 2 is observed when using the investment-financing dataset for the retail industry.
Table 20.
LightGBM true positive count by industry and dataset configuration.
Resampling Results by Industry: As shown in Table 21, which presents the LightGBM resampling results by industry, SMOTE-ENN yielded 60 TPs with data for the real estate industry. K-means showed the worst performance with the electrical equipment industry data at zero TPs.
Table 21.
LightGBM true positive count and recall rate by resampling method and industry.
As the combined results in Table 20 and Table 21 show, when the seven-feature LightGBM model is applied across industries, the highest TP count (138) is predicted for the real estate industry. Conversely, the lowest TP count of 43 is predicted for the service industry. In terms of recall, the highest rate was recorded for the construction industry (68.57%), whereas the lowest rate was observed for the wholesale industry (33.70%).
Feature Importance: Table 22 presents a comparison of the top seven features identified by LightGBM from the full 173-feature set with the seven features selected for the optimized model. The Depreciation to Sales ratio feature appears in both the seven selected features and the top seven of the full set, underscoring its critical role in LightGBM-based bankruptcy prediction. In the seven-feature model, four cash flow metrics were selected (dividend-to-free cash flow ratio, instant coverage cash flow, cash flow to debt ratio, and cash flow to current liabilities ratio), whereas in the full 173-feature ranking, the Cash Flow to Net Interest-Bearing Debt Ratio also emerged among the top indicators. Given that insufficient cash flow is the leading cause of corporate bankruptcies, these results confirm that the proposed approach effectively identified relevant features.
Table 22.
Comparison of the top seven features identified by LightGBM from the full set with the seven features selected for the optimized model.
Figure 5 illustrates the top seven features (out of 173) for the electrical equipment industry when using a financial dataset processed with SMOTE-ENN.
Figure 5.
Importance scores for the top seven features under SMOTE-ENN resampling using the financial dataset for the electrical equipment industry.
4. Discussion
This study proposed a two-stage machine learning framework to address several challenges inherent in corporate bankruptcy prediction, including high-dimensional financial data, class imbalance, industry heterogeneity, and the need for interpretability. Through extensive analyses using Random Forest and LightGBM, our findings demonstrate that the proposed approach is both effective and practically applicable. A key contribution of this research is the significant improvement achieved through feature selection. By reducing the original 173 financial indicators to an optimized subset of only seven features, the model not only became more interpretable but also achieved higher predictive performance. The research also demonstrated the superior performance of Random Forest over LightGBM, with the former achieving 566 true positives using a seven-feature set, an improvement of 88 cases compared with the full feature set. This result suggests that many of the original features may have introduced noise, and that carefully removing low-importance variables helps strengthen the model’s ability to identify bankruptcy-prone firms. The selected features commonly reflected profitability, liquidity, leverage, and cash-flow stability—factors widely recognized as fundamental indicators of corporate financial health. The industry-specific analysis revealed substantial differences in bankruptcy determinants across sectors. In construction and real estate, capital structure and fixed-asset intensity played more dominant roles, whereas in wholesale and service sectors, short-term liquidity and working capital were more influential. These results show that bankruptcy mechanisms differ across industries, reinforcing the importance of evaluating each sector independently rather than applying a uniform model to all firms. Our evaluation of resampling techniques further highlighted the interaction between algorithm choice and data-balancing strategy. For Random Forest, k-means undersampling provided the strongest performance, whereas SMOTE-ENN yielded the best results for LightGBM. This suggests that the effectiveness of resampling methods depends heavily on how each algorithm handles noise, boundary samples, and minority-class representation. About the issue of generalizability, we have clarified that the industry-specific models are designed explicitly to improve predictive accuracy and interpretability within each industry. They are not intended to be generalized across different industries. Because financial structures, operational characteristics, and bankruptcy mechanisms differ substantially among industries, applying a model trained in one industry to another would be inappropriate. This heterogeneity is precisely why separate models were developed. This study has several limitations. First, the dataset consists solely of Japanese listed firms, which may limit the applicability of the findings to firms in other countries or to small and medium-sized enterprises. Second, formal statistical significance tests such as the McNemar test were not conducted due to data constraints; future research should investigate the statistical robustness of the observed performance improvements. Third, feature selection was conducted using ensemble tree-based models; comparisons with other model families, such as SVMs or neural networks, were beyond the scope of this study and remain an area for future work. Despite these limitations, the proposed two-stage framework provides a practical and interpretable approach to bankruptcy prediction, combining the strengths of ensemble learning, feature selection, and industry-specific modeling. The findings offer valuable insights for academics and practitioners involved in credit risk assessment, financial monitoring, and early warning systems.
5. Conclusions
This study develops a two-stage machine learning framework for predicting corporate bankruptcy by integrating ensemble learning, feature selection, and resampling techniques. The proposed model effectively addresses key challenges in bankruptcy prediction, including high-dimensional financial data, class imbalance, and the need for industry-specific interpretability. By combining Random Forest and LightGBM with wrapper-based feature selection, the approach reduces the original 173 financial indicators to a smaller, optimal set of features, thereby improving both predictive accuracy and model transparency.
The results demonstrate that the proposed model not only enhances classification performance but also improves interpretability by identifying the most critical financial indicators associated with profitability, liquidity, leverage, and cash flow stability. Furthermore, the construction of industry-specific models based on the Tokyo Stock Exchange classification provides valuable insights into financial patterns that characterize bankruptcy risks across different sectors. These findings contribute to both academic research and practical applications by presenting a robust and interpretable predictive framework that bridges methodological rigor and real-world utility.
From a practical perspective, the model offers an efficient and transparent framework for credit risk evaluation and early warning systems. Financial institutions can utilize the reduced feature set to streamline credit screening and enhance decision-making. Investors and regulators can apply the model to detect early signs of financial distress and strengthen macroprudential oversight, thereby supporting more proactive risk management and corporate monitoring.
Future research can expand upon this study by incorporating a broader range of firms and time periods to test the generalizability of the model. Extending the analysis to include macroeconomic variables and external financial conditions may further improve predictive accuracy and explainability. In addition, future work could explore the integration of deep learning architectures or hybrid ensemble techniques to capture non-linear relationships in financial distress prediction. These extensions would further enhance the applicability, scalability, and robustness of the proposed framework, supporting continuous innovation in data-driven financial risk management.
Author Contributions
Conceptualization, M.M. and H.K.; methodology, M.M.; software, M.M.; validation, M.M. and H.K.; formal analysis, M.M.; investigation, M.M.; resources, M.M.; data curation, M.M.; writing—original draft preparation, M.M.; writing—review and editing, M.M. and H.K.; visualization, M.M.; supervision, M.M.; project administration, M.M.; funding acquisition, H.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Acknowledgments
Since no additional administrative, technical, or material support was received, we have no acknowledgments to declare. In addition, no GenAI tools were used in the preparation of the manuscript; therefore, the corresponding statement is not required.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| RF | Random Forest |
| LGBM | Light Gradient Boosting Machine |
| TP | True Positive |
| FP | False Positive |
| FN | False Negative |
| TN | True Negative |
| SMOTE | Synthetic Minority Oversampling Technique |
| ENN | Edited Nearest Neighbors |
| KMeans | K-Means Clustering |
| TSE | Tokyo Stock Exchange |
References
- Alaka, H. A., Oyedele, L. O., Owolabi, H. A., Kumar, V., Ajayi, S. O., Akinade, O. O., & Bilal, M. (2018). Systematic review of bankruptcy prediction models: Towards a framework for tool selection. Expert Systems with Applications, 94, 164–184. [Google Scholar] [CrossRef]
- Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23, 589–609. [Google Scholar] [CrossRef]
- Altman, E. I., Iwanicz-Drozdowska, M., Laitinen, E. K., & Suvas, A. (2017). Financial distress prediction in an international context: A review and empirical analysis of Altman’s Z-score model. Journal of International Financial Management and Accounting, 28, 131–171. [Google Scholar] [CrossRef]
- Ansah-Narh, T., Nortey, E. N. N., Proven-Adzri, E., & Opoku-Sarkodie, R. (2024). Enhancing corporate bankruptcy prediction via a hybrid genetic algorithm and domain adaptation learning architecture. Expert Systems with Applications, 258, 120654. [Google Scholar] [CrossRef]
- Atiya, A. F. (2001). Bankruptcy prediction for credit risk using neural networks: A survey and new results. IEEE Transactions on Neural Networks, 12, 929–935. [Google Scholar] [CrossRef] [PubMed]
- Bandyopadhyay, S., & Lang, M. (2013). Ensemble learning for financial default prediction. Journal of Finance and Data Science, 1, 69–81. [Google Scholar]
- Batista, G. E. A. P. A., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6, 20–29. [Google Scholar] [CrossRef]
- Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111. [Google Scholar] [CrossRef]
- Bellovary, J. L., Giacomino, D. E., & Akers, M. D. (2007). A review of bankruptcy prediction studies: 1930 to present. Journal of Financial Education, 33, 1–42. [Google Scholar]
- Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. [Google Scholar] [CrossRef]
- Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259. [Google Scholar] [CrossRef] [PubMed]
- Chava, S., & Jarrow, R. A. (2004). Bankruptcy prediction with industry effects. Review of Finance, 8, 537–569. [Google Scholar] [CrossRef]
- Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. [Google Scholar] [CrossRef]
- Chen, T., & Guestrin, C. (2016, August 13–17). XGBoost: A scalable tree boosting system. The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794), San Francisco, CA, USA. [Google Scholar] [CrossRef]
- Choi, I., & Lee, J. (2018). Multi-label learning for corporate bankruptcy prediction. Decision Support Systems, 110, 87–98. [Google Scholar]
- Ciampi, F. (2015). Corporate governance characteristics and default prediction modeling for small enterprises: An empirical analysis of Italian firms. Journal of Business Research, 68, 1012–1025. [Google Scholar] [CrossRef]
- Dasilas, A., & Rigani, A. (2024). Machine learning techniques in bankruptcy prediction: A systematic literature review. Expert Systems with Applications, 255, 124761. [Google Scholar] [CrossRef]
- Dietterich, T. G. (2000). Ensemble methods in machine learning. In International workshop on multiple classifier systems (pp. 1–15). Springer. [Google Scholar]
- Dikshit, A., & Pradhan, B. (2021). Interpretable and explainable AI (XAI) model for spatial drought prediction. Science of the Total Environment, 801, 149797. [Google Scholar] [CrossRef]
- Du Jardin, P. (2016). A two-stage classification technique for bankruptcy prediction. European Journal of Operational Research, 254, 236–252. [Google Scholar] [CrossRef]
- Fernández, A., García, S., Galar, M., Prati, R. C., Krawczyk, B., & Herrera, F. (2018). Learning from imbalanced datasets. Springer. [Google Scholar] [CrossRef]
- Ganaie, M. A., & Hu, M. (2022). Ensemble deep learning: A review. Knowledge-Based Systems, 239, 108098. [Google Scholar] [CrossRef]
- García, V., Marqués, A. I., Sánchez, J. S., & Ochoa-Domínguez, H. J. (2019). Dissimilarity-based linear models for corporate bankruptcy prediction. Computational Economics, 53, 1019–1031. [Google Scholar] [CrossRef]
- Giudici, P., & Hadji-Misheva, B. (2022). Explainable ML for credit scoring and bankruptcy prediction. Risks, 10, 104. [Google Scholar]
- Guillén, M., & Salas, A. (2021). Bankruptcy prediction combining feature selection and machine learning classifiers. Sustainability, 13, 6436. [Google Scholar]
- Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182. [Google Scholar]
- He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21, 1263–1284. [Google Scholar] [CrossRef]
- Heo, J., & Yang, J. Y. (2014). AdaBoost-based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24, 494–499. [Google Scholar] [CrossRef]
- Hillegeist, S. A., Keating, E. K., Cram, D. P., & Lundstedt, K. G. (2004). Assessing the probability of bankruptcy. Review of Accounting Studies, 9, 5–34. [Google Scholar] [CrossRef]
- Hossari, G., & Rahman, S. (2022). Artificial intelligence and bankruptcy prediction: The relevance of feature selection. International Journal of Finance and Economics, 27, 2103–2122. [Google Scholar]
- Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17, 299–310. [Google Scholar] [CrossRef]
- Huang, Z., Chen, H., Hsu, C.-J., Chen, W.-H., & Wu, S. (2004). Credit rating analysis using support vector machines and neural networks: A comparative market study. Decision Support Systems, 37, 543–558. [Google Scholar] [CrossRef]
- Iparraguirre-Villanueva, O., & Cabanillas-Carbonell, M. (2023). Predicting business bankruptcy: A comparative analysis with machine learning models. Economies, 11, 122. [Google Scholar] [CrossRef]
- Jabeur, S. B., Gharib, C., Mefteh-Wali, S., & Ben Arfi, W. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. [Google Scholar] [CrossRef]
- Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems (NeurIPS) (Vol. 30, pp. 3146–3154). ACM. [Google Scholar]
- Kim, A., & Yoon, S. (2023). Corporate bankruptcy prediction with domain-adapted BERT. arXiv, arXiv:2312.03194. [Google Scholar] [CrossRef]
- Kim, H., Cho, H., & Ryu, D. (2022). Corporate bankruptcy prediction using machine learning methodologies with a focus on sequential data. Computational Economics, 59, 1231–1249. [Google Scholar] [CrossRef]
- Kim, H. Y. (2011). Bankruptcy prediction using support vector machine with optimal choice of kernel function and regularization parameters. Expert Systems with Applications, 38, 511–517. [Google Scholar]
- Ko, B., Kim, D., & Kang, B. (2021). A hybrid feature selection approach for bankruptcy prediction. Applied Intelligence, 51, 111–128. [Google Scholar]
- Kotze, M., & Beukes, C. (2021). Feature extraction using hybrid genetic algorithms and XGBoost. Neurocomputing, 452, 111–122. [Google Scholar]
- Kraus, C., & Feuerriegel, S. (2019). Decision support for bankruptcy prediction using machine learning: A comparison of boosting and bagging. Decision Support Systems, 120, 113–126. [Google Scholar]
- Lahsasna, A., Ainon, R. N., & Wah, T. Y. (2018). Business failure prediction using ensemble machine learning. International Journal of Advanced Computer Science and Applications, 9, 45–52. [Google Scholar]
- Lee, C.-C., & Chen, M.-L. (2020). Ensemble models for predicting corporate financial distress: A performance comparison. Expert Systems with Applications, 139, 112–124. [Google Scholar]
- Lee, S., & Choi, W. S. (2013). A multi-industry bankruptcy prediction model using back-propagation neural network and multivariate discriminant analysis. Expert Systems with Applications, 40, 2941–2946. [Google Scholar] [CrossRef]
- Li, H., Sun, J., & Wu, J. (2022). Financial distress prediction using attention-based deep learning. Expert Systems with Applications, 199, 116–137. [Google Scholar]
- Lin, C. T., Chiu, C. C., & Tsai, C. Y. (2010). A hybrid neural network model for credit scoring. International Journal of Electronic Business Management, 8, 254–261. [Google Scholar]
- Lin, F. Y., & McClean, S. (2001). A data mining approach to the prediction of corporate failure. Knowledge-Based Systems, 14, 189–195. [Google Scholar] [CrossRef]
- Liu, X.-Y., Wu, J., & Zhou, Z.-H. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics—Part B, 39, 539–550. [Google Scholar] [CrossRef]
- Liu, Y., & Wu, H. (2021). Bankruptcy prediction using SMOTE-ENN and LightGBM. Sustainability, 13, 8021. [Google Scholar]
- Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (NeurIPS) (Vol. 30, pp. 4765–4774). Curran Associates Inc. [Google Scholar]
- Mishra, D., & Singh, A. (2022). Predicting bankruptcy using XGBoost and SMOTE. Journal of Risk and Financial Management, 15, 142. [Google Scholar]
- Nam, J., & Jinn, T. (2000). Bankruptcy prediction: Evidence from Korean listed companies during the IMF financial crisis. Journal of International Financial Management and Accounting, 11, 178–197. [Google Scholar] [CrossRef]
- Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18, 109–131. [Google Scholar] [CrossRef]
- Pramodh, T. R., & Ravi, V. (2016). Rule extraction and feature selection techniques—A review. International Journal of Computers and Applications, 975, 8887. [Google Scholar]
- Radovanovic, J., & Haas, C. (2023). The evaluation of bankruptcy prediction models based on socio-economic costs. Expert Systems with Applications, 227, 120275. [Google Scholar] [CrossRef]
- Ravi Kumar, P., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques—A review. European Journal of Operational Research, 180, 1–28. [Google Scholar] [CrossRef]
- Razzak, I., Imran, M., & Xu, G. (2019). Deep learning for credit scoring: A review. Information Processing and Management, 56, 102–128. [Google Scholar]
- Sáez, J. A., Luengo, J., Stefanowski, J., & Herrera, F. (2015). SMOTE-IPF: A filtering method to pre-process data. Information Sciences, 291, 184–203. [Google Scholar] [CrossRef]
- Sánchez-Medina, A. J., Blázquez-Santana, F., Cerviño-Cortínez, D. L., & Pellejero, M. (2024). Ensemble methods for bankruptcy resolution prediction: A new approach. Computational Economics, 66(5), 3891–3926. [Google Scholar]
- Shumway, T. (2001). Forecasting bankruptcy more accurately: A simple hazard model. The Journal of Business, 74, 101–124. [Google Scholar] [CrossRef]
- Succurro, M., Arcuri, G., & Costanzo, G. D. (2019). A combined approach based on robust PCA to improve bankruptcy forecasting. Review of Accounting and Finance, 18, 296–320. [Google Scholar] [CrossRef]
- Sun, J., Li, H., Huang, Q.-H., & He, K.-Y. (2014). Predicting financial distress and corporate failure: A review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowledge-Based Systems, 57, 41–56. [Google Scholar] [CrossRef]
- Sun, L., & Shenoy, P. P. (2007). Using Bayesian networks for bankruptcy prediction: Some methodological issues. European Journal of Operational Research, 180, 738–753. [Google Scholar] [CrossRef]
- Tang, L., & Yan, H. (2021). Financial distress prediction based on stacking ensemble. Journal of Intelligent and Fuzzy Systems, 41, 3147–3159. [Google Scholar]
- Tang, Y., Zhang, Y.-Q., & Chawla, N. V. (2009). SVMs modeling for highly imbalanced classification. IEEE Transactions on Systems, Man and Cybernetics, 39, 281–288. [Google Scholar]
- Tsai, C.-F. (2009). Feature selection in bankruptcy prediction. Knowledge-Based Systems, 22, 120–127. [Google Scholar] [CrossRef]
- Tsai, C.-F., & Hsu, Y.-F. (2013). A meta-learning framework for credit scoring. Expert Systems with Applications, 40, 5124–5130. [Google Scholar]
- Varetto, F. (1998). Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking and Finance, 22, 1421–1439. [Google Scholar] [CrossRef]
- Wang, G., Ma, J., Huang, L., & Xu, K. (2010). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 23, 899–908. [Google Scholar] [CrossRef]
- Wu, J., & Wang, Y. (2022). Corporate bankruptcy prediction using explainable boosting machine. Expert Systems with Applications, 187, 115859. [Google Scholar]
- Wu, Y., Gaunt, C., & Gray, S. (2010). A comparison of alternative bankruptcy prediction models. Journal of Contemporary Accounting and Economics, 6, 34–45. [Google Scholar] [CrossRef]
- Xu, Y., & Ouenniche, J. (2018). Slacks-based DEA and cross-benchmarking framework to evaluate bankruptcy predictive models. Expert Systems with Applications, 104, 240–253. [Google Scholar]
- Yeh, C.-H., Chi, D.-J., & Lin, Y.-R. (2022). Improved bankruptcy prediction using feature selection and classification algorithms. Journal of Risk and Financial Management, 15, 329. [Google Scholar]
- Yeh, I.-C., & Lien, C.-H. (2009). The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36, 2473–2480. [Google Scholar] [CrossRef]
- Yu, L., & Zhang, Z. (2020). A novel stacking ensemble learning framework for bankruptcy prediction. IEEE Access, 8, 58828–58840. [Google Scholar]
- Zhang, Z., & Wang, J. (2021). Explainable deep learning model for financial distress prediction. Expert Systems with Applications, 185, 115655. [Google Scholar]
- Zhao, Y., & Wang, G. (2018). Predicting corporate bankruptcy using ensemble learning and data balancing techniques. Applied Soft Computing, 72, 362–375. [Google Scholar]
- Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research, 22, 59–82. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).