Artificial Intelligence Techniques for Bankruptcy Prediction of Tunisian Companies: An Application of Machine Learning and Deep Learning-Based Models

: The present paper aims to compare the predictive performance of five models namely the Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Trees (DT), Support Vector Machine (SVM) and Random Forest (RF) to forecast the bankruptcy of Tunisian companies. A Deep Neural Network (DNN) model is also applied to conduct a prediction performance comparison with other statistical and machine learning algorithms. The data used for this empirical investigation covers 25 financial ratios for a large sample of 732 Tunisian companies from 2011–2017. To interpret the prediction results, three performance measures have been employed; the accuracy percentage, the F1 score, and the Area Under Curve (AUC). In conclusion, DNN shows higher accuracy in predicting bankruptcy compared to other conventional models, whereas the random forest performs better than other machine learning and statistical methods.


Introduction
Predicting bankruptcy has always been of great importance and a huge challenge for banks and lending institutions.Therefore, financial analysts and credit experts look for the best techniques that can help them in decision making.For a long time, the traditional approaches have been widely used for bankruptcy prediction.These techniques are based on the financial ratios analysis, statistical models, and expert judgment.However, these models have limitations in predicting bankruptcy accurately (Hamdi 2012;Altman et al. 1994;Hamdi and Mestiri 2014).
Over recent years, several research studies have been focused on bankruptcy forecasting using artificial intelligence and machine learning models.The research paper of Ravi Kumar and Ravi (2007) summarizes existing researches on bankruptcy prediction studies using statistical and intelligence techniques during 1968-2005.For the same objective, Gergely (2015) has also presented a rich bibliographic review.He summarizes the short evolution of bankruptcy prediction and presents the main critiques made on modeling process for bankruptcy prediction.Furthermore, the author announces avenues of future research recommended in these studies.More recently, a systematic literature was presented by Clement (2020) to predict bankruptcy.His review was conducted based on published papers between 2016 and 2020.In the same context, Kuizinien ė et al. (2022) present another systematic review covering 232 research studies spanning from 2017 to February 2022 that use artificial intelligence techniques to identify financial distress.
A more advanced model is applied in this study, specifically, the concept of deep learning.For more details about deep learning approaches refer to the studies of Deng and Yu (2014) and LeCun et al. (2015).Deep learning approaches have been extensively employed in the field of computer vision (Kamruzzaman and Alruwaili 2022), speech recognition (Roy et al. 2021), natural language programming (Xie et al. 2018), and medical image analysis (Suganyadevi et al. 2022).However, few are the studies which have been focused on the use of deep learning in finance (Qu et al. 2019).
This study is organized as follows: Section 2 provides a pertinent literature review related to bankruptcy prediction.Section 3 presents the different statistical and artificial intelligence techniques applied in this work.The data used are identified in Section 4. The Section 5 is devoted to the empirical investigation to predict the bankruptcy of Tunisian companies.And finally, the conclusion of this research study is presented in Section 6.

Related Literature
In past decades, the discriminant approach (Beaver 1966;Altman 1968;Deakin 1972) and the logistic regression method (Ohlson 1980;Pang 2006) were the two well-known and most popular statistical methods for predicting corporate bankruptcy.More recently, Mestiri and Hamdi (2013) used the logistic regression with random effect to predict the credit risk of Tunisian banks.For bankruptcy prediction, several more developed methods have been employed.Some authors apply the decision trees method (Aoki and Hosonuma 2004;Zibanezhad et al. 2011;Begović and Bonić 2020), some others utilize various machine learning techniques such as genetic algorithm (Shin and Lee 2002;Kim and Han 2003;Davalos et al. 2014), support vector machine (Shin et al. 2005;Härdle et al. 2005;Dellepiane et al. 2015) and random forest (Joshi et al. 2018;Ptak-Chmielewska and Matuszyk 2020;Gurnani et al. 2021).Recently, several comparative analyses of machine learning models have been carried out to predict bankruptcy (Narvekar and Guha 2021;Park et al. 2021;Bragoli et al. 2022;Máté et al. 2023;Martono and Ohwada 2023).
As a matter of fact, with the invasion of the artificial intelligence modeling algorithms since the 1990s in diverse domains, artificial neural networks were the most famous and well-used machine learning tool to predict financial distress (Odom and Sharda 1990;Atiya 2001;Anandarajan et al. 2004;Hamdi 2012;Aydin et al. 2022).However, despite the good forecasting results observed by applying this tool, deep learning models are the most applied today.This comes down to the ability of deep learning approach to overcome some limitations by training the neural network which includes a significant number of hidden layers, such as the vanishing gradient, overfitting problem and the computational load (Kim 2017).
Until now, few are the works which have been focused on applying deep learning models to predict bankruptcy.Addo et al. (2018) used seven methods (LR, RF, boosting approach and 4 deep learning models) to predict loan default probability.Based on AUC and RMSE performance criteria, they concluded that the gradient boosting model outperforms the other models in solving the binary classification problem.In another study, Hosaka (2019) proposed a convolutional neural network to forecast the bankruptcy of Japanese firms.This model is specifically effective for image recognition, therefore the author has converted the financial ratios in order to train and test the network.The prediction performance results showed higher performance with the use of deep neural network compared to other employed tools.
For the same purpose, Noviantoro and Huang (2021) used machine learning as well as deep learning approaches to predict bankruptcy of Taiwanese companies between 1999 and 2009.They compared the best prediction performance of decision tree, random forest, k-nearest neighbour algorithm, support vector machine, artificial neural network, Naïve bayes, logistic regression, rule induction and deep neural network.To evaluate the classifier's performance of these models, they computed the accuracy rate, F score and AUC of each technique.They found that random forest demonstrated the highest accuracy and AUC, as well as the highest F score, and this was followed by the deep learning approach.
Very recently, Shetty et al. (2022) utilized deep neural network, extreme gradient boosted tree and support vector machine in order to predict the bankruptcy of 3728 Belgian firms for the period from 2002 to 2012.The authors concluded that the use of these different techniques yields roughly the same bankruptcy prediction accuracy rate of approximately 82-83%.Elhoseny et al. (2022) applied an adaptive whale optimization algorithm combined with deep learning (AWOA-DL) to predict bankruptcy.They evaluated the ability of the proposed new approach, to predict the failure of any company compared to logistic regression, the RBF Network, the teaching-learning-based optimization-DL (TLBO-DL) and the deep neural network.The empirical results show that the new deep learning-based approach (AWOA-DL) allows better predictions.More recently, Ben Jabeur and Serret (2023) proposed a Fuzzy Convolutional Neural Networks (FCNN) to predict corporate financial distress.They used eight evaluation measures in order to compare the performance of the new adopted method to other traditional and machine learning techniques.They found that the combined new approach outperforms traditional methods.In another study, Noh (2023) tested the accuracy performance of Long Short-Term Memory (LSTM), Logistic Regression (LR), K-Nearest Neighbour (k-NN), Decision Tree (DT), and Random Forest (RF) models for corporate bankruptcy prediction.On the basis of five performance measures, the author concluded that the proposed technique can enhance the prediction accuracy by using a small sample of an unbalanced financial dataset.
Table 1 provides a literature review summary of the main research studies that apply deep learning to predict bankruptcy.

Statistical, Machine Learning and Deep Learning Techniques
3.1.Linear Discriminant Analysis (LDA) Ronald Fisher (1933) pioneered work on discriminant analysis.In his work, he developed a statistical technique for defaults prediction, by developing a linear combination of quantitative predictor variables.The output of LDA is a score that classifies data observations between the good and bad classes.
where a i : are the weights associated with the quantitative input variables X i .
The study of Altman (1968) is considered as the reference work that uses the LDA to classify default and health companies based on five financial ratios.

Logistic Regression (LR)
LR is a statistical method used for binary classification tasks (e.g., 0 or 1, bad or good, health or default, etc.).Corresponding to Ohlson (1980), the outcome of the LR model can be written as: where P(y = 1|X) is the probability of y being 1, given the input variables X, z is a linear combination of Where a 0 is the intercept term, a 1 , a 2 , . . ., a p are the weights, and X 1 , X 2 , . . ., X p are the inputs.

Decision Trees (DT)
DTs proceed recursively partitioning the data into subsets based on the values of the input variables, with each partition represented by a branch in the tree (Quinlan 1986).The function of DTs is aimed at training a sequence of binary decisions that can be utilized to forecast the value of the output for a new observation.In the tree, each decision node corresponds to a test of value for one of the input variables, and the branches correspond to the possible outcomes of the test.The leaves of the tree denote the predicted values of the output variable for each combination of input values.For each step, the algorithm identifies the input variable that provides the best split of the data into two subsets which are as homogeneous as possible in relation to the output variable.The quality of a split is typically measured using information gain or Gini impurity, which quantifies the reduction in uncertainty about the output variable achieved by the split.
Decision trees are typically not formulated in terms of mathematical equations, but rather as a sequence of logical rules that describe how the input variables are used to predict the output variable.However, the splitting standard utilized to select the best split at each decision node can be expressed mathematically.Suppose having a dataset with n observations and m input variables, denoted by X 1 , X 2 , . . ., X p , and a binary output variable y that takes values in 0.1.Let S be a subset of the data at a particular decision node, and let p i be the part of observations in S that belong to class i.The Gini impurity of S is calculated as follows: The Gini impurity measures the probability of misclassifying an observation in S if randomly assign it to a class corresponding to the observations proportion for each class.(Gelfand et al. 1991).A small value of G(S) indicates that the observations in S are well-separated by the input variables.
To split the data at a decision node, consider all possible splits of each input variable into two subsets, and choose the split that minimizes the weighted sum of the Gini impurities of the resulting subsets.The weighted sum is given by: where S 1 and S 2 are the subsets of S resulting from the split, and |S 1 | and |S 2 | are their respective sizes.The split with the smallest value of ∆G is chosen as the best split.The decision tree algorithm proceeds recursively, splitting the data at each decision node based on the best split, until a stopping criterion is met, such as reaching a maximum depth or minimum number of observations at a leaf node.

Support Vector Machine (SVM)
SVM is a supervised learning model used for classification, regression, and outlier detection, developed by Vapnik and Vapnik (1998).The basic idea of this technique is to determine the best separating hyperplane between two classes in a given dataset.The mathematical formulation of SVM is divided into two parts: optimization problem and decision function (Hearst et al. 1998).
Given a training set (x i , y i ) where x i is the ith input vector and y i is the corresponding output: y i = (−1, 1).Then, SVM seeks to find the best separating hyperplane defined by: where w is the weight vector, b is the bias term, and x is the input vector.SVM algorithm aims to determine the optimal w and b that maximize the margin between two classes.The margin is the distance between the hyperplane and the nearest data point from either class.Then, SVM optimization problem can be formulated as: where ||w|| 2 is the L2-norm of the weight vector, C is a hyperparameter that controls the tradeoff between maximizing the margin and minimizing the classification error, ξ i is the slack variable that allows for some misclassifications, and the two constraints enforce that all data points lie on the correct side of the hyperplane with a margin of at least 1 − ξ i .
The optimization problem can be solved by using convex optimization methods, for example the quadratic programming.Once the optimization problem is solved, the decision function can be defined as: where sign is the sign function that returns +1 or −1 depending on the sign of the argument.
The decision function takes an input vector x and returns its predicted class label based on whether the output of the hyperplane is positive or negative.For more details about the optimization process, refer to (Chang and Lin 2011;Cristianini and Shawe-Taylor 2000;Gunn 1998).
Thereafter, SVM finds the best separating hyperplane by solving an optimization problem that maximizes the margin between the two classes, subject to constraints that ensure all data points are correctly classified with a margin of at least 1 − ξ i .The decision function then predicts the class label of new data points based on the output of the hyperplane.

Random Forests (RF)
RF is an ensemble of learning algorithm.It is a type of ensemble learning algorithm, developed by Breiman (2001), which combines multiple decision trees to make predictions.The algorithm is called "random" because it uses random subsets of the features and random samples of the data to build the individual decision trees.The data is split into training and testing sets.The training set is used to build the model, and the testing set is used to evaluate its performance.At each node of a decision tree, the algorithm selects a random subset of the features to consider when making a split.This helps to reduce overfitting and increase the diversity of the individual decision trees.
A decision tree is built using the selected features and a subset of the training data.The tree is grown until it reaches a pre-defined depth or until all the data in a node belongs to the same class.Suppose having a dataset with n observations and p features.Let X be the matrix of predictor variables and Y be the vector of target variables.
To build an RF model, start by creating multiple decision trees using a bootstrap sample of the real data.This means that we randomly sample n observations from the dataset with replacement to create a new dataset, and for k times this process is repeated to create k bootstrap samples.For each bootstrap sample, we then create a decision tree using random subsets of p features.For each node of the tree, we select the optimal feature and threshold value to divide the data based on a criterion, for example; the information gain or Gini impurity.We repeat the mentioned steps k times to create k decision trees.To make a prediction for a new observation, we pass it through each of the k decision trees and therefore obtain k predictions.For more details about the technical analysis of random forests, see Biau (2012).

Deep Neural Network (DNN)
DNN is an enhanced version of the conventional artificial neural network with at least two hidden layers (Schmidhuber 2015).Figure 1 illustrates the standard architecture of deep neural network.
To fully understand how DNN works, a thorough knowledge of the basics of artificial neural network is then necessary.For more information, readers can look at the studies of Walczak and Cerpa (2003) and Zou et al. (2008).According to Addo et al. (2018), the DNN output is computed as: where W k is the matrix weights of the layer, X k (k = 1, . .., L) is the total number of sequence of real values called events during an epoch and f is the activation function.

Deep Neural Network (DNN)
DNN is an enhanced version of the conventional artificial neural networ least two hidden layers (Schmidhuber 2015).Figure 1 illustrates the standard arc of deep neural network.To fully understand how DNN works, a thorough knowledge of the basics cial neural network is then necessary.For more information, readers can loo studies of Walczak and Cerpa (2003) and Zou et al. (2008).According to Ad (2018), the DNN output is computed as: where Wk is the matrix weights of the layer, Xk (k = 1, ..., L) is the total number of of real values called events during an epoch and f is the activation function.

Data
A series of financial ratios was calculated using balance sheets and income statements of 732 firms from different sectors of activity for the period between 2011-2017.A total of 4925 credit files, provided by a private Tunisian bank, constitute the database used in this empirical study.Table 2 presents the input ratios.In our research study, the same financial ratios considered by the previous works (Hamdi 2012;Mestiri and Hamdi 2013;Hamdi and Mestiri 2014) are used and demonstrated a high prediction accuracy in predicting bankruptcy of Tunisian firms.We excluded only one non-significant ratio (Raw stock/Total assets) in our empirical investigation.
On the other hand, the estimated output (Y) can be written as binary values: Following this classification criterion, the out-of-sample test is composed of 488 healthy companies and 244 are bankrupt companies.

Predictive Performance Measures
There are several criteria that can be utilized to compare and evaluate the predictive ability of the employed techniques including accuracy rate, F1 score and AUC.

Accuracy Rate
The accuracy rate is the most famous performance metric, deduced from the confusion matrix (see Table 3) and calculated following this formula: Table 3. Confusion matrix.

F1 Score
The F1 score is also computed from the confusion matrix.The value of F1 score varies between 0 and 1, since 1 is the best possible score.The model can correctly identify positive and negative cases with a high F1-score, meaning that the model has high precision and high recall.

AUC
Area Under Curve (AUC) is a synthetic indicator derived from the ROC curve.This curve is a graphical indicator utilized to assess the model forecasting accuracy (Pepe 2000;Vuk and Curk 2006).Specificity and sensitivity are the two relevant indicators on which ROC curve is based (see Zweig andCampbell 1993 andMestiri andHamdi 2013 for further details).This curve is characterized by the 1-specificity rate on the x axis and by sensitivity on the y axis.Where Moreover, AUC measure reflects the quality of the model classification between heath and default firms.In the ideal case, AUC is equal to 1, i.e., the model makes it possible to completely separate all the positives from the negatives, without false positives or false negatives.According to Table 4, the deep neural network significantly outperforms other techniques.DNN shows the highest accuracy rate with 93.6% whereas 88.2% for RF and 85.8% for LR.The lowest rate of prediction accuracy was found by the use of DT (74.3%).For the same objective to assess the predictive ability of the proposed algorithms, F1-score equal to 0.964 proves DNN's ability to identify with a great precision healthy companies from bankrupt companies.Since 1 is the best desired F1 score, DNN reaches the highest score while F1 score values were equal to 0.933, 0.922, 0.910, 0.890 and 0.838 for RF, LR, SVM, LDA and DT, respectively.

Results &Discussion
Another graphical indicator was also used to evaluate the quality of classification of the models under study, is the ROC curve (see Figure 2).The AUC measure is deduced from this curve.A model with AUC value near to unity shows high quality of classification between health and default firms.Based on Table 4, the AUC of DNN yields 0.888.In the second rank, RF was found with AUC equals to 0.815.The RL and ADL models present the worst classification results as the AUC is 0.633 and 0.574, respectively, in the testing sample.
J. Risk Financial Manag.2024, 17, x FOR PEER REVIEW 10 of 14 from this curve.A model with AUC value near to unity shows high quality of classification between health and default firms.Based on Table 4, the AUC of DNN yields 0.888.In the second rank, RF was found with AUC equals to 0.815.The RL and ADL models present the worst classification results as the AUC is 0.633 and 0.574, respectively, in the testing sample.Similar conclusions were provided by Hosaka (2019).The study's findings indicate that the convolutional neural network has better prediction performance than statistical and conventional machine learning methods.Furthermore, the work of Efron (1975) proved the robustness of the LR model compared to the LDA.Barboza et al. (2017) obtained similar results in predicting bankruptcy of North American firms.Their empirical findings indicate that RF is the most accurate prediction model compared to LR and Similar conclusions were provided by Hosaka (2019).The study's findings indicate that the convolutional neural network has better prediction performance than statistical and conventional machine learning methods.Furthermore, the work of Efron (1975) proved the robustness of the LR model compared to the LDA.Barboza et al. (2017) obtained similar results in predicting bankruptcy of North American firms.Their empirical findings indicate that RF is the most accurate prediction model compared to LR and ADL.They found that RF reaches 87% accuracy, whereas LR reach 69%and LDA reach 50%.
As a final conclusion, the ability of DNN outperforms the traditional statistical models and the conventional machine learning techniques in forecasting bankruptcy.In the second rank, RF has a significantly higher prediction accuracy compared to other employed techniques.Based on our empirical investigation, the DNN can be considered as the best technique to detect a company's financial distress and therefore can help to make managerial decisions.
In our empirical study, we have used 20% of the sample (985 firms) as a test data set in order to check the prediction accuracy and classifier's quality of the models.The type of deep neural network used in our study is a recurrent neural network with three hidden layers.Nodes per layer are 200,100,40,1('output' layer).Activation function is ReLU and Loss function is binary cross entropy.The output unit is Sigmoid.Backpropagation training algorithm was used and a stopping criteria equal to 10 −3 was set.

Conclusions
There are considerable consequences of a company's financial default on several financial and economic actors such as investors, creditors, managers, shareholders, financial analysts, auditors, employees and government.Prediction bankruptcy has become of great importance and concern.By developing accurate bankruptcy prediction techniques, many advantages and benefits can be achieved, such as cost reduction and rapidity in recovery and credit file analysis, gaining time and better reimbursement monitoring of loan files.The machine learning models are widely used and applied in the literature of bankruptcy prediction.These models demonstrate performance in terms of prediction accuracy which explains our choice to adopt these models and compare them with the deep learning approach.The main contribution of this present work is to identify the appropriate model able to predict financial distress with high precision in the Tunisian context.
Statistical, machine learning and deep learning models such as the ADL, LR, DT, SVM, RF and DNN are applied to predict the financial distress of 732 Tunisian companies from different activity sectors.The empirical findings showed that DNN is a highly suitable tool for studying financial distress in Tunisian credit institutions.Compared to past work, this study is distinguished from other references in predicting bankruptcy that employed an interesting number of input features (25 ratios) as well as a large sample of firms in training phase (3940 ≈ 80% of total sample of firms).Wilson and Sharda (1994) used only five ratios (same input ratios employed by Altman 1968) to predict the bankruptcy of 169 firms.The machine learning models applied in their work are the shallow neural network and multidiscriminant analysis.In a related study, Chen (2011) utilized a set of eight selected features as inputs of machine learning models and an evolutionary computation approach was used for predicting business failure of 200 Taiwanese companies.To forecast the bankruptcy of Korean construction companies, Heo and Yang (2014) used a total of 2762 samples and 12 ratios for training several models such as adaptive boosting with DT, SVM, DT and ANN.For future research studies, we can apply hybrid learning techniques by combining the DNN with other machine learning model which can provide higher performance than when using a single model.In this context and for the same purpose to forecast bankruptcy, Ben Jabeur and Serret (2023) utilized the fuzzy convolutional neural networks.The present work as well as previous research supports the idea that artificial intelligence models perform better than traditional methods.However, it will be interesting for further research to diversify the data sources and not only use standard financial ratio data, by adding miscellaneous textual data (e.g., news, companies' public report, notes and comments from experts, auditors' reports and managements' statements) that can enhance the forecasting accuracy of financial distress (Mai et al. 2019;Matin et al. 2019).Furthermore, it is of great interest to integrate sector diversification as an input variable to predict company default and to subsequently study the impact of changing industry on the accuracy of predictions.Another concern that should be studied in the future, is the occurrence of several recent crises such as the COVID-19 crisis.It is interesting to apply artificial intelligence models to investigate the crisis impact on the performance of financial distress prediction methods (Sabir et al. 2022).

Figure 1 .
Figure 1.The Standard architecture of DNN.

Figure 1 .
Figure 1.The Standard architecture of DNN.

Figure 2 .
Figure 2. ROC curve for the five machine learning models and DNN.

Figure 2 .
Figure 2. ROC curve for the five machine learning models and DNN.

Table 1 .
A summary of literature review on bankruptcy prediction using deep learning.

Table 2 .
The series of financial ratios.
Table 4 presents the empirical results of the accuracy rate, F1 score and AUC criteria used to judge the classifier's performance of the applied methods.

Table 4 .
Prediction results and models accuracy.