Predicting Financial Distress of Slovak Enterprises: Comparison of Selected Traditional and Learning Algorithms Methods

: Predicting the risk of ﬁnancial distress of enterprises is an inseparable part of ﬁnancial-economic analysis, helping investors and creditors reveal the performance stability of any enterprise. The acceptance of national conditions, proper use of ﬁnancial predictors and statistical methods enable achieving relevant results and predicting the future development of enterprises as accurately as possible. The aim of the paper is to compare models developed by using three di ﬀ erent methods (logistic regression, random forest and neural network models) in order to identify a model with the highest predictive accuracy of ﬁnancial distress when it comes to industrial enterprises operating in the speciﬁc Slovak environment. The results indicate that all models demonstrated high discrimination accuracy and similar performance; neural network models yielded better results measured by all performance characteristics. The outputs of the comparison may contribute to the development of a reputable prediction model for industrial enterprises, which has not been developed yet in the country, which is one of the world’s largest car producers.


Introduction
The economy is built on the successful functioning of enterprises. In current conditions, however, an increasing number of corporate defaults occur, which is caused by various factors. The financial distress of business entities is closely connected with unpleasant consequences, and these are the main motivation factors for managers or financial analysts to search for the methods that can predict possible financial problems or bankruptcy in advance. Financial analysis may help solve the problems, as it focuses on the determination of the factors (and their intensity); it forms the financial stability of enterprises and reveals corporate strengths and weaknesses and thus becomes a necessary and effective diagnostic tool of the corporate financial health prediction. Since the development of the first prediction models in 1930 [1], hundreds of bankruptcy prediction models have been introduced worldwide (e.g., Alaka et al. [2]). However, the results of many researches confirm that the reliability and predictive accuracy of the models decrease if they are used in different national environments and time horizons than those in which they were originally formed [3][4][5]. The development of prediction models in unique national conditions is of vital importance if the financial risks are to be estimated correctly. One-country studies play a significant role in the research of bankruptcy prediction.
Considering the eastern European countries, especially the Visegrad Group-which is the political and cultural alliance of countries for the purpose of the social, energy and economic cooperation-most of them predict the financial health of enterprises using the bankruptcy models that were formed in their national environment and are generally known. For instance, in Hungary, the most reputable model is the prediction model of Virag and Hajdu [6] developed for industrial enterprises; in the Czech Republic the models of the Neumaiers [7][8][9][10] focused either on industrial enterprises or all sectors of the economy; the widely used models in Poland are the bankruptcy models for industrial enterprises introduced by Maczynska [11], Gajdki and Stosa [12], Hamrol et al. [13]; the Poznanski model by Prusak [14] and the general model of Gruszynski [15]; in the conditions of Slovakia the most significant are the models of Chrastinova [16] and Gurcik [17], both developed for agricultural entities. All four countries are high-income industrial countries; thus, predicting the future financial situation with a high level of accuracy is essential. However, the situation in Slovakia-the world's largest per-capita car producer [18]-is not solved properly. Despite the fact that several models have been developed in the national conditions, e.g., Binkert [19], Hurtosova [20], Delina and Packova [21], Rohacova and Kral [22], Gulka [23] and Boda and Uradnicek [24], they specialize in the bankruptcy prediction of agricultural enterprises or do not have any sectoral orientation; none of them is focused on the industrial sector (the industrial sector includes the enterprises operating in sectors B-E according to the statistical classification of economic activities in the European Community).
The need for the development of a bankruptcy model for industrial enterprises in the Slovak environment is indisputable. The Slovak Republic has a small, open economy driven mainly by automobile and electronics exports, accounting for more than 80% of the GDP. It is also one of the fastest growing economies in Europe and the fifth largest car producer in the European Union. Slovakia continues exhibiting robust economic performance, with strong growth backed by a sound financial sector, low public debt and high international competitiveness drawing on large inward investment.
Several different statistical methods may be used to form the model; thus, the main aim of the paper is to compare selected traditional and machine learning methods (logistic regression, random forest and neural network) in the conditions of the Slovak environment when predicting the financial health of enterprises. Identification of the most relevant and accurate method is useful to form the model predicting the financial distress of industrial enterprises in the specific national environment, and the results may be applied also in other countries with a similar economic structure and business orientation. The originality of the paper is based on the presentation of different methods of bankruptcy prediction applied on a dataset of about 50,000 industrial enterprises (on average) in each analyzed period (2016-2018). This period was chosen due to new legislation being applied on entities in financial crisis into practice, which entered into force in 2016; the last period correspondents with the newest available data (2018). The importance of the study is underlined by the fact that the information about the stable development of industrial enterprises in the future can contribute to the elimination of potential financial risks and thus to the improvement of the decision-making process of investors and creditors.
Following the amended Slovak legislation (Law No. 513/1991 Coll. Commercial Code and Law no. 7/2005 Coll. On Bankruptcy and Restructuring as amended), a company is in default if it has liabilities to at least two entities and the value of the liabilities exceeds the value of its assets or if it is unable to pay at least two financial liabilities to at least two creditors 30 days after the due date. An enterprise is at risk of imminent default when it has a low equity-to-liability ratio, which is strictly limited to be less than 4 to 100 for the year 2016, 6 to 100 for the year 2017 and 8 to 100 for the year 2018 and any other following year [25]. According to available statistical data (financial reports of industrial enterprises were obtained from the register of the financial statements, www.registeruz.sk), if the law were in force between 2011 and 2015, 22,591, 38,413, 41,952, 41,905 and 40,636 enterprises would be in crisis each year, respectively. Moreover, an enterprise is in financial distress if the low value of the equity-to-liability ratio is accompanied by negative profit after taxes and the ratio of current-assets-to-current-liabilities (current ratio) is less than 1. Furthermore, the enterprises of which value of assets is lower than the value of liabilities (negative equity) are also entities with financial problems [26,27]. Despite the fact that the determination of an enterprise in default is based on the Slovak legislation, the same limitation is relevant also in the context of different countries, as the negative equity, negative profit after taxes and low level of the equity-to-liability ratio are general indicators of non-prosperity. The optimal value of the current ratio is defined divergently in the literature; however, if its value is below 1, it shows that an enterprise may not be able to meet its obligations in the short run [28]. It should be emphasized that the equity-to-liabilities ratio varies across the industries [29]; however, on the other hand, the Slovak legislation does not consider the economic sector and sets the same limit value for all enterprises on the market. Based on the presented information, it can be assumed that the research on the use of traditional and learning algorithms for bankruptcy prediction in conditions of the Slovak Republic can produce interesting findings.
The paper is divided into the following sections. The literature review depicts the most important as well as recent studies and tries to connect the research aim and the literature's previous findings. The Data and Methodology section highlights the methods used and determines the data used for the analysis. The outputs of calculations, crucial findings and comparison of results with other studies are portrayed in the Results and Discussion section.

Literature Review
The bankruptcy prediction, i.e., the prediction of financial distress, has been a highly discussed topic for several decades. The first studies on bankruptcy forecasting date to the beginnings of the 20th century, with the most significant studies being those of Beaver [30], Altman [31] and Ohlson [32]. However, in Europe, this phenomenon assumed its importance in the 1990s when the economic systems started to be changed [33]. The complex list of models of European transition economies with the specification of their sample size, economic sector, type of statistical method used, and prediction accuracy are portrayed in the work of Kliestik et al. [34]. Different prediction models were developed worldwide, helping business entities to forecast their financial stability in the upcoming period, which is important not only for the enterprise itself but also for its business partners [35,36]. Kliestik et al. [37] in their research confirm that the issue of bankruptcy predictions ensures business continuity and sustainable and ethically responsible economic development.
Chou et al. [38] add that when forming the bankruptcy prediction model, the financial ratios selection and the classifier design play major roles. The importance of the financial ratios used as predictors of financial health is depicted in the renowned studies of Sharifabadi et al. [39], Tian et al. [40], Bellovary et al. [41], Ravi Kumar and Ravi [42], Calderon et al. [43], Dimitras et al. [44], O'Leary [45] and Scott [46]. However, two studies are especially important in the identification of crucial financial ratios. The first one is the research of Bellovary et al. [41], where the authors analyze 165 prediction models. They state that 752 different variables were used in the models, with up to 674 of these variables being used in only one or two models. At the conclusion of the study, they present 42 variables that were used in more than five models (the most frequently used were earnings-after-taxes-to-total-assets ratio, current-assets-to-current-liabilities ratio, working-capital-to-total-assets ratio, retained-earnings-to-total assets ratio and earnings-before-interest-and-taxes-to-total-assets ratio, appearing in more than 30 models). The authors of the second important study [42] analyzed 62 prediction models and put in order the 20 most relevant financial ratios based on their frequency of occurrence, i.e., earnings-after-taxes-to-total-assets ratio, retained-earnings-to-total assets ratio, sales-to-total-assets ratio, earnings-before-interest-and-taxes-to-total-assets ratio and current-assets-to-current-liabilities ratio. As declared in the study of Kovacova et al. [47], each country prefers different explanatory variables when developing a bankruptcy prediction model. Their results reveal that prediction models being developed in the Slovak Republic prefer the current ratio, liabilities-to-total-assets ratio, equity-to-total assets ratio, return on assets (ROA) and cash ratio. By contrast, the weakest predictors are macro-economic variables, analyst recommendations and industry variables [48], which is in contrast with other studies confirming the significance of macroeconomic variables in predicting financial distress, e.g., Jacobson et al. [49], Bruneau et al. [50] and Nam et al. [51]. Nonetheless, the research of Zikovic [52] underlines that the probability of financial distress is influenced by both firm specificities and macroeconomic variables. The utility of combining accounting and macro-economic data in financial distress prediction is confirmed also in the research of Tinoco and Wilson [53] and Giriuniene et al. [54]. The study of Filipe et al. [55] concludes that the same firm-specific factors are essential in predicting the financial distress of small and medium-sized enterprises across Europe; however, considering the macroeconomic variables, they differ based on regional specifications. Kacer et al. [56] finds that the classification performance of the prediction models is improved when the non-financial variables are included in the model, but they do not recommend the use of macroeconomic variables. Wilson et al. [57] confirm the usefulness of non-financial information in the prediction of financial distress of enterprises and find that the transition process variables, along with financial and non-financial variables, influence the probability of failure. Du Jardin [58] highlights that time horizon also plays an important role in the bankruptcy prediction, and that the optimal forecasting horizon is usually one year.
Undoubtedly, an inevitable role is played by statistical methods and models used to predict the future development of enterprises [59]. Mattsson and Steinert [60] state that the quality of the model is given by the statistical method that is applied; the results of their research prove that in recent years artificial intelligence and machine learning methods have achieved promising results in corporate bankruptcy prediction settings compared to the traditional method used (logistic regression or multiple discriminant analysis). The assessment of the bankruptcy risk of large companies by Barbatu-Misu and Madaleno [61] confirms that the estimation of bankruptcy risk is important for managers in decision-making and in the process of the improvement of corporate financial performance. Their findings show that the principal component analysis based on discriminant analysis indices is more effective when used to determine the corporate financial risks. In addition, the methodological aspects of designing a scoring model for an early prediction of bankruptcy using ensemble classifiers are examined by Pisula [62]. Oliveira et al. [63] aim to develop a multiple criteria system to predict bankruptcy in small and medium-sized enterprises combining the cognitive mapping and categorical-based evaluation technique Macbeth. Tsai [64] demonstrates that assessing the credit risk and possibility of bankruptcy are important issues before investment, and moreover, the data mining and machine learning techniques are more frequently used to solve credit scoring problems. Le et al. [65] highlight the importance of bankruptcy prediction for financial institutions, fund managers, lenders, governments and economic stakeholders. However, they stress that the imbalance of bankruptcy companies and health companies may cause classification errors and advise the use of cluster-based boosting algorithms for effective bankruptcy prediction. In addition, their study on a sample of Korean companies proves that the use of oversampling methods to balance the dataset of analyzed enterprises enhances the performance of the bankruptcy prediction [66]. Le et al. [67] present a new machine-learning model (GPU-based extreme gradient boosting machine), which outperforms the current machine learning approaches for bankruptcy forecasting in terms of geometric mean and area under the receiver operating characteristic curve (AUC). The findings of the researches of Wang et al. [68] and of Mai et al. [69] confirm the superiority of learning machine algorithms (super vector machines and random forest) in terms of classification ability, type I and II errors and AUC curve. Hosaka [70] shows that convolutional neural networks show higher discrimination accuracies than conventional methods. Qu et al. [71] claim that the development of modern information technologies causes a decrease in the use of traditional prediction methods-logistic regression (LR) and multiple discriminant analysis (MDA)-and, by contrast, causes the evolution of the machine learning use to do the prediction. Moreover, several authors tried to compare the predictive ability of traditional prediction methods [72]. Affes and Hentati-Kaffel [73] found that the logit model outperforms the discriminant analysis model in terms of correct classification rate. Using data from 1985 to 2013 on North American enterprises, Barboza et al. [74] report that comparing the best prediction models, random forest led to 87% accuracy, while logistic regression and discriminant analysis led to 69% and 50% accuracy, respectively. A study of 236 enterprises operating in Slovakia proved that the model based on a logit function outperforms the classification accuracy of the discriminant model [75]. As declared by the results of others researches focused on the comparison of traditional and machine learning models, e.g., Cho et al. [76], Van Gestel et al. [77], Kim [78], Chen [79] and Nyitrai and Virag [80], the models based on the principles of discriminant analyses achieve the weakest prediction ability (the linear regression models outperform the prediction accuracy almost in all cases); thus we decided not to include this method into the comparison of statistical methods in the conditions of Slovak industrial enterprises. However, the findings of all the cited researches indicate that the best predictive accuracy is achieved by learning algorithms. Altman et al. [81] also compare the predictive accuracy of different estimation methods used to assess the financial health of small and medium-sized enterprises up to 10 years before the default in an open European market. Their findings affirm that logistic regression and neural networks are superior to other approaches. The importance of the logistic regression in bankruptcy prediction modeling is underlined in the research of Ben Jabeur [82]. Olson et al. [83] found that decision trees are relatively more accurate compared to neural networks and support vector machines. The research of Klepac and Hampel [84] focuses on medium-sized enterprises in Europe that went bankrupt in 2014; and the importance of business risks of small and medium-sized enterprises for their operation is portrayed in the paper of Hudakova et al. [85]. They found that learning algorithms achieve much better results compared to other methods, especially in predictive accuracy. Garcia et al. [86] point out that both advanced statistical and machine learning models may demonstrate their effectiveness when assessing financial data, which are often specified by different imperfections. However, on the other hand, there are also researchers who do not recommend the use of machine learning in the field of business [87] because the prediction accuracy does not far exceed the statistical models and the results are not interpretable. As stated by Svabova and Durica [88], the proper use of statistical methods ensures the correct use of statistical tools and may lead to the creation of a strong prediction model with a statistically significant level of bankruptcy prediction.

Data and Methodology
To compare the prediction accuracy of the models in the conditions of Slovak enterprises, we created logistic regression (LR), random forests (RF) and neural networks (NN) models to predict whether a company will be in financial crisis in the following year. Each of the models has both advantages and disadvantages, and our goal was to find the most suitable model [89][90][91][92].

Data and Conditions of Classification
The data were obtained from the register of the financial statements (www.registeruz.sk) using API (API is the acronym for Application Programming Interface, which allows an application to communicate with another application or an operating system) and C# (C-Sharp is a programming language). Financial statements from 2016, 2017 and 2018 were analyzed. To calculate the predictors for the year 2016, the financial data of 2015 were analyzed. To determine if an enterprise is in the financial crisis in 2016, the data of 2016 were used. The same procedure was used for the statements from the years 2017 and 2018. To distinguish the enterprises into the groups of financially sound enterprises and enterprises in financial crisis, the legislative criteria were applied. Financial reports of enterprises do not include the information about the number of creditors and payment delays; thus a different procedure was used to identify an enterprise in crisis and the following criteria were set: (i) the equity-to-liability ratio does not exceed the given value of 0.4; (ii) the current ratio is less than 1; and (iii) earnings after taxes are not positive. If an enterprise meets all three conditions, it is classified as an enterprise in financial crisis. These criteria treat potential and real indebtedness of a company (due to the accumulation of losses from previous years, which would indicate: (i) an inability to generate profit in the longer term; (ii) insolvency; and (iii) the current inability to make a profit).
The data of 2016 were divided in the ratio 75:25 for training and validation samples. Both parts preserve the same proportion of companies in the response class as the original data. All the financial predictors from 2016 and 2017 were used to test the model. The comparison of training data and testing data is summarized in Table 1. Financial stability of enterprises is also influenced by the development of the optimal values of crucial financial indicators. We decided to choose the most relevant financial ratios [90] of profitability, liquidity, activity, indebtedness and capital structure ( Table 2). Measuring and assessing the financial ratios of profitability, activity, liquidity and indebtedness help create a competitive advantage for an enterprise [93]. However, symptoms of financial distress never occur at the same time but in certain phases. First, there is a decrease in output volume, a decrease in profitability, an increased need for working capital and a deterioration of the capital structure, and finally, it comes to persistent insolvency. Summary statistics of these predictors are shown in Table 3.

Methods Used for Bankruptcy Prediction
As we mentioned in the literature review, several methods may be used in the bankruptcy prediction models, but to analyze the industrial enterprises in Slovak conditions we selected three popular statistical methods-logistic regression, random forest and neural networks-which were proven to be the most accurate in other researches and studies.
Logistic regression is a method which tries to model the unilateral dependence of variables from which the examined dependent variables are binary, ordinal or categorical, and the explanatory variables can be of any type. It is suitable for modeling of the unilateral dependence between variables in a situation where the dependent variable is categorical, and the explanatory variables may be continuous or categorical. Logistic regression is often compared to multiple discriminant analysis; however, its fundamental restrictions are not so strict, e.g., it does not require the assumptions of normality of variables or homoscedasticity of individual groups. Additionally, the classification ability tends to be better than in the case of models based on discriminant analysis [94]. When modeling the financial distress of enterprises using logistic regression, two categories are recognized: prosperous and non-prosperous enterprises. Each enterprise belongs only to one category depending on the value of the dependent variable. The modeling of prosperity/ non-prosperity (conditional probability) is based on the conditional probability of the dependent variable (Y) depending on the independent variables, predictors (X). All used predictors should be independent of each other, as the existence of mutual dependence (multicollinearity) can affect the stability of the model [95]. The relationship between the probability and the vector of independent variables for non-prosperous enterprise is calculated using the following algorithm: where π is the conditional probability that an enterprise is non-prosperous. Thus, the logistic regression assumes that an enterprise is non-prosperous if the predicted probability is greater than the limit value (most often it is the value 0.5), and vice versa, if the predicted value is below the determined limit values, an enterprise belongs to the group of prosperous enterprises. The technique of random forest was developed for datasets containing a large number of predictors. Random forests can combine multiple categorical and numeric variables in one analysis [96]. This method consists of a set of simple trees T 1 , . . . , T N whose classification or regression function can be expressed as h(X, Θ 1 ), . . . , h(X, Θ N ), where h is a function, X is a predictor and Θ 1 , . . . , Θ N are independent, equally distributed random vectors. For a random forest method, CandRT binary trees are used. Similarly to the formation of individual trees, using the RF method, the dataset is split into test and training files. Training files for individual trees T i are bootstrap selections from the L data files. Bootstrap selections are random repetitions of n size. Observations that are in the i-th bootstrap selection L i are used to create the T i tree (training set), and the observations that were not selected (test set) are used to estimate the error. Error estimates on a test set are called out-of-bag estimates. The total number of out-of-bag estimates is 1/3 of the data set. When using RF to classify enterprises, we get information from each tree about the classification of each observation into the resulting category. The result of forest classification is given by the majority decision of all the trees. The RF method increases the accuracy (reduces distortion) by letting trees grow, while maintaining a bearable variation by combining the results of individual trees (majority vote/averaging). Compared to other forests, however, there is an effort to ensure a low level of correlation between individual trees. Decreasing the correlation between the trees is achieved by a random selection of a certain number of predictors. The random forest method uses a random selection of observations and a random selection of predictors [97].
The neural network is a set of connected input-output units (artificial neurons), with each connection having a certain weight. Artificial neurons are based on the principle of biological neurons that make up the human nervous system. The input information is weighted. The threshold value is subtracted from the sum of weighted input signals and the activation function transforms the signal into an output signal that is sent to the input of the neurons to which the given neuron is associated. There are several types of neural networks and algorithms. The most commonly used type of neural network, which was used to analyze the Slovak industrial enterprises, is a multilayer pre-implemented neural network. It consists of several layers of neurons-the input layer, several hidden layers and the output layer [98]. This type of neural networks is used for the classification and prediction of a continuous function (numerical prediction). Provided that they have enough hidden layers and training examples, they can approximate any function. The use of a neural network for bankruptcy prediction is recommended as this method is tolerant of data noise and the ability to model complex relationships between inputs and outputs. Algorithms of neural networks can be parallelized, which reduces the calculation time [99].

Metrics of the Prediction Models Comparison
The quality of the prediction model can be quantified by several measures. In this study, different types of models are compared using multiple metrics.
The accuracy is the ratio between the number of correct predictions and the total number of predictions. If the number of occurrences in classes varies greatly, accuracy is biased. The per class accuracy is the average accuracy for each class. It should be used when the classes are imbalanced.
The error rate is calculated as 1 − Accuracy.
The mean per class error is an average of the error rate for each class.
The coefficient of determination (R-squared) is calculated as: where y i , y,ŷ i are the original data values, mean and predicted values, respectively. R-squared measures the percentage of variability of the dependent variable, which is explained by independent variables in the model. The logarithmic loss (LogLoss) measures the uncertainty of the probabilities (p) of a model by comparing them with true labels. To calculate log loss the following algorithm is used: Thus, log loss quantifies the accuracy of a model by penalizing false classifications. The receiver operating characteristic (ROC) is a curve with points [x,y], where x = 100 − Specificity and y = Sensitivity for different cut-off (threshold) points. The closer the ROC curve is to the upper left corner, the higher the overall accuracy of the model. The ROC curve reveals a trade-off between sensitivity and specificity-increasing of sensitivity implies decreasing of specificity and vice versa. The biggest benefit of using the ROC curve is that it is independent on the change in the proportion of outcomes.
The area under curve (AUC) is one of the most common and most frequently used metrics. An area under curve with a value in the range 0.97; 1.0 characterizes a perfect classification ability of a model. AUC values of 0.92; 0.97) present excellent results of prediction, 0.75; 0.92) good classification, 0.6; 0.75) acceptable classification ability and AUC bellow 0.6 indicates a model that is inappropriate for the prediction of a financial crisis [100]. Like other methods, a high AUC value does not necessarily guarantee the top quality of the model. For example, there are situations where the sensitivity is in the range of only a few hundred and the specificity is over 90%, and the AUC is still above 0.8.
The Gini coefficient is derived from the AUC and measures the inequality among the values of a frequency distribution. The Gini coefficient is calculated using the algorithm: A Gini coefficient of more than 60% shows a good predictive ability of the model. The mean squared error (MSE) measures the average squared difference between the estimated values and the actual values: where y i is the vector of observed values of the variable being predicted andŷ i is the predicted value. MSE is highly affected by outlier values [101].
The root mean squared error (RMSE) is defined as the square root of the MSE. It aggregates the magnitudes of the errors in predictions and measures the accuracy, which is used to compare forecasting errors of different models for a particular dataset [102].

Results and Discussion
All the selected enterprises are considered using the logistic regression, random forests and neural networks models, which are assessed by the determined metrics. The result is the determination of the most accurate and relevant prediction model in the conditions of Slovak industrial enterprises, enabling the prediction of financial distress in the upcoming period.
We started the analysis with fourteen predictors; however, in the process of the modelsd evelopment, some of the predictors were removed. Models were created and tested in R with the H2O package. The H2O package allows setting the maximum number of predictors used when creating the LR model [103]. When developing the model, all predictors were used and their maximal allowed number was changed (max active predictors option). When switching from five to six predictors it was proven that the use of six or more predictors does not improve the resulting model (compared to increasing complexity; thus using a simpler model is a better model). However, max active predictors cannot be used in RF and NN models, though the process of the model development was similar but more time-consuming. Table 4 shows the final use of predictor variables in each considered model. A plus sign indicates that a financial predictor was used in the final model, and vice versa, predictors with a minus sign were not included. The combination of plus and minus signs indicate that for each model different financial predictors are important.
In Table 5, confusion matrices for the models are portrayed; real yes/no information about the corporate financial crisis is on the left side of the table, and predicted yes/no information is on the top. Max minimum per class accuracy specifies the threshold at which the class accuracy is the worst. Performance characteristics are shown in Table 6. The symbols ↑ (↓) indicate that the larger (smaller) number is better. The comparison of the methods in 2017 reveals that better results were achieved by the new learning algorithms (NN and RF) following all selected metrics. Choosing the best method in conditions of the prediction accuracy, the neural network model has the best predictive ability measured almost by all performance characteristics (except for mean per class error).
The situation is similar in 2018, with slightly different results. The NN and RF model financial distress more accurately following the results of almost all performance characteristics. Neural networks outperform the other two methods in almost all analyzed metrics except for mean per class error, where the best result was achieved by RF (followed by LR).
Comparing the results of the neural network and logistic regression models in the analyzed period, NN models show 2%-22% better results (depending on the performance metrics); the biggest differences are in the results of LogLoss and R 2 . These findings correspondent with the results of Barboza et al. [71], who conducted intensive research evaluating bankruptcy using traditional statistic techniques and early artificial intelligence models and found that machine learning models show 10% better accuracy in relation to tradition models (LR and MDA).
The difference between both learning algorithms, NN and RF, does not usually exceed 5%, and the differences in the metrics achieved are even smaller (about 2%). Naidu and Govinda [104] affirm that the use of artificial neural networks and random forest have proven to be more efficient over the traditional algorithms. Thus, the neural network models have the advantage of being able to detect non-linear relationships and show better performance, describing the blatant information in corporate failure prediction problems [105].
The results portrayed in Tables 5 and 6 indicate that the neural network model is the best, the random forest model is the second and the logistic regression is the third in order when measuring their strength by selected statistical performance characteristics. Additionally, the models work more accurately with data from 2018 than with the data from 2017, even though they are built on the data from 2016. The most likely reason is that the data from 2016 and 2018 have more similar ratios of the responding variables than data from 2016 and 2017.
Our findings correspond with the results of Lee et al. [106], who analyzed the performance of discriminant analysis, logistic regression and neural networks in the context of Korean enterprises and confirmed the importance of neural networks in predicting bankruptcy. Bagheri et al. [107] affirm, based on the dataset of 80 Tehran enterprises, that artificial neural networks have higher accuracy than logistic regression models used in the bankruptcy prediction. The comparative analysis of logit and probit models, random forests and artificial neural networks by Karminsky and Burekhin [108] revealed that neural networks outperform other methods in predictive power measured by Gini and AUC coefficients. As this study was conducted on a sample of Russian industrial companies, the confirmation of our findings by this study is crucial. Moreover, the authors added that there is no significant impact of non-financial indicators on the probability of bankruptcy. Chaudhuri and De [109] stress that artificial neural networks have become a dominant modeling paradigm. Their study of the 50 largest bankrupt organizations with capitalization of no less than $1 billion underlines the relevance of neural network models used for bankruptcy prediction. However, they claim that the choice of appropriate parameters plays an inevitable role in the performance of the model.
The results of our analysis are contrary to the study of Chen [110], who reported that traditional statistical methods are more relevant to handle large samples without sacrificing prediction performance, while learning algorithms achieve better predictive ability with a smaller dataset. However, our dataset of more than 50,000 enterprises shows opposite results.
Each model has its advantages and all models have the potential to be used in practice as decision support tools. Neural network models provide better results but logistic regression is more convenient to be used in practice. It is noteworthy that models have preferred slightly different predictors. It results from the fact that models have different abilities to detect relationships between predictors and outcome, as well as interactions among predictor variables.

Conclusions
The vast majority of enterprises accept that their lifetime is unlimited and will bring continuous benefits to their owners, creditors and stakeholders in the form of profit, rising market value of their enterprise and growing or, at least, non-declining number of employees. However, due to the entropy and turbulence of the economic and political environment in which companies are interacting, it may come to a loss of key customers, change of the crucial macroeconomic fundamentals, reduction of expected returns, increase of costs or the emergence of new, unexpected expenses. The cardinal question is if the financial distress of enterprises can be predicted with sufficient time in advance and with appropriate accuracy. This important phenomenon is solved by the models of bankruptcy prediction. The models differ in several aspects following the economic and legislative principles of the country in which they were formed. Moreover, they use various financial ratios as potential predictors of financial distress. The most important role is played by the model's statistical principle, which is used to predict the financial crisis of enterprises.
Until recently, the dominating bankruptcy prediction methods were based on statistical modeling; the most frequently used were multiple discriminant analysis and logistic regression models, having much better prediction accuracy. However, lately, models based on machine learning have been proposed and have been successfully used for many classification and regression problems. Moreover, machine learning models often outperform traditional classification methods. The purpose of bankruptcy prediction is to reveal the future financial development and perspective of enterprises. Therefore in this study, the comparison of traditional (logistic regression) and new learning algorithm methods (random forest and neural network) was conducted to reveal their prediction ability and accuracy in the condition of Slovak industrial enterprises. Comparing the methods on a scale of different metrics, the new machine learning models show higher predictive performance; particularly neural network model yielded better results measured by all performance characteristics. The accurate prediction of corporate bankruptcy for enterprises operating in specific industries is crucial for creditors and stakeholders as the reduction of potential risk. The results of Lee and Choi [111] declare that prediction using industry samples outperforms using the entire sample of enterprises and the best predictive accuracy is achieved by the neural network model. Our results underline the importance of the neural network model for the bankruptcy prediction and highlight its relevance to assess the financial distress of industrial enterprises.
Identification of the most relevant and accurate method is useful to form the model predicting the financial distress of industrial enterprises in the specific national environment of Slovakia, which has not been developed yet. Despite the huge extent of performance characteristics comparing the models, several more methods used for bankruptcy prediction can be included in the comparison, e.g., multiple discriminant analysis, probit regression, rough sets, linear programming, principal component analysis, data envelopment analysis and survival analysis. This limitation can be omitted by further research also including other models, not only the most frequently used, and investigating the prediction accuracy of the models in a longer time horizon.