The possibility to predict the development of the company’s financial situation for the future is based on the analysis and evaluation of the financial results achieved by the company at present. These are the symptoms of its further development [1
]. Correct identification and subsequent interpretation of the indicators of future financial development of the company is the essence of prediction financial analysis. It aims to point out possible threats in the current development of the company in advance to enable it to take action to prevent serious problems. Financial analysis should, therefore, be carried out in every company regardless of its size, business activity or other specifics. These should, however, be taken into account in the content, extent and depth of the analysis carried out.
The issue of predicting the financial situation of companies is a relatively young field of the economic research. Its origin dates back to the 30s’ of the 20th century, but the first prediction models appeared not until the 60s’ of the 20th century. The first study focused on finding the main differences between companies with and without financial problems, based on the analysis of the financial ratios, is the work of Fitzpatrick [2
]. Since then prediction financial analysis has undergone significant development, from one-dimensional and multidimensional discriminant analysis through logistic regression to artificial intelligence. The method of discriminant analysis was used for the first time by Beaver [3
], who also formed the basis for prediction models. Based on his research in 1968, Altman used multivariate discriminant analysis to develop probably the most famous bankruptcy prediction model [4
]. Ohlson [5
] was the first who used the method of logistic regression for creating the model to predict the probability of a company failure.
At present, experts’ have different views on various methods aimed at predicting the financial situation of the companies. Many of them strive to find a reliable tool that can predict a possible threat of the corporate bankruptcy [6
]. Some authors deal with the possibility of using models developed in the last century for predicting bankruptcy of current companies. This results in different adjustments and recalculations in the original models. Other authors focus on creating new models using new ratios and modern methods.
As a result of the development of artificial intelligence, new methods such as machine learning techniques, neural networks and genetic algorithms are being introduced into prediction financial analysis [7
]. These authors argue that artificial intelligence methods, in comparison with traditional mathematical and statistical methods, can more accurately classify companies as prosperous and non-prosperous [7
]. For this reason, according to these authors, more researches should be orientated on the application of these methods into the issue of predicting the financial health of the companies [9
]. On the other hand, some experts point out the complexity of artificial intelligence methods. They claim that traditional mathematical and statistical methods are comparable to artificial intelligence methods in terms of the accuracy of companies’ classifications. As a result, many prediction models based on traditional prediction techniques have still been developed around the world. Given the different opinions of experts on various prediction methods, it can be argued that every method has its advantages and disadvantages, and also limitations of its use [10
]. The constant research in this area proves the currentness of this topic even today. In any case, the issue of predicting the financial situation of a company will always be up to date due to its great importance not only for the company itself but also for all the entities that come into contact with it [11
The relevance of the topic, analyzed in this study, implies from the interest of scientists and economists in many countries worldwide in the issue of prediction of the financial difficulties of companies. History has shown that developed prediction models lose their classification ability with application in another time and especially in another country [12
]. Models that were created under the conditions of some country’s economy may not be good enough to predict the financial difficulties of the companies in another country. Therefore, this study aims to create a prediction model able to capture accurately the economic situation of real Slovak companies. As a benefit of the study can be considered the use of the database of real companies operating in Slovakia in the years 2016–2018, the period of interest of this study. This study focuses on a group of small and medium-sized Slovak enterprises (SME), which are 98% of all enterprises in Slovakia. The main contribution of the article is the use of a combination of methods of logistic regression and discriminant analysis, taking into account not only the financial ratios of the companies but also their size and the economic activity in which they operate. The ambition of the combination of methods and using more detailed characteristics of companies is to achieve high prediction ability of the future financial troubles of the company. This good classification is achieved by the addition of the estimated value of the probability of financial difficulties of companies as a new variable, i.e., additional information on enterprises, to the discriminatory model. We consider the use of a combination of both methods in the two-step model creation process to be innovative in this area. At the same time, the benefit of the study is the creation of two prediction models for one year in advance and one model for two years in advance, all using a combination of the mentioned methods.
The rest of the paper is organized as follows. The literature review highlights some interesting studies in the field of prediction of bankruptcy or financial difficulties of companies in various countries in the world and also in Slovakia. The second part of the article describes the base of the statistical methods used and the main characteristics of the database. The third part of the article states the result of the prediction models for Slovak SMEs based on the combination of logistic regression and discriminant analysis. The discussion compares the prediction ability of the models with other models created before and also states the main strengths and weaknesses and further possible continuation of the study. The conclusion part summarizes the results.
The prediction of bankruptcy is a topic that has been analyzed in recent years by economists in different countries of the world. Different methods of creating the prediction models are used, whether historically known methods of discriminant analysis and logistic regression, or even more modern methods of neural networks, genetic algorithms, classification trees, and random forests. For example, the study [15
] compares three models of predicting corporate financial distress of French companies created by the discriminant analysis, logistic regression and random forests. The results show that the best classification results are given by the random forest method. In a study [16
], the authors compare the logit model and data mining models in the field of prediction of bank failures in the USA. They found that the logit model predicts bank failures less precisely than data mining models, but on the other hand, produces fewer missed failures predictions. In [17
], the author analyzes how time-varying variables and changes in the macroeconomic environment affect the probability of companies’ financial distress in Croatia. In his study, discrete-time hazard models using logit were applied to demonstrate that the probability of distress is influenced not only by firm-specific variables but also by macroeconomic variables. In [18
], the author developed models of crisis diagnostics for Russian companies renting commercial real estate. These models are created by discriminant analysis and logistic regression. In a study [19
], the author uses the bootstrapping method and the multivariate discriminant analysis to compare the prediction ability of models created for specific industry with general models created in Poland. Another aim of the research was to define the determinants of joint-stock company bankruptcy in three particular industries of the Polish economy. In [20
], the authors apply a multivariate discriminant analysis to differentiate defaulted nations from the non-defaulted ones. Their results indicated discriminant analysis as the most suitable method for insolvency prediction compared to other methods like probit and logit model. In a study [21
], the authors prepared a model to predict the risk of encountering financial difficulties in Estonia by using discriminant analysis and logit analysis. The authors of [22
] focus their study on a credit risk modelling for prediction of default of companies operating in Romania. The estimations were performed first using logistic regression, and then by artificial neural networks method. A linear discriminant analysis, logistic regression, classification trees and the method of nearest neighbors are applied in [23
] to analyze possible bankruptcy signals and to evaluate the financial condition of the selected sector consisting of transport, spedition, and logistics for entities from Poland and Slovakia. In their study [24
], the authors deal with the issue of sovereign credit ratings (CR) and the development of the financial market in the European region and say that these serve two main purposes: it verifies the financial condition of the country and signals and the change in the prevailing financial condition. By using a non-linear panel cointegration methodology approach, the authors investigated whether there is a long-run relationship between sovereign CR and financial market development (FMD) in the European region and found an asymmetric effect; both in the long and short run. In [14
], the authors verify five traditional bankruptcy models on the sample of Czech construction SME companies and then create a new model for construction companies. The comparison of the prediction accuracy of the verified models shows that the correct classification fluctuates between 50% and 94% for non-failed companies and between 10% and 86% for failed companies one year in advance. Their new prediction model achieves 77% accuracy for non-failed companies and 86% for failed ones. In a study [25
], the authors compare 5 different methods for creation of the bankruptcy prediction model, considering up to 10 years before the event in an open, European economy. The authors found that logistic regression and neural networks achieve better results than other approaches. Several authors have published a systematic review of articles in recent years with the issue of models predicting the financial difficulties of companies, for example [13
Some prediction models have also been created in Slovakia recently. In addition to already known models of Chrastinova [30
], Gurcik [31
], Hurtosova [32
] and Gulka [33
], several authors have tried to create a prediction model with the best classification ability. In [34
], the authors developed models for bankruptcy prediction of Slovak companies using logit and probit methods and provided the comparison of overall prediction ability of the two developed models. The logit model for prediction of non-prosperity of Slovak companies one year in advance has the overall prediction accuracy of 86.7%. The study [35
] is focused on the design of bankruptcy models, specifically the selection of suitable predictors to verify whether bankruptcy predictors are industry-specific. Another objective of their research was to determine indicators that can detect signs of bankruptcy earlier than one period before bankruptcy. They found that the application of industry-specific determinants and indicators that can detect signs of bankruptcy for more than one year before can help to increase the prediction accuracy of bankruptcy models for Slovak companies. In [36
], the authors analyze the impact of trend variables on the prediction ability of the models constructed using discriminant analysis and decision trees. They developed a new model for Slovak companies using the decision tree technique. The study [37
] was also dedicated to the development of bankruptcy prediction models in the Slovak Republic. The study is focused on the comparison of the overall prediction performance of the two developed models: the first one is estimated via discriminant analysis, the other one is based on logistic regression. Also, the authors developed new prediction models for various economic sectors in the studies [36
]. In [37
], the author creates the discriminant analysis model with the overall prediction accuracy of 64.41% and the model of logistic regression with the overall prediction accuracy of 68.64%. The study [38
], contains the data about Slovak companies from the years 2002–2005. One of the resulting models, logit model, achieved the overall correct classification of 90.24%. In a study [39
], the authors created their model using discriminant analysis, covering the years 2002 and 2003 and achieved the overall prediction accuracy of 87.1%. The study [40
] was based on the dataset of the companies from 2002–2004. Final model achieved 85.6 % of correct classification for all companies in the test sample. The study [41
] covers the years 2011 and 2012 and their model of discriminant analysis achieved 88% successfulness in the classification of all companies and the model of logistic regression has the prediction power of 82% for all companies. In this context, we consider the application of a combination of two methods to be a novelty of this study. Such an approach has not yet been applied in the studies of Slovak authors, and as it turned out, helps to improve the classification results of models. In this area, we see the scientific gap that this study seeks to fill. The use of an extensive database of real Slovak companies is another contribution to meet this goal.
Also, several authors in Slovakia have dealt with the evaluation and applicability of already created models in the conditions of Slovakia. The study [42
] examines the application of the models of multivariate discriminant analysis for companies. In [10
], the authors summarize the advantages and disadvantages of the most used methods in this field: Logistic regression, discriminant analysis, decision trees, and neural networks. The study [43
] examines the relationship between the prediction ability of the models of various authors and the number of variables in them. The study [44
] deals with the comparison between the results of prediction models and the results of the method for the determination of the value of the company. In [45
], the authors apply the well-known Taffler model and the Springate model to validate companies from the electrical engineering industry and conclude that the sector appears to be financially healthy, although the models predict expected difficulties for about a quarter of the companies analyzed. In [46
], the authors also created the prediction model for Slovak SME from the years 2008–2011 and achieved the overall prediction power of 88%.
2. Materials and Methods
The financial analysis enables the company to evaluate the financial health identifying the strengths that a company can use to its advantage and also looking for its weaknesses, which are currently not shown very much, but in the future could cause significant problems. The results of the financial analysis provide information to managers enabling them to make optimal choices [13
]. Basically, the most sophisticated and the most used financial analysis tools are financial ratios that will be used as predictors in the model for Slovak SMEs also in this study. In [47
], authors state that financial ratio models are one of four classes of models: financial ratio models, cash flow models, market-adjusted returns models, and standard deviation models. These authors state that financial ratio models perform best in the year before bankruptcy. Financial indicators are generally considered to be the primary and the most important predictors of financial stability and economic activity in the area of predicting a company’s financial health, as well as models based on them. Other models are, in fact, also based on adjusted financial indicators, and thus in this respect, we consider them rather as models of a complementary nature. In addition, for example, the models based on market-adjusted returns are not well applicable in (e.g.,) Slovakia due to a low number of publicly traded companies [1
11 suitable predictors were selected, their names and algorithms of calculation are listed in Table 1
When selecting appropriate predictors of financial health, we used the research of several authors who discussed this issue, for example, [13
]. The predictors in Table 1
are selected according to these analyses, continuing with all logical controls and accuracy checks of the values of financial ratios and also economic controls and selection of variables from a mathematical-statistical point of view with emphasis on the economic importance of selected ratios. Table 2
shows the characteristics of the variables used in the analysis.
Undoubtedly, among the most widely used methods of predicting financial difficulties of the companies, also the method of multivariate discriminant analysis as well as the method of logistic regression have their place. Discriminant analysis is the oldest method used to discriminate objects [51
]. Discriminant analysis generally identifies the ability of the variables to distinguish existing, previously known, groups of statistical units in the base file and to formulate the classification rule for their allocation to these groups. This results in the two functions of discriminant analysis—descriptive function and predictive function [52
]. If we want to identify above-known differences in the groups and quantify these differences with a set of relevant explanatory variables, we talk about a descriptive discriminant analysis [53
]. If we use discriminant analysis to create a classification rule by which we can predict the inclusion of a new object into one of the existing groups based on the values of its explanatory variables, we talk about a predictive discriminant analysis [52
]. The principle of application of multivariate discriminant analysis in the area of analysis and assessment of the financial health of the company lies in the classification of the company into one of the groups of prosperous (financially stable, healthy) or non-prosperous (unstable, in financial troubles) companies. Classification of the company into one of these two groups is realized by estimating the value of variable Y
as a function of selected financial and economic indicators in the role of explanatory variables [55
]. Therefore, the main goal of applying discriminant analysis is to construct a discriminant function as a linear combination of discriminatory variables and to ensure the best separation of companies into the two groups. Despite the numerous assumptions that discriminatory analysis places on data, this method is very popular and widely used. This is due to the relatively simple interpretability of its findings, even though some statistical techniques can provide better results [54
The method of logistic regression (or only “logit”) belongs to a group of generalized linear models. It is appropriate in the case when the dependent variable is not continuous, but categorical. Its main aim is to find a model that describes the relationship between the explained variable and the group of explanatory variables. The simplest case of its use is a situation where the dependent variable is binary and thus it shall take only two values. Such a variable also corresponds with the situation of predicting the probability of non-prosperity of the company based on its financial and economic ratios or other characteristics. Logistic regression can be used in similar situations as discriminant analysis. Compared to discriminant analysis, it has the advantage that the input assumptions are less restrictive. Classification ability of the logistic regression model is usually better than a model of discriminant analysis. The role of logistic regression is to estimate the probability that the company will be non-prosperous in case of know values of its explanatory variables [52
]. In this study, we applied the combination of both the abovementioned methods. That is, in the first step of the analysis, we used the logistic regression, and predicted the probability of the non-prosperity for each company, assuming the given values of financial indicators. Then, we use this probability as one explanatory variable in the discriminant model. By this combination of the methods, we supposed the improvement of classification results, because the probability from the first step is estimated using all relevant data about the company. All the relevant variables that we had at our disposal, contributed to the estimation of the probability of financial difficulties of the companies. Then, we did not used the threshold (usually 0.5) to classify the companies based on their probability. Instead, the estimated probabilities were added as other relevant variables to the discriminant model to improve its classification ability because the linear combination of all significant variables carries additional information on the enterprise. We supposed this additional information to be important and influential in the models with a significant impact on the resulting discriminant score, i.e., on the classification of the company into one of the groups. In both models, a stepwise method was used to search for a set of significant variables, with the significance of the contribution of a given variable being tested at a significance level of 0.05. Also, other tests were performed at a significance level of 0.05. Among the step-methods of the model creation, we consider the stepwise method to be the best choice. This method, unlike backward elimination and forward selection, tests the significance of previously included variables at each step, which can be changed by adding other variables. The inclusion of the variable is therefore not definitive, which is its advantage.
The classification ability of the models will be evaluated by a classification table and ROC curve and its AUC measure. Classification table contains absolute and relative frequencies of correctly and incorrectly classified companies into the groups of prosperous and non-prosperous companies. The ROC curve indicates the quality of the prediction model, the closer the AUC value of the area to 1, the better the prediction power the model [16
To create prediction models for SMEs in Slovakia, we used data from the financial statements of real Slovak companies. The source of this data was the Amadeus database, focusing on the period of the years 2016–2018 in the sense that the financial indicators of companies are measured in 2016 and 2017 and the result variables, indicating the prosperity or non-prosperity of the company, are set in 2017 and 2018. The whole database contains (after the preparation of the dataset including the correctness check, missing data analysis and analysis of outliers) totally 75,652 companies. Among all, there were the following numbers and frequencies of prosperous and non-prosperous companies (Table 3
The company has the status “non-prosperous provided that it fulfils the following individually determined criteria, based on the valid legislation of the Slovak Republic. From 1 January 2016, the amendment to Act no. 513/1991 Coll. Commercial Code as amended introduced the institute “company in crisis” in the form of §67a to 67i. According to this amendment-a company is in crisis if it is in the state of bankruptcy or is threatened by bankruptcy. The bankruptcy of the company is governed by Act no. 7/2005 Coll. on Bankruptcy and Restructuring as amended. Under §3 of this Act, the company is in bankruptcy if it is indebted or unable to pay. This means that it has more than one creditor and the value of its liabilities exceeds the value of its assets (i.e., has negative equity) or is unable to pay at least two financial obligations to more than one creditor for 30 days after the due date. We approximated this fact by the variable
(see Table 1
), determining the breakpoint of this indicator at level 1. Companies are at risk of bankruptcy if the solvency ratio (Solvency ratio (liability based) = Shareholders funds/(Non-current liabilities + Current liabilities)
100) is less than 0.06 in 2017, or less than 0.08 in 2018 (Act no. 513/1991 Coll.) [13
]. If at least one of the three conditions were not met, the company was considered to be “prosperous”. So, the companies were classified into two groups: non-prosperous (unhealthy, in financial troubles) and prosperous (non-default, healthy) in the context of legislative adjustments [56
]. We take into account all standardized legal forms of companies with non-consolidated financial statements. The sample does not contain self-employed persons and general partnerships, as these two forms the do not apply the legislative stipulations about a company in crisis.
Thus, more than 89% (in 2017) and 87% (in 2018) of Slovak SME companies are prosperous and the rest of them, nearly 11% or 13% respectively are in financial troubles. This status of the company is given in the database by the output variable that is binary and takes the value 0 for a prosperous company and 1 for anon-prosperous company.
Together, three prediction models using the combination of methods were created:
1-year prediction model using the data from 2016 (and prosperity of the company from 2017);
1-year prediction model using the data from 2017 (and prosperity of the company from 2018);
2-year prediction model using the data from 2016 (and prosperity of the company from 2018).
Firstly, we created the models of the logistic regression using the forward method (likelihood ratio based), with the significance level of the variables in the model of 0.05. As the predictors, we applied all the mentioned financial ratios and the type of the economic activity of the company given by the NACE code and the size of the company (small company or medium-sized company). All calculations in this study were performed using SPSS software. The generated model has the value of Nagelkerke R-Square, which is an alternative to R-Square from linear regression, at 0.393, 0.583, and 0.437 respectively (Table 4
). This means the percentage of the “pseudo-variability” of the dependent variable is explained by this model. From the previous experience concerning the creation of bankruptcy prediction models, we know that such level of this characteristic of the model is common. Therefore, from this perspective we consider it to be good enough. The classification power of the models was evaluated by the classification table and by the AUC criterion (Table 4
According to the percentage of successful classification of enterprises and according to the AUC criterion, the best one seems to be the 1-year model from 2018, which achieved 90.9% of the correct classification of all enterprises and its AUC is 0.890. However, the other two models also achieved high success.
The resulting logistic models were used to predict the probability of the non-prosperity of every company. Subsequently, we applied this probability as one of the predictors in the discriminant analysis model.
To verify the assumption of covariance matrices equality we used Box’s test with a null hypothesis of equal covariance matrices of the companies’ groups. The results of the Box’s test are in Table 5
Based on the Box’s test result, we reject the zero hypothesis on the equality of covariance matrices of groups of prosperous and non-prosperous companies for all the models. In such a case, the use of a quadratic discriminant analysis instead of a linear one has to be considered, but this is not widely used in economic practice. It is usually mentioned more in theory as a more optimal variant in case of violation of this assumption as it is sensitive to deviations from normality and rarely gives better results than linear discriminant analysis. Its results are also more difficult to interpret [52
]. Therefore, we had to treat this violation provided the use of discriminant analysis using this fact in the application of this method in SPSS, that means, to use separate groups covariance matrix instead of within-groups matrix. Table 6
shows the resulting discriminant equations for the Z-score (discriminant score) of each company. These models were created using the stepwise method of selecting significant variables from the set of financial ratios, NACE categories in the form of dummy variables, the size of the enterprises in the form of dummy variables and the probabilities obtained from the first step of the procedure. At a significance level of 0.05, the significance of the contribution of each variable to discrimination was verified by Wilk’s Lambda and the F-test of its significance. The stepwise discriminant analysis led to the following resulting equations.
The variables , and are the probabilities predicted from the model of logistic regression from the first step of the analysis. All these three probabilities were statistically significant in the discriminant models at the significance level of 0.05.
Using these equations, we can predict the classification of the company into the group of prosperous, or into the group of non-prosperous companies. As the constants are used in the models, the resulting discriminant score is always compared to the weighted average of the group’s centroid, which is zero in this case. So, the negative Z-score implies the classification of the company to the group of prosperous and positive value to the group of non-prosperous ones.
The quality of the models and the impact of the predictors on the value of the Z-score of the company is evaluated by the canonical correlation coefficients (referring about the correlation between the discriminant function and the explanatory variables used in the models), the significance of these correlations, the values of standardized canonical discriminant function coefficients and also correlation coefficients between the discriminant function and explanatory variables (used to assess discrimination ability of the variables in the resulting model). Table 7
shows these characteristics of the models and three variables with the highest correlation with the discriminant function for each model.
Based on the past experience, the value of the canonical correlation coefficient, in the bankruptcy prediction models based on discriminant analysis, is usually just at such or very similar levels. On this basis, we consider that the discriminant function is good enough. Moreover, all of these characteristics are statistically significant. The probabilities of the non-prosperity, predicted from the first step of the analysis, have the highest coefficients (the first higher in 1-year prediction model in 2017 and in the 2-year model and the third in the 1-year model in 2018). They are also strongly and positively correlated with the discriminant function. Among the other variables, and have a significant correlation with the discriminant function and are repeated in two of the models. The correlation of is negative and of is positive. This fact also has its logical interpretation, as is one of the rentability ratios and belongs to the group of indebtedness indicators.
Classification ability of the model is quantified in Table 8
by the percentage of correct classification and by the AUC criterion.
By comparing the percentage of correct classification and AUC criterion of the logistic regression model created in the first step of the analysis and the subsequent discriminant model, it is visible that the combination of methods brought improved prediction ability and better AUC quality of the model of non-prosperity of Slovak SMEs.
Prediction models created using one of the two statistical methods, used in this study, are emerging worldwide over the last period. As aforementioned, new prediction models have been created also in Slovakia by several authors. Comparing our results with the other studies, we can say that we achieved very similar, or even a little bit better classification results. For example, in [34
] the model achieved a little bit lower classification power than in our combined model in 2017. In a study [38
], the model achieved a very similar result than our models, the model in [40
] has a prediction power similar to the model of logistic regression that was the first step of our combined models. The two models in the study [46
] have both lower classification results than our models in this study.
Combination of the methods of logistic regression and discriminant analysis brings very good classification ability of the resulting models with high AUC score. The classification result of the logistic regression, created in the first step of the analysis, was then improved by the consequent discriminant analysis that achieves much better classification results. All three models achieved more than 90% correct classification. We explain the improvement of classification results using the predicted probability of non-prosperity of the company by the fact that all the relevant data that we had at our disposal about the company at the given moment were used in this probability. All relevant variables contributed to the best possible estimation of the probability of financial difficulties. However, we did not proceed as usual in the classification of the enterprise based on the predicted probability, where the limit is usually considered to be the probability of 0.5. Enterprises with a probability higher than 0.5 are classified in the group of non-prosperous enterprises and enterprises with a probability of non-prosperity below 0.5 are classified in the group of prosperous. However, this threshold is very "thin" because, for example, an enterprise with a probability of 0.495 will be classified as prosperous, but an enterprise with a probability of 0.505 will be classified as non-prosperous. So, we did not apply this classification but instead added the probability of every company as another relevant variable into the discriminatory model. This discriminant model is finally used to categorize companies into two groups. Probability, classified as one of its explanatory variables, improves its classification ability because it adds new additional information in the form of a linear combination of all the significant variables available in the enterprise. The individual variables themselves may prove insignificant in the discriminant model, but in a combination with several variables, their interactions may be significant. This is also shown in the analysis of the resulting models, where the variable “probability” in all three models is included among the three most influential variables. The additional information is therefore important and influential and affects the resulting classification of the company into one of the groups.
The achieved classification result is, of course, appropriate, but the truth remains that not only correctly classified non-prosperous enterprises but also correctly classified prosperous enterprises signed on this total percentage. However, to predict financial difficulties, it is necessary to focus mainly on the correct classification of non-prosperous enterprises, so that the model can correctly predict the threat of problems. In this part of the classification, we see an even weaker side of the created models. However, a comparison with other studies shows that several authors achieved similar or even worse results in their, sometimes also well-known, models.
As a strength of this analysis, we consider the fact that we used an extensive database of real Slovak companies and their actual data from the financial statements from 2016–2018. The database was also subjected to thorough cleaning, preparation, and analysis of the accuracy of records. On the contrary, as a weakness of this study, we can consider the fact that many companies had to be removed from the database due to a large amount of missing data. This was also the case for several variables that could be very useful for the analysis, but because of the large percentage of missing data, they could not be included in the models. In the further continuation of this research, we will focus on the creation of prediction models for Slovak SMEs using data mining and artificial intelligence methods.