An Optimal Model of Financial Distress Prediction: A Comparative Study between Neural Networks and Logistic Regression

Neural Abstract: In the face of rising defaults and limited studies on the prediction of ﬁnancial distress in Morocco, this article aims to determine the most relevant predictors of ﬁnancial distress and identify its optimal prediction models in a normal Moroccan economic context over two years. To achieve these objectives, logistic regression and neural networks are used based on ﬁnancial ratios selected by lasso and stepwise techniques. Our empirical results highlight the signiﬁcant role of predictors, namely interest to sales and return on assets in predicting ﬁnancial distress. The results show that logistic regression models obtained by stepwise selection outperform the other models with an overall accuracy of 93.33% two years before ﬁnancial distress and 95.00% one year prior to ﬁnancial distress. Results also show that our models classify distressed SMEs better than healthy SMEs with type I errors lower than type II errors.


Introduction
Work on financial distress is a topical issue that has attracted the attention of researchers for several decades. Financial distress occurs when a company's current assets can no longer meet its current liabilities (Malécot 1981). The process of financial distress is continuous and dynamic, lasting from a few months to several years, and can ultimately lead to bankruptcy (Sun et al. 2014).
Financial distress can have devastating effects on the company itself and all of its stakeholders (Hafiz et al. 2015). Financial distress prediction studies help companies detect financial difficulties earlier, understand the process of financial distress, and prevent the occurrence of bankruptcy (Crutzen and Van Caillie 2007).
Since the Z-score model proposed by Altman (1968), a great deal of research has focused on the prediction of corporate financial distress using different prediction models. However, most models used are either statistical or based on artificial intelligence (Balcaen and Ooghe 2006). In general, the objective of these predictive tools is to use financial ratios to differentiate between non-distressed and distressed firms and build an explanatory model of business failure (Refait-Alexandre 2004).
The ability of financial ratios to detect early warning signals of business failure has been highlighted by several empirical studies (Bellovary et al. 2007;Altman et al. 2017;Mselmi et al. 2017;Svabova et al. 2020;Kliestik et al. 2020). Predictors of business failure can be classified into two broad categories, namely ratios related to the firm's ability to generate profits (profitability ratios) and those associated with the firm's ability to meet its short-, medium-, or long-term obligations (liquidity and solvency ratios) (Back et al. 1996;Bunn and Redwood 2003;Sharifabadi et al. 2017;Lukason and Laitinen 2019;Valaskova et al. 2018;Kamaluddin et al. 2019).
Even though the study of SMEs is intriguing because their management style is generally focused on the short term and reaction rather than forecasting, applying predictive techniques to SMEs is difficult compared to large firms because of obstacles related to the lack of available data (Van Caillie 1993;Psillaki 1995;Bellanca et al. 2015). In Morocco, business failure is a present phenomenon with an evolution of 244% between 2009-2019. Despite the preponderant weight of Very Small and Medium-sized Businesses (VSMB) in the Moroccan economic fabric, they are the most affected by business failure by 99.7%.The first cause of mortality of VSMB is the long payment delays (Inforisk 2020;Haut-Commissariat au Plan 2019).
Nevertheless, studies on predicting SMEs' business failure in the Moroccan context are limited. There is a need to determine the relevant ratios of financial distress as well as the development of its prediction models in the Moroccan regions. The development of predictive models of financial distress under unique regional and national conditions allows for a better estimation of financial risks since the accuracy and reliability of these models can vary if they are used in a different context than the one in which they were originally developed. Indeed, empirical works conducted on a single region or country play a crucial role in predicting financial distress (Gregova et al. 2020).
This article aims to determine the most relevant predictors of financial distress and identify its optimal prediction models in Morocco. In particular, this study is conducted to answer the following questions: What are the most relevant ratios of financial distress? Consequently, what are the optimal prediction models of financial distress?
To do so, we use logistic regression and neural networks to develop financial distress prediction models based on financial ratios selected by LASSO (Least Absolute Shrinkage and Selection Operator) and stepwise techniques. The models are built on a sample of 180 SMEs during 2017-2018 including 123 healthy SMEs and 57 distressed SMEs. To address the problem of unbalanced data, we use SMOTE (Synthetic Minority Over-sampling Technique). Our study focuses on the Fez-Meknes region, one of the 12 Moroccan regions. This region is characterized by a high concentration of companies operating in the construction sector (11.2% of companies in the sector are located in this region). It contributes 8.4% of the national Gross Domestic Product (GDP) and is ranked second in terms of contribution to the primary sector by 14.5% (Haut-Commissariat au Plan 2018 2019).
Our contributions to the literature can be listed as follows. First, the estimation results of our models identify predictors that have a significant impact on financial distress, namely interest to sales and return on assets. Second, to the best of our knowledge, no study has ever attempted to apply the lasso technique in the selection of financial distress discriminant variables in Morocco. Indeed, the findings reveal that the lasso technique performs better with neural networks than logistic regression. Third, the results show that logistic regression is a powerful and robust tool for Moroccan SMEs' financial distress prediction. Finally, our models classify distressed SMEs better than healthy SMEs with type I errors lower than type II errors and can be effective to Moroccan creditors.
The rest of the article is as follows. Section 2 consists of a literature review on the prediction of business failure as well as the main works that have used neural networks and logistic regression to predict business failure. Section 3 presents the data collection, the variables considered, the methodology used for feature selection and model construction, and the performance metrics used to evaluate our models. Section 4 presents the empirical results. Finally, Sections 5 and 6 are dedicated to present the discussion and conclusions, respectively.

Literature Review
Over the past five decades, numerous studies on the prediction of corporate financial distress have been developed. In the early research of business failure prediction, Beaver (1966) proposed a one-dimensional dichotomous classification based on a single ratio. This method was rarely exploited afterward because of the lack of robustness linked to the uniqueness of the ratio used (Deakin 1972;Gebhardt 1980).
Through multiple discriminant analysis, Altman (1968) was the first to use several ratios simultaneously to predict the failure of firms. The author developed a Z-score model, a linear combination of the selected ratios, which makes it possible to assign the firm to the group to which it is closest (failing firms or non-failing firms). From a sample of 66 firms, the author retained 5 ratios out of 22 potential ratios to construct the Z-score function, namely working capital to total assets, retained earnings to total assets, earnings before interest and taxes to total assets, market value equity to book value of total debt, and sales to total assets. However, multiple discriminant analysis requires statistical conditions that are generally not satisfied in financial data. The explanatory variables must follow a normal distribution and their variance-covariance matrices must be identical for the sample of non-failing firms as for the sample of failing firms. Furthermore, the Z-score model is suitable only for linear classification. Faced with the statistical conditions required by multiple discriminant analysis, which are rarely respected in the empirical part, several statistical models have been developed that assume a different distribution of the explanatory ratios, particularly the widely used logistic regression. Logistic regression is a probabilistic method used to treat two-class classification problems such as the prediction of business failure. In the United States, Ohlson (1980) was the first to use logistic regression to predict business failure. After that, logistic regression has gained popularity and it is considered one of the most used methods in predicting business failure worldwide (Shi and Li 2019). Amor et al. (2009) developed a logistic regression model to anticipate the financial difficulties of Quebec SMEs known for their particularities. Based on solvency, liquidity, and profitability ratios, the model achieved an accuracy of 63.63% two years prior to default and 72.84% one year prior to default. Charalambakis and Garrett (2019) employed a multi-period logit model on a sample of 31.000 Greek private firms between 2003 and 2011. The model classified 88% of firms that went bankrupt during the Greek debt crisis as likely to fail. The results showed that the model retains its predictive ability over different time horizons.
In Morocco, Kherrazi and Ahsina (2016) used a binomial logistic regression model to identify the determinants of SMEs failure in the Gharb-Chrarda-Beni-Hssen region. The results of the model showed that the failure of SMEs in the region is related to the lack of commercial profitability and the lack of permanent funds. On a sample of 2.032 borrowing SMEs and large firms, Khlifa (2017) built a logistic regression model to predict the risk of default of Moroccan firms. The model yielded a classification rate of 88.2% over two years.
Several studies have shown that logistic regression models provide better accuracy than multiple discriminant analysis. In a sample of U.S. banks, Iturriaga and Sanz (2015) obtained 81.73% accuracy by logistic regression one year prior to bankruptcy versus 77.88% for discriminant analysis. This finding is confirmed by Du Jardin (2015) and Affes and Hentati-Kaffel (2019), the authors showed that logistic regression outperforms multiple discriminant analysis in terms of prediction accuracy.
Given the advancement of computer technology and the dynamism and complexity of real-world financial problems, machine learning techniques have been used for the prediction of corporate failure, including Artificial Neural Network (ANN).
The principle of neural networks is to develop an algorithm that replicates the functioning of the human brain in the information processing process. The use of neural networks in the field of business failure prediction was introduced by Odom and Sharda (1990). Subsequently, the neural network models have been prosperously used by several authors to predict business failure since they are characterized by nonlinear and nonparametric adaptive learning properties. During the last three decades, neural networks have shown promising results in terms of predicting business failure and they can be considered as one of the machine learning techniques with the highest predictive capability (Jeong et al. 2012).
Based on a matched sample of 220 U.S. firms, Zhang et al. (1999) found that neural networks outperform logistic regression models in terms of classification rate estimation. Chen and Du (2009) used neural networks on 68 companies listed on the Taiwan Stock Exchange Corporation (TSEC) with 37 ratios. The results indicated that neural networks are a suitable technique for predicting corporate financial distress with an accuracy of 82.14% two seasons before financial distress. Paule-Vianez et al. (2020) used a hidden layer artificial neural networks model to predict financial distress in Spain. The authors obtained an accuracy of more than 97% on a sample of 148 Spanish credit institutions and demonstrated that neural networks have a better prognostic capacity than multivariate discriminant analysis. In a large-scale study, Altman et al. (2020) compared the performance of five failure prediction methods, namely logistic regression, neural networks with multi-layer perceptron, support vector machine, decision tree, and gradient boosting. The results showed that neural networks and logistic regression outperform other techniques in terms of efficiency and accuracy in an open European economic zone. In order to identify the best financial distress prediction model for Slovakian industrial firms, Gregova et al. (2020) confirmed the superiority of neural networks over other techniques, namely random forest and logistic regression. Despite the good performances of the last two techniques, neural networks yield better results for all metrics combined.
Machine learning techniques can give better performance in classifying companies as failing or non-failing compared to statistical methods. For this reason, new studies should be directed to apply these classification techniques in predicting financial distress (Jones et al. 2017). However, statistical techniques for predicting business failure are still used worldwide and are comparable to machine learning techniques in terms of accuracy and predictive performance. Indeed, each classification method has its advantages and disadvantages and the performance of the financial distress prediction models depends on the particularities of each country, the methodology, and the variables used to build these models (Kovacova et al. 2019). Given the reliability and predictive accuracy of logistic regression and neural networks in different contexts, we use these techniques to predict the financial distress of Moroccan SMEs.

Data Collection
Before predicting corporate financial distress, we need first to define when financial distress occurs and which firms enter financial distress. A firm is considered to be in financial distress if it is unable to meet a credit deadline after 90 days from the due date (Circular n°19/G/2002 of Bank Al-Maghrib 2002).
Using this definition, we contacted the major banks in the Fez-Meknes region to obtain the financial statements of SMEs 1 . Constrained by the availability of information, we selected an initial sample of 218 SMEs. A total of 38 SMEs were eliminated for the following reasons: Young firms less than three years old, absence of financial statements for at least two consecutive years, lack of business continuity, and firms with specific characteristics such as financial and agricultural firms. Thus, the final sample includes 180 SMEs including 123 non-distressed SMEs and 57 distressed SMEs. The financial distress occurred in 2019 and the data used in the study correspond to the financial statements of the year 2017 and 2018. Our final sample covers the following sectors: Trade (45.55%), construction (42.23%), and industry (12.22%).

Data Balancing
When collecting data, an unbalanced classification problem can be encountered. This can lead to inefficiency in the prediction models. To avoid this problem, we can use one of the methods to deal with unbalanced data such as the oversampling method or the undersampling method.
In this article, we use the oversampling method. This method is a resampling technique, which works by increasing the number of observations of minority class(es) in order to achieve a satisfactory ratio of minority class to majority class.
To generate synthetic samples automatically, we use the SMOTE (Synthetic Minority Over-sampling Technique) algorithm. This technique works by creating synthetic samples from the minority class instead of creating simple copies. For more details on the SMOTE algorithm, we refer the reader to Chawla et al. (2002).
As shown in Table 1, we obtain by the SMOTE algorithm on data the following results:

Training-Test Set Split
We divide the sample into two sub-samples, the first called training sample (in this paper, we take 75% of the sample for training) and the second called validation or test sample (25% of the sample). The prediction models that we present next are built on the training sample and validated on the test sample.

Variable Analysis
Financial distress as defined in the previous subsection is the variable to be explained in the study. It is a qualitative, dichotomous, and binary variable. In this paper, it takes the value of 1 when the SME is in arrears of more than 90 days. Thus, it is considered to be in a distressed situation. Otherwise, it takes the value of 0 when the SME is not in arrears or is in arrears for less than 90 days and is considered normal.
The selection of financial ratios as initial features for predicting financial distress is based on their predictive and discriminative ability between non-distressed and distressed firms in previous works (Jabeur 2017;Kliestik et al. 2020;Mselmi et al. 2017;Kovacova et al. 2019;Kisman and Krisand 2019;Valaskova et al. 2018;Zizi et al. 2020).
As shown in Table 2, the explanatory variables are divided into four categories: Liquidity, solvency and capital structure, profitability, and management. The management ratios are used to take into account the long customer and supplier payment delays that characterize the context of the study (Inforisk 2020). Table 2. Financial ratios used as initial features.

R1
Current Ratio

R2
Quick Ratio

R3
Working Capital to Total Assets Working Capital Total Assets

Stepwise and Lasso Selection Techniques
In applied studies, many variables can lead to greater variance in the performance of the predictive models and decrease their accuracy. Eliminating redundant and insignificant variables prevents models from underfitting or overfitting. Therefore, it is necessary to look for the best embedded model composed only of the most pertinent variables that explain well the endogenous variable (output variable).
In empirical studies, selection techniques based on Wald or likelihood ratio (LR) are tedious and sometimes impossible to apply. For this reason, it is better to use numerical selection techniques such as stepwise logistic regression selection, or regularization tech-niques based on cross-validation to obtain the most pertinent variables that well explain the endogenous variable.
In this paper, we use two selection techniques: Stepwise logistic regression selection and lasso logistic regression selection.

Stepwise Logistic Regression Selection
In step-by-step numerical selection techniques, we evaluate successions of embedded models, by adding them as they are added → FORWARD, or by removing them as they are removed → BACKWARD.
The stepwise selection technique consists of alternating between FORWARD and BACKWARD, i.e., checking that each addition of a variable does not cause the removal of another variable. The principle of the stepwise method is to minimize one of the following criteria: • Akaike Information Criterion (AIC): • Bayesian Information Criterion (BIC): where: • L is the likelihood of the logit model; • K is the number of variables in the model; • n is the number of observations.
The stopping criterion: The addition or removal of a variable does not improve the criterion used anymore.
In our article, we use the BIC criterion for selection, as it penalizes complexity more; therefore, this criterion selects fewer variables.

Lasso Logistic Regression Selection
Least Absolute Shrinkage and Selection Operator (LASSO) is a method for the reduction in regression coefficients. It has been extended to many statistical models such as generalized linear models, M-estimators, and proportional risk models.
The lasso method has the advantage of a parsimonious and consistent selection. It selects a restricted subset of variables that allows a better interpretation of a model. Thus, the selected subset of variables is used for the prediction.

Formal presentation:
T be a vector containing the explanatory variables associated to individual i, y i the associated response, and β = {β 1 , β 2 , . . . , β p } the coefficients to be estimated. We note by X the matrix containing the individuals in a row, X i,. = x T i and y = (y 1 , y 2 , . . . , y n ).
The log-likelihood associated to the lasso logistic regression is defined as: Considering centered variables, the lasso is generally written in vector form by the following minimization problem: where λ is the penalty coefficient.
To select the best variables explaining the endogenous variable and to choose a minimum penalty coefficient λ, k-folds cross-validation is used.
3.6. Prediction Models 3.6.1. Logistic Regression Model Logistic regression or logit model is a binomial regression model from the family of generalized linear models. It is widely used in many fields. For example, it is used to detect risk groups when taking out credit in banking. In econometrics, the model is used to explain a discrete variable. While in medicine, it is used to find the factors characterizing a group of sick subjects compared to healthy subjects.
Let Y be the variable to be predicted (Variable to be explained) and X = (X 1 , X 2 , . . . , X J ) the predictors (explanatory variables).
In the framework of binary logistic regression, the variable Y takes two possible modes {1, 0}. The variables X j are exclusively continuous or binary.
The a posteriori probability of obtaining the modality 1 of Y (resp. 0) knowing the value taken by X is noted p(1|X) (resp. p(0|X)).
The logit term for p(1|X) is given by the following expression: The equation above is a "regression", as it reflects a dependency relationship between the variable to be explained and a set of explanatory variables. This regression is "logistic" because the probability distribution is modeled from a logistic distribution. Indeed, after converting the above equation, we find: 3.6.2. Neural Networks Model: Multi-Layer Perceptron An artificial neural network is a system whose concept was originally schematically inspired by the functioning of biological neurons. It is a set of interconnected formal neurons allowing the solving of complex problems such as pattern recognition or natural language processing owing to the adjustment of weighting coefficients in a learning phase.
The formal neuron is a model that is characterized by an internal state s ∈ S, input signals X = (X 1 , X 2 , . . . X J ) T , and an activation function: The activation function performs a transformation of an affine combination of input signals α 0 (a constant term that is called the bias of the neuron). This affine combination is determined by a vector of weights [α 0 , α 1 , . . . , α J ] associated with each neuron and which values are estimated in the learning phase. These elements constitute the memory or distributed knowledge of the network.
The different types of neurons are distinguished by the nature of their activation function g. The main types are linear, threshold, sigmoid, ReLU, softmax, stochastic, radial, etc.
In this article, we use the sigmoid activation function that is given by: The advantage of using sigmoid is that it works well for learning algorithms involving gradient back-propagation because their activation function is differentiable.
For supervised learning, we focus in this paper on an elementary network structure, the so-called static one without feedback loops.
The multilayer perceptron (MLP) is a network composed of successive layers. A layer is a set of neurons with no connection between them. An input layer reads the incoming signals, one neuron per input X i . An output layer provides the system response.
One or more hidden layers participate in the transfer. In a perceptron, a neuron in a hidden layer is connected as an input to each neuron in the previous layer and as an output to each neuron in the next layer. Therefore, a multi-layer perceptron realizes a transformation of input variables: where α is the vector containing each parameter α jkl of the jth input and of the kth neuron in the lth layer; the input layer (l = 0) is not parameterized and it only distributes the inputs to all the neurons of the layer.
In regression with a single hidden layer perceptron of q neurons and an output neuron, this function is written: where: z k = g(α 0k + α T k X); k = 1, . . . ., q Let us assume that we have a database with n observations (X i 1 , . . . , X i J , Y i ) (i = 1, . . . , n) of the explanatory variables X i 1 , . . . , X i J , Y i and the variable to be provided Y. Considering the simplest case of regression with a network consisting of a linear output neuron and a layer of q neurons which parameters are optimized by least squares.
Learning is the estimation of the parameters α j=0,J;k=1,q and β k=0,q by minimization of the quadratic loss function or that of an entropy function in classification:

Error back-propagation:
Back-propagation aims to evaluate the derivative of the cost function at an observation and with respect to the various parameters.
Let z k = g(α 0k + α T k X) and z i = (z 1i , z 2i , . . . , z qi ). The partial derivatives of the quadratic loss function are written: The terms δ i and s ki are the error terms of the current model at the output and on each hidden neuron, respectively. These error terms verify the so-called back-propagation equations: These terms are evaluated in two passes. A forward pass with the current values of the weights: The application of the different inputs x i to network allows us to determine the fitted valuesf (x i ). The return pass then determines the δ i that are back-propagated in order to calculate the s ki and thus obtain the gradient evaluations.

Optimization algorithms:
To evaluate the gradients, different algorithms are implemented. The most elementary one is an iterative use of a gradient: At any point in the parameter space, the gradient vector of Q points in a direction of increasing error. To make Q decrease, it is sufficient to move in the opposite direction. This is an iterative algorithm modifying the weights of each neuron according to: The proportionality coefficient τ is called the learning rate. It can be fixed (determined by the user) or variable (according to certain heuristics). It seems intuitively reasonable that this rate, high at the beginning to go faster, decreases to achieve a finer adjustment as the system approaches a solution. For more details on machine learning techniques, we refer to Friedman et al. (2017).

Metrics
In this paper, the performance of prediction models is measured by the common evaluation metrics of machine learning, namely confusion matrix, accuracy, precision, sensitivity, specificity, F1-score, and Area Under the Curve (AUC).
Confusion matrix: It represents the basis for calculating the performance of the prediction models. Each column of the table indicates the instances of the predicted class and each row indicates the instances of a real class, or vice versa.
Accuracy: It measures the percentage of cases correctly classified. Specificity (also known as True Negative Rate): It is the proportion of true negative cases to the total number of negative cases.

Speci f icity =
True Negative True Negative + False Positive F1-score: It is the harmonic mean of recall and precision. It is calculated as follows: Area Under the Curve (AUC): It is a measure introduced to characterize the ROC curve 2 numerically. The closer the area value is to 1, the better the discrimination quality of the model. (Long and Freese 2006).

Results
In this section, we present the main results obtained by the R 4.0.5 software. Table 3 shows the ratios selected by the stepwise and lasso techniques. The stepwise logistic technique is based on minimizing the BIC criterion to select the relevant variables. While the lasso logistic technique is based on the optimal choice of the penalty coefficient to select the relevant ratios. In our case, the optimal BIC value is 132.1 in 2017 and 123.67 in 2018; however, the optimal penalty coefficient is 0.05867105 in 2017 and 0.0311904 in 2018. We note that interest to sales (R14), return on assets (R15), and days in accounts receivable (R21) remain discriminant one and two years before financial distress for both techniques. These variables belong to the profitability and management categories. Interest to sales (R14) represents the weight of interest in relation to sales. A healthy financial situation is generally characterized by a level of interest not exceeding 2.5% or 3% of sales. Return on assets (R15) measures the net income earned for each amount invested in assets. This profitability ratio plays an important role in the early prediction of business failure and it can reduce its probability (Geng et al. 2014;Zizi et al. 2020). Days in accounts receivable (R21) relates accounts receivable (multiplied by 360) to sales and is expressed in the number of days of sales. Long payment terms can lead to business failure.

Descriptive Statistics
The main results of the descriptive statistics of selected variables by the two selection techniques (stepwise and lasso) are illustrated in Tables A1-A4 (Appendix A), namely descriptive statistics for selected variables, normality tests, correlation matrices, and multicollinearity tests.
We note from the descriptive statistics that failing SMEs are more indebted than their non-failing peers. SMEs in financial distress are more dependent on external funds with high means of debt to equity ratio (R4) and autonomy ratio (R7). Thus, the use of debt favors the increase in interest (R14). In addition, distressed SMEs are less solvent and they find it difficult to repay their debts with low average interest coverage (R5) compared to healthy SMEs. The results of the descriptive statistics also show that distressed SMEs are less profitable with negative return on assets (R15) and retained earnings to total assets (R17) means. Concerning management ratios, days in accounts receivable (R21) and duration of trade payables (R22) are longer for defaulting SMEs. Contrary to what was expected, liquidity expressed by the quick ratio (R2) is higher for distressed SMEs.
Based on the p-values of the Shapiro-Wilk and Lilliefors (adapted Kolmogorov-Smirnov test) normality tests, we reject the hypothesis of normality of the explanatory variables (p-value of the two tests are <0.05).
To ensure that significant correlations in absolute value close to 0.7 (such as the correlations between R6-R14 and R16-R21) do not give rise to a multicollinearity problem that can affect the results, we test the degree of multicollinearity by Variance Inflation Factor (VIF) and we calculate the tolerance coefficient (TOL). If the TOL is close to 0, then it can be considered that there is a significant collinearity for the variable. If it is close to 1 with a VIF value between 1 and 5, then it can be considered that the collinearity generated by the variable is not important and does not influence the reliability. Problematic multicollinearity exists if the VIF is greater than 10 or if the TOL is less than 0.1 (Zhang et al. 2010).
The VIF values of the selected ratios are all below 5 and their tolerances are close to 1. Therefore, we do not have a multicollinearity problem.

Estimation Results of the Stepwise and Lasso Logistic Regression Models
Tables 4 and 5 present the estimation results of the stepwise logistic regression models. One year before financial distress, all variables in the model are significant at the threshold of 1%. Interest coverage (R5), autonomy ratio (R7), interest to sales (R14), and days in accounts receivable (R21) have a positive effect on financial distress. While return on assets (R15) negatively impacts financial distress. Interest to sales (R14) impacts more on the probability of financial distress. An increase in this ratio of one unit raises the probability of financial distress by 79.59%. Two years prior to financial distress, all ratios are significant at the threshold of 5% except for the repayment capacity (R8). Variables already selected by the stepwise method in 2017 retain the same sign in 2018. Interest to sales (R14) keeps the largest marginal effect and may increase the probability of default by 66.91%. While increasing return on assets (R15) by one unit may decrease the probability of financial distress by 35.78%. Table 6

Performance of Logit Models
The results obtained by the confusion matrices are based on the test sample. As shown in Table 7, two years before the occurrence of financial distress, the stepwise logistic regression model correctly classifies 93.33% of the SMEs. One year before the occurrence of financial distress, the accuracy improves to 95.00% and the sensitivity is 96.67% (29/30 of the failing SMEs are correctly classified). Regarding the performance of lasso logistic regression models, the accuracy improves in 2018 with 86.67% compared to 80% in 2017. The type I error (When a model classifies a failing company as healthy) goes from 16.67% in 2017 to 13.33% in 2018 showing the improvement of the quality of the model when financial distress is imminent.

Performance of Neural Networks Models
To find the best neural networks models for stepwise logistic selection and lasso logistic selection, we vary the network parameters, namely the hidden layers from 0 to 10 and the number of its nodes from 0 to 10. We find that the best neural networks models for stepwise logistic selection (resp for lasso logistic selection) are composed of a single hidden layer containing three nodes.
According to Table 8, in 2017 the lasso neural networks model performs better than the stepwise neural networks model with an accuracy of 83.33%. In addition, the type I error of the lasso neural networks model is 6.67% against 13.33% for the stepwise neural networks model, a difference of 6.66%. As for 2018, the stepwise neural networks model has a higher overall accuracy of 88.33% versus 86.67% for the lasso neural networks model.
In general, the performance of neural networks models improves one year before the financial distress. Furthermore, these models achieve a lower type I errors than type II errors.
As shown in the Appendix B, the architecture of neural networks consists of three layers (input layer, output layer, and one hidden layer). The nodes of the input layer correspond to the ratios selected by the lasso and stepwise techniques. The solution to the dichotomous problem (distressed SME or healthy SME) is provided by the output layer.

Discussion
The performance metrics of our prediction models are summarized in Tables 9 and 10. In addition to those used in Tables 7 and 8, we add precision, F1-score, and AUC. Precisions and F1-scores of our models improve one year before financial distress as the other metrics. For the AUC metric, the values obtained vary between 0.833 and 0.959, thus showing an excellent discrimination capacity of the models (Long and Freese 2006). Furthermore, our models correctly classify distressed SMEs better than healthy SMEs. That is, our models have lower type I errors than type II errors. Indeed, type I errors are considered by the literature as the most costly for all stakeholders (Bellovary et al. 2007). These findings are in contrast with those of Shrivastav and Ramudu (2020) and Durica et al. (2021). On a sample of 59 Indian banks, Shrivastav and Ramudu (2020) obtained by support vector machine with linear kernel a type I error of 25% and a type II error of 0%. One year before the default, Durica et al. (2021) obtained by the CART algorithm a better classification of healthy Slovak companies with 94.93% compared to a classification of 81.48% for Slovak companies in financial distress.  Regarding the performance of the models based on lasso selection, neural networks give better performances with an accuracy of 83.33% in 2017 and 86.67% in 2018 against 80.00% and 86.67% for logistic regression, respectively. However, our best results are obtained by stepwise selection with an accuracy of 93.33% in 2017 and 95.00% in 2018 for logistic regression and an accuracy of 88.33% in 2018 for neural networks. In general, our results show the superior performances of logistic regression over neural networks. These findings are in line with the works of Du Jardin and Séverin (2012), Islek and Oguducu ( 2017), Kim et al. (2018), Lukason and Andresson (2019), and Malakauskas and Lakštutienė ( 2021). For example, logistic regression reached for Du Jardin and Séverin (2012) an accuracy of 81.6% against 81.3% for neural networks with data collected over one year. Similarly for Lukason and Andresson (2019) where logistic regression scored first on the test sample with 90.2% accuracy followed by multilayer perceptron with 87.60%.
By comparing our logistic regression results obtained by the stepwise selection technique, we can say that they are well above the average obtained by other studies on the topic of prediction of financial distress (Bateni and Asghari 2020;Cohen et al. 2017;Vu et al. 2019;Guan et al. 2020;Ogachi et al. 2020;Tong and Serrasqueiro 2021;Rahman et al. 2021;Park et al. 2021). On a sample of 64 listed companies in the Nairobi Securities Exchange, Ogachi et al. (2020) correctly classified 83% of the companies through logistic regression with the following significant ratios: working capital ratio, current ratio, debt ratio, total asset, debtors turnover, debt-equity ratio, asset turnover, and inventory turnover. Tong and Serrasqueiro (2021) used logistic regression to predict the financial distress of Portuguese small and mid-sized enterprises operating in Portuguese technology manufacturing sectors. Logistic regression models managed to correctly classify 79.60% in 2013, 80.40% in 2014, and 79.20% in 2015 for the financial distress group. Based on a sample of U.S. publicly traded companies, Rahman et al. (2021) achieved an overall accuracy of 79.2% in the holdout sample. As for Shrivastava et al. (2018), they achieved better performance by Bayesian logit model with an accuracy of 98.9% on a sample of Indian firms extracted from Capital IQ.
For neural networks, our best results outperform those of Kim et al. (2018), Lukason and Andresson (2019), Papana and Spyridou (2020), and Malakauskas and Lakštutienė (2021). For instance, using neural networks with 42 nodes in the hidden layer, Kim et al. (2018) found an accuracy of 71.9% through 41 financial ratios selected from 1548 Korean heavy industry companies. To predict bankruptcy in the Greek market, Papana and Spyridou (2020) achieved by neural networks a good classification rate of 65.7% two years before bankruptcy and 70% one year before bankruptcy; however, our results are lower than those of Islek and Oguducu (2017) and Paule-Vianez et al. (2020). We take as an example the Paule-Vianez et al. (2020) model that achieved an overall success of 97.3% in predicting the financial distress of Spanish credit institutions.
In the Moroccan context, our results are better than Azayite and Achchab (2017), Khlifa (2017), Idrissi and Moutahaddib (2020), and Zizi et al. (2020) for either logistic regression or neural networks. Using logistic regression, Khlifa (2017) correctly classified 88.2% of Moroccan firms and Zizi et al. (2020) managed to achieve an overall accuracy of 84.44% two years and one year before the default. While our best logistic regression models correctly classify 93.33% of firms two years before financial distress and 95.00% of firms one year before financial distress. Same observation for neural networks where our best model achieves an accuracy of 88.33% against 80.76% for Idrissi and Moutahaddib (2020) and 85.6% for Azayite and Achchab (2017).

Conclusions
The lack of consensus on predictors of financial distress, the limited studies on the prediction of financial distress in Morocco, and the crucial role that the prediction of financial distress plays in a specific context led us to conduct this study. The objectives of this article were to determine the most relevant predictors of financial distress and identify its optimal prediction models.
To achieve these objectives, we have used logistic regression and neural networks on a sample of 180 SMEs in the Fez-Meknes region, including 123 healthy SMEs and 57 distressed SMEs. The SMOTE technique was used to solve the problem of unbalanced data. Focusing on Morocco, financial distress is defined according to Bank Al Maghrib's circular n°19/G/2002. Following the literature review on the topic and the context of the study, we have used a battery of 23 financial ratios as initial predictors. Our models were based on the discriminant ratios selected by the lasso and stepwise techniques.
Our results highlighted the importance of variables such as interest to sales (R14) and return on assets (R15) in predicting financial distress. Interest to sales (R14) has a positive impact on financial distress and retains the largest marginal effect over two years for both selection techniques, while return on assets (R15) reduces the probability of financial distress.
Empirical results on test samples showed the superiority of logistic regression over neural networks with accuracies obtained by stepwise selection of 93.33% two years before financial distress and 95.00% one year before financial distress. In addition, our results showed that performance metrics improved one year before financial distress. As an example, the accuracies ranged from 80.00% (logistic regression with lasso selection) to 93.33% (logistic regression with stepwise selection) in 2017 while in 2018 they ranged from 86.67% (neural networks with lasso selection) to 95.00% (logistic regression with stepwise selection). Furthermore, our models classified distressed SMEs better than healthy SMEs with type I errors lower than type II errors.
The results have practical implications for creditors, academics, and managers. Our proposed models can be effective for creditors who should assess the financial condition of borrowing firms and make low-risk credit-granting decisions to avoid capital loss. From an academic point of view, this paper suggests that logistic regression is a robust and more accurate tool in predicting the financial distress of Moroccan SMEs. As far as managers are concerned, our results will allow them to take corrective actions upstream through the proposed variables representing early warning signals.
Constrained by the availability of information, our results can be improved by increasing the sample size and introducing qualitative and macroeconomic variables into our models. Finally, future studies on business failure prediction in Morocco can consider comparing the results of our models with other machine learning techniques such as random forests or decision trees.    Notes: Std indicates standard deviation; *** significance level at 0.001; ** significance level at 0.01; * significance level at 0.05; . significance level at 0.1.

Appendix B. Architectures of Neural Networks Models
Error: 3.689385 Steps: 458
Notes 1 According to Maroc PME, SMEs are companies with a turnover of less than or equal to 200 million dirhams. 2 A graph that relates true positive rates and false positive rates. By varying the threshold S (threshold used for the assignment rule) over the interval [0, 1], the ROC curve is constructed and the true positive and false positive rates are calculated.