Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry

Trebuňa, Peter; Kronová, Jana; Kliment, Marek; Pekarčíková, Miriam

doi:10.3390/sym18060973

Open AccessArticle

Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry

Department of Industrial and Digital Engineering, Faculty of Mechanical Engineering, Technical University of Kosice, Park Komenského 9, 042 00 Kosice, Slovakia

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(6), 973; https://doi.org/10.3390/sym18060973 (registering DOI)

Submission received: 23 March 2026 / Revised: 25 May 2026 / Accepted: 27 May 2026 / Published: 4 June 2026

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry, 3rd Edition)

Download

Browse Figures

Versions Notes

Abstract

This study investigates the role of symmetric probabilistic models in predicting financial distress in the automotive industry, with a focus on companies operating in the Slovak Republic. Financial distress prediction represents a binary classification problem characterized by an inherent symmetry between healthy and distressed firms. To capture this structure, two widely used symmetric models—logit and probit—are applied and systematically compared. The modeling framework incorporates LASSO regression for variable selection, enabling dimensionality reduction while preserving the most informative financial indicators. The empirical analysis is conducted on a dataset of 351 manufacturing enterprises. The results indicate that both models achieve comparable predictive performance, with the logit model reaching an accuracy of 78.9% and the probit model 77.8%. The area under the ROC curve further confirms the strong discriminatory power of both approaches. The findings highlight that the symmetric nature of the applied link functions contributes to model stability, interpretability, and balanced classification behavior. This study extends existing research by explicitly linking symmetry concepts with financial distress prediction in a sector-specific context. The proposed approach provides a transparent and practically applicable framework for early risk identification in industrial enterprises.

Keywords:

symmetry; financial distress; logit model; probit model; LASSO

1. Introduction

Predictive models designed to forecast the financial condition of companies have been developed by various economists and analysts. These models have undergone several stages of development. These models play a crucial role in financial risk management, as they enable early identification of potential financial distress and support decision-making processes of firms and investors.

All predictive models are based on the assumption that, for a certain period prior to the onset of a crisis, a company begins to exhibit anomalies and symptoms indicating a deterioration of its financial health.

Predictive methods classify the analyzed company—with a certain degree of reliability—as either prosperous or non-prosperous (i.e., at risk of bankruptcy) [1,2,3].

The pioneer in the field of bankruptcy prediction was Fitzpatrick, whose initial study on the subject was published in 1932, followed by Merwin’s contribution in 1935. It was only in the subsequent period that statistical methods began to be applied to assess corporate financial health. In 1966, American professor Beaver was the first to use univariate discriminant analysis for this purpose, and his publication is considered the foundation of bankruptcy prediction models [4,5].

A more advanced technique—multivariate discriminant analysis—was introduced by Altman in 1968, who developed the well-known Z-Score model. Until 1980, discriminant analysis remained the dominant method for constructing predictive models [6].

Other notable authors who developed prediction models based on discriminant analysis include Taffler (1974), Loris (1976), Springate (1983), Neumaier and Neumaierová (1995, 1999, 2000, 2005), Virág and Hajdu (1996), Chrastinová (1998), Binkert (2000), Gurčík (2002), and Sharita (2003) [7].

In 1980, Ohlson introduced a predictive model based on the logit method, and in 1984, Zmijewski presented a model based on the probit method [7]. As noted in [8], other successors who developed prediction models using logit and probit methods include Skogsvik (1990), Boritz and Kennedy (1995), and Lennox (1999) [8].

In the 1990s, developments in neural networks began to be applied to the prediction of corporate financial health, with Odoma and Sharda (1990) recognized as the founding figures in this area [9].

Despite the extensive development of predictive models, most studies primarily focus on improving predictive accuracy or comparing different modeling approaches. However, limited attention has been devoted to the structural properties of these models, particularly the role of symmetry in probabilistic classification.

Importantly, in the context of predictive modeling, the concept of symmetry plays a crucial role not only in data distribution but also in the formulation of binary classifiers.

In particular, symmetric link functions assume a balanced response to positive and negative deviations of explanatory variables around the decision boundary, which may influence classification stability and interpretability.

Both the logit and probit models are grounded in symmetric functions that map linear combinations of predictors to probabilities within the bounded interval [0, 1]. This mathematical symmetry results in balanced classification thresholds and enhances model interpretability. Such properties are especially important for financial health prediction, where the binary decision between “healthy” and “distressed” companies reflects an inherent structural symmetry in business status evaluation [9].

Furthermore, many existing studies do not consider sector-specific characteristics, although financial behavior and risk patterns may differ across industries. The automotive industry represents a key sector of the Slovak economy, making it a relevant context for financial distress prediction.

The aim of this study is to investigate the role of symmetry in probabilistic models by comparing the performance of logit and probit models in predicting financial distress in Slovak automotive companies.

To achieve this aim, the study addresses the following research questions:

Does the symmetry of link functions influence the classification performance of logit and probit models?
Are logit and probit models comparable in terms of predictive accuracy and classification behavior?

The main contribution of this study lies in the explicit examination of symmetry in probabilistic models within a sector-specific context and its impact on financial distress prediction.

2. Literature Review

Research on financial distress prediction has evolved significantly over time, reflecting both methodological advancements and the increasing availability of financial data. Early studies focused primarily on identifying key financial indicators associated with corporate failure [10].

One of the first systematic analyses was conducted by Fitzpatrick (1932), followed by Merwin (1935), who examined differences between prosperous and failing firms using financial ratios. A major breakthrough occurred with the work of Beaver (1966), who introduced univariate discriminant analysis and demonstrated that individual financial ratios can effectively distinguish between healthy and distressed firms.

Subsequently, Altman (1968) developed the multivariate Z-score model based on discriminant analysis, which became one of the most influential tools in bankruptcy prediction. This approach was later extended by several authors, including Taffler (1974), Springate (1983), and Neumaier and Neumaierová (1995, 1999, 2000, 2005), who adapted the methodology to different economic environments and datasets.

Despite their popularity, discriminant analysis models rely on relatively strong assumptions, such as normality and linear separability, which may limit their applicability in more complex settings. This limitation led to the development of probabilistic models based on logistic and probit regression.

Ohlson (1980) introduced the logit model for bankruptcy prediction, providing a flexible framework that does not require strict distributional assumptions. Similarly, Zmijewski (1984) proposed a probit-based model, which shares many properties with the logit approach but differs in the underlying distribution function. Later studies, such as Skogsvik (1990), Boritz and Kennedy (1995), and Lennox (1999), further explored the application of these probabilistic models in financial distress prediction.

In parallel, advances in computational methods enabled the application of machine learning techniques. Neural networks, introduced by Odom and Sharda (1990), represent one of the earliest nonlinear approaches used for predicting corporate financial health. These models are capable of capturing complex relationships between variables; however, they often lack interpretability compared to traditional statistical models.

From a theoretical perspective, both logit and probit models are based on symmetric link functions, which map linear combinations of explanatory variables to probabilities in a balanced manner around a central threshold. While this property is mathematically well understood, its implications for classification performance and model behavior in financial distress prediction have not been sufficiently explored.

Moreover, many existing studies focus on general datasets without considering sector-specific characteristics, despite the fact that financial structures and risk patterns may vary significantly across industries. This highlights the need for further research that combines methodological analysis with sector-oriented applications.

3. Materials and Methods

3.1. Data and Sample

The empirical analysis is based on a dataset of companies operating in the automotive industry in Slovakia. This sector represents a key pillar of the Slovak economy and is therefore particularly relevant for the analysis of financial distress.

The dataset consists of firms classified according to their financial condition as either financially healthy or financially distressed. The classification is based on predefined criteria reflecting the financial stability of the companies.

The data were collected for the year 2021 and include financial statements and selected financial indicators of the analyzed firms.

Although the dataset is cross-sectional and limited to a single year, the use of homogeneous sector-specific data reduces variability caused by structural differences across industries and allows for a more consistent comparison of model performance.

Furthermore, financial statements for the selected period provide a sufficiently detailed snapshot of the financial condition of firms, which is suitable for the application of binary classification models.

The final sample consists of 351 companies, of which 153 are classified as financially distressed and 198 as financially healthy. The data were obtained from the CRIBIS Universal Register provided by CRIF—Slovak Credit Bureau, Ltd.

3.2. Variables

The selection of variables used in the analysis is based on commonly applied financial indicators in the field of financial distress prediction. These indicators reflect key aspects of a company’s financial condition, including liquidity, profitability, activity, and indebtedness.

The dependent variable is defined as a binary indicator of financial condition, where a value of 1 represents a financially distressed company and 0 denotes a financially healthy company. The independent variables consist of selected financial ratios derived from the companies’ financial statements.

The selected variables capture different dimensions of financial performance. Liquidity ratios reflect the ability of a firm to meet its short-term obligations, profitability indicators measure the efficiency of generating earnings, and leverage ratios provide insight into the level of financial risk associated with debt financing.

These indicators are widely used in prior studies on financial distress prediction and have been identified as significant predictors of bankruptcy risk.

Given the potentially large number of financial indicators and the presence of multicollinearity among them, a variable selection procedure is required. For this purpose, LASSO regression is applied to identify the most relevant predictors and reduce model complexity.

3.3. LASSO Regression

LASSO regression is one of the most widely used methods belonging to the group of regularization techniques. It is a regression analysis method that performs both variable selection and coefficient regularization in order to improve model accuracy and interpretability [11].

The LASSO (Least Absolute Shrinkage and Selection Operator) method extends the least squares approach by adding a penalty term controlled by the constant λ. The magnitude of this penalty determines the degree of shrinkage applied to the regression coefficients. The optimal value of λ is typically selected using cross-validation, often corresponding to the value that minimizes the prediction error λ_min. LASSO regression has strong predictive ability because it can shrink regression coefficients toward zero and in some cases set them exactly to zero, thereby effectively performing variable selection [12].

Unlike traditional inferential regression approaches, LASSO performs variable selection through coefficient penalization rather than statistical hypothesis testing. Therefore, variable relevance is determined by shrinkage behavior and predictive contribution instead of conventional p-values [13].

The use of the LASSO method has many advantages. Primarily, it can provide very good prediction accuracy, as the removal of coefficients may reduce variance without substantially increasing bias. This is especially useful when there is a small number of observations and a large number of variables. Additionally, LASSO helps improve model interpretability by eliminating irrelevant variables (Hastie, Tibshirani, Friedman, 2009).

In general form, the LASSO optimization problem can be expressed as [11,12]:

\min_{β} {\sum_{i = 1}^{n} {(y_{i} - β_{0} - \sum_{j = 1}^{p} x_{i, j} β_{j})}^{2} + λ \sum_{j = 1}^{p} | β_{j} |},

(1)

where λ ≥ 0 is the regularization parameter controlling the strength of the penalty.

In the context of this study, LASSO regression is applied as a variable selection technique prior to model estimation. The variables selected using LASSO are subsequently used as inputs in both logit and probit models. By reducing the number of predictors and mitigating multicollinearity, this approach contributes to more robust and interpretable classification models.

3.4. Logit and Probit

Logistic regression gradually replaced multiple discriminant analysis, which was dominant until the 1980s. It is considered one of the most reliable methods for distinguishing between bankrupt and financially healthy firms, with Ohlson recognized as a key pioneer [14,15,16].

The outputs of logistic regression are values in the interval (0,1), representing the probability of an event occurring. The method is suitable when the dependent variable is binary, while independent variables can be continuous, discrete, or categorical.

In multiple logistic regression, the probability cannot be expressed as a linear function. Instead, it is defined using the logistic function:

P (Y = 1) = \propto + β_{1} X_{1} + \dots + β_{K} X_{K},

(2)

where

α—is the intercept,

β—regression parameters,

X—explanatory variables.

Since a linear combination of variables may produce values outside (0,1), a logit transformation is applied. First, the probability is converted into odds [17,18,19]:

o d d s (Y = 1) = \frac{P (Y = 1)}{(1 - P (Y = 1))}

(3)

Then, the odds are transformed using the natural logarithm:

l o g i t (Y) = l n \frac{P (Y = 1)}{(1 - (P (Y = 1)))}

(4)

After transformation, the logistic regression equation becomes:

l o g i t (Y) = \propto + β_{1} X_{1} + \dots + β_{K} X_{K}

(5)

The coefficient β_K indicates the change in the log-odds for a one-unit change in X_K, holding other variables constant. The logit can be converted back into odds as follows [17,18,19]:

o d d s (Y = 1) = e^{(l o g i t (Y))} = e^{\propto + β_{1} X_{1} + \dots + β_{K} X_{K}}

(6)

The logit can be transformed back into probability using the inverse logistic function [17]:

P (Y = 1) = \frac{o d d s (Y = 1)}{(1 + o d d s (Y = 1))} = \frac{e^{\propto + β_{1} X_{1} + \dots + β_{K} X_{K}}}{1 + e^{\propto + β_{1} X_{1} + \dots + β_{K} X_{K}}}

(7)

The probit model represents an alternative approach that differs mainly in its assumption of a normally distributed error term [18,19,20,21,22]. In the probit model, the probability is calculated using the cumulative distribution function of the standard normal distribution:

P_{i} = \emptyset (α + β x_{i})

(8)

where

\emptyset

is the cumulative distribution function of the standard normal distribution, defined by the following expression:

\emptyset (\propto, β x_{i}) = \int_{- \infty}^{\propto + β x_{i}} \frac{1}{\sqrt{2 π}} \exp (- \frac{1}{2} x^{2}) d x

(9)

In contrast to the logit model, which is based on the logistic distribution, the probit model assumes a normal distribution of a latent variable. To evaluate both models, a cut-off value (typically 0.5) is applied. If p > 0.5, the company is considered at high risk of bankruptcy; otherwise, it is considered financially healthy [20,21,22]. This threshold corresponds to the central point of the symmetric link functions used in both models.

3.5. Symmetry in Probabilistic Models

The concept of symmetry plays an important role in the formulation and interpretation of probabilistic classification models, particularly in binary decision problems such as financial distress prediction. In this context, symmetry can be understood both in terms of the distributional assumptions of the models and the mathematical properties of their link functions.

The prediction of financial distress represents a binary classification problem, where firms are categorized as either financially healthy or distressed. This dichotomous structure naturally motivates the application of symmetric probabilistic models, in which the decision boundary is typically defined by a threshold of 0.5.

Both logit and probit models are based on symmetric link functions that transform a linear combination of explanatory variables into probabilities within the interval (0,1). In the case of the logit model, this transformation is represented by the logistic function, which is symmetric around zero. Similarly, the probit model relies on the cumulative distribution function of the standard normal distribution, which is symmetric around its mean.

This symmetry implies that equal deviations in opposite directions from the decision boundary are transformed in a mathematically consistent manner. However, this property applies to the link function itself and does not necessarily imply identical empirical classification performance.

From a modeling perspective, the assumption of symmetry simplifies coefficient interpretation and provides a consistent probabilistic framework for a binary classification. This property is particularly valuable in financial applications, where transparency and interpretability are essential for decision-making.

Recent studies have further emphasized the importance of symmetry-oriented mathematical structures and interpretable analytical frameworks in predictive and decision-support models [23].

Nevertheless, real-world financial data may exhibit asymmetries, such as skewed distributions or class imbalance. In such cases, symmetric models may not fully capture complex nonlinear relationships. Despite this limitation, their robustness, interpretability and theoretical transparency make logit and probit models suitable baseline approaches for financial distress prediction.

In this study, the concept of symmetry provides a theoretical framework for comparing logit and probit models and supports the interpretation of their predictive performance in financial risk assessment.

The empirical analysis presented in the following Section 4 evaluates whether these theoretical symmetric properties are reflected in model classification outcomes.

4. Results

4.1. Input Variables

The input dataset for the analysis of the financial situation and subsequent bankruptcy prediction consisted of 351 Slovak enterprises from the year 2021 operating within SK NACE C—Manufacturing sector/29—Manufacture of motor vehicles, trailers and semi-trailers. The financial statements database was obtained from the Universal Register CRIBIS, provided by CRIF—Slovak Credit Bureau, s.r.o. The analyzed sector is a key industry in the Slovak economy in terms of both employment and GDP contribution.

For the purpose of analysis and model development, the selected companies were classified as prosperous (0) or non-prosperous (1). The classification criteria for non-prosperous companies were based on the methodology in [20] and included the following:

Equity-to-liabilities ratio lower than 0.08,
Current ratio below 1,
Negative EAT (Earnings After Tax).

These criteria reflect the currently valid legislation in the respective area. According to them, the dataset included 198 prosperous and 153 non-prosperous companies.

The dataset exhibits a moderate class imbalance, with a slightly higher proportion of financially healthy companies, which may influence classification performance, particularly in terms of sensitivity to the minority class.

4.2. Application of LASSO Regression for Variable Selection

To assess the financial health of the selected companies, 20 financial indicators were applied. The LASSO regression method was used to select the indicators that served as inputs for the logit and probit models. The optimal value of the penalty constant λ was chosen based on the minimum prediction error of the model, λ_min, which corresponds to the lowest point on the curve shown in Figure 1.

The value of λ_min is 0.00329. Out of the 20 initial indicators, the application of LASSO regression reduced the number of indicators with non-zero coefficients to 13 (see Table 1). Coefficients close to zero are considered less significant and were therefore excluded.

On the Figure 2 are illustrated coefficients of indicators depending on lambda.

The LASSO regression method has the ability to select only one variable from a group of highly correlated variables while excluding the others. Therefore, no further analysis of the input variables is necessary, and the output of the LASSO regression is considered the set of input indicators for the logistic regression.

The variables retained by the LASSO procedure were subsequently subjected to logit and probit estimation, from which only the most statistically and practically relevant predictors were included in the final model specifications.

The optimal value of the regularization parameter was determined using cross-validation, ensuring the robustness and generalizability of the selected model.

4.3. Application of Logit in the Prediction of Financial Health

The calculation of the logit model was performed using STATISTICA 13.6 software, into which the input data of financial indicators selected by the LASSO regression method for the assessed sample of companies were entered. The financial indicators represent the independent variables, while the data on the financial health status of the companies represent the binary dependent variable.

The resulting logit equation representing the predictive model of financial health for companies in the automotive industry operating in Slovakia has the form:

π_{M} = \frac{1}{1 + e^{- (3.169 + 0.951 x_{1} + 3.781 x_{2} + 0.385 x_{3} - 8.133 x_{4} - 0.717 x_{5} + 3.881 x_{6} + 8.584 x_{7})}}

(10)

where

x₁—third-degree liquidity,
x₂—accounts payable turnover period,
x₃—asset turnover,
x₄—total asset indebtedness,
x₅—equity-to-liabilities ratio,
x₆—long-term asset indebtedness,
x₇—return on assets (ROA).

The following table (Table 2) presents the significance tests of the developed logit model. The Likelihood Ratio test (LR test) is particularly important, as it includes those variables in the model that increase its likelihood.

When developing a predictive model, it is essential to ensure that the model explains the highest possible degree of pseudo-variability (the quality of the model’s explanatory power, demonstrating the relationship between the dependent and independent variables). This can be assessed using statistical metrics such as Cox-Snell R², Nagelkerke R², and Log-likelihood (Table 3).

Nagelkerke R² is generally considered the most indicative metric because it best reflects the model’s explanatory power and ranges between 0 and 1, where values closer to 1 indicate better model performance.

The classification performance of the developed logit model was validated using the confusion matrix (Table 4). The confusion matrix expresses the classification accuracy of the model at 81.82% for prosperous firms and 75.16% for non-prosperous (bankrupt) firms. The overall predictive accuracy of the developed logit model is 78.92%.

The classification accuracy of the logit model was also evaluated using the ROC curve, shown in Figure 3. The overall predictive performance of the model is expressed by the area under the ROC curve (AUC). In our case, the AUC is 0.8767, indicating that the predictive ability of the logit model is good.

To provide a more comprehensive evaluation of classification performance, additional metrics were considered. The logit model achieved a precision of 76.1%, recall of 75.2%, and an F1-score of 75.6% for the distressed class. These results indicate a balanced trade-off between sensitivity and specificity, which is particularly important given the moderate class imbalance in the dataset.

Furthermore, sensitivity analysis of the classification threshold was performed. While the standard cut-off value of 0.5 provides balanced classification, lower threshold values slightly improve recall for distressed firms at the expense of overall accuracy. This confirms that model performance can be adjusted depending on the risk preferences of decision-makers.

4.4. Application of Probit in the Prediction of Financial Health

A similar approach was applied in developing the probit prediction model, with the input variables remaining unchanged.

The resulting probit equation, constructed from the estimated parameters of statistically significant variables and representing the predictive model of the financial health of automotive industry enterprises operating in the Slovak Republic, is given by:

π_{M} = \frac{1}{1 + e^{- (2.187 + 0.359 x_{1} + 2.308 x_{2} + 0.254 x_{3} - 5.154 x_{4} - 0.416 x_{5} + 2.619 x_{6} + 4.752 x_{7})}}

(11)

where

x₁—third-degree liquidity,
x₂—accounts payable turnover period,
x₃—asset turnover,
x₄—total asset indebtedness,
x₅—equity-to-liabilities ratio,
x₆—long-term asset indebtedness,
x₇—return on assets (ROA).

The following table (Table 5) presents the significance tests for the created probit model. The Likelihood Ratio test (LR test) is significant, as it includes variables that improve the model’s reliability.

Table 6 presents the results of the statistical metrics Cox-Snell R², Nagelkerke R², and Loglikelihood. The Nagelkerke R² metric is considered significant, with a value indicating that 48.55% of the variability of the dependent variable is explained by the probit model.

The classification ability of the developed probit model was evaluated using the confusion matrix (Table 7) and the ROC curve (Figure 4).

The results shown in the confusion matrix indicate that the model correctly classified 273 companies and misclassified 78 companies. The classification accuracy of the probit model is 81.31% for prosperous companies and 73.2% for non-prosperous companies. The overall predictive accuracy of the probit model is 77.78%.

The classification accuracy of the probit model was also evaluated using the ROC curve (Figure 4), where the AUC is 0.8653, indicating that the predictive ability of the probit model is good.

The symmetric nature of both models is reflected in their comparable classification performance across both classes.

Similarly, the probit model achieved a precision of 74.3%, recall of 73.2%, and an F1-score of 73.7% for the distressed class. These results are consistent with those of the logit model and confirm comparable classification behavior across both approaches.

The slightly lower recall of the probit model suggests a marginally reduced sensitivity in detecting financially distressed firms. However, the differences between the models remain minimal, supporting their functional equivalence in practical applications.

4.5. Comparative Analysis and Symmetry Interpretation

The comparative analysis of the logit and probit models reveals a high degree of similarity in their predictive performance, as reflected in accuracy, AUC, and class-specific evaluation metrics. This empirical finding supports the theoretical assumption that both models, based on symmetric link functions, exhibit comparable classification behavior.

From the perspective of symmetry, both models apply a balanced transformation of the linear predictor into probability space, ensuring that deviations in explanatory variables are treated consistently around the decision boundary. This property is reflected in the relatively even classification performance across both classes.

However, it is important to emphasize that symmetry is primarily a theoretical property of the link functions and does not guarantee identical empirical results. Minor differences observed in recall and precision suggest that data-specific characteristics, such as distribution shape and class imbalance, may influence classification outcomes.

Although the primary objective of this study was not to provide an exhaustive comparison between regularized and non-regularized approaches, preliminary estimations using standard logit and probit models without LASSO regularization were also considered. The obtained predictive performance was comparable; however, the non-regularized models retained a larger number of variables and exhibited lower interpretability. Therefore, the LASSO-based approach was preferred due to its ability to reduce model complexity while preserving predictive performance.

Overall, the results confirm that symmetric probabilistic models provide stable, interpretable, and robust performance in financial distress prediction, while their differences remain negligible in practical applications.

5. Conclusions and Discussion

This study addressed the prediction of financial distress in automotive industry enterprises in the Slovak Republic using selected interpretable classification models, specifically logit and probit regression. Leveraging LASSO (Least Absolute Shrinkage and Selection Operator) regression for variable selection on a dataset of 351 firms allowed for the extraction of the most informative financial indicators, thereby improving model performance and enhancing interpretability.

The use of a single-year dataset limits the ability to capture temporal dynamics and cyclical effects in financial distress development.

Both models exhibited strong predictive capabilities, with the logit model achieving an accuracy of 78.9% and the probit model 77.8%. The applied methodology aligns with current trends in machine learning that emphasize sparsity, dimensionality reduction, and model transparency—crucial aspects in financial decision-making environments where explainability is paramount.

The comparative analysis revealed minimal performance differences between the models, which is consistent with the literature that treats them as functionally equivalent for binary classification tasks.

The findings suggest that while symmetry contributes to model stability and interpretability, it does not lead to significant differences in predictive performance. This indicates that the practical relevance of symmetry lies primarily in the theoretical consistency of the models rather than in measurable performance gains.

The logit model slightly outperformed the probit model in terms of accuracy and AUC, which may be attributed to its higher robustness to extreme values.

Despite these contributions, several limitations must be acknowledged. First, the analysis is based on cross-sectional data from a single year (2021), which may limit the generalizability of the results. Although the use of a sector-specific dataset ensures consistency, future research should incorporate longitudinal data to capture dynamic changes in financial conditions over time.

Second, the dataset exhibits moderate class imbalance, which may influence classification sensitivity, particularly for distressed firms. Techniques such as oversampling (e.g., SMOTE) could further improve model performance in this regard.

In addition, the study focuses on linear probabilistic models, which may not fully capture complex nonlinear relationships between financial indicators. Future research could extend the analysis by incorporating advanced machine learning methods, such as decision trees, random forests, or support vector machines, and comparing their performance with interpretable models.

The presented approach contributes to the application of machine learning in finance and risk management, offering a practical, interpretable framework for early detection of financial distress. These findings have implications not only for financial analysts and regulators, but also for the broader machine learning community exploring transparent models in structured tabular data environments.

Future studies may also compare alternative regularization approaches such as Elastic Net or Ridge regression. Recent research has increasingly highlighted the importance of mathematically interpretable, symmetry-oriented, and data-driven approaches in predictive modeling and intelligent decision systems. These studies emphasize the growing role of advanced analytical frameworks, optimization methods, and explainable models in complex industrial and financial environments. In this context, the application of interpretable probabilistic models combined with regularization techniques represents a promising direction for future research [24,25].

Author Contributions

Conceptualization, P.T., M.P., M.K. and J.K.; methodology, P.T., M.P., M.K. and J.K.; software, J.K., M.P., P.T. and M.K.; validation, P.T., M.P., M.K. and J.K.; formal analysis, M.P., P.T., M.K. and J.K.; investigation, P.T., M.P., M.K. and J.K.; resources, M.K. and P.T.; data curation, M.K., J.K., P.T. and M.P.; writing—original draft preparation, J.K., M.P., M.K. and P.T.; writing—review and editing, J.K.; project administration, P.T. and M.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This article was created through the implementation of the grant projects APVV-19-0418 “Intelligent solutions to enhance business innovation capability in the process of transforming them into smart businesses”, APVV-17-0258 “Digital engineering elements application in innovation and optimization of production flows”, VEGA 1/0383/25 “Optimizing the activities of manufacturing enterprises and their digitization using advanced virtual means and tools”, KEGA 003TUKE-4/2024 “Innovation of the profile of industrial engineering graduates in the context of required knowledge and specific capabilities for research and implementation of intelligent systems of the future”, VEGA 1/0160/26 “Integration of Digital Technologies into Business Processes in the Context of Industry 4.0/5.0”, and KEGA 037TUKE-4/2026 “EduSmart 5.0: Innovative Education for Competence Development in the Era of Industry 4.0 and 5.0”.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Klieštik, T.; Valášková, K.; Lázaroiu, G. Remaining financially healthy and competitive: The role of financial predictors. J. Compet. 2020, 12, 74–92. [Google Scholar] [CrossRef]
Klieštik, T.; Kočišová, K.; Mišanková, M. Logit and probit model used for prediction of financial health of company. Procedia Econ. Financ. 2015, 23, 850–855. [Google Scholar] [CrossRef]
Meloun, M.; Militký, J. Compendium of Statistical Data Processing: Methods and Solved Problems; Academia: Prague, Czech Republic, 2006. [Google Scholar]
Synek, M. Managerial Economics; Grada: Prague, Czech Republic, 2011. [Google Scholar]
Zalai, K.; Kalafutová, E.; Šnircová, J. Financial and Economic Analysis of the Company; Elita: Bratislava, Slovakia, 2001. [Google Scholar]
Altman, E.I. The success of business failure prediction models—An international survey. J. Bank. Financ. 1984, 8, 171–198. [Google Scholar] [CrossRef]
Valášková, K.; Ďurana, P.; Adamko, P.; Jaros, J. Financial compass for Slovak enterprises: Modeling economic stability. J. Risk Financ. Manag. 2020, 13, 92. [Google Scholar] [CrossRef]
Valášková, K.; Gajdošíková, D.; Belas, J. Bankruptcy prediction in the post-pandemic period: A case study of Visegrad Group countries. Oecon. Copernic. 2023, 14, 253–293. [Google Scholar] [CrossRef]
Kováčová, M.; Klieštik, T.; Valášková, K.; Radišić, M.; Borocki, J. Bankruptcy models: Verifying their validity as a predictor of corporate failure. Pol. J. Manag. Stud. 2018, 18, 167–179. [Google Scholar] [CrossRef]
Kang, L. Statistical analysis and case investigation of fatal fall-from-height accidents in the Chinese construction industry. Int. J. Ind. Eng. Theory Appl. Pract. 2022, 29, 1–10. [Google Scholar] [CrossRef]
Fahrmeir, L.; Kneib, T.; Lang, S. Regression—Models, Methods and Applications; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar] [CrossRef]
Fonti, V.; Belitser, E. Feature Selection Using LASSO; VU Amsterdam Research Paper in Business Analytics; Vrije Universiteit Amsterdam: Amsterdam, The Netherlands, 2017; pp. 1–25. [Google Scholar]
Kipruto, E.; Sauerbrei, W. Evaluating Prediction Performance: A Simulation Study Comparing Penalized and Classical Variable Selection Methods in Low-Dimensional Data. Appl. Sci. 2025, 15, 7443. [Google Scholar] [CrossRef]
Herzog, I.; Grabowska, M. Quality cost account as a framework of continuous improvement at operational and strategic level. Manag. Prod. Eng. Rev. 2021, 12, 122–132. [Google Scholar] [CrossRef]
Hiadlovský, V.; Kráľ, P. Possibilities of predicting the financial situation of enterprises in Slovakia using SPSS. Forum Stat. Slovacum 2006, 2, 4–10. [Google Scholar]
Straka, M.; Šofranko, M.; Glova Vegsoova, O.; Kovalcik, J. Simulation of homogeneous production processes. Int. J. Simul. Model. 2022, 21, 214–225. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
Gundová, P. Verification of the selected prediction methods in Slovak companies. Acta Acad. Karv. 2014, 14, 26–38. [Google Scholar] [CrossRef]
Grznar, P.; Gregor, M.; Gaso, M.; Gabajova, G.; Schickerle, M.; Burganova, N. Dynamic simulation tool for planning and optimisation of supply process. Int. J. Simul. Model. 2021, 20, 441–452. [Google Scholar] [CrossRef]
Horváthová, J.; Mokrišová, M. Risk of bankruptcy, its determinants and models. Risks 2018, 6, 117. [Google Scholar] [CrossRef]
Glova, J.; Mrazkova, S.; Dancakova, D. Measurement of intangibles and knowledge: An empirical evidence. Ad Alta J. Interdiscip. Res. 2018, 8, 76–80. [Google Scholar]
Grunwald, R.; Holečková, J. Financial Analysis and Planning of the Company; Ekopress: Prague, Czech Republic, 2007. [Google Scholar]
Akhtar, N.; Alharthi, M.F. Enhanced ridge estimators to effectively address multicollinearity challenges. AIP Adv. 2025, 15, 035143. [Google Scholar] [CrossRef]
Alharthi, M.F.; Akhtar, N. Modified Two-Parameter Ridge Estimators for Enhanced Regression Performance in the Presence of Multicollinearity: Simulations and Medical Data Applications. Axioms 2025, 14, 527. [Google Scholar] [CrossRef]
Akhtar, N.; Alharthi, M.F.; Khan, M.S. Mitigating Multicollinearity in Regression: A Study on Improved Ridge Estimators. Mathematics 2024, 12, 3027. [Google Scholar] [CrossRef]

Figure 1. Cross-validation error as a function of lambda.

Figure 2. Coefficients of indicators as a function of lambda.

Figure 3. ROC curve for the logit model.

Figure 4. ROC curve for the probit model.

Table 1. Variable Selection Using LASSO Regression.

Sign	Indicators	Model Lambda = 0.00329
Sign	Indicators	Coefficients	Status
	Constant	2.77465506
x₈	Liquidity II. degree		excluded
x₁	Liquidity III. degree	0.756475535	selected
x₉	Stock turnover time		excluded
x₁₀	Maturity period of receivables	1.74585723	selected
x₁₁	Maturity period of short-term receivables	−0.433764117	selected
x₁₂	Maturity period of obligations		excluded
x₂	Maturity period of short-term liabilities	2.93061848	selected
x₃	Asset turnover	0.353950083	selected
x₄	Total indebtedness of assets	−7.12882394	selected
x₅	Equity to liabilities ratio	−0.558660517	selected
x₆	Long-term indebtedness of assets	3.46935374	selected
x₁₃	Credit indebtedness of assets	−0.591160135	selected
x₁₄	Interest coverage	0.000552017394	excluded
x₁₅	Current indebtedness		excluded
x₁₆	Return on equity	−0.000619657135	excluded
x₇	Return on assets-gross	6.22307735	selected
x₁₇	Operating profitability of sales	−2.32028663	selected
x₁₈	Share of newly created value in sales		excluded
x₁₉	Share of added value in sales	−0.765868794	selected
x₂₀	Share of EBITDA in sales	−2.36252006	selected

Table 2. Significance tests of the logit model.

	Chi-Square	Df
Likelihood Ratio	162.768071	13
Score	118.855559	13
Wald	76.488593	13

Table 3. Assessment of the quality of the logit model.

Model	Value
Cox-Snell R²	0.371065
Nagelkerke R²	0.497509
Loglikelihood	−159.018055

Table 4. Confusion matrix of the logit model.

	Predicted Value: No	Predicted Value: Yes	Prediction Accuracy (%)
Actual: No	162	36	81.8181818
Actual: Yes	38	115	75.1633987

Table 5. Significance tests of the probit model.

	Chi-Square	Df
Likelihood Ratio	157.788203	13
Score	118.855559	13
Wald	92.066123	13

Table 6. Assessment of the quality of the probit model.

Model	Value
Cox-Snell R²	0.362078
Nagelkerke R²	0.485460
Loglikelihood	−161.507989

Table 7. Confusion matrix of the probit model.

	Predicted Value: No	Predicted Value: Yes	Prediction Accuracy (%)
Actual: No	161	37	81.3131313
Actual: Yes	41	112	73.2026144

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Trebuňa, P.; Kronová, J.; Kliment, M.; Pekarčíková, M. Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry. Symmetry 2026, 18, 973. https://doi.org/10.3390/sym18060973

AMA Style

Trebuňa P, Kronová J, Kliment M, Pekarčíková M. Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry. Symmetry. 2026; 18(6):973. https://doi.org/10.3390/sym18060973

Chicago/Turabian Style

Trebuňa, Peter, Jana Kronová, Marek Kliment, and Miriam Pekarčíková. 2026. "Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry" Symmetry 18, no. 6: 973. https://doi.org/10.3390/sym18060973

APA Style

Trebuňa, P., Kronová, J., Kliment, M., & Pekarčíková, M. (2026). Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry. Symmetry, 18(6), 973. https://doi.org/10.3390/sym18060973

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetry-Based Comparison of Logit and Probit Models for Financial Distress Prediction in the Automotive Industry

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Data and Sample

3.2. Variables

3.3. LASSO Regression

3.4. Logit and Probit

3.5. Symmetry in Probabilistic Models

4. Results

4.1. Input Variables

4.2. Application of LASSO Regression for Variable Selection

4.3. Application of Logit in the Prediction of Financial Health

4.4. Application of Probit in the Prediction of Financial Health

4.5. Comparative Analysis and Symmetry Interpretation

5. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI