3.1. Dataset and General Setting of the Study
The dataset of this paper includes all Estonian bankrupted firms from 2013–2017, in case of which the following restrictions have been applied. First, all firms must have information available to calculate variables outlined in Section 3.2
and Section 3.3
. Second, we demand the financial report of a bankrupted firm to be not older than two years from the moment of bankruptcy declaration. With this restriction, we guarantee that the annual report portrays pre-bankruptcy financial situation homogenously for firms included in the analysis and is available for comparative purposes with payment defaults. On average, the financial report in the dataset portrays financial situation one year before bankruptcy declaration. In total, 512 bankrupted firms are included in the analysis, which are all SMEs.
Concerning survived firms, 4003 firms are used which are functional at the time of the analysis. All firms which have financial information available from 2011 to 2015 are chosen, irrespective of how well they perform. The latter is important to avoid a bias of discriminating only in between bankrupt and “successful” survived firms. The time 2011–2015 is determined by the fact that the reports of bankrupted firms originate from the same time interval. In the viewed period, Estonia had recovered from the consequences of the global financial crisis and these years were characterized by stable economic growth. Thus, the viewed period is not subject to any abnormal performance of firms due to economic recession.
For calculating financial ratios of bankrupt firms, we use the last available annual financial report before bankruptcy. In case of survived firms, we calculate the financial ratios for all firms for all five years incorporated to the analysis. In Estonia, firms are responsible for submitting an annual report in maximum six months after the end of the fiscal year, which for the vast majority of firms overlaps with the calendar year.
Concerning taxes, firms need to submit tax reports and pay taxes twice in the month following the month that the taxes were incurred. Specifically, on the 10th day of the month for taxes concerned with salaries and on the 20th day of the month for value added tax. Estonia is among a few countries in the world where profit is not taxed on an accrual basis, but only when dividends are paid. When dividends are paid, the respective income tax is subject to the same principles as salaries. When tax arrears (i.e., unpaid tax debt due) occur, this is observable live on the Estonian Tax and Customs Board database. From the latter database, we have obtained the values of tax arrears for the whole population for each month end in the viewed period of 2011–2017. The usage of the month end is a more suitable option when compared with for instance one day delay of paying taxes. This is because a few days’ delays of paying taxes is common in Estonia and are more subject to administrative or diligence reasons, rather than pointing to a temporary liquidity crisis. Thus, tax arrears’ information can be used dynamically to view the emergence of problems up to the exact month when bankruptcy occurred. As the annual reports are up to 2 years old, in case of tax arrears data, we consider a 24 month long period before bankruptcy is declared. For survived firms, we use multiple 24 month long periods within the years from 2011 to 2016.
We do not apply other payment defaults (i.e., to private creditors, such as banks and suppliers) in this study for multiple reasons. First, in Estonia no single database incorporates all payment defaults to private creditors. Second, some of such defaults might not be documented, for instance because of their small size or creditors executing their claims in a different way (e.g., suing managers who have guaranteed the credit). Third, such defaults might not be documented precisely in respect to their start or end period, e.g., due to the fact that creditors could be delaying the execution of a claim because of groundless promises by debtors to pay the debt.
To provide an answer in which period tax arrears’ information is more useful than financial ratios, we consider different pre-bankruptcy periods concerning tax arrears. The usage of financial information in this study has been consolidated into Table 1
3.2. Financial Ratios Portraying Different Domains
The financial ratios for this study have been chosen based on their previous usage for bankruptcy prediction and taking into account that all important financial ratio domains would be covered (see the formulas and ratio domains in Table 2
). Firm leverage is reflected by the total debt to total assets ratio (DA). This ratio in its different forms (e.g., total equity to total assets or total equity to total debt) might be the most common and useful failure predictor. The ratio has a strong intersection with legislation, as business and insolvency codes in different countries often set minimum requirements for firms’ equity. Profitability is captured with two ratios, i.e., net income to total assets and net income to operating revenue. The former is a more common profitability ratio in bankruptcy prediction and was used already in the Altman
) model, although having EBIT instead of net income in the numerator. Static liquidity is portrayed with two ratios, namely either the quotient of cash minus current liabilities to total assets or the quotient of current assets minus current liabilities to total assets. These ratios have been frequently used in the form of cash to current liabilities (quick ratio) and current assets to current liabilities (current ratio), but the usage of such ratios is problematic. Namely, as among survived firms there might be a fair amount of companies with no or very low level of current liabilities, such ratios would obtain extreme values or the value cannot be calculated at all. Moreover, the division with total assets helps us to have a better overview how large the surplus or deficit of cash or current assets is in comparison to all assets a firm possesses. A firm’s cash flow creation is portrayed with two ratios reflecting the quotient of operating cash flow to either operating revenue or total assets. The productivity (efficiency) of a firm’s assets is reflected by the quotient of operating revenue to total assets. Finally, the burden of interest paid on debt is proxied with two ratios, specifically the quotient of total financial revenues minus total financial expenses to either total assets or operating revenue. The latter two variables (with similar, but not necessarily identical formulas) have been often classified as solvency (solidity) ratios.
The ten applied financial ratios reflect the most usual domains used in previous bankruptcy prediction studies, i.e., profitability, cash flow creation, leverage, liquidity, solidity, and profitability. We acknowledge that many more financial ratios have been applied in previous studies, but they are mostly very similar (or mere modifications) to the ones used, and thus, would evidently provide only a marginal surplus (if at all) to classification accuracies. In addition, the calculation of very specific financial ratios is altered by the availability of financial information, as the financial reports of SMEs are often quite brief. Because of the latter, we can for instance use the difference of financial revenues and financial expenses, rather than specific types of those revenues/expenses. In case of all applied financial ratios, the general rule is that higher values should reduce the bankruptcy probability on a univariate principle. The exception is DA, where the situation is the reverse.
We apply one classical statistical (i.e., logistic regression, noted as LR) and one machine learning (i.e., multilayer perceptron with two hidden layers, noted as MP) tool for composing the prediction models. In case of using only one method, the results could be biased towards that specific method, and therefore, not generalizable. These two methods are probably the most exploited classical and novel methods in bankruptcy prediction, thus their choice is fairly justified based on the developments in previous research. We acknowledge that there is nowadays a myriad of different methods (especially in the area of machine learning) available for failure prediction. Still, as the first and foremost aim of the paper is to show whether and in what context the information about tax arrears can be exploited in bankruptcy prediction, we find the usage of two methods a sufficient choice. In addition, based on the results in the empirical section, we thoroughly explain why the usage of additional methods would probably not have provided a surplus to the obtained results.
In bankruptcy prediction, there are different streams concerning how to use observations in the analysis. The classical studies have used (rather) equal samples for bankrupted and survived firms. This definitely guarantees that the analysis reaches a clear conclusion how accurately bankrupted and survived firms can be discriminated from each other. Still, such selection of survived firms should be avoided, as there is a serious risk of creating a bias, i.e., the sample of survived firms does not represent the population it originates from. Moreover, when for instance a credit analyst is solving a practical classification problem, a firm under consideration originates from the whole population without any preselection. Thus, if available, the population of survived firms should be used irrespective of their characteristics. Therefore, our dataset (see Section 3.1
) incorporates all bankrupted firms and all their survived counterparts, for which the respective annual reports were available.
There are different options on how to use LR and MP. When the frequencies in two groups (i.e., bankrupted and non-bankrupted firms) are very imbalanced (which is the usual case and also applies for this study), algorithms can result in classifying a majority group (i.e., non-bankrupt firms) as correctly as possible, at the same time creating (huge) misclassification errors in case of the minority group (i.e., bankrupt firms). Therefore, we administer a procedure frequently used in bankruptcy prediction research (see e.g., Altman et al. 2017
) by weighting the two groups of firms to be equal in the analysis. In case of LR, the weights for observations are calculated as 0.5 divided by the share of respective group in the population used. In case of MP, we achieve the same by making synthetic observations. Such a method, i.e., a synthetic minority oversampling technique (SMOTE), has been frequently used in case of machine learning classification applications for bankruptcy prediction (Kim et al. 2015
). SMOTE is achieved by repeating the observations of bankrupt firms as long as their population size equals that of non-bankrupt firms. We acknowledge that different weights could be applied in this study, but this is specifically dependent on how large the misclassification costs of (non-)bankrupt firms are (in practice). Likewise with majority of previous studies in the area, we do not incorporate misclassification costs in the analysis.
In order to understand what are the prediction abilities of individual variables, we first provide the results in case of LR by using only single variables from Table 2
and Table 3
. After that, we conduct three types of analyses: (a) using all financial ratios together for LR and MP, (b) using all tax arrears’ variables together for LR and MP, (c) using financial ratios and tax arrears’ variables together for LR and MP. When the comparison of (a) and (b) enables us to outline the individual prediction abilities of the specific variables through the two applied methods, then (c) introduces a joint analysis. Results are provided for both test and hold-out samples.