3.1. A Review of Statistical Techniques in Financial Distress and Bankruptcy Prediction Models
A review of literature revealed that the commonly used statistical techniques in developing accounting-based bankruptcy prediction models are multiple discriminant analysis, logistic regression, the probabilistic model, and linear probability (
Altman and Saunders 1998). Of these four approaches, multiple discriminant analysis is the most widely used technique, followed by logistic regressions (
Altman and Saunders 1998;
Bandyopadhyay 2006). The Z-Score and Zeta models were developed using multiple discriminant analysis, whereas the O-Score is a logistic regression model.
However, the usage of discriminant analysis requires the independent variables to have a multivariate normal distribution (
Angelini et al. 2008). According to the study done by
Lennox (
1999), cash flow and leverage have non-linear effects on the probability of bankruptcy of a firm. Thus, for bankruptcy prediction models incorporating cash flow and leverage variables, it is more suitable to adopt logistic regression analysis instead of discriminant analysis, as the logistic model does not require the variables to satisfy the assumption on normality.
Begley et al. (
1996),
Lennox (
1999), and
Mihalovič (
2016) also showed that logistic and probabilistic models can predict bankruptcy more accurately than discriminant analysis.
Begley et al. (
1996),
Lennox (
1999), and
Pongsatat et al. (
2004) concluded that generally, the O-Score, which was developed using logistic analysis, has higher predictive accuracy as compared to the Z-Score. The strength of logistic models is further supported by Adnan
Aziz and Dar (
2006), who concluded that logistic regression models are one of the most reliable models in the area of bankruptcy prediction due to their stability and low Type 1 and Type 2 errors. Based on the above evidence, the most appropriate statistical technique for this study is the logistic regression model.
3.2. Sample Description and Statistical Technique
The sample of financially distressed firms for this study comprise of companies which had filed for Chapter 11 bankruptcy protection in the United States between January 2009 and September 2017. The list of companies was extracted from UCLA-LoPucki Bankruptcy Research Database and the original list consists of 306 large public companies.
It was deemed that companies in financial sector are unsuitable for this study, as the performance metrics for companies in these industries differ from other industries. On top of that, the variables in the F-Score do not take into account variables which are deemed important in measuring the performance of financial institutions such as the bank capital, loan ratio (
Cleary and Hebb 2016), and net interest margin (
Goyal and Bhatia 2016). At times of distressed, firms in the financial sector tend to receive government interventions in the form of bailouts or capital support as well (
Cabrera et al. 2016;
Dieckmann and Plank 2011;
Kirshner 2015;
Berger et al. 2016). Thus, companies in financial industries such as banks and insurance companies are excluded from this study. Subsequently, the financial data for remaining companies was extracted from Bloomberg and companies with incomplete financial information were excluded from this study. This includes
Companies whereby financial data is unavailable for the period between one to three years prior to bankruptcy.
Financial data available is insufficient to calculate all the ratios included in the F-Score.
As a result, the sample of financially distressed firms ended up with a total of 81 firms.
Table 1 shows the year-wise distribution of the sample for financially distressed firms. The financial data for the variables used in this study was mainly collected from Bloomberg and the EDGAR database on the website of the Securities and Exchange Commission of the United States. The industry-wise distribution of the sample is as per
Table 2. In the past 3 years, many firms in the Crude Petroleum and Natural Gas industry were negatively affected by the decline in global oil prices. Therefore, there is a high concentration of firms in this industry in our sample of financially distressed firms. The high number of bankrupt companies in year 2016 mainly comprised of companies in the crude petroleum and natural gas industries. Moreover, the worldwide oil price issue gave this industry more potential for being financially distressed and thus became an interesting case to examine (
Ward 1994).
The sample of non-distressed firms was formed by using matched-pair sampling technique. This technique has been used in many studies relating to financial distress, such as the studies done by
Agrawal (
2015),
Beaver (
1966),
Altman (
1968), and
Begley et al. (
1996). Matching pairs of non-distressed firms were chosen based on matching industries, the nearest total assets, and the financial year of reporting. The matching of industries was done using the Standard Industry Classification (SIC) Code in the United States. As such, the overall sample size became162 firms, which wasmade up of 81 distressed firms and 81 non-distressed firms.
In order for us to have a meaningful study, it is vital to ensure that the differences of the total assets of the 2 samples of companies are not statistically significant (
Agrawal 2015). To confirm this, an independent samples t-test was performed. The results from the independent samples t-test and descriptive statistics for the total assets of the both distressed firms sample and non-distressed firms sample are presented in
Table 3. As the p-value is at 0.944, which is greater than 0.05, this shows that the difference in total assets for the distressed firms sample and non-distressed firms sample are not statistically significant.
In addition, the overall sample was split into an estimation sample and a hold-out sample. Firms in the period from 2015 to 2017 were taken to represent the estimation sample, whereas the remaining firms from 2009 to 2014 were taken to represent the hold-out sample. Therefore, the estimation sample comprised of 57 distressed firms and 57 non-distressed firms. On the other hand, the hold-out sample comprised of 24 distressed firms and 24 non-distressed firms. As per a similar study done by
Agrawal (
2015), this study also used logistic regression to predict the probability of default for a firm based on the respective aggregate Piotroski’s F-Score for each firm and also the nine individual components that made up the aggregate F-Score. Based on logistic regression, the probability of an event happening was established as follows:
where
P (Y) = Probability of event Y happening
z = linear combination of independent variables as represented by:
3.3. Variable Descriptions
Model 1 uses the aggregate Piotroski’s F-Score (
Piotroski 2000) calculated as the total of the following independent variables.
where
F_ROA = 1 if Net Income/Total Assets is positive, 0 otherwise
F_∆ROA = 1 if current year’s ROA (Net Income/Total Assets) is higher than previous year’s ROA, 0 otherwise
F_CFO = 1 if Cash Flows From Operations/Total Assets is positive, 0 otherwise
F_Accrual = 1 if CFO (Cash Flows from Operations/Total Assets) > ROA (Net Income Before Extraordinary Items/Total Assets), 0 otherwise
F_∆Margin = 1 if current year’s Gross Margin Ratio is higher than previous year’s Gross Margin Ratio, 0 otherwise
F_∆Turnover = 1 if current year’s Asset Turnover Ratio is higher than previous year’s Asset Turnover Ratio, 0 otherwise
F_∆Leverage = 1 if current year’s Leverage Ratio is lower than previous year’s Leverage Ratio, 0 otherwise
F_∆Liquidity = 1 if current year’s Current Ratio is higher than previous year’s Current Ratio, 0 otherwise
F_Eq_Offer = 1 if the number of shares outstanding in the current year is not greater than the number of shares outstanding in the prior year, 0 otherwise.
Though the F-Score comprised of nine components, only seven components werebeing used in Model 2 as the variable ROA and Accruals were removed due to high multicollinearity. In addition, the linearity of all the remaining seven components of F-Score with respect of the logit of the outcome was assessed via the Box-Tidwell procedure (
Box and Tidwell 1962). A Bonferroni correction was applied using all 15terms in the model, resulting in statistical significance being accepted when
p< 0.00333 (
Tabachnik and Fidell 2007). Based on this assessment, all the components were found to be linearly related to the logit of the outcome except CFO. Therefore, the CFO variable was recoded into a nominal variable. Thus, the components of the Piotroski’s F-Score used in Model 2 are ∆ROA, CFO, ∆Margin, ∆Turnover, ∆Leverage, ∆Liquidity and Eq_Offer.
Model 2 uses the individual components of the Piotroski’s F-Score, that is, ΔROA, CFO, ΔMargin, ΔTurnover, ΔLeverage, ΔLiquidity, and Eq_Offer. These are measured as follows:
∆ROA = Current year’s ROA less previous year’s ROA
CFO = 1 if (Cash Flows from Operations/Total Assets) < 0, 0 otherwise.
∆Margin = Current year’s Gross Margin Ratio less previous year’s Gross Margin Ratio
∆Turnover = Current year’s Asset Turnover Ratio less prior year’s Asset Turnover Ratio
∆Leverage = Current year’s Leverage Ratio less prior year’s Leverage Ratio
∆Liquidity = Current year’s Current Ratio less prior year’s Current Ratio
Eq_Offer = number of shares outstanding in the current year less the number of shares outstanding in the prior year/number of shares outstanding in the prior year.