Theoretical underpinnings of logistic and Bayesian logistic model is discussed below.
2.1. The Binomial Distribution
Let us consider the response variable assuming only two values that are either zero or one.
Random variable takes value one with probability and zero with probability . The distribution of random variable is called Bernoulli distribution with parameter written as,
Expected value (mean) and variance of are given as,
Here, it may be observed that both mean and variance depends on probability . Let denote number of observations in group i, and let denote number of companies who have failed in group i. For example,
Here, is the realization of a random variable that takes the values 0, 1, …, ni. If observations in each group are independent with exactly same probability of being distressed, then the distribution of is binomial with parameters and . The probability distribution function of is given by
The mean and variance of random variable is defined as
2.2. Logistic Regression Model
Suppose that we have k independent observations , which are realization of a random variable . Further, we assume that follows binomial distribution given by
Further, let the logit of underlying probability is a linear function of the predictors, as below.
Here,
is a vector of covariates and
is a vector of regression coefficient. The odds of
ith firm is given by:
This formula shows the multiplicative model for the odds. For example, if we change the jth predictor by one unit while keeping the other variables constant, the odds will be multiplied by . Upon deriving for probability in the logit model, the above equation becomes,
The probability varies from 0 to 1. The right-hand side of above expression becomes a non-linear function of predictors. On differentiating with respect to , we obtain
Thus, the effect of jth predictor on probability depends on coefficient and the value of probability. However, due to pooled cross section nature of logistic model, observations might be correlated over the years.
2.3. Bayesian Logistic Model
Let us consider the following logistic model,
Here, takes the value 0 or 1, representing the vector of observations. represents the matrix of covariates. and are vector of regression coefficient and disturbances, respectively. In the logistic model, dichotomous response 0 and 1 can be represented by probability. As per Bayes theorem, suppose an event occurs at any specified time of point. The probability and and the regression coefficient represents the change in probability of success and failure when any fixed covariate changes.
For the above non-linear model, maximum likelihood estimation can be used to obtain reasonable estimates of parameter β. Here, the estimator follows the properties of good estimator, like unbiasedness, consistency, and efficiency for a large sample case. The likelihood function is given by,
The estimation process of parameters in the logistic model involves only information that is available in the observed data set for . However, Bayesian approach considers the updation of our knowledge about unknown parameters that is known as prior distribution. Utilizing the well-known Bayes theorem, the posterior distribution of parameters and is proportional to product of likelihood function and joint prior distribution of parameter and given as,
Selection of appropriate prior plays critical role in estimation process. Researchers and experimenters set the prior process based on their rich experience and beliefs on the subject. Typically, priors are assumed to be parametric distribution, like Gaussian, Gamma, Beta, etc., also known as informative priors. However, one can also choose non-informative priors, which are chosen by data itself.
Zellner and Ando (
2010) proposed that Bayesian methods generate consistent and efficient estimates. Researchers have to compromise on slight variation in results because of the choice of reasonable prior among distribution function space. However, to incorporate more prior flexibility in a model, one can assume coefficient
following multivariate normal distribution and variance following Inverse Gamma distribution. If the family of prior distributions and likelihood are same, then the posterior distribution belongs to the same family, called a conjugate family of distribution. Recently, employing decision trees and survival analysis models good prediction accuracy for financially distressed companies is evidenced by
Gepp and Kumar (
2015).