A Logit Model for Bivariate Binary Responses

: This article provides a bivariate binary logit model and statistical inference procedures for parameter estimation and hypothesis testing. The bivariate binary logit (BBL) model is an extension of the binary logit model that has two correlated binary responses. The BBL model responses were formed using a 2 × 2 contingency table, which follows a multinomial distribution. The maximum likelihood and Berndt–Hall–Hall–Hausman (BHHH) methods were used to obtain the BBL model. Hypothesis testing of the BBL model contains the simultaneous test and the partial test. The test statistics of the simultaneous test and the partial test were determined using the maximum likelihood ratio test method. The likelihood ratio statistics of the simultaneous test and the partial test were approximately asymptotically chi-square distributed with 3 p degrees of freedom. The BBL model was applied to a real dataset, and the BBL model with the single covariate was better than the BBL model with multiple covariates. valida-tion, P.P.; formal analysis, P.P. and M.F.; investigation, P.P. and M.F.; resources, P.P. and M.F.; data curation, M.F.; writing—original draft preparation, M.F.; writing—review and editing, P.P. and M.F.; visualization, M.F.; supervision, P.P.; project administration, P.P. authors


Introduction
The logit model is a model that is often used for modeling categorical data in various research fields. Several studies have recently developed logit models for multiple correlated responses. McCullagh and Nelder [1] introduced a multivariate logistic transform used to construct logit models with two or more correlated responses. A multivariate logistic transform, including the numerical optimization methods, has been proposed in [2][3][4][5][6]. Lipsitz, Laird, and Harrington [7] examined the maximum likelihood (ML) method for binary data models, which connects the probability of success at each time point to a set of covariates. Liang, Zeger, and Qaqish [8] discussed the regression modeling of the marginal means of the responses using the generalized estimating equation approach, wherein there are dependencies between responses. An alternating logit model for jointly regressing the responses on covariates and modeling the dependencies among responses in the framework of pairwise odds ratios was proposed by [9]. Cessie and Houwelingen [10] modeled regression for correlated binary responses, in which the form of marginal response probabilities is the logit link function. Lang and Agresti [11] considered the model-fitting methods for analyzing the parameters simultaneously and parsimoniously. The ML estimator properties of the kappa coefficient in the bivariate binary logistic model using the small and moderate sized samples through Monte Carlo simulation were investigated by [12,13]. Molenberghs and Lesaffre [14] presented a simple generalized linear model formulation for marginal and association modeling of multivariate categorical data. The specifications of the association models in [15], in which the dependence ratios contrast with other models for a multivariate binary response that is specified by odds ratios or correlation coefficients, were employed by [16].
Studies have also proposed both the conditional and marginal models. A flexible conditional model for a multivariate binary response vector with covariates was examined

Bivariate Binary Logit Model
Bivariate binary logit (BBL) models are one of the families of multivariate logit models and are used to model the relationships between two correlated binary responses with one or more covariates. Let Y 1 and Y 2 be two bivariate binary responses and T be a vector of responses. The elements of y have the probabilities of γ 11 , γ 10 , γ 01 , and γ 00 , respectively, which are presented in Table 1. Table 1. Probabilities for the responses. 2 1 According to Fathurahman, Purhadi, Sutikno, and Ratnasari [27], the BBL model responses in Table 1 follow a multinomial distribution. Therefore, the joint probability function of the responses can be defined as follows: P(Y 11 = y 11 , Y 10 = y 10 , Y 01 = y 01 , Y 00 = y 00 ) = where 0 < γ qr < 1; q, r = 0, 1; y qr = 0, 1; y 00 = 1 − y 11 − y 10 − y 01 ; and γ 00 = 1 − γ 11 − γ 10 − γ 01 . q and r are the values of the responses. y qr is the value of Y qr , which represents the elements of the vector of responses. γ qr = P(Y 1 = q, Y 2 = r) is the joint probability of the responses. γ 1 = P(Y 1 = 1) and γ 2 = P(Y 2 = 1) are the marginal probabilities of Y 1 and Y 2 , respectively. Let x = 1 X 1 X 2 · · · X p T be the vector of covariates, which is (p + 1)dimensional. Then the BBL model is expressed as follows: where θ 1 , θ 2 , and θ 3 are vectors of parameters, γ 1 (x) and γ 2 (x) are marginal probabilities of responses, and ψ(x) is the odds ratio of responses depending on covariates, which shows that the responses are correlated. The vectors of parameters are symbolized by The marginal probabilities of responses are defined as follows: The joint probability of γ 11 (x) in Equation (2) is defined by where then the responses are independent [30]. Based on Table 1 and Equation (5), the probabilities of γ 10 (x), γ 01 (x), and γ 00 (x) in Equation (2) are as follows:

Estimation of the BBL Model
The estimation of the BBL model's parameters is one of the main results of this study. The BBL model in Equation (2) has 3(p + 1) parameters, where (p + 1) parameters show the dependencies among responses, and 2(p + 1) parameters describe the relationships between responses and covariates. The BBL model's parameters are denoted by θ and expressed as where θ T 1 , θ T 2 , and θ T 3 are given by Equation (3). To obtain the parameters estimator of the BBL model in Equation (7), the ML method was employed. Based on the ML method, the estimator ofθ is the value of θ, maximized by the likelihood function and the log-likelihood function. The ML estimator can be obtained by determining the first partial derivatives of the log-likelihood function, then equating them to zero.
Based on Equation (2), the likelihood equation contains the interdependence equations, which have a non-explicit form. Therefore, the ML estimator of the BBL model's parameters was not obtained analytically. The ML estimator was approximated by the likelihood equation's roots, which were obtained via an iterative process using the BHHH method. Determining the ML estimator of the BBL model's parameters using the BHHH method needs the gradient vector and the Hessian matrix. In the following, we present Lemmas 1 and 2 for the gradient vector and Hessian matrix, respectively.
. . , n be a random vector sample that is mutually independent and identical with a multinomial distribution denoted by , and Y 00i that contain the parameter θ. If the likelihood function of the BBL model is denoted by L(θ), where θ is as in Equation (7), then the gradient vector is where ∂ (θ) ) is a vector of the random sample that is independently and identically multinomial distributed; then the joint probability is defined by 00i (x i ).
As in Equation (9), the likelihood function is as follows: For simplicity, let γ y qr qr (x i ) = γ y qri qri for q, r = 0, 1; then the likelihood function in Equation (10) can be rewritten as To obtain the log-likelihood function of the BBL model, both sides of the likelihood function in Equation (11) were transformed by the natural logarithm, which gives (y 11i log γ 11i + y 10i log γ 10i + y 01i log γ 01i + y 00i log γ 00i ).
The log-likelihood function in Equation (12) is that the vector of θ has 3(p + 1)− dimensions. Following the definition in Greene [31], the gradient vector of the log-likelihood function in Equation (12) is where the vector of θ is given by Equation (7). Regarding the BBL model in Equation (2), we define the vector of τ, which is denoted by τ = τ 1 τ 2 τ 3 T , where τ 1 = τ 1 (x), τ 2 = τ 2 (x), and τ 3 = τ 3 (x). The vector of the joint probability of y is defined by γ = γ 11 γ 10 γ 01 γ 00 T . Furthermore, the derivative of τ with respect to γ is denoted by ∂τ/∂γ. To get a symmetrical matrix of ∂τ/∂γ, Thus, the matrix of ∂τ/∂γ is ∂τ ∂γ The inverse matrix of ∂τ/∂γ in Equation (14) is as follows: where The gradient vector of the log-likelihood function in Equation (12) can be written as In relation to Equations (13)- (15) and the chain rule of derivatives, the elements of the gradient vector in Equation (16) can be obtained as follows: where ∆ 1i and ∆ 2i , for i = 1, 2, . . . , n, given in Equation (15).

Lemma 2.
If the log-likelihood function of the BBL model is (θ) and the vector of θ is the BBL model's parameters, then the Hessian matrix of (θ) is where n is the sample size.
Proof of Lemma 2. The BBL model's parameters (θ) and the log-likelihood function ( (θ)) were given in Equations (7) and (12), respectively. Based on Lemma 1, the gradient vector of the log-likelihood function (θ) is g(θ). According to Greene [31], the Hessian matrix can be obtained by the Berndt-Hall-Hall-Hausman (BHHH) method. On the other hand, the Hessian matrix depends on the gradient vector [31], which is shown below: Meanwhile, the gradient vector and the Hessian matrix associated with the information matrix and can be expressed by The information matrix in Equation (22) is also referred to as the Fisher information matrix [32]. Based on Equations (21) and (22), the Hessian matrix is Regarding Lemmas 1 and 2, an iteration process can be carried out using the BHHH method. Following [33], the BHHH algorithm in this study is as follows: • Determine the tolerance value (ε) for the BHHH iteration process stopping. • Start the BHHH iteration process using the formula: • The iteration stops at the T-th iteration if the condition of convergence is satisfied, The estimator values of the parameters are obtained in the last iteration.
Akaike's information criterion (AIC) and the Bayesian information criterion (BIC) determine the best model in this study. The AIC and BIC values can be obtained by where θ is the log-likelihood value of the parameter's estimate, p is the number of covariates, and n is the sample size. The best model is the BBL model, which has the smallest values of AIC and BIC.

Hypothesis Testing of the BBL Model
Hypothesis testing of the BBL model contains the simultaneous test and the partial test. The simultaneous test and the partial test obtain the significance of the BBL model's parameters jointly and individually, respectively. The simultaneous test and the partial test in this study were done using the maximum likelihood ratio test (MLRT) method.
In the following we present a lemma used to determine the likelihood ratio (LR) statistic, the distribution of the LR statistic, and the rejection region of the simultaneous test.

Lemma 3.
If θ is the BBL model's parameter andθ is the ML estimator of θ, then: a) The LR statistic of the simultaneous test is G 2 1 = 2 L θ − L θ * , whereθ * is the ML estimator of the parameter space under the null hypothesis andθ is the ML estimator of the parameter space under the population. b) The distribution of the LR statistic follows an asymptotic chi-square distribution, which is The rejection region at the significance level of α is G 2 Proof of Lemma 3. (a) The first step to obtain the statistical test of the hypothesis in Equation (27) is to define the parameter space under the null hypothesis, denoted by Ω 01 = {θ 01 , θ 02 , θ 03 }. Furthermore, we determine the likelihood function formulated by Letθ * = θ 01θ02θ03 T be the ML estimator that maximizes the likelihood function of L(Ω 01 ). The ML estimator of θ * can be obtained by using Lemmas 1 and 2. Therefore, the maximum likelihood function of L(Ω 01 ) is where:γ * The parameters under the population are Ω 11 = θ gh , g = 0, 1, . . . , p; h = 1, 2, 3 , and the likelihood function is Suppose the ML estimator that maximizes the likelihood function of L( T . However, the ML estimator of θ was obtained using Lemmas 1 and 2. Therefore, the maximum likelihood function of L(Ω 11 ) is where:γ The likelihood ratio (LR) statistic of the hypothesis in Equation (27) is With regard to Equations (29) and (31), the LR statistic in Equation (32) can be written as However, the form of the LR statistic in Equation (33) is complicated, and we cannot do the calculation analytically. Therefore, to simplify the calculation, the LR statistic is transformed in a form equivalent to The LR statistic in Equation (34) is also transformed using the natural logarithm. Thus, the formula of the LR statistic is where L θ = log L Ω 11 and L θ * = log L Ω 01 . (b) Suppose the ML estimator under the population is partitioned bŷ The ML estimator and the known parameter in the null hypothesis are partitioned bŷ The hypothesis in Equation (27) can be rewritten as The LR statistic in Equation (35) can be specified by The function of L(θ * ) is approximated using Taylor's second-order expansion aroundθ; we have Analogously, in the previous step, the function of L(θ * ) was approximated using Taylor's second-order expansion aroundθ * , which is or it can be written as Following Equations (38) and (39), the LR statistic in Equation (37) can be rewritten as Suppose the forms of the partition of the Fisher information matrix and its inverse are as follows.
Following the concept of the conditional distribution, given as θ 11 = θ * 01 ,θ 11 , andθ 12 , we haveθ * Since θ * − θ * = 0,θ * 02 − θ 12 and θ * 02 − θ 12 =θ 12 − θ 12 The LR statistic in Equation (40) can be formulated as Therefore, the LR statistic is obtained as follows: The LR statistic in Equation (44) has an asymptotic chi-square distribution with v 1 degrees of freedom. v 1 is the difference of the parameter sets under the population and the null hypothesis.
(c) Determining the rejection region or the critical region of the null hypothesis in Equation (27) requires the MLRT method, where the null hypothesis is rejected when Λ < c, 0 < c ≤ 1, and where Λ is given in Equation (32) and c is a constant. Let α be the significance level for 0 < α < 1, and 0 < c α ≤ 1 be a constant; the c α value depends on the significance level of α and satisfies P θ Ω 01 (Λ < c α ) = α. Based on the definition of the significance level of α, we have where G 2 1 is the LR statistic, which has an asymptotic chi-square distribution with v 1 degrees of freedom. The value of the constant c in Equation (45) is the probability density function of the chi-square distribution with v 1 degrees of freedom. Therefore, the rejection region at the significance level of α is G 2 1 > χ 2 (α,v 1 ) . The last one is a partial test. This test aims to obtain covariates that have a significant effect on the responses individually. The procedures of the partial test in this study follow the simultaneous test. The hypothesis of the partial test is H 0 : θ 11 = θ 12 = θ 13 = 0, H 1 : at least one of θ 1h = 0, h = 1, 2, 3.

Application
The BBL model was applied to model the factors influencing the status of the human development index (HDI) and public health development index (PHDI) of regencies/municipalities in Kalimantan, Indonesia, in 2018. The HDI is an index measured from four components of the essential dimensions of human development: life expectancy, the average length of schooling, expected length of schooling, and adjusted per-capita income. Life expectancy represents an indicator of health, the average length of schooling and the expected length of schooling represent educational indicators, and adjusted per capita income represents an economic indicator [34]. The PHDI is an index that measures the health of the regencies/municipalities and provinces in the Republic of Indonesia [35].
The HDI status data and covariates' data were collected from the National Bureau of Statistics of the Republic of Indonesia, whereas the PHDI data were collected from the Republic of Indonesia's Ministry of Health. The variables in this study consist of two responses and five covariates. The responses are the HDI status and the PHDI status of regencies/municipalities, denoted by Y 1 and Y 2 . The covariates are the economic growth (X 1 ), the net enrollment rate of the junior high school (X 2 ), the percentage of people that have the minimum level of education in junior high school (X 3 ), the number of doctors per 1000 people (X 4 ), and the number of public health centers (X 5 ). The regencies and municipalities' HDI status has four categories: low HDI, medium HDI, high HDI, and very high HDI [34]. Regencies/municipalities in Kalimantan, Indonesia, in 2018, had HDI in the medium and high categories. Therefore, the HDI status (Y 1 ) has two categories: the medium HDI coded by 0 and the high HDI coded by 1.
Meanwhile, the Ministry of Health of the Republic of Indonesia classifies regencies/ municipalities' health status based on the PHDI into two categories. Regencies/municipalities with a low PHDI have health problems, and vice versa [36]. Therefore, the PHDI status (Y 2 ) has two categories: the regencies/municipalities with low PHDI values are coded by 0, and the regencies/municipalities with high PHDI values are coded by 1. This study's observation unit is the regency/municipality. Five provinces in Kalimantan, Indonesia were used (2018 data), including 47 regencies and nine municipalities. Therefore, the sample size is 56.
The descriptive statistics of the responses HDI status (Y 1 ) and PHDI status (Y 2 ), consisting of observed frequencies, are presented in Table 2.   Table 2 shows that 20 regencies/municipalities had high HDI and PHDI, and six regencies/municipalities had high HDI and low PHDI. We also see that three regencies/municipalities had medium HDI and high PHDI. Finally, 27 regencies/municipalities had medium HDI and low PHDI. The HDI status (Y 1 ) and PHDI status (Y 2 ) of regencies/municipalities are displayed in Figure 1. are coded by 0, and the regencies/municipalities with high PHDI values are coded by 1. This study's observation unit is the regency/municipality. Five provinces in Kalimantan, Indonesia were used (2018 data), including 47 regencies and nine municipalities. Therefore, the sample size is 56.
The descriptive statistics of the responses HDI status ( ) and PHDI status ( ), consisting of observed frequencies, are presented in Table 2.  Table 2 shows that 20 regencies/municipalities had high HDI and PHDI, and six regencies/municipalities had high HDI and low PHDI. We also see that three regencies/municipalities had medium HDI and high PHDI. Finally, 27 regencies/municipalities had medium HDI and low PHDI. The HDI status ( ) and PHDI status ( ) of regencies/municipalities are displayed in Figure 1. is the number of regencies/municipalities that had medium HDI and low PHDI; is the number of regencies/municipalities that had medium HDI and high PHDI; is the number of regencies/municipalities that had high HDI and low PHDI; is the number of regencies/municipalities that had high HDI and PHDI.
The descriptive statistics of the responses show that the majority of regencies/municipalities in Kalimantan, Indonesia, in 2018, had medium HDI and low PHDI. The descriptive statistics of the covariates are summarized in Table 3.  . nY 00 is the number of regencies/municipalities that had medium HDI and low PHDI; nY 01 is the number of regencies/municipalities that had medium HDI and high PHDI; nY 10 is the number of regencies/municipalities that had high HDI and low PHDI; nY 11 is the number of regencies/municipalities that had high HDI and PHDI.
The descriptive statistics of the responses show that the majority of regencies/municipalities in Kalimantan, Indonesia, in 2018, had medium HDI and low PHDI. The descriptive statistics of the covariates are summarized in Table 3. The HDI status (Y 1 ) and PHDI status (Y 2 ) are correlated. Based on the observed frequencies in Table 2, the odds ratio (OR) value of HDI status (Y 1 ) and PHDI status (Y 2 ) was 30 with a 95% confidence interval of 6.6826 ≤ OR ≤ 134.6783. This result indicates that the responses are highly positively correlated. Meanwhile, we also employed the dependence test of the responses HDI status (Y 1 ) and PHDI status (Y 2 ), provided in Table 4. Three statistical tests demonstrated a dependence test of HDI status (Y 1 ) and PHDI status (Y 2 ). The result in Table 4 shows that all of the statistical test values had greater than the chi-square table value (i.e., χ 2 (0.05,1) = 3.8415) and p-values less than the significance level value (i.e., α = 0.05). Therefore, the conclusion was to reject the null hypothesis (H 0 ), and the HDI status (Y 1 ) and PHDI status (Y 2 ) are dependencies. Based on the OR value and the dependence test, the HDI status (Y 1 ) and PHDI status (Y 2 ) are appropriate for the BBL model.
The variance inflation factor detected the multicollinearity of the covariates. The variance inflation factor values of all covariates in Table 5 are less than ten, which indicates that the covariates are independent of each other (i.e., no multicollinearity). Therefore, all covariates can be used in the BBL model. The estimation of the BBL model's parameters using the ML and BHHH methods was employed. Table 6 provides the bias values and the numbers of BHHH iterations of the parameter estimation process for the BBL model with the single and multiple covariates. 3.8014 × 10 −6 * 9 X 5 5.6987 × 10 −5 1000 X 2 X 3 X 4 8.5186 × 10 −8 * 146 * Convergent at tolerance limit value (i.e., ε = 1 × 10 −5 ).
The BBL model with the single covariate of economic growth (X 1 ) and public health centers (X 5 ) in Table 6 was not convergent. Therefore, both covariates, economic growth (X 1 ) and public health centers (X 5 ), were not used in the BBL model. Based on Table  6, the BBL model for modeling the factors that affect the HDI status and PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018 was obtained. Table 7 displays the ML estimates of the BBL model with multiple covariates (i.e., X 2 , X 3 , X 4 ), giving the parameter estimates, the LR statistic of the simultaneous test (G 2 1 ), the degrees of freedom (df), and the p-value. The LR statistic value in Table 7 is 99.739, and the p-value is 1.7685 × 10 −21 (p < 0.001). Meanwhile, the chi-square table's value with nine degrees of freedom and a 5% significance level was 16.919. The LR statistic value is greater than the chi-square table's value, and the p-value is less than the 5% significance level. Therefore, the null hypothesis was rejected, and we conclude that the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people were jointly significantly affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018. The BBL model for the HDI status and the PHDI status of regencies/municipalities can be written as follows: The partial test using the MLRT method was used to obtain the covariates that individually affect the HDI status and the PHDI status of regencies/municipalities. Table 8 describes the BBL model with the single covariate, which covers the parameter estimates, the LR statistic value (G 2 2 ), the degrees of freedom (df), and the p-value. The LR statistic's value of the estimated parameter for each covariate (the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people; Table 8) was greater than the chi-square table's value; the chi-square table's value with three degrees of freedom and 5% significance level was 7.8147. Meanwhile, the p-value of each covariate was less than the 5% significance level. Therefore, we concluded that the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people individually significantly influenced the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018.
The BBL model with a single covariate (e.g., X 4 ) for the HDI status and the PHDI status of regencies/municipalities can be expressed as follows: The AIC and BIC methods in Equations (25) and (26) were used for the evaluation of the BBL model's performance. The AIC and BIC values of the BBL models are shown in Table 9. The BBL model with the single covariate in Table 9 has the smallest AIC and BIC values compared to the BBL model with the multiple covariates. Therefore, the BBL model with the single covariate is the best model for modeling the relationships between the responses (i.e., the HDI status and the PHDI status) and the covariates (i.e., the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people) of regencies/municipalities in Kalimantan, Indonesia, in 2018. Furthermore, the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people individually significantly affected the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018.
However, some recommendations and future research from this work are possible. Firstly, the logit models in this research are limited to two responses. The BBL model, with more than two responses, should be considered for future research. Secondly, other numerical optimization methods that improve the performance of the BBL model should also be considered for future research.

Conclusions
The BBL model is the development of the binary logit model. It was constructed using the multinomial distribution. The ML method was applied to get the BBL model's parameter estimator. The ML estimator of the BBL model's parameters does not have a closed-form, and it needs an iterative numerical procedure. The BHHH iterative method was used. Hypothesis testing of the BBL model includes the simultaneous test and the partial test. The simultaneous and partial tests were done by using the MLRT method. The LR statistics of the simultaneous test and the partial test were asymptotically chi-square distributed. The BBL model was applied to model the factors affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018. The BBL model with the single covariate performed better than the BBL model with multiple covariates. The factors significantly affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018, were the net enrollment rate of junior high schools, the percentage of people who have the minimum level of education in junior high school, and the number of doctors per 1000 people.