# A Logit Model for Bivariate Binary Responses

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Bivariate Binary Logit Model

## 3. Estimation of the BBL Model

**Lemma**

**1.**

**Proof**

**of**

**Lemma**

**1.**

**Lemma**

**2.**

**Proof**

**of Lemma 2.**

- Determine the initial value for ${\widehat{\mathit{\theta}}}^{\left(0\right)}={\left[\begin{array}{ccc}{\widehat{\mathit{\theta}}}_{1}^{T\left(0\right)}& {\widehat{\mathit{\theta}}}_{2}^{T\left(0\right)}& {\widehat{\mathit{\theta}}}_{3}^{T\left(0\right)}\end{array}\right]}^{T}$.
- Determine the tolerance value $\left(\mathsf{\epsilon}\right)$ for the BHHH iteration process stopping.
- Start the BHHH iteration process using the formula:$${\widehat{\mathit{\theta}}}^{\left(t+1\right)}={\widehat{\mathit{\theta}}}^{\left(t\right)}-{\mathit{H}}^{-1}\left({\widehat{\mathit{\theta}}}^{\left(t\right)}\right)\mathit{g}\left({\widehat{\mathit{\theta}}}^{\left(t\right)}\right),t=0,1,2,\dots ,T.$$
- The iteration stops at the $T$-th iteration if the condition of convergence is satisfied, which is ${\widehat{\mathit{\theta}}}^{\left(T+1\right)}-{\widehat{\mathit{\theta}}}^{\left(T\right)}\le \epsilon $. The estimator values of the parameters are obtained in the last iteration.

## 4. Hypothesis Testing of the BBL Model

**Lemma**

**3.**

- a)
- The LR statistic of the simultaneous test is ${G}_{1}^{2}=2\left(L\left(\widehat{\mathit{\theta}}\right)-L\left({\widehat{\mathit{\theta}}}^{*}\right)\right)$, where ${\widehat{\mathit{\theta}}}^{*}$ is the ML estimator of the parameter space under the null hypothesis and $\widehat{\mathit{\theta}}$ is the ML estimator of the parameter space under the population.
- b)
- The distribution of the LR statistic follows an asymptotic chi-square distribution, which is ${G}_{1}^{2}=2\left(L\left(\widehat{\mathit{\theta}}\right)-L\left({\widehat{\mathit{\theta}}}^{*}\right)\right)\stackrel{d}{\to}{\chi}_{{v}_{1}}^{2},n\to \infty $.
- c)
- The rejection region at the significance level of $\alpha $ is ${G}_{1}^{2}>{\chi}_{\left(\alpha ,{v}_{1}\right)}^{2}$.

**Proof**

**of Lemma 3.**

## 5. Application

^{−21}(p < 0.001). Meanwhile, the chi-square table’s value with nine degrees of freedom and a 5% significance level was 16.919. The LR statistic value is greater than the chi-square table’s value, and the p-value is less than the 5% significance level. Therefore, the null hypothesis was rejected, and we conclude that the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people were jointly significantly affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018. The BBL model for the HDI status and the PHDI status of regencies/municipalities can be written as follows:

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Nomenclature

AIC | Akaike’s information criterion |

BHHH | Berndt–Hall–Hall–Hausman |

BIC | Bayesian information criterion |

HDI | Human development index |

LR | Likelihood ratio |

ML | Maximum likelihood |

MLRT | Maximum likelihood ratio test |

MULT | Multinomial distribution |

PHDI | Public health development index |

VIF | Variance inflation factor |

## References

- McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1989. [Google Scholar]
- Glonek, G.F.V.; McCullagh, P. Multivariate logistic models. J. R. Stat. Soc. B
**1995**, 57, 533–546. [Google Scholar] [CrossRef] - Kauermann, G. A note on multivariate logistic models for contingency table. Austral. J. Stat.
**1997**, 39, 261–276. [Google Scholar] [CrossRef] - Glonek, G.F.V. A class of regression models for multivariate categorical responses. Biometrika
**1996**, 83, 15–28. [Google Scholar] [CrossRef] - Bergsma, W.P.; Rudas, T. Marginal models for categorical data. Ann. Stat.
**2002**, 30, 140–159. [Google Scholar] - Qaqish, B.F.; Ivanova, A. Multivariate logistic models. Biometrika
**2006**, 93, 1011–1017. [Google Scholar] - Lipsitz, S.R.; Laird, N.M.; Harrington, D.P. Maximum likelihood regression methods for paired binary data. Stat. Med.
**1990**, 9, 1517–1525. [Google Scholar] [CrossRef] [PubMed] - Liang, K.Y.; Zeger, S.L.; Qaqish, B. Multivariate regression analyses for categorical data (with discussion). J. R. Stat. Soc. B
**1992**, 54, 3–40. [Google Scholar] - Carey, V.; Zeger, S.L.; Diggle, P. Modelling multivariate binary data with alternating logistic regressions. Biometika
**1993**, 80, 517–526. [Google Scholar] [CrossRef] - Cessie, S.L.; Houwelingen, J.C. Logistic regression for correlated binary data. Appl. Stat.
**1994**, 43, 95–108. [Google Scholar] [CrossRef] - Lang, J.B.; Agresti, A. Simultaneously modeling joint and marginal distributions of multivariate categorical responses. J. Am. Stat. Assoc.
**1994**, 89, 625–632. [Google Scholar] [CrossRef] - Shoukri, M.M.; Martin, S.W.; Mian, I.U.H. Maximum likelihood estimation of the kappa coefficient from models of matched binary responses. Stat. Med.
**1995**, 14, 83–99. [Google Scholar] [CrossRef] [PubMed] - Shoukri, M.M.; Mian, I.U.H. Maximum likelihood estimation of the kappa coefficient from bivariate logistic regression. Stat. Med.
**1996**, 15, 1409–1419. [Google Scholar] [CrossRef] - Molenberghs, G.; Lesaffre, E. Marginal modelling of multivariate categorical data. Stat. Med.
**1999**, 18, 2237–2255. [Google Scholar] [CrossRef] [Green Version] - Ekholm, A.; Smith, P.W.F.; McDonal, J.W. Marginal regression analysis of a multivariate binary response. Biometrika
**1995**, 82, 847–854. [Google Scholar] [CrossRef] - Ekholm, A.; McDonald, J.W.; Smith, P.W.F. Association models for a multivariate binary response. Biometrics
**2000**, 56, 712–718. [Google Scholar] [CrossRef] [PubMed] - Joe, H.; Liu, Y. A model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. Stat. Probab. Lett.
**1996**, 31, 113–120. [Google Scholar] [CrossRef] - Islam, M.A.; Chowdhury, R.I.; Briollais, L. A bivariate binary model for testing dependence in outcomes. Bull. Malays. Math. Sci. Soc.
**2012**, 35, 845–858. [Google Scholar] - El-Sayed, A.M.; Islam, M.A.; Alzaid, A.A. Estimation and test of measures of association for correlated binary data. Bull. Malays. Math. Sci. Soc.
**2013**, 36, 985–1008. [Google Scholar] - Islam, M.A.; Alzaid, A.A.; Chowdhury, R.I.; Sultan, K.S. A generalized bivariate Bernoulli model with covariate dependence. J. Appl. Stat.
**2013**, 40, 1064–1075. [Google Scholar] [CrossRef] - Bhuyan, M.J.; Islam, M.A.; Rahman, M.S. A bivariate Bernoulli model for analyzing malnutrition. Health Serv. Outcomes Res. Method
**2018**, 18, 109–127. [Google Scholar] [CrossRef] - Sinha, S.S.; Laird, N.M.; Fitzmaurice, G.M. Multivariate logistic regression with incomplete covariate and auxiliary information. J. Multivar. Anal.
**2010**, 101, 2389–2397. [Google Scholar] [CrossRef] [Green Version] - Horton, N.J.; Laird, N.M. Maximum likelihood analysis of logistic regression models with incomplete covariate data and auxiliary information. Biometrics
**2001**, 57, 34–42. [Google Scholar] [CrossRef] - Chen, Z.; Yi, G.Y.; Wu, C. Marginal methods for correlated binary data with misclassified responses. Biometrika
**2011**, 98, 647–662. [Google Scholar] [CrossRef] - O’Brien, S.M.; Dunson, D.B. Bayesian multivariate logistic regression. Biometrics
**2004**, 60, 739–746. [Google Scholar] [CrossRef] - Fathurahman, M.; Purhadi; Sutikno; Ratnasari, V. Hypothesis testing of geographically weighted bivariate logistic regression. J. Phys. Conf. Ser.
**2019**, 1417, 012008. [Google Scholar] [CrossRef] - Fathurahman, M.; Purhadi; Sutikno; Ratnasari, V. Geographically Weighted Multivariate Logistic Regression Model and Its Application. Abstr. Appl. Anal.
**2020**, 2020, 8353481. [Google Scholar] [CrossRef] - Berndt, E.K.; Hall, B.H.; Hall, R.E.; Hausman, J.A. Estimation and inference in nonlinear structural models. Ann. Econ. Soc. Meas.
**1974**, 3, 653–665. [Google Scholar] - Mardalena, S.; Purhadi, P.; Purnomo, J.D.T.; Prastyo, D.D. Parameter estimation and hypothesis testing of multivariate Poisson inverse Gaussian regression. Symmetry
**2020**, 12, 1738. [Google Scholar] [CrossRef] - Dale, J.R. Global cross-ratio models for bivariate, discrete, ordered responses. Biometrics
**1986**, 42, 909–917. [Google Scholar] [CrossRef] [PubMed] - Greene, W.H. Econometric Analysis, 6th ed.; Pearson Education: Cranbury, NJ, USA, 2008. [Google Scholar]
- Pawitan, Y. All Likelihood: Statistical Modelling and Inference Using Likelihood, 1st ed.; Clarendon Press: Oxford, UK, 2001. [Google Scholar]
- Rahayu, A.; Purhadi; Sutikno; Prastyo, D.D. Multivariate gamma regression: Parameter estimation, hypothesis testing, and its application. Symmetry
**2020**, 12, 813. [Google Scholar] [CrossRef] - National Bureau of Statistics. Human Development Index 2018; BPS: Jakarta, Indonesia, 2019.
- Ministry of Health. Public Health Development Index; LPB: Jakarta, Indonesia, 2019.
- Ministry of Health. General Guidelines for Dealing with Areas with Health Problems; LPB: Jakarta, Indonesia, 2010.

**Figure 1.**The proportions of the responses with certain HDI status (${Y}_{1}$ ) and PHDI status (${Y}_{2}$). $n{Y}_{00}$ is the number of regencies/municipalities that had medium HDI and low PHDI; $n{Y}_{01}$ is the number of regencies/municipalities that had medium HDI and high PHDI; $n{Y}_{10}$ is the number of regencies/municipalities that had high HDI and low PHDI; $n{Y}_{11}$ is the number of regencies/municipalities that had high HDI and PHDI.

${\mathit{Y}}_{1}\text{}$ | ${\mathit{Y}}_{2}\text{}$ | Total | |
---|---|---|---|

${\mathit{Y}}_{2}=1\text{}$ | ${\mathit{Y}}_{2}=0\text{}$ | ||

${Y}_{1}=1$ | ${\gamma}_{11}$ | ${\gamma}_{10}$ | ${\gamma}_{1}$ |

${Y}_{1}=0$ | ${\gamma}_{01}$ | ${\gamma}_{00}$ | $1-{\gamma}_{1}$ |

Total | ${\gamma}_{2}$ | $1-{\gamma}_{2}$ | $1$ |

${\mathit{Y}}_{1}$ | ${\mathit{Y}}_{2}$ | Total | |
---|---|---|---|

${\mathit{Y}}_{2}=1$ | ${\mathit{Y}}_{2}=0$ | ||

${Y}_{1}=1$ | 20 | 3 | 23 |

${Y}_{1}=0$ | 6 | 27 | 33 |

Total | 26 | 30 | 56 |

Covariates | Minimum | Maximum | Mean | Standard Deviation |
---|---|---|---|---|

${X}_{1}$ | −4.10 | 7.99 | 5.08 | 1.83 |

${X}_{2}$ | 68.37 | 98.82 | 81.19 | 8.14 |

${X}_{3}$ | 35.58 | 81.37 | 54.25 | 11.05 |

${X}_{4}$ | 1.00 | 46.40 | 10.23 | 9.47 |

${X}_{5}$ | 5.00 | 33.00 | 17.57 | 6.93 |

Statistical Tests | ${\mathit{\chi}}^{2}\text{}$ | df | p-Value |
---|---|---|---|

Pearson | 25.7750 | 1 | 3.8370 × 10^{−7} |

Pearson with Yates’ continuity correction | 23.0840 | 1 | 1.5510 × 10^{−6} |

Likelihood ratio | 28.2420 | 1 | 1.0708 × 10^{−7} |

Covariates | VIF |
---|---|

${X}_{1}$ | 1.0404 |

${X}_{2}$ | 1.4948 |

${X}_{3}$ | 2.3002 |

${X}_{4}$ | 2.5617 |

${X}_{5}$ | 1.5165 |

**Table 6.**The bias values and the numbers of BHHH iterations for the BBL model with single and multiple covariates.

Covariates | Bias | Iteration |
---|---|---|

${X}_{1}$ | 6.2266 × 10^{−5} | 1000 |

${X}_{2}$ | 1.9227 × 10^{−8} * | 23 |

${X}_{3}$ | 2.0283 × 10^{−8} * | 25 |

${X}_{4}$ | 3.8014 × 10^{−6} * | 9 |

${X}_{5}$ | 5.6987 × 10^{−5} | 1000 |

${X}_{2}{X}_{3}{X}_{4}$ | 8.5186 × 10^{−8} * | 146 |

^{−5}).

**Table 7.**Parameter estimates and the statistical test value of the simultaneous test for the BBL model with the multiple covariates.

Parameter | Estimation | ${\mathit{G}}_{1}^{2}$ | df | p-Value |
---|---|---|---|---|

${\theta}_{01}$ | −0.0034 | 99.7390 | 9 | 1.7685 × 10^{−21} |

${\theta}_{11}$ | −0.1916 | |||

${\theta}_{21}$ | 0.0449 | |||

${\theta}_{31}$ | 0.0013 | |||

${\theta}_{02}$ | −0.0016 | |||

${\theta}_{12}$ | −0.1403 | |||

${\theta}_{22}$ | 0.0395 | |||

${\theta}_{32}$ | 0.0011 | |||

${\theta}_{03}$ | −0.0023 | |||

${\theta}_{13}$ | −0.1440 | |||

${\theta}_{23}$ | −0.1071 | |||

${\theta}_{33}$ | 0.0002 |

**Table 8.**Parameter estimates and the LR statistic value of the partial test for the BBL model with the single covariate.

Covariate | Parameter | Estimation | ${\mathit{G}}_{2}^{2}$ | df | p-Value |
---|---|---|---|---|---|

${X}_{2}$ | ${\theta}_{01}$ | −0.0010 | 90.2309 | 3 | 1.9542 × 10^{−19} |

${\theta}_{11}$ | −0.0446 | ||||

${\theta}_{02}$ | −0.0026 | ||||

${\theta}_{12}$ | −0.2251 | ||||

${\theta}_{03}$ | 0.0008 | ||||

${\theta}_{13}$ | 0.0931 | ||||

${X}_{3}$ | ${\theta}_{01}$ | −0.0104 | 111.4570 | 3 | 5.3304 × 10^{−24} |

${\theta}_{11}$ | −0.1630 | ||||

${\theta}_{02}$ | −0.0079 | ||||

${\theta}_{12}$ | −0.2145 | ||||

${\theta}_{03}$ | 0.0004 | ||||

${\theta}_{13}$ | 0.0483 | ||||

${X}_{4}$ | ${\theta}_{01}$ | −15.3744 | 174.2092 | 3 | 1.5700 × 10^{−37} |

${\theta}_{11}$ | 12.5080 | ||||

${\theta}_{02}$ | −17.0178 | ||||

${\theta}_{12}$ | 9.3722 | ||||

${\theta}_{03}$ | 4.2464 | ||||

${\theta}_{13}$ | 4.1612 |

Covariates | AIC | BIC |
---|---|---|

${X}_{2}$ | 512.9107 | 525.0628 |

${X}_{3}$ | 546.2378 | 558.3899 |

${X}_{4}$ | 546.1944 | 558.3465 |

${X}_{2}{X}_{3}{X}_{4}$ | 552.6286 | 576.9328 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Purhadi, P.; Fathurahman, M.
A Logit Model for Bivariate Binary Responses. *Symmetry* **2021**, *13*, 326.
https://doi.org/10.3390/sym13020326

**AMA Style**

Purhadi P, Fathurahman M.
A Logit Model for Bivariate Binary Responses. *Symmetry*. 2021; 13(2):326.
https://doi.org/10.3390/sym13020326

**Chicago/Turabian Style**

Purhadi, Purhadi, and M. Fathurahman.
2021. "A Logit Model for Bivariate Binary Responses" *Symmetry* 13, no. 2: 326.
https://doi.org/10.3390/sym13020326