Abstract
Under the Bayesian framework, this study proposes a Tweedie compound Poisson partial linear mixed model on the basis of Bayesian P-spline approximation to nonparametric function for longitudinal semicontinuous data in the presence of nonignorable missing covariates and responses. The logistic regression model is simultaneously used to specify the missing response and covariate mechanisms. A hybrid algorithm combining the Gibbs sampler and the Metropolis–Hastings algorithm is employed to produce the joint Bayesian estimates of unknown parameters and random effects as well as nonparametric function. Several simulation studies and a real example relating to the osteoarthritis initiative data are presented to illustrate the proposed methodologies.
1. Introduction
Semicontinuous data, characterized by nonnegative continuous value with a discrete mass of zero, appear frequently in many fields, such as medicine, health, economics, and ecology. Models for longitudinal semicontinuous data have, in particular, been receiving a lot of attention in two ways. The first approach is the two-part mixed model wherein a mixture of Bernoulli with positive support distribution is used to model zero and positive components separately (Olsen and Schafer [1]; Berk and Lachenbruch [2]; Tooze et al. [3]; Su et al. [4,5]; Liu et al. [6]; Zhou et al. [7]). However, Hasan et al. [8] and Yan and Ma [9] pointed out that such artificial separation based on the two-part modeling method breaks down the serial patterns in the analysis of time series and longitudinal data. The second approach is the compound Poisson mixed model for modelling longitudinal and repeated measurement or cluster data in an integral way. For example, Zhang [10] investigated several statistical inference methods for Tweedie compound Poisson linear mixed models from the frequentist and Bayesian perspective. Swallow et al. [11] developed a Bayesian hierarchical Tweedie regression model by incorporating serial temporal and spatial correlation into the Tweedie distribution in the analysis of longitudinal semicontinuous ecological data. Ye et al. [12] investigated the sensitivity analysis for priors in Tweedie compound Poisson random effect models under a Bayesian framework. In particular, Yan and Ma [9] incorporated serially dependent distribution-free random effects into the compound Poisson regression model for longitudinal semicontinuous data. However, all the abovementioned compound Poisson mixed models have limitations in that they either do not consider nonlinear smooth effects of covariates, such as time and age variables, or do not deal with missing responses and covariates.
It is well known that handling missing data has become an active research field in data analysis. Many methods have been proposed to make statistical inference on various regression models with nonignorable missing response or covariates. For example, Ibrahim et al. [13,14] proposed two methods by which to estimate unknown parameters in generalized linear models with nonignorable missing covariates and generalized linear mixed models with nonignorable missing responses by using the EM algorithm, respectively. In addition, based on these frequentist approaches of handling nonignorable missing response or covariate data, their Bayesian analogues have been extended to various regression models. For example, from a Bayesian perspective, see Huang et al. [15] for generalized linear models with nonignorably missing covariates, Lee and Tang [16] for nonlinear structural equation models with nonignorable missing data, Tang and Zhao [17] for nonlinear reproductive dispersion mixed models for longitudinal data with nonignorable missing covariates, Tang et al. [18] for a nonlinear dynamic factor analysis model with nonparametric prior and possible nonignorable missingness, Zhou et al. [7] for two-part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates, Wang and Tang [19] for Bayesian quantile regression with mixed discrete and nonignorable missing covariates, and Wang et al. [20] for Bayesian latent factor on image regression with nonignorable missing data. Therefore, we propose a fully Bayesian method by which to simultaneously estimate unknown parameters, random effects and nonparametric function in a Tweedie compound Poisson partial linear mixed models on the basis of Bayesian P-spline approximation to nonparametric function in the presence of nonignorable missing covariates and responses, where the nonignorable missing data mechanism is specified by a logistic regression model.
For the sake of brevity and readability, we first introduce the main mathematical symbols and their descriptions in the rest of paper summarized in Table 1.
Table 1.
Symbols and description.
The paper is organized as follows. In Section 2, we give a description of the data. In Section 3, we describe a Tweedie compound Poisson partial linear mixed models in the presence of nonignorable missing covariates and responses. We present the Bayesian P-spline to model the nonparametric function. The logistic regression model is simultaneously used to specify the missing response and covariate mechanisms, and a sequence of one-dimensional conditional distributions is used to model the joint probability function of the missing covariates. In Section 4, the prior distributions and posterior distributions of unknown parameters and latent variables are presented. In Section 5, two simulation studies and an example are given to illustrate our proposed methodologies. In Section 6, we give some conclusions. In the Appendix A and Appendix B, the conditional distributions for Gibbs sampling and the Metropolis–Hastings algorithm are given.
2. Data Description
In this section, we describe the Osteoarthritis Initiative (OAI) database, which is available at https://www.oai.ucsf.edu (accessed on 4 April 2017). The OAI cohort study investigated the causes of knee osteoarthritis for 4796 patients aged 45 and older, and collected some information such as age, sex, and body mass index (BMI) for these patients at baseline, 12 months, 24 months, 36 months, and 48 months. Thus, this information is collected at most five times because of the missing data involved. In addition, this OAI study adopted the Western Ontario and McMaster Universities Arthritis Index (WOMAC) disability scores to assess the pain intensity in these patients with hip and/or knee osteoarthritis. Higher scores on the WOMAC score indicate worse pain, stiffness, and functional limitations for these patients. A sample of two patients (denoted by ID 9019406 and ID 9025191) from the OAI study is presented in Table 2.
Table 2.
Sample data from the OAI study (M denotes the missing data).
The missing rates for the longitudinal WOMAC scores outcome at baseline, 12 months, 24 months, 36 months, 48 months are , , , , and , respectively. Moreover, the missing rates for covariate BMI at five different time points are , , , , and , respectively. It can be seen from Figure 1 that the observed WOMAC numeric score at 12 months, 24 months, 36 months, and 48 months are right-skewed with a large numerical proportion of zeros, where the bold line on the left of each histogram denotes the frequency for zero. Specifically, more than of the observations of all time points are zeros; thus we consider the WOMAC numeric score as a longitudinal semicontinuous response with missing data in this article.
Figure 1.
Histogram for the observed WOMAC numeric score in the OAI dataset.
3. Statistical Models
3.1. Tweedie Compound Poisson Distribution
As in Ma and Jørgensen [21], the probability density function of the Tweedie compound Poisson distribution has the following form,
where p is the power parameter satisfying , and are the mean parameter and dispersion parameter, respectively, and the expression for is not analytically tractable when . If a nonnegative random variable Y is distributed as a Tweedie compound Poisson distribution, then we simply denote in the rest of paper. Moreover, we have and . Furthermore, the random number Y of the Tweedie compound Poisson distribution is readily generated from the following stochastic representation
where U is distributed as a Poisson distribution with mean , is the independent and identically distributed gamma distribution with mean and variance , and U and are assumed to be independent. After some calculations, the relationship between the two sets of parameters in Equations (1) and (2) are derived as
It follows from Equation (2) that the joint probability distribution of is given by
Thus, the marginal distribution of has the abovementioned form given in Equation (1).
3.2. The Model
For modeling, we first introduce some notations. Let be the longitudinal semicontinuous outcome with missing data of the ith patient with osteoarthritis measured at time (). In the OAI study, is the number of patients with denoting the number of repeated observations per patient. Given random effects , are conditionally independent and each is assumed to be the Tweedie compound Poisson distribution, that is
where is the conditional expectation of the response , is the dispersion parameter to be estimated and . Inspired by GLMM method, the conditional expectation is modeled by
where is a vector of unknown regression parameter of interest, is a vector of covariates in the presence of missing data, is distributed as , is a vector of covariates relating to the random effects , and denotes an unknown nonparametric function satisfying the twice-differentiable property in term of time effects . In this article, the model defined in Equations (1) and (2) is referred to as a Tweedie compound Poisson partial linear mixed model.
Inspired by Lang and Brezger [22], we used the Bayesian P-spline method based on a linear combination of B-spline basic functions to approximate the unknown nonparametric function, that is
where is the hth B-spline basis function, H is the number of B-spline basis function, and is the B-spline coefficients to be estimated. Under the Bayesian framework, is treated as a random variable, and defined by the following first-order random walk; that is, , where for and the diffuse prior is proportional to constant. The variance parameter is viewed as a global smoothing parameter. Although it is easy to estimate the global smoothing parameter, this global smoothing parameter is difficult to characterize in terms of the highly oscillating features for the underlying nonparametric functions . To overcome this issue, we introduce the additional hyperparameters as local smoothing parameters, which can improve the estimation of a function with significantly different curvatures at different points . Thus, is assumed to be the normal distribution with heterogeneous variance; that is, for . Furthermore, let and . The prior distribution for is derived in the matrix form
where the penalty matrix Q is given by
Here, the prior distribution of smooth parameter is distributed as an inverse gamma distribution; that is, .
3.3. Missing Data Mechanism Assumptions
In this article, let be a vector of response (), and be a vector of covariates in the presence of missing data, respectively, whereas are completely observed. In what follows, we assume that the missing data mechanism for response and covariates are nonignorable. Let and , where and are vectors of the observed and missing components of responses in satisfying , respectively; and are vectors of the observed and missing covariate in satisfying , respectively. Let be an indicator variable which indicates whether is missing; that is,
Inspired by Ibrahim et al. [14], it is common to specify a Bernoulli distribution for the following nonignorable missing mechanism. Thus, given and unknown parameter , the conditional probability function of is distributed as
where is specified by a logistic regression model,
in which .
Similarly, let be an indicator variable, which indicates whether is missing, and each is defined as follows:
For conditional probability density , we consider the following nonignorable data mechanisms,
in which is defined by a logistic regression model
where .
In this article, we consider the following other type of the nonignorable missing mechanism for response and covariates. Specifically, in the first type, the nonignorable missing mechanism for response is specified by a logistic regression model,
where are all missing covariables. For missing covariate, is given by a logistic regression model,
where .
In what follows, we assume that the covariate is continuous, and there is missingness in the first m dimension and complete observation in the rest dimension. According to Ibrahim et al. [13], the joint probability function of the missing covariates is simplied by a sequence of one-dimensional conditional distributions as follows,
where and , , . Here, covariates do not need to be modelled because they are always observed. In addition, continuous missing covariates are generally assumed to follow the normal distribution. For example,
where mean parameter is given by
Here,
4. Bayesian Inference
To investigate the Bayesian inference on parameters of interest, we first introduce the following notations. Let and be the sets of observed and missing values of response variables, respectively. Similarly, and are the sets of observed and missing values corresponding to covariates, respectively. Let denote the latent variable. Let and denote the vector of random effects and the vector of covariates relating to random effects. Let be the vector of time effects relating to the nonparametric part. Denote the vector of indicator variables and parameters relating to missing data mechanism by and , where and . On the whole, let , , and be all the parameters to be estimated in our considered model. Given the observed data , the joint posterior distribution of is given by
where and .
Clearly, it is difficult to generate the random sample from the posterior distribution because Equation (14) has high-dimensional integration. Thus, inspired by the data augmentation method (Tanner and Wong [23]), we adopt the following posterior distribution, , to solve the high-dimensional integration issue. Meanwhile, it is easy to generate the random sample from via the Gibbs sampler (Geman and Geman [24]). That is, random samples are iteratively generated by means of the following conditional distributions , ,, , ,,, , , and . To derive the abovementioned conditional distributions, we adopt the following joint logarithmic likelihood function of
Moreover, the prior distributions of , p, , , , , , , and are given by
where is the pregiven hyperparameter, N is the normal distribution, is the k-dimensional inverse Wishart distribution, is the gamma distribution, IG is the inverse gamma distribution, and . As for the choices of hyperparameters with regard to the Bayesian P-spline method, Lang and Brezger [22] pointed out that and a small value for for example, or , leading to an almost diffuse prior for . Moreover, the hyperparameters and are simultaneously taken to be , which can characterize the highly oscillating features for some nonparametric functions. As for the power parameter p, Ye et al. [12] adopted the following priors to conduct the sensitivity analysis: , , and . As a result, Ye et al. [12] chose the prior as the optimal for p in the Tweedie compound Poisson distribution based on the sensitivity analysis. The choices of hyperparameters for other prior distributions are discussed in Section 5. The conditional distributions, Gibbs sampling and Metropolis–Hastings algorithm are shown in the Appendix A and Appendix B.
Bayesian Estimates
Let be random samples from the joint posterior distribution . The Bayesian estimates of parameters , , p, , ,, and can be obtained by
Similarly, the consistency estimates of the posterior covariance matrix for parameters , , p, , ,, and can be obtained from the sample covariance matrix of their random samples. For example, the posterior covariance matrix can be consistently estimated by
In addition, the corresponding standard deviation can be estimated by the diagonal elements of the sample covariance matrix of the random sample sequence.
5. Numerical Examples
In this section, two simulation studies and a real example relating to the OAI data are conducted to investigate the performance of our proposed Bayesian methodologies.
5.1. Simulation Studies
In the first simulation study, we assume that the longitudinal semicontinuous datasets with and are simulated from the Tweedie compound Poisson distribution and the conditional mean is given by
where covariate is generated from the standard normal distribution, and and are independently simulated from the normal distribution and , respectively. In addition, the random effects are independent and identically distributed as , and the true curve of nonparametric function is given by with . The true values of the abovementioned parameters are taken to be , , , , , , , . In what follows, it is assumed that covariate is completely observed, while response and covariates , are subject to missingness. Thus, the nonignorable missing mechanism for these three variables are modelled by the following logistic regression model,
where the truth values of , , and are given by , , . The missing data for , and were generated by (18), and the average proportion of missing data for , and on the basis of 50 replications are , , and , respectively.
To investigate the effect of different prior information on the Bayesian estimate for unknown parameters, three types of prior information are considered as follows.
Type I: The hyperparameters , , , , and are taken to be the truth values corresponding to their parameters; , , , , and ; A, B, , , , and are taken to be , , , , , , where denotes the identity matrix. This scenario is viewed as a good piece of prior information.
Type II: The hyperparameters , , , , and are taken to be 2 times truth values corresponding to their parameters; A, B, , , , and are taken to be , , , , while other hyperparameters are taken to be the same as those given in Type I. This scenario is viewed as an inaccurate prior information.
Type III: The hyperparameters , , , , , and are taken to be zero vector, respectively; A, B, , , , and are taken to be , , , , while other hyperparameters are taken to be the same as those given in Type I. This scenario is viewed as a noninformative prior information.
For each of the above-generated 50 datasets, the hybrid algorithm combining the block Gibbs sampler and Metropolis–Hastings algorithm is used to produce the joint Bayesian estimates of unknown parameters, random effects, and nonparametric function. To ensure the convergence of the hybrid algorithm for each replication, we collected 5000 observations after 5000 iterations to calculate Bayesian estimates, which are reported in Table 3, where “Bias” is the difference between the mean value of parameters obtained from 50 replication and the truth value, “SD” is the standard deviation of the estimates on the basis of 50 replications, and “RMS” is the root mean square between the estimates on the basis of 50 replications and its true value. It can be seen from Table 3 that (i) Bayesian estimates for unknown parameters were reasonably accurate in our considered three different prior information because all Bias values are less than , and (ii) the estimated values of SD and RMS are less than and there is little difference between these two estimated values regardless of any priors. Thus, Bayesian estimates are not sensitive to our considered three prior pieces of information. In addition, examination of Figure 2 indicated that our proposed Bayesian P-spline method to approximate nonparametric function is validated to be feasible because the estimated curves of nonparametric function matched well with the true curve in our considered simulation studies.
Table 3.
Bayesian estimates of parameters in the first simulation study.
Figure 2.
The estimated function and true function of for three priors: type I (left panel), type II (middle panel), and type III (right panel) in the first simulation.
In the second simulation study, the simulated setup is the same as the first simulation study except for the missing mechanism. Here, the other nonignorable missing mechanism model for , and is given by
where the true values of unknown parameters are taken to be , , . The average proportion of missing data for , and is , , and , respectively. Similar to the first simulation study, we also considered three different prior pieces of information for their corresponding hyperparameters. These findings in Table 4 and Figure 3 show that (i) all Bias values corresponding to unknown parameters are less than except that the Bias value for is under the Type II prior and the Bias values for and are and under the Type III prior, respectively, and (ii) the estimated curves of nonparametric function also matched well with the true curve regardless of three different priors. Clearly, Bayesian estimates under the first two priors are better than those obtained from the Type III prior. All in all, our proposed Bayesian approach is feasible in our considered missing mechanism.
Table 4.
Bayesian estimation of parameters in the second simulation study.
Figure 3.
The estimated function and true function of for three priors: type I (left panel), type II (middle panel), type III (right panel) in the second simulation.
5.2. Real Example
In this section, the application of our proposed semiparametric Bayesian approach is illustrated by the analysis of longitudinal semicontinuous data from the OAI, which was discussed in Section 2. The OAI longitudinal data were analyzed by various approaches, such as Chen and Wehrly [25,26]. However, these authors only considered the observed data by reducing 4796 patients to 1499 patients and assumed the log transformation of the WOMAC score plus 1 to approximate the normal distribution. In this study, our scientific interest is to link the covariates, such as age, sex, and BMI with the outcome WOMAC score while accounting for nonignorable missing response with a point mass at zero and covariates data. In addition, we viewed age as the individual-level covariate modeled nonparametrically with the other covariate variables modeled parametrically. Let the outcome represent the WOMAC numeric score for the right knees of the ith () patient recorded at the jth time point ( corresponding to 0, 12, 24, 36, and 48 months). As discussed in Section 2, we regarded the WOMAC numerical score as a longitudinal semicontinuous outcome in this real example.
Here, given random effects , follows the Tweedie compound Poisson distribution; that is . The conditional mean is simultaneously linked to covariates, random effects, and nonparametric function as follows,
where the covariates (1 for male or 2 for female) and are completely observed, while the outcome and covariate are missing and their corresponding missing rates are and , respectively. Furthermore, we consider the following missing data mechanisms for covariate and outcome ,
where , and . In addition, we assume that the missing distribution for covariate follows the normal distribution and random effect is distributed as the normal distribution . Bayesian estimates of unknown parameters and their corresponding standard error as well as the nonparametric function are displayed in Table 5 and Figure 4. Table 5 indicates that the covariates BMI and Sex have the positive significant effect on the WOMAC score at the significance level of . The result shows that the WOMAC score increases as BMI increases. The higher the BMI score a patient has, the greater intensity of knee osteoarthritis the patient will suffer. The positive significant effect of the covariate Sex on the WOMAC score indicates that the average WOMAC score for females are higher compared with males. Women are more vulnerable to greater intensity than men. Chen and Wehrly [25,26] assumed a linear age effect on the WOMAC score parametrically, but an insignificant effect on the WOMAC score are presented in their studies. It appears from Figure 4 that the Bayesian estimates of nonparametric function based on the P-spline method has a significant nonlinear trend. Specifically, there was a sharp decrease from age 45 to approximately age 49 and from age 60 to approximately age 73, respectively. Moreover, stabilization seems to have started at age 73. In the missing mechanism model, we found that the Bayesian estimates of unknown parameters and significantly deviated from zero. Thus, it is reasonable to incorporate the missing data into our proposed semiparametric Bayesian model in the analysis of OAI dataset because missing data mechanisms for and are nonignorable.
Table 5.
Bayesian estimates and standard errors in the real example.
Figure 4.
Nonparametric estimate of effects of age on the WOMAC numeric score in the OAI dataset.
6. Conclusions
In this paper, we have introduced a new Tweedie compound Poisson partial linear mixed model with nonignorable missing covariates and responses by assuming that the random effect is distributed as a multivariate normal distribution and the nonparametric function is modelled by the Bayesian P-splines simultaneously. The logistic regression model is simultaneously used to model the missing response and covariate mechanisms. This article has the following contributions: (i) our proposed Bayesian semiparametric mixed effects model can model both zero and positive components of the longitudinal semicontinuous data in an integral way while accounting for the nonignorable missing responses and covariates simultaneously; (ii) our proposed partial linear mixed models based on Bayesian P-spline can characterize the nonlinear smooth effects of covariate in the analysis of longitudinal semicontinuous data; (iii) the conditional distributions for the Gibbs sampling algorithm and Metropolis–Hastings algorithm of our proposed model are derived; and (iv) two simulation studies and a real example are used to illustrate the effectiveness and feasibility of our several considered missing mechanisms.
Author Contributions
Conceptualization, Z.W. and X.D.; methodology, X.D. and Z.W.; software, Z.W., X.D. and W.Z.; validation, Z.W., X.D. and W.Z.; formal analysis, Z.W. and X.D.; investigation, Z.W., X.D. and W.Z.; preparation of the original work draft, X.D. and Z.W.; visualization, Z.W. and W.Z.; supervision, funding acquisition, X.D. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (12161014), the National Statistical Science Research Project (2021LY011), the Guizhou Provincial Science and Technology Project ([2020]1Y009), the Innovative Exploration and New Academic Seedling Project of Guizhou University of Finance and Economics (No. 2022XSXMA18).
Institutional Review Board Statement
Not applicable.
Data Availability Statement
The research data are available on the website: https://www.oai.ucsf.edu, accessed on 4 April 2017.
Acknowledgments
We are grateful to Zhixian Yang for careful English editing during the preparation of the revision.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Conditional Distributions for the Gibbs Sampling Algorithm
First, the conditional distribution of the missing part used in the Gibbs sampling algorithm is as follows.
- (1)
- The logarithmic joint conditional distribution of ( iswhere or ( are all missing covariables), , , and .
- (2)
- (3)
- The conditional distribution of iswhere or ( are all missing covariables).
- (4)
- The conditional distribution of iswhere or .
- (5)
- The conditional distribution of is
- (6)
- According to (13), we know , and . Then the conditional distribution of isClearly, , where is the inverse gamma distribution. In addition, the conditional distributions of are the same as that of .
The conditional distribution of the nonparametric part used in the Gibbs sampling algorithm is as follows.
- (1)
- The logarithmic conditional distribution of is
- (2)
- The conditional distribution of isClearly, , where is the inverse gamma distribution.
- (3)
- The conditional distribution of isClearly, , where is the gamma distribution.
Finally, the conditional distributions of other parameters used in the Gibbs sampling are as follows.
- (1)
- The logarithmic conditional distribution of is
- (2)
- The logarithmic conditional distribution of p is
- (3)
- The logarithmic conditional distribution of is
- (4)
- The logarithmic conditional distribution of isThus, is proportional to
- (5)
- The conditional distribution of isClearly, , where is the k-dimensional inverse Wishart distribution.
- (6)
- The logarithmic conditional distribution of isthus,
Appendix A. Metropolis–Hastings Algorithm
To implement the Metropolis–Hastings algorithm, we assume that the current iteration values of , , p, , , and are , , , , , and , and the proposal distributions of the new random samples , , , , , and were selected as zero truncated Poisson distribution , multivariate normal distribution , normal distribution , normal distribution , normal distribution , and multivariate normal distribution , respectively, where denotes the mean parameter of the Poisson distribution, N denotes the normal distribution, and , , , , , and are the tuned parameters, respectively. Furthermore, , , and are derived as
where . Finally, we give the accepted probability of , , p, , , and used in the Metropolis–Hastings algorithm as follows:
References
- Olsen, M.K.; Schafer, J.L. A two-part random-effects model for semicontinuous longitudinal data. J. Am. Stat. Assoc. 2001, 96, 730–745. [Google Scholar] [CrossRef]
- Berk, K.; Lachenbruch, P.A. Repeated measures with zeros. Stat. Methods Med. Res. 2002, 11, 303–316. [Google Scholar] [CrossRef]
- Tooze, J.A.; Grunwald, G.K.; Jones, R.H. Analysis of repeated measures data with clumping at zero. Stat. Methods Med. Res. 2002, 11, 341–355. [Google Scholar] [CrossRef]
- Su, L.; Tom, B.D.; Farewell, V.T. Bias in 2-part mixed models for longitudinal semicontinuous data. Biostatistics 2009, 10, 374–389. [Google Scholar] [CrossRef]
- Su, L.; Tom, B.D.; Farewell, V.T. A likelihood-based two-part marginal model for longitudinal semicontinuous data. Stat. Methods Med. Res. 2015, 24, 194–205. [Google Scholar] [CrossRef]
- Liu, L.; Strawderman, R.L.; Johnson, B.A.; O’Quigley, J.M. Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study. Stat. Methods Med. Res. 2016, 25, 133–152. [Google Scholar] [CrossRef]
- Zhou, X.X.; Kang, K.; Song, X.Y. Two-part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates. Stat. Med. 2020, 39, 1801–1816. [Google Scholar] [CrossRef]
- Hasan, M.T.; Yan, G.H.; Ma, R.J. Analysis of periodic patterns of daily precipitation through simultaneous modeling of its serially observed occurrence and amount. Environ. Ecol. Stat. 2014, 21, 811–824. [Google Scholar] [CrossRef]
- Yan, G.H.; Ma, R.J. Modelling occurrence and quantity of longitudinal semicontinuous data simultaneously with nonparametric unobserved heterogeneity. Can. J. Stat. 2023, in press. [Google Scholar]
- Zhang, Y.W. Likelihood-based and Bayesian Methods for Tweedie Compound Poisson Linear Mixed Models. Stat. Comput. 2013, 23, 743–757. [Google Scholar] [CrossRef]
- Swallow, B.; Buckland, S.T.; King, R.; Toms, M.P. Bayesian hierarchical modelling of continuous non-negative longitudinal data with a spike at zero: An application to a study of birds visiting gardens in winter. Biom. J. 2016, 58, 357–371. [Google Scholar] [CrossRef] [PubMed]
- Ye, T.; Lachos, V.H.; Wang, X.J.; Dey, D.K. Comparisons of zero-augmented continuous regression models from a Bayesian perspective. Stat. Med. 2021, 40, 1073–1100. [Google Scholar] [CrossRef]
- Ibrahim, J.G.; Lipsitz, S.R.; Chen, M.H. Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J. R. Stat. Soc. Ser. B 1999, 61, 173–190. [Google Scholar] [CrossRef]
- Ibrahim, J.G.; Chen, M.H.; Lipsitz, S.R. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. Biometrika 2001, 88, 551–564. [Google Scholar] [CrossRef]
- Huang, L.; Chen, M.H.; Ibrahim, J.G. Bayesian analysis for generalized linear models with nonignorably missing covariates. Biometrics 2005, 61, 767–780. [Google Scholar] [CrossRef]
- Lee, S.Y.; Tang, N.S. Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika 2006, 71, 541–564. [Google Scholar] [CrossRef]
- Tang, N.S.; Zhao, H. Bayesian analysis of nonlinear reproductive dispersion mixed models for longitudinal data with nonignorable missing covariates. Commun. Stat. Simul. Comput. 2014, 43, 1265–1287. [Google Scholar] [CrossRef]
- Tang, N.S.; Chow, S.M.; Ibrahim, J.G.; Zhu, H.T. Bayesian sensitivity analysis of a nonlinear dynamic factor analysis model with nonparametric prior and possible nonignorable missingness. Psychometrika 2017, 82, 875–903. [Google Scholar] [CrossRef]
- Wang, Z.Q.; Tang, N.S. Bayesian quantile regression with mixed discrete and nonignorable missing covariates. Bayesian Anal. 2020, 15, 579–604. [Google Scholar] [CrossRef]
- Wang, X.Q.; Song, X.Y.; Zhu, H.T. Bayesian latent factor on image regression with nonignorable missing data. Stat. Med. 2021, 40, 920–932. [Google Scholar] [CrossRef] [PubMed]
- Ma, R.; Jørgensen, B. Nested generalized linear mixed models: An orthodox best linear unbiased predictor approach. J. R. Stat. Soc. Ser. B 2007, 69, 625–641. [Google Scholar] [CrossRef]
- Lang, S.; Brezger, A. Bayesian P-splines. J. Comput. Graph. Stat. 2004, 13, 183–212. [Google Scholar] [CrossRef]
- Tanner, M.A.; Wong, W.H. The Calculation of Posterior Distributions by Data Augmentation. J. Am. Stat. Assoc. 1987, 82, 528–540. [Google Scholar] [CrossRef]
- Geman, S.; Geman, D. Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef]
- Chen, H.C.; Wehrly, T.E. Assessing correlation of clustered mixed outcomes from a multivariate generalized linear mixed model. Stat. Med. 2015, 34, 704–720. [Google Scholar] [CrossRef]
- Chen, H.C.; Wehrly, T.E. Approximate uniform shrinkage prior for a multivariate generalized linear mixed model. J. Multivar. Anal. 2016, 145, 148–161. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).