Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates

Wu, Zhenhuan; Duan, Xingde; Zhang, Wenzhuan

doi:10.3390/e25030506

Open AccessArticle

Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates

by

Zhenhuan Wu

,

Xingde Duan

^*

and

Wenzhuan Zhang

Department of Mathematics and Statistics, Guizhou University of Finance and Economics, Guiyang 550025, China

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(3), 506; https://doi.org/10.3390/e25030506

Submission received: 28 January 2023 / Revised: 24 February 2023 / Accepted: 13 March 2023 / Published: 15 March 2023

(This article belongs to the Special Issue Statistical Methods for Modeling High-Dimensional and Complex Data)

Download

Browse Figures

Versions Notes

Abstract

Under the Bayesian framework, this study proposes a Tweedie compound Poisson partial linear mixed model on the basis of Bayesian P-spline approximation to nonparametric function for longitudinal semicontinuous data in the presence of nonignorable missing covariates and responses. The logistic regression model is simultaneously used to specify the missing response and covariate mechanisms. A hybrid algorithm combining the Gibbs sampler and the Metropolis–Hastings algorithm is employed to produce the joint Bayesian estimates of unknown parameters and random effects as well as nonparametric function. Several simulation studies and a real example relating to the osteoarthritis initiative data are presented to illustrate the proposed methodologies.

Keywords:

random effects; Tweedie compound Poisson distribution; Bayesian P-spline; longitudinal semicontinuous data; logistic regression model

1. Introduction

Semicontinuous data, characterized by nonnegative continuous value with a discrete mass of zero, appear frequently in many fields, such as medicine, health, economics, and ecology. Models for longitudinal semicontinuous data have, in particular, been receiving a lot of attention in two ways. The first approach is the two-part mixed model wherein a mixture of Bernoulli with positive support distribution is used to model zero and positive components separately (Olsen and Schafer [1]; Berk and Lachenbruch [2]; Tooze et al. [3]; Su et al. [4,5]; Liu et al. [6]; Zhou et al. [7]). However, Hasan et al. [8] and Yan and Ma [9] pointed out that such artificial separation based on the two-part modeling method breaks down the serial patterns in the analysis of time series and longitudinal data. The second approach is the compound Poisson mixed model for modelling longitudinal and repeated measurement or cluster data in an integral way. For example, Zhang [10] investigated several statistical inference methods for Tweedie compound Poisson linear mixed models from the frequentist and Bayesian perspective. Swallow et al. [11] developed a Bayesian hierarchical Tweedie regression model by incorporating serial temporal and spatial correlation into the Tweedie distribution in the analysis of longitudinal semicontinuous ecological data. Ye et al. [12] investigated the sensitivity analysis for priors in Tweedie compound Poisson random effect models under a Bayesian framework. In particular, Yan and Ma [9] incorporated serially dependent distribution-free random effects into the compound Poisson regression model for longitudinal semicontinuous data. However, all the abovementioned compound Poisson mixed models have limitations in that they either do not consider nonlinear smooth effects of covariates, such as time and age variables, or do not deal with missing responses and covariates.

It is well known that handling missing data has become an active research field in data analysis. Many methods have been proposed to make statistical inference on various regression models with nonignorable missing response or covariates. For example, Ibrahim et al. [13,14] proposed two methods by which to estimate unknown parameters in generalized linear models with nonignorable missing covariates and generalized linear mixed models with nonignorable missing responses by using the EM algorithm, respectively. In addition, based on these frequentist approaches of handling nonignorable missing response or covariate data, their Bayesian analogues have been extended to various regression models. For example, from a Bayesian perspective, see Huang et al. [15] for generalized linear models with nonignorably missing covariates, Lee and Tang [16] for nonlinear structural equation models with nonignorable missing data, Tang and Zhao [17] for nonlinear reproductive dispersion mixed models for longitudinal data with nonignorable missing covariates, Tang et al. [18] for a nonlinear dynamic factor analysis model with nonparametric prior and possible nonignorable missingness, Zhou et al. [7] for two-part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates, Wang and Tang [19] for Bayesian quantile regression with mixed discrete and nonignorable missing covariates, and Wang et al. [20] for Bayesian latent factor on image regression with nonignorable missing data. Therefore, we propose a fully Bayesian method by which to simultaneously estimate unknown parameters, random effects and nonparametric function in a Tweedie compound Poisson partial linear mixed models on the basis of Bayesian P-spline approximation to nonparametric function in the presence of nonignorable missing covariates and responses, where the nonignorable missing data mechanism is specified by a logistic regression model.

For the sake of brevity and readability, we first introduce the main mathematical symbols and their descriptions in the rest of paper summarized in Table 1.

The paper is organized as follows. In Section 2, we give a description of the data. In Section 3, we describe a Tweedie compound Poisson partial linear mixed models in the presence of nonignorable missing covariates and responses. We present the Bayesian P-spline to model the nonparametric function. The logistic regression model is simultaneously used to specify the missing response and covariate mechanisms, and a sequence of one-dimensional conditional distributions is used to model the joint probability function of the missing covariates. In Section 4, the prior distributions and posterior distributions of unknown parameters and latent variables are presented. In Section 5, two simulation studies and an example are given to illustrate our proposed methodologies. In Section 6, we give some conclusions. In the Appendix A and Appendix B, the conditional distributions for Gibbs sampling and the Metropolis–Hastings algorithm are given.

2. Data Description

In this section, we describe the Osteoarthritis Initiative (OAI) database, which is available at https://www.oai.ucsf.edu (accessed on 4 April 2017). The OAI cohort study investigated the causes of knee osteoarthritis for 4796 patients aged 45 and older, and collected some information such as age, sex, and body mass index (BMI) for these patients at baseline, 12 months, 24 months, 36 months, and 48 months. Thus, this information is collected at most five times because of the missing data involved. In addition, this OAI study adopted the Western Ontario and McMaster Universities Arthritis Index (WOMAC) disability scores to assess the pain intensity in these patients with hip and/or knee osteoarthritis. Higher scores on the WOMAC score indicate worse pain, stiffness, and functional limitations for these patients. A sample of two patients (denoted by ID 9019406 and ID 9025191) from the OAI study is presented in Table 2.

The missing rates for the longitudinal WOMAC scores outcome at baseline, 12 months, 24 months, 36 months, 48 months are

0.3 %

,

7.1 %

,

10.9 %

,

12.1 %

, and

12.3 %

, respectively. Moreover, the missing rates for covariate BMI at five different time points are

0 %

,

11.5 %

,

16.0 %

,

18.9 %

, and

21 %

, respectively. It can be seen from Figure 1 that the observed WOMAC numeric score at 12 months, 24 months, 36 months, and 48 months are right-skewed with a large numerical proportion of zeros, where the bold line on the left of each histogram denotes the frequency for zero. Specifically, more than

36.9 %

of the observations of all time points are zeros; thus we consider the WOMAC numeric score as a longitudinal semicontinuous response with missing data in this article.

3. Statistical Models

3.1. Tweedie Compound Poisson Distribution

As in Ma and Jørgensen [21], the probability density function of the Tweedie compound Poisson distribution has the following form,

f_{p} (y; μ, ϕ) = c_{p} (y; ϕ) exp \{\frac{1}{ϕ} (\frac{y μ^{1 - p}}{1 - p} - \frac{μ^{2 - p}}{2 - p})\},

(1)

where p is the power parameter satisfying

1 < p < 2

,

μ

and

ϕ

are the mean parameter and dispersion parameter, respectively, and the expression for

c_{p} (y; ϕ)

is not analytically tractable when

y > 0

. If a nonnegative random variable Y is distributed as a Tweedie compound Poisson distribution, then we simply denote

Y \sim {T w}_{p} (μ, ϕ)

in the rest of paper. Moreover, we have

E (Y) = μ

and

Var (Y) = ϕ μ^{p}

. Furthermore, the random number Y of the Tweedie compound Poisson distribution is readily generated from the following stochastic representation

Y = \sum_{i = 1}^{U} X_{i},

(2)

where U is distributed as a Poisson distribution with mean

λ

,

X_{i}

is the independent and identically distributed gamma distribution with mean

α γ

and variance

α γ^{2}

, and U and

X_{i}

are assumed to be independent. After some calculations, the relationship between the two sets of parameters in Equations (1) and (2) are derived as

\begin{matrix} μ = λ α γ λ = \frac{μ^{2 - p}}{ϕ (2 - p)} \\ p = \frac{α + 2}{α + 1} α = \frac{2 - p}{p - 1} \\ ϕ = \frac{λ^{1 - p} {(α γ)}^{2 - p}}{2 - p} γ = ϕ (p - 1) μ^{p - 1} . \end{matrix}

(3)

It follows from Equation (2) that the joint probability distribution of

(Y, U)

is given by

\begin{matrix} p_{Y, U} (y, u | λ, α, γ) = p_{Y | U} (y | u, α, γ) \times p_{U} (u | λ) \\ = \{\begin{matrix} exp (- λ) & (0, 0) \\ \frac{y^{u α - 1} exp (- y / γ)}{Γ (u α) {(γ)}^{u α}} \times \frac{λ^{u}}{u!} exp (- λ) & R^{+} \times Z^{+} . \end{matrix} \end{matrix}

(4)

Thus, the marginal distribution of

(Y, U)

has the abovementioned form given in Equation (1).

3.2. The Model

For modeling, we first introduce some notations. Let

y_{i j}

be the longitudinal semicontinuous outcome with missing data of the ith patient with osteoarthritis measured at time

t_{i j}

(

i = 1, \dots, n, j = 1, \dots, n_{i}

). In the OAI study,

n = 4796

is the number of patients with

n_{i} = 5

denoting the number of repeated observations per patient. Given random effects

b_{i}

,

Y_{i 1}, \dots, Y_{i n_{i}}

are conditionally independent and each

Y_{i j} | b_{i}

is assumed to be the Tweedie compound Poisson distribution, that is

\begin{matrix} Y_{i j} | b_{i} & \sim & {Tw}_{p} (μ_{i j}, ϕ), \end{matrix}

(5)

where

μ_{i j}

is the conditional expectation of the response

Y_{i j}

,

ϕ

is the dispersion parameter to be estimated and

1 < p < 2

. Inspired by GLMM method, the conditional expectation

μ_{i j}

is modeled by

log (μ_{i j}) = η_{i j} = x_{i j}^{T} β + z_{i j}^{T} b_{i} + g (t_{i j}),

(6)

where

β

is a

q \times 1

vector of unknown regression parameter of interest,

x_{i j}

is a

q \times 1

vector of covariates in the presence of missing data,

b_{i}

is distributed as

N_{r} (0, Σ)

,

z_{i j}

is a

r \times 1

vector of covariates relating to the random effects

b_{i}

, and

g (t_{i j})

denotes an unknown nonparametric function satisfying the twice-differentiable property in term of time effects

t_{i j}

. In this article, the model defined in Equations (1) and (2) is referred to as a Tweedie compound Poisson partial linear mixed model.

Inspired by Lang and Brezger [22], we used the Bayesian P-spline method based on a linear combination of B-spline basic functions to approximate the unknown nonparametric function, that is

g (t_{i j}) = \sum_{h = 1}^{H} ξ_{h} B_{h} (t_{i j}),

(7)

where

B_{h} (\cdot)

is the hth B-spline basis function, H is the number of B-spline basis function, and

ξ_{h}

is the B-spline coefficients to be estimated. Under the Bayesian framework,

ξ_{h}

is treated as a random variable, and defined by the following first-order random walk; that is,

ξ_{h} = ξ_{h - 1} + v_{h}

, where

v_{h} \sim N (0, τ_{ξ}^{2})

for

h = 2, \dots, H

and the diffuse prior

ξ_{1}

is proportional to constant. The variance parameter

τ_{ξ}^{2}

is viewed as a global smoothing parameter. Although it is easy to estimate the global smoothing parameter, this global smoothing parameter is difficult to characterize in terms of the highly oscillating features for the underlying nonparametric functions

g (t)

. To overcome this issue, we introduce the additional hyperparameters

δ_{h}

as local smoothing parameters, which can improve the estimation of a function with significantly different curvatures at different points

t_{i j}

. Thus,

υ_{h}

is assumed to be the normal distribution with heterogeneous variance; that is,

υ_{h} \sim N (0, τ_{ξ}^{2} / δ_{h})

for

h = 2, \dots, H

. Furthermore, let

ξ = {(ξ_{1}, \dots, ξ_{H})}^{T}

and

δ = {(δ_{2}, \dots, δ_{H})}^{T}

. The prior distribution for

ξ

is derived in the matrix form

ξ | τ_{ξ}^{2} \propto exp (- \frac{1}{2 τ_{ξ}^{2}} ξ^{T} Q ξ),

where the penalty matrix Q is given by

Q = (\begin{matrix} δ_{2} & - δ_{2} & 0 & 0 & \dots & 0 & 0 & 0 \\ - δ_{2} & δ_{2} + δ_{3} & - δ_{3} & 0 & \dots & 0 & 0 & 0 \\ 0 & - δ_{3} & δ_{3} + δ_{4} & - δ_{4} & \dots & 0 & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & 0 & \dots & - δ_{H - 2} & δ_{H - 2} + δ_{H - 1} & - δ_{H - 1} \\ 0 & 0 & 0 & 0 & \dots & 0 & - δ_{H - 1} & δ_{H - 1} + δ_{H} \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 & - δ_{H} \end{matrix} \begin{matrix} 0 \\ 0 \\ 0 \\ ⋮ \\ 0 \\ - δ_{H} \\ δ_{H} \end{matrix}) .

Here, the prior distribution of smooth parameter

τ_{ξ}^{2}

is distributed as an inverse gamma distribution; that is,

p (τ_{ξ}^{2}) \sim I G (a_{τ}, b_{τ})

.

3.3. Missing Data Mechanism Assumptions

In this article, let

y_{i} = {(y_{i 1}, \dots, y_{i n_{i}})}^{T}

be a

n_{i} \times 1

vector of response (

i = 1, \dots, n

), and

x_{i j}

be a

q \times 1

vector of covariates in the presence of missing data, respectively, whereas

z_{i j}

are completely observed. In what follows, we assume that the missing data mechanism for response and covariates are nonignorable. Let

y_{i} = {(y_{o i}^{T}, y_{m i}^{T})}^{T}

and

x_{i j} = {(x_{o i j}^{T}, x_{m i j}^{T})}^{T}

, where

y_{o i} (n_{1 i} \times 1)

and

y_{m i} (n_{2 i} \times 1)

are vectors of the observed and missing components of responses in

y_{i}

satisfying

n_{1 i} + n_{2 i} = n_{i}

, respectively;

x_{o i j} (q_{1 i j} \times 1)

and

x_{m i j} (q_{2 i j} \times 1)

are vectors of the observed and missing covariate in

x_{i j}

satisfying

q_{1 i j} + q_{2 i j} = q

, respectively. Let

r_{y i} = {(r_{y i 1}, \dots, {r_{y}}_{i n_{i}})}^{T}

be an indicator variable which indicates whether

y_{i} = {(y_{i 1}, \dots, y_{i n_{i}})}^{T}

is missing; that is,

r_{y i j} = \{\begin{matrix} 1, y_{i j} is missing, \\ 0, y_{i j} is observed . \end{matrix}

Inspired by Ibrahim et al. [14], it is common to specify a Bernoulli distribution for the following nonignorable missing mechanism. Thus, given

y_{i}

and unknown parameter

φ_{y}

, the conditional probability function of

r_{y i j}

is distributed as

p (r_{y i j} | y_{i}, φ_{y}) = {\{\Pr (r_{y i j} = 1 | y_{i}, φ_{y})\}}^{r_{y_{i j}}} {\{1 - \Pr (r_{y i j} = 1 | y_{i}, φ_{y})\}}^{1 - r_{y_{i j}}},

where

\Pr (r_{y i j} = 1 | y_{i}, φ_{y})

is specified by a logistic regression model,

logit \{\Pr (r_{y i j} = 1 | y_{i}, φ_{y})\} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} y_{i, j - 1} = u_{y i j},

(8)

in which

logit \{\Pr (r_{y i j} = 1 | y_{i}, φ_{y})\} = log \{\frac{\Pr (r_{y i j} = 1 | y_{i}, φ_{y})}{1 - \Pr (r_{y i j} = 1 | y_{i}, φ_{y})}\}

.

Similarly, let

r_{x i j} = {(r_{x i j 1}, \dots, r_{x i j q})}^{T}

be an indicator variable, which indicates whether

x_{i j}

is missing, and each

r_{x i j k}

is defined as follows:

r_{x i j k} = \{\begin{matrix} 1, x_{i j k} is missing, \\ 0, x_{i j k} is observed . \end{matrix}

For conditional probability density

\Pr (r_{x i j} | x_{i j}, φ_{x})

, we consider the following nonignorable data mechanisms,

\begin{matrix} \Pr (r_{x i j} | x_{i j}, φ_{x}) = \Pr (r_{x i j q} | x_{i j 1}, \dots x_{i j q}, r_{x i j 1}, \dots, r_{x i j, q - 1}, φ_{x q}) \\ \times \Pr (r_{x i j, q - 1} | x_{i j 1}, \dots x_{i j, q - 1}, r_{x i j 1}, \dots, r_{x i j, q - 2}, φ_{x, q - 1}) \\ \times \dots \times \Pr (r_{x i j 2} | x_{i j 1}, x_{i j 2}, r_{x i j 1}, φ_{x 2}) \Pr (r_{x i j 1} | x_{i j 1}, φ_{x 1}), \end{matrix}

in which

\Pr (r_{x i j k} | x_{i j 1}, \dots, x_{i j k}, r_{x i j 1}, \dots, r_{x i j, k - 1}, φ_{x k})

is defined by a logistic regression model

\begin{matrix} logit \{P r (r_{x i j k} = 1 | x_{i j 1}, \dots, x_{i j k}, r_{x i j 1}, \dots, r_{x i j, k - 1}, φ_{x k})\} \\ = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} r_{x i j 1} + \dots + φ_{x k, 2 k - 1} r_{x i j, k - 1} = v_{x i j k}, \end{matrix}

(9)

where

φ_{x k} = {(φ_{x k 0}, φ_{x k 1}, \dots, φ_{x k, 2 k - 1})}^{T}

.

In this article, we consider the following other type of the nonignorable missing mechanism for response and covariates. Specifically, in the first type, the nonignorable missing mechanism for response is specified by a logistic regression model,

logit \{\Pr (r_{y i j} = 1 | y_{i}, φ_{y}, x_{i j})\} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} x_{i j 1} + φ_{y 3} x_{i j 2} + \dots + φ_{y, k + 1} x_{i j k},

(10)

where

x_{i j 1}, \dots x_{i j k}

are all missing covariables. For missing covariate,

\Pr (r_{x i j k} | x_{i j 1}, \dots, x_{i j k}, y_{i j}, φ_{x k})

is given by a logistic regression model,

logit \{\Pr (r_{x i j k} = 1 | x_{i j 1}, \dots, x_{i j k}, y_{i j}, φ_{x k})\} = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} y_{i j} = v_{x i j k},

(11)

where

φ_{x k} = {(φ_{x k 0}, φ_{x k 1}, \dots, φ_{x k, k + 1})}^{T}

.

In what follows, we assume that the covariate

x_{i j} = {(x_{i j 1}, x_{i j 2}, \dots, x_{i j q})}^{T}

is continuous, and there is missingness in the first m dimension and complete observation in the rest

q - m

dimension. According to Ibrahim et al. [13], the joint probability function of the missing covariates is simplied by a sequence of one-dimensional conditional distributions as follows,

\begin{matrix} p (x_{i j 1}, x_{i j 2}, \dots, x_{i j, m - 1}, x_{i j m} | x_{0}, α) = p (x_{i j m} | x_{0}, x_{i j 1}, x_{i j 2}, \dots, x_{i j, m - 1}, α_{m}) \\ \times \dots \times p (x_{i j 2} | x_{0}, x_{i j 1}, α_{2}), p (x_{i j 1} | x_{0}, α_{1}) \end{matrix},

(12)

where

i = 1, \dots, n

and

j = 1, \dots, n_{i}

,

x_{0} = (x_{i j, m + 1}, x_{i j, m + 2}, \dots, x_{i j q})

,

α = (α_{1}, α_{2}, \dots, α_{m})

. Here, covariates

x_{0}

do not need to be modelled because they are always observed. In addition, continuous missing covariates are generally assumed to follow the normal distribution. For example,

p (x_{i j k} | x_{0}, x_{i j 1}, x_{i j 2}, \dots, x_{i j, k - 1}, α_{k}) \sim N (μ_{i j k}, σ_{k}^{2}), k = 1, 2, \dots, m,

(13)

where mean parameter

μ_{i j k}

is given by

\begin{matrix} μ_{i j k} = α_{k, 0} + α_{k 1} x_{i j 1} + α_{k 2} x_{i j 2} + \dots + α_{k, k - 1} x_{i j, k - 1} + α_{k k} x_{i j, k + 1} \\ + α_{k, k + 1} x_{i j, k + 2} + \dots + α_{k, q - 1} x_{i j q} . \end{matrix}

Here,

α_{k} = (α_{k, 0}, α_{k 1}, \dots, α_{k, q - 1}) .

4. Bayesian Inference

To investigate the Bayesian inference on parameters of interest, we first introduce the following notations. Let

Y_{o} = \{y_{o 1}, \dots, y_{o n}\}

and

Y_{m} = \{y_{m 1}, \dots, y_{m n}\}

be the sets of observed and missing values of response variables, respectively. Similarly,

X_{o} = \{x_{o 11}, \dots, x_{o 1 n_{i}}, \dots, x_{o n 1}, \dots, x_{o n n_{n}}\}

and

X_{m} = \{x_{m 11}, \dots, x_{m 1 n_{i}}, \dots, x_{m n 1}, \dots, x_{m n n_{n}}\}

are the sets of observed and missing values corresponding to covariates, respectively. Let

U = {u_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

denote the latent variable. Let

b = {b_{1}, \dots, b_{n}}

and

Z = {z_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

denote the vector of random effects and the vector of covariates relating to random effects. Let

T = {t_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

be the vector of time effects relating to the nonparametric part. Denote the vector of indicator variables and parameters relating to missing data mechanism by

r = {r_{y}, r_{x}}

and

φ = {φ_{x}, φ_{y}}

, where

r_{y} = {r_{y 1}, \dots, r_{y n}}

and

r_{x} = {r_{x 11}, \dots, r_{x 1 n_{1}}, \dots, r_{x n 1}, \dots, r_{x n n_{n}}}

. On the whole, let

θ = {β, p, ϕ, Σ, ξ, τ_{ξ}^{2}, δ, φ}

,

α

, and

σ_{m}^{2} = {σ_{x k}^{2} : k = 1, \dots, m}

be all the parameters to be estimated in our considered model. Given the observed data

{Y_{o}, X_{o}, Z, T, r}

, the joint posterior distribution of

θ, α, σ_{m}^{2}

is given by

\begin{matrix} p (θ, α, σ_{m}^{2} | Y_{o}, X_{o}, Z,, T, r) \\ \propto p (Y_{o} | X_{o}, Z, T, θ) p (r | Y_{o}, X_{o}, φ) p (X_{o} | α, σ_{m}^{2}) p (θ) p (α) p (σ_{m}^{2}), \end{matrix}

(14)

where

p (Y_{o} | X_{o}, Z, T, β, p, ϕ, ξ) = \int p (Y, U, b | X, Z, T, β, p, ϕ, ξ) d Y_{m} d X_{m} d U d b

and

p (r | Y_{o}, X_{o}, φ) = \int p (r | Y, X, φ) d Y_{m} d X_{m}

.

Clearly, it is difficult to generate the random sample from the posterior distribution

p (θ, α, σ_{m}^{2} | Y_{o}, X_{o}, Z, T, r)

because Equation (14) has high-dimensional integration. Thus, inspired by the data augmentation method (Tanner and Wong [23]), we adopt the following posterior distribution,

p (θ, α, σ_{m}^{2} | Y, X, T, Z, r, b, U)

, to solve the high-dimensional integration issue. Meanwhile, it is easy to generate the random sample from

p (Y_{m}, X_{m}, b, U, θ, α, σ_{m}^{2} | Y_{o}, X_{o}, Z, T, r)

via the Gibbs sampler (Geman and Geman [24]). That is, random samples

\{Y_{m}, X_{m}, U, b, θ, α, σ_{m}^{2}\}

are iteratively generated by means of the following conditional distributions

p (Y_{m} | Y_{o}, X, U, Z, T, b, r, θ)

,

p (X_{m} | Y, X_{o}, U, Z, T, b, r, φ_{x}, α, σ_{m}^{2})

,

p (U | Y, X, Z, T, p, ϕ)

,

p (b | Y, X, U, Z, T, θ)

,

p (β, p, ϕ, \sum, ξ | Y, X, U, Z, T)

,

p (τ_{ξ}^{2} | ξ, δ)

,

p (δ_{l} | ξ, τ_{ξ}^{2})

,

p (φ | Y, X, r)

,

p (α | X, σ_{m}^{2})

, and

p (σ_{m}^{2} | X, α)

. To derive the abovementioned conditional distributions, we adopt the following joint logarithmic likelihood function of

(Y, U, b)

\begin{matrix} l (θ; Y, U, b) & = log \{\prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} p (y_{i j}, u_{i j} | b_{i}, θ) p (b_{i})\} \\ = \sum_{y_{i j} = 0} log p (y_{i j}, u_{i j} |, b_{i}, θ) + \sum_{y_{i j} > 0} log p (y_{i j}, u_{i j} | b_{i}, θ) + \sum_{i = 1}^{n} log (p (b_{i})) \\ = - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + log Γ (u_{i j} \frac{2 - p}{p - 1}) + log u_{i j}! + log y_{i j}) \\ + \sum_{y_{i j} > 0} u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)) + \sum_{i = 1}^{n} log (p (b_{i})) \end{matrix} .

(15)

Moreover, the prior distributions of

β

, p,

ϕ

,

Σ

,

ξ

,

τ_{ξ}^{2}

,

δ_{ρ}

,

φ

,

α

and

σ_{x k}^{2}

are given by

\begin{matrix} p (β) \sim N (β^{0}, A), logit (p - 1) \sim N (0, 10000), log (ϕ) \sim N (0, 10000), p (Σ) \sim I W_{k} (ρ_{0}, R_{0}), \\ ξ | τ_{ξ}^{2} \propto exp (- \frac{1}{2 τ_{ξ}^{2}} ξ^{T} Q ξ), p (τ_{ξ}^{2}) \sim I G (a_{τ}, b_{τ}), p (δ_{ρ}) \sim Γ (a_{δ}, b_{δ}), p (φ_{y}) \sim N (φ_{y}^{0}, B), \\ p (φ_{x}) \sim N (φ_{x}^{0}, C), p (α) \sim N (α_{x}^{0}, D), p (σ_{x k}^{2}) \sim I G (a_{x k}, b_{x k}), \end{matrix}

(16)

where

ρ_{0}, a_{τ}, b_{τ}, a_{δ}, b_{δ}, a_{x k}, b_{x k}, φ_{x}^{0}, φ_{y}^{0}, α^{0}, A, B, C, D, Σ, R_{0}

is the pregiven hyperparameter, N is the normal distribution,

I W_{k} (\cdot, \cdot)

is the k-dimensional inverse Wishart distribution,

Γ

is the gamma distribution, IG is the inverse gamma distribution, and

logit (a) = log (\frac{a}{1 - a})

. As for the choices of hyperparameters with regard to the Bayesian P-spline method, Lang and Brezger [22] pointed out that

a_{τ} = 1

and a small value for

b_{τ}

for example,

b_{τ} = 0.005

or

b_{τ} = 0.0005

, leading to an almost diffuse prior for

τ_{ξ}^{2}

. Moreover, the hyperparameters

a_{δ}

and

b_{δ}

are simultaneously taken to be

0.5

, which can characterize the highly oscillating features for some nonparametric functions. As for the power parameter p, Ye et al. [12] adopted the following priors to conduct the sensitivity analysis:

p \sim Uniform (1, 2)

,

logit (p - 1) \sim N (0, 100)

,

p - 1 \sim Beta (0.1, 0.1)

and

p - 1 \sim Beta (0.01, 0.01)

. As a result, Ye et al. [12] chose the

logit (p - 1) \sim N (0, 100)

prior as the optimal for p in the Tweedie compound Poisson distribution based on the sensitivity analysis. The choices of hyperparameters for other prior distributions are discussed in Section 5. The conditional distributions, Gibbs sampling and Metropolis–Hastings algorithm are shown in the Appendix A and Appendix B.

Bayesian Estimates

Let

\{β^{(l)}, ϕ^{(l)}, p^{(l)}, Σ^{(l)}, α^{(l)}, φ_{y}^{(l)}, φ_{x}^{(l)}, σ {_{x k}^{2}}^{(l)} : l = 1, \dots, L\}

be random samples from the joint posterior distribution

p (β, ϕ, p, Σ, α, φ_{y}, φ_{x}, σ_{x k}^{2} | Y, X, Z, r, b)

. The Bayesian estimates of parameters

β

,

ϕ

, p,

Σ

,

α

,

φ_{y}

,

φ_{x}

and

σ_{x k}^{2}

can be obtained by

\begin{matrix} \hat{β} = \frac{1}{L} \sum_{l = 1}^{L} β^{(l)}, \hat{ϕ} = \frac{1}{L} \sum_{l = 1}^{L} ϕ^{(l)}, \hat{p} = \frac{1}{L} \sum_{l = 1}^{L} p^{(l)}, \hat{Σ} = \frac{1}{L} \sum_{l = 1}^{L} Σ^{(l)}, \\ \hat{α} = \frac{1}{L} \sum_{l = 1}^{L} α^{(l)}, {\hat{φ}}_{y} = \frac{1}{L} \sum_{l = 1}^{L} φ_{y}^{(l)}, {\hat{φ}}_{x} = \frac{1}{L} \sum_{l = 1}^{L} φ_{x}^{(l)}, {\hat{σ}}_{x k}^{2} = \frac{1}{L} \sum_{l = 1}^{L} σ {_{x k}^{2}}^{(l)} \end{matrix} .

Similarly, the consistency estimates of the posterior covariance matrix

var (β, ϕ, p, Σ, α, φ_{y}, φ_{x}, σ_{x k}^{2} | Y, X, Z, r, b)

for parameters

β

,

ϕ

, p,

Σ

,

α

,

φ_{y}

,

φ_{x}

and

σ_{x k}^{2}

can be obtained from the sample covariance matrix of their random samples. For example, the posterior covariance matrix

var (β | Y, X, Z, r, b)

can be consistently estimated by

\hat{var} (β | Y, X, Z, r, b) = \frac{1}{L - 1} \sum_{l = 1}^{L} (β^{(l)} - \hat{β}) {(β^{(l)} - \hat{β})}^{T} .

In addition, the corresponding standard deviation can be estimated by the diagonal elements of the sample covariance matrix of the random sample sequence.

5. Numerical Examples

In this section, two simulation studies and a real example relating to the OAI data are conducted to investigate the performance of our proposed Bayesian methodologies.

5.1. Simulation Studies

In the first simulation study, we assume that the longitudinal semicontinuous datasets

{y_{i j} : i = 1, \dots, n, j = 1, \dots n_{i}}

with

n = 150

and

n_{i} = 4

are simulated from the Tweedie compound Poisson distribution

{Tw}_{p} (μ_{i j}, ϕ)

and the conditional mean

μ_{i j}

is given by

log (μ_{i j}) = x_{i j 1} β_{1} + x_{i j 2} β_{2} + x_{i j 3} β_{3} + b_{i} + g (t_{i j}),

(17)

where covariate

x_{i j 3}

is generated from the standard normal distribution, and

x_{i j 1}

and

x_{i j 2}

are independently simulated from the normal distribution

N (α_{10} + α_{11} x_{i j 3}, σ_{x 1}^{2})

and

N (α_{20} + α_{21} x_{i j 1} + α_{22} x_{i j 3}, σ_{x 2}^{2})

, respectively. In addition, the random effects

b_{i}

are independent and identically distributed as

N (0, Σ)

, and the true curve of nonparametric function is given by

g (t) = sin (2 π t)

with

t_{i j} \sim U (0, 1)

. The true values of the abovementioned parameters are taken to be

σ_{x 1}^{2} = 0.25

,

σ_{x 2}^{2} = 0.36

,

p = 1.5

,

ϕ = 0.5

,

Σ = 0.64

,

β = {(β_{1}, β_{2}, β_{3})}^{T} = {(1, 1, - 1)}^{T}

,

α_{1} = {(α_{10}, α_{11})}^{T} = {(0.05, 0.5)}^{T}

,

α_{2} = {(α_{20}, α_{21}, α_{22})}^{T} = {(- 0.9, 0.05, 0.9)}^{T}

. In what follows, it is assumed that covariate

x_{i j 3}

is completely observed, while response

y_{i j}

and covariates

x_{i j 1}

,

x_{i j 2}

are subject to missingness. Thus, the nonignorable missing mechanism for these three variables are modelled by the following logistic regression model,

\begin{matrix} logit \{\Pr (r_{y i j} = 1 | y_{i j}, y_{i, j - 1}, φ_{y})\} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} y_{i, j - 1}, \\ logit \{\Pr (r_{x i j 1} = 1 | x_{i j 1}, φ_{x 1})\} = φ_{x 10} + φ_{x 11} x_{i j 1}, \\ logit \{\Pr (r_{x i j 2} = 1 | x_{i j 1}, x_{i j 2}, r_{x i j 1}, φ_{x 2})\} = φ_{x 20} + φ_{x 21} x_{i j 1} + φ_{x 22} x_{i j 2} + φ_{x 23} r_{x i j 1}, \end{matrix}

(18)

where the truth values of

φ_{y}

,

φ_{x 1}

, and

φ_{x 2}

are given by

φ_{y} = {(φ_{y 0}, φ_{y 1}, φ_{y 2})}^{T} = {(- 2.4, 0.1, 0.1)}^{T}

,

φ_{x 1} = {(φ_{x 10}, φ_{x 11})}^{T} = {(- 2.5, 0.1)}^{T}

,

φ_{x 2} = {(φ_{x 20}, φ_{x 21}, φ_{x 22}, φ_{x 23})}^{T} = {(- 1.9, 0.05, 0.05, 0.3)}^{T}

. The missing data for

x_{i j 1}

,

x_{i j 2}

and

y_{i j}

were generated by (18), and the average proportion of missing data for

x_{i j 1}

,

x_{i j 2}

and

y_{i j}

on the basis of 50 replications are

8.7 %

,

13.3 %

, and

7.5 %

, respectively.

To investigate the effect of different prior information on the Bayesian estimate for unknown parameters, three types of prior information are considered as follows.

Type I: The hyperparameters

β^{0}

,

φ_{y}^{0}

,

φ_{x 1}^{0}

,

φ_{x 2}^{0}

,

α_{x 1}^{0}

and

α_{x 2}^{0}

are taken to be the truth values corresponding to their parameters;

ρ_{0} = 8

,

R_{0} = 2

,

a_{τ} = 1

,

b_{τ} = 0.005

,

a_{δ} = 0.5

and

b_{δ} = 0.5

; A, B,

C_{x 1}

,

C_{x 2}

,

D_{x 1}

, and

D_{x 2}

are taken to be

0.25 I_{3}

,

0.25 I_{3}

,

0.25 I_{2}

,

0.25 I_{4}

,

0.25 I_{2}

,

0.25 I_{3}

, where

I_{d}

denotes the

d \times d

identity matrix. This scenario is viewed as a good piece of prior information.

Type II: The hyperparameters

β^{0}

,

φ_{y}^{0}

,

φ_{x 1}^{0}

,

φ_{x 2}^{0}

,

α_{x 1}^{0}

and

α_{x 2}^{0}

are taken to be 2 times truth values corresponding to their parameters; A, B,

C_{x 1}

,

C_{x 2}

,

D_{x 1}

, and

D_{x 2}

are taken to be

0.75 I_{3}

,

0.75 I_{3}

,

0.75 I_{2}

,

0.75 I_{4}

, while other hyperparameters are taken to be the same as those given in Type I. This scenario is viewed as an inaccurate prior information.

Type III: The hyperparameters

β^{0}

,

φ_{y}^{0}

,

φ_{x 1}^{0}

,

φ_{x 2}^{0}

,

α_{x 1}^{0}

, and

α_{x 2}^{0}

are taken to be zero vector, respectively; A, B,

C_{x 1}

,

C_{x 2}

,

D_{x 1}

, and

D_{x 2}

are taken to be

100 I_{3}

,

100 I_{3}

,

100 I_{2}

,

100 I_{4}

, while other hyperparameters are taken to be the same as those given in Type I. This scenario is viewed as a noninformative prior information.

For each of the above-generated 50 datasets, the hybrid algorithm combining the block Gibbs sampler and Metropolis–Hastings algorithm is used to produce the joint Bayesian estimates of unknown parameters, random effects, and nonparametric function. To ensure the convergence of the hybrid algorithm for each replication, we collected 5000 observations after 5000 iterations to calculate Bayesian estimates, which are reported in Table 3, where “Bias” is the difference between the mean value of parameters obtained from 50 replication and the truth value, “SD” is the standard deviation of the estimates on the basis of 50 replications, and “RMS” is the root mean square between the estimates on the basis of 50 replications and its true value. It can be seen from Table 3 that (i) Bayesian estimates for unknown parameters were reasonably accurate in our considered three different prior information because all Bias values are less than

0.1

, and (ii) the estimated values of SD and RMS are less than

0.5

and there is little difference between these two estimated values regardless of any priors. Thus, Bayesian estimates are not sensitive to our considered three prior pieces of information. In addition, examination of Figure 2 indicated that our proposed Bayesian P-spline method to approximate nonparametric function is validated to be feasible because the estimated curves of nonparametric function

g (t)

matched well with the true curve in our considered simulation studies.

In the second simulation study, the simulated setup is the same as the first simulation study except for the missing mechanism. Here, the other nonignorable missing mechanism model for

y_{i j}

,

x_{i j 1}

and

x_{i j 2}

is given by

\begin{matrix} logit \{\Pr (r_{y i j} = 1 | y_{i j}, x_{i j 1}, x_{i j 2}, φ_{y})\} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} x_{i j 1} + φ_{y 3} x_{i j 2}, \\ logit \{\Pr (r_{x i j 1} = 1 | x_{i j 1}, y_{i j}, φ_{x 1})\} = φ_{x 10} + φ_{x 11} x_{i j 1} + φ_{x 12} y_{i j}, \\ logit \{\Pr (r_{x i j 2} = 1 | x_{i j 1}, x_{i j 2}, y_{i j}, φ_{x 2})\} = φ_{x 20} + φ_{x 21} x_{i j 1} + φ_{x 22} x_{i j 2} + φ_{x 23} y_{i j} \end{matrix},

(19)

where the true values of unknown parameters are taken to be

φ_{y} = {(φ_{y 0}, φ_{y 1}, φ_{y 2}, φ_{y 3})}^{T} = {(- 2.2, 0.1, 0.1, 0.1)}^{T}

,

φ_{x 1} = {(φ_{x 10}, φ_{x 11}, φ_{x 12})}^{T} = {(- 2.5, 0.2, 0.1)}^{T}

,

φ_{x 2} = {(φ_{x 20}, φ_{x 21}, φ_{x 22}, a n d φ_{x 23})}^{T} = {(- 1.9, 0.05, 0.05, - 0.3)}^{T}

. The average proportion of missing data for

x_{i j 1}

,

x_{i j 2}

and

y_{i j}

is

10 %

,

11 %

, and

8.5 %

, respectively. Similar to the first simulation study, we also considered three different prior pieces of information for their corresponding hyperparameters. These findings in Table 4 and Figure 3 show that (i) all Bias values corresponding to unknown parameters are less than

0.1

except that the Bias value for

φ_{x 10}

is

- 0.1013

under the Type II prior and the Bias values for

φ_{y 0}

and

φ_{x 23}

are

- 0.1102

and

- 0.1029

under the Type III prior, respectively, and (ii) the estimated curves of nonparametric function

g (t)

also matched well with the true curve regardless of three different priors. Clearly, Bayesian estimates under the first two priors are better than those obtained from the Type III prior. All in all, our proposed Bayesian approach is feasible in our considered missing mechanism.

5.2. Real Example

In this section, the application of our proposed semiparametric Bayesian approach is illustrated by the analysis of longitudinal semicontinuous data from the OAI, which was discussed in Section 2. The OAI longitudinal data were analyzed by various approaches, such as Chen and Wehrly [25,26]. However, these authors only considered the observed data by reducing 4796 patients to 1499 patients and assumed the log transformation of the WOMAC score plus 1 to approximate the normal distribution. In this study, our scientific interest is to link the covariates, such as age, sex, and BMI with the outcome WOMAC score while accounting for nonignorable missing response with a point mass at zero and covariates data. In addition, we viewed age as the individual-level covariate modeled nonparametrically with the other covariate variables modeled parametrically. Let the outcome

Y_{i j}

represent the WOMAC numeric score for the right knees of the ith (

i = 1, 2, \dots, 4796

) patient recorded at the jth time point (

j = 1, 2, \dots, 5

corresponding to 0, 12, 24, 36, and 48 months). As discussed in Section 2, we regarded the WOMAC numerical score as a longitudinal semicontinuous outcome in this real example.

Here, given random effects

b_{i}

,

Y_{i j} | b_{i}

follows the Tweedie compound Poisson distribution; that is

Y_{i j} | b_{i} \sim {Tw}_{p} (μ_{i j}, ϕ)

. The conditional mean

μ_{i j}

is simultaneously linked to covariates, random effects, and nonparametric function as follows,

log (μ_{i j}) = β_{0} + β_{1} {BMI}_{i j} + β_{2} {SEX}_{i j} + b_{i} + g ({AGE}_{i j}),

where the covariates

{SEX}_{i j}

(1 for male or 2 for female) and

{AGE}_{i j}

are completely observed, while the outcome

Y_{i j}

and covariate

{BMI}_{i j}

are missing and their corresponding missing rates are

8.5 %

and

13.5 %

, respectively. Furthermore, we consider the following missing data mechanisms for covariate

{BMI}_{i j}

and outcome

Y_{i j}

,

\begin{matrix} logit \{\Pr (r_{{BMI}_{i j}} = 1 | {BMI}_{i j}, φ_{B M I})\} = φ_{B M I 0} + φ_{B M I 1} {BMI}_{i j}, \\ logit \{\Pr (r_{Y_{i j}} = 1 | Y_{i j}, Y_{i, j - 1}, φ_{Y})\} = φ_{Y 0} + φ_{Y 1} Y_{i j} + φ_{Y 2} Y_{i, j - 1}, \end{matrix}

where

φ_{B M I} = {(φ_{B M I 0}, φ_{B M I 1})}^{T}

, and

φ_{Y} = {(φ_{Y 0}, φ_{Y 1}, φ_{Y 2})}^{T}

. In addition, we assume that the missing distribution for covariate

{BMI}_{i j}

follows the normal distribution

N (α_{10} + α_{11} {SEX}_{i j}, σ_{B M I}^{2})

and random effect

b_{i}

is distributed as the normal distribution

N (0, Σ)

. Bayesian estimates of unknown parameters and their corresponding standard error as well as the nonparametric function are displayed in Table 5 and Figure 4. Table 5 indicates that the covariates BMI and Sex have the positive significant effect on the WOMAC score at the significance level of

0.05

. The result shows that the WOMAC score increases as BMI increases. The higher the BMI score a patient has, the greater intensity of knee osteoarthritis the patient will suffer. The positive significant effect of the covariate Sex on the WOMAC score indicates that the average WOMAC score for females are higher compared with males. Women are more vulnerable to greater intensity than men. Chen and Wehrly [25,26] assumed a linear age effect on the WOMAC score parametrically, but an insignificant effect on the WOMAC score are presented in their studies. It appears from Figure 4 that the Bayesian estimates of nonparametric function

g ({AGE}_{i j})

based on the P-spline method has a significant nonlinear trend. Specifically, there was a sharp decrease from age 45 to approximately age 49 and from age 60 to approximately age 73, respectively. Moreover, stabilization seems to have started at age 73. In the missing mechanism model, we found that the Bayesian estimates of unknown parameters

φ_{B M I 1}

and

φ_{Y 2}

significantly deviated from zero. Thus, it is reasonable to incorporate the missing data into our proposed semiparametric Bayesian model in the analysis of OAI dataset because missing data mechanisms for

Y_{i j}

and

{BMI}_{i j}

are nonignorable.

6. Conclusions

In this paper, we have introduced a new Tweedie compound Poisson partial linear mixed model with nonignorable missing covariates and responses by assuming that the random effect is distributed as a multivariate normal distribution and the nonparametric function is modelled by the Bayesian P-splines simultaneously. The logistic regression model is simultaneously used to model the missing response and covariate mechanisms. This article has the following contributions: (i) our proposed Bayesian semiparametric mixed effects model can model both zero and positive components of the longitudinal semicontinuous data in an integral way while accounting for the nonignorable missing responses and covariates simultaneously; (ii) our proposed partial linear mixed models based on Bayesian P-spline can characterize the nonlinear smooth effects of covariate in the analysis of longitudinal semicontinuous data; (iii) the conditional distributions for the Gibbs sampling algorithm and Metropolis–Hastings algorithm of our proposed model are derived; and (iv) two simulation studies and a real example are used to illustrate the effectiveness and feasibility of our several considered missing mechanisms.

Author Contributions

Conceptualization, Z.W. and X.D.; methodology, X.D. and Z.W.; software, Z.W., X.D. and W.Z.; validation, Z.W., X.D. and W.Z.; formal analysis, Z.W. and X.D.; investigation, Z.W., X.D. and W.Z.; preparation of the original work draft, X.D. and Z.W.; visualization, Z.W. and W.Z.; supervision, funding acquisition, X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (12161014), the National Statistical Science Research Project (2021LY011), the Guizhou Provincial Science and Technology Project ([2020]1Y009), the Innovative Exploration and New Academic Seedling Project of Guizhou University of Finance and Economics (No. 2022XSXMA18).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The research data are available on the website: https://www.oai.ucsf.edu, accessed on 4 April 2017.

Acknowledgments

We are grateful to Zhixian Yang for careful English editing during the preparation of the revision.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Conditional Distributions for the Gibbs Sampling Algorithm

First, the conditional distribution of the missing part used in the Gibbs sampling algorithm is as follows.

(1): The logarithmic joint conditional distribution of ( $y_{m i}, u_{i})$ is

$\begin{matrix} log p (y_{m i}, u_{i} | y_{o i}, x_{i}, z_{i}, t_{i}, b_{i}, Σ, β, p, ϕ, ξ, r_{y i}, φ_{y}) \\ \propto log p (y_{i}, u_{i} | x_{i}, z_{i}, t_{i}, b_{i}, θ) + log p (r_{y i} | y_{i}, φ_{y}) \\ = log \prod_{j = 1}^{n_{i}} p (y_{i j}, u_{i j} | x_{i j}, z_{i j}, t_{i j}, b_{i}, θ) + log \prod_{j = 1}^{n_{i}} p (r_{y i j} | y_{i}, φ_{y}) \\ = \sum_{j = 1}^{n_{i}} log p (y_{i j}, u_{i j} | x_{i j}, z_{i j}, t_{i j}, b_{i}, θ) + \sum_{j = 1}^{n_{i}} log p (r_{y i j} | y_{i}, φ_{y}) \\ = - \frac{1}{ϕ} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{j = 1}^{n_{i}} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + log Γ (u_{i j} \frac{2 - p}{p - 1}) + log u_{i j}! + log y_{i j}) I \{y_{i j} > 0\} \\ + \sum_{j = 1}^{n_{i}} u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)) I \{y_{i j} > 0\} \\ + \sum_{j = 1}^{n_{i}} log \{{(\frac{exp (u_{y i j})}{1 + exp (u_{y i j})})}^{r_{y i j}} {(1 - \frac{exp (u_{y i j})}{1 + exp (u_{y i j})})}^{1 - r_{y i j}}\} \\ \propto - \sum_{j = 1}^{n_{i}} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + log y_{i j}) I \{y_{i j} > 0\} + \sum_{j = 1}^{n_{i}} u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1}) I \{y_{i j} > 0\} \\ + \sum_{j = 1}^{n_{i}} \{r_{y i j} u_{y i j} - log [1 + exp (u_{y i j})]\}, \end{matrix}$

where $u_{y i j} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} y_{i, j - 1}$ or $u_{y i j} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} x_{i j 1} + φ_{y 3} x_{i j 2} + \dots + φ_{y, k + 1} x_{i j k}$ ( $x_{i j 1}, x_{i j 2}, \dots, x_{i j k}$ are all missing covariables), $x_{i} = \{x_{i 1}, \dots, x_{i n_{i}}\}$ , $z_{i} = \{z_{i 1}, \dots, z_{i n_{i}}\}$ , $t_{i} = \{t_{i 1}, \dots, t_{i n_{i}}\}$ and $u_{i} = \{u_{i 1}, \dots, u_{i n_{i}}\}$ .

(2): The logarithmic conditional distribution of $x_{m i}$ is

$\begin{matrix} log p (x_{m i} | x_{o i}, u_{i}, y_{i}, z_{i}, t_{i}, b_{i}, Σ, β, p, ϕ, ξ, α, φ_{x}) \\ \propto log p (y_{i}, u_{i} | x_{i}, z_{i}, t_{i}, b_{i}, θ) + log p (r_{x i} | x_{i}, φ_{x}) + log p (x_{m i} | α) \\ = \log \prod_{j = 1}^{n_{i}} p (y_{i j}, u_{i j} | x_{i j}, z_{i j}, t_{i j}, b_{i}, θ) + log \prod_{k = 1}^{q} p (r_{x i j k} | x_{i}, φ_{x}) + log p (x_{m i} | α) \\ = \sum_{j = 1}^{n_{i}} log p (y_{i j}, u_{i j} | x_{i j}, z_{i j}, t_{i j}, b_{i}, θ) + \sum_{k = 1}^{q} log p (r_{x i j k} | x_{i}, φ_{x}) + log p (x_{m i} | α) \\ = - \frac{1}{ϕ} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{j = 1}^{n_{i}} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + log Γ (u_{i j} \frac{2 - p}{p - 1}) + log u_{i j}! + log y_{i j}) I \{y_{i j} > 0\} \\ + \sum_{j = 1}^{n_{i}} u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)) I \{y_{i j} > 0\} \\ + \sum_{k = 1}^{p} log \{{(\frac{exp (v_{i j k})}{1 + exp (v_{i j k})})}^{r_{x i j k}} {(1 - \frac{exp (v_{i j k})}{1 + exp (v_{i j k})})}^{1 - r_{x i j k}}\} + log p (x_{m i} | α) \\ \propto - \frac{1}{ϕ} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{j = 1}^{n_{i}} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}}) I \{y_{i j} > 0\} \\ + \sum_{k = 1}^{p} \{r_{x i j k} v_{i j k} - log [1 + exp (v_{i j k})]\} + log p (x_{m i} | α) \end{matrix},$

where $v_{x i j k} = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} r_{x i j 1} + \dots + φ_{x k, 2 k - 1} r_{x i j, k - 1}$ or $v_{x i j k} = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} y_{i j}$ , $p (x_{m i} | α)$ is specified by (12) and (13).
(3): The conditional distribution of $φ_{y}$ is

$\begin{matrix} p (φ_{y} | Y, r_{y}) \propto p (r_{y} | Y, φ_{y}) p (φ_{y}) \\ = \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} p (r_{y i j} | y_{i}, φ_{y}) p (φ_{y}) \\ \propto exp \{\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (r_{y i j} u_{y i j} - log (1 + exp (u_{y i j}))) - \frac{1}{2} {(φ_{y} - φ_{y}^{0})}^{T} B^{- 1} (φ_{y} - φ_{y}^{0})\} . \end{matrix},$

where $u_{y i j} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} y_{i, j - 1}$ or $u_{y i j} = φ_{y 0} + φ_{y 1} y_{i j} + φ_{y 2} x_{i j 1} + φ_{y 3} x_{i j 2} + \dots + φ_{y, k + 1}$ $x_{i j k}$ ( $x_{i j 1}, x_{i j 2}, \dots, x_{i j k}$ are all missing covariables).
(4): The conditional distribution of $φ_{x}$ is

$\begin{matrix} p (φ_{x k} | X, r_{x}) \\ \propto exp \{\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (r_{x i j k} v_{x i j k} - log (1 + exp (v_{x i j k}))) - \frac{1}{2} {(φ_{x k} - φ_{x k}^{0})}^{T} C^{- 1} (φ_{x k} - φ_{x k}^{0})\} \end{matrix},$

where $v_{x i j k} = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} r_{x i j 1} + \dots + φ_{x k, 2 k - 1} r_{x i j, k - 1}$ or $v_{x i j k} = φ_{x k 0} + φ_{x k 1} x_{i j 1} + \dots + φ_{x k k} x_{i j k} + φ_{x k, k + 1} y_{i j}$ .
(5): The conditional distribution of $α$ is

$\begin{matrix} p (α | X, r_{x}) \propto p (X | α, r_{x}) p (α) \\ \propto \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} p {(x_{i j q} | x_{i j 1}, \dots, x_{i j, q - 1}, α_{q})}^{r_{x i j q}} \dots p {(x_{i j 2} | x_{i j 1}, α_{2})}^{r_{x i j 2}} \times p {(x_{i j 1} | α_{1})}^{r_{x i j 1}} p (α) \\ = \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} \prod_{k = 1}^{q} p {(x_{i j k} | x_{i j 1}, \dots, x_{i j, k - 1}, α_{q})}^{r_{x i j k}} p (α) \\ \propto \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} \prod_{k = 1}^{q} p {(x_{i j k} | x_{i j 1}, \dots, x_{i j, k - 1}, α_{q})}^{r_{x i j k}} exp \{- \frac{1}{2} {(α - α^{0})}^{T} D^{- 1} (α - α^{0})\} \end{matrix} .$
(6): According to (13), we know $x_{i j 1} \sim N (μ_{i j 1}, σ_{x 1}^{2})$ , and $p (σ_{x 1}^{2}) \sim I G (a_{x 1}, b_{x 1})$ . Then the conditional distribution of $σ_{x 1}^{2}$ is

$\begin{matrix} p (σ_{x 1}^{2} | x_{i j 1}, α_{1}) \\ \propto {(σ_{x 1}^{2})}^{- a_{x 1} - 1} exp (- \frac{b_{x 1}}{σ_{x 1}^{2}}) \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} \frac{1}{\sqrt{2 π} σ_{x 1}^{2}} exp \{- \frac{{(x_{i j 1} - μ_{i j 1})}^{2}}{2 σ_{x 1}^{2}}\} \\ \propto {(σ_{x 1}^{2})}^{- a_{x 1} - \frac{n n_{i}}{2} - 1} exp \{- \frac{1}{σ_{x 1}^{2}} [b_{x 1} + \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {(x_{i j 1} - μ_{i j 1})}^{2}]\} \end{matrix} .$

Clearly, $p (σ_{x 1}^{2} | x_{i j 1}, α_{1}) \sim I G (a_{x 1} + \frac{n n_{i}}{2}, b_{x 1} + \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {(x_{i j 1} - μ_{i j 1})}^{2})$ , where $I G$ is the inverse gamma distribution. In addition, the conditional distributions of $σ_{x 2}^{2}, σ_{x 3}^{2}, \dots, σ_{x m}^{2}$ are the same as that of $σ_{x 1}^{2}$ .

The conditional distribution of the nonparametric part used in the Gibbs sampling algorithm is as follows.

(1): The logarithmic conditional distribution of $ξ$ is

$\begin{matrix} log p (ξ | U, Y, X, T, β, p, ϕ, b, Σ) \\ \propto - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}}) - \frac{1}{2 τ_{ξ}^{2}} ξ^{T} Q ξ \end{matrix} .$
(2): The conditional distribution of $τ_{ξ}^{2}$ is

$\begin{matrix} p (τ_{ξ}^{2} | ξ, δ) \propto {(τ_{ξ}^{2})}^{- \frac{r a n k (Q)}{2} - a_{τ} - 1} exp \{- \frac{1}{2 τ_{ξ}^{2}} ξ^{T} Q ξ - \frac{b_{τ}}{τ_{ξ}^{2}}\} \end{matrix} .$

Clearly, $τ_{ξ}^{2} | ξ, δ \sim I G (\frac{r a n k (Q)}{2} + a_{τ}, \frac{1}{2} ξ^{T} Q ξ + b_{τ})$ , where $I G$ is the inverse gamma distribution.
(3): The conditional distribution of $δ_{ρ}$ is

$\begin{matrix} p (δ_{ρ} | ξ, τ_{ξ}^{2}) \propto {(δ_{ρ})}^{a_{δ} - 1 / 2} exp \{- δ_{ρ} (\frac{{(ξ_{ρ} - ξ_{ρ - 1})}^{2}}{2 τ_{ξ}^{2}} + b_{δ})\} \end{matrix} .$

Clearly, $p (δ_{ρ} | ξ, τ_{ξ}^{2}) \sim Γ (a_{δ} + \frac{1}{2}, \frac{{(ξ_{ρ} - ξ_{ρ - 1})}^{2}}{2 τ_{ξ}^{2}} + b_{δ})$ , where $Γ$ is the gamma distribution.

Finally, the conditional distributions of other parameters used in the Gibbs sampling are as follows.

(1): The logarithmic conditional distribution of $β$ is

$\begin{matrix} log p (β | U, Y, X, T, ϕ, p, b, Σ, ξ) \\ \propto - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}}) - \frac{1}{2} {(β - β^{0})}^{T} A^{- 1} (β - β^{0}) \end{matrix} .$
(2): The logarithmic conditional distribution of p is

$\begin{matrix} log p (p | U, Y, X, T, β, ϕ, b, Σ, ξ) \\ \propto - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + log Γ (n_{i j} \frac{2 - p}{p - 1})) \\ + \sum_{y_{i j} > 0} n_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)) - \frac{{(log \frac{p - 1}{2 - p})}^{2}}{2 \times 10, 000} \end{matrix} .$
(3): The logarithmic conditional distribution of $ϕ$ is

$\begin{matrix} log p (ϕ | U, Y, X, T, β, p, b, Σ, ξ) \\ \propto - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} \frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} + \sum_{y_{i j} > 0} n_{i j} (- \frac{log ϕ}{p - 1}) - \frac{{(log ϕ)}^{2}}{2 \times 10, 000} \end{matrix} .$
(4): The logarithmic conditional distribution of $b$ is

$\begin{matrix} log p (b | U, Y, X, T, β, p, ϕ, ξ, Σ) \\ \propto - \frac{1}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{j}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{y_{i j} > 0} (\frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}}) - \sum_{i = 1}^{n} \frac{1}{2} b_{i}^{T} Σ^{- 1} b_{i} \end{matrix} .$

Thus, $log p (b_{i} | U, Y, X, T, β, p, ϕ, ξ, Σ)$ is proportional to

$- \frac{1}{ϕ} \sum_{j = 1}^{n_{j}} \frac{μ_{i j}^{2 - p}}{2 - p} - \sum_{j = 1}^{n_{i}} \frac{y_{i j}}{ϕ (p - 1) μ_{i j}^{p - 1}} I \{y_{i j} > 0\} - \frac{1}{2} b_{i}^{T} Σ^{- 1} b_{i} .$
(5): The conditional distribution of $Σ$ is

$p (Σ | U, Y, X, T, β, p, ϕ, b, ξ) \propto {|Σ|}^{\frac{- (ρ_{0} + n + d + 1)}{2}} exp \{- \frac{1}{2} t r [Σ^{- 1} (\sum_{i = 1}^{n} b_{i} b_{i}^{T} + R_{0})]\} .$

Clearly, $Σ | U, Y, X, T, β, p, ϕ, b, ξ \sim I W_{d} (n + ρ_{0}, R_{0} + \sum_{i = 1}^{n} b_{i} b_{i}^{T})$ , where $I W_{k} (\cdot, \cdot)$ is the k-dimensional inverse Wishart distribution.
(6): The logarithmic conditional distribution of $U$ is

$\begin{matrix} log p (U | Y, ϕ, p) \propto - \sum_{y_{i j} > 0} (log Γ (u_{i j} \frac{2 - p}{p - 1}) + log u_{i j}! + log y_{i j}) \\ + \sum_{y_{i j} > 0} u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)), \end{matrix}$

thus,

$\begin{matrix} log p (u_{i j} | Y, ϕ, p) \propto - (log Γ (u_{i j} \frac{2 - p}{p - 1}) + log u_{i j}! + log y_{i j}) I \{y_{i j} > 0\} \\ + u_{i j} (\frac{2 - p}{p - 1} log \frac{y_{i j}}{p - 1} - \frac{log ϕ}{p - 1} - log (2 - p)) I \{y_{i j} > 0\} . \end{matrix}$

Appendix A. Metropolis–Hastings Algorithm

To implement the Metropolis–Hastings algorithm, we assume that the current iteration values of

u_{i j}

,

β

, p,

ϕ

,

b_{i}

, and

ξ

are

u_{i j}^{(l)}

,

β^{(l)}

,

p^{(l)}

,

ϕ^{(l)}

,

b_{i}^{(l)}

, and

ξ^{(l)}

, and the proposal distributions of the new random samples

u_{i j}^{*}

,

β^{*}

,

p^{*}

,

ϕ^{*}

,

b_{i}^{*}

, and

ξ^{*}

were selected as zero truncated Poisson distribution

f (u_{i j}^{(l)}; σ_{n}^{2} λ | u_{i j}^{(l)} > 0)

, multivariate normal distribution

N (β^{(l)}, σ_{β}^{2} Ω_{β})

, normal distribution

log \frac{p - 1}{2 - p} \sim N (p^{(l)}, σ_{p}^{2})

, normal distribution

log ϕ \sim N (ϕ^{(l)}, σ_{ϕ}^{2})

, normal distribution

N (b_{i}^{(l)}, σ_{b_{i}}^{2} Ω_{b_{i}})

, and multivariate normal distribution

N (ξ^{(l)}, σ_{ξ}^{2} Ω_{ξ})

, respectively, where

λ

denotes the mean parameter of the Poisson distribution, N denotes the normal distribution, and

σ_{u}^{2}

,

σ_{β}^{2}

,

σ_{p}^{2}

,

σ_{ϕ}^{2}

,

σ_{b_{i}}^{2}

, and

σ_{ξ}^{2}

are the tuned parameters, respectively. Furthermore,

Ω_{β}

,

Ω_{b_{i}}

, and

Ω_{ξ}

are derived as

\begin{matrix} Ω_{β} = {(\frac{2 - p}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} μ_{i j}^{2 - p} x_{i j}^{T} x_{i j} + \frac{p - 1}{ϕ} \sum_{y_{i j} > 0} y_{i j} μ_{i j}^{1 - p} x_{i j}^{T} x_{i j} + A^{- 1})}^{- 1}, \\ Ω_{b_{i}} = {(\frac{2 - p}{ϕ} \sum_{j = 1}^{n_{i}} μ_{i j}^{2 - p} + \frac{p - 1}{ϕ} \sum_{j = 1}^{n_{i}} y_{i j} μ_{i j}^{1 - p} I \{y_{i j} > 0\} + \sum^{- 1})}^{- 1}, \\ Ω_{ξ} = {(\frac{2 - p}{ϕ} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} μ_{i j}^{2 - p} B^{T} (t_{i j}) B (t_{i j}) + \frac{p - 1}{ϕ} \sum_{y_{i j} > 0} y_{i j} μ_{i j}^{1 - p} B^{T} (t_{i j}) B (t_{i j}) + \frac{Q}{τ_{ξ}^{2}})}^{- 1}, \end{matrix}

where

i = 1, \dots, n, j = 1, \dots n_{i}

. Finally, we give the accepted probability of

u_{i j}

,

β

, p,

ϕ

,

b_{i}

, and

ξ

used in the Metropolis–Hastings algorithm as follows:

\begin{matrix} min \{\frac{p (u_{i j}^{*} | Y, p, ϕ) f (u_{i j}^{(t)}; σ_{u}^{2}, λ | u_{i j}^{(l)} > 0)}{p (u_{i j}^{(l)} | Y, p, ϕ) f (u_{i j}^{*}; σ_{u}^{2}, λ | u_{i j}^{*} > 0)}, 1\}, \\ min \{\frac{p (β^{*} | Y, X, Z, T, ϕ, p, b, Σ, ξ)}{p (β^{(l)} | Y, X, Z, T, ϕ, p, b, Σ, ξ)}, 1\}, \\ min \{\frac{p (p^{*} | U, Y, X, Z, T, β, ϕ, b, Σ, ξ)}{p (p^{(l)} | U, Y, X, Z, T, β, ϕ, b, Σ, ξ)}, 1\}, \\ min \{\frac{p (ϕ^{*} | U, Y, X, Z, T, β, p, b, Σ, ξ)}{p (ϕ^{(l)} | U, Y, X, Z, T, β, p, b, Σ, ξ)}, 1\}, \\ min \{\frac{p (b_{i}^{*} | U, Y, X, Z, T, β, p, ϕ, ξ, Σ)}{p (b_{i}^{(l)} | U, Y, X, Z, T, β, p, ϕ, ξ, Σ)}, 1\}, \\ min \{\frac{p (ξ^{*} | U, Y, X, Z, T, β, p, ϕ, b, Σ)}{p (ξ^{(l)} | U, Y, Z, X, T, β, p, ϕ, b, Σ)}, 1\} \end{matrix} .

References

Olsen, M.K.; Schafer, J.L. A two-part random-effects model for semicontinuous longitudinal data. J. Am. Stat. Assoc. 2001, 96, 730–745. [Google Scholar] [CrossRef]
Berk, K.; Lachenbruch, P.A. Repeated measures with zeros. Stat. Methods Med. Res. 2002, 11, 303–316. [Google Scholar] [CrossRef]
Tooze, J.A.; Grunwald, G.K.; Jones, R.H. Analysis of repeated measures data with clumping at zero. Stat. Methods Med. Res. 2002, 11, 341–355. [Google Scholar] [CrossRef]
Su, L.; Tom, B.D.; Farewell, V.T. Bias in 2-part mixed models for longitudinal semicontinuous data. Biostatistics 2009, 10, 374–389. [Google Scholar] [CrossRef]
Su, L.; Tom, B.D.; Farewell, V.T. A likelihood-based two-part marginal model for longitudinal semicontinuous data. Stat. Methods Med. Res. 2015, 24, 194–205. [Google Scholar] [CrossRef]
Liu, L.; Strawderman, R.L.; Johnson, B.A.; O’Quigley, J.M. Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study. Stat. Methods Med. Res. 2016, 25, 133–152. [Google Scholar] [CrossRef]
Zhou, X.X.; Kang, K.; Song, X.Y. Two-part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates. Stat. Med. 2020, 39, 1801–1816. [Google Scholar] [CrossRef]
Hasan, M.T.; Yan, G.H.; Ma, R.J. Analysis of periodic patterns of daily precipitation through simultaneous modeling of its serially observed occurrence and amount. Environ. Ecol. Stat. 2014, 21, 811–824. [Google Scholar] [CrossRef]
Yan, G.H.; Ma, R.J. Modelling occurrence and quantity of longitudinal semicontinuous data simultaneously with nonparametric unobserved heterogeneity. Can. J. Stat. 2023, in press. [Google Scholar]
Zhang, Y.W. Likelihood-based and Bayesian Methods for Tweedie Compound Poisson Linear Mixed Models. Stat. Comput. 2013, 23, 743–757. [Google Scholar] [CrossRef]
Swallow, B.; Buckland, S.T.; King, R.; Toms, M.P. Bayesian hierarchical modelling of continuous non-negative longitudinal data with a spike at zero: An application to a study of birds visiting gardens in winter. Biom. J. 2016, 58, 357–371. [Google Scholar] [CrossRef] [PubMed]
Ye, T.; Lachos, V.H.; Wang, X.J.; Dey, D.K. Comparisons of zero-augmented continuous regression models from a Bayesian perspective. Stat. Med. 2021, 40, 1073–1100. [Google Scholar] [CrossRef]
Ibrahim, J.G.; Lipsitz, S.R.; Chen, M.H. Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J. R. Stat. Soc. Ser. B 1999, 61, 173–190. [Google Scholar] [CrossRef]
Ibrahim, J.G.; Chen, M.H.; Lipsitz, S.R. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. Biometrika 2001, 88, 551–564. [Google Scholar] [CrossRef]
Huang, L.; Chen, M.H.; Ibrahim, J.G. Bayesian analysis for generalized linear models with nonignorably missing covariates. Biometrics 2005, 61, 767–780. [Google Scholar] [CrossRef]
Lee, S.Y.; Tang, N.S. Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika 2006, 71, 541–564. [Google Scholar] [CrossRef]
Tang, N.S.; Zhao, H. Bayesian analysis of nonlinear reproductive dispersion mixed models for longitudinal data with nonignorable missing covariates. Commun. Stat. Simul. Comput. 2014, 43, 1265–1287. [Google Scholar] [CrossRef]
Tang, N.S.; Chow, S.M.; Ibrahim, J.G.; Zhu, H.T. Bayesian sensitivity analysis of a nonlinear dynamic factor analysis model with nonparametric prior and possible nonignorable missingness. Psychometrika 2017, 82, 875–903. [Google Scholar] [CrossRef]
Wang, Z.Q.; Tang, N.S. Bayesian quantile regression with mixed discrete and nonignorable missing covariates. Bayesian Anal. 2020, 15, 579–604. [Google Scholar] [CrossRef]
Wang, X.Q.; Song, X.Y.; Zhu, H.T. Bayesian latent factor on image regression with nonignorable missing data. Stat. Med. 2021, 40, 920–932. [Google Scholar] [CrossRef] [PubMed]
Ma, R.; Jørgensen, B. Nested generalized linear mixed models: An orthodox best linear unbiased predictor approach. J. R. Stat. Soc. Ser. B 2007, 69, 625–641. [Google Scholar] [CrossRef]
Lang, S.; Brezger, A. Bayesian P-splines. J. Comput. Graph. Stat. 2004, 13, 183–212. [Google Scholar] [CrossRef]
Tanner, M.A.; Wong, W.H. The Calculation of Posterior Distributions by Data Augmentation. J. Am. Stat. Assoc. 1987, 82, 528–540. [Google Scholar] [CrossRef]
Geman, S.; Geman, D. Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef]
Chen, H.C.; Wehrly, T.E. Assessing correlation of clustered mixed outcomes from a multivariate generalized linear mixed model. Stat. Med. 2015, 34, 704–720. [Google Scholar] [CrossRef]
Chen, H.C.; Wehrly, T.E. Approximate uniform shrinkage prior for a multivariate generalized linear mixed model. J. Multivar. Anal. 2016, 145, 148–161. [Google Scholar] [CrossRef]

Figure 1. Histogram for the observed WOMAC numeric score in the OAI dataset.

Figure 2. The estimated function and true function of

g (t)

for three priors: type I (left panel), type II (middle panel), and type III (right panel) in the first simulation.

Figure 2. The estimated function and true function of

g (t)

for three priors: type I (left panel), type II (middle panel), and type III (right panel) in the first simulation.

Figure 3. The estimated function and true function of

g (t)

for three priors: type I (left panel), type II (middle panel), type III (right panel) in the second simulation.

Figure 3. The estimated function and true function of

g (t)

for three priors: type I (left panel), type II (middle panel), type III (right panel) in the second simulation.

Figure 4. Nonparametric estimate of effects of age on the WOMAC numeric score in the OAI dataset.

Table 1. Symbols and description.

Symbols	Description
U	A Poisson distribution random variable
$μ$	The mean parameter of Tweedie compound Poisson distribution
$ϕ$	The dispersion parameter of Tweedie compound Poisson distribution
p	The power parameter of Tweedie compound Poisson distribution
$β$	A $q \times 1$ vector of unknown regression parameter
$b_{i}$	A $d \times 1$ vector of random effect $(b_{i} \sim N_{d} (0, Σ))$
$Σ$	A $d \times d$ covariance matrix
$g (\cdot)$	An unknown nonparametric function
$ξ$	An $H \times 1$ vector of B-spline coefficient $(ξ = {(ξ_{1}, \dots, ξ_{H})}^{T}$ )
$B (\cdot)$	The B-spline basis function
$τ_{ξ}^{2}$	A global smoothing parameter
Q	The $H \times H$ penalty matrix with elements $δ = {(δ_{2}, \dots, δ_{H})}^{T}$
$Y_{o}$	Set of observed values of response variable Y $(Y_{o} = {y_{o 1}, \dots, y_{o n}})$
$Y_{m}$	Set of missing values of response variable Y $(Y_{m} = {y_{m 1}, \dots, y_{m n}})$
$X_{o}$	Set of observed values of covariates X $(X_{o} = {x_{o 11}, \dots x_{o 1 n_{1}}, \dots, x_{o n 1}, \dots, x_{o n n_{n}}})$
$X_{m}$	Set of missing values of covariates X $(X_{m} = {x_{m 11}, \dots x_{m 1 n_{1}}, \dots, x_{m n 1}, \dots, x_{m n n_{n}}})$
Z	Vector of covariates relating to random effects $(Z = {z_{i j} : i = 1, \dots, n, j = 1, \dots n_{i}})$
T	Vector of time effects $(T = {t_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}})$
r	Vector of indicator variables relating to missing data mechanism ( $r = {r_{y}, r_{x}}$ )
$φ$	Vector of parameters relating to missing data mechanism ( $φ = {φ_{x}, φ_{y}}$ )
$α$	The parameter in the covariables’ distribution $(α = (α_{1}, \dots, α_{m}))$
$σ_{m}^{2}$	The parameter in the covariables’ distribution $(σ_{m}^{2} = {σ_{x k}^{2} : k = 1, \dots, m})$

Table 2. Sample data from the OAI study (M denotes the missing data).

ID	Month	Response Variable	Covariates
ID	Month	WOMAC Score	BMI	SEX	AGE
9019406	0	0	23.5	Male	71
9019406	12	1	22.6	Male	72
9019406	24	0	22.9	Male	73
9019406	36	M	M	Male	74
9019406	48	M	M	Male	75
⋮	⋮	⋮	⋮	⋮	⋮
9025191	0	1	24.2	Female	55
9025191	12	0	24.7	Female	56
9025191	24	1.06	24.7	Female	57
9025191	36	1	25.4	Female	58
9025191	48	0	25.4	Female	59
⋮	⋮	⋮	⋮	⋮	⋮

Table 3. Bayesian estimates of parameters in the first simulation study.

Par.	Type I			Type II			Type III
Par.	Bias	SD	RMS	Bias	SD	RMS	Bias	SD	RMS
$β_{1}$	0.0007	0.0795	0.0787	−0.0068	0.0664	0.0661	0.0042	0.0763	0.0756
$β_{2}$	−0.0011	0.0509	0.0504	0.0026	0.0567	0.0562	0.0045	0.0553	0.0549
$β_{3}$	0.0004	0.0723	0.0716	−0.0051	0.0678	0.0673	−0.0068	0.0815	0.0810
p	−0.0143	0.0265	0.0299	−0.0180	0.0227	0.0288	−0.0205	0.0206	0.0289
$ϕ$	−0.0247	0.0473	0.0529	−0.0355	0.0408	0.0538	−0.0371	0.0406	0.0547
$Σ$	−0.0355	0.0974	0.1028	−0.0195	0.0826	0.0841	−0.0354	0.0871	0.0932
$α_{10}$	0.0193	0.0653	0.0675	0.0159	0.0673	0.0685	0.0293	0.0749	0.0797
$α_{11}$	−0.0031	0.0866	0.0858	−0.0148	0.0650	0.0660	0.0023	0.0813	0.0805
$α_{20}$	0.0195	0.0677	0.0698	0.0290	0.0724	0.0773	−0.0021	0.0733	0.0726
$α_{21}$	0.0152	0.1205	0.1203	−0.0075	0.1125	0.1116	0.0041	0.1479	0.1464
$α_{22}$	−0.0171	0.0963	0.0968	0.0144	0.0827	0.0831	−0.0123	0.1036	0.1033
$φ_{y 0}$	0.0016	0.1453	0.1439	−0.0672	0.1680	0.1794	−0.0565	0.1603	0.1685
$φ_{y 1}$	−0.0156	0.0592	0.0607	0.0003	0.0468	0.0463	0.0027	0.0539	0.0535
$φ_{y 2}$	−0.0005	0.0640	0.0633	0.0090	0.0520	0.0522	0.0054	0.0569	0.0566
$φ_{x 10}$	−0.0119	0.1308	0.1301	−0.0802	0.1660	0.1828	−0.0622	0.1568	0.1673
$φ_{x 11}$	−0.0275	0.1552	0.1561	0.0095	0.2089	0.2070	0.0126	0.1889	0.1874
$φ_{x 20}$	−0.0139	0.1512	0.1503	−0.0461	0.2133	0.2162	−0.0886	0.1723	0.1921
$φ_{x 21}$	−0.0323	0.1915	0.1923	−0.0283	0.1914	0.1915	0.0616	0.2577	0.2624
$φ_{x 22}$	0.0088	0.1319	0.1309	0.0270	0.1426	0.1437	−0.0375	0.1528	0.1558
$φ_{x 23}$	−0.0452	0.2707	0.2717	0.0018	0.3127	0.3096	−0.0170	0.4563	0.4521
$σ_{x 1}^{2}$	0.0200	0.0170	0.0262	0.0245	0.0182	0.0304	0.0215	0.0198	0.0291
$σ_{x 2}^{2}$	0.0237	0.0233	0.0330	0.0275	0.0214	0.0347	0.0299	0.0220	0.0370

Table 4. Bayesian estimation of parameters in the second simulation study.

Par.	Type I			Type II			Type III
Par.	Bias	SD	RMS	Bias	SD	RMS	Bias	SD	RMS
$β_{1}$	−0.0073	0.0794	0.0790	−0.0050	0.0730	0.0724	0.0029	0.0717	0.0711
$β_{2}$	−0.0123	0.0551	0.0559	0.0119	0.0560	0.0567	−0.0044	0.0577	0.0573
$β_{3}$	0.0121	0.0764	0.0766	−0.0124	0.0786	0.0788	0.0003	0.0711	0.0704
p	−0.0182	0.0274	0.0327	−0.0132	0.0255	0.0285	−0.0140	0.0237	0.0273
$ϕ$	−0.0298	0.0444	0.0531	−0.0262	0.0406	0.0480	−0.0267	0.0416	0.0491
$Σ$	−0.0343	0.0737	0.0806	−0.0423	0.0904	0.0990	−0.0545	0.0870	0.1019
$α_{10}$	0.0885	0.0785	0.1177	0.0837	0.0595	0.1023	0.0852	0.0748	0.1129
$α_{11}$	0.0057	0.0775	0.0770	0.0180	0.0627	0.0646	0.0106	0.0781	0.0780
$α_{20}$	−0.0466	0.0834	0.0948	−0.0603	0.0661	0.0889	−0.0458	0.0711	0.0840
$α_{21}$	−0.0126	0.1330	0.1322	−0.0493	0.1282	0.1362	−0.0392	0.1376	0.1418
$α_{22}$	0.0065	0.1015	0.1007	0.0071	0.1039	0.1031	0.0001	0.0993	0.0983
$φ_{y 0}$	0.0094	0.1394	0.1383	−0.0769	0.1987	0.2112	−0.1102	0.1917	0.2194
$φ_{y 1}$	0.0070	0.0509	0.0508	0.0108	0.0498	0.0505	0.0167	0.0453	0.0479
$φ_{y 2}$	−0.0251	0.1807	0.1806	−0.0055	0.1977	0.1958	0.0114	0.2291	0.2271
$φ_{y 3}$	0.0317	0.1131	0.1164	−0.0203	0.1385	0.1386	−0.0355	0.1435	0.1464
$φ_{x 10}$	−0.0540	0.1440	0.1524	−0.1013	0.1761	0.2016	−0.0449	0.1860	0.1895
$φ_{x 11}$	−0.0042	0.1922	0.1903	0.0322	0.2089	0.2093	−0.0226	0.2019	0.2011
$φ_{x 12}$	0.0085	0.0505	0.0507	0.0094	0.0698	0.0697	0.0061	0.0563	0.0561
$φ_{x 20}$	−0.0337	0.2063	0.2070	−0.0246	0.2234	0.2225	−0.0077	0.2564	0.2540
$φ_{x 21}$	−0.0478	0.2089	0.2123	0.0187	0.2069	0.2057	0.0169	0.2182	0.2167
$φ_{x 22}$	0.0007	0.1472	0.1457	−0.0201	0.1322	0.1324	0.0045	0.1744	0.1727
$φ_{x 23}$	−0.0221	0.1341	0.1346	−0.0780	0.1768	0.1916	−0.1029	0.2064	0.2288
$σ_{x 1}^{2}$	0.0273	0.0219	0.0349	0.0230	0.0194	0.0300	0.0251	0.0216	0.0330
$σ_{x 2}^{2}$	0.0353	0.0306	0.0465	0.0336	0.0297	0.0447	0.0364	0.0260	0.0446

Table 5. Bayesian estimates and standard errors in the real example.

Parameter	Est	SD
$β_{0}$	−0.492835	0.0778
$β_{1}$	0.058405	0.0024
$β_{2}$	0.141541	0.0370
p	1.258745	0.0031
$ϕ$	3.146945	0.0245
$Σ$	1.840797	0.0550
$α_{10}$	21.004105	0.3076
$α_{11}$	4.520023	0.1779
$φ_{Y 0}$	−2.694960	0.0196
$φ_{Y 1}$	0.000522	0.0024
$φ_{Y 2}$	0.042958	0.0023
$φ_{B M I 0}$	−1.338632	0.1937
$φ_{B M I 1}$	−0.017045	0.0067
$σ_{B M I}^{2}$	30.067984	0.5088

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Z.; Duan, X.; Zhang, W. Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates. Entropy 2023, 25, 506. https://doi.org/10.3390/e25030506

AMA Style

Wu Z, Duan X, Zhang W. Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates. Entropy. 2023; 25(3):506. https://doi.org/10.3390/e25030506

Chicago/Turabian Style

Wu, Zhenhuan, Xingde Duan, and Wenzhuan Zhang. 2023. "Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates" Entropy 25, no. 3: 506. https://doi.org/10.3390/e25030506

APA Style

Wu, Z., Duan, X., & Zhang, W. (2023). Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates. Entropy, 25(3), 506. https://doi.org/10.3390/e25030506

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bayesian Analysis of Tweedie Compound Poisson Partial Linear Mixed Models with Nonignorable Missing Response and Covariates

Abstract

1. Introduction

2. Data Description

3. Statistical Models

3.1. Tweedie Compound Poisson Distribution

3.2. The Model

3.3. Missing Data Mechanism Assumptions

4. Bayesian Inference

Bayesian Estimates

5. Numerical Examples

5.1. Simulation Studies

5.2. Real Example

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Conditional Distributions for the Gibbs Sampling Algorithm

Appendix A. Metropolis–Hastings Algorithm

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI