Bayesian Semiparametric Regression Analysis of Multivariate Panel Count Data

Panel count data often occur in a long-term recurrent event study, where the exact occurrence time of the recurrent events is unknown, but only the occurrence count between any two adjacent observation time points is recorded. Most traditional methods only handle panel count data for a single type of event. In this paper, we propose a Bayesian semiparameteric approach to analyze panel count data for multiple types of events. For each type of recurrent event, the proportional mean model is adopted to model the mean count of the event, where its baseline mean function is approximated by monotone I-splines. The correlation between multiple types of events is modeled by common frailty terms and scale parameters. Unlike many frequentist estimating equation methods, our approach is based on the observed likelihood and makes no assumption on the relationship between the recurrent process and the observation process. Under the Poisson counting process assumption, we develop an efficient Gibbs sampler based on novel data augmentation for the Markov chain Monte Carlo sampling. Simulation studies show good estimation performance of the baseline mean functions and the regression coefficients; meanwhile, the importance of including the scale parameter to flexibly accommodate the correlation between events is also demonstrated. Finally, a skin cancer data example is fully analyzed to illustrate the proposed methods.


Introduction
Panel count data often arise in epidemiological and medical studies, in which the events of interest have a property of recurring and subjects are monitored periodically. Since the subjects are not under continuous monitoring, the exact time of each recurrent event is not observed but the count of such events between adjacent observation times is known. Many times in a study, several types of related recurrent events are of interest to be collected. For example, the recurrent events can be different types of infections, tumors, and social behaviors such as drinking and drug use. The motivating example is bivariate panel count data on skin cancers from the literature [1]. The data arose from a skin cancer chemoprevention trial conducted by the University of Wisconsin Comprehensive Cancer Center. This was a double-blinded and placebo-controlled randomized III clinical trial. The main objective of the study was to evaluate the effectiveness of 0.5/m 2 /day PO difluoromethylornithine (DFMO) in reducing the recurrence rate of skin cancers in a population of patients with a history of non-melanoma skin cancers: basal cell carcinoma and squamous cell carcinoma. Two hundred and ninety one patients were randomized into either a placebo group or a DFMO group. During the study, the patients were scheduled to regularly take 6-month reviews to check the development of both skin cancers. At each visit, the numbers of occurrences of both basal cell carcinoma and squamous cell carcinoma since the previous visit were recorded. For these bivariate panel count data, it is possible to analyze these two skin cancers separately to evaluate the effectiveness of DFMO. However, conducting a joint analysis is a better practice to investigate the correlation between two cancers and to improve the estimation efficiency.
Stats 2022, 5 Extensive methods have been established to analyze panel count data for a single type of recurrent event, for example Sun and Kalbfleisch [2], Sun and Zhao [1], Wellner and Zhang [3,4], Lu et al. [5,6], Wang and Lin [7], among others. For multivariate panel count data, He et al. [8] considered regression analysis for multivariate panel count data and first proposed a class of marginal mean models, which leave the dependence structures for related recurrent events completely unspecified. Zhang et al. [9] then improved their model and provided a robust joint modeling approach for the regression analysis of multivariate panel count data with an informative observation process. Li et al. [10] proposed semiparametric transformation models that allow the dependence of the recurrent event processes on the observation process. Along the same line, Zhao et al. [11] proposed a semiparametric additive model to analyze the multivariate panel count data with dependent observation processes and a terminal event. This branch of works emphasize the dependence of the recurrent event processes on the observation process and require the model of the observation process being explicitly specified. They usually derive estimating equation methods to estimate regression coefficient without estimating baseline mean functions.
Another line of research for panel count data is to consider varying or nonlinear covariate effects. He et al. [12] proposed a class of partially linear models with varying coefficients for the mean function of the counting processes to explore the nonlinear interactions between covariates. Zhao et al. [13] adopted the B-splines to model the regression function. Wang and Yu [14] proposed the time varying coefficient model by the local linear expansion method. For multivariate panel count data, Li et al. [15] presented a joint modeling for the recurrent event processes, observation process, and censoring time by using time-dependent random effects. Wang and Yu [16] proposed a varying coefficient mean model for multivariate panel count data to describe the possible nonlinear interact effects between covariates. In this paper, instead of investigating varying covariate effects, we only focus on traditional constant covariate effects.
Frequentist approaches dominate the analysis of panel count data. There is only a limited number of papers describing Bayesian approaches for analyzing panel count data. Chib et al. [17] investigated estimation and model comparison for panel count data models with multiple random effects. Sinha and Maiti (2004) proposed a Bayesian approach for the analysis of panel count data with dependent termination. Wang and Lin (2020) developed a Bayesian semiparametric approach under the proportional mean model with the baseline mean function approximated by monotone I-splines [18]. Dimitrakopoulos [19] discussed how Bayesian techniques can be used to estimate the panel count data model. Liang et al. [20] proposed a bivariate Gaussian Cox process model to jointly model the recurrent event process and the observation process. To the best of our knowledge, Ref. [20] is the only Bayesian paper for analyzing multivariate panel count data.
In this paper, we contribute a new Bayesian semiparameteric approach to analyze panel count data for multiple types of recurrent events. For each type of recurrent event, we adopt the modeling in Wang and Lin (2020) by using the proportional mean model to model the mean count of the recurrent event with its baseline mean function approximated by monotone I-splines [18]. The proposed approach is based on the observed likelihood using only the observed counts and observation times, which does not require us to specify a model for the observation process. Instead of building up the correlation among multiple types of events through their dependence on the observation process, we tackle the correlations between different types of events directly by introducing common subjectspecific gamma frailty terms and additional scale parameters. The resulting pairwise correlations can be calculated in a close form and flexibly accommodate different correlation situations including positive, negative, strong, and weak correlations. Based on Poisson latent variable augmentation, our developed MCMC algorithm can estimate both the regression coefficients and the baseline mean functions simultaneously.
The remainder of this paper is organized as follows. In Section 2, primary models and the correlation derivation and interpretation are introduced. In Section 3, a detailed description of monotone I-splines, augmented likelihood function construction, and prior specification and posterior computation are presented. Section 4 evaluates our proposed methods via extensive simulation studies. The skin cancer data are used to demonstrate the performance of the proposed methods in Section 5. Lastly, Section 6 summarizes our findings and discusses some future work directions.

Model Construction
Consider that n subjects participate a long-term study involving K types of related recurrent events. Each subject is not under continuous monitoring and instead observed at discrete time points. Specifically, for each subject i, J ip ) denote the p × 1 covariate vector associated with subject i for event k. For simplicity, in this paper we assume the covariates for subject i are identical for all K events and denoted by X i . The whole set of observed panel count Finally, we assume there is a latent subject-specific positive frailty term w i for each subject i, which affects the occurrence rates and connects the multiple events. The frailties w i 's are assumed to follow a gamma distribution G(η, η) with mean 1 and variance 1/η. The model identifiability is satisfied with the mean of frailties equal to 1 [21,22]. Given the covariates X i and frailty w i , we assume the counting process {N i (t), t > 0} for event k has a proportional mean function in the following form: where α k is a scale parameter introduced to more flexibly accommodate the correlation between events, U (k) 0 (t) is the baseline mean function for event k, and β (k) is a p × 1 vector of regression coefficients for event k. By default, the model considers different covariate effects for different recurrent events, but it is easy to extend to the situation where the covariate effects are identical across all the K events. More details about α k related to the correlation are discussed in Section 2.2.
Let Z (k) ij denote the number of occurrence of event k in the jth time interval (t (k) i (t i0 ) = 0 for i = 1, . . . , n. Through this transformation, the whole set of data can be expressed as ij , X i , for k = 1, . . . , K; i = 1, . . . , n; j = 1, . . . , J (k) i }. Under the Poisson process assumption, given the covariates X i and latent frailty terms w i , Z (k) ij independently follows Poisson distributions, which can be written as: This form is particularly useful for constructing the likelihood function in Section 3.2.
Note that when integrating out the frailty effect w i , Z

Correlation Expression
An advantage of our proposed model is its straightforwardness in deriving the correlation formula between two events. Among the K events, we predesignate one event that is of main interest, for instance event j, as a reference event, and let α j = 1. Then, given the covariates X i and the unobservable frailty w i , the cumulative counts of event j and event k (k = j) for subject i at any time point t are conditionally independent and have Using the law of total variance, we explicitly derive Cov N (j) i (t) and hence the correlation formula as below: where η > 0 and α k > −η/2. For details on derivation of Equation (3), see Appendix A. Clearly, α k = 0 implies that event j and event k are independent; α k > 0 implies that event j and event k are positively related; α k < 0 implies that event j and event k are negatively related. When getting rid of the covariate effects, the baseline correlation between the two events is .
The benefit of this form is that it gets away with the individual level and provides a broader view of the correlation between two events. Letting U (j) 0 (t) = 1, we can explore the pure effect of α k and η on the baseline correlation Corr 0 (N (j) (t), N (k) (t)). Note that the correlation between any two events is controlled by the parameters η and α k . Figure A1 shows the baseline correlation performance related to η and α k when α k ∈ [−1, 1]. It shows that with the variance of frailty increasing (η decreasing), the correlation increases; with the magnitude of α k increasing, the correlation also increases. When α k > 0, any η > 0 is a legitimate choice; while for α k < 0, the condition η > −2α k must be satisfied to make Γ(·) valid. In fact, the interpretation of the pairwise correlation is not limited to the pairs of events involving the predesignated event. A similar derivation can derive the correlation between any two types of events. Theoretically, α k can be any value greater than −η/2. We only choose α k between −1 and 1 because we think in practice we can always choose the event with the largest frailty variability as the reference event. Then, for the other events, we only need α k between −1 and 1.

(t) with Monotone I-Splines
To accommodate the nondecreasing nature of the baseline mean functions in the proposed model, we choose monotone I-splines to model them. I-splines were first bought up by Ramsay [18] and then widely applied in many semiparametric models. To put it briefly, I-splines are actually integrated M-splines, a set of non-negative spline functions [18]. In our proposed model, each baseline function of event k is modeled as a linear combination of I-splines: In the formula, I l (·|d) is the I-spline basis function with degree d; L is the number of I-spline basis functions, which equals the number of interior knots plus the degree d; r (k) l s are the nonnegative spline coefficients. For more detailed information on the formula of I-splines, refer to Ramsay [18] and Lin et al. [23]. The degree d and the placement of knots are two chief components that determine the basis functions. The former controls the smoothness and the latter controls the shape of the spline function. In general, 2 or 3 degrees is enough to provide adequate smoothness, and 10 to 30 knots can provide enough flexibility for a regression incorporating thousands of observations according to Cai et al. [24] and Wang and Dunson [25]. Equally spaced knots and quantile-based knots are two commonly used methods to select knots. Ref. [7] showed that the deviance information criterion (DIC) can be used to facilitate choosing the setup for the I-spline functions. In addition, reversible jump MCMC technique [26] may also help adaptively choosing the number and location of knots. In this paper, to avoid additional computation burden, we uniformly use degree 3 and 20 equally spaced knots (18 interior knots), which provides sufficient flexibility for modeling the unknown baseline mean functions.

Likelihood Augmentation with Poisson Latent Variables
Consider that n subjects participate in a long-term study involving K types of related recurrent events. The data structure and model are defined in Section 2.1. Under the Poisson process assumption, following Equation (2), the observed likelihood function can be written in the following form: Taking the baseline function U (k) 0 (t) in the form of (4), the likelihood can further be written as where θ represents the vector of all unknown parameters including β (k) L ) for k = 1, . . . , K, and α = (α 1 , . . . , α k ) . Note that for the simplicity of notation, the d is omitted from the I-spline basis functions. With this likelihood format, the sampling for r (k) l is especially difficult. To solve this problem, we further decompose Z ijl s independently follow Poisson distributions as follows: The convolution property of Poisson distribution aids the transformation smoothly. Then, with the augmented Poisson latent variables, the likelihood function can be expressed as Based on this augmented likelihood Equation (5), we develop the Bayesian computation algorithm in Section 3.3.

Prior Specification and Posterior Computation
For Bayesian computation, we need to first specify prior distributions for unknown parameters. When we do not have much prior information about parameters, we usually assign vague priors for them. For β (k) m , m = 1, . . . , p; k = 1, . . . , K, we assign N (0, σ 2 ) priors, where σ 2 takes a large value such as 100. For nonnegative r (k) l , l = 1, . . . , L; k = 1, . . . , K, we assign exponential priors with rate parameter λ k , where λ k itself follows a gamma prior G(a λ , b λ ). This prior specification is appealing from the computational perspective because it leads to conjugate forms for each of the conditional posterior distributions of r (k) l and λ k . Theoretically, such a prior specification is closely related to Bayesian Lasso [27] and is equivalent to the penalized likelihood approach with an L1 penalty on those spline coefficients, in which λ k serve as tuning parameters. Our simulation studies show that our approach is robust to the choice of hyperparameters, so we simply choose a λ = 1 and b λ = 1. We adopt Gibbs sampling algorithm for posterior computation. Basically, we derive the full conditional distribution of each parameter component-wisely from the joint distribution of the likelihood function in (5) and the specified prior distributions. If the full conditional distribution of a parameter has a closed form, the sampling is straightforward. When the closed form is intractable, an adaptive rejection sample (ARS) [28] is adopted if the full conditional posterior distribution preserves the log-concavity. Even if the log concavity is not satisfied, we can still use adaptive rejection metropolis sampling (ARMS) [29] to draw samples. Specifically, we use function arms() in the R package HI [30] to realize the ARMS. The full conditional distributions of the Gibbs sampler are summarized as below.

5.
Sample w i for i = 1, . . . , n, by using the ARS. The log full conditional distribution of each w i is proportional to

6.
Sample η by using the ARMS, the log full conditional distribution of which is proportional to

7.
Sample α k , for k = 1, . . . , K, by using the ARMS, the log full conditional distribution of which is proportional to In the R function arms(), we set the low bound of the support of α k as −η/2.

Simulation Studies
Simulation studies are conducted to evaluate the proposed methods. We only consider 2 (K) types of events for the demonstration purpose. By default, α 1 = 1. For notation simplicity, we denote α 2 as α. It is straightforward to extend to three or more types of events. We particularly assess the performance of estimating covariate coefficients β (k) and baseline functions U (k) 0 for different η and α values. We also compare the estimation results between models with and without the scale parameter α.

Data Generation
which is approximately linear, and for the second type of event, U 0 (t) = t 0.5 + log(1 + t), which is curvilinear. Two covariates are involved for each subject, where X 1 is from a Bernoulli distribution with the probability of success 0.5 and X 2 is from a standard normal distribution. Each subject has the same observation times for the two types of events. The number of observation times for each subject is generated from a Poisson distribution with mean 7, and the time length between every two adjacent observation times follows an exponential distribution with mean 0.5. Given the observation times and the generated frailty w i from G(η, η), for each subject, the event count for each time interval for each type of event is generated from a Poisson distribution as Equation (2). Each set of simulations consists of 500 data replicates.

Simulation Results
We implemented the Gibbs sampler in Section 3.3 for each simulated dataset. Good mixing and fast convergence in the chains of the key parameters were observed. The convergence assessment was carried out using various convergence criteria in the R package coda [31]. The following results are summarized based on 5000 iterations of the Gibbs samples after discarding the first 2000 iterations as a burn-in.
The first set of simulations aims at assessing the performance of estimating regression coefficients β (k) . For this set of simulations, data are generated with different true β values but α fixed at 1. Table A1 shows the estimation bias (Bias) defined as the average of the posterior means minus the true value, the mean of the posterior standard deviations (SD), standard deviation of the posterior means (SE), and 95% coverage probability (CP95) of four combinations of β values. Clearly, the estimation of regression coefficients is good with small biases, SDs close to SEs, and 95% coverage probabilities around 95%. When η goes from 1 to 5 (i.e., the variance of frailty goes from 1 to 0.2), the Bias, SD and SE uniformly become smaller. It is noted that the estimation of β 2 is consistently better than that of β 1 in terms of giving smaller biases and SDs and SEs. This is because that X 2 is a normal covariate providing more information than a binary covariate X 1 , and that the variance of X 2 is four times the variance of X 1 . Our simulation study shows that when we add one more possible value to X 1 such that X 1 follows a uniform discrete distribution with X 1 ∈ {0, 1, −1}, the estimation of β 1 's is improved and close to the estimation of β 2 .
The second set of simulations focuses on assessing the estimation performance of our proposed methods when data are generated with different α values . The true β values are fixed as β Table A2 summarizes the estimation results. The estimation results of regression coefficients are similar as those in Table A1. Overall, the estimation is good with small biases, SDs close to SEs, and the CP95s close to 95%. The estimation is more precise for the larger η value, and the estimation of β 2 is overall better than that of β 1 . A new fact is observed that with the magnitude of α approaching 0, the estimation of the regression coefficients for the second type of event becomes more precise with smaller biases and SDs and SEs. This observation is reasonable, as the smaller value of α reflects the occurrence of the second type of events less affected by frailties, and thus the more precise estimation of the regression coefficients is expected. Furthermore, the estimated α and η have remarkably caught the truth. Different from the estimation of regression coefficients, a small value of η leads to the better estimation of α. A glance at the table may find the bias of η is a little too high when the true η is 5. We need to point out that the difference between G(5, 5) and G(6, 6) is trivial, so the estimation can still accurately catch the shape of the distribution. Figure A2    0 (t) = t 0.5 + log(1 + t) of the second type of event. We can clearly see that all the estimated lines catch up to the true lines well and the plots on the right side when η = 5 show an even better convergence to the truth than their counterparts on the left side when η = 1. Figure A3 shows the estimation of the baseline correlation functions corresponding to different setups of α and η values. It is clear that for each α and η setup, the estimated baseline correlation function overlays the true baseline correlation function.
Finally, we evaluate the effect of the scale parameter α on the estimation. Frailty models are well accepted for univariate panel count due to its flexibility and robustness. A naive extension to multivariate panel count data is to treat the frailty exactly the same for all types of events, i.e., each subject shares the common frailty to its full extent for all the events. Our simulation shows that this practice leads to wrong estimation results when true α is not 1. Table A3 presents the estimation results for the same simulated datasets as in Table A2, but are fitted with the naive model, i.e., model (1) with α omitted. The last block of rows of the table shows that the naive model does a comparable job in estimating the regression coefficients and η, which is reasonable because the naive model is the true model when the true value of α is 1. However, for other data generated with α not equal to 1, the misspecified naive model provides bad estimation results, which is especially clear when true η = 1. Compared with the results in Table A2, all Biases for the regression coefficients slightly increase. For the regression coefficients for the first type of event, β (1) 1 and β (1) 2 , the SEs are quite close to those in Table A2, but the SDs become smaller (the further away the true α is from 1, the smaller the SDs become), leading to CP95 being much smaller than 95%. For the regression coefficients for the second type of event, β (2) 1 and β (2) generated with α = 0 or 0.5, the naive method inflates SDs more than SEs, leading to higher coverage probabilities than 95%. When α = −0.3, the inflated SEs are actually larger than the inflated SDs, leading to lower coverage probabilities. The bias, SD and SE of η are increased when the true α gets away from the value 1. The estimation results for η = 5 have a similar pattern to those for η = 1 but much better, which is reasonable because the large η value implies a small variance of frailties and thus less correlation between two types of events. This set of simulations shows that including the scale parameter of α is important to correctly estimate the regression coefficients.

Real Data Analysis
In this section, we apply our proposed methods to analyze the motivating dataset, the skin cancer dataset, which was introduced in Section 1, and compare the results with those in the literature. Two hundred and ninety patients are analyzed, with one removed from the original group because of no observation. The observation times were recorded in days. The covariate of major interest is denoted as X 1 , which is equal to 1 if the patient was assigned in the DFMO group and 0 in the placebo group. The other three covariates of interest are X 2 , the number of cancers prior to the trial; X 3 , the patient's age; and X 4 , gender with male 1 and female 0. Table A4 displays the results of the covariate coefficients of the two types of skin cancer from the proposed model. Clearly, gender (X 4 ) has no significant effect, but the number of cancers prior to trial (X 2 ) and the patient's age (X 3 ) are significant for both skin cancers. The positive value of β 2 implies that the number of prior cancers has a positive relationship with the rate of new cancers. The negative value of β 3 suggests that older patients tend to have a lower recurrence rate of new cancers, which makes biological sense as older people have a slower metabolism. However, DFMO has different effects on the recurrence rate of the two skin cancers. For basal cell carcinoma, the recurrence rate is decreased by a factor of 1.209, but for squamous cell carcinoma, the recurrence rate is increased by a factor of 1.111. However, neither of these two effects are significant due to the large posterior standard deviations.
The skin cancer data have also been analyzed by He et al. [8] and Zhang et al. [9], assuming the same covariate effects for both skin cancers. For comparison, we also modify step 4 in the Gibbs sampler in Section 3.3 and reanalyze the data with the same covariate coefficients for both cancers. The results are presented in Table A5. Overall, all the estimation results of the covariate coefficients for the three methods assuming common covariate effects are quite similar, with the Bayesian method producing smaller posterior standard deviations of the regression coefficients than the standard errors from the other two frequentist methods. The different effects of DFMO on the two skin cancers are completely hidden compared to the method where different covariate effects are assumed. The effect of the number of cancers prior to trial (X 2 ) is significant for all three methods. Our Bayesian method shows the significant effect of patient's age (X 3 ), while the other two frequentist methods do not. Our method and Zhang et al. [9]'s method show no significant effect of gender (X 4 ), but He et al. [8]'s method does.
Our methods can estimate the baseline mean functions simultaneously with the regression coefficients. Figure A4 presents the estimated baseline mean functions for basal cell carcinoma and squamous cell carcinoma, respectively. The solid line on the top and the broken line at the bottom represent the baseline mean functions of basal cell carcinoma and squamous cell carcinoma from the model, assuming different covariate effects. The two broken lines in the middle represent the baseline mean functions of the two cancers when assuming the covariate effects are the same. Clearly, under the common covariate effects assumption, the baseline mean functions are both pulled towards the middle, which blurs the difference of the mean functions between these two skin cancers.
Finally, Figure A5 displays the estimated baseline correlation function when covariate effects are assumed different for the two skin cancers. It shows that the correlation between the two recurrent skin cancers is positive and strengthened across time.

Discussion
In this paper, we have proposed a Bayesian estimation approach for the semiparametric regression analysis of multivariate panel count data. For each type of event, the proportional mean model is adopted to model the cumulative mean count of the event, where its baseline mean function is approximated by monotone I-splines. The correlation between multiple types of events is modeled by common frailty terms and scale parameters. Based on the novel Poisson data augmentation, an efficient and easy-to-implement Gibbs sampler is developed for the MCMC computation. Through the MCMC samples, the regression coefficients, the baseline mean functions, and the baseline correlation function can be simultaneously estimated. Simulation studies have shown that the proposed approach provides accurate estimation of the regression coefficients and the baseline mean functions. Simulation studies have also demonstrated the importance of including the scale parameter in the model. The scale parameter provides substantive flexibility for modeling the correlation and meanwhile improving the estimation performance.
Instead of using a single common frailty w i and scale parameters to model the correlation between counts of multiple events, another possible way is to have a frailty vector (w i1 , w i2 , . . . , w iK ) for each subject. Then, the correlation between counts of multiple events is modeled through the dependence structure of these w ij s. Bedair et al. [32] used copulafrailty models to analyze correlated recurrent events of different types. We may also borrow the copula-frailty idea to more flexibly model the correlation between counts of multiple events. In this paper, we assume frailties follow a gamma distribution G(η, η). The gamma frailty is the simplest and most popular frailty distribution in survival analysis. Under the mean equal to 1 restriction, there is only one parameter η to represent the variance in the frailty, which is easy to interpret when connecting to the correlation. Some other frailties may also be used, such as log normal frailty and inverse Gaussian frailty. Then, the formula for the correlation and the sampling for the frailty would be different. Other possible choices of frailties are referred to by Balan and Putter [33], who provide an excellent tutorial on frailty models.
Regarding choosing which event as the reference event, in this paper, we choose the event with the largest frailty variability as the reference event. One possible way to choose the reference event is to first conduct univariate panel count data analysis for each type of event, and then to choose the one with the largest frailty variability as the reference event.
For real data analysis, gamma frailty assumption may not hold for the reference event. For this, nonparametric frailties without assuming any specific distribution may be a more flexible choice.
Our approach can be slightly modified to accommodate common covariate effects for different types of events. However, the real data analysis shows that doing this would hide significant covariate effects for individual types of events. Therefore, we suggest assuming different covariate effects for different events first for real data analysis and then using the common covariate effect model later if similar effects are observed for different events.
Unlike the existing frequentist methods, our approach does not require model assumptions for the observation or censoring processes. Our approach is solely based on the observed likelihood and only needs the observed counts and observation times for the analysis, which makes our proposed approach generic to deal with panel count data arising from different observation schemes, including dependent or independent censoring and/or observation processes. On the other hand, the proposed approach may lose a certain amount of efficiency due to not incorporating the information of the observation or censoring processes when they are actually available. Acknowledgments: The authors would thank Lianming Wang for the initial conceptualization of the paper.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A. Derivation of Cov(N
1 = 1, and β (2) 2 = 1. Bias refers to the difference between the average of the 500 posterior means and the true value; SD refers to the mean of the 500 posterior standard deviations; SE refers to the standard deviation of the 500 posterior means, and CP95 refers to the 95% coverage probability.
1 = 1, and β (2) 2 = 1. Bias refers to the difference between the average of the 500 posterior means and the true value; SD refers to the mean of the 500 posterior standard deviations; SE refers to the standard deviation of the 500 posterior means, and CP95 refers to the 95% coverage probability.  Table A4. Estimation results (posterior mean, posterior standard deviation and 95% credible interval) of the covariate effects for the two types of skin cancers. * indicates statistically significant.