Improved Small Sample Inference Methods for a Mixed-Effects Model for Repeated Measures Approach in Incomplete Longitudinal Data Analysis

The mixed-effects model for repeated measures (MMRM) approach has been widely applied for longitudinal clinical trials. Many of the standard inference methods of MMRM could possibly lead to the inflation of type I error rates for the tests of treatment effect, when the longitudinal dataset is small and involves missing measurements. We propose two improved inference methods for the MMRM analyses, (1) the Bartlett correction with the adjustment term approximated by bootstrap, and (2) the Monte Carlo test using an estimated null distribution by bootstrap. These methods can be implemented regardless of model complexity and missing patterns via a unified computational framework. Through simulation studies, the proposed methods maintain the type I error rate properly, even for small and incomplete longitudinal clinical trial settings. Applications to a postnatal depression clinical trial are also presented.


Introduction
Clinical trials for new drug development are often longitudinal trials in which outcome variables are repeatedly measured. In these trials, the primary analyses usually compare the treatment efficacy with a comparator at the end of a follow-up period. However, during the follow-up period, dropouts or missing outcome variables usually occur, and may seriously influence the validity and precision of the statistical inference. In addition, regulatory guidelines for preventing and treating the missing data in clinical trials have been issued [1][2][3], and adequate practices have been strongly pursued in recent years. Following these discussions, the mixed-effects model for repeated measures (MMRM) [4][5][6] has been widely applied for primary analyses of clinical trials in drug development. This type of model allows for valid statistical inference under incomplete longitudinal repeated measurements based on the direct likelihood approach.
MMRM is a type of linear mixed model (LMM) [7][8][9] that directly models the variance-covariance matrix of the longitudinal multivariate outcome variables [5], in which random effects are included as part of the marginal covariance matrix. One of the advantages of MMRM is that it enables flexible modelling of the correlation structure between time points to ensure the validity of inference in treatment efficacy. Further, the ordinary inference methods for the LMM (e.g., restricted maximum Stats 2019, 2, 174-188; doi:10.3390/stats2020013 www.mdpi.com/journal/stats likelihood (REML) method) [10] are based on large sample approximations. Their validities are violated under small or moderate sample settings [11,12]. For the MMRM, Gosho et al. [13] also showed the invalidity of the ordinary inference methods under small or moderate sample settings. In order to resolve these problems, several related works have been conducted regarding the conventional LMM. One solution is to adopt a higher-order asymptotic theory. Zucker et al. [14] studied the Bartlett correction [15] and the Cox-Reid adjusted likelihood [16] as well as their combination of the likelihood ratio (LR) test. Lyons and Peters [17] and Guolo et al. [18] proposed a higher-order asymptotic approach by adapting Skovgaard's improved modified signed log-likelihood ratio [19] introduced by Barndorff-Nielsen [20]. Stein et al. [21] investigated the modified profile likelihood approach of Barndorff-Nielsen [20] based on the approximation method of Severini [22]. Although the improved methods by Stein et al. [21] performed well in their simulation studies, they compared their methods only with the naïve LR test. In addition, their improved methods require complicated analytical calculations involved in higher-order differentiations of log-likelihood in case-by-case analyses. In particular, when applying the MMRM in longitudinal clinical trials, the marginal covariance structure is usually assumed to be a complicated form to circumvent model misspecifications, and calculations by Stein et al. [21] would not be realistic in practical use. Stein et al. [21] also investigated bootstrap-based approximation inferences, but their discussions and numerical evaluations were also limited within the conventional LMM framework, and the applications to MMRM for incomplete longitudinal studies were not discussed.
In this study, we proposed and investigated two improved inference methods involved in MMRM under small sample size and with incomplete data for longitudinal clinical trials. To circumvent the practical difficulties in implementing the analytical calculations, we adopted numerical approximations using bootstrap inferences [23][24][25]. The first method was the Bartlett correction [15] with the adjustment term approximated by bootstraps [23]. This approach can effectively circumvent case-by-case complicated analytical calculations and can be generally applied regardless of model complexity and missing patterns via a unified computational framework. In addition, the second involved the Monte Carlo test using empirical distribution constructed by the bootstrap; this was a straightforward approach, and is widely known to be an effective method under these situations. The resampling schemes of both methods allowed outcome variables to be incomplete, and we evaluated their validities and performances under practical situations of longitudinal clinical trials with missing data. In addition, we compared these methods with those of standard methods such as REML using Kenward-Roger's (KR) [11] method and unstructured covariance structure. We also assessed their practical effectiveness, illustrating their application to a postnatal depression clinical trial [26]. This paper is organized as follows: we first review the MMRM for longitudinal data analyses in Section 2. We then provide our approaches to improve the statistical inferences of MMRM in Section 3. We provide simulation evaluations in Section 4, and we apply our methods to the postnatal depression clinical trial data in Section 5. Lastly, we conclude with some discussion in Section 6.

Mixed-Effects Model for Repeated Measures (MMRM) for Longitudinal Data Analysis
Suppose subjects were randomized to two treatment groups (e.g., active drug vs. placebo). A continuous outcome was measured repeatedly over n time points. Also, we considered that the total number of subjects in the two groups was N and the number of time points that the outcomes were observed for the ith individual is n i (n i ≤ n). The primary analysis of interest was to compare the mean difference of the primary endpoint at the final nth time point. In this study, we supposed only monotonic missing data for simplicity, but our discussions can be straightforwardly extended to non-monotonic missingness.
where ε i was the n i × 1 random error vector distributed independently as MV N(0, Σ i ). Σ i was the n i × n i variance-covariance matrix. The random effect b i and the error ε i were independent, and all data between different subjects were also assumed to be independent. Y i marginally follows In the MMRM method, the variations explained by random effects were included as part of the marginal covariance matrix V i rather than being explicitly modelled as the random effects [5]. An unstructured covariance matrix is often preferred as the structure of V i because no assumptions are made on the covariance structure [13]. For the statistical inferences of regression parameters, the restricted maximum likelihood (REML) method [10] has been routinely used in practice. In addition, although missing is a common problem in longitudinal clinical trials, validity of the inferences is assured under the missing at random (MAR) mechanism because MMRM adopts the likelihood-based methods [27]. To develop the improved inference methods, we first introduced the LR test for MMRM. The LR test statistic for the hypothesis test above was

Likelihood Ratio (LR) test
where β 1 ,β c ,υ was the maximum likelihood (ML) estimates of (β 1 , β c , υ) and β c , υ was the constrained ML estimates under the null hypothesis. Also, l(β 1 , β c , υ) was the log-likelihood function for MMRM, The ML and the constrained ML estimates were computed by using this log-likelihood function.
Asymptotically, the LR test statistic T β null 1 followed the chi-squared distribution with 1 degree of freedom under the null hypothesis [28].

Bartlett-Type Adjustment by Bootstrap Resampling
Conventionally, it is widely known that the large sample approximation of the LR test statistic T β null 1 to the chi-squared distribution is not accurate under small sample settings [23]. To improve the approximations, several higher order approaches have been developed and the Bartlett correction [15] is one of the effective solutions. The Bartlett correction is a correction method for the LR test statistic that aims to improve the approximation to the reference chi-square distribution dividing by a correction term. The adjustment term is an estimate of the first moment of the null distribution of the LR test statistic ξ = E T β null 1 , and the corrected LR test statistic is given by /ξ. Intuitively, if the estimateξ is accurate, the null distribution of the corrected statistic approaches the chi-squared distribution. Theoretically, Barndorff-Nielsen and Hall [29] showed that the Bartlett correction reduces the error of the chi-squared approximation from O N −1 to O N −2 .
In this study, we proposed a practical procedure to apply the Bartlett correction to MMRM effectively for incomplete longitudinal data under small sample size. Many previous studies attempted to obtain analytical forms of the Bartlett correction term by analytical methods [14,30,31]. However, analytical forms of the correction term were not necessarily obtainable when complicated models were assumed and missing data was involved. As an alternative effective approach, Rocke [32] proposed to use a resampling approach, which adopted the parametric bootstrap method to estimate the Bartlett correction termξ. Here, we proposed to apply this resampling approach to improve the inferences of MMRM. The resampling approach possibly involved computational burdens, but it had an advantage in that it could be implemented using generic algorithms regardless of the complexity of regression model and covariance structure. The resampling based procedure was formulated as the following Algorithm 1.

Algorithm 1 Bartlett correction using bootstrap resampling technique.
(1) For the MMRM model, compute the constrained ML estimates β c , υ under β 1 = β null 1 . ( N from the estimated null distribution of the MMRM model with the parameters substituted with β null 1 , β c , υ via parametric bootstrap with reflecting missing patterns of a parametric bootstrap technique. This method used the Monte Carlo estimate of the null distribution as the reference distribution of LR test, instead of the chi-squared distribution. This approach would be an alternative to the former proposed method that had the same advantages for the inferences of small sample settings. With processes 1-4 of Algorithm 1, we had the bootstrap LR test statistics The Monte Carlo estimate of the null distribution was obtained as the empirical distribution of T β null 1 . Also, the bootstrap-based critical value of the nominal α level (0 < α < 1) corresponded to the upper αth quantile of the empirical distribution function. The Monte Carlo test can be constructed by the following Algorithm 2.
Algorithm 2 Bootstrap-based adjustment of LR test.
(2) Calculate the p-value by the following formula [24].
Here, I(x) is an indicator function, and it returns 1 if x is true and 0 otherwise.
Also, 100 × (1 − α)% confidence intervals can be constructed with a set of β null 1 that fulfill [36], whereq bs,(1−α) for the upper αth quantile of the estimated null distribution. According to Rocke [32], more than 1000 resamplings were recommended when estimating the tail of a distribution, such as the upper αth quantile of the null distribution.

Design and Setting
We conducted a series of simulation studies to assess the performances of the two methods, the Bartlett-type correction for LR test statistic-based test (LR Bart ) and the bootstrap adjustment test for LR test statistic based test (LR Boot ) under practical situations of longitudinal small clinical trials. We compared the effectiveness of these methods with the conventional ordinary LR test and Kenward-Roger (KR) method [11], which is the current standard inference method in MMRM analyses. We considered the same scenarios of the simulation studies of Gosho et al. [13], which conducted extensive simulations to evaluate the performances of MMRM for longitudinal clinical trials. We supposed two group comparative longitudinal clinical trials (e.g., active drug group vs. placebo group) and the number of post baseline visits (n) to be 7. The total number of subjects was determined as N = 20 (i.e., 10 subjects per group, respectively). The outcome variables Y it (t = 1., 2., . . . 7) were generated from the following model, where mean it was a fixed effect, subject i was a subject effect, and error it was a random error (i = 1, 2, . . . , N; t = 1, 2, . . . , 7). The mean values of Y it assumed the four scenarios illustrated in Figure 1. Here, we were interested in evaluating the mean difference between the two groups at the final (7th) time point. Scenarios 1 and 2 corresponded to the null hypothesis that the mean values of the outcome variables were the same between the two groups at the final time point. By contrast, in scenarios 3 and 4, the mean value of the treatment efficacy at the final time point differs between the two groups and corresponded to the alternative hypothesis.

Correlation Structures
For the variance-covariance structure of the error terms, a first order heterogeneous autoregressive (ARH (1)) structure was adopted, of which ( , ′ ) element is defined as and ′ are the standard deviances of th and ′ th time points and is the correlation coefficient between the two points. Following Gosho et al. [13], the diagonal elements 2 was set to 9{1 + 3 ( − 1) 6 ⁄ } and the correlation coefficient was set to 0.7. Also, the subject effect was generated by (0, 3 2 ).

Missing-data Mechanism
In this simulation, we considered two missing-data mechanisms, missing completely at random (MCAR) and missing at random (MAR). Only the monotone missing was assumed, i.e., once missingness occurred, all outcome values after the time point were missing for the corresponding individual. We denoted the probability of missingness of as . The missingness probability was assumed to follow the logistic regression model, The regression coefficients of the logistic regression model for the missingness probability were defined as 1 = 0 for MCAR and 1 = −1 for MAR. Table 1 summarizes the missing-data mechanisms and the coefficients of the logistic regression model that had a defined dropout rate for each treatment group at the final time point. The total dropout rates for the two groups were set to 0%, 20% or 40% for the four scenarios.

Correlation Structures
For the variance-covariance structure of the error terms, a first order heterogeneous autoregressive (ARH (1)) structure was adopted, of which (t, t ) element is defined as σ t σ t ρ |t −t| where σ t and σ t are the standard deviances of tth and t th time points and ρ is the correlation coefficient between the two points. Following Gosho et al. [13], the diagonal elements σ 2 t was set to 9{1 + 3(t − 1)/6} and the correlation coefficient ρ was set to 0.7. Also, the subject effect was generated by N 0, 3 2 .

Missing-data Mechanism
In this simulation, we considered two missing-data mechanisms, missing completely at random (MCAR) and missing at random (MAR). Only the monotone missing was assumed, i.e., once missingness occurred, all outcome values after the time point were missing for the corresponding individual. We denoted the probability of missingness of Y it as λ it . The missingness probability λ it was assumed to follow the logistic regression model, The regression coefficients of the logistic regression model for the missingness probability were defined as γ 1 = 0 for MCAR and γ 1 = −1 for MAR. Table 1 summarizes the missing-data mechanisms and the coefficients of the logistic regression model that had a defined dropout rate for each treatment group at the final time point. The total dropout rates for the two groups were set to 0%, 20% or 40% for the four scenarios.

Analysis Methods
The simulated data were analysed using four methods as mentioned above. We adopted the standard MMRM that included a group variable and time variables as dummy variables and the group-by-time interactions in the regression function. An unstructured covariance structure was adopted for the covariance structure model for the outcome variables. Parametric bootstraps for the proposed two methods were performed via 3000 resamplings. The results concerning convergence of MMRM analyses are reported in the Appendix A. The numbers of simulations were 1000 for all scenarios. All computations were performed by SAS ver. 9.4. Also, the significance levels were set to be 0.05.  Figure 2 shows the type I error rates for scenarios 1 and 2 under N = 20 (i.e., 10 subjects per group). The blue dashed lines correspond to the 95% intervals of the Monte Carlo errors. At first, the type I error rates of LR test increased greatly from 5% as the dropout rate increased. In scenario 1 with a 40% dropout rate, the type I error rates of LR was 11.3% under MCAR, and 9.8% under MAR, respectively. Besides, the type I error rates of LR Bart and LR Boot were maintained at 5% irrespective of the missing-data mechanism and dropout rate. For example, the type I error rates under scenario 1 with a 40% dropout rate under MAR were 5.5%, 5.6% for LR Bart and LR Boot , respectively. On the other hand, the type I error rates of the KR method were not maintained at around 5% under MAR and were too conservative. Under MAR with a 40% dropout rate, the type I error rates of the KR method were 3.6% and 3.5% for scenarios 1 and 2, respectively. Besides, under MCAR scenarios, the type I error rates of the KR method were maintained at around 5%. The convergence rates of these methods were not significantly different (see Appendix A). Note that the type I errors for the LR, LR Bart and LR Boot were inflated under scenario 1 with dropout rate 40%, but they fell within the ranges of Monte Carlo errors. The results of the convergence for scenarios 1 and 2 appears in the appendix section as Figure A1. Figure 3 shows the powers in scenarios 3 and 4 for N = 20 (i.e., 10 subjects per group). At first, the powers of LR was higher than those of other methods, ranging from approximately 14% to 20% depending on the dropout rate. However, since the type I error rates of LR were not maintained at 5% under scenarios 1 and 2, it should be considered to have liberal properties in general. In all three methods other than LR test, the powers decreased as the dropout rate increased, due to the reduction of available statistical information. In scenario 3, with 10 subjects per group and a 40% dropout rate under MAR, the powers of LR Bart , LR Boot and KR were 7.1%, 7.4% and 6.5%, respectively. The overall trends concerning powers of the four methods agreed with the rejection rates under scenario 1 and 2, although they depended on the sample size and effect sizes.

Results
The results of the convergence for scenarios 3 and 4 appears in the appendix section as Figure A2.  Figure 3 shows the powers in scenarios 3 and 4 for = 20 (i.e., 10 subjects per group). At first, the powers of LR was higher than those of other methods, ranging from approximately 14% to 20% depending on the dropout rate. However, since the type I error rates of LR were not maintained at 5% under scenarios 1 and 2, it should be considered to have liberal properties in general. In all three methods other than LR test, the powers decreased as the dropout rate increased, due to the reduction of available statistical information. In scenario 3, with 10 subjects per group and a 40% dropout rate under MAR, the powers of LRBart, LRBoot and KR were 7.1%, 7.4% and 6.5%, respectively. The overall trends concerning powers of the four methods agreed with the rejection rates under scenario 1 and 2, although they depended on the sample size and effect sizes.
The results of the convergence for scenarios 3 and 4 appears in the appendix section as Figure  A2.

Application: Postnatal Depression Trial
Postnatal depression is commonly treated with antidepressants and counselling. Transdermal administration of estrogen has also been shown to be effective, and Gregoire et al [26] conducted a double-blind, placebo-controlled study in 61 women within 3 months of giving birth [26,37]. Although the study planned to enroll 100 subjects, eventually 61 women were randomly assigned to the placebo group (27 subjects) or the estrogen group (34 subjects). The women were assessed twice prior to treatment and then monthly for 6 months after treatment using the Edinburgh postnatal depression scale (EPDS), with higher scores indicating more severe depression. Approximately 37.0% (10/27) of subjects in the placebo group and 17.6% (6/34) of subjects in the estrogen group had missing EPDS scores at the final time point. All data had monotone missing patterns. The baseline EPDS score was defined as the average of the scores at the 1st and 2nd months in this study. The outcome variables were measured on the visits between the 3rd and 8th months. We considered analysing this longitudinal dataset using MMRM and the following regression function model, where denotes the EPDS score for the participant ( = 1, 2, … ,61) on the th occasion ( = 1, 2, . . ,8).
was a dummy variable that equals 1 if the participant belongs to the estrogen group and equals 0, otherwise. For the covariance structure of the outcome variables, we assumed the unstructured structure. Here, our primary subject of interest was the evaluation of the mean difference of outcome variables on the final time point. In addition, we considered a subgroup

Application: Postnatal Depression Trial
Postnatal depression is commonly treated with antidepressants and counselling. Transdermal administration of estrogen has also been shown to be effective, and Gregoire et al [26] conducted a double-blind, placebo-controlled study in 61 women within 3 months of giving birth [26,37]. Although the study planned to enroll 100 subjects, eventually 61 women were randomly assigned to the placebo group (27 subjects) or the estrogen group (34 subjects). The women were assessed twice prior to treatment and then monthly for 6 months after treatment using the Edinburgh postnatal depression scale (EPDS), with higher scores indicating more severe depression. Approximately 37.0% (10/27) of subjects in the placebo group and 17.6% (6/34) of subjects in the estrogen group had missing EPDS scores at the final time point. All data had monotone missing patterns.
The baseline EPDS score was defined as the average of the scores at the 1st and 2nd months in this study. The outcome variables were measured on the visits between the 3rd and 8th months. We considered analysing this longitudinal dataset using MMRM and the following regression function model, where Y it denotes the EPDS score for the participant i (i = 1, 2, . . . , 61) on the tth occasion (t = 1, 2, . . . , 8). G i was a dummy variable that equals 1 if the participant i belongs to the estrogen group and equals 0, otherwise. For the covariance structure of the outcome variables, we assumed the unstructured structure. Here, our primary subject of interest was the evaluation of the mean difference of outcome variables on the final time point. In addition, we considered a subgroup analysis for a group of participants with clinically severe depressive symptoms that was defined as baseline EPDS > 21 [38]. There were 30 participants in subgroup (15 participants in both placebo and estrogen groups). At the final month, the proportions of dropout were 40.0% (6/15) and 20.0% (3/15) for placebo and estrogen group, respectively. At baseline, the mean EPDS scores of the placebo and estrogen groups were 21.26 (3.11) and 21.59 (3.06), respectively. Table 2 summarizes the mean difference estimates of EPDS scores at the final month, as well as their 95% confidence intervals and p-values by the conventional and proposed methods. We added the t-test on the single point analysis at the final month in these analyses as a reference. The numbers of resampling for the proposed methods were set to be 3000. In the whole population analysis, all of the five methods showed significant differences and provided similar estimates. However, the p-value of LR test was a bit smaller than the proposed methods, and that of KR was a bit larger. These trends might corresponds to the liberal and conservative properties of these methods. These trends became clearer for the subgroup analysis for the participants with severe symptoms. Only the LR test showed significant difference, and the other four methods provided non-significant results. The p-values of LR Bart and LR Boot were 0.0586 and 0.0583, but that of KR was 0.1288. These results might reflect the conservative property of KR, and it was possibly improved by the proposed two methods. In addition, the t-test for the subgroup analysis provided a larger p-value (0.2349) with a considerably smaller estimate and larger standard error. Previous numerical evidence (e.g., Ashbeck and Bell [39]) showed possible bias and information reduction of the single time point analysis by t-test, and this result might correspond to this evidence. With LR Bart and LR Boot , the computational times were 55 and 38 minutes for whole population and subgroup, respectively (we used a general laptop computer with an Intel (R) Core (TM) i7-6500U and SAS 9.4). The computational times would be dramatically improved by applying parallel computation techniques. Figure 4 shows the histogram of the empirical distribution of the LR test statistics resampled by the parametric bootstrap method under the null effect hypothesis. The mean values of the empirical distribution were designated by the vertical blue dashed line in each histogram and were 1.09 and 1.20 in the whole population and the subgroup, respectively. In addition, the 95th percentiles of the empirical distribution were 4.14 and 4.55 for the whole population and the subgroup, respectively. If the chi-squared approximations are accurate, the means and 95th percentiles of the null distribution were expected to be 1.0 and 3.84. These results would show that the distribution of the LR test statistic in incomplete longitudinal data with a small sample size shifted and adequate corrections were needed. The proposed methods would improve the approximations and enable improvements of the inferences as shown in the simulations.

Discussion
MMRM with the KR method has been widely applied as a standard analysis method for longitudinal clinical trials. If a sufficient number of samples are available, there are no problems using statistical tests and confidence intervals based on large sample theory. However, the asymptotic approximations cannot be appropriate in cases with small sample sizes. In addition, most clinical trials involve missing data. As methods to improve validity of the statistical inferences, we proposed resampling-based approaches. Throughout the simulations and real data applications, we demonstrated the effectiveness of the proposed methods compared with existing standard methods.
In the simulation experiments, the KR method and our proposed methods maintained almost the same type I error rate under MCAR, which was close to 5%. However, under MAR scenarios with large dropout rates, the KR method had conservative type I error rate compared with our proposed methods. Our proposed methods might have an advantage even if the missing-data mechanism is MAR compared with KR. In addition, it should be noted that algorithm 1 uses bootstrap samples to estimate the mean of the null distribution, whereas algorithm 2 uses them to estimate a quantile of the null distribution. In general, the latter is a more unstable quantity for Monte Carlo inferences and thus requires a larger number of resamplings in general [23]. In our simulation studies, 3000 resamplings were performed for both the LRBart and LRBoot methods, and we obtained similar results. The number 3000 was determined considering Monte Carlo errors, and they would be sufficient. Although these might require large computation burdens, they would not be so problematic under a modern computational environment, in which parallel computations are available for standard statistical software.
In addition, another possible effective approach to be considered in future research might be the Bayesian approach. The Bayesian method might also accommodate small sample sizes, if the choices

Discussion
MMRM with the KR method has been widely applied as a standard analysis method for longitudinal clinical trials. If a sufficient number of samples are available, there are no problems using statistical tests and confidence intervals based on large sample theory. However, the asymptotic approximations cannot be appropriate in cases with small sample sizes. In addition, most clinical trials involve missing data. As methods to improve validity of the statistical inferences, we proposed resampling-based approaches. Throughout the simulations and real data applications, we demonstrated the effectiveness of the proposed methods compared with existing standard methods.
In the simulation experiments, the KR method and our proposed methods maintained almost the same type I error rate under MCAR, which was close to 5%. However, under MAR scenarios with large dropout rates, the KR method had conservative type I error rate compared with our proposed methods. Our proposed methods might have an advantage even if the missing-data mechanism is MAR compared with KR. In addition, it should be noted that Algorithm 1 uses bootstrap samples to estimate the mean of the null distribution, whereas Algorithm 2 uses them to estimate a quantile of the null distribution. In general, the latter is a more unstable quantity for Monte Carlo inferences and thus requires a larger number of resamplings in general [23]. In our simulation studies, 3000 resamplings were performed for both the LR Bart and LR Boot methods, and we obtained similar results. The number 3000 was determined considering Monte Carlo errors, and they would be sufficient. Although these might require large computation burdens, they would not be so problematic under a modern computational environment, in which parallel computations are available for standard statistical software.
In addition, another possible effective approach to be considered in future research might be the Bayesian approach. The Bayesian method might also accommodate small sample sizes, if the choices of the prior distributions are appropriate. The advantages and potential drawbacks are discussed in Van De Schoot et al. [40]. Also, another concern is extensions to multi-parameter inferences. However, the proposed methods are quite general methods and could be straightforwardly extended to the multi-parameter inferences.
The effectiveness of our proposed two resampling approaches for MMRM were clearly shown through simulation studies and real data applications. To assure scientific validity in developments of new drug and health technology, accurate statistical inference methods are essential tools. The proposed methods can be applied as effective options in statistical analyses for small and incomplete longitudinal clinical trials.
of the prior distributions are appropriate. The advantages and potential drawbacks are discussed in Van De Schoot et al. [40]. Also, another concern is extensions to multi-parameter inferences. However, the proposed methods are quite general methods and could be straightforwardly extended to the multi-parameter inferences.
The effectiveness of our proposed two resampling approaches for MMRM were clearly shown through simulation studies and real data applications. To assure scientific validity in developments of new drug and health technology, accurate statistical inference methods are essential tools. The proposed methods can be applied as effective options in statistical analyses for small and incomplete longitudinal clinical trials.