A Brief Overview of Restricted Mean Survival Time Estimators and Associated Variances

: Restricted Mean Survival Time ( RMST ) experiences a renaissance and is advocated as a model-free, easy to interpret alternative to proportional hazards regression and hazard rates with implication in causal inference. Estimation of RMST and associated variance is mainly done by numerical integration of Kaplan–Meier curves. In this paper we brieﬂy review the two main alternatives to the Kaplan–Meier method; analysis based on pseudo-observations, and the ﬂexible parametric survival method. Using computer simulations, we assess the efﬁcacy of the three methods compared to a fully parametric approach where the distribution of survival times is known. Thereafter, the three methods are directly compared without any distributional assumption for the survival data. Generally, ﬂexible parametric survival methods outperform both competitors, however the differences are small.


Introduction
In 1958, Kaplan and Meier [1] published their paper on the product limit estimator for survival probabilities, an estimator that can handle censored and/or truncated survival data. The easy to use and interpretable nature of the method, made it an instant success. The Kaplan-Meier estimator have since then become the dominant method to summarize survival data and parametric methods have lost relevance. This prompted Miller [2] to criticize this over-reliance of applied scientist on the Kaplan-Meier curve. Miller established, both theoretically and numerically, the loss of efficacy of the Kaplan-Meier curve compared to parametric methods. Jullum and Hjort [3] further explored this aspect and corroborated Miller's findings. Meier et al. [4] have addressed some of Miller's critique, by highlighting the difficulties in choosing the right parametric form. In addition to the Kaplan-Meier curve, Meier et al. [4] have addressed the efficacy of a functional of the Kaplan-Meier curve: the Restricted Mean Survival Time (RMST), the average survival time up to a given time point. Theoretical aspects of RMST (or Kaplan-Meier integrals) are well studied [5][6][7] and currently RMST is experiencing a renaissance [8][9][10] being hailed as a model-free, easy to interpret statistic with implications in causal inference [11]. Not needing to establish the proper parametric model has certain advantages. However, generally non-parametric methods have lower efficacy than parametric ones. Generally, practical applications do not consider parametric estimators for RMST and use either numerical integration of the Kaplan-Meier curve [12], pseudo-observations [13][14][15] or flexible parametric survival methods [16]. In this paper, we briefly summarize parametric and non-parametric Stats 2020, 3 estimators for RMST and associated variance estimators. We will use Monte Carlo simulations to examine their efficacy.

Restricted Survival Times
We assume that survival times to an event of interest X 1 , . . . , X n are independently and identically distributed (iid) according to the cumulative distribution function F(x) and survival function S(x).
Similarly we assume C 1 , . . . , C n to be iid censoring times according to the distribution function G(c). Additionally, we assume that follow-up times are restricted, by an upper threshold τ set by the researcher. Thus the actual observed time for subject j is T j = min X j , C j , τ . Additionally, we have δ j = I X j ≤ min(C j , τ) as event indicator that takes a value of 1 if the event of interest is recorded at the given time, 0 otherwise. We assume independence between failure and censoring time. We let t (1) ≤ ... ≤ t (n) denote the ordered observed survival times and δ (1) , ..., δ (n) their associated indicator values. Survival times are summarized with the help of the survival function S(t) = P(T > t) and the τ-restricted mean survival times ar given by τ 0 S(t) dt [12]. From Andersen et al. [14] we know that In the next sections we review, then evaluate, different estimators for τ 0 S(t) dt and σ 2 .

Estimators of RMST and Associated Variance
The statistical literature describes several parametric and non-parametric ways of estimating τ 0 S(t) dt and its associated variance.

Parametric Methods
For parametric estimation, we need to make assumptions about the distribution of the unobserved survival times, X. Once the distribution function is established, the Restricted Mean Survival Time (µ τ ) is estimated as where θ is the parameter vector for the assumed distribution function.
In the following, we outline three possibilities for parametric estimation 1. Likelihood and δ-method based approach. Estimation of τ 0 S(θ, t) dt requires knowledge of θ. Estimates for θ under censoring can be obtained with maximum likelihood. Given the censoring indicator I {X j ≤C j } and Y j = min(X j , C j ) we can define the likelihood function for θ as For most cases there are no closed-form solutions forθ and Var(θ), however, numerical estimates are enough. Once estimates for θ are available, τ 0 S(θ, t) dt can be estimated either by usingθ in closed form equations, or as plug-ins in numerical integration. The variance for τ 0 S(θ, t) dt by the δ-method is given by where θ i is the i th element of the population parameter. 2. M-estimation (or estimating equation) M-estimation seeks solution to the vector equation ∑ n i=1 ψ(T i ,θ) = 0. Here, T i are independently and identically distributed restricted survival times,θ is a p-dimensional parameter and ψ is a known (p × 1)-function that does not depend on i or n [17]. M-estimates are asymptotically normal with variance n −1 V(θ 0 ). V(θ 0 ) is a sandwich variance estimator given by where, and Wang [18] discusses alternative formulations for ψ with regards to the strictly non-negative and often skewed nature of survival data. For an in-depth discussion on the M-estimators the reader is referred to [17,19].

The second cumulant
The variance equivalent to the second cumulant of a probability distribution of T j . If there is no censoring present Var(T) = E τ [T 2 ] − E τ [T] 2 and can be estimated as This estimator is not practical due to censoring. When we have censoring in the data Rosyton & Parmar [20] suggested multiplying n −1 Var(T) with a positive scaling factor φ, so that φ = 1 if no censoring and φ > 1 otherwise. The scaling factor φ can be estimated with help of Monte Carlo simulation. Alternatively, the variance can be estimated with the Stute estimator [6]. Adopting the notation of Stute [21], (i.e., xdF( Both approaches require assumptions about the distribution of the survival times and censoring times. These two estimators of variance will not be further evaluated, however they do represent an important aspect of RMST.

Flexible Parametric Survival Methods
In 2002, Royston and Parmar [22] introduced flexible parametric survival models as a simple way to deal with non-proportional hazards, and as a way of easy visualization of the hazard function. In a later paper the authors exemplified the utility of the model in estimation of RMST [16]. The model is defined through the log cumulative hazard function as where the baseline (H 0 ) is modeled as a restricted cubic spline in log time, ln H 0 (t) = s(ln t), and where z is a covariate vector. The splines s(ln t) are linear combination of basis functions and regression parameters. The model requires a number of knots and a probability distribution function. This later is often assumed to be the Weibull distribution, which we used in this article, with 2 knots. RMST is estimated by predicting the log cumulative hazard function using Equation (4) at a suitable grid of time points equally spaced between 0 and τ. Thereafter the cumulative hazard is transformed into the survival functionŜ RP (t), and µ τ = τ 0Ŝ RP (t) dt. The variance is estimated with the δ-method or by bootstrapping.
While this approach is fully parametric and the survival function is completely specified, the model is ought to be robust against distribution misspecification. Integration proceeds with the formula of the survival function of the assumed distribution, but with numerical integration for the area underŜ RP (t) between 0 and τ. Additionally, the splines and the piece-wise modeling between the knots offer great flexibility to the model.

The Kaplan-Meier-Method
The Kaplan-Meier method is perhaps the most well-known and used method for estimation of RMST. In 1975, Meier [12] proved that the substitution of the Kaplan-Meier estimator in τ 0 S(t) dt provides an (approximately) unbiased estimate that is asymptotically normal.
The Kaplan-Meier estimator is given bŷ where T i is the ordered follow up time, R i is the number of subjects at risk prior to T i and δ i the number of events that happened at time T i . The RMST is estimated as As far as we know, Irwin in 1949 [23], was the first to provide an estimator for the variance based on life tables. Later, Meier [12] in 1975 proved that reducing the data to the life table format was unnecessary and established the following variance estimator is the sum of the squared coefficient of variation for S(u). Klein & Moeschberger [24] provided a variant with the Greenwood variance estimator of survival probabilities as plug-ins and popularized Meier's variance estimator in the following form: The Aalen -Johansen variance estimator of survival probabilities could also serve as plug-in in Meier's estimator, or re-sampling methods can be used as well. However, as Eaton et al. [25] noted based on Monte Carlo simulations, the estimator with the Greenwood plug-in give variance estimates closest to empirical and asymptotic variances.

Pseudo-observations
Pseudo-observations is a method that is based on the pseudo-values from a jackknife statistic constructed from simple summary statistic estimates which are then used in a generalized estimating equation to obtain estimates of the model parameters [13,15].
The parameter of interest is µ τ . If there is no censoring, then µ τ = n −1 ∑ i T i . Under censorship, an approximately unbiased estimator for µ τ is τ 0Ŝ (t)dt. The pseudo-observation T * j associated with the j th observed time is estimated as where S −j is the survival curve estimated withouth the j th observation. The assumption is that The approach preferred by Andersen et al. [13] is based on M-estimators. The usual assumption is that the underlying distribution is symmetric about the location parameter and the corresponding function ψ is an odd function for the location parameter (mean in our case, θ 1 ) and an even function for the targeted scale parameter (variance, θ 2 ) defined as Solving for V (θ 0 ) (Equation (1)) starts with solving A(θ 0 ) (Equation (2)) and B(θ 0 ) (Equation (3)). In this case A(θ 0 ) = I(θ 0 ), the identity matrix. Solving for B(θ 0 ) gives For asymmetric distributions (i.e., µ 3 > 0) there is an asymptotic variance inflation due to estimating variance in addition to the mean [19].

Simulation Studies
We used Monte Carlo simulations to gain insight of the small sample proprieties of the estimators. We used the R computing environment (R 3.6.0 ) [26] for calculations and illustrations. Codes are available from the corresponding author.

Relative Efficacy under Parametric Assumption
A first set of simulation studies was used to assess the relative efficacy of non-parametric methods and flexible parametric survival methods compared to the parametric estimates of variance. The relative efficiency (eff ) of the estimators was assessed as the ratio between the estimated variances of the different estimators and the estimator with the minimum variance where M is the model considered for estimation. We were interested in seeing how relative efficacy varies as a function of (i) sample size, (ii) censoring percentage and (iii) restriction time. To facilitate calculation and possibility of obtaining closed form solutions for the parametric variance estimates, we assumed that survival times were exponentially distributed with rate, λ = 1/365. Similarly, we assumed that censoring times were exponentially distributed with rate γ.
In the first set of simulations we set the sample size to 50 and 100 with censoring percentages at 50 % and 75 %. Given the assumed censoring percentages (cp) we estimated γ, the rate parameter for the censoring distribution, as the solution for the equation [27]. Every setting was simulated 1000 times and we estimated µ τ and Var(µ τ ) with the δ-method, M-estimation, the Kaplan-Meier method, flexible parametric survival and pseudo-observations with τ = 365. Table 1 presents the results of the simulation study. The expected µ 365 = 230 days. Generally, all methods provided unbiased estimates of µ 365 , with small deviations.  To gain further insights of the relative efficacy of the variance estimator a second set of simulation studies was run. First we simulated survival times with λ = 1/365 and γ = 1/365 with a sample size of 100 and a censoring percentage of 50 %. We varied τ between 50 and 650 days and estimated µ τ and Var(µ τ ) with M-estimation, the Kaplan-Meier method, flexible parametric survival and pseudo-observations. All estimators returned unbiased µ τ estimates. As Figure 1a shows, the Kaplan-Meier method, flexible parametric survival and pseudo-observations have lower efficacy compared to parametric estimate of the variance. The estimated increases with longer restriction times and declines after a maximum is reached approximately at 600 days, when around 80 % of the subjects should have experienced the event of interest.
Similarly, we have assessed how the censoring percentage affects efficacy when everything else is kept constant. We simulated survival times with λ = 1/365, sample size of 100 and set the restriction time at 365 days.  We estimated γ, the censoring rate with Equation (5). The pattern was similar to the previous setting. The efficacy increased up to around 65 % censoring rate, thereafter declined again.

Estimation with Unknown Distribution Function
Practitioners rarely, if ever, know the distribution of survival times, which makes parametric estimation difficult, if not impossible. Here, we aim to assess the precision, efficacy and confidence internal coverage, for the Kaplan-Meier method, pseudo-observations and flexible parametric survival methods. For the former two, distributional assumptions are less important. For flexible parametric survival methods with an assumed Weibull survival function, deviations from the assumptions might affect the outcome. In a first set of simulations, we simulated survival times with the log-logistic distribution, a distribution used in accelerated failure models. The hazard of the log-logistic distribution can take non-monotonic forms depending on the shape parameter. We assumed shape of 1.5 and a scale of 151, giving a mean survival time of approximately 365 days. Thereafter, we simulated survival times with a mixture distribution with f (t) = 0.5 exp(1/265) + 0.5 exp(1/465). In both cases we assumed exponential censoring with a rate of γ = 1/365. Table 2 summarizes the results of the simulation studies with 1000 runs. Generally, flexible parametric survival methods outperform both the Kaplan-Meier method and pseudo-observations without compromising the coverage probability of the 95 % confidence intervals. Table 2. Efficacy and 95 % confidence interval probabilities of the different RMST estimators at the mean survival time of 365 days, with the expected µ τ of 182 days for the Log-logistic distribution and 225 days for the exponential distribution.

Parametric Estimation under Model Misspecification
As we have seen in the previous subsection, parametric estimation outperforms non-parametric estimation, at least when the correct model is specified. However, this is rarely possible. In this subsection, we assess the parametric estimators and associated variances under model misspecification. In a first set of simulations, we simulated survival times with the log-logistic distribution as described above (Section 4.2). When the distribution of survival times is unknown, it is common to assume an Exponential distribution. While the Kaplan -Meier method was practically unbiased, the parametric estimator overestimated µ τ . Of the two variance estimator the δ-method had the smallest estimates. This is expected as sandwich variance estimates are robust against model misspecification at the expanse of efficacy. See Table 3. See Table 3. In a second set of simulation, we simulated survival times according to a Weibull distribution with shape (k) and scale (λ) parameters. We assumed exponential censoring with rate 1/365. We varied the scale parameter between 0.5 and 3. The Shape parameter was set so that the mean of Weibull survival times should equal 365, the mean of exponential survival time with rate 1/365. We estimated λ with numerical root finding, by solving the equation λΓ(1 + k −1 ) = 365. At k = 1, λ = 365 and the Weibull distribution simplifies to an Exponential distribution with rate 1/365. At these values parametric RMTS estimation assuming Exponential survival distribution returned, as expected, nearly unbiased estimates. However, as k deviated from 1, the bias increased. The sign and magnitude of bias is dependent on the direction and magnitude of deviation from k = 1 (Figure 2).

Discussion
In this paper we presented the main estimators for RMST and associated variances. Additionally, with the help of simulation studies, we examined their small sample properties. As expected, the results of our simulations corroborated that parametric methods of estimation are more effective than non-parametric estimation. This is in line with previous findings about the Kaplan-Meier curve [2] or semi-parametric Cox-regression [3]. Naturally, just as Meier et al. [4] concluded, establishing the proper parametric form is difficult. Since the work of Meier et al., important advancements have been made in the field. The Focused Information Criterion (FIC) have been proposed for comparing general non-nested parametric models with a non-parametric alternative [28]. Application of FIC requires advanced statistical/technical knowledge and it is possible that survival times do not follow a known and well-established distribution. Thus, it is likely that in practical applications, parametric estimators for RMST will be rare.
The numerical results and the patterns of efficacy as a function of censoring percentage are similar to the ones reported by Meier et al. [4] and by Jullum and Hjort [3]. Independently of the employed method, we could see that a small sample size and a high percentage of censoring, result in genuine departure from nominal coverage values of the approximate Wald-type confidence interval. This departure from the nominal coverage of the 95 % confidence intervals is not a fallacy of the RMST and associated variance estimators, more it is a sign of the inappropriateness of Wald-type intervals and statistics. Lawrence et al. [29] described this deviation from the normal distribution in small samples and proposed an approximation to the correct null distribution using the Cornish-Fisher expansion. In medium to large samples the normal approximation is expected to hold. Our simulation suggests that in samples ≥ 100 statistical inference will be correct. This value of ≥ 100 should not be taken as a proposed threshold. This is data dependent. Schall [30] proposed a bootstrap based framework to assess if the coverage probability of confidence intervals approximates the intended nominal coverage. Alternatively, one could either correct null distribution as Lawrence et al. [29] proposed or employ computer intensive methods. Jackknife estimate, that is behind numerous important theoretical results about RMST [21,31,32] could be a natural option or bootstrapping that has a straightforward application in survival analysis [33,34]. Nevertheless, bootstrapping RMST requires extra care. The parameter space for RMST is (0, τ] and RMST can be close to τ. Bootstrapping parameters on the boundary of the parameter space can return inconsistent results [35,36]. This is not a surprising result. While we assume normal distribution for µ τ , its parameter space is restricted to (0, τ]. A Wald-type 95 % confidence interval would require that 2.5 % of the distribution would be at either side of the interval. As µ τ → τ the right tail of the assumed normal distribution will include more and more values that are larger than τ. Thus the confidence interval might include values outside the parameter space. If µ τ ∼ N( µ τ , Var(µ τ )), then P(µ τ ≥ τ) determines the values of the µ τ under the assumed distribution that falls outside of (0, τ]. P(µ τ ≥ τ) is directly influenced by Var(µ τ )) and decreases with rate √ n as the sample size increases. In either case, further studies are needed to establish routines of statistical inference. Extra care is also needed in inferring from RMST based studies with Wald-type confidence intervals or Z-scores. Applied studies, while they do present RMST and associated confidence intervals, are interested in comparing different treatment regimens. This is done by calculating difference between two RMST estimates D, or the ratio between them R. Neither D or R is affected by the boundary value problems in realistic situations. The parameter space for D is (−τ, τ) and the D approaches the boundaries if and only if RMST for one arm is close to zero and the RMST for the competing treatment should be close to τ. Under similar circumstances R approaches to its boundary values of (0, ∞).
One aspect that practitioners face when planing a clinical trial with RMST as their method of choice is deciding upon τ. The choice of the time window (0, t] may be prespecified at the design stage of the study based on clinical considerations or the choice of time point τ could be data-dependent after the data is collected [37]. The choice of τ has direct implications on statistical power [38] and as we highlighted statistical power can be maximized by considering the intricate relationship between the choice of τ and the censoring percentage.
Although we observed differences in the magnitude of the variance between the Kaplan -Meier method, pseudo-observations and flexible parametric survival methods the practical implications are likely limited. The differences between the variance estimators might be numerically large however when compared to the magnitude of the RMST estimate, these differences will have limited practical implications. Thus, we refrain from suggesting one or the other method as the preferred one. All three methods have arguments that spoke for them. The Kaplan-Meier method is probably the most used and best known. Its application is simple and straightforward and require limited programming knowledge to implement on the top of Kaplan-Meier curve estimator, if not readily available in the preferred software. The flexible parametric survival method takes the idea behind the Kaplan-Meier method one step further by assuming a parametric form and fully define a smoothed version of the survival curve. Estimation assumes definition of the survival distribution. Additionally, to increase flexibility and model fit, the survival function can be modeled as a natural cubic spline function of log time.
Standard statistical software provide readily available routines for estimation [39,40]. An approach similar in philosophy was recently proposed by Luo et al. [41] based on piecewise exponential survival and censoring times. This approach does not require smoothing of the survival curve, however the exponential assumption facilitates design and monitoring of survival trials based on RMST. Both estimation with the Kaplan-Meier method and flexible parametric survival method allows adjustment of the survival curves for covariates. However, interpretation for continuous variables is difficult. The third approach, pseudo-observations, is also readily available in most statistical software [42][43][44]. With pseudo-observation covariates can be regressed on restricted survival times with linear regression. While the estimates of the Cox-regression or flexible parametric survival regression are hazard rates, regression on pseudo-observation returns regression coefficients on the time scale, i.e. they directly quantify the change in restricted survival times as a function of change in the covariates.
In conclusion, we suggest researchers interested in using RMST to analyse survival data to adopt the method that best fit their purpose. If ease of implementation is paramount, then the Kaplan-Meier method is preferred. To increase statistical power, flexible parametric survival models could be the proper option as they show superior efficacy compared to the other practical applicable methods. Clinical studies that aim to assess the effect of treatments/exposures adjusted for relevant covariates could take benefit from considering pseudo-observations as their tool of preference. Naturally, parametric method of estimation and inference can be considered. However researchers should be extra cautious about violation of assumptions.