Abstract
In some clinical studies, assessing covariate effect types indicating whether a covariate is predictive and/or prognostic is of interest, in addition to the study endpoint. Recently, for a case with a binary outcome, Chiba (Clinical Trials, 2019; 16: 237–245) proposed the new concept of covariate effect type, which is assessed in terms of four response types, and showed that standard subgroup or regression analysis is applicable only in certain cases. Although this concept could be useful for supplementing conventional standard analysis, its application is limited to cases with a binary outcome. In this article, we aim to generalize Chiba’s concept to continuous and time-to-event outcomes. We define covariate effect types based on four response types. It is difficult to estimate the response types from the observed data without making certain assumptions, so we propose a simple method to estimate them under the assumption of independent potential outcomes. Our approach is illustrated using data from a clinical study with a time-to-event outcome.
1. Introduction
In certain clinical studies, it is important to assess covariate effect types to indicate whether a covariate is predictive and/or prognostic in the context of treatment effectiveness. Clark et al. [1] defined a predictive factor as a factor associated with the response or lack of response to a particular therapy, while a prognostic factor is a factor associated with the clinical outcome in the absence of therapy, or with the application of a standard therapy. To assess whether a covariate is predictive and/or prognostic, subgroup or regression analysis is often used [2]. For a case with a binary outcome, on the difference scale, a binary covariate is predictive if
is not zero, where is the treatment arm ( for the experimental treatment and for the control treatment) and is the outcome ( for response and for no response). If is close to zero, is prognostic if
is not zero.
Recently, for cases with a binary outcome, Chiba [3] proposed assessing covariate effect types based on four response types [4] defined in terms of the potential outcome of if is , , rather than and as above. The four response types are defined as follows:
- Activated subjects, who would show a response regardless of the treatment they received; that is, ;
- Causative subjects, who would show a response only if they received the experimental treatment; that is, ;
- Preventive subjects, who would show a response only if they received the control treatment; that is, ;
- Inert subjects, who would not show a response regardless of the treatment they received; that is, .
Based on the four response types, measures of the covariate effect types are defined as follows [3]:
for . Then, can be used instead of a current common measure for predicting the effectiveness of the experimental treatment, i.e., in (1). indicates that the proportion of subjects who would show a response only if they received the experimental treatment is higher in the subgroup where than in the subgroup where . In addition, we can consider predictability for the effectiveness of the control treatment (or the harm caused by the experimental treatment) by . Instead of a current common measure for a prognostic factor, i.e., in (2), can be used. indicates that the proportion of subjects who would show a response regardless of the treatment they received is higher in the subgroup where than in the subgroup where .
Applying the potential outcome to (1) under the assumption of , which indicates that is independent from given , the following relationship between (1) and (3) is obtained:
Similarly, the following relationship holds between (2) and (3):
Obviously, if , the currently used definition of covariate effect types does not correspond to the new definition in (3). Except for the special case of , under the current definition, the two characteristics of covariate effect cannot be separated from each other. In actual clinical studies, the aim is to assess covariate effect type on the basis of (3) rather than (1) and (2). However, (3) can be applied only when an outcome is binary and not when an outcome is continuous or time-to-event-based. In this article, we aim to generalize (3) to continuous and time-to-event outcomes. We provide a definition of measures of covariate effect types in Section 2 and propose a simple method to estimate those measures in Section 3. In Section 4, our approach is illustrated using data from a clinical study with a time-to-event outcome. Finally, in Section 5, the performance of the proposed approach and directions for future work in this area are discussed.
2. Definition of Measures of Covariate Effect Types
In the following, we mainly discuss a case with a time-to-event outcome . However, this discussion can also be applied to a case with a continuous outcome as a special case without censored data in which a larger value represents a better response. and are the same as in Section 1; i.e., is a binary treatment arm and is a binary covariate. is the potential outcome of if had been set to . Unfortunately, it is not possible to observe the values of both and ; i.e., [5]. We use the stable unit treatment value assumption under which a single version of each treatment is available and there is no interference among subjects [6,7].
When an outcome is time-to-event, the response type cannot be classified into four types unlike in the case of a binary outcome described in Section 1. However, if we set a cut-off point , the time-to-event outcome can be regarded as a binary variable; then, we can consider the following four response types.
Definition 1
(Response type).Using a cut-off point, the response type can be classified into the following four types:
- Activated subjects (type 11 subjects), whose outcomes would be larger than or equal to, regardless of the treatment they received; that is,;
- Causative subjects (type 10 subjects), whose outcomes would be larger than or equal toif they received the experimental treatment, but would be smaller thanif they received the control treatment; that is,;
- Preventive subjects (type 01 subjects), whose outcomes would be smaller thanif they received the experimental treatment, but would be larger or equal toif they received the control treatment; that is,;
- Inert subjects (type 00 subjects), whose outcomes would be smaller than, regardless of the treatment they received; that is,.
All subjects belong to one of these four types, although we cannot know which type a subject belongs to. We denote a proportion of the subjects of the total subjects as ; i.e.,
and the proportion in the stratum with as . Then, corresponds to the survival probability under the experimental condition and is that under the control condition. For a continuous outcome, they correspond to one minus the cumulative density function.
Using , we can simply define measures of covariate effect types at the cut-off point by
This is essentially the same as in (3). Unfortunately, with a fixed cannot be applied as a general definition of measures of covariate types. If there is a medically meaningful cut-off point, the outcome is no longer time-to-event (continuous); rather, it is binary. If the cut-off point has no medical significance, it is not valuable to use a fixed cut-off point. and with a fixed cut-off point may also not be useful. Using these probabilities with changed on the interval, we define the restricted mean probability (RMP) as follows.
Definition 2
(Restricted mean probability).The restricted mean probability for response type on the interval,, is defined by the following formula:
for.
In this definition, and will usually be set to and , respectively, where is the truncation time; for a continuous outcome, these values will be set to the minimum and maximum values of the observed outcome, respectively. in the stratum with is defined in the same manner. is the restricted expectation of on the interval . Then, is interpreted as the mean proportion of type subjects when the cut-off point is changed from to .
It is important to note that is related to , which is the restricted expectation of the potential outcome on the interval .
Lemma 1.
We have the following relationships betweenand:
The proof is given in Appendix A. The difference between these two equations indicates that the restricted average causal effect on the interval can be expressed as . Notably, when and , and correspond to the restricted mean survival times (RMSTs) [8,9,10], which are equal to the areas under the survival curves on the interval under the experimental and control conditions, respectively. Similarly, and correspond to the restricted mean time lost [9] under the experimental and control conditions, respectively.
We give a general definition of measures of covariate effect types using , which is the RMP of a given response type in the stratum with .
Proposition 1.
We define a measure of covariate effect type asfor.
The interpretation of is similar to that of in (3) for a case with a binary outcome. As a prediction measure, we use and . implies the subgroup where contains a higher proportion of subjects who would survive longer by receiving the experimental treatment than the subgroup where . Then, predicts the effectiveness of the experimental treatment, and we say that is “augmented-causative”. If , we say that is “depleted-causative”. In a similar sense, is a prediction measure of the effectiveness of the control treatment (or for the harm associated with proceeding with the experimental treatment).
We note that for a case with a time-to-event (continuous) outcome, we can obtain results similar to (4) and (5) when the outcome is binary. Using Proposition 1 and Lemma 1, the prediction and prognosis measures based on conventional standard analysis, and , can be expressed as functions of on the difference scale.
Corollary 1.
We have the following equations:
where.
On the basis of this corollary, in Proposition 1 can be related to the prognosis under the current standard analysis. Similar to the currently used standard analysis, if is close to , which implies that is close to zero, is prognostic if is not zero. implies that the mean survival time is longer for subjects in the subgroup where than the subgroup where , regardless of the treatment they received. In this case, we say that is “augmented-activated”. In contrast, when , we say that is “depleted-activated”.
Finally, we emphasize that Corollary 1 indicates that the results of the current standard analysis can be properly interpreted when ; however, this may not be the case when . On the other hand, our measure of covariate effect type can assess whether a covariate is predictive and/or prognostic, even when .
3. Estimation of Measures of Covariate Effect Types
Unfortunately, in Proposition 1 cannot be identified based on the observed data without making certain assumptions, because the joint probability of and cannot be estimated [11,12]. Therefore, we use two assumptions that are often applied in the context of the pairwise comparison based on [13,14,15].
Assumption 1
(Ignorable treatment assignment).The potential outcomeis independent of the treatment arm; i.e.,. It is also assumed that.
Assumption 2
(Independent potential outcome).Two potential outcomes are independent of each other; i.e.,.
Assumption 1 is often made in randomized trials. Although some authors have discussed methods to infer without Assumption 2 [16,17], the methods are impractical as they tend to be complex and/or require considerable computational effort.
Corollary 2.
in (6) is identified under Assumptions 1 and 2 as follows:
in the stratum with is also identified under Assumptions 1 and 2, thus allowing in Definition 2 and in Proposition 1 to be identified.
We propose a method to estimate in Definition 2 based on the formula in Corollary 2. Let us denote the observed event occurrence or censoring time as in which the time points in both arms are included. Then, we can use the cut-off point as a discrete value taking only the same value as ; that is, it is sufficient to consider as the cut-off points. We also set the interval to , where is the truncation time. These settings yield the following formula to estimate .
Proposition 2.
For a time-to-event outcome, we estimatewith truncation timeunder Assumptions 1 and 2 as follows:
whereand is the survival probability estimated using the Kaplan–Meier method. Then, we estimate by
Censoring is considered when estimating by the Kaplan–Meier method. In a similar manner, and are estimated by applying Proposition 2 to the stratum with Z = z. Thus, , in (7) and in Proposition 1 can be estimated based on the observed data. It is obvious from Proposition 2 that and are consistent with the estimators of the RMST in the arm with X = x.
To derive the formulas to estimate and for a continuous outcome without censoring, let us suppose that t1 < ⋯ < tj < ⋯ tD are the observed values of the continuous outcome. The interval [ta,tb] is set to [t1,tD]. Here, we denote the number of subjects taking the value tj in the experimental arm as nj and that in the control arm as mj. Then, and are estimated by the following formulas.
Proposition 3.
For a continuous outcome, we estimate under Assumptions 1 and 2 by
where and .
Let us denote the outcome values as () for subjects in the experimental arm and for those in the control arm. Then, in relation to Lemma 1, we have Lemma 2 below.
Lemma 2.
When in Proposition 3 are used, the following equations hold:
The proof is given in Appendix B. In this lemma, the right sides correspond to the arithmetic means in the arm with . As the arithmetic mean is a plausible estimate of E{T|X = x}, the left sides are plausible estimates of unrestricted rather than restricted expectation of , , under Assumption 1.
4. Illustration
We illustrate our measure of covariate effect type using the clinical data set of Ohashi and Hamada [18]. The purpose of that study was to determine whether the survival time is longer with use of radiation during surgery () than without it () in subjects with pancreatic cancer. As it was an observational study, Assumption 1 would not hold. Nevertheless, we analyzed the data under this assumption, so the following analyses are for illustrative purposes only.
We explored the covariate effect type in the context of radiation effectiveness in the pancreatic cancer site, which was subclassified as pancreatic head () or “other” (). In the pancreatic head subgroup, 51 subjects received radiation during surgery and 14 subjects did not. In the other subgroup, nine subjects (including one censored subject) received radiation and nine did not. The survival curves in both subgroups are shown in Figure 1. Table 1 summarizes the RMST estimates and differences between arms in both subgroups, with 95% confidence intervals (CIs). For the estimation, the truncation time was set to the maximum observed time, 21.6 (months), because the final survival probability was zero in both arms in both subgroups.
Figure 1.
Survival curves for the arms with and without radiation during surgery: (a) pancreatic head subgroup; (b) “other” subgroup.
Table 1.
Restricted mean survival times (RMSTs) for the arms with and without radiation, for the pancreatic head and “other” subgroups, and the RMST differences between the two arms.
Before applying our approach, we first applied the current standard analysis given by and in Corollary 1. Using the RMST estimates in Table 1, (95% CI: –8.687, –0.011) and (95% CI: –2.635, 2.422). As , it was concluded that the cancer site would be a predictive factor, i.e., if the cancer site is not the pancreatic head, surgery is expected to be more effective with radiation than without it.
Next, we applied the approach proposed in this article. Figure 2 shows in each subgroup when the cut-off point was changed from 0 to 21.6 (months). The proportions of activated and inert subjects decreased and increased monotonically over time, respectively, as shown in Figure 2a,d; this is to be expected given Proposition 2. Figure 2b shows that the proportion of causative subjects is lower in the pancreatic head subgroup than in the other subgroup. Figure 2c shows that the proportion of preventive subjects was small, especially in the other subgroup. Figure 3 shows the differences in the estimated proportions between the two subgroups; i.e., in (7). In Figure 3, we can clearly see that the cancer site is “depleted-activated” until approximately six months. Then, over the period from 6 to 12 months, the cancer site is “depleted-causative.”
Figure 2.
Response type proportions in the pancreatic head and “other” subgroups: (a) activated (type 11) subjects; (b) causative (type 10) subjects; (c) preventive (type 01) subjects; (d) inert (type 00) subjects.
Figure 3.
Differences in the proportions of type subjects between the pancreatic head and “other” subgroups.
The estimates of in Definition 2 and in Proposition 1 are summarized in Table 2. In the table, implies that the mean proportion of causative subjects in the pancreatic head subgroup on the interval [0, 21.6] (12.9%) was 14.5% lower than that in the other subgroup (27.4%). This indicates that the cancer site is “depleted-causative”. However, for some subjects, the cancer site might be “activated-preventive” because , although the absolute value is smaller than that of (−0.145). The cancer site might also be considered as “depleted-activated” because . In the context of prediction and prognosis, as , it is concluded that the cancer site would be a predictive factor of the effectiveness of radiation, and possibly also of the harm caused by radiation, albeit to a lesser extent.
Table 2.
Restricted mean probabilities (RMPs) of the four response types by subgroups of cancer site and the RMP differences between the two subgroups.
The results generated using our approach were similar to those obtained via the current standard analysis; however, the magnitude was different. The difference is attributed to , where our approach could not rule out, based on the data, a harmful effect of radiation in some subjects, especially in the pancreatic head subgroup. The value of estimated by the current standard analysis was −4.439, while it is converted to −0.201 (); this is 0.057 less than the value of (−0.145) derived by our approach, as seen in Corollary 1. Similarly, is converted to −0.005 (); this is 0.057 larger than the value of (−0.062). As a result, is closer to zero than .
5. Conclusion
In this article, we extended the concept of the covariate effect type proposed by Chiba [3] from the case with a binary outcome to cases with time-to-event and continuous outcomes. This was achieved by using all cut-off points on the interval , where the respective values of and are zero and the truncation time in a case with a time-to-event outcome and the minimum and maximum observed values in a case with a continuous outcome. A proportion of the subjects for each cut-off point is in (6), and the mean of is the RMP, , in Definition 2. As discussed in Section 2 and Section 3, the RMP is related to the arithmetic mean, which corresponds to the RMST for a case with a time-to-event outcome.
In some studies, a covariate of interest may be a continuous variable, rather than a binary variable discussed in this article. Unfortunately, our approach cannot be applied for a continuous covariate. However, a continuous covariate can be partitioned into two subgroups by finding the optimal threshold value to split it using a popular method such as receiver operating characteristic (ROC) curve analysis [19]. Recently, partitioning methods based on a combination of multivariate covariates have also been discussed [20]. These methods will facilitate the use of our approach.
Our measures of covariate effect types can easily be calculated by merging two data sets for the experimental and control arms including the rows of times and survival probabilities, which are automatically generated using commercial software such as SAS (SAS Institute, Cary, NC, USA) and R (R Foundation for Statistical Computing, Vienna, Austria). For example, to derive , we can use a procedure to generate a Kaplan–Meier plot. Then, we obtain respective data sets including and . is derived by merging these two data sets. The RMP estimate, , is calculated by summing the areas of rectangles, which are calculated by applying a lag function. This process to estimate can be applied to a case with a continuous outcome by setting the minimum value of the outcome to zero, i.e., by using . It is also easy to extend the method to observational studies by using a weighted survival curve [21] derived based on the propensity score [22].
A limitation of our approach is the requirement that the assumption of independent potential outcomes (Assumption 2) is used when estimating the measures of covariate effect types. Unfortunately, we cannot verify whether this assumption holds in actual studies based on the observed data. Additional work is necessary to develop a simple method to estimate our measures of covariate effect types without using the assumption of independent potential outcomes.
Current standard analysis is suitable for assessing prediction and prognosis of a covariate when , which implies that a covariate is not predictable for the effectiveness of the control treatment (or for the harm caused by the experimental treatment). However, when , whether the current approach is appropriate is somewhat questionable unlike for our approach. Thus, our approach can supplement the current standard analysis, despite the limitation of requiring the assumption of independent potential outcomes.
Funding
This research was funded by Grant-in-Aid for Scientific Research from Japan Society for the Promotion of Science, grant number 19K11871.
Conflicts of Interest
The author declares no conflict of interest.
Appendix A. Proof of Lemma 1
Let us denote the probability density function of as , and the cumulative density function as . Then, the left side of the first equation in Lemma 1 can be expressed as follows:
The second equation in Lemma 1 is derived in a similar manner.
Appendix B. Proof of Lemma 2
Using Proposition 3, holds. This implies that the value of does not change when applying the cut-off point for the observed outcome value in the control group. Therefore, for , it is sufficient to consider only the observed outcome values in the experimental arm. Then, the left side of the first equation in Lemma 2 can be expressed as follows:
The second equation in Lemma 2 is derived in a similar manner.
References
- Clark, G.M.; Zborowski, D.M.; Culbertson, J.L.; Whitehead, M.; Savoie, M.; Seymour, L.; Shepherd, F.A. Clinical utility of epidermal growth factor receptor expression for selecting patients with advanced non-small cell lung cancer for treatment with erlotinib. J. Thorac. Oncol. 2006, 1, 837–846. [Google Scholar] [CrossRef][Green Version]
- Ballman, K.V. Biomarker: Predictive or prognostic? J. Clin. Oncol. 2015, 33, 3968–3972. [Google Scholar] [CrossRef] [PubMed]
- Chiba, Y. Inference on covariate effect types for treatment effectiveness in a randomized trial with a binary outcome. Clin. Trials 2019, 16, 237–245. [Google Scholar] [CrossRef] [PubMed]
- Hernán, M.A.; Robins, J.M. Causal Inference: What If; Chapman & Hall/CRC: Boca Raton, FL, USA, 2020; pp. 58–65. [Google Scholar]
- Holland, P.W. Statistics and causal inference. J. Am. Stat. Assoc. 1986, 81, 945–960. [Google Scholar] [CrossRef]
- Rubin, D.B. Bayesian inference for causal effects: The role of randomization. Ann. Stat. 1978, 6, 34–58. [Google Scholar] [CrossRef]
- Rubin, D.B. Randomization analysis of experimental data: The Fisher randomization test comment. J. Am. Stat. Assoc. 1980, 75, 591–593. [Google Scholar] [CrossRef]
- Royston, P.; Parmar, M.K. Restricted mean survival time: An alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Med. Res. Methodol. 2013, 13, 152. [Google Scholar] [CrossRef] [PubMed]
- Uno, H.; Claggett, B.; Tian, L.; Inoue, E.; Gallo, P.; Miyata, T.; Schrag, D.; Takeuchi, M.; Uyama, Y.; Zhao, L.; et al. Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. J. Clin. Oncol. 2014, 32, 2380–2385. [Google Scholar] [CrossRef] [PubMed]
- A’Hern, R.P. Restricted mean survival time: An obligatory end point for time-to-event analysis in cancer trials? J. Clin. Oncol. 2016, 34, 3474–3476. [Google Scholar] [CrossRef] [PubMed]
- Ding, P.; Dasgupta, T. A potential tale of two-by-two tables from completely randomized experiments. J. Am. Stat. Assoc. 2016, 111, 157–168. [Google Scholar] [CrossRef]
- Lu, J.; Ding, P.; Dasgupta, T. Treatment effects on ordinal outcomes: Causal estimands and sharp bounds. J. Educ. Behav. Stat. 2018, 43, 511–539. [Google Scholar] [CrossRef]
- Buyse, M. Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Stat. Med. 2010, 29, 3245–3257. [Google Scholar] [CrossRef] [PubMed]
- Péron, J.; Roy, P.; Ozenne, B.; Roche, L.; Buyse, M. The net chance of a longer survival as a patient-oriented measure of treatment benefit in randomized clinical trials. JAMA Oncol. 2016, 2, 901–905. [Google Scholar] [CrossRef] [PubMed]
- Chiba, Y. Sharp nonparametric bounds and randomization inference for treatment effects on an ordinal outcome. Stat. Med. 2017, 36, 3966–3975. [Google Scholar] [CrossRef] [PubMed]
- Huang, E.J.; Fang, E.X.; Hanley, D.F.; Rosenblum, M. Inequality in treatment benefits: Can we determine if a new treatment benefits the many or the few? Biostatistics 2017, 18, 308–324. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Chiba, Y. Bayesian inference of causal effects for an ordinal outcome in randomized trials. J. Causal Infer. 2018, 6, 20170019. [Google Scholar] [CrossRef]
- Ohashi, Y.; Hamada, C. Survival Analysis: Biostatistics with SAS; University of Tokyo Press: Tokyo, Japan, 1995; pp. 247–250. (In Japanese) [Google Scholar]
- Heagerty, P.J.; Lumley, T.; Pepe, M.S. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000, 56, 337–344. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Li, J.; Li, Y.; Wong, W.K. A model-based multithreshold method for subgroup identification. Stat. Med. 2019, 38, 2605–2631. [Google Scholar] [CrossRef] [PubMed]
- Cole, S.R.; Hernán, M.A. Adjusted survival curves with inverse probability weights. Comput. Methods Programs Biomed. 2004, 75, 45–49. [Google Scholar] [CrossRef] [PubMed]
- Rosenbaum, P.R.; Rubin, D.B. The central role of the propensity score in observational studies for causal effects. Biometrika 1983, 70, 41–55. [Google Scholar] [CrossRef]
© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).