Next Article in Journal
TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection
Next Article in Special Issue
Classification of Alzheimer’s Disease Based on Core-Large Scale Brain Network Using Multilayer Extreme Learning Machine
Previous Article in Journal
Free Vibration of FG-CNTRCs Nano-Plates/Shells with Temperature-Dependent Properties
Previous Article in Special Issue
Evaluation of Surrogate Endpoints Using Information-Theoretic Measure of Association Based on Havrda and Charvat Entropy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Surrogate Measure for Time-Varying Biomarkers in Randomized Clinical Trials

1
Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
2
Department of Medicine, Stanford University, Palo Alto, CA 94305, USA
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(4), 584; https://doi.org/10.3390/math10040584
Submission received: 31 December 2021 / Revised: 1 February 2022 / Accepted: 2 February 2022 / Published: 13 February 2022

Abstract

:
Clinical trials with rare or distant outcomes are usually designed to be large in size and long term. The resource-demand and time-consuming characteristics limit the feasibility and efficiency of the studies. There are motivations to replace rare or distal clinical endpoints by reliable surrogate markers, which could be earlier and easier to collect. However, statistical challenges still exist to evaluate and rank potential surrogate markers. In this paper, we define a generalized proportion of treatment effect for survival settings. The measure’s definition and estimation do not rely on any model assumption. It is equipped with a consistent and asymptotically normal non-parametric estimator. Under proper conditions, the measure reflects the proportion of average treatment effect mediated by the surrogate marker among the group that would survive to mark the measurement time under both intervention and control arms.

1. Introduction

The HPTN (HIV Prevention Trial Network) 052 study is an HIV prevention trial conducted across several continents. The primary clinical endpoint of interest is that HIV infection is estimated to have a rate of 3–5% in modern clinical trial settings. In addition to this, the time from viral exposure to infection is long. The median infection time on average is more than one year. It is desirable to replace the clinically meaningful endpoint by an earlier and more easily accessible alternative endpoint.
A surrogate marker in clinical trials is considered to be “a laboratory measurement or physical sign used as a substitute for a clinically meaningful endpoint that measures directly how a patient feels, functions, or survives and that is expected to predict the effect of the therapy” [1]. It is considered to be valid if one could correctly conclude treatment effect on the clinical endpoint by using the marker [2,3]. In this context, how to validate surrogate markers for a clinically meaningful endpoint are especially important. Zhuang and Chen [4] review the surrogate measures in clinical research on their strengths and limitations in details.
A surrogate marker is considered to be valid if one could correctly conclude the treatment effect on the clinical endpoint by using that marker. In the language of hypothesis testing, that is, departure from the null hypothesis P ( T Z ) = P ( T ) is captured by departure from the null hypothesis P ( S Z ) = P ( S ) , where Z, S, and T represent the intervention, marker, and clinical endpoint, respectively. Prentice [5] operationalized the idea to test P ( T S , Z ) = P ( T S ) . The conditional independence of T and Z represents an ideal situation when the marker fully mediates the treatment effect. However, many candidate markers may only capture part of the treatment effect, so that to which extent a marker captures the treatment effect is of great practical importance. Freedman et al. [6] further extended Prentice’s criterion and evaluated the strength of surrogate markers by comparing the treatment effect with and without adjusting for the marker. For a binary endpoint T and logistic models:
l o g i t ( T Z ) = μ 1 + β Z , l o g i t ( T Z , S ) = μ 2 + β s Z + ϕ z S ,
the proportion of treatment effect explained (PTE) is defined as P T E = 1 β s / β . However, the adjusted and unadjusted models do not hold simultaneously in general model classes [7,8,9], and the assumption of no interaction in the adjusted model is not necessarily true. To avoid model dependence, Wang and Taylor [9] proposed the F-measure in a general setting as F = ( A A A B ) / ( A A B B ) , where A A = h ( s g A ( s ) d P A ( s ) ) , A B = h ( s g A ( s ) d P B ( s ) ) , and B B = h ( s g B ( s ) d P B ( s ) ) . Here, P A ( s ) and P B ( s ) are the distributions of surrogate marker S in the treatment group A and the control group B, respectively. The functions g A ( s ) and g B ( s ) are functions of the conditional distribution of the primary endpoint given S in the two groups. The functions h ( · ) , g A ( s ) and g B ( s ) are chosen such that A A B B is the desired measure of treatment effect on the primary endpoint. The F-measure framework is flexible while preserving the flavor of comparing the marginal treatment effect and adjusted treatment effect.
In this paper, we bring in the time dimension and define a generalized F-measure for time-to-event outcomes and time-varying internal surrogate markers explicitly. In Section 2, we introduce the time-varying F-measure. We show the measure can be estimated using a non-parametric estimator that is consistent and asymptotically normal. In Section 3, we give examples to visualize the change of F-measure with time and conduct Monte Carlo simulation studies to evaluate the proposed non-parametric estimation and inference. In Section 4, we apply the time-varying F-measure to an HIV prevention trial for illustration. Finally, we conclude the paper with a discussion in Section 5 and a conclusion in Section 6.

2. The Time-Varying F-Measure

2.1. Definition

We introduce the time-varying F-measure in this section. The new measure does not rely on any model assumption. In addition, it reflects Prentice’s criterion and describes the degree to which a marker captures the treatment effect on the clinical endpoints.
We consider intervention groups Z = 1 and Z = 0 . Let T represent the time-to-event outcome and X t represents the value of a candidate marker measured at time point t (after randomization). The time-varying F-measure is formulated to evaluate the marker when survival status at time point c ( c > t ) is of primary interest. We choose h ( u ) = u , g z ( x ) = P ( T c X t = x , T t , Z = z ) , P z ( s ) = P ( T c T t , Z = z ) . Then, the F-measure for a time-to-event outcome T is:
F ( c , t ) = A A A B A A B B ,
where
A A = P ( T c T t , Z = 1 ) , B B = P ( T c T t , Z = 0 ) , A B = x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) .
It is a function of time point c when the survival status is of primary interest, and time point t when the surrogate marker is measured. The definition and estimation do not necessarily rely on any model assumption and are exempt from model misspecification.
The time-varying F-measure reflects Prentice’s criterion. Namely, the scenarios of perfect markers, in which a marker mediates all the treatment effect, lead to F ( c , t ) = 1 ; the scenarios of useless markers, in which a marker does not mediate any treatment effect or is independent of intervention in the group of interest, leads to F ( c , t ) = 0 . In addition, when the treatment effect mediated by the marker is consistent with the direct treatment effect, the F-measure for a partial marker is guaranteed to be bounded within (0,1). A value outside the ideal bound indicates treatment effects via different pathways are not in the same direction so that the marker is not an appropriate surrogate. (Theoretic results are deferred to Section 2.3.)
In summary, the time-varying F-measure evaluates the relative position of the survival probability adjusted by eliminating the treatment effect on a biomarker. It serves as a model-free metric for assessing the proportion of treatment effect explained by the marker.

2.2. Estimation and Inference

In the time-varying F-measure, survival probabilities can be estimated by the non-parametric Kaplan–Meier estimator [10]. Under the assumption of random censoring, the conditional probability p x | 0 : = P ( X t = x T t , Z = 0 ) can be estimated by the empirical distribution. Naturally, we propose a plug-in estimator for the defined time-varying F-measure:
F ^ = s 1 ^ x s 1 x ^ · p x | 0 ^ s 1 ^ s 0 ^ ,
where s z ^ ( z = 0 , 1 ) and s 1 x ^ are the Kaplan–Meier estimator for P ( T c T t , Z = z ) and P ( T c X t = x , T t , Z = 1 ) , respectively. Let u z 1 < u z 2 < be the ordered, distinct times observed on arm z; n z ( τ ) be the number of subjects at risk set at time τ on arm z; and d z ( τ ) be the number of events at time τ on arm z. The Kaplan–Meier estimator of survival probabilities reads:
s z ^ = t < u z k c 1 d z ( u z k ) n z ( u z k ) .
Similarly, s 1 x : = P r ( T c X t = x , T t , Z = 1 ) can be estimated by the Kaplan–Meier estimator in the strata by Z = 1 and X t = x as:
s 1 x ^ = t < u 1 x k c 1 d 1 x ( u 1 x k ) n 1 x ( u 1 x k ) .
Under the assumption of random censoring, p x | 0 : = P ( X t = x T t , Z = 0 ) can be estimated by the empirical distribution as:
p x | 0 ^ = n 0 x ( t ) n 0 ( t ) ,
where n 0 x ( t ) = i = 1 n I ( X t = x , T t , Z = 0 ) .
We show the proposed estimator converges weakly to a Gaussian process under the following regularity assumptions (Proof of the theorem is deferred to Appendix A).
Assumption A1.
The time c is in a range of ( t , τ ) for some constant t > 0 , 0 < τ < such that s 1 ( τ ) s 0 ( τ ) > 0 and 1 ( 1 H ( τ ) ) ( 1 G ( τ ) ) < 1 , where H is the distribution function of time-to-event T and G is the distribution function of censoring time U.
Assumption A2.
Survival probabilities s 1 ( · ) s 0 ( · ) on ( 0 , τ ) .
Assumption A3.
Random censoring: The censoring time U is independent of both the failure time T and time-varying covariates X t on ( 0 , τ ) .
Theorem 1.
Under regularity Assumptions A1–A3, given a time t, n t ( F ^ ( c ) F ( c ) ) converges weakly to a zero-mean Gaussian process with covariance function E ζ ( c ) ζ ( c ) between time points c and c , where:
ζ ( c ) = s 1 x p x | 0 s 1 x ( s 1 s 0 ) 2 · η 0 + x p x | 0 s 1 x s 0 ( s 1 s 0 ) 2 · η 1 + 1 s 0 s 1 x p x | 0 · η 1 x + 1 s 0 s 1 x s 1 x η 0 x p ,
and
η 0 = s 0 ( c ) I ( Z = 0 T t ) t c d N ( u ) Y ( u ) d Λ 0 ( u ) E I ( Z = 0 T t ) Y ( u ) , η 1 = s 1 ( c ) I ( Z = 1 T t ) t c d N ( u ) Y ( u ) d Λ 1 ( u ) E I ( Z = 1 T t ) Y ( u ) , η 1 x = s 1 ( c | X t = x , T t ) I ( Z = 1 , X t = x T t ) t c d N ( u ) Y ( u ) d Λ 1 x ( u ) E I ( Z = 1 , X t = x T t ) Y ( u ) , η 0 x p = 1 p 0 I ( X t = x , Z = 0 T t ) p 0 x p 0 x p 0 2 I ( Z = 0 T t ) p 0 .
In the above equations, N ( u ) : = I ( T U , T u ) denote the observed counting process and Y ( u ) : = I ( T u , U u ) the at-risk process. The covariance function E ζ ( c ) ζ ( c ) can be consistently estimated by 1 / n t i = 1 n t ζ i ^ ( c ) ζ i ^ ( c ) , where ζ i ^ ( · ) is the sample versions of ζ ( · ) .

2.3. Ranges of F-Measure

2.3.1. Perfect Marker

When the marker mediates the entire treatment effect, we have P ( T c X t = x , T t , Z = 1 ) = P ( T c X t = x , T t , Z = 0 ) . It implies x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) = x P ( T c X t = x , T t , Z = 0 ) P ( X t = x T t , Z = 0 ) , and furthermore, F ( c , t ) = 1 .

2.3.2. Useless Marker

When the marker does not mediate any treatment effect, we have P ( T c X t = x 1 , T t , Z = 1 ) = P ( T c X t = x 2 , T t , Z = 1 ) ; when the intervention is independent of X t in the risk set at time point t, we have P ( X t = x T t , Z = 1 ) = P ( X t = x T t , Z = 0 ) . Either of the above useless marker conditions leads to x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 1 ) = x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) , and furthermore, F ( c , t ) = 0 .

2.3.3. Partial Marker

Without loss of generality, we consider the case A A B B > 0 . Theorem 2 stated below and its proof at Appendix B are naturally extendable for the counterpart case A A B B < 0 . To give interpretability and links to common instances in clinical trials, we impose three mild assumptions:
Assumption A4.
X t in the treatment group and that in the control group are stochastically ordered, P ( X t x T t , Z = 1 ) P ( X t x T t , Z = 0 ) x , or P ( X t x T t , Z = 1 ) P ( X t x T t , Z = 0 ) x .
Assumption A5.
P ( T c X t = x , T t , Z = z ) is monotone with x in the same direction for any given z.
Assumption A6.
P ( T c X t = x , T t , Z = z ) is monotone with z in the same direction for any given x.
In addition, we formulate three conditions:
C1.
P ( T c X t = x , T t , Z = 1 ) P ( T c X t = x , T t , Z = 0 ) > 0 .
C2.
P ( X t x T t , Z = 1 ) P ( X t x T t , Z = 0 ) x and P ( T c X t = x , T t , Z = 1 ) is increasing with x .
C3.
P ( X t x T t , Z = 1 ) P ( X t x T t , Z = 0 ) x and P ( T c X t = x , T t , Z = 1 ) is decreasing with x .
Theorem 2.
With Assumptions A4–A6, if Condition C1 is satisfied, then F < 1 ; if either Condition C2 or C3 is satisfied, then F > 0 .

2.4. Causal Interpretation

The F-measure is closely related with the concept of natural indirect effect, which is defined in the counterfactual framework [11,12]. We formulate Theorem 3 revealing the link with detailed proof in Appendix C.
Assumption A7.
{ T ( Z = 1 , X t = x ) , X t ( Z = 1 ) } { T ( Z = 0 , X t = x ) , X t ( Z = 0 ) } .
Assumption A8.
Z { T ( Z = z , X t = x ) , X t ( Z = z ) } .
Assumption A9.
X t ( Z = 1 ) T ( Z = 1 , X t = x ) T t , Z = 1 .
Theorem 3.
Under Assumptions A7–A9, it holds that:
F = P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) P ( T ( 1 , X t = X t ( 0 ) ) c T ( 1 ) t , T ( 0 ) t ) P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) P ( T ( 0 ) c T ( 1 ) t , T ( 0 ) t ) .
For the subgroup of T ( 1 ) t , T ( 0 ) t , F-measure’s numerator describes the natural indirect effect mediated by the surrogate marker while the denominator describes the average treatment effect. The ratio reflects the proportion of the average treatment effect mediated by the surrogate measure (in the sense of natural direct effect) for the subgroup surviving to marker measurement anyway. However, we also note that the causal interpretation does not apply in general [13].

3. Numerical Studies

To assess the proposed surrogate measure, we conduct numerical studies motivated by the HIV Prevention Trial Network. The plasma HIV-1 viral load represents the degree of viral burden and is believed to play a crucial role in mediating the benefit of antiretroviral therapy (ART) on HIV-related disease progression and transmission. We consider a viral load measurement dichotomized by a threshold of 1000 copies per cubic millimeter as the biomarker of interest. In a hypothetical scenario, participants have some HIV-1 exposure at the enrollment. The viral load level may increase fast in the follow-up while an effective intervention could delay the virus proliferation and further suspend the failure time. We express the above scenario in the following mathematical models. The dichotomous viral load level at time t is modeled as X t = I ( t t s ) , where t s denotes the time when one’s viral load shifts from level 0 to 1 after enrollment. We assume t s follows an exponential distribution with mean μ z in intervention group z, and a time-varying Cox–Weibull model:
h ( t X t , Z ) = h 0 ( t ) exp ( b 1 Z + b 2 X t ) ,
h 0 ( t ) = λ v t v 1 ,
where Z is Bernoulli with success probability of 0.5.

3.1. Numerical Examples

In Figure 1, we explore the numerical behavior of the time-varying F-measure under the motivation scenario described above. In particular, we assume Model (2) with a constant baseline hazard where h 0 ( t ) = 0.2 . Without loss of generality, we assume b 1 0 , b 2 0 , and c = 5 . With the model assumption, the F-measure has a closed-form formula with details in the Appendix D. Figure 1a has b 1 = 1 , t 0 = 0.5 , t 1 = 2 , varying b 2 visualized F-measures curves from an useless marker with b 2 = 0 to partial markers with b 2 > 0 .  Figure 1b has b 2 = 1 , t 0 = 0.5 , t 1 = 2 , varying b 1 gives the F-measure curves from a perfect marker with b 1 = 1 to partial markers with b 1 < 0 .

3.2. Monte–Carlo Simulation

In this section, we describe our Monte–Carlo simulation to evaluate the proposed non-parametric estimator. We generate failure times for each z-group based on a closed-form approach described in Austin [14]: First, a random value u is generated from the Uniform(0, 1) distribution and the subject-specific shift time t s is generated from an exponential distribution with mean μ z ; second, if log ( u ) < λ exp ( b 1 z ) t s v , we let failure time T = log ( u ) / λ exp ( b 1 z ) 1 / v ; otherwise, T = log ( u ) λ exp ( b 1 z ) t s v + λ exp ( b 2 ) exp ( b 1 z ) t s v / λ exp ( b 2 ) exp ( b 1 z ) 1 / v . In addition, we generate the censoring times from Uniform ( 0 , τ ) , in which τ is chosen to give a censoring rate of 20 % . The censoring is independent of the failure time T, the covariates Z and X t .
Table 1 summarizes the simulation results. We consider the scale parameter v to be 0.8, 1, and 1.2, representing when the hazard is decreasing, constant, and increasing with time, respectively. We are interested in the surrogacy level of the t-th year marker measurement for the treatment effect on the c-th year survival probability. For each setting of v, we choose c = 5 (years) and t = 0.25 , 0.5 , 1 , 2 (years). We show typical scenarios when a surrogate marker is perfect, useless, or partial. A perfect surrogate marker explains all the treatment effect on the clinical endpoint, i.e., b 1 = 0 ; an useless maker is conditionally independent of the failure time given Z, i.e., b 2 = 0 ; a partial marker, the most common scenario in practice, is beyond the above extreme situations. Without loss of generality, we consider a treatment delaying the failure time by both directly affecting the clinical endpoint and suppressing a harmful marker. That is, b 1 0 , b 2 0 ,   and   μ 0 μ 1 . Specifically, here are the configurations for the three scenarios in Table 1: (1) A perfect marker: λ = 0.02 , b 1 = 0 , b 2 = 3 , t 0 = 3 months , t 1 = 30 months ; (2) a useless marker: λ = 0.3 , b 1 = 1 , b 2 = 0 , t 0 = 3 months , t 1 = 30 months ; and (3) a partial marker: λ = 0.2 , b 1 = 0.5 , b 2 = 0.5 , t 0 = 3 months , t 1 = 30 months . We replicate 1000 times with 20,000 subjects. Under a large sample size, the estimator is unbiased; its variance accurately reflects the sampling variation; the coverage of 95% Wald-type confidence intervals is close to the nominal probability. One limitation for the non-parametric estimator is its lack of efficiency, which is the price for avoiding model misspecification.

4. Data Analysis

We apply the proposed time-varying F-measure to the HIV Prevention Trial Network (HPTN) 052 study [15]. The study enrolled 1763 serodiscordant couples in which one participant was HIV-positive, and the other was HIV-negative. The HIV-positive patients were randomly assigned to receive either immediate or delayed ART in a 1:1 ratio. Patients on the delayed arm started ART when two consecutive CD4+ cell count measurements fell below 250 per cubic millimeter or an indicator of AIDS developed. The study monitored the earlier occurrence of severe clinical outcomes in HIV-positive patients or HIV transmission to HIV-negative partners as a key endpoint. It is believed that plasma viral load mediates the effect of ART on HIV-associated disease progression and transmission [16].
In this application, we consider the plasma viral load as a candidate marker and evaluate its surrogacy level on the composite monitoring endpoint in a 3-year follow-up. To explain the idea in a simple way, we dichotomize the viral load using a threshold of 1000 copies per cubic millimeter. More specifically, we set the marker value to be 1 for a viral load greater than 1000. We estimate the time-varying F-measure for the viral load measured at each of the 2nd to 7th quarter after randomization. Table 2 shows results of the application. Comparing the prevalence of a high viral load between the two arms reveals that ART was very effective in suppressing viral proliferation. In addition, a low viral load significantly decreases the hazard of the composite endpoint on the immediate arm before the treatment effect kicks in on the delayed arm. The time-varying F-measure gradually increases until reaching its maximum at the 6th quarter. This temporal pattern reflects the fact that the surrogacy level is a combination of the treatment effect on the marker and the marker effect on the clinical endpoint. On the one hand, it takes time to realize the effect of viral load suppression. On the other hand, as an increasing number of patients on the delayed arm began ART, the difference in marker distribution between two arms become smaller. The time-varying F-measure correctly reflects the temporal pattern and the biological mechanism of ART.

5. Discussion

In this paper, we consider a definition of time-varying F-measure based on three aspects. First, there is the question of whether there is a sound interpretation for comparisons. Second, do the typical marker types, such as perfect or useless markers, correspond to reasonable values. Third, is the defined F-measure model-free and equipped with a non-parametric estimation? Guided by the three questions, we define the time-varying F-measure in Section 2. In addition, we explore two alternative definitions. Both of them do not conduct an appropriate comparison. With g z ( x ) = P ( T c X t = x , Z = z ) , the F-measure can be defined as:
F ( c , t ) = P ( T c Z = 1 ) x P ( T c X t = x , Z = 1 ) P ( X t = x Z = 0 ) P ( T c Z = 1 ) P ( T c Z = 0 ) .
When the availability of an internal marker depends on the failure time (e.g., event is death-related), X t should include “not applicable” as a possible value for subjects with T < t . In this case, P ( X t = x Z = 1 ) P ( X t = x Z = 0 ) is determined by both the treatment effect on the marker and that on the primary endpoint. Compared to P ( T c Z = 1 ) , the adjusted survival probability actually removes a portion of the direct treatment effect. This definition does not reflect the proportion of treatment effect explained by the surrogate marker in general. With g z ( x ) = h ( c X t = x , Z = z ) , the F-measure can be defined as:
F ( c , t ) = h ( c Z = 1 ) x h ( c X t = x , Z = 1 ) P ( X t = x T c , Z = 0 ) h ( c Z = 1 ) h ( c Z = 0 ) .
A closer look at the marker distribution P ( X t = x T c , Z = z ) reveals that:
P ( X t = x T c , Z = z ) = 1 + y x P ( T c X t = y , T t , Z = z ) P ( X t = y T t , Z = z ) P ( T c X t = x , T t , Z = z ) P ( X t = x T t , Z = z ) 1 .
If there is no interaction between the marker and intervention, the independence of X t and Z in the risk set at time point t could translate to the independence at time point c . In other words, only if P ( T c X t = y , T t , Z = 1 ) / P ( T c X t = y , T t , Z = 1 ) = P ( T c X t = y , T t , Z = 0 ) / P ( T c X t = y , T t , Z = 0 ) , then P ( X t = x T c , Z = 1 ) = P ( X t = x T c , Z = 0 ) is equivalent to P ( X t = y T t , Z = z ) / P ( X t = x T t , Z = z ) and is constant with z . Assumption of no interaction is, unfortunately, necessary for the appropriateness of the definition with hazard functions. As a contrast, the time-varying F definition introduced in Section 2 has a sound interpretation, reasonable ranges, and model-free definition and estimation. Moreover, numerical studies and practical data analysis verify the measure’s numerical behavior.
The time-varying F-measure is a generalization of the PTE [6] and F-measure [9]. All three measures are quantitative ones based on the qualitative Prentice Criterion [5]. While Prentice Criterion tests P ( T | S , Z ) = P ( T | S ) and requires a surrogate marker to capture the treatment effect fully, the three quantitative measures compare the treatment effects unadjusted and adjusted by the marker distribution on the treatment arm. Beyond the similarities, PTE is defined for binary endpoints and relies on logistic regressions for definition and estimation; the F-measure is a model-free version of PTE, however it does not cover how to assess surrogate markers for time-to-event outcomes. The time-varying F-measure brings in the time dimension and extends the measure for time-to-event outcomes in survival settings.

6. Conclusions

This paper introduces a generalized proportion of treatment effect for survival settings, called the time-varying F-measure. Without relying on any model assumption, the measure reflects the proportion of the average treatment effect mediated by the surrogate marker. In addition, the paper introduces a non-parametric estimator to maximize the measure’s model-free characteristics. One limitation of the current estimation method is its lack of efficiency, which can be a future research direction. We applied the generalized F-measure to assess the viral load as a surrogate marker for HIV progression and transmission in the HPTN052 study. The time-varying F-measure increased from 0.18 in the 2nd quarter after randomization and reached 1.12 in the 6th quarter. It correctly captured the temporal pattern and biological mechanism of how ART regulates HIV progression and transmission by suppressing viral replication.

Author Contributions

Conceptualization, R.Z. and Y.-Q.C.; Formal analysis, R.Z.; Investigation, R.Z., F.X. and Y.W.; Methodology, R.Z.; Resources, Y.-Q.C.; Software, R.Z.; Supervision, Y.-Q.C.; Validation, R.Z.; Writing—original draft, R.Z.; Writing—review & editing, Y.-Q.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly supported by the grants from NIH/NICHD R01 HD094682 and NIH/NIAID R56 AI140953.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

Proof. 
The proof consists of three steps. First, we decompose n t ( F ^ F ) as multiple empirical processes. Second, the convergence of each empirical process is derived. Third, we combine the asymptotic results and conclude the proof.
Step 1: We first write n t ( F ^ F ) as:
n t ( F ^ F ) = n t F ^ ( p x | 0 ^ ) F ( p x | 0 ^ ) + n t F ( p x | 0 ^ ) F ( p x | 0 ) ,
and tackle the parts one by one. For the first part, we plug in the estimator (1) and obtain:
n t F ^ ( p x | 0 ^ ) F ( p x | 0 ^ ) = n t s 1 ^ x s 1 x ^ · p x | 0 ^ s 1 ^ s 0 ^ s 1 x s 1 x p x | 0 ^ s 1 s 0 .
Rearranging the terms yields:
n t F ^ ( p x | 0 ^ ) F ( p x | 0 ^ ) = n t ( s 1 ^ s 0 ^ ) ( s 1 s 0 ) ( s 1 ( s 0 ^ s 0 ) s 0 ( s 1 ^ s 1 ) x p x | 0 ^ ( s 1 s 0 ) ( s 1 x ^ s 1 x ) x p x | 0 ^ s 1 x ( s 0 ^ s 0 ) + x p x | 0 ^ s 1 x ( s 1 ^ s 1 ) ) .
Then we collect the terms and write the equation in the form of n t ( s 0 ^ s 0 ) , n t ( s 1 ^ s 1 ) and n t ( s 1 x ^ s 1 x ) as:
n t F ^ ( p x | 0 ^ ) F ( p x | 0 ^ ) = s 1 x p x | 0 ^ s 1 x ( s 1 ^ s 0 ^ ) ( s 1 s 0 ) · n t ( s 0 ^ s 0 ) + x p x | 0 ^ s 1 x s 0 ( s 1 ^ s 0 ^ ) ( s 1 s 0 ) · n t ( s 1 ^ s 1 ) + x p x | 0 ^ ( s 0 s 1 ) ( s 1 ^ s 0 ^ ) ( s 1 s 0 ) · n t ( s 1 x ^ s 1 x ) .
Similarly, we write the second part as:
n t ( F ( p x | 0 ^ ) F ( p x | 0 ) ) = n t ( s 1 x s 1 x p x | 0 ^ s 1 s 0 s 1 x s 1 x p x | 0 s 1 s 0 ) = 1 s 0 s 1 x s 1 x · n t ( p x | 0 ^ p x | 0 ) .
Step 2: In this step, we derive the convergence of each empirical process. Under the assumption of random censoring, Kaplan–Meier estimator S ^ ( c ) satisfies [17]:
n t S ^ ( c ) S ( c ) = n t ( P n P ) S ( c ) 0 c d M ( u ) E Y ( u ) + o p ( 1 ) ,
where M ( u ) : = N ( u ) 0 u Y ( a ) d Λ ( a ) represents the counting process martingale. Therefore, the Kaplan–Meier estimator s 0 ^ , s 1 ^ , and s 1 x ^ satisfy:
n t s 0 ^ ( c ) s 0 ( c ) = d n t ( P n P ) s 0 ( c ) I ( Z = 0 T t ) t c d N ( u ) Y ( u ) d Λ 0 ( u ) E I ( Z = 0 T t ) Y ( u ) , n t s 1 ^ ( c ) s 1 ( c ) = d n t ( P n P ) s 1 ( c ) I ( Z = 1 T t ) t c d N ( u ) Y ( u ) d Λ 1 ( u ) E I ( Z = 1 T t ) Y ( u ) , n t s 1 x ^ ( c ) s 1 x ( c ) = d n t ( P n P ) s 1 x ( c ) I ( Z = 1 , X t = x T t ) t c d N ( u ) Y ( u ) d Λ 1 x ( u ) E I ( Z = 1 , X t = x T t ) Y ( u ) ,
where I ( · ) is an indicator function for subjects’ group.
Next we write n t ( p x | 0 ^ p x | 0 ) in the form of:
n t ( p x | 0 ^ p x | 0 ) = n t P ^ ( X t = x T t , Z = 0 ) P ( X t = x T t , Z = 0 ) = n t P ^ ( X t = x , Z = 0 T t ) P ^ ( Z = 0 T t ) P ( X t = x , Z = 0 T t ) P ( Z = 0 T t ) .
It can be readily shown that:
n t ( p x | 0 ^ p x | 0 ) = 1 P ^ ( Z = 0 T t ) n t P ^ ( X t = x , Z = 0 T t ) P ( X t = x , Z = 0 T t ) P ( X t = x , Z = 0 T t ) P ^ ( Z = 0 T t ) P ( Z = 0 T t ) n t P ^ ( Z = 0 T t ) P ( Z = 0 T t ) .
Let p 0 : = P ( Z = 0 T t ) and p 0 x : = P ( X t = x , Z = 0 T t ) . Since p 0 ^ is a consistent estimator of p 0 , we obtain:
n t ( p x | 0 ^ p x | 0 ) = d n t ( P n P ) 1 p 0 I ( X t = x , Z = 0 T t ) p 0 x p 0 x p 0 2 I ( Z = 0 T t ) p 0
as a consequence of Slutsky’s theorem.
Step 3: In this step, we combine the above results and conclude the convergence of n t F ^ F . We first introduce some notation,
η 0 = s 0 ( c ) I ( Z = 0 T t ) t c d N ( u ) Y ( u ) d Λ 0 ( u ) E I ( Z = 0 T t ) Y ( u ) ,
η 1 = s 1 ( c ) I ( Z = 1 T t ) t c d N ( u ) Y ( u ) d Λ 1 ( u ) E I ( Z = 1 T t ) Y ( u ) ,
η 1 x = s 1 x ( c ) I ( Z = 1 , X t = x T t ) t c d N ( u ) Y ( u ) d Λ 1 x ( u ) E I ( Z = 1 , X t = x T t ) Y ( u ) ,
η 0 x p = 1 p 0 I ( X t = x , Z = 0 T t ) p 0 x p 0 x p 0 2 I ( Z = 0 T t ) p 0 .
Combining the above results and apply Slutsky’s theorem, we can write:
n t ( F ^ ( c ) F ( c ) ) = n t ( P n P ) ( s 1 x p x | 0 s 1 x ( s 1 s 0 ) 2 · η 0 + x p x | 0 s 1 x s 0 ( s 1 s 0 ) 2 · η 1 + 1 s 0 s 1 x p x | 0 · η 1 x + 1 s 0 s 1 x s 1 x η 0 x p ) + o p ( 1 ) .
It follows that n t ( F ^ ( c ) F ( c ) ) converges weakly to a zero-mean Gaussian process with covariance function E { ζ ( c ) ζ ( c ) } between time points c and c , where:
ζ ( c ) = s 1 x p x | 0 s 1 x ( s 1 s 0 ) 2 · η 0 + x p x | 0 s 1 x s 0 ( s 1 s 0 ) 2 · η 1 + 1 s 0 s 1 x p x | 0 · η 1 x + 1 s 0 s 1 x s 1 x η 0 x p .
The covariance function E { ζ ( c ) ζ ( c ) } can be consistently estimated by 1 / n t i = 1 n t ζ ^ i ( c ) ζ ^ i ( c ) with:
ζ ^ i ( c ) = s 1 x p x | 0 s 1 x ( s 1 s 0 ) 2 · η 0 i + x p x | 0 s 1 x s 0 ( s 1 s 0 ) 2 · η 1 i + 1 s 0 s 1 x p x | 0 · η 1 x i + 1 s 0 s 1 x s 1 x η 0 x i p ,
where η 0 i , η 1 i , η 1 x i , and η 0 x i p is the subject i’s realization of (A1)–(A4), respectively. The specific forms are:
η 0 i = s 0 ( c ) I ( Z i = 0 T i t ) t c d N i ( u ) Y i ( u ) d Λ 0 ( u ) E I ( Z i = 0 T i t ) Y i ( u ) η 1 i = s 1 ( c ) I ( Z i = 1 T i t ) t c d N i ( u ) Y i ( u ) d Λ 1 ( u ) E I ( Z i = 1 T i t ) Y i ( u ) η 1 x i = s 1 x ( c ) I ( Z i = 1 , X t i = x T i t ) t c d N i ( u ) Y i ( u ) d Λ 1 x ( u ) E I ( Z i = 1 , X t i = x T i t ) Y ( u ) η 0 x i p = 1 p 0 I ( X t i = y , Z i = 0 T i t ) p 0 x p 0 x p 0 2 I ( Z i = 0 T i t ) p 0 .

Appendix B. Proof of Theorem 2

Proof. 
The proof consists of two steps. First, we show the sufficiency of Condition C1 for F < 1 . Second, we show the sufficiency of either Condition C2 or C3 for F > 0 .
Step 1: We first expand A B B B as x P ( T c X t = x , T t , Z = 1 ) P ( T c X t = x , T t , Z = 0 ) P ( X t = x T t , Z = 0 ) . If Condition C1 is satisfied, we have A B B B > 0 . Simple algebra reveals ( A A A B ) ( A A B B ) < 0 . Given A A B B > 0 , we conclude:
F = A A A B A A B B < 1 .
Step 2: If X t in the treatment group is stochastically greater than that in the control group, E A f ( X t ) > E B f ( X t ) for bounded and increasing function f . When P ( T c X t = x , T t , Z = 1 ) is increasing with x , we have x P ( T c X t = x , T t , Z = 1 ) P ( X t x T t , Z = 1 ) > x P ( T c X t = x , T t , Z = 1 ) P ( X t x T t , Z = 0 ) , that is, A A > A B . Use the same argument, if X t in the control group is stochastically greater than that in the treatment group and P ( T c X t = x , T t , Z = 1 ) is decreasing with x , we have x P ( T c X t = x , T t , Z = 1 ) P ( X t x T t , Z = 0 ) > x P ( T c X t = x , T t , Z = 1 ) P ( X t x T t , Z = 1 ) , that is A B > A A . Given A A B B > 0 , we conclude:
F = A A A B A A B B > 0 .
In summary, if Condition C1 and C2 (or C3) are satisfied, the F-measure is bounded within (0,1). □

Appendix C. Proof of Theorem 3

Proof. 
Step 1. In this step, we show P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) = P ( T c T t , Z = 1 ) . We first write:
P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 ) c , T ( 1 ) t T ( 0 ) t ) P ( T ( 1 ) t T ( 0 ) t ) .
By Assumption A7, we have:
P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 ) c , T ( 1 ) t ) P ( T ( 1 ) t ) .
Further, Assumption A8 yields:
P ( T ( 1 ) c T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 ) c , T ( 1 ) t Z = 1 ) P ( T ( 1 ) t Z = 1 ) = P ( T c Z = 1 ) P ( T t Z = 1 ) = P ( T c T t , Z = 1 ) .
In a similar way, we can show P ( T ( 0 ) c T ( 1 ) t , T ( 0 ) t ) = P ( T t T t , Z = 0 ) .
Step 2. In this step, we show P ( T ( 1 , X t = X t ( 0 ) ) c T ( 1 ) t , T ( 0 ) t ) = x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) . By definition,
P ( T ( 1 , X t = X t ( 0 ) ) c T ( 1 ) t , T ( 0 ) t ) = x P ( T ( 1 , X t = x ) c , X t ( 0 ) = x T ( 1 ) t , T ( 0 ) t ) = x P ( T ( 1 , X t = x ) c X t ( 0 ) = x , T ( 1 ) t , T ( 0 ) t ) P ( X t ( 0 ) = x T ( 1 ) t , T ( 0 ) t ) .
In the following, we work on the two probability components one by one.
P ( X t ( 0 ) = x T ( 1 ) t , T ( 0 ) t ) = P ( X t ( 0 ) = x , T ( 0 ) t T ( 1 ) t ) P ( T ( 0 ) t T ( 1 ) t ) .
The cross-world independence described in Assumption A7 leads to:
P ( X t ( 0 ) = x T ( 1 ) t , T ( 0 ) t ) = P ( X t ( 0 ) = x , T ( 0 ) t ) P ( T ( 0 ) t ) .
Furthermore, Assumption A8 yields:
P ( X t ( 0 ) = x T ( 1 ) t , T ( 0 ) t ) = P ( X t ( 0 ) = x , T ( 0 ) t Z = 0 ) P ( T ( 0 ) t Z = 0 ) = P ( X t = x T t , Z = 0 ) .
The remaining probability component:
P ( T ( 1 , X t = x ) c X t ( 0 ) = x , T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 , X t = x ) c , T ( 1 ) t X t ( 0 ) = x , T ( 0 ) t ) P T ( 1 ) t X t ( 0 ) = x , T ( 0 ) t ) = P ( T ( 1 , X t = x ) c , T ( 1 ) t ) P T ( 1 ) t ) = P ( T ( 1 , X t = x ) c , T ( 1 ) t Z = 1 ) P ( T ( 1 ) t Z = 1 ) ( by Assumption A8 ) .
Then we have:
P ( T ( 1 , X t = x ) c X t ( 0 ) = x , T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 , X t = x ) c T t , Z = 1 ) .
Assumption A9 gives X t ( 1 ) T ( 1 , X t = x ) c T t , Z = 1 , so that:
P ( T ( 1 , X t = x ) c X t ( 0 ) = x , T ( 1 ) t , T ( 0 ) t ) = P ( T ( 1 , X t = x ) c X t ( 1 ) = x , T t , Z = 1 ) = P ( T c X t = x , T c , Z = 1 ) .
Collecting the equations for the two probability components, we show:
P ( T ( 1 , X t = X t ( 0 ) ) T ( 1 ) t , T ( 0 ) t ) = x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) .

Appendix D. F-Measure under the Time-Varying Cox–Weibull Model

To facilitate the exploration and understanding of the F-measure, we calculate its true value under a time-varying Cox–Weibull model as an illustrative example. We follow the notation described in the main paper: Z denotes treatment assignment (control = 0; treatment = 1); X t denotes the value of the marker at time t; T denotes the failure time; and c denotes the pre-specified time of interest for survival.
We consider the time-varying Cox–Weibull model:
h ( t ) = h 0 ( t ) exp ( b 1 Z + b 2 X t ) , h 0 ( t ) = λ v t v 1 ,
where the marker value satisfies X t = I ( t t s ) , and t s follows an exponential distribution with mean μ z . The definition of the F-measure reads:
F ( c , t ) = P ( T c T t , Z = 1 ) x P r ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) P ( T c T t , Z = 1 ) P ( T c T t , Z = 0 ) .
With the Bayes rule, the conditional survival probability can be written in the form of:
P ( T c T t , Z = z ) = P ( T c Z = z ) P ( T t Z = z ) .
In general, for τ ( 0 , ) ,
P ( T τ Z = z ) = P ( T τ , X τ = 1 Z = z ) + P ( T τ , X τ = 0 Z = z ) .
The first term of (A5):
P ( T τ , X τ = 1 Z = z ) = 0 τ P ( T τ t s , Z = z ) f ( t s Z = z ) d t s = 0 τ exp 0 t s exp ( b 1 z ) h 0 ( u ) d u t s τ exp ( b 1 z + b 2 ) h 0 ( t ) d u · f ( t s Z = z ) d t s .
Plugging in a b h 0 ( u ) d u = λ ( b v a v ) and the density f ( t s Z = z ) = 1 / μ z exp ( t s / μ z ) leads to:
P ( T τ , X τ = 1 Z = z ) = 0 τ exp e b 1 z λ t s v e b 1 z + b 2 λ ( τ v t s v ) 1 μ z exp ( t s μ z ) d t s .
For a general Cox–Weibull model with v > 0 , there is no closed form formula and we need to refer to a numerical evaluation. Similarly, the second term of (A5):
P ( T τ , X τ = 0 Z = z ) = τ P ( T τ t s , Z = z ) f ( t s Z = z ) d t s = τ exp 0 τ exp ( b 1 z ) h 0 ( u ) d u · f ( t s Z = z ) d t s = τ exp e b 1 z λ τ v 1 μ z exp ( t s μ z ) d t s = exp τ v λ e b 1 z τ / t z .
Thus far, with Equations (A6) and (A7), we can calculate P ( T c T t , Z = 1 ) and P ( T c T t , Z = 0 ) .
Next, the adjusted probability in the numerator of the F-measure x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) = x P ( T c X t = x , T t , Z = 1 ) P ( X t = x T t , Z = 0 ) . We use Bayes rule and obtain:
P ( T c X t = 1 , T t , Z = 1 ) = P ( T c , X t = 1 Z = 1 ) P ( T t , X t = 1 Z = 1 ) .
The denominator of Equation (A8) is shown in (A6). The numerator of Equation (A8) satisfies:
P ( T c , X t = 1 Z = 1 ) = P ( T c , t s t Z = 1 ) = 0 t P ( T c t s , Z = 1 ) f ( t s Z = 1 ) d t s = 0 t exp 0 t s exp ( b 1 ) h 0 ( u ) d u t s c exp ( b 1 + b 2 ) h 0 ( u ) d u · f ( t s Z = 1 ) d t s
Plugging in a b h 0 ( u ) d u = λ ( b v a v ) and the density f ( t s Z = z ) = 1 / μ z exp ( t s / μ z ) leads to:
P ( T c , X t = 1 Z = 1 ) = 0 t exp e b 1 λ t s v e b 1 + b 2 λ ( c v t s v ) · 1 μ 1 exp ( t s μ 1 ) d t s .
Following the same lines, we work on:
P ( T c X t = 0 , T t , Z = 1 ) = P ( T c , X t = 0 Z = 1 ) P ( T t , X t = 0 Z = 1 ) .
The numerator of Equation (A10) reads:
P ( T c , X t = 0 Z = 1 ) = P ( T c Z = 1 ) P ( T c , X t = 1 Z = 1 ) ,
and the denominator reads:
P ( T t , X t = 0 Z = 1 ) = P ( T t Z = 1 ) P ( T t , X t = 1 Z = 1 ) .
To simplify the math, we introduce the following notation:
A ( τ , z ) : = P ( T τ , X τ = 1 Z = z ) = 0 τ exp e b 1 z λ t s v e b 1 z + b 2 λ ( τ v t s v ) 1 μ z exp ( t s μ z ) d t s , B ( τ , z ) : = P ( T τ , X τ = 0 Z = z ) = exp τ v λ e b 1 z τ / μ z , C : = P ( T c , X t = 1 Z = 1 ) = 0 t exp e b 1 λ t s v e b 1 + b 2 λ ( c v t s v ) 1 μ 1 exp ( t s μ 1 ) d t s .
With the above notation, Equation (A5) writes:
P ( T τ Z = z ) = A ( τ , z ) + B ( τ , z ) ,
Equation (A8) writes:
P ( T c X t = 1 , T t , Z = 1 ) = C A ( t , 1 ) ,
and Equation (A10) writes:
P ( T c X t = 0 , T t , Z = 1 ) = A ( c , 1 ) + B ( c , 1 ) C A ( t , 1 ) + B ( t , 1 ) A ( t , 1 ) = A ( c , 1 ) + B ( c , 1 ) C B ( t , 1 ) .
The remaining parts in the F-measure definition are:
P ( X t = 1 T t , Z = 0 ) = P ( X t = 1 , T t Z = 0 ) P ( T t Z = 0 ) = A ( t , 0 ) A ( t , 0 ) + B ( t , 0 ) , P ( X t = 0 T t , Z = 0 ) = 1 P ( X t = 1 T t , Z = 0 ) = B ( t , 0 ) A ( t , 0 ) + B ( t , 0 ) .
Gathering all the pieces together, we have:
P ( T c T t , Z = 1 ) = A ( c , 1 ) + B ( c , 1 ) A ( t , 1 ) + B ( t , 1 ) , P ( T c T t , Z = 0 ) = A ( c , 0 ) + B ( c , 0 ) A ( t , 0 ) + B ( t , 0 ) , P r ( T c X t = 1 , T t , Z = 1 ) P ( X t = 1 T t , Z = 0 ) = C A ( t , 0 ) A ( t , 1 ) A ( t , 0 ) + B ( t , 0 ) , P r ( T c X t = 0 , T t , Z = 1 ) P ( X t = 0 T t , Z = 0 ) = A ( c , 1 ) + B ( c , 1 ) C B ( t , 0 ) B ( t , 1 ) A ( t , 0 ) + B ( t , 0 ) .
When v = 1 (i.e., the failure time follows an exponential distribution), terms A and C are equipped with a closed-form formula in the form of:
A ( τ , z ) : = exp ( τ λ e b 1 z + b 2 ) exp ( τ ( λ e b 1 z + 1 / μ z ) ) 1 + λ μ z e b 1 z ( 1 e b 2 ) , B ( τ , z ) : = exp τ λ e b 1 z + 1 / t z , C : = exp ( c λ e b 1 + b 2 ) 1 exp t μ 1 + λ t e b 1 ( 1 + e b 2 ) 1 + λ μ 1 e b 1 ( 1 e b 2 ) .

References

  1. FDA. New drug, antibiotic and biological drug product regulations: Accelerated approval. Fed. Regist. 1992, 57, 13234–13242. [Google Scholar]
  2. Baker, S.G.; Kramer, B.S. A perfect correlate does not a surrogate make. BMC Med. Res. Methodol. 2003, 3, 16. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Fleming, T.R.; Powers, J.H. Biomarkers and surrogate endpoints in clinical trials. Stat. Med. 2012, 31, 2973–2984. [Google Scholar] [CrossRef] [PubMed]
  4. Zhuang, R.; Chen, Y.Q. Measuring Surrogacy in Clinical Research. Stat. Biosci. 2020, 12, 295–323. [Google Scholar] [CrossRef] [PubMed]
  5. Prentice, R.L. Surrogate endpoints in clinical trials: Definition and operational criteria. Stat. Med. 1989, 8, 431–440. [Google Scholar] [CrossRef] [PubMed]
  6. Freedman, L.S.; Graubard, B.I.; Schatzkin, A. Statistical validation of intermediate endpoints for chronic diseases. Stat. Med. 1992, 11, 167–178. [Google Scholar] [CrossRef] [PubMed]
  7. Lin, D.Y.; Fleming, T.R.; De Gruttola, V. Estimating the proportion of treatment effect explained by a surrogate marker. Stat. Med. 1997, 16, 1515–1527. [Google Scholar] [CrossRef]
  8. Bycott, P.W.; Taylor, J.M. An evaluation of a measure of the proportion of the treatment effect explained by a surrogate marker. Control. Clin. Trials 1998, 19, 555–568. [Google Scholar] [CrossRef] [Green Version]
  9. Wang, Y.; Taylor, J.M. A measure of the proportion of treatment effect explained by a surrogate marker. Biometrics 2002, 58, 803–812. [Google Scholar] [CrossRef] [PubMed]
  10. Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
  11. Robins, J.M.; Greenland, S. Identifiability and exchangeability for direct and indirect effects. Epidemiology 1992, 3, 143–155. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Pearl, J. Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, Seattle, WA, USA, 2–5 August 2001; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2001; pp. 411–420. [Google Scholar]
  13. Taylor, J.M.; Wang, Y.; Thiébaut, R. Counterfactual Links to the Proportion of Treatment Effect Explained by a Surrogate Marker. Biometrics 2005, 61, 1102–1111. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Austin, P.C. Generating survival times to simulate Cox proportional hazards models with time-varying covariates. Stat. Med. 2012, 31, 3946–3958. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Cohen, M.S.; Chen, Y.Q.; McCauley, M.; Gamble, T.; Hosseinipour, M.C.; Kumarasamy, N.; Hakim, J.G.; Kumwenda, J.; Grinsztejn, B.; Pilotto, J.H.; et al. Antiretroviral therapy for the prevention of HIV-1 transmission. N. Engl. J. Med. 2016, 375, 830–839. [Google Scholar] [CrossRef] [PubMed]
  16. Murray, J.S.; Elashoff, M.R.; Iacono-Connors, L.C.; Cvetkovich, T.A.; Struble, K.A. The use of plasma HIV RNA as a study endpoint in efficacy trials of antiretroviral drugs. Aids 1999, 13, 797–804. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Fleming, T.R.; Harrington, D.P. Counting Processes and Survival Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2011; Volume 169. [Google Scholar]
Figure 1. F-measure curves describing the surrogacy level for survival status at Year 5.
Figure 1. F-measure curves describing the surrogacy level for survival status at Year 5.
Mathematics 10 00584 g001
Table 1. Simulation results under Cox–Weibull distribution. The sample size of the study is 20,000 subjects and the coverage probability is obtained by 1000 replicates.
Table 1. Simulation results under Cox–Weibull distribution. The sample size of the study is 20,000 subjects and the coverage probability is obtained by 1000 replicates.
c = 5, v = 0.8
ScenarioMarker time a True valueBiasSampling SEMean of SECoverage
Perfect0.250.7470.0030.0520.0510.941
0.50.9320.0020.0540.0550.956
10.9950.0030.0590.0590.953
21.0000.0070.0810.0800.948
Useless0.250.0000.0010.0340.0340.948
0.50.0000.0010.0310.0300.951
10.0000.0010.0230.0230.949
20.0000.0010.0170.0160.944
Partial0.250.1970.0030.0510.0510.955
0.50.2290.0010.0470.0470.953
10.2130.0020.0380.0370.952
20.1670.0030.0300.0300.957
c = 5, v = 1
ScenarioMarker time a True valueBiasSampling SEMean of SECoverage
Perfect0.250.7430.0020.0430.0440.951
0.50.9310.0020.0440.0460.954
10.9950.0020.0470.0480.949
21.0000.0040.0620.0630.945
Useless0.250.0000.0010.0330.0340.964
0.50.0000.0000.0300.0300.958
10.0000.0000.0220.0220.954
20.0000.0010.0150.0160.954
Partial0.250.2040.0020.0510.0510.951
0.50.2410.0000.0460.0460.953
10.2280.0000.0360.0360.960
20.1810.0010.0280.0290.951
c = 5, v = 1.2
ScenarioMarker time a True valueBiasSampling SEMean of SECoverage
Perfect0.250.7420.0020.0360.0360.949
0.50.9300.0020.0360.0370.956
10.9950.0010.0370.0380.952
21.0000.0030.0490.0480.940
Useless0.250.0000.0010.0370.0370.953
0.50.0000.0000.0320.0320.955
10.0000.0000.0230.0230.943
20.0000.0000.0160.0160.953
Partial0.250.2190.0030.0550.0540.948
0.50.2620.0000.0490.0490.956
10.2520.0000.0370.0380.947
20.2030.0010.0300.0300.952
a The time point (year) measuring the marker.
Table 2. Application to an HIV prevention trial HPTN 052. The proposed time-varying F-measure captures the proportion of treatment effect explained by the plasma HIV-1 viral load.
Table 2. Application to an HIV prevention trial HPTN 052. The proposed time-varying F-measure captures the proportion of treatment effect explained by the plasma HIV-1 viral load.
Marker Time a Delayed ART ArmImmediate ART ArmF-Measure
Prevalence ofHazard Ratio b Prevalence ofHazard Ratio b Estimator95% CI
Viral Load 1000 Viral Load 1000
20.881.390.082.200.18−0.03, 0.39
30.880.940.083.21 *0.410.13, 0.70
40.871.000.094.49 *0.520.09, 0.95
50.851.590.085.59 *0.720.10, 1.34
60.812.51 *0.074.49 *1.12−0.42, 2.67
70.752.110.086.55 *0.81−0.92, 2.54
a The time point (quarter) when plasma HIV-1 viral load was measured. b Hazard ratio between groups with a viral load higher and lower than 1000 copies per cubic millimeter. Significant results at the level of 0.05 are marked with *.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhuang, R.; Xia, F.; Wang, Y.; Chen, Y.-Q. A Surrogate Measure for Time-Varying Biomarkers in Randomized Clinical Trials. Mathematics 2022, 10, 584. https://doi.org/10.3390/math10040584

AMA Style

Zhuang R, Xia F, Wang Y, Chen Y-Q. A Surrogate Measure for Time-Varying Biomarkers in Randomized Clinical Trials. Mathematics. 2022; 10(4):584. https://doi.org/10.3390/math10040584

Chicago/Turabian Style

Zhuang, Rui, Fan Xia, Yixin Wang, and Ying-Qing Chen. 2022. "A Surrogate Measure for Time-Varying Biomarkers in Randomized Clinical Trials" Mathematics 10, no. 4: 584. https://doi.org/10.3390/math10040584

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop