1. INTRODUCTION
In late 2010, internationally respected Canadian authorities in ovarian cancer convened in Toronto, Ontario, to discuss the relevance of progression-free survival (PFS) as an endpoint in treatment and also the importance of PFS and its impact on the development of clinical trials. This pan-Canadian workshop was led by Dr. Amit Oza. The expert panel participants included Drs. Marc Buyse, Mark Brady, Al Covens, Corneel Coens, Prafull Ghatage, Jean Gregoire, Hal Hirte, Paul Hoskins, Helen Mackay, Jean Maroun, Dianne Miller, Marie Plante, Diane Provencher, Barry Rosen, Gavin Stuart, Katia Tonkin, Johanne Weberpals, and Stephen Welch. The findings presented here represent ideas generated during the workshop with respect to
an evaluation of the relevance of PFS as a valid endpoint in ovarian cancer.
a Canadian consensus on the relevance of PFS in ovarian cancer.
an attempt to address how PFS translates into clinical benefit in ovarian cancer.
2. BACKGROUND
Ovarian cancer is the leading cause of gynecologic cancer mortality in Canada, responsible for an estimated 2600 new cases and 1750 deaths in 2010 [
1]. Most ovarian cancers are epithelial tumours, and the present review focuses on treatment and outcomes for the group of patients with that tumour type.
Currently, standard first-line treatment of advanced disease involves cytoreductive surgery combined with platinum-based chemotherapy [
2]. Most patients will initially respond well to treatment, but unfortunately, approximately three quarters of all women treated will develop recurrent disease and will no longer be considered curable [
3]. Treatment after recurrence focuses on prolonging life and improving quality of life (QOL) [
4]. To that end, phase III clinical trials are designed to ascertain whether new treatments are superior to standard therapy or whether they positively affect patient QOL (or both); to understand toxicity profiles; and to evaluate the economic consequences of implementing the new treatments. Importantly, study endpoints must be clearly defined at the outset—the “ultimate endpoint” being to afford significant clinical benefit to the patient. To date, overall survival (OS) has been considered the most clinically relevant endpoint in oncology trials; it remains the “gold standard” because of its relevance and objectivity.
3. OVARIAN CANCER AS A CHRONIC DISEASE AND THE CHALLENGES OF USING OS
It is widely recognized that OS has become a challenging endpoint in oncology trials because of the prolonged time required to adequately assess survival and the delay in designing subsequent trials. Ultimately, this strategy is very costly. In addition, when OS is used, confounding effects of post-study therapies and trial crossover (whereby patients who fail on standard-arm therapy are subsequently permitted to switch to the experimental treatment) are seen. The pattern of increasing sequential salvage therapies in ovarian cancer makes the measurement of OS—and the individual effect of a specific therapy on that endpoint—more challenging. Because of biologic complexity in ovarian cancer, collection of OS data is particularly challenging. Since about the end of the 1950s, the median survival of women with advanced ovarian cancer has improved from less than 1 year to more than 3 years in most recent publications [
5,
6,
7]. Although this progress may be related in part to patient selection, it has transformed ovarian cancer from an acute condition with limited treatment options into a chronic disease in which numerous treatments are used in sequence. As a result, to show a difference in OS, contemporary trials must follow large cohorts of patients for prolonged periods.
In most trials, a significant consideration is that the analysis might be confounded by therapy received after progression. Data concerning OS are especially likely to be obscured when trials are designed with a planned crossover to the experimental arm or when the likelihood is high that patients will subsequently receive the experimental treatment once they progress to the standard. Paradoxically, for ethical reasons and because of pressure from patients and physicians, crossover designs are more often used in the study of drugs presumed to be effective, thereby increasing the risk that any OS benefit present in those newly developed effective treatments will be overshadowed.
Taken together, the foregoing factors have made a statistical proof of survival advantage derived from a single line of therapy difficult to attain even for active treatments. That difficulty has led to the adoption of more convenient—and in some cases, arguably more relevant—surrogate endpoints expected to correlate clinically and statistically with the original endpoint.
4. PFS AS A SURROGATE ENDPOINT
The most commonly used surrogate endpoint in oncology is PFS. The advantages of this endpoint are an earlier and more sensitive assessment of antitumour efficacy, a lower likelihood of influence by competing risks (especially in elderly subjects), and a lesser chance of confounding because of treatments received after progression. However, the clinical significance of PFS remains unclear. For PFS to be universally accepted as a surrogate for improved survival, delayed progression must be shown to correlate with improved survival, or alternatively, prolonged asymptomatic periods must be shown to translate into an improvement in QOL. Conversely, absence of a PFS benefit must also be demonstrated to mean that the experimental therapy is unlikely to yield a survival advantage. Those proofs should come in the form of statistical correlations between PFS and OS data (where both are available) and of scores on validated QOL tools incorporated into the study design. Also, in contrast to OS, which relies on date of death (a biasfree time point), PFS is based on predefined progression endpoints, which depend on bias-prone variables such as the methods of assessment, the moment of assessment, and the definition of progression. It can also be argued that statistical correlation between PFS and OS is, in itself, not proof or indication of surrogacy [
8] and that formal validation requires detailed observations and analysis. Yet, despite those limitations, PFS has been validated as correlating with OS in some settings such as that of metastatic colorectal cancer [
9,
10].
We therefore convened the workshop reported here to discuss the existing evidence about the validity of PFS as a surrogate for OS in the various clinical settings of ovarian cancer, with the aim of developing a consensus statement to guide those who ultimately determine drug approval policy.
5. PFS AND OS IN THE SETTING OF FIRSTLINE TREATMENT OF OVARIAN CANCER— THE EVIDENCE
A significant number of phase III clinical trials assessing first-line treatment for ovarian cancer have been published. Many include data pertaining to PFS and OS as primary or secondary trial endpoints, which allows for a direct comparison and assessment of the correlation between those outcomes. Our review of the literature revealed a significant number of trials illustrating correlated results for PFS and OS, reflecting both positive and negative findings. Some of those results are summarized in the next subsection and in
Table 1.
In the setting of adjuvant treatment for women with early-stage ovarian cancer, the combined analysis of ICON1 (International Collaborative Ovarian Neoplasm 1) and ACTION (Adjuvant Chemotherapy in Ovarian Neoplasm) observed a statistically significant advantage in recurrence-free survival and OS for patients receiving platinum-based adjuvant chemotherapy [
12]. In 1996, McGuire and colleagues [
11] established the superiority of the paclitaxel and cisplatinum combination compared with the accepted standard combination of cyclophosphamide and cisplatinum. Their study showed a highly significant 5-month PFS advantage and a 14-month survival advantage for the experimental combination [
11]. An Arbeitsgemeinschaft Gynäkologische Onkologie (AGO)/ Groupe d’Investigateurs Nationaux pour l’Etude des Cancers Ovariens Intergroup study examined the use of triple combination therapy consisting of paclitaxel, carboplatin, and epirubicin compared with paclitaxel and carboplatin therapy. Based on analysis of PFS and OS, no superiority for the triplet therapy was found. Because of its favored toxicity profile, the doublet therapy therefore remained the standard of care [
13]. In a separate phase III study examining the utility of topotecan consolidation therapy after standard chemotherapy, findings showed that PFS and OS were not significantly affected by that addition, and the trial was concluded to be negative [
14]. Interestingly, a recent investigation by Katsumata
et al., which compared the efficacy of a dose-dense combination of paclitaxel and carboplatin with the standard combination regimen, showed correlating evidence in the form of a PFS and an OS favouring the dose-dense regimen [
15].
Thus, there appears to be a correlation between PFS and OS in the first-line setting for trials of cytotoxic chemotherapy, but it is unclear if the same holds true for trials incorporating targeted agents. The two first trials to report results for such agents were GOG 218 (Gynecologic Oncology Group 218) and ICON7. Those studies both examined the addition of bevacizumab, with or without a maintenance phase, to front-line chemotherapy. Preliminary results from both studies showed a clear PFS advantage favouring arms in which bevacizumab was administered both concurrently with chemotherapy and as maintenance therapy afterward [
16]. Survival data from those trials are not yet mature, but interim analysis has so far failed to show a survival benefit. Those results highlight the importance of consensus with regard to the interpretation of data pertaining to PFS improvements and the acceptance of PFS as a surrogate endpoint when new classes of agents are introduced in randomized trials.
Recently, a review article considered statistical approaches that may be of use in validating surrogate endpoints. It concluded that meta-analysis based on individual patient data from multiple trials is an accurate method of validation. That approach was used to assess the validity of PFS as an accurate surrogate endpoint for OS in ovarian cancer. The meta-analysis assessed four trials that all addressed the same treatment-related question in advanced ovarian cancer patients and concluded that an acceptably high statistical correlation between PFS and OS had been observed [
17]. However, all existing data reflect the use of cytotoxic treatments rather than targeted agents. Based on their differing mechanisms of action and their prolonged use, a conclusion cannot be made that the validity of PFS as an endpoint for cytotoxic therapy can be presumed for targeted therapies in the same clinical setting.
5.1. PFS In Platinum-Sensitive Relapse
The data presented so far demonstrate significant correlation between PFS and OS in chemotherapynaïve patients. However, such a relationship is not yet well-defined in relapsed disease. Moreover, an important distinction should be made between platinum-sensitive and platinum-resistant patients.
It has been consistently demonstrated that ovarian cancer sensitive to platinum agents (>6-month platinum-free interval) shows higher response rates to chemotherapy (including platinum agents) [
18,
19]. That observation is important, because it translates into a longer time to disease progression, a higher survival rate, and longer treatment-free intervals [
19]. Thus, because of a lower rate of mortality events, trials of platinum-sensitive relapsed ovarian cancer (compared with trials of platinum-resistant disease) have to accrue more patients and require longer follow-up to obtain meaningful survival data. Moreover, correlating a QOL benefit with PFS in platinum-sensitive disease can be challenging, because the incremental benefit in QOL is likely to occur during treatment-free intervals, and experience has shown high, often nonrandom, dropout rates for QOL questionnaires in the latter phases of randomized clinical trials [
20,
21,
22]. This latter issue is not as likely to be a concern in studies of platinum-refractory disease, in which the time on treatment is shorter and the treatment-free interval is generally short-lived or nonexistent.
In contrast to the first-line setting, platinum-sensitive recurrent ovarian cancer has few studies that assist in developing an assertion about whether PFS correlates with QOL or OS. The largest of the available trials—and the only one sufficiently powered to assess OS—is the pooled analysis of two similarly designed phase III trials: ICON4 and AVOG (Associated Valley Obstetrics and Gynecology) 2.2. Those trials enrolled patients either to conventional platinumbased chemotherapy or to platinum plus paclitaxel.
The final analysis showed an advantage in PFS (13 months vs. 10 months,
p = 0.0004) that translated into an OS advantage (29 months vs. 24 months,
p = 0.02) [
23]. Another smaller trial comparing single-agent paclitaxel with cyclophosphamide, doxorubicin, and cisplatin combination chemotherapy also showed a concordant benefit in PFS and OS, with advantage to the platinum-based arm [
24].
Many phase III trials that are well designed to demonstrate a PFS advantage are not powered or conducted to report a survival outcome. For example, the AGO-led GCIG (Gynecologic Cancer Intergroup) study in platinum-sensitive women (>6 months) compared single-agent carboplatin with a carboplatin–gemcitabine combination and showed an advantage for PFS, but not for OS, favouring gemcitabine [
25]. The second GCIG study, led by the Grupo Español de Investigación de Cáncer de Ovario, compared singleagent carboplatin with a carboplatin and pegylated doxorubicin combination. The study was powered for PFS equivalence, but demonstrated that the median PFS was significantly superior for combination therapy. The OS difference was not statistically different, however [
26]. It is important to emphasize that both trials were underpowered to assess OS and that survival data may have been confounded by contamination of the control arm through use of the experimental drug in the community.
Table 2 summarizes selected trials.
For studies in which QOL information was collected, improvement in PFS did not translate into better QOL. It should be noted that the quality of the data at later time points was limited [
25,
26,
27], and as pointed out previously, was perhaps affected by nonrandom dropout, possibly obscuring QOL benefits.
To summarize, current data concerning whether PFS correlates with OS or QOL in the setting of platinum-sensitive recurrent ovarian cancer are inconclusive. We emphasize, however, that OS differences can be challenging to demonstrate in this patient population because of previously described limitations such as a low event rate and contamination of the control arm. We also argue that extended time off therapy can, in itself, be a clinically relevant endpoint, although it has been difficult to prove by standardized QOL questionnaires.
5.2. Platinum-Resistant Disease
Unfortunately, median survival is poor in platinumresistant ovarian cancer [
28,
29,
30], and as a result, OS data are easier to obtain and less likely to be confounded by post-trial therapy. In this setting, patients rarely enjoy prolonged treatment-free intervals and frequently move from one treatment to another upon progression. However, data generated to date are insufficient to establish a definitive relationship between PFS and OS or QOL in patients with platinum-resistant disease. Only one randomized trial, which compared either topotecan or pegylated doxorubicin with the experimental drug canfosfamide, demonstrated a survival benefit (for the control arm) [
28]. However, in a retrospective analysis of eleven completed GOG phase II trials in platinum-refractory patients, PFS at 6 months in each trial correlated with OS (Pearson
r = 0.661,
p = 0.027, and Kendall tau-b
r = 0.514,
p = 0.029) [
29]. That observation is particularly intriguing, but the absence of randomization and the retrospective nature of the analysis make its interpretation difficult.
Hence, our opinion is that an improvement in OS or a clearly demonstrated improvement in QOL scored using standardized QOL questionnaires are the preferred endpoints in platinum-refractory ovarian cancer. That opinion holds particularly true if a trial is not likely to be hampered by contamination or crossover. Contamination or crossover are likely to occur in some trials, and in such cases, PFS remains a relevant endpoint that will need to be interpreted within the global context of the trial, taking into account frequency of crossover or contamination, magnitude of benefit, toxicity, QOL, and costs associated with therapy.
As has been discussed, considerable support is available for the correlation between PFS and OS in the first-line treatment of ovarian cancer. However, the relationship between PFS and OS remains unclear in the context of relapsed disease and in the setting of maintenance therapy. The limitations of OS as an endpoint should be recognized, and yet we believe that it may be appropriate to use PFS as the primary endpoint when
crossover or contamination is expected.
the absolute gain in PFS is clinically relevant.
PFS is supported by one or more other endpoints (for example, OS, QOL).
It is important to note that the use of PFS as an endpoint must be validated before it will be accepted by regulatory authorities. Because the PFS-to-OS correlation is not clear in most situations, it is essential to ensure that current studies are adequately powered for both OS and PFS so that data supporting a strong correlation between these two endpoints can be accumulated.
There are pitfalls in the use of PFS as a surrogate for OS. Surrogacy of PFS is highly tumour-dependent, and it may also be treatment-dependent, meaning that surrogacy may have to be re-established for every new class of treatment. Because individual trials may allow crossover or may have similar issues that may impair the establishment of a PFS-to-OS correlation, evidence must come from meta-analyses of completed trials, including completed trials that did not address contemporary questions and that used older therapies and various definitions of PFS.
6. PFS IN THE SETTING OF MAINTENANCE THERAPY
The relationship between PFS and OS for patients undergoing maintenance therapy may be different from that for patients undergoing first-line therapy.
First, the current definition of maintenance chemotherapy leaves several issues unresolved:
Does “maintenance therapy” refer solely to patients in complete remission after induction treatment, or does it include those with residual disease?
What do we call patients receiving “maintenance”
after secondor third-line treatment?
How are trials that treat until progression to be interpreted compared with those that continue “maintenance therapy” for a predetermined period, and is the relationship the same for both scenarios?
The answers to those questions are currently unknown. Moreover, because each subsequent progression-free period may be shorter than the earlier ones, comparisons between trials may be difficult unless the patient populations are clearly defined.
The 2006 Workshop on Endpoints for Regulatory Approval (a joint effort of the U.S. Food and Drug Administration, the American Society of Clinical Oncology, and the American Association for Cancer Research) concluded that OS is the most significant endpoint for studies involving maintenance chemotherapy, but that improvement in PFS might be acceptable if the agent under investigation had few toxicities, if the potential for assessment bias were to be reduced (blinded designs were encouraged), and if biologic agents were not going to affect subsequent cytotoxic treatment [
30]. A Cochrane review of 902 women in six trials of maintenance therapy failed to show a PFS or OS benefit; however, the authors believed that they could not comment on the clinical benefit because of insufficient data [
31]. In a meta-analysis by Hess [
3] in which the results of twenty consolidation and nine maintenance therapy trials were pooled, consolidation therapies were associated with improved PFS [hazard ratio (HR): 0.79;
p = 0.003] and OS (HR: 0.68;
p = 0.0008), and maintenance therapies were associated with improved PFS (HR: 0.82;
p = 0.02) and OS (HR: 0.68;
p = 0.007). This relationship remained statistically significant when sensitivity analyses were completed. Of the maintenance studies to date, issues arise from statistical design, crossover, lack of QOL data, and censoring. As a result, the issue of maintenance therapy and the use of PFS as a surrogate is impossible to answer at this time. In the trials to date, it remains unclear whether “early” second-line therapy benefit or true benefit are occurring. Questions then arise about whether “time to third-line” therapy might be more relevant or whether maintenance strategies might be more effective after secondor third-line treatment than in the front line. Once again, the relationship might not be the same for cytotoxic agents (for example, paclitaxel) as it is for the biologic agents. Following on from the presentation of data from ICON7 and GOG 218, the mature OS, QOL, and health economic analyses from those studies are eagerly awaited.
7. PFS STANDARDIZATION AND THE ROLE OF CANCER ANTIGEN 125
The group acknowledged that the lack of standardized assessment of progression across ovarian cancer trials is problematic. The criteria used to define progression, the time points defined for the assessment of progression, and the methods of assessment can all affect the measured PFS. Although these issues are less likely to be problematic in randomized trials in which there is a comparator arm, they continue to hamper the comparison of PFS data from various single-arm trials.
One of the confounding factors in an evaluation of the use of PFS as an endpoint in clinical trials in ovarian cancer is the role of cancer antigen 125 (CA125) in identifying progression. Many recent protocols are assigning a “progressive disease” date based on which of CA125 or objective progressive disease occurs first. However, how that approach may affect the PFS-to-OS relationship is unknown. Although not consistently recognized by regulatory authorities, an increase in CA125 is well recognized by clinicians as an indicator of progression. However, an increase in CA125 is not always adequately confirmed [
32], suggesting that identification of progression in clinical trials could be inaccurate. Rustin
et al. [
32] showed that starting treatment based on increase in CA125 does not improve survival, and CA125 levels may therefore not be a useful indicator for a determination of completion of treatment. Given that CA125 has not been validated as an indicator of progression with new treatments, it is possible that the correlation between CA125 and tumour progression might not be robust for every class of agents, warranting prudence in clinical trials.
In past years, some clinical trials have incorporated the GCIG criteria of CA125 progression, and some have relied solely on tumour measurement criteria such as the Response Evaluation Criteria in Solid Tumors. The results have been great heterogeneity in clinical trial results and confusion in interpretation. The use of CA125 in clinical trials of ovarian cancer warrants careful scrutiny and discussion so as to better assess this parameter’s validity and to attempt to standardize its use.
8. SUMMARY
Progression-free and overall survival are important for understanding the full impact of any new treatment, and thus either one may be designated as the primary endpoint for clinical trials in ovarian cancer. Although OS is an important endpoint, PFS is usually preferred as the primary endpoint for clinical trials because of the confounding effect on OS of postprogression therapy. Despite the promising results of meta-analyses, the issue of whether PFS can be used as a surrogate for OS is not yet resolved. An improvement in PFS is generally accepted as a good outcome, but treatment toxicities and QOL must also be considered. These latter considerations hold particularly true for cytotoxic therapies in which the toxicities are more severe, making smaller gains in PFS or OS less attractive.
Not only must PFS be validated as an endpoint for metastatic ovarian cancer, but it must also be accepted by regulatory authorities as valid when approving new treatments. The ability to validate PFS as a surrogate for OS is currently limited because clinical trials are using older regimens as comparators, different definitions of progressive disease, and conflicting statistical interpretations. There is a growing need for standardization of methods to assess progression so that biases are limited and separate clinical trials are made comparable and thus eventually amenable to pooling of data.
The PFS-to-OS relationships may be different for different patient groups (such as platinum-refractory compared with platinum-sensitive, or first-line compared with third line), possibly leading to contamination of results when all patient types are pooled. Future research should attempt to determine if the PFS-to-OS relationship varies with the patient group and should ensure that the full range of patient demographics are made available so that data from the various patient groups can be pooled and analyzed separately. The most appropriate statistical analysis for evaluating clinical trials, particularly when PFS is being used as the primary endpoint, should also be determined. Analyses such as restricted-means analysis, which analyzes all data and the areas under the curve for the standard and the experimental arm alike, may provide better sensitivity than is achieved by assessing just the maximal differences between arms and should be explored to determine if more appropriate methods of assessing benefit can be developed.
Recommendations for future investigations include these:
Ensure that trials are designed to evaluate PFS, OS, and other clinically relevant endpoints such as disease-related symptoms or QOL.
Incorporate interim futility analyses intended to stop accrual early when the experimental regimen is not active.
Stop trials early to declare superiority only when compelling evidence suggests that a new treatment provides benefit for a pre-specified clinically relevant endpoint such as OS or symptom relief.
Importantly, discourage early release of secondary endpoint results when such a release might increase the frequency of crossover to the experimental intervention.
9. CONFLICT OF INTEREST DISCLOSURES
AO has participated on advisory boards for AstraZeneca, Celgene, and Sanofi–Aventis. JW has participated on the drug advisory board for AstraZeneca. AC has been on advisory boards and speakers’ bureaus for Amgen, Roche, GlaxoSmithKline, and Merck. MB holds stock in IDDI.