Relationship between Progression-free Survival and Overall Survival in Chronic Lymphocytic Leukemia: a Literature-based Analysis

Background The endpoints of progression-free survival (pfs) and time-to-progression (ttp) are frequently used to


INTRODUCTION
Chronic lymphocytic leukemia (cll) is a lymphoproliferative disorder defined by an accumulation of incompetent clonal B lymphocytes 1 .There are several risk factors for cll, including family history, male sex, white race, and advanced age 1 .In Europe and North America, cll represents the most common form of adulthood leukemia, and it accounts for approximately one third of all leukemia cases [2][3][4] .In the United States, 15,680 new cases of cll were recorded in 2013, with 4580 deaths 5 .In Canada, 1345 new cases in men and 850 new cases in women were recorded in 2010, and deaths among men and women in 2011 were 372 and 228 respectively 6 .Median survival from a first diagnosis of cll is estimated to range from 18 months to more than 10 years depending on disease severity 7 .The prognosis for patients with cll depends on several factors, including clinical stage, leucocyte count at diagnosis, leucocyte doubling time, levels of serum lactate dehydrogenase, mutation status of the immunoglobulin heav y-chain variable region (IgHV) genes, and CD38 expression level 3 .
In oncology, an appropriate and universally recognized measure for evaluating clinical benefit is improvement in overall survival (os), which is defined as the time from randomization to the time of death from any cause.Because of its objectivity, clinical relevance, and ease of interpretation, os has historically been considered the "gold standard" for measuring the clinical efficacy of a new anticancer drug.However, trials have encountered difficulties in demonstrating clinical benefit in terms of os, because use of that endpoint is associated with several limitations.Indeed, os can be influenced by the use of subsequent-line treatments after disease progression, making it difficult to assess the impact on survival of just one treatment.Overall survival can also be affected by the confounding effect of crossover therapy, because for ethical reasons, many trials allow patients in the control arm to receive the experimental treatment after disease progression.Moreover, large sample sizes and extended periods of follow-up are required to detect a significant difference in os, often resulting in long and expensive trials 8 .
More recently, intermediate clinical endpoints such as progression-free survival (pfs, defined as the time from randomization to objective tumour progression or death) and time to progression (ttp, defined as the time from randomization to objective tumour progression only) have been clinically accepted for anticancer drug approvals 8,9 .The validity of pfs and ttp as surrogate endpoints for os has been assessed in several cancer settings, including advanced colorectal cancer, advanced breast cancer, and advanced non-small-cell lung cancer.Furthermore, the relationship of pfs or ttp with os has also been explored in the context of hematologic malignancies.More specifically, Lee et al. used a literature review to evaluate the correlation between pfs and os in non-Hodgkin lymphoma (nhl) 10 .Those authors concluded that improvements in 3-year pfs were highly correlated with 5-year os in aggressive nhl (r = 0.90; 95% confidence interval: 0.73 to 0.96), but that no correlation in indolent nhl was evident.
Until now, the association between these endpoints has never been assessed in the specific context of cll.The objective of the present study was therefore to use a trialbased approach to evaluate the relationship of median pfs or ttp with median os in the context of cll.

Literature Search Strategy
A systematic review of the literature identified studies of cll therapy that reported median pfs or ttp and median os.The review question was established using the pico (population, interventions, comparators, outcomes) method 11 : the population consisted of patients with cll; the interventions and comparators (when applicable) were standard therapies for cll, and the outcomes were median pfs or ttp and median os.The systematic search was conducted using the electronic databases medline (1950-2011), embase (1980-2011), All EMB Reviews (including the Cochrane Database of Systematic Reviews, the American College of Physicians Journal Club, the Database of Abstracts of Reviews of Effects, the Cochrane Central Register of Controlled Trials, the Cochrane Methodology Register, the Health Technology Assessment Database, and the NHS Economic Evaluation Database) and Current Contents (1993-2011).The keywords used for the search were "B-cell chronic lymphocytic leukemia," "survival, " "disease progression," "cancer survival," "survival time," "survival rate," "progression," "progression-free survival," "event-free survival," "cause specific survival," and "survival analysis."To limit the introduction of publication bias, the grey literature was also searched.More specifically, abstracts from annual meetings were searched on the Web sites of the American Society of Clinical Oncology and the American Society of Hematology.Furthermore, retrieved articles were crossreferenced to identify additional publications.

Study Selection
Studies were first selected based on title and abstract; full-text articles were then reviewed using a predefined eligibility form.The included studies were randomized or nonrandomized clinical studies (phase ii or iii) or observational studies (retrospective or prospective) published in English or French between 1990 and 2011 (14 December).Each treatment arm had to include at least 30 patients, and the endpoints of median pfs or ttp and median os both had to be reported.The only definitions of the former endpoints that were accepted were these: ■ pfs: the time from study entry until objective tumour progression or death (all causes) ■ ttp: the time from study entry until objective tumour progression or death (cll-related) Studies were excluded if full-text articles were not available, if fewer than 80% of the patients in the sample had cll, and if the treatments under investigation included surgery, radiotherapy, or hematopoietic stem-cell transplantation without a conditioning regimen.All eligibility criteria were defined a priori.To avoid bias in study selection, the selection was performed by two independent reviewers.Disagreement between the reviewers was discussed and resolved by consensus.When more than one publication was retrieved for the same trial, the most recent article was selected.

Data Extraction and Quality Assessment
The general information and outcome measures extracted from selected studies were author, year of publication, number of patients included, definitions of pfs and ttp, median pfs and ttp and median os, and possibility for patients in the control arm to cross over to the experimental arm after progression (where applicable).Patient characteristics-sex, age, median follow-up, type of treatment under investigation, line of treatment, median number of prior treatments, Eastern Cooperative Oncology Group performance status, and median time between diagnosis and study entry-were also extracted.The extraction also focused on the risk profile of the included patients and on clinical disease staging by Binet stage and Rai classification.The predefined criteria equated Rai class 0 or Binet stage A (or both) with low-risk cll; Rai class i-ii or Binet stage B (or both) with intermediate-risk cll; and Rai calss iii-iv or Binet stage C (or both) with high-risk cll 12 .Other prognosis factors extracted were 17p deletion, 11q deletion, 13q deletion, mutation status of the immunoglobulin heavy-chain variable region genes, CD38 expression level, zap70 deficiency, level of β 2 -microglobulin, and trisomy 12 syndrome.
The included studies were assessed for quality using the Jadad scale 13 for randomized studies and the strobe statement 14 for nonrandomized studies.The Jadad scale includes three items associated with reduction of bias (description of the methods used for randomization and for double-blinding, and description of withdrawals and dropouts).The strobe statement uses a 22-item checklist relating to the study title, abstract, and introduction, methods, results, and discussion sections to evaluate the quality of reporting in cohort, case-control, and cross-sectional studies.For validation purposes, data extraction and quality assessment were performed by two independent reviewers.

Statistical Analyses
Descriptive analyses were performed first, to illustrate the characteristics of the included studies.Correlation analyses subsequently assessed the relationship of median pfs or ttp with median os.For the latter analysis, each treatment arm provided one observation.All data were tested for normality using Kolmogorov-Smirnov test.To examine the degree of association of pfs or ttp with os, the Pearson product moment or Spearman rank correlation coefficient was calculated, depending on whether the data were or were not normally distributed.Degrees of association were defined a priori: by range, correlation coefficients were considered to represent a very weak (0.00-0.19), weak (0.20-0.39), moderate (0.40-0.59), strong (0.60-0.79), or very strong (>0.8)association 15 .To explore possible reasons for heterogeneity, subgroup correlation analyses were also separately conducted according to the characteristics of selected studies.

Trials Included in the Analysis
The literature search identified 1263 potentially relevant studies, with 268 duplicates that were excluded.After the screening by title and abstract, 235 full-text articles were assessed according to the eligibility criteria.The nineteen studies that met the criteria were included, and four studies found by cross-reference were added, for a total of twentythree articles.No relevant study was found during the grey literature search (Figure 1).

Descriptive Analyses
Table i details the characteristics of the twenty-three included studies.Of those studies, seventeen were nonrandomized [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32] and six were randomized [33][34][35][36][37][38] .The studies included a total of 27 treatment arms and a mean of 118 patients (minimum 30, maximum 724).On average, the median age of the patients was 63.0 years [standard deviation (sd): 3.6 years], the median follow-up period was 40.0 months (sd: 18.3 months), and the median time between diagnosis and study entry was 44.3 months (sd: 26.2 months).Most included studies used pfs rather than ttp as their primary or secondary outcome.In some cases, studies that used ttp as an outcome also included all-cause mortality, which by definition should accompany only a pfs outcome.Considering the heterogeneity of reported definitions, pfs and ttp outcomes were thus combined as pfs/ttp for analysis.Furthermore, the averages of the median pfs/ttp and the median os were, respectively, 16.0 months (sd: 12.4 months) and 43.5 months (sd: 31.2).The treatment under investigation in most of the studies was chemotherapy-generally second-and subsequent-line therapies.Most of the included studies were conducted in patients with high-risk or intermediate-risk cll, according to Rai class and Binet stage.Other prognostic factors such as gene expression, mutation, and deletion were mostly not reported.Among the randomized studies, 1 (16.7%) had a Jadad quality score of 1/5, 2 (33.3%) had a score of 2/5, and 3 (50%) of 3/5 or more.Among the nonrandomized studies, 6 (35.4%) had a strobe score of 17/22 or less, and 11 (64.8%) had a score of 17/22 or more.

Correlation Analyses of Median PFS/TTP with Median OS
The Spearman correlation coefficient was used to evaluate the relationship between median pfs/ttp and median os.The estimated coefficient of 0.813 (p ≤ 0.001) represents a very strong association according to the pre-defined criteria (Figure 2).
Results of the subgroup analyses indicate a higher correlation in studies with patients whose median age exceeded 65 years (r = 0.964, p ≤ 0.001) and with a median follow-up of 30 months or less (r = 0.917, p = 0.001, Table ii).Analysis of the potential effect of line of treatment showed a statistically significant association of pfs/ttp with os in studies assessing second-and subsequent-line therapies.However, no statistically significant correlation was observed in studies of previously untreated patients receiving a first-line treatment.Type of therapy under investigation also had a significant effect on correlation between the endpoints.A statistically significant correlation was observed in studies assessing chemotherapy and immunotherapy agents, but no such correlation was found in studies assessing combinations of treatments.That observation might be the result of a lack of statistical power.
Subgroup correlation analyses according to cll stage showed that the relationship of pfs/ttp with os applied specifically to studies in patients with intermediate-or high-risk profiles (r = 0.797, p ≤ 0.001).Moreover, a very strong correlation was observed for studies in which 50% or more of the included patients had Rai class iv cll (r = 0.900, p = 0.037).
Because most prognostic factors were not reported in the included studies, correlation analyses involving only a few of those variables were performed.One factor that could be evaluated was median β 2 -microglobulin (>3.5 mg/L), which did not demonstrate a statistically significant correlation.However, a very strong correlation was observed for studies in which 50% or more of patients had unmutated IgHV (r = 0.857, p = 0.014, Table ii).

DISCUSSION
The objective of the present study was to use a trial-based approach to evaluate the relationship of median pfs/ttp with median os in the context of cll.The results demonstrated that pfs/ttp is highly correlated with os (correlation coefficient: 0.813, p ≤ 0.001).Moreover, age, a median follow-up period of 30  50% or more of the patients had unmutated IgHV or Rai class iv disease, use of chemotherapy and immunotherapy agents, median number of prior treatments, and secondand subsequent-line therapies were determinants of a statistically significant relationship.Correlation was statistically significant only in studies in which second-and subsequent-line therapies were being investigated and not in studies assessing a first-line treatment.
In the past, the surrogacy of pfs for os has been assessed in various advanced cancer settings, including advanced colorectal cancer, advanced breast cancer, advanced non-small-cell lung cancer, advanced ovarian cancer, advanced gastric cancer, glioblastoma multiforme, and metastatic prostate cancer 39 .In the context of hematologic malignancies, Lee et al. 10 combined thirty-eight randomized controlled studies with at least 100 patients per arm to evaluate the relationship of pfs with os in nhl, finding a statistically significant correlation in aggressive nhl (correlation coefficient: 0.90; 95% confidence interval: 0.73 to 0.96).However, in the same evaluation, a combination of twenty studies did not show a statistically significant correlation in indolent nhl.
The present study differs from the former one in several respects.For instance, the study by Lee et al. included trials from 1978 to 2005; it also included event-free survival as an endpoint.Findings from the study by Lee et al. 10 support the tendency of pfs to correlate with os in the context of advanced or high-risk cancers.In the present analysis, most of the studies fulfilling the eligibility criteria were conducted in patients with refractory or progressive cll.The correlation between pfs and os found in our study would thus especially apply to advanced forms of cll.Accordingly, subgroup analyses using studies that included patients with intermediate-or high-risk profiles showed a statistically significant correlation between those endpoints.Even if the results of analyses by risk profile did not lead to statistically significant correlation coefficients, the discrepancy might be a result of a lack of statistical power only.
According to Broglio et al. 40 , who evaluated the impact of post-progression survival on the surrogacy of pfs for os,    the correlation between pfs and os becomes less reliable as post-progression survival lengthens.The availability of effective treatments subsequent to disease progression therefore plays an important role in the association between endpoints, because a long post-progression period adds randomness that attenuates the ability to detect os benefits.In the context of cll, studies in previously untreated patients receiving a first-line treatment often show a statistically significant improvement of pfs, but not of os 37,[41][42][43][44][45][46][47][48] .In fact, in studies assessing first-line treatment, the time from first therapy to final endpoint is often long enough to introduce confounding factors such as crossover and subsequent-line therapies, leading to a statistically nonsignificant difference in os.Because the time from second-or subsequent-line therapy to the final endpoint is shorter, the probability of assessing the true effect of a treatment on os, without misinterpretation, is higher.That effect was observed in the present analysis as a statistically significant correlation of pfs/ttp with os in studies assessing second-and subsequent-line therapies, but not in studies assessing a first-line treatment.
The method used to reach the results presented here was an exhaustive and rigorous systematic literature review that provided an adequate and transparent overview of the relationship between median pfs/ttp and median os.However, our study has some limitations, such as reduced statistical power in the analyses.Indeed, just twenty-three studies were included in the review, which limited the ability to conduct further analyses.Another limitation is the inclusion of nonrandomized studies.Indeed, according to the Guidance for Industry prepared by the U.S. Food and Drug Administration 8 , time-to-event endpoints (such as pfs and ttp) should be evaluated in randomized trials.Such measures are considered rarely to be reliable for historical data or single-arm trials, which were included in the present study.Because the analyses included only a small number of studies with two treatment arms, it was therefore impossible to evaluate the correlation of treatment effect on pfs/ttp with treatment effect on os.That limitation is important because the "demonstration across randomized comparisons that differences in the effect of randomized treatments on the surrogate endpoint are associated with the corresponding differences in the effects on the clinical endpoint of interest" is essential to validate a surrogate 49 .Nevertheless, despite the small number of included studies, a significant relationship between the endpoints was observed, suggesting that the observed association is real.
Moreover, definitions of pfs and ttp were not consistent throughout the included publications.For instance, some authors defined pfs or ttp as the time from response to objective tumour progression.Studies with inconsistent definitions of pfs and ttp were excluded from our literature review, which could have affected external validity.In addition, because the present analysis included only studies reporting both the endpoints of median pfs/ttp and median os, several large and well-designed randomized trials were excluded.Indeed, many main trials in the cll field reported only 3-year or 5-year survival rates, without reporting median pfs or median os, and were therefore not included in our study.
Another limitation is that the literature search included the keywords "B-cell lymphocytic leukemia," because that term is the most frequent in the Western world."T-cell cll" was not clearly included in the keywords of the literature search even given that T-cell cll is prevalent in Asia 50 .
Overall, the quality of included studies was good.Applying the strobe statement, most of the included nonrandomized studies were of acceptable quality.However, the strobe statement is limited to an evaluation of the quality of reporting; it does not address the quality of the study itself.That approach might lead to a misperception of quality, because a study can be well performed, but not well written.Moreover, of the six randomized studies included, only three (50%) had a Jadad score of 3/5 or better, which might partly be a result of the choice of instrument.Indeed, the Jadad quality assessment scale can be disadvantageous for research areas in which blinding is rarely feasible, such as in oncology.

CONCLUSIONS
The present results demonstrate a very strong correlation of median pfs/ttp with median os in the context of secondand subsequent-line therapies in cll, which reinforces the hypothesis that pfs or ttp can be an adequate surrogate endpoint for os in this cancer setting.

FIGURE 1
FIGURE 1 Flow chart of studies included in the systematic review of the literature.PFS = progression-free survival; TTP = time to progression; OS = overall survival.

a
For patients receiving second-and subsequent-line treatments.b Based on Binet stage and Rai classification.Rai class 0 or Binet stage A, or both, corresponds with low-risk disease; Rai class I-II or Binet stage B, or both, corresponds with intermediate-risk disease; Rai class III-IV or Binet stage C, or both, corresponds with high-risk disease.IgHV = immunoglobulin heavy-chain variable region; ZAP70 = zetachain-associated protein kinase 70; ECOG = Eastern Cooperative Oncology Group.

FIGURE 2
FIGURE 2Scatterplot of the association between the combination of median progression-free survival (PFS) or time to progression (TTP) and median overall survival (OS).Each circle corresponds to a treatment arm.The Spearman correlation coefficient was estimated to be 0.813 (p ≤ 0.001).

TABLE I
Characteristics of the selected studies, including 27 treatment arms

TABLE II
Correlation analyses according to the characteristics of the selected studies Based on Binet stage and Rai classification.Rai class 0 or Binet stage A, or both, corresponds with low-risk disease; Rai class I-II or Binet stage B, or both, corresponds with intermediate-risk disease; Rai class III-IV or Binet stage C, or both, corresponds with high-risk disease.NS = statistically nonsignificant; ECOG PS = Eastern Cooperative Oncology Group performance status.