Progression-Free Survival Early Assessment Is a Robust Surrogate Endpoint of Overall Survival in Immunotherapy Trials of Hepatocellular Carcinoma

Simple Summary Surrogate radiology-based endpoints such as progression-free survival (PFS) and objective response rate (ORR) are commonly used in oncology. However, their surrogacy with overall survival (OS) has not been evaluated in immunotherapy trials for hepatocellular carcinoma (HCC). We found that the surrogacy of PFS with OS is highly variable depending on treatment class (immune-checkpoint inhibitors or multikinase inhibitors) and evaluation time-point. Early PFS is a robust surrogate endpoint for OS in immunotherapy trials, while the surrogacy relationship between ORR and OS is weak. Early assessment of PFS could be useful for allowing analyses with small sample sizes and short accrual times, enhancing the interpretability of immunotherapy trials in HCC. Abstract Background: Radiology-based outcomes, such as progression-free survival (PFS) and objective response rate (ORR), are used as surrogate endpoints in oncology trials. We aimed to assess the surrogacy relationship of PFS with overall survival (OS) in clinical trials of systemic therapies targeting advanced hepatocellular carcinoma (HCC) by novel meta-regression methods. Methods: A search of databases (PubMed, American Society of Clinical Oncology (ASCO), and European Society for Medical Oncology (ESMO) Meeting Libraries, Clinicaltrials.gov) for trials of systemic therapies for advanced HCC reporting both OS and PFS was performed. Individual patient data were extracted from PFS and OS Kaplan–Meier curves. Summary median PFS and OS data were obtained from random-effect model. The surrogate relationships of median PFS, first quartile (Q1), third quartile (Q3), and restricted mean survival time (RMST) for OS were evaluated by the coefficient of determination R2. Heterogeneity was explored by meta-regression. Results: We identified 49 trials, 11 assessing immune-checkpoint inhibitors (ICIs) and 38 multikinase inhibitors (MKIs). Overall, the correlation between median PFS and median OS was weak (R2 = 0.20. 95% Confidence Intervals [CI]-0.02;0.42). Surrogacy robustness varied between treatment classes and PFS endpoints. In ICI trials only, the correlations between Q1-PFS and Q1-OS and between 12-month PFS-RMST and 12-month OS-RMST were high (R2 = 0.89, 95%CI 0.78–0.98, and 0.80, 95% CI 0.63–0.96, respectively). Interaction p-values obtained by meta-regression confirmed the robustness of results. Conclusions: In trials of systemic therapies for advanced HCC, the surrogate relationship of PFS with OS is highly variable depending on treatment class (ICI or MKI) and evaluation time-point. In ICI trials, Q1-PFS and 12-month PFS-RMST are robust surrogate endpoints for OS.


Introduction
Hepatocellular carcinoma (HCC) is often diagnosed at advanced stage not amenable to curative treatments [1]. In recent years, there has been a surge in progress for HCC treatment, leading to the development of several systemic therapies [2,3]. Given the rapid ongoing evolution in this area, a careful evaluation of trial designs and outcomes to optimize health benefits to patients is needed.
Overall survival (OS) is a universally recognized easy-to-assess endpoint to determine clinical benefit in oncology trials [4]. However, the interpretation of OS can be confounded by post-progression survival and treatment crossover [5]. Surrogate radiology-based endpoints, such as progression-free survival (PFS), time-to-progression (TTP), and objective response rate (ORR), are commonly used in oncology, especially when sequential postprogression treatments are available, as is now occurring for HCC [6]. Their relevance remains debated, and aggregate-data meta-analyses showed a modest correlation with OS, with substantial variability according to cancer type and stage and according to the class of drug(s) administered [7]. Specifically, PFS is a composite endpoint that might provide an early assessment of treatment efficacy, independent of post-progression survival [5]. However, PFS is limited by the subjectivity inherent in radiological evaluation of progression and using different response criteria [8]. A recent meta-analysis of aggregated data from randomized controlled trials (RCTs) of systemic therapies for HCC, not including immunotherapy drugs, showed only a moderate correlation between PFS and OS [9]. Meanwhile, for immunotherapy trials involving patients with different types of cancer, a weak association was found between PFS and OS both at individual level and trial level [10,11].
Here, we report a meta-regression of clinical trials of systemic therapies, including immunotherapies, for advanced HCC. The aim of this meta-regression was to evaluate the surrogate relationship between radiology-based endpoints (PFS and ORR) and OS.

Trial Selection and Characteristics
Trial selection process is showed in Figure S1. Based on the full-text reviews, we determined that 49 clinical trials fulfilled the inclusion criteria, and these trials were selected for the main analysis.
Restricted mean survival times (RMSTs) for each trial are reported in Table 2. Sixmonth OS and PFS RMSTs were 5.5 and 4.0 months for ICI trials and 5.3 and 3.9 months for MKI trials, respectively. Twelve-month OS and PFS RMSTs were 9.5 and 5.8 months for ICI trials and 8.6 and 5.2 months for MKI trials, respectively.
Restricted mean survival times (RMSTs) for each trial are reported in Table 2. Sixmonth OS and PFS RMSTs were 5.5 and 4.0 months for ICI trials and 5.3 and 3.9 months for MKI trials, respectively. Twelve-month OS and PFS RMSTs were 9.5 and 5.8 months for ICI trials and 8.6 and 5.2 months for MKI trials, respectively.
Non-proportionality of hazards between PFS and OS was present in 66.7% of the treatment arms (71.4% of ICI-arms; 65.2% of MKI-arms) (Table S3). Time-dependent Cox modeling confirmed the non-proportionality of hazards between PFS and OS in pooled reconstructed survival curves, showing that hazards vary over time, following a different trend in ICIs than in MKIs ( Figure 2, Table S4, Figure S7).
Surrogacy analysis between ORR and OS rate at the end of follow-up yielded overall, ICI-trial, and MKI-trial R 2 values of 0.005 (95% CI −0.03;0.04), 0.60 (95% CI 0.33;0.88), and 0.002 (95% CI −0.0008;0.0009), respectively ( Figure S16). OS rates at the end of follow-up and the results of subgroup analyses for ORR are reported in Tables S7 and S8, respectively. The treatment arms from which the individual data for overall survival (OS) and progression-free survival (PFS) were extracted are shown in bold. All the included trials employed Response Evaluation Criteria in Solid Tumours (RECIST) 1.

Discussion
To the best of our knowledge, this is the first systematic quantitative study that assessed the correlation between surrogate and true treatment endpoints in trials of systemic therapies for advanced HCC by using innovative methods. Median PFS and median OS were found to be weakly correlated. Surrogacy relationships among outcomes varied according to treatment class (MKI or ICI) and PFS evaluation time-point. In ICI trials, but not MKI trials, the surrogacies between Q1-PFS and Q1-OS and between 12-month PFS-RMST and 12-month OS-RMST were high. ORR could not be confirmed as a robust surrogate endpoint for OS.
Innovative methodologies aimed at validating the role of radiology-based outcomes (TTP, PFS, and ORR) as surrogate endpoints of OS are becoming increasingly relevant in oncology. In particular, the advent of ICIs for the treatment of HCC has raised several questions regarding the most appropriate surrogate endpoints for early capture of survival benefit. Consequently, validated and consistent new methodological criteria for defining response to treatment are urgently needed. PFS is a composite endpoint not influenced by post-progression survival and that avoids crossover treatment bias. Modeling sequential treatments, PFS represents the primary endpoint for first-line therapy in ICI trials, as demonstrated by a recently published decision model [40]. Although efforts have been made to improve the surrogacy between PFS and OS in immunotherapy trials-by modifying the threshold percentage to define PFS or the response criteria with Immune Response Evaluation Criteria in Solid Tumors (iRECIST)-PFS surrogacy for OS remains weak both at trial and individual level [10,11,41]. Therefore, the potential for alternative treatment effect measures (Q1-PFS, RMST, and milestone analysis) able to early capture survival benefits, while traditional statistical methods (medians, hazard ratios [HRs], log rank tests) cannot, is a key issue in the era of immunotherapy [42]. Although HR is the most commonly used comparative measure, its validity is limited by the requirement to assume a proportional hazard over the entire follow-up period [43,44]. Upon demonstrating that this assumption did not hold between PFS and OS for both ICI-and MKI trials, we explored whether this surrogate relationship can be improved by adopting new robust statistical procedures. First quartile analysis is a cross-sectional assessment of treatment benefit at a meaningful time-point that overcomes the proportional hazards assumption. Our analysis showed that the surrogacy between Q1-PFS and Q1-OS was high in ICI trials only. However, these time-based outcomes do not reflect the entire survival history. Overcoming this limitation, RMST represents an innovative methodology that has the advantage of being valid under any time-to-event distribution, regardless of the proportional hazard assumption [42][43][44]. Unlike HR, RMST is an absolute measure of survival time, it can be used in all models, and it does not change with extended follow-up, enabling clinically meaningful interpretation of a treatment effect [44,45]. Although it was intended to increase the interpretability of immunotherapy trials, it is not routinely reported.
In ICI trials, our analyses showed that PFS surrogacy of OS was robust with the use of 12-month RMSTs. In particular, we further confirmed the significant benefit of atezolizumab plus bevacizumab compared to sorafenib, both for 12-month RMST OS and PFS, when we reanalyzed data from the recently published RCT [2]. Moreover, our data may have important implications also for trial design and for sample size calculation of future ICI trials.
Importantly, our results suggest that this surrogate relationship changes over time, and that these changes follow different trends for ICI trials than for MKI trials. In MKI setting, caution must be taken when interpreting PFS in absence of OS. The reasons underlying this finding are not fully understood, but they could be plausibly related to different pharmacodynamics between MKIs (fast) and ICIs (slow but durable). Therefore, we can hypothesize that the durable radiological response to ICIs better correlates with OS [46].
It is important to consider that the line of treatment could have an impact on the surrogacy between PFS and OS, because patients on first-line treatment are more likely to have a chance to receive subsequent post-progression treatments compared to patients on second-line treatment. Unfortunately, the small number of first-line ICI trials [2,17] hampered this subgroup analysis.
Our meta-regression demonstration of a weak correlation between ORR and OS underscores that researchers should exercise caution when using the ORR as the primary endpoint in a phase III trial of an immunotherapy, with deference being given to time-toevent outcomes (e.g., PFS, TTP, time to response, and duration of response). The correlation value obtained is consistent with the results of a prior meta-analysis of immunotherapy trials conducted in other cancer types, such as melanoma, lung cancer, and renal cell carcinoma [10] and with an aggregate-data meta-analysis including only MKI RCTs of HCC [9]. Together, this convergence of evidence does not lend support the use of the ORR as a primary endpoint in immunotherapy trials. Accordingly, treatment effects based solely on time-fixed surrogate outcomes, such as ORRs, should be interpreted with caution.
Limitations: Although we extracted IPD for OS and PFS from Kaplan-Meier curves, the association between PFS and OS could not be evaluated at the individual level. Moreover, we were unable to assess other potentially relevant patient-level covariates, such as duration of response, treatment-related toxicity, and hepatic decompensation. The survival of patients with advanced HCC has been shown to be influenced by hepatic decompensation, which, together with HCC progression, represents a competitive mortality risk [47]. Finally, we agree fully with Finn that an IPD meta-analysis could better evaluate the surrogacy between PFS and OS [6].

Literature Search and Study Selection
Details about literature search are reported in Supplementary Materials. The inclusion criteria for retrieved studies were: being a clinical trial of systemic therapy for advanced HCC; and data reported for OS and at least one surrogate radiologybased endpoint (PFS or ORR). Review articles, letters, interim analyses, subgroup analyses of previously reported trials, trials including only conventional chemotherapy, duplicate reports, trials in which the systemic therapy of interest was used in an adjuvant and neoadjuvant setting, or used with concomitant locoregional treatments were excluded. Each trial was evaluated by three independent investigators (Ci.C., G.E.M.R., A.B.). Discrepancies among reviewers were not frequent (interobserver variation < 10%) and resolved by discussion.

Trial-Level Data Extraction
OS/PFS median times and HRs with corresponding 95% CIs and ORRs were assessed as measures of treatment effect. We also obtained the following covariates: ICI or MKI treatment; single-agent or combination therapy; trial phase; publication year; number of trial arms; number of patients in each arm; type of control arm; treatment line; timing of first radiological assessment; follow-up duration; and treatment-response radiological evaluation criteria.

Individual Patient Survival Data Extraction
We used Engauge Digitizer software [48] to extract IPD from OS and PFS Kaplan-Meier curves and used Guyot algorithm [49] to reconstruct the data. This algorithm was applied to assembled patients with predicted survival times and a predicted event of interest (i.e., alive or dead; progression or no progression) with digitized data on survival probabilities, time, and total numbers of patients and events. Each reconstructed survival curve was inspected for accuracy and compared with the originally published curves.
We used Combescure [50] nonparametric approach to obtain summary survival curves, which enabled assessments of pooled reconstructed survival probabilities of trials separately according to drug class (ICI or MKI). A random-effects model was used to detect between-study heterogeneity. The multivariate extension of DerSimonian and Laird's method was used to estimate a between-study covariance matrix [51,52]. Heterogeneity was assessed by the I 2 statistic.

Restricted Mean Survival Time (RMST)
RMSTs, reflecting average survival from time 0 to a specified time-point t, were determined from Kaplan-Meier estimates of survival functions. RMST can be interpreted readily as the area under the survival curve within a specific time window. For each trial, we reanalyzed the reconstructed IPD and then assessed RMSTs for OS and PFS at two pre-specified time horizons: 6 and 12 months [53].

Statistical Analysis
We used a two-step process to evaluate the surrogate relationship between PFS and OS.

Step 1: Assessing Proportional Hazards Assumption
We first checked if the proportional hazards (PH) assumption between PFS and OS was valid in each trial and in the pooled PFS and OS curves for each drug class (ICI or MKI) using Schoenfeld residual statistics. When the PH assumption was not verified, we generated time-dependent Cox models, including an interaction term between survival time and the fixed covariate, to overcome the non-proportionality [54]. The best model was chosen based on Akaike's information criterion values.

Step 2: Surrogacy Endpoint Validation
Linear meta-regression model, with sample-size weighting of the trial arms from which the data were extracted, was employed to quantify the relationship between PFS and OS. Surrogacy was evaluated between median times, between different time-based endpoints [first quartile (Q1) and third quartile (Q3)], between 6-month RMSTs, and between 12-month RMSTs. For ORR surrogacy validation, we assessed the relationship between OS rate at the end of follow-up and ORR (not being this latter a time-to-event endpoint). The strength of each association was assessed by calculating R 2 (the proportion of OS variance that is predictable from the surrogate endpoints), with values near 1 implying surrogacy and values close to zero suggesting no association [55].

Subgroup Analyses
We performed the following subgroup analyses: (1) drug class (ICI or MKI); (2) presence of control arm (controlled or not controlled); (3) trial phase (phase I/II or III); (4) line of treatment (first or second); and (5) duration of follow-up. For each subgroup analysis, we calculated an interaction p-value using a meta-regression model.

Conclusions
In trials of systemic therapies for advanced HCC, the surrogacy relationship of PFS with OS is highly variable depending on treatment class (ICI or MKI) and evaluation time-point. In ICI trials, Q1-PFS and 12-month PFS-RMST are robust surrogate endpoints for OS. Therefore, PFS RMSTs should be reported routinely in ICI trials for advanced HCC. Although caution must be taken when interpreting PFS in the absence of OS data, PFS could be useful for allowing analyses with small sample sizes and short accrual times in clinical trials, ultimately enhancing the interpretability of immunotherapy clinical trials.   Table S8: Results of subgroup analyses of surrogacy between objective response rate (ORR) and OS rate at the end of follow-up, Figure S1: Study flow chart, Figure   Institutional Review Board Statement: Ethical review and approval were waived for this study, because it included non-identifiable data from published clinical trials.
Informed Consent Statement: Patient consent was waived because the study included non-identifiable data from published clinical trials.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.