Tumor Volume Dynamics as an Early Biomarker for Patient-Specific Evolution of Resistance and Progression in Recurrent High-Grade Glioma

Recurrent high-grade glioma (HGG) remains incurable with inevitable evolution of resistance and high inter-patient heterogeneity in time to progression (TTP). Here, we evaluate if early tumor volume response dynamics can calibrate a mathematical model to predict patient-specific resistance to develop opportunities for treatment adaptation for patients with a high risk of progression. A total of 95 T1-weighted contrast-enhanced (T1post) MRIs from 14 patients treated in a phase I clinical trial with hypo-fractionated stereotactic radiation (HFSRT; 6 Gy × 5) plus pembrolizumab (100 or 200 mg, every 3 weeks) and bevacizumab (10 mg/kg, every 2 weeks; NCT02313272) were delineated to derive longitudinal tumor volumes. We developed, calibrated, and validated a mathematical model that simulates and forecasts tumor volume dynamics with rate of resistance evolution as the single patient-specific parameter. Model prediction performance is evaluated based on how early progression is predicted and the number of false-negative predictions. The model with one patient-specific parameter describing the rate of evolution of resistance to therapy fits untrained data (R2=0.70). In a leave-one-out study, for the nine patients that had T1post tumor volumes ≥1 cm3, the model was able to predict progression on average two imaging cycles early, with a median of 9.3 (range: 3–39.3) weeks early (median progression-free survival was 27.4 weeks). Our results demonstrate that early tumor volume dynamics measured on T1post MRI has the potential to predict progression following the protocol therapy in select patients with recurrent HGG. Future work will include testing on an independent patient dataset and evaluation of the developed framework on T2/FLAIR-derived data.

In our clinical trial, patients with recurrent HGG were treated with hypo-fractionated stereotactic radiotherapy (HFSRT) combined with pembrolizumab and bevacizumab (NCT02313272) [9]. Adult patients with recurrent glioblastoma (GBM) or anaplastic astrocytoma (maximum diameter of target lesion ≤3.5 cm) were eligible. Eligible patients received HFSRT to the recurrent tumor (6 Gy × 5) combined with pembrolizumab (100 or 200 mg, intravenously based on dose level, every 3 weeks) and bevacizumab (10 mg/kg, intravenously every 2 weeks). After determination of the recommended phase II dose of pembrolizumab, an additional 26 patients were enrolled in an expansion cohort and were treated with HFSRT, pembrolizumab (200 mg every 3 weeks), and bevacizumab. Response was assessed every 6 weeks per Response Assessment in Neuro-Oncology (RANO) criteria.
The RANO criteria define radiographic progression as 25% or greater increase in the sum of the products of perpendicular diameters of the enhancing lesion, when compared with baseline or smallest tumor measurement (nadir). Additionally, progression may be observed by a significant increase in a T2-weighted/fluid-attenuated inversion recovery (T2/FLAIR) non-enhancing lesion on stable or increasing doses of corticosteroids compared with nadir [10]. The present RANO criteria for HGGs do not include volumetric assessment. While contrast-enhancing volumetric changes were found to be predictive of progression-free survival (PFS), volumetric measures have not yet been incorporated into the RANO criteria [11][12][13].
Despite the advances of drug development, recurrent HGGs remain incurable as patients inevitably develop resistance and progress on treatment. However, time to progression varies significantly between patients, and no reliable biomarkers exist to predict when resistance will develop. Prediction of time to progression requires temporally resolved biomarkers and identification of highly patient-specific therapy response dynamics.
Here, we evaluate if early tumor volume dynamics in T1-weighted contrast-enhanced (T1post) measurements can be used to calibrate and validate a mathematical model of patient-specific tumor volume dynamics to predict response dynamics following therapy. Predicting progression prior to radiographic manifestation allows clinicians to modify therapy before selection for and proliferation of treatment-resistant tumor subpopulations. We identify a subset of recurrent HGG patients for whom treatment response dynamics can accurately predict progression.

Experimental Section
A total of 32 patients with recurrent HGG were treated with HFSRT (6 Gy × 5) plus pembrolizumab (100 or 200 mg, intravenously based on dose level, every 3 weeks) and bevacizumab (10 mg/kg, intravenously every 2 weeks) until progression in a phase I clinical trial (NCT02313272) at Moffitt Cancer Center between August 2015 and March 2018. T1post and T2-weighted/fluid-attenuated inversion recovery (T2/FLAIR) MRIs were taken approximately every 6 weeks ( Figure 1A). The protocol and its amendments were reviewed and approved by the institutional review board (IRB study #: Pro00014674, continuing review approval IRB# 00000971, 26 August 2019). Written consent was provided by all patients. Included in the present analysis were pembrolizumab-naïve and bevacizumab-naïve patients with at least four MRI-derived T1post volumes and who progressed with non-zero T1post volume ( Figure 1B). The median time to progression for these patients was 27.4 weeks ( Figure 1C). progressed due to the development of a new lesion, and two progressed due to clinical deterioration ( Figure 1D). Notice that two of our patients developed progression under multiple criteria. Patient-specific longitudinal gross tumor volumes were delineated from a total of 95 T1post MRI data sets (median of 5.5 images per patient, median of 42 days between images). Briefly, for a given patient, consecutive MRI scans were evaluated, and T1post volumes were manually delineated on each slice with commercially available software (Mirada Medical, Denver, CO, USA) by an experienced radiation oncologist (GDG) and were verified by a neuro-radiologist (JAA). For patients who underwent surgery for recurrent disease prior to HFSRT, the volume only included the residual T1post volume excluding the surgical cavity. For patients who did not undergo salvage surgery prior to HFSRT, the T1post volume was delineated from the epicenter of the recurrence. For a given patient, recurrent T1post volumes were concordant in the region of delineation across MRI scans, except when the recurrence was multi-focal. T1post volumes were delineated if they were measurable (i.e., if they were ≥0.1 cm 3 ). Non-measurable T1post volumes were set to 0.0 cm 3 .
To evaluate T1post MRI tumor volume dynamics as an early biomarker for progression in recurrent HGG, we followed the rigorous Brady pipeline to develop, calibrate, and validate a mathematical model and evaluate model prediction performance [14]. Simulation of the evolution of tumor volume during the course of therapy traditionally involves the development of a mathematical model that describes the mechanism of each therapeutic agent. This requires numerous parameters Of the 14 patients included in the present analysis, three progressed due to a greater than 25% increase in the sum of the products of perpendicular diameters of T1post enhancing lesions compared with nadir, nine progressed due to significant increase in T2/FLAIR non-enhancing lesions, three progressed due to the development of a new lesion, and two progressed due to clinical deterioration ( Figure 1D). Notice that two of our patients developed progression under multiple criteria.
Patient-specific longitudinal gross tumor volumes were delineated from a total of 95 T1post MRI data sets (median of 5.5 images per patient, median of 42 days between images). Briefly, for a given patient, consecutive MRI scans were evaluated, and T1post volumes were manually delineated on each slice with commercially available software (Mirada Medical, Denver, CO, USA) by an experienced radiation oncologist (GDG) and were verified by a neuro-radiologist (JAA). For patients who underwent surgery for recurrent disease prior to HFSRT, the volume only included the residual T1post volume excluding the surgical cavity. For patients who did not undergo salvage surgery prior to HFSRT, the T1post volume was delineated from the epicenter of the recurrence. For a given patient, recurrent T1post volumes were concordant in the region of delineation across MRI scans, except when the recurrence was multi-focal. T1post volumes were delineated if they were measurable (i.e., if they were ≥0.1 cm 3 ). Non-measurable T1post volumes were set to 0.0 cm 3 .
To evaluate T1post MRI tumor volume dynamics as an early biomarker for progression in recurrent HGG, we followed the rigorous Brady pipeline to develop, calibrate, and validate a mathematical model and evaluate model prediction performance [14]. Simulation of the evolution of tumor volume during the course of therapy traditionally involves the development of a mathematical model that describes the mechanism of each therapeutic agent. This requires numerous parameters that will be difficult to learn from the limited number of patients in the analysis (n = 14) and sparse longitudinal data (median, 5.5 MRIs per patient).
We deployed a simple tumor growth inhibition (TGI) model [15] to describe tumor volume dynamics V(t): where λ (day −1 ) is the net tumor growth rate in the absence of therapy and γ(t) (day −1 ) is the rate at which the tumor volume decays in response to therapy. We assume an exponential growth as the tumor volume is likely far from carrying capacity after surgery and HFSRT, supported by observed dynamics. Bevacizumab and pembrolizumab were administered every 2 and 3 weeks until progression; thus, we approximated a continuous and constant drug concentration and ignored pharmacokinetics and pharmacodynamics for simplicity. To simulate the evolution of resistance to therapy, we modeled the decay rate to be exponentially declining with time, such that where ε (day −1 ) is the rate at which resistance develops. We express the analytic solution of the coupled system of Equations (1) and (2) as: Note that time to tumor growth (TTG), defined as time of volume nadir and resistance to therapy, is t = t * , such that λ = γ(t * ). Analytically, this occurs at t * = ln(γ 0 )−ln(λ) ε . A summary of model parameters is given in Table 1. The parameter bounds are defined from the patient dataset. To make predictions over sparse data, we need to minimize the number of free parameters. Patient-specific tumor proliferation and invasion have been thoroughly investigated in mathematical models of glioma [16][17][18][19][20][21][22][23][24]. Here, sensitivity analysis of Equations (1) and (2) (Supplementary Information, [25]), identified the rate of evolution of resistance as the most sensitive parameter. Therefore, we modeled ε to be patient-specific and set less-sensitive parameters (net tumor growth rate λ and treatment response rate γ 0 ) to be uniform across the patient cohort (Table 1). This model has the advantage of being very simple (only one parameter to be trained per patient), as well as being able to explain a variety of tumor volume dynamics ( Figure 2).

Figure 2.
The model can explain a variety of tumor volume dynamics by varying the speed of evolution of resistance to therapy. A low resistance rate (left) yields slow evolution of resistance to therapy and long-term response with large tumor volume regression. A medium (middle) yields medium evolution of resistance to therapy and short-term response with medium tumor volume regression. A high (right) yields fast evolution of resistance to therapy and immediate progression with small tumor volume regression. Resistance to therapy and volume nadir occur at = * , such that the net growth rate and treatment sensitivity coincide ( = ( * )). The legend in top left panel applies to all top panels.
We also performed an identifiability analysis, detailed in the Supplementary Information [26,27]. In case of non-identifiability, we fixed the least sensitive uniform parameter to some nominal value. Model parameters are derived by fitting the analytic solution of the mathematical model to the clinical data by minimizing the sum of squared relative errors = ( )/ [28], where is simulated volume at the time of MRI, and is the actual volume at the time of MRI, over all MRIs. This relative error was chosen to avoid fitting artefacts to non-measurable tumor volumes. To best fit to resistance dynamics, the model solution was fixed to the final measurable tumor volume and simulated back in time, giving double weight to the penultimate tumor volume. Optimal parameters are derived by implementing a nested particle swarm algorithm in MATLAB R2020a.
To validate the calibrated model with untrained data, we performed a leave-one-out crossvalidation (LOOCV) study. We calibrated the uniform model parameters using the training dataset and applied them to the left-out patient to learn the patient-specific parameter. We performed this for all patients and aggregate results. Model calibration and validation were evaluated based on coefficient of determination (R 2 ).
We then predicted tumor volume and response dynamics for each individual patient. We decided to make predictions only when the following conditions were met: 1. There are at least three T1post tumor volumes while on the present treatment; 2. The two most recent T1post tumor volumes are measurable; 3. At least one T1post tumor volume is greater than 1 cm 3 , which is a threedimensional approximation of the 1 cm longest diameter criteria in RANO; 4. There is no prior progression; and 5. There is no prior prediction of progression ( Figure 3A). Once those conditions are met, we set the initial condition to the most recent tumor volume and back fit to the first data point, learning the patient-specific resistance rate (day −1 ) using the uniform parameters learned from the training cohort. Tumor volume was then predicted for the next time point. Progression or no progression is evaluated based on the predicted tumor volume relative to nadir. A variety of thresholds relative to nadir were tested (0-300% increase in tumor volume from nadir) to determine the optimal progression criterion. The threshold that maximizes the negative predictive value (NPV) Figure 2. The model can explain a variety of tumor volume dynamics by varying the speed of evolution of resistance to therapy. A low resistance rate ε (left) yields slow evolution of resistance to therapy and long-term response with large tumor volume regression. A medium ε (middle) yields medium evolution of resistance to therapy and short-term response with medium tumor volume regression. A high ε (right) yields fast evolution of resistance to therapy and immediate progression with small tumor volume regression. Resistance to therapy and volume nadir occur at t = t * , such that the net growth rate and treatment sensitivity coincide (λ = γ(t * ) ). The legend in top left panel applies to all top panels.
We also performed an identifiability analysis, detailed in the Supplementary Information [26,27]. In case of non-identifiability, we fixed the least sensitive uniform parameter to some nominal value. Model parameters are derived by fitting the analytic solution of the mathematical model to the clinical data by minimizing the sum of squared relative errors E = V sim −V data (V sim +V data )/2 [28], where V sim is simulated volume at the time of MRI, and V data is the actual volume at the time of MRI, over all MRIs. This relative error was chosen to avoid fitting artefacts to non-measurable tumor volumes. To best fit to resistance dynamics, the model solution was fixed to the final measurable tumor volume and simulated back in time, giving double weight to the penultimate tumor volume. Optimal parameters are derived by implementing a nested particle swarm algorithm in MATLAB R2020a.
To validate the calibrated model with untrained data, we performed a leave-one-out cross-validation (LOOCV) study. We calibrated the uniform model parameters using the training dataset and applied them to the left-out patient to learn the patient-specific parameter. We performed this for all patients and aggregate results. Model calibration and validation were evaluated based on coefficient of determination (R 2 ).
We then predicted tumor volume and response dynamics for each individual patient. We decided to make predictions only when the following conditions were met: 1. There are at least three T1post tumor volumes while on the present treatment; 2. The two most recent T1post tumor volumes are measurable; 3. At least one T1post tumor volume is greater than 1 cm 3 , which is a three-dimensional approximation of the 1 cm longest diameter criteria in RANO; 4. There is no prior progression; and 5. There is no prior prediction of progression ( Figure 3A). Once those conditions are met, we set the initial condition to the most recent tumor volume and back fit to the first data point, learning the patient-specific resistance rate ε (day −1 ) using the uniform parameters learned from the training cohort. Tumor volume was then predicted for the next time point. Progression or no progression is evaluated based on the predicted tumor volume relative to nadir. A variety of thresholds relative to nadir were tested (0-300% increase in tumor volume from nadir) to determine the optimal progression criterion. The threshold that maximizes the negative predictive value (NPV) was selected as optimal for predicting response dynamics. Model predictive performance was evaluated based on minimizing false-negative (FN) predictions, as well as minimizing the number of weeks for which progression is predicted early for each patient. Model calibration, validation, and predictions were performed in 20 replicates to average numerical stochasticity. Statistics are reported as mean and standard deviation unless otherwise stated. J. Clin. Med. 2020, 9, x FOR PEER REVIEW 6 of 11 was selected as optimal for predicting response dynamics. Model predictive performance was evaluated based on minimizing false-negative (FN) predictions, as well as minimizing the number of weeks for which progression is predicted early for each patient. Model calibration, validation, and predictions were performed in 20 replicates to average numerical stochasticity. Statistics are reported as mean and standard deviation unless otherwise stated. No predictions were made for 2 patients (not shown) due to no T1post tumor volume ≥1 cm 3 . Of the 9 patients predicted to progress, four were predicted to progress one scan early, four were predicted to progress two scans early, and one was predicted to progress six scans early. Filled markers identify the radiological data from which the model forecast has been performed. Blue and red curves show the model prediction relative to the progression threshold (dashed line). Asterisks mark patients who progressed due to T2/FLAIR non-enhancing lesions or clinical deterioration. All three patients whose progression failed to be predicted progressed despite accurately predicted diminished T1post No predictions were made for 2 patients (not shown) due to no T1post tumor volume ≥1 cm 3 . Of the 9 patients predicted to progress, four were predicted to progress one scan early, four were predicted to progress two scans early, and one was predicted to progress six scans early. Filled markers identify the radiological data from which the model forecast has been performed. Blue and red curves show the model prediction relative to the progression threshold (dashed line). Asterisks mark patients who progressed due to T2/FLAIR non-enhancing lesions or clinical deterioration. All three patients whose progression failed to be predicted progressed despite accurately predicted diminished T1post volumes. Two of these patients progressed due to significant increase in T2/FLAIR non-enhancing lesions, and one progressed due to clinical deterioration. (C) Distribution of early predictions. Markers correspond to individual patients in panel (B).

Model Fits to Patient Data
The tumor volume growth and decay model exhibits a variety of dynamics dependent on the resistance rate ε (Figure 2). A low ε yields slow evolution of resistance and long-term response and volume regression. A high ε yields fast evolution of resistance and short-term response or even immediate progression. The model with three patient-specific parameters (net growth rate λ initial treatment sensitivity γ 0 , and resistance rate ε) fits the volumetric data of the analyzed 14 patients with recurrent HGG treated with HFSRT with concurrent pembrolizumab and bevacizumab with median R 2 = 0.76. The resistance rate ε was identified as the most sensitive parameter; thus, we kept ε to be patient-specific, and we trained λ and γ 0 to be uniform across all patients ( Figure S1). While this significantly simplifies the model, it remains practically non-identifiable. Because γ 0 was found to be the least sensitive parameter, we set it to the nominal value γ 0 = 0.4608 day −1 , which maximizes R 2 and optimized for λ = 0.4465 ± 0.0023 day −1 (Figure S2). Model validation via the LOOCV study was performed with median R 2 = 0.66 on this patient dataset (Figure 4).

Model Fits to Patient Data
The tumor volume growth and decay model exhibits a variety of dynamics dependent on the resistance rate (Figure 2). A low yields slow evolution of resistance and long-term response and volume regression. A high yields fast evolution of resistance and short-term response or even immediate progression. The model with three patient-specific parameters (net growth rate , initial treatment sensitivity , and resistance rate ) fits the volumetric data of the analyzed 14 patients with recurrent HGG treated with HFSRT with concurrent pembrolizumab and bevacizumab with median R = 0.76. The resistance rate was identified as the most sensitive parameter; thus, we kept to be patient-specific, and we trained and to be uniform across all patients ( Figure S1). While this significantly simplifies the model, it remains practically non-identifiable. Because was found to be the least sensitive parameter, we set it to the nominal value = 0.4608 day −1 , which maximizes R 2 and optimized for = 0.4465 ± 0.0023 day −1 (Figure S2). Model validation via the LOOCV study was performed with median R = 0.66 on this patient dataset (Figure 4).

Model Predicts Early Progression in 9 of 14 Patients
We evaluated a variety of T1post volumetric progression criteria and selected the one that maximized the negative predictive value (NPV). NPV minimized predicting no progression when there was in fact progression (false negative, FN) and maximized predicting no progression correctly to minimize the number of MRI scans that we predicted early (true negative, TN). Of all the progression criteria sampled, a 200% increase in tumor volume from nadir maximizes the NPV at 0.84 with a sensitivity of 0.56 and a specificity of 0.76 (Table 2). Table 2. Prediction results of various progression criteria. Progression is defined based on predicted T1post tumor volume relative to nadir. Each progression criterion is defined by a threshold (x% increase in T1post tumor volume from nadir) above which progression is called. The optimal progression criterion is found to be 200% based on maximizing the negative predictive value (NPV). We also report the average number of false negatives (FNs) and how early we predict progression based on the average number of scans and weeks before actual progression.

Progression
Criterion NPV FN Mean Scans Predicted Early

Model Predicts Early Progression in 9 of 14 Patients
We evaluated a variety of T1post volumetric progression criteria and selected the one that maximized the negative predictive value (NPV). NPV minimized predicting no progression when there was in fact progression (false negative, FN) and maximized predicting no progression correctly to minimize the number of MRI scans that we predicted early (true negative, TN). Of all the progression criteria sampled, a 200% increase in tumor volume from nadir maximizes the NPV at 0.84 with a sensitivity of 0.56 and a specificity of 0.76 (Table 2). Table 2. Prediction results of various progression criteria. Progression is defined based on predicted T1post tumor volume relative to nadir. Each progression criterion is defined by a threshold (x% increase in T1post tumor volume from nadir) above which progression is called. The optimal progression criterion is found to be 200% based on maximizing the negative predictive value (NPV). We also report the average number of false negatives (FNs) and how early we predict progression based on the average number of scans and weeks before actual progression. The chosen 200% increase in tumor volume from nadir also minimizes the time that progression is predicted early. In line with RANO, no predictions were made for two patients because of no T1post tumor volumes ≥1 cm 3 prior to progression. Our model predicts progression early in 9 of the remaining 12 patients. Of these, four were correctly predicted to progress one scan early, four were predicted to progress two scans in advance, and one patient was predicted to progress six scans early ( Figure 3B). Progression was predicted with a median of 9.3 (range: 3-39.3) weeks early ( Figure 3C), despite the fact that 6 of these 9 patients (66%) progressed due to significant increase in T2/FLAIR non-enhancing lesions or due to clinical deterioration.

Progression
For three patients, the model accurately forecasted the evolution of the target lesion but failed to predict progression early due to significant increase in T2/FLAIR non-enhancing lesions or clinical deterioration ( Figure 3B).

Discussion
We have developed a simple mathematical model to simulate T1post tumor volume dynamics during therapy as a predictor for progression to subsequent therapy. As temporally resolved volumetric data are only collected every 6 weeks, the model has to be very simple, and to our knowledge, this is the first attempt to simulate treatment resistance as a patient-specific exponential decline of treatment sensitivity. Using this model, we identified recurrent HGG patients who were treated with HFSRT combined with pembrolizumab and bevacizumab without previous drug failure and with enhancing tumor volume greater than 1 cm 3 , for whom predictions can be made with high confidence.
To this extent, we have developed a novel volumetric progression classifier of 200% above nadir. However, predicted tumor volumes near the progression threshold can potentially skew the results. This may be mediated by assigning prediction confidence based on how far the predicted tumor volume is away from the progression threshold. Additionally, this figure seems surprisingly high, given that the equivalent volumetric threshold to the current RANO bidirectional criterion assuming a spherical tumor is 40%, warranting further investigations into tumor volume dynamics. However, tumor contouring was done by a radiologist or radiation oncologist and remains a manual process as the ability of auto-contouring the tumor on T1post images is not yet fully commercially available. Current software allows auto-contours of normal structures to a high-degree precision but may still require manual adjustment.
The developed model may provide medical oncologists with an additional tool to discuss treatment options on a per patient basis. One limitation of the presented study is the limited patient dataset (n = 14 patients with 95 MRIs in total). The inclusion of more patients would further refine model calibration and progression criterion selection, with the hope to identify different progression models based on additional patient biomarkers, such as lymphocyte and neutrophil dynamics, MGMT promoter methylation, IDH mutation, and EGFR vIII mutation.
Our results demonstrate that early tumor volume dynamics in T1post measurements have the potential to predict progression. However, using T1post volumetric measurements to predict progression has limitations with therapies that include anti-angiogenic agents, which have a propensity to exhibit pseudo-response in T1post MRIs. This is illustrated by 3 of 14 patients in the discussed cohort with shrinking T1post tumor volumes and thus predicted to continue therapy without disease progression, yet who progressed due to significant increase in T2/FLAIR non-enhancing lesions or clinical deterioration. Exploration of T2/FLAIR-derived dynamics as a treatment response predictor, including evaluation of the developed framework, is currently ongoing. By excluding patients whose progression was unrelated to the T1post MRI signal, the model achieves an NPV of 1.0, with a sensitivity of 1.0 and a specificity of 0.75 without false positives. These results motivate prospective evaluation of early tumor volume dynamics in T1post measurements to predict progression. As demonstrated, model parameters are not uniquely independently identifiable given the clinical data. Additional pre-treatment tumor growth data may help to constrain the model parameter space and help to further develop the model. Herein, we chose parameter pairs based on statistical considerations to maximize model predictive power and reported these values for reproducibility. However, modification of the functional form of the developed model terms or inclusion of additional mechanisms will change each parameter value. Therefore, herein, estimated parameter values should neither be taken as biological ground truth, nor are they translatable to different mathematical models.