Predicting Regions of Local Recurrence in Glioblastomas Using Voxel-Based Radiomic Features of Multiparametric Postoperative MRI

Simple Summary In this study, we developed a predictive model that employs data from multiparametric structural MRI to predict local recurrence in glioblastoma, providing a practical solution to an issue clinicians face in our daily practice: discriminating edema from tumor infiltration. Predicting the location of these areas at high risk of recurrence will potentially allow for personalizing and optimizing the local treatment of glioblastomas, creating new surgical resection limits and radiotherapy targets. Our findings could potentially improve the survival rate of these patients and open a new line of research that permits a better understanding of the mechanisms of glioma invasion. In addition, we evaluated our results in an external multicenter cohort of patients, thus demonstrating the applicability of the model despite the MRI acquisition protocols and scanner manufacturers. The model will be publicly available through a repository for its implementation by any institution. Abstract The globally accepted surgical strategy in glioblastomas is removing the enhancing tumor. However, the peritumoral region harbors infiltration areas responsible for future tumor recurrence. This study aimed to evaluate a predictive model that identifies areas of future recurrence using a voxel-based radiomics analysis of magnetic resonance imaging (MRI) data. This multi-institutional study included a retrospective analysis of patients diagnosed with glioblastoma who underwent surgery with complete resection of the enhancing tumor. Fifty-five patients met the selection criteria. The study sample was split into training (N = 40) and testing (N = 15) datasets. Follow-up MRI was used for ground truth definition, and postoperative structural multiparametric MRI was used to extract voxel-based radiomic features. Deformable coregistration was used to register the MRI sequences for each patient, followed by segmentation of the peritumoral region in the postoperative scan and the enhancing tumor in the follow-up scan. Peritumoral voxels overlapping with enhancing tumor voxels were labeled as recurrence, while non-overlapping voxels were labeled as nonrecurrence. Voxel-based radiomic features were extracted from the peritumoral region. Four machine learning-based classifiers were trained for recurrence prediction. A region-based evaluation approach was used for model evaluation. The Categorical Boosting (CatBoost) classifier obtained the best performance on the testing dataset with an average area under the curve (AUC) of 0.81 ± 0.09 and an accuracy of 0.84 ± 0.06, using region-based evaluation. There was a clear visual correspondence between predicted and actual recurrence regions. We have developed a method that accurately predicts the region of future tumor recurrence in MRI scans of glioblastoma patients. This could enable the adaptation of surgical and radiotherapy treatment to these areas to potentially prolong the survival of these patients.


Introduction
Glioblastoma is the most common primary neoplasm of the central nervous system, with an overall survival of approximately 15 months from diagnosis in patients who undergo resection and postoperative adjuvant treatment [1]. Despite active research in recent years, including multiple clinical trials, the current therapeutic regimen has not undergone substantial changes in the last decade, and the life expectancy of these patients has not been extended. The current treatment protocol consists of pursuing complete resection of the contrast-enhancing tumor component, followed by adjuvant treatment with chemo-and radiotherapy [2]. Nonetheless, the recurrence and mortality rates inevitably reach 100% in all patients [3].
At the time of diagnosis, glioblastoma is considered widespread because neoplastic cells infiltrate the non-enhancing peritumoral area, as demonstrated by several anatomopathological studies [4,5]. Nevertheless, the non-enhancing peritumor is rarely included as a surgical or radiotherapy treatment target, although it is well-known that more than 80% of recurrences occur near the margins of the resection cavity [6]. The main reason is that, using conventional magnetic resonance imaging (MRI), it is often impossible to visually distinguish a non-enhancing tumor from vasogenic edema, despite specific radiological criteria [7]. The entire peritumoral region has the same contrast on MRI and is shown as a hyperintense signal in T2-weighted (T2w) and fluid-attenuated inversion recovery (FLAIR) images. There is increasing evidence for extending the surgical resection of glioblastoma beyond the margins of the contrast-enhancing region since it could improve patient survival [8]. However, expanding surgical margins is not always feasible because the peritumor can extend to eloquent areas, thus increasing the risk of postoperative neurological deficits. The Response Assessment in Neuro-Oncology (RANO) group recently concluded that less than 5 mL residual non-enhancing tumor volume is prognostically better than complete resection of contrast-enhancing volume alone [9]. However, the nonenhancing tumor varies in size and location, and a more tailored approach toward the non-enhancing tumor burden could be beneficial.
Previous studies have attempted to characterize tumor infiltration into the noncontrastenhancing region through MRI and in combination with stereotactic biopsies [10] or by applying machine learning and deep learning models [11][12][13][14]. Moreover, several authors have proposed methods to predict regions of future tumor recurrence using MRI-based radiomic features [15][16][17][18][19]. Several of these studies show promising results, but they often require a great variability of image preprocessing, ground truth definitions, feature extraction, and data handling, as well as a need for advanced MRI sequences, which often hinder the generalization and applicability in a clinical setting. Furthermore, most of these studies are based on preoperative MRI, which contains no information about the resection margins, something that is highly important for recurrence predictions. Finally, none of these publications have been validated using external, multi-institutional data.
In the current study, we aimed to develop a method that may not only distinguish between the areas of edema and those of tumor infiltration but also identify sites of possible future tumor recurrence in the peritumor. As a secondary objective, we sought to develop a method that can be used in any institution, regardless of MRI acquisition protocols, without using advanced neuroimaging modalities and usable for both pre-and postoperative scans. This is accomplished using the follow-up MRI for ground truth definition and the postoperative structural multiparametric MRI to extract voxel-based radiomic features as input into a machine learning-based prediction model of glioblastoma local recurrence.

Study Population and Data Description
We conducted a multi-institutional, retrospective analysis of patients who underwent surgery with de novo-diagnosed glioblastoma. The inclusion criteria were as follows: patients with complete resection of the enhancing tumor component, having a preoperative, early postoperative (less than 72 h), and follow-up MRI study, which all included T1weighted (T1w), T2-weighted (T2w), FLAIR, and T1w contrast-enhanced (T1ce) sequences, as well as diffusion-weighted imaging-derived apparent diffusion coefficient (ADC) maps. The details of the MRI acquisition protocols are shown in Supplementary material Table S1. In addition, all patients received adjuvant treatment with temozolomide and radiotherapy after surgery according to the Stupp protocol [2]. The exclusion criteria were as follows: pathological diagnoses other than IDH wild-type WHO grade 4 astrocytoma (glioblastoma), partial and subtotal resection, distant recurrences, MRI studies severely distorted due to hemorrhages, postsurgical infarcts or artifacts produced during the acquisition that conditioned images of poor quality. The diagnosis of tumor progression in the follow-up studies was made using the modified RANO criteria [20]. In cases that raised diagnostic doubts, additional follow-up studies were evaluated to discriminate between true tumor recurrence or pseudoprogression. Patients with uncertain progression and cases where the coregistration between the postoperative and follow-up MRIs failed were also excluded.
The initial study population comprised 127 patients from five different institutions (Table 1, column A). Of these, 55 patients met the selection criteria (Table 1, column B). The 40 patients from the Spanish institutions were used for model training. The remaining 15 patients (a Norwegian institution and 2 institutions from the Radiomics Signatures for PrecisiON Diagnostics (ReSPOND) database) [21] were used as the test dataset (Table 1, column C).

Image Preprocessing
The MRI studies were exported from the imaging archive of the respective institutions in the format of Digital Imaging and Communication in Medicine (DICOM) before they were converted to the format of the Neuroimaging Informatics Technology Initiative (NIfTI). The scans for every subject were rigidly registered to the SRI24 anatomical atlas space [22]. Intensity nonuniformity correction was applied as a temporary step to facilitate optimal registration [23].
T1w, T2w, ADC maps, and FLAIR scans were rigidly registered to the transformed T1ce scan for each individual, resulting in coregistered and resampled volumes of 1 × 1 × 1 mm isotropic voxels. The brain was extracted from all coregistered scans using a pre-trained deep learning-based model [24], followed by intensity normalization using Z-scoring.

Ground Truth Segmentation
Follow-up and postoperative MRI scans were coregistered using the deformable mode available in CaPTk through the Greedy registration technique [26]. Subsequently, the enhancing tumor was semiautomatically segmented in the T1ce images of the recurrence scan using an edge-based snake evolution method (ITK-SNAP version 3.0 [27]). Likewise, the peritumoral region was segmented in the postoperative MRI using the T1w, T1ce, T2w, and FLAIR scans as input and a hybrid generative-discriminative tumor segmentation method named the boosted glioma image segmentation and registration (GLISTRboost) algorithm [28]. Finally, the overlapping region between the segmented enhancing tumor in the follow-up scan and the peritumor of the postoperative scan was formed. Here, the peritumoral region was manually divided into two subregions to be used as ground truth labels: recurrence, or tumor infiltration, labeled with 1, and nonrecurrence, or edema, labeled with 0, using the intersection and subtraction tools of LifEx version 6.0 [29]. An example of segmentation is shown in Figure 1A,B. The automatic segmentations were reviewed visually, and manual corrections were introduced where the algorithms failed. All segmentations were performed by two neurosurgeons (SC, SG) with more than five years of experience in imaging applied in neuro-oncological surgery. Subsequently, a senior neuroradiologist (MV) with more than 15 years of experience reviewed and adjusted all segmentations of all patients.

Voxel-Based Radiomic Feature Extraction
A total of 4730 voxel-based radiomic features were computed from the peritumoral region of the five multiparametric MRI modalities using the open-source Python package Pyradiomics [30] version 3.0.1. The radiomic features followed the definitions according to the Image biomarker standardization initiative (IBSI) [31] and were divided according to first-order statistical features (19 features) and textural features (75 features). In addition, features were calculated from filtered images using wavelets, Laplacian of Gaussian filters, and local binary patterns. A detailed description of these characteristics and the parameters used in the extraction is provided in Supplementary Material Table S2.

Data Management and Model Training
Inevitably, the number of recurrence voxels was much smaller than the number of peritumor voxels. This class imbalance was handled by randomly undersampling the majority class (nonrecurrence) to match the size of the minority class (recurrence). Four different machine learning-based classifiers were trained: Categorical Boosting (Cat-Boost), Extreme gradient boosting (XGBoost), Random Forest (RF), and the Light Gradient boosting machine (LightGBM), using Python version 3.9.

Probability Maps and Predicted Recurrence Labels
The output from the machine learning classifiers were voxelwise probabilities of future tumor recurrence, which were represented as a graphical color overlay on the MRI images ( Figure 1C). In addition, probabilities were dichotomized into recurrence and nonrecurrence labels, as shown in Figure 1D. The voxels closest to the edge of the surgical cavity had a greater probability of being infiltrated than those located in the most distant regions of the peritumor. Therefore, a correction factor was introduced, which strongly reduced the predicted probability for voxels further than 15 mm from the edge of the segmented surgical cavity. This correction factor was implemented based on previously published anatomopathological findings from [4,5]. Then, the Otsu method [32] automatically determined the threshold used to define the recurrence label. Thus, the probabilities predicted and corrected by the distance factor that exceeded the Otsu threshold were labeled as recurrence.

Model Evaluation
The recurrence prediction performance of the trained model on the external test data was evaluated using two approaches: voxelwise and regionwise. In the first approach, the predicted label of each voxel in the test data was compared to the ground truth using the area under the receiver operating characteristic (ROC) curve (AUC), precision, recall, accuracy, F1 score, and Cohen's Kappa. However, since it is not a simple segmentation task and due to the biological implications of predicting an infiltrated area subject to evolutionary changes, a second, region-based evaluation approach was developed, taking into account the overall distribution in three-dimensional space. The peritumor was automatically divided into sectors, with the postsurgical cavity as the reference center. The sectors were anterior, posterior, superior, inferior, right, and left, and their angular combinations ( Figure 2). This allowed a comparison of whether the predicted recurrent voxels were located within the same sectors as the ground truth recurrence segmentations. Subsequently, the sector predictions were evaluated using the same metrics as the voxel-based approach. Figure 3 shows a schematic representation of the predictive model development process. area under the receiver operating characteristic (ROC) curve (AUC), precision, recall, accuracy, F1 score, and Cohen's Kappa. However, since it is not a simple segmentation task and due to the biological implications of predicting an infiltrated area subject to evolutionary changes, a second, region-based evaluation approach was developed, taking into account the overall distribution in three-dimensional space. The peritumor was automatically divided into sectors, with the postsurgical cavity as the reference center. The sectors were anterior, posterior, superior, inferior, right, and left, and their angular combinations ( Figure 2). This allowed a comparison of whether the predicted recurrent voxels were located within the same sectors as the ground truth recurrence segmentations. Subsequently, the sector predictions were evaluated using the same metrics as the voxel-based approach. Figure 3 shows a schematic representation of the predictive model development process.

Recurrence Prediction in Preoperative MRI Scans
One of the potential implications of obtaining recurrence probability maps of the peritumor region is to adapt the surgical strategy by extending the resection to these areas. The resulting best-trained model was applied to the preoperative MRI scans of the test cohort following the same preprocessing steps mentioned above. The results were quali-

Recurrence Prediction in Preoperative MRI Scans
One of the potential implications of obtaining recurrence probability maps of the peritumor region is to adapt the surgical strategy by extending the resection to these areas. The resulting best-trained model was applied to the preoperative MRI scans of the test cohort following the same preprocessing steps mentioned above. The results were qualitatively reviewed by an experienced neurosurgeon (SC).
Once the classifier was trained, the average time required for processing a new patient, including image preprocessing, segmentation, extraction of radiomic features, and model application, was approximately 45 min. A computer with a 2.20 GHz Intel Core i7 processor, 32 GB of RAM, and a 16 GB NVIDIA GeForce RTX 3070 graphics card was used.

Results
The clinical characteristics of the patients in the training and test cohorts are summarized in Table 2. There were no statistically significant differences between the training and test cohorts in overall and progression-free survival. Of the 40 patients in the training cohort, the radiomic features were extracted using a total of 1,569,490 voxels, of which 160,366 corresponded to the recurrence label and 1,409,124 to no recurrence label. After random undersampling, a total of 320,732 training voxels were obtained.
Patients from St. Olavs University Hospital had 324,391 test voxels, of which 8475 were recurrence and 315,916 were nonrecurrence. From the ReSPOND database, there were 259,202 test voxels, of which 13,828 were recurrence and 245,374 were nonrecurrence.
The performance metrics of all ML classifiers on the testing dataset are shown in Table 3. The best-performing classification model was the one using the CATboost algorithm. The results of the validation of the model in the external cohort applying the CATBoost algorithm and using sector-based evaluation were as follows: AUC of 0.81 ± 0.09, accuracy of 0.84 ± 0.06, precision of 0.48 ± 0.24, recall of 0.76 ± 0.22, F1 score of 0.53 ± 0.17, and Cohen's Kappa of 0.45 ± 0.18. Figure 4 shows the individual ROC curves obtained after the evaluation by voxels and sectors. Figure 5 shows the estimated recurrence probability maps in all cases of the test group using the CATBoost algorithm and the follow-up scans where recurrence was diagnosed. Ranking features based on predictive and cumulative importance are shown in Figure 6.   After applying the model in preoperative studies of the testing dataset, recurrence probability maps were obtained in all cases with well-differentiated areas of high infiltration probability and low probability of edema. Despite being unable to make a quantitative evaluation due to the absence of an adequate coregistration system, these predictions revealed a spatial distribution very similar to the sites of future recurrence, proving the existence of an underlying pattern in the image that can be revealed after applying our methodology (Figure 7).   After applying the model in preoperative studies of the testing dataset, recurrence probability maps were obtained in all cases with well-differentiated areas of high infiltration probability and low probability of edema. Despite being unable to make a quantitative evaluation due to the absence of an adequate coregistration system, these predictions

Discussion
In this retrospective study, we evaluated a machine learning-based approach for predicting tumor recurrence in patients with glioblastoma using radiomic features from postoperative MRI. We found that the predicted recurrence regions were highly correlated with the areas of future recurrence. This suggests that there is a group of features in the multiparametric MRI that reveal a pattern imperceptible to the naked eye. This pattern allows the classifier to distinguish two well-differentiated regions within the peritumoral zone. Therefore, using our methodology, it is possible to predict which areas of the peritumoral region will become tumor-enhancing zones. Thus, the main contribution of our study is the development of a reproducible prediction model with great potential for the application of personalized therapies for this type of neoplasia.
To the best of our knowledge, our study is the first that combines the follow-up MRI to define the ground truth labels and the voxelwise radiomic feature extraction from the peritumoral region of the early postoperative MRI (<72 h). This has the advantage of similar morphology between postoperative and follow-up scans. Early postoperative MRI allows us to define the presence of residual contrast enhancements and quantifies the extent of tumor resection.
Applying the region-based model evaluation, our model achieved mean AUC values of 0.81 and an accuracy of 0.84 in the external testing cohort. These results are superior to those reported by Yan et al. [17] (2020), in which the authors developed a predictive model of recurrence using the voxel-based radiomic features of preoperative MRI (structural, perfusion, and diffusion tensor imaging (DTI)). Similar to our study, the ground truth labels were created through the coregistration of the follow-up scans, but the authors employed preoperative MRI instead of postoperative MRI, as we suggest. The authors reported an overall accuracy in the validation group (n = 20) of 0.78. In the study by Chougule et al. [18] (2021), an accuracy of 0.71 was reported to predict local recurrence in the test group (n = 6). The authors trained a predictive recurrence model using voxel-based radiomic features of the T1ce, FLAIR, and ADC maps. Although retrospective longitudinal data from each subject were collected, only one postoperative scan was used to define the ground truth labels, averaging 143 ± 42 days before recurrence. It is well known that during surgery, some peritumour areas are inadvertently or intentionally resected. Therefore, using preoperative MRI for feature extraction would imply predicting potentially nonexistent regions. Instead, an early postoperative MRI, as suggested by the present work, represents a more robust alternative to predict future tumor recurrence.
The performance of our model also exceeds the results published by Dasgupta et al. [19] (2021). The authors obtained an accuracy of 0.79 and an AUC of 0.61 in the test group (n = 10) using voxel-based radiomic feature extraction from T1ce, T2w, and ADC maps. To train the model, the authors used the features extracted from the peritumoral region of a group of patients with brain metastasis (n = 45) as recurrence labels and the data extracted from the tumor segmentation of a low-grade glioma dataset (n = 36) as recurrence labels. Although this method is innovative, it is not possible to affirm that the composition of low-grade gliomas is the same as that of the non-enhancing tumor areas of glioblastoma.
Our results are slightly lower than those reported by Akbari et al. in 2016, who obtained a mean AUC of 0.84, a sensitivity of 91%, and a specificity of 93% in the testing cohort (n = 34) [16]. The authors used the voxel-based intensity features of the structural, DTI, and dynamic susceptibility contrast-enhanced MRI. Additionally, their model uses preoperative MRI for recurrence prediction, which has limitations, as discussed above.
In the work of Rathore et al. [15] (2018), using previously published methodology [16], the authors included only patients with pathology-proven recurrence diagnoses and added texture features during model training. In the test cohort (n = 31), they obtained an AUC of 0.91 and an accuracy of 0.89, somewhat better than our results. This study used preoperative MRI for ground truth definition, with the limitations discussed earlier. Furthermore, to evaluate predictions, experts manually drew recurrence regions on the preoperative MRI using the follow-up scans as a reference. In contrast, our model does not need expert knowledge to define the true labels since coregistration was used with the follow-up scans.
Only two previous studies mentioned having applied the inclusion criterion that the patients underwent complete resection of the contrast-enhancing tumor [15,16]. This was imperative in our study to ensure that there were no remnants of the enhancing tumor component that could interfere with the analysis.
We acknowledge that the sensitivity of our model is greater than its specificity due to the presence of false positive predicted regions. However, high sensitivity is undoubtedly necessary because we intend to detect a severe event (areas of tumor recurrence) since they are potentially amenable to treatment.
A significant difference in our work compared to previous publications is our method of evaluating predictions. Although the final output of the model is a segmentation integrated by voxels exceeding a certain threshold, predicting areas of recurrence is not an ordinary segmentation task. Two facts with a biological basis must be considered. The first is that the areas of the peritumor are infiltrated, and that will evolve into an enhancing tumor unlikely to have the same size or shape as the enhancing tumor in the follow-up scan. Because these regions undergo evolutionary changes, they naturally tend to grow and invade. The second fact adding disparity between the predicted tumor areas and the ground truth segmentations is that a coregistration of the follow-up and postoperative images has been used for the estimation. As already discussed, this coregistration has limitations, which add uncertainty to the shape and size that the predicted infiltrated peritumor may have. For these reasons, our segmentations cannot be evaluated in a classical way using measures based on overlapping or distance. Instead, we sought to determine whether our predicted recurrence regions were spatially correlated with the site where the tumor would genuinely begin to regrow. By realizing that predicted segmentations need not have high spatial accuracy, we proposed a new regionwise evaluation that brings an intuitive interpretation and evaluation of our predictions. Using this novel approach, we compared whether the regions predicted as recurrence were found in a similar location to the actual site of the enhancing tumor in the follow-up scan.
We highlight that our predictive model has been evaluated in a multi-institutional cohort of patients, which tests the reproducibility of the radiomic features used by the model between different manufacturers of MR scanners and acquisition protocols. Furthermore, our predictive model was trained using the basic or structural sequences of MRI. Since glioblastoma tends to grow predominantly along white matter tracts [33], advanced sequences such as DTI and perfusion MRI could improve our model performance further, as shown in earlier studies [15,16]. However, these sequences are not available in all centers, which would make the model less attractive. In addition, despite the complexity of the methods used here, the final application in a clinical scenario can be carried out with basic computer science and image processing knowledge. Thus, an infiltration probability map of a patient diagnosed with glioblastoma can be obtained in less than 1 h in the DICOM format. It can be easily transferred and used in any neuronavigation or radiotherapy workstation.
We are aware of the limitations of our study. A potential drawback is the lack of anatomopathologic confirmation of regions labeled as recurrences on follow-up scans. As previously mentioned, the diagnosis of recurrence was made according to preestablished radiological criteria. Therefore, we cannot completely rule out the inclusion of a misdiagnosed case of pseudoprogression.

Conclusions
We have developed and evaluated a model that can predict the location of tumor recurrence from MRI of patients with glioblastoma with high accuracy and sensitivity. Further research focused on the molecular and pathological characteristics of these areas of potential recurrence will allow clinicians to adapt the surgical and radiotherapy treatment to prolong the survival of these patients.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/cancers15061894/s1, Table S1: MRI acquisition parameters used by participating institutions. Table S2: Pyradiomics settings used for voxel-wise feature extraction. Informed Consent Statement: Patient consent was waived due to the study's retrospective nature and the fact that the patients were deceased. All patient data were anonymized to protect patient confidentiality.
Data Availability Statement: Data generated or analyzed during the study are available from the corresponding author by request. All code written in support of this publication is publicly available at: https://github.com/smcch/Predicting_Glioblastoma_Recurrence_MRI (accessed on 19 March 2023).