Prediction of Rapid Early Progression and Survival Risk with Pre-Radiation MRI in WHO Grade 4 Glioma Patients

Simple Summary Rapid early progression (REP) has been defined as increased nodular enhancement at the border of the resection cavity, the appearance of new lesions outside the resection cavity, or increased enhancement of the residual disease after surgery and before radiation. Patients with REP have worse survival compared to patients without REP (non-REP). Therefore, a reliable method for differentiating REP from non-REP is hypothesized to assist in personlized treatment planning. A potential approach is to use the radiomics and fractal texture features extracted from brain tumors to characterize morphological and physiological properties. We propose a random sampling-based ensemble classification model. The proposed iterative random sampling of patient data followed by feature selection and classification with radiomics, multi-resolution fractal, and proteomics features predicts REP from non-REP using radiation-planning magnetic resonance imaging (MRI). Our results further show the efficacy of pre-radiation image features in the analysis of survival probability and prognostic grouping of patients. Abstract Recent clinical research describes a subset of glioblastoma patients that exhibit REP prior to the start of radiation therapy. Current literature has thus far described this population using clinicopathologic features. To our knowledge, this study is the first to investigate the potential of conventional radiomics, sophisticated multi-resolution fractal texture features, and different molecular features (MGMT, IDH mutations) as a diagnostic and prognostic tool for prediction of REP from non-REP cases using computational and statistical modeling methods. The radiation-planning T1 post-contrast (T1C) MRI sequences of 70 patients are analyzed. An ensemble method with 5-fold cross-validation over 1000 iterations offers an AUC of 0.793 ± 0.082 for REP versus non-REP classification. In addition, copula-based modeling under dependent censoring (where a subset of the patients may not be followed up with until death) identifies significant features (p-value < 0.05) for survival probability and prognostic grouping of patient cases. The prediction of survival for the patients’ cohort produces a precision of 0.881 ± 0.056. The prognostic index (PI) calculated using the fused features shows that 84.62% of REP cases fall under the bad prognostic group, suggesting the potential of fused features for predicting a higher percentage of REP cases. The experimental results further show that multi-resolution fractal texture features perform better than conventional radiomics features for prediction of REP and survival outcomes.


Introduction
Brain and other central nervous system (CNS) tumors are associated with the highest mortality and morbidity across different malignancies in the United States [1]. Glioblastoma (GB) is one of the most aggressive brain tumors, representing nearly half of brain gliomas [2]. For newly diagnosed GB patients, the standard of care includes maximal safe surgical resection followed by radiation therapy with concurrent and adjuvant temozolomide (TMZ) after which tumor-treating fields are often recommended (i.e., the Novo-TTF system, renamed Optune) [3]. Radiation therapy should ideally begin six weeks following surgery [4]. During this time frame, GB may regrow significantly due to its highly proliferative nature [5]. Several institutional series have been published evaluating postoperative REP with pre-radiation MRI [6][7][8]. Specifically, REP is assessed by comparing early post-operative MRI scans with radiation-planning MRI [8]. There is almost a 50% prevalence rate for the development of REP, even if radiation is initiated earlier than six weeks post-operatively [6][7][8].
MRI plays a crucial role in the evaluation of post-operative and post-treatment effects. The post-operative MRI is acquired within 3 days (preferably within 24 h) both to assess extent of resection and to minimize the effect of enhancement due to surgery [5]. The radiation-planning MRI scan is obtained 1 to 3 weeks prior to the start of radiation therapy to assist with target delineation. According to the guidelines [9], another MRI (post-radiation) scan is obtained 2 to 6 weeks after the completion of treatment with radiation therapy +/− TMZ followed by surveillance imaging every 2-4 months. The comparison of the first post-radiation MRI (approximately after 1 month from the completion of radiation therapy) to baseline scan (post-operative MRI, more commonly being radiation-planning scan), is often made to evaluate tumor progression as well as radiationinduced changes (collectively termed as pseudo progression) [4]. MRI offers a detailed characterization of the morphological, physiological, and metabolic properties of brain tumors, particularly brain gliomas, which are complex and heterogenous malignancies [10]. Quantitative radiomics features [11,12] extracted from MRI (texture, intensity, shape, area, and geometric features) followed by statistical and machine learning analyses have been shown to be effective [13][14][15] in brain tumor volume segmentation, and classification of normal/tumor tissues.
Patients with REP have worse survival compared to non-REP patients [7]. Overall survival (OS) analysis refers to the time to death from the day of surgery. A patient is censored if they are lost to follow-up prior to observing time to death [16]. Censoring may introduce bias into statistical analysis results if censoring processes involve dropout or withdrawal due to tumor progression, treatment toxicity, or the start of second-line therapy [16]. Because a patient may die soon after being lost to follow-up, total survival and dropout time may be positively associated [16]. Dependent censoring occurs when the relationship between censoring time and survival time cannot be explained by observable factors [16,17]. In statistical analysis, copula-based modeling is a state-of-the-art method to model the dependency between survival and censoring time.
As alluded to above, several retrospective reviews correlated clinical and pathologic features with REP, and found it to be an independent negative prognostic factor [4,[6][7][8]. It remains unclear whether patients with REP have distinct molecular or radiographic features. To the best of our knowledge, there has been no research that utilizes MRI features for prediction of REP using radiation-planning MRI scans, which may act as a quantitative imaging-based biomarker to stratify REP patients from non-REP patients.
The first objective of this study is to evaluate the predictive efficacy of conventional radiomics and sophisticated multi-resolution fractal texture features extracted from radiationplanning MRI for predicting REP. A second objective is to examine the survival probability of patients using copula modeling applied to radiation-planning MRI radiomics features within a context of dependent censoring scenario. Third objective is the binary prediction of patients' survival status (patients expired/dead or not) based on selected significant (p-value < 0.005) T1C features, employing copula modeling. To assess the significance of the selected features, a prognostic index (PI) is calculated, derived as a linear combination of the selected features. By utilizing this calculated PI, patients are categorized into prognostic groups (good or bad). Additionally, within these prognostic groups, the distribution of REP cases versus non-REP cases is constructed. The distribution of REP cases versus non-REP cases within the defined prognostic groups further demonstrates the predictive capability of radiomics features in identifying REP as a bad prognostic factor or indicator of a high-risk group.

Patient Data
This is an institutional review board (IRB, reference #22-057)-approved retrospective chart review of a cohort of patients treated at OhioHealth between 1 January 2015 and 1 March 2021. Relevant clinical and radiographic data have been abstracted from the electronic health record system. Patients with a biopsy-confirmed diagnosis of World Health Organization (WHO) grade 3 or 4 anaplastic astrocytoma or grade 4 glioblastoma have been included in this study, with a minimum of three MRI scans (pre-operative, early post-operative, radiation planning). Imaging studies are reviewed by board-certified neuroradiologists. All patients have undergone surgery (biopsy, subtotal, or gross total resection), followed by radiation therapy with or without adjuvant TMZ.
A total of 95 patient cases have been included in this study. Among these, twenty-five cases do not have a (T1C) sequence in at least one of the MRI studies. Seventy patients with complete radiographic data have been included in the analysis, with thirteen of them being clinically identified as having REP and the remaining fifty-seven as non-REP. REP has been classified as such in line with previous literature [18,19]. A detailed summary of patient cases is presented in Table 1. Table 1. Summary of data between the patient group.

Algorithm Pipeline for Prediction of Rapid Early Progression (REP)
The overall pipeline for prediction of rapid early progression (REP) is depicted in Figure 1. The specifics of each step are described in the subsequent sub-sections. The predictive model is an ensemble tree-based Cat Boost classification model implemented in

MRI Preprocessing
All radiation-planning MRI images are co-registered to the same T1 anatomic template using affine registration and resampled to 1 mm 3 voxel resolution using the Oxford Center for Functional MRI of the Brain's (FM-RIB's) Linear Image Registration Tool (FLIRT) of the FMRIB Software(version 6.0) Library (FSL) [20]. The FSL's Brain Extraction Tool (BET) [21] is utilized to skull-strip each patient's volumetric image. For obtaining improved skull-stripped volumetric images, manual intervention is employed. Additionally, all images are smoothed using the Smallest Unvalued Segment Assimilating Nucleus (SUSAN) [22], a low-level image processing technique, in order to reduce high frequency intensity changes (i.e., noise) in regions with a uniform intensity profile while maintaining the underlying structure. The intensity histograms of all modalities for all patients are then matched to the relevant modality of a single reference patient using the implemented version of the Insight Toolkit (ITK) [23].

Tumor Volume Segmentation
The utilization of transfer learning in our study is motivated by the inherently challenging nature of limited patient data for tumor volume segmentation. Given the scarcity of available data, we employ transfer learning as a strategic approach to enhance the performance of our segmentation model. To achieve this, we leverage a substantial dataset comprising 1251 GB patient cases collected from the well-established BRATS 2021 challenge [24][25][26]. This dataset serves as the foundation for training our segmentation model. By initially training the model on this larger dataset, we enable it to learn intricate features and patterns associated with tumor tissue segmentation. Once the model is adequately trained using this extensive dataset, it is subsequently deployed to segment tumor tissue regions in a more restricted 70 patient cohort with radiation-planning MRI volumetric images. This process allows the model to generalize its learned knowledge and effectively adapt to the specific characteristics of our target cohort, even in the presence of limited data. For the task of segmenting tumor tissue regions, encompassing edema, enhancing tumor, and necrosis, we employ a 3D UNet model specifically designed for GB patients'  All radiation-planning MRI images are co-registered to the same T1 anatomic template using affine registration and resampled to 1 mm 3 voxel resolution using the Oxford Center for Functional MRI of the Brain's (FM-RIB's) Linear Image Registration Tool (FLIRT) of the FMRIB Software(version 6.0) Library (FSL) [20]. The FSL's Brain Extraction Tool (BET) [21] is utilized to skull-strip each patient's volumetric image. For obtaining improved skullstripped volumetric images, manual intervention is employed. Additionally, all images are smoothed using the Smallest Unvalued Segment Assimilating Nucleus (SUSAN) [22], a low-level image processing technique, in order to reduce high frequency intensity changes (i.e., noise) in regions with a uniform intensity profile while maintaining the underlying structure. The intensity histograms of all modalities for all patients are then matched to the relevant modality of a single reference patient using the implemented version of the Insight Toolkit (ITK) [23].

Tumor Volume Segmentation
The utilization of transfer learning in our study is motivated by the inherently challenging nature of limited patient data for tumor volume segmentation. Given the scarcity of available data, we employ transfer learning as a strategic approach to enhance the performance of our segmentation model. To achieve this, we leverage a substantial dataset comprising 1251 GB patient cases collected from the well-established BRATS 2021 challenge [24][25][26]. This dataset serves as the foundation for training our segmentation model. By initially training the model on this larger dataset, we enable it to learn intricate features and patterns associated with tumor tissue segmentation. Once the model is adequately trained using this extensive dataset, it is subsequently deployed to segment tumor tissue regions in a more restricted 70 patient cohort with radiation-planning MRI volumetric images. This process allows the model to generalize its learned knowledge and effectively adapt to the specific characteristics of our target cohort, even in the presence of limited data. For the task of segmenting tumor tissue regions, encompassing edema, enhancing tumor, and necrosis, we employ a 3D UNet model specifically designed for GB patients' T1C MRI scans [13,14].This approach capitalizes on the model's inherent ability to capture complex spatial relationships within the volumetric images, further enhancing the precision and accuracy of tumor tissue segmentation. The tumor regions are rigorously validated by two expert radiation oncologists specializing in primary and secondary brain tumor treatment, who achieve consensus.

Feature Extraction
A total of 600 features have been extracted from the tumor tissue volume segments in this study. These features include texture, volume, and the area of the tumor and its sub-regions (edema, enhancing tumor, and necrosis). Forty-one texture characteristics have been derived from the whole tumor volume in the raw MRI (T1C) sequence, as well as the tumor sub-regions. The conventional texture features are extracted using a grey-tone spatial dependence matrix (GTSDM), neighborhood grey-tone difference matrix (NGTDM), and grey level size zone matrix (GLZSM). The fractal texture features encompass the piecewise triangular prism surface area (PTPSA) for fractal characterization, multi-resolution Brownian motion (mBm) analysis, and tumor region characterization with Holder Exponent (HE) modeling, termed as generalized multi-resolution Brownian motion (GmBm). The PTPSA, mBm, and HE computational algorithms are detailed in [27][28][29]. Multi-resolution fractal features depict textural variation in tumor tissue across various image resolutions [30]. Six histogram-based statistics (mean, variance, skewness, kurtosis, energy, and entropy) are also derived from the distinct tumor sub-regions. We further extract volumetric features: the volume of the entire tumor, the volume of the whole tumor in relation to the brain, and the volume of sub-regions. Vallières et al.'s [31] MATLAB-based software (version R2019b) is utilized for analyzing texture features. For fractal characterization, multi-resolution fractal characterization, HE characterization, and volumetric characteristics, MATLAB-based in-house software is employed.

Selection of Radiomics Features and Model Building
To evaluate the efficacy of fractal features and conventional radiomics features, we consider two model configurations. These model configurations include (a) a non-fractal model (a model containing only conventional volume, area, and texture features), and (b) a fractal model (a model incorporating multi-resolution fractal features along with conventional volume, area, and texture features) [32]. For feature selection, a two-step feature selection process (details of feature selection and statistical analysis can be found in Appendix A) results in three significant (p-value < 0.05) features for both the fractal and non-fractal models, respectively. Radiomic modeling of REP classification using the selected features is implemented using a nested paradigm involving 1000 iterations and 5-fold cross-validation with random sampling (details of the algorithms are provided in Appendix A). The objective of cross-validation is to offer a robust estimate of a model's performance on unseen data [33]. In a k-fold cross-validation, the data are partitioned into k-subsets, of approximatively equal sizes. Moreover, typically calculated from the formula: n = k × m, where n is the sample size, k is the number of folds, and m the number of observations in each fold or subset. For 70 patient cases, where n = 70, k = 5, and m = 14, 70 = 5 × 14, we see that 5-fold may be a reasonable choice for the fold numbers.

Survival Analysis Modeling under Dependent Censoring
The Kaplan-Meier estimator and the Cox proportional hazard model are typical methods for survival analysis and feature selection [17]. These techniques manage censoring under the presumption that overall survival time and censoring time are statistically independent. Therefore, a copula-based approach [17] is employed to estimate the dependence parameter by utilizing cross-validation to select significant genes or features for survival prediction. The dependency between survival and censoring time is modeled by copula functions.
Considering the censoring scenario as presented in Table 2, we propose the copula method (details are presented in Appendix B) for feature selection in survival analysis. First, we identify the dependency parameter utilizing the survival data and the extracted radiomics features matrix. In addition to imaging features, molecular features are also included in the feature matrix. The significant (p-value < 0.05) features are then utilized to make binary predictions regarding whether the patient has expired or is alive. 2.6. Survival Prediction Figure 2 illustrates the comprehensive pipeline for predicting overall survival (OS). The objective is to predict the status of patients who have expired. Given that the copula-based method is computationally demanding [33], we perform feature selection in two steps. In the first step, we use our proposed algorithm (shown in Algorithm A1 in Appendix A). Subsequently, based on the F1-score range (0.2-0.84), we initially select features with F1scores greater than 0.7. The number of selected features in the first step is 124 and 87 for the non-fractal and fractal models, respectively. These initially selected features serve as input for a second-step feature selection using the copula model, with the aim of identifying significant features (p-value < 0.05). These final selected features are associated with patients' OS probabilities. Additionally, we examine whether the inclusion of molecular features (MGMT status, IDH status) is statistically significant in relation to patient survival probability. Considering the censoring scenario as presented in Table 2, we propose the copula method (details are presented in Appendix B) for feature selection in survival analysis. First, we identify the dependency parameter utilizing the survival data and the extracted radiomics features matrix. In addition to imaging features, molecular features are also included in the feature matrix. The significant (p-value < 0.05) features are then utilized to make binary predictions regarding whether the patient has expired or is alive.  2.6. Survival Prediction Figure 2 illustrates the comprehensive pipeline for predicting overall survival (OS). The objective is to predict the status of patients who have expired. Given that the copulabased method is computationally demanding [33], we perform feature selection in two steps. In the first step, we use our proposed algorithm (shown in Algorithm A1 in Appendix A). Subsequently, based on the F1-score range (0.2-0.84), we initially select features with F1-scores greater than 0.7. The number of selected features in the first step is 124 and 87 for the non-fractal and fractal models, respectively. These initially selected features serve as input for a second-step feature selection using the copula model, with the aim of identifying significant features (p-value < 0.05). These final selected features are associated with patients' OS probabilities. Additionally, we examine whether the inclusion of molecular features (MGMT status, IDH status) is statistically significant in relation to patient survival probability.

Predictive Performance of Rapid Early Progression (REP) Classification
The comparative performance of the two model configurations is illustrated in Table  3. When we compare the performance of the fractal model to the non-fractal model, the fractal model attains an AUC of 0.793, while the non-fractal model attains an AUC of 0.673. A significant difference (determined through ANOVA tests, p-value < 0.001) exists in

Predictive Performance of Rapid Early Progression (REP) Classification
The comparative performance of the two model configurations is illustrated in Table 3. When we compare the performance of the fractal model to the non-fractal model, the fractal model attains an AUC of 0.793, while the non-fractal model attains an AUC of 0.673. A significant difference (determined through ANOVA tests, p-value < 0.001) exists in terms of accuracy, AUC, PPV, and FPR between the two model configurations, as presented in Figure 3. Table 3. Comparison between model configurations for 5-fold cross--validation over 1000 iterations with subject independent random sampling for REP classification.  The statistically significant features (please refer to Table A1 in Appendix A) in t non-fractal model are the eccentricity in the edema region, the second axis (y-axis) leng in the necrosis region and the autocorrelation of GTSDM from T1C, respectively. For o statistical analysis, a p-value ≤ 0.05 is considered significant. We observe the significa differences in the features between the two groups. For instance, the median and me value of the necrosis region for the REP group are higher than those of the non-REP similar trend is also observed in the autocorrelation of GTSDM from T1C between the tw groups (please refer to Table A3 and Figure A2 in Appendix A). For the fractal model, the statistically significant features (please refer to Table A2 Appendix A) are GmBm of the NGTDM of T1C, the strength of NGTDM from the 37 direction of T1C, and the strength of NGTDM. We observe that the feature distribution not normal in each group. Therefore, the Wilcoxon-Mann-Whitney test is performed determine the significant difference (p-value < 0.05) in the feature distributions betwe the two groups. In the fractal model, the significant selected features comprise textu features. The median value of the selected features is higher in the non-REP group co pared to the REP group (please refer to Table A4 and Figure A3 in Appendix A).

Survival Probability Analysis under Dependent Censoring
First, we analyze the impact of dependent censoring on feature selection for survi probability. For this purpose, we evaluate the significant (p-value < 0.05) features usi the Cox proportional hazard model with independent censoring (please refer to Table in Appendix B). The features selected with independent censoring are compared w those selected with dependent censoring. Table 4 presents the significant features (p-va < 0.05) obtained with dependent censoring using copula modeling. In the case of no fractal models, no significant features are selected when applying Cox modeling. To amine the effect of dependent censoring on feature selection, we compare the survi probability curves utilizing the top two features in the fractal model. Since only t The statistically significant features (please refer to Table A1 in Appendix A) in the non-fractal model are the eccentricity in the edema region, the second axis (y-axis) length in the necrosis region and the autocorrelation of GTSDM from T1C, respectively. For our statistical analysis, a p-value ≤ 0.05 is considered significant. We observe the significant differences in the features between the two groups. For instance, the median and mean value of the necrosis region for the REP group are higher than those of the non-REP. A similar trend is also observed in the autocorrelation of GTSDM from T1C between the two groups (please refer to Table A3 and Figure A2 in Appendix A).
For the fractal model, the statistically significant features (please refer to Table A2 in Appendix A) are GmBm of the NGTDM of T1C, the strength of NGTDM from the 37th direction of T1C, and the strength of NGTDM. We observe that the feature distribution is not normal in each group. Therefore, the Wilcoxon-Mann-Whitney test is performed to determine the significant difference (p-value < 0.05) in the feature distributions between the two groups. In the fractal model, the significant selected features comprise texture features. The median value of the selected features is higher in the non-REP group compared to the REP group (please refer to Table A4 and Figure A3 in Appendix A).

Survival Probability Analysis under Dependent Censoring
First, we analyze the impact of dependent censoring on feature selection for survival probability. For this purpose, we evaluate the significant (p-value < 0.05) features using the Cox proportional hazard model with independent censoring (please refer to Table A5 in Appendix B). The features selected with independent censoring are compared with those selected with dependent censoring. Table 4 presents the significant features (p-value < 0.05) obtained with dependent censoring using copula modeling. In the case of non-fractal models, no significant features are selected when applying Cox modeling. To examine the effect of dependent censoring on feature selection, we compare the survival probability curves utilizing the top two features in the fractal model. Since only two features are selected with independent censoring, we proceed to analyze the survival marginal curves with and without censoring dependency. For a patient case with feature vector x = (x 1 ,x 2 , . . . ., x p ) , survival prediction is analyzed using the PI defined asβ(α) x, whereβ(α) = β 1 (α),. . ..,β p (α) [16,34]. If α = 0, then PI =β(0) x which is based on Cox modeling under independent censoring (α = 0). Therefore, the PI for the fractal modeling with the cox model with two significant features is the following.
(1) However, considering dependent censoring (α = 18) and utilizing the copula modeling the top two selected features as shown in Table 4 and the PI for the fractal model is the following: PI (with dependent censoring) = (−1.58 × ET2) + (0.74 × L2_Orientation). (2) Using the PI, we randomly divide the 67 patient cases into two groups of equal sample size (n 1 = 33, n 2 = 34). Patients in the good(low-risk) prognostic group have low PIs, and patients in the bad (high-risk) prognostic group have high PIs [16,17,35]. The two survival curves are determined by the copula graphic (CG) estimator [36,37] with the Clayton copula as presented in Figure 4. The difference between the two curves is calculated by the average vertical difference [16,17,38]. The p-value is calculated between the two groups using 1000 permutation tests [16,17,39]. From Figure A5a, we observe that the vertical distance (D = 0.128) between the two groups with independent censoring (α = 0), is not significant (p-value = 0.1422). However, considering the dependent censoring in Figure A5b (α = 18, c-index= 0.519), the distance (D = 0.185) between two prognostic groups is significant (p-value = 0.0047). Clayton copula as presented in Figure 4. The difference between the two curves is calculated by the average vertical difference [16,17,38]. The p-value is calculated between the two groups using 1000 permutation tests [16,17,39]. From Figure A5a, we observe that the vertical distance (D = 0.128) between the two groups with independent censoring (α = 0), is not significant (p-value = 0.1422). However, considering the dependent censoring in

Binary Prediction of Survival
Through the analysis, we observe the effect of dependent censoring on feature selection and survival probability (please refer to Figure A4 in Appendix B). Therefore, for binary survival prediction, we utilize the features selected through copula modeling. Table 4 presents the significant (p-value < 0.05) features for the fractal and non-fractal models. For the experimental analysis of the selected features, we consider the top three, five, seven, and nine features for binary classification of survival (whether the patient is expired or not). In the case of the fractal model, 10 features are significant for survival probability analysis while for the non-fractal model, 8 features are significant as presented in Table 4. First, we analyze the vertical distance between the good prognostic and bad prognostic groups with 3, 5, 7, 9, and 10 features (please refer to Table A6 in Appendix B). Subsequently, based on the significance of the feature combinations, we compute binary predictions of survival.
Using the selected top three, five, seven, and nine features, we apply our proposed algorithm (refer to Algorithm A2 in Appendix A) for binary prediction of survival. The predictive results for 5-fold cross-validation with 1000 iterations are presented in Table 5. In the case of binary survival prediction, we emphasize the precision of the models for the selected number of features. Higher precision corresponds to a lower false positive rate, as indicated in Table 5. Moreover, following the algorithm (refer to Algorithm A2 in Appendix A), we consider balanced number of expired/dead (n = 25, randomly sampled from 54 dead cases) and not dead (n = 13) cases in each iteration. It is observed (please refer to Appendix B) that when utilizing seven features, both model configurations (fractal, non-fractal) achieve a higher PPV or precision. Therefore, the performance of the model configurations is based on the top seven features. While comparing the fractal and non-fractal model performance for binary survival prediction, a significant difference in

Binary Prediction of Survival
Through the analysis, we observe the effect of dependent censoring on feature selection and survival probability (please refer to Figure A4 in Appendix B). Therefore, for binary survival prediction, we utilize the features selected through copula modeling. Table 4 presents the significant (p-value < 0.05) features for the fractal and non-fractal models. For the experimental analysis of the selected features, we consider the top three, five, seven, and nine features for binary classification of survival (whether the patient is expired or not). In the case of the fractal model, 10 features are significant for survival probability analysis while for the non-fractal model, 8 features are significant as presented in Table 4. First, we analyze the vertical distance between the good prognostic and bad prognostic groups with 3, 5, 7, 9, and 10 features (please refer to Table A6 in Appendix B). Subsequently, based on the significance of the feature combinations, we compute binary predictions of survival.
Using the selected top three, five, seven, and nine features, we apply our proposed algorithm (refer to Algorithm A2 in Appendix A) for binary prediction of survival. The predictive results for 5-fold cross-validation with 1000 iterations are presented in Table 5.
In the case of binary survival prediction, we emphasize the precision of the models for the selected number of features. Higher precision corresponds to a lower false positive rate, as indicated in Table 5. Moreover, following the algorithm (refer to Algorithm A2 in Appendix A), we consider balanced number of expired/dead (n = 25, randomly sampled from 54 dead cases) and not dead (n = 13) cases in each iteration. It is observed (please refer to Appendix B) that when utilizing seven features, both model configurations (fractal, non-fractal) achieve a higher PPV or precision. Therefore, the performance of the model configurations is based on the top seven features. While comparing the fractal and nonfractal model performance for binary survival prediction, a significant difference in the area under the curve (AUC) is observed with a p-value of 0.005 from ANOVA analysis. Furthermore, in terms of PPV (p-value < 0.01), accuracy (p-value < 0.001),and false positive rate (p-value < 0.01), a significant difference exists between the performance of the fractal and non-fractal models. In addition to radiomics features, we analyze the survival probability and binary prediction of survival using molecular information (MGMT methylation status, IDH status). Therefore, with the top seven selected features we include MGMT status and IDH mutation status as additional features. In both the fractal and non-fractal models, the MGMT status does not hold significance (p-value = 0.9651) under dependent censoring copula modeling. However, the IDH mutation status is significant (p-value = 0.04). Consequently, we added the IDH mutation status to the top seven features and computed model performance, as presented in Table 6. With the inclusion of additional molecular features, the distance between marginal survival curves remains almost the same. The vertical distance is D = 0.171, with p-value = 0.0085, as depicted in Figure 5 for the fractal model. In the non-fractal model, the vertical distance is D = 0.173, with p-value of 0.0084. For both the fractal-molecular and non-fractal-molecular models, there exists a significant difference between the models in terms of accuracy, PPV and FPR. However, there is no significant difference between the model configurations regarding AUC. The reason for this could be illustrated in Figure 5, where the survival probability demonstrates almost the same vertical distance and p-value for both models with the addition of the molecular features to the seven radiomics features. Moreover, with inclusion of molecular features, there is no increase in model performance in terms of PPV and FPR compared to model configurations without molecular feature. mutation) of mean test performance across 5-fold with 1000 iterations for binary prediction of patient survival (patient expired (dead) or not).

Model Configurations
Area under Curve Accuracy (%)  For both the fractal-molecular and non-fractal-molecular models, there exists a significant difference between the models in terms of accuracy, PPV and FPR. However,

Analysis of Prognostic Groups and Its Association with REP Status
Based on Table 4, it is observed that for the non-fractal model, a total of eight features are found to be significant (p-value < 0.05), whereas for the fractal model, the number is 10.
To analyze the distribution of REP patients in prognostic (good or bad) groups, we consider the top eight features in both the fractal and non-fractal models. As a result, the PI for the fractal and non-fractal models are as follows: The PI is computed using the selected radiomics and multi-resolution fractal features from radiation-planning MRI. The only difference between the PI of the fractal and non-fractal models lies in the inclusion of multi-resolution fractal texture features in the fractal models, in addition to conventional texture features. The sixty-seven patients are divided into good prognostic and bad prognostic groups. Using a CG estimator, two marginal survival curves are determined. In the fractal model, the distance between the two marginal curves is D = 0.157 (p-value = 0.014), while in the non-fractal model, it is D = 0.128 (p-value = 0.038). Figure 6 displays the distribution of REP cases within the prognostic groups. The scatter points depict the prognostic index for each patient. The darker point indicates the mean PI within each group, while the bars represent the corresponding standard deviation. For instance, in Figure 6a, within the scatter plot, the group labeled "c" corresponds to the patients in the bad prognostic group, which also includes REP cases. As indicated by Figure 6, we can observe that patients with a low prognostic index (PI)/lower risk are placed in the good prognostic group while those with a higher PI/higher risk are placed in the bad prognostic group. Furthermore, in the case of fractal model, based on the PI, 84.62% (11 out of 13 cases) of REP cases fall under the bad prognostic group, as depicted in the matrix representation of Figure 6a. In the non-fractal model, 76.92% (10 out of 13 of REP cases belong to the bad prognostic group, as presented in in the matrix representation of Figure 6b. The percentage of REP cases in the bad prognostic group is higher for the fractal model compared to the non-fractal model. Therefore, we have calculated whether there exists a significant difference between groups in terms of survival time, as illustrated in Table 7. In both groups, there is a significant (p-value < 0.05) difference in survival time. Test of normality and the appropriate test (ANOVA/Wilcoxon-Mann-Whitney) have been conducted between the two groups. From Table 7, the median number of survival days for REP is 172 days, in contrast to the non-REP group's 474.50 days, highlighting the significant difference between their survival times. Moreover, within each prognostic group (good or bad), there is also a significant difference in survival days between the groups. The median survival days for the bad prognostic group is 329 days, which is significantly different from the median survival days of 511 days for the good prognostic group. The percentage of REP cases in the bad prognostic group is higher for the fractal model compared to the non-fractal model. Therefore, we have calculated whether there exists a significant difference between groups in terms of survival time, as illustrated in Table 7. In both groups, there is a significant (p-value < 0.05) difference in survival time. Test of normality and the appropriate test (ANOVA/Wilcoxon-Mann-Whitney) have been conducted between the two groups. From Table 7, the median number of survival days for REP is 172 days, in contrast to the non-REP group's 474.50 days, highlighting the significant difference between their survival times. Moreover, within each prognostic group (good or bad), there is also a significant difference in survival days between the groups. The median survival days for the bad prognostic group is 329 days, which is significantly different from the median survival days of 511 days for the good prognostic group. In addition, we have analyzed each individual significant feature of the fractal model in relation to the REP status. For each individual feature, a test of normality, specifically the Shapiro-Wilk test, has been performed, followed by an appropriate test (ANOVA/Wilcoxon-Mann-Whitney) to determine the significance of each feature with respect to the REP status. As depicted in Figure 7, it can be observed that two features from the fractal survival probability exhibit a significant association (p-value < 0.05) with the REP status. Among these two features, the fractal features belong to the top eight features of the fractal model. Therefore, the higher percentage of REP cases in the fractal model can be associated with the significance of this feature with the REP status. The multifractal feature from T1C is significantly linked to the dependent censoring survival probability and REP status.  In addition, we have analyzed each individual significant feature of the fractal model in relation to the REP status. For each individual feature, a test of normality, specifically the Shapiro-Wilk test, has been performed, followed by an appropriate test (ANOVA/Wilcoxon-Mann-Whitney) to determine the significance of each feature with respect to the REP status. As depicted in Figure 7, it can be observed that two features from the fractal survival probability exhibit a significant association (p-value < 0.05) with the REP status. Among these two features, the fractal features belong to the top eight features of the fractal model. Therefore, the higher percentage of REP cases in the fractal model can be associated with the significance of this feature with the REP status. The multifractal feature from T1C is significantly linked to the dependent censoring survival probability and REP status.

Discussion
This study proposes the feasibilty of radiomics and sophisticated multi-resolutional fractal texture features for prediction of REP status in GB patients from a radiationplanning T1C sequence MRI. Two models (non-fractal and fractal) are constructed utilizing random sampling 5-fold crossvalidation as presented in Table 3. The predictive performance of the fractal model is an AUC of 0.793 ± 0.082, with an FPR of 0.145 ± 0.107, while that of the non-fractal model is an AUC of 0.673 ± 0.082 with an FPR of 0.262 ± 0.177. There is a significant difference (p-value < 0.001) between the fractal and the non-fractal models for the prediction of REP status.
Furthermore, copula-based modeling for survival analysis of dependent censoring has been obtained using survival time, censoring time, and radiomics features. For binary prediction of patient survival, the selected significant features (p-value < 0.05) from survival analysis have been incorporated. The predictive precision performance for

Discussion
This study proposes the feasibilty of radiomics and sophisticated multi-resolutional fractal texture features for prediction of REP status in GB patients from a radiation-planning T1C sequence MRI. Two models (non-fractal and fractal) are constructed utilizing random sampling 5-fold crossvalidation as presented in Table 3. The predictive performance of the fractal model is an AUC of 0.793 ± 0.082, with an FPR of 0.145 ± 0.107, while that of the non-fractal model is an AUC of 0.673 ± 0.082 with an FPR of 0.262 ± 0.177. There is a significant difference (p-value < 0.001) between the fractal and the non-fractal models for the prediction of REP status.
Furthermore, copula-based modeling for survival analysis of dependent censoring has been obtained using survival time, censoring time, and radiomics features. For binary prediction of patient survival, the selected significant features (p-value < 0.05) from survival analysis have been incorporated. The predictive precision performance for patient survival in the fractal model is 0.881 ± 0.056, with an FPR of 0.311 ± 0.109, while that of the nonfractal model is 0.872 ± 0.054, with an FPR of 0.339 ± 0.106. Moreover, the inclusion of IDH mutation status in addition to radiomics features holds significance (p-value = 0.04) for the survival probability analysis of patients.
Note a direct comparison of our proposed method with the existing literature may not be feasible due to the differences in patient dataset and the analysis methods. The methods in the existing literature primarily focus on describing REP and assessing its impact on disease outcomes using pre-radiation MRI. Additionally, a limited number of studies [4,35] have concentrated on determining the significance of radiation-planning MRI for assessing pseudo-progression. Previous studies [7] investigated the intergration of REP and MGMT status for the overall survival of patients. Patients with both REP and unmethylated MGMT status exhibit a worse prognosis compared to non-REP and methylated patients (10.2 months versus 16.5 months, p-value = 0.033).
In comparison to the existing literature on REP in pre-radiation MRI, our method focuses on evaluating the diagnostic and prognostic ability of radiomics features in both identifying REP in patients with glioblastoma, and predicting their survival outcomes under dependent censoring. We first extract radiomics features from radiation-planning T1C MRI and assess their ability to predict REP. Our analysis shows that the inclusion of multi-resolution fractal features improves model performace significantly (p-value < 0.05) when compared to the non-fractal feature model. We then carry out survival anlysis utlizing copula-based modeling of the occurrence of dependent censoring. The CG estimator is used to straify good and bad prognostic groups based on radiomics and multi-resolution fractal feature-based PI. Median survival in days is higher for the good progostic group compared to the poor prognostic group (511.0 versus 329.0; p-value = 0.02). Similarily, median survival is higher in the non-REP versus the REP group (474.5 versus 172.0; p-value 0.006). Moreover, a fractal texture feature (T1C_mBm_GLZSM_LargeZoneLowGrayEmphasis) is found significant (p-value < 0.05), along a with histogram mean (edema tumor region) feature, for the stratification of prognostic groups as presented in Figure 7. As for incorporating molecular data, the inclusion of IDH mutation status with the radiomics features is significantly associated with survival as shown in Figure 5. Despite this association being reported in the literature [15,40,41], our analysis shows that MGMT promoter methylation status is not associated with survival outcomes (p-value = 0.9651), similar to that in the literature [8]. This may be explained by the fact that MGMT promoter methylation status is missing in 14 patients.
Our findings are consistent with earlier research employing sophisticated imaging features, including conventional shape, volumetric, histogram statistics, texture, and fractalbased texture features, as well as radiogenomics, in brain tumor segmentation, classification, survival prediction, and molecular mutation characterization [10,28,42]. Several studies focus on the application of machine or deep learning models for survival predicition and psudeoprogression prediction with radimics features extracted from structuralor advanced MRI [10,[43][44][45][46]. The inclusion of fractal-based features with conventional radiomics features in the fractal model increases predictive performance significantly compared to using only conventional radiomics features in the non-fractal model. The feasibility of using fractal-based features is also observed in prognostic grouping using the CGestimator. The percentage of REP cases is higher (84.62%, 11 out of 13) with fractal features-based PI compared to non-fractal feature-based PI (76.92%, 10 out of 13). Futhermore, a multi-fractalbased texture feature extracted from T1C is signifiant in prognostic grouping and REP status stratification.
Limitations of our study include the indeterminate status of molecular markers for some patients. The molecular marker of 1p/19q co-deletion status is diagnostic for oligodendroglioma which is a different type of glioma, so it is not included in our analysis. The indeterminate status of MGMT methylation and IDH mutation may also impact the analysis of these molecular markers. Moreover, the sample size of our study is rather small and may restrict the generalizability of this study. Attempts have been made to address these challenges. Random sampling with 5-fold cross-validation has been performed for feature selection and model evaluation to balance the number of patient cases in each iteration. In addition, only statistically significant features have been included in REP classification and survival modeling. For survival analysis, copula modeling has been utilized to circumvent the assumption of independent censoring.
Future studies with a larger sample size and MRIs from different institutions/scanners are needed for improved generalizability of our model. Including additional sequences (i.e, T2/FLAIR) in addition to T1C may improve REP modeling, both in pre-operative and radiation-planning scans. The ability to predict who will develop rapid progression and where spatially using pre-operative scans can provide valuable insights for guiding surgical and radiation-planning decisions.

Conclusions
The most aggressive glioma in adults is GB, and REP is a negative prognostic factor. Our study demonstrates that utilizing both conventional and sophisticated multi-resolution fractal image features from the radiation-planning T1C MRI sequence provides a useful tool for predicting REP in GB patients. Additionally, the copula-based feature selection modeling and survival analysis under dependent censoring indicate the feasibility of fractal and radiomics features for predicting REP in GB patients utilizing radiation-planning MRIs. Informed Consent Statement: Informed consent was waved because this is a retrospective study.
Data Availability Statement: Data may be available subject to appropriate IRB approval.

Acknowledgments:
The authors thank Jay Anderson, Amanda Magee and Ben Purugganan for their assistance in patients' data collection and administrative support. We extend our appreciation to Megan A. Witherow for generously offering her expertise to review and edit this article.

Conflicts of Interest:
The authors declared no conflict of interest.

Appendix A
The complete algorithm for the initial step of subject-independent random feature selection is depicted in Algorithm A1.
The group D P presents the data frame of 13 patients (shown in Table 1) with their REP class (indicated as 1) and D N denotes non-REP patients (presented as 0). Since there is class size imbalance, a subsample of group D N (lines 4-5 in Algorithm A1) is selected, repetitively under a bootstrap algorithm and compared with group D P . Using the bootstrap algorithm [47], we repeatedly sample data from non-REP patients in each iteration to create a data frame called D. This data frame is then saved during each iteration (lines 4-8 in Algorithm A1). We create resampled datasets D to build a sampling distribution of prediction results along with their standard deviation.
For each iteration i, 20 patients are randomly sampled from 57 non-REP cases. The reason for selecting 20 samples is to have approximately 40-50% cases from the REP group and the rest from the non-REP group in each ith iteration. Under each ith iteration, we perform n-fold cross-validation (line 10 in Algorithm A1). Under each nth fold, there is a feature matrix X j j=1 to k , where k denotes 600 extracted features (lines 11-12 in Algorithm A1). For each of the 600 features, we apply a decision tree (DT) classifier to the training portion of data frame D. This classifier is then tested on the corresponding test portion of D to obtain an F1-score (lines 12-17 in Algorithm A1). The F1-score represents how well a feature performs in categorizing the positive (minority) data, making it an important indicator when the underlying data distribution is unbalanced [48].
After completing total, I iterations, the average F1-score is computed for all 600 features to establish their rankings (lines 19-21 in Algorithm A1). From our analysis, the range of mean F1-scores for the 600 features is 0.0-0.74. The features that score greater or equal to 0.6 (threshold value) are chosen. The threshold value of 0.6 is chosen because it is close to 1.00, which represents the highest F1-score and indicates better performance. There are seven features after applying this threshold for the non-fractal and fractal models as presented in Figure A1.
In the second step of our feature selection approach, the statistically significant (p-value < 0.05) feature(s) among the selected seven features are depicted in Table A1 for the non-fractal model, and in Table A2 for the fractal model. Finally, there are three features that are statistically significant for the fractal and non-fractal models. A detailed statistical analysis of the selected features is presented below.
Algorithm A1: Subject Independent random sampling with i iteration for feature ranking. . For each of the 600 features, we apply a decision tree (DT) classifier to the training portion of data frame D. This classifier is then tested on the corresponding test portion of D to obtain an F1-score (lines 12-17 in Algorithm A1). The F1-score represents how well a feature performs in categorizing the positive (minority) data, making it an important indicator when the underlying data distribution is unbalanced [48]. After completing total, I iterations, the average F1-score is computed for all 600 features to establish their rankings (lines 19-21 in Algorithm A1). From our analysis, the range of mean F1-scores for the 600 features is 0.0-0.74. The features that score greater or equal to 0.6 (threshold value) are chosen. The threshold value of 0.6 is chosen because it is close to 1.00, which represents the highest F1-score and indicates better performance. There are seven features after applying this threshold for the non-fractal and fractal models as presented in Figure A1.
In the second step of our feature selection approach, the statistically significant (pvalue < 0.05) feature(s) among the selected seven features are depicted in Table A1 for the non-fractal model, and in Table A2 for the fractal model. Finally, there are three features that are statistically significant for the fractal and non-fractal models. A detailed statistical analysis of the selected features is presented below.
Algorithm A1: Subject Independent random sampling with i iteration for feature ranking. Save D after each ith iteration 9: within D split the target variable vector y and feature variable matrix as { }j=1 to k 10: for fold = 1 to n do 11: enumerate train and test indices for nth-fold in { }j=1 to k and y 12: for j = 1 to k do 13: fit a DTon the train indices of { } 14: predict ŷ on the test indices of { } 15: calculate F1-score based on true y and predicted ŷ of the test indices of { } 16: assign the score as the feature score of each { } 17: end for 18: end for 19: Output: Cumulative F1-score after n-fold cross-validation 20: end for 21: Output: Ranked features based on cumulative score after I iterations model performance evaluation metric P denotes the accuracy, AUC, PPV and FPR while R denotes the ranking score of features.
In each of the i iterations, dataset D (as shown in Algorithm A2, line 4) is initialized through random sampling, following the process described in Algorithm A1. The feature matrix X S and the target vector y are set up using the data from dataset D (Algorithm A2, line 5).
Following the n-fold cross-validation, the average value of the performance metric P is computed (Algorithm A2, line 12). Finally, the mean and standard deviations of the performance metrics after I iterations are calculated. (Algorithm A2, lines [13][14].
Algorithm A2: Subject independent n-fold cross-validation with i iteration for model evaluation.
1: Input: D after each i iterations, ranking score R for each feature in matrix X j j=1 to k 2: Define {X S } selected feature matrix based on R which is a subset of X j j=1 to k , Classification model C, Model performance evaluation metrics P 3: for iteration = 1 to I do 4: Initialize D = {D N , D P } 5: Initialize {X S } based on R and target vector y from D 6: for fold = 1 to n do 7: enumerate train and test index for n-fold in {X S } and y 8: fit classifier model C to the train index of {X S } and y 9: evaluate C on the test index of {X S } 10: Save P after each n 11: end for 12: Output: Cumulative P after n fold 13: end for 14: Output: Mean values of P after I iterations In each of the i iterations, dataset D (as shown in Algorithm A2, line 4) is initialized through random sampling, following the process described in Algorithm A1. The feature matrix XS and the target vector y are set up using the data from dataset D (Algorithm A2, line 5).
Following the n-fold cross-validation, the average value of the performance metric P is computed (Algorithm A2, line 12). Finally, the mean and standard deviations of the performance metrics after I iterations are calculated. (Algorithm A2, lines [13][14].
Algorithm A2: Subject independent n-fold cross-validation with i iteration for model evaluation. for fold = 1 to n do 7: enumerate train and test index for n-fold in {XS} and y 8: fit classifier model C to the train index of {XS} and y 9: evaluate C on the test index of {XS} 10: Save P after each n 11: end for 12: Output: Cumulative P after n fold 13: end for 14: Output: Mean values of P after I iterations

. Detailed Statistical Analysis of Selected Features
To identify the significant difference in features between REP and non-REP groups, we first observe the distribution of features within each group. The Shapiro-Wilk test is performed to analyze the underlying distribution of each feature vector. In the case of eccentricity in edema region the distribution is not normal. Therefore, the Wilcoxon-Mann-Whitney test is performed. However, for the other two features, the distribution is normal and ANOVA tests are performed. For statistical analysis, a p-value ≤ 0.05 is considered significant.
In the fractal model, the significant selected features are texture features. The median value of the selected features is higher in the non-REP group compared to the REP group as shown in Table A4 and Figure A3. In Figure A3, GmBm is represented as the holder exponent (HE) as discussed in Section 2.3.3.

Appendix B
Considering random variables, if T is the survival time and U is the censoring time and X i = (X i1 , X i2 ,. . ., X ip ) are the p vectors of features for each patient [16], the response or the target is (t i , δ i , X i ) where t i = min{T, U} and δ i = I {T i ≤ U i }, where I {.} is the indicator function which indicates whether the time is survival time or censoring time. The univariate cox hazard model [49] is represented as, where p indicates the number of predictors. The estimatorβ j is used to obtain the p-value from the Wald test under the null hypothesis H oj : B j = 0. The features or genes that are selected have p-values under a certain threshold level of significance. The estimatorβ j can correctly estimate β j if the independence assumption is satisfied [34]. A copula model for dependent censoring [35] is presented as, where C a is a bivariate copula function and S T (t) = Pr(T > t) and S U (u) = Pr(U > u) indicate marginal survival. Under the independence assumption, α = 0 which leads to C α (u, v) = uv [35]. The resultant p-values are the same as univariate cox model p-values. Equation (2) under the independence assumption can be written as the following: However, there are bivariate copula functions that may be considered for potential correlation between the censoring and the survival times. We select the Clayton copula because of its mathematical simplicity [16]. It is suitable for statistical analysis because of the positive dependence structure property and captured by parameter alpha (α) described in Equation (A2). Moreover, the Clayton copula may be particularly applicable in cancer survival because it concentrates on the reliance in the lower tail of the bivariate density function, which represents the prevalent circumstance in which a rapid time to progression leads to a rapid time to death [50].
Under dependence censoring [35], the Clayton copula C α (u, v) can be represented as the following, The parameter α depicts the degree of dependence [17] and can be converted to Kendall's tau (τ). Under the Clayton copula, τ = α/(α + 2). Under this assumption, all the observed times are unique (t i = t j ,where i = j) among patients. The number of patients at-risk at time t i , n i = ∑ n l=1 I(t l ≥ t i ), under Clayton copula, has the graphic estimator [35], which is presented as, The CGestimator is equivalent to the Kaplan-Meier estimator under the independence copula [16]. Given survival data t i, δ i , x ij ; i = 1, . . . , n for j-th feature (j = 1, 2,. . . p), the data are modeled using the following copula equation.
Pr T > t, U > u x j = C α Pr T > t x j , Pr U > u x j , where Pr T > t x j ) = exp −∧ 0j (t) exp β j x j )}, Pr U > u x j ) = exp −Γ oj (u)exp Y j x j )}, and the same copula C α is used for every j. Here, β j , Y j , β j = (β oj ,β 1j ,. . .., β pj ; are the regression co-efficients and ∧ 0j , Γ oj are the cumulative baseline hazard functions which are unspecified [16]. For α, the semiparametric maximum likelihood estimator (β j (α),Ŷ j (α), ∧ oj (α),Γ oj (α)) determined using the dependCox.reg() function in R [17]. Harrell's concordance c-index [51,52] is a robust predictive measure to calculate α. Therefore, α is selected by maximizing the cross-validated c-index. The c-index is a measure of concordance between the outcome or target (t i , δ i ) and the predictors ∑ j∈Ωβ (−i) j (α)x j . For a given x j , the bivariate survival function [16] is the following: The dependency between T and U for a given x j , is described as in Equation (A6) under Sklar's theorem [16,53]. The copula reduces the restrictive independent condition of C α (u, v) = uv. Using dependCox.reg.CV( ) [34], we determineβ 1 (α),. . .,β p (α).    From Table A5, we observe the two selected features by the Cox model under independent censoring assumptions. Both features refer to fractal texture features extracted from MRI. Table A6. The average vertical difference between the two survival curves is based on the top 3, 5, 7, 9, and 10 features for the fractal and non-fractal models. The p-value is calculated from the permutation test of randomly splitting the total samples into equal portions.