A Multiparametric MR-Based RadioFusionOmics Model with Robust Capabilities of Differentiating Glioblastoma Multiforme from Solitary Brain Metastasis

Simple Summary Glioblastoma multiforme (GBM) and solitary brain metastasis (SBM) are common brain tumors in adults. The two tumors often pose a diagnostic dilemma owing to their similar features on conventional magnetic resonance imaging (MRI). Ability to discriminate the two tumors is critical as it informs clinical treatment strategies. This pilot study attempts to employ the machine learning technique to identify GBM and SBM by fusing radiomics features of multiple MRI sequences and multiple models. A multiparametric MR-based RadioFusionOmics (RFO) model was developed and has demonstrated promising prediction accuracy for the identifications of GBM and SBM. Abstract This study aimed to evaluate the diagnostic potential of a novel RFO model in differentiating GBM and SBM with multiparametric MR sequences collected from 244 (131 GBM and 113 SBM) patients. Three basic volume of interests (VOIs) were delineated on the conventional axial MR images (T1WI, T2WI, T2_FLAIR, and CE_T1WI), including volumetric non-enhanced tumor (nET), enhanced tumor (ET), and peritumoral edema (pTE). Using the RFO model, radiomics features extracted from different multiparametric MRI sequence(s) and VOI(s) were fused and the best sequence and VOI, or possible combinations, were determined. A multi-disciplinary team (MDT)-like fusion was performed to integrate predictions from the high-performing models for the final discrimination of GBM vs. SBM. Image features extracted from the volumetric ET (VOIET) had dominant predictive performances over features from other VOI combinations. Fusion of VOIET features from the T1WI and T2_FLAIR sequences via the RFO model achieved a discrimination accuracy of AUC = 0.925, accuracy = 0.855, sensitivity = 0.856, and specificity = 0.853, on the independent testing cohort 1, and AUC = 0.859, accuracy = 0.836, sensitivity = 0.708, and specificity = 0.919 on the independent testing cohort 2, which significantly outperformed three experienced radiologists (p = 0.03, 0.01, 0.02, and 0.01, and p = 0.02, 0.01, 0.45, and 0.02, respectively) and the MDT-decision result of three experienced experts (p = 0.03, 0.02, 0.03, and 0.02, and p = 0.03, 0.02, 0.44, and 0.03, respectively).


Introduction
Primary glioblastoma multiforme (GBM; WHO grade IV glioma) and intracranial metastases are commonly identified malignant brain tumors in adults [1]. Noninvasive discrimination between the two tumors helps in the selection of appropriate treatment options and clinical management strategies [2,3]. However, in cases of solitary brain metastasis (SBM), differentiating the GBM from SBM solely based on conventional characteristics on magnetic resonance imaging (MRI) remains challenging. GBM and SBM often share similar radiological manifestations, such as an intratumoral necrotic center and heterogeneous enhancement of the component surrounded by peritumoral edema regions [4,5].
To define a better diagnostic tool, previous studies have explored various advanced quantitative MR techniques, such as diffusion weighted imaging (DWI), diffusion tensor imaging (DTI), dynamic contrast-enhanced (DCE), dynamic susceptibility contrast (DSC) perfusion, arterial spin labeling (ASL) perfusion, and spectroscopy, aimed towards excavating more hidden imaging information that would help characterize the physiological and metabolic differences between GBM and SBM [6][7][8][9][10][11][12]. Whereas the data were inspiring, their attempts failed to provide adequate diagnostic confidence due to the inconsistencies in the MR-derived parameters. Besides, these alternatives were associated with extra expense and more scanning time required for implementation, which impeded the application of these advanced MR techniques in routine diagnostic practices.
The advent of radiomics in recent years has tremendously permeated clinical applications in terms of diagnosis, prognosis, as well as prediction of treatment response in tumors [13,14]. In particular, many studies have employed radiomics for the differentiation of GBM from SBM based on conventional MR sequences or/and more advanced MR technologies such as DWI, DTI, APT [15][16][17][18][19][20]. These investigations have shed light on mining more concealed image textures barely captured by the naked eyes in distinguishing GBM from SBM. Nevertheless, these techniques are limited by the fact that the tumor VOI under investigation is tentatively chosen from either the peritumoral edema region [12,15,21] or the enhanced tumor area [4,11]. In addition, there is no definitive data on the tumor VOI (necrotic centers, enhancing margins, or peritumoral edema) that better characterizes GBM and SBM. Besides, radiomics analysis was performed on a single MR sequence, such as contrast-enhanced T 1 WI [22,23], ADC [20], DTI [17], or multi-sequences [18,19,24]. The MR sequence selection is often random and heuristic, and superiority among the MR sequences remains unclear. In fact, the different MR sequences can be viewed as multimodality images that depict the tumor from different perspectives. Fusion of imaging information from multiple MR sequences gives better characterization of the tumor properties and would theoretically enhance the classification performance of a radiomics model. To date, no attempts have been made to fully exploit the potential of integrating radiomics features acquired from multiple MR sequences for the discrimination of GBM from SBM. In addition, the reported radiomics models are often built on a single classifier. A model's classification capability is linked with the classifier used. Indeed, different classifiers might yield inconsistent results even when applied on the same task [25]. An ensemble of classifiers, a process analogous to disease diagnosis by a multi-disciplinary team (MDT), usually offers more reliable and reproducible performance compared to a single classifier [15,26].
Here, we hypothesized that integrating information from multiple MR sequences and various classifiers could yield a more robust radiomics system. We proposed and used a novel hierarchical fusion radiomics model, referred to as RadioFusionOmics (RFO), to fuse multimodality radiomics features from multiple MR sequences, and combine different

Delineation of Volume of Interest (VOI)
All the images were stored in the Digital Imaging and Communications in Medicine (DICOM) file format. The three-dimensional VOI of the tumors were manually delineated, slice-by-slice, using ITK-SNAP software (http://www.itksnap.org; assessed on 15 January 2020) on the axial plane on the T 2 _FLAIR images. The axial T 1 WI, T 2 WI, and post-contrast axial T 1 WI were used to cross-check the extent of the tumor or peritumor edema and to fine-tune the contour. The exclusion criteria resulted in a patient cohort with comparable MR images presenting similar pathology-related components (i.e., intratumoral necrotic center, enhancing component, and peritumoral edema that simultaneously presented in GBMs or SBMs). Specifically, visible tumor margins were manually delineated to obtain the volume of non-enhanced tumor (nET), enhanced tumor (ET), and peritumoral edema (pTE) (Figure 2). The procedure was conducted by two investigators (J. Wu and R.M Yang, with 4 and 15 years of experience in radiological diagnosis, respectively). The conformity of the delineated VOIs were evaluated by the Dice similarity coefficient. Intersection of two VOIs was used as the final VOI based on whether the Dice index was greater than 0.9. Discrepancies on the lesion boundary (Dice < 0.9) were resolved by further discussions until there was mutual consensus. Besides, morphological union operation was carried to combine two/three of the three VOIs, which resulted in a total of seven VOIs (VOI nET , VOI ET , VOI pTE , VOI nET+ET , VOI nET+pTE , VOI ET+pTE , and VOI nET+ET+pTE ).

RadioFusionOmics
The proposed RadioFusionOmics (RFO) methodology is a novel two-level fusion scheme that integrates radiomics information from different MRI sequences ( Figure 3). It exploits the ensemble learning to fully explore the strengths of various classifiers.

Level 1: Feature Fusion
We proposed a novel feature fusion method that integrates features extracted from different MRI sequences (Figure 3b). We used a data matrix X (p×n) for a particular feature, where p (p = 2, 3 or 4) denotes the number of MRI sequences, and n is the training sample size. To incorporate the class structure (i.e., memberships of the training samples in class) into the feature fusion, a between-class scatter matrix S = ∑ c i=1 n i (x i − x)(x i − x) T was constructed by dividing matrix X into c separate groups (c is the number of classes, c = 2); where n i columns belong to the ith class (n = ∑ c i=1 n i ), and the x i and x represent the means of the x i,j vectors (j th sample in the i th class) in the ith class and the whole set. Unitizing the between-class matrix S generated a transformation matrix W, which was used to reduce the dimensionality of matrix X.
Feature fusion was accomplished by mapping W T X to compress matrix X to a row vector x (1×n) (see Appendix A for algorithm details). The fusion was repeated for all the m (m = 109) features and stacking the m row vectors x (1×n) yielded the fused feature matrix F (m×n) . Unlike the serial or parallel fusion, the proposed method not only fuses the features from different MR sequences but also eliminate the between-class correlations and restricts the within-class correlations during the feature fusion process [28]. More technical details can be found in Appendix A.
With features F 1 seq , F 2 seq , F 3 seq , and F 4 seq , respectively, extracted from the four MRI sequences-T 1 WI, CE_T 1 WI, T 2 WI, and T 2 _FLAIR-the feature level fusion was done by either combining any two ( ). In the training stage, feature level fusion would yield two outputs: the fused feature of the training cohort and the transformation matrix W.

Level 2: Model Fusion
The previously described fused features were used to train fifteen classification models built with different combinations of five feature selection methods and three classifiers (Table S1). A stratified five-fold cross-validation was performed on the training cohort to rank all the fifteen models and then the top 3 models were screened out.
In addition, the ensemble learning was applied to fuse the top 3 models via a multidisciplinary team (MDT)-like fusion method. Each of the models was regarded as a clinical specialist providing a classification prediction c i , i = {1, 2, 3}. Their decisions were integrated by the weighted fusion (WF) method, which assigns different weights to the i th models according to their accuracies acc i . The accuracies were averaged over the five-fold cross-validation, and reaches a consensus classification by a linear weighted sum ∑ 3 i w i c i (x). In the training stage, the model level fusion provided a final consensus classification model by fusion of the screened top three trained models coupled with their associated weights w i .

Independent Testing
All the patients were divided into a training cohort (n = 121), an independent testing cohort 1 (n = 62), and an independent testing cohort 2 (n = 61). The transformation matrix W acquired from the feature level fusion in the training stage was directly applied to fuse features of the MRI sequences in the testing cohort. The fused testing features were fed into the final classification model for independent testing.
2.6. Evaluation of the Model 2.6.1. Study 1: Comparison of Lesion VOI We compared the discriminative power of the features extracted from the seven VOIs (VOI nET , VOI ET , VOI pTE , VOI nET+ET , VOI nET+pTE , VOI ET+pTE , and VOI nET+ET+pTE ) on each MRI sequence. This was used to verify the lesion VOI that could provide better discrimination between the GBM and SBM.

Study 2: Comparisons of the Different Combinations of Mri Sequences Used in the Fusion
Using the best lesion VOI, different MRI sequences were fused via the proposed methodology; F 1;2 seq , F 1;3 seq , . . . , F 1;2;3;4 seq ). We then compared the various combinations of MRI sequences for fusion.

Study 3: Comparison with Radiologist Performance
The constructed RFO model was finally built with features from the best lesion VOI through the best combination of MRI sequences that resulted from Study 1 and 2. The performance was measured by the area under the receiver operating characteristic (ROC) curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE), and were compared with human diagnosis by three radiologists (W.L Zhang, J.L Wu, and R.M Yang, with 3, 5, and 15 years of experience in neuroradiology, respectively), without prior knowledge of the patients' pathology. For further illustration, the performance of our proposed RFO model was also compared with the clinical diagnosis result by a multi-disciplinary team (MDT), which was composed by three specialists (R.M Yang with 15 years of experience in radiology, X.H Ye with 15 years of experience in radiation oncology, and Y Wu with 19 years of experience in oncology). By counting the number of decisions made by the three specialists in each class, the class with the highest number of votes was regarded as the result of MDT.

Study 4: Top Features
The top-ranked features associated with the discrimination of GBM or SBM were also sieved by the proposed RFO model and their discriminative capabilities were analyzed.

Statistical Analysis
All statistical analyses were performed in SPSS 21.0 software (SPSS, Inc.). Continuous variables were reported as the means ± standard deviation (SD) while categorical variables were presented as numbers and proportions. Normality of the data distribution was evaluated by the Shapiro-Wilk test. The demographic characteristics in the GBMs and SBMs were compared using the chi-square test for categorical variables, Student's t test for normally distributed continuous variables, while the Mann-Whitney U test was used for non-normally distributed continuous variables. In addition, we used the paired samples Wilcoxon signed rank test for the comparisons. A p-value < 0.05 was considered to be statistically significant.

Patient Characteristics
This study included a total of 244 patients (131 GBMs and 113 SBMs). There were no significant sex differences between the GBM and SBM in the training and independent testing cohorts (Table 1). Our data showed that the SBM patients were older (58.18 ± 9.83 years for the SBM cohort vs. 52.22 ± 15.59 years for the GBM cohort, p = 0.012) and more likely to be in the infratentorial region (22/113 for the SBM cohort vs. 3/131 for the GBM cohort, p = 0.001).

Study 1: Comparison of the Lesion VOIs
For each MRI sequence, we extracted the radiomics features from seven VOIs. Each feature (n = 4 × 7 types in total) was independently applied in the fifteen models and evaluated by the five-fold cross-validation. The mean AUC values over for all the models and folds were reported ( Figure 4). In all the four MRI sequences, image features from the VOI ET were the most superior compared with those from VOI ET , VOI nET , and VOI pTE and their possible combinations. The best performance in terms of the VOI ET features was seen in the T 2 _FLAIR (AUC = 0.92), followed by CE_T 1 WI (0.90), T 2 WI (0.88), and then T 1 WI (0.87). Furthermore, the VOI combinations did not provide significant aid in enhancing the model's discriminative capability. Even though it was still inferior to VOI ET , the VOI nET+ET combination showed better performance on all the MRI sequence except for the CE_T 1 WI. This data might suggest that the enhanced tumor region in MRI provided the most essential diagnostic information in characterizing and differentiating GBM vs. SBM.

Study 2: Combination of the MRI Sequences for Fusion
Using the best lesion VOI (VOI ET ), the features of the VOI ET from the four MRI sequences (F 1 seq , F 2 seq , F 3 seq , and F 4 seq ) were fused by combining any two, three, or all the four sequences via the proposed methodology, and their classification performance was compared ( Figure 5). Our data showed that, without feature fusion, the best performance (mean AUC) was observed in T 2 _FLAIR (F 4 seq , AUC = 0.94), followed by CE_T 1  ) did not necessarily further enhance the accuracy of the fusion. Nevertheless, feature fusion generally outperformed the use of features of a single MRI sequence (mean AUC 0.91 vs. 0.93, p = 0.007). The detailed statistical differences between these feature types were also reported (see Table S2 in Supplementary Materials). Thus, F 1;4 seq emerged as the best feature fusion combination and was used for subsequent RFO modeling. represents the fused feature of sequence num 1 and sequence num 2 . Boxplots of the AUC distributions achieved by the 15 discrimination models for each of the 15 feature types. The boxes run from the 25th to 75th percentile; the two ends of the whiskers represent the 5% and 95% percentiles of the data; the horizontal line and the square in the box shows the median and mean values, respectively. The diamonds represent outliers. The letters above each box (i.e., 'a, b, c, d') are the symbols to indicate whether a statistically significant difference exists between any two feature types. No common letters indicate that the two feature types are significantly different.

Study 3: The RFO Model vs. Radiologist Performance
Features extracted from the T 1 WI and T 2 _FLAIR sequences on VOI ET were fused to yield the F 1;4 seq feature level fusion in the RFO model. The fusion was used to train 15 classification models. The top 3 models were singled out and then integrated in the model level fusion to define a consensus model, which was validated using the independent testing cohorts. For comparison purposes, the three top models were individually tested against the two testing cohorts, and their mean AUCs were 91.6% and 86.4%, respectively ( Table 2). The data showed that the proposed RFO model improved the classification accuracy to an AUC 92.5% (p = 0.04), ACC 85.5% (p = 0.10), and SPE 85.3% (p = 0.02) on the independent testing cohort 1, while the ACC and SPE on the independent testing cohort 2 were also increased to 83.6% (p = 0.04) and 91.9% (p = 0.03). For almost all outcomes (except for the SPE of the testing cohort 2), the proposed RFO model significantly preceded all the data from the three board-certified neuroradiologists, who obtained the best performance with AUC 75.4%, ACC 75.8%, SEN 75.8%, and SPE 75.9% on testing cohort 1 and AUC 78.2%, ACC 77.0%, SEN 83.3%, and SPE 73.0% on testing cohort 2 (Table 2). Furthermore, it was interesting to compare our results with the MTD discrimination to mimic daily clinical practice. Our results that the RFO performance was on average about 15% higher than MDT-decision voted by three specialists (one radiologist, one radiation oncologist, and one medical oncologist) further demonstrated the superiority of RFO ( Table 2). The statistical differences between the data obtained by the proposed RFO model and the expert diagnosis are shown in Table 3.  In addition, we evaluated the classification performance following the addition of two statistically significant clinical characteristics ("lesion location" or/and "age") to the radiomics features in the RFO model ( Table 2). The data demonstrated that adding a patient's clinical characteristics showed different discrimination potentials on two independent test cohorts, as revealed by the statistical comparisons shown in Figure 6. Taking the performance on testing cohort 1 as an example, addition of patient's age into RFO would decrease the AUC to 92.2% (p = 0.01), while addition of the lesion location would significantly improve the classification accuracy to AUC 92.9% (p = 0.01), ACC 87.1% (p = 0.01), and SEN 89.3% (p = 0.00). Interestingly, addition of both clinical characteristics did not further enhance the performance (AUC 92.7%, p = 0.02). In contrast, adding age improved the AUC on testing cohort 2 to 86.6% (p = 0.00), while there was no performance improvement after location was added (AUC 85.8%, p = 0.05). It was worthy to stress that adding both clinical characteristics has a significant performance improvement of AUC 86.5% (p = 0.03), ACC 85.2% (p = 0.03), and SEN 75.0% (p = 0.00). Figure 6. The statistical comparisons of AUC of the RFO model with or without clinical features on two independent testing cohort 1 (left) and 2 (right); Asterisks (** for p ≤ 0.01 and * for 0.01 ≤ p ≤ 0.05) are marked between any two groups with significant differences.

Study 4: Highly Correlated Radiomics Markers
The high correlation features (extracted on VOI ET ) associated with the discrimination of GBM and SBM discrimination also can be selected by the fifteen classification models within the RFO framework. Since each of the 15 models was equipped with a feature selection procedure, we counted and ranked the occurrence of each selected feature (only in models with AUC > 0.8). The ten most frequently selected features included four firstorder-based features and six shape-based features (Table 4). Except for the 'Median' in T 1 WI, all the features showed statistically significant differences between GBM and SBM. To differentiate between GBM and SBM, the mean feature values of the two groups (i.e., "M" in Table 4) were calculated and used as the threshold. The discrimination capabilities of the top 10 features in T 2 _FLAIR were superior to their counterparts in T 1 WI. Three first-order features ('90th percentile', 'Median', and 'Maximum') in T 2 _FLAIR, as well as one shape feature ('Maximum 2D DiameterColumn') in both T 1 WI and T 2 _FLAIR, demonstrated satisfactory discriminative capabilities. The data showed that~70% of the GBM group had larger feature values, while~70% of the SBM group had smaller values.
The fused feature of T 1 WI and T 2 _FLAIR (i.e., F 1;4 seq ) via the proposed RFO model (generated in the feature level fusion) was shown to further enhance the discriminative power in differentiating GBM vs. SBM, especially when compared with the corresponding features from either T 1 WI or T 2 _FLAIR (Figure 7).

Discussion
Accurate distinction between GBM and SBM is critical for accurate radiological assessment, which paves the way for precision medicine and personalized treatment options [29]. Our strict exclusion criteria (excluding the "pure solid or cystic") left us a "clean" patient cohort of GBMs and SBMs both presenting an intratumoral necrotic center and heterogeneous enhancing component surrounded by peritumoral edema regions on MR. Such cases exhibit the most common imaging manifestations encountered in our daily practice, posing the biggest challenge for radiologists. The proposed RFO model fusing image information from multiple MR sequences was crafted to deal with these tricky and tangled cases that may confound the radiologists in clinics. The RFO model demonstrated satisfactory and superior discrimination capability compared with the experienced radiologists and MDT decision. In addition, we explored the discriminative capabilities of the radiomics features from different tumor volumetric components on different MRI sequences.
Radiomics convert digital medical images into mineable high-dimensional data and have been widely recognized as a practical alternative for noninvasive diagnosis/prognosis in oncology [13,30]. Previous studies demonstrated the feasibility of applying radiomics models to differentiate GBM from SBM based on either conventional MR images or advanced MR technologies [4,17,20,22,31]. Notably, one crucial and defining step for radiomics modeling is the delineation of the lesion VOI. The lesion VOI defines the image context associated with tumor pathological heterogeneity and exerts a direct impact on the extracted radiomics features. Tumor heterogeneity from different VOI components has been associated with poor survival rates in GBM [16,32]. Besides, a radiomics model's predictive performance also depends on the image features as previously demonstrated [15,21,33]. The previous studies did not, however, resolve the dilemma of whether to derive the lesion VOI from the peritumoral edema region [12,15,21], or from the enhanced tumor area as a whole [19,34]. Evaluation of the features such as necrotic centers, enhancing margins, or peritumoral edema would lead to a more precise and reproducible discriminative model. Indeed, previous reports noted the positive role played by the peritumoral region based on pathological distinctions, such as vasogenic edema, in SBM against a combination with scattered neoplastic cells in GBM [8]. Besides, the infiltrative characteristics of tumor cells in GBMs were also found to exhibit different diffusion, spectroscopic, and perfusion features compared to the peritumoral edema in SBM [8,11]. In contrast, our results showed that features from the VOI ET (enhancing tumor) yielded the most discriminative capability. This observation aligns with a previous study that employed a different imaging (APT) technology [16]. We speculate that the microscopic tumor cell infiltration characteristics are mostly exhibited in the enhancing tumor component, and macroscopically present as intra-tumoral heterogeneity reflected in image texture differences between GBM and SBM.
On the other hand, selection of an appropriate MR sequence presents another challenge in MR-based radiomics modeling. Previous analyses were mostly performed on a single MR sequence, such as CE_T 1 WI and T 2 WI [19,22,23,34]. However, such preferences were presumably random because no selection criteria were provided. Choosing a proper MR sequence for radiomics modeling strengthens comprehensive comparative studies. For example, Tateishi et al. compared the CE_T 1 WI, T 2 WI, and ADC sequences and showed that the intratumor textures from T 2 WI produced the highest AUC (0.78) in differentiating GBM from SBM [19]. However, the study excluded the T 2 _FLAIR sequences in the analysis. In our study, we extensively compared the four conventional MR sequences (T 1 WI, T 2 WI, T 2 _FLAIR, and CE_T 1 WI) and their possible combinations and demonstrate the positive role of the T 2 _FLAIR sequences in GBM and SBM discrimination.
In addition, we hypothesized that managing the MR sequences in a collaborative way would offer more multifaceted imaging information that is usually absent in a single MR sequence. The novel feature fusion method proposed in the RFO framework performed effective feature fusion through incorporating class structure information, which unitized the constructed between-class matrix in the fusion process. Thus, features from different MR sequences were not only integrated, but the fused features were more representative and discriminative between the GBM and SBM. With this novel feature fusion, we showed that the fused features associated with the T 2 _FLAIR sequences (T 1 WI + T 2 _FLAIR) demonstrated superior discriminative capabilities as compared with other MR sequences. The best performance in terms of the VOI ET features was in T 2 _FLAIR (AUC = 0.92) followed by CE_T 1 WI (0.90) and T 1 WI (0.87). However, feature fusion of 'T 2 _FLAIR+ CE_T 1 WI' did not outperform feature fusion of 'T 2 _FLAIR+ T 1 WI', as we expected. This phenomenon was associated with the fact that the image features from the T 2 _FLAIR and CE_T 1 WI sequences are universally similar and somewhat overlapping, as opposed to T 2 _FLAIR and T 1 WI where the image features might be complementary.
Furthermore, the novel MDT-like model fusion in the proposed RFO framework was able to mimic the multi-disciplinary decision-making process. The fusion treated the top 3 models as three independent experts, whose opinions were weighted by their experiences (performances in five-fold cross-validations of the training/validation dataset) and in combination to yield a final consensus prediction. This weighted fusion approach outperformed the majority voting method that simply counts the number of decisions from each specialist and then predicts based on the highest number of votes [15] (see Table S3 in Supplementary Materials).
Our data showed that four first-order-based features and six shape-based features of the VOI ET -specific T 1 WI and T 2 _FALIR were the top 10 features associated with the discrimination of the GBM vs. SBM. In particular, '90th percentile', 'Median', and 'Maximum' in T 2 _FLAIR, and 'Maximum 2D DiameterColumn' in both T 1 WI and T 2 _FLAIR exhibited satisfactory diagnostic capabilities [35]. A higher percentile might imply high tumor heterogeneity. For example, Tozer et al. underscored the role of the 90th percentile in the identification of glioma subtypes [36]. Our results showed that the '90th percentile' feature (higher in GBM than in SBM) on the T 2 _FLAIR sequence demonstrated the highest discriminative capabilities. This could be ascribed to the fact that GBM is characterized by poorly differentiated neoplastic astrocytes and extensive microvascular proliferation and/or necrosis [37]. Besides, the T 2 _FLAIR sequence highlighted the changes in signal intensity within the tumor by suppressing the cerebrospinal fluid signal, thus resulting in a higher '90th percentile' value in GBM. Our results also showed that the 'Median' value of GBM was higher than that of SBM. However, whereas its role in distinguishing GBM and SBM was not defined, a previous study had demonstrated its merit in ccRCC and pRCC differentiation [20].
The 'Maximum 2D diameter' feature, the largest Euclidean distance between the vertices of the tumor surface mesh, is used to describe the tumor size. Our data showed that the mean diameter of GBM was larger than that of the SBM (with 'M' value of 6 cm: Table 4). These data were consistent with the radiological and clinical characteristics of the two tumors. The GBM pathology is characterized by angiogenesis and highly infiltrative growth, with enhanced tumor volume. On the contrary, SBM usually exhibits limited expansile growth and tends to compress the surrounding brain tissue with a spherical shape and smaller volume [38]. In addition, SBM usually occurs in the subcortex of the watershed area, which is adjacent to the brain draining veins. The draining veins are easily compressed by the tumor mass effect. Collectively, the factors give rise to the obstruction of venous reflux, leading to obvious peritumoral edema, and triggering of a series of clinical symptoms. Therefore, small size SBM is easier to detect in daily diagnostic practice.
In agreement with the previous observation, the demographics of the enrolled patients showed that SBMs were preferentially located in the infratentorial region compared to GBMs (p = 0.001) [31]. In addition, patients with SBM were older than the those with GBM (p = 0.001), which is also consistent with previous data [20,23]. Thus, addition of the lesion location to the RFO model can further enhance its performance.
Nevertheless, our study is limited by the retrospective nature of the design and the fact that the sample size was not large enough due to the strict inclusion criteria. It is worthy to stress that the current enrolled cohort might not necessarily reflect the real distribution of the imaging manifestations in GBMs and SBMs. We found that most of the excluded GBMs (n = 11) or SBMs (n = 19) manifested a pure solid appearance, and only n = 6 GBM cases (but no SBMs cases) presented a pure cystic appearance or without enhancements. Although the RFO model was employed to differentiate the solid GBMs from the solid SBMs, and also exhibited satisfactory performance (see Table S4 in Supplementary Materials), unbalanced case distributions, either universal across different institutions or is specific in our center, might still probably introduce prediction bias and serve as a limitation of a radiomics-based model. Therefore, there is need for further studies with a larger patient population to validate the proposed model's capability. Besides, our analysis included SBM of different histopathological origins. It is still unknown whether the different origins exhibit different image textures and their possible influences on the model. In addition, an unclear correlation between the radiomics features with pathological manifestations in the enhanced tumor component was also considered a limitation.

Conclusions
In summary, we successfully built a high-performing RadioFusionOmics model with combined multiple classifiers integrated with multi-sequences feature fusion. Our findings demonstrated that the model outperformed experienced radiologists and might be a promising tool for computer-aided diagnosis of GBM and SBM.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/cancers13225793/s1, Table S1: Feature selection methods and classifiers, Table S2: Statistical pairwise comparison of 15 feature types (in terms of AUC), Table S3: Performance comparison of majority voting and weighted fusion, Table S4: Performance comparison using the excluded GBMs (n = 11) or SBMS (n = 19) as an independent testing cohort. the author contribution or funding sections. This may include administrative and technical support, or donations in kind (e.g., materials used for experiments).

Conflicts of Interest:
The authors declare no conflict of interest.

Feature Level Fusion:
Specifically, let us define a data matrix X (p×n) for a particular feature, where p (p = 2, 3 or 4) denotes the number of MRI sequences, and n is the training sample size. To incorporate the class structure (i.e., memberships of the training samples in class) in feature fusion, the n columns of matrix X were divided into c separate groups (c is the number of classes, c = 2), where n i columns belong to the ith class (n = ∑ c i=1 n i ). Let x i,j ∈ X denote the vector corresponding to the jth sample in the ith class, x i and x denote the means of the x i,j vectors in the ith class and the whole set. We define a between-class scatter matrix If the classes were well-separated, S T (c×c) (i.e., the covariance matrix φ T φ) would be a diagonal matrix. Since S T is symmetric positive semidefinite, it can be diagonalized as P T S T P = Λ , where P is the matrix of orthogonal eigenvectors and Λ is the diagonal matrix of real non-negative eigenvalues sorted in decreasing order. Let Q (c×r) consist of the first r eigenvectors from P, then (φQ) T S(φQ) = Λ (r×r) can be formulated with mapping Q → φQ . Thus, the matrix W (p×r) = φQΛ −1/2 unitizes S by W T SW = I and reduces the dimensionality of matrix X from p to r.
Feature fusion can be accomplished by setting r = 1 to compress matrix X (p×n) to a row vector x (1×n) with transformation x (1×n) = W T (1×p) X (p×n) .