Machine Learning Model Based on Optimized Radiomics Feature from 18F-FDG-PET/CT and Clinical Characteristics Predicts Prognosis of Multiple Myeloma: A Preliminary Study

Objects: To evaluate the prognostic value of radiomics features extracted from 18F-FDG-PET/CT images and integrated with clinical characteristics and conventional PET/CT metrics in newly diagnosed multiple myeloma (NDMM) patients. Methods: We retrospectively reviewed baseline clinical information and 18F-FDG-PET/CT imaging data of MM patients with 18F-FDG-PET/CT. Multivariate Cox regression models involving different combinations were constructed, and stepwise regression was performed: (1) radiomics features of PET/CT alone (Rad Model); (2) Using clinical data (including clinical/laboratory parameters and conventional PET/CT metrics) only (Cli Model); (3) Combination radiomics features and clinical data (Cli-Rad Model). Model performance was evaluated by C-index and Net Reclassification Index (NRI). Results: Ninety-eight patients with NDMM who underwent 18F-FDG-PET/CT between 2014 and 2019 were included in this study. Combining radiomics features from PET/CT with clinical data showed higher prognostic performance than models with radiomics features or clinical data alone (C-index 0.790 vs. 0.675 vs. 0.736 in training cohort; 0.698 vs. 0.651 vs. 0.563 in validation cohort; AUC 0.761, sensitivity 56.7%, specificity 85.7%, p < 0.05 in training cohort and AUC 0.650, sensitivity 80.0%, specificity78.6%, p < 0.05 in validation cohort) When clinical data was combined with radiomics, an increase in the performance of the model was observed (NRI > 0). Conclusions: Radiomics features extracted from the PET and CT components of baseline 18F-FDG-PET/CT images may become an effective complement to provide prognostic information; therefore, radiomics features combined with clinical characteristic may provide clinical value for MM prognosis prediction.


Introduction
Multiple myeloma (MM), the second-most frequent hematologic tumor, is an incurable malignancy of the plasma cells. Over the past decade, the prognosis of MM has notably improved, due to the emergence of new therapeutic options. However, the improvement has not been uniform, and 15% to 20% of all patients have a predicted OS of less than 3 years [1]. Early identification of patients with high-risk features is needed to develop individualized and risk-adapted treatment strategies in newly diagnosed MM. Currently, several prognostic models have been used to stratify myeloma patients into subgroups with distinct risk profiles [2][3][4]. However, the performance of these models for identifying high-risk MM is not satisfactory. 18 F-FDG PET/CT (18F-fluoro-deoxy-glucose positron emission tomography/computed tomography) is a useful diagnostic imaging procedure providing both tomographic and functional information in patients with MM. It may be regarded as a useful tool in the 2 of 14 workup at diagnosis parameters and the follow-up of MM, especially for the detection of para-medullary and extramedullary disease or solid organ involvement. Various studies have demonstrated image-based standardized uptake value (SUV), extramedullary disease (EMD), and numbers of focal bone lesions (FLs) have been served as prognostic factors [5][6][7][8]. 18 F-FDG PET/CT was recommended by IMWG as the actual "gold standard" method for evaluating and monitoring response to anti-myeloma therapy [9]. As MM is highly heterogeneous, quantitative description of inter-tumoral and intra-tumoral heterogeneity might have significant potential for improved prognosis in MM. Consequently, it is necessary to develop more effective and feasible methods to assist in image analysis and mining more valuable prognostic information. Radiomics is an emerging area that shows promising prospects in the domain of radiological evaluation. Radiomics is a sophisticated image analysis technique that captures tissue and lesion high-throughput characteristics providing complementary information about tumor heterogeneity across the entire tumor volume to improve prognosis prediction and may therefore prove useful for patient stratification [10]. Increasing studies are published owning to encouraging results of radiomics-based machine-learning models. Most of these studies showed the value of radiomics extracted from PET was for solid tumor, such as lung cancer, head and neck cancer, and gastric cancer [11][12][13]. A recent study did clarify that the radiomics features model may predict high-risk cytogenetic status in multiple myeloma based on magnetic resonance imaging [14]. Some studies demonstrated that the radiomic analysis on standard CT or 18 F-FDG-PET/CT images of patients with MM strongly improve accuracy in differentiating focal from diffuse patterns at diagnosis [15]. It also showed the value in disease follow-up, treatment options, and prognosis prediction. Bone marrow radiomics features extracted from 18 F-FDG PET/CT may provide some information of MRD [16]. In a small sample size study, radiomics models based on MRI could also predict the response to bortezomib-based therapy in MM patients [17]. MRI-based textural features proved to correlate well with the clinical and hematological response (CR, VPGR, and PR) in MM patients undergoing systemic treatment [18]. In some sense, a radiomics approach may extract and mine more medical imaging features as reliable prognosis biomarkers of MM. We hypothesized that a model incorporating radiomic features extracted from baseline PET/CT would improve the prediction outcome of MM.
Although radiomics and machine learning have been widely used in disease diagnosis, the application of radiomics and multiple machine learning algorithms combined in predicting prognosis of MM has rarely been reported. The aim of this study was to evaluate the prognostic value of a machine learning model based on optimized radiomics features from 18 F-FDG-PET/CT and clinical characteristics in NDMM patients.

Study Design and PATIENTS
We retrospectively reviewed medical records of 98 NDMM patients who underwent 18 F-FDG-PET/CT between 2014 and 2019 in Renji hospital. Inclusion criteria included active MM, age of ≥18 years at the time of diagnosis, and availability of a pre-chemotherapy PET-CT scan and complete clinical data. Patients with a history of other tumors were excluded. This retrospective study was approved by institutional ethics committee in our hospital, and the informed consent requirement was waived.

Data Collection
Baseline features of patients were used to characterize the disease at the beginning of the concerned period. We gathered initial results of PET/CT and the biomarkers performed before treatment in order to analyze the correlation between these characteristics and the prognosis of myeloma. The data set is divided into training set and test set by date at 70:30.

PET/CT Image Acquisition
According to the guidelines of the European Association of nuclear medicine (EANM), all patients underwent whole-body 18 F-FDG positron emission tomography on Siemens Biograph-64 mCT scanner. All patients fasted for at least 6 h before acquisition, and the blood glucose levels were controlled below 150mg/dL. FDG-PET/CT was performed 60 min (60 ± 3 min) after injection of 3.7-5.55 MBq 18 F-FDG per kg of body weight. PET image reconstruction with a 3-dimensional (3D) ordered-subset expectation maximization (OSEM) algorithm: 3 iterations, 24 subsets; 2.75 mm × 3.12 mm× 3.12 mm voxel size. The field of view (FOV) was 700 mm. Before PET scanning, CT was performed with attenuation correction methods to obtain image with matrix size of 512 × 512 (80 Ma, 120 kV). PET and CT results were reviewed on the workstation to display the fused image frame by frame. Then, the positron emission tomography image (voxel size 3.12 mm, slice thickness 2.75 mm) was interpolated to the same resolution as the computed tomography image (voxel size 0.98 mm, slice thickness 2 mm) (Supplementary Materials Table S1).

Image Preprocessing
18 F-FDG-PET/CT images were read and interpreted by two independent boardcertified nuclear medicine physicians with more than 10 years of experience. The osteolytic lesions are identified with a PET standard spatial resolution limit of about 5 mm. The maximum standardized uptake value (SUVmax) of the lesions obtained from the region of interest (ROI) is the standard semi-quantitative index that can be considered for image interpretation. If there is no focal FDG metabolism in visual analysis, the ROI with diameter of 10 mm is drawn at the first sacral vertebrae to obtain SUVmax. Focal lesions (FLs) at diagnosis were defined as focally increased FDG uptake greater than the physiologic bone marrow or liver uptake on at least two consecutive slices, with or without any underlying lytic lesion. The dichotomized number of FLs were with the threshold set at 3.

Radiomics Features Extraction and Selection
Skeleton volume of Interest (VOI) segmentation was mainly based on Slicer Radiomics (V2.10, https://github.com/Radiomics/SlicerRadiomics, accessed on 12 March 2022) as 3D Slicer extension which enables processing and extraction of radiomics features. To ensure the repeatability of PET/CT image features, we used the fixed bin width to acquire gray histogram and discrete image gray level. Finally, a total of 1702 image radiomics features were extracted from the original images of PET and CT by wavelet filter, including 18 firstorder features, 13 shape features, 23 gray-level co-occurrence matrix features (GLCM), 16 gray-level scale matrix feature (GLSM), 16 gray-level size zone matrix (GLSZM), 5 neighborhood gray-tone difference matrix (NGTDM), and 14 gray-level dependence matrix (GLDM). The workflow was shown in Figure 1. All radiomics features were extracted from VOIs of PET and CT images.

Predictive Model Establishment and Statistical Analysis
The data set is divided into training set and test set by date, and the proportion is 70:30, with the latest 30% used as the test-set. Optimal features are screened by univariate Cox regression together with least absolute shrinkage and selection operator (Lasso) algorithm and 10-fold cross-validation [19] (Figures 1 and 2). Thus, different combinations were constructed, and stepwise regression was performed: (1) radiomics features of PET/CT alone (Rad-Mod); (2) using clinical data (including clinical/laboratory parameters and conventional PET/CT metrics) only (Cli-Mod); (3) combination radiomics features and clinical data (Cli-Rad-Mod). Receiver operating characteristic (ROC) curves were used to test the predictive performance of each model. The discriminative ability of each model was assessed by the concordance index (C-index). In order to evaluate the improvement in prediction performance gained by adding radiomics features to the baseline model, we calculated the net reclassification index (NRI) in the training cohort and validation cohort in the first and third year.  estimates. We used the Cox regression model to confirm the independent predicto survival by univariate and multivariate analyses (see in Figure 2). The relative risk o event and the 95% confidence interval (CI) were estimated using a Cox proportional ard model. p < 0.05 indicates that the difference is statistically significant. The signif difference between two C-indices was tested using the Hmisc R package.

Baseline Clinical Characteristics
The baseline clinical characteristics of the 98 patients are summarized in Table 1. The median age of all patients was 65 years (range, 41-86 years). The most prevalent type of MM patients was IgG type (49.0%), and the proportion of patients with light chain disease was 17.3%. The consensus of the International Myeloma Working Group defines high-risk SPSS Statistics 26.0 (version 26.0; IBMC, Armonk, NY, USA) and R software packages (version 3.6.3, http://www.r-project.org, accessed on 10 February 2022) were used for statistical analysis and model construction. The Mann-Whitney U test and Chi-square test were used for comparisons between groups for continuous variables and categorical variables. Progression-free survival (PFS) was calculated from the beginning of treatment until disease progression or death from any cause. PFS were evaluated using Kaplan-Meier estimates. We used the Cox regression model to confirm the independent predictors of survival by univariate and multivariate analyses (see in Figure 2). The relative risk of an event and the 95% confidence interval (CI) were estimated using a Cox proportional hazard model. p < 0.05 indicates that the difference is statistically significant. The significant difference between two C-indices was tested using the Hmisc R package.

Feature Selection and Model Performance
In this study, radiomics analysis showed a total of 1702 features were extracted, including morphological features, intensity features, texture features, and high-order features based on wavelet filters. Optimal radiomics features are screened by Lasso algorithm including LHL_Idmn_glcm, LHL_LDLGLE_gldm, LHL_LALGLE_glszm from (Table 3) retained as prognostic factors for models involving radiomics features. For models involving clinical parameters, elevated LDH (HR 1.004, 95% CI 1.000-1.008, p = 0.034) and SUVmax > 4.2(HR 1.114, 95%CI 1.043-1.189, p = 0.001) were consistently found to be significant predictors. After weighting the selected features according to the regression coefficient, the score of each patient were calculated, respectively. The highest Youden index was adapted from a time-dependent ROC curve used to determine the optimal cut-off value of each model. Patients were divided into high-risk group and low-risk group according to cut off value. The nomogram was constructed based on the above independent prognostic factors ( Figure 3).
In this study, the model performance was evaluated by the concordance index (Cindex). The value of C-index ranges from 0.5 to 1. The higher the c-index, the more accurate is the prediction. The C-index for each model is listed in Table 4    In this study, the model performance was evaluated by the concordance index (Cindex). The value of C-index ranges from 0.5 to 1. The higher the c-index, the more accurate is the prediction. The C-index for each model is listed in Table 4

Discussion
MM is a condition that has a heterogeneous presentation and prognosis with survival rates ranging from months to decades. A prior identification of those with high-risk profiles is important for prognostication and personalized treatment strategies [21,22]. An increasing number of clinical prognostic markers for MM reflecting various aspects of the patients' clinical status and disease behavior have been mentioned in the literature [23]; however, risk stratification is still a challenge because of spatial intra-tumoral heterogeneity. The imaging phenotype potentially containing extensive information of tumor characteristics and susceptibility to treatments can be partly acquired through medical image analysis, especially using PET-based images [24]. FDG-PET/CT enables detecting the presence of sites of metabolically active PCs and to assess changes in tumor cell metabolism after induction treatment. This study evaluated the potential prognostic performance of radiomics features extracted from FDG-PET/CT in MM integrated with clinical data. We have identified a model for predicting progression in newly diagnosed MM. Among the 13 clinical features initially considered in this study, LDH and SUVmax were selected in the final model. Patients with elevated LDH and SUVmax > 4.2 had significantly worse PFS. In this study, optimal radiomics features are screened including LHL_Idmn_glcm, LHL_LDLGLE_gldm, LHL_LALGLE_glszm from PET. Idmn is a measurement for local homogeneity of imagine. LALGLE reflects the proportion of a larger area with lower gray value in the image. LALGLE is a large area low gray level emphasis. Our model incorporated six of the most highly predictive PET/CT radiomics and clinical parameters. The model combining clinical data with radiomics features showed higher C-index than the models with clinical data alone (training cohort: C-index 0.   NRI > 0 indicates that the prediction ability of the new model is improved compared with the old mode (positive improvement); NRI < 0 indicates the prediction ability of the new model decreases (negative improvement); NRI = 0 is considered that the new model has not improved. Table 6 and Figure 4 summarize the results for the AUC (area under ROC curve) of each combination. In comparison with the AUC for the clinical model, the significant improvement was seen with the combination of the clinical data and radiomics feature (p < 0.05). Cli-Rad model yielded the best performance (AUC 0.761, sensitivity 56.7%, specificity 85.7%, p < 0.05 in training cohort and AUC 0.650, sensitivity 80.0%, specificity 78.6%, p < 0.05 in validation cohort).

Discussion
MM is a condition that has a heterogeneous presentation and prognosis with survival rates ranging from months to decades. A prior identification of those with high-risk profiles is important for prognostication and personalized treatment strategies [21,22]. An increasing number of clinical prognostic markers for MM reflecting various aspects of the patients' clinical status and disease behavior have been mentioned in the literature [23]; however, risk stratification is still a challenge because of spatial intra-tumoral heterogeneity. The imaging phenotype potentially containing extensive information of tumor characteristics and susceptibility to treatments can be partly acquired through medical image analysis, especially using PET-based images [24]. FDG-PET/CT enables detecting the presence of sites of metabolically active PCs and to assess changes in tumor cell metabolism after induction treatment. This study evaluated the potential prognostic performance of radiomics features extracted from FDG-PET/CT in MM integrated with clinical data. We have identified a model for predicting progression in newly diagnosed MM. Among the 13 clinical features initially considered in this study, LDH and SUVmax were selected in the final model. Patients with elevated LDH and SUVmax > 4.2 had significantly worse PFS. In this study, optimal radiomics features are screened including LHL_Idmn_glcm, LHL_LDLGLE_gldm, LHL_LALGLE_glszm from PET. Idmn is a measurement for local homogeneity of imagine. LALGLE reflects the proportion of a larger area with lower gray value in the image. LALGLE is a large area low gray level emphasis. Our model incorporated six of the most highly predictive PET/CT radiomics and clinical parameters. The model combining clinical data with radiomics features showed higher C-index than the models with clinical data alone (training cohort: C-index 0. Radiomics as a data-driven analysis of radiologic images might enable efficient mine image features providing valuable clinical information. Yet, few studies underly the interest of the value of radiomics features in MM. Radiomics features may quantify structural characteristics of bone marrow changes in MRI images and may be implemented as a complementary prognosis evaluation tool [25]. Some studies have shown MRI-based or PET/CT-based radiomics features may provide valuable information for image-based assessment of MRD and prediction of the therapy response [16][17][18]26]. Jamet B [27] tried to evaluate the potential prognostic value of textural features extracted from FDG-PET/CT in MM framework in addition to conventional PET-derived metabolic features and usual clinical/biological/genetic parameters. Though FDG-PET/CT has been considered a valuable tool in the work-up of patients with newly diagnosed MM, differentiation between focal and diffuse patterns on PET/CT is difficult. Therefore, some studies attempted to apply radiomic approaches to improve standard radiological evaluation with implications for prognosis. Tagliafico AS [28] found 15% of radiomics features (16/104) were different in diffuse and focal patterns. Mesguich C [21] found that a radiomic signature based on five different features extracted from PET and CT images was accurate for the diagnosis of diffuse disease in MM patients. In this study, we found radiomics features extracted from the baseline PET/CT combined with clinical parameters provided valuable information identifying the patients progressing early. Nevertheless, the limited literature could not give enough evidence of the value of radiomics features predicting outcomes in MM patients. A prior work on radiomics in myeloma has explicitly shown that the feature stability between different scanners is very limited in vivo, even after application of a simple image normalization. Radiomics features selected by a repeatability experiment only are not necessarily suited to build radiomics models for multicenter clinical application. Supposedly, one of the main reasons that hinder the translation to clinical application is the low external generalizability of radiomics models [29]. Accordingly, standardization of image acquisition or advanced calculative approaches for image normalization or RF compensation might help to improve external generalizability of radiomics prediction models. Further investigations that completely explore the potential prognostic value of PET/CT radiomics feature predicting the outcome of MM patients should be taken.
Our study has several limitations. First, our findings are based on a small size cohort from one institution with retrospective nature. A second follow-up duration may not be long enough; therefore, we establish a predictive model based on a single survival endpoint (PFS). Thus, a prospective multicenter study with a large cohort is necessary to confirm the results. Thirdly, the whole spine including the intervertebral disc was segmented in our study. The segmentation in a further study with using only the "clean" bone without discs might provide more valuable information on MM prognosis. Another research group combined the automatic BM segmentation with a subsequent radiomics analysis to automatically perform comprehensive, bone-by-bone phenotyping of the BM from wb-MR images which correctly exclude intervertebral discs [26]. This also brings inspiration for our future work.

Conclusions
Early identification of high-risk myeloma would help the development of precise treatment strategies. Radiomics features extracted from the baseline PET/CT quantitatively characterized intratumor heterogeneity and provided complementary information of prognosis for myeloma patients. In our study, the combination of radiomics features with clinical data showed improved performance relative to models with radiomics features or clinical parameters alone. Multivariate Cox model containing the radiomics information stratified patients into different risk groups for PFS, and thereby may mine more intratumor heterogeneous information and maybe further improve prognostic performance. Further studies with external test data will be needed to investigate the final, realistic performance of the model.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jcm12062280/s1, Table S1: IBSI reporting structure of the study as of the Imaging Biomarker Standardization Initiative (IBSI) guidelines. References [30][31][32] are cited in the supplementary materials. Institutional Review Board Statement: This research study was conducted retrospectively from data obtained for clinical purposes. Renji Hospital Ethics Committee has confirmed that no ethical approval is required.