Prognostic Value of Baseline Radiomic Features of 18F-FDG PET in Patients with Diffuse Large B-Cell Lymphoma

This study investigates whether baseline 18F-FDG PET radiomic features can predict survival outcomes in patients with diffuse large B-cell lymphoma (DLBCL). We retrospectively enrolled 83 patients diagnosed with DLBCL who underwent 18F-FDG PET scans before treatment. The patients were divided into the training cohort (n = 58) and the validation cohort (n = 25). Eighty radiomic features were extracted from the PET images for each patient. Least absolute shrinkage and selection operator regression were used to reduce the dimensionality within radiomic features. Cox proportional hazards model was used to determine the prognostic factors for progression-free survival (PFS) and overall survival (OS). A prognostic stratification model was built in the training cohort and validated in the validation cohort using Kaplan–Meier survival analysis. In the training cohort, run length non-uniformity (RLN), extracted from a gray level run length matrix (GLRLM), was independently associated with PFS (hazard ratio (HR) = 15.7, p = 0.007) and OS (HR = 8.64, p = 0.040). The International Prognostic Index was an independent prognostic factor for OS (HR = 2.63, p = 0.049). A prognostic stratification model was devised based on both risk factors, which allowed identification of three risk groups for PFS and OS in the training (p < 0.001 and p < 0.001) and validation (p < 0.001 and p = 0.020) cohorts. Our results indicate that the baseline 18F-FDG PET radiomic feature, RLNGLRLM, is an independent prognostic factor for survival outcomes. Furthermore, we propose a prognostic stratification model that may enable tailored therapeutic strategies for patients with DLBCL.


Introduction
Diffuse large B-cell lymphoma (DLBCL) is the most common type of lymphoma, accounting for approximately one-third of non-Hodgkin lymphomas [1]. DLBCL is a heterogeneous group of lymphomas with variable survival rates. The cure rate of DLBCL has improved substantially due to advances in disease management, and the addition of rituximab immunotherapy to conventional cyclophosphamide, hydroxydaunorubicin (doxorubicin or epirubicin), oncovin (vincristine), and prednisolone chemotherapy (R-CHOP) is effective in 60-70% of patients [2]. However, approximately 30-40% of patients still suffer relapse or refectory disease [3]. New prognostic factors for personalized riskadapted treatment is currently an unmet clinical need, and may improve the outcomes of patients with DLBCL.
The International Prognostic Index (IPI) has been the basis for determining prognosis for DLBLC in clinical practice for the past 20 years [4,5]. In addition to IPI, 18 Ffluorodeoxyglucose ( 18 F-FDG) positron emission tomography/computed tomography (PET/CT) is a standard imaging modality for patients with DLBLC. 18 F-FDG PET is highly sensitive for detecting lymphoma, and plays a crucial role in disease staging and therapy monitoring, which has allowed personalized therapeutic decision making [6]. The total metabolic tumor volume (MTV) derived from baseline 18 F-FDG PET has been shown to be associated with survival outcomes in patients with DLBCL [7][8][9][10][11][12], and novel PET imaging-derived biomarkers may further individualize the treatment of lymphoma.
Tumor heterogeneity is a pivotal prognostic factor in cancer progression, recurrence, and therapeutic resistance [13]. Moreover, tumor heterogeneity plays an important role in patient outcomes, and is correlated with tumor aggressiveness, metastasis, and molecular profiles [14,15]. Radiomic analysis can be used to assess tumor heterogeneity, and may assist with clinical outcome prognostication [16]. High-throughput radiomic features are extracted from medical images, and can reveal complex mathematical patterns in the spatial distribution of signal intensity values that are not observed visually. Radiomic analysis promotes diagnostic, predictive, and prognostic power to facilitate better clinical decision making [17]. Radiomic features have been widely explored to pursue personalized medicine in various oncology studies [18][19][20][21][22][23]; however, there is limited evidence relating to their role as prognostic factors in DLBCL [24,25].
Therefore, this study aimed to assess the prognostic value of radiomic features derived from baseline 18 F-FDG PET in terms of survival outcomes. Moreover, we investigated the feasibility of combining clinical variables and radiomic features for the prognostic stratification of patients with DLBCL.

Patient Population
This study was conducted according to the Declaration of Helsinki guidelines, and approved by the Institutional Review Board and Research Ethics Committee of Hualien Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation (IRB108-251-B; 10 December 2019). The need for informed consent was waived given the retrospective nature of the study. Between September 2004 and June 2019, 83 patients with a pathological diagnosis of DLBCL who underwent pre-treatment 18 F-FDG PET/CT were retrospectively enrolled. All patients received either R-CHOP chemotherapy or R-CVP (rituximab, cyclophosphamide, vincristine, prednisone) chemotherapy, or rituximab monotherapy in patients with a low tumor burden. Electronic charts were carefully reviewed for each patient, and data regarding patient demographics, disease characteristics, clinical course, therapy modalities, and patient outcomes were collected. All patients underwent a complete medical history, physical examination, laboratory tests, bone marrow aspiration, CT scan, and 18 F-FDG PET/CT. The patient's age at disease onset, Ann Arbor stage, Eastern Cooperative Oncology Group performance status, lactate dehydrogenase (LDH) level, and extranodal involvement were recorded for calculation of the IPI score [5]. Bulky disease was defined as a nodal mass larger than 10 cm in diameter.

Patient Follow-Up Evaluation
Initial treatment of rituximab-based chemotherapy with or without involved-field radiotherapy was conducted to the patients with DLBCL under the Clinical Practice Guidelines of the National Comprehensive Cancer Network in Oncology. Disease status was evaluated by CT or 18 F-FDG PET/CT scan following treatment. Follow-up assessment was performed every 3 months for the first 2 years, and 6 to 12 months thereafter. The enrolled patients were followed up until disease progression or death, and these cases were counted as an event. Progression-free survival (PFS) was defined as the time from the date of diagnosis to the date of the first relapse, progression, or death from any cause. Overall survival (OS) was defined as the time from diagnosis until death from any cause [26]. Patients who did not suffer an event were censored at the date of the last known follow-up.

18 F-FDG PET/CT Scan
Patients fasted for at least 4 h before the examination and had blood glucose levels less than 150 mg/dL. Patients were injected intravenously with 5 MBq/kg of 18 F-FDG, and PET/CT scans were performed 45 min after administration using a GE Discovery ST scanner (GE Healthcare, Milwaukee, WI, USA). PET images were acquired from the midthigh to the vertex in a static 3-dimensional mode. A CT scan without intravenous contrast medium enhancement was performed immediately prior to the PET imaging for attenuation correction. PET images were reconstructed with an ordered-subset expectation maximization algorithm (2 iterations, 21 subsets, and a 2.14-mm full width at half maximum Gaussian post-filter). The reconstructed PET image has a matrix size of 128 × 128, a pixel size of 5.47 × 5.47 mm, and a slice thickness of 3.27 mm.

Feature Extraction and Selection
18 F-FDG PET images were interpreted by an expert nuclear medicine physician. To avoid interobserver variations, all images were analyzed by the same reviewer using OsiriX software (Pixmeo, Geneva, Switzerland) [27]. The results were confirmed by the other experienced nuclear medicine physician. 18 F-FDG-avid lesions were segmented on PET images by applying the region-growing algorithm with a standardized uptake value (SUV) threshold above 2.5 for target delineation [28]. The SUV-based volumes of interest were used to compute quantitative radiomic features in PET images.
The radiomic features included 19 first-order features and 61 textural features. The firstorder parameters were calculated on the basis of SUV statistics. The textural features were computed from a gray level co-occurrence matrix, gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), and neighboring gray tone difference matrix using a fixed bin width of 0.25. A total of 80 radiomic features (Supplementary Table S1) were extracted from PET images using the Pyradiomics open-source software package version 2.2.0 (Harvard Medical School, Boston, MA, USA) [29]. Radiomic features calculated by this package complies with the feature definitions described by the Imaging Biomarker Standardization Initiative (IBSI) [30,31].
To reduce dimensionality within the radiomic features, reliable features were chosen with low sensitivity to the intraclass correlation coefficient (ICC) following the literature report [32]. Subsequently, the least absolute shrinkage and selection operator (LASSO) regression algorithm [33] was employed for the chosen features. A five-fold cross-validation scheme was applied to tune the parameters of Lambda. The optimal Lambda value was identified by the minimum cross-validated criterion and the minimum criterion within one standard error. Using this method, the regression coefficients of irrelevant features were regularized to zero, and the remaining nonzero coefficients of the radiomic features were selected.

Statistical Analysis
The primary endpoints of this study were PFS and OS. Clinical variables and image features from the radiomic analysis were tested as potential prognostic factors. Two independent datasets were needed to build and validate the model. The data of 83 patients were randomly divided into two cohorts: 58 patients (70%) to the training dataset, and the remaining 25 patients (30%) to the validation dataset. Chi-square tests were used to compare the categorical variables between the training and validation cohorts. Receiver operating characteristic (ROC) curves were used to define the optimal cut-off values of the radiomic features by maximizing the sensitivity and specificity based on the Youden index. Cox proportional hazards regression models were used to identify the prognostic factors of PFS and OS in the training dataset. The statistically significant variables in the univariate Cox analysis were included in the stepwise multivariate Cox regression models. In both training and validation datasets, the survival curve was plotted using the Kaplan-Meier method, and the survival difference between the subgroups was assessed using a log-rank test. All statistical tests were two-sided, with a significance level of 0.05. Statistical analyses were performed using MedCalc statistical software version 19.4.1 (MedCalc Software, Ostend, Belgium) and R open-source statistical software version 3.5.2 (R Foundation, Vienna, Austria).

Patient Characteristics
A total of 83 patients met the criteria for enrolment in the study; among whom, 65 patients were treated with the R-CHOP chemotherapy regimen, 13 with R-CVP, and 5 with rituximab monotherapy. In addition, 18 patients received involved-field radiotherapy. The median follow-up period was 41.7 months; at the time of the analysis, 35 patients (42%) suffered disease relapse or progression at a median of 9.8 months after diagnosis, and 29 patients (35%) died of the disease at a median of 10.7 months. The 5-year PFS rate was 52.3%, and the 5-year OS rate was 60.3% in the entire study population. The clinical characteristics of the patients are outlined in Table 1. No significant differences were found between the training and validation datasets (p = 0.111-0.755).

Feature Selection in the Training Cohort
The twelve radiomic features (Supplementary Table S2) with low sensitivity to the ICC (<1.10) were chosen according to the literature report [32]. These reliable features were chosen for further LASSO analysis. Based on the LASSO results (Supplementary Figure S1), MTV, gray level non-uniformity (GLN), and run length non-uniformity (RLN) both from GLRLM with nonzero regression coefficients were selected as potential prognostic factors for PFS and OS. From ROC curves, the cut-off value of MTV was 137 cm 3 , GLN GLRLM was 68, and RLN GLRLM was 1449. These cut-off values were used to stratify patients into those with good or poor survival outcomes.

Survival Analyses in the Training Cohort
The results of univariate and multivariate Cox regression analyses for the clinical variables and PET parameters are presented in Tables 2 and 3, respectively. In the univariate analysis, the disease stage, LDH, IPI score, bulky disease of clinical variables, MTV, GLN GLRLM , and RLN GLRLM of radiomic features were associated with PFS. Meanwhile, LDH, IPI, MTV, GLN GLRLM , and RLN GLRLM were related to OS. These variables were entered into the multivariate Cox regression model. After multivariate analysis, RLN GLRLM remained a prognostic factor for PFS, whereas the IPI and RLN GLRLM maintained their prognostic significance for OS. Kaplan-Meier survival analysis confirmed that the IPI score and RLN GLRLM were predictive factors for both PFS and OS (Figure 1)  Kaplan-Meier survival analysis confirmed that the IPI score and RLNGLRLM were predictive factors for both PFS and OS (Figure 1). The 5-year estimate of PFS was 35.8% in the high-risk IPI group compared to 69.8% in the low-risk IPI group. Patients with high-risk IPI scores had a 5-year OS of 35.5%, while patients with low-risk IPI scores had a 5-year OS of 74.6%. The high RLNGLRLM patients had more aggressive disease, a greater risk of relapse or progression, and a lower survival rate compared to patients with low RLNGLRLM. Patients with a high RLNGLRLM had a 5-year PFS of 37.2%, whereas patients with a low RLNGLRLM had a 5-year PFS of 91.7%. Moreover, patients with a high RLNGLRLM had a 5year OS of 41.1%, whereas patients with a low RLNGLRLM had a 5-year OS of 91.7%.

Prognostic Model Development and Validation
A prognostic stratification model was built based on the independent risk factors presented in the multivariate Cox regression analysis for OS. The risk factors included high-risk IPI scores of the clinical variable and high RLN GLRLM of the radiomic feature. A combination of the two factors, the presence or absence of each risk factor was given a score of 1 or 0, resulting in scores from 0 to 2. All patients were stratified into three risk groups: group I, with a score of 0 (none of the risk factors); group II, with a score of 1 (one risk factors); and group III, with a score of 2 (two risk factors). In the training dataset, Kaplan-Meier analyses of PFS and OS demonstrated the ability of the prognostic stratification model (Figure 2a,b). Survival curves revealed that the three risk groups were significantly different with regard to PFS and OS. The 5-year PFS of patients in groups I to III were 90.0%, 54.2%, 30.6% (p < 0.001), respectively, and the 5-year OS were 90.0%, 64.7.0%, 30.3% (p < 0.001), respectively.
Diagnostics 2020, 10, x FOR PEER REVIEW 7 of 12 high-risk IPI scores of the clinical variable and high RLNGLRLM of the radiomic feature. A combination of the two factors, the presence or absence of each risk factor was given a score of 1 or 0, resulting in scores from 0 to 2. All patients were stratified into three risk groups: group I, with a score of 0 (none of the risk factors); group II, with a score of 1 (one risk factors); and group III, with a score of 2 (two risk factors). In the training dataset Kaplan-Meier analyses of PFS and OS demonstrated the ability of the prognostic stratifi cation model (Figure 2a,b). Survival curves revealed that the three risk groups were sig nificantly different with regard to PFS and OS. The 5-year PFS of patients in groups I to III were 90.0%, 54.2%, 30.6% (p < 0.001), respectively, and the 5-year OS were 90.0% 64.7.0%, 30.3% (p < 0.001), respectively. In the validation dataset, survival curves generated through Kaplan-Meier analysis indicated that the prognostic stratification model identified three risk groups for surviva outcomes (Figure 2c,d). The patients in group I had significantly higher 5-year PFS (100% vs. 43.3% vs. 0%, p < 0.001) and OS (100% vs. 67.5% vs. 33.3%, p = 0.020) rates than those in groups II and III.

Discussion
The present study investigated the use of radiomic analysis of 18 F-FDG PET for pre dicting survival outcomes in patients with DLBCL. Our results demonstrate that baseline 18 F-FDG PET radiomics have prognostic value, and that RLNGLRLM is an independent prog nostic factor for both PFS and OS. The RLNGLRLM provides a way of featuring for tumor heterogeneity, driven by the genomic diversity that enables the tumor to evolve and adap to anticancer treatments [15,34]. Therefore, it can be reasoned that the assessment of tumor In the validation dataset, survival curves generated through Kaplan-Meier analysis indicated that the prognostic stratification model identified three risk groups for survival outcomes (Figure 2c,d). The patients in group I had significantly higher 5-year PFS (100% vs. 43.3% vs. 0%, p < 0.001) and OS (100% vs. 67.5% vs. 33.3%, p = 0.020) rates than those in groups II and III.

Discussion
The present study investigated the use of radiomic analysis of 18 F-FDG PET for predicting survival outcomes in patients with DLBCL. Our results demonstrate that baseline 18 F-FDG PET radiomics have prognostic value, and that RLN GLRLM is an independent prognostic factor for both PFS and OS. The RLN GLRLM provides a way of featuring for tumor heterogeneity, driven by the genomic diversity that enables the tumor to evolve and adapt to anticancer treatments [15,34]. Therefore, it can be reasoned that the assessment of tumor heterogeneity allows us to anticipate patient outcomes. Moreover, a prognostic stratification model was devised to identify the risk groups of patients based on integrating clinical and imaging prognostic factors. The proposed model showed the complementary roles of combining clinical information with tumor heterogeneity and allowed the stratification of three risk groups according to survival outcomes in patients with DLBCL.
Many PET radiomic features are currently under investigational use, and different studies have reported different radiomic features for predicting the survival outcome of lymphoma [35][36][37][38][39]. To keep the data dimensionally low to avoid overfitting, only 12 radiomic features with low ICC sensitivity were evaluated for clinical endpoints in this study. The cohort was split into a training dataset (70%) and an internal validation dataset (30%). A LASSO algorithm was further used for feature selection in order to achieve the best accuracy for PFS and OS prognostication. The radiomic feature identified in the study, RLN GLRLM , was a valuable imaging biomarker after multivariable analyses. RLN GLRLM estimates the similarity of run lengths throughout the image, where a lower value indicates higher homogeneity. A higher RLN GLRLM was associated with a worse prognosis, suggesting that the measurement of tumor heterogeneity of 18 F-FDG PET distribution is an essential biomarker in patients with DLBCL.
The literature on molecular imaging radiomics for DLBCL is limited. A few studies have been conducted to investigate the usefulness of PET radiomic features in determining the survival in DLBCL. Parvez et al. [38] found that GLN GLSZM correlated with diseasefree survival, and that kurtosis correlated with OS. Moreover, Aide et al. [35,40] found that skewness of skeletal heterogeneity was a prognostic factor for PFS, and long-zone high gray level emphasis from GLSZM was a prognostic parameter for 2-year event-free survival. Recently, Cottereau et al. [41] reported that the radiomic feature characterizing lesion dissemination was associated with PFS and OS. Our findings are in line with those of studies indicating that the PET-derived radiomic features are useful for patient outcome prognostication in DLBCL. Previous studies have indicated that MTV can be used to determine the prognosis of patients with DLBCL [7][8][9][10][11][12]. Our results are not in contradiction with those of the studies. In univariate analysis, MTV demonstrated prognostic significance; however, in multivariate analysis, MTV did not correlate with PFS and OS, presumably due to the small sample size. In lymphoma, few reports have indicated that the performance of PET metabolic parameters for survival prognostication is poor compared to that of PET radiomic features [42,43]. On the contrary, Wang et al. [39] reported that radiomics are not superior to traditional imaging parameters. Notwithstanding, our data suggest that features of tumor heterogeneity may serve as a complementary indicator of MTV. Further external validation is required in a larger cohort population to validate our findings.
Tumor heterogeneity has the potential to impact the prognosis of patients with DL-BCL [44]. Lymphoma is a system malignancy, which lacks a primary tumor in the majority of cases. A biopsy is generally performed for a single lesion site in routine clinical practice. Thus, it might be more relevant to explore the tumor heterogeneity across the entire tumor volume than with a single site biopsy in DLBCL. In this study, a tumor heterogeneity feature from the entire tumor volume was combined with the clinical IPI to construct a prognostic stratification model. Our findings highlight the benefit of an integrated approach that includes IPI and radiomics for evaluating patients with DLBCL at initial diagnosis. Currently, a qualitative assessment of response using 18 F-FDG PET has been implemented into the clinical management of DLBCL. However, patients with DLBCL failed to achieve significant survival improvement after the qualitative 18 F-FDG PET response directed-treatment strategy [45]. Radiomics provides a more sophisticated quantitative measure of 18 F-FDG PET. We further combined radiomics with the clinical IPI system into a survival prediction model. Because radiomics portrays tumor heterogeneity, which is different from the clinical information provided by the IPI score, these two features may have complementary roles. A combination of the two risk factors may more comprehensively depict the survival risk of DLBCL. Future clinical trials are warranted to test the ability of our proposed model to guide tailored treatment strategies.
Despite the usefulness of radiomics, it does have certain limitations. First, radiomics are extracted in terms of MTV, and the method of MTV measurement is inconsistent among different working groups. A recent report [28] indicated that different methods predicted prognosis, but those with a SUV ≥ 2.5 had the best interobserver agreement and were easiest to apply in DLBCL; this was the method we selected in the current study. Moreover, the threshold used to divide patients into high-and low-risk groups depends on the method of MTV measurement. Thus, setting of common criteria for standardization of the MTV calculation is warranted [46]. Second, the SUV discretization step in computing textural features can influence repeatability [47]. In our work, a reliable discretization using a fixed size of bins was adopted, which was shown to be more appropriate in clinical cases [48]. However, the optimal bin size value could not be identified (i.e., the extraction of reliable radiomic features has not been thoroughly investigated). Further investigation of the optimal size of bins for survival prognostication should be considered. Third, the reliability of radiomic features and their ability to predict clinical outcomes is highly dependent on the choice of feature extraction platform [49]. Future radiomic studies should still ensure platforms are IBSI-compliant, as was the platform that we adopted in the current study. Finally, radiomic features can be sensitive to the imaging acquisition and reconstruction settings [50]. Therefore, a radiomic-based model might not be directly applied to different imaging centers, which limits its usefulness in clinical practice. Further research is necessary to validate our findings using a post-reconstruction harmonization [51] approach in multicenter trials.
We acknowledge that our research is exploratory and that there are several limitations. Like most radiomic studies, selection bias could not be avoided due to the retrospective nature of the study. Furthermore, since our analysis was based on a small number of patients, the lack of statistically significant differences should be interpreted with caution, as a statistical difference may be evident with a larger population. Besides, the interobserver variability could be affected by different image readers. In addition, current molecular genetic studies have identified DLBCL subtypes with less favorable survival outcomes, such as the activated B-cell subtype or MYC oncogene rearrangement [11,52]. However, only 12 patients in our cohort underwent subtyping. Whether the radiomic features derived from 18 F-FDG PET are associated with the different subtypes of DLBCL requires further investigation. Finally, the rituximab-based regimens and the radiotherapy doses varied throughout the study. This study demonstrated that the identified radiomic feature has prognostic value in DLBCL, but the underlying biological meaning remains to be further explored in larger, multi-institutional cohorts before they can be applied to clinical decision making.

Conclusions
Our results indicate that the baseline 18 F-FDG PET radiomic feature, RLN GLRLM , serves as an independent prognostic factor for survival outcomes. Furthermore, a prognostic stratification model combining the IPI and RLN GLRLM can be useful for risk stratification of patients with DLBCL. Our findings may be clinically helpful in guiding personalized therapeutic strategies.
Informed Consent Statement: Patient consent was waived given the retrospective nature of the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the privacy and ethical restrictions.