A Simultaneous Multiparametric 18F-FDG PET/MRI Radiomics Model for the Diagnosis of Triple Negative Breast Cancer

Simple Summary In this study, we aimed to build a machine-learning predictive model for the identification of triple negative breast cancer, the most aggressive subtype, using quantitative parameters and radiomics features extracted from tumor lesions on hybrid PET/MRI. The good performance of the model supports the hypothesis that hybrid PET/MRI can provide quantitative data able to non-invasively detect tumor biological characteristics using artificial intelligence software and further encourages the conduction of additional studies for this purpose. Abstract Purpose: To investigate whether a machine learning (ML)-based radiomics model applied to 18F-FDG PET/MRI is effective in molecular subtyping of breast cancer (BC) and specifically in discriminating triple negative (TN) from other molecular subtypes of BC. Methods: Eighty-six patients with 98 BC lesions (Luminal A = 10, Luminal B = 51, HER2+ = 12, TN = 25) were included and underwent simultaneous 18F-FDG PET/MRI of the breast. A 3D segmentation of BC lesion was performed on T2w, DCE, DWI and PET images. Quantitative diffusion and metabolic parameters were calculated and radiomics features extracted. Data were selected using the LASSO regression and used by a fine gaussian support vector machine (SVM) classifier with a 5-fold cross validation for identification of TNBC lesions. Results: Eight radiomics models were built based on different combinations of quantitative parameters and/or radiomic features. The best performance (AUROC 0.887, accuracy 82.8%, sensitivity 79.7%, specificity 86%, PPV 85.3%, NPV 80.8%) was found for the model combining first order, neighborhood gray level dependence matrix and size zone matrix-based radiomics features extracted from ADC and PET images. Conclusion: A ML-based radiomics model applied to 18F-FDG PET/MRI is able to non-invasively discriminate TNBC lesions from other BC molecular subtypes with high accuracy. In a future perspective, a “virtual biopsy” might be performed with radiomics signatures.


Introduction
Breast cancer (BC) is a heterogeneous disease with a multifactorial etiology (e.g., hormone, genetics-related) affecting the capability of cells to repair DNA damages [1][2][3]. In the course of cancer development, cells progressively accumulate mutations, and acquire new cancer hallmark capabilities, i.e., sustaining proliferative signaling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing/accessing vasculature, activating invasion and metastasis, reprogramming cellular metabolism, and avoiding immune destruction [4,5]. Depending on the expression of molecular biomarkers, including estrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor 2 (HER2), BC can be categorized into different subtypes [6]. The knowledge of such molecular features led to the development of targeted treatments and improved outcomes [7]. An exception is the triple negative (TN) BC, which does not express any of these molecular biomarkers. TN is the most aggressive BC subtype, with a propensity for tissue invasion and distant metastases [8], and has the poorest prognosis as no targeted treatment is currently available [9].
The assessment of molecular subtypes is, therefore, the "sine qua non" for treatment planning of BC. In patients with TNBC, it is especially important as they usually require upfront systemic treatment before surgery. Currently, the assessment of molecular subtypes is performed through invasive tissue sampling using invasive core needle biopsy, an approach inherently limited by sampling bias and providing only a snapshot of the biology of the entire tumor. Indeed, molecular biomarkers are likely to differ within the tumor, a phenomenon known as intratumor heterogeneity [10]. The development of a non-invasive, pre-operative approach for the molecular characterization of BC in its entirety that can inform whether a tumor really has no actionable treatment targets, i.e., is TNBC entirely, is still an unmet clinical need.
Developments have led to sophisticated imaging techniques that can depict the functional properties of breast tumors and their underlying biology. Currently, 18 Ffluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) as well as magnetic resonance imaging (MRI) diffusion-weighted imaging (DWI), and dynamic contrast-enhanced (DCE) techniques are the gold standard for the in vivo assessment of tumor metabolism ( 18 F-FDG PET), cellularity (DWI) and neoangiogenesis (DCE) [11][12][13]. For the assessment of TNBC in particular, both 18 F-FDG PET and MRI have shown high sensitivity [14,15]. More recently, simultaneous 18 F-FDG PET/MRI has been shown to be promising for the accurate and non-invasive biological characterization of BC [16].
Initial studies have demonstrated the potential of radiomics analysis coupled with machine learning (ML) based on either PET or MRI (both DCE and DWI) for the comprehensive assessment of tumor phenotypes and for the development of predictive models [17][18][19]. Most recently, the value for artificial intelligence (AI)-enhanced simultaneous 18 F-FDG PET/MRI for BC phenotyping, specifically for hormone receptor-positive (luminal) BC identification has been demonstrated [20]; nevertheless, its potential for the identification of TNBC remains unclear.
We hypothesized that AI-enhanced simultaneous 18 F-FDG PET/MRI can identify the radiomic signature of TNBC. Therefore, we aimed to build an ML-based predictive model using both quantitative imaging parameters and radiomic features extracted from simultaneous 18 F-FDG PET/MRI to distinguish TNBC from other molecular BC subtypes.

Patient Sample
This prospective single-institution study was approved by the institutional review board, and written informed consent was obtained from all participants. All patients underwent simultaneous multiparametric 18 F-FDG PET/MRI of the breast between June 2016 and June 2020. Inclusion criteria were: >18 years-old subjects; histologically verified BC lesions; and not pregnant or breastfeeding. Exclusion criteria were: patients with standard contraindications for performing MRI examinations (e.g., metal implants, metallic foreign bodies, renal failure with eGFR < 30 mL/min); patients for whom histological proof of malignancy was not available; patients with malignant lesions other than BC; tumor recurrence; incomplete 18 F-FDG PET/MRI examinations; and patients with PET, DCE, or DWI images that were not suitable for subsequent multiparametric and radiomics analyses. Patients included in this study have been investigated in a previous study with different purpose and results [21].

Image Analysis Quantitative Parameters
A board-certified breast radiologist and a nuclear medicine physician with 7 and 11 years of experience, respectively, independently evaluated all PET/MR images, using a previously described method that proved to be highly reproducible [21].
MR images were analyzed for both DWI and perfusion-weighted imaging (PWI) quantitative parameters using a free, open-source software (Horos v.3.3.5, distributed under the LGPL license at Horosproject.org, sponsored by Nimble Co LLC d/b/a Purview in Annapolis, MD, USA). In detail, 2D circle ROIs were placed on apparent diffusion coefficient (ADC) maps for ADCmean calculation of tumor lesion and contralateral breast parenchyma. Thereafter, 2D ROIs were drawn over tumor lesions on first post-contrast DCE images and then pasted on perfusion maps DCE maps for the extraction of quantitative perfusion parameters, including mean transit time (MTT), plasma flow (PF), and volume distribution (VD), according to previous evidence [26]. Details of DWI and DCE image analysis are reported in Supplementary Materials S2.
PET images were analyzed for the quantification of tumor uptake using the Hermes Hybrid Viewer (Hermes Medical Solutions, Stockholm, Sweden). Maximum, mean, and minimum standardized uptake values (SUVmax, SUVmean, and SUVmin) were calculated by placing a 3D volume of interest (VOI) with a fixed threshold at the level of tumor lesions; care was taken to exclude surrounding background parenchymal uptake. The same approach was used for the extraction of SUVmean of the ipsilateral and contralateral normal appearing breast parenchyma, away from the nipple and areola.

Tumor Segmentation
Whole BC lesions were segmented on T2-weighted, DCE, DWI, and PET images using a dedicated software (ITK-SNAP v. 3.6.0, itksnap.org, University of Pennsylvania, Philadelphia, PA, USA; University of Utah, Salt Lake City, UT, USA). DCE (first postcontrast timepoint), DWI, and PET images were annotated using a semi-automated method selecting a lower boundary of signal intensity, while a slice-by-slice approach was used for segmenting BC lesions on T2-weighted images. In all cases, VOIs were placed within the margins of the lesions and care was taken to exclude macroscopic necrosis as well as cystic and hemorrhagic areas or biopsy markers. Figure 1 illustrates ROI placement on DWI, PWI and PET images, as well as the BC lesion segmentation process.

Radiomic Feature Extraction
Prior to the extraction of radiomic features, data for all images were reduced to 16 grey levels. The Computational Environment for Radiological Research (CERR), compatible with the Image Biomarker Standardization Initiative (IBSI), platform was used to extract radiomic features [27] from DCE, T2-weighted, ADC and PET images. For the extraction of ADC-derived radiomic features, BC segmentation was first performed on DWI images to better define tumor margins and leverage their intrinsic high contrast, and ROIs on DWI images were subsequently pasted onto ADC maps for the calculation of radiomic features. For non-isotropic images (T2-weighted images and ADC maps), feature extraction was performed in a slice-by-slice 2D fashion and successively clustered over the whole lesion (BTW3 as defined by IBSI) [28]. Due to the large class imbalance present, adaptive synthetic sampling was utilized to remove this effect. A total of 101 features were computed per image; all features are detailed in Supplementary Materials S3.

Radiomic Feature Selection and Machine Learning
As our study dataset consisted of a limited number of cases relative to a high number of extracted features, radiomic feature selection was performed using Least Absolute Shrinkage and Selection Operator (LASSO) regression [29], and subsequently, the five most important features were selected for each radiomic model to avoid overfitting. With insufficient cases to fine tune the LASSO hyperparameter (Lambda, a pragmatic approach was taken, wherein the algorithm employed automatically selected the largest value of Lambda that resulted in a nonnull model. Thereafter, a fine Gaussian support vector machine (SVM) was employed for radiomic model building using MATLAB 2017b (The

Radiomic Feature Extraction
Prior to the extraction of radiomic features, data for all images were reduced to 16 grey levels. The Computational Environment for Radiological Research (CERR), compatible with the Image Biomarker Standardization Initiative (IBSI), platform was used to extract radiomic features [27] from DCE, T2-weighted, ADC and PET images. For the extraction of ADC-derived radiomic features, BC segmentation was first performed on DWI images to better define tumor margins and leverage their intrinsic high contrast, and ROIs on DWI images were subsequently pasted onto ADC maps for the calculation of radiomic features. For non-isotropic images (T2-weighted images and ADC maps), feature extraction was performed in a slice-by-slice 2D fashion and successively clustered over the whole lesion (BTW3 as defined by IBSI) [28]. Due to the large class imbalance present, adaptive synthetic sampling was utilized to remove this effect. A total of 101 features were computed per image; all features are detailed in Supplementary Materials S3.

Radiomic Feature Selection and Machine Learning
As our study dataset consisted of a limited number of cases relative to a high number of extracted features, radiomic feature selection was performed using Least Absolute Shrinkage and Selection Operator (LASSO) regression [29], and subsequently, the five most important features were selected for each radiomic model to avoid overfitting. With insufficient cases to fine tune the LASSO hyperparameter (Lambda, a pragmatic approach was taken, wherein the algorithm employed automatically selected the largest value of Lambda that resulted in a nonnull model. Thereafter, a fine Gaussian support vector machine (SVM) was employed for radiomic model building using MATLAB 2017b (The MathWorks Inc., Natick, MA, USA). SVM is a widely used supervised ML method that has demonstrated good performance in small datasets and also provides memory efficient models that are able to solve both linear and non-linear issues [30]. Briefly, SVM works by identifying a hyperplane that best segregates two classes (e.g., TNBC vs. other BC subtypes). The choice of the best hyperplane is made based on how many cases are correctly classified and with which margins. The higher the margin, the higher the robustness of the model reducing the possibility of misclassification. Due to the limited number of cases in our study dataset, it was not possible to define a training and a validation set. Therefore, five-fold cross validation was employed, wherein five groups of data were generated, so that each model was trained on the first four groups and tested on the remaining one, providing reliable information on model generalizability. Data were initially standardized (z-score calculation with mean 0 and standard deviation 1) to prevent dependence on any individual parameter, especially those parameters containing high values. The whole process was repeated 1000 times, for each of the 8 datasets, to provide final accuracy metrics. Different models were built using various combinations of DCE, T2-weighted, ADC, and PET-derived radiomic features as well as quantitative PET/MRI parameters, to evaluate their ability to accurately distinguish TNBC from other BC subtypes.

Reference Standard
Malignant tumor samples from core biopsy and/or surgical specimen were analyzed to define tumor histology, grade, and immunohistochemical status including ER, PgR, Ki-67 expression, and overexpression and/or amplification of HER2 of each breast cancer lesion. The St. Gallen surrogate molecular subtype definitions were used to classify breast lesions [31].

Statistical Analysis
The Kolmogorov-Smirnov test was performed to assess whether data were normally distributed. Accordingly, the Mann-Whitney test or independent t-test were performed to assess differences in terms of lesion size, quantitative parameters and radiomics parameters between TN and non-TN breast cancer subtypes. McNemar's test was used to assess differences in terms of diagnostic performance among the different radiomics models. p values ≤ 0.05 were considered statistically significant. Confidence intervals for diagnostic metrics were calculated using a bootstrapping approach. Statistical analysis was conducted using SPSS, Version 25.0. 2017 (IBM Corp, Armonk, NY, USA).

Patient Sample
According to the inclusion and exclusion criteria, 144 patients were initially enrolled in the study. Of these, 86 female patients (mean age 52 ± 13 years) were included in the final study sample, with 98 histologically proven BC lesions (mean size: 28.31 ± 16.

Feature Selection and Machine Learning Analysis
Eight radiomic models (Models 1-8) were developed for TNBC identification, using the following combinations of quantitative parameters and radiomic features: quantitative parameters alone (Model 1); radiomic features extracted from ADC (Model 2), DCE (Model 3), PET (Model 4) and T2-weighted (Model 5) images; combinations of radiomic features extracted from different 18 F-FDG PET/MR images, namely DCE-MRI and ADC (Model 6), and DCE-MRI, ADC, and PET (Model 7); and quantitative parameters combined with radiomic features (Model 8). Quantitative parameters and/or radiomic features selected for each model are reported in Table 1.

Combinations of radiomic features
ADCr Among the models built using radiomic features extracted from individual 18 F-FDG PET/MR images, the best performing one was Model 2, based on features extracted from ADC maps, which yielded an AUC of 0.826 (95% CI: 0.758-0.984). On the other hand, the worst performing one was Model 5, based on features extracted from T2-weighted images, which yielded an AUC of 0.725 (95% CI: 0.679-0.765). Accuracy metrics of all radiomic model are reported in Table 2.
A statistically significant difference in terms of diagnostic performance was found using McNemar's test between the best performing model (Model 7) and the worst performing one (Model 5) (p = 0.005). No differences were observed between Model 7 and the remaining radiomic models. Comparisons in terms of diagnostic performance among all models are reported in Supplementary Materials S6, while univariable results for the image features and quantitative parameters are presented in Supplementary Materials S7.

Discussion
In the present study, we evaluated AI-enhanced simultaneous 18 F-FDG PET/MRI to distinguish TNBC from other molecular BC subtypes. To this end, we built ML-based predictive models employing radiomic features and/or quantitative parameters extracted from simultaneous 18 F-FDG PET/MRI to non-invasively identify TNBC. Model 7, the best performing one (AUC of 0.887; 95% CI: 0.847-0.916) specifically included features extracted from functional images i.e., ADC and PET, supporting the hypothesis that functional data such as tumor cellularity and metabolism may better depict biological tumor features compared to morphologic sequences. It is worth noting that no significant differences in terms of diagnostic performance were found between Model 7 and all other models except for Model 5 which was based on solely on radiomic features extracted from T2-weighted images.
Such findings support the expectation that, in the near future, molecular data could be non-invasively obtained by imaging through the application of artificial intelligence tools. This issue is particularly relevant if we consider that 18 F-FDG PET and MRI are already indicated for both local and global staging of locally advanced breast cancer as well as for treatment monitoring. As such, it is permissible to imagine that, with a single imaging examination, tumor diagnosis, staging, and phenotyping could be obtained non-invasively at the same time. In this light, "virtual biopsies" could be performed once radiomic profiles specific to molecular subtypes have been defined, aiming at providing genetic and phenotypic alterations which are representative of the whole tumor and comprehensively describe tumor heterogeneity. Furthermore, the extraction of quantitative imaging data from the whole tumor could allow the spatio-longitudinal monitoring of biomarker heterogeneity changes during treatment and the early identification of clonal dynamics and genetic modifications related to the occurrence of drug resistance.
PET and MRI represent the most promising imaging modalities for this purpose, due to their ability to non-invasively inform on cancer metabolism, cellularity, and neoangiogenesis. Such properties are reported to be different between BC subtypes which exhibit different biological aggressiveness and behavior [32]. TNBC is characterized by higher glucose metabolism, which is reflected in its higher SUV values compared with other BC subtypes as seen in other studies [32,33] as well as in this investigation. However, quantitative parameters (e.g., DWI-ADC values) alone do not seem to be accurate enough to discriminate molecular BC subtype; thus, more sophisticated AI-enhanced approaches are necessary [34].
So far, several attempts have been made to non-invasively define BC molecular subtype through the use of radiomics and machine learning applied to PET and MRI [17][18][19]. Previous studies explored the feasibility and usefulness of radiomics applied to PET/CT or MRI (both DCE and DWI) features for the prediction of molecular subtype [17][18][19][20]35]. Only recently have radiomics and AI techniques been applied to simultaneous 18 F-FDG PET/MRI for a comprehensive analysis of molecular subtype, Ki67 expression, nodal status, and presence of distant metastasis in a population of 124 BC patients [20]. For the prediction of molecular BC subtypes, Umutlu et al. built two accurate radiomic models to discriminate Luminal A vs. Luminal B BC (AUC: 0.978, 95% CI: 0.950-1.000) and Luminal BC vs. other subtypes (AUC: 0.950, 95% CI: 0.922-0.979), based on MRI and PET-derived radiomics features, respectively. In contrast, our best performing radiomics model included radiomic features extracted from both MRI and PET images; furthermore, our model was geared towards specifically identifying TNBC. This choice was driven by the particularly aggressive biological behavior of TNBC and the different biological development of TNBC compared with other BC subtypes, supposed to originate from luminal progenitors and breast epithelial stem cells, respectively, according to a cell-oforigin hypothesis [36]. However, according to a more recent hypothesis, Luminal and TNBC could have a common luminal progenitor and the latter, through a dedifferentiation process, could acquire a basal like phenotype [37]. Furthermore, both types of tumor cells are supposed to be present in the same BC lesions, even if in different percentages, which also contributes to both intra-and intertumor heterogeneity [6]. Others have attempted to diagnose TNBC using radiomic signatures extracted from 300 pre-treatment post-contrast CT examinations [38]. Five radiomic features were selected, showing AUC values of 0.881 (95% CI: 0.781-0.921) and 0.851 (95% CI: 0.761-0.961) in the training and validation group, respectively. While this single study yielded comparable results, it has to be noted that CT is not an imaging modality that is recommended for breast imaging and has no clinical standing in breast cancer diagnosis and treatment monitoring.
Our study has several limitations to be acknowledged, the first being the relatively small sample size, as accessibility to simultaneous 18 F-FDG PET/MRI was limited due to high demand from clinical needs. Furthermore, TNBC is relatively rare compared to other subtypes, thus larger numbers can only be recruited in a multi-centric setting over a reasonable time period. Due to this limitation, we refrained from using a subset of cases as a held-out test set. Indeed, a five-fold cross-validation was used, as done in previous studies involving the preliminary assessment of the applicability of the model to an unseen population [21,39]. Potentially, fewer parameters for each model might have been preferable but, considering the 25 cases included in the minority class, this equates to 5 cases per feature, which was deemed acceptable. A known limitation of LASSO is that in case of highly correlated features, the selection of features among those that are correlated can be random, or at least noisy. By employing LASSO in a cross-validated fashion, this effect is reduced to a certain extent. However, it is noted that rerunning the LASSO process may well result in different sets of selected features. Furthermore, findings from this single-institution study have to be further tested and validated on external cohorts of patients, preferably in the setting of multi-center investigations, which are currently being planned to overcome the above-mentioned drawbacks.

Conclusions
AI-enhanced simultaneous 18 F-FDG PET/MRI can non-invasively identify TNBC, the most aggressive tumor type requiring intensified treatment, with high accuracy. Additional investigations on larger cohorts of patients are necessary to validate our model and fully assess its generalizability.

Informed Consent Statement:
Written informed consent has been obtained from the patients to publish this paper.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.