Measurement of Perfusion Heterogeneity within Tumor Habitats on Magnetic Resonance Imaging and Its Association with Prognosis in Breast Cancer Patients

Simple Summary A habitat analysis reflects intratumoral heterogeneity more accurately than does a whole-tumor analysis. Perfusional heterogeneity using a habitat analysis is a rarely explored option and can affect patient outcomes. From two hospitals, 308 and 147 patients with invasive breast cancer who underwent preoperative MRI were included as development and validation cohorts, respectively. In our study, five habitats with distinct perfusion patterns were identified based on early and delayed phases of dynamic contrast material-enhanced MR images. A habitat risk score (HRS) was an independent risk factor for predicting worse disease-free survival outcomes in the HRS-only risk model (hazard ratio = 3.274 [95% CI = 1.378–7.782]; p = 0.014) and combined habitat risk model (hazard ratio = 4.128 [95% CI = 1.744–9.769]; p = 0.003) in the validation cohort. Abstract The purpose of this study was to identify perfusional subregions sharing similar kinetic characteristics from dynamic contrast-enhanced magnetic resonance imaging (MRI) using data-driven clustering, and to evaluate the effect of perfusional heterogeneity based on those subregions on patients’ survival outcomes in various risk models. From two hospitals, 308 and 147 women with invasive breast cancer who underwent preoperative MRI between October 2011 and July 2012 were retrospectively enrolled as development and validation cohorts, respectively. Using the Cox-least absolute shrinkage and selection operator model, a habitat risk score (HRS) was constructed from the radiomics features from the derived habitat map. An HRS-only, clinical, combined habitat, and two conventional radiomics risk models to predict patients’ disease-free survival (DFS) were built. Patients were classified into low-risk or high-risk groups using the median cutoff values of each risk score. Five habitats with distinct perfusion patterns were identified. An HRS was an independent risk factor for predicting worse DFS outcomes in the HRS-only risk model (hazard ratio = 3.274 [95% CI = 1.378–7.782]; p = 0.014) and combined habitat risk model (hazard ratio = 4.128 [95% CI = 1.744–9.769]; p = 0.003) in the validation cohort. In the validation cohort, the combined habitat risk model (hazard ratio = 4.128, p = 0.003, C-index = 0.760) showed the best performance among five different risk models. The quantification of perfusion heterogeneity is a potential approach for predicting prognosis and may facilitate personalized, tailored treatment strategies for breast cancer.


Introduction
Solid tumors are genomically, immunologically, and physiologically heterogeneous. These heterogeneous tumors are less likely to have durable responses to targeted and immune therapies [1][2][3] because treatment response is not uniform across the tumor, and therapy resistance occurs in different tumor regions. The complex vasculature within the tumor would lead to intratumoral heterogeneity. Tumors are supposed to have chaotic vasculature which leads to high permeability, and varying degrees of perfusion and oxygenation, and which has been proposed to be a major driver of the evolution of tumor heterogeneity at the genomic level and causes different microenvironment [4,5].
Since radiomics have been commonly used to measure intratumoral heterogeneity, radiomics analyses have conventionally been conducted for the whole tumor, and this approach assumes that the tumor is heterogeneous; however, the heterogeneity is evenly distributed throughout the tumor, thus neglecting regional phenotypic variations within a tumor [6]. Current tumor analyses using histograms [7] or radiomics analysis [8] focus on quantifying the heterogeneity and complexity by calculating the relationship between voxels [9]. In contrast to prior methodology, an emerging approach explicitly divides tumors into subregions containing clusters of voxels with similar characteristics, often called habitats, which allow better quantification of the intratumoral heterogeneity [10][11][12]. Habitat imaging is based on the speculation that identified subregions comprising voxels sharing similar imaging characteristics would share a common tumor biology [9,13]. Under this concept, Wu et al. [11] quantified intratumoral heterogeneity using dynamic contrast material-enhanced (DCE) magnetic resonance imaging (MRI) of breast cancers in neoadjuvant chemotherapy (NAC) settings to predict recurrence-free survival. They applied four metrics of DCE time-activity curves in each voxel and then used consensus clustering to divide the tumor into subregions.
Recently, Kim et al. [14] reported that higher values of kinetic heterogeneity, and peak enhancement determined using a commercially available computer-aided diagnosis (CAD) of preoperative MRI on the whole-tumor, were associated with worse recurrence outcomes in women with invasive breast cancer. Consequently, we hypothesized that intratumoral perfusion heterogeneity based on subregions of breast cancer derived from the kinetic features of DCE MRI maps would exhibit a better correlation with patient outcomes than that obtained from the conventional whole tumor. Therefore, the purpose of our study was to identify perfusional subregions sharing similar kinetic characteristics from DCE MRI using data-driven clustering, and to evaluate the effect of perfusional heterogeneity based on those subregions on patients' survival outcomes in various risk models.

Patients
This retrospective multicenter study was conducted in accordance with the Declaration of Helsinki and was approved by the institutional review board of Samsung Medical Center (SMC 2017-08-136) and Gil Hospital (GDIRB 2016-088). The requirement for informed consent was waived due to the retrospective nature of the study and the analysis used anonymous clinical data.
Our cohort comprised breast cancer patients who had undergone surgery for invasive breast cancer from two hospitals between October 2011 and July 2012. Patients from Samsung Medical Center (SMC) were used as the development cohort, and patients from the Gil Hospital (GH) were used as the validation cohort. The flowchart in Figure 1 provides an overview of the datasets used in this study. formed in patients after the diagnosis of cancer by vacuum-assisted or excisional biopsy (n = 42); (b) MRI performed in patients treated with NAC (n = 92); (c) patients with a preexisting malignancy in another organ (metastasis or primary malignancy) (n = 9); (d) involvement of any resection margin at final pathology (n = 16); (e) non-visualization of known breast cancer (n = 21); and (f) MRI quality inadequate for further processing (n = 14). Finally, 455 cancers in 455 women (mean age, 51 years; range, 24-85 years) were included. Figure 1. Flow chart of the study population. In total, 455 patients were included according to the inclusion and exclusion criteria from two hospitals.

MRI Protocol
For the SMC cohort, all MRI scans were performed on a 1.5 T scanner from Philips (Achieva, Philips Healthcare, Best, The Netherlands). The MRI examination comprised turbo spin-echo T1-and T2-weighted sequences and a fat-suppressed 3-dimensional dynamic contrast-enhanced (DCE) sequence. Image subtraction was performed after the dynamic series. The DCE-MRI scans were acquired using the following parameters: TR/TE, 6.5/2.5; slice thickness, 3 mm; flip angle, 10°; matrix size, 376 × 374; and field of view, 32 × 32 cm. DCE-MRI was performed using axial imaging with one pre-contrast and six postcontrast dynamic series. After contrast injection, contrast-enhanced images were acquired at 0.5, 1.5, 2.5, 3.5, 4.5, and 5.5 min. A 0.1 mmol/kg bolus of gadobutrol (Gadovist; Bayer Healthcare Pharmaceutical, Berlin, Germany) was injected, followed by a 20 mL saline flush.
For the Gil cohort, all MRI scans were performed using a 3.0 T Philips scanner (Achieva, Philips Healthcare, Best, The Netherlands). The MRI examination consisted of turbo spin-echo T1-and T2-weighted sequences and a fat-suppressed 3-dimensional DCE sequence. Image subtraction was performed after the dynamic series. The DCE-MRI scans were acquired using the following parameters: TR/TE, 5.5/2.8; slice thickness, 3 mm; flip Our inclusion criteria were as follows: (a) preoperative DCE MRI, (b) initial unilateral breast malignancy with a final pathologic diagnosis of invasive breast cancer, and (c) lesion presenting as a mass on MRI. The exclusion criteria were as follows: (a) MRI performed in patients after the diagnosis of cancer by vacuum-assisted or excisional biopsy (n = 42); (b) MRI performed in patients treated with NAC (n = 92); (c) patients with a pre-existing malignancy in another organ (metastasis or primary malignancy) (n = 9); (d) involvement of any resection margin at final pathology (n = 16); (e) non-visualization of known breast cancer (n = 21); and (f) MRI quality inadequate for further processing (n = 14). Finally, 455 cancers in 455 women (mean age, 51 years; range, 24-85 years) were included.

MRI Protocol
For the SMC cohort, all MRI scans were performed on a 1.5 T scanner from Philips (Achieva, Philips Healthcare, Best, The Netherlands). The MRI examination comprised turbo spin-echo T1-and T2-weighted sequences and a fat-suppressed 3-dimensional dynamic contrast-enhanced (DCE) sequence. Image subtraction was performed after the dynamic series. The DCE-MRI scans were acquired using the following parameters: TR/TE, 6.5/2.5; slice thickness, 3 mm; flip angle, 10 • ; matrix size, 376 × 374; and field of view, 32 × 32 cm. DCE-MRI was performed using axial imaging with one pre-contrast and six post-contrast dynamic series. After contrast injection, contrast-enhanced images were acquired at 0.5, 1.5, 2.5, 3.5, 4.5, and 5.5 min. A 0.1 mmol/kg bolus of gadobutrol (Gadovist; Bayer Healthcare Pharmaceutical, Berlin, Germany) was injected, followed by a 20 mL saline flush.
For the Gil cohort, all MRI scans were performed using a 3.0 T Philips scanner (Achieva, Philips Healthcare, Best, The Netherlands). The MRI examination consisted of turbo spinecho T1-and T2-weighted sequences and a fat-suppressed 3-dimensional DCE sequence. Image subtraction was performed after the dynamic series. The DCE-MRI scans were acquired using the following parameters: TR/TE, 5.5/2.8; slice thickness, 3 mm; flip angle, 18 • ; matrix size, 424 × 424; and field of view, 34 × 34 cm. The DCE-MRI was performed using axial imaging, with one pre-contrast and five post-contrast dynamic series. Contrastenhanced axial images were acquired at 1.5, 3, 4.5, and 6 min after contrast injection. A delayed sagittal image was obtained 8 min after the contrast injection. A 0.1 mmol/kg bolus of gadoterate meglumine (Dotarem; Guerbet, Villepinte, France and Clariscan; GE Healthcare, Oslo, Norway) was injected for dynamic contrast enhancement, followed by a 20 mL saline flush.

Clinicopathological Evaluation
The MRI findings were retrospectively evaluated according to the American College of Radiology Breast Imaging Reporting and Data System MR Lexicon [15] by two boardcertified radiologists (S.Y.N. and E.S.K., with 12 and 15 years of experience in breast MRI, respectively). The radiologists assessed the shape (oval, round, irregular), margin (circumscribed, irregular, spiculated), and internal enhancement characteristics (homogeneous, heterogeneous, rim, dark internal septation) of each mass.
The final histopathological results of surgical specimens were reviewed to determine the following: pathologic diagnosis, histologic grade, presence of an extensive intraductal component (EIC), presence of lymphovascular invasion, estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), p53, and Ki-67 expression status. Tumors with HER2 scores of 3+ (strong homogeneous staining) were considered positive. In the case of 2+ scores (moderate complete membranous staining in ≥1% of tumor cells), silver in situ hybridization was used to determine HER2 amplification. For convenience, the pathologic diagnoses were divided into three groups: invasive ductal carcinoma, invasive lobular carcinoma, and others.
The endpoint of our study was disease-free survival (DFS), which was defined as the time from the date of surgery to that of the first recurrence of the disease, of death, of the last-known evidence of the absence of disease, or of the most recent follow-up. Disease recurrence was defined as the outcome of breast cancer recurrence (local, regional, or distant) or new primary contralateral breast cancer (invasive or ductal carcinoma in situ). Patient medical records were used to obtain information regarding patient follow-up and recurrence status. Patients who did not have recurrence at the last follow-up or were lost to follow-up were treated as censored observations in the analyses. The last follow-up date was 1 September 2020.

Tumor Segmentation and Preprocessing
The pre-enhanced T1-weighted, early (1 min 30 s after contrast injection, respectively) and delayed (5 min 30 s for SMC, 6 min for GH after contrast injection, respectively) phases of contrast-enhanced T1-weighted MR images were retrieved from the Picture Archiving Communication System and loaded onto a workstation for further analysis. A region of interest (ROI) was manually drawn around the entire visible tumor on the early phase of contrast-enhanced T1-weighted images by a radiologist with 15 years of experience in breast MRI (E.S.K.) who was blinded to the clinical and pathological findings but was aware of the diagnosis of invasive carcinoma. The defined ROI was co-registered onto three other MRI series with a nine-parameter affine transform using mutual information as the similarity measure. The co-registration process allowed the researcher to define the ROI once and apply it to other imaging series of the same patient in a consistent manner. The ROI was drawn to be as large as possible but did not include edge voxels to avoid a partial volume effect. In the case of multifocal or multicentric disease, the largest tumor was selected as the index cancer for the analysis. Another radiologist (H.K.) drew another set of ROIs for a randomly selected 48 patients to assess interobserver reproducibility in terms of the intra-class correlation coefficient (ICC).
Each original imaging data were resampled to a 1 × 1 × 1 mm 3 isotropic resolution using B-splice interpolation for accounting of the resolution differences in imaging resolutions. The ROIs defined in the original imaging data were resampled using nearest-neighbor interpolation on the isotropic resolution data. To harmonize the MRI intensity characteristics between the two cohorts, histogram-matching was applied to the validation cohort for conformance with the development cohort.

Generation of Perfusion Maps
Three perfusion parametric maps were constructed using pre-contrast, early, and delayed-phase images of the DCE MRI of the development cohort. The wash-in map (E in ) was generated by subtracting the pre-contrast image from the early phase image. The washout map (E out ) was generated by subtracting the delayed phase image from the early phase image. The washout ratio map (R WO ) was measured as the ratio between the signal intensity difference from the early phase to the delayed phase for the early phase. If the signal intensity of the delayed phase was increased compared to that of the early phase, it was considered as no washout. Each perfusion map was calculated on a voxel-by-voxel basis using the following equation:

Identifying Distinct Subregions Based on Perfusion Features with Population-Level Clustering
A vector encompassing three perfusion features of each voxel (wash-in, washout, and washout ratio values) was defined as the perfusion feature vector (PFV). Each feature was quantized using a 256-bin histogram covering from the minimum to the maximum of each feature. For the development cohort, the PFVs of all patients were collected. We applied the k-means clustering algorithm with k values of 2 to 32 to identify a group of voxels with similar perfusion characteristics. Clustering was applied at the cohort level, not at the patient level, to ensure that clustering assignment remained consistent across patients. The Euclidean distance was used as the clustering cost measure. To select the optimal number of clusters (i.e., habitats), the clustering results were evaluated using the averaged Calinski-Harabasz score and Silhouette coefficient for each k value through 100 repetitions. The cluster centers from the development cohort were propagated to the validation cohort to ensure the application of the same clustering.
To investigate the characteristics of each subregion, we used box-and-whisker plots of three perfusion features and illustrated the kinetic profile of each subregion using the PFV centroid in a time-intensity curve. The portion of each subregion was also measured in relation to the total tumor volume.

Habitat Risk Score Building
To measure perfusion heterogeneity, we constructed a habitat map in which the intensity values of voxels were replaced with the previously identified habitat index. For example, if we identified five habitats, the habitat index varied from one to five. We adopted the concept of quantitative imaging features from radiomics studies to measure the spatial heterogeneity between the identified subregions. In total, 58 radiomics features were calculated using PyRadiomics from the habitat map.
To evaluate the effect of perfusion heterogeneity measured from the habitat map using the radiomics technique for predicting patients' survival outcomes, a habitat risk score (HRS) was constructed from the calculated features, which also included the proportions of each habitat.
For this study, 4histograms, 24 gray-level co-occurrence matrix (GLCM), and 16 graylevel size zone matrix (GLSZM) features were calculated using PyRadiomics from the habitat map. We also computed the proportion of each habitat (i.e., the volume of each habitat divided by the tumor volume). Because the habitat map has only a few unique index values, the bin size was set to one for all feature calculation processes. Fourteen shape-based features from the entire tumor ROI were also computed using PyRadiomics. Each feature was z-score normalized based on the development cohort's mean and standard Cancers 2022, 14, 1858 6 of 18 deviation (SD). The L1-norm regularized Cox proportional hazard model (Cox-LASSO) was used to select features to build the HRS for DFS. The optimal coefficients were determined using a nested 10-fold cross-validation and grid search process. The HRS was defined as the relative risk at the initial time according to the following equation: where h(X i , 0) denotes the initial hazard of the ith patient whose habitat feature vector is X i , x ij denotes the jth prognostic habitat feature of the ith patient, n denotes the number of selected features, and β j denotes the corresponding Cox-LASSO coefficient of the jth habitat feature. The same HRS model was applied to the validation cohort, fixing the model parameters and using the feature values from the validation cohort to obtain the HRS for the validation cohort. We further conducted a Wilcoxon rank-sum test to compare the habitat risk score of the original set of ROIs with the additional set of ROIs which were randomly selected from 48 patients to assess interobserver reproducibility.

Validation of HRS and Comparison of Performances among Different Risk Models
The Cox-least absolute shrinkage and selection operator (LASSO) model was used to analyze the effects of clinicopathological variables (age, histologic grade, pathologic type, T stage, N stage, EIC, lymphovascular invasion, ER, PR, HER2, p53, Ki-67, adjuvant chemotherapy, adjuvant radiation therapy, and adjuvant endocrine therapy); radiological variables (MRI findings of mass shape, mass margin, and internal enhancement pattern); and HRS on DFS.
To demonstrate the value of the HRS, the HRS-only, clinical, and combined habitat risk models were constructed. To evaluate the additive effect of HRS for predicting survival outcome in the clinical risk model, the combined habitat risk model included 17 clinicopathological and radiological variables and HRS. To compare the predictive ability of the habitat-based method with the whole tumor-based method, we conducted two additional radiomics analyses on the DCE MRI of three phases and three perfusion maps generated from the DCE MRI. For these analyses, a total of 72 features were extracted. The radiomics risk score was also calculated using the Cox-LASSO model in the development cohort.
The potential association of each risk score with DFS was first assessed in the development cohort and then validated in the validation cohort. Patients were classified into low-risk or high-risk groups using the median values of each risk score as cutoff values, which were also used for the validation cohort. The hazard ratio, p-value, and concordance index (C-index) were measured for all risk models, and we compared them among different models.

Statistical Analysis
Patient characteristics in the development and validation cohorts were compared using a Student's t-test for continuous variables and a chi-squared test or Fisher's exact test for categorical variables. The characteristics of patients according to risk groups in the development cohort were also compared using Student's t-test for continuous variables and chi-squared or Fisher's exact tests for categorical variables. Different risk prediction models were compared with C-index values using the paired t-test [16]. All statistical analyses were conducted using the Statistics and Machine Learning Toolbox in MATLAB (R2019b).

Patient Characteristics and Outcomes
Characteristics of the development and validation cohorts were compared, and the results are given in Table 1.
Three-hundred and fifty-five (355/455, 78.0%) patients underwent breast-conserving surgery, and mastectomy was performed in 100 (100/455, 22.0%). There were forty-nine recurrences (twenty-three local-regional, nine contralateral breast, and seventeen distant recurrences) after a mean follow-up period of 84.1 months (range, 5-108 months). The mean time to recurrence was 39.3 months (range, 6-91 months). One patient had a recurrence within the first 6 months of follow-up, possibly due to residual disease.

Identified Subregions Based on Perfusion Features and Habitat Risk Score
The interobserver reliability in ROIs measured in intra-class correlation coefficient was on average 0.9237 over the habitat radiomics features.
The optimal number of clusters was determined to be five based on the Calinski-Harabasz score and Silhouette coefficient ( Figure 2). Resultantly, five subregions were determined in the development cohort, which were also applied in the validation cohort ( Figure 3). Table 2 shows the cluster center and the proportion of each subregion among all voxels in the development cohort. Figure 4 shows the detailed distribution of the three perfusion features used (wash-in, washout, and washout ratio) in the development cohort and the characteristic illustration of each habitat using a time-intensity curve. In the early phase of contrast enhancement, habitats 2 and 3 showed strong wash-in, whereas habitats 1, 4, and 5 showed less wash-in. In the delayed phase of enhancement, habitat 1 showed a persistent pattern; habitats 2, 4, and 5 showed a washout pattern; and habitat 3 showed a plateau pattern. For each subregion, habitat 1 was the most predominant (41.9%), followed by habitats 3 (25.6%) and 4 (23.5%) ( Table 2). The HRS was constructed using selected variables, mostly from texture features (Table 3). Notably, the proportion of habitats was not selected as a prognostic factor.          The median HRS in the development cohort was 1.067 (range, 0.322-1.691; interquartile range, 0.823-1.255). Using this threshold value, the patients were classified into a highrisk (HRS ≥ 1.067) and a low-risk group (HRS < 1.067). The patient characteristics in the development cohort according to risk groups are shown in Table 4. In the development cohort, a higher T stage (p < 0.001), higher N stage (p < 0.001), higher histologic grade (p <  The median HRS in the development cohort was 1.067 (range, 0.322-1.691; interquartile range, 0.823-1.255). Using this threshold value, the patients were classified into a highrisk (HRS ≥ 1.067) and a low-risk group (HRS < 1.067). The patient characteristics in the development cohort according to risk groups are shown in Table 4. In the development cohort, a higher T stage (p < 0.001), higher N stage (p < 0.001), higher histologic grade (p < 0.001), presence of lymphovascular invasion (p < 0.001), ER negativity (p < 0.001), PR negativity (p < 0.001), and HER2 positivity (p = 0.001) were associated with the high-risk group (Figures 5 and 6). The HRS of the original set of ROIs and the additional set were comparable with a p-value of 0.7278.
Cancers 2022, 14, x FOR PEER REVIEW 11 of 18 0.001), presence of lymphovascular invasion (p < 0.001), ER negativity (p < 0.001), PR negativity (p < 0.001), and HER2 positivity (p = 0.001) were associated with the high-risk group (Figures 5 and 6). The HRS of the original set of ROIs and the additional set were comparable with a p-value of 0.7278.        Table 3 lists the features selected in various risk models using the Cox-LASSO model, where the selected features had non-zero coefficients. The coefficients represented the relative strength of the selected features and were reported in the third column. Higher N stage, presence of lymphovascular invasion, ER negativity, PR negativity, and Ki-67 ≥ 20% were selected for the clinical risk model as risk factors significantly associated with worse survival outcomes. The results of the combined habitat risk model showed that a higher HRS was associated with worse outcomes. The cutoff values of all risk models, including the two radiomics risk models, are presented in Table 5.

Performance and Validation of the HRS
The risk stratification performance of each risk model is presented in Table 6. Higher radiomics risk scores calculated from two conventional radiomics risk models were independent risk factors for worse survival outcomes in the development cohort, but they were not reproducible in the validation cohort. The HRS was an independent risk factor for predicting worse outcomes in the HRS-only (hazard ratio = 3.274 [95% CI = 1.378-7.782]; p = 0.014) and combined habitat risk models (hazard ratio = 4.128 [95% CI = 1.744-9.769]; p = 0.003). When comparing the performance of the HRS-only risk model with that of the radiomics risk model based on perfusion maps, the C-index of the HRS-only risk model was 0.699, which was better than that of the radiomics_DCE_MR (C-index = 0.537) and radiomics_perfusion risk models (C-index = 0.640). The clinical risk model showed better performance than that of the HRS-only and the two radiomics risk models. When the HRS was combined with the clinical risk model, there was an improvement (C-index for combined habitat risk model = 0.760 vs. C-index for clinical risk model = 0.748, respectively), although it failed to show a statistical significance in both the development and validation cohorts (p = 0.342 and 0.456, respectively). The p-value in each row refers to the p-value of the hazard ratio model. The p-value * is the model comparison between each model with the combined habitat model using the C-index values.

Discussion
Unlike previous radiomics studies on whole tumors, an emerging approach explicitly aimed at identifying distinct tumor areas or cell subpopulations is commonly called habitat imaging, which has been reported to reveal aggressive subregions that are important for determining prognosis and treatment response [6,[17][18][19]. The use of complex signatures from multi-dimensional information and a radiomics analysis is a common framework for habitat imaging. Relative habitat volumes derived from clustering have been reported as predictors of survival [20,21]. Regarding histologic validation of the radiological habitat, some authors have reported detailed preclinical results through per-pixel spatial co-registration of the images and corresponding histologic findings of hypoxia, necrosis, and other conditions [22].
In this study, we focused on the spatial heterogeneity of kinetic profiles and hypothesized that intratumoral spatial heterogeneity might be reflected in the tumor enhancement kinetics of DCE MRI, and that this characteristic could be quantified using data-driven clustering analyses. As a relatively easy approach to measure perfusion heterogeneity, some authors have utilized CAD because it is popularly used and automatically provides quantitative values. They have revealed significant associations between peak enhancement or washout components as determined by CAD on preoperative MRI and recurrencefree survival in breast cancer patients [23,24]. Furthermore, a recent study conducted by Kim et al. [14] adopted the radiomics concept to evaluate the effect of kinetic patterns on survival outcomes. They found significant differences in early peak enhancement and delayed enhancement profiles, as determined by the CAD of preoperative breast MRIs, between the distant and non-distant metastasis groups. Importantly, they also noted a higher degree of kinetic heterogeneity in the distant metastasis group. However, they only used the kinetic characteristics of the delayed phase of enhancement (persistent, plateau, washout) and calculated the entropy measured from the amount of each delayed kinetic pattern to measure kinetic heterogeneity. As we believe that kinetic heterogeneity should include the characteristics of the early phase of enhancement as well, and consider not only the magnitude of each pattern but also their distribution, we proposed the HRS, which was defined as a measurement of heterogeneity in the habitat map that considered three perfusion features reflecting spatial heterogeneity among five distinct habitats. In our study, we identified five distinct intratumoral subregions combining kinetic profiles of both early and delayed phases of DCE MRI, and a higher HRS was an independent risk factor for predicting poor survival outcomes in the validation cohort. As far as we know, this study is the first to divide early and delayed phases by grouping them together. If the amount of washout is large, the prognosis is likely to be poor; however, it was not, and the most common type was type 1.
Interestingly, the proportion of each habitat did not affect the survival outcomes, which means that the distribution heterogeneity of habitats may be more important than the amount of each habitat. Our results could partially explain the previous results using CAD reporting inconsistent effects of kinetic patterns (i.e., washout component) with relatively low hazard ratios [14,23,24]. For example, in a study conducted by Kim et al. [23], a multivariate Cox analysis showed that a higher peak enhancement (hazard ratio = 1.001; p = 0.004) and a higher washout component (hazard ratio = 1.029; p = 0.017) were associated with poorer DFS. Conversely, in a study by Nam et al. [24], although the mean value of the washout component was higher in the recurrence group than in the non-recurrence group (39.19% vs. 38.08%, respectively), there was no significant difference in the DFS between the two groups (hazard ratio = 1.001; p = 0.834). A multivariate analysis revealed that a higher peak enhancement (hazard ratio = 1.004; p = 0.013) was independently associated with worse DFS outcomes. A more recent study by Kim et al. [14] reported that greater kinetic heterogeneity (hazard ratio = 19.2; p < 0.001) and higher peak enhancement (hazard ratio = 1.001; p = 0.045) were associated with worse distant metastasis-free survival in women with invasive breast cancer. Compared with these studies, our results showed that HRS was consistently an independent risk factor for poorer survival in the validation cohort (hazard ratio = 3.274; p = 0.014). We can confirm our hypothesis that spatial heterogeneity as well as magnitude of heterogeneity are important.
Several key aspects differentiate our study from previous studies. First and foremost, this was the first study to identify distinct patterns of enhancement profiles both in the early and delayed phases of enhancement, to quantify them from derived maps based on perfusional features, and to identify the relationship between perfusion heterogeneity and survival outcomes in the preoperative setting. Second, we applied well-grounded statistical principles to derive our habitat results. Third, we performed meticulous analyses to prove the better performance of the habitat-based method than that of the whole-tumor-based method. For this purpose, we conducted two additional radiomics analyses and compared the performances of the five different models. Consequently, radiomics analyses based on features calculated from the whole tumor (both on DCE MRI and perfusion maps) did not consistently enable the prediction of DFS in our study. Through these processes, our results provided evidence that a habitat-based analysis more robustly and consistently reflected the tumor characteristics than did the whole tumor-based analysis. One possible reason for the better performance of the habitat-based method could be the limitation of the voxel-based quantification of the whole-tumor approach, which can be easily affected by scan-related circumstances such as inhomogeneous fat suppression or blurring. However, habitat-based quantification lumps together similar voxels and is thus more robust in such circumstances.
Our study had several limitations. First, although we called our five distinct subregions perfusion habitats, strict pathological correlations with image-based segmentations were lacking. However, such correlations are difficult to achieve. It is also close to a radiologic subgroup because the amount of each habitat did not affect the prognosis. Second, although the study results were validated in a separate cohort, only MRIs from the same vendor were utilized in this study. Technical factors such as field strength, repetition time, echo time, and flip angle might have influenced the results despite normalized imaging. Therefore, additional studies are required to confirm and validate our findings. However, 1.5T and 3T were mixed even though it was the same vendor. Due to the difference in scanners (1.5T in the development and 3T in the validation cohorts), the raw intensities could be different between scanners/cohorts. Thus, we mitigated this issue by matching the intensity histogram of the validation cohort to that of the development cohort for three phases (i.e., pre-contrast, early, and delay phases). This effectively normalized the intensity distribution of the validation cohort. Thirdly, patients' characteristics between the two hospitals were very different. We speculate that it was probably due to the different nature of each hospital. However, our hypothesis was proven with statistical significance despite the inhomogeneous patient cohort. Finally, the technology used in this study seems very complex, making this technique seem impractical. To obtain greater clinical relevance, the development of easy-to-use software is warranted. However, our study provided a proofof-concept. Similar to the developmental process of artificial intelligence for mammography, accumulating large-scale data using various machines or scanning parameters would make this technique practical.

Conclusions
In conclusion, we quantified the spatial heterogeneity of the perfusional features of breast cancer on a derived habitat map from preoperative MRI scans. We identified five intratumoral subregions with distinct perfusion characteristics in breast cancer and showed that the spatial heterogeneity among the subregions was more important than the amount of each subregion. In addition to the clinical and pathologic factors, perfusion heterogeneity, defined by the spatial heterogeneity between perfusion habitats, was an independent predictor of DFS. The quantification of perfusion heterogeneity is a potential approach for predicting prognosis and can potentially lead to personalized, tailored treatment strategies for breast cancer.