Machine Learning Predicts Pathologic Complete Response to Neoadjuvant Chemotherapy for ER+HER2- Breast Cancer: Integrating Tumoral and Peritumoral MRI Radiomic Features

Background: This study aimed to predict pathologic complete response (pCR) in neoadjuvant chemotherapy for ER+HER2- locally advanced breast cancer (LABC), a subtype with limited treatment response. Methods: We included 265 ER+HER2- LABC patients (2010–2020) with pre-treatment MRI, neoadjuvant chemotherapy, and confirmed pathology. Using data from January 2016, we divided them into training and validation cohorts. Volumes of interest (VOI) for the tumoral and peritumoral regions were segmented on preoperative MRI from three sequences: T1-weighted early and delayed contrast-enhanced sequences and T2-weighted fat-suppressed sequence (T2FS). We constructed seven machine learning models using tumoral, peritumoral, and combined texture features within and across the sequences, and evaluated their pCR prediction performance using AUC values. Results: The best single sequence model was SVM using a 1 mm tumor-to-peritumor VOI in the early contrast-enhanced phase (AUC = 0.9447). Among the combinations, the top-performing model was K-Nearest Neighbor, using 1 mm tumor-to-peritumor VOI in the early contrast-enhanced phase and 3 mm peritumoral VOI in T2FS (AUC = 0.9631). Conclusions: We suggest that a combined machine learning model that integrates tumoral and peritumoral radiomic features across different MRI sequences can provide a more accurate pretreatment pCR prediction for neoadjuvant chemotherapy in ER+HER2- LABC.


Introduction
Breast cancer is the most common female cancer worldwide [1,2].Among the varied subtypes, ER+HER2-breast cancer has consistently increased in number since its incidence surpassed that of ER-breast cancer in 1950 [2].Overall, ER+HER2-breast cancer has a good prognosis compared with other breast cancer subtypes.However, owing to its high incidence, it is the main subtype that accounts for the highest proportion of breast cancer mortality [2].
The treatment of ER+HER2-breast cancer can be broadly divided into early-stage breast cancer with no lymph node (LN) involvement and advanced-stage breast cancer with LN involvement.Currently, for early-stage breast cancer without LN involvement, the oncotype DX breast recurrence score is used as a quantitative measurement, using real-time PCR to predict the response to chemotherapy in patients requiring adjuvant chemotherapy [3].For locally advanced breast cancer (LABC) with LN involvement, the standard protocol has so far been surgery following the completion of neoadjuvant chemotherapy (NAC) [4,5].Regarding NAC, the lesion size is first reduced to improve operability, and notably, the pathologic complete response (pCR) upon operation after NAC has been proven to be a powerful prognostic factor for patients' long-term outcomes [6][7][8].
Nonetheless, when the effect of NAC on ER+HER2-breast cancer is compared to that of other molecular subtypes of breast cancer, it is shown to have a poor NAC response [4,5].According to a meta-analysis, the pCR rate of LABC upon the operation following NAC varied from 26.5% to 39.0% in other molecular subtypes, and the ER+HER2-subtype showed a significant difference, with a rate of 7.2% to 13.0% [9].Given the significantly low NAC response in ER+HER2-LABC patients, it is crucial, from a precision medicine perspective, to selectively administer NAC to the approximately 10% of patients who exhibited a favorable response to NAC.This approach can help reduce unnecessary suffering for the 90% of patients who do not respond well to NAC, mitigating issues such as drug toxicity and delayed surgery.However, to date, there has been no reliable method for predicting this response, leading to uniform treatment strategies for all patients.Therefore, this study was focused on identifying, at an early stage, patients within the minority of patients with ER+HER2-LABC who exhibited a favorable response to NAC and who had the potential to achieve a pCR upon surgery.
We aimed to use magnetic resonance imaging (MRI), a representative modality for evaluating treatment response, to classify patients and potentially provide clinical assistance [6,7,10].Initially, MRI assessed treatment response through lesion characteristics [11], but technological advances have expanded predictive techniques, such as multiparametric MRI, magnetic resonance spectroscopy (MRS), and FDG-PET [7,12].Nevertheless, designing accurate and reproducible parameters remains a challenge.Recent research has actively explored MRI radiomics, utilizing texture features [3,10].MRI texture analysis (TA) offers objective assessment by quantifying data representing tissue heterogeneity, often imperceptible visually [13].This has led to efforts to enhance reproducibility through machine learning models trained using a selected set of key texture features [14].
Most previous studies that used MRI to predict pCR rates among patients with breast cancer can be broadly divided into two general aspects based on the region of interest in the imaging.First, many studies have attempted to explain treatment responses based on temporal changes in lesions between the initial MRI and the early or mid-term MRI after NAC [15,16].However, such a prediction based on a comparison between the initial lesion and the residual lesion in a follow-up MRI has been reported to over-or under-estimate the lesion due to various changes associated with the treatment response [17].More importantly, decreased quality of life experienced by patients who receive unnecessary treatment or have delayed appropriate treatment should be considered.Thus, this study focused on refining the prediction of pCR using pretreatment MRI.Second, in many previous studies, when evaluating lesions on MRI, the focus was mainly on the tumor region [18][19][20]; however, in this study, the peritumoral region was also evaluated and analyzed.Several studies have shown that the peritumoral region can be critical to the response to NAC by reflecting angiogenic or lymphangiogenic activity [21,22].This study focused more on confirming the importance of the peritumoral region.Thus far, a few recent studies have attempted to use pretreatment MRI only or have included the peritumoral region as a consideration [21,23].However, these studies considered the ER+HER2-subtype, the focus of this study, only as a part of the study population, and no study has yet investigated the pretreatment NAC response with a focus on the ER+HER2-subtype.
Therefore, this study aimed to develop and validate a reproducible practical machine learning model with texture features incorporating both tumoral and peritumoral regions across initial pretreatment MRI sequences in patients with ER+HER2-LABC, whose NAC response was notably low.Through this study, we hope to provide practical help to clinicians in establishing tailored therapeutic strategies by stratifying this patient population prior to treatment.

Patient Population and Study Design
This retrospective study was approved by the institutional review board of our hospital, which waived the requirement for informed consent.
Between January 2010 and December 2020, 2349 patients with advanced breast cancer received NAC at our institution.Among these, 818 were diagnosed with ER+HER2-LABC subtype.First, 403 patients were excluded because of a lack of raw data for the dynamic study in the Picture Archiving and Communication System (PACS).Patients with a history of previous treatment, no cytology report on the initial axillary LN metastasis, no report on the final pathological results, no pretreatment MRI, no verifiable information on the four MRI sequences essential to this study (T1-weighted fat-suppressed pre-contrast, early and delayed post-contrast subtraction sequences, and T2-weighted fat-suppressed sequence), and insufficient image quality for lesion segmentation were excluded.Finally, the inclusion criteria were patients who (1) had a pretreatment MRI performed at our center, (2) completed all cycles of NAC and had surgery with a final pathologic report on achievement of pCR or not, and (3) had all four sequences with sufficient quality for segmentation, resulting in a total of 265 enrolled patients (stage IIB through IIIC according to the 8th edition of the AJCC cancer staging system).
Based on the date of the pretreatment MRI scans, patients were divided into training and validation cohorts.A total of 195 patients who underwent MRI between 2010 and 2015 were included in the training cohort.Another 70 patients who underwent MRI between 2016 and 2020 were included in the temporal validation cohort.The patient selection process is illustrated in Figure 1.

MRI Acquisition
Breast MRI examinations were performed with the patients in the prone position using a 3.0 T scanner (MR750, GE Healthcare, Milwaukee, WI, USA or TrioTim, Siemens Healthcare, Erlangen, Germany using a dedicated eight-or four-channel breast coil).The following images have been commonly obtained after the localizer images from one of the two types of scanners: T2-weighted fast spin echo axial images (TR/TE, 9100/100 ms;

Volume of Interest (VOI) Segmentation
The VOI segmentation of tumors was first semi-automatically performed along the margin of the tumor in the axial scan of T1-weighted fat-suppressed early post-contrast subtraction sequences (Ph2) by a radiologist (J.P., with 5 years of experience in radiology) using 3D-Slicer (version 5.0.2) software, and the accuracy of the image up to the 3D margin on the coronal and sagittal planes was checked with necessary modifications.For peritumoral VOI segmentation, the existing tumor mask was subtracted after 3D dilation by 1 mm and 3 mm units (Figure 2).The same process was applied to T1-weighted fat-suppressed delayed post-contrast subtraction sequences (Ph6) and T2-weighted fat-suppressed sequences (T2FS).Thus, 15 VOIs of tumoral, peritumoral (1 mm, 3 mm), and tumoral + peritumoral (1 mm, 3 mm) were obtained for Ph2, Ph6, and T2FS in each patient's pretreatment MRI.The process was evaluated by another senior radiologist (M.J.K., with 23 years of experience in radiology) to assess and revise the tumoral and peritumoral VOI segmentations to reconfirm The same process was applied to T1-weighted fat-suppressed delayed post-contrast subtraction sequences (Ph6) and T2-weighted fat-suppressed sequences (T2FS).Thus, 15 VOIs of tumoral, peritumoral (1 mm, 3 mm), and tumoral + peritumoral (1 mm, 3 mm) were obtained for Ph2, Ph6, and T2FS in each patient's pretreatment MRI.The process was evaluated by another senior radiologist (M.J.K., with 23 years of experience in radiology) to assess and revise the tumoral and peritumoral VOI segmentations to reconfirm the entire procedure.

MRI Preprocessing and Radiomic Texture Feature Extraction
For the segmented VOIs, N4ITK MRI bias correction was applied to improve the nonuniformity of MR images between different patients [24], and the variation between data was minimized by normalizing the gray-level value, as shown in the following formula [25]: Here, x is the amplitude of the image, µ x is the average of the image values, σ x is the standard deviation of the image, and s is an optional scaling value set to 10 to prevent errors in the calculation of radiomic features that may occur due to a relatively large standard deviation.After resampling the image with a 1 × 1 × 1 mm iso-voxel, 863 radiomic features were extracted from each VOI in the three sequences.Among the extracted features, diagnostic features (n = 12), which are information on the entire image, not VOI, and shape features among the original features (n = 14), which were information related to tumor size or volume measurable on conventional MRI, were excluded.The final feature set incorporated 2511 features for each sequence, and 7533 features were extracted from each patient.

Dimension Reduction
Python 3.8 was used for data handling in the machine learning steps, and the key feature selection on the radiomic features extracted from each VOI was performed in two steps.First, the Mann-Whitney U test was used with statistical significance related to pCR or non-pCR prediction (p < 0.05).Second, using the random forest (RF) algorithm, the top 30 features were selected for radiomic feature importance in pCR prediction.Prior to data training, a standard scaler was applied to adjust the deviating scales of the radiomic features and reduce the influence of outliers.Additionally, the synthetic minority oversampling technique (SMOTE) was performed to reduce the problem of overfitting toward non-pCR due to the numerical imbalance between the pCR and non-pCR groups, even if the number reflected the actual clinical pCR rate of ER+HER2-LABC.

Development of pCR Prediction Model in the Training Cohort
First, pCR prediction model development was individually developed for each sequence.Seven representative machine learning models were created with the key radiomic features for each of the five VOIs (tumor, peritumor 1 mm, peritumor 3 mm, area from tumor to peritumor 1 mm, and area from tumor to peritumor 3 mm) in the MRI sequences of the training cohort: binary classification model, K-Nearest Neighbor model, Support Vector Machine (SVM), Decision Tree classifier, AdaBoost classifier, Random Forest (RF) classifier, and Light Gradient-Boosting Machine (LightGBM).A grid search approach was used to find the best hyperparameters for each of these seven models.This method systematically explores various combinations of hyperparameters to identify the optimal configuration for each model.To evaluate the performance of each model and its hyperparameter combination, we employed a five-fold cross-validation.The dataset was divided into five subsets.During each iteration, four subsets were designated for training, while the remaining subset was allocated for validation.This cycle was repeated five times, ensuring that each subset served as the validation set once.The optimal model and its hyperparameters for each VOI were selected based on the area under the curve (AUC) value, representing the true positive rate (sensitivity) plotted against the false positive rate (1-specificity) [26], which measures the model's ability to differentiate between the pCR and non-pCR groups.
Next, to construct a more sophisticated pCR prediction model, seven machine learning models were created with sets of selected key radiomic features in combination for tumoral, peritumoral, and tumoral + peritumoral VOIs across sequences from the training cohort.The training and testing processes were identical to those described as above, and AUC values were used to select the optimal model.Finally, a model incorporating clinical factors instead of radiomic features was created as a comparison group, and its performance was evaluated.Excluding the molecular subtype and axillary LN metastasis of breast cancer, which were fixed in this study, patient age, tumor size, and estrogen receptor (ER) and progesterone receptor (PR) expression levels were selected as clinical factors potentially associated with disease prognosis.

Assessment of pCR Prediction Model Performance with the Validation Cohort
We validated the predictive performance of the optimal models developed using radiomic features extracted from the VOIs of each sequence, radiomic features combined from the VOIs across different sequences, and clinical factors in the validation cohort.After calculating the AUC, precision (the ratio of correctly predicted positive observations to the total predicted positives), recall (the ratio of correctly predicted positive observations to total actual positives), and F1 scores (the harmonic mean of Precision and Recall, representing both precision and recall in one metric), the predictive performance of the model was evaluated using the AUC of the receiver operating characteristic (ROC) curve [27].servations to total actual positives), and F1 scores (the harmonic mean of Precision and Recall, representing both precision and recall in one metric), the predictive performance of the model was evaluated using the AUC of the receiver operating characteristic (ROC) curve [27].
The process of this study is summarized in Figure 3.

Statistical Analysis
Statistical analyses of clinical factors, including patient age, tumor size, and ER and PR expression levels, were performed as follows: continuous variables were expressed as means and standard deviations, while categorical variables were expressed as frequencies and percentages.Continuous variables were tested using the independent samples

Statistical Analysis
Statistical analyses of clinical factors, including patient age, tumor size, and ER and PR expression levels, were performed as follows: continuous variables were expressed as means and standard deviations, while categorical variables were expressed as frequencies and percentages.Continuous variables were tested using the independent samples t-test or the Mann-Whitney U test based on the results of the Shapiro-Wilk test for normality, and categorical variables were compared using the χ 2 test.Statistical significance was accepted when p values were <0.05.

Patient Characteristics
This study included pretreatment MRIs scans of 265 patients with ER+HER2-LABC with axillary LN metastasis.The clinical and histological factors of the pCR and non-pCR groups, considering pCR as the endpoint in this study, are shown in Table 1.Among the patients, 238 (89.8%) had a non-pCR and 27 (10.2%)reached pCR.The mean tumor size between the pCR and non-pCR groups was observed to be significantly different, with 37.9 ± 21.3 mm and 22.1 ± 8.8 mm, respectively.However, there were no significant differences in terms of patient age and estrogen and progesterone receptor expression levels between the pCR and non-pCR groups.A comparison of the training and validation cohorts is shown in Table 2.The two cohorts showed no significant difference in pCR rate (9.7% and 11.4%, respectively).Table 2 shows the other characteristics of the two cohorts, including patient age, tumor size, and estrogen and progesterone receptor expression levels.

Radiomic Texture Feature Composition and Dimension Reduction
As previously mentioned, excluding diagnostic and shape features, 837 radiomic texture features per VOI were extracted from each patient's pretreatment MRI.The 837 radiomic texture features included 93 original (first-order, shape, gray-level co-occurrence matrix [GLCM], gray-level dependence matrix [GLDM], gray-level run-length matrix [GLRLM], gray-level size-zone matrix [GLSZM], and neighboring gray tone difference matrix [NGTDM]), and 744 wavelet features (Table S1).The Mann-Whitney U test was used to remove 16 features showing no significant difference between pCR and non-pCR.The remaining 821 features were ranked by importance values from the Random Forest (RF) algorithm, and the top 30 features were chosen.

Performance of the pCR Prediction Model in Each Sequence
Table S2 presents the final pCR prediction performance in the validation cohort, which was confirmed by sequentially applying the optimal machine learning models developed from each of the five types of VOIs.A general look at the table reveals that the models derived from Ph2 and T2FS showed relatively high AUC values, whereas even the best performing models in Ph6 did not exceed an AUC value of 0.9.The best model for pCR prediction of NAC in ER+HER2-LABC in the three respective sequences was the SVM model of tumor-to-peritumor 1 mm on Ph2 (AUC = 0.9447, recall = 91%, precision = 91%, and F1 score = 91%).The ROC curves and AUCs of the 15 models in the validation cohorts are shown in Figure 4, and it can be confirmed once again that the overall high AUC value was shown in Ph2.

Performance of the pCR Prediction Model with Combination of Sequences
We confirmed the predictive performance of pCR for the optimal machine learning model developed from 75 VOIs, combining the tumoral and peritumoral regions in two different sequences in the validation cohort.The KNN model with key radiomic features derived from a combination of VOIs ranging from the tumor-to-peritumor 1 mm in Ph2 and peritumor 3 mm VOI in T2FS exhibited the best pCR prediction performance, with an AUC of 0.96.The pCR prediction performance based on the combination of tumoral and peritumoral regions of different sequences is shown in Table S3.Additionally, Figure 5 compares the ROC curve of the optimal model developed using the tumoral VOI, peritumoral 1 mm VOI, tumor-to-peritumoral 1 mm VOI of Ph2, and peritumoral 3 mm VOI of T2FS, which are components of the combination model.

Performance of the pCR Prediction Model with Combination of Sequences
We confirmed the predictive performance of pCR for the optimal machine learning model developed from 75 VOIs, combining the tumoral and peritumoral regions in two different sequences in the validation cohort.The KNN model with key radiomic features derived from a combination of VOIs ranging from the tumor-to-peritumor 1 mm in Ph2 and peritumor 3 mm VOI in T2FS exhibited the best pCR prediction performance, with an AUC of 0.96.The pCR prediction performance based on the combination of tumoral and peritumoral regions of different sequences is shown in Table S3.Additionally, Figure 5 compares the ROC curve of the optimal model developed using the tumoral VOI, peritumoral 1 mm VOI, tumor-to-peritumoral 1 mm VOI of Ph2, and peritumoral 3 mm VOI of T2FS, which are components of the combination model.Cochran's Q test verified that there was a significant difference between these five models (p < 0.001).

Diagnostic Performance of Clinical Model
Furthermore, we used the same process to confirm the predictive performance of pCR for clinical factors that could be associated with patient prognosis in breast cancer, such as age, tumor size, and estrogen and progesterone expression levels.
In the validation cohort, the AUC values were generally low for pCR prediction performance compared to the radiomics models.The AUC values for patient age, tumor size, and the combination model of patient age and tumor size were 0.63, 0.81, and 0.67, respectively.The AUC values for estrogen and progesterone expression levels and their combination model were 0.68, 0.64, and 0.53, respectively.The results are summarized in Table S4 and Figure 6.Cochran's Q test verified that there was a significant difference between these five models (p < 0.001).

Diagnostic Performance of Clinical Model
Furthermore, we used the same process to confirm the predictive performance of pCR for clinical factors that could be associated with patient prognosis in breast cancer, such as age, tumor size, and estrogen and progesterone expression levels.
In the validation cohort, the AUC values were generally low for pCR prediction performance compared to the radiomics models.The AUC values for patient age, tumor size, and the combination model of patient age and tumor size were 0.63, 0.81, and 0.67, respectively.The AUC values for estrogen and progesterone expression levels and their combination model were 0.68, 0.64, and 0.53, respectively.The results are summarized in Table S4 and Figure 6.

Discussion
ER+HER2-locally advanced breast cancer (LABC) has a poor pathologic complete response (pCR) rate of approximately 10% compared with the 3-40% pCR rates of other molecular subtypes after surgical intervention following neoadjuvant chemotherapy (NAC) [9].Therefore, this study aimed to classify ER+HER2-LABC patients with a high probability of providing an effective response to NAC using pretreatment MRI, which is a key modality for the non-invasive assessment of breast cancer [6,7,10].Several recent studies have attempted to create a prognosis prediction model for breast cancer using radiomic texture feature extraction with respect to the pretreatment MRI applied in this study [10,21,28].However, all these studies were conducted on heterogeneous molecular subtypes with only a small part of the patient population with ER+HER2-LABC, which is the focus of this study.
To construct a sophisticated model for pCR prediction after NAC in patients with ER+HER2-LABC, the radiomic texture features of MRI were extracted from the tumor, peritumor 1 mm, peritumor 3 mm, area from tumor-to-peritumor 1 mm, and area from tumor-to-peritumor 3 mm for early post-contrast sequences, delayed post-contrast sequences, and T2-weighted fat-saturated sequences.In line with previous studies, it was further established that early post-contrast images predominantly contain the most useful texture features in machine learning models as a single sequence model evaluation [29][30][31].The inclusion of the delayed post-contrast image in this study was based on a previous study by Jin et al., who claimed that texture heterogeneity is better reflected in the delayed enhanced phase for breast tumors [32].However, the model incorporating the texture features of the tumor in the delayed phase did not produce more potent information in comparison to other sequences in our study.
Another strength of this study is that not only is the tumoral region the basis for determining the VOI for radiomic feature extraction in MRI, but the peritumoral region, which is also reported to form a microenvironment that affects the NAC response [21,22], was included.Until recently, studies included the peritumoral region to investigate an extended area from the tumoral to the peritumoral region on a single MRI sequence [22].In this study, on the other hand, the tumoral and peritumoral regions in combination across sequences were examined to construct a more advanced model that reflects more important sequences related to each region.As a result, the model with a combination of the extended tumoral region in the early enhanced phase and the peritumoral region in T2FS exhibited the highest AUC.This finding is consistent with the

Discussion
ER+HER2-locally advanced breast cancer (LABC) has a poor pathologic complete response (pCR) rate of approximately 10% compared with the 3-40% pCR rates of other molecular subtypes after surgical intervention following neoadjuvant chemotherapy (NAC) [9].Therefore, this study aimed to classify ER+HER2-LABC patients with a high probability of providing an effective response to NAC using pretreatment MRI, which is a key modality for the non-invasive assessment of breast cancer [6,7,10].Several recent studies have attempted to create a prognosis prediction model for breast cancer using radiomic texture feature extraction with respect to the pretreatment MRI applied in this study [10,21,28].However, all these studies were conducted on heterogeneous molecular subtypes with only a small part of the patient population with ER+HER2-LABC, which is the focus of this study.
To construct a sophisticated model for pCR prediction after NAC in patients with ER+HER2-LABC, the radiomic texture features of MRI were extracted from the tumor, peritumor 1 mm, peritumor 3 mm, area from tumor-to-peritumor 1 mm, and area from tumor-to-peritumor 3 mm for early post-contrast sequences, delayed post-contrast sequences, and T2-weighted fat-saturated sequences.In line with previous studies, it was further established that early post-contrast images predominantly contain the most useful texture features in machine learning models as a single sequence model evaluation [29][30][31].The inclusion of the delayed post-contrast image in this study was based on a previous study by Jin et al., who claimed that texture heterogeneity is better reflected in the delayed enhanced phase for breast tumors [32].However, the model incorporating the texture features of the tumor in the delayed phase did not produce more potent information in comparison to other sequences in our study.
Another strength of this study is that not only is the tumoral region the basis for determining the VOI for radiomic feature extraction in MRI, but the peritumoral region, which is also reported to form a microenvironment that affects the NAC response [21,22], was included.Until recently, studies included the peritumoral region to investigate an extended area from the tumoral to the peritumoral region on a single MRI sequence [22].In this study, on the other hand, the tumoral and peritumoral regions in combination across sequences were examined to construct a more advanced model that reflects more important sequences related to each region.As a result, the model with a combination of the extended tumoral region in the early enhanced phase and the peritumoral region in T2FS exhibited the highest AUC.This finding is consistent with the general MRI principle that T2FS images exhibit a wider range of signal alterations compared to T1-weighted images, even including contrast-enhanced T1-weighted images [33,34].In the future, more elaborate models need to be developed by combining tumoral and peritumoral regions across different sequences and validated for other molecular subtypes of breast cancer.
This study had several limitations.First, there was a possibility of selection bias because the study was a retrospective study conducted at a single tertiary referral center, and due to the unavailability of raw data for MRI dynamic studies from patients accumulated over a substantial ten-year period, certain patients had to be excluded.Second, this study intentionally focused on ER+HER2-LABC, which is a molecular subtype with relatively poor NAC response.Because the results were based on a single molecular subtype, it may be difficult to generalize the findings to all patients with breast cancer.We hope that follow-up studies will be conducted in future.Third, regarding the potential clinical utility, more time seems necessary for the immediate clinical application of the findings through rapid and reliable automatic segmentation.An accurate VOI segmentation process for a tumor is a prerequisite for providing accurate key texture features to constitute machine learning models.Although this study used a 3D slicer to produce images in a semi-automatic manner, the reliability of the VOI produced by the program decreased as the irregularity of the tumor margin increased, which required modification by a radiologist and reconfirmation by a senior radiologist to refine the VOI segmentation.Lastly, the most fundamental limitation was found in the revision of treatment plans for patients with ER+HER2-LABC.Despite a mere 10% pCR rate after NAC, NAC is still administered to patients with ER+HER2-LABC, mainly because more effective and specific treatments for this patient group are still in progress.However, if we accumulate evidence supporting the accurate classification of patients who exhibited a favorable response to NAC before treatment initiation, we can significantly influence practical treatment decisions made by oncologists and surgeons by instilling confidence grounded in sound reasoning.This approach could potentially shift the focus for ER+HER2-LABC patients expected to exhibit a poor response to NAC towards earlier surgical interventions and the determination of the scope of post-surgery treatment, leveraging tools such as Oncotype Dx for adjuvant chemotherapy decisions.Consequently, we anticipate that these efforts will assist in the establishment of tailored therapy plans that prioritize benefits over risks, ultimately improving patient quality of life.
To assess the NAC response of patients with ER+HER2-LABC on pretreatment MRI, this study applied radiomic texture features to the tumoral and peritumoral regions across MRI sequences.We suggest that a combination of machine learning models incorporating tumoral and peritumoral texture features across different MRI sequences can provide a more accurate prediction of pCR for NAC response in these patients.These results are also expected to make a potential contribution to the development of novel clinical therapeutic strategies.Informed Consent Statement: The requirement for informed consent was waived in this retrospective study.

Diagnostics 2023 ,
13,  x FOR PEER REVIEW 5 of 14 gy) using 3D-Slicer (version 5.0.2) software, and the accuracy of the image up to the 3D margin on the coronal and sagittal planes was checked with necessary modifications.For peritumoral VOI segmentation, the existing tumor mask was subtracted after 3D dilation by 1 mm and 3 mm units (Figure2).
Positive + False Negative F1 Score = 2 × Precision × Recall Precision + Recall The process of this study is summarized in Figure 3. Diagnostics 2023, 13, x FOR PEER REVIEW 7 of 14
Diagnostics 2023, 13, x FOR PEER REVIEW 9 of 14 was the SVM model of tumor-to-peritumor 1 mm on Ph2 (AUC = 0.9447, recall = 91%, precision = 91%, and F1 score = 91%).The ROC curves and AUCs of the 15 models in the validation cohorts are shown in Figure 4, and it can be confirmed once again that the overall high AUC value was shown in Ph2.

Figure 4 .
Figure 4.The ROC curve for the predictive performance of pCR in the validation cohort using the optimal machine learning models in each sequence: (a) AUC on Ph2; (b) AUC on Ph6; (c) AUC on T2FS.Abbreviations: Ph2-T1-weighted fat-suppressed early post-contrast subtraction sequence;

Figure 5 .
Figure 5.Comparison of the pCR prediction performance in the validation cohort: The best combination model of the VOI from tumor-to-peritumor 1 mm in Ph2 and the peritumor 3 mm VOI in T2FS as well as the respective components of the VOI models.Abbreviations: Ph2-T1-weighted fat-suppressed early post-contrast subtraction sequence; T2FS-T2-weighted fat-suppressed sequence; Peri1-peritumoral region, 1 mm; Tumor_peri1-tumoral + 1 mm peritumoral region; Pe-ri3-peritumoral region, 3 mm; FPR-false positive rate; TPR-true positive rate.

Figure 5 .
Figure 5.Comparison of the pCR prediction performance in the validation cohort: The best combination model of the VOI from tumor-to-peritumor 1 mm in Ph2 and the peritumor 3 mm VOI in T2FS as well as the respective components of the VOI models.Abbreviations: Ph2-T1-weighted fat-suppressed early post-contrast subtraction sequence; T2FS-T2-weighted fat-suppressed sequence; Peri1-peritumoral region, 1 mm; Tumor_peri1-tumoral + 1 mm peritumoral region; Peri3-peritumoral region, 3 mm; FPR-false positive rate; TPR-true positive rate.

Diagnostics 2023 , 14 Figure 6 .
Figure 6.The ROC curve for pCR prediction performances of the clinical models in the validation cohort.Abbreviations: ER-estrogen; PR-progesterone; FPR-false positive rate; TPR-true positive rate.

Figure 6 .
Figure 6.The ROC curve for pCR prediction performances of the clinical models in the validation cohort.Abbreviations: ER-estrogen; PR-progesterone; FPR-false positive rate; TPR-true positive rate.

Funding:
This work was supported by the Korea Medical Device Development Fund Grant funded by the Korea government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare, Republic of Korea, the Ministry of Food and Drug Safety) (Project Number: KMDF202011A01-04), and by a National Research Foundation of Korea (NRF) Grant funded by the Korean government (MSIT) (NRF-2021R1A2C3006264).Institutional Review Board Statement: This study was approved by the institutional review board of our hospital (IRB 4-2021-1418).

Table 1 .
Comparison of the patient characteristics between non-pCR and pCR groups.

Table 2 .
Comparison of the patient characteristics between the training and validation cohorts.