Comparing Genetic Risk and Clinical Risk Classification in Luminal-like Breast Cancer Patients Using a 23-Gene Classifier

Simple Summary Multi-gene expression assays have been advocated for treatment decision in breast cancer management. The most commonly used assays such as Oncotype DX, MammaPrint, which were developed from the Western population, were especially designed for the prognostication of early stage luminal-type breast cancer. The tabulation of multi-gene expression assay and clinical risk has become the research interest recently. The 23-gene signature was purposed for the Asian population and was validated for the discriminative ability regarding 5-year relapse-free survival and the objective of this study was to evaluate the performance across distinct clinical risk groups. Abstract Background: A 23-gene classifier has been developed based on gene expression profiles of Taiwanese luminal-like breast cancer. We aim to stratify risk of relapse and identify patients who may benefit from adjuvant chemotherapy based on genetic model among distinct clinical risk groups. Methods: There were 248 luminal (hormone receptor-positive and human epidermal growth factor receptor II-negative) breast cancer patients with 23-gene classifier results. Using the modified Adjuvant! Online definition, clinical high/low-risk groups were tabulated with the genetic model. The primary endpoint was a recurrence-free interval (RFI) at 5 years. Results: There was a significant difference between the high/low-risk groups defined by the 23-gene classifier for the 5-year prognosis of recurrence (16 recurrences in high-risk and 3 recurrences in low-risk; log-rank test: p < 0.0001). Among the clinically high-risk group, the 5-year RFI of high risk defined by the 23-gene classifier was significantly higher than that of the low-risk group (15 recurrences in high-risk and 2 recurrences in low-risk; log-rank test: p < 0.0001). Conclusion: This study showed that 23-gene classifier can be used to stratify clinically high-risk patients into distinct survival patterns based on genomic risks and displays the potentiality to guide adjuvant chemotherapy. The 23-gene classifier can provide a better estimation of breast cancer prognosis which can help physicians make a better treatment decision.


Introduction
Breast cancer is the most common female malignancy and ranks fourth among all cancer mortality in Taiwan [1]. Clinical outcomes of early breast cancers treated with curative intension have been improved enormously, mainly due to screening mammography for early detection and adjuvant systemic therapy for high-risk patients. Breast cancer is subdivided into immunohistochemistry (IHC) subtypes based on the status of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor II (HER2). The IHC assays, in combination with anatomical stages (tumor size, regional nodal and distant organ metastasis), pathological features such as histological type and nuclear grade and an IHC-based proliferative marker, Ki67 (coded by the gene MKI67) not only determine which systemic therapy should be prescribed (predictive markers), but also serve as prognostic biomarkers forecasting long-term treatment outcomes [2,3].
Among all breast cancer molecular subtypes, luminal breast cancers (defined by ER and/or PR-positive and HER2-negative) enjoy the best survival rate and are the only subtype which may benefit from long-term endocrine therapy and may be spared from cytotoxic chemotherapy if residual risk following curative surgery is low enough and endocrine therapy alone can counteract the risk of recurrence [4,5].
Multi-gene expression assays (MGAs), initially adapted microarrays as the platform interrogating whole transcriptome and then commercialized with reserve transcriptionpolymerase chain reaction (RT-PCR), digital RNA counting (NanoString Technologies, Inc, South Lake Union, Seattle, WA), and next-generation sequencing (NGS)-based RNA sequencing have been advocated for hormone receptor (HR)-positive and HER2-negative early breast cancers for risk stratification, and serve as a decision-making tool to avoid chemotherapy. To name important ones, the Amsterdam (70-gene) and Rotterdam (76-gene) signatures, Genomic Grade Index (GGI), intrinsic subtypes (with the latest version of PAM50 ® ) and Recurrence Score (21-gene) [6]. It deserves notice that both the Oncotype DX ® (21-gene, Exact Sciences Cooperation, Madison, WI) and MammaPrint ® (70-gene, Agendia Precision Oncology, NT Amsterdam, Netherlands) have been endorsed by guidelines from international societies such as the American Society of Clinical Oncology (ASCO) and the National Cancer Comprehensive Network (NCCN) with periodic focused updates as being prognostic (70-gene signature) or both prognostic and predictive (21-gene) for early stage luminal breast cancers [7,8].
Our published 23-gene signature (RecurIndex ® , Amwise Diagnostics PTE. LTD, Taipei, Taiwan) has been proposed to classify breast cancers into high-and low-risk following curative surgery, and significantly discriminative 5-year relapse-free survival patterns were observed among 473 luminal Taiwanese breast cancers; gene expression scores with accompanied clinical variables (diagnosed age, tumor size and nodal stage) were used for risk-predictive model construction. Hazard ratios of 5.63 (95% confidence interval 2.77-11.5) and 8.02 (3.52-18.3) for high-risk subjects were reported for the genetic and clinical-genetic models, respectively [9].
Recently, with the publication of large randomized controlled trials of the MINDACT and TAILORx, the tabulation of MGA and clinical risk groups has become a major interest among clinicians and scientists engaged in breast cancer management and research [10][11][12]. The criterion of clinical risk grouping is largely based on the modified Adjuvant! Online, or the Dutch clinical risk scale [10,13]. The aim of this study is to evaluate the prognostic performance of the 23-gene signature among Taiwanese breast cancer patients with clinical high and low risk.

Study Population
The Amwise database (Amwise Diagnostics PTE. LTD., Singapore) comprised patients with breast cancer from multiple medical centers in Taiwan. All patients enrolled in the Amwise database received a standard of care including breast conserving surgery or mastectomy. The CONSORT (consolidated standard of reporting trials) diagram shows the workflow for subject selection (Figure 1). The following were the inclusion criteria: (i) luminal-like (HR positive, HER2 negative) breast cancers, (ii) complete clinical information (age, tumor grade, nodal status, and tumor size), and (iii) complete genetic data (23-gene). Exclusion criteria were: (i) patients with pre-operative chemotherapy or radiotherapy, and (ii) patients without follow-up information. In this study, there were 40% samples in the Amwise database involved in the building of the 23-gene classifier. The clinical data we collected were from the electronic medical record (EMR) from each medical center we collaborated with to obtain the treatment, follow-up, and personal information. For the extraction of gene-expression data, the reverse-transcriptase (RT) quantitative polymerase chain reaction (qPCR) technique was used to measure the gene expression of the target 23 genes by using the total RNA isolate from the formalin-fixed paraffin-embedded (FFPE) tumor tissue. mastectomy. The CONSORT (consolidated standard of reporting trials) diagram shows the workflow for subject selection (Figure 1). The following were the inclusion criteria: (i) luminal-like (HR positive, HER2 negative) breast cancers, (ii) complete clinical information (age, tumor grade, nodal status, and tumor size), and (iii) complete genetic data (23-gene). Exclusion criteria were: (i) patients with pre-operative chemotherapy or radiotherapy, and (ii) patients without follow-up information. In this study, there were 40% samples in the Amwise database involved in the building of the 23-gene classifier. The clinical data we collected were from the electronic medical record (EMR) from each medical center we collaborated with to obtain the treatment, follow-up, and personal information. For the extraction of gene-expression data, the reverse-transcriptase (RT) quantitative polymerase chain reaction (qPCR) technique was used to measure the gene expression of the target 23 genes by using the total RNA isolate from the formalin-fixed paraffinembedded (FFPE) tumor tissue.

The 23-Gene Classifier
Development of the 23-gene classifier has been described elsewhere [9,14]. This classifier is an MGA, which interrogates functionalities associated with cell cycl and proliferation, oncogenic processes, inflammation and immune response, apoptosis and

The 23-Gene Classifier
Development of the 23-gene classifier has been described elsewhere [9,14]. This classifier is an MGA, which interrogates functionalities associated with cell cycl and proliferation, oncogenic processes, inflammation and immune response, apoptosis and metabolism. The 23 genes panel comprised of BLM, BUB1B, CCR1, CKAP5, CLCA2, DDX39, DTX2, ERBB2, ESR1, MKI67, OBSL1, PGR, PHACTR2, PIM1, PTI1, RCHY1, SF3B5, STIL, TPX2, and YWHAB along with three housekeeping genes ACTB, RPLP0, and TFRC. The com-parison of the gene list with other MGAs can be found in Supplementary Table (Table S1). Figure 2 summarizes the 23-gene signature. In the previous study, logistic regression with leave-one-out cross-validation (LOOCV) was performed to build a prognostic classifier [9]. First, the Ct number of 23 genes will be normalized by Equation (1) to the gene expression. Second, 23-gene signature, without housekeeping genes, was the input of the well-trained 23-gene classifier, and the output was the probability of recurrence by Equation (2). The 23-gene classifier is advocated for realizing the prognosis of recurrence for luminal breast cancers 5 years post-operatively.
where p is the probability of recurrence.  Table  (Table S1). Figure 2 summarizes the 23-gene signature. In the previous study, logistic regression with leave-one-out cross-validation (LOOCV) was performed to build a prognostic classifier [9]. First, the Ct number of 23 genes will be normalized by Equation (1) to the gene expression. Second, 23-gene signature, without housekeeping genes, was the input of the well-trained 23-gene classifier, and the output was the probability of recurrence by Equation (2). The 23-gene classifier is advocated for realizing the prognosis of recurrence for luminal breast cancers 5 years post-operatively.
where p is the probability of recurrence.

Prognostic and Statistical Analysis
The 23-gene classifier was used to determine the risk of breast cancer recurrence and provided genomic information beyond clinical risk provided by the modified Adjuvant! Online [10], which is a web tool incorporating patients' characteristics and tumor features for estimation of relapse and survival.
The primary endpoint of current study was a relapse-free interval (RFI) at 5 years, evaluated by Kaplan-Meier method and log-rank test between the defined high-/low-risk group. A univariate and multivariate Cox proportional hazards model was conducted to evaluate the performance of clinical and genetic model adjusted for various covariates

Prognostic and Statistical Analysis
The 23-gene classifier was used to determine the risk of breast cancer recurrence and provided genomic information beyond clinical risk provided by the modified Adjuvant! Online [10], which is a web tool incorporating patients' characteristics and tumor features for estimation of relapse and survival.
The primary endpoint of current study was a relapse-free interval (RFI) at 5 years, evaluated by Kaplan-Meier method and log-rank test between the defined high-/low-risk group. A univariate and multivariate Cox proportional hazards model was conducted to evaluate the performance of clinical and genetic model adjusted for various covariates including age group (<40, 40-60, >60), lymphovascular invasion (LVI, prominent/present versus focal/absent) and chemotherapy (with versus without).
In Model 1 and Model 2, Cox regression was conducted for the 23-gene classifier and clinical risk groups, respectively, while Model 3 evaluated both. To investigate the incremental predictive power of the 23-gene classifier across clinical risk groups, the interaction term between clinical risk groups and the 23-gene classifier was added in Model 4. Furthermore, a subgroup analysis was conducted within each clinical risk group to evaluate the prognostic power of the 23-gene classifier among patients with and without chemotherapy. All statistical analyses were conducted with R version 4.0.2 software, with p-value < 0.05 as statistically significant.

Baseline Demography of Enrolled Population
A total of 248 patients were included in this study (  Tables 2 and 3. These tables showed that factors such as tumor stage (p < 0.001), nodal stage (p < 0.001) and tumor grade (p = 0.008) of patients with chemotherapy were worse than of those without. Regarding radiotherapy, only tumor stage (p = 0.024) and follow-up time (p < 0.001) were significantly different between those with and without radiotherapy.   Table 4 showed a good partition from 23-gene classifier in the prediction of relapse. Either accuracy or NPV was over than 85% (accuracy, 0.855 and NPV, 0.815). The performance of Modified Adjuvant! Online (Table 5) showed poor partition in this population (accuracy, 0.355 and NPV, 0.294). Regarding the F1 score, the metric of the 23-gene classifier was 0.913, which was much higher than the value of Modified Adjuvant! Online (F1-score, 0.448).

RFI Analysis and Cox Proportional Hazards Regression Model
RFIs stratified by the 23-gene classifier and clinical risk groups are shown in Figures 3 and 4. The recurrence-free probability between the high-/low-risk group defined by the genetic classifier was significant (p < 0.0001), and was 0.67 (95% CI: 0.55, 0.82) for high-and 0.98 (95% CI: 0.96, 1.00) for low-risk group, respectively. On the other hand, the 5-year recurrence-free probability was 0.90 (95% CI: 0.85, 0.95) and 0.97 (95% CI: 0.92, 1.00) for the clinical high-and low-risk group. Among clinically high-risk patients, the recurrence-free probability of the genetic high-risk group was 0.62 (95% CI: 0.48, 0.80) and was 0.98 for the genetic low-risk group (95% CI: 0.96, 1.00) ( Figure 5). Figure 6 shows the 5-year RFI in the clinically low-risk group. Among patients without chemotherapy (Figures 7 and 8), the recurrence-free probability of clinically high-risk group was 0.91 (95% CI: 0.85, 0.98), which was worse than that of clinically low-risk group, but was not statistically significant (0.98 (95% CI: 0.94, 1.00), log-rank test: 0.21). On the other hand, the 5-year recurrence-free probability of the genetic high-risk group was 0.71 (95% CI: 0.55, 0.92) at 5-years and was 0.99 (95% CI: 0.97, 1.00) for the genetic low-risk group without adjuvant chemotherapy.      . In Model 4, the interaction term tabulating clinical risk groups and the 23-gene classifier was not statistically significant (p = 0.6).  Table 6 summarizes the results of Cox proportional hazards regression for the 23-gene classifier and clinical risk groups. In the univariate analysis, only the 23-gene classifier was found to have a significant effect on recurrence within 5 years (hazard ratio: 20.9 [95% CI: 6.04, 72.1]) and the effect of clinically high-risk group was borderline (hazard ratio: 2.92 [95% CI: 0.67, 12.7]). In Model 1, after controlling potential confounders, the 23-gene classifier remained a significant predictor for recurrence status within 5 years (hazard ratio: 10.5 [95%CI: 2.65, 41.8]). The clinical risk groups were also an independent prognostic factor in Model 2 (hazard ratio: 1.59 [95% CI: 0.30, 8.35]). In Model 3, a multivariate Cox regression model comprised both the 23-gene classifier and clinical risk groups, and the 23-gene classifier remained an independent predictor for the 5-year recurrence (hazard ratio: 10.5 [95% CI: 2.63, 42.2]). In Model 4, the interaction term tabulating clinical risk groups and the 23-gene classifier was not statistically significant (p = 0.6).

Discussion
In the past two decades, gene expression profiling has re-defined breast cancer as a molecularly heterogeneous disease entity which displays a broad spectrum of alternations in transcriptome, and sub-classifications that not only enhance molecular taxonomy, but have provided prognostic information pertaining to survival after curative therapy [15]. In addition to the well-known published MGAs, the 23-gene signature has been validated for its discriminating ability in addition to pathological prognostic factors such as tumor size and nodal status [9]. In one study, the prognostic discrepancy in 5-year relapse-free survival was evidenced between the predicted high-and low-risk groups [9,14]. Chronologically and collectively, the proposed signature has been validated across microarray and RT-PCR platforms [9,14].
It is not a coincidence that several MGAs have been used in combination with clinical risk factors such as tumor size and nodal status. The EPclin score composes a 12-gene molecular score and clinical features, while the PAM50-based risk of recurrence (ROR) score adopts a 50-gene signature as well as clinical features [16,17]. These second-generation MGAs are capable of predicting 10-year distant recurrence [18,19]. On the other hand, for the two signatures with purely genetic scores, the latest clinical trials all incorporated clinical risk groups stratifying targeted populations; both the MINDACT (clinically highand genomic low-risk group) and RxPONDER identified a subset of post-menopausal (age > 50-year-old and recurrence score < 25) pN1 patients who may be safely spared from cytotoxic chemotherapy [10,20]. For pre-menopausal (age < 50-year-old) patients, the situation is much more complicated, as there is still a substantial benefit to chemotherapy (~5%) for the clinically high and genomic low MINDACT population, as well as those of recurrence score 16-20 with clinically high risk, and all risk groups with recurrence score > 21 from TAILORx trial [6,10,11]. The Dutch clinical risk criteria (low-risk definition: age > 35 years and [grade 1 with tumor ≤3cm, grade 2 with tumor ≤2cm, or grade 3 with tumor ≤1cm]) and the modified Adjuvant! Online criteria have been used in clinical risk stratification for both the MammaPrint and Oncotype DX ® [21,22].
In the current study, we evaluated the prognostic value of the 23-gene signature from an unselected Taiwanese early breast cancer cohort across distinct clinical risk groups. Both genetic and clinical risk groups were prognostic as shown in Figure 3, 4 and Table 4, while the slightly smaller p-value indicated the better discriminative ability of the purposed genetic score than clinical risk groups (univariate Cox's model and multi-variate model 1 and 2). Figures 5 and 6 shows that among clinically high-risk group patients (n = 179), the 23-gene signature remained prognostic, which did not hold true for clinically low-risk counterparts (n = 69). Among the clinically high-risk sub-population, there were 14 events out of 42 genetically high-risk subjects defined by the 23-gene classifier, while only 2 events were observed during the 5-year follow up period from 137 genetically low-risk subjects predicted by the signature, resulting in a highly significant p-value of 0.0001 ( Figure 5). Prognostic discrimination of the 23-gene classifier diminished for patients with a low clinical risk, indicating that these subjects might not be the targeted population of the purposed signature.
We further dug into the impact of risk prediction upon 5-year recurrence-free intervals among patients not receiving adjuvant chemotherapy (n = 156). As noted in Figures 7 and 8, worse survival was observed for those predicted as high risk by clinical risk groups or the 23gene classifier, but without chemotherapy. Among all patients not receiving chemotherapy ( Table 4 shows that patients with a larger tumor, advanced pN stage and a higher nuclear grade tended to receive chemotherapy while the presence of LVI was only borderline statistically significant), prognostic discrimination was more pronounced for the 23-gene defined risk groups (7 out of 30 genetically high-risk patients experienced events, while only 7 out of 101 clinically high-risk patients had events). In other words, more patients were categorized into high-risk group by genetic model rather than the clinical model among patients without chemotherapy. It deserves notice that there were still more patients experiencing relapse even after chemotherapy, indicating an unmet need of increased risk not covered by adjuvant systemic therapy; more patients (30% versus 20%) in the genetically high-risk group received chemotherapy, but did not reach a statistical significance (Table 4).
In summary, both the genetic and clinical risk groups were prognostic in terms of 5-year recurrence-free interval, while the 23-gene signature was more prognostic than the clinical risk groups. Among clinically high-risk patients, the prognostic power of the 23gene signature remains, further indicating that these patients were the targeted population for the use of MGA. Among patients not receiving adjuvant therapy, both the 23-gene classifier and clinical risk groups were prognostic, while the genetic risk group was more predictive for 5-year recurrence events.
There were some limitations to the current study. First, the observational rather than interventional design limited the predictive power of adjuvant chemotherapy benefits from the purposed signature, although the 23-gene classifier was prognostic among patients not receiving chemotherapy. The allocation of chemotherapy in the current study was determined by clinicians, which might be correlated with factors defining clinical risk groups, such as tumor size and nodal status. Second, the retrospective study design also hampered the evidence level of the deduced conclusion. Third, the modest sample size further limited subgroup analyses regarding each tabulation of genetic and clinical risk groups, due to the paucity of cases in each stratum.

Conclusions
In conclusion, this study ascertained that the 23-gene classifier could stratify early breast cancer patients with clinical high risk into distinct survival patterns, and have the potentiality to support decision making in adjuvant chemotherapy. This MGA can provide a better estimation of breast cancer prognosis which can help physicians with precise management of luminal breast cancers.

Patents
Amwise holds the patent related to the content of this manuscript (Taiwan patent application number: 109132402; China patent application number: 202011103766.4).

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers14246263/s1, Table S1  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Conflicts of Interest:
The authors declare no conflict of interests.