In Silico Evaluation of Natural Compounds for an Acidic Extracellular Environment in Human Breast Cancer

The survival rates for breast cancer (BC) have improved in recent years, but resistance, metastasis, and recurrence still remain major therapeutic challenges for BC. The acidic tumor microenvironment (TME) has attracted attention because of its association with tumorigenesis, metastasis, drug resistance, and immune surveillance. In this study, we evaluated natural compounds from traditional herbal medicine used to treat cancer that selectively target genes regulated by extracellular acidosis. We integrated four transcriptomic data including BC prognostic data from The Cancer Genome Atlas database, gene expression profiles of MCF-7 cells treated with 102 natural compounds, patterns of gene profiles by acidic condition, and single-cell RNA-sequencing from BC patient samples. Bruceine D (BD) was predicted as having the highest therapeutic potential, having an information gain (IG) score of 0.24, to regulate reprogrammed genes driven by acidosis affecting the survival of BC patients. BD showed the highest IG on EMT (IG score: 0.11) and invasion (IG score: 0.1) compared to the other phenotypes with the CancerSEA database. BD also demonstrated therapeutic potential by interfering with the tumor cell–TME interactions by reducing the amyloid beta precursor protein and CD44 expression. Therefore, BD is a potential candidate to target the acidic TME induced metastatic process in BC.


Introduction
In 2020, breast cancer (BC) was the most diagnosed cancer at 24.5% and was the leading cause of death at 15.5% among women with cancer [1]. Advances in surgical techniques, radiation therapy, and systemic therapies for BC contributed to an increase in the 5-year relative survival rate [2,3]. Although BC mortality has continued to improve over the past decades, radiation resistance, drug resistance, and metastatic recurrence still remain major therapeutic challenges for BC [4][5][6]. Solid tumors produce and export excessive levels of lactic acid due to reprogrammed cancer cell metabolism, and when combined with poor vascular perfusion, it eventually leads to an acidic extracellular pH (pH e ) microenvironment [7,8]. The adaptation of cancer cells to acidic pH e within the tumor microenvironment (TME) is an important factor in increasing tumor aggressiveness such as invasion and metastasis [9][10][11][12]. In acidic regions, tumor cells perform niche engineering that results in extracellular matrix degradation, normal cell death, and local invasiveness [13,14]. Tumors with higher metastatic potential showed lower pH values, dispersedly, in peritumoral regions [15]. Additionally, distant metastasis was associated with enhanced tumor acidity [16]. In BC, exposure to acidic pH e increased cell migration and drug resistance in MCF-7 cells [17]. Drug-resistant MCF-7 cells showed lowered pH e compared to parent BC cells [18]. Comparison between metastatic 4T1 and less metastatic TUBO cells showed that pH e correlated with distant lung metastasis [19].
to parent BC cells [18]. Comparison between metastatic 4T1 and less metastatic TUBO cells showed that pHe correlated with distant lung metastasis [19].
Traditional herbal medicine has long been used to treat cancer, as some herbs contain natural compounds with anticancer effects that target proliferation, angiogenesis, metastasis, and apoptosis [20]. Natural compounds, such as epigallocatechin gallate, curcumin, berberine, artemisinins, ginsenosides, and silibinin, have been reported to regulate autophagy, drug resistance, immunity balance, and chemosensitization in vitro and in vivo [20]. In this study, we analyzed the transcriptomic expression patterns of genes by acidic condition, the BC prognostic data from The Cancer Genome Atlas (TCGA) database, and the gene expression profiles of MCF-7 cells treated with 102 natural compounds from herbal medicine (Figure 1). Moreover, we predicted potential cancer-associated pathways that could be improved by compounds with therapeutic potential for acidic pHe. Finally, we investigated how compounds would affect receptor-ligand interactions in the TME using single-cell sequencing of BC patient samples ( Figure 1).

Data Collection
We downloaded GSE152345 transcriptomic (Illumina HiSeq platform) and survival data of a BC database from Xena TCGA (https://xenabrowser.net/hub/, Accessed on 27 May 2021). The GSE152345 with a previous study contains whole gene expression levels through high-throughput sequencing with BGISEQ-500 between pH 6.5 and pH 7.6 conditions of the MCF-7 cell line [21]. In addition, we downloaded GSE85871, which contains gene expression profiles of the MCF-7 cell line microarray (Affymetrix human genome U133A 2.0 array) treated with 102 traditional Chinese medicine-related compounds [22]. The gene list with 14 different functional states of tumor cells was downloaded from Can-cerSEA (http://biocc.hrbmu.edu.cn/CancerSEA/home.jsp, Accessed on 10 August 2021) [23]. Finally, we downloaded GSE161529, which contains single-cell sequencing data of tissues from each ER + , HER2 + , and triple-negative (TN) BC patient [24].

Data Collection
We downloaded GSE152345 transcriptomic (Illumina HiSeq platform) and survival data of a BC database from Xena TCGA (https://xenabrowser.net/hub/, accessed on 27 May 2021). The GSE152345 with a previous study contains whole gene expression levels through high-throughput sequencing with BGISEQ-500 between pH 6.5 and pH 7.6 conditions of the MCF-7 cell line [21]. In addition, we downloaded GSE85871, which contains gene expression profiles of the MCF-7 cell line microarray (Affymetrix human genome U133A 2.0 array) treated with 102 traditional Chinese medicine-related compounds [22]. The gene list with 14 different functional states of tumor cells was downloaded from CancerSEA (http://biocc.hrbmu.edu.cn/CancerSEA/home.jsp, accessed on 10 August 2021) [23]. Finally, we downloaded GSE161529, which contains single-cell sequencing data of tissues from each ER + , HER2 + , and triple-negative (TN) BC patient [24].

Acidosis-Dependent Prognosis-Related Genes
To analyze acidosis-dependent prognosis-related genes, we performed survival analysis with Cox-regression and a t-test for extracting differential expressed genes with TCGA BC and GSE152345 data, respectively. After that, genes that were commonly significant in both analyses were classified into 4 types according to hazard ratio and fold change. The 4 types were risk effect to prognosis and up-regulated in acidic condition (RU), protective effect to prognosis and up-regulated in acidic condition (PU), risk effect to prognosis and down-regulated in acidic condition (RD), and protective effect to prognosis and downregulated in acidic condition (PD).

Selection Method to Identify Compounds with Therapeutic Potential against Acidosis-Dependent Prognosis-Related Genes
Information gain (IG) was considered as an evaluation parameter to select compounds with therapeutic potential in association with expression patterns of the acidosis-dependent prognosis-related genes. According to the fold change value 0, among the up-or downregulation genes, the proportion of RU or PD was calculated based on Shannon's entropy, and the IG score was calculated.
Let us assume probability (P) and i contain a number of RU (ru) and PD (pd); therefore, Shannon's entropy was calculated as follows: Let us assume T as the population of targets such as RU or PD before splitting according to the fold change 0, and s contains up-regulated (u) and down-regulated (d). The IG of each compound was calculated as follows: We performed logistic regression and calculated IG values between those up-or downregulated by the compound treated group and directions such as positive or negative against 14 different functional states from the CancerSEA database, including angiogenesis, apoptosis, DNA damage, DNA repair, epithelial mesenchymal transition (EMT), invasion, differentiation, proliferation, cell cycle, metastasis, hypoxia, stemness, inflammation, and quiescence. The IG was obtained by the calculation formula described above.
Let us assume probability (P) and i contain a number of positives (p) and negatives correlated (n) with each 14 different functional states; the Shannon's entropy was calculated as follows: Let us assume the T as a population of the targets such as RU or PD before splitting according to the fold change 0, where s contains up-regulated (u) and down-regulated (d). The IG of each compound is calculated as follows: The logistic regression was performed with directions for each 14 different functional states as a dependent variable and up-and down-regulated in compound treated states as an independent variable.

Cell-to-Cell Interaction Analysis with scRNA-seq
We downloaded scRNA-seq data from normal (GSM4909262) samples, and ER + tumor (GSM4909313) paired samples were downloaded from GSE161529. Clustering and annotation of cell types of normal and ER + tumors were performed by Seurat [26] with default options, and cell-to-cell communication analysis was conducted with the CellChat (http://www.cellchat.org/, accessed on 19 August 2021) package of R [27]. The interaction database was retrieved from CellChatDB, which contains a total of 2,021 molecular interactions including paracrine/autocrine (60%), ECM-related (21%), and cell-cell contact (19%) interactions [27].

Statistical Analysis
We performed t-test, Cox-regression, and logistic regression analyses in this study. The threshold of the p-value is < 0.05 with a t-test and Cox-regression without adjusting for multiple comparisons. With logistic regression, we applied Bonferroni corrections for adjusting multiple comparisons. All of the statistical analyses were performed by Rstudio (Version 1.41106) and Python (Version 3.9).

Identification of Acidosis-Dependent Prognosis-Related Genes
To identify acidosis-dependent prognosis-related genes, a Cox-regression and t-test were performed between acidic (pH 6.5) and normal (pH 7.6) conditions with GSE152345 based on whole gene expression for the overall survival from TCGA, respectively ( Figure 2A, Tables S1 and S2). According to the hazard ratio and fold change, we classified the data into four types such as RU, PU, RD, and PD ( Figure 2B). As a result, 2148 (Table S1) and 2328 (Table S2) significant genes were extracted with the Cox-regression for extracting prognosisrelated genes and the t-test to identify differential expressed genes (DEGs), respectively ( Figure 2C). Among these significant genes, 307 genes were significant in both statistical analyses ( Figure 2C and Table S3). According to the hazard ratio and fold change, 35, 70, 163, and 39 genes were defined as RU, PD, PU, and RD, respectively ( Figure 2C).

Evaluation of Therapeutic Compounds against an Acidosis-Dependent Manner in BC
For this study, 102 traditional herbal medicine-related compounds were evaluated against acidic conditions. We focused on RU-or PD-type genes that cause an acidosisrelated risk effect by up-regulating risk genes or down-regulating protective genes, respectively. Among the 102 compounds, Bruceine D (BD) had the highest IG score (0.24) with RU and PD ( Figure 3A). Among the 27 up-regulated probes in the BD-treated group, 24 probes were PD type, which indicates a better prognosis and is down-regulated by acidic (pH 6.5) conditions (PDU) ( Figure 3B,C). In addition, among the 15 down-regulated probes in

Evaluation of Therapeutic Compounds against an Acidosis-Dependent Manner in BC
For this study, 102 traditional herbal medicine-related compounds were evaluated against acidic conditions. We focused on RU-or PD-type genes that cause an acidosisrelated risk effect by up-regulating risk genes or down-regulating protective genes, respectively. Among the 102 compounds, Bruceine D (BD) had the highest IG score (0.24) with RU and PD ( Figure 3A). Among the 27 up-regulated probes in the BD-treated group, 24 probes were PD type, which indicates a better prognosis and is down-regulated by acidic (pH 6.5) conditions (PDU) ( Figure 3B,C). In addition, among the 15 down-regulated probes in the BD-treated group, 10 probes were RU type, which indicates a poor prognosis and is up-regulated by acidic (pH 6.5) conditions (RUD) ( Figure 3B,C).

Effect of BD on 14 Different Cancer-Related Functional States with the CancerSEA Database
To evaluate the effect of BD on tumor cell-related functional states, we downloaded genes and directions with each of the 14 different BC-related functional states including angiogenesis, apoptosis, DNA damage, DNA repair, EMT, invasion, differentiation, proliferation, cell cycle, metastasis, hypoxia, stemness, inflammation, and quiescence ( Figure 4). In addition, with the nine total patients, including ER + , HER2 + , and TN, cancer cells from a single cell population were segmented ( Figures S1-S3), and percentages of positive cells from each of the 14 different tumorigenesis-related functional states were calculated ( Figure 4B,D and Figure S4). As a result, EMT and invasion showed higher IG values > 0.1 compared to the other phenotypes ( Figure 4A). Interestingly, the genes that positively correlated with EMT and invasion were highly expressed in the cancer cell populations ( Figure 4B,D). Our data showed that BD up-regulated genes had negative correlations with EMT, whereas down-regulated genes had positive correlations with invasion( Figure 4C,E).

Effect of BD on 14 Different Cancer-Related Functional States with the CancerSEA Database
To evaluate the effect of BD on tumor cell-related functional states, we downloaded

Discussion
Solid tumors adapt and survive in acidic TME through pH-regulating proteins such as the sodium/proton exchanger 1 (NHE1), sodium bicarbonate cotransporter (NBC), monocarboxylate transporters (MCT), and carbonic anhydrase [28,29]. The acidic TME has received attention due to its association with tumor development, progression, metastasis, drug resistance, and escape from immune surveillance [28,30]. The advances in modern technology have increased our understanding of the pharmacology and molecular mechanisms of traditional herbal medicine and its active compounds [31]. Natural compounds exert anti-cancer effects on the TME that consists of tumor cells, stromal cells, immune cells, and non-cellular components such as collagen and fibronectin either directly or indirectly [20,32]. In this study, we used a computational approach to evaluate 102 major TCM-related natural compounds for an acidic pH e environment in human BC.
After evaluation between 102 compounds and BC cells under an acidic pH e , BD was suggested as the candidate with the most potential to treat BC against an experimental acidic condition with the highest IG value (IG value: 0.24) ( Figure 3A and Table S5). BD is a major active quassinoid in Brucea javanica (L.) Merr., which has anti-cancer properties such as anti-proliferative and pro-apoptotic effects via c-Jun N-terminal kinase (JNK), mitogen-activated protein kinases (MAPK), phosphatidylinositol 3-kinase (PI3K)/protein kinase B (AKT)/mammalian target of Rapamycin (mTOR), and canonical Wnt signaling pathways against many cancers including BC [33,34]. Our data showed that 10 probes defined as RUD and 24 probes defined as PDU in BC cells were potential therapeutic targets of BD. Moreover, the number of genes defined by PDUs was higher than those defined by RUDs ( Figure 3C). This suggests that BD exhibits a pattern to prevent cancer progression by up-regulating the protective genes (NFKBIE, LIFR, SERPINB9, etc.), which were down-regulated due to acidification, rather than down-regulating the risk genes (FAM114A1, PLAC8, SCUBE2, etc.), which were up-regulated due to acidification.
We used the CancerSEA database aimed at identifying the correlation between BD and 14 functional states. The DEGs for BD were evaluated by the IG method, with the highest discrimination powers being against EMT and invasion with an IG score of >0.1 ( Figure 4A). Our segmented scRNA-seq data originating from ER + , HER2 + , and TN BC further supported that higher proportions were seen in cell populations that express positiverelated genes compared to negative-related genes against EMT or invasion ( Figure 4B,D). Prior studies have proposed the regulatory effect of BD on EMT, invasion, and apoptosis. BD inhibited STAT3 (Signal Transducer and Activator of Transcription 3) activation that attenuated the cell proliferation, migration, invasion, and stem cell-like properties of osteosarcoma cells [35]. BD dose-dependently increased E-cadherin, whereas BD decreased vimentin and β-catenin expression that resulted in the reduced migration and invasion abilities of MDA-MB-231 cells [36]. BD increased oxidative stress and inhibited the PI3K/Akt signaling pathway, inducing apoptosis in human pancreatic cancer cells [37]. Interestingly, we found that BD regulated genes with positive (ANXA2, HSP90B1, TGFBI, FN1) and negative (FRK) correlation with EMT, invasion, and metastasis. Overexpressed ANXA2 (Annexin A2) exhibits poor prognosis and correlates positively with invasion and metastasis in BC [38]. Annexin A2 interacts with STAT3 and mediates EGF (Epidermal growth factor) induced EMT [38]. ANXA2 knockdown suppressed β-catenin expression and inhibited EMT and invasion in ovarian cancer cells [39]. GRP94 (Glucose-regulated protein 94, HSP90B1) expression highly correlates with brain metastatic BC, and autophagy mediated the adaption and survival at metastatic sites [40]. FN1 (Fibronectin 1) is upregulated in various tumors and mediates cell proliferation and migration [41]. FN1 knockdown showed decreased cell migration and invasion by modulating EMT-related proteins such as E-cadherin, N-cadherin, and vimentin in MCF-7 cells [42]. In silico analysis showed that TGFBI (Transforming growth factor beta induced protein) is associated with poor prognosis and aggressive BC subtypes [43]. An in vivo study showed that TGFBI affected hypoxia and metastasis in BC [43]. Previous studies have shown the tumor suppressive role of FRK (Fyn-related kinase) by inhibiting cell proliferation, PTEN (Phosphatase and tensin homolog) degradation, and EGF receptor signaling in BC [44][45][46]. Overexpression of FRK inhibited STAT3 activation, which suppressed the EMT process and cell migration in BC cells [47]. BD treatment may induce ANXA2, TGFI, FN1, HSP90B1 down-regulation, and FRK up-regulation to inhibit tumor metastasis by regulating autophagy, hypoxia, β-catenin, and STAT3 signaling pathways ( Figure 6). suppressive role of FRK (Fyn-related kinase) by inhibiting cell proliferation, PTEN (Phosphatase and tensin homolog) degradation, and EGF receptor signaling in BC [44][45][46]. Overexpression of FRK inhibited STAT3 activation, which suppressed the EMT process and cell migration in BC cells [47]. BD treatment may induce ANXA2, TGFI, FN1, HSP90B1 down-regulation, and FRK up-regulation to inhibit tumor metastasis by regulating autophagy, hypoxia, β-catenin, and STAT3 signaling pathways ( Figure 6). We next determined whether BD interferes with the tumor cell-TME interactions. The cellular heterogeneity in the TME includes immune cells, fibroblasts, and stromal cells [48]. For this reason, we performed cell-to-cell interaction analysis with single-cell resolution. As a result, among the various ligand-receptor relationships in ER + TME, BD decreased APP and CD44 expression ( Figure 5G). The APP as a ligand interacted with CD74, which was mainly expressed in CD68 + macrophages in the cell-to-cell interaction analysis ( Figures 5C and S6A). APP encodes an amyloid beta precursor protein and is strongly linked to Alzheimer's disease [49]. Interaction between APP and CD74 reduces the production of beta amyloid in Alzheimer's disease [50]. Unfortunately, there is no evidence of a connection between APP and CD74 in cancer. However, APP is up-regulated in BC cells and tissues, which promotes tumor formation and progression [51,52]. CD44 is a member of the cell adhesion molecules that have been proposed as having conflicting functions, i.e., either being tumor-promoting or tumor-suppressing in BC [53]. CD44 promotes cancer cell migration and invasion by directly interacting with MMP-9, which degrades collagens [54,55]. CD44 stimulates multidrug resistance protein (P-glycoprotein) expression and leads to chemoresistance in BC cells [56]. Our results demonstrated that CD44 within ER + tumor cells interacted with SPP1, also known as osteopontin (OPN) with macrophages ( Figures 5E and S6B). A previous study revealed that the tumor-associated macrophages interacted with CD44 + cancer stem cells and enhanced tumorigenesis via the OPN/CD44 axis [57]. Additionally, OPN induced radiation resistance by activating the CD44 signaling pathway in glioma [58].
In summary, we used in silico methodology to select the most potent compound to target genes characterized by acidic pHe that influence the survival of BC patients. Among We next determined whether BD interferes with the tumor cell-TME interactions. The cellular heterogeneity in the TME includes immune cells, fibroblasts, and stromal cells [48]. For this reason, we performed cell-to-cell interaction analysis with single-cell resolution. As a result, among the various ligand-receptor relationships in ER + TME, BD decreased APP and CD44 expression ( Figure 5G). The APP as a ligand interacted with CD74, which was mainly expressed in CD68 + macrophages in the cell-to-cell interaction analysis ( Figure 5C and Figure S6A). APP encodes an amyloid beta precursor protein and is strongly linked to Alzheimer's disease [49]. Interaction between APP and CD74 reduces the production of beta amyloid in Alzheimer's disease [50]. Unfortunately, there is no evidence of a connection between APP and CD74 in cancer. However, APP is up-regulated in BC cells and tissues, which promotes tumor formation and progression [51,52]. CD44 is a member of the cell adhesion molecules that have been proposed as having conflicting functions, i.e., either being tumor-promoting or tumor-suppressing in BC [53]. CD44 promotes cancer cell migration and invasion by directly interacting with MMP-9, which degrades collagens [54,55]. CD44 stimulates multidrug resistance protein (P-glycoprotein) expression and leads to chemoresistance in BC cells [56]. Our results demonstrated that CD44 within ER + tumor cells interacted with SPP1, also known as osteopontin (OPN) with macrophages ( Figure 5E and Figure S6B). A previous study revealed that the tumor-associated macrophages interacted with CD44 + cancer stem cells and enhanced tumorigenesis via the OPN/CD44 axis [57]. Additionally, OPN induced radiation resistance by activating the CD44 signaling pathway in glioma [58].
In summary, we used in silico methodology to select the most potent compound to target genes characterized by acidic pH e that influence the survival of BC patients. Among the 102 natural compounds, BD showed the highest potential to regulate reprogrammed genes driven by acidosis, affecting the prognosis of BC. BD down-regulated genes that positively correlate with EMT and invasion, while it up-regulated genes that negatively correlate with EMT and invasion. BD showed therapeutic potential by changing the TME condition by reducing APP and CD44 expression. Taken together, BD may be an effective natural compound for treating BC metastasis driven by extracellular acidity. More research is needed to fully understand the mechanism of the action of BD in acidic TME.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/cells10102673/s1, Figure S1: Segmentation of ER+ breast cancer cells from single cell data with patients, Figure S2: Segmentation of HER+ breast cancer cells from single cell data with patients, Figure S3: Segmentation of TN breast cancer cells from single cell data with patients, Figure S4: Heatmap of proportion of positive cells within segmented tumor cells from scRNA-seq with 12 different functional states of cancer cells and gene expression status by BD treated or not, Figure S5: Analysis of normal scRNA-seq data (GSM4909262) which is paired with GSM4909313 ER+ BC, Figure S6: Dot plots of expression of ligands receptor with percent of positively expressed and average of expression in ER+ sample GSM (GSM4909313), Table S1: Total genes for survival analysis with breast cancer patients from TCGA database, Table S2: Total genes for DEG analysis between pH 6.5 and pH 7.6, Table S3: Significant genes with prognosis and differential expressed between acidic (pH 6.5) and normal (pH 7.6) condition, Table S4: Significant genes with differential expressed between each compound treated and control group according to acidosis dependent prognosis-related genes, Table S5: Evaluation of effectiveness of compound to acidosis dependent prognosis-related manner, Table S6: Whole significantly differential expressed genes between BD treated and control group.

Data Availability Statement:
The datasets presented in this study are available from the corresponding author upon request.

Conflicts of Interest:
The authors declare no conflict of interest.