In Silico Investigation of the Anti-Tumor Mechanisms of Epigallocatechin-3-Gallate

The EGCG, an important component of polyphenol in green tea, is well known due to its numerous health benefits. We employed the reverse docking method for the identification of the putative targets of EGCG in the anti-tumor target protein database and these targets were further uploaded to public databases in order to understand the underlying pharmacological mechanisms and search for novel EGCG-associated targets. Similarly, the pharmacological linkage between tumor-related proteins and EGCG was manually constructed in order to provide greater insight into the molecular mechanisms through a systematic integration with applicable bioinformatics. The results indicated that the anti-tumor mechanisms of EGCG may involve 12 signaling transduction pathways and 33 vital target proteins. Moreover, we also discovered four novel putative target proteins of EGCG, including IKBKB, KRAS, WEE1 and NTRK1, which are significantly related to tumorigenesis. In conclusion, this work may provide a useful perspective that will improve our understanding of the pharmacological mechanism of EGCG and identify novel potential therapeutic targets.

EGCG prevents cancer by suppressing tumor-associated proteins and modulating other signaling pathways involved in tumor development [4,6,13,14]. Furthermore, previous research demonstrated that EGCG effectively induces apoptosis and cell cycle arrest during tumorigenesis through regulation of NF-kB [15], cyclin D1 [4], p21/WAF1/CIP1 [16], p27/KIP1 [16] and cyclin-dependent kinases (CDK) [4,17]. Moreover, in vitro investigations have indicated that EGCG suppresses AKT and ERK phosphorylation, whereas it also enhances the activation of FOXO transcription factors, which results in cell cycle arrest and apoptosis [18]. In addition, EGCG binds to the 67-kDa laminin receptor (67LR) and hinders its expression during carcinogenesis [19,20]. Similarly, it also inhibits MAPK (mitogen-activated protein kinase) activity the 67-kDa laminin receptor (67LR) and hinders its expression during carcinogenesis [19,20]. Similarly, it also inhibits MAPK (mitogen-activated protein kinase) activity and is involved in the regulation of various important signaling pathways, including JNK and RAS signaling [4,16]. On the other hand, it suppresses the activity of Pin1 in order to modulate multiple signaling pathways, such as Wnt-β-catenin and NF-kB signaling [21][22][23]. A number of preclinical studies suggested that EGCG treatment decreases the incidence and diversity of tumors in various organs, such as liver, stomach, skin, lung, breast and colon [4,24]. Therefore, the discovery of the novel EGCG-related targets and the illustration of their molecular mechanisms of action would be very meaningful [4,6,13].
Despite gaining such remarkable scientific attention over the last years, little is known about the potential targets of EGCG and their molecular mechanisms [12]. With the advent of computational chemistry, structural biology and bioinformatics, the reverse docking method, a powerful tool for drug repositioning and rescue, has been frequently applied to facilitate the discovery of new targets or receptors [25,26]. Traditional docking involves one target-many ligands, while the reverse docking is one ligand-many targets, in which the ligand is docked against an array of relevant targets and is ranked according to 'some score' [27]. Therefore, several reverse docking tools, such as TarFisDock, idTarget, INVDOCK or conventional docking softwares (Autodock, Autodock Vina, DOCK, Glide and Ledock), have been successfully developed and explored in the last few years [28]. Some researchers have applied INVDOCK to predict the potential targets of the broad-spectrum anticancer drug BBR [29,30]. Zhang et al. employed TarFisDock for the rapid identification of bioactive compounds and their targets from medicinal plants [31]. Investigators have identified the potential antineoplastic targets of tea polyphenols using Autodock 4 and TarFisDock based on a comparative reverse docking strategy [12]. Wang et al. adopted a reverse docking method to predict human nuclear receptors of environmental contaminants by means of Autodock Vina software [32]. Chen et al. engaged Ledock and Autodock Vina for the prediction of the potential anti-tumor activity of marine compounds [26]. Park et al. recognized potential targets for ginsenosides by applying the Glide software [33]. Li et al. implemented a reverse docking approach for the demonstration of the pharmacological mechanism of G. biloba in the case of Alzheimer's disease [25]. Additionally, various scientific databases, such as KEGG [34], DAVID [35] and STRING database [36], have been generally used to explore the targets related to the possible molecular mechanisms.

Target Proteins Database and Reverse Virtual Screening
All the 299 proteins were installed in the anti-tumor target proteins database and annotated into the GAD (Genetic Association Database) in the DAVID 6.8 package [35]. The corresponding disease of the set of target proteins contained 360 diseases entries (p-value ≤ 0.05) and the top 15 entries are shown in Figure 2A and listed in Table S1, respectively. Diseases related to the tumor and cancer in our target database are lung cancer, breast cancer, bladder cancer, esophageal adenocarcinoma, colorectal cancer, prostate cancer, ovarian cancer, pancreatic neoplasms, thyroid cancer, neoplasms, stomach cancer, head and neck cancer, stomach neoplasms and leukemia. Simultaneously, 271 target proteins are annotated into GAD at an enrichment rate of 91.25%, which suggests that most target proteins are enriched in the disease databases.
including the utilization of public databases, reverse docking and MD simulations, in order to describe a novel paradigm for constructing easily interpretable networks. In short, our finding provides new insight into our understanding of the pharmacological mechanism of EGCG and its role in the prevention of carcinogenesis. Figure 1 demonstrates the schematic and overall result from each step in this scientific search.

Target Proteins Database and Reverse Virtual Screening
All the 299 proteins were installed in the anti-tumor target proteins database and annotated into the GAD (Genetic Association Database) in the DAVID 6.8 package [35]. The corresponding disease of the set of target proteins contained 360 diseases entries (p-value ≤ 0.05) and the top 15 entries are shown in Figure 2A and listed in Table S1, respectively. Diseases related to the tumor and cancer in our target database are lung cancer, breast cancer, bladder cancer, esophageal adenocarcinoma, colorectal cancer, prostate cancer, ovarian cancer, pancreatic neoplasms, thyroid cancer, neoplasms, stomach cancer, head and neck cancer, stomach neoplasms and leukemia. Simultaneously, 271 target proteins are annotated into GAD at an enrichment rate of 91.25%, which suggests that most target proteins are enriched in the disease databases. Before applying the formal reverse virtual screening, redocking was carried out to validate the accuracy of the docking program [26]. As seen in Figure 2B, the number of RMSD values was 8 (less than 0.5 Å), 12 (between 0.5 and 1 Å), 46 (between 1 and 2 Å), 13 (between 2 and 5) and 21 (greater than 5 Å), respectively. As the RMSD value for the heavy atoms of the ligand is less than 2 Å, this suggests that the parameters and the scoring algorithms are reasonable [26,37]. Moreover, according to the Autodock Vina, the RMSD value of 66 ligands was less while that of 34 ligands was greater than 2 Å. The similarity of the results to that of previous studies suggests that the parameters and the scoring algorithms are relatively reliable while using the Autodock Vina software [38].
For the identification of EGCG anti-tumor efficiency, we performed reverse virtual screening with a database of 299 target proteins. The detailed results are listed in Table S2. The top 40 target proteins were selected based on their docking score (Table 1). In addition, 10 target proteins were identified as an inhibitor in previous investigations [2,4,14,24]. The binding modes of the 10 reported targets with EGCG, including FYN-EGCG [39], NOS2-EGCG [40,41], CDK2-EGCG [42,43], ABL1-EGCG [44], SYK-EGCG [8], AKT2-EGCG [1,45], MAPK8-EGCG [46], IRAK4-EGCG [47], AKT1-EGCG [1,45] and APAF1-EGCG [48], are shown in Figure 3. As observed, all target EGCG compounds were posed into the binding site and form extensive interactions with the key residues. Before applying the formal reverse virtual screening, redocking was carried out to validate the accuracy of the docking program [26]. As seen in Figure 2B, the number of RMSD values was 8 (less than 0.5 Å), 12 (between 0.5 and 1 Å), 46 (between 1 and 2 Å), 13 (between 2 and 5) and 21 (greater than 5 Å), respectively. As the RMSD value for the heavy atoms of the ligand is less than 2 Å, this suggests that the parameters and the scoring algorithms are reasonable [26,37]. Moreover, according to the Autodock Vina, the RMSD value of 66 ligands was less while that of 34 ligands was greater than 2 Å. The similarity of the results to that of previous studies suggests that the parameters and the scoring algorithms are relatively reliable while using the Autodock Vina software [38].
According to Figure 3A, EGCG interacts with the key residues N19, M86, S89 and D148 and forms four hydrogen bonds in the binding sites of proto-oncogene tyrosine-protein kinase FYN (PDB ID: 2DQ7). Similarly, the intermolecular interactions between EGCG and nitric oxide synthase 2 (NOS2, PDB ID: 4NOS) is involved in hydrophobic interactions with the key residues N370, G371, W372, E377 and Y489 and forms four hydrogen bonds ( Figure 3B). Meanwhile, EGCG enters the hydrophobic pocket of cyclin-dependent kinase 2 (CDK2, PDB ID: 2IW9) and forms six hydrogen bonds with the key residues E12, H84, Q131, N132 and D145 ( Figure 3C). The key residues E316, N322, N368 and D381 interact in a hydrophobic manner with EGCG and develop four stable hydrogen bonds in tyrosine-protein kinase ABL1 (ABL1, PDB ID: 2V7A, Figure 3D). At the same time, EGCG relocates to the active sites of spleen tyrosine kinase (SYK, PDB ID: 1XBC) and it establishes nine hydrogen bonds with the important residues S379, K402, E420, A451, R498, S511 and D512 ( Figure 3E). The hydroxyl groups of EGCG forms two hydrogen bonds with the critical residues E279 and D4400 and it is involved in the hydrophobic interactions with the residues L158 and F163 in serine-threonine protein kinase AKT2 (PDB ID: 2JDR, Figure 3F). Moreover, the hydroxyl groups of EGCG also can form six hydrogen bonds by interacting with the residues I32, G38, K55, N114 and S155 in mitogen-activated protein kinase 8 (MAPK8 or JNK1, PDB ID: 3ELJ, Figure 3G). EGCG forms only one hydrogen bond with the residue Y264 and hydrophobic interactions with the residues M192, M265 and N319 in interleukin-1 receptor-associated kinase 4 (IRAK4, PDB ID: 2NRU, Figure 3H). EGCG inserts into the binding site of serine-threonine protein kinase AKT1 (PDB ID: 3MVH) with a hydroxyl group and forms four hydrogen bonds with the important residues L156, G159, E234 and D292 ( Figure 3I). Similarly, the interactions between EGCG and apoptotic protease activating factor 1 (APAF1, PDB ID: 1Z6T) involves the formation of extensive hydrogen bonds with the residues Q121, V125, V127, S161, V162 and H438 ( Figure 3J).  According to Figure 3A, EGCG interacts with the key residues N19, M86, S89 and D148 and forms four hydrogen bonds in the binding sites of proto-oncogene tyrosine-protein kinase FYN (PDB ID: 2DQ7). Similarly, the intermolecular interactions between EGCG and nitric oxide synthase 2 (NOS2, PDB ID: 4NOS) is involved in hydrophobic interactions with the key residues N370, G371, W372, E377 and Y489 and forms four hydrogen bonds ( Figure 3B). Meanwhile, EGCG enters the hydrophobic pocket of cyclin-dependent kinase 2 (CDK2, PDB ID: 2IW9) and forms six hydrogen bonds with the key residues E12, H84, Q131, N132 and D145 ( Figure 3C). The key residues E316, Note: Lig_score and EGCG_score represented the docking scores of original ligand and EGCG, respectively. Uniprot_ID and Ligand_ID represented the identification (ID) of the target proteins from uniport database and original ligand from PDB database, respectively.

GO Analysis and KEGG Pathway Enrichment
In the present research, the top 40 putative target proteins were annotated according to the GO term and KEGG pathway [25,26]. The GO term was divided into three functional parts: molecular function, biological process and cellular component [49]. As shown in Figure 4A, the molecular function mainly contains protein binding, ATP binding, protein kinase activity, protein homodimerization activity and magnesium ion binding; the cellular component is related to cytosol, nucleus, cytoplasm, nucleoplasm and intracellular space; and the biological program is involved in protein phosphorylation, regulation of transcription, apoptosis process, cell division and innate immune response.
term and KEGG pathway [25,26]. The GO term was divided into three functional parts: molecular function, biological process and cellular component [49]. As shown in Figure 4A, the molecular function mainly contains protein binding, ATP binding, protein kinase activity, protein homodimerization activity and magnesium ion binding; the cellular component is related to cytosol, nucleus, cytoplasm, nucleoplasm and intracellular space; and the biological program is involved in protein phosphorylation, regulation of transcription, apoptosis process, cell division and innate immune response. Based on the p-value ≤ 0.05, the top 40 putative target proteins were mapped to the reference pathway in the KEGG database [34] and were involved in 98 pathways (Table S3). The bubble diagram was employed in Figure 4B to reveal the result of the complicated KEGG pathway enrichment. We observed that these signaling pathways are evidently related with human tumors or cancer, including small cell lung cancer as they are involved in related processes, such as apoptosis, neurotrophin, FoxO, MAPK, PI3K/Akt, ErbB, TNF and Toll-like receptor signaling pathway. Previous literature demonstrated that EGCG can effectively suppress carcinogenesis by modulating various signaling pathways, such as apoptosis, FoXO, ErbB, JAK/STAT, MAPK, PI3K/AKT, Wnt, AMPK and Notch [2,4,6,13,24]. Combining this information with the above KEGG analysis, we can conclude that the anti-tumor mechanism of EGCG is extremely complicated and involves multiple signaling pathways. Based on the p-value ≤ 0.05, the top 40 putative target proteins were mapped to the reference pathway in the KEGG database [34] and were involved in 98 pathways (Table S3). The bubble diagram was employed in Figure 4B to reveal the result of the complicated KEGG pathway enrichment. We observed that these signaling pathways are evidently related with human tumors or cancer, including small cell lung cancer as they are involved in related processes, such as apoptosis, neurotrophin, FoxO, MAPK, PI3K/Akt, ErbB, TNF and Toll-like receptor signaling pathway. Previous literature demonstrated that EGCG can effectively suppress carcinogenesis by modulating various signaling pathways, such as apoptosis, FoXO, ErbB, JAK/STAT, MAPK, PI3K/AKT, Wnt, AMPK and Notch [2,4,6,13,24]. Combining this information with the above KEGG analysis, we can conclude that the anti-tumor mechanism of EGCG is extremely complicated and involves multiple signaling pathways.

Analysis of Pharmacological Network
To further explore the relationship of complex pathways, the PPI network was prepared by uploading the top 40 putative target proteins to the STRING database [25,36]. A total of 37 putative target proteins were mapped to the STRING database and the results were displayed by using Cytoscape tools ( Figure 5A). By observing the PPI (protein-protein interaction) network, we can conclude that AKT1, CDK2, MAPK8, KRAS, TOP2A, MAP3K5 play a significant role in relative signaling pathways, such as MAPK, AKT/PI3K, Ras and cell cycle. Additionally, IKBKB, AURKA, BUB1, WEE1, APAF1, ABL1, FYN and PRKCQ are also important and this is consistent with their molecular function.

Exploration and Identification of the Potential Target Proteins
After connecting PPI with TP network, we observed that the target proteins, such as AKT1/2, KRAS, MAPK8, CDK2, IRAK4, IKBKB, MAP3K5, APAF1, JAK3, WEE1 and NTRK1, play a relatively important role in relative anti-tumor pathways. However, six relative targets, including IKBKB, MAP3K5, KRAS, JAK3, WEE1 and NTRK1, were not verified by the scientific methods. MD simulations are a powerful and effective method to validate relative target-EGCG compounds [50]. Therefore, based on the conformation of reverse docking, the MD simulations of six target-EGCG compounds were carried out in this research.
The assessment of the backbone Cα-RMSD value for each target-EGCG system is used to measure the structural stability in the MD simulation [51,52] as shown in Figure 6. As shown, most of the systems reached equilibrium after 5 ns when running a MD simulation with an average backbone Cα-RMSD of about 3 Å. Furthermore, we also observed that the backbone Cα-RMSD values of target-EGCG systems are less or equal to the targets and target-original ligand complexes during last 10 ns MD simulation, suggesting that EGCG are tightly bound to these targets and block the enzymatic activities.

Exploration and Identification of the Potential Target Proteins
After connecting PPI with TP network, we observed that the target proteins, such as AKT1/2, KRAS, MAPK8, CDK2, IRAK4, IKBKB, MAP3K5, APAF1, JAK3, WEE1 and NTRK1, play a relatively important role in relative anti-tumor pathways. However, six relative targets, including IKBKB, MAP3K5, KRAS, JAK3, WEE1 and NTRK1, were not verified by the scientific methods. MD simulations are a powerful and effective method to validate relative target-EGCG compounds [50]. Therefore, based on the conformation of reverse docking, the MD simulations of six target-EGCG compounds were carried out in this research.
The assessment of the backbone Cα-RMSD value for each target-EGCG system is used to measure the structural stability in the MD simulation [51,52] as shown in Figure 6. As shown, most of the systems reached equilibrium after 5 ns when running a MD simulation with an average backbone Cα-RMSD of about 3 Å. Furthermore, we also observed that the backbone Cα-RMSD values of target-EGCG systems are less or equal to the targets and target-original ligand complexes during last 10 ns MD simulation, suggesting that EGCG are tightly bound to these targets and block the enzymatic activities.  Based on the above MD results, the potential binding models of these target-EGCG compounds were further analyzed (Figure 7). Figure 7A shows that EGCG is stabilized in the binding site of mitosis inhibitor protein kinase WEE1 (PDB ID: 1X8B) through six hydrogen bonds with C89, N86, S140 and S140. On the other hand, EGCG interacted with tyrosine-protein kinase JAK3 (PDB ID: 1YVJ) and constructs five hydrogen bonds with the key residues K17, L92, R140 and D154 ( Figure 7B). Similar to the binding mode of KRAS-EGCG compounds (KRAS, kirsten rat sarcoma viral oncogene; PDB ID: 3GFT), EGCG is stabilized in the binding site through five hydrogen bonds with the key residues G15, K16, D33, D1189 and A146 ( Figure 7C). Meanwhile, EGCG enters the hydrophobic pocket of high affinity nerve growth factor receptor (NTRK1, PDB ID: 4AOJ) and forms four hydrogen bonds with the critical residues D97, R100, G168 and S173 ( Figure 7D). EGCG enters the hydrophobic pocket of mitogen-activated protein kinase kinase kinase 5 (MAP3K5, PDB ID: 2CLQ) and forms five hydrogen bonds with the key residues K40, V88, S92, S152 and F154 ( Figure 7E). Likewise, as shown in Figure 7F, EGCG interacts with the key residues T15, K36, C91, D95 and N142 and constructs four hydrogen bonds in the binding sites of the inhibitor of nuclear factor kappa-B kinase subunit beta (IKBKB, PDB ID: 4KIK).  Based on the above MD results, the potential binding models of these target-EGCG compounds were further analyzed (Figure 7). Figure 7A shows that EGCG is stabilized in the binding site of mitosis inhibitor protein kinase WEE1 (PDB ID: 1X8B) through six hydrogen bonds with C89, N86, S140 and S140. On the other hand, EGCG interacted with tyrosine-protein kinase JAK3 (PDB ID: 1YVJ) and constructs five hydrogen bonds with the key residues K17, L92, R140 and D154 (Figure

Enzymatic Activity Assay In Vitro
To further verify the above-described results, enzymatic activity assays were implemented ( Figure 8). We observed that EGCG effectively inhibits IKBKB, KRAS and NTRK1 activity while moderately suppresses WEE1 activity. Furthermore, it only slightly affects the JAK3 and MAP3K5 activity. Taken together, the above-mentioned results suggest that IKBKB, KRAS, NTRK1 and WEE1 might be possible novel potential targets of EGCG.
To further verify the above-described results, enzymatic activity assays were implemented (Figure 8). We observed that EGCG effectively inhibits IKBKB, KRAS and NTRK1 activity while moderately suppresses WEE1 activity. Furthermore, it only slightly affects the JAK3 and MAP3K5 activity. Taken together, the above-mentioned results suggest that IKBKB, KRAS, NTRK1 and WEE1 might be possible novel potential targets of EGCG.

Discussion
EGCG, the green tea polyphenol, suppresses tumors through the inhibition of various dynamic tumor-associated proteins and the modulation of relative signaling pathways [4,6,13,14]. Several clinical trials have declared that EGCG is an important anti-tumor compound but very little is known about its potential targets and its molecular mechanisms [12]. Therefore, in the present work, we integrated bioinformatics and computational chemistry methodologies for the discovery of novel EGCG targets and tried to explain its anti-tumor mechanism.

Discussion
EGCG, the green tea polyphenol, suppresses tumors through the inhibition of various dynamic tumor-associated proteins and the modulation of relative signaling pathways [4,6,13,14]. Several clinical trials have declared that EGCG is an important anti-tumor compound but very little is known about its potential targets and its molecular mechanisms [12]. Therefore, in the present work, we integrated bioinformatics and computational chemistry methodologies for the discovery of novel EGCG targets and tried to explain its anti-tumor mechanism.
Previous research illustrated that EGCG induces apoptosis and cell cycle arrest during tumorigenesis through the regulation of NF-kB [15,53,54], cyclin D1 [4] and cyclin-dependent kinases (CDK) [4,17]. Similarly, our reverse docking results indicated that EGCG targets CDK4 and CDK7, which are among the top 40 target proteins. Furthermore, Masuda et al. and Aggarwal et al. showed that EGCG inhibits the activation of NF-kB in head and neck cancer, breast cancer and colon cancer and it is involved in the NF-kB signaling pathway [55,56]. Conversely, we obtained a lower docking score of NFKB1 (nuclear factor NF-kappa B, only -6.2 KJ/mol, Table S2) was detected in our reverse docking research. Interestingly, IKBKB that is involved in NF-kB signaling pathway possesses a high docking score and inhibition rate (Table 1 and Figure 8) so based on this information, we can say that IKBKB may be a novel potential target of EGCG. Similarly, Shankar et al. confirmed that EGCG suppressed AKT and ERK phosphorylation and improves the activation of FOXO transcription factors that results in cell cycle arrest and apoptosis [57]. We can easily assume that EGCG targets AKT1/2 due to the high docking scores (Table 1 and Figure 9), whereas FOXO family proteins (FOXO3A and FOXO4) have extremely low docking scores (Table S3). These results demonstrate that EGCG not only modulates the FOXO signaling pathway by the activation of FOXO transcription factors but also regulates the PI3K-AKT signaling pathway through the inhibition of AKT1/2 activity. Previous research illustrated that EGCG induces apoptosis and cell cycle arrest during tumorigenesis through the regulation of NF-kB [15,53,54], cyclin D1 [4] and cyclin-dependent kinases (CDK) [4,17]. Similarly, our reverse docking results indicated that EGCG targets CDK4 and CDK7, which are among the top 40 target proteins. Furthermore, Masuda et al. and Aggarwal et al. showed that EGCG inhibits the activation of NF-kB in head and neck cancer, breast cancer and colon cancer and it is involved in the NF-kB signaling pathway [55,56]. Conversely, we obtained a lower docking score of NFKB1 (nuclear factor NF-kappa B, only -6.2 KJ/mol, Table S2) was detected in our reverse docking research. Interestingly, IKBKB that is involved in NF-kB signaling pathway possesses a high docking score and inhibition rate (Table 1 and Figure 8) so based on this information, we can say that IKBKB may be a novel potential target of EGCG. Similarly, Shankar et al. confirmed that EGCG suppressed AKT and ERK phosphorylation and improves the activation of FOXO transcription factors that results in cell cycle arrest and apoptosis [57]. We can easily assume that EGCG targets AKT1/2 due to the high docking scores (Table 1 and Figure 9), whereas FOXO family proteins (FOXO3A and FOXO4) have extremely low docking scores (Table S3). These results demonstrate that EGCG not only modulates the FOXO signaling pathway by the activation of FOXO transcription factors but also regulates the PI3K-AKT signaling pathway through the inhibition of AKT1/2 activity.
Recently, EGCG has been reported to be involved in the inhibition of MAP kinase activity and plays a role in various important signaling pathways, including JNK, RAS and apoptosis [4,16]. Our results indicated that EGCG may suppress JNK1 (MAPK8), MAP3K5 and MAP2K4 according to the docking scores, whereas JNK1 merely has been verified by scientific experiments [46]. Therefore, we also deduced that MAP3K5 and MAP2K4 might be novel potential targets for EGCG. However, the enzymatic activity assay indicated that EGCG can slightly affect MAP3K5 activity. In addition, Senggunprai et al. showed that EGCG suppresses the JAK-STAT signaling pathway activation via modulating the phosphorylation of STAT1 and STAT3 and inhibiting the expression level of iNOS (inducible nitric oxide synthase) and ICAM-1 (intercellular cell adhesion molecule-1) in cholangiocarcinoma cells [58]. Thus, according to our findings, EGCG may regulate the JAK-STAT signaling pathway by targeting NOS2 and JAK3 (Figure 9). However, the enzymatic activity assays indicated that JAK3 was not an ideal target for EGCG ( Figure 8). Unluckily, we did not detect the docking scores of STAT2 and STAT3 due to the unavailability of their crystal structure in the PDB database. Recently, EGCG has been reported to be involved in the inhibition of MAP kinase activity and plays a role in various important signaling pathways, including JNK, RAS and apoptosis [4,16]. Our results indicated that EGCG may suppress JNK1 (MAPK8), MAP3K5 and MAP2K4 according to the docking scores, whereas JNK1 merely has been verified by scientific experiments [46]. Therefore, we also deduced that MAP3K5 and MAP2K4 might be novel potential targets for EGCG. However, the enzymatic activity assay indicated that EGCG can slightly affect MAP3K5 activity. In addition, Senggunprai et al. showed that EGCG suppresses the JAK-STAT signaling pathway activation via modulating the phosphorylation of STAT1 and STAT3 and inhibiting the expression level of iNOS (inducible nitric oxide synthase) and ICAM-1 (intercellular cell adhesion molecule-1) in cholangiocarcinoma cells [58]. Thus, according to our findings, EGCG may regulate the JAK-STAT signaling pathway by targeting NOS2 and JAK3 ( Figure 9). However, the enzymatic activity assays indicated that JAK3 was not an ideal target for EGCG ( Figure 8). Unluckily, we did not detect the docking scores of STAT2 and STAT3 due to the unavailability of their crystal structure in the PDB database.
In addition, Byun et al. revealed that EGCG modulates Toll-like signaling pathway via suppressing the expression of 67-kDa laminin receptor (67LR) and downregulating TLR2 and TLR4 on dendritic and leukemia cells [19,20]. In the same way, our reverse docking experiments indicated that EGCG also modulated Toll-like signaling pathway via targeting IRAK4 (Figure 9). Therefore, IRAK4 also may be considered as a potential target for EGCG and the enzymatic activity assays further verified this hypothesis. Excitingly, two novel signaling pathways were regulated via SYK and PRKCQ in B and T cell receptor signaling pathway, respectively ( Figure 9). The results suggest that EGCG may suppress SYK and PRKCQ to modulate these important pathways.
In conclusion, we applied a series of computational chemistry, bioinformatics and enzymatic activity assay approaches to explore the anti-tumor mechanisms of EGCG and discovered several potential targets and novel pathways. However, most of the proteins had an unknown crystal structure. In addition, the set of anti-tumor targets was relatively small and it only contained 299 tumor-associated proteins in our works. However, Lauro et al. and Chen et al. established a similar set of targets to implement the reverse docking and it also contained only 126 and 150 different proteins, respectively [26,59]. Regrettably, our work had some shortcomings as it only applied a single docking program, Autodock Vina, to perform reverse docking rather than multiple programs. In addition, our work considered only the top 40 targets in the results of reverse docking, but the excluded proteins may also be potential or known targets of EGCG. The MD simulations were performed to assess the binding stability of compounds and the enzymatic activity assays were further carried out to validate the possible targets. However, the simulation times were not long enough. Similarly, the enzymatic activity was measured to validate the six possible targets, but biological assays were not long and deep enough. In short, the present research provides great inspiration and highly encourages the study of the pharmacological mechanism of natural products, identification of therapeutic targets and discovery of novel signaling pathways.

Construction of Anti-tumor Target Proteins Database
The database of anti-tumor target proteins was constructed based on related scientific studies and literature [26,[59][60][61] and was also downloaded from the protein databank (PDB). The undefined binding sites and repeated targets were removed from the database of anti-tumor target proteins. Finally, 299 different target proteins were constructed from PDB and installed in the database of anti-tumor targets.

Reverse Virtual Screening
The crystal structures of the targets were prepared based on the following strategies [26]: (1) removal of the water molecules, heteroatoms, ions, co-crystallized ligands and repeated chains using Pymol 1.3.X software; and (2) completing the missing residues using SWISS-MODELING software. The binding site was determined based on the co-crystallized ligand in crystal structures. The geometry center of the co-crystallized ligand was denoted as the active center box and the size_x, size_y, size_z was set to 22.5, 22.5 and 22.5 using the Autodock Vina plugin in Pymol 1.3.X software [62] (Table S4). The target proteins and small molecule epigallocatechin-3-gallate (EGCG) were automatically processed, including adding polar hydrogens and charges, by using the protein and ligand preparation python script in Autodock tools 1.4.5. Reverse virtual screening was performed by using Autodock Vina software and all parameters were adopted as their defaults [38]. Finally, the conformation of each target-EGCG compound was generated and the binding scores were calculated.
The root means square deviation (RMSD) is used to validate the accuracy of the docking program to replicate the ligand conformation observed in the crystal structure. Therefore, a total of 100 original ligands were selected randomly and redocked into the binding pocket of targets as a test set. The VMD 1.9.3 tool was used to calculate the RMSD of ligand heavy atoms between docking and crystal conformation [26,63].

GO Analysis and KEGG Pathway Enrichment
GO (Gene Ontology) enrichment analysis of the interesting targets was carried out using the DAVID 6.8 tools [35]. The KEGG (Kyoto encyclopedia of genes and genomes) enrichment analysis of the objective targets was assigned using the online KOBAS 3.0 tools [64]. The p-value was used to identify the significance of the GO terms and KEGG pathways and p-value ≤ 0.05 was regarded as being significant and interesting. The ggplot2 package was installed in the R 3.5.2 software for visualizing terms [65].

Pharmacological Network Analysis
The analysis of PPI (protein-protein interaction) was established according to the STRING database [36]. Based on the results of KEGG pathway, the analysis of TP (Target-pathway) was realized by using Cytoscape 3.7.1 software [66]. The NetworkAnalyzer plugin was applied to obtain the parameters of the network topology, such as nodes, degree and edge betweenness [25].

Molecular Dynamic Simulation
The MD (molecular dynamic) simulations were performed using Gromacs 4.6.5 tools [67,68]. Based on the Autodock Vina, the initial configurations of the EGCG-target complex were considered as a confirmation of MD simulation. The topology files of target proteins were obtained by using the pdb2gmx [37,52]. The topology files of EGCG were generated by the antechamber and tleap tools from Ambertools 18 software using the AM1/BCC charge method [69]. The force field of proteins and ligands were Amberff99SB and GAFF (a general Amber force field), respectively [69]. The complex system was preprocessed by the following steps: (1) the TIP3P water molecules solving the system with the dodecahedron box; (2) Na + and Cl − ions neutralizing the system; and (3) the periodic boundary condition with a minimal distance of 1.2 nm [37]. The systems first had their energy minimized using the steepest descent minimization method to limit the energy to 1000 Kcal/mol/nm [69]. Before the formal simulation, the complex system was carried out at both 100 ps NVT and NPT ensemble with 300 K temperature and 1 atm pressure [22,51]. The long-range electrostatic interactions were calculated using the PME (Particle Mesh Ewald) method and the hydrogen bonds were constrained using the LINCS (Linear Constraint Solver) algorithm. The non-bonded cutoff distance was set to 12 Å and the SETTLE algorithm was applied to restrict water molecules. The 20 ns MD simulations were performed with a time step of 2 fs and the atomic coordinates were recorded every 5000 fs. Pymol 1.3.X was further used to generate visual conformation.

Enzymatic Activity Assay In Vitro
EGCG (HPLC ≥ 99%) was purchased from Alfa biotechnology corporation (Chengdu, China). The kinase kits of six interesting targets were obtained from Mlbio biotechnology corporation (Shanghai, China). The assays were carried out according to the instructions of the kinase kits. Firstly, 50 µL of the compound solution was added to 96-well plates and incubated for 30 min at 37 • C. Second, the compound solution was removed and 50 µL of the HRP-conjugate reagent was added and incubated for 30 min at 37 • C. Third, HRP-conjugate reagent was removed before chromogen solution A and B was added and incubated for 10 min at 37 • C. Finally, the 50 µL stop solution was added and Bio-Rad iMark microplate reader was used to measure the activity at 450 nm.

Conclusions
We integrated the computational chemistry and bioinformatics techniques to analyze the anti-tumor mechanism of EGCG and explore possible targets and pathways. According to our findings the mechanism involves 12 signaling transduction pathways, such as apoptosis, PI3K-AKT, MAPK, FOXO and KRAS, and 33 vital target proteins. Furthermore, we also discovered that six tumor-associated proteins may be novel potential targets for EGCG but four targets were only moderately inhibited, including IKBKB, KRAS, WEE1 and NTRK1. Moreover, the establishment of networks between tumor-related proteins and EGCG may have important implications for understanding the underlying anti-tumor mechanisms of EGCG. Convincingly, a new therapeutic strategy for tumor may be developed from the identified targets and pathways. In summary, this study provided important information for studying the pharmacological mechanism of EGCG and identifying novel potential therapeutic targets.
Supplementary Materials: The following are available online. Table S1: The top15 terms for genetic association database. Author Contributions: L.D. conceived the idea and supervised the project. W.W., X.X. and X.L. performed experiments and wrote the manuscript. W.W., X.X. and X.L. analyzed the data. Q.Z., L.D. and W.Y. reviewed and edited the manuscript. All authors read and approved the final manuscript.