In-Silico Characterization of Estrogen Reactivating β-Glucuronidase Enzyme in GIT Associated Microbiota of Normal Human and Breast Cancer Patients

Estrogen circulating in blood has been proved to be a strong biomarker for breast cancer. A β-glucuronidase enzyme (GUS) from human gastrointestinal tract (GIT) microbiota including probiotics has significant involvement in enhancing the estrogen concentration in blood through deconjugation of glucuronidated estrogens. The present project has been designed to explore GIT microbiome-encoded GUS enzymes (GUSOME) repertoire in normal human and breast cancer patients. For this purpose, a total of nineteen GUS enzymes from human GIT microbes, i.e., seven from healthy and twelve from breast cancer patients have been focused on. Protein sequences of enzymes retrieved from UniProt database were subjected to ProtParam, CELLO2GO, SOPMA (secondary structure prediction method), PDBsum (Protein Database summaries), PHYRE2 (Protein Homology/AnalogY Recognition Engine), SAVES v6.0 (Structure Validation Server), MEME version 5.4.1 (Multiple Em for Motif Elicitation), Caver Web server v 1.1, Interproscan and Predicted Antigenic Peptides tool. Analysis revealed the number of amino acids, isoelectric point, extinction coefficient, instability index and aliphatic index of GUS enzymes in the range of 586–795, 4.91–8.92, 89,980–155,075, 25.88–40.93 and 71.01–88.10, respectively. Sub-cellular localization of enzyme was restricted to cytoplasm and inner-membrane in case of breast cancer patients’ bacteria as compared to periplasmic space, outer membrane and extracellular space in normal GIT bacteria. The 2-D structure analysis showed α helix, extended strand, β turn and random coil in the range of 27.42–22.66%, 22.04–25.91%, 5.39–8.30% and 41.75–47.70%, respectively. The druggability score was found to be 0.05–0.45 and 0.06–0.80 in normal and breast cancer patients GIT, respectively. The radius, length and curvature of catalytic sites were observed to be 1.1–2.8 Å, 1.4–15.9 Å and 0.65–1.4, respectively. Ten conserved protein motifs with p < 0.05 and width 25–50 were found. Antigenic propensity-associated sequences were 20–29. Present study findings hint about the use of the bacterial GUS enzymes against breast cancer tumors after modifications via site-directed mutagenesis of catalytic sites involved in the activation of estrogens and through destabilization of these enzymes.


Introduction
A variety of microbes exist in human gastrointestinal tract (GIT) which may have positive or negative impact on human body. Microbes with good or positive impact are usually known as probiotics. Probiotics are the living microorganisms which confer benefits to human health when administered in the body in sufficient concentrations [1]. They are available as live microbial feed supplements [2]. They exert positive effects on human health such as digestion of lactose, normalization of small bowel-associated microbes, conferring resistance to enteric pathogens, immune system regulation, anticancer (mL1, 2), No Loop (NL) and no coverage groups [25,26]. Three types of GUS i.e., BuGUS-1, BuGUS-2 and BuGUS-3 have been reported in Bacteroides uniformis [27].
In 2019, the role of β-glucuronidase in reactivation of estrogen was proved experimentally [28]. Estrogen is found in two circulating forms i.e., estradiol and estrone in preand postmenopausal women, respectively [29]. During estrogen metabolism, both these forms are conjugated with glucuronic acid in the presence of UDP-glucuronosyl transferase enzymes (UGTs) leading to the formation of estrone 3-glucuronide and estradiol-17glucuronide [30]. Due to high polarity and hydrophilicity, glucuronidated estrogens have more tendency to dissolve in blood and excrete via urine. However, a major proportion of conjugated forms enters the GI tract via bile and metabolized further [31]. Once in the intestine, glucuronidated estrogens are deconjugated in the presence of GUSOME into aglycones estrone and aglycones estradiol. The activated estrogen is absorbed in mucosa and re-enters blood circulation via a portal vein thus contributing to breast cancer ( Figure 1) [32].  [25,26] Three types of GUS i.e., BuGUS-1, BuGUS-2 and BuGUS-3 have been reported in Bacteroides uniformis [27].
In 2019, the role of β-glucuronidase in reactivation of estrogen was proved experimentally [28]. Estrogen is found in two circulating forms i.e., estradiol and estrone in preand postmenopausal women, respectively [29]. During estrogen metabolism, both these forms are conjugated with glucuronic acid in the presence of UDP-glucuronosyl transferase enzymes (UGTs) leading to the formation of estrone 3-glucuronide and estradiol-17-glucuronide [30]. Due to high polarity and hydrophilicity, glucuronidated estrogens have more tendency to dissolve in blood and excrete via urine. However, a major proportion of conjugated forms enters the GI tract via bile and metabolized further [31]. Once in the intestine, glucuronidated estrogens are deconjugated in the presence of GUSOME into aglycones estrone and aglycones estradiol. The activated estrogen is absorbed in mucosa and re-enters blood circulation via a portal vein thus contributing to breast cancer ( Figure 1) [32].  Breast cancer has been found to be reduced in human females on stopping estrogen replacement therapy (ERT) [33]. Estrogen has been reported to be a potential biomarker for breast cancer [34]. This is due to its contribution to enhanced proliferation of cancerous cells, angiogenesis and metastasis stimulation and resistance to chemotherapy [35][36][37][38]. Keeping in view the association of estrogen with breast cancer and the significant role of microbial GUS enzyme in reactivation of this hormone, we designed the present study. This study targets the GUS enzymes in bacteria inhabiting GIT of normal and breast cancer patients. Characterization of the GUS enzyme might help us in the manipulation of GIT-associated bacteria including probiotics to reduce estrogen-related cancer risk. Manipulation can be performed to reduce the stability of enzyme, to alter the 3-D configuration and catalytic site of enzymes, thus reducing the breast cancer risk associated with the activity of enzymes.

Protein Sequences
To retrieve protein sequences of bacteria documented in present project, Uniprot database (https://www.uniprot.org, accessed on 21-24 July 2022) was explored [39]. Sequences retrieved, their accession numbers and bacterial species selected for this study are mentioned (Supplementary Data Table S1).

Phylogeny Analysis
To construct the phylogenetic tree initially, protein sequences of nineteen bacteria documented in present study were aligned using Clustal Omega Multiple Sequence Alignment Tool [40]. The aligned file was then subjected to MEGA version 7 [41]. The evolutionary history was inferred using the Neighbor-Joining method [42]. The evolutionary distances were computed using the Poisson correction method and are in the units of the number of amino acid substitutions per site [43]. All positions containing gaps and missing data were eliminated. There were a total of 586 positions in the final dataset.

Prediction of Physicochemical Properties
To explore the physicochemical properties of bacteria, ProtParam tool (https://web. expasy.org/protparam/, accessed on 24 July 2022) was employed. Computed attributes of bacterial proteins include molecular weight, theoretical isoelectric point (pI), half-life, instability index and aliphatic index.

Sub-Cellular Localization and Ontology Analysis
To predict the sub-cellular localization and ontology of uidA encoded GUS protein in the bacteria addressed in present study, CELLO2GO tool (cello.life.nctu.edu.tw/cello2go/, accessed on 25 July 2022) was employed.

2D Structure
For secondary structure prediction, SOPMA tool from Network Protein Sequence Analysis (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pI?page=npsa_sopma.html, accessed on 30 July 2022,) was employed [44]. To predict secondary motif map PDBsum tool (http://www.ebi.ac.uk/thornton-srv/databases/cgibin/pdbsum/GetPage.pl? pdbcode=index.html, accessed on 30 July 2022) was used. To predict the catalytic site of GUS enzyme Caver Web server v 1.1 (https://loschmidt.chemi.muni.cz/caverweb/, accessed on 4 August 2022) was used [45]. The 25 to 75% of protein is made up of secondary structure building blocks [46]. α helix, extended strand, β turn and bends are the basic elements of secondary conformation. It is important to analyze the impact of SNPs on these elements in order to gain an insight into the deleteriousness of SNPs.

3D Structure
The three dimensional structures of GUS enzyme for all the microbes were explored using Phyre2 tool (www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index, accessed on 26-28 July 2022). The 3D models were visualized using PyMOL. To measure the accuracy of protein model and predict stereochemical characteristics of protein structures, Ramachandran plot was used [47]. To generate Ramachandran plot, SAVES v6.0 (https: //saves.mbi.ucla.edu/, accessed on 30 July 2022) was used. Structure quality has been estimated on the basis of peptide bond planarity, hydrogen bonds energy backbone phi and psi angles. Analysis is based on 118 structures of resolution of at least 2.0 angstrom (Å) and R-factor no greater than 20%.

Conserved Protein Motifs Analysis
To predict the conserved motifs in protein sequences of probiotics, MEME version 5.4.1 (http://meme.sdsc.edu/meme/meme.html, accessed on 26 July 2022) was used. This tool usually finds three motifs by default however, in the present study we tried to find up Genes 2022, 13, 1545 5 of 23 to ten motifs. All other parameters were set according to default settings. To estimate the ontology of each individual conserved domain, Interproscan (http://www.ebi.ac.uk/ interpro/search/sequence/, accessed on 4 August 2022) was employed.

Predicted Antigenic Peptides Tool
To predict the antigenic determinants of GUS enzymes of all the bacteria included in study, Kolaskar and Tongaonkar method i.e., Predicted Antigenic Peptides Tool was used (https://imed.med.ucm.es/Tools/antigenic.pl, accessed on 26 August 2022). This prediction algorithm depends on amino acids occurrence in experimentally determined epitopes.

Phylogeny Prediction Based on uidA Gene Sequence
As we are trying to explore the GUS enzyme among the bacteria inhabiting the GIT of normal and breast cancer patients. Therefore, it is important to gain insight into the evolution of the GUS-coding uidA gene of these bacteria. For this purpose, the phylogenetic tree has been constructed using GUS enzymes sequences ( Figure 2). According to this phylogeny study, F. prausnitzii 1 and S. suis are sharing the same clade so are closely related to each other. E. coli strain K12 and S. enterica are also originating from the same branch point. M. bacterium is not sharing closeness with any of the bacteria as it is not sharing clade. S. xylosus and S. caeli are closely related with each other and also shared clade with S. hemolyticus. These three bacteria are also related with R. intestinalis. P. acnes and E. gallinarum 2 are sharing closeness. S. aquatilis, F. prausnitzii 3 and C. amalonaticus are also related more with each other as compared to other bacteria. C. comes and Bacillus sp. are showing the common ancestry due to origin from common branch point. L. rhamnosus and F. prausnitzii 2 are sharing clade with each other. E. gallinarum 1 is also originating from the same branch point as that of L. rhamnosus and F. prausnitzii 2. Phylogenetic tree based on GUS protein sequences of present study bacteria constructed using neighbor-joining method The optimal tree with the sum of branch length = 19.62208961 is shown. The tree is drawn to scale, with branch lengths (next to the branches) in the same units as those of the evolutionary distances used to infer the phylogenetic tree.

Physicochemical Attributes
The GUS enzymes in most of the microbes comprise of different numbers of amino acids ( Table 1). The highest number is 795 and the lowest is 586 in C. comes and M. bacterium, respectively. The half-life was observed to be the same in all the bacteria i.e., 30 h. The highest isoelectric point (8.92) was observed in the case of S. aquatilis NBRC 16722 while the lowest (4.91) was found in S. caeli. Extinction coefficient, instability index and aliphatic index also showed variability. Highest (155075) and the lowest values (89980) of Figure 2. Phylogenetic tree based on GUS protein sequences of present study bacteria constructed using neighbor-joining method The optimal tree with the sum of branch length = 19.62208961 is shown. The tree is drawn to scale, with branch lengths (next to the branches) in the same units as those of the evolutionary distances used to infer the phylogenetic tree.

Physicochemical Attributes
The GUS enzymes in most of the microbes comprise of different numbers of amino acids ( Table 1). The highest number is 795 and the lowest is 586 in C. comes and M. bacterium, respectively. The half-life was observed to be the same in all the bacteria i.e., 30 h. The highest isoelectric point (8.92) was observed in the case of S. aquatilis NBRC 16722 while the lowest (4.91) was found in S. caeli. Extinction coefficient, instability index and aliphatic index also showed variability. Highest (155075) and the lowest values (89980) of extinction coefficient have been observed in cases of E. gallinarum 1 and S. xylosus, respectively. The instability index is observed to be highest (40.93) in C. amalonaticus and lowest (25.88) in E. gallinarum 1. As far as the aliphatic index is concerned, the largest value 88.10 was found in the case of S. suis and the smallest 71.01 in L. rhamnosus.

Sub-Cellular Localization
In normal tissue-associated bacteria, GUS was found to be present in cytoplasm (Table 2, Figure 3). Meanwhile, in C. comes, L. rhamnosus, F. prausnitzii 1 and F. prausnitzii 2, protein was additionally localized in extracellular, outer-membrane, periplasmic space and innermembrane. The protein in majority of the breast cancer-associated bacteria was found to be localized in cytoplasm with the exception of M. bacterium and S. aquatilis NBRC 16722 in which it was additionally localized in inner-membrane.

2D Structure Prediction
Secondary structure composition analysis based on SOPMA tool revealed that α helices, extended strand, β turn and random coil of GUS proteins in bacteria were comprised of amino acids in the range of 22 (Table 3). Secondary Motif Maps predicted using PDBsum tool are shown in Supplementary Data Figure S1. The catalytic site properties were predicted using Caver Web. The catalytic sites with most reliable starting points and 100% relative

2D Structure Prediction
Secondary structure composition analysis based on SOPMA tool revealed that α helices, extended strand, β turn and random coil of GUS proteins in bacteria were comprised of amino acids in the range of 22.66 to 27.81%, 22.04 to 25.91%, 5.39 to 8.30% and 41.75 to 47.70%, respectively (Table 3). Secondary Motif Maps predicted using PDBsum tool are shown in Supplementary Data Figure S1. The catalytic site properties were predicted using Caver Web. The catalytic sites with most reliable starting points and 100% relative scores with tunnel druggability, bottleneck radius, length and curvature are shown in Figure 4 and Table 4. Druggability, bottle neck radius, length and curvature of predicted catalytic sites were found to be in the range of 0.052 to 0.80, 1.1-2.8 Å, 1.4-15.9 Å and 0.65-1.4, respectively, for GUS enzymes. Highest and lowest tunnel bottleneck radii were observed in E. gallinarum 2 and S. aquatilis, respectively. Highest and lowest tunnel lengths were found in cases of E. gallinarum and L. rhamnosus, respectively. Highest and lowest tunnel curvature was observed in F. prausnitzii and P. acnes and S. aquatilis, respectively.

3D Structure Prediction
Three dimensional configurations of GUS enzymes were obtained through PHYRE2 tool and visualized by PyMol ( Figure 5). According to verification by Ramachandran plot, values of quality model were found to be closer to 90% in the most favored region which reflects the accuracy of GUS protein structures in the case of all bacteria (Supplementary Data Figure S2, Table 5).

Residues in Additional Allowed
Regions (%)

Residues in Generously Allowed
Regions (%)

Dihedrals Covalent Overall
Normal tissue associated bacteria  Table 5. Cont.

Residues in Additional Allowed
Regions (%)

Residues in Generously Allowed
Regions (%)

Dihedrals Covalent Overall
Breast cancer patients associated bacteria

Conserved Protein Motifs Prediction
In total, ten conserved protein motifs were explored in GUS protein of nineteen bacteria through MEME. The number of amino acids were found to be 29 (motifs 1, 5, 6, 8, 9 and 10), 50 (motif 2), 41 (motif 3 and 4) and 25 (motif 7). The locations of these motifs with their p-values are shown ( Figure 6). The motif results were found to be significant with p-value < 0.05 in the case of all bacteria except S. caeli. Sequences, E-values, site count, width, relative entropy and bayes threshold are also mentioned (Figure 7). E-value is an estimate of the expected number of motifs with the given log likelihood ratio (or higher) and with the same width and site count, that one would find in a similarly sized set of random sequences. The E-values for all the ten motifs were found to be significant i.e., <0.05. Site count is the number of sites contributing to the construction of motif. The maximum number of site count, i.e., 19, was observed in the case of third, fifth and sixth motif sequences. The lowest number of site count, i.e., 13, was observed in case of second and ninth motifs. The width of the motif describes a pattern of a fixed width as no gaps are allowed in MEME motifs. The width was observed in the range of 25-50. Maximum was predicted in second motif and minimum in the case of first, fifth, sixth, eighth, ninth and tenth motifs. The conserved proteins motifs explored via MEME were subjected to Interproscan to predict their molecular and biological functions. This revealed association of these motifs with carbohydrate metabolism and hydrolyses of O-glycosyl compounds.  The height of a block gives an indication of the significance of the site as taller blocks are more significant. The height is calculated to be proportional to the negative logarithm of the p-value of the site, truncated at the height for p-value of 1 × 10 −10 . Combined match p-value is defined as the probability that a random sequence (with the same length and conforming to the background) would have position p-values such that the product is smaller or equal to the value calculated for the sequence under test. Ten different colors are used to depict the ten different motifs. Figure 6. Location of motif sites with their corresponding combined match p-value for uidA of bacteria, predicted by MEME suite. Each block represents the position and strength of site. The height of a block gives an indication of the significance of the site as taller blocks ar significant. The height is calculated to be proportional to the negative logarithm of the p-v the site, truncated at the height for p-value of 1 × 10 −10 . Combined match p-value is defined probability that a random sequence (with the same length and conforming to the backg would have position p-values such that the product is smaller or equal to the value calcula the sequence under test. Ten different colors are used to depict the ten different motifs.

Figure 7.
Sequences, E-values, site count, width, relative entropy and bayes threshold of con motifs of uidA protein predicted in probiotics documented in present study E-value show tistical significance of the motif. It is an estimate of the expected number of motifs with th log likelihood ratio (or higher) and with the same width and site count that one would fi similarly sized set of random sequences. Site count is the number of sites contributing to t struction of the motif. The width of the motif describes a pattern of a fixed width as no g allowed in MEME motifs. Figure 7. Sequences, E-values, site count, width, relative entropy and bayes threshold of conserved motifs of uidA protein predicted in probiotics documented in present study E-value show the statistical significance of the motif. It is an estimate of the expected number of motifs with the given log likelihood ratio (or higher) and with the same width and site count that one would find in a similarly sized set of random sequences. Site count is the number of sites contributing to the construction of the motif. The width of the motif describes a pattern of a fixed width as no gaps are allowed in MEME motifs.

Antigenic Peptide Prediction
Positions and number of sequences that might be involved in antigenic propensity were different in all proteins i.e., Lacticaseibacillus rhamnosus (24), Roseburia intestinalis (27)

Discussion
The first verification of microbiota presence and dysbiosis in breast cancer tissue has been reported through next generation sequencing (NGS) analysis of breast tumor tissue as well as the normal tissue adjacent to tumor. Qualitative analysis revealed the presence of Methylobacterium radiotolerans and Sphingomonas yanoikuyae in tumor and normal tissues, respectively. Quantitative PCR-based analysis showed reduced load of bacterial DNA in cancerous tissue proving the breast cancer association with dysbiosis [11]. Multiple experimental evidences of the role of microbial GUS enzyme in breast cancer has been reported in the literature. In a study, the potential of 35 GUS enzymes to reactivate glucuronidated estrogen was explored using in-vivo, in-vitro and in-fimo techniques. It was found that GUS enzymes belonging to classes L1, ML1 and FMN were very active in the activation of conjugated estrogens [28]. The association of gmGUS with estrobolome has

Discussion
The first verification of microbiota presence and dysbiosis in breast cancer tissue has been reported through next generation sequencing (NGS) analysis of breast tumor tissue as well as the normal tissue adjacent to tumor. Qualitative analysis revealed the presence of Methylobacterium radiotolerans and Sphingomonas yanoikuyae in tumor and normal tissues, respectively. Quantitative PCR-based analysis showed reduced load of bacterial DNA in cancerous tissue proving the breast cancer association with dysbiosis [11]. Multiple experimental evidences of the role of microbial GUS enzyme in breast cancer has been reported in the literature. In a study, the potential of 35 GUS enzymes to reactivate glucuronidated estrogen was explored using in-vivo, in-vitro and in-fimo techniques. It was found that GUS enzymes belonging to classes L1, ML1 and FMN were very active in the activation of conjugated estrogens [28]. The association of gmGUS with estrobolome has also been studied via the inspection of estrogen replacement therapy impact on GUS enzymes of GIT microbiota and the microbial composition. Long-term exposure to ERT altered the microbial composition of GIT accompanied with reduced GUS enzyme activities. ERT induced dysbiosis by reducing the number of L. rhamnosus, F. prausnitzii and enhancing the R. gnavus .
The gmGUS play their role in estrobolome by reversing the glucuronidation process and catalyze estrogens activation by breaking glucuronic moiety. Estrogen and estrone glucuronides that might be the substrates for gmGUS include 17-α estradiol 17 This reaction releases aglycones. The process of deconjugation occurs as estrogens after GIT via bile [24]. Estrogens without deconjugation, due to high polarity and hydrophilicity, are dissolved in blood and are removed through the kidneys as urine. However, on deconjugation, estrogens via reabsorption in mucosa enter the portal vein [32]. Due to the association of high estrogen concentration with breast cancer gut microbes and breast cancer axis has been as emerging research area.
With reference to sub-cellular localization and physicochemical properties, bacteria in breast cancer patients were found to be more diverse as compared to those reported in normal tissues. The diversity of GUS enzyme found in the present study is consistent with the literature [49]. This protein in Ruminococcus gnavus has been reported to show similarity of 69%, 61%, 59% and 58% with L. gasseri, E. coli, C. perfringens and S. aureus, respectively [50]. Therefore, this study has further strengthened the diversified nature of GUS enzyme.
The GUS protein in S. xylosus, S. suis, S. aquatilis NBRC 16722, C. amalonaticus, S. caeli and E. gallinarum 1 exhibited extreme values of pI, extinction coefficient, instability index and aliphatic index. Among the probiotics of normal tissue, only L. rhamnosus GUS protein showed extreme value for aliphatic index. The total number of amino acids was found to be in the range of 596-795 with variable molecular weights. The isoelectric point analysis revealed that GUS of S. aquatilis NBRC 16722 was alkaline with pI value 8.92 while in all other cases, GUS showed acidic pI. The pI value for S. caeli is consistent with earlier reported pI value for E. coli HGU-3 [51]. Literature reports that GUS enzyme activity increases at high pH and causes cancer [52]. According to this information, in the present study GUS enzyme of S. aquatilis showed alkaline nature as per its pI. While enzymes of all other bacteria except S. caeli exhibited low acidic pI, which might have some association with the cancer-causing potential of this enzyme.
Aliphatic index reflects the relative volume of protein occupied by aliphatic side chain containing amino acids which indicates increased thermostability [53]. As all the bacterial proteins in the present study showed the range of 71.01-88.10 so GUS protein is found to be highly thermostable. Instability index is a measure of protein stability in test tube [54]. The value below 40 indicates stability of protein, so GUS enzyme in the case of all present study bacteria except C. amalonaticus is considered to be highly stable.
The 3-D configuration was generated using PHYRE2 tool and further verified by plotting Ramachandran plot using SAVES server. Except in the R. intestinalis and F. prausnitzii 1, GUS enzyme in all other bacteria showed marked variation in 3D structures. GUS enzymes from nineteen microbiota showed high variation with regard to physicochemical properties, sub-cellular localization, 3D configuration and antigenic sites so this is a highly diverse protein.
The catalytic pockets were identified along with different parameters of tunnels. Druggability is the potential of a molecule of being controlled by therapeutic drugs [55].
It is the measurement of binding affinity of catalytic site to drug like organic molecule of host [56]. The druggability values for catalytic sites of GUS enzyme are reflecting their challenging nature and do not prove them as excellent drug targets. Bottle neck radius is the measure of maximum size of probe that can fit in the narrowest portion of tunnel. Length of tunnel measures the distance between starting point and protein surface. In the present study, tunnel length was measured in the range of 1.4 to 15.9. Curvature shows the shape of tunnel which is the ratio of tunnel length and shortest distance between the starting and ending points of tunnel. In GUS enzymes, tunnel curvature was measured in the range of 0.65 to 1.4. Length and curvature also gives an idea about substrate specificity of the enzyme. Tunnels geometry may affect catalysis via channeling of substrate [57]. However, these catalytic sites can be engineered for targeting by drugs to inhibit the estrogen reactivation in breast cancer patients. The attributes of tunnel can be useful in drug designing experiments in future.
Conserved protein motifs are those sequences of proteins that undergo small variations with time. The variations might involve substitutions of fewer amino acids and replacement with amino acid having similar biochemical properties. These motifs play crucial roles in the stability and formation of catalytic site of protein. Conserved protein motifs identified in present study bacteria also show their relatedness [58].
Antigenic peptides are the bacterial proteins which directly interact with host immune system; therefore, they can be good candidates for the development of vaccines [59]. GUS enzyme in all the nineteen bacteria showed antigenic peptides in the range of [22][23][24][25][26][27][28][29]. The presence of these large numbers of antigenic peptides gives a clue about the possible immunomodulatory role of these bacteria. Due to the variety in antigenic peptides, catalytic sites and the adjacent loop structures, anti-cancer medication therapy can be managed via GUS enzyme inhibition also reported in literature [49]. High concentration and deglucuronidation potential of GUS enzyme in breast cancer patients can also be used for bioactivation of glucuronide anticancer prodrugs [60].

Conclusions
Identification of the structural properties and present study findings regarding estrogenreactivating protein (GUS enzyme) in bacteria found in normal and breast cancer patients might provide us multiple directions for modification of this enzyme. As it is very easy to perform manipulations successfully at the level of probiotics as compared to human genes, the manipulation at probiotics level might be helpful in reducing breast cancer risk through inhibiting reactivation of this estrobolome-associated protein. Additionally, the active sites explored in the present study can be inspected further for their possible role in deglucuronidation of glucuronidated estrogens. Exploration of these catalytic sites might help in modification of GUS enzyme to prevent its estrogen deconjugation potential. Modifications may include alteration in the structure of active sites participating in deglucuronidation by inducing mutations at specific points in uidA gene and deletion of conserved protein motifs. For these modifications, site-directed mutagenesis can be performed which may lead to destabilization of protein, thus inhibiting its estrogen reactivation potential.