TrpNet: Understanding Tryptophan Metabolism across Gut Microbiome

Crosstalk between the gut microbiome and the host plays an important role in animal development and health. Small compounds are key mediators in this host–gut microbiome dialogue. For instance, tryptophan metabolites, generated by biotransformation of tryptophan through complex host–microbiome co-metabolism can trigger immune, metabolic, and neuronal effects at local and distant sites. However, the origin of tryptophan metabolites and the underlying tryptophan metabolic pathway(s) are not well characterized in the current literature. A large number of the microbial contributors of tryptophan metabolism remain unknown, and there is a growing interest in predicting tryptophan metabolites for a given microbiome. Here, we introduce TrpNet, a comprehensive database and analytics platform dedicated to tryptophan metabolism within the context of host (human and mouse) and gut microbiome interactions. TrpNet contains data on tryptophan metabolism involving 130 reactions, 108 metabolites and 91 enzymes across 1246 human gut bacterial species and 88 mouse gut bacterial species. Users can browse, search, and highlight the tryptophan metabolic pathway, as well as predict tryptophan metabolites on the basis of a given taxonomy profile using a Bayesian logistic regression model. We validated our approach using two gut microbiome metabolomics studies and demonstrated that TrpNet was able to better predict alterations in in indole derivatives compared to other established methods.


Introduction
The gut microbiome is a community of metabolically active microorganisms inhabiting all niches along the intestines that coevolves with its host. Growing evidence has shown that the gut microbiome plays a critical role in animal development and health [1]. Disruptions in microbiome composition, termed dysbiosis, are implicated in various diseases including gastrointestinal diseases [2], infectious diseases [3], metabolic diseases [4,5], and neurological disorders [6]. Dysbiosis leads to a shift in the production of various microbial metabolites which then influence the physiology and immune status of the host [7]. Among these bioactive metabolites, short-chain fatty acids (SCFAs, produced by bacteria from fermenting dietary fibers), secondary bile acids (originated in liver and transformed by gut microbiome), and tryptophan-derived metabolites are most well known.

Literature Search and Intestinal Tryptophan Metabolism
We define the tryptophan metabolism pathway as a set of reactions that transfer tryptophan to an end product without further searchable reactions or breakdown of tryptophan until the energy metabolism. To enumerate the metabolic reactions and metabolites regarding tryptophan biotransformation in microbes and their mammalian hosts, we manually searched and compared over 300 biochemical and metabolomic research papers on tryptophan metabolites, 37 reviews published in the last 5 years, and 14 public databases.
As shown in Figure 2, tryptophan metabolism generates 29 bioactive metabolites via three major pathways-indole pathway, serotonin pathway, and kynurenine pathway. The indole pathway, converting tryptophan into indole derivatives, including AhR ligands, predominates in gut microbes, while the serotonin and kynurenine pathways predominate in mammalian hosts. However, the origins of some tryptophan metabolites, whether microbe-derived or host-derived, are inconsistent across previously published reviews. In addition, most reports focused on the indirect roles of the gut microbiome in modulating kynurenine and serotonin production through non-tryptophan metabolites such as butyrate, an important SCFAs derived from gut microbes [25]. The direct production of kynurenines and serotonin by gut microbes is not well characterized. For instance, serotonin was termed as a host-limited metabolite [26], yet our investigation showed that several species such as Lactococcus lactis, Lactobacillus plantarum, and Klebsiella pneumoniae produced serotonin in a similar way to their mammalian host via aromatic amino-acid decarboxylase (AAAD) [27][28][29]. Another important neurotransmitter, tryptamine, was traditionally regarded as a microbial metabolite produced by Clostridium, Ruminococcus, Blautia, and Lactobacillus through tryptophan decarboxylases [13]. However, it is also reported to be produced by brain cells in certain cases and may play specific roles in the mammalian brain [30]. In addition, some gut microbes can degrade tryptophan in a different way through the kynurenine pathway. For instance, gut species Burkholderia cepacia was reported to convert tryptophan to 2-amino-3-carboxymuconate semialdehyde, which was further enzymatically degraded to pyruvate and acetate via the intermediates 2-aminomuconate and 4-oxalocrotonate rather than the known mammalian pathway which transforms 2-aminomuconate to 2-ketoadipate and, ultimately, glutaryl-coenzyme [31]. Compared with the most recent reviews [26,[32][33][34] on the bioactive tryptophan metabolites, we updated the origin of all collected tryptophan metabolites including three inconsistent annotations of the kynurenines according to the current literature searches. The

Literature Search and Intestinal Tryptophan Metabolism
We define the tryptophan metabolism pathway as a set of reactions that transfer tryptophan to an end product without further searchable reactions or breakdown of tryptophan until the energy metabolism. To enumerate the metabolic reactions and metabolites regarding tryptophan biotransformation in microbes and their mammalian hosts, we manually searched and compared over 300 biochemical and metabolomic research papers on tryptophan metabolites, 37 reviews published in the last 5 years, and 14 public databases.
As shown in Figure 2, tryptophan metabolism generates 29 bioactive metabolites via three major pathways-indole pathway, serotonin pathway, and kynurenine pathway. The indole pathway, converting tryptophan into indole derivatives, including AhR ligands, predominates in gut microbes, while the serotonin and kynurenine pathways predominate in mammalian hosts. However, the origins of some tryptophan metabolites, whether microbe-derived or host-derived, are inconsistent across previously published reviews. In addition, most reports focused on the indirect roles of the gut microbiome in modulating kynurenine and serotonin production through non-tryptophan metabolites such as butyrate, an important SCFAs derived from gut microbes [25]. The direct production of kynurenines and serotonin by gut microbes is not well characterized. For instance, serotonin was termed as a host-limited metabolite [26], yet our investigation showed that several species such as Lactococcus lactis, Lactobacillus plantarum, and Klebsiella pneumoniae produced serotonin in a similar way to their mammalian host via aromatic amino-acid decarboxylase (AAAD) [27][28][29]. Another important neurotransmitter, tryptamine, was traditionally regarded as a microbial metabolite produced by Clostridium, Ruminococcus, Blautia, and Lactobacillus through tryptophan decarboxylases [13]. However, it is also reported to be produced by brain cells in certain cases and may play specific roles in the mammalian brain [30]. In addition, some gut microbes can degrade tryptophan in a different way through the kynurenine pathway. For instance, gut species Burkholderia cepacia was reported to convert tryptophan to 2-amino-3-carboxymuconate semialdehyde, which was further enzymatically degraded to pyruvate and acetate via the intermediates 2-aminomuconate and 4-oxalocrotonate rather than the known mammalian pathway which transforms 2-aminomuconate to 2-ketoadipate and, ultimately, glutaryl-coenzyme [31]. Compared with the most recent reviews [26,[32][33][34] on the bioactive tryptophan metabolites, we updated the origin of all collected tryptophan metabolites including three inconsistent annotations of the kynurenines according to the current literature searches. The results were further cross-validated and enhanced with the information obtained from mining the GEMs, as described below.
Metabolites 2022, 11, x FOR PEER REVIEW 4 of 16 results were further cross-validated and enhanced with the information obtained from mining the GEMs, as described below.

Curation of Genome-Scale Metabolic Models
GEMs are knowledge-based stoichiometric-balanced metabolic networks containing the entire set of metabolic reactions, genes, and metabolites in the target organism [35]. Current developments in systems biology allow for the large-scale reconstruction of GEMs for numerous microorganisms. For instance, AGORA is a set of semiautomatically generated GEMs for 818 gut bacteria [36], and EMBL_GEMs is another large collection (5584 bacteria) for all reference and representative bacterial genomes of NCBI RefSeq [37] using CarveMe [38]. The reconstruction tools for both AGORA (assembly of gut organisms through reconstruction and analysis) and EMBL_GEMs were evaluated outstanding among the general tools, especially in gap-filling the network [39].
A total of 6402 GEMs covering 41 phyla were collected from AGORA and EMBL_GEMs. Most GEMs are at a strain level except for 73 at the species level and 333 models belonging to same strains shared between the two datasets. GEMs were manually annotated according to literature searches [40][41][42][43][44], of which 2114 models were labeled as the human gut microbe covering 1380 species of 30 phyla, and 177 were part of the mouse gut microbiome from 98 species of 10 phyla. The reactions, metabolites, and enzymes involved in microbial tryptophan metabolism were extracted from GEMs. These include nearly 5000 species belonging to 39 phyla and involve tryptophan metabolism covering 1246 species in the human gut and 88 species in the mouse gut ( Figure 3). The results were corroborated by literature searches to reconcile inconsistencies between AGORA and EMBL_GEMs, as well as to make the GEM data as complete as possible.

Curation of Genome-Scale Metabolic Models
GEMs are knowledge-based stoichiometric-balanced metabolic networks containing the entire set of metabolic reactions, genes, and metabolites in the target organism [35]. Current developments in systems biology allow for the large-scale reconstruction of GEMs for numerous microorganisms. For instance, AGORA is a set of semiautomatically generated GEMs for 818 gut bacteria [36], and EMBL_GEMs is another large collection (5584 bacteria) for all reference and representative bacterial genomes of NCBI RefSeq [37] using CarveMe [38]. The reconstruction tools for both AGORA (assembly of gut organisms through reconstruction and analysis) and EMBL_GEMs were evaluated outstanding among the general tools, especially in gap-filling the network [39].
A total of 6402 GEMs covering 41 phyla were collected from AGORA and EMBL_GEMs. Most GEMs are at a strain level except for 73 at the species level and 333 models belonging to same strains shared between the two datasets. GEMs were manually annotated according to literature searches [40][41][42][43][44], of which 2114 models were labeled as the human gut microbe covering 1380 species of 30 phyla, and 177 were part of the mouse gut microbiome from 98 species of 10 phyla. The reactions, metabolites, and enzymes involved in microbial tryptophan metabolism were extracted from GEMs. These include nearly 5000 species belonging to 39 phyla and involve tryptophan metabolism covering 1246 species in the human gut and 88 species in the mouse gut ( Figure 3). The results were corroborated by literature searches to reconcile inconsistencies between AGORA and EMBL_GEMs, as well as to make the GEM data as complete as possible.

Development of a Database for Tryptophan Metabolism and Functional Prediction
Following two major procedures described above, the final tryptophan metabolism pathway contains the entries for 130 reactions and 108 metabolites (excluding currency compounds such as water, hydrogen, oxygen, etc.) linking to 91 enzymes and more than 5000 GEMs. We developed a user-friendly web-based database and visual analytics tool -TrpNet (https://www.trpnet.ca/, accessed on 21 December 2021) to share this resource with the community. Users can browse, search, and filter reactions, metabolites, or microbes involved in tryptophan metabolism and visualize more detailed information and summary tables in multiple formats. Whenever possible, different entries are hyperlinked to PubMed, KEGG [14], BioCyc [15], and ModelSEED [21].
A main motivation of developing TrpNet is to help understand the relationship between the gut microbiome composition and the capacity for tryptophan metabolism. We designed the interface and functions to allow users to easily obtain the distribution of tryptophan metabolite production at different taxonomy levels. Figure 4 shows the pairwise distance between the phylogenetic tree from the dominant genus in the host gut and the corresponding metabolic clusters, according to the presence or absence of tryptophan metabolite production. It can be observed that phylogenetically close species may differ in their capacities in metabolite production. These data will help to resolve some inconsistencies between microbiome and metabolome divergence and the coexistence of specific species [22,45]. For instance, Bacteroides were found to be relatively conservative while Lactobacillus fluctuated in tryptophan metabolite production depending on whether they produced indole derivatives. Human-and mouse-specific gut microbes differed in the production of several AhR ligands such as IA, IAA, ILA, and IPA. This may help explain the different affinities of human AhR and mouse AhR in selecting exogenous ligands as reported in several studies [46,47] and shown in Figure S1.

Development of a Database for Tryptophan Metabolism and Functional Prediction
Following two major procedures described above, the final tryptophan metabolism pathway contains the entries for 130 reactions and 108 metabolites (excluding currency compounds such as water, hydrogen, oxygen, etc.) linking to 91 enzymes and more than 5000 GEMs. We developed a user-friendly web-based database and visual analytics tool -TrpNet (https://www.trpnet.ca/, accessed on 23 November 2021) to share this resource with the community. Users can browse, search, and filter reactions, metabolites, or microbes involved in tryptophan metabolism and visualize more detailed information and summary tables in multiple formats. Whenever possible, different entries are hyperlinked to PubMed, KEGG [14], BioCyc [15], and ModelSEED [21].
A main motivation of developing TrpNet is to help understand the relationship between the gut microbiome composition and the capacity for tryptophan metabolism. We designed the interface and functions to allow users to easily obtain the distribution of tryptophan metabolite production at different taxonomy levels. Figure 4 shows the pairwise distance between the phylogenetic tree from the dominant genus in the host gut and the corresponding metabolic clusters, according to the presence or absence of tryptophan metabolite production. It can be observed that phylogenetically close species may differ in their capacities in metabolite production. These data will help to resolve some inconsistencies between microbiome and metabolome divergence and the coexistence of specific species [22,45]. For instance, Bacteroides were found to be relatively conservative while Lactobacillus fluctuated in tryptophan metabolite production depending on whether they produced indole derivatives. Human-and mouse-specific gut microbes differed in the production of several AhR ligands such as IA, IAA, ILA, and IPA. This may help explain the different affinities of human AhR and mouse AhR in selecting exogenous ligands as reported in several studies [46,47] and shown in Figure S1. Several computational tools such as PICRUSt2 [18] and Tax4Fun2 [19] are available for predicting functional profiles from 16S rRNA gene sequence data. Their performances are inherently limited by the known annotated enzyme groups which may not represent the metabolite generation. Specifically, public databases used in current tools are not tailored for tryptophan metabolism, and this may lead to bias due to incomplete information. TrpNet provides a more complete tryptophan metabolism according to literature curation and GEMs that describe metabolism at strain level with the potential to predict unknown enzymatic reactions. Here, we explored whether we could better predict the microbial tryptophan metabolism using the TrpNet database.
One constraint is that 16S rRNA data cannot reach the resolution of strain level but usually identify the microbiome at the genus level. To address this issue, we used a logistic regression model to estimate the tryptophan metabolite production potential of the interested genus depending on the metabolite distribution collected by TrpNet. This approach was used in previous studies [48][49][50] to model microbiome compositional data and to identify informative microbiome features. To acquire more accurate models for our prediction, we fit Bayesian logistic regression models for each tryptophan metabolite according to their distributions across the taxonomy levels. In this model, the human/mouse gut origin was included as a nonrandom covariate as tryptophan metabolite production differs by the niche. Tables 1 and 2 show the estimated odds ratios for the prevalence genus in producing bioactive indoles generated from mouse model and human model, respectively. The models were firstly validated by randomly split TrpNet database, whereby 80% was used for training and 20% was used to evaluate the model performance. We found that genus levels provided relatively reliable results for different metabolites in general. Figure 5 shows the ROC curves of the prediction models comparing different taxonomic Several computational tools such as PICRUSt2 [18] and Tax4Fun2 [19] are available for predicting functional profiles from 16S rRNA gene sequence data. Their performances are inherently limited by the known annotated enzyme groups which may not represent the metabolite generation. Specifically, public databases used in current tools are not tailored for tryptophan metabolism, and this may lead to bias due to incomplete information. TrpNet provides a more complete tryptophan metabolism according to literature curation and GEMs that describe metabolism at strain level with the potential to predict unknown enzymatic reactions. Here, we explored whether we could better predict the microbial tryptophan metabolism using the TrpNet database.
One constraint is that 16S rRNA data cannot reach the resolution of strain level but usually identify the microbiome at the genus level. To address this issue, we used a logistic regression model to estimate the tryptophan metabolite production potential of the interested genus depending on the metabolite distribution collected by TrpNet. This approach was used in previous studies [48][49][50] to model microbiome compositional data and to identify informative microbiome features. To acquire more accurate models for our prediction, we fit Bayesian logistic regression models for each tryptophan metabolite according to their distributions across the taxonomy levels. In this model, the human/mouse gut origin was included as a nonrandom covariate as tryptophan metabolite production differs by the niche. Tables 1 and 2 show the estimated odds ratios for the prevalence genus in producing bioactive indoles generated from mouse model and human model, respectively. The models were firstly validated by randomly split TrpNet database, whereby 80% was used for training and 20% was used to evaluate the model performance. We found that genus levels provided relatively reliable results for different metabolites in general. Figure 5 shows the ROC curves of the prediction models comparing different taxonomic levels in predicting IAA production. Please note that the performance measures are likely to be inflated as the same database was used for calculating the parameters of the regression models. A network visualization page was implemented to allow users to search metabolites of interest in the network or to customize the tryptophan metabolism network according to a user-specified list of microbes ( Figure 6). The result can be highlighted against the whole network or downloaded as a table. Another key feature of TrpNet lies in the annotation for the origin beyond the enzyme level. Reactions and metabolites were individually checked against the literature to label them as host-derived or microbial-derived, to help decipher the host-microbe interactions and co-metabolism.  A network visualization page was implemented to allow users to search metabolites of interest in the network or to customize the tryptophan metabolism network according to a user-specified list of microbes ( Figure 6). The result can be highlighted against the whole network or downloaded as a table. Another key feature of TrpNet lies in the annotation for the origin beyond the enzyme level. Reactions and metabolites were individually checked against the literature to label them as host-derived or microbial-derived, to help decipher the host-microbe interactions and co-metabolism.   A network visualization page was implemented to allow users to search metabolites of interest in the network or to customize the tryptophan metabolism network according to a user-specified list of microbes ( Figure 6). The result can be highlighted against the whole network or downloaded as a table. Another key feature of TrpNet lies in the annotation for the origin beyond the enzyme level. Reactions and metabolites were individually checked against the literature to label them as host-derived or microbial-derived, to help decipher the host-microbe interactions and co-metabolism.

Case Studies 2.4.1. Myocardial Infarct (MI) Case Study
Disturbed tryptophan metabolism is known to alter the host inflammation status and affect many diseases including heart diseases such as myocardial infarction (MI) with an increased ratio of KYN/TRP [51]. To understand gut microbiome and host MI status with tryptophan metabolism, we collected 16 cecal samples from 16 mice (8 with MI and 8 control) day 3 post MI. Each sample was processed for 16S rRNA bacterial sequencing and untargeted metabolomics based on LC-MS and MS/MS. As it has been reported that females and males have differences in the risk of MI, we included data from male mice in our case study to exclude any additional effects of sex [52].
DADA2 [53] was used to assign taxonomy to amplicon sequence variants (ASVs). After filtering the 712 low-quality features, the remaining 304 ASVs were attributed to 69 genera dominated by Lachnospiraceae spp. And Ruminococcaceae spp. For metabolomics data, XCMS [54] and metID [55] were used for spectrum processing and peak annotation. A total of 24 microbial tryptophan metabolites were detected in LC-MS/MS, of which nine metabolites were significantly different including IAA, IAM, IalD, and serotonin. Statistical analyses of microbiome data were performed using MicrobiomeAnalyst [56]. Principal component analysis (PCA) evaluation showed that male mice without an MI differed from male mice post MI in microbiome composition. This was caused by a lack of Proteobacteria and Verrucomicrobia, which are active tryptophan metabolites producers in the no MI group ( Figure S2).
Prediction models built on the TrpNet database were used to predict tryptophan metabolite production as a function of the genus-level data. Figure 7 shows the prediction result from the gut microbiome, as well as the comparison with metabolomics data and related enzymes predicted by PICRUSt2. Tryptophan degradation of the MI group differed significantly from their counterparts without MI, which may be explained by their diverse gut microbiome composition. According to our prediction, the MI group is more likely to produce greater amounts of tryptophan metabolites, including AhR ligands such as indole, IAM, and IAA, supporting the metabolomics data. Previous evidence has suggested that AhR activity is a critical modulator in the development and pathogenesis of the cardiovascular system [57]. AhR knockout mice were reported to be more susceptible to cardiac hypertrophy, vascular remodeling and systemic hypertension [58]. However, AhR activation can also contribute to the formation and promotion of atherosclerosis through inducing vascular inflammation [59]. Further studies are necessary in order to elucidate the effects on MI progression triggered by microbial AhR ligands from tryptophan metabolism. trol) day 3 post MI. Each sample was processed for 16S rRNA bacterial sequencing and untargeted metabolomics based on LC-MS and MS/MS. As it has been reported that females and males have differences in the risk of MI, we included data from male mice in our case study to exclude any additional effects of sex [52].
DADA2 [53] was used to assign taxonomy to amplicon sequence variants (ASVs). After filtering the 712 low-quality features, the remaining 304 ASVs were attributed to 69 genera dominated by Lachnospiraceae spp. And Ruminococcaceae spp. For metabolomics data, XCMS [54] and metID [55] were used for spectrum processing and peak annotation. A total of 24 microbial tryptophan metabolites were detected in LC-MS/MS, of which nine metabolites were significantly different including IAA, IAM, IalD, and serotonin. Statistical analyses of microbiome data were performed using MicrobiomeAnalyst [56]. Principal component analysis (PCA) evaluation showed that male mice without an MI differed from male mice post MI in microbiome composition. This was caused by a lack of Proteobacteria and Verrucomicrobia, which are active tryptophan metabolites producers in the no MI group ( Figure S2).
Prediction models built on the TrpNet database were used to predict tryptophan metabolite production as a function of the genus-level data. Figure 7 shows the prediction result from the gut microbiome, as well as the comparison with metabolomics data and related enzymes predicted by PICRUSt2. Tryptophan degradation of the MI group differed significantly from their counterparts without MI, which may be explained by their diverse gut microbiome composition. According to our prediction, the MI group is more likely to produce greater amounts of tryptophan metabolites, including AhR ligands such as indole, IAM, and IAA, supporting the metabolomics data. Previous evidence has suggested that AhR activity is a critical modulator in the development and pathogenesis of the cardiovascular system [57]. AhR knockout mice were reported to be more susceptible to cardiac hypertrophy, vascular remodeling and systemic hypertension [58]. However, AhR activation can also contribute to the formation and promotion of atherosclerosis through inducing vascular inflammation [59]. Further studies are necessary in order to elucidate the effects on MI progression triggered by microbial AhR ligands from tryptophan metabolism.  Our prediction also shows increased activation of the kynurenine pathway post MI. This suggests that gut microbes may directly contribute to the increased KYN/TRP ratio, leading to the decreased level of beneficial serotonins and accumulation of neurotoxic KYN metabolites during the disease process. In parallel, we performed analyses using PICRUSt2. Only enzyme EC 4.1.99.1 relating to indole production was identified as significantly increased in the MI group, similar to the prediction by TrpNet and metabolomics data. Thus, TrpNet can serve as a better resource for exploring intestinal tryptophan metabolism.

IBD Case Study
Previous studies have demonstrated the key role of the gut microbiome in IBD, and some highlighted the potential link to gut tryptophan metabolism [60]. The 16S rRNA and metabolomics data were collected from 26 participants between age 6 and 19 randomly selected from the Inflammatory Bowel Disease Multi-omics Database (http://ibdmdb.org, accessed on 17 October 2021) [61]. For each data type, 20 samples from pediatric Crohn's disease (CD) patients and 20 from pediatric healthy controls were also included.
From the metabolomics data annotation, nine tryptophan-derived metabolites were observed among which seven could be produced by the microbiome. IPA was significantly decreased in the CD group. Regarding the ASV sequencing data, 147 ASVs were annotated to 44 genera after filtering out the low-abundance features. However, there were no significant differences observed regarding microbiome composition between the CD and control group ( Figure S3). TrpNet was then used for tryptophan metabolite prediction for each sample using the established model. Figure 8 shows the predicted distribution of tryptophan metabolites in comparison with metabolomics data and EC identified by PICRUSt2. Our prediction found the alteration of IPA validated by metabolomics and the obligatory intermediate IA in producing IPA from tryptophan. In contrast, PICRUSt2 did not contain the information for IPA, and the enzyme for IA was not significantly different between CD patients and healthy controls. Indole derivatives were predicted by TrpNet to be more abundant in healthy people than in CD patients, which is consistent with a previous report showing a reduction in AhR ligands by the microbiota in IBD patients. Most metabolites did not show significant differences between the two groups, probably due to the disperse microbiome structure. Interestingly, although indolepyruvate, which can improve intestinal epithelial barrier function during challenges with inflammatory stimuli, was not annotated by the metabolomics data, our prediction shows its decrease in CD patients, replicating previous results [60]. Despite serotonin being found increased in the CD group by metabolomics analysis, which was possibly due to the decreased expression of SERT in the ileum and colon [62], there was no significant difference according to our prediction. Similarly, the kynurenines were not different between the two groups using all the methods. Consequently, we can envision that gut microbes may affect IBD processing through tryptophan-derived AhR ligands such as IPA and IPY.
Previous studies have demonstrated the key role of the gut microbiome in IBD, and some highlighted the potential link to gut tryptophan metabolism [60]. The 16S rRNA and metabolomics data were collected from 26 participants between age 6 and 19 randomly selected from the Inflammatory Bowel Disease Multi-omics Database (http://ibdmdb.org, accessed on 17 October 2021) [61]. For each data type, 20 samples from pediatric Crohn's disease (CD) patients and 20 from pediatric healthy controls were also included.
From the metabolomics data annotation, nine tryptophan-derived metabolites were observed among which seven could be produced by the microbiome. IPA was significantly decreased in the CD group. Regarding the ASV sequencing data, 147 ASVs were annotated to 44 genera after filtering out the low-abundance features. However, there were no significant differences observed regarding microbiome composition between the CD and control group ( Figure S3). TrpNet was then used for tryptophan metabolite prediction for each sample using the established model. Figure 8 shows the predicted distribution of tryptophan metabolites in comparison with metabolomics data and EC identified by PICRUSt2. Our prediction found the alteration of IPA validated by metabolomics and the obligatory intermediate IA in producing IPA from tryptophan. In contrast, PIC-RUSt2 did not contain the information for IPA, and the enzyme for IA was not significantly different between CD patients and healthy controls. Indole derivatives were predicted by TrpNet to be more abundant in healthy people than in CD patients, which is consistent with a previous report showing a reduction in AhR ligands by the microbiota in IBD patients. Most metabolites did not show significant differences between the two groups, probably due to the disperse microbiome structure. Interestingly, although indolepyruvate, which can improve intestinal epithelial barrier function during challenges with inflammatory stimuli, was not annotated by the metabolomics data, our prediction shows its decrease in CD patients, replicating previous results [60]. Despite serotonin being found increased in the CD group by metabolomics analysis, which was possibly due to the decreased expression of SERT in the ileum and colon [62], there was no significant difference according to our prediction. Similarly, the kynurenines were not different between the two groups using all the methods. Consequently, we can envision that gut microbes may affect IBD processing through tryptophan-derived AhR ligands such as IPA and IPY.

Discussion
Tryptophan metabolism plays a central role in host physiologic and pathologic processes. The balance among microbial tryptophan metabolism, supplementation, and microbial modulation exerts a great impact on local gastrointestinal and circulating tryptophan availability for its host and ultimately contributes to host health and disease. Hence, it is important to fully characterize tryptophan metabolism within a host or within its resident gut microbes. TrpNet, a first step toward addressing this gap, includes a collection of all currently known reactions and metabolites relating to tryptophan according to comprehensive literature reviews and large-scale data mining across >5000 GEMs. However, despite our intensive curation efforts, several reactions and metabolites are still left without related literature reports. For example, no reaction details are currently available for several tryptophan metabolites such as for Iald, an important AhR ligand.
One of the major challenges in microbiome studies is to determine the causal role that the gut microbiome composition plays in specific phenotypes. This is difficult due to the complexity of host-microbe interactions and microbe-microbe interactions. TrpNet can help decipher this co-metabolism by providing the detailed tryptophan metabolism within specific microbial species according to GEMs and literature annotation. Many current studies are based on 16S rRNA sequencing, making it essential to improve functional prediction and maximize the information gained from these relatively low-resolution taxonomic profiles. Here, we took an initial trial to predict tryptophan metabolism from genus-level bacterial identification using a logit regression model based on the TrpNet database. It should be noted that this prediction is limited by the current knowledge of the tryptophan metabolism, as well as algorithm for GEM construction or function prediction. Optimized methods are needed to improve the annotation of microbe to metabolite levels for mechanical and therapeutical insights. For instance, an increased KYN/Trp ratio has been reported as a potential biomarker for inflammation status, and supplementation of gut species that can naturally produce AhR ligands such as Lactobacillus spp. could help recover the AhR signaling. This microbe-based therapeutic approach was successfully applied in a mouse model of colitis [11]. As the gut microbiome can also modulate tryptophan metabolism indirectly by producing other small molecules such as bile acids, it will be useful to gather the information of microbes involved in these relevant processes to further improve TrpNet.

Literature Review
Review papers were searched from PubMed, Web of Science, and bioRxiv (www. biorxiv.org/, accessed on 8 October 2021) using the search term "tryptophan metabolism AND gut microbiome" since 2017. Those studies providing a global view of and tryptophan metabolism and focusing on the host-microbe interaction were included. Furthermore, for each tryptophan metabolite, research paper surveys were conducted to determine its origin. These papers showed showing at least one of the sources of genetic, enzymatic, or metabolic evidence in certain microbial species were prioritized.

GEMs Collection
A total of 818 GEMs in AGORA were collected from the Virtual Metabolic Human (VMH) database that can be accessed via the website (http://vmh.life, accessed on 3 September 2021), and EMBL GEMs were download from the EMBL BioModels website (https://www.ebi.ac.uk/biomodels, accessed on 3 September 2021). SBML files were parsed using R studio (version 4.1.1). GEMs were first annotated to human and/or mouse gut microbes on the basis of several large-scale studies and public gut microbiome databases. The models without records were then manually searched in PubMed to annotate their habitat.

TrpNet Implementation
The web-based database was developed on the basis of the JavaServer Faces (JSF) technology using the PrimeFaces framework (v11). The network visualization was implemented using D3 (version 5.0).

Sample Collection for MI Case Study
The murine experiments were approved by the Lady Davis Facility Animal Care Committee and followed the guidelines described by the Canadian Council on Animal Care. Retired breeder male mice were purchased from Charles River, St. Constant, PQ, Canada. Mice were housed in single cages on irradiated corn cob bedding in a vented rack, fed an irradiated Harlan Teklad Global 2018 diet which contains no animal protein and acidified tap water, and acclimated to the facility for 1 month before use. Surgery to create an MI was performed by the Surgery Core of the Lady Davis Institute [63,64]. Samples of cecal contents were collected day 3 post MI from a total of eight male mice, as well as from eight male mice which did not experience MI surgery. DNA for 16S rRNA sequencing was isolated using a Qiagen QIAamp PowerFecal DNA kit according to the manufacturer's instructions. DNA samples were quantified, purity was determined, and samples sent to the McGill Genome Center. There, the bacterial V4 region was PCR-amplified from bases 515F to 806R, sequenced using a MiSeq Reagent Kit v3 (600-cycle), and run on an Illumina MiSeq. Data were processed and returned as amplified sequence variants (ASVs). R package DADA2 v 1.20.0 [53] was used to determine the abundance and gut bacterial species assignment. Gut metabolomics data from the same cecal contents were processed using an Orbitrap Q-Exactive LC-MS system in both positive and negative mode using a C18 column. MS/MS spectra were collected using data-independent acquisition (DIA). Raw LC-MS spectra were processed by MetaboAnalyst v5.0 [65] to generate a peak list table. About 150 MS1 peaks were found to be from potential tryptophan metabolites. According to MS/MS data, 24 tryptophan metabolites were identified using the metID package [52] (Table S1).

Sample Collection for IBD Case Study
The dataset of pediatric IBD stool samples was downloaded from the Integrative Human Microbiome Project Consortium (iHMP) [66]. For evaluation purposes, we randomly selected individuals between age 6 and 19 for disease (diagnosed with Crohn's disease) and control groups. The information of the sample is listed in Table S2, and the original data can be found at https://ibdmdb.org/ (accessed on 17 October 2021). The tryptophan metabolites were extracted on the basis of annotation information provided by the authors (Table S3).

Logistic Regression Model for Predicting Metabolite Profiles
The logistic regression model was used to infer the metabolic profile from known taxonomy compositions. This method is from the generalized linear model family and can learn probabilistic models to predict the outcome of a binary variable from one or more response categorical or continuous variables. In our case, we aimed to predict tryptophan metabolite production using taxonomy profile and host type. The algorithm involved four key steps as described below.

1.
Different taxonomy levels and their combinations were evaluated for their predictive values. Models were ranked by Akaike information criterion (AIC). The genus level combined with the host type was selected as the best predictor; 2.
The models were further optimized by Bayesian logistic regression coupled with a fast Pareto smoothed leave-one-out cross-validation for the penalized likelihood estimation [67]. These models capture the metabolite production potential (PM, G) for the underlying metabolite (M) of interest in every genus (G) for a given host type; 3.
The predicted probability (P M,G ) was multiplied by the genus abundance table obtained from 16S rRNA sequencing data to compute the accumulated production potential for each metabolite of interest for each sample; 4.
The results of all samples were normalized by total sum scaling to be comparable with metabolomics data.

Conclusions
Understanding molecular dialogues between the gut microbiome and the host is critical for developing microbiome-based diagnostic and therapeutic approaches. In this manuscript, we focused on improving our knowledge on tryptophan metabolism by integrating information from >5000 GEMs, 14 databases, and >300 literature reports. Through its user-friendly interface and interactive visualization, TrpNet provides the most up-todate information for researchers to study tryptophan metabolism within the context of host and microbiome interactions. According to this information, we further developed an algorithm for predicting the microbial tryptophan metabolism from the 16S rRNA abundance profiles. Our two case studies demonstrated that our approach gives more accurate results compared to other established methods. We hope that TrpNet will be a useful resource that allows researchers to better understand the gut microbial tryptophan metabolism in the context of the gut microbiome for translational applications.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/metabo12010010/s1. Figure S1. Observed distribution of tryptophan metabolites production across the predominant genus in host gut. Microbial tryptophan metabolite production differs according to their host niche, Figure S2. Comparation of microbiome composition between MI and NO.MI group by (a) PCA plot and (b) bar plot, Figure S3. Comparation of microbiome composition between CD patients and health control by (a) PCA plot and (b) stack bar plot, Table S1. Metabolite annotation of MI case study, Table S2. IBD sample information, Table S3. Metabolite annotation of IBD case study.

Data Availability Statement:
The IBD data is available from http://ibdmdb.org (accessed on 23 November 2021). The MI data is available upon request.