Analyzing Predominant Bacterial Species and Potential Short-Chain Fatty Acid-Associated Metabolic Routes in Human Gut Microbiome Using Integrative Metagenomics

Simple Summary Human gut microbiome plays an important role for health. This study was thus aimed to analyze the predominant species and metabolic routes involved in short-chain fatty acids (SCFAs) production in the human gut microbiome after treatment with copra meal hydrolysate (CMH). Using integrative metagenomics, key predominant bacterial species and metabolic routes involved in cooperative microbiome networks in relation to SCFAs biosynthesis were identified. This suggests that CMH becomes a potential prebiotic diet for modulating and maintaining the gut microbiome implicated in human health. Abstract Gut microbiome plays an essential role in host health, and there is interest in utilizing diet to modulate the composition and function of microbial communities. Copra meal hydrolysate (CMH) is commonly used as a natural additive to enhance health. However, the gut microbiome is largely unknown at species level and is associated with metabolic routes involving short-chain fatty acids (SCFAs). In this study, we aimed to analyze, using integrative metagenomics, the predominant species and metabolic routes involved in SCFAs production in the human gut microbiome after treatment with CMH. The effect of CMH treatment on the Thai gut microbiome was demonstrated using 16S rRNA genes with whole-metagenome shotgun (WMGS) sequencing technology. Accordingly, these results revealed that CMH has potentially beneficial effects on the gut microbiome. Twelve predominant bacterial species, as well as their potential metabolic routes, were involved in cooperative microbiome networks under sugar utilization (e.g., glucose, mannose, or xylose) and energy supply (e.g., NADH and ATP) in relation to SCFAs biosynthesis. These findings suggest that CMH may be used as a potential prebiotic diet for modulating and maintaining the gut microbiome. To our knowledge, this is the first study to reveal the predominant bacterial species and metabolic routes in the Thai gut microbiome after treatment with potential prebiotics.


Introduction
The human gut microbiome contains highly diverse microbial communities that play essential roles in human health. The gut microbiome is influenced by several factors, such as host genetics, diet, lifestyle, medication, and environment [1,2]. In addition, consumption of specific nutritional substances, such as prebiotics, can modulate gut microbial composition [3,4]. Prebiotics include diverse types of oligosaccharides that are subjected to microbial fermentation in the gastrointestinal (GI) tract, and they affect microbial communities and their functions, offering energy and maintaining gut homeostasis [5,6].
Manno-oligosaccharides (MOS) consist of a linear chain of mannose and have gained great interest as a prebiotic. MOS can be derived from mannan-rich plants, such as copra meal, a by-product of coconut milk and coconut oil processing. In Thailand, 25 million metric tons of copra meal is produced annually [7]. By enzymatic hydrolysis using βmannanase, a copra meal hydrolysate (CMH) was obtained as a source for further MOS production. CMH is stable under human gastrointestinal tract conditions. CMH is relatively stable in the small intestine and is easily fermented by lactobacilli and bifidobacteria [8]. Very recently, the impact of CMH on gastrointestinal symptoms and gut microbiome was demonstrated to have a positive relationship with health using 16S rRNA gene sequencing [9]. In the context of taxonomy and metabolic function-associated pathways, the human gut microbiome is largely unknown at the species level, as are the associated metabolic routes involved in short-chain fatty acids (SCFAs).
Therefore, this study aimed to analyze the predominant species and metabolic routes involved in SCFAs production in the human gut microbiome after treatment with CMH using integrative metagenomics. Different treatments, such as baseline, placebo, or CMH, on the Thai gut microbiome were demonstrated using 16S rRNA genes and whole-metagenome shotgun (WMGS) sequencing technology. DNA extraction from fecal samples was initially performed, and sequencing data were then analyzed using bioinformatics and systems biology tools and databases for taxonomic profiles and metabolic functions of the gut microbiome. Integrative analysis with 16S rRNA genes and WMGS data revealed a number of predominant bacterial species, as well as a list of potential metabolic functions and associated routes involved in SCFAs production after treatment with CMH. This is the first study to reveal that CMH is a potential prebiotic capable of modulating and maintaining the gut microbiome implicated in human health.

Participants and Fecal Sample Collection
For participants, thirty-seven Thai adults-from Bangkok and its closer city, Thailandaged between 18-45 years under a BMI of 18.5-24.0 kg/m 2 were registered with doubleblinded, placebo-controlled trials in this cohort. Concerning recruitment, they were informed at King Chulalongkorn Memorial Hospital, Bangkok, Thailand, under stringent inclusion and exclusion criteria i.e., dietary intake, age, and status of health. Notably, the participants had no intestinal diseases or diarrheal symptoms in the months prior to sampling and none of the patients had a family history of colorectal cancer. In addition, the participants had to not receive antibiotics within at least three months as well as probiotics, prebiotics, and synbiotics at least one month before sampling. Participants with allergies to coconuts or food intolerance were excluded. The clinical and demographic characteristics of the study participants in the cohort were described according to Sathitkowitchai et al. (2021) [9] (see Supplementary File S1).
For fecal sample collection, participants were assigned to four different groups in a total of 74 samples: baseline CMH (bCMH for 17 samples), baseline placebo (bPB for 20 samples), treatment with CMH (tCMH for 17 samples), and treatment with placebo (tPB for 20 samples). Remarkably, the baseline means that they were not treated with either CMH or PB. In addition, treatment meant that they were subjected to CMH or PB for 21 days. Fecal samples (20 g) were immediately collected at the time of defecation and placed into a collection tube in a cooler bag. The fecal samples were stored at -80 • C for further analysis.

DNA Extraction and Metagenome Sequencing
All 74 samples were subjected to DNA extraction using a modified method described by Nakphaichit et al. (2014) [10]. Subsequently, 16S rRNA gene sequencing was performed according to the method described by Sathitkowitchai et al. (2021) [9]. Of the 74 samples, 20 were prepared for WMGS sequencing, annotation, and analysis. Initially, fecal samples were centrifuged for 2 min at 13,000 ×g. The pellet was washed twice, centrifuged in phosphate-buffered saline solution (PBS, 1 mL) at 13,000 ×g for 5 min, and then suspended in 900 µL PBS. The supernatant was then discarded. Total DNA purification with quality control assessments and metagenome sequencing was performed according to the method described by Raethong et al. (2021) [11]. Clean reads, which were not mapped to the human genome, were obtained as the metagenomic dataset of the gut microbiome.

Microbial Taxonomic Analysis and Functional Annotation of 16S rRNA Gene Sequencing Data and WMGS Datasets
Initially, 16S rRNA gene sequence datasets (i.e., 17 bCMH samples, 20 bPB samples, 17 tCMH samples, and 20 tPB samples) were analyzed for microbial taxonomy and function using QIIME 2, PICRUSt2 pipeline package (v2.1.0_b) [12], and MetGEMs toolbox (v1.0) [13]. Exploring the MetGEMs toolbox, Core Function was used to investigate the metabolic functions, the KO IDs in each sample were rank-transformed, and the geometric means of KO IDs of each group were then computed.
To analyze WMGS datasets, 20 samples out of 74 samples were randomly selected (5 bCMH samples, 5 bPB samples, 5 tCMH samples, and 5 tPB samples). Alpha-and beta-diversity analyses were initially performed using a vegan package [14] in the R program (version 2.5-6). For alpha-diversity, the observed species and Shannon diversity indices were used to calculate species richness and abundance. Statistical differences in diversity indices between possible pairwise comparison groups were identified using the Wilcoxon rank-sum test. Beta-diversity was computed using Bray-Curtis distances with the metaMDS function in the vegan R package [14]. Beta-diversity analysis was performed to investigate differences between microbial communities across possible pairwise comparison groups. Differences in beta-diversity were visualized through non-metric multidimensional scaling (NMDS) ordination using the ggplot2 R package [15]. Beta-diversity was tested for inference by permutational multivariate analysis of variance (PERMANOVA), as implemented in the ADONIS function from the vegan R package [14], using permutations equal to 999. To test the difference in microbial composition between two or more groups, analysis of similarities (ANOSIM) was also evaluated based on the Bray-Curtis dissimilarity using the vegan R package [14].
For microbial taxonomic analysis, MetaPhlAn (v3.0) [16] was used to detect and quantify individual species with a library of clade-specific markers (mpa_v30_CHOCOPhlAn_201901) as database and to further generate overall metagenome profiles, for example, phyla, families, genera, species, and strain levels. The relative abundance of the microbial taxonomic level was plotted using the ggplot2 package [15] in the R program (v.3.5.3). The Wilcoxon rank-sum test was used to identify statistically significant differences (p < 0.1) between possible pairwise comparison groups.
The relative abundance of gene families and pathways was determined using HU-MAnN 3. The gene profile was summed into KO IDs and normalized using the built-in script in HUMAnN 3 [17]. To further determine the metabolic function of the gut microbiome, the obtained KO IDs from the 16S rRNA gene sequence and WMGS dataset were searched against the KEGG database using KEGG mapper in KEGG mapping tools [18][19][20].
Once KEGG Orthology (KO) IDs were identified, they were classified into six functional categories: metabolism, genetic information processing, cellular processes, environmental information processing, organism systems, and human diseases. Further, the functional and pathway enrichment analysis was explored under a distinct up-directional p-value < 0.05 for the potential group, for example, pairwise comparison between tCMH and tPB.

Identification of Predominant Bacterial Species, Potential Metabolic Functions and Associated Routes Involved in Treatment with CMH Using Integrative Analysis
To identify the predominant bacterial species, potential metabolic functions, and associated routes when treated with CMH, taxonomic profiles, and metabolic functions obtained from 16S rRNA gene sequences and WMGS datasets were integrated. After analyzing taxonomy and metabolic functional as well as pathway enrichments, the predominant bacterial species and enriched KO IDs under positive median mode and p-value < 0.05 were then considered. Based on the KEGG database, the targets of enriched KO IDs across predominant species were mapped to metabolic pathways, for example, SCFA biosynthesis, using mapper in KEGG mapping tools. Furthermore, literature mining and manual curation were performed to uncover a number of predominant bacterial species, as well as a list of potential metabolic functions and associated routes.

Assessment of 16S rRNA Gene Datasets on Microbial Composition and Metabolic Function
After treatment of the Thai gut microbiome with CMH (tCMH), the microbial composition in the human gut was assessed using 16S rRNA gene datasets. Four phyla (Firmicutes, Bacteroidetes, Actinobacteria, and Proteobacteria) were most commonly identified. Firmicutes showed the highest relative abundance in the bacterial community, accounting for 80.6%. At the family level, tCMH was found in the top five bacterial families (Lachnospiraceae, Bacteroidaceae, Erysipelotrichaceae, Prevotellaceae, and Rikenellaceae) under the positive median mode ( Figure 1A). Ruminococcaceae was reduced in the tCMH group. These results are consistent with those reported by Sathitkowitchai et al., 2021 [9]. Across the other pairwise groups, for example, at the phyla level, the changes in the gut microbiome were similar (see Supplementary File S1).
Metabolic function prediction of the gut microbiome focused on metabolism ( Figure 1B). The top three associated pathways involved in carbohydrate metabolism were the major enrichments. Considering the positive median mode under statistical significance (p ≤ 0.05), glycolysis/gluconeogenesis, propanoate metabolism, C5-Branched dibasic acid metabolism were targeted in relation to the central carbon metabolism and energy supply. There were target of enzymes e.g., phosphoenolpyruvate carboxykinase (ATP) (EC:4.

Assessment of Gemicrobial Diversity and Composition from WMGS Datasets
Raw WMGS datasets were assessed for 1141.74 Megabases (Mb) of all samples (Table  1). After discarding low-quality reads, adapter and human genome contaminants, a total of 1132.41 Mb of clean reads were retrieved with an average effective rate of 99.18% (see Supplementary File 2) and used for further analysis. Alpha-and beta-diversity analyses were performed to determine the microbial diversity across all possible groups. The observed species and Shannon diversity indices showed that each group was comparable, as shown in Figure 2A,B. Based on the Wilcoxon rank-sum test, there were no statistically significant differences in alpha-diversity among all groups. Regarding beta diversity, an NMDS plot of the Bray-Curtis dissimilarity index is shown in Figure 2C. The results also showed that beta-diversity was not significantly

Assessment of Gemicrobial Diversity and Composition from WMGS Datasets
Raw WMGS datasets were assessed for 1141.74 Megabases (Mb) of all samples (Table 1). After discarding low-quality reads, adapter and human genome contaminants, a total of 1132.41 Mb of clean reads were retrieved with an average effective rate of 99.18% (see Supplementary File S2) and used for further analysis. Alpha-and beta-diversity analyses were performed to determine the microbial diversity across all possible groups. The observed species and Shannon diversity indices showed that each group was comparable, as shown in Figure 2A,B. Based on the Wilcoxon rank-sum test, there were no statistically significant differences in alpha-diversity among all groups. Regarding beta diversity, an NMDS plot of the Bray-Curtis dissimilarity index is shown in Figure 2C. The results also showed that beta-diversity was not significantly different across all groups (ANOSIM analysis: R = -0.236, p > 0.05; ADONIS analysis: R 2 = 0.127; p-value > 0.05). This indicates that the diversity between the baseline and treatment groups was comparable in the context of taxonomic richness and abundance.  To initially assess the taxonomic profiles of the gut microbiome, the WMGS datasets were analyzed for all taxa. As presented in Figure 3A,B, we found that four phyla (Firmicutes, Actinobacteria, Proteobacteria, and Bacteroidetes) were comparable across the baselines (bPB and bCMH) and treatments (tPB and tCMH). Among these four phyla, Firmicutes showed the highest relative abundance in the bacterial community for all groups, accounting for an average of 60.99%. The family Lachnospiraceae, belonging to the genus Roseburia in the phylum Firmicutes, was dominant at a high abundance (  (C) Beta-diversity plot of non-metric multidimensional scaling (NMDS) ordination in context of Bray-Curtis dissimilarity among each group. Taxonomic diversity was assessed at species level. In addition, bPB, bCMH, tPB, and tCMH represent baseline placebo, baseline CMH, treatment with placebo, and treatment with CMH, respectively.
To initially assess the taxonomic profiles of the gut microbiome, the WMGS datasets were analyzed for all taxa. As presented in Figure 3A,B, we found that four phyla (Firmicutes, Actinobacteria, Proteobacteria, and Bacteroidetes) were comparable across the baselines (bPB and bCMH) and treatments (tPB and tCMH). Among these four phyla, Firmicutes showed the highest relative abundance in the bacterial community for all groups, accounting for an average of 60.99%. The family Lachnospiraceae, belonging to the genus Roseburia in the phylum Firmicutes, was dominant at a high abundance (23.73%). Considering all groups under the top 15 bacterial families (see Supplementary File S2), five dominant families ( Figure 3C,D) were found: Bifidobacteriaceae, Enterobacteriaceae, Eubacteriaceae, Lachnospiraceae, and Ruminococcaceae. These results are consistent with those of La-ongkham et al. (2020) [21] in the context of the core taxonomy of the Thai gut microbiome.

Assignment of Metabolic Function Underlying KO IDs from WMGS Datasets
Concerning on all bacterial families achieved from taxonomic profiles (see Supplementary File S2), metabolic functional analysis was performed using KEGG. A total of 5421 KO identifiers (KO IDs) were identified. Across six functional categories, metabolism was shown to have the highest number of KO IDs (1,712 KO IDs), followed by genetic informa-tion processing (365 KO IDs), environmental information processing (550 KO IDs), cellular processes (177 KO IDs), organism systems (17 KO IDs), and human diseases (56 KO IDs), as shown in Figure 5A and   Metabolic function and pathway enrichment analyses were performed for carbohydrate metabolism. Interestingly, we found that the TCA cycle and pentose and glucuronate interconversions were significantly enriched (p < 0.05) (Figures 5B and 6A; see Supplementary File S2).
Altogether, our results agree well with earlier reports on propionate/butyrate-producing microbes, such as A. intestini, E. coli, Bacteroides spp., Anaerostipes, Eubacterium and Roseburia [27].   Figure 6A) (see Supplementary File S2). These enzymes are essential for 2-oxoglutarate and succinate formation, leading to the biosynthesis of SCFAs. Succinate is a key precursor and plays an important role in either propionate or acetyl-CoA formation, which can then be converted to acetate or butyrate formation [22][23][24].

Identifying Predominant Bacterial Species and Potential Metabolic Routes Using Integrative Metagenomics
To identify the predominant bacterial species, the results obtained from the 16S rRNA gene sequence datasets ( Figure 1B) and the WMGS datasets ( Figure 6B) were integrated. Post-tCMH, we found 12 predominant species: A. intestini and A. butyriciproducens, A. hadrus, B. dorei, B. massiliensis, B. vulgatus, C. saccharolyticum, D. piger, E. coli, E. siraeum, R. hominis, and R. intestinalis. According to literature supports for all identified species [28], very interestingly, they were SCFA-producing species. The results of mapping the enriched KO IDs identified from these SCFA-producing species onto SCFA biosynthesis are shown in Figure 7. Indeed, it revealed five metabolic pathways-glycolysis/gluconeogenesis, TCA cycle, pentose and glucuronate interconversions, C5-branched dibasic acid metabolism, and propanoate metabolism-in relation to SCFA biosynthesis. Among SCFAs, the metabolic routes involved in acetate, propionate, and butyrate production were identified (Figure 7), as they are commonly found in the human gut [23]. In glycolysis/gluconeogenesis, we found 11 KO IDs across five key enzymes involved in nutrient utilization and ATP supply: pyruvate kinase (EC:2.7.1.40), phosphoenolpyruvate carboxykinase (ATP) (EC:4.1.1.49), pyruvate, and phosphate dikinase (EC:2.7.9.1). These enzymes are important for generating key precursors such as oxaloacetate and pyruvate for acetyl-COA towards acetate and butyrate formation [4,28]. Among SCFAs, acetate is the most abundant SCFA, and it is essential for the growth of other microbes in the human gut. Butyrate is also the main energy source for human colonocytes and enterocytes, and can activate intestinal gluconeogenesis for beneficial effects on glucose and energy homeostasis. In the TCA cycle, we also found NADH supply via only one KO ID and one key enzyme, 2-oxoglutarate synthase (EC:1.2.7.3), across A. intestini, B. dorei, B. massiliensis, B. vulgatus, D. piger, and E. siraeum. This enzyme is interesting because it produces succinyl-COA, which is a precursor for further propionate formation [22,29]. Consistent with other populations from earlier reports, propionate is commonly produced by Bacteroidetes, including Bacteroides, for example, B. vulgatus through the succinate pathway [30] when treatment with CMH [31].
For pentose and glucuronate interconversions, we observed three KO IDs and three key enzymes involved in NADH supply: l-iditol 2-dehydrogenase (EC: EC:1.1.1.14). This enzyme is widely distributed and has been described in archaea and bacteria across five bacterial species (i.e., A. hadrus, C. saccharolyticum, E. coli, R. hominis and R. intestinalis). This enzyme is widely distributed in the bacteria. It acts on a number of sugar alcohols, such as D-xylitol, which increases the concentration of propionate, as seen in E. coli [32].
Beyond integrative metagenomics throughout this study, the predominant species and potential metabolic routes were revealed to be cooperative microbiome networks under different sugar utilizations (e.g., glucose, mannose, or xylose) in relation to SCFA biosynthesis. Taken with tCMH, these results suggest that CMH is a potential prebiotic for human gut microbiome modulation and maintenance.

Conclusions
Our integrative metagenomics approach under different treatment regimens was enabled using 16S rRNA gene and WMGS sequencing technology. The data were analyzed using bioinformatics and systems biology. Upon treatment with CMH, we positively identified twelve predominant bacterial species, potential metabolic functions and associated routes. Cooperative microbiome networks under sugar utilization (e.g., glucose, mannose, or xylose) and energy supply (i.e., ATP and NADH) engage in central carbon metabolism for biosynthesizing SCFAs, such as propionate, butyrate, and acetate. These findings suggest that CMH may be a potential prebiotic for modulating the gut microbiome. Therefore, this study sheds light on the gut microbiome-metabolism axis implicated in human health.