Milk Formula Diet Alters Bacterial and Host Protein Profile in Comparison to Human Milk Diet in Neonatal Piglet Model

The metaproteome profiling of cecal contents collected from neonatal piglets fed pasteurized human milk (HM) or a dairy-based infant formula (MF) from postnatal day (PND) 2 to 21 were assessed. At PND 21, a subset of piglets from each group (n = 11/group) were euthanized, and cecal contents were collected for further metaproteome analysis. Cecal microbiota composition showed predominantly more Firmicutes phyla and Lachnospiraceae family in the lumen of cecum of HM-fed piglets in comparison to the MF-fed group. Ruminococcus gnavus was the most abundant species from the Firmicutes phyla in the cecal contents of the HM-fed piglets at 21 days of age. A greater number of expressed proteins were identified in the cecal contents of the HM-fed piglets relative to the MF-fed piglets. Greater abundances of proteins potentially expressed by Bacteroides spp. such as glycoside enzymes were noted in the cecal lumen of HM-fed piglets relative to the MF. Additionally, lyases associated with Lachnospiraceae family were abundant in the cecum of the HM group relative to the MF group. Overall, our findings indicate that neonatal diet impacts the gut bacterial taxa and microbial proteins prior to weaning. The metaproteomics data were deposited into PRIDE, PXD025432 and 10.6019/PXD025432.


Introduction
Studies have demonstrated positive health outcomes in human milk fed in comparison to formula-fed infants. There is considerable amount of evidence showing that human milk diet minimizes risk of necrotizing enterocolitis in preterm infants [1][2][3][4]. Additionally, the gut microbiota colonization can be influenced by neonatal diet. The literature suggests that breastfed infants have higher abundance of Bifidobacteria and Bacteroides than formulafed infants [5][6][7][8][9][10]. Previously, we have reported that human milk-fed (HM) piglets had higher fecal Bacteroides abundance relative to a formula-fed group (MF), as well as a stronger immune response by which enhanced T cell proliferation in the mesenteric lymph nodes of HM-fed animals [11]. Gut and immune health are functions of both diet and gut microbes that respond to diet. Several metabolites are known to derive from the microbial metabolism throughout the intestinal regions [12,13]. For instance, indigestible carbohydrates can be fermented by distal gut bacteria (cecum and colon) to short-chain fatty acids [14], and complex human milk oligosaccharides (HMOs) are broken down by microbes in the distal gut, serving as substrates to commensal bacteria among other functions reviewed elsewhere [15]. In addition, derivatives of the tryptophan metabolism (i.e., indoles) [16] and the conversion of primary to secondary bile acids are also metabolized by distal gut microbiota [17]. Metabolomics analysis of the large intestinal contents of these piglets revealed that HM feeding resulted in greater abundance of fatty acids, polyamine derivatives, glutamic acid, and tryptophan metabolites in the distal gut of HM-fed piglets. In contrast, MF-fed piglets had greater abundance of cholesterol, bile acids, and amino acids in the distal colon at 21 days of age relative to the HM-fed group [18]. These findings might be a result from the interaction between neonatal diets and gut microbial activity.
Several approaches have demonstrated that microbiota compositional changes can be altered in response to diet [19][20][21][22][23][24][25][26][27]. However, studies were limited in terms of determining the functional relevance of the microbial changes and which components of microbiota play a role in positive health outcomes observed in human milk-fed infants. Newer technology such as metaproteomics might help to determine the microbial protein presence, abundance, and microbial community. This allows us to understand the functional role of microbiota and their interactions with host and other microbial species in an ecosystem. In addition, host proteins can be identified from the sloughed off cells of the gastrointestinal tract. The proteins provide a measure of the activity of the cells and their abundances provide a phenotype at the molecular level. Metaproteomics was used often to study environmental samples in the 2000s [28]. The first shot gun metaproteomics from human samples was conducted by Verberkmoes et al. in 2009 [29]. They identified that 30% of protein hits were associated with the host, and several microbial pathways related to carbohydrate and energy metabolism were observed. The literature is limited in terms of its understanding of the microbial protein/peptide functions, especially in neonates.
We hypothesized that cecal bacterial community and bacterial proteins act as signaling molecules to promote gut homeostasis and immune function in HM-fed piglets relative to MF-fed piglets. Thus, a metaproteomics approach was used to determine the bacterial protein expression in piglets fed either human milk or cow's milk formula.

Experimental Design
An animal experiment was conducted as per the Institutional Animal Care and Use Committee approval at the University of Arkansas for Medical Sciences (UAMS Institutional Animal Care and committee 3727 and 3471). Diet composition and the experimental design were published previously [11,30]. Piglets were obtained from 4-6 sows. At 2 days of age, White Dutch Landrace Duroc male piglets were randomized into two dietary groups (n = 11/group): pasteurized human milk (HM) provided from the Mother's Milk Bank of North Texas, or a dairy-based infant formula (MF) (milk formula; Similac Advance powder; Ross products, Abbott Laboratories, Columbus, OH). Piglets were fed to meet the nutrient requirements of growing pigs as per National Research Council (NRC) guidelines [31]. Piglets were euthanized at PND 21 to collect cecal contents. Samples were immediately frozen in liquid nitrogen and transferred to an −80 • C freezer.
Preliminary taxonomy analysis was conducted to evaluate reproducibility of protein extraction. De novo peptide tags obtained by PEAKS for the raw MSMS spectra were filtered using the average local confidence score (ALC ≥ 80), and the filtered peptide lists were supplied to the online metaproteomics tool Unipept [34], with the following settings: equate I and L-checked, filter duplicate peptides-unchecked, advanced missed cleavage handling-checked. The taxonomy information was visualized using a tree diagram provided by Unipept. Bacterial-to-host ratios were calculated using number of de novo sequences matched to each of the two taxonomic kingdoms (Table S1). This step is required the assessment of the quality of protein extraction in order to ensure that sufficient number of bacterial proteins was extracted. After this quality control step, we proceed with the protein identification using the multi-step database strategy as implemented in Peaks Studio.
Multi-step database search strategy for protein identification: The high quality de novo tags (average local confidence score ≥ 50%) were search against a series of protein databases using the multi-step database strategy. The false discovery rate estimation as implemented in PEAKS is compatible with the multi-step searches [35].
Step 1: Uniprot/Tremble protein database (downloaded on 13 April 2020) was searched using Homo sapiens and Sus scrofa taxonomic filter (310,501 entries were searched). Unmatched de novo tags from this step were passed on to Step 2, wherein the Uniprot database was searched using bacteria, archaea, and fungi as taxonomic filters (142,741,860 entries searched). No filters were applied to the search results in these 2 first steps, apart from the de novo quality score (ALC ≥ 50%). All of the identified entries from the first two steps (≈10% estimated F.D.R at this point, 0 unique peptides allowed) were used to compile a sequence database for the final search.
Step 3: The de novo tags were re-searched against the final sequence database derived from the results of the previous two steps (172,464 entries), applying stringent FDR criteria to the final result: 1% false discovery rate for peptide-to-spectrum matches (corresponding average −10lgP ≈ 25 across samples) and minimum of 1 unique peptide per protein. One unique peptide hits were further required to have −10lgP = 30 in order to be considered identified. Additional filters were applied at the next step for comparative analysis.
Differential abundance of proteins and bacteria: Spectral counts (number of tandem MS spectra that match to a given protein sequence via the database search) were used to infer differential abundant (DA) proteins and taxonomic units. At the taxonomic unit level, the spectral counts of proteins were grouped using taxonomic information in the sequence database and then were summed to obtain total spectral counts for each species in each sample. If species were not identifiable, higher taxonomic levels were used. Moreover, the identified organism had to be present in at least 4 of the independent biological replicates in either of the two conditions compared. The counts were filtered so that species with less than 10 counts in all samples, but one was removed. Then, counts were normalized to the trimmed mean of M values, a method frequently employed in RNA-Seq analysis [36]. The differential abundance analysis was performed employing Poisson-Tweedie family of distributions using tweeDE package in R [37]. Initially, data analysis for microbiota and microbial and host proteins was conducted by edgeR and DESeq2 methods with different statistical tests (i.e., Wald LRT for DESeq2 and LRT, exactTest for edgeR). Finally, Benjamini-Hochberg correction was used for multiple testing to define differentially abundant proteins and bacterial species (FDR < 0.05).

Data Accessibility
The mass spectrometry proteomics data were deposited to the ProteomeXchange Consortium via the PRIDE [38] partner repository with the dataset identifier PXD025432 and 10.6019/PXD025432. Reviewer login details: Username: reviewer_pxd025432@ebi.ac.uk; password: qvFTwXRs.

Results
Figures S1 and S2 detail these data as Venn diagrams (Bacteria Venn pairwise and protein Venn pairwise, respectively). Due to zero inflation and overdispersion observed with the data, we used the Poisson-Tweedie method, which enables direct fitting of data with heavy-tails and/or zero-inflation [32]. Heat maps show the most abundant bacterial taxa and proteins altered in cecal lumen due to neonatal feeding (Figures 1 and 2, respectively). The microbial abundance and metaproteome data are listed in Tables 1 and 2. All the bacteria and proteins identified in this study are presented in the Supplemental Datasets S1 and S2, respectively. The mass spectrometry proteomics data were deposited to the ProteomeXchange Consortium via the PRIDE [38] partner repository with the dataset identifier PXD025432 and 10.6019/PXD025432. Reviewer login details: Username: re-viewer_pxd025432@ebi.ac.uk; password: qvFTwXRs.

Results
Figures S1 and S2 detail these data as Venn diagrams (Bacteria Venn pairwise and protein Venn pairwise, respectively). Due to zero inflation and overdispersion observed with the data, we used the Poisson-Tweedie method, which enables direct fitting of data with heavy-tails and/or zero-inflation [32]. Heat maps show the most abundant bacterial taxa and proteins altered in cecal lumen due to neonatal feeding (Figures 1 and 2, respectively). The microbial abundance and metaproteome data are listed in Tables 1 and 2. All the bacteria and proteins identified in this study are presented in the Supplemental Datasets S1 and S2, respectively.

Microbial Taxonomy Identification in the Cecal Lumen
The bacterial abundance in the luminal cecum of HM-and MF-fed piglets is shown in Table 1. The cecum profile of the HM-fed piglets was predominantly composed of the Firmicutes phylum and of the Lachnospiraceae family, including the species Ruminococcus lactaris, Ruminococcus gnavus, and Lachnospiraceae bacterium, while the cecal lumen of the MF-fed relative to HM-fed piglets had higher abundance of the Bacteroides genera including Bacteroides clarus and Bacteroides stercoris. Additionally, the cecum of MF-fed piglets had greater abundance of the Clostridium clostridioforme (fold-change (FC) = 2.9) compared to the HM-fed group.   1 The raw spectral counts matching to the identified bacterial species were analyzed using tweeDEseq package in Bioconductor. 2 HM and MF columns indicate mean value of the total spectral counts followed by the standard deviation of the mean (SD). 3 Log 2 FC is the log 2 of the HM to MF ratio. 4 Benjamini-Hochberg correction for multiple testing was applied to adjust p-values.  1 The raw spectral counts matching to the identified proteins were analyzed using tweeDEseq package in Bioconductor. 2 HM and MF columns indicate mean value of the total spectral counts. 3 Log 2 FC is the log 2 of the HM-to-MF ratio. 4 Benjamini-Hochberg correction for multiple testing was applied to adjust p-values.

Microbial Taxonomy Identification in the Cecal Lumen
The bacterial abundance in the luminal cecum of HM-and MF-fed piglets is shown in Table 1. The cecum profile of the HM-fed piglets was predominantly composed of the Firmicutes phylum and of the Lachnospiraceae family, including the species Ruminococcus lactaris, Ruminococcus gnavus, and Lachnospiraceae bacterium, while the cecal lumen of the MFfed relative to HM-fed piglets had higher abundance of the Bacteroides genera including Bacteroides clarus and Bacteroides stercoris. Additionally, the cecum of MF-fed piglets had greater abundance of the Clostridium clostridioforme (fold-change (FC) = 2.9) compared to the HM-fed group.

Bacterial Proteins Impacted by Diet Groups in the Lumen of Cecum at PND 21
Bacterial peptide profile of cecal contents of HM-or MF-fed piglets at PND 21 are shown in Table 2. A greater number of bacterial proteins were identified in the HM-fed group relative to the MF piglets. The top 10 bacterial proteins identified in the lumen of cecum of MF group were from the phylum Bacteroidetes, including species from Bacteroides and Phocaeicola genus. Peptides derived from Phocaeicola vulgatus (Bacteroides vulgatus) included RagB/SusD family nutrient uptake outer membrane proteins as well as malate dehydrogenase. In fact, proteins associated with Phocaeicola vulgatus were also identified in the cecal contents of the HM-fed piglets; however, a greater diverse pool of peptides were observed relative to the MF group. For instance, galactose oxidase, sialidase, tetracycline resistance protein, and chaperonin were peptides associated with Phocaeicola vulgatus that had higher abundance in the cecum of the HM group compared to the MF group. Additionally, the Lacl family transcriptional regulator associated with the Firmicutes bacterium was greater in the cecal lumen of HM (FC = 3) relative to the MF group. L-fucose isomerase, D-ribose pyranase, and chaperonin Firmicutes bacterium associated-proteins were greater in the cecal contents of HM compared to MF-fed piglets. The aldehyde-lyase fructose-1,6-bisphosphate aldolase had greater abundance in the cecum of the HM group relative to the MF group. Additionally, this enzyme was associated with different species in the cecum of HM group such as Lachnospiraceae bacterium, Ruminococcus gnavus, and uncultured Ruminococcus sp. The abundance of phosphotransferase acetate kinase was also greater in the cecal contents of HM group, and it was associated with both species Lachnospiraceae bacterium and Clostridium sp. D5.

Host Proteins Identified in the Cecal Contents at PND 21
Host proteins expressed in the cecal contents of HM-fed versus MF-fed piglets at PND 21 is shown in Table S2. Briefly, the human proteins N-sulphoglucosamine sulphohydrolase, epididymis secretory sperm binding protein, alpha-1-antitrypsin, and lactotransferrin were greater (FC > 5) in the cecum of HM-fed piglets compared to the MF group. In contrast, the MF-fed piglets had greater porcine proteins such as secreted folate binding protein, folate_rec domain-containing protein, and transthyretin relative to the HM-fed group.

Discussion
This study used a porcine model due to the similarities in the anatomy and physiology of the digestive tract between pigs and humans [39,40]. Previous studies found that different protein sources such as bovine milk, hydrolyzed bovine milk, and soybean formula did not change intestinal trypsin and chymotrypsin and the absorption of nitrogen in the small and large intestine in 3-week-old piglets, similar to the human infants [39]. Furthermore, it has been demonstrated that 3-week-old piglets are suitable for studying parameters of digestion and absorption relative to 3-month-old infants [40]. In our previous study, we observed that MF-fed piglets had an increased microbial diversity and richness across the luminal regions compared to the HM-fed group [26], which is in agreement with microbiota composition findings in infants that have shown higher microbial richness in formula-fed infants [41,42]. Thus, the gut related outcomes from the current study have the potential to be translated to infants consuming human milk or formula.
Metaproteome analysis of gut microbiota are typically conducted with fecal samples, and the latter constitutes a significant amount of microbial biomass in feces, which can reflect the intestinal conditions. However fecal samples are a mixture of microbiota from all intestinal regions, and the piglet model provided the opportunity to measure the specific bioregion of the gut (i.e., cecal contents). In addition, it has been demonstrated that the main microbial fermentation of both carbohydrate and protein occur in the cecum, suggesting a microbiota role in putrefaction [43]; thus, cecal luminal contents were considered for this study. Future studies are needed to determine bioregional differences in bacterial protein expression and its impact on gut health.
Bifidobacterium and Bacteroides are the most abundant genera observed in breastfed infants [24,44], while in formula-fed infants, Bifidobacterium and Bacteroides have been identified in similar levels [9]. Bacteroides vulgatus had persistent abundance from birth up to 4 months of age in the infant gut [45]. Bacteroides vulgatus and Bacteroides dorei abundances have been reported to increase in the feces of infants at 6 months of age [46], while in the adult gut microbiota community of healthy individuals, these species within the Bacteroides genera are the most predominant [47]. Additionally, Bacteroides abundance in the human gut has been associated with the maintenance of a healthy gut [48]. In line with these observations, we previously reported a higher abundance of Bacteroides in the feces of HM-fed piglets relative to the formula-fed group [11]. In the current study, metaproteomic analysis revealed greater abundance of specific bacterial peptides belonging to the Bacteroides vulgatus in the cecal contents of HM-fed piglets relative to MF-fed group at 21 days of age. Interestingly, studies have shown that Bacteroides vulgatus can grow in the presence of human milk oligosaccharides (HMO), as well as metabolize these complex carbohydrates [49,50]. Moreover, proteins associated with Bacteroides vulgatus has been identified in stool samples of breastfed infants at 2-3 months of age [51]. Interestingly, Bacteroides spp. promote Treg cell development [52,53], and it has been shown that infants with decreased allergic colitis had increased Bacteroides spp. in their stool [54] suggesting the role of these species in promoting immune responses and homeostasis in the gut. This further suggests the role of Bacteroides spp. in cell-mediated immunity, but it is yet to be determined how this impacts antibody and humoral immune response.
Recently, Bifidobacterium abundance has been reported to decrease in the feces of infants from 6 to 12 months of age, while Lachnospiraceae abundance increased [46]. Interestingly, in this study, alongside the Bacteroides vulgatus-associated proteins, a greater number of enzymes related to the Lachnospiraceae family were identified in the luminal cecum of HM-fed piglets relative to the MF group. Studies demonstrated that bacteria within the Lachnospiraceae family, in particular Ruminococcus gnavus, has the ability to produce iso-bile acids, and such metabolites can favor the growth of Bacteroides [55,56]. Recently, the Pre-ventADALL cohort study evaluated the microbial composition and the metaproteome of 100 mother-child pairs from Norway and Sweden [57]. Within the Firmicutes phyla, the predominant species identified in the feces of 12 months old infants was Ruminococcus gnavus, and glycoside hydrolases were the enzymes associated with such species [57]. These findings are in agreement with the greater abundance of enzymes involved in the degradation of sialic acids observed in our study, including fructose-1,6-bisphosphate aldolase potentially expressed by Ruminococcus gnavus in the luminal cecum of HM-fed piglets relative to MF group at PND 21. Ruminococcus gnavus and such aldolase enzymes have been reported to metabolize sialic acid to N-acetylmannosamine [58]. Indeed, sialic acids are a family of carbon sugar acids present in human milk in a rich source of oligosaccharide-bound sialic acid [59].
Furthermore, human milk feeding enhanced the expression of glycoside hydrolases, nutrient uptake proteins, and transporters, which are common enzymes involved in the HMO consumption by Bacteroides and Bifidobacteria [60,61]. Therefore, it is plausible to acknowledge that milk glycans such as HMOs promote the growth of beneficial bacteria in HM-fed piglets. Additionally, the different hydrolases identified in the HM-fed piglets potentially expressed by Bacteroides might benefit the immune system, pending mechanistic data.
We acknowledge that in the current study, numbers of microbial taxa and proteins identified were smaller than in usual metagenome analysis. Kleiner et al. described very elegantly the limitations of metaproteomics data acquisition with current mass spectra and how that limits in-depth community analysis [62]. In addition, the metaproteomics analysis requires a well-curated sequence database to assign the proteins to individual microbial species. This potential mismatch between identified proteins and their assignment to bacterial species is evident when one compares hierarchical clustering using bacterial abundance ( Figure 1) and protein abundance (Figure 2). The clustering using protein abundance shows clear separation between the HM and MF groups, while clustering using bacterial abundance has one outlier from the HM group (the right-most HM column in the middle of the heatmap, Figure 1). We argue that the limited number of microbial species and hits to microbial proteins in the current study is a result of combination of technology and data acquisition. This is expected to improve in the next few years with better mass spectra technology and metaproteome database tools.
The current study was conducted at controlled environmental (housed at the vivarium) and isocaloric diets for both HM and MF groups. The human milk used in this study was composed of milk samples ranging from 2 to 12 months of lactation, and different components were added to the diet to maintain the growing piglet nutrient requirements.
Additionally, the piglets were enrolled in the study at 2 days of age. Thus, this study lacks data on the colostrum intake. These limitations might introduce variation in the milk composition and might affect the luminal microbiota composition and protein expression of the gut microbiota. Moreover, piglets were from 4-6 sows, by which genetic differences could cause some variation.

Conclusions
In summary, we observed a 5.4-fold increase in the relative abundance of Ruminococcus gnavus in the cecal microbiota composition of HM-fed piglets relative to the MF-fed piglets at 21 days of age. This bacterial abundance was also associated with the expression of glycolytic enzymes in the cecum lumen of HM-fed piglets compared to the MF group. Furthermore, the greater number of proteins potentially expressed by Bacteroides vulgatus observed in the cecal contents of HM-fed piglets relative to the MF-fed group at 21 days of age might be associated with the ingestion of bioactive components of human milk (i.e., HMOs metabolized by this gut bacteria) and possibly promotes immune function. Overall, our findings highlight the association between gut microbiota composition upon different neonatal diets with the peptides and enzymes originated from this interaction. We believe that further research in the field of metaproteomics might be crucial to understanding the establishment of key gut colonizers and the overall effect on the host metabolism and immune system.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/nu13113718/s1, Figure S1: Venn diagram representing the bacterial pairwise comparisons. Figure S2: Venn diagram representing the proteins pairwise comparisons. Dataset S1: All bacteria identified with the current approach. Dataset S2: All proteins identified in the experimental samples. Table S1: Host/bacteria ratios identified in this study. Table S2: Host protein expression identified in the cecal contents of piglets fed HM or MF at postnatal day 21.

Informed Consent Statement: Not applicable.
Data Availability Statement: The mass spectrometry proteomics data have been deposited to the Pro-teomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD025432 and 10.6019/PXD025432. Reviewer login details: Username: reviewer_pxd025432@ebi.ac.uk Password: qvFTwXRs.