The Stool Volatile Metabolome of Pre-Term Babies

The fecal metabolome in early life has seldom been studied. We investigated its evolution in pre-term babies during their first weeks of life. Multiple (n = 152) stool samples were studied from 51 babies, all <32 weeks gestation. Volatile organic compounds (VOCs) were analyzed by headspace solid phase microextraction gas chromatography mass spectrometry. Data were interpreted using Automated Mass Spectral Deconvolution System (AMDIS) with the National Institute of Standards and Technology (NIST) reference library. Statistical analysis was based on linear mixed modelling, the number of VOCs increased over time; a rise was mainly observed between day 5 and day 10. The shift at day 5 was associated with products of branched-chain fatty acids. Prior to this, the metabolome was dominated by aldehydes and acetic acid. Caesarean delivery showed a modest association with molecules of fungal origin. This study shows how the metabolome changes in early life in pre-term babies. The shift in the metabolome 5 days after delivery coincides with the establishment of enteral feeding and the transition from meconium to feces. Great diversity of metabolites was associated with being fed greater volumes of milk.


Introduction
The intestinal metabolome is shaped by the interactions between the microbiota and diet. Before birth, mammals ingest amniotic fluid which contains amino acids (notably taurine), some proteins (including growth factors and hormones), phospholipids [1], and, potentially, bacteria [2] and volatile organic compounds, from the mother [3]. Soon after birth, bacteria and other microbes that will eventually form the microbiota begin to colonize the intestine. During the neonatal period, there is a huge switch in the enteral intake from amniotic fluid, to colostrum and then milk, in the majority of babies. Colostrum and milk also contain microbes which may seed to the baby [4,5]. Babies that are born significantly pre-term are cared for in Neonatal Intensive Care Units (NICUs) where they receive expressed colostrum and breast milk, if possible.
It has been proposed that the study of feces from neonates may be useful in the early identification of necrotizing enterocolitis (NEC) [6][7][8] and late onset sepsis (LOS), to which preterm babies are at risk. There is a paucity of research on the metabolome in early life and we hypothesize that disease signals may be obscured as the metabolome is rapidly changing.
Here, we have analyzed the metabolome of a new cohort of preterm babies, who did not develop NEC or late onset sepsis, and explore factors that might have an impact on the metabolome. The paper describes the 'normal metabolome of the preterm neonate' as a reference document for others interested in the health of the newborn.

Patients Demographics
Fifty-one healthy infants (not affected by NEC or LOS), all <32 weeks gestation at birth and participating in both the Enteral LactoFerrin In Neonates (ELFIN) and mechanisms affecting the gut of preterm infants in enteral feeding trials (MAGPIE) [9] studies, were used in this sub-study. A total of 152 samples were analysed (distribution of age and samples shown in Table 1). Of the 51 infants, 46 were twins and 7 were singletons; their key neonatal features are summarised below.
In Figure 2, we focus on a selection of compounds, showing that some of these, specifically, aldehydes and acetic acid, were present since birth and others (acids, esters, ketones and alcohol), increased after day 5. Boxplots for a selection of compounds (abundance/age group). Each boxplot represents a compound, and these are grouped according to the type of molecule (i.e., aldehydes, methyl aldehydes, acids, alcohol, esters, and ketone). All samples were included, n = 152.
Linear mixed-effects (LME) analysis was used to identify compounds that changed over time, results are in Table 4. All of these increased over times (positive slope value). Three other factors were considered in the analysis: batch, gestational age (weeks) and delivery mode. Patient ID was a random effect in the LME analysis. Esters were slightly increased over time in babies with a higher gestational age, meanwhile an alcohol and a ketone show a weak increase in babies born earlier during the pregnancy. Interestingly, the alcohol, 1-octen-3-ol (Table 4 and Figure 3), can be related to fungal metabolism [10,11]. This metabolite and 2-pentylfuran, another compound related to fungal metabolism [12], were also slightly increased in babies born by caesarean section (Table 4 and Figure 3). Most of the compounds that showed significant association with delivery mode were increased in babies born by caesarean section, except for ethyl acetate that was increased in babies born by vaginal delivery.
Boxplots for a selection of compounds (abundance/age group). Each boxplot represents a compound, and these are grouped according to the type of molecule (i.e., aldehydes, methyl aldehydes, acids, alcohol, esters, and ketone). All samples were included, n = 152.
Linear mixed-effects (LME) analysis was used to identify compounds that changed over time, results are in Table 4. All of these increased over times (positive slope value). Three other factors were considered in the analysis: batch, gestational age (weeks) and delivery mode. Patient ID was a random effect in the LME analysis. Esters were slightly increased over time in babies with a higher gestational age, meanwhile an alcohol and a ketone show a weak increase in babies born earlier during the pregnancy. Interestingly, the alcohol, 1-octen-3-ol (Table 4 and Figure 3), can be related to fungal metabolism [10,11]. This metabolite and 2-pentylfuran, another compound related to fungal metabolism [12], were also slightly increased in babies born by caesarean section (Table 4 and Figure 3). Most of the compounds that showed significant association with delivery mode were increased in babies born by caesarean section, except for ethyl acetate that was increased in babies born by vaginal delivery.
A positive slope for infant age (days) indicates an increase in the compound over time; a positive value for gestational age indicates that babies born later had more of that compound; a positive value for delivery mode means that babies born through a caesarean section had more of that compound, opposite to a negative slope that refers to a compound being more prevalent in babies born by vaginal delivery. Values that were not significant are not shown (−). Significance codes: − p not significant, * p < 0.05, ** p <0.01, *** p < 0.001. All samples were included, n = 152.
A positive slope for infant age (days) indicates an increase in the compound over time; a positive value for gestational age indicates that babies born later had more of that compound; a positive value for delivery mode means that babies born through a caesarean section had more of that compound, opposite to a negative slope that refers to a compound being more prevalent in babies born by vaginal delivery. Values that were not significant are not shown (−). Significance codes: − p not significant, * p < 0.05, ** p <0.01, *** p < 0.001. All samples were included, n = 152.

Figure 3.
Boxplots for a selection of compounds (abundance/gestational age and delivery mode). Each boxplot represents a compound, and these are grouped according to the variable of interest (gestational age and delivery mode). All samples were included, n = 152.

Figure 3.
Boxplots for a selection of compounds (abundance/gestational age and delivery mode). Each boxplot represents a compound, and these are grouped according to the variable of interest (gestational age and delivery mode). All samples were included, n = 152.

Discussion
This is the largest study of the fecal metabolome in the neonatal period. Samples from the first few days after birth are characterized by the limited range of VOCs and the predominance of acetic acid and aldehydes. We found that acetic acid was found in the majority of these samples, but propionic acid and butanoic acid were not. Studies on the fermentation of taurine have shown that acetic acid is the most common short-chain fatty acid (SCFA) derived from this amino acid [13]: it is plausible that the taurine-rich amniotic fluid is responsible for this pattern of SCFA in the meconium.
The presence of aldehydes was striking. There were four medium-chain aldehydes (C6-C9) and two further branched aldehydes. Aldehydes are a consequence of lipid peroxidation [14,15]. Branched-chain aldehydes arise from amino acids (for example, leucine and isoleucine [16,17]) and are metabolites of lactic acid bacteria, which are abundant in the vagina and are likely to seed to the neonate during delivery.
There was a steady increase in the range (median 13 to 24, ANOVA p < 0.00001) of VOCs in faecal samples during the first few weeks of life. The lack of esters was striking. Esters are common in adult faeces and may arise from foods (as flavours in fruit [18]) but may occur by the condensation of fatty acids and alcohols [19].
The previous study of VOCs in preterm new-borns [6] reported 36 samples were obtained from seven babies over 14 days. The same analytical laboratory methods were used although the present study had more consistent stool weights (80.6 mg (range 32.5-100 mg, SD 12.3 mg) than the earlier one (890 mg, range 300-2400 mg, SD 460 mg). The main difference between these two studies was the temporal sampling employed here: the earlier report did not consider the influence of the age of the babies. As a result, no conclusions could be drawn about the evolution of the metabolome. Costello noted that 7 of the 15 most abundant compounds were aldehydes. Acetone and ethanol were also prevalent. 2-ethylhexanol was also common (97%), but it was considered to be a contaminant arising from plasticware: it was found in 61% of samples in the present study, even though samples were collected into glass vials. The three short chain fatty acids are common in the stool of adults (>95%) [19], each had a low prevalence (<10%) in the Costello study.
The paper reports the evolution of the faecal metabolome in the first weeks of life in preterm babies. There is a marked change that occurs in association with the introduction of first milk feeds. The lack of SCFA in the first week of life suggests they are not a requirement for the intestine in utero or early after birth; their appearance when milk is introduced suggests that the faecal microbiota contains bacteria able to ferment carbohydrates and amino acids to synthesize SCFA.
Gestational age and delivery mode were included in our LME model as these factors are known to influence the gut microbiota of infants. A weak increase in fungal metabolites was observed in babies born earlier during the pregnancy and delivered by caesarean section. In full-term infants, mode of delivery is known to influence the microbiota and it has been shown that babies born by caesarean delivery are more susceptible to being colonized by opportunistic pathogen acquired from the hospital environment rather than commensal bacteria that are transmitted by the mother during vaginal delivery [20]. This effect may increase in babies spending a long time in NICU and may explain the increase in signal of fungal volatile (1-octen-3-ol and 2-pentylfuran), as yeasts may colonize the gut in an opportunistic fashion and NICU are a source of yeasts [21]. Similarly, earlier preterm babies showed a weak increase in fungal metabolites. A recent study on interkingdom relationships (bacteria, fungi and archaea) on preterm infants [22] found a defined succession of bacteria genera, however the evolution of the fungal community was less predictable. They found a negative correlation between fungal and bacterial load, and that Candida colonization was inhibited by Staphylococcus, a pioneer in the establishment of gut microbiota in early life [23].

Patients
Patients in this sub-study were part of a large cohort recruited to the MAGPIE study. This study focuses on the children without necrotising enterocolitis or late onset sepsis, who gave a least two stool samples during the first 70 days of life. The overarching study was the ELFIN study. Preterm infants at one of 12 participating NHS hospital trusts (13 separate NICUs) were eligible if they met enrolment criteria for ELFIN which included preterm infants < 32 weeks gestation and <72 h postnatal age. Potential infants meeting the eligibility criteria for MAGPIE were identified and recruited by the local healthcare team. Parents were approached for written informed consent after they had received a verbal and written explanation of MAGPIE. The study protocol was approved by East Midlands-Nottingham 2 Research Ethics Committee 16/EM/0042.

Extraction of VOCs
Faecal samples collected in glass vials and stored at −80 • C in Newcastle for up to 12 months, before shipping to the Liverpool laboratory, on dry ice, and being stored at −20 • C again. Prior to analysis, samples were weighed, and aliquots transferred to 10 mL glass headspace vials with magnetic septum caps (Sigma-Aldrich, Dorset, UK) in a hood: a mean of 80.6 mg stool (SD 12.3 mg) was used for the analysis. During aliquoting an empty vial remained unsealed in the hood to collect circulating air, later this was then re-sealed in the hood and was stored with the prepared samples. These air samples were analysed alongside the samples to determine whether there were contaminants in the air when the samples were aliquoted.
Volatile organic compound analysis was performed using gas-chromatography massspectrometry on a PerkinElmer Clarus 500 GC-MS quadrupole benchtop system (Beaconsfield, UK) and Combi PAL auto-sampler (CTC Analytics, Zwingen, Switzerland). VOCs were extracted using solid phase micro-extraction with a divinylbenzene-carboxenpolydimethylsiloxane (DVB-CAR-PDMS) (Sigma-Aldrich, Dorset, UK) coated fibre, otherwise the protocol and GC-MS conditions were the same as published by Reade et al. (2014) [24]. Samples were heated to 60 • C for 30 min at prior to fibre exposure, the fibre was exposed to the headspace gases at 60 • C for 20 min, then thermally desorbed for 5 min at 220 • C.
The GC column used was a 60 m Zebron ZB-624 (inner diameter 0.25 mm, length 60 m, film thickness 1.4 µm (Phenomenex, Macclesfield, UK). The carrier gas used was 99.996% pure helium (BOC, Sheffield, UK) which was passed through a helium purification system, Excelasorb ™ (Supelco, Bellefonte, PA, USA) at 1 mL/min. The initial temperature of the GC oven was set at 40 • C and held for 2 min before increasing to 220 • C at a rate of 5 • C/min and held for 4 min with a total run time of 41 min. The MS was operated in electron impact ionization EI + mode, scanning from 10 to 300 m/z with an interscan delay of 0.1 s and a resolution of 1000 at FWHM (Full Width at Half Maximum). Samples were run in two batches, the first batch had 36 samples and the second 116.

Downstream Data Processing and Analysis
The GC-MS data were processed as CDF files using the Automated Mass Spectral Deconvolution and Identification System software (AMDIS, version 2.73, 2017, Gaithersburg, MD, USA), the NIST mass spectral library ((version 2.0, 2011 purchased from PerkinElmer, Beaconsfield, UK) and the R package Metab [25]. AMDIS and NIST software were used to build a compound library; VOCs were added based on a match criterion of greater than 700, then a probability of a true match (greater than 70%) and finally inspection of fragment patterns. This compound library is then used, with AMDIS, and was applied to deconvolute chromatograms and identifying metabolites. VOCs were named as common names, moreover, the International Union of Pure and Applied Chemistry (IUPAC) [26] names along with PubChem CID number are provided in Appendix A Table A2.
VOCs data were analyzed with R (version 3.6.3, Vienna, Austria) [27] in RStudio (version 1.2.5033, Boston, MA, USA) [28,29]. Firstly, the VOCs table was adjusted as follows: only compounds observed in at least 25% of samples were kept, natural log transformation was performed using the log() function and missing values were imputed to 0. Generalized linear mixed-effects, glmer() function of the lme4 package [30], was used to perform a mixed effect regression model to assess whether there was correlation between the number of VOCs and postnatal baby age (days). Finally, LME model analysis was performed with the lmer() function of the lme4 package [30]. Patients ID was used as a random factor, while baby age (days), GC-MS run batch, gestational age and delivery mode were the fixed factors. ggplot2 [31] package was used to produce the charts.

Conclusions
This study shows the evolution of the metabolome in early life in pre-term babies. We observed a clear shift in the metabolome after 5 days from birth that coincides with the establishment of enteral feeding and the transition from meconium to faeces.  Data Availability Statement: CDF files, VOCs and metadata tables are available upon reasonable request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.
Sample Availability: Human samples are not available from the authors, in accordance with the Human Tissue Act, these have been used and remainder destroyed.