*2.3. Sample Collection and Preparation*

Urine samples were obtained from the infants during the first week of life and thereafter approximately every other week until discharge (Figure 1). We collected 0.5–1.8 mL urine non-invasively by the use of cotton pads, transferred them to Nunc® CryoTubes® (Nalge Nunc International, Penfield, NY, USA) before they were stored at í80 °C. The urine samples had been thawed once prior to the metabolic NMR profiling when they were acidified for electrolyte analysis by mixing 400 ȝL of sample with 5 ȝL of 1 M HCl, resulting in a pH of approximately 3.

Metabolite profiling in the present study largely followed a protocol described earlier [26]. Briefly, 150 ȝL of distilled water and 50 ȝL of a buffer at pH 7.4 containing D2O and trimethylsilylpropionate-d4 (TSP) were added to 350 ȝL of the samples, which were then centrifuged at 13,400× *g* for 5 min and transferred to 5 mm NMR tubes (Wilmad LabGlass, Vineland, NJ, USA). One-dimensional, water-suppressed proton NMR spectra were acquired at 300.0 K on a Bruker AVI-600 spectrometer (Bruker Biospin GmbH, Rheinstetten, Germany) equipped with a TCI cryoprobe and a BACS-60 automatic sample changer, under full automation of D2O locking, tuning and matching, and gradient shimming using TopSpin 2.1pl6 and iconNMR. Of each sample 32 scans and 4 dummy scans were collected into 64 k data points using the Bruker "noesygppr1d.comp" sequence with a spectral width of 20.6 ppm, 2.65 s acquisition time and a 25 Hz water presaturation during the 4 s relaxation delay. An exponential line broadening of 0.3 Hz was applied. The TSP signal achieved a full width at half maximum of less than 1 Hz after apodization and acted as spectral and concentration reference. The spectra were phase-corrected, a smooth baseline was removed, and the spectra were binned to a spectral width of 0.01 ppm. Signals were assigned to known metabolites using a reference database [27] and the software Chenomx NMR Suite 7.5 professional (Chenomx Inc., Edmonton AB, Canada). Two example spectra are shown in Figure 2. Pseudo-concentrations were extracted by integrating manually defined spectral regions corresponding to both known and unknown substances, and arranged in a table. Pseudo-concentrations are proportional to absolute concentrations and can be used as such in the statistical analysis. Both the spectra and the table of metabolite pseudo-concentrations were subsequently normalized to the total intensity of the respective spectra, and the metabolite table was log-transformed.

**Figure 1.** (**a**) Available urine samples by infants' age in days, one infant per line, one sample per symbol. Grouped by intervention and control (red and gray lines, respectively), color-coded by week of life. Age in days was imputed for eight samples where only the week was recorded; (**b**) Available urine samples by infants' week of life. Bars divided by nutritional intervention *vs.* control (left half of bar; red and gray, respectively) and further subdivided by small for gestational age (SGA) or appropriate for gestational age (AGA) infants (right half of bar; SGA white, AGA black).

**Figure 2.** Selected regions of two NMR spectra (black for week 1 and red for week 1) of an SGA infant in the intervention group.

#### *2.4. Statistical Analysis*

We used Student *t*-test, Mann-Whitney U test or Fisher's exact test to evaluate differences in baseline characteristics, clinical outcomes and nutrient supplies between the two study groups [9,10]. Results are presented as frequencies (%) for categorical data, and as means (ranges or standard deviations) or medians (interquartile ranges) for continuous data [9,10]. For the metabolomics study, PCA was applied to mean-centered and unit-variance scaled spectra to explore the major variations in the dataset [28,29]. By definition, a PCA score plot arranges samples based on the similarity of their spectra, thus enabling the identification of natural groupings of and systematic changes between samples. The corresponding loadings reveal which spectral regions, i.e., which metabolites, contribute to the scores. Multivariate PLS regression was used to associate the endpoints in our study to the urine spectra. Again, the spectral variables were mean-centered and scaled to unit variance, and 7-fold cross-validation was applied to evaluate the quality of the resulting statistical models by considering the diagnostic measures R2 and Q2 [30], describing the endpoint variation captured in regression model, and the variation reproduced in cross-validation, respectively. Whereas R2 and Q2 represent measures of the strength of a multivariate relationship between profiles and endpoints, their ratio Q2 /R<sup>2</sup> is a measure of cross-validation reproducibility. In the present study, Q2 /R<sup>2</sup> ratios above 0.5 were considered indicative of relevant associations, which were then studied further.

Univariate response approaches were used on log-transformed data in the pseudo-concentration table to expand the results from the multivariate analyses. A linear mixed model for repeated measures (first-order autoregressive covariance structure) was used to study the impact of the two diets on the metabolite pseudo concentrations over time (weeks 1, 3, 5 and 7), adjusted for gestational age at birth and SGA status. Linear regression was used to quantify the relations between the metabolite pseudo concentrations at week 1 with SGA status and PMA in weeks, and also between metabolite levels, growth velocity and PMA. Results are reported as fold-change ratios (FC) with respect to back-transformed metabolite levels; ratios below 1 are presented as í1/ratio. Bonferroni correction for multiple testing was applied. The analyses were carried out on a Windows PC using SPSS version 20 (SPSS Inc., Chicago, IL, USA) and R 2.12.1, 64-bit (R Foundation, Vienna, Austria), with packages pls 2.2-0 and pcaMethods 1.32.0.
