Metabolomics Discovers Early-Response Metabolic Biomarkers that Can Predict Chronic Reproductive Fitness in Individual Daphnia magna

Chemical risk assessment remains entrenched in chronic toxicity tests that set safety thresholds based on animal pathology or fitness. Chronic tests are resource expensive and lack mechanistic insight. Discovering a chemical’s mode-of-action can in principle provide predictive molecular biomarkers for a toxicity endpoint. Furthermore, since molecular perturbations precede pathology, early-response molecular biomarkers may enable shorter, more resource efficient testing that can predict chronic animal fitness. This study applied untargeted metabolomics to attempt to discover early-response metabolic biomarkers that can predict reproductive fitness of Daphnia magna, an internationally-recognized test species. First, we measured the reproductive toxicities of cadmium, 2,4-dinitrophenol and propranolol to individual Daphnia in 21-day OECD toxicity tests, then measured the metabolic profiles of these animals using mass spectrometry. Multivariate regression successfully discovered putative metabolic biomarkers that strongly predict reproductive impairment by each chemical, and for all chemicals combined. The non-chemical-specific metabolic biomarkers were then applied to metabolite data from Daphnia 24-h acute toxicity tests and correctly predicted that significant decreases in reproductive fitness would occur if these animals were exposed to cadmium, 2,4-dinitrophenol or propranolol for 21 days. While the applicability of these findings is limited to three chemicals, they provide proof-of-principle that early-response metabolic biomarkers of chronic animal fitness can be discovered for regulatory toxicity testing.


Introduction
Chemical safety regulations are designed to protect human and environmental health. Yet around the globe, they fail to guarantee safe products and safety from exposure to pollutants. The disease burden associated with environmental chemical exposures likely exceeds 10% of the global gross domestic product [1]. Improvements in prognostic chemical safety-that is, early testing before adverse impacts on humans and the environment occur [2][3][4]-remains a critical need of society. Yet prognostic chemical risk assessment remains entrenched in experimental procedures that fail to benefit from the genomics revolution, i.e., fail to understand the underpinning molecular mechanisms that can provide molecular diagnostic tools [5,6]. In short, there remains almost total reliance on animal testing to measure 'apical endpoints', i.e., safety thresholds that are based on the concentrations at which a chemical induces pathological observations in the whole organism [7].
For ecological risk assessment, chemicals are routinely tested to measure adverse impacts on acute lethality (e.g., OECD Test No. 202: Daphnia sp. Acute Immobilization [8]) and, of greater relevance to low dose environmental exposures, the impacts of chemicals on reproduction over chronic timescales. For example, OECD Test No. 211 (Daphnia magna Reproduction Test [9]) evaluates the reproductive fitness of a keystone freshwater invertebrate over 21 days, equating to approximately 5 cycles of off-spring production through one-third of this species' life span [10]. The ecological importance of such a chronic (or long term) test is undeniably high, with reproductive fitness essential for species continuation and used as a measure in population ecology and the assessment of ecosystem health [11,12]. Yet measuring chronic effects such as reproductive toxicity by practically observing impacts on species over long timescales (i.e., 21 days for Daphnia; typically, 30 days for fish toxicity, OECD Test No. 210; 90 days for rodent toxicity, OECD Test No. 408 [7,9,13]) is highly resource intensive and hence expensive [14].
Furthermore, crudely counting the number of living offspring produced from Daphnia in OECD Test 211 conveys no knowledge of the underlying mode of action (MoA) of a chemical. Yet knowledge of MoA can have major benefits for chemical safety testing; for example, it can be used to group chemicals into similarly acting substances and more extensive animal testing can then be focused on a few representative members of that chemical group, reducing the number of animal tests (termed biological 'read-across' [15]. Knowledge of MoA can also serve as a source of (molecular) biomarkers for that toxicity pathway [16]. Introduced in 2010, Adverse Outcome Pathways (AOP) have provided a framework to enable a more mechanistic and causal understanding of how chemicals perturb normal biological functions, from a molecular initiating event (MIE) through to an adverse outcome (AO; i.e., apical endpoint) that affects individual organisms and, subsequently, populations [17][18][19]. One of the major challenges for AOPs is how to discover the molecular key events (KEs) that are causally predictive of these higher levels of biological organization. Indeed, this is a long-standing challenge of molecular biomarker research, to predict whole organism fitness from molecular measurements [20].
Having introduced the relevance of reproductive fitness as a measure in ecological risk assessment, yet the inadequacies of current chronic testing regimes in terms of cost and lack of mechanistic insights, we need to ask what could be a viable solution for modern chemical testing strategies. One solution would be the application of early molecular predictors of chronic reproductive toxicity, where those molecular predictors have a mechanistic basis, i.e., are biologically plausible. Practically, this would greatly reduce the need for chronic, long term animal testing, thereby reducing costs. In AOP terminology, this proposed solution is to use molecular KEs to predict individual effects such as reproductive fitness. Given this, the grand challenge is how to discover a small panel of molecular KEs for regulatory testing purposes from the tens of thousands of expressed genes, proteins and metabolites within cells and to ensure that panel of molecular KEs is predictive of the adverse outcome. Untargeted 'omics technologies may provide one solution, which in combination with multivariate statistics are designed specifically to discover changes within the molecular landscape without bias [21]. Indeed, 'omics approaches are increasingly being applied to a range of applications in toxicology [22][23][24], including for read-across [25] and deriving points of departure [26,27]. For the specific challenge of discovering molecular KEs that are predictive of a whole organism response-the focus of the current work-the experimental design is, however, just as critical as the measurement technology.
Practically, two experimental designs can be envisioned to discover early-response biomarkers: first, to seek to discover putative molecular KEs occurring at the same chronic time point that the adverse outcome occurs (i.e., decreased reproductive fitness), and then to evaluate whether those putative KEs can also predict decreased fitness when measured early in the exposure period. This approach benefits from a likely greater association of the molecular changes to the apical endpoint, but the successful application of the putative KEs to early exposure times rests on the assumption that these early molecular changes persist through the exposure period. The second practical strategy would be to discover the molecular changes at an early time following chemical exposure, where that exposure concentration is known to decrease reproductive fitness on a longer time scale. This approach benefits from knowing that the molecular changes do indeed occur on an early timescale, but assumes that these early molecular KEs are causally associated with the longer term apical endpoint. In the current study we have investigated the first of these two strategies. Furthermore, we have focused specifically on utilizing metabolomics to discover early metabolic KEs. This is because changes in metabolic biochemistry, as measured using untargeted metabolomics, are the most downstream of gene expression and protein translation, and represent the most functional measure of an organism's physiology and response to toxic stress [22][23][24]28,29].
In previous work, we and others have sought to discover associations across levels of biological organization, specifically linking metabolism with whole organism apical endpoints. For example, using prior knowledge of the MoA of eight model toxicants, energy-related biomarkers (cellular energy allocation, lipid content, anaerobic metabolic activity) and parameters related to oxidative stress (measured after 48-96 h) were shown to predict population level responses in D. magna (after 21 days), although no attempt was made to discover the most predictive molecular KEs using an 'omics approach [30]. Hines et al. [31] reported the first successful application of metabolomics to discover novel metabolic biomarkers that could predict the energetic fitness of marine mussels following a 7-day exposure to two model toxicants. More recently, statistical models were derived from chronic metabolic profiles that could predict reproductive impairment in adult D. pulex-pulicaria following exposure to both single and binary mixtures of copper and nickel, and these metabolic profiles were shown to be predictive of chronic impairment when applied to animals that had experienced only an acute exposure [32]. The metabolic profiles were not identified and the authors acknowledged that further work is required to validate this exciting finding, including evaluating this strategy to discover early-response biomarkers using more diverse chemicals and MoAs.
The overarching aim of the current study was to discover early-response metabolic biomarkers that could predict the chronic reproductive fitness of individual D. magna, using mass spectrometry metabolomics. To build on our earlier studies, we investigated three further model toxicants that have different MoAs: cadmium (Cd), primarily causing oxidative stress; 2,4-dinitrophenol (DNP), an uncoupler of oxidative phosphorylation; and propranolol, a non-selective beta adrenoceptor blocking drug. Daphnia was selected as the study species due to its importance in regulatory toxicity testing; indeed, this species is the most commonly used test organism in aquatic toxicology worldwide [33], enhancing the potential translation and impact of this research. Daphnia have increasingly been studied using metabolomics [24,[34][35][36][37], including for elucidating chemical MoAs and building predictive models for chronic effects, both with the potential to contribute to constructing AOPs [29,32]. The current study was comprised of four specific objectives, built around a workflow summarized in Figure 1 and comprising of an OECD chronic reproductive study, an acute exposure study, untargeted metabolomics, targeted metabolite analysis and multivariate modelling. First, to ensure that all biomarker discovery was rigorously anchored to a regulatory-relevant adverse outcome, range-finding studies were conducted to determine appropriate exposure concentrations for each chemical. Second, we sought to discover putative metabolic biomarkers of reproductive fitness in D. magna by applying untargeted metabolomics on the final day of OECD Test No. 211 (Daphnia chronic reproduction) conducted for Cd, DNP and propranolol. In addition to building predictive models of reproductive fitness for each chemical individually, the third objective was to integrate the discoveries from these exposure studies of Cd, DNP and propranolol to reduce the list of putative metabolic KEs to non-chemical-specific predictors of reproductive fitness. Finally, we sought to evaluate these biomarkers, which were discovered in chronic exposure studies, for their ability to predict Daphnia reproductive fitness (i.e., after 21 days) from an independent metabolic dataset measured using 24-h exposures to Cd, DNP and propranolol. That is, to determine whether we had discovered early-response metabolic biomarkers that could predict the chronic reproductive fitness of individual D. magna.

Chronic Chemical Exposures Reduced Daphnia Reproductive Fitness
The first objective of the study was to ensure that all (subsequent) biomarker discovery would be rigorously anchored to a regulatory-relevant adverse outcome, specifically, a decrease in reproductive fitness ( Figure 1). Initial dose range-finding studies were performed using 24-h exposures to derive the lethal exposure concentrations (neonatal LC50) of Cd, DNP and propranolol (see Supplementary Materials) to guide the dose selection for the 21-day D. magna reproduction tests (OECD Test No. 211 [9]). Using these data, chronic 21-day exposures were conducted using three concentrations of each chemical, a reduced-food group (50% feed level), and untreated controls: the chemical exposure concentrations corresponded to 0.05-10% of the neonatal LC50 (for each chemical), see Table S1, and all chemicals included an exposure concentration of 1% of LC50 for comparative purposes. Importantly, each biological replicate comprised of a single female D. magna in an individual exposure vessel, which had implications for the metabolomics assays described below. This design was necessary to enable the reproductive fitness of individual animals to be measured in terms of the total number of offspring they each produced during the 21-day exposure period (following the OECD test guideline). Figure 2 depicts the reproductive output for all surviving animals (each female Daphnia corresponds to one vertical bar) for each of the three exposure chemicals, the reduced-food, and untreated controls. Visual interpretation of the plots shows a distinct reduction in reproductive output with increasing concentrations of Cd ( Figure 2A) and propranolol ( Figure 2C) exposures, yet a relatively smaller effect was observed following exposure to DNP ( Figure 2B). Univariate statistical analyses determined significant effects on D. magna reproductive output for all three chemicals: Cd (ANOVA, p = 1.73 × 10 −20 ), DNP (p = 1.49 × 10 −6 ) and propranolol (p = 1.52 × 10 −15 ) with Tukey-Kramer post-hoc tests revealing significant differences in reproductive fitness as follows: Cd-all treatment groups and untreated control group were significantly different except between the reduced-food and low dose group; DNP-control, low and medium dose groups were significantly different to the reduced-food and high dose groups; propranolol-untreated control group was significantly different from all other treatment groups, with the reduced-food and low dose groups also different from the medium and high dose groups. The observations that some of the chemical exposure groups reduced reproductive fitness to a greater extent than that caused by reducing the food availability suggests a direct impact of the chemicals,

Chronic Chemical Exposures Reduced Daphnia Reproductive Fitness
The first objective of the study was to ensure that all (subsequent) biomarker discovery would be rigorously anchored to a regulatory-relevant adverse outcome, specifically, a decrease in reproductive fitness ( Figure 1). Initial dose range-finding studies were performed using 24-h exposures to derive the lethal exposure concentrations (neonatal LC 50 ) of Cd, DNP and propranolol (see Supplementary Materials) to guide the dose selection for the 21-day D. magna reproduction tests (OECD Test No. 211 [9]). Using these data, chronic 21-day exposures were conducted using three concentrations of each chemical, a reduced-food group (50% feed level), and untreated controls: the chemical exposure concentrations corresponded to 0.05-10% of the neonatal LC 50 (for each chemical), see Table S1, and all chemicals included an exposure concentration of 1% of LC 50 for comparative purposes. Importantly, each biological replicate comprised of a single female D. magna in an individual exposure vessel, which had implications for the metabolomics assays described below. This design was necessary to enable the reproductive fitness of individual animals to be measured in terms of the total number of offspring they each produced during the 21-day exposure period (following the OECD test guideline). Figure 2 depicts the reproductive output for all surviving animals (each female Daphnia corresponds to one vertical bar) for each of the three exposure chemicals, the reduced-food, and untreated controls. Visual interpretation of the plots shows a distinct reduction in reproductive output with increasing concentrations of Cd ( Figure 2A) and propranolol ( Figure 2C) exposures, yet a relatively smaller effect was observed following exposure to DNP ( Figure 2B). Univariate statistical analyses determined significant effects on D. magna reproductive output for all three chemicals: Cd (ANOVA, p = 1.73 × 10 −20 ), DNP (p = 1.49 × 10 −6 ) and propranolol (p = 1.52 × 10 −15 ) with Tukey-Kramer post-hoc tests revealing significant differences in reproductive fitness as follows: Cd-all treatment groups and untreated control group were significantly different except between the reduced-food and low dose group; DNP-control, low and medium dose groups were significantly different to the reduced-food and high dose groups; propranolol-untreated control group was significantly different from all other treatment groups, with the reduced-food and low dose groups also different from the medium and high dose groups. The observations that some of the chemical exposure groups reduced reproductive fitness to a greater extent than that caused by reducing the food availability suggests a direct impact of the chemicals, i.e., the observed decreases in reproductive fitness following chemical exposure were not simply caused by reduced food intake. The metabolomics data, below, offers further insights into the reduced-food effects. Overall, the observations of significantly decreased reproductive fitness for all three chemicals provided the 'adverse outcome' to which the molecular biomarker discovery could be anchored. i.e., the observed decreases in reproductive fitness following chemical exposure were not simply caused by reduced food intake. The metabolomics data, below, offers further insights into the reduced-food effects. Overall, the observations of significantly decreased reproductive fitness for all three chemicals provided the 'adverse outcome' to which the molecular biomarker discovery could be anchored.

Metabolomics Discovers Biomarker Signatures that Predict Daphnia Reproductive Fitness Following Chronic Exposures to Individual Chemicals
We next sought to discover putative metabolic biomarkers of reproductive fitness in D. magna by applying untargeted metabolomics to the same individual animals used in the reproduction tests (from above; Figure 1). It was therefore essential that our metabolomics assays were sufficiently sensitive to measure samples of ca. 1.5 mg biomass, so that associations could be discovered between metabolic signatures and reproductive fitness in the same individuals. We selected nanoelectrospray direct infusion mass spectrometry (DIMS) as it provided the necessary analytical sensitivity [38], yielding mass spectra comprising of >3500 peaks (after removal of noise and peaks from the extract blank).
Univariate and unsupervised multivariate analyses of the metabolomics data were applied to provide an initial assessment of any metabolic perturbations induced by each of the three chemicals. FDR-corrected univariate analyses reveal that significant metabolic perturbations were caused by chemical exposure (Table S2). Visual interpretation of principal components analysis (PCA) scores plot for each chemical shows a clear separation between the treatment groups following exposure to Cd and propranolol ( Figure 2D,F, respectively), consistent with the univariate analyses (Table S2). Exposure to DNP results in less well separated groups using PCA ( Figure 2E), consistent with the univariate analyses (Table S2) and the modest reduction of reproductive fitness ( Figure 2B). For all three chemicals, the reduced-food group was significantly separated from the chemical-exposed groups confirming that the chemicals were not inducing a starvation phenotype in the Daphnia (see Supplementary Materials; Table S3). These molecular insights allowed us to conclude that the observed decreases in reproductive fitness following chemical exposure (Figure 2A-C) were not simply caused by reduced food intake. The reduced-food group could therefore be removed from the subsequent biomarker discovery analyses.
Next, we employed supervised multivariate analyses to attempt to discover metabolic signatures that could predict the reproductive fitness of D. magna (Figure 1). Partial least squares regression (PLS-R) was applied to each of the three metabolomics datasets (Cd, DNP and propranolol), and the models built were internally cross-validated (i.e., evaluated as to how well they predict reproductive output for samples left out of the original model building). Strong associations between the measured and predicted reproductive outputs were obtained: cross-validated R 2 value of 0.822 for Cd (4056 peaks in PLS-R model); R 2 = 0.768 for DNP (4112 peaks); R 2 = 0.735 for propranolol (3647 peaks). Permutation testing was applied to assess the statistical significance of the capability of each PLS-R model to predict the reproductive output; for all chemicals, p < 0.001, i.e., none of the 1000 permuted PLS models generated R 2 values that were higher than those from the models built with the real metabolomics data. While these initial PLS-R models were robust, they generated long lists of putative metabolic biomarkers. We sought a reduced list of the most predictive metabolites to translate towards practical regulatory applications. Therefore, next we employed a 'forward selection' approach in our PLS-R model building to derive the most reduced list of putative metabolic biomarkers that built an optimally predictive model for each chemical, i.e., the model with the highest R 2 value. The optimal individual-chemical PLS-R models are shown in Figure 3, with cross-validated R 2 = 0.935 for Cd (561 peaks in model); R 2 = 0.945 for DNP (306 peaks); R 2 = 0.893 for propranolol (606 peaks); and p < 0.001 for all models (permutation testing). These results provide very strong evidence for the existence of metabolic predictors of reproductive fitness, i.e., metabolic predictors of the number of offspring from individual female Daphnia produced in a 21-day OECD Test No. 211, for each of Cd, DNP and propranolol.

Metabolomics Discovers a 49-Biomarker Signature that Predicts Reproductive Fitness Following Chronic Exposure to All Three Chemicals
The analyses above discovered three highly predictive models of Daphnia reproductive fitness. Yet the domains of applicability of each of these three models should be viewed as limited to each of the three exposure chemicals, Cd, DNP and propranolol; i.e., we cannot assume that the model built with Cd is valid for all chemicals inducing oxidative stress, etc. However, the domain of applicability of this study can be increased significantly by determining if there are any putative metabolic biomarkers that are common to all three chemicals. If these exist, they can be viewed as a non-chemical-specific (or non-MoA-specific) biomarker signature of reproductive fitness in Daphnia. Therefore, the metabolic features from the three optimal individual-chemical PLS-R models were compared (Figures 1 and 4A), revealing that 194 metabolic features were common to at least two of the three chemicals (using an m/z tolerance of 1.5 ppm to combine the lists). Next, to attempt to reduce this putative metabolic biomarker list still further, one further PLS-R model was built using forward selection applied to just these 194 metabolic features. The optimal, non-chemical-specific PLS-R model is shown in Figure 4B, with a cross-validated R 2 of 0.915 and only 49 peaks in the model. Considering that >4000 peaks were detected using the DIMS metabolomics approach, the strategies applied here have focused on the ca. 1% most important metabolic features only. Table S4 lists the 49 peaks in ranked order of importance in the PLS-R model as well as their m/z values, average intensities and the fold-change in intensities between the high dose and control groups (for each of the three chemicals).
Biological plausibility is one of the criteria for robust molecular KEs within AOPs, which means that molecular biomarkers should be identified. As is widely recognized in metabolomics, metabolite identification can be extremely challenging. For example, to confidently identify a metabolite to the Metabolomics Standards Initiative (MSI) level 1, the experimental measurements of the metabolite in the biological sample need to be validated against the same type of measurements using an authentic chemical standard [39]. Yet for most metabolites, such standards are not commercially available.
De novo synthesis of a single or a few metabolites is an option for specialist laboratories [40], but this does not scale even to a few tens of metabolites. In the current study, we putatively annotated as many as possible of the 49 metabolic features to at least MSI level 2, as described in Materials and Methods. Table S5 summarizes these annotations including the measured m/z values, selected correlations (only those r > 0.9) between the intensities of pairs of peaks, the (putative) empirical formula(e), ion form (i.e., in which adduct form and isotope the metabolite was measured), m/z error compared to theoretical mass (ppm), the observation of related adduct forms and naturally-occurring isotopes to support the peak annotations, and where known the putative metabolite name. Metabolite identification to MSI level 1 was conducted by comparing MS/MS fragmentation spectra to those measured using a commercial authentic metabolite standard. Table 1 summarizes the annotations and identifications for the sub-group of metabolic features that could be assigned a unique empirical formula and, where possible, (putative) metabolite name, which accounted for 11 of the 49 peaks in the biomarker signature. from each of the optimal PLS-R models following chronic exposure of Daphnia to Cd, DNP or propranolol; (B) Correlation between measured and predicted reproductive output for individual D. magna, the latter derived from the non-chemical-specific forward selected PLS-R model, following chronic exposure to Cd, DNP and propranolol, including line of best fit. The cross-validated R 2 value is 0.915 (49 peaks, 3LVs); p = < 0.001. Classes comprise of Cd exposed animals: control (), low (), medium () and high dose (); DNP exposed: control (), low (), medium () and high dose (); propranolol exposed: control (), low (), medium () and high dose (). (B) Correlation between measured and predicted reproductive output for individual D. magna, the latter derived from the non-chemical-specific forward selected PLS-R model, following chronic exposure to Cd, DNP and propranolol, including line of best fit. The cross-validated R 2 value is 0.915 (49 peaks, 3LVs); p = < 0.001. Classes comprise of Cd exposed animals: control ( ), low ( ), medium ( ) and high dose ( ); DNP exposed: control ( ), low ( ), medium ( ) and high dose ( ); propranolol exposed: control ( ), low ( ), medium ( ) and high dose ( ).

Chronic Reproductive Fitness Can Be Predicted from Early-Response Metabolic Measurements
Considering this study within the context of the international 21-day Daphnia reproductive toxicity test, the capability of the 49-biomarker signature to predict the number of offspring using metabolite data measured on day 21 is at best interesting, but of no practical value to chemical regulators. As introduced above, however, this biomarker signature would have considerably greater value if it could predict chronic reproductive toxicity based upon metabolic measurements derived from a (much lower cost) short term study. If successful, this would greatly reduce the need for chronic, long term animal testing, thereby reducing costs. We therefore applied the newly discovered 49-biomarker signature to our previously published metabolomics datasets for D. magna, corresponding to early-response (24-h) exposure studies of Cd, DNP and propranolol [29] (Figure 1). These 24-h exposures were conducted at a single dose of 10% LC 50 , which we know would cause a statistically significant decrease in reproductive output if the studies had been run for 21 days ( Figure 2B-C for DNP and propranolol). Note that for the case of Cd in Figure 2A, reproductive output was measured for exposure concentrations up to 1% LC 50 (Table S1), which caused a significant decrease; here we assume that an even higher exposure concentration of 10% LC 50 for Cd would also result in a significant decrease in reproductive output. In fact, this is critical knowledge for Cd, DNP and propranolol, which we rely upon for determining whether the predictions of chronic adverse outcome from the early-response metabolic measurements are correct or not. Of the 49 metabolic features in the biomarker signature discovered in this study, 39 of these were observed in the early-response Cd data, 36 in the early-response DNP and 35 in the early-response propranolol datasets. The intensities of any missing peaks in these datasets were set to zero. Next, each of the three early-response toxicity datasets were submitted to the optimal, non-chemical-specific PLS-R model, which predicted the number of Daphnia offspring produced following hypothetical 21-day exposures. Figure 5 summarizes the predicted reproductive output (with control animals normalized to 100%) for Cd, DNP and propranolol. Univariate statistical analyses of these predictions (see section 4.5) confirmed that they correctly predict a decrease in reproductive fitness: Cd (t-test, p = 6.93 × 10 −6 ), DNP (p = 0.0154), propranolol (p = 8.12 × 10 −3 ).

Chronic Reproductive Fitness Can Be Predicted from Early-Response Metabolic Measurements
Considering this study within the context of the international 21-day Daphnia reproductive toxicity test, the capability of the 49-biomarker signature to predict the number of offspring using metabolite data measured on day 21 is at best interesting, but of no practical value to chemical regulators. As introduced above, however, this biomarker signature would have considerably greater value if it could predict chronic reproductive toxicity based upon metabolic measurements derived from a (much lower cost) short term study. If successful, this would greatly reduce the need for chronic, long term animal testing, thereby reducing costs. We therefore applied the newly discovered 49-biomarker signature to our previously published metabolomics datasets for D. magna, corresponding to early-response (24-h) exposure studies of Cd, DNP and propranolol [29] (Figure 1). These 24-h exposures were conducted at a single dose of 10% LC50, which we know would cause a statistically significant decrease in reproductive output if the studies had been run for 21 days ( Figure  2B-C for DNP and propranolol). Note that for the case of Cd in Figure 2A, reproductive output was measured for exposure concentrations up to 1% LC50 (Table S1), which caused a significant decrease; here we assume that an even higher exposure concentration of 10% LC50 for Cd would also result in a significant decrease in reproductive output. In fact, this is critical knowledge for Cd, DNP and propranolol, which we rely upon for determining whether the predictions of chronic adverse outcome from the early-response metabolic measurements are correct or not. Of the 49 metabolic features in the biomarker signature discovered in this study, 39 of these were observed in the earlyresponse Cd data, 36 in the early-response DNP and 35 in the early-response propranolol datasets. The intensities of any missing peaks in these datasets were set to zero. Next, each of the three earlyresponse toxicity datasets were submitted to the optimal, non-chemical-specific PLS-R model, which predicted the number of Daphnia offspring produced following hypothetical 21-day exposures. Figure 5 summarizes the predicted reproductive output (with control animals normalized to 100%) for Cd, DNP and propranolol. Univariate statistical analyses of these predictions (see section 4.5) confirmed that they correctly predict a decrease in reproductive fitness: Cd (t-test, p = 6.93 × 10 −6 ), DNP (p = 0.0154), propranolol (p = 8.12 × 10 −3 ). Figure 5. Predictions of the reproductive outputs for individual D. magna following hypothetical 21-day exposures to Cd, DNP and propranolol, relative to untreated controls, derived by submitting three previously published acute toxicity metabolomics datasets [29] to the non-chemical-specific forward selected PLS-R model from this study. Reproductive outputs are normalized for each chemical, with the control group set at 100%. Error bars represent ±SEM. Figure 5. Predictions of the reproductive outputs for individual D. magna following hypothetical 21-day exposures to Cd, DNP and propranolol, relative to untreated controls, derived by submitting three previously published acute toxicity metabolomics datasets [29] to the non-chemical-specific forward selected PLS-R model from this study. Reproductive outputs are normalized for each chemical, with the control group set at 100%. Error bars represent ±SEM.

Discussion
Given the importance of Daphnia as a test organism in regulatory toxicology [41], and as a model organism in freshwater ecology [42] and more recently human health research through its adoption as a US National Institutes of Health model organism [43], the relative lack of knowledge of the identities of the polar metabolites and lipids in this species is arguably surprising. Researchers from different fields may use different terms: toxicologists may ask what is known of Daphnia's metabolic biochemistry; ecologists may enquire about its chemical ecology; while "omics" investigators may ask what is known of Daphnia's metabolome? Yet these questions all refer to the same natural compounds, i.e., the primary and secondary metabolites that are ingested, synthesized and/or excreted by Daphnia.
To date, only a few hundred metabolites in Daphnia have been reported in the literature, for example see references [37,44]. The challenges of annotating and/or identifying the peaks measured in a metabolomics study are clear from the current work. While an array of analytical measurements (accurate m/z with <1 ppm accuracy [45], MS/MS) and data analysis approaches (interpretation of adduct patterns, isotope patterns, correlation of peak intensities [46], etc.,) have been applied, only 11 of the 49 peaks in the biomarker signature of reproductive fitness have been identified. Clearly more sophisticated approaches are required to identify metabolomes. One strategy that is currently being developed is 'Deep Metabolome Annotation', an integrated analytical, computational and informatics workflow that seeks to extract, characterize and store knowledge of both polar and lipophilic metabolites from a cell, tissue or whole organism. This approach combines solid phase extraction, non-targeted LC-MS(/MS), GC-MS, NMR and nanoelectrospray ionization DIMS n with an array of data analysis algorithms and informatics resources and is currently being used to extend the annotations and identification of the D. magna metabolome [47].
We argue that applying such large-scale annotation and identification is critical for regulatory toxicology, as meaningful biological inferences may only be drawn from metabolomics datasets where peaks can be definitively named. Practically, to translate discoveries from a metabolomics study in toxicology into a practical metabolic KE that could be deployed in a targeted high throughput toxicity screening assay is going to require identification of that or those metabolites. One further strategy to address this challenge was recognized in 2015 by the formation of a scientific task group of the international Metabolomics Society to progress the characterization of metabolomes, specifically focusing on a small number of model organisms [48]. The value of model organisms across biology and medicine is very well established [49]. Motivated by the challenges we encountered in the current discovery study, we strongly advocate for the deep metabolome annotation and identification of the most widely used test species in (regulatory) toxicology including in vitro models.
The annotation and identification of the 49-biomarker metabolic signature, albeit limited in completeness, has in fact produced some intriguing findings. Ascorbate (ascorbic acid or more commonly, vitamin C) is one of the major water-soluble antioxidants present in animals and plants [50,51]. Ascorbate decreased in concentration in response to all chemical treatments (average fold change from control to high dose: 0.498 (Cd), 0.636 (DNP) and 0.571 (propranolol); Tables S4 and S5), most likely indicating the Daphnia experienced oxidative stress. Glutathione (GSH), the second common water-soluble antioxidant, is not a component of the final 49-biomarker signature but was also detected in our metabolomics study. Table S6 shows the effects of chemical exposure on Daphnia GSH levels and confirms that oxidative stress was occurring following Cd exposure, as would be expected based on its primary MoA (average fold change from control to high dose Cd of 0.652). GSH levels were increased following DNP exposure (1.55-fold) and unchanged by propranolol (0.969-fold; Table S6). The differing response of GSH across the three chemicals is the most likely cause for its exclusion from the final biomarker signature by the PLS-R feature selection algorithm.
While this perturbation of antioxidants was expected [52], given the inclusion of a heavy metal in the study, the observation of four sulfonated lipids in the list of putatively annotated metabolites was not (Table 1). Although the structures of the C c H h SO 4 , C c H h SO 5 and C c H h SO 6 metabolites could not be determined unambiguously, the MS/MS fragmentation studies revealed that these molecules contain lipophilic hydrocarbon backbones and a sulfate group. The functions of such compounds can be hypothesized from the literature. Amphipathic structures (i.e., polar and non-polar regions within the same molecule) are consistent with these molecules acting as anionic surfactants, and similar endogenous surfactants have been reported previously within the guts of aquatic invertebrates to induce micelle formation and enhance the solubilization of food [53][54][55]. The digestive juice of the land snail H. pomatia L. has also been reported to contain alkyl sulfates [56]. This suggests that part of the 49-biomarker signature of reproductive fitness could be linked to early changes in gut digestive physiology in Daphnia. While speculative, there is a logical basis for associations between digestion and reproduction, namely the allocation of energy resources. Chemical exposure can affect the metabolic processes involved in energy production, storage and metabolism, processes which are fundamental to organism growth and reproduction. It has been shown in Daphnia that a toxic assault can lead to changes in energy allocation that is highly correlated with impaired reproduction [30,57]. Therefore, further investigations of this potentially important class of sulfonated lipids and their ability to predict reproductive dysfunction in Daphnia is warranted.
For the strategy of metabolomics biomarker discovery presented in the current study to be translated successfully to regulatory toxicology, not only do we need to improve our ability to identify metabolites, but we also require biological validation of any putative biomarker signatures that are discovered. Obviously, the metabolomics measurements and statistical analyses need to be conducted rigorously, including for example the use of cross-validation and permutation testing (both used here) when building predictive multivariate models. In addition, more rigorous biological validation would require that this whole study is repeated independently, and the new biomarker signature checked for consistency with the one reported here. This level of biological validation was not conducted in the current study. Following that additional study, for putative biomarkers to advance towards becoming established metabolic KEs they should be validated using independent toxicity studies. The strategy that we would recommend is to develop targeted assays to measure just the metabolic biomarkers of interest, using LC-MS/MS. If the physico-chemical properties of the biomarkers within the signature are quite different, this might require two or more LC separation methods. Targeted LC-MS/MS measurements of the selected biomarker signature would first require analytical validation of the method (e.g., linearity, accuracy, precision, limits of detection and quantification, etc.) as well as biological validation by measuring the biomarkers in multiple new (i.e., independent) samples. What is noteworthy from the current study is the number of metabolic features in the biomarker signature. The most predictive biomarker signature contained 49 metabolic features, i.e., it was not a signature with only one or a few metabolites (arguably a failing of earlier biomarker studies to rely on single or few endpoints), nor did the discovered biomarker signature contain many hundreds of metabolites. From an analytical perspective this is highly encouraging as targeted LC-MS/MS measurements of a few 10 s of metabolites is achievable with existing mass spectrometry platforms, meaning that translation of putative biomarker signatures is not dependent on future technology solutions.
In the current study we sought to extend the domain of applicability of the biomarker signature beyond individual chemicals by integrating the discoveries from three separate chemicals and then building a multivariate predictive model to determine the subset of metabolic features that best predicted Daphnia reproductive output. Again, considering strategies for how the current study could be translated to regulatory toxicology, it would be critical to evaluate a much broader toxicological space, i.e., to discover putative metabolic KEs for the majority of "toxicity pathways", and related to this to explore a much broader chemical space. To power such a study correctly, and to cover several toxicity pathways, would require at least a 10-fold increase in the scale of the experimentation, arguably several hundred chemicals would be required to cover the MoA space. From a practical perspective, this is technologically feasible, although a very high attention to quality assurance and control as well as method performance criteria would be critical to extract knowledge from a protracted metabolomics investigation. A tripartite group of scientists, spanning academic, government and industry laboratories, are now defining best practice, performance standards and reporting standards for the applications of metabolomics in regulatory toxicology [58]. This group aims to provide regulators and other stakeholders with the first practical guidelines for reporting and interpreting the quality of metabolomics data in toxicology, including for large scale investigations as would be needed to expand on the work presented here.
Yet such a significant expansion of the study described here leads directly to a further logistical need: who would be willing to pay for this? Currently, within Europe, the onus is on industry to pay for the risk assessments of the chemicals that they synthesize and/or use, so is there a business case for industry to utilize metabolomics (and other 'omics modalities) in regulatory toxicology to a much greater degree than at present? In the context of the study presented here, replacing a standard OECD chronic Daphnia reproduction test with an acute Daphnia toxicity test design (compatible with deploying the new early-response metabolic predictors of chronic reproductive fitness) would amount to a 7-to 10-fold cost saving for industry [59]. Could this provide sufficient motivation to the chemical industry? The company BASF SE have already strongly endorsed the application of metabolomics within their chemical risk assessment paradigm, including as a tool for biological read-across [15]. Similar to the approaches presented here, the insights into the MoAs of chemicals provided by metabolomics have allowed a chronic 90-day study in rats to be reduced to only 28-days [25], with significant resource savings. In summary, we and others believe strongly in the capability of metabolomics to discover early-response biomarker signatures that are predictive of chronic adverse outcomes (i.e., apical endpoints) because metabolic measurements are the most downstream of all molecular changes and closely represent an organism's phenotype. The discovery of a panel of metabolic markers that can predict the 21-day reproductive fitness of Daphnia exposed to Cd, DNP and propranolol from metabolic measurements made after only 24-h exposures adds further weight to that argument.

Daphnia Magna Culturing and Chemical Exposures
Cultures of D. magna were maintained as previously reported [38]. Individual D. magna neonates (3rd brood, <24-h old) in 250 mL media were exposed for 21 days to a range of concentrations of each chemical tested (n = 8 exposure vessels per concentration). Animals were fed daily with Chlorella vulgaris (no supplements) as per normal culturing conditions, with media and chemical renewal every 48 h. Three chronic toxicity studies were conducted using Cd, DNP or propranolol as the test chemicals (Sigma Aldrich, UK) prepared as described previously [29]. Nominal concentrations were: 0, 0.35, 1.4, 3.5 and 7 µgL −1 cadmium (measured as Cd 2+ ions); 0, 0.15, 0.75 and 1.5 mgL −1 DNP; and 0, 0.14, 0.7 and 1.4 mgL −1 propranolol. As a means of normalizing to biological effect, the exposures of 7 µgL −1 cadmium, 0.15 mgL −1 DNP and 0.14 mgL −1 propranolol all corresponded to 1% of the previously determined neonatal LC 50 (see Supplementary Materials). For each of the three chronic studies, a further control group was set up with D. magna fed only a half ration of algae (termed "reduced-food control"), in order to determine if observed metabolic changes were a direct effect of chemical exposure or instead a chemical-induced reduction in feeding. Throughout each of the 21-day studies, the reproductive output of individual D. magna was monitored and recorded, specifically the number of offspring produced per adult per brood and the day on which each brood was released. Neonates were removed from the test vessels and discarded daily; any mortality during the exposure period was also recorded. At the end of the 21-day test period, the now adult daphniids were captured and flash frozen as previously detailed [29], and then stored at −80 • C until metabolite extraction.

Metabolite Extraction and Mass Spectrometry Metabolomics
Metabolites were extracted from individual D. magna using a well-established bead-based homogenization and methanol/water/chloroform extraction protocol [29,38], including the preparation of an extract blank (same analytical procedure but no biological material). For this study, only the polar extracts were analyzed using Fourier transform ion cyclotron resonance mass spectrometry. Specifically, each daphniid extract was analyzed in triplicate on a hybrid 7-T LTQ FT mass spectrometer (Thermo Scientific, Bremen, Germany) with a chip-based direct infusion nanoelectrospray (nESI) ionization source (Triversa, Advion Biosciences, Ithaca, NY, USA), utilizing the SIM-stitching approach in negative ion mode. All data acquisition parameters were as previously detailed [38]. For each of the three metabolomics datasets corresponding to the three chemicals studied, mass spectra were processed using a three-stage peak picking and noise filter, missing value imputation, normalization and g-log transformation, as previously described [29] with some modifications. In this study the sample filter was set to retain in the final datasets only those peaks that occurred in >50% of all samples using an m/z tolerance of 1.8 ppm, and peaks in the extract blank were retained only if they were at least three times more intense in the biological samples. This workflow yielded three separate processed data matrices, one each for Cd, DNP and propranolol. Putative peak annotation and identification utilized a range of approaches, including analytical measurements (accurate m/z with <1 ppm accuracy [45] and collision-induced dissociation MS/MS) and informatics approaches. Specifically, we used custom written MI-Pack software version 1 [60] as previously detailed [29], which in addition to providing information on the Daphnia endogenous metabolome also enabled us to identify (and subsequently remove prior to statistical analysis) peaks directly arising from the test chemicals. Further methods to annotate and identify peaks included the interpretation of adduct patterns, isotope patterns, and highly correlated pairs of peak intensities that indicated the peaks likely arose from the same metabolite [46]). Where possible, metabolite identification to MSI level 1 [39] was conducted by comparing MS/MS fragmentation spectra to those measured using a commercial authentic metabolite standard.

Statistical Analysis of Reproductive Output
The total number of offspring produced per individual D. magna over the duration of each 21-day study was used as a measure of reproductive output. For each chemical, Grubbs statistical tests were employed for each dose group to identify any potential outliers, which were subsequently removed from any further analysis. ANOVA followed by a Tukey-Kramer post-hoc testing was then applied across all control and dose groups, per chemical.

Statistical Analysis of Metabolomics Datasets from 21-Day Exposures
To visualize the effects of each chemical on the metabolome of D. magna, principal components analyses (PCA) were conducted on the three processed and normalized (including g-log) data matrices (one per chemical). In addition, t-tests adjusted for a false discovery rate (FDR) <5% were applied across all peaks (metabolic features) between the untreated control and high dose group to provide an initial indication of the relative sizes of the metabolic perturbations induced by each chemical; this utilized the three processed and normalized (excluding g-log) data matrices. The most important algorithm for this study was partial least squares regression (PLS-R), a supervised multivariate method for relating two data matrices, in our case metabolomics peak intensities and Daphnia reproductive fitness. PLS-R derives its usefulness from its ability to analyze data with many noisy, collinear and even incomplete variables in both data matrices [61]. We utilized PLS-R modelling to discover metabolic predictors of reproductive output, for each of the three chemicals separately and also after merging the three datasets. All PLS-R models were internally cross validated, using Venetian blinds with 6 splits, to assess for any over-fitting of the data based on cross-validated R 2 values [61]. In addition, permutation testing was applied, using 1000 random permutations per model built, to determine the significance of the capability of each PLS-R model to predict the reproductive output [62]. A forward selection strategy was applied to PLS-R model building to derive the minimum number of metabolic features that best predict the reproductive fitness.

Statistical Analysis of Metabolomics Datasets from 24-h Exposures
Metabolomics data recorded using the same DIMS metabolomics workflow, but for Daphnia exposed to Cd, DNP and propranolol (each n = 10 biological replicates) for only 24 h, were published by us previously [29] (see further Materials and Methods in Supplementary Materials) and re-examined in this study. Here we treated these early-response metabolomics datasets as 'targeted' metabolic measurements, i.e., we only extracted and used the intensities of the 49 peaks corresponding to the newly discovered biomarker signature (described in Results), for each chemical. This required some alignment of the m/z values across the datasets and applying a 3 ppm m/z tolerance; the intensities of any of the 49 peaks that were missing in the early-response datasets were set to zero. Then each of these 'targeted' metabolic measurements, one for each chemical, were submitted to the optimal, non-chemical-specific PLS-R model to predict the reproductive output of Daphnia following hypothetical 21-day exposures (described in Results). Predicted reproductive outputs for untreated controls and chemical-exposed groups were normalized and compared using Student's t-tests on a chemical-by-chemical basis. Specifically, using Cd as the example, the predicted reproductive outputs from the PLS-R model for the Cd-treated and Cd-control groups were normalized by scaling the mean of the Cd-control group to 100%. Next a Student's t-test was applied to evaluate the statistical significance of the difference in the means of the predicted reproductive outputs of the Cd-treated versus Cd-control groups.

Conclusions
The overarching aim of this study was to discover early-response metabolic biomarkers that could predict the chronic reproductive fitness of individual D. magna, which has been achieved successfully. Using the newly discovered non-chemical-specific biomarker signature of 49 metabolic features in conjunction with metabolic measurements of Daphnia exposed for only 24-h, we were able to correctly predict that a significant decrease in reproductive fitness of Daphnia would occur after 21 days. This prediction was correct for all three of the chemicals investigated, cadmium, 2,4-dinitrophenol and propranolol. Further work is needed to extend the applicability of this metabolomics discovery approach to many more chemicals and chemical classes, though our findings provide encouragement that early-response metabolic biomarkers of established apical endpoints can be discovered and translated towards regulatory toxicology.
Supplementary Materials: The following are available online at http://www.mdpi.com/2218-1989/8/3/42/s1. Supplementary methods and results for Article: Metabolomics discovers early-response metabolic biomarkers that can predict chronic reproductive fitness in individual Daphnia magna (includes Tables S1-S4 and S6), Table S5: Summary of all annotations for the 49 metabolic features (or peaks) that constitute the non-chemical-specific metabolic biomarker signature.
Author Contributions: N.S.T. and M.R.V. conceived of and designed the study; N.S.T. conducted all chemical exposure studies and metabolomics assays and the majority of the statistical analyses; A.G. helped with the PLS-R predictions using the early-response toxicity data; and N.S.T. and M.R.V. co-wrote the paper. All authors contributed to the final version of the manuscript.