Metabolomics, a Powerful Tool for Agricultural Research

Metabolomics, which is based mainly on nuclear magnetic resonance (NMR), gas-chromatography (GC) or liquid-chromatography (LC) coupled to mass spectrometry (MS) analytical technologies to systematically acquire the qualitative and quantitative information of low-molecular-mass endogenous metabolites, provides a direct snapshot of the physiological condition in biological samples. As complements to transcriptomics and proteomics, it has played pivotal roles in agricultural and food science research. In this review, we discuss the capacities of NMR, GC/LC-MS in the acquisition of plant metabolome, and address the potential promise and diverse applications of metabolomics, particularly lipidomics, to investigate the responses of Arabidopsis thaliana, a primary plant model for agricultural research, to environmental stressors including heat, freezing, drought, and salinity.


Introduction
Metabolomics, defined as a powerful platform for the global identification and quantification of low-molecular-weight metabolites in a biological sample, is rapidly evolving into a principal tool in the functional annotation of genes; and confers high-resolution snapshot of various physiological and biological aspects of cellular responses to specific environmental stimuli [1].
The field of metabolomics relies mainly on the technologies of nuclear magnetic resonance (NMR) and mass spectrometry (MS) with or without chromatography, which allows the detection of an organism's metabolite pool characterized by plethora of chemical structures with an enormous diversity of chemical and physical properties [1][2][3][4][5]. Since its introduction by Nicholson et al. in 1999, metabolomics has been extensively applied to various fields of science in the post-genomic era, such as agricultural research [6][7][8][9][10].
Arabidopsis thaliana was one of the first established model organisms worldwide [11], and has been extensively studied, rendering it an ideal research model. Metabolomics has been broadly applied to a myriad of studies pertaining to the metabolic responses of Arabidopsis to various abiotic or biotic stresses, including heat, freezing, drought, and salinity [12,13]. Lipidomics, as a comparably young branch in the realm of metabolomics area, has emerged as a burgeoning arena within the last few years [14]; and has been applied to various analyses to elucidate the physiological mechanisms of Arabidopsis [15][16][17][18][19][20].
In this review, we discuss the capacity of NMR and gas-chromatography/liquid-chromatography coupled to mass spectrometry (GC/LC-MS) in acquiring qualitative and quantitative information on carbohydrates, amino acids, organic acids, and lipids in plants, as well as the application of metabolomics and lipidomics in providing the mechanistic details of the physiological responses of Arabidopsis to heat, freezing, drought, and salt stresses. The technical routes of this review are illustrated in Figure 1.

Techniques of Plant Metabolome Acquisition
Quantitative plant metabolomics provides us with an in-depth understanding of plant metabolism, and helps to improve crop yields [21]. At present, NMR and GC/LC-MS techniques dominate the data acquisition strategies in metabolomics studies, which enable the identification of a myriad of species belonging to the three major classes of nutrient components (i.e., carbohydrate, amino acid, and lipids) in plants [6,21]. Other chromatographic techniques, including capillary electrophoresis (CE) and supercritical fluid chromatography (SFC), can also be coupled to MS for metabolomics studies. However, CE and SFC are being applied to a less extensive extent than conventional GC/LC analyses due to the drawback of poor migration time reproducibility and lack of reference libraries for CE [22], and fluid compressibility for SFC [23].

Metabolite Coverage of NMR in Plant Metabolomics
NMR spectroscopy, one of the two leading analytical techniques in the field of metabolome research, is characterized by its reproducibility in quantification, structure identification, and nonbiased detection of metabolites [24]. NMR can quantify metabolites in large batches of samples with higher reproducibility and greater accuracy, coupled with a wider time span and dynamic ranges than GC/LC-MS can perform. Particularly in untargeted MS-based metabolomics, the measurements are semi-quantitative. NMR guarantees stable sensitivity as samples and instruments are devoid of contact during detection, eliminating the concerns of gradual contamination by residual metabolites that may compromise sensitivity in MS analyses. In addition, NMR provides the same signal sensitivity for all metabolites regardless of the complexities of the biological matrix, and is independent of the chemical properties of the metabolites [25]. NMR is a powerful technique for the analysis of metabolite structures, as it can differentiate compounds with identical masses and twodimensional structures that differ only in spatial configuration [26,27]. NMR is also preferred in metabolomics studies due to its simple detection requirements using intact bio-specimens without requirements for prior separation.

Techniques of Plant Metabolome Acquisition
Quantitative plant metabolomics provides us with an in-depth understanding of plant metabolism, and helps to improve crop yields [21]. At present, NMR and GC/LC-MS techniques dominate the data acquisition strategies in metabolomics studies, which enable the identification of a myriad of species belonging to the three major classes of nutrient components (i.e., carbohydrate, amino acid, and lipids) in plants [6,21]. Other chromatographic techniques, including capillary electrophoresis (CE) and supercritical fluid chromatography (SFC), can also be coupled to MS for metabolomics studies. However, CE and SFC are being applied to a less extensive extent than conventional GC/LC analyses due to the drawback of poor migration time reproducibility and lack of reference libraries for CE [22], and fluid compressibility for SFC [23].

Metabolite Coverage of NMR in Plant Metabolomics
NMR spectroscopy, one of the two leading analytical techniques in the field of metabolome research, is characterized by its reproducibility in quantification, structure identification, and non-biased detection of metabolites [24]. NMR can quantify metabolites in large batches of samples with higher reproducibility and greater accuracy, coupled with a wider time span and dynamic ranges than GC/LC-MS can perform. Particularly in untargeted MS-based metabolomics, the measurements are semi-quantitative. NMR guarantees stable sensitivity as samples and instruments are devoid of contact during detection, eliminating the concerns of gradual contamination by residual metabolites that may compromise sensitivity in MS analyses. In addition, NMR provides the same signal sensitivity for all metabolites regardless of the complexities of the biological matrix, and is independent of the chemical properties of the metabolites [25]. NMR is a powerful technique for the analysis of metabolite structures, as it can differentiate compounds with identical masses and two-dimensional structures that differ only in spatial configuration [26,27]. NMR is also preferred in metabolomics studies due to its simple detection requirements using intact bio-specimens without requirements for prior separation.
A major drawback of NMR technique, however, lies in its sensitivity [24,25,28], which restrains its application to the detection of metabolites of low-abundance in plants. In addition, as molecular weight increases, for example, for lipids comprising long fatty chains, its identification capacity is weakened rapidly due to more complex and overlapping signals from the extended hydrocarbon chains in such compounds. The long-carbon-chain lipids, such as fatty acids (FAs) and phospholipids carrying single or multiple long fatty chains, can be further differentiated into hundreds of thousands of subtype species according to the lengths of fatty chains, the number of double bonds, and the functional groups, etc. As NMR can only provide classification based on the characteristic signals located in functional groups, it falls short in terms of conferring specific compound identification due to overlapping signals of methylene in long fatty chains. As can be seen from Table 1, only hundreds of metabolites can be identified by NMR, which falls way behind GC/LC-MS.

Metabolite Coverage of GC/LC-MS in Plant Metabolomics
MS coupled to GC or LC is by far the most frequently applied analytical technique in plant metabolomics studies due to its unparalleled sensitivity and extensive coverage of biological information relevant to the metabolism of the organism [29,30]. The plant metabolome reported to date is composed of approximately 30,000 endogenous metabolites that mainly comprises various lipid classes, the majority of which can be easily characterized and quantified via MS.
GC-MS is excellent for the detection of biological samples with very complex matrices, offering highly efficient separation and resolution. Not only can it analyze many of the aforementioned carbohydrates, amino acids, and organic acids detectable by NMR, but it aslo accurately identify a plethora of volatile and thermally stable lipids, or volatile derivatized metabolites, such as FAs [9,31,32]. More importantly, an immense number of well-curated compound reference libraries, including the NIST [33], FiehnLib [34] and Golm metabolic databases (GMD [35]) are availabe for peak identification and prediction across different models of mass spectrometers, which have been proven very useful in terms of analysing metabolome data. Major limitations of GC-MS, however, lie in its inability to ionize thermolabile metabolites, such as di-and triphosphates, lysophosphatidycholine (LPC), lysophosphatidylethanolamine (LPE), or higher molecular masses phosphatidylcholine (PC) and phosphatidylethanolamine (PE), due to their non-volatile properties even after derivatization, circumscribing its application for global metabolic profiling in plants. This limitation narrows the GC-MS-derived metabolome both in terms of metabolite number and subtypes compared to that obtained using LC-MS (Table 1).
In comparison to GC-MS, LC-MS is useful in handing thermolabile, polar metabolites, and high-molecular weight compounds without derivatization, such as phosphatidylinositol (PI), PE, phosphatidic acid (PA), phosphatidylglycerol (PG), phosphatidylserine (PS), sulfoquinovosyldiacylglycerol (SQDG), PC, monogalactosyldiacylglycerol (MGDG), and digalactosyldiacylglycerol (DGDG) [15,16,36]. More importantly, with the advancement in ionization techniques, increasing scan speed, and improvement in terms of instrument sensitivity, metabolite coverage of LC-MS can be expanded into greater array of metabolite classes, traditionally dominated by GC-MS [37][38][39]. For example, volatile metabolites involved in tricarboxylic acid cycle (TCA), which are generally detected by GC-MS, can now also be analyzed by LC-MS [40]. Even though GC-MS exhibits higher senstitivity for these volatile compounds aforementioned than LC-MS; the detectability of such compounds in LC-MS per se is enough for the quantitative analysis of targeted metabolites in plant.

The Application of Metabolomics to Arabidopsis Model-Based Research
Metabolomics has been widely applied to a number of studies pertaining to the metabolic responses of Arabidopsis to various abiotic or biotic stresses. In this part, we summarized the potential promise and diverse applications of metabolomics, particularly lipidomics, to investigate the responses of Arabidopsis to environmental stressors including heat, freezing, drought and salinity ( Table 2).  "↑" and "↓": represents up-and down-regulated concentration in response to stress, respectively; GABA: gamma-aminobutyric acid; AA: amino acid; BCAA: branched-chain amino acid; AAA: aromatic amino acid; GIPC: glycosylinositolphosphoceramide; ASG: acylated steryl glycoside; SG: steryl glycoside.

Temperature Stress-Induced Alterations of Arabidopsis Metabolome
Environmental stressors, including heat, freezing, drought, and salinity, are detrimental to the normal growth of plants [51][52][53]. The sessile nature of plants makes them particularly susceptible to such environmental stressors, and thus plants have evolved various physiological and metabolic responses to effectively deal with such environmental extremities [54]. Temperature extremes can lead to oxidative stress, which is extremely harmful to various biomolecules including lipids, nucleic acids, and proteins in plants [55,56].
Temperature extremes can lead to severe crop-yield losses [52,53,57,58], and elevated temperature is predicted to result in severe food crises in the future with global warming on the rise [59,60]. To improve thermotolerance in crops, it is essential to understand the molecular basis of thermotolerance adaptation in plants.
On the other hand, low temperature represents a key determinant in influencing the geographical distribution of plants worldwide. Cold acclimation is an important mechanism that enables plant species to survive through the low temperature of the harsh winters in their natural habitats [61]. Complex changes in plant transcriptome, proteome, and metabolome occurs during the cold acclimation process [61,62], initiating specific mechanisms to protect plants against freezing [63].
Cold stress (CS) influenced plant metabolism far more profoundly than heat stress (HS). Analysis on temperature-stressed metabolome of Arabidopsis revealed that, with regard to metabolite markers, heat stress and cold stress shared many comparable responses [13]. Increased concentrations of branched-chain amino acids (BCAA), like isoleucine (Ile), leucine (Leu), and valine (Val), and aromatic amino acids (AAA), such as tyrosine (Tyr) during both temperature stresses were detected. Increased concentrations of tryptophan (Trp) and phenylalanine (Phe) were, however, only observed during cold accumulation [8].
Salicylic acid, which plays a crucial function in guarding the plant hosts against pathogen intrusion [13,[64][65][66][67][68], was enhanced during both HS and CS accumulations, suggesting that salicylic acid could be the initial signaling molecule of plant tolerance to various stresses, and could possibly help plants prepare for combat against pathogens.
The promoters of a number of genes induced by temperature stresses contain sugar-responsive elements [13,69,70], indicating that sugar signaling may be important in the establishment and maintenance of both acquired heat and freeze tolerance. A metabolic profiling study on Arabidopsis showed that gluconapin is characteristic in CS accessions, while kaempferol-3,7-O-dirhamnoside and kaempferol-neohesperidoside-7-rhamnoside are specific to highly cold-tolerant accessions, which can facilitate the screening of cold tolerance in Arabidopsis accessions [71].
Secondary metabolism-generated metabolites, such as Ile, Leu, Val, Tyr, Trp, and Phe can defend against pests, pathogenic fungi, and bacteria [72]. It is plausible that BCAA accumulation functions to enhance secondary metabolism, facilitating the development of resistance against pathogens during stress conditions. In plants, cold response and day-night cycles share many regulation genes [73]. Metabolic and transcriptional analyses on Arabidopsis exposed to CS revealed that the expression of day-regulation genes was profoundly influenced by CS, indicating a disrupted clock function, and the importance of understanding the mechanism of cold acclimation in the correct day-night context.
Environmental conditions, especially temperatures, can directly affect cell membrane properties, which subsequently influence membrane fluidity, and this can be balanced by plants via regulation of membrane glycerolipids saturation indices, as the presence of unsaturated bonds decreases the phase-transition temperature [74]. In this area, lipidomics has been broadly applied to the study of the influence of temperature on Arabidopsis [36]. In the initial stage of adaptation to HS (32 • C), increases in PG 32:0 and SQDG 36:5 were observed in Arabidopsis, while PE 36:6, PG 36:4, and PG 36:5 showed decreases in response to HS, indicating that membrane lipid compositions can be fine-tuned to counter temperature changes, principally via modulating desaturation leading to compensatory decreases in membrane fluidity in response to high temperatures. Furthermore, SGDG, PG, and PE represent anionic glycerolipids reported to aid in stabilizing membrane proteins [75].
With the advances in analytical methods in the field of lipidomics, a LC-MS lipidomic platform was developed, which extended the coverage of plant lipidome [43]. In particular, PG (16:0/18:3) concentration was up-regulated upon cold acclimation. In addition, the degree of unsaturation in long-chain bases of sphingolipids was observed to increase during CS [43], presumably enhancing Arabidopsis CS tolerance [76].
Moreover, lipid changes were reported during all phases of the freezing/cold exposure, including cold acclimation, the freezing process per se, as well as recovery from low temperature [77][78][79][80]. CS-induced lipid alterations can limit detrimental lipid phase changes that may result in the cell membrane leakage [77]. Indeed, CS-or freezing-induced alterations in plant lipid metabolism can modulate subsequent damage during stress exposure [20,[77][78][79]. For instance, desaturases producing trienoic fatty acids are required for effective photosynthesis under cold conditions [80]. Therefore, much still remains to be explored on the role of lipids in cold and freezing responses, and these necessitate the development of more sensitive and extensive lipidomics and metabolomics analytical strategies tailored to Arabidopsis.

Drought-Stress (DS) Induced Alterations of Arabidopsis Metabolome
DS can result in significant loss of plant productivity [81][82][83]. In the drought of 2012 in the United States, the most severe drought experienced in the past 25 years, the production and yields of corn and soybean were severely afflicted, leading to huge economic repercussions [84]. Such weather extremities call for a better understanding of the metabolic mechanisms of stress response, which would be crucial for improving crop tolerance to ensure agricultural outputs and economic stability [85].
Flavonoids represent a major component of specialized/secondary metabolites in plants [86] that are thought to be defense metabolites against environmental stresses [87], and accumulation of flavonoids in Arabidopsis have been previously observed in response to various stresses [88][89][90][91][92][93][94]. It was reported that flavonoid accumulation, which was shown to participate in radical scavenging activity, increases the oxidative and drought tolerance of plants, thereby preventing water loss [95].
Increased concentrations of three types of flavonoids, glycosides of kaempferol (f1, f2, and f3), quercetin (f6 and f8), and cyanidin (A5, A8, A9, A10, and A11) were noted in Arabidopsis during DS, indicating that all flavonoids are DS responsive metabolites that can be used as positive markers and potential mitigators of DS [44]. Nonetheless the signaling/regulation mechanisms of flavonoids or the individual role of each molecule in the stress mitigation mechanism has remained unclear.
The diversity of secondary metabolites is critical for their roles in plant stress response [96]. Metabolomics can aid transcriptomics to elucidate cellular machinery of metabolic tolerance towards stress. The characterization of the flavonoid pathway transcription factor TRANSPARENT TESTA8 (TT8) in Arabidopsis using an integrative omics strategy revealed that two phytohormone biosynthesis pathways of jasmonic acid and brassinosteroids, which are implicated in stress response [96], are directly regulated by TT8 [97]. Furthermore, at least eight stress response proteins, which are implicated in the tolerance against salt and drought stress [98][99][100], are directly regulated in a TT8-dependent manner, thus implicating TT8 in reprogramming defense response. In addition, TT8 has a direct role in increasing the diversity of core metabolites, particularly by regulating glycosylation of brassinosteroids and flavonoids.
Mitochondrial metabolism was reported to be highly active during DS responses [45]. DS-induced metabolic reprogramming leads to up-regulated concentrations of amino acids and intermediates from TCA cycle, including cis-aconitate, isocitrate, citrate, fumarate, 2-oxoglutarate, succinate, malate, glycolate, putrescine, spermidine, gamma-aminobutyric acid (GABA,) guanidine, fructose, galactose, glucose, maltose, mannose, raffinose, ribose, sucrose, trehalose, dehydroascorbate, alanine (Ala), aspartate (Asp), glutamate (Glt), glutamine (Gln), Ile, Leu, lysine (Lys), methionine (Met), ornithine (Orn), Phe, proline (Pro), serine (Ser), threonine (Thr), Try, and Val, as well as the decreased concentrations of protein, starch and nitrate. The increased concentrations of the BCAA appear to be associated with their increased utilization as TCA cycle substrates [101,102], and functions in the short-term DS, most likely by retarding stress initiation. A multiplicity of primary metabolites (osmolytes, osmoprotectants) accumulate under stress conditions, which could serve as building blocks for macromolecules to stabilize membranes and therefore contribute to cell osmotic pressure [9,13]. Indeed, this may be why it is often infeasible to generate powerful stress tolerance varieties via engineering the overproduction of merely a single compatible solute compound in plants.
Lipids, as major membrane components, function to preserve the membrane integrity during DS process [47]. Concentrations of glycosylinositolphosphoceramide (GIPC), steryl glycoside (SG), acylated steryl glycoside (ASG), DGDG, and PA in Arabidopsis were reported to be altered by DS [43,47,48]. In response to DS, increased DGDG 18:3 and PC 18:3, and diminished levels of triunsaturated fatty acid 16:3 and 18:3 were observed in leaves of Arabidopsis [47]. These results indicated that Arabidopsis have a strong capacity for tolerating DS at the cellular level via modulating membrane lipid proportions.

Salt Stress-Induced Alterations in Arabidopsis Metabolome
Soil and water salinity is pervasive throughout the globe, and may adversely impact the crop yield [103]. Plants have evolved tolerance against saline stress, which may involve osmotic adjustment and the sequestration of ions into respective cellular compartments [104]. Numerous metabolomic studies that elucidate the tolerance mechanisms of Arabidopsis in response to salt stress have been reported [105,106].
Using global transcriptional and metabolomic analyses, a study on the regulation mechanism of polyamine in Arabidopsis salinity tolerance revealed that activation of abscisic acid and jasmonate biosynthesis, and accumulation of important compatible solutes; as well as TCA cycle intermediates, was observed [49]. Expression analyses indicated that thermospermine regulates the transcript levels of certain target genes associated with the biosynthesis and signaling of jasmonate, some of which are proven to enhance salinity tolerance.
The osmotic potential in the plants cell is increased upon hyperosmolarity challenge. The osmotic potential of the cell cytosol is reduced by plant cells via building up compatible osmolytes to maintain the activity of enzymes [46]. Concentrations of Pro, histidine (His), glutathione, and GABA in Arabidopsis were strongly affected by salt stress. Pro, polyamines, organic acids, and Gly betaine, as compatible osmolytes, can greatly decrease stress-induced destruction of plant cells [107][108][109][110][111][112]. The compatible osmolytes are synthesized via shifting basic intermediary metabolites towards stress-activated biochemical reactions. Cytochrome P450s generally participate in primary and secondary metabolism and may be implicated in some osmolytes biosynthesis. Many other potential biochemical compounds, which are associated with salt tolerance, can also be uncovered by metabolic analysis [113][114][115].

Future Perspectives
Although metabolomics represents a relatively new realm of science that emerged not more than two decades ago, it has already exhibited considerable potential in agricultural and food science research, especially pertaining to the elucidation of metabolome and lipidome changes in response to environmental or pathophysiological stimuli; thereby contributing to the crop production improvement. Despite the rapid advances in the techniques of NMR and MS coupled to GC or LC, and their broad applications to the field of metabolomics, thus far, no single analytical platform by itself can offer a holistic coverage of the metabolome in plants. While NMR possesses the advantage of capturing the structural information of metabolites as well as their abundances, LC-MS emerges as the probable optimal choice in terms of the acquisition of global metabolome in plants, and the rapid advances in MS instrumentation is greatly driving its diverse applications in agricultural sciences.
In face of adverse environmental conditions, a series of primary metabolites (osmolytes, osmoprotectants) and secondary metabolites (defense metabolites) in plants accumulate to enhance their stress tolerance, thus it is impossible to generate plants with high levels of stress adaptation via engineering the levels of a just a few selected metabolites. Future studies will focus on increasing the resolution and coverage of the metabolome to obtain a comprehensive understanding of how plants adapt themselves to environmental stresses, providing new avenues to increase crop production.
Apart from the identification of critical metabolic pathways to uplift crop production, metabolomics also confers a powerful tool to assess food crop quality. For example, global profiling of plant/crop metabolome could also facilitate in the identification of genetic manipulations that may serve to increase the nutritional value and quality of crops. For instance, previous lipidomic analysis has revealed that GmMYB73 manipulation promotes lipid accumulation in soybeans, thus providing a potential avenue for increasing oil production in legume crop plants [116]. Another example is given by the establishment of an integrated lipidomic approach comprising multiple analytical arms specifically tailored to the analysis of the fine lipidomic fingerprints of palm oil, currently the leading edible oil consumed around the globe, which could be applied to the evaluation of oil quality and the health benefits/risks associated with different oil refinement techniques [117]. Again, resolution of the metabolome (lipidome) is instrumental in such applications, since a highly sensitive analytical methodology is indispensable in detecting trace amounts of critical components such as oxidized lipids including oxidized triacylglycerols that could be auxiliary in determining the oil quality.
In conclusion, while metabolomics holds great promise in forwarding agricultural research, a pressing need exists to expand its analytical capacity in order to derive fully integrated functional networks of metabolites that have true biological meaning. The systemic construction of metabolome libraries with sufficient resolution is therefore expected to further broaden the translational applications of metabolomics in various sub-arenas of agricultural research. Finally, the plethora of metabolomics data also calls for the development of competent information processing tools that allow data to be processed, integrated, interpreted, and verified alongside with proteomics and transcriptomics strategies in order to fully unravel the various intricate biological networks under study.
Acknowledgments: This work was financially supported by grants from the National Natural Science Foundation of China 31371515 and 3150040263.
Author Contributions: All authors contributed to varying degrees to the production of the review. He Tian and Sin Man Lam: literature search and screening, writing the paper, references. Guanghou Shui: literature search and screening, writing the paper, study conception and supervision.

Conflicts of Interest:
The authors declare no conflict of interest.