The Application of Metabolomics for the Study of Cereal Corn (Zea mays L.)

Corn (Zea mays L.) is an important cereal crop indigenous to the Americas, where its genetic biodiversity is still preserved, especially among native populations from Mesoamerica and South America. The use of metabolomics in corn has mainly focused on understanding the potential differences of corn metabolomes under different biotic and abiotic stresses or to evaluate the influence of genetic and environmental factors. The increase of diet-linked non-communicable diseases has increased the interest to optimize the content of bioactive secondary metabolites in current corn breeding programs to produce novel functional foods. This review provides perspectives on the role of metabolomics in the characterization of health-relevant metabolites in corn biodiversity and emphasizes the integration of metabolomics in breeding strategies targeting the enrichment of phenolic bioactive metabolites such as anthocyanins in corn kernels.

based on crossed inbred lines. The current corn germplasm is often referred to in terms of teosinte, landraces, and inbred lines and are important genetically diverse resources for crop improvement and food security [5].
The current challenge for feeding a growing population that would almost reach 10 billion over next 30 years is creating the need to improve existing crops and develop new crops with even higher yielding, more nutritious, and with climate change resilient characteristics [12]. On the other hand, the rise of diet-linked chronic diseases worldwide, such as type 2 diabetes, and other derived complications including cancer and certain degenerative diseases has increased the interest in plant-derived foods from global biodiversity [13][14][15]. Fruits and whole grains have been found to be sources of bioactive compounds with potential to decrease the risk of chronic diseases when consumed on a regular basis. The intake of whole grains rich in different health-relevant functional metabolites has been related to the reduction of risks factors of major chronic diseases according to several epidemiological and intervention studies [16][17][18]. The research and sustainable use of corn genetic diversity may help to counter such challenges.
Metabolomics has emerged as a valuable technology for the comprehensive profiling and comparison of metabolites in biological systems, and a diversity of applications in plant sciences have been reported [19,20]. Depending on the purpose of the study and the type of scientific information required, metabolomics may be focused into different approaches such as targeted analysis, metabolomic profiling and metabolomic fingerprinting [21,22]. In the case of corn, several metabolomic studies have focused on understanding the complex biochemical mechanisms involved in corn plant response to environmental directed biotic [23][24][25][26][27] and abiotic stress factors [28][29][30][31][32]. Other metabolomic studies have been aimed at comparing genetically modified and non-genetically modified corn lines [33] or at evaluating the influence of genetic factors and the growing location on corn kernel and plant function [34][35][36][37].
The current review summarizes the application of metabolomics for the characterization of health promoting metabolites in corn kernels and emphasizes the importance of corn genetic diversity, especially native landraces, as potential sources of bioactive metabolites. The role of metabolomics integrated to novel phenotype-genotype studies targeting the improvement of health-relevant metabolites in corn with a focus on anthocyanins is also presented. Future perspectives on the application of metabolomics for the valorization and understanding of corn genetic diversity and its impact on food security and health are also highlighted.

The Diversity of Corn Germplasm and the Importance of Native Landraces
The preservation of genetic resources including crop landraces and wild relatives is fundamental and crucial for future crop improvement [38]. Landraces have been found to possess important allelic diversity contributing, for example, with useful genes for a more efficient plant nutrient uptake and utilization, resistance or tolerance to biotic and abiotic stresses, and superior nutritional and health-linked bioactive compounds [39].
Ex situ and in situ strategies have been applied to conserve corn genetic diversity worldwide. The first international initiative for the ex situ conservation of corn germplasm was directed by the Latin American Maize Program (LAMP) (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996). A group of 11 countries from Latin America and the United States evaluated the status of local corn germplasm, compiling information about this germplasm, and promoted the access to this information for research purposes and for the creation of superior varieties and hybrids by breeders [40]. Table 1 shows the genetic diversity of corn germplasm in terms of the number of native landraces reported by country in the Americas and in some countries from different continents. Corn genetic diversity is very high in Latin America, and there are around 300 landraces that comprise 90 percent of world's corn diversity [40,41]. Mexico and Peru have around 30 percent of Latin American corn landrace diversity. The phenotypic and molecular variability of corn landraces from Mexico, the center of corn origin, has been reported by Prasana [8] and Prasana [42]. The Andean region has wide ecological diversity and corn landraces that are currently grown from areas at sea level to different highland regions. These adaptations have been developed by native farmers since ancient times. This may explain the fact that the Andean region is possibly the geographical area with the highest corn phenotypic diversity in the world (Figure 1) [43]. However, Andean corn landraces have been poorly investigated at phenotypic and molecular levels compared to the research focused on Mexican landraces and teosinte varieties. region is possibly the geographical area with the highest corn phenotypic diversity in the world ( Figure 1) [43]. However, Andean corn landraces have been poorly investigated at phenotypic and molecular levels compared to the research focused on Mexican landraces and teosinte varieties.  In several places in Mesoamerica and Latin America, local farmers maintain the cultivation of corn landraces for subsistence under traditional agriculture practices (in situ conservation). However, the increasing introduction of modern corn hybrids is leading to a steady genetic erosion of native germplasm. This situation will compromise food security, potentially affecting indigenous food systems in the medium term and potentially long term.
Germplasm banks from the International Maize and Wheat Improvement Center (CIMMYT-Mexico City, Mexico) and the United States Department of Agriculture-National Plant Germplasm System (USDA-NPGS-Washington, DC, United States) currently preserves the world's major corn collections [61,62]. However, modern breeding programs are mainly based on corn germplasm with narrow genetic base as was reported by Prasanna [8]. Considering the current demand of healthier plant-based foods due to the increase of metabolic chronic diseases worldwide, the integrated research of native corn landraces may increase the genetic base for crop improvement and targeting functional bioactive metabolites beyond yield and common agronomic traits. [59] Asia India 15 [60] 1300 [60] Metabolites 2019, 9, x FOR PEER REVIEW 3 of 25 region is possibly the geographical area with the highest corn phenotypic diversity in the world ( Figure 1) [43]. However, Andean corn landraces have been poorly investigated at phenotypic and molecular levels compared to the research focused on Mexican landraces and teosinte varieties.  In several places in Mesoamerica and Latin America, local farmers maintain the cultivation of corn landraces for subsistence under traditional agriculture practices (in situ conservation). However, the increasing introduction of modern corn hybrids is leading to a steady genetic erosion of native germplasm. This situation will compromise food security, potentially affecting indigenous food systems in the medium term and potentially long term.
Germplasm banks from the International Maize and Wheat Improvement Center (CIMMYT-Mexico City, Mexico) and the United States Department of Agriculture-National Plant Germplasm System (USDA-NPGS-Washington, DC, United States) currently preserves the world's major corn collections [61,62]. However, modern breeding programs are mainly based on corn germplasm with narrow genetic base as was reported by Prasanna [8]. Considering the current demand of healthier plant-based foods due to the increase of metabolic chronic diseases worldwide, the integrated research of native corn landraces may increase the genetic base for crop improvement and targeting functional bioactive metabolites beyond yield and common agronomic traits. In several places in Mesoamerica and Latin America, local farmers maintain the cultivation of corn landraces for subsistence under traditional agriculture practices (in situ conservation). However, the increasing introduction of modern corn hybrids is leading to a steady genetic erosion of native germplasm. This situation will compromise food security, potentially affecting indigenous food systems in the medium term and potentially long term.
Germplasm banks from the International Maize and Wheat Improvement Center (CIMMYT-Mexico City, Mexico) and the United States Department of Agriculture-National Plant Germplasm System (USDA-NPGS-Washington, DC, United States) currently preserves the world's major corn collections [61,62]. However, modern breeding programs are mainly based on corn germplasm with narrow genetic base as was reported by Prasanna [8]. Considering the current demand of healthier plant-based foods due to the increase of metabolic chronic diseases worldwide, the integrated research of native corn landraces may increase the genetic base for crop improvement and targeting functional bioactive metabolites beyond yield and common agronomic traits.

The Application of Metabolomics for the Characterization of Health-Relevant Metabolites in Corn Genetic Diversity
Plant secondary metabolites are small molecules that are considered non-essential for the general survival of the organism but are key components for overall plant adaptation and protection to biotic and abiotic stress conditions [63]. Several epidemiological and intervention studies have shown an inverse correlation in the prevalence of major chronic diseases including cardiovascular disease, type 2 diabetes, and cancer and the consumption of whole grains [16,17,64]. This is due to the presence of a wide range of phytochemical compounds, mainly secondary metabolites, that also play a significant role for human health [64].
The composition of corn in terms of primary metabolites with major nutritional importance (proteins, carbohydrates, fat) along with the mineral composition has been shown to be comparable or even better than common cereal crops such as rice and wheat [65]. Some comprehensive reviews have also shown that the corn kernel is a good source of important health-relevant phytochemicals such as phenolic compounds, carotenoids, tocopherols, and phytosterols [65,66]. However, research on the phenolic and carotenoid composition in corn kernel has received more attention due to the wide chemical diversity and remarkable health-promoting properties associated to both metabolite groups [67,68].

Metabolomic Analysis of Phenolic Compounds in Corn Genetic Diversity
The characterization of phenolic bioactives in corn kernel diversity was first evaluated by applying spectrophotometric methods on Mexican landraces of variable kernel pigmentation [69][70][71][72]. This has allowed the rapid selection of promising corn accessions and genotypes for further breeding strategies. The specific chemical structure of phenolic compounds has been associated with different functional properties based on their antioxidant capacity [73]. In addition, phenolic compounds in cereals such as corn can be classified into free, esterified (covalently bound to other molecules), and insoluble-bound forms. Insoluble-bound phenolics form covalent bonds with cell wall macromolecules including pectin, cellulose, arabinoxylan, and structural proteins and generally represent the major phenolic fraction in cereals [74]. Bound phenolics are not significantly affected by the gastrointestinal digestion but are released through colonic fermentation exerting a myriad of positive biological effects at colon level [75].
The study of phenolic profiles (free and bound phenolic fractions) in corn (kernel) germplasm from different origins and types (landraces, inbred lines, hybrids, teosinte varieties) has been done using mainly targeted metabolomic platforms (Tables 2 and 3). Corn landraces and hybrids with different phenotypes (kernel color) from American germplasm banks (Mexico and the United States) have been extensively screened as important sources of phenolic compounds through the use of a liquid chromatography diode array detector/ultraviolet-visible detector (LC-DAD/UV-Vis) and liquid chromatography coupled to mass spectrometry (LC-MS). The interest in the characterization of high-anthocyanin corn landraces and hybrids is increasing due to the growing demand for alternative sources of natural pigments for nutraceutical and food industry applications. Hong et al. [76] and Hong et al. [77] have recently combined the use of different mass separation techniques such as quadrupole-Orbitrap and triple quadrupole for the detection of at least 18-20 different types of anthocyanins in purple corn germplasm from Australia. These authors optimized the method for the extraction and detection of anthocyanins, revealing that some previously identified anthocyanins in purple corn such as succinyl and ethyl-malonyl derivatives are likely not to be endogenous but are esterification products formed during the extraction process [76,77].
The most comprehensive survey for the identification of high-anthocyanin corn germplasm was done by Paulsmeyer et al. [78]. Around 398 genetically diverse pigmented corn accessions from different origins were analyzed, and 167 accessions were selected due to their capability to produce anthocyanins. Selected germplasm was classified as blue aleurone, pink aleurone, purple pericarp with condensed anthocyanin forms, purple pericarp without condensed forms, and reduced acylation accessions. Accessions with purple pigmented pericarp with the presence of condensed forms of anthocyanins showed the highest anthocyanin contents (a Peruvian Andean purple corn had the highest levels). The same study also revealed that the anthocyanin biosynthesis has a broad potential of heritability, and this trait can be optimized through breeding strategies [78].
Peniche-Pavia [79] recently showed the potential of direct injection electrospray ionization mass spectrometry (DIESI-MSQD) as a cost-effective, more economical, and statistically robust method for high-throughput phenotypic characterization and targeting of high anthocyanin corn germplasm. Furthermore, Fourier-transformed near-infrared reflectance spectroscopy (FT-NIRS) has been used for the quantitative measure of anthocyanin contents in a wide array of corn germplasm [80]. Vibrational spectroscopy (infrared and NIR) includes non-destructive methods that can provide high-throughput analysis of a large number of samples with minimum sample preparation. However, major drawbacks are their poor resolution and limited information about the chemical structure of targeted metabolites [81].

Metabolomic Analysis of Carotenoid Compounds in Corn Genetic Diversity
Carotenoids are natural pigments found in most fruits, vegetables, and grains. These metabolites have shown a range of beneficial functional properties for human health. Besides their antioxidant effects, some individual carotenoids such as β-carotene, β-cryptoxanthin, and other derivatives have a pro-vitamin-A function [68]. Lutein and zeaxanthin are essential for macular pigments in the eye and have been linked to the reduction of the macular eye disease reducing the risk of cataracts [97]. LC coupled to detectors such as ultraviolet-visible (UV-Vis) and diode array detectors (DADs) have been widely used for the identification of pro and non-provitamin A carotenoids in different corn germplasm (Table 4). Liquid chromatography tandem mass spectrometry (LC-MS/MS) has allowed the generation of more information about the variability of carotenoid isomers present in corn kernels [98].
In contrast to the observations on corn phenolic compounds, several investigations from different continents have focused their attention on the characterization of carotenoids in worldwide corn biodiversity. Corn has been targeted as one of the major crops for provitamin A enrichment, and important global initiatives such as the HarvestPlus and CYMMIT programs have focused on obtaining high-provitamin-A corn cultivars through conventional and molecular breeding strategies [99,100]. The main objective of the HarvestPlus program (2003-2016) was to develop high-yielding provitamin-A-enriched corn cultivars with proven efficiency in reducing vitamin A deficiency that is profitable to farmers and acceptable to the consumers [100,101]. Targeted metabolomics have revealed that carotenoids are mostly concentrated in the endosperm and that overall orange-pigmented varieties show the highest carotenoids levels ( Table 4). This germplasm has potential for applications in biofortification-linked breeding programs. All-trans-α-carotene, 9-cis-α-carotene, 9 -cis-α-carotene, all-trans-β-carotene, 9-cis-β-carotene, 13-cis-β-carotene, all-trans-β-cryptoxanthin, 9-cis-β-cryptoxanthin, 9 -cisβ-cryptoxanthin, 13 or 13'-cis-β-cryptoxanthin, 15-cis-β-cryptoxanthin, all-trans-α-cryptoxanthin, 9-cis-α-cryptoxanthin

Use of Non-Targeted Metabolomic Platforms for the Research of Corn Kernel Metabolome
Untargeted metabolomics alone or integrated to other high-throughput platforms such as transcriptomics and proteomics have been recently used to characterize the composition of corn kernels and to understand the molecular regulation of metabolic pathways underlying the biosynthesis of phenolic antioxidants. Rao et al. [112] identified 210 metabolites (199 primary metabolites, nine secondary metabolites, and two phytohormones) in mature kernels of 14 corn lines from China using LC-MS/MS and gas chromatography mass spectrometry (GC-MS). A total of 32 metabolites were identified corresponding to basic and essential macronutrients (17 amino acids, five carbohydrates, seven lipids, and three cofactors, prosthetic groups, electro carriers). Other molecules such as vitamin E, stigmasterol, campesterol, β-sitosterol, phospholipids, antioxidants (dihydro-kaempferol and the lactone costunolide), and the antinutrient inositol hexaphosphate were also detected. An integrated metabolic map based on transcriptomic, proteomic and metabolomic data was built and included seven important pathways and 23 sub-pathways that represented the regulatory mechanisms of corn kernel metabolism [112].
Hu et al. [113] studied the metabolome of purple pigmented and non-pigmented kernels from China at different maturity stages (11, 16, 21 days after pollination (DAP)) using a combination of LC-MS/MS and GC-MS. A total of 247 metabolites were identified and metabolomes differed according to the maturity stage. The integration of transcriptomic data revealed that kinetic trends in the transcriptome and metabolome were similar across kernel development. An important metabolic shift from the primary to the secondary metabolism occurred at 16 DAP only in the purple pigmented kernel targeting the biosynthesis of anthocyanins and precursors of phlobaphenes [113]. The biochemical information generated in above studies would be valuable in molecular breeding strategies aimed at increasing health promoting metabolites such as anthocyanins in corn kernel.
Unlike the high-throughput analytical platforms such as LC/GC-MS for metabolomic studies, mass spectrometry imaging (MSI) is a relatively recent non-targeted method used for the simultaneous analysis of both the composition and spatial distribution of many compounds [114]. MSI is a two-dimensional analysis method that can detect intact molecules within tissue sections without requiring extraction, purification, separation, or labeling while allowing the detection of a wide range of molecules [114,115]. MSI has the capability to image thousands of metabolites such as lipids, peptides, proteins, and secondary metabolites (including phenolic compounds) in a single experiment [116][117][118]. Most common MSI configurations are matrix-assisted laser desorption ionization mass spectrometry (MALDI-MSI), secondary ion mass spectrometry (SIMS), and desorption electrospray ionization mass spectrometry (DESI), depending on the type of ionization [114]. In case of corn, MSI has been applied for the study of kernel tissue sections, the distribution of lysophospholipids within starch granule in corn endosperm, and the distribution of cellulose and hemicellulose structures in corn stems [119][120][121]. MSI can provide valuable information about the localization of important bioactive compounds such as anthocyanins for breeding purposes. The integration of these untargeted platforms with targeted methods might help to increase the analytical power for the comprehensive characterization of corn kernel diversity.

Common Methods Used for the Study of Corn Genetic Diversity
Genetic diversity has been broadly defined as any variation in the nucleotides, genes, chromosomes, or genomes of a species at a level of the individual, population, species, or region for a given time [122]. This diversity is highly desirable in crop breeding programs since it offers a wide array of critical gene pools for further development of improved corn seeds relevant for nutrition and health. As mentioned above, corn genetic diversity (native landraces) may represent an important source of candidate genes for breeding applications.
The study of genetic diversity in corn germplasm has been assessed using different types of markers (Table 5). Morphological markers such as those related with plant traits (genetically determined characteristics) [5] with agronomic interest have been traditionally used in several studies for the elucidation of potential clustering patterns among different corn landraces. Ortiz et al. [123] and Ortiz et al. [124], using plant and internal ear traits, respectively, determined that several Andean corn accessions can be classified in different groups corresponding mainly to the original corn race classification. Likewise, Salazar et al. [125] highlighted the wide range of phenotypic variability based on different plant qualitative and quantitative traits at an intra-racial level in the Chilean Choclero landrace. The genetic diversity of Turkish corn genotypes can be classified into flint, pop, and dent races using ear and tassel traits [126]. Other authors have used molecular markers such as microsatellites (SSR) to assess the genetic diversity in germplasm from Brazil and Argentina [127,128]. Porta et al. [129] confirmed the genetic diversity and representativeness of the core collection (90 accessions) with respect to the whole Uruguayan corn landrace collection applying SSR markers in combination with morphological characteristics. The advance in the development of high-throughput sequencing techniques have allowed the genotyping-by-sequencing (GBS) of 349 commercial corn inbred lines based on single-nucleotide polymorphism (SNP) markers [130]. Same study revealed the presence of three different heterotic groups corresponding to the Stiff Stalk, Non-Stiff Stalk, and Iodent corn types among North American corn germplasm [130].

Metabolomic-Assisted Molecular Breeding Strategies for the Increase of Phenolic Antioxidants in Corn Kernels
Metabolomics enables the comprehensive high-throughput quantification of a wide array of metabolites and is becoming an important tool for corn phenotyping. Since the metabolic phenotype provides a link between gene sequence and visible phenotypes, metabolites can be used as markers for trait prediction relevant for crop genetic improvement [132,133]. Strategies known as association mapping studies including genome-wide association (GWAS) and linking studies have been developed for the identification of genes underlying quantitative trait loci (QTL) (a region of the genome associated with the control of a quantitative trait) [5,134,135]. GBS profiling combined with different omics technologies are currently applied to understand the genetic and biochemical regulation of metabolism linked to relevant agronomic and nutritional traits in corn.
Secondary metabolites such as anthocyanins are receiving more attention for breeding purposes due to their well-known health-relevant bioactivity and potential for use as natural pigments at the industrial level. Metabolite-based genome-wide association studies (mGWAS) are used to decipher the genetic basis of plant metabolite biosynthesis and regulation. This strategy is allowing the development of metabolic markers and genetic loci for metabolome-assisted biofortification [136]. The anthocyanin biofortification through mGWAS techniques has already been obtained in tomato and rice [137,138]. Rhodes et al. [139] evaluated a wide sorghum diversity panel (381) using NIRS and identified novel QTL for sorghum polyphenols through a GWAS with 404,628 SNP markers.
Conventional marker-assisted breeding has been used for the production of a colored polenta corn kernel and a purple popcorn rich in anthocyanins in Italy [140,141]. GWAS strategies have been more extensively used for the study of complex traits, such as yield and stress resistance in corn plants [142]. Novel metabolites and genes involved in the biosynthesis of flavonoids have been identified in mature kernels from a diverse corn inbred panel (368) through mGWAS using different LC-MS/MS platforms (electro spray ionization-triple quadrupole-linear ion trap and electro spray ionization-high-resolution quadrupole time-of-flight mass spectrometry, respectively) [133]. However, the application of mGWAS targeting the biofortification of corn kernel in relation to anthocyanin metabolites have not been well studied. This technique along with other new GWAS methods and new population designs would accelerate the discovery and use of allelic diversity in corn landraces to produce novel corn-based functional foods [12,142].

Future Perspectives
Metabolomics is an important technique for the characterization of relevant bioactive phytochemicals such as phenolic and carotenoids compounds in corn biodiversity. The integration of metabolomic information with data from other high-throughput omics technologies is critical for better understanding the underlying molecular mechanisms involved in the biosynthesis of functional metabolites. The application of novel phenotype-genotype molecular strategies for corn breeding such as mGWAS would speed up the production of innovative functional corn-derived foods with improved health-relevant compounds such as anthocyanins and carotenoids.
Corn landrace genetic diversity is a key source of candidate genes with potential applications in modern breeding programs aimed at producing healthier foods in resilient and sustainable ecosystems. A primary major challenge is the comprehensive molecular characterization of worldwide corn landrace diversity using an integrated multi-omics approach including metabolomics. A high-quality phenotypic data is essential to identify and use relevant corn germplasm for different breeding applications.
Extensive research has focused on corn diversity from North and Mesoamerica as highlighted in this review. The Andean region also represents an important hotspot of corn diversity, but the scientific information about germplasm from this area is limited. In fact, some breeding initiatives based on Peruvian corn with high anthocyanin contents have been carried out in the United States and Australia [77,78]. The effective in situ and ex situ preservation of corn diversity at local and international levels is critical to ensure adequate research on corn germplasm, especially in areas with high corn genetic diversity. This requires clear rules and policies for its accession and exchange involving not only researchers but also critical stakeholders such as farmers, government, and industry.
The next challenge that requires a multidisciplinary and intersectoral approach is the transition from the laboratory or greenhouse to the field. All strategies should consider the specific environmental conditions of the target region for corn cultivation. Applications in geographical areas where corn diversity is high may be strategic for improving corn in situ conservation while allowing the development of local agriculture within indigenous food systems.
The holistic integration of metabolomics with complementary omic-technologies may open the opportunity for the application of omic-wide association studies targeting the valorization, understanding, and sustainable use of worldwide corn genetic diversity.