Advances in Cereal Crop Genomics for Resilience under Climate Change

Adapting to climate change, providing sufficient human food and nutritional needs, and securing sufficient energy supplies will call for a radical transformation from the current conventional adaptation approaches to more broad-based and transformative alternatives. This entails diversifying the agricultural system and boosting productivity of major cereal crops through development of climate-resilient cultivars that can sustainably maintain higher yields under climate change conditions, expanding our focus to crop wild relatives, and better exploitation of underutilized crop species. This is facilitated by the recent developments in plant genomics, such as advances in genome sequencing, assembly, and annotation, as well as gene editing technologies, which have increased the availability of high-quality reference genomes for various model and non-model plant species. This has necessitated genomics-assisted breeding of crops, including underutilized species, consequently broadening genetic variation of the available germplasm; improving the discovery of novel alleles controlling important agronomic traits; and enhancing creation of new crop cultivars with improved tolerance to biotic and abiotic stresses and superior nutritive quality. Here, therefore, we summarize these recent developments in plant genomics and their application, with particular reference to cereal crops (including underutilized species). Particularly, we discuss genome sequencing approaches, quantitative trait loci (QTL) mapping and genome-wide association (GWAS) studies, directed mutagenesis, plant non-coding RNAs, precise gene editing technologies such as CRISPR-Cas9, and complementation of crop genotyping by crop phenotyping. We then conclude by providing an outlook that, as we step into the future, high-throughput phenotyping, pan-genomics, transposable elements analysis, and machine learning hold much promise for crop improvements related to climate resilience and nutritional superiority.


Introduction
Combating global climate change, providing sufficient human nutritional needs, and securing sufficient energy supplies are the formidable challenges confronting humankind in the current era [1][2][3][4]. Cereal crops, which are characteristically grasses cultivated for their edible grains [5], are the major suppliers of food energy to human beings and livestock and hence are produced in greater quantities more than any other crop species [6]. Among cereal crops, wheat (Triticum aestivum L.), rice (Oryza sativa L.), and maize (Zea mays L.) constitute the top three most important, in terms of production [6][7][8][9]. Nearly 40% of our daily calories are dependent on those three cereals which require more resources than

De Novo Domestication of Crop Wild Relatives and Better Exploitation of Orphan Crop Species
Development of new germplasm resources with novel allelic diversity in useful backgrounds is critical in assisting the identification of genes that contribute to important adaptive traits [26]. De novo domestication of CWRs becomes central if crop breeders are to create new varieties with novel climate adaptive traits. CWRs refer to plant species sharing close phylogenetic resemblance with domesticated crops, from any geographical location in the world. Examples of CWRs include landraces, crop progenitors, and some closely related plant taxa without known agriculture significance [47,48]. Long periods of evolution and selective pressure (natural selection) have allowed CWRs to accumulate several important genes enabling them to survive harsh biotic and abiotic environments [20]. Compared with modern, elite cultivars, which are normally selected for high-input environments with minimal limitations, CWRs exhibit morpho-physiological features for survival and adaptation under extreme conditions [49]. Therefore, CWRs and landraces serve as potential reservoirs of beneficial alleles for tolerance to abiotic stresses such as drought, heat, salt, and cold, that can be introduced to crop lines via traditional or molecular breeding [26,[50][51][52]. For instance, grain sorghum can be improved by exploitation of the yet untapped potential of the extensive gene pool of CWR in its genus [47]. Sorghum is a genus within the tribe Andropogoneae that includes other plant genera such as Saccharum (sugarcane) and Miscanthus that are important biomass crops. Sugarcane and sorghum are more phylogenetically connected and have been observed to be inter-crossed [53,54]. Therefore, CWR in the Sorghum genus may be a valuable genetic resource for new crop development across the tribe, either via incorporation of important genes into genera such as Saccharum or by domestication of additional Sorghum species [35,47,55]. Additionally, cultivated barley (Hordeum vulgare L.) can be enhanced for drought tolerance by crossing it with its wild relative Hordeum spontaneum L. which harbors alleles for drought tolerance [20,48,49]. For cultivated maize, enhancement can be achieved by exploiting its wild relatives (called teosinte) such as Zea parviglumis (teosinte) and Tripsacum [20,[56][57][58][59]. Interestingly, new advances in gene editing technologies such as CRISPR-Cas9 system may aid in bridging the strong reproductive and genetic hurdles in gene transfer between cultivated crop species and CWRs [47].
Interestingly, the utility of CWRs to unlock marginalized areas for agriculture has been gathering attention in recent years [60]. Moreover, the current technological advances in plant genomics and genome editing offer a window for accelerated domestication of CWRs and improvement of neglected, semi-domesticated crop species by targeting a few key genes and metabolic pathways [42,60]. Genomics have the capacity to expand the diversity of alleles in crop breeders' toolkit by digging into the gene pools of CWRs [23]. Genomics may be used to characterize CWR populations and help their conservation and utilization or can be used to create CWR reference genomes that are useful in comparative genomics analyses [20,50]. However, CWR materials present challenges to crop breeders by requiring genetic selection, to become useful tools for agronomic and breeding programs, through pre-breeding [26].
The orphan crops are envisaged as the future climate-smart crops and are now gaining global recognition [44]. Such UCSs refer to neglected species cultivated mainly in their centers of origin or centers of diversity by native inhabitants where they are central to sustenance of local communities by providing special ecological, production, and consumption related roles, and underutilized species that were once widely grown but now degraded to disuse because of a range of agronomic, genetic, and economic reasons [61][62][63][64][65]. Nevertheless, these crops are nutrient dense [31,34,66,67] and highly adapted to marginal and complex environments, have contributed immensely to diversification and resilience of agroecological niches [4,65,68], and already occupy special positions with regards to the region's socio-economic status and local farmers or consumers' preferences [32]. These crops include millets (pearl, foxtail, finger, fonio, barnyard, etc.), tef, and Amaranthus hypochondriacus L., a pseudo-cereal crop, among others [12,34,44].
Millets can be used as valuable genetic breeding tools. For instance, foxtail millet harbors novel genes, alleles, and QTL for genetic improvement of major cereal crops and bioenergy grasses [23,69]. Moreover, pearl millet is a climate-smart crop, containing superior nutritive value (with greater amounts of zinc and iron) than wheat and other major cereals on top of being more resilient to climate stressors [27,46,[70][71][72]. It is an alternative to major cereal crops because it can provide nutrition without the need of too much water and has a much greater resilience to heat and drought compared to wheat, rice and maize [10,27]. The climate-adaptive reproductive, phenotypic, and physiological characteristics of pearl millet give it thriving ability to grow in marginal conditions characterized by limited soil water availability, poor soil fertility, high salt content, high temperatures, and scant rainfall, where major cereal crops perform dismally [72]. Fortunately, pearl millet genome has been sequenced [73]. Other orphan cereal crops that have been whole-genome sequenced include foxtail millet [74,75], tef [32], fonio millet [60], and finger millet, among others [34,44]. Therefore, harnessing CWRs and better utilization of these UCSs will increase our genetic resource base and diversify our available gene pools since these species have been identified as sources of novel abiotic-related and nutrition-related traits that can be incorporated into major cereal crops [2,27,44,50,72,73]. However, to achieve global food security, research programs and political efforts will be necessary to make these UCSs available compared to a sole focus on only a few major staple food crops [10].

Advances in DNA Sequencing Technologies Accelerating Traits Discovery and Decoding Crop Species' Whole Genomes
Reference genome sequences, as the basis of crop genetic and genomic studies, provide insights into gene content, genomic variation, and genetic foundation for agronomic traits [2,76]. High quality reference genome assemblies are important in elucidating complex traits and fast-tracking crop improvement by facilitating easy identification of favorable genes harboring better agronomic traits [77,78]. Particularly, whole genome sequences lay bare detailed genomic features, encompassing coding and noncoding genes, repetitive elements, GC content, and regulatory sequences that are valuable resources for deciphering plant genes' functional roles [12].
The ground-breaking work of Sanger and his co-workers [79,80] initiated sequencing of DNA and genomes [12]. Particularly for the plant science, construction of the first complete genome sequence for Arabidopsis thaliana L. (Arabidopsis) in the year 2000 ushered in major strides for plant functional and comparative genomics studies [78,81]. Crop genomes of several major crops such as rice [82], maize [83], and sorghum [84] got decoded by the bacterial artificial chromosome (BAC)-based physical maps of the Sanger strategy [35,78]. The BAC physical maps used in Sanger sequencing offered a good template for completing gaps and errors, although the genome coverage of physical maps was sometimes non-representative due to cloning bias [35]. Thus, although the Sanger sequencing technology boasted of long read length and high assembly accuracy [85] and enabled the construction of 'standard' reference genomes for maize, rice, sorghum, and Arabidopsis [78], its widespread adoption suffered from its low throughput capacity and high cost of acquisition and operation [2].
However, post 2010, there has been tremendous progress in whole genome sequencing of several plant species, including CWRs and orphan crops (Table 1, [33,50,[86][87][88]). Crucially, the evolution of sequencing platforms has allowed the generation of large volumes of sequencing data within a short period of time, at reduced costs as compared to the first-generation sequencing technologies [12]. In particular, the rapid development of second-generation or next-generation sequencing (NGS) technologies around the year 2010 onwards has facilitated assemblage of hundreds of plant genomes (Table 1, [20,86]). Prominent among the NGS technologies has been the Illumina platform, with its high throughput (HTP) and lower cost [89].
With the aid of NGS technologies, re-sequencing of the plant genome and the whole transcriptome in greater depth has been made possible [35,90,91]. For example, McCormick et al. [90] used deep whole-genome sequencing, coupled with high-density genetic map and transcriptome data to update the sorghum reference genome sequence (ver. 1) and its annotation as well as characterize additional features of the sorghum reference genome. They produced a resequenced high-quality sorghum reference genome (ver. 3) with improved sequence coverage (of~29.6 Mb additional sequence), increased number of annotated genes (24% increase) to 34,211, increased average gene length, and narrowed error frequency rate by ten-fold (down to~1 per 100 kbp) [90]. Moreover, sequencing of hundreds of related genomes within and between germplasm pools has facilitated deciphering of genetic diversity [12,47]. Meanwhile, NGS technologies possess some of the drawbacks of short read lengths, which limit their ability to span over long bits of repetitive sequences, causing misassemblies in the long repetitive regions and gaps in assemblies [2,20].  3 Public year, publication year of the genome sequence information. 4 The wheat specie Triticum urartu L. (einkorn wheat) is the progenitor species of the A genome. It is diploid wild wheat which resembles cultivated bread wheat (AABBDD) more extensively than any other wheat species.
Encouragingly, the emergence of third generation sequencing (TGS) approaches and generation of long-reads through platforms such as the Oxford Nanopore MinION (Nanopore) and Pacific Biosciences (PacBio) have offered the best way to resolve transposon repeats by generating long reads that span-over transposon regions, enabling distinct contiguous sequences to bridge-over the unknown locations, thereby facilitating the production of high-quality assemblies for complex genomes [78]. The PacBio and Nanopore long read sequencing approaches are single molecule real time (SMRT) methods, producing long reads (in real time) of several thousand bases which can span complex and repetitive regions [2]. Such TGS technologies can yield single molecule reads of~60-200 Kb long, with an average length of 10-20 Kb [78,98]. However, these TGS approaches suffer from higher error rates (of approximately 13-18%) [81,99]. Various genome sequencing technologies across generations have been reviewed and compared in several papers [78,100,101]. The long-read sequencing technologies are usually coupled with optical mapping and confirmation capture to generate draft genomes of unparalleled contiguity [20,102]. Overall, the introduction of long-read sequencing approaches has presented a window for de novo assembling (scaffolding) of genomes, resolving sequence assembly ambiguities and gap filling. Moreover, the enhanced genome assembly improves spanning of regulatory sequences, consequently raising annotation efficiency and our capacity to identify functional genetic variations [78,103]. Further, this progress in genome sequencing has facilitated comparisons between related species and identification of subtle genetic variations that may be key in improvement of elite crops. For instance, PacBio long read single nucleotide sequencing strategy has been successfully used to explore the subtle genomic variations between sweet and grain sorghum reference genomes, where it was observed that among sucrose metabolism related genes, three sucrose transporters were either entirely eliminated or severely curtailed in the sweet sorghum variety Rio. However, several other sucrose transporters and sucrose synthases showed differential expression between the sweet and grain [91].

Approaches in Mapping of Genomic Regions Controlling Variation of Quantitatively Inherited Traits
Crop improvement largely depends on the availability and identification of genetic variation for the target traits and their utilization via breeding and transformation [35]. Meanwhile, the underlying gene regulatory processes governing crop biotic and abiotic stress responses are quite intricate, with several gene networks and stress signaling pathways being involved, and morpho-physiological traits affecting these crop responses are quantitatively inherited [104]. Beneficial alleles for various traits are located at specific chromosomal positions called QTL [49]. Therefore, the precise discovery of QTLs plays a crucial role in crop improvement through their manipulation via marker assisted selection (MAS) [105,106]. Fortunately, innovations in genomics-based methods offer access to these agronomically desirable alleles present at QTLs, and analysis of genome sequencing data and gene products facilitates the identification and cloning of genes at target QTLs [49]. For example, in maize crop improvement for drought tolerance, MAS has been employed to introgress QTL alleles for shortening the anthesis-silking interval [107].
Generally, the allelic variations of QTLs can be statistically linked with the value of a quantitative trait in two ways: across mapping populations (QTL or linkage mapping) or suitable panels of accessions characterized by the presence of linkage disequilibrium (LD)/association mapping) [49,108]. QTL or linkage mapping approach is the more traditional method and has been widely applied to identify genomic regions (QTL) controlling target traits [2,108,109]. This family-based mapping analysis relies on the genetic recombination and segregation during the construction of mapping populations in the progenies of bi-parental crosses that eventually affect the genetic mapping resolution and allele richness [104,109]. QTL mapping has demonstrated and remains a powerful tool to identify loci that co-segregate with the trait of interest in the research population. The utility of QTL mapping is that it can be applied in different population types such as F2 populations, double-haploid populations, backcross or recombinant inbred lines families, using different types of molecular markers [104,110]. Analysis of QTLs has identified several climate-related and nutrient-related QTLs in major cereal crops as extensively reviewed [23,39,[109][110][111][112].
Worryingly, compared to other crops, research in millets is still lagging behind [46,113]. This is despite the fact that millets are considered predominantly climate resilient crops [11,34,72] and could serve as valuable source of novel genes, alleles, and QTLs for tolerance to climate-change-induced abiotic stresses [11,23]. Moreover, despite the high number of studies on QTL mapping for complex traits such as drought tolerance in major cereal crops over the past decade, there has been little success in introgression of those QTLs, and the number of causal genes that have been confirmed within these QTL regions remains relatively small as compared to Arabidopsis and rice [114,115]. Therefore, the identification and functional characterization of those stress-tolerance genes, alleles, and QTLs in millets is critical for their introgression and improvement of climate change resilience in cereal crops [23]. Promisingly, it is envisaged that within the next decade, necessitated by rapid improvements in high-throughput genome sequencing, crop phenotyping, and gene transfer techniques, QTL cloning will increasingly become feasible, whereas MAS will remain a useful tool for major QTL screening [23,72,116]. Cloned QTL facilitate a more targeted search for novel alleles and will offer novel insights for genetic engineering of climate resilient cereal crops [109].
Recent advances in NGS have enabled identification of major QTLs regulating specific plant phenotypes, via the development and deployment of enormous amounts of genetic markers such as single nucleotide polymorphisms (SNPs) and insertion-deletions (InDels), thereby aiding in an efficient way to enhance crop agronomic traits of economic importance, including in orphan crops. These developments in NGS facilitate the discovery of novel alleles/genes for various agronomic traits by genotyping-by-sequencing (GBS) approach [23]. The greater abundance of SNPs means that they cover a greater number of loci; hence, they are located in huge pools across the genome and can be used to classify sets of polymorphic markers [12]. Additionally, SNPs are amenable to high-throughput and automated profiling [49], therefore allowing for quick and high-throughput high-density SNP-marker-based genotyping [12,114]. Resultantly, SNPs have now overtaken other genetic markers such as single sequence repeats (SSR) markers as the preferred markers for marker assisted breeding, GWAS, GS, identification of disease-related alleles, or map-based cloning [12,114,117].
Genome-wide distributed high-density SNPs have greatly supported GWAS in delineating the slightest possible genome region associated with phenotypic variation in wide germplasm pools [42]. As a result, in the last few years, large-scale GWAS has become a key approach for mapping quantitative traits and studying the natural variation; GWAS is a powerful tool for performing effective and efficient genome-phenotype association analysis and identification of causative loci/genes for quantitative traits [81,104]. GWAS analysis approach involves evaluating statistical associations between DNA polymorphisms and trait variations across distantly related and heterogeneous individuals from a diverse collection that are genotyped and phenotyped for traits of interest [23,118]. Through screening large and diverse collections with ample genetic marker density, GWAS can effectively detect causal loci/gene underlying natural phenotypic variation; GWAS approach boasts robustness, high resolution, and effectiveness in the dissection of complex traits in crops [104]. Coupled with improved genome sequencing technology, GWAS enhances the mapping resolution for accurate location of allele/QTL/genes [23]. GWAS incorporates past recombination events in diverse association panels, and larger allele numbers, to identify genes linked to phenotypic traits at higher resolution than QTL analysis [2,104,108,119]. An increasing number of papers highlighting the use of SNPs in GWAS to detect genomic regions and candidate genes for various agronomic traits, including abiotic stress tolerance, in cereal crops are available for rice [120][121][122], pearl millet [72], barley [104,123], foxtail millet [124], sorghum [125], and several crops [12,42].
Meanwhile, GS has also become one of the innovations holding promise in genomicsassisted-breeding (GAB) [126,127], facilitating quick crop improvement without detailed study of individual loci [2,117]. GS enables crop breeders to explore and increase genetic gain per selection in a breeding program per unit breeding cycle, consequently enhancing speed and efficiency of breeding programs [72,128]. In GS, several cycles of selection are used to accumulate favorable alleles that are associated with desired phenotypes, although no causal association between a specific gene and a phenotype is established [35,129]. In GS, genome-wide HTP markers that are in LD with QTL are used to estimate their effects through optimum statistical models, before genomic estimated breeding values GEBVs are calculated for each individual to select potential elite lines [72,117,127,129]. In fact, two population types are needed in GS: a training population (also known as reference population) that is composed of a cohort of individuals with both genotypic and phenotypic data and a testing or (breeding) population that consists of candidate breeding lines with genotypic data only [127,[129][130][131]. The predicted GEBVs are then used for selection, without the need for further phenotyping [72,114,132]. In this way, GS significantly shortens the breeding cycle as compared to conventional breeding methods. The utility of GS is that it can facilitate selection of complex traits including those for tolerance to drought, heat, cold, flooding, etc. [2]. Additionally, with GS, decisions on selections can be made during the off-season, resulting in genetic gain improvements on an annual basis [133]. Already, GS has shown to be an economical and viable alternative to MAS and phenotypic selection for quantitative traits and has fast-tracked crop breeding programs in cereal crops [72,127]. For these reasons, GS is suggested to hold great promise for adapting cereal crops to climate change [129].

Broadening Crop Genetic Diversity through Mutagenesis
The rigorous screening applied by crop breeders in the crop domestication process and eventual breeding of elite cultivars has resulted in considerable decline in natural genetic diversity. Consequently, this genetic erosion has become a bottleneck in further crop improvement efforts [134]. However, it is well known that the availability of heritable genetic variation is a prerequisite for any crop improvement program [30,135,136]. Therefore, broadening crop genetic base by induced mutations has become a common tool for creating genetic variation for use in crop improvement programs [108].
Aside from recombination, plant mutation induction by physical (via ionizing radiation such as X-rays, gamma rays, fast neutrons, etc.) and chemical mutagens (ethyl methane sulphonate, methyl methane sulphonate, sodium azide, etc.) is the most common approach for generating novel variations [136]. The resulting populations generated by physical and chemical mutagenesis are then screened for mutants with desirable phenotypes, and the genetic base underlying those phenotypes deciphered through mutant characterization [137]. Ion radiation induced mutations can increase the natural mutation rate by~1 × 10 3 -1 × 10 6 -fold (www.iaea.org, accessed on 28 March 2021) and have been widely used to generate heritable genetic variability in the development of novel crop cultivars for the past century, generating billions of additional dollars in the process [108].
Whereas physical mutagens such as gamma rays and neutrons often result in large scale deletion of DNA and chromosomal structure alterations, chemical mutagens often affect single nucleotide pairs, with the extent of mutation being dependent on the tissue targeted, time and degree of exposure [136,138]. The aim in mutagenesis breeding is to cause maximal genetic variation with minimal decline in viability [134]. Therefore, mutations at single nucleotide pairs are particularly of much interest to crop breeders since large-scale alterations in chromosome structures usually result in serious deleterious consequences [138]. Application of mutation induction techniques has generated considerable amount of genetic variability, thereby playing an important role in plant breeding, genetics, and advanced genomics studies (www.iaea.org, accessed on 28 March 2021). Thus, mutagenesis-based crop improvement programs have benefited from new genetic variation and novel traits [108]. Consequently, the generated mutant crop cultivars contributed con-siderably to global food and nutrition security [139]. For instance, several mutant cultivars with enhanced productivity, abiotic stress tolerance, biotic stress resistance, and improved nutritive value have been developed in different cereal crops (Table 2, [136,140]).  Meanwhile, modern innovations in reverse genetics and gene discovery tools, coupled with recent advances in genome sequencing, bioinformatics and HTP technologies have brought heightened interest in the use of chemically induced mutations for crop improvement [145,146]. Particularly, the advent of TILLING (targeting induced local lesions in genomes) technology has revolutionized chemically-induced-mutagenesis-based crop breeding endeavors [134,147]. TILLING is a reverse genetics tool combining traditional chemical mutagenesis with high-throughput mismatch detection technique to identify series of single base pair allelic variations within a gene of interest [138,147]. This approach generates a wide range of mutant alleles, is fast and automatable, and is applicable to any organism that is amenable to chemical mutagenesis [148]. TILLING has been successfully applied in different cereal crop species including maize, wheat, rice, sorghum as comprehensively reviewed in previous articles [134,145,147].
Although TILLING populations have been conventionally used for reverse genetic approaches, current innovations in whole-genome sequencing have opened new opportunities for the identification of mutants in candidate genes, and the availability of sequence information from entire mutant population will permit the use of TILLING populations for forward genetic methodologies such as starting from an interesting phenotype detected in certain mutant lines to clone the underlying genes via association mapping [149]. For instance, the utility of TILLING technology has been proven in the cloning of wheat gene Stb6 [150]. The gene Stb6 encodes a conserved wall-associated receptor kinase (WAK) and confers pathogen resistance to the fungal Septoria tritici blotch (STB) disease. In a cultivar (Cadenza) harboring Stb6 gene, Saintenac et al. [150] identified nonsense and mis-sense mutations for two of the candidate genes they had observed via genetic mapping. Mutations in a gene encoding WAK showed increased susceptibility compared to the wild-type control, confirming that this WAK gene is Stb6 [150]. In the face of the changing global climate, TILLING technology holds new prospects for gene cloning disease resistance and abiotic stress genes in cereal crops [149]. For a detailed, and more recent, excellent review on TILLING in cereal crops for allele expansion and mutation detection, we refer you to Irshad et al. [151].
In cereal crops such as rice, insertional mutagenesis (IM) is an important tool for large-scale functional genomics analyses and gene discovery, using molecular tags such as T-DNA, activator/dissociation (Ac/Ds) insertions, transposons, or retrotransposons [152,153]. In instances where the identification of biological functions for redundant and vital genes is not feasible with the use of knockout mutants generated through chemical or physical mutagens [154], IM coupled with gene activation tagging technique plays a key role in circumventing such concern. The IM approach has been widely used to generate mutant libraries that allow for the easy identification of tagged genes by use of PCRbased techniques such as TAIL-PCR (thermal asymmetric interlaced PCR) or inverse PCR [152]. Generation of large mutant populations using IM approach is hindered by the approach's stringent need for plant transformation methods and its low mutation frequency. However, in crops such as rice, innovations in the efficient plant transformation protocols and the availability of a wide range of transformation vectors have facilitated for the increased use of IM approach in rice functional genetic studies [152,155,156]. In particular, the IM approach has been successfully used in japonica cultivars, aided by the availability of reliable and reproducible Agrobacterium tumefaciens-mediated transformation techniques [155,157,158]. In this technique, the mutagen performs the role of a molecular tag, facilitating the identification of disrupted genes. When coupled with the desirable phenotype, the insertion tag will facilitate the isolation of the gene of interest [159]. For a detailed review on the use of IM approach in cereal crop functional genomics, we refer you to Ram et al. [152].
In relation to the future of plant mutagenesis, we hold the view that with the space explorations getting increasingly frequent, cosmic radiation induced mutations will become more important in generating novel allelic variations essential for strengthening crop breeders' toolbox. It will be interesting to see how the new crop cultivars generated from such technologies will impact crop nutritional quality and human safety concerns.

Use of Sequence Specific Nucleases for Precise Gene Editing for Crop Improvements
Recent progress in genome engineering has revolutionized the precise editing of DNA sequences in living cells, thereby prompting targeted plant genetic manipulations for our benefit [160]. DNA can now be altered in various ways such as the introduction of specific nucleotide substitutions in a gene that alter a protein's amino acid sequence, deletion of genes or chromosomal segments, and insertion of foreign DNA at specific genomic sites [161]. Such programmed DNA sequence modifications are facilitated by sequence-specific nucleases (SSNs) directed to modify target genes at desirable locations on the genome by creating double-strand breaks (DSBs) at specific genomic loci to be altered [162,163]. The double-stranded (ds) DNA is cleaved at particular loci by means of mainly three programmable SSNs, viz., zinc finger nucleases (ZFNs), transcriptional activator-like effector nucleases (TALENs), and CRISPR-Cas9 [161,164,165]. The DNA DSBs then undergo natural repair, via either homologous recombination (HR) or error-prone nonhomologous end joining (NHEJ) [166][167][168]. These DSBs repair can be controlled to achieve the desired sequence modifications such as DNA deletions or insertions of larges arrays of transgenes [160]. For instance, the NHEJ repair mechanism causes mutation-like random insertions or deletions (InDels) or substitution [169]. If it occurs in the coding region of the gene, it may cause frame shift mutation, resulting in a target gene knockout [170]. On the other hand, the HR mechanism accurately repairs DNA DSBs by integrating a DNA donor containing homologous overhangs at the target site [166,171]. For detailed reviews on DNA DSBs repair mechanisms, we refer you to previous articles [164,167,172,173].
ZFNs are chimeric protein fusion of a non-specific DNA cleavage domain and a synthetic zinc finger-based domain that binds to DNA [164,171]. These ZFNs facilitate the creation of specific breaks in ds DNA, without pre-determined target site. When the target sequence specific array of zinc fingers is fused with endonuclease domain, usually a non-specific cleavage domain from Fok1, a type IIS restriction endonuclease, the breakage at a desired site can be formed [171]. Fok1 has an N-terminal DNA binding domain and a C-terminal domain possessing non-specific DNA cleavage activity. The DNA binding domain consists of zinc finger binding arrays (3-6 in number), with each binding array capable of recognizing a target DNA sequence of three consecutive nucleotides (3 base pairs) length [165]. In total, 9-12 nucleotides of the DNA sequence may be recognized per protein monomer [173]. Two Fok1 zinc finger binding arrays (each 18-24 bp in length and spaced by 5-8 bp) recognize respective sequences targeted (that are in close proximity on opposite strands) and dimerize (get aligned in reverse fashion with each other) [165].
This results in a staggered cut in the ds DNA; the spacing between the two zinc finger binding arrays is a critical design component of the ZFN, allowing for Fok1 monomers to dimerize and generate the much-needed DSB in the target DNA sequence. Consequently, a functional ZFN dimer has a recognition site of 18-24 nucleotides in length (excluding the spacer region) [165,173]. The high specificity of ZFNs has promoted their wide application in targeted gene editing in plants and animals [164,165].
Similar to ZFNs, a TALEN is a chimeric protein fusion of non-specific DNA cleavage domain of Fok1 nucleases with the DNA binding domain [164]. However, in TALENs, the DNA binding domain is a transcription activator-like effector (TALE) array [165,171,174]. In nature, TALENs are Type III effector proteins synthesized by Xanthomonas bacterial species to promote virulence in plants such as rice and cotton [175,176]. A TALE DNA binding domain is an array (of up to 30 copies) of highly conserved tandem repeats that span 34 amino acids each [173], with each repeat being able to recognize one nucleotide in the target sequence [165]. The tandem repeats on the TALE DNA binding domain are followed by a sequence of 20 amino acids, commonly known as half repeats [165]. Moreover, the TALE DNA binding domain possesses a simple repeat variable located only at two (12 and 13) positions known as repeat variable di-residue (RVD), whose function is to recognize a specific DNA sequence [165,170]. By targeting the sequence within the RVD (complementary to target sequence), the TALENs can be designed for a targeted site-specific DNA cleavage [170]. The mechanism of action of TALEN is the same as ZFNs, and the designing of TALEN is considerably easier as compared to ZFNs [165]. In terms of utility, TALENs boast greater specificity and efficiency and are considerably cheaper and quicker to assemble as compared to ZFNs [177,178]. Consequently, the TALEN approach has been used for precise genome editing and has shown great potential for various applications in biotechnology, synthetic biology, and crop improvement programs [164,170,173,178].
Among the three programmable SSNs based approaches used for genome editing, CRISPR-Cas9 represents a significant advance within the field of genetics and molecular biology [30,179] and has garnered much attention because of its great accuracy, quickness, adaptability, and simplicity [162,180,181]. The CRISPR-Cas9 system is a prokaryotic immunity system based on a targeted DNA-destroying defense system originally found in bacteria and archaea [164,177], aimed at defending organisms from invading viral and plasmid DNAs [166]. CRISPRs are small repeats (around 24-48 nucleotides in length) interspaced by foreign DNA from the earlier invaders against which protection is to be deployed. These spacers, which are known as protospacers, are always associated with a protospacer adjacent motif (PAM) [177]. Among the Cas effectors, class 2 effectors, such as Cas9 (type II) and Cpf1 (type V), are the most used in this defense system [182]. The principal components of the CRISPR/Cas9 system comprise the Cas9 protein, the transactivating CRISPR RNA (tracrRNA) sequences, and the CRISPR RNA (crRNA) [166]. The tracrRNA and crRNA are engineered into single guide RNA (sgRNA) (of usually about 20 nucleotides [183], which is normally used in genome editing [177,184].
For its mechanism of action, the CRISPR-Cas9 system recruits an RNA-directed cleavage of foreign DNA to offer protection against the invading plasmids and viruses [177]. In simple terms, when bacteria detect the presence of viral DNA, they produce RNA corresponding to that of the invading virus. This RNA then recruits the Cas9 protein and guides it to the section of the genome complementary to the viral DNA, which the Cas9 protein will then cleave and eliminate from the bacterium's genome [185]. Thus, co-expression of Cas9 and an engineered sgRNA forms a sequence-homology-dependent endonuclease that generates DNA DSBs within a specified target sequence along the genome [166,184]. The DNA DSBs can then be corrected by the cell's endogenous DNA repair mechanism, either via NHEJ, possibly resulting in mutations, or through homology directed repair [179,183].
Since its discovery as 'genetic scissors' by Emmanuelle Charpentier of the Max Planck Unit (Berlin, Germany) and Jennifer A. Doudna of University of California (Berkeley, CA, USA) in 2012 [186], which eventually won them a Nobel Prize in Chemistry in November 2020 for 'the development of a method for genome editing' [185], CRISPR-Cas9 technol-ogy has evolved as the most powerful tool worldwide for precise genome editing and generation of genetic models for both fundamental and applied research [161,183,187,188]. Particularly, the ability of CRISPR-Cas9 system to localize a protein to a specific DNA sequence opened up several new opportunities, and the improvement of Cas9 has spawned several other potentially remarkable DNA manipulation tools and techniques [165,179], and it is now routinely employed to enable gene knockout, gene insertion, and gene replacement methods in different model genetic organisms [164,[189][190][191]. Thus, CRISPR-Cas9 technology has been widely used to generate nutrition-improved and climate resilient cultivars in cereal crops (Table 3, [180,181,[192][193][194][195]). For comparison amongst different SSNs, we refer you to previous detailed reviews [161,165,195]. Table 3. Selected examples of significant gene targeting studies in cereal crops using CRISPR-Cas9 system [164,165,180,183].  [205] However, despite the promise of the advanced technologies in gene editing and crop improvements for global food security and climate adaptation, public and scientific concerns related to ethics, and unsubstantiated human and environmental health and safety concerns brought about by genetically engineered crop cultivars (GECs) have been raised [164]. The resultant government regulatory frameworks aimed at safeguarding human and environmental health have imposed major cost barriers to the swift widespread adoption of newly developed GECs [140,206]. Going forward, the extent of these government regulations imposed on GECs will have a huge bearing on the cost of GECs' further development itself and how rapidly they will be adopted for food and feed [68]. Additionally, the general public's preparedness in embracing GEC-derived food products will also determine the extent of adoption of these gene editing approaches in crop improvement programs, especially in least developed countries where cereal crops are major providers of staple diets [207,208].

Double Haploid Technique as a Tool for Accelerated Crop Breeding for Climate Resilience
Speed breeding of climate-resilient and nutritionally superior crops targets to optimize and integrate the parameters that affect plant growth and reproduction to lessen generation times and the period taken to observe phenotypes, especially in the context of climate change [209]. Double haploid (DH) technology has made these speed breeding targets achievable. Double haploids (DHs) are plants derived from a single pollen grain and doubled artificially to form homozygous diploids [210], with a DH individual possessing two homologous chromosomes/genes [211], so that the amount of recombination information is equivalent to a backcross [210]. The utility of DH technology over conventional breeding approaches is that DH achieves complete homozygosity in a single generation, thereby significantly shortening the time required to produce pure lines [211]. Consequently, DH technology has had a significant impact on reducing time, labor and cost in crop improvement programs [212]. Additionally, because all individuals are homozygous, DHs can be transferred between different labs and environments for assessing the effect of the environment on gene expression [210]. Therefore, DHs are ideal for estimating QTL × environment interactions as complete homozygosity allows better estimates of trait means and facilitates for more accurate selection over locations and years [211].
The development of DH plants allows crop breeders to achieve homozygosity in segregating populations in a single generation as compared to 5-7 generations by conventional breeding methods. This permits selection of stable lines to start much earlier [210]. Therefore, DHs provide a time advantage for incorporation of quantitative traits that cannot be readily selected in the early segregating generations arising from conventional crosses [213], and they significantly reduce the size of populations needed to find a desired genotype [214]. Moreover, DHs provide an efficient screening material for desired mutants and other material for complex traits [214]. Therefore, combination of DH breeding approach with MAS and other new techniques such as directed in vitro mutagenesis and in vitro screening can be a vital tool for effective selection and efficient incorporation of complex traits such as drought, cold, and salinity tolerance [213,214]. Further, complete homozygosity facilitates more precise phenotyping and allows accurate gene-trait association in genetic mapping and gene function studies [211,215].
Despite the DH technology suffering from the limitations associated with anther/ microspore culture (including low regeneration rate, high genotype-specific response, high frequency of callogenesis, and low recovery of DH plants) [214], there has been significant crop-specific protocols [140] and technology improvements that have facilitated its increased application in cultivar improvement, genetic mapping, mutagenesis, and gene functional analyses [211,216]. Thus, the DHs have been applied in genetic research and crop improvement efforts in cereal crops, including barley, maize, wheat, rice, and rye [211,213,[217][218][219][220]. For instance, DHs created from haploid wheat plants developed by anther culture or fertilization with maize pollen have been used for wheat genetic research and breeding, whereby maize-pollen-derived haploids have been more feasible than anther culture in durum wheat; the rate of DH plant production in the durum × maize system has been significantly improved [213].
Another practical example is that DH technology has been successfully used in commercial maize breeding, to produce and double the chromosomes in maternal haploids to generate DHs (instant inbred lines), and paternal haploids produced through indeterminate gametophyte1 mutation are utilized to convert male-fertile lines into cytoplasmic male-sterile lines [221,222]. Consequently, this has significantly reduced the breeding time from about 7-8 generations/seasons to two generations, thereby making the breeding of maize more efficient and economical [221]. Notably, since 2010, DHs have been progressively adopted in CIMMYT maize breeding programs, steadily substituting pedigree breeding. CIMMYT has since developed above 200,000 maize DH lines from diverse source populations and successfully recognized several maize DH lines harboring superior traits for use in maize breeding programs across Sub-Saharan African, Latin America, and Asia [223]. Going forward, the successful production of DHs on a routine basis would shorten cultivar development period and provide excellent recombinant inbred lines for molecular mapping applications [210].

The Integral Role of Crop Phenotyping in Complementing Crop Genotyping
Phenotyping encompasses measurement of observable traits that reflect the biological functioning of gene variants or alleles as influenced by the environment, and generally, phenotyping for crop improvement via breeding calls for assessment of hundreds or thousands of genetic lines [224]. In other words, phenotyping refers to the application of methodologies and protocols to quantify specific traits related to plant structure or function, with these traits ranging from cellular to whole-plant levels [225,226]. This means that such a trait that is subject to phenotyping can be any physiological, morphological, or phenological feature, from the cell to whole plant level [227]. Combining all these definitions, Yang et al. [228] have described crop phenomics as the multidisciplinary study of HTP accurate acquisition and analysis of multidimensional phenotypes on an organismwide scale through crop development.
Phenotyping has become an integral component of the crop improvement programs by contributing towards dissecting the genetic bases of certain crop phenotypic traits, particularly those related to yield and stress tolerance [229][230][231]. For instance, in crop drought tolerance improvement programs, yield and yield attributing factors (the primary trait) are targeted for direct selection whilst secondary traits (those traits in addition to yield, such as root architecture, anthesis-silking-interval, stay green, leaf rolling, etc.) are vital in conferring drought tolerance and contributing to final yield indirectly [232][233][234]. Conventionally, phenotyping of secondary traits has involved field assessments of easily observable and scored morphological characters such as plant height, flowering date, leaf number, etc. However, researchers have since discovered that tolerance to abiotic stresses, such as drought, involves metabolic and regulatory functions, for which measurements of targeted processes will more likely offer valuable information regarding the underlying biology and have since developed better methods for assaying of such traits [224].
Several component traits of plants acclimation to environmental stresses are controlled quantitatively. Therefore, enhancing phenotyping accuracy has become more imperative to improve the heritability of these traits. Additionally, the target quantitative traits would require rapid and precise measurement [114]. Researchers have since used physiological phenotyping approaches in controlled-environment studies in the determination of the mechanistic basis of abiotic stress responses [224]. More importantly, the phenotypic data generated from physiological phenotyping has been utilized to identify QTLs or genes through QTL mapping, association mapping and GWAS for GAB for crop improvement [228]. However, despite their positive contributions, many of these physiological phenotyping approaches are commonly detailed, time-consuming, expensive, and can only be effectively applied to a limited number of genotypes [224,226,235]. Consequently, there has been a critical limitation in applying physiological information in crop improvement programs [236].
Precise physiological phenotyping of specific plant traits is crucial in enhancing crop breeding programs, although characterization of non-visual (physiological) traits is difficult and complicated [225,230]. Accurate phenotyping and effective integration of phenotyping in crop breeding programs will entail clearly specifying and differentiating the physiological phenotyping methodology at every breeding step. More specifically, there will be need to (i) find evidence that a hypothesized plant trait will lead to crop improvement, (ii) understand the underlying basic process of the trait to guide the development of physiological screens, and (iii) develop multi-tier phenotypic screens that provide insight about trait expression at various phases along the breeding process [225].
Meanwhile, in the last decade, the rapid advances in HTP crop genotyping techniques have not be met with corresponding pace with regards to crop phenotypic methods [225,229]. Without quality and effective data, the enormous genotyping data cannot be effectively used for crop improvement [237]. Promisingly, the recent advances in robotics, information technology, and data extraction and analysis, coupled with systems integration have revolutionized plant phenotyping [228], with HTP phenotyping platforms (HTPPs) being developed in plants to keep pace with significant advancement in genotyping techniques to enhance the efficiency of crop improvement programs [238]. Crop morphology and physiology can now be assessed non-destructively and repeatedly across whole plant populations and throughout development, speedily, and less costly [228]. Specifically, we can now record trait data via sophisticated non-invasive imaging, remote-sensing, spectroscopy, image analysis, robotics, high-performance computing facilities, and phenomics tools and databases. These tools include red-green-blue (RGB) imaging, magnetic resonance imaging (MRI) and positron emission tomography (PET), near-infrared (NIR) spectroscopy, canopy spectral reflectance (SR) and infrared thermography (IRT), nuclear magnetic resonance, hyperspectral imaging, laser imaging, 3D imaging, and geographical information systems (GIS), among others [228,230,[239][240][241][242].
These new phenomics methodologies target to significantly reduce the time required to gather essential data on traits such as plant architecture, photosynthesis, growth, or biomass productivity on thousands of individual plants from weeks or days to hours [230,238]. For instance, the use of robotics for measuring large number of plants means that large numbers of genotypes could be readily phenotyped [243][244][245]. Such advances in crop phenotyping are anticipated to provide crop researchers with tools and knowledge essential for unlocking the information coded in plant genomes [238]. Therefore, HTP now provides an essential link in translating laboratory research to the field. This is vital in developing novel genotypes that incorporate gene(s) expressing promising trait(s) into breeding lines adapted to target field environments [246]. For example, 3D visual modelling can be used to determine the plasticity of the canopy architecture and to evaluate the architectural and physiological characteristics that contribute to the higher productivity of the super rice varieties under drought stress conditions [247].
Thus, image-based phenotyping is currently being deployed for crop growth and disease monitoring in main cereals using HTPPs [240,248,249]. Such HTPPs are capable of acquiring quantitative plant information from large populations by minimally invasive or non-invasive methods integrated into screening protocols. Moreover, HTPP methodologies are amenable to deployment for both field and controlled environments (greenhouse and growth chambers) phenotyping [225,250]. Though currently very expensive, upscaling the utilization of these HTPPs will eventually enhance our understanding of crop growth kinetics and aid us improve crop models for systems biology and breeding of climate resilient cereal crop cultivars.

Unlocking the Roles of Plant Long Non-Coding RNAs (lncRNAs) in Regulating Plant Stress Responses and Adaptation
Technical innovations in genomics and bioinformatics, particularly the extensive use of high-throughput sequencing technology, have facilitated the discovery of more transcriptional units lacking protein-coding potential [251,252]. Such RNA units are now known as non-protein coding RNAs (ncRNAs), and they include small RNAs (sRNAs, ranging between 18-30 nucleotides; nt), medium sized ncRNAs (31-200 nt), and long noncoding RNAs (lncRNAs, >200 nt) [253,254]. Particularly, micro RNAs (miRNAs) are usually 21 nt sRNAs which direct the degradation and inhibition of translation of the mRNA targets, thereby suppressing the target genes [254]. The growing interest in studying plant ncRNAs has led to the development of several databases and tools which harbor information for the identification and annotation of those plant ncRNAs [254,255].
Among the several ncRNAs, lncRNAs have attracted much attention in genomics and stress response studies [252,256,257]. The lncRNAs have been defined as a diverse class of RNA transcripts containing >200 nt, with little or no significant proteincoding ability, but possessing critical roles in diverse cellular processes and plant abiotic stress responses [252,253,256]. Generally, lncRNAs are transcribed by RNA polymerase II or III, as well as polymerase IV/V in plants [255,258], processed via splicing or non-splicing, polyadenylation or non-polyadenylation, and can be compartmentalized in the nucleus or cytoplasm [251].
Increasing body of evidence has shown that plant lncRNAs play vital roles 'as biological regulators' in diverse cellular processes, including cell differentiation, genomic imprinting, epigenetic modification, and stress response, among others [252,[259][260][261][262][263]. Some plant lncRNAs act as primary transcripts of small regulatory RNAs such as miRNAs and siRNAs, as well as playing roles in phosphate homeostasis and protein re-localization [251]. Additionally, lncRNAs have recently been functionally characterized in plant stress response mechanisms [264][265][266][267]. For instance, using a deep transcriptomic sequencing approach, researchers identified 584 lncRNAs to be responsive to simulated drought stress in foxtail millet [268]. In another study, researchers used deep RNA sequencing approach to identify lncRNAs responsive to combined salinity and boron stress in a hyper-arid Lluteño maize landrace from the Atacama Desert. Consequently, 1710 lncRNAs were putatively responsive to the combined stresses [252]. Similarly, 98 drought-responsive lncRNAs were observed to regulate drought-responsive regulatory genes involved in various metabolic processes in rice [269], whereas 77 heat-responsive lncRNAs were identified to regulate cellular responses to heat stress in wheat [270]. Taken together, various lncRNAs play vital roles in modulating stress responses by acting as target mimics for different miRNAs that control the expression of stress-responsive target genes and transcriptional factors via upand down-regulation or regulatory hubs for controlling several hormonal signaling pathways at transcriptional, post transcriptional, and epigenomic levels [253,256]. For extensive reviews on lncRNAs and their functional roles in plant biology and stress responses, we refer you to previous articles [253,256,262,271].
Since field crops are continuously exposed to a combination of different biotic and abiotic stresses, plants institute elaborate adaptive response mechanisms in response, via reprogramming their gene expression at the transcriptional, post-transcriptional, and posttranslational levels [252,270]. Therefore, unlocking the exact roles played by lncRNAs in specific abiotic and biotic stresses will facilitate the designing of lncRNAs biomarkers relevant for engineering climate-resilient crop cultivars [256,272]. Improving productivity and climate resilience of major cereal crops requires understanding the causal processes, exploring the extent, and exploiting the maximum possible abundance of genetic variation within the gene pools [273]. Generally, crop reference genome sequences have been the basis of crop genomic and genetic studies, providing insights into gene content, genomic variation, and the genetic foundation for most agronomic traits [2,118,274]. Traditionally, researchers have employed reference genomes to predominantly target SNPs for crop genomic diversity investigations. Precisely, the accessibility of reference genomes for the major crops has facilitated genome-wide analyses of SNPs and subsequent marker-trait linkage studies to connect genetic variation with phenotypic variation [273].
Meanwhile, genome structural variation (SV) has become increasingly acknowledged as a fundamental aspect of genomic diversity [2,[275][276][277]. Recognizably, genetic variation, especially SV, can cause considerable variation of functional gene complement and gene content among individuals within the same species [273]. Main SVs comprise the copy number variants (CNVs) and presence absence variants (PAVs). CNVs refer to sequences that are present in a different number of copies between, whereas PAVs are sequences that exist in one genome and are absent in another [277,278].
Unfortunately, the reliance on resequencing approaches premised on a single reference genome has limited our capacity to detect genomic SVs and constrained our understanding of the genetic diversity in major crop species [273]. It is now widely accepted that a single crop reference genome is incapable of capturing the full landscape of genetic diversity of a species and hence cannot offer full insights of the crop's diversity [279,280]. On the other hand, since pan-genomes usually contain within-species CNVs and PAVs, a pan-genome offers a complete genomic variation repertoire of a genus [280]. Therefore, pan-genome analysis is a more robust and comprehensive approach that provides a platform to capture gene content variation and evaluate the genetic diversity of a species via investigation of its entire genome repertoire, through sequencing of multiple individuals of the same species [273]. This presents unprecedented opportunities for crop improvement going forward.
A pan-genome has been defined as the sum total of genes of a biological clade, such as a species, comprising of a set of core genes that are shared by all individuals and a set of dispensable (or variable) genes that are partially shared or individual specific [280,281]. Initially coined for prokaryotes and popular in microbiological studies [281], pan-genome analysis is becoming increasingly common in plant genome studies as well [2,280]. Recently, crop pan-genomes have been published for maize [282], rice [283][284][285], wheat [286], and Brassica species [287,288], among others. For extensive reviews on crop pan-genomes, we refer you to recent excellent papers [273,279,280,289]. Therefore, a paradigm shift from single reference genome to pan-genome analysis approach for detecting genetic diversity within species will eliminate single-sample bias and allow for a better representation of crop genetic diversity [2,280,290].
SVs such as PAVs and CNVs play crucial roles in influencing important climaterelevant crop agronomic traits [273]. Particularly, the dispensable genome has been observed to harbor genes responsible for crop adaptation and survival under different biotic and abiotic environments [280], including head smut resistance in maize [291], phosphorus starvation in rice [292], and temperature extremes, among others [273]. Therefore, pangenomic studies will facilitate dissection of the genetic basis of these major agronomical traits, thereby aiding linking of genetic variation with agronomic traits via QTL studies or GWAS, which is critical for crop improvement [273,279]. Moreover, understanding pan-genomics will facilitate accelerated exploitation of CWRs for increasing diversity within gene pools, thereby expanding the toolbox available for plant breeding and crop improvement efforts [52,289]. Collectively, pan-genomics, coupled with advanced genome sequencing techniques, will facilitate better understanding of the crop genetic diversity and identification of novel crop alleles [279], ultimately broadening genetic resources for accelerated crop improvement for stable higher yields and climate resilience.

Transposable Elements as Research Target for Decoding Crop Genomes and Understanding Crop Responses to Biotic and Abiotic Stresses
With the modern advances in genome sequencing technology and assembly algorithms, our capacity to decode the complexity and structure of genomes has been significantly improved [289]. Resultantly, previously unknown genome structures such as transposable elements (TEs) are becoming clearer. TEs, also known as 'jumping genes', are ubiquitous mobile DNA sequences that are found in both eukaryotic and prokaryotic genomes [293]; they comprise large portions of plant genomes (Table 1, [86]). Examples of TEs include long terminal repeats (LTRs), miniature inverted repeat TEs (MITEs), Ac/Ds elements, helitrons, Enhancer/Suppressor of mutation (En/Spm) elements, and mutator (Mu) elements [293][294][295]. Plant genomes, including crop species such as maize, are rich in TEs [296,297]. TE activity mediates large-scale chromosomal reorganizations [298,299], creates majority of insertions and deletions in crop genomes [300], and modifies the architecture and amount of gene product that is transcribed [273,301,302]. Notably, TE transposition may influence transcriptional activity of adjoining genes by controlling epigenomic profile of the region or by altering the relative location of regulatory elements [303]. It is not surprising that TEs are the major contributors of genome size variation among different species [295,304,305], and important causes of SVs [273,289].
The importance of TEs to crop phenotypes has been repeatedly shown. For instance, in maize, TEs were shown to provide/activate important allelic regulatory variation in gene response to several abiotic stresses [306]. Similarly, Gypsy retrotransposon-mediated aluminum tolerance was achieved in rice, through enhanced expression of the citrate transporter OsFRDL4 [307]. As pan-genomes become widely available for crop species, TEs will receive increasing attention in crop improvement programs [289]. More crucially, development of new tools for analyzing complex pan-genomes, encompassing comprehensive TE annotation, will facilitate our further understanding of TEs' varied roles within crop genomes and connecting TE variation to phenotypes of agronomic importance [279,303].

Machine Learning as a Powerful Tool for Gene Function Prediction and High-Throughput Field and Stress Phenotyping
Crop genomics research is not simply about acquiring molecular phenotypes, but also leveraging powerful data mining and bioinformatics tools to predict and interpret these phenotypes [308]. Fortunately, machine (or deep) learning (ML) has been observed to be effective in accomplishing these tasks in recent years [308]. ML refers to a group of computerized modelling approaches that can study patterns from the data so that they make automatic decisions without programming explicit rules [309]. ML allows algorithms to interpret data by learning patterns through experience [310].
When using ML, problems in the field are categorized into either supervised or unsupervised [311,312]. Supervised learning aims to obtain a model which maps its predictors, such as DNA sequences, to target variables, such as histone marks [308]. Predicting regulatory and non-regulatory regions in the maize genome [313], plant stress phenotyping [314], and predicting diseases and nutritional deficiencies in soybean [315] are examples of supervised learning applications. On the other hand, if there is no specification about the outcome in the data set, then the problem becomes unsupervised learning [308], with clustering and feature extraction being examples [316].
Using ML to analyze enormous, varied, and formless datasets (such as those generated by photo imaging or sequencing) may offer considerable advantages over conventional analytical methods [310,317]. ML has been applied in many areas of genomics and phenomics research, including genome assembly and genome annotation [318]; large-scale data analysis to resolve complex biological problems in genomics, metabolomics, transcriptomics, proteomics, and systems biology [319,320]; the inference of gene regulatory networks [321,322]; identification of true SNPs in polyploid plant species [323]; and highthroughput field and crop stress phenotyping [309,318,324,325].
Crop scientists can use ML to model the flow of information from genomic DNA sequences to molecular phenotypes, and to identify functional variants in natural populations using ML models [308,312]. Additionally, the power of ML in synthetic biology can be unleashed to create novel genomic elements with desirable functions [308]. Particularly for crop breeders, ML will aid the identification of functional genomic regions of agronomic value by facilitating functional annotation of genomes and permitting realtime high-throughput phenotyping of agronomic traits in both controlled and open-field environments [310,312]. Moreover, ML can be integrated with genome sequencing and bioinformatics to predict transcriptional factor binding sites [326]. Previously, ML method-ologies have been successfully employed to detect several features, including proteincoding genes, miRNAs, lncRNAs [327], polyadenylation sites [328], and cis-regulatory elements (CREs) [312,329].
Several crop databases that integrate the enormous volume of heterogeneous and unstructured genotypic and phenotypic data (Big Data) now provide valuable resource for crop breeders and opportunities to unravel novel trait-associated candidate genes [308]. However, retrieval, analysis and interpretation of such Big Data is challenging [309]. Fortunately, ML offers promising computational and analytical solutions for the integrative analysis of these enormous datasets on the Big-Data scale [131,312,319]. Additionally, using ML to infer the relationships between CREs and genes is a promising field for identifying previously unknown candidates for crop improvement stepping into the future [330]. Further, ML approaches will become more important for crop yield prediction, high-throughput crop stress phenotyping and climate change impact evaluation in agriculture [131,309,318,331].
Collectively, we have summarized the recent advances in crop genomics that are being applied to enhance cereal crop resilience to climate change as shown in Figure 1.

Conclusions
Modern developments in genome sequencing, assembly, and annotation, coupled with sophisticated bioinformatics and computational tools have facilitated our better understanding of the structure and information contained in crop genomes. Consequently, mapping of genomic regions controlling variation of target agronomic traits has been improved. Additionally, exploitation of an increased number of CWRs and orphaned species and the use of induced mutations are providing novel allelic diversity into the crop breeders' toolbox. This has necessitated genomics-assisted breeding of climate-resilient crop cultivars. Further, the application of new gene editing techniques such as CRISPR-Cas9 and DH technology are accelerating the improvement of climate-resilient and nutritionsuperior crop cultivars. Integration of ML and high-throughput crop phenotyping has become central in enhancing gene function prediction and linking genotypes to phenotypes. Going forward, unlocking the exact roles played by lncRNAs in specific abiotic and biotic stresses will facilitate the designing of lncRNA biomarkers relevant for engineering climateresilient crop cultivars. Moreover, pan-genomics will enable our better decoding of the crop genetic diversity and identification of novel crop alleles, whereas comprehensive TE annotation and analysis will help us understand their varied roles within the crop genomes and link TE variation to phenotypes of agronomic importance. All these genomics strategies will be critical for breeding high-yielding, climate-resilient and highly nutritive cereal crops for the rising human population.

Conflicts of Interest:
The authors declare no conflict of interest. Additionally, the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.