Genomic Selection—Considerations for Successful Implementation in Wheat Breeding Programs

: In order to meet the goal of doubling wheat yield by 2050, breeders must work to improve breeding program e ﬃ ciency while also implementing new and improved technologies in order to increase genetic gain. Genomic selection (GS) is an expansion of marker assisted selection which uses a statistical model to estimate all marker e ﬀ ects for an individual simultaneously to determine a genome estimated breeding value (GEBV). Breeders are thus able to select for performance based on GEBVs in the absence of phenotypic data. In wheat, genomic selection has been successfully implemented for a number of key traits including grain yield, grain quality and quantitative disease resistance, such as that for Fusarium head blight. For this review, we focused on the ways to modify genomic selection to maximize prediction accuracy, including prediction model selection, marker density, trait heritability, linkage disequilibrium, the relationship between training and validation sets, population structure, and training set optimization methods. Altogether, the e ﬀ ects of these di ﬀ erent factors on the accuracy of predictions should be thoroughly considered for the successful implementation of GS strategies in wheat breeding programs.


Introduction
According to the United Nations Department of Economic and Social Affairs, the global population is currently expected to grow from 7.7 to approximately 9.7 billion people by 2050 and 10.9 billion by 2100 [1]. This rapid population growth is expected to exceed the rate of global food production, posing the risk of a potential food crisis in the next few decades. While wheat (Triticum aestivum L.) yield has increased rapidly since the 1950s, largely due to improved breeding and agronomic practices, such as the introduction of improved pesticides and synthetic fertilizers, the current rate is insufficient in meeting future caloric demands and in some cases remains stagnant [2]. It has been estimated that an approximate yield improvement of 2.4 percent per year, compared to the current 0.9 percent, is needed in order to double production of wheat by 2050 [3].
To meet future demands, breeders have looked toward new technologies, such as marker-assisted selection (MAS), phenomics, trans and cisgenic approaches, and the implementation of genomic selection (GS) [4][5][6]. Genomic selection is a modified form of MAS. The GS process begins with a panel of genotypes, referred to as a training population, that have been genotyped using genome wide markers and phenotyped for trait(s) of interest. The training population is then used to train a model to calculate genome estimated breeding values (GEBVs) for a panel of genotypes, the validation population or new breeding lines that have only been genotyped. The breeder can then use the calculated GEBVs to make selections from the validation population without the need for phenotyping [5,7]. GEBV is One of the technologies breeders have used to increase genetic gain is marker-assisted selection (MAS). In MAS, molecular markers closely linked or co-segregating with a desired trait can be used to differentiate breeding lines based on the allelic variation underlying a trait [11,12]. Practitioners of MAS have been able to report genetic gains twice that of phenotypic selection [13][14][15]. MAS has several advantages compared to phenotypic selection, some of which include (1) the ability to select for traits that are difficult, expensive, or time consuming to phenotype (2) the ability to select for traits that have low heritability and low expression and are therefore difficult to phenotype, (3) the ability to select for traits where phenotyping is dependent on specific environmental conditions or growth stages [16], and (4) the ability to select for monogenic or qualitative traits, making it an ideal selection strategy for introgression and backcrossing of single genes into germplasm along with pyramiding of disease resistance genes [4,16]. Strategies have also been developed so that multiple traits can be selected simultaneously via MAS [16,17].
While MAS has helped to revolutionize plant breeding and provide a source for increasing genetic gain, there are also downsides. One of the primary downsides to MAS is that it is less effective when screening for more complex or multi-genic quantitative traits [5,18,19]. In order to discover new quantitative trait loci (QTL) associated with a trait of interest, biparental mapping populations must be developed. The problem with biparental mapping populations is that they rarely account for the allelic diversity and genetic background of a full breeding program and their effects tend to vary across environments, particularly for quantitative traits. Therefore, multiple mapping populations must be developed for specific environments and breeding programs in order to validate the position and allelic effects of new QTL which is both expensive and time consuming [5,16]. Other mapping approaches such as genomewide association studies (GWAS), which rely on linkage disequilibrium to identify marker-trait associations in diverse germplasm, could serve as an alternative to biparental analyses; however, it is also limited by the fact that complex traits are controlled by multiple loci with small effects. If a breeder attempts to adopt small effect QTL into their breeding program through MAS prior to validating said QTL using local germplasm, they could achieve genetic gains that are lower than conventional phenotypic selection [20]. Cost can especially limit the wide-spread use of MAS in a breeding program, especially when considering a genome as complex as wheat [21]. Altogether, the genetic architecture of complex traits, and the costs and resources needed to implement genetic mapping methods make other approaches such as GS an ideal complementary tool for breeding and selection.

Impact of Genomic Selection on the Breeder's Equation
When considering the breeder's equation, GS has the potential to reduce the length of a breeding cycle by selecting on the basis of GEBVs as opposed to measured phenotypes, allowing for greater genetic gain in one cycle of GS compared to one cycle of phenotypic selection [5]. Two rounds of GS were performed in the time it took to perform one round of phenotypic selection when selecting for resistance to stem rust in wheat [22]. When GS was compared with MAS in maize (Zea mays L.), genetic response was between 18 and 43 percent greater in the GS population compared to the MAS population [19]. Differences in response to selection per cycle between GS, MAS, and phenotypic selection were minimal in oats (Avena sativa L.); however the ability of GS to make two generations of selection in the time it took to make one generation of selections for phenotypic and MAS made it favorable for maximizing genetic gain [23]. The best way to determine prediction accuracy in GS is by performing a Pearson correlation between the GEBVs and true breeding values or phenotypic values. In the context of the breeder's equation, the prediction accuracy is proportional to genetic gain (R), where the prediction accuracy is equivalent to the selection accuracy (r) [9]. The accuracy of GS is affected by a combination of one or more factors as discussed below.

Marker Density
Since GS requires dense marker coverage across the entire genome, a proper genotyping platform must be considered [7]. Several genomewide platforms have been implemented in wheat. The first of these were restriction fragment length polymorphism (RFLP) markers [24], followed by amplified fragment length polymorphism (AFLP) markers [25], simple sequence repeat (SSR) markers [26], diversity arrays technology (DArT) [27], and single nucleotide polymorphisms (SNP), as part of the 9K and 90K Infinium arrays [28,29]. More recently, genotyping has been performed using SNP markers through genotyping by sequencing (GBS) platforms [30,31]. GBS has become the dominant genotyping platform for GS in wheat, particularly due to its low cost, high coverage, and reduced sampling bias compared to SNP arrays [32,33].
There is evidence that marker density only marginally impacts prediction accuracy and that the use of low to medium density marker datasets could be more cost-effective for GS for Fusarium head blight resistance in wheat [34]. Large numbers of markers can also result in model overfitting, resulting in lower prediction accuracies when predicting independent datasets [35]. In other plant species such as Asian rapeseed (Brassica napus L.), the use of lower density panels enabled high prediction accuracies for populations with strong LD comparable to those achieved using high-density marker sets [36].
Higher marker density can also be favorable for GS, as a larger amount of markers can increase the probability that markers will be in LD with QTL [5]. Higher marker density can also result in lower linkage disequilibrium (LD), significantly improving prediction accuracies [37]. In most cases lower LD can result in a larger recombination frequency and more accurate estimates of QTL effects [38]. Low LD combined with higher marker densities and larger training population sizes can strongly improve prediction accuracy as well [39].
In a study predicting the performance of agronomic traits in biparental maize and barley (Hordeum vulgare L.) populations, increasing the number of markers improved prediction accuracy. However, once the genome was sufficiently saturated with one marker for every 12.5 centimorgan (cM), the gain in prediction accuracy plateaued. This was also observed in mixed wheat and barley populations, where prediction accuracy plateaued after reaching a moderate marker density (2 cM in barley and 4.5 cM in wheat) [39]. While lower marker density might be adequate for GS, the cost of genotyping and phenotyping large training populations is not always sustainable for breeding programs. As a result, smaller training populations with higher marker densities are appropriate, especially in the case of biparental populations [40].

Prediction Models
Several statistical approaches have been proposed for predicting GEBVs and these methods differ based on their assumptions and treatment of marker effects. GS models include parametric, semi-parametric and non-parametric approaches [5,7,41]. The parametric methods include ridge regression-best linear unbiased prediction (RR-BLUP) and Bayesian methods which use a shrinkage procedure [5]. In RR-BLUP, the most commonly utilized model, all the marker effects are estimated simultaneously assuming that all marker effects are normally distributed (Gaussian distribution) with an equal variance [7]. The RR-BLUP approach treats marker effects as random effects and shrinks equally all QTL effects towards zero, thus it can capture minor effects using a large number of markers (minor QTLs). As RR-BLUP calculates GEBVs by contracting all marker effects equally, it can lead to an underestimation of large-effect QTL. To overcome this limitation, Bayesian methods have been proposed using marker-specific shrinkage effects [7]. A Bayesian approach assumes that the variance of the marker effects is different and is estimated by using a prior distribution for this variance [7].
Bayes A (Bayesian shrinkage regression) utilizes a scaled inverse chi-square distribution, which uses all markers, which are assumed to have unequal variances. Bayes B (Bayesian variable selection) uses an inverse chi-square distribution for marker effects and some markers are assumed to have a variance of zero. Bayes B presumes that a small number of large effect loci controls the trait. Another alternative Bayesian approach is Bayes C π which combines all markers with non-zero effects and estimates a common variance for them [42]. Bayesian LASSO (least absolute shrinkage and selection operator) has an exponential prior on the marker variances, giving a double exponential (Laplace) distribution for the marker effects resulting in less and more shrinkage on large-and small-effect markers [43]. The semi parametric and non-parametric methods used in GS include kernel regression and reproducing kernel Hilbert spaces (RKHS), and random forest regression (RF) methods. Kernel regression is based on genetic distance, which regresses marker effects to a smoothing parameter to control the distribution of the QTL effects [44]. Random forest [45] is a machine learning method that captures nonlinear relations between phenotypes and marker genotypes by building a nonlinear prediction model. The advantage of semi-and non-parametric methods is that both methods capture non-additive effects.
When using smaller training sets, the RR-BLUP model outperformed the Bayes-Cπ model [40]. This is likely due to the fact that a larger number of allelic observations are needed in order to predict small effect QTL, which is irrelevant for RR-BLUP since all allelic effects are assumed to be the same [46,47]. Epistatic models improved predictions in maize, thus RKHS and RF may hold advantage in this situation [48]. When compared with 10 other Bayesian and BLUP GS models using wheat, maize, and barley datasets, the RKHS often had much higher accuracy than other models. However, in the same analysis, RKHS over fitted more than the other models, in that too many irrelevant, low impact loci, were incorporated in the model leading to exaggerated genetic variances and trait heritability [49,50]. Another study in maize showed that RKHS outperformed Bayesian-LASSO for traits where epistasis was more relevant, whereas it was outperformed for more additive traits [51]. When RF was compared with RR-BLUP, Bayes A, Bayes B, and RKHS models, it consistently had the lowest prediction accuracies for seven drought tolerance traits in maize [52]. When the RF model was compared with 10 other models for wheat, maize, and barley predictions, it performed fairly well with good prediction accuracies and low computing time [49]. The RF model was also compared with Bayesian LASSO, Bayes B, Bayes Cπ, and RR-BLUP while predicting the performance of four agronomic traits in chickpea (Cicer arietinum L.). While the RF model was the best fitting model during three location-years, there was not a significant difference between any of the models [53].

Training Population Size
One of the important factors for implementing GS in a breeding program is determining the appropriate size of a training population. The goal for a breeding program would be to develop a training population that can maximize prediction accuracy while limiting population size as much as possible, largely due to expense of genotyping and phenotyping a large number of genotypes for the training population [35].
When training population size was tested for four different yield traits in wheat, prediction accuracy increased as the number of genotypes increased from 250 to 2000 genotypes; glaucousness increased from 0.78 to 0.92, grain yield increased from 0.72 to 0.85, thousand-kernel weight increased from 0.66 to 0.86, and relative maturity increased from 0.59 to 0.82 [49]. Even though prediction accuracy increased as the training population size increased, the gain in accuracy plateaued after 2000 genotypes [37]. When evaluating prediction accuracies for nine different end-use quality traits in biparental wheat populations, the increasing population size from 24 to 96 genotypes increased prediction accuracies as well [40]. It has also been observed that deviations between prediction accuracies for different traits are larger under smaller training population sizes, likely due to the effect of relatedness between the training population and validation set [33]. Smaller training populations also run the risk of overestimating the genotypic effect when predicting larger validation sets [35]. Training population sizes between 25 and 300 were evaluated for both wheat and rice (Oryza sativa L.) for five and four agronomic traits, respectively. In wheat, prediction accuracies were highest at 300 genotypes for all five traits; however, there was evidence of a plateau after 300. In rice, prediction accuracies plateaued after 175 genotypes for florets per panicle and protein content, while there was actually a decrease in prediction accuracy for plant height and flowering time after 150 genotypes [54]. More diverse training populations usually need to be larger in order to account for the larger genetic diversity, particularly with low heritability traits [55].

Trait Heritability
Generally, as heritability increases for a trait, so does the predictive ability for the trait [35]. The impact of heritability on prediction accuracy for training and validation sets was evaluated for 30 agronomic, quality, and stress tolerance traits in rice and three quality traits in maize. Overall, prediction accuracy improved as heritability increased in both the training and validations sets, with a greater impact in training sets [56]. Heritability was also evaluated for mixed populations of barley and wheat, along with biparental populations of maize and barley. In all cases, prediction accuracy increased as heritability in the training populations increased [39]. A strong correlation between high heritability and stronger prediction accuracies was also observed when predicting quality traits in bi-parental wheat populations [40]. Even with higher heritability, certain traits may have lower prediction accuracies as the trait could be controlled by a larger number of small-effect genes, such as grain yield, as opposed to a small number of large-effect genes, such as grain dry weight, as evidenced when evaluating a maize test-cross population [57].

Genetic Relationship between Training and Validation Sets
The genetic similarity between training populations and their validation sets can significantly influence prediction accuracy [35,58]. More closely related individuals will often share a common ancestor in a relatively small number of generations prior, allowing for less recombination events, preserving QTL and marker linkage-phases. Closely related individuals are also more likely to share polymorphic loci that generate genetic variation, leaving fewer opportunities for genetic drift or mutations [59]. There are also interactions that can occur between QTL and genetic backgrounds, as closely related individuals are more likely to share a portion of their genetic backgrounds compared to distant individuals [59][60][61]. When evaluating training populations for Fusarium head blight (FHB) resistance in barley, it was observed that prediction accuracy is lower when a training population is predicting the performance of barley genotypes from outside breeding programs [62]. The same has also been observed when predicting across different full-sib families in bi-parental maize populations [63]. Training populations do not need to be as large when they are predicting the performance of closely related genotypes in the validation population. Along with that, genotypes in the training population do not need to have as large of a marker density when predicting the performance of closely related genotypes [64].

Population Structure
Population structure is the composition of a population that are divided by genetic background, geography, or natural selection. As a result, it plays an important role in the estimation of GEBVs. Changes in allele frequencies between sub-populations can result in erroneous associations between markers and traits, leading to biased prediction accuracies for GS. A common indicator of population structure interference is when small training populations have high prediction accuracies [65]. Several methods can be used to reduce the influence of population structure. These include the separation of breeding lines based on their origins or traits, inclusion of kinship or genotype matrices, inclusion of marker fixed effects, or including principle components in an analysis based on subpopulations [56,58,62]. Another way to control for population structure is with epistatic genomic prediction models that can account for both additive and non-additive effects, such as the RKHS, Gaussian kernel, or exponential kernel models [66]. Another method for population structure partitioning is the K-means algorithm, where partitioning is based on genetic similarity [37]. In the case of large training populations, another common source of population structure is LD. Particularly when two loci with differences in allele frequencies are in LD across subpopulations. This can result in spurious associations with multiple QTL [67]. When the effects of population structure were evaluated in a historic winter wheat nursery, two major subpopulations emerged based on the presence or absence of the t2BS:2GS·2GL:2BL translocation derived from Triticum timopheevii. However, when this translocation was used to design an optimal training population, it did not have significantly higher predictive ability than a randomly selected training population [68].

Retraining and Training Population Composition
Retraining of the training population is important because as multiple generations of selections are made, the generational difference between the training population and the validation set grows larger. Since the training population and the validation set become more genetically distant, the prediction accuracy is reduced. Therefore, it is important to add new genotypes to the training population in order to maintain a genetic relationship to the validation set as new germplasm is introduced to the breeding program and more recombination events occur [59]. New germplasm must be added to breeding programs implementing GS because the increased selection intensity from GS reduces the effective population size in a breeding program and thus leads to a loss in genetic variability [5].
Several different methods are used to optimize training populations in order to maximize prediction accuracy while also minimizing training population size, allowing breeders to better allocate resources. Among these are the PEVmean, which identifies the optimal training set to minimize predictive error variance; CDmean which maximizes the coefficient of determination; Gmean which maximizes genetic relationship using only marker information; and Stratified Sampling (SS) which uses random sampling to maximize the variance explained; and Selection of Training Populations by Genetic Algorithm (STPGA) which utilizes both phenotypic and marker information [69].
When the CDmean, SS, Gmean, and STPGA algorithms were tested on a barley training population in order to compare prediction accuracies for grain yield and deoxynivalenol (DON) accumulation, all four significantly improved prediction accuracy for both traits compared to a randomly selected training set [69]. The Gmean algorithm outperformed the other algorithms while predicting grain yield, while the SS algorithm was the top performing algorithm for predicting DON accumulation [69]. PEVmean and CDmean were compared with each other in two empirical studies using two different maize diversity panels. CDmean was the more reliable algorithm as it took into account the reduction of variance due to relatedness between individuals [70]. The PEVmean algorithm, as described by Akdemir et al. was compared with random and clustered training population selection methods in winter wheat [71]. The PEVmean algorithm outperformed the random and clustered methods across all population sizes, ranging from 50 to 350 genotypes [68].

Genomic Selection in Multiple Environments
A vast majority of the applications for GS is in single environments and most genomic prediction models do not have the predictive power to make selections across multiple environments or take genotype by environment interactions into consideration [72]. Even so, genotype by environment interactions play a large role in plant breeding [73]. The first multiple environment models incorporated a factor analytic structure in order to model for genotype by environment interactions. Multiple environment models with pedigree and marker information performed better than models without pedigree or with only one of the components when making predictions for multiple environments in wheat [72]. The incorporation of weather data and weather covariates has also been beneficial for multiple environment GS as well, improving prediction accuracy by nearly 11 percent in winter wheat [74]. The problem with incorporating marker and environmental covariate interactions in a multiple environment model is that the number of interactions becomes so large that the ability to model such interactions is impossible, especially when using a large number of markers and environmental covariates. The use of variance components can help to reduce the computational demand of the modeling marker and environmental component interactions [75]. When comparing marker by environment GBLUP models with models that ignored a genotype by environment interaction, the marker by environment GBLUP model significantly outperformed the naïve model over three wheat datasets in seven different irrigated and dryland environments [76]. In another study, a genotype by environment (GxE) GBLUP model and a standard GBLUP model were tested over 35 location years. The standard GBLUP model outperformed the GxE model when predicting the performance of new environments, however modeling for genotype by environment interactions improved the overall predictive ability [77].

Genomic Selection for Multiple Traits
Most GS models only predict for single traits, however predicting for multiple traits can be advantageous, especially when looking at multiple yield components, quality traits, disease resistance traits, and abiotic stress tolerance traits. In the past, multi-trait methods have been developed for QTL mapping [78][79][80]. Genomic prediction models have also been developed to predict multiple traits in dairy bulls. The three models are variations on the GBLUP, Bayes SVSS, and Bayes Cπ models, where the Bayes SVSS outperformed the Bayes Cπ and GBLUP models [81]. When comparing predictive ability of GBLUP, Bayes A, and Bayes Cπ multivariate models on a pine (Pinus L.) breeding dataset, the Bayesian models outperformed GBLUP under a major QTL structure and the multi-trait models strongly outperformed the single trait models. Under a polygenic genetic architecture, the three multi-trait models performed roughly the same and barely outperformed the single trait models. Multiple trait models tend to work better when predicting traits that are genetically correlated with each other [82]. When predicting cassava (Manihot esculenta Crantz) performance, multiple-trait models outperformed single trait models by nearly 40 percent in prediction accuracy [83]. In American cranberry (Vaccinium macrocarpon Ait), a multi-trait GBLUP model outperformed a standard, single trait GBLUP model in scenarios with medium to high genetic correlation between traits, however there was little difference between models when genetic correlation between traits was low [84]. A training population of 557 wheat genotypes were evaluated for grain yield and three proximal or remote sensing traits and genomic prediction models were tested using a univariate model for only grain yield and a multivariate model using all four traits [80]. Multivariate model prediction accuracies for grain yield outperformed univariate models by 70 percent, on average. This indicated that the use of proximal and remote sensing data as secondary traits in genomic prediction models could improve prediction accuracy for grain yield [85].

Use of GWAS Results in Genomic Selection
Traditionally, QTL mapping was performed using QTL studies, which consisted of biparental mapping populations consisting of recombinant inbred lines, F 2 lines, or backcrosses. All of the individuals within the mapping population were then genotyped and phenotyped for a trait of interest, where the resulting data would be analyzed using linkage mapping in order to identify QTL associated with the trait of interest. The problem with QTL mapping is that the resolution is often low, due to low recombination within the population and small population sizes. Then, when QTL have successfully been identified within a single biparental mapping population, markers within LD of the QTL could be used to select for the trait of interest. The problem is that it is difficult to validate QTL across mapping populations, often making it nearly impossible to implement MAS using markers developed for another breeding program [5,86].
An alternative to QTL mapping is the genomewide association study (GWAS), which can be used to identify significant marker trait associations (MTA) between genomewide SNP markers and individual traits. This method relies on LD and ancestral recombination events in natural populations [87,88]. GWAS can be performed on a wide array of diverse genotypes already existing in a population [86,89]. The problem with some GWAS models is that they can often fail to account for false positives due to population structure and relationships between individuals in the association-mapping panel [90,91]. Using population structure and other relationship parameters as covariates can reduce these false positives; however, this can result in more false negatives and longer computational times [92]. In that regard, many GWAS models that can help control for false positives or negatives when identifying MTAs have been implemented. Some of these models include the generalized linear model (GLM), mixed linear model (MLM), efficient mixed-model association (EMMA), compressed mixed linear model (CMLM), multiple locus mixed linear model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK) models [92][93][94].
Most QTL discovered through QTL mapping and GWAS related to quantitative traits, such as grain yield, disease resistance, and quality, have not been successful for traditional MAS. This is largely due the fact that many small-effect genes are controlling variation in these traits [95]. While significant MTAs identified through GWAS for these traits may not be helpful regarding MAS, they can still be applied for GS [96][97][98]. Most simple genomic prediction models, such as RR-BLUP or GBLUP, assume that most markers have small genetic effects; as opposed to GWAS, which assumes there are multiple markers contributing to a larger amount of genetic variation. Both methods are incorrect in their assumptions [96,99]. More complex Bayesian and non-parametric models were developed in order to account for major-effect QTL using shrinkage algorithms, and variable selection using multiple distributions for marker effects and tuning parameters [7,42]. The problem with more complex models is that they are often more computationally intensive and do not work well with larger marker datasets. Parameter estimates can also be overly sensitive to priors and often do not translate to other studies [100].
An alternative is to use markers linked to major genes associated with traits of interest as fixed effects. Several GS experiments have shown that this strategy can improve prediction accuracies in wheat and maize [68,[101][102][103][104]. Another proposed method is using only GWAS data from previously published GWAS studies as marker covariates, where the inclusion of GWAS data with a GBLUP model improved prediction accuracies for two out of three traits in cattle and nine out of 11 traits in rice over a Bayes B model [105]. Other studies have included GWAS data from the training populations themselves, otherwise referred to as GS + de novo GWAS (GS+GWAS) [97].
Spindel et al. previously performed GS+GWAS in a tropical rice breeding program using the RR-BLUP model with significant markers from a GWAS as fixed effects. The GS+GWAS model outperformed the RR-BLUP with historic GWAS data, RR-BLUP, Bayesian LASSO, RKHS, Random Forest, and multiple linear regression models for all three traits; however, the GS+GWAS model had a significantly higher prediction accuracy for plant height [97]. A GS+GWAS model was used on a maize nested association mapping (NAM) population, making predictions for three traits; one was a highly polygenic trait, plant height, whereas the other two were moderately polygenic disease resistance traits. The GS+GWAS model performed significantly better for the moderately polygenic disease resistance traits, whereas there was no significant difference in prediction accuracy between the GS+GWAS models and the standard models for plant height [96]. When predicting powdery mildew resistance in winter wheat, seven different training populations were developed ranging in size from 50 to 350 genotypes, A GWAS was performed on each training population and only the most significant MTA was selected as a fixed effect for the RR-BLUP model. The GS+GWAS model significantly outperformed the standard RR-BLUP model for prediction accuracy at each training population size [68]. The inclusion of significant markers in a model, however, does not always result to improved predictions. In a recent simulation study in maize and sorghum (Sorghum bicolor (L.) Moench), for instance, no significant increase or a decrease in prediction accuracy was observed for a majority of the traits evaluated when GWAS-derived SNPs were included as fixed effects in the GS model. Model performance should therefore be explored on a trait-by-trait basis before its implementation in the breeding program [106].

Putting the Pieces Together: Genomic Selection for Wheat Grain Yield
Improving grain yield is a major goal for any wheat breeder, as it is an economic trait directly affecting farmers while also contributing to global food security. As genetic gains in wheat grain yield have stagnated recently, wheat breeders have looked at GS in order to increase genetic gain 3 . Previously, GS was implemented on 2325 European winter wheat genotypes wherein four genomic prediction models, RR-BLUP, Bayes Cπ, RKHS, and EGBLUP were compared for prediction accuracy. The epistatic models (RKHS) outperformed the additive models, indicating that accounting for epistasis improved prediction accuracy, particularly in larger populations. Forward selection was also successfully implemented with limited decrease in prediction accuracy from one year to the next [107]. In another study, prediction accuracy was evaluated for grain yield using RR-BLUP over five independent breeding cycles in a winter wheat breeding program using 659 inbred lines. Over the five cycles, grain yield prediction accuracy was r = 0.38, however after outlier cycles were removed, accuracy improved to r = 0.41. The removal of outlier environments did not have a significant impact on prediction accuracy [108]. Genomic selection was also implemented into the preliminary yield trial (PYT) stage of a wheat breeding program and compared with traditional phenotypic selection and genomic assisted selection (GAS), where breeding values from the PYT stage were combined with GEBVs using a heritability index. The resulting prediction accuracies for all three methods for grain yield were 0.39, 0.33, and 0.48 respectively. This indicated that the use of heritability indices in order to supplement GS can increase prediction accuracy for grain yield in wheat [109]. Different training population sizes were evaluated for the prediction of grain yield in winter wheat. It was observed that there was a high predictive ability using RR-BLUP when a training population was selected using PEVmean for grain yield. There was no significant improvement in prediction accuracy as training population size increased above 50 genotypes, with the exception of a reduction in prediction accuracy at 100 genotypes. Prediction accuracies for grain yield averaged 0.64 with a training population size of 350 genotypes, selected using PEVmean [68].

Putting the Pieces Together: Genomic Selection for Fusarium Head Blight Resistance
Fusarium head blight (FHB) is a destructive fungal disease that negatively affects wheat production worldwide, resulting in heavy losses in grain yield and quality. In the United States, the most common causal pathogen is Fusarium graminearum, which is part of the phylum Ascomycota [110,111]. Symptoms on infected wheat plants include premature bleaching of individual spikelets shortly after flowering, eventually progressing through the entire spike, resulting in a fully bleached spike. Signs of F. graminearum are usually present during warm and moist conditions; they appear as salmon colored sporodochia on the rachis and glumes of the spikelets. Blue-black spherical perithecia, which serve as sexual structures for the pathogen, will often appear later in the growing season. As symptoms progress, the fungal mycelium will colonize the wheat kernels as they develop, resulting in shriveled grain with a pink to light-pink coloration. These damaged kernels are often referred to as 'tombstone-kernels' or 'Fusarium damaged kernels' [112,113].
The F. graminearum pathogen produces the mycotoxin, deoxynivalenol (DON), in order to disable natural plant defenses. DON is considered a vomitoxin because it disrupts the digestive function of humans and animals that consume infected grain, resulting in nausea, headaches, vomiting, and in extreme cases, death. DON levels in human food should not exceed 1 ppm, however grain infected with FHB can exceed 20 ppm [114]. Any grain that does not meet the DON limits set by either the USDA or individual buyers can have the value of their grain docked or rejected entirely, causing significant economic damages for farmers [115].
Wheat breeders have worked to develop varieties that have genetic resistance to FHB. FHB resistance is a quantitative trait that is significantly influenced by genotype by environment interactions [116]. There are generally two major sources for FHB resistance in the United States, either through 'exotic' or 'native' germplasm. One of the most influential exotic genotypes, 'Sumai-3' was identified in China [117]. Several major effect QTL that have been validated for FHB resistance. The first three identified include Fhb1, Fhb2, and Qfhs.ifa-5A, found in 'Sumai-3' [118][119][120][121][122]. The next two major QTL, Fhb4 and Fhb5 were found in the genotype 'Wangshiubai' [123,124]. Qfhs.nau-2DL was then found in the breeding line CJ9306 [125,126]. The most recently discovered QTL was Fhb7 found in Thinopyrum ponticum [127]. The two primary forms of FHB resistance which exist in field conditions are Type I resistance or the resistance to initial infection by F. graminearum, and Type II resistance or the resistance to fungal spread within the infected spike [116]. The Fhb1 and Qfhs.nau-2DL QTL primarily confer Type II resistance while also reducing DON accumulation, whereas Qfhs.ifa-5A primarily confers Type I resistance and reduces DON accumulation [128][129][130][131].
As FHB resistance in wheat is a complex, quantitative trait, it is an ideal candidate for GS. Previous studies explored the potential of GS to predict resistance to FHB in wheat under different subsets of markers, mapping populations, and prediction models and scenarios. A wheat panel of 322 genotypes was previously used to predict the performance of six traits associated with FHB resistance. Four genomic prediction models (RR-BLUP, Bayesian-LASSO, RKHS, and Random Forest) across two different marker sets (whole-genome versus a marker subset associated with FHB resistance) were compared. Random Forest and RKHS models had the highest prediction accuracies for most of the traits. In the case of DON, the Random Forest model plus the targeted and whole-genome markers had the highest prediction accuracy [132]. Hoffstetter et al. found that prediction accuracy for FHB resistance improved with the use of marker and training subsets [133]. Another study for predicting FHB resistance using six traits tested with three models: RR-BLUP, elastic net, and LASSO where different marker densities were tested between 500 and 4500 SNPs along with training population sizes between 96 and 2018 genotypes. Overall, there was a high prediction accuracy for FHB resistance. The top performing model was RR-BLUP, whereas the marker sets having the highest accuracies ranged between 1500 and 3000 SNPs. Depending on the trait, the optimum training population was less than 192 genotypes. [134]. Another study on spring wheat genotypes in the Pacific Northwest United States used a population of 170 genotypes. FHB incidence was observed to have the highest prediction accuracies and genotypes that were similarly related to the initial training population had a prediction accuracy of 0.60. The prediction accuracy was fairly high for FHB resistance overall, suggesting that GS could work well for FHB resistance [135].

Conclusions
In GS, all markers, including those with minor effects are included in the model simultaneously, allowing breeders to use GEBVs to make selections on performance when limited or no phenotypic data are available. This is especially important for quantitative traits conferred by a large number of genes each with a minor effect, which, in the workflow of a traditional breeding program, are not evaluated until the later generations. The availability of inexpensive genotyping technology makes GS a feasible and attractive approach. In terms of influencing the breeder's equation (R = irσ A /t), GS can increase genetic gain by decreasing the breeding cycle time and by increasing selection accuracy and intensity in early generations. In wheat, studies have shown the potential for positive genetic gains from GS for quantitative traits such as grain yield, quality and disease resistance. Future work should focus on empirical studies that validate prediction accuracy in new breeding material in order to determine the efficacy of GS as replacement for phenotypic selection, where applicable. As different factors affect the accuracy of GS, thoughtful considerations based on these parameters, the traits being evaluated, and the available resources should result in to a successful implementation of these approaches in wheat breeding programs.

Conflicts of Interest:
The authors declare no conflict of interest.