Next Article in Journal
Genome-Wide Analysis of the Amino Acid Permeases Gene Family in Wheat and TaAAP1 Enhanced Salt Tolerance by Accumulating Ethylene
Next Article in Special Issue
QTLs and Candidate Loci Associated with Drought Tolerance Traits of Kaybonnet x ZHE733 Recombinant Inbred Lines Rice Population
Previous Article in Journal
The Role of Long Non-Coding RNAs in Cardiovascular Diseases
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multivariate Genomic Hybrid Prediction with Kernels and Parental Information

by
Osval A. Montesinos-López
1,
José Crossa
2,3,
Carolina Saint Pierre
2,
Guillermo Gerard
2,
Marco Alberto Valenzo-Jiménez
4,
Paolo Vitale
2,
Patricia Edwigis Valladares-Cellis
5,
Raymundo Buenrostro-Mariscal
1,
Abelardo Montesinos-López
6,* and
Leonardo Crespo-Herrera
2,*
1
Facultad de Telemática, Universidad de Colima, Colima 28040, Colima, Mexico
2
International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico
3
Colegio de Postgraduados, Montecillos 56230, México, Mexico
4
Universidad Michoacana de San Nicolas de Hidalgo (UMSNH), Avenida Francisco J. Mujica S/N Ciudad Universitaria, Morelia 58030, Michoacán, Mexico
5
Bachillerato 22, Universidad de Colima, Cuauhtémoc 28510, Colima, Mexico
6
Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Jalisco, Mexico
*
Authors to whom correspondence should be addressed.
Int. J. Mol. Sci. 2023, 24(18), 13799; https://doi.org/10.3390/ijms241813799
Submission received: 22 July 2023 / Revised: 28 August 2023 / Accepted: 1 September 2023 / Published: 7 September 2023
(This article belongs to the Special Issue Plant Population Genomics)

Abstract

:
Genomic selection (GS) plays a pivotal role in hybrid prediction. It can enhance the selection of parental lines, accurately predict hybrid performance, and harness hybrid vigor. Likewise, it can optimize breeding strategies by reducing field trial requirements, expediting hybrid development, facilitating targeted trait improvement, and enhancing adaptability to diverse environments. Leveraging genomic information empowers breeders to make informed decisions and significantly improve the efficiency and success rate of hybrid breeding programs. In order to improve the genomic ability performance, we explored the incorporation of parental phenotypic information as covariates under a multi-trait framework. Approach 1, referred to as Pmean, directly utilized parental phenotypic information without any preprocessing. While approach 2, denoted as BV, replaced the direct use of phenotypic values of both parents with their respective breeding values. While an improvement in prediction performance was observed in both approaches, with a minimum 4.24% reduction in the normalized root mean square error (NRMSE), the direct incorporation of parental phenotypic information in the Pmean approach slightly outperformed the BV approach. We also compared these two approaches using linear and nonlinear kernels, but no relevant gain was observed. Finally, our results increase empirical evidence confirming that the integration of parental phenotypic information helps increase the prediction performance of hybrids.

1. Introduction

Meeting the increasing global demand for food is an imperative challenge, especially in the face of climate change and its subsequent impact on natural resources. Plant breeding plays a crucial role in the human food chain by contributing to high and stable production yields with minimal external inputs and environmental impact. In the last twenty years, plant breeding has been revolutionized by Genomic Selection (GS), a predictive methodology proposed by Meuwissen et al. (2001) [1], which has enabled the selection of superior candidates based solely on genotypic information.
GS is important because it allows breeders to make more accurate and efficient selection decisions in plant and animal breeding programs. It involves the use of genomic data, such as DNA markers, to predict the genetic value of an individual or a population for a given trait of interest. This prediction is based on the association between the genomic data and the phenotype (observable characteristics) of the individuals [2,3,4,5]. Compared to traditional breeding methods, GS has several advantages. Firstly, individuals with desired traits can be selected earlier, reducing the time and cost of breeding programs. Likewise, traits that are difficult to measure or are expressed late in the life cycle can also be selected. Additionally, GS can increase the genetic gain per unit of time and cost, leading to the development of more productive and resilient crops and livestock. Overall, GS has revolutionized the field of plant and animal breeding, making it more efficient and effective [6,7,8].
Nonetheless, the successful implementation of GS in plant breeding for predicting the performance of hybrid combinations based on the genetic makeup of their parents poses several challenges, which include: (1) Non-additive effects: Hybrid prediction assumes additive genetic effects of parental lines, but in reality, non-additive effects, such as dominance and epistasis, significantly influence hybrid performance. Accurately modeling non-additive effects requires larger sample sizes for estimation; (2) Genotype-by-environment (G × E) interaction: Hybrid performance varies across different environments due to G × E interaction. This variation makes it challenging to predict hybrid performance accurately across diverse environments; (3) Limited data: The availability of limited data poses difficulties in accurately predicting hybrid performance, especially for new or untested hybrid combinations. Insufficient data hinders accurate predictions of hybrid performance; (4) Heterogeneous parental populations: Genetic diversity among parental lines used in hybrid creation complicates the accurate prediction of hybrid performance. This challenge is particularly prominent when dealing with open-pollinated populations or composite crosses; and (5) Complex trait architecture: The accurate modeling of complex traits, such as yield or disease resistance, is difficult, posing challenges in predicting hybrid performance for these traits. In conclusion, hybrid prediction in plant breeding is a complex and challenging task that requires careful consideration of the underlying genetic and environmental factors influencing hybrid performance [9].
In this regard, a valuable alternative to genomic prediction in plant breeding is multi-trait hybrid prediction, which permits the simultaneous prediction of hybrid performance for multiple traits, reducing both time and cost in breeding programs. Multi-trait hybrid prediction is advantageous for several reasons, including (1) Improved accuracy: By considering trait correlations, multi-trait hybrid prediction enhances the accuracy of performance prediction. Leveraging similarities in the genetic basis of traits, information from one trait improves predictions for another; (2) Reduced bias: Traditional genomic prediction methods may exhibit bias due to an uneven distribution of phenotypic data or low heritability of traits. Multi-trait hybrid prediction mitigates such bias by integrating genetic information from multiple traits to estimate hybrid genetic value; (3) Enhanced hybrid selection: Multi-trait hybrid prediction facilitates the selection of hybrids with desired combinations of traits, leading to more productive and resilient crops. This is especially valuable for complex traits that are difficult to measure or expressed late in the life cycle; and (4) Improved management of G × E interaction: Multi-trait hybrid prediction assists breeders in effectively managing genotype-by-environment (G × E) interaction. It considers trait correlations and their interaction with environmental factors, aiding the identification of hybrids with consistent performance across diverse environments. Overall, multi-trait hybrid prediction shows promise in improving the accuracy and efficiency of plant breeding programs, resulting in the development of highly productive and resilient crops [10,11].
Montesinos-López et al. (2022) [12,13] showed that multi-trait prediction using the kernel method has the potential to enhance prediction accuracy as kernels are able to capture nonlinear patterns when they are in the data. Kernel methods are important for genomic prediction because they efficiently model complex, nonlinear relationships between genetic variants and phenotypes. In genomic prediction, the goal is to predict phenotypic traits based on genetic information, and kernel methods provide a powerful framework for achieving this. Kernel methods rely on the concept of a kernel function, which can be thought of as a measure of similarity between two objects. In the context of genomic prediction, the objects are typically genetic variants, and the kernel function measures the similarity between two variants based on their genetic similarity [13]. By using kernel methods, genomic prediction models can capture complex interactions between genetic variants that may not be easily captured by linear models. This is important because many phenotypic traits of interest are influenced by numerous genetic variants, each of which may have a small effect. Kernel methods allow for these subtle interactions to be captured, leading to more accurate predictions. Overall, kernel methods are a powerful tool for genomic prediction, as they efficiently model complex, nonlinear relationships between genetic variants and phenotypes [1,14,15].
Since the prediction of hybrids in plant breeding is very challenging, many approaches have been studied to increase prediction accuracy. For example, some authors, such as Xu et al. (2021) [16] and Wang et al. (2012) [17], proposed the incorporation as an input of parental information to increase the prediction accuracy of hybrids because it provides additional information about the genetic background of the offspring. The genetic makeup of the offspring is a combination of the genetic makeup of the two parents, and this information is used to improve predictions of the offspring’s performance. Other studies proposed to incorporate phenotypic data or environmental data, among others.
For this reason, to evaluate if incorporating phenotypic parental information as an additional input improves prediction accuracy in the context of multi-trait with kernel methods, we utilized a wheat dataset provided by the International Maize and Wheat Improvement Center (CIMMYT). We hypothesize that incorporating the phenotypic information of the parents as covariates in the multivariate genomic prediction model with kernels will improve the prediction accuracy of the GS methodology. In this sense, we aim to broaden our understanding of the genetics of the studied traits and maximize the efficiency of hybrid prediction by exploring the multivariate kernel approach.

2. Results

The results presented for each trait and for each year are compared with the three strategies of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) in each of the six kernels, AC_1, AC_2, AC_3 AC_4, GK, and Linear. These comparisons were carried out for each year and across years.

2.1. Trait DTF

The following Figure 1 shows the prediction performance of each kernel in each year and across years (Global). For each kernel in each year, we compared the three strategies of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean).
In Figure 1, we can appreciate that in each year and across years, the best prediction performance was observed incorporating the parental phenotypic information (BV and Pmean). However, small differences are observed between the different kernels, and we can appreciate that in year 1 under the strategy BV, the best prediction performance was observed under a linear kernel that outperformed the remaining kernels by only 1.534%. While under the Pmean strategy, also in year 1, the linear kernel was the best regarding the remaining kernels but only by 0.706%. The same pattern was observed in years 2 and 3 under the BV strategy, where the best linear kernel outperformed the remaining kernels by 2.125% and 2.33%, respectively, while under the Pmean strategy, the linear kernel outperformed the other in years 2 and 3 by 0.062% and 0.106%, respectively (Figure 1).
Finally, across years (Global), we can appreciate that the linear kernel was also the best under both strategies of incorporating the phenotypic parental information (Figure 1). However, the gain in terms of NRMSE of the linear kernel regarding the other kernels was 1.982% under the BV strategy and 0.304% under the Pmean strategy. Under the linear kernel, we can appreciate that the best strategy for incorporating the phenotypic information (both for each year and across years) was the Pmean with an NRMSE = 0.626, followed by the BV strategy with an NRMSE = 0.632 and without incorporating parental phenotypic information. The NO_Cov strategy was the worst with an NRMSE = 0.666, meaning that the Pmean and BV strategies were better than the NO_Cov by 6.4% and 5.4%, respectively. See details in Table A1.

2.2. Trait DTH

Figure 2 demonstrates that including parental phenotypic information (BV and Pmean) consistently resulted in the most accurate predictions across different years. Although there were slight differences among various kernels, the linear kernel stood out in year 1, outperforming others by 1.577% using the BV strategy and 0.468% using the Pmean strategy. This trend continued in years 2 and 3 under the BV strategy, with the linear kernel leading by 1.875% and 2.448%, respectively. In year two, under the Pmean strategy, all six kernels performed equally, while in year three, the linear kernel was slightly better, with a 0.142% advantage.
When considering all years together (Global), the linear kernel stood out as the top performer in both the BV and Pmean strategies, improving NRMSE by 1.955% and 0.202%, respectively, compared to other kernels. Interestingly, the best approach for integrating phenotypic information, represented by the linear kernel, was the Pmean strategy, resulting in an NRMSE of 0.610. The BV strategy followed closely with an NRMSE of 0.617. In contrast, the strategy of not including parental phenotypic information (NO_Cov strategy) performed the worst, with an NRMSE of 0.651. This means that the Pmean and BV strategies outperformed the NO_Cov strategy by 6.7% and 5.5%, respectively. For more detailed information, please refer to Table A2.

2.3. Trait YIELD

Figure 3 illustrates that including parental phenotypic data (BV and Pmean) consistently led to the best predictive performance across different years. Nevertheless, there were subtle differences among the various kernels. Specifically, in year one, the linear kernel performed slightly worse than the other kernels by only 0.099% when using the BV strategy. Meanwhile, under the Pmean strategy, the linear kernel performed equally as well as the other kernels. This pattern persisted in years 2 and 3 under the BV strategy, where the linear kernel showed slightly lower performance than the other kernels by 0.075% and 0.194%, respectively. Similarly, under the Pmean strategy, the linear kernel exhibited slightly lower performance than the other five kernels, but the difference was minimal, with only 0.013% and 0.071% variations in years two and three, respectively.
When considering all years collectively (Global), the linear kernel showed a slight performance disadvantage compared to other kernels in both the BV and Pmean strategies, resulting in a decrease in NRMSE of 0.120% and 0.026% relative to the other kernels (Figure 3). Interestingly, the best strategy for incorporating phenotypic information, as indicated by the top-performing kernels (AC_1, AC_2, AC_3, AC_4, and GK), was the Pmean strategy, which achieved an NRMSE of 0.766. The BV strategy followed closely with an NRMSE of 0.778. In contrast, the strategy of not incorporating parental phenotypic information (NO_Cov strategy) showed the poorest performance, with an NRMSE of 0.796 (Figure 3). This demonstrates that the Pmean and BV strategies outperformed the NO_Cov strategy by 3.9% and 2.3%, respectively. For more detailed information, please refer to Table A3.

2.4. Across Traits

Since the linear kernel was the best in two out of three of the traits under study, in this section, we present the results across traits only for this linear kernel.
Across traits and years, we can appreciate in Figure 4 that not incorporating the parental phenotypic information as a covariate in the modeling process decreases prediction accuracy by 4.24% when the parental phenotypic information is incorporated as breeding values (BV) estimated from the parents, and by 5.59% when the parental phenotypic information is incorporated directly as the blues of the parents. In other words, these results point out that incorporating the phenotypic information helps to increase the prediction accuracy of hybrid predictions by at least 4.24% under a multi-trait framework.

3. Discussion

Hybrid prediction is of utmost importance in plant breeding as it harnesses the benefits of hybrid vigor, improves yield potential, enhances genetic diversity, facilitates trait selection, optimizes resource utilization, accelerates breeding progress, increases crop productivity, promotes sustainable agriculture, addresses consumer preferences, and provides economic advantages to stakeholders. However, as mentioned in the introduction, many factors affect the efficient development of highly productive hybrids.
For these reasons, the GS methodology is crucial for hybrid development as it enhances breeding efficiency, improves prediction accuracy, enables early-stage selection, aids in complex trait prediction, manages genetic diversity, facilitates information transferability, complements marker-assisted selection, enables selection for novel traits, supports data-driven breeding decisions and can be integrated with other breeding approaches [1,9]. By leveraging genomic information, breeders can optimize hybrid breeding programs and accelerate the development of high-performing and genetically superior hybrids.
However, for a successful implementation of the GS methodology, high accuracy is key for an efficient selection, increasing genetic gain, saving costs and time, utilizing genetic resources effectively, selecting for complex traits, facilitating precision breeding, building confidence in breeding decisions, adapting to changing years, managing genetic diversity, and promoting industry acceptance and adoption. Achieving high accuracy in genomic selection enhances the effectiveness and efficiency of plant breeding programs, ultimately leading to the development of improved and high-performing hybrids.
For this reason, the prediction performance of the GS methodology was explored under a multivariate framework with two strategies for incorporating the parental phenotypic information in the modeling process as covariates. The first approach, denoted as Pmean, directly used the parental phenotypic information without any preprocessing, while the second, denoted as BV, used only the breeding values of the parents ( g M , t  and  g F , t ) instead of the phenotypic values of both parents ( P M , t  and  P F , t ). Under both approaches, an increase in prediction performance NRMSE of at least 4.24% was observed; however, the direct approach of incorporating the parental phenotypic information (Pmean) was slightly better than the BV approach. However, we do not have elements to say that the Pmean approach is statistically better than the BV approach; for this reason, the small difference between the two approaches can be attributed in part to Monte-Carlo error since we implemented both approaches under a Bayesian framework.
Likewise, under both strategies (Pmean and BV) of incorporating the parental phenotypic information, we explored the use of nonlinear inputs using different kernels (AC_1, AC_2, AC_3, AC_4, and GK), which were compared to the conventional linear kernel. We did not find relevant differences in prediction performance between the kernels using the NRMSE as a metric. As such, findings suggest that, in general, for this data set, the linear kernel is sufficient, for when nonlinear kernels were evaluated, no significant gain in prediction performance was observed. However, even though the nonlinear kernels were not better in terms of NRMSE than the conventional linear kernel, it is of paramount importance to remember that in many data sets, the use of these nonlinear kernels still helps to increase prediction performance since they can efficiently capture nonlinear patterns in the input data when they are present.
In general, genomic prediction models under a multivariate context with nonlinear kernels have the potential to capture complex relationships, improve prediction accuracy, consider trait correlations, account for genetic pleiotropy and interactions, uncover hidden patterns, offer flexibility and adaptability, allow for cross-species applications, support better breeding decisions, and contribute to advancements in data science. By incorporating these models, breeders can enhance the accuracy and efficiency of genomic prediction, leading to improved plant breeding outcomes and the development of superior hybrids.
However, the efficacy of the multivariate model in comparison to the single-trait analysis varies depending on the specific problem at hand. Conventional multivariate models presuppose a uniform covariance of effects throughout the genome. In genomic regions where the alignment of effects correlates closely with the average effect correlation across the genome, leveraging information sharing among traits can enhance statistical power. Conversely, in regions where the correlation of effects significantly deviates from the genome’s average correlation pattern, the multivariate model might diminish statistical power. This reduction can be attributed to the tendency of multivariate models to constrict the magnitude of effects towards a shared covariance pattern [18].
Finally, our findings increase empirical evidence that integrating parental phenotypic information improves prediction performance by integrating this information as covariates. It is also important to point out that we did not find any improvement when adding this information under a multi-trait framework versus a uni-trait framework. For this reason, by incorporating parental phenotypic information, prediction models can make more accurate predictions and support more effective breeding decisions in plant breeding programs. However, we did not find an improvement in integrating this information as BV as opposed to integrating it as Pmean.

4. Materials and Methods

4.1. Phenotypic Data

Field experiments were conducted at CIMMYT’s Campo Experimental Norman E. Borlaug (CENEB) near Ciudad Obregon, Sonora, Mexico, over a period of three years. A total of 1888 hybrids resulting from crosses between 667 females and 18 males were evaluated. Specifically, the number of hybrids assessed during the winter growing seasons of 2014–2015 (Year 1), 2015–2016 (Year 2), and 2016–2017 (Year 3) were 703, 655, and 1197, respectively. Among these, 225 hybrids were common between consecutive years (Years 1 and 2), while 383 hybrids were common between Years 2 and 3. The selection of elite female and male parents was based on their performance for desired traits, ability to produce hybrids, and ancestral diversity, which was determined using a coefficient of parentage.
In order to produce the hybrids, a chemical hybridizing agent provided by Syngenta Inc. (Wilmington, DE, USA) was utilized in alternate male and female strip plots measuring 6.4 m. The parents and hybrids were evaluated in α-lattice trials, with two replications conducted over a span of two years. Each 4.8 m yield trial plot consisted of 1000 seeds to ensure uniform plant density. Standard agronomic practices, including four supplementary irrigations, were employed in a high-yield-potential environment. All male parents involved in the hybrids and the set of two advanced checks tested each year were planted in all trials. The hybrids and female parents were planted side by side in all the trials. For each entry, data on days to flowering (DTF), days to heading (DTH), days to maturity (DTM), grain yield (GY), and plant height (PHT) per plot were recorded. Phenotypic data were analyzed using a mixed linear model implemented in META-R software (V5.0). Best linear unbiased estimates (BLUEs) were estimated by fitting the model with trial (as a random effect), genotype (as a fixed effect), replication nested within trials (as a random effect), and sub-blocks nested within trials and replications (as random effects). The obtained BLUEs for each hybrid and parent were utilized for subsequent analyses. This paper focuses on the analysis of three traits: grain yield (GY), days to flowering (DTF), and days to heading (DTH).

4.2. Genotypic Data

In the first year, 18 male and 667 female parents underwent genotyping using the Illumina iSelect 90K Infinitum SNP genotyping array. In the second and third years, genotyping was performed using the Illumina Infinium 15K wheat SNP array (TraitGenetics GmbH, Gatersleben, Germany). After combining the data from all three years, a total of 13,005 single-nucleotide polymorphisms (SNPs) remained. SNPs with less than 15% missing values were retained, and any remaining missing markers were imputed using the naive method based on observed allele frequencies. Following imputation, markers with a minor allele frequency below 0.05 were excluded from the analysis. A total of 10,250 markers were ultimately utilized for further analysis. Although a larger set of hybrids and parents were assessed in the field experiments, only hybrids derived from parents that had undergone SNP genotyping were considered for genomic predictions. The number of hybrids included varied in each year of evaluation.

4.3. Multivariate Statistical Model

This model is given by
Y = Z E β E + Z M g M + Z F g F + Z H h + u M + u F + u H + X AC β AC + ϵ
where  Y  is the matrix of response variables of order  n × n T  (with  n T = 3  since the traits under study were GY, DTF, and DTH);  n  denotes the total number of observations;  Z E  is the design matrix for environments (year);  β E  is the matrix of year effects of order  I × n T , and  I  denotes the number of years, and it is assumed as random effects since model (1) was implemented under a Bayesian framework;  β E   MN I × n T ( 0 , σ E 2 I I , I n T ) , that is, with a matrix-variate normal distribution with parameters  M = 0 U = σ E 2 I I  and  V = I n T , and  g M  is the matrix of random effects due to the general combining ability (GCA) of markers from paternal lines (males, M);  g F  is the matrix of random effects due to the GCA of markers for maternal lines (females, F), and  h  is the matrix of SCA random effects for the crosses (hybrids, H). The incidence matrices  Z M Z F , and  Z H  relate  Y  to  g M g M , and  h  with  g M MN M × n T ( 0 , G M , Σ M ) M  denotes the male parents;  g F MN F × n T ( 0 , G F , Σ F ) F  denotes the female parents, and  h MN H × n T ( 0 , H = Z M G M Z M T Z F G F Z M T , Σ H ) H  denotes the hybrids resulting from combining the M males and F females,   denotes the Hadamard product, where,  Σ M Σ F , and  Σ H  are variance-covariance components associated with GCA and SCA, and  G M G F , and  H  are relationship matrices for parental and maternal lines and hybrids, respectively. While  u M MN ME × n T ( 0 , V M , Σ M E ) ,  denotes the random effects of males-year combinations,  u F MN FE × n T ( 0 , V F , Σ M F ) , denotes random effects of females-year combinations,  u H MN HE × n T ( 0 , V H , Σ M H ) ; denotes the random effects of hybrids-year combinations,  Σ M E Σ M F , and  Σ M H  are variance-covariances matrices of components associated with male × year, female × year, hybrid × year interactions, respectively; and  V M V F , and  V H  are the associated variance–covariance matrices. These variance-covariance matrices are given by  V M = Z M G M Z M T Z E Z E T V F = Z F G F Z F T Z E Z E T  and  V H = Z H HZ H T Z E Z E T . Finally,  ϵ  is the residual matrix of dimension  n × n T  distributed as  ϵ MN n × n T ( 0 , I n , R ) , where  R  is the residual variance-covariance matrix of order  n T × n T .
The relationship matrices  G M  and  G F  were computed using markers (Van Raden, 2008). Let  X m m  {Male, Female} be the matrix of markers and let  W m , be the matrix of centered and standardized markers. Then  G m = W m W m T p  [5,19] where  p  is the number of markers.  X AC  is the matrix that contains the parental covariates of the trait to be predicted and of correlated traits. The parental information was used under two approaches. The first one (Pmean) directly used the parental phenotypic information without any preprocessing. Under this approach, from each trait, we computed two covariates using the parental phenotypic information. One covariate that captured the additive part is computed as
X AC , t , a = ( P M , t + P F , t ) 2
where  P M , t  denotes the male parental phenotype for trait t,  P F , t  denotes the female parental phenotype for trait t, with  t = GY , DTF , and  DTH ; while  a  denotes the additive effects. The other covariates captured the dominance part, and it is computed as
X AC , t , d = | P M , t P F , t | 2
where  d  denotes dominance. The matrix  X AC  contains six columns since two covariates (one for  a  and the other for  d ) were computed for each of the three traits under study. The second approach is denoted as BV. To incorporate the parental phenotypic information in place of using directly the phenotypic values of both parents ( P M , t  and  P F , t ) we used the breeding values of the parents ( g M , t  and  g F , t ) and these were computed with the following predictor:
P = 1 μ + g + e
where  P  is the vector of the response variable for each trait of order  n p × 1 1  is a vector of ones of order  n p × 1 μ  denotes a general mean,  g  is the random effects of both parental lines distributed as  g N n p × 1 ( 0 , G ) , with  G  representing the genomic relationship matrix, and  e  are the residual errors distributed as  g N n p × 1 ( 0 , I n p σ e 2 ) . After fitting model (4), we estimated the breeding values,  g , that contain the breeding values of males ( g M , t )  and females ( g F , t ). Then, with these breeding values, we computed the covariates  X AC , t , a  and  X AC , t , d  with Equations (2) and (3), but instead of using  P M , t  and  P F , t g M , t and  g F , t  were used. Finally, to compare both approaches for using the phenotypic information, Pmean and BV were evaluated with and without the phenotypic information as covariates. For this reason, three strategies resulted in incorporating or ignoring the parental phenotypic information. These strategies are the Pmean, the BV, and the NO_Cov method that ignored the parental phenotypic information. The implementation of these models was carried out in the R statistical software (V.6) using the BGLR library [20].

4.4. Evaluation of Prediction Performance

In each of the three methods evaluated (Pmean, BV, and NO_Cov), we employed a type of cross-validation that emulates real breeding strategies, referred to as untested lines in tested years, using a seven-fold cross-validation [13]. In this approach, the training set was allocated to 7-1 folds, while the remaining fold was assigned to the testing set. This process was repeated until each of the seven folds had been utilized at least once in the testing set. The average performance across the seven folds was then reported as the prediction performance, using the normalized root mean square error (NRMSE) as the evaluation metric. In order to compare the prediction accuracies between models of the same type (Pmean and BV), the relative efficiencies in terms of NRMSE were computed as follows:
RE NRMSE = NRMSE Mx NRMSE Mx _ z
where  NRMSE Mx  and  NRMSE Mx _ z  denote the  NRMSE  of model  x = Pmean   and   BV  and  z = NO _ Cov  respectively. if  RE NRMSE > 1 ,  the best prediction performance in terms of  NRMSE  was obtained using method  Mx _ z , but when  RE NRMSE < 1 ,  the best method was  Mx . When  RE NRMSE = 1 ,  both methods were equally efficient.

4.5. Kernel Methods

Kernel functions (kernel methods or kernel tricks) are mathematical functions used in various machine learning algorithms. These functions enable the algorithms to operate in a high-dimensional feature space without explicitly computing the coordinates of the data points in that space [15]. Kernel functions could uncover complex, nonlinear relationships between data points by projecting them into a higher-dimensional space, where the relationships become more apparent and easier to separate. This process is called the “kernel trick”, which avoids the explicit computation of the higher-dimensional feature space, saving computational resources and memory.
The kernel functions efficiently represent data and perform the kernel trick where the kernel functions take pairs of data points in the original space and calculate the inner product (similarity) between them in a higher-dimensional space. A popular kernel function is the linear one that, in genomic prediction, is basically represented by the Genomic Best Linear Unbiased Predictor (GBLUP). The nonlinear Gaussian kernel (GK) function, also known as the radial basis function kernel, depends on the Euclidean distance between the original attribute value vectors rather than on their dot product,  K ( x i , x j ) = e γ x i x j 2  The Gaussian kernel method is very popular, but it is sensitive to the choice of the γ parameter and may be prone to overfitting.
The Arc-cosine (AC) kernel function uses the idea of forming artificial neural networks with more than one hidden layer  ( l ) . Cho and Saul [21] proposed a recursive relationship of repeating  l  times the interior product. It is important to point out that this kernel method is like a deep neural network since more than one hidden layer can be used. In this study, we have represented the AC with 1,2,3,4 hidden layers as AC_1, AC_2, AC_3, and AC_4, respectively.

5. Conclusions

In this paper, the integration of parental phenotypic information under a multi-trait framework with different kernels was explored. We evaluated two approaches for the integration of parental phenotypic information. We found an increase in prediction performance in the normalized mean square error of at least 4.24% by integrating the parental phenotypic information, but no relevant differences were observed between the two approaches for integrating the parental phenotypic information. Furthermore, we did not find a significant increase in prediction performance using nonlinear kernels regarding linear kernels. Finally, our findings increase empirical evidence that integrating parental phenotypic information as covariates helps to increase the prediction performance of hybrids prediction.

Author Contributions

Conceptualization, O.A.M.-L., A.M.-L., J.C. and L.C.-H.; Data curation, P.V.; Formal analysis, A.M.-L., C.S.P., G.G. and M.A.V.-J.; Investigation, J.C., O.A.M.-L., A.M.-L., P.E.V.-C. and R.B.-M.; Methodology, O.A.M.-L. and A.M.-L.; Resources, Validation, J.C., O.A.M.-L. and A.M.-L.; Writing—original draft, O.A.M.-L., A.M.-L. and J.C.; Writing—review & editing, O.A.M.-L., A.M.-L., J.C., C.S.P., G.G., M.A.V.-J., P.V., P.E.V.-C., R.B.-M. and L.C.-H. All authors have read and agreed to the published version of the manuscript.

Funding

We are thankful for the financial support provided by the Bill & Melinda Gates Foundation [INV-003439, BMGF/FCDO, Accelerating Genetic Gains in Maize and Wheat for Improved Livelihoods (AG2MW)], the USAID projects [USAID Amend. No. 9 MTO 069033, USAID-CIMMYT Wheat/AGGMW, AGG-Maize Supplementary Project, AGG (Stress Tolerant Maize for Africa], and the CIMMYT CRP (maize and wheat). We are also thankful for the financial support provided by the Foundation for Research Levy on Agricultural Products (FFL) and the Agricultural Agreement Research Fund (JA) through the Research Council of Norway for grants 301835 (Sustainable Management of Rust Diseases in Wheat) and 320090 (Phenotyping for Healthier and more Productive Wheat Crops).

Data Availability Statement

Phenotypic and genomic data can be downloaded from the link: http://hdl.handle.net/11529/10548129.

Acknowledgments

The authors are thankful for the administrative, technical field support and Lab assistance that established the different experiments in the field as well as in the Laboratory at the different institutions that generated the data used in this study.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this section, the data used in the elaboration of the different figures are presented.
Table A1. Prediction performance of the DTF trait measured by NRMSE by type, kernel, year, and predictive ability gain measured by Relative Efficiency.
Table A1. Prediction performance of the DTF trait measured by NRMSE by type, kernel, year, and predictive ability gain measured by Relative Efficiency.
TypeKernelYearNRMSERE__KerRE_Env
BVAC_110.6750.0000.000
BVAC_210.6750.0000.000
BVAC_310.6750.0000.000
BVAC_410.6750.0000.000
BVGK10.6750.0000.000
BVLinear10.6651.5340.000
NO_CovAC_110.7030.0000.000
NO_CovAC_210.7030.0000.000
NO_CovAC_310.7030.0000.000
NO_CovAC_410.7030.0000.000
NO_CovGK10.7030.0000.000
NO_CovLinear10.6931.4280.000
PmeanAC_110.6700.0000.000
PmeanAC_210.6700.0000.000
PmeanAC_310.6700.0000.000
PmeanAC_410.6700.0000.000
PmeanGK10.6700.0000.000
PmeanLinear10.6660.7060.000
BVAC_120.6540.0003.321
BVAC_220.6540.0003.321
BVAC_320.6540.0003.321
BVAC_420.6540.0003.321
BVGK20.6540.0003.321
BVLinear20.6402.1253.922
NO_CovAC_120.6830.2343.048
NO_CovAC_220.6830.2343.048
NO_CovAC_320.6830.2343.048
NO_CovAC_420.6830.2343.048
NO_CovGK20.6830.2343.048
NO_CovLinear20.6840.0001.359
PmeanAC_120.6450.0003.955
PmeanAC_220.6450.0003.955
PmeanAC_320.6450.0003.955
PmeanAC_420.6450.0003.955
PmeanGK20.6450.0003.955
PmeanLinear20.6440.0623.290
BVAC_130.6060.00011.382
BVAC_230.6060.00011.382
BVAC_330.6060.00011.382
BVAC_430.6060.00011.382
BVGK30.6060.00011.382
BVLinear30.5922.33012.255
NO_CovAC_130.6170.61613.950
NO_CovAC_230.6170.61613.950
NO_CovAC_330.6170.61613.950
NO_CovAC_430.6170.61613.950
NO_CovGK30.6170.61613.950
NO_CovLinear30.6210.00011.659
PmeanAC_130.5670.00018.139
PmeanAC_230.5670.00018.139
PmeanAC_330.5670.00018.139
PmeanAC_430.5670.00018.139
PmeanGK30.5670.00018.139
PmeanLinear30.5670.10617.434
BVAC_1Global0.6450.0004.688
BVAC_2Global0.6450.0004.688
BVAC_3Global0.6450.0004.688
BVAC_4Global0.6450.0004.688
BVGKGlobal0.6450.0004.688
BVLinearGlobal0.6321.9825.149
NO_CovAC_1Global0.6680.0005.337
NO_CovAC_2Global0.6680.0005.337
NO_CovAC_3Global0.6680.0005.337
NO_CovAC_4Global0.6680.0005.337
NO_CovGKGlobal0.6680.0005.337
NO_CovLinearGlobal0.6660.2254.088
PmeanAC_1Global0.6270.0006.822
PmeanAC_2Global0.6270.0006.822
PmeanAC_3Global0.6270.0006.822
PmeanAC_4Global0.6270.0006.822
PmeanGKGlobal0.6270.0006.822
PmeanLinearGlobal0.6260.3046.395
Table A2. Prediction performance of the DTH trait measured by NRMSE by type, kernel, year and predictive capability gain measured by Relative Efficiency.
Table A2. Prediction performance of the DTH trait measured by NRMSE by type, kernel, year and predictive capability gain measured by Relative Efficiency.
TypeKernelEnvNRMSERE_KerRE_Env
BVAC_110.6510.0000.000
BVAC_210.6510.0000.000
BVAC_310.6510.0000.000
BVAC_410.6510.0000.000
BVGK10.6510.0000.000
BVLinear10.6411.5770.000
NO_CovAC_110.6780.0000.000
NO_CovAC_210.6780.0000.000
NO_CovAC_310.6780.0000.000
NO_CovAC_410.6780.0000.000
NO_CovGK10.6780.0000.000
NO_CovLinear10.6701.2240.000
PmeanAC_110.6450.0000.000
PmeanAC_210.6450.0000.000
PmeanAC_310.6450.0000.000
PmeanAC_410.6450.0000.000
PmeanGK10.6450.0000.000
PmeanLinear10.6420.4680.000
BVAC_120.6300.0003.204
BVAC_220.6300.0003.204
BVAC_320.6300.0003.204
BVAC_420.6300.0003.204
BVGK20.6300.0003.204
BVLinear20.6191.8753.507
NO_CovAC_120.6590.4102.991
NO_CovAC_220.6590.4102.991
NO_CovAC_320.6590.4102.991
NO_CovAC_420.6590.4102.991
NO_CovGK20.6590.4102.991
NO_CovLinear20.6610.0001.331
PmeanAC_120.6250.0003.103
PmeanAC_220.6250.0003.103
PmeanAC_320.6250.0003.103
PmeanAC_420.6250.0003.103
PmeanGK20.6250.0003.103
PmeanLinear20.6250.0002.607
BVAC_130.6070.0007.201
BVAC_230.6070.0007.201
BVAC_330.6070.0007.201
BVAC_430.6070.0007.201
BVGK30.6070.0007.201
BVLinear30.5922.4488.120
NO_CovAC_130.6190.6799.667
NO_CovAC_230.6190.6799.667
NO_CovAC_330.6190.6799.667
NO_CovAC_430.6190.6799.667
NO_CovGK30.6190.6799.667
NO_CovLinear30.6230.0007.611
PmeanAC_130.5650.00014.048
PmeanAC_230.5650.00014.048
PmeanAC_330.5650.00014.048
PmeanAC_430.5650.00014.048
PmeanGK30.5650.00014.048
PmeanLinear30.5640.14213.678
BVAC_1Global0.6290.0003.385
BVAC_2Global0.6290.0003.385
BVAC_3Global0.6290.0003.385
BVAC_4Global0.6290.0003.385
BVGKGlobal0.6290.0003.385
BVLinearGlobal0.6171.9553.770
NO_CovAC_1Global0.6520.0004.065
NO_CovAC_2Global0.6520.0004.065
NO_CovAC_3Global0.6520.0004.065
NO_CovAC_4Global0.6520.0004.065
NO_CovGKGlobal0.6520.0004.065
NO_CovLinearGlobal0.6510.0672.876
PmeanAC_1Global0.6120.0005.384
PmeanAC_2Global0.6120.0005.384
PmeanAC_3Global0.6120.0005.384
PmeanAC_4Global0.6120.0005.384
PmeanGKGlobal0.6120.0005.384
PmeanLinearGlobal0.6100.2025.106
Table A3. Prediction performance of the YIELD trait measured by NRMSE by type, kernel, year, and predictive capability gain measured by Relative Efficiency.
Table A3. Prediction performance of the YIELD trait measured by NRMSE by type, kernel, year, and predictive capability gain measured by Relative Efficiency.
TypeKernelEnvNRMSEGRE_Entre_KerGRE_Entre_Env
BVAC_110.8100.0990.000
BVAC_210.8100.0990.000
BVAC_310.8100.0990.000
BVAC_410.8100.0990.000
BVGK10.8100.0990.000
BVLinear10.8110.0000.000
NO_CovAC_110.8280.0000.000
NO_CovAC_210.8280.0000.000
NO_CovAC_310.8280.0000.000
NO_CovAC_410.8280.0000.000
NO_CovGK10.8280.0000.000
NO_CovLinear10.8270.1690.000
PmeanAC_110.7930.0000.757
PmeanAC_210.7930.0000.757
PmeanAC_310.7930.0000.757
PmeanAC_410.7930.0000.757
PmeanGK10.7930.0000.757
PmeanLinear10.7930.0000.769
BVAC_120.8000.0751.162
BVAC_220.8000.0751.162
BVAC_320.8000.0751.162
BVAC_420.8000.0751.162
BVGK20.8000.0751.162
BVLinear20.8010.0001.186
NO_CovAC_120.8180.2811.333
NO_CovAC_220.8180.2811.333
NO_CovAC_320.8180.2811.333
NO_CovAC_420.8180.2811.333
NO_CovGK20.8180.2811.333
NO_CovLinear20.8200.0000.878
PmeanAC_120.7990.0130.000
PmeanAC_220.7990.0130.000
PmeanAC_320.7990.0130.000
PmeanAC_420.7990.0130.000
PmeanGK20.7990.0130.000
PmeanLinear20.7990.0000.000
BVAC_130.7220.19412.162
BVAC_230.7220.19412.162
BVAC_330.7220.19412.162
BVAC_430.7220.19412.162
BVGK30.7220.19412.162
BVLinear30.7230.00012.056
NO_CovAC_130.7420.05411.659
NO_CovAC_230.7420.05411.659
NO_CovAC_330.7420.05411.659
NO_CovAC_430.7420.05411.659
NO_CovGK30.7420.05411.659
NO_CovLinear30.7420.00011.410
PmeanAC_130.7060.07113.128
PmeanAC_230.7060.07113.128
PmeanAC_330.7060.07113.128
PmeanAC_430.7060.07113.128
PmeanGK30.7060.07113.128
PmeanLinear30.7070.00013.063
BVAC_1Global0.7770.1204.164
BVAC_2Global0.7770.1204.164
BVAC_3Global0.7770.1204.164
BVAC_4Global0.7770.1204.164
BVGKGlobal0.7770.1204.164
BVLinearGlobal0.7780.0004.142
NO_CovAC_1Global0.7960.0544.079
NO_CovAC_2Global0.7960.0544.079
NO_CovAC_3Global0.7960.0544.079
NO_CovAC_4Global0.7960.0544.079
NO_CovGKGlobal0.7960.0544.079
NO_CovLinearGlobal0.7960.0003.847
PmeanAC_1Global0.7660.0264.296
PmeanAC_2Global0.7660.0264.296
PmeanAC_3Global0.7660.0264.296
PmeanAC_4Global0.7660.0264.296
PmeanGKGlobal0.7660.0264.296
PmeanLinearGlobal0.7660.0004.281
Table A4. Phenotypic, genetic, and residual covariance and correlation matrix between traits.
Table A4. Phenotypic, genetic, and residual covariance and correlation matrix between traits.
Phenotypic covariance
yield_blueDTH_blueDTF_blue
yield_blue0.9433.3182.981
DTH_blue3.31835.30133.398
DTF_blue2.98133.39831.872
Phenotypic correlation
yield_blueDTH_blueDTF_blue
yield_blue1.0000.5750.544
DTH_blue0.5751.0000.996
DTF_blue0.5440.9961.000
Genetic covariance
yield_blueDTH_blueDTF_blue
yield_blue2.787 × 10−101.726 × 10−101.142 × 10−10
DTH_blue1.726 × 10−105.383 × 10−95.147 × 10−9
DTF_blue1.142 × 10−105.147 × 10−95.037 × 10−9
Genetic correlation
yield_blueDTH_blueDTF_blue
yield_blue1.0000.1410.096
DTH_blue0.1411.0000.988
DTF_blue0.0960.9881.000
Residual covariance
yield_blueDTH_blueDTF_blue
yield_blue0.144−0.017−0.009
DTH_blue−0.0171.5751.531
DTF_blue−0.0091.5311.619
Residual correlation
yield_blueDTH_blueDTF_blue
yield_blue1.000−0.035−0.019
DTH_blue−0.0351.0000.958
DTF_blue−0.0190.9581.000

References

  1. Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef] [PubMed]
  2. Heffner, E.L.; Sorrells, M.E.; Jannink, J.-L. Genomic selection for crop improvement. Crop Sci. 2009, 49, 1–12. [Google Scholar] [CrossRef]
  3. Hickey, J.M.; Gorjanc, G. Simulated data for genomic selection and cross-validation trials in plant and animal breeding. G3 Genes Genomes Genet. 2020, 10, 1925–1931. [Google Scholar]
  4. Spindel, J.E.; Begum, H.; Akdemir, D.; Collard, B.; Redoña, E.; Jannink, J.-L.; McCouch, S. Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity 2016, 116, 395–408. [Google Scholar] [CrossRef] [PubMed]
  5. Van Raden, P.M. Genomic measures of relationship and inbreeding. Interbull Bull. 2007, 52, 11–16. [Google Scholar]
  6. Jannink, J.L.; Lorenz, A.J.; Iwata, H. Genomic selection in plant breeding: From theory to practice. Brief. Funct. Genom. 2010, 9, 166–177. [Google Scholar] [CrossRef] [PubMed]
  7. Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Genomic selection: A paradigm shift in animal breeding. Anim. Front. 2016, 6, 6–14. [Google Scholar] [CrossRef]
  8. Poland, J.; Endelman, J.; Dawson, J.; Rutkoski, J.; Wu, S.; Manes, Y.; Dreisigacker, S. Genomic selection in wheat breeding using genotyping-by-sequencing. Plant Genome 2012, 5, 103–113. [Google Scholar] [CrossRef]
  9. Crossa, J.; Pérez, P.; Hickey, J.; Burgueño, J.; Ornella, L. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef] [PubMed]
  10. Riedelsheimer, C.; Czedik-Eysenberg, A.; Grieder, C.; Lisec, J.; Technow, F.; Sulpice, R.; Altmann, T.; Stitt, M.; Willmitzer, L.; Melchinger, A.E. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat. Genet. 2012, 44, 217–220. [Google Scholar] [CrossRef] [PubMed]
  11. Hernandez, M.V.; Crossa, J.; Singh, P.K.; Bains, N.S.; Singh, K.; Sharma, I. Multi-trait and multi-environment QTL analyses for resistance to maize lethal necrosis disease and grain yield. PLoS ONE 2012, 7, e38008. [Google Scholar] [CrossRef]
  12. Montesinos-López, O.A.; Montesinos-López, J.C.; Montesinos-López, A.; Ramírez-Alcaraz, J.M.; Poland, J.; Singh, R.; Dreisigacker, S.; Crespo, L.; Mondal, S.; Govidan, V.; et al. Bayesian multitrait kernel methods improve multi-environment genome-based prediction. G3 Genes Genomes Genet. 2022, 12, jkab406. [Google Scholar] [CrossRef] [PubMed]
  13. Montesinos-López, O.A.; Montesinos-López, A.; Crossa, J. Multivariate Statistical Machine Learning Methods for Genomic Prediction; Springer International Publishing: Cham, Switzerland, 2022; ISBN 978-3-030-89010-0. [Google Scholar]
  14. Gianola, D.; van Kaam, J.B. Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 2008, 178, 2289–2303. [Google Scholar] [CrossRef] [PubMed]
  15. Habier, D.; Fernando, R.L.; Dekkers, J.C. The impact of genetic relationship information on genome-assisted breeding values. Genetics 2007, 177, 2389–2397. [Google Scholar] [CrossRef] [PubMed]
  16. Xu, Y.; Zhao, Y.; Wang, X.; Ma, Y.; Li, P.; Yang, Z.; Zhang, X.; Xu, C.; Xu, S. Incorporation of parental phenotypic data into multi-omic models improves prediction of yield-related traits in hybrid rice. Plant Biotechnol. J. 2021, 19, 261–272. [Google Scholar] [CrossRef] [PubMed]
  17. Wang, D.; Shi, J.; Zhu, J.; Wu, R. The use of parental information to improve genomic prediction in plant breeding. Crop Sci. 2012, 52, 1476–1487. [Google Scholar]
  18. Pérez-Rodríguez, P.; de Los Campos, G. Multitrait Bayesian shrinkage and variable selection models with the BGLR-R package. Genetics 2022, 222, iyac112. [Google Scholar] [CrossRef] [PubMed]
  19. Technow, F.; Riedelsheimer, C.; Schrag, T.A. Genomic prediction of hybrid performance in maize with models incorporating dominance and population specific marker effects. Theor. Appl. Genet. 2012, 125, 1181–1194. [Google Scholar] [CrossRef] [PubMed]
  20. Pérez, P.; de los Campos, G. Genome-Wide Regression and Prediction with the BGLR Statistical Package. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef] [PubMed]
  21. Cho, Y.; Saul, L. Kernel Methods for Deep Learning. In Proceedings of the NIPS’09 the 22nd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 7–10 December 2009; pp. 342–350. [Google Scholar]
Figure 1. Prediction performance for trait DTF in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Figure 1. Prediction performance for trait DTF in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Ijms 24 13799 g001
Figure 2. Prediction performance for trait DTH in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Figure 2. Prediction performance for trait DTH in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Ijms 24 13799 g002
Figure 3. Prediction performance for trait GY in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Figure 3. Prediction performance for trait GY in each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with six different kernel methods (AC_1, AC_2, AC_3, AC_4, Gaussian Kernel (GK) and linear kernel measured by NRMSE in each year (1, 2, 3) and across years (Global).
Ijms 24 13799 g003
Figure 4. Prediction performance across traits and years for each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with the linear kernel in terms of normalized root mean square error (NRMSE).
Figure 4. Prediction performance across traits and years for each strategy of incorporating the parental phenotypic information (NO_Cov, BV, and Pmean) with the linear kernel in terms of normalized root mean square error (NRMSE).
Ijms 24 13799 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Montesinos-López, O.A.; Crossa, J.; Saint Pierre, C.; Gerard, G.; Valenzo-Jiménez, M.A.; Vitale, P.; Valladares-Cellis, P.E.; Buenrostro-Mariscal, R.; Montesinos-López, A.; Crespo-Herrera, L. Multivariate Genomic Hybrid Prediction with Kernels and Parental Information. Int. J. Mol. Sci. 2023, 24, 13799. https://doi.org/10.3390/ijms241813799

AMA Style

Montesinos-López OA, Crossa J, Saint Pierre C, Gerard G, Valenzo-Jiménez MA, Vitale P, Valladares-Cellis PE, Buenrostro-Mariscal R, Montesinos-López A, Crespo-Herrera L. Multivariate Genomic Hybrid Prediction with Kernels and Parental Information. International Journal of Molecular Sciences. 2023; 24(18):13799. https://doi.org/10.3390/ijms241813799

Chicago/Turabian Style

Montesinos-López, Osval A., José Crossa, Carolina Saint Pierre, Guillermo Gerard, Marco Alberto Valenzo-Jiménez, Paolo Vitale, Patricia Edwigis Valladares-Cellis, Raymundo Buenrostro-Mariscal, Abelardo Montesinos-López, and Leonardo Crespo-Herrera. 2023. "Multivariate Genomic Hybrid Prediction with Kernels and Parental Information" International Journal of Molecular Sciences 24, no. 18: 13799. https://doi.org/10.3390/ijms241813799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop