Accuracy of Selection in Early Generations of Field Pea Breeding Increases by Exploiting the Information Contained in Correlated Traits

Accuracy of predicted breeding values (PBV) for low heritability traits may be increased in early generations by exploiting the information available in correlated traits. We compared the accuracy of PBV for 10 correlated traits with low to medium narrow-sense heritability (h2) in a genetically diverse field pea (Pisum sativum L.) population after univariate or multivariate linear mixed model (MLMM) analysis with pedigree information. In the contra-season, we crossed and selfed S1 parent plants, and in the main season we evaluated spaced plants of S0 cross progeny and S2+ (S2 or higher) self progeny of parent plants for the 10 traits. Stem strength traits included stem buckling (SB) (h2 = 0.05), compressed stem thickness (CST) (h2 = 0.12), internode length (IL) (h2 = 0.61) and angle of the main stem above horizontal at first flower (EAngle) (h2 = 0.46). Significant genetic correlations of the additive effects occurred between SB and CST (0.61), IL and EAngle (−0.90) and IL and CST (−0.36). The average accuracy of PBVs in S0 progeny increased from 0.799 to 0.841 and in S2+ progeny increased from 0.835 to 0.875 in univariate vs MLMM, respectively. An optimized mating design was constructed with optimal contribution selection based on an index of PBV for the 10 traits, and predicted genetic gain in the next cycle ranged from 1.4% (SB), 5.0% (CST), 10.5% (EAngle) and −10.5% (IL), with low achieved parental coancestry of 0.12. MLMM improved the potential genetic gain in annual cycles of early generation selection in field pea by increasing the accuracy of PBV.


Introduction
It is important to increase the rate of genetic gain in self-pollinating crops to meet future demand for grains, and the most promising way to increase the rate of gain is to shorten selection cycles [1]. This may be achieved through selection on early generation non-inbred progeny [1]. However, selection on non-inbred progeny is less accurate than on inbred lines [2][3][4], and therefore crop breeders normally prefer to self to purity before choosing parents for crossing. This extends selection cycles and reduces the potential rate of genetic gain. In this study, we explore methods which increase the accuracy of predicted if the correlation is unfavourable [39]. However, a multivariate model based on many traits with low or no correlation may have no benefits over univariate models [2,38].
Significant gains were made in ascochyta blight resistance across two cycles of recurrent selection based on non-inbred progeny of field pea in an animal model analysis across cycles [18]. Here, we re-analyze the data from Cowling et al. [18] to select parents to begin the next cycle of this experimental pea population, and we apply MLMM to select non-inbred progeny for several low to moderate heritability traits. We combine random and/or fixed spatial effects and covariances of additive, non-additive and residual effects in the analysis of ascochyta blight resistance, flowering time, stem strength traits, grain yield, biomass, and other important agronomic traits.
The aim of this paper is to improve accuracy of PBV in non-inbred progeny of field pea by exploiting the information available in correlated low to moderate heritability traits. This is one of the first studies to use MLMM in crop breeding for this purpose. We hypothesize that accuracy of PBVs for these correlated low to medium heritability traits will be higher in MLMM than in a univariate linear mixed model (LMM), and compare the accuracy of PBVs in S 0 progeny with self progeny of parent plants (S 2 or higher selfing level) evaluated in the same experiment. In a novel approach to test the value of the PBV generated by MLMM, we predict the genetic gain of traits and achieved parental coancestry in the next cycle with optimal contribution selection (OCS) [44] based on an index of MLMM-derived PBVs weighted to achieve desired genetic gains.

Results
For definition of abbreviations, please see Section 4 Materials and Methods.

Re-Analysis of 2015 Data (Cycle 2) and Mating Design for Cycle 3
The bivariate LMM analysis of ABS and DTF in Cycle 2 2015 revealed a negative genetic correlation of additive effects (r = −0. 64) between ABS and DTF (Supplemental Table S1A,B). Narrow-sense heritability (h 2 ) for ABS was 0.377 and average accuracy ± standard deviation of PBV for ABS in S 0 progeny was 0.826 ± 0.04, which is slightly higher than 0.805 reported from univariate analysis in Cowling et al. [18]. h 2 for DTF was 0.555 and average accuracy in S 0 progeny was 0.850 ± 0.04. The PBVs obtained from this model were given appropriate economic weights in a selection index, and the index was used to generate an optimised mating design for Cycle 3 based on OCS (Supplemental Table S1C,D).   stem buckling (SB) in dry field pea stems. In both settings the force gauge was actuated downwards, applying a load over the pea stem. In (a) a displacement meter was used to measure the displacement of the stem from its natural stem diameter (SD) to the CST after applying a load of 10 N. For (b) the bottom of the stem was restricted with a clamp and the top was fixed under the flat end of the force gauge. SB was measured as the peak force recorded at stem failure.   stem buckling (SB) in dry field pea stems. In both settings the force gauge was actuated downwards, applying a load over the pea stem. In (a) a displacement meter was used to measure the displacement of the stem from its natural stem diameter (SD) to the CST after applying a load of 10 N. For (b) the bottom of the stem was restricted with a clamp and the top was fixed under the flat end of the force gauge. SB was measured as the peak force recorded at stem failure. In both settings the force gauge was actuated downwards, applying a load over the pea stem. In (a) a displacement meter was used to measure the displacement of the stem from its natural stem diameter (SD) to the CST after applying a load of 10 N. For (b) the bottom of the stem was restricted with a clamp and the top was fixed under the flat end of the force gauge. SB was measured as the peak force recorded at stem failure. Single traits were analysed first in univariate LMM without pedigree information, providing estimates of the total genotype and residual variance for each trait, and then with pedigree information to estimate additive (and potentially non-additive) variance components and narrow-sense heritability for each trait (Table 1, Supplemental Table S3). In all univariate models, linear fixed effects for range and row were not significant. h 2 for GY, BM, SB, SD, CST, and ABS was low (h 2 ≤ 0.15), low to moderate for Br and EAngle (h 2 ≤ 0.36) and moderate for DTF and IL (h 2 > 0.50) ( Table 1). The analysis of all 10 traits in the MLMM in ASReml-R achieved convergence with a US variance structure for additive genetic effects and residual effects, while omitting any other form of random effect. However, US variance structures were modified in all iterations of the run to maintain positive definite values (Supplemental Table S4A). Hence, the optimised MLMM was completed in the stand-alone version of ASReml by first adding the significant components for each trait identified in the univariate LMM and later removing those components that became non-significant under the new variance structure in the MLMM. In the optimized MLMM, variance components for additive genetic effects increased for GY, BM, Br, DTF, EAngle, and IL (Table 2) compared to the estimates of the univariate model (Table 1). ABS had a slightly smaller variance component for the additive genetic effect. The variance components of residual effects increased for all traits in the optimized MLMM, except for EAngle (Table 2). In the optimized MLMM, non-additive genetic effects were significant only for GY and DTF, and these effects were assumed independent (DIAG variance structure was used for non-additive effects). Compared to the univariate models (Table 1), h 2 in the MLMM (Table 2) improved considerably for DTF (increased by 0.10 compared to univariate LMM), EAngle (increased by 0.23) and IL (increased by 0.08) and increased slightly for GY (increased by 0.01). In contrast, h 2 decreased slightly for the other traits (−0.01 to −0.02) (Tables 1 and 2). The standard errors of h 2 were similar for all traits in univariate LMM and MLMM except for EAngle, where the standard error of h 2 increased from 0.04 to 0.12 (full model components can be seen in Supplemental Table S4B).

Trait Correlations
The optimized MLMM allowed for covariance of additive genetic effects and residual effects between traits. Covariances of additive effects were converted to correlations ( Figure 3). Correlations of residual effects were also significant between many traits and are presented in Supplemental Figure S1.
LMM and MLMM except for EAngle, where the standard error of ℎ 2 increased from 0.04 to 0.12 (full model components can be seen in Supplemental Table S4B).

Trait Correlations
The optimized MLMM allowed for covariance of additive genetic effects and residual effects between traits. Covariances of additive effects were converted to correlations ( Figure 3). Correlations of residual effects were also significant between many traits and are presented in Supplemental Figure S1.  Strong positive additive genetic correlations were found among traits BM, Br, and DTF (≥0.54) and among the stem strength traits SB, CST, and SD (≥0.59), whereas the strongest negative correlations occurred between EAngle and IL (−0.90) and between IL and SD (−0.58), that is, shorter internodes were associated more upright growth and with broader stems (Figure 3). Lower ABS (higher resistance to ascochyta blight disease) was associated with higher basal branching (Br) (−0.53). ABS showed moderate positive correlations with SB and CST (≥0.47). GY had a moderate correlation with BM (0.44) (Figure 3).
The positive genetic correlations between ABS and SB, and ABS and CST, indicate that stronger stems were associated with higher disease levels, which hinders selection for both strong stems and ABS resistance. SB was positively correlated with DTF which indicates that stronger stems were associated with later flowering.

Accuracy of Predicted Breeding Values
The average accuracy of PBV across all traits in S 0 progeny significantly increased from 0.799 in the univariate LMM to 0.841 in the optimized MLMM (Table 3) (t = −189.9, p < 0.001), which was slightly higher than accuracy of PBV in S 2+ progeny in univariate LMM (0.835) and demonstrates a significant benefit of MLMM over univariate LMM for S 0 progeny. Furthermore, the average accuracy of PBV for S 2+ progeny significantly increased to 0.875 in the optimized MLMM (Table 3).

Prediction of Genetic Gain in the Next Cycle with OCS
PBVs for each trait from the optimized MLMM were weighted for desired genetic gains and summed to generate a selection index for each individual. Trait weightings were chosen to increase predicted genetic gains in GY, BM, SB, CST, SD, and EAngle, to reduce DTF and ABS, and to maintain Br and IL ( Table 4). The index, pedigree information and list of genotypes available for crossing were submitted to MateSel for OCS with conservative parameters for improving genetic gain in Cycle 4 while minimising the loss in genetic diversity in the population (Table 4). Fine-tuning of the trait weightings occurred in MateSel for ABS and CST to decrease ABS and increase CST. The run achieved predicted genetic gain in GY, BM, and stem strength traits of 0.9 to 9.4% (Table 4) while reducing DTF by 2.5% and ABS by 3.8%. Despite a weighting to maintain IL, the predicted PBV of IL decreased by 10.5% in the next cycle ( Table 4). The achieved parental coancestry in Cycle 4 was very low at 0.12. Table 4. Predictions of genetic gains in index and PBVs in the next cycle of the experimental field pea population, from optimal contribution selection using MateSel [45]. The population phenotypic means for each trait are shown in Supplemental Table S2B.

Discussion
Multivariate linear mixed models improved the accuracy of predicted breeding values in non-inbred progeny of an experimental field pea population for several traits with low to moderate heritability compared to univariate models. This research shows the value of exploiting the information in correlated traits to improve the accuracy of PBV in non-inbred progeny. Accurate selection in early generations of field pea breeding can accelerate annual genetic gain for these traits.
A bivariate LMM of ABS and DTF in Cowling et al. [18] showed a strong negative correlation between the additive genetic effects for ABS and DTF in Cycle 2 (Supplemental Table S1B), which confirmed a previous report that late flowering field peas tended to have lower ABS (i.e., were more resistant to Ascochyta blight) than early types [9]. The bivariate analysis improved the h 2 values and accuracy of PBV for ABS over the univariate analysis presented in Cowling et al. [18], and therefore it was used to generate an optimised crossing design to being Cycle 3.
In 2019, the Cycle 3 progeny were evaluated as spaced single S 0 and S 2+ plants sown into a weed-suppression mat. This is clearly different from the conditions in which field pea crops are normally grown, but this design permitted an unbiased assessment of trait values on individual progeny plants. The sowing of spaced single plants in a weed mat may affect the growth of the root system and stem due to higher soil temperatures. However, the monthly average temperature of the region ranged from 14 to 20 • C during the winterspring months in which the trial was grown, and therefore it is unlikely that the weed mat resulted in detrimental effects on root systems due to high soil temperatures. Similarly, sowing of blueberries into a weed mat or sawdust mulch had no negative impact on growth and yield in most years [46]. Wide spacing of single plants was useful to record EAngle and to achieve an unbiased estimate of all traits on individual progeny plants. Similar designs were used in previous studies where high levels of ascochyta blight disease and low to moderate heritability for ABS resistance were recorded [9,18].
Previously, improvements in lodging resistance in field peas resulted from the breeding of dwarf semi-leafless varieties [7,15,22] which have better standability due to tendrils which hold the crop together [23]. Semi-leafless pea plants have weak stems and there has been little or no genetic progress in improving their stem strength. We measured several traits related to stem strength (SB, CST, SD, EAngle, and IL), all of which contribute to stem structure and standability of individual plants, with the goal of making significant genetic progress for stronger stems.
In previous studies, CST and SD were shown to be correlated with physical measures of stem strength such as flexion and force at breaking point [47] and both traits responded to selection [9]. Smitchger and Weeden [20] also found a genetic relationship between lodging resistance and CST and SD, and Smitchger et al. [48] suggested that SD should be the major focus of breeding to increase lodging tolerance. We improved the reliability of CST measurement method over previous methods [20,47] by standardizing the displacement in the stem at a fixed load of 10 N (Figure 2a).
We included a new measurement of stem strength, stem buckling (SB), where the base of the stem was fixed and the top was pressed downwards by the flat end of the force gauge, following the methods in Niklas [49]. SB has been proposed as a good stem strength predictor in plants [50], however, Smitchger et al. [48] considered that pea stems normally do not fail by stem buckling since plants do not usually grow in a vertical position, and attributed an increased lodging tendency to stem angles closer to the horizontal. Hence, we also measured EAngle on individual plants. We found that shorter IL was strongly correlated with higher EAngle (more upright growth), but extremely short internodes are not desirable as this results in short plants and poor harvestability [48]. Our results suggest that it should be possible to breed a more compact pea crop with slightly shorter internodes, with stronger stems and better lodging resistance as predicted by higher EAngle, SB, CST, and SD.
Another component of theoretical column buckling strength is the second moment of area [49]. Future work could include this measurement as an additional indicator of stem strength. It will likely be correlated with SB and indicate the expected buckling type (i.e., Euler or Brazier), which is associated with the wall thickness ratio [51]. However, CST is expected to be a good proxy of second moment of area due to its relationship with stem wall thickness.
In this population, SB, SD, and CST had low narrow-sense heritability ( Table 2), but there were moderate to high additive genetic correlations between these traits in the optimized MLMM ( Figure 3). Hence, the average accuracy of PBVs in S 0 progeny from 0.799 to 0.841 across all traits in the optimized MLMM over the corresponding univariate LMM (Table 3). To put this into perspective, the accuracy of PBVs for S 0 progeny in multivariate models was higher than the accuracy of PBVs in selfed (S 2+ ) progeny in univariate models. We conclude that selfing to S 2 or higher within breeding cycles is not essential to achieve significant genetic gain for these low heritability traits. We achieved annual cycles of S 0 recurrent selection with high accuracy of PBV based on an optimized MLMM of several correlated low to medium heritability traits. Recurrent selection on S 0 progeny avoids the cost of selfing and longer breeding cycles when selection occurs on inbred lines. This supports the suggestion of Cobb et al. [1] that shorter cycles and selection on non-inbred progeny should be explored as an option for improving genetic gain in crop breeding.
The optimised MLMM benefited from correlations among additive effects and residuals across traits which improved the data structure in the model and increased the accuracy of PBV, as originally shown by Thompson and Meyer [34]. Similar benefits were obtained with MLMM in Eucalyptus tree breeding [35], where self-pollination does not occur (or is very rare) and all progeny were S 0 or clones thereof. Benefits of MLMM were also shown in genomic prediction in inbred wheat lines [52].
Our study is one of the first applications of the single step MLMM with pedigree information in self-pollinating crops, and our results support the use of MLMM for several low to medium heritability traits measured on non-inbred progeny to help achieve rapid genetic gain. In this study, we achieved a predicted genetic gain in the next cycle of −3.8% in ABS, +5.0% in CST, +5.1% in SD, and +7.8% in EAngle, while controlling DTF and IL and achieving low parental coancestry of 0.12 (Table 4). Rapid and sustained genetic gain for several traits was predicted in stochastic models of S 0,1 family selection [53,54], and validated in the field over several cycles of recurrent selection in spring canola for grain yield and several other low to moderate heritability traits [55].
S 0 recurrent selection [3] may be augmented by including self progeny of parent plants at S 2 or higher selfing generations (S 2+ ). Augmenting S 0 recurrent selection with S 2+ selfs of parent plants improved connectivity of genetic relationships between and within cycles, and improved the accuracy of PBV [55]. It is also of practical and commercial value to include S 2+ selfs of parent plants in augmented S 0 recurrent selection, as this provides inbred lines ready for commercial evaluation after two cycles. This simplifies the proposed two-part strategy (population improvement and product development) for breeding self-pollinating crops [56] by combining the two parts into a single breeding program.
Our study was based on measurements made in the field, but the accuracy of PBV in non-inbred progeny could also be increased by including additional traits from highthroughput phenotyping in controlled environments to improve the accuracy of PBV for all traits [57].
Genomic relationship information is expected to improve accuracy of PBV especially when combined with pedigree information [52,56,58,59]. It will be relatively simple to combine genomic analysis (GBLUP) and pedigree analysis (ABLUP) in single-step or hybrid BLUP (HBLUP) analysis [60][61][62][63]. GBLUP was used for selection of non-inbred progeny of wheat by Bonnett et al. [64], and we expect that accuracy of PBV on non-inbred progeny will be improved further when pedigree and genomic information are combined in HBLUP.
The inclusion of multiple correlated traits increases the complexity and number of calculations required to converge the MLMM. In this study, we ran our models outside the R environment to overcome the CPU limitations that appeared in the 10-trait model with ASReml-R. Attempting to fit a model with more traits will likely result in convergence errors, so it is worth exploring different approaches of multivariate analysis for large datasets. One alternative would be to divide the analysis into two stages, where estimates of the fixed effects of genotypes (BLUES) are obtained in univariate analyses and are later analyzed in a multivariate BLUP analysis e.g., [65]. In animal breeding studies, correlations between traits in large datasets are sometimes estimated through pair-wise bivariate analyses of all trait combinations e.g., [66]. In theory, if correlations are known beforehand, it would be possible to adjust the multivariate model so that it includes all traits but assumes a value of zero for low and non-significant covariances between traits. In this way, fewer parameters need to be estimated in the model, which should result in easier computation and potentially could achieve convergence in a challenging dataset.
Animal breeders increased the rate of genetic gain by exploiting BLUP models which improved the accuracy of PBV in highly heterozygous individuals [27], but BLUP-based breeding also increased the rate of population inbreeding over cycles. As a result, OCS was implemented in animal breeding to maximize genetic gain whilst controlling the rate of inbreeding [67]. Recently, crop breeders have adopted OCS or similar optimized selection frameworks that retain genetic variability of the population under selection to improve the long-term genetic response [1,55,58,65,[68][69][70][71]. OCS was valuable in this study to predict the outcome of MLMM on genetic response for several traits in the next generation based on an optimised mating design.
The MLMM that we implemented here is relatively simple to adopt and involves little additional cost to the breeding program. Pedigree information can be recorded and recalled in an appropriate data base at low cost. Genomic data can be added in the form of a genomic relationship matrix if and when available. The challenge remains to implement MLMM in small plot trials grown at many sites per cycle across multiple cycles of selection, with measurements of GY, lodging, and other commercial traits. So far, this has only been achieved with univariate analyses, one trait at a time, across all sites and cycles [55].

Crossing to Begin Cycle 3
Progeny evaluated in the previous cycle (Cycle 2) of this field pea experimental population [18] were selfed, harvested and stored for later use as parent plants to generate Cycle 3 progeny for evaluation in this study. Cycle 2 data for ascochyta blight score (ABS) [18] and days from sowing to first flower (DTF) (Supplemental Table S1A) were re-analyzed in a bivariate model with pedigree information to generate predicted breeding values (PBVs) for each trait (Supplemental Table S1B). A selection index was constructed with economic weights on PBVs which aimed to reduce ABS while maintaining DTF (Supplemental Table S1C).
Optimal contribution selection (OCS) based on this selection index was implemented in software MateSel [45] (Supplemental Table S1C), which is based on an evolutionary algorithm with constraints easily invoked to ensure practical relevance and precise control of selection response and other outcomes. MateSel dictates which individuals to select and the actual crossing allocations and/or selfings to be made in a crossing design that balances genetic gain for the selection index under the constraint of a maximum permissible target inbreeding rate [44,45,67]. Retained S 1 seed harvested from S 0 individuals was used for crossing and selfing of each parent plant, and each crossing and selfing was recorded explicitly in the pedigree at the individual plant level [18].
The resulting mating list generated by MateSel comprised 150 crosses and was based on 60 Cycle 2 parental genotypes (Supplemental Table S1D). Crosses for Cycle 3 were carried out in a glasshouse at The University of Western Australia (UWA) Field Station, Shenton Park, Western Australia, during the summer and autumn months (December 2018 to April 2019). Cycle 3 cross progeny seed (S 0 ) along with selfs of the parent plants at S 2 or higher selfing generations (S 2+ ) were harvested and prepared for the Cycle 3 field trial, which forms the basis of this study.

Field Trial and Trait Assessment
In early June 2019, Cycle 3 progeny were grown in a field trial at UWA Field Station. Single plants were sown 90 cm apart in a square grid design 20 rows wide by 80 ranges (columns) deep, based on a spatially-optimized partially replicated design using the R package DiGGer [72]. The population included 668 S 0 genotypes (from crosses), 612 S 2+ genotypes (selfs of parents used in crosses) and 320 control plants (replicates of 15 control varieties), resulting in a trial with 1600 single plants (Figure 1a). The field was covered with black plastic weed mat to assist with weed control. One week after germination, infected pea straw with ascochyta blight disease from the previous cycle was spread across the trial to promote even disease infection.
Plants were scored for several traits during the growing season from June to November 2019. Ascochyta blight score (ABS) was measured as the number of nodes on the main stem from the base of the plant with stipules or leaves showing ascochyta blight disease symptoms at first flower, with lower values indicating higher resistance. Other recorded traits included days to first flower (DTF) and number of basal branches (Br). Stem angle above horizontal (EAngle) was assessed before the flowering stage (Figure 1b). After harvest, plants were collected, dried, and weighed for above-ground biomass (BM) and single plant grain yield (GY).
To assess stem strength, the first 15 cm of each main stem was detached at harvest and measured for mean internode length of the first four nodes (IL) and stem diameter (SD) at the center of the third stem internode. Compressed stem thickness (CST) was measured using a digital force gauge (Starr FGD-100, Starr Instruments, Melbourne, VIC, Australia) to apply a force of 10 N at the center of the third stem internode, recording CST as SD less displacement (mm) at 10 N (Figure 2a). Lastly, the top 10 cm of each stem piece was used to assess the stem buckling critical load (SB) in a setup where both ends of the stem were essentially fixed by restricting the base of the stem to the bottom of the digital force gauge with a clamp and pressing the top with the flat attachment of the device (Figure 2b). SB was recorded as the peak force (N) required to produce the structure failure when the force gauge was actuated downwards [49].

Univariate Linear Mixed Model
First, all traits were analyzed in a univariate linear mixed model (LMM) where the baseline model accounts for the randomization and is further extended to account for spatial variability in the field trial. Design blocks and genotypes were treated as random effects, and the residual model assumed independent and identically distributed error effects. To account for spatial variability within the trial, in particular the local stationary trend, a first-order separable autoregressive correlation process (AR1 × AR1) for ranges and rows was included, if significant, in the model, following Gilmour et al. [73]. In the model used here, we did not fit the AR1 × AR1 structure in the residual term but as a random effect to maintain a comparable model structure to the multivariate models, where it is not possible to fit the AR1 × AR1 structure in the residual term. In addition, as concluded by De Faveri et al. [74], the separable AR1 × AR1 structure in the residual term may not be adequate for multivariate models, and as a random term this can be dropped if it is found to be not significant. Additional linear fixed and random range and/or row effects were considered and included in each model if the term was significant and did not overfit the data.
The initial LMM was defined as: where y is the vector of individual plot observations for a given trait, β is the vector of fixed effects with associated design matrix X, u g is the vector of random genotype effects with associated design matrix Z g , u o is the vector of significant random effects other than genotype (AR1 × AR1 structure, range and/or row), with associated design matrix Z o , and e corresponds to the vector of plot residual effects. Vectors u g , u o , and e, representing random effects, are assumed to follow a Gaussian distribution with zeromean vector and pairwise independence, so that u g ∼ N 0, I G g , u o ∼ N(0, I G o ), and e ∼ N(0, I R), where I is an identity matrix and G g , G o and R are diagonal type variance-covariance matrices. A complete pedigree file of the population back to founders was based on the pedigree in Cowling et al. [18] and updated with new crosses and progeny in Cycle 3 (Supplemental Table S2A). A pedigree relationship matrix (A-matrix) was included in the analysis to estimate additive genetic effects (PBV) in each model as previously described [18]. In addition, it was possible to fit a variance component for the non-additive or residual genetic effects. This non-additive term was included only if it was significant and if inclusion increased the log-likelihood of the model. Thus, the LMM with pedigree information was defined as follows: where the model is equivalent to Equation (1), following the same assumptions, but the component u g is partitioned into u ga as the vector of random additive genetic effects and u gn as the vector of random non-additive genetic effects with u ga ∼ N 0, A G g and u gn ∼ N 0, I G g , where A corresponds to the A-matrix of pedigree relationship information. Each successive model was evaluated based on changes in log-likelihood, Akaike information criterion (AIC), and Bayesian information criterion (BIC). Comparisons between models were made through a log-likelihood ratio test. The significance of fixed effects was assessed by a Wald test. Random terms were retained when they significantly improved the log-likelihood of the model. Narrow-sense heritability (h 2 ) for each trait was calculated from the estimated variance components from the respective LMM with pedigree information, where h 2 equals the additive variance component divided by the sum of the additive, non-additive and residual variance components.

Multivariate Linear Mixed Model
The second approach was to fit a multivariate linear mixed model (MLMM) across traits, constructed by specifying a matrix of combined traits as the response [75]. Different variance structures can be fitted for each random and residual term specified inside the MLMM, thus, it is possible to obtain independent variances for the different effects of each trait and covariances between trait effects. This allows for a greater level of fine-tuning when modelling because the covariance between trait effects can be allowed or omitted independently for the different terms in the model.
In MLMM, the process began by fitting a baseline multi-trait model, using diagonal variance structures (DIAG) for the genotype and residual terms, which assumes that traits are independent (covariance between traits is zero) and have separate variances. This is comparable to fitting parallel univariate models. Second, the MLMM was improved by assuming an unstructured covariance model (US), instead of a DIAG variance structure, for the genotype and residual terms, permitting the estimation of covariances for these terms between traits [75]. Finally, pedigree relationship information in the form of an A-matrix was added to the MLMM, enabling the estimation of PBV, as in the univariate step.
For each trait present in the model, the AR1 × AR1 structure and other range or row random effects for each trait were fitted separately and retained in the final model only if significant. This means that it was possible to test and fit these terms in a trait-by-trait case. For example, the AR1 × AR1 structure was modelled only for ABS, DTF had a random row effect and GY had both a random row and range effect. Likewise, the non-additive genetic term was included in the MLMM or omitted depending on its significance for each trait and was included as either a US or DIAG structure, based on the significance of the covariance term (meaning that it was possible to create a model with covariance between additive genetic effects, but non-additive genetic effects were assumed independent). The final multivariate model was defined as: where terms are as per the univariate model in Equation (2) but expanded for additional traits, so that for N traits and Y is the combined vector of line observations for N traits included in the model, B is the matrix of combined vectors of significant fixed effects for each trait identified in the univariate analyses with associated design matrix X, U ga is the combined vector of random additive genetic effects with associated design matrix Z g , with U ga ∼ N(0, A T a ), U gn is the combined vector of random non-additive genetic effects with associated design matrix Z g , with U gn ∼ N(0, I T n ), U o is the combined vector of significant random effects other than genotype for each trait, with associated design matrix Z o , and E corresponds to the matrix of residual effects with E ∼ N(0, I R). A is the A-matrix of pedigree relationship information, I is an identity matrix and T a , T n , and R are the trait variance-covariance matrices of additive, non-additive, and residual effects, respectively. T a and R were modeled with a US variance structure resulting in the additive genetic and residual covariance of trait effects such as However, T n assumed a US variance structure only if the covariance of non-additive genetic effects between traits was significant, otherwise a DIAG variance structure was specified and non-additive genetic effects were assumed independent, resulting in either for US variance structure and DIAG variance structure, respectively. The covariances of trait effects estimated in the model were transformed to correlations using the variance value for each trait and their covariance: where ρ XY is the correlation coefficient between the effect of traits X and Y, σ XY is the covariance between the effects of traits X and Y, and σ 2 X and σ 2 Y , are the variances of traits X and Y, respectively.
As in the univariate process, all fitted MLMM were evaluated in terms of log-likelihood, AIC, and BIC, using a log-likelihood ratio test to compare models. The significance of fixed effect terms was assessed by a Wald test. Random terms were retained if they produced a significant improvement in the log-likelihood of the model. h 2 for each trait was calculated using the estimated variance components of the MLMM as the additive variance component divided by the sum of the additive, non-additive (if present), and residual variance components for each trait.
The models presented in this study were fitted using the R package ASReml-R version 4 (VSN International Ltd., Hemel Hempstead, UK) [75]. For the more complex models, the MLMM was optimized in the stand-alone version of ASReml version 4.2 (VSN International Ltd., Hemel Hempstead, UK) [76] due to its ability to use more CPU cores in the run.

Model Accuracy
The accuracy of PBV obtained from the final models was calculated as the correlation between the true and predicted breeding values based on the approach of Mrode [77]: where the accuracy r pi of prediction for the individual I, s 2 i is the squared standard error that accompanies the breeding value of individual i obtained from the model prediction, f i is the inbreeding coefficient calculated based on the pedigree for individual i and σ 2 A is the trait additive genetic variance estimated from the model [76].
The average accuracy of PBV of groups of progeny S 0 and S 2+ were compared across models.

Prediction of Progeny Performance in Cycle 4
PBVs for each trait were obtained from the optimized MLMM. An optimized selection index for each individual was calculated from the sum of weighted PBVs across traits, where the weights for each trait were based on the tactical desired gains approach [78] and calculated in the program DESIRE [79]. The selection index and the PBVs for each trait for each individual were submitted to software MateSel to generate an optimized mating design with 170 crosses to begin the fourth cycle. The simulation ran with a weight against progeny inbreeding of −0.01 and a conservative target of 50 degrees in the response frontier of the population to promote a moderate increase in index while minimizing achieved population coancestry [45]. The output summary included predicted changes in mean index value, PBVs and achieved parental coancestry in the matings to begin the fourth cycle.

Conclusions
The accuracy of predicted breeding values for 10 low to medium heritability traits in non-inbred progeny of field pea increased in an optimized multivariate linear mixed model over a univariate model, making it feasible to undertake annual cycles of S 0 recurrent selection on non-inbred progeny. Significant genetic gains in several low heritability traits were predicted in the next cycle with low rates of population inbreeding based on a mating design generated with optimal contribution selection. The results show that the information contained in correlated traits can be exploited to increase accuracy of selection in early generations of field pea breeding.  Figure S1. Correlation of residuals from Cycle 3 optimized multivariate model.