Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences

Michel, Sebastian; Löschenberger, Franziska; Ametz, Christian; Bistrich, Herbert; Bürstmayr, Hermann

doi:10.3390/crops5030031

Open AccessArticle

Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences

by

Sebastian Michel

^1,*

,

Franziska Löschenberger

²,

Christian Ametz

²,

Herbert Bistrich

² and

Hermann Bürstmayr

¹

Institute of Biotechnology in Plant Production, University of Natural Resources and Life Sciences Vienna, Konrad-Lorenz-Str. 20, 3430 Tulln, Austria

²

Saatzucht Donau GesmbH & CoKG, Saatzuchtstrasse 11, 2301 Probstdorf, Austria

^*

Author to whom correspondence should be addressed.

Crops 2025, 5(3), 31; https://doi.org/10.3390/crops5030031

Submission received: 28 February 2025 / Revised: 27 April 2025 / Accepted: 8 May 2025 / Published: 20 May 2025

Download

Browse Figures

Versions Notes

Abstract

Plant breeding is considered to be the science and art of genetically improving plants according to human needs. Breeders in this context oftentimes face the difficult task of selecting among thousands of genotypes for dozens of traits simultaneously. Using a breeder’s selection decisions from a commercial wheat breeding program as a case study, this study investigated the possibility of implementing a recommender system based on the breeder’s preferences to support early-generation selection decisions in plant breeding. The target trait was the retrospective binary classification of selected versus non-selected breeding lines during a period of five years, while the selection decisions of the breeder were predicted by various machine learning models. The explained variance of these selection decisions was of moderate magnitude (

ρ_{S N P}^{2}

= 0.45), and the models’ precision suggested that the breeder’s selection decisions were to some extent predictable (~20%), especially when some of the pending selection candidates were part of the training population (~30%). Training machine learning algorithms with breeders’ selection decisions can thus aid breeders in their decision-making processes, particularly when integrating human and artificial intelligence in the form a recommender system to potentially reduce a breeder’s effort and the required time to find interesting selection candidates.

Keywords:

breeder’s eye; selection theory; genome-wide prediction; machine learning; recommender system

1. Introduction

Statistical and machine learning, both fields of study in artificial intelligence, have opened up new avenues for agriculture in recent years, where plant breeders and quantitative genetics have, for example, exploited the power of the methods to optimize and accelerate genetic improvement in various crop species [1,2,3]. Statistical learning methods have thus gained large popularity for phenomics and genomics applications in plant breeding [4,5], giving rise, amongst others, to manifold predictive breeding approaches during the last decade [6,7,8,9]. In particular, the utilization of big data coming from vast multi-environment trials in combination with genome-wide distributed SNP markers to support selection decisions has become a routine procedure in many cereal breeding programs [10,11,12,13]. The foundation for conducting such a genomic selection lies in the quality and comprehensiveness of the underlying dataset, while the model training represents the heart of the genomic selection process [14]. Aside from classical statistical learning methods, Bayesian models or approaches like machine and deep learning algorithms, which are part of the field of artificial intelligence [15], are thereby employed to establish the relationship between genetic markers and phenotypic traits [16]. Regularly updating training populations with selection candidates that are routinely phenotyped across multiple years in multi-location trials in the framework of variety development [17,18,19] or with genotypes that are chosen by dedicated algorithms [20,21,22] is thereby pivotal in order to maintain a high accuracy of these prediction models [23,24]. The resulting genomic estimated breeding values can subsequently be used to guide a breeder’s selection decisions among early-generation non-phenotyped selection candidates, aiding in the selection of individuals with the most desirable genetic profiles [25]. Genomic selection has accordingly been successfully used for targeting major agronomic traits in cereal breeding to improve quantitively inherited traits like grain yield [26,27], baking quality in wheat [28,29], and malting quality in barley [30,31]. In addition to the usage of genomic estimated breeding values on a continuous scale for the mentioned traits, some successes have also been reported for classifying selection candidates by machine learning algorithms [32,33,34], especially for certain characteristics, such as rust resistant versus non-resistant lines in wheat [4,35].

Aside from these data-driven approaches, a breeder’s intuition and knowledge, sometimes referred to as the breeder’s eye, still remains an important aspect of modern plant breeding, as it has been for more than 100 years [36]. Although this breeder’s eye has sometimes been romanticized in the form of a personal relationship between breeders and plants, it is without doubt the habit of successful plant breeders to constantly visit field trials during the season and systematically take note of their germplasm [37]. The breeder’s eye thereby makes use of objective traits like phenology and disease resistance but also considers more subjective traits, like general field impressions, with a certain ideotype in mind [38] as well as breeder’s knowledge of the crop germplasm [36]. Based on their knowledge about the demands of the seed market and sometimes direct interaction with the farmers [37], breeders nowadays possess a strong expertise in integrating both knowledge-driven [39] as well as data-driven inputs based on statistical approaches [40] in their selection decisions in order to develop new and improved crop varieties. These selection decisions are thus oftentimes reached by a breeder after implicitly combining phenotypic observations made in the field, knowledge about the germplasm, and the mentioned data-driven inputs from statistical and machine learning models, such as genomic estimated breeding values. However, this expertise of bridging human and artificial intelligence by breeders has not been adequately taken into consideration by statistical modelling approaches for combining machine learning and expert knowledge in plant breeding.

Nevertheless, machine learning algorithms trained on human decision making are an inherent part of other disciplines, like online commerce and marketing, where they are frequently used in the form of recommender systems on shopping, entertainment, and learning platforms [41]. Although many such platforms are making use of vast amounts of consumer data and highly sophisticated algorithms, the simplest form of recommender system is given by training a simple machine learning model like random forest [42] with the preferences of just one user. The results of applying this so-called content-based approach for implementing a recommender system is the suggestion of personalized items or information based on the characteristics of previously chosen items or information by this user [43]. The merit of a recommender system is accordingly given by the fact that this person does not have to screen all the available content on a platform to find a suitable item or information [41,43]. The usage of such a content-based recommender system has thus the advantage of providing highly personalized recommendations that are most relevant to one particular user [43]. Using such a recommender system in the context of plant breeding might likewise have some potential, since breeders oftentimes have to select among thousands of genotypes for dozens of traits simultaneously during product development. Training a statistical or machine learning model with a breeder’s preferences to obtain recommendations for selection would accordingly allow a breeder to screen the available data more systematically without being overwhelmed by the vast amount of information about a plethora of agronomic traits from all selection candidates at once. Hence, the application of such a recommender system in plant breeding would potentially reduce a breeder’s effort and required time to find the most interesting selection candidates when conducting selection decisions. The derived recommendations would furthermore be highly personalized, reflecting a breeder’s preferences and breeding goals, which might lead to a high acceptance of such a recommender system as an additional selection tool. The aim of this case study was thus to investigate the possibility of predicting selection decisions of a plant breeder by various methods of statistical and machine learning and to assess the potential of implementing a recommender system based on breeders’ preferences as a tool for supporting breeders in their selection decisions.

2. Materials and Methods

2.1. Plant Material, Classification of Breeder’s Decisions, and Genotypic Data

This study focuses on a set of 4674 winter wheat (Triticum aestivum L.) breeding lines from the commercial breeding program of Saatzucht Donau GesmbH & CoKG (Probstdorf, Austria), which were tested in preliminary yield trials between 2015 and 2019 in Austria. The lines were part of five distinct cohorts, each comprising 532–1966 lines that were part of 64–342 different families. The plant material within each cohort was classified into non-selected lines (

s e l = 0

) and lines that were selected and advanced to the first year of multi-environment trials (

s e l = 1

), based on the decisions by the breeder (Franziska Löschenberger) in the respective preliminary yield trials (Figure 1). These selection decisions served as the target trait in this study, and they were reached by the breeder after implicitly combining phenotypic observations made in the field, knowledge about the germplasm, and data-driven inputs such as genomic estimated breeding values.

All breeding lines were genotyped with the DArTcap targeted genotyping-by-sequencing approach [44]. Alleles of the SNP markers were coded as “−1” for homozygous minor, ”+1” for homozygous major, and “0” for heterozygous. SNP markers with more than 10% missing data were filtered out, as were SNP markers with a minor allele frequency smaller than 5%, which resulted in a final dataset of 2219 SNP markers as predictors for the below-described statistical and machine learning models, i.e., recommender systems, after imputation with the missForest algorithm [45]. A principal component analysis with these SNP markers lines did not reveal a clear population structure in the studied panel of winter wheat breeding lines (Figure S1).

Genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight (FHB) severity of the 4674 breeding lines additionally served as potential predictors for the recommender systems. These genomic estimated breeding values were available to the breeder and extensively used when conducting selection decisions as part of the routine implementation of genomic selection in the wheat breeding program. Hence, they were chosen as predictors for the target trait, i.e., the breeder’s selection decisions, as they represented the major breeding targets, i.e., yield, quality, and disease resistance in the investigated wheat breeding program. The genomic estimated breeding values were computed based on genomic best linear unbiased prediction (GBLUP) models, which were trained on numerous multi-environment trial series with advanced generation breeding lines, as outlined in previous studies [46,47,48].

2.2. Estimating the Explained Variance of Breeders’ Decisions

Bayesian logistic models were fitted with the BGLR package v1.1.0 [49] for the R statistical environment [50] separately for the individual cohorts as well as for the combination of all five cohorts in order to estimate the variance of the breeder’s selection decisions that can be explained by the different predictor sets. Aside from the SNP markers, the above-mentioned genomic estimated breeding values for the five major agronomic traits served as predictors in these models. The genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and FHB severity of each individual cohort were for this purpose standardized by

s_{{g e b v}_{i j k}} = \frac{{g e b v}_{i j k} - µ_{j k}}{σ_{j k}}

(1)

where

{g e b v}_{i j k}

is the genomic estimated breeding value of the ith line for the jth trait in the kth cohort,

µ_{j k}

is the average of the genomic estimated breeding values for the jth trait in the kth cohort, and

σ_{j k}

is the standard deviation of the genomic estimated breeding values for the jth trait in the kth cohort. A Gaussian prior and Gibbs sampler algorithm with a burn-in of 2000 followed by an additional 10,000 MCMC (Markov chain Monte Carlo) samples and a thinning factor of five was used in the model:

y = μ + Z u + e

(2)

where

y

is the vector of labels for the lines classifying them into non-selected (

s e l = 0

) or selected (

s e l = 1

),

μ

is the grand mean, and

u

is the vector of random marker effects with variance

σ_{u}^{2}

.

Z

is the matrix of predictors that either contained the SNP markers or the standardized genomic estimated breeding values, and

e

is the vector of residuals. The proportion of the explained variance of this binary classification was estimated as suggested by Nakagawa and Schielzeth (2010), de Villemereuil et al. (2013), and Heuer et al. (2016) [51,52,53]:

ρ^{2} = \frac{σ_{a}^{2}}{σ_{a}^{2} + 1 + 1}

(3)

where the additive genetic variance

σ_{a}^{2}

was estimated based on

σ_{a}^{2} = m σ_{u}^{2}

, where

m

is the number of predictors, the first 1 for the probit link variance, and the second 1 for the residual variance

σ_{e}^{2}

, as implemented in the BGLR package for the probit link function [49].

2.3. Forward Prediction of Breeder’s Decisions by Machine Learning Algorithms

Training populations for fitting machine learning models with predicted breeder’s decisions were built by 50 times repeatedly sampling 400 lines from each individual cohort for 2015–2019. Following the scheme of a multi-stage selection in a line breeding program, 350 of these 400 lines were sampled based on their labelling from the non-selected class (

s e l = 0

) and 50 lines from the selected class (

s e l = 1

), assuming a selection intensity of 12.5% from preliminary yield trials to multi-environment trials. For the purpose of a forward prediction, validation populations were created by analogously sampling 50 times 80 different lines from each individual cohort in 2017–2019, where 70 lines (

s e l = 0

) belonged to the non-selected class and 10 lines to the selected class (

s e l = 1

), likewise reflecting a selection intensity of 12.5%. This resampling scheme was implemented in order to equalize the training and validation population sizes when assessing the potential to predict breeder’s decisions in two forward prediction schemes:

(1): Training populations coming from two cohorts preceding the cohort that was used for validation were used for fitting prediction models (800 lines) (Figure 2A).
(2): Training populations coming from two cohorts preceding the cohort used for validation were used for fitting prediction models, while this set was augmented by the training population coming from the same cohort that was used for validation (1200 lines) (Figure 2B).

The former scheme thus represented a scenario in which none of the pending selection candidates was labeled, and the latter scheme a scenario where a certain proportion of the pending selection candidates was already pre-labeled by the breeder. The individual validation populations sampled from the cohorts in 2017–2019 served thereby to measure the precision of the predicted classification in both forward prediction schemes. Additionally, the training population size was varied between 400 and 800 lines by equally sampling lines from two cohorts preceding the cohort that was used for validation (Figure 2A) or from three cohorts including the cohort that was used for validation (Figure 2B) in order to investigate if a general larger training population size or, specifically, the inclusion of the lines from the validation cohort would lead to a higher predictive performance of the below-described machine learning models.

For the purpose of obtaining predictions of the breeder’s decisions, both SNP markers as well as the standardized genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and FHB severity served again as predictors. Bayesian logistic models were firstly trained with BGLR [49] and a Gibbs sampler algorithm with a burn-in of 2000 followed by an additional 10,000 MCMC samples and a thinning factor of five. A Gaussian prior was used when implementing the Bayesian logistic models, equally shrinking all predictor effects towards zero and assuming purely additive effects of the predictors. Alternatively, the R package glmnet v4.1-6 [54] was used to predict the breeders’ decisions with elastic nets, where a sequence of 1000 values was used to search for an optimum of the penalty parameter

λ

based on the lowest deviance. The elastic nets also assumed purely additive effects, while the predictor effects

\hat{u}

were estimated by

\hat{u} = \underset{u}{argmin} ({‖y - Z u‖}^{2} + λ [α {‖u‖}_{1} + \frac{1}{2} (1 - α) {‖u‖}_{2}^{2}])

(4)

Setting the hyperparameter

α = \frac{1}{3}

ensured an equal weight of the

L_{1}

and

L_{2}

regularization for combining shrinkage properties and handling multicollinearity with predictor selection properties for reducing the influence of noisy signals. Neural networks were subsequently tested in order to also model interactions between the predictors, i.e., both additive as well as non-additive effects. The R package neuralnet v1.44.2 [55] was used to fit multi-layer perceptrons based on resilient backpropagation with backtracking, one hidden layer, and

\frac{1}{3} m

neurons in this hidden layer, where

m

is the total number of available predictors. Random forest models [42] were finally fitted with the R package randomForest v4.7-1.1 [56], which likewise included modeling additive as well as non-additive effects among the predictors. Half the number of lines in the respective training population were sampled to build 500 classification trees with a minimum size of the terminal nodes of 1, while the number of sampled predictors for building each tree was set to

p = \sqrt{m}

, where

m

is the total number of available predictors.

The predictions from each of the models were expressed in terms of the probability for each of the 80 breeding lines in the validation population to fall into the class of selected lines (

s e l = 1

). Aiming to reflect a recommender system where the most desirable selection candidates are suggested to a breeder, the 10 lines in the validation population with the highest probability were accordingly labeled to fall into the mentioned class of the selected lines (

s e l = 1

), and the other 70 lines were labeled to fall into the class of non-selected lines (

s e l = 0

). The precision of this classification was estimated by

P r e c = \frac{T P}{T P + F P}

(5)

where

T P

is the number of true positive and

F P

is the number of false positive classified lines in the validation population based on a confusion matrix. The true positives refer to lines that were predicted as being selected and were actually selected by the breeder, whereas the false positives refer to lines that were wrongly classified by the statistical and machine learning algorithms to fall into the class of selected lines (

s e l = 1

). This metric can, in the study at hand, also be interpreted as the percentage overlap between the lines recommended by the different statistical and machine learning algorithms and the actual breeder’s choice. Aside from the precision, the Matthews correlation coefficient

(ϕ)

was chosen as an additional comprehensive measure of model performance and estimated by

ϕ = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + T N) (F P + F N) (T P + F N) (F P + T N)}}

(6)

where

T P

is the number of true positive,

T N

is the number of true negative,

F P

is the number of false positive, and

F N

is the number of false negative classified lines in the validation population based on a confusion matrix. The true positives again refer to lines that were predicted as being selected and were actually selected by the breeder, whereas the true negatives refer to lines that were predicted as being non-selected and were actually not selected by the breeder. The false positives and false negatives refer accordingly to lines that were wrongly classified by the machine algorithms. The Matthews correlation coefficient

(ϕ)

was chosen as an additional metric as it is similar to the Pearson correlation coefficient

ρ

in its interpretation, given the case of a binary classification. Notably, it can be shown that

ϕ

is identical to

ρ

when coding non-selected lines as “0” and selected lines as “1” followed by calculating the Pearson correlation coefficient between the predicted and observed selection decisions.

The precision and Matthews correlation coefficient obtained when training with the different machine learning models and choosing the lines with the highest probability to fall into the class of selected lines (

s e l = 1

) was lastly compared to a random choice among the lines in the validation population in the above-outlined forward prediction schemes (Figure 2).

2.4. Retrospective Assessment of the Machine Learning Algorithms’ Recommendations as a Selection Decision Support Tool

The merit of using the recommendations made by the different machine learning models trained with breeder’s preferences as a selection decision support tool was assessed by using a reciprocal fine-tuning algorithm (Figure 3). The algorithm was initiated by obtaining recommendations for the population of selection candidates in 2017, 2018, and 2019 based on model training with the respective two previous years, 2015–2016, 2016–2017, and 2017–2018. Initial training populations of 400 lines were sampled from each of these years, while 350 of these 400 lines were sampled based on their labelling from the non-selected class (

s e l = 0

) and 50 lines from the selected class (

s e l = 1

). The populations of selection candidates were constructed by sampling 420 lines from the non-selected class (

s e l = 0

) and 60 lines from the selected class (

s e l = 1

) from each of the validation years, 2017, 2018 and 2019, individually. A retrospective perspective was taken in this study to test the suggested algorithm, where the aim was to identify an as high as possible proportion of

l_{t a r g e t} = 60

lines that were actually chosen by the breeder among a population of 480 selection candidates:

(1): In the first iteration $(i = 1)$ , the $l_{i = 1} = 60$ lines with the highest probability of selection were recommended and labeled to fall into the class of selected lines $(s e l = 1)$ .
(2): Based on the actual breeder’s choice among these recommended lines, this labeling of $s e l = 1$ was retained for a number of ${t l}_{i = 1}$ lines corresponding to the true positives, while it was changed to $s e l = 0$ for a number of ${f l}_{i = 1}$ lines corresponding to the false positives.
(3): The ${t l}_{i = 1}$ true positive lines were then added to the pool of chosen lines ( $l_{c h o s e n})$ .
(4): After (re)-labeling the lines in the population of selection candidates in this way, they were used to augment the initial training population of $c_{i = 1} = 800$ lines to a size of $c_{i = 2} = c_{i = 1} + {t l}_{i = 1} + {f l}_{i = 1}$ .
(5): Machine learning models were then re-trained with this augmented training population to obtain recommendations for the remaining not-yet-labeled lines in the population of selection candidates.
(6): In the second iteration $(i = 2),$ the $l_{i = 2} = l_{t a r g e t} - l_{c h o s e n}$ crosses with the highest probability of selection were recommended and labeled to fall into the class of selected lines $(s e l = 1)$ , reducing the number of recommended lines to the remaining difference towards the target of $l_{t a r g e t} = 60$ lines.
(7): Steps (2)–(6) were subsequently repeated for $i = 20$ iterations or until the target number of lines $l_{t a r g e t} = 60$ was reached in the pool of chosen lines $l_{c h o s e n} = 60$ .

The initial recommendations for a population of selection candidates were thus based on model training with previous years, followed by feedback by the breeder and re-training the models with an augmented training population. The performance of the above-outlined machine learning models in this scheme was finally compared to a random choice among the lines in terms of the average percentage of found target lines as well as the average percentage of totally labeled lines in each iteration across 50 repetitions of random sampling of the training populations and the populations of selection candidates. The latter was performed by choosing lines randomly from the selection candidates in each iteration instead of choosing the lines with the highest probability to fall into the class of selected lines (

s e l = 1

) based on the recommendations made by the different machine learning models. It should lastly be noticed that the first iteration of the suggested algorithm corresponds to the above-described scenario, where the training populations are coming from two cohorts preceding the cohort that was used for validation, i.e., the cohort that contained the pending selection candidates (Figure 2A). All subsequent iterations, on the other hand, corresponded to the above-described scenario, where this training population is augmented with lines coming from the same cohort that was used for validation (Figure 2B).

3. Results

The estimated explained variance of breeders’ decisions based on SNP markers with the combination of all five cohorts was of moderate magnitude, with

ρ_{S N P}^{2} = 0.45

, while it was substantially lower when the genomic estimated breeding values of the individual traits served as predictors, with

ρ_{T R A I T}^{2} = 0.25

. Similar observations were made when estimating the explained variance of the individual cohorts, suggesting that the SNP markers were able to explain a larger proportion of the breeder’s selection decisions than the genomic estimated breeding values of the employed five agronomic traits (Table 1). The estimated explained variance for the individual cohorts ranged accordingly between

ρ_{S N P}^{2} = 0.53 - 0.77

and

ρ_{T R A I T}^{2} = 0.22 - 0.34

for selection decisions made in the stage of preliminary yield trials concerning lines that were selected for the first year of multi-environment trials. These results thus suggested that the selection decisions taken by the breeder can to some extent be explained by genome-wide distributed SNP markers or genomic estimated breeding values. Hence, in a second step, it was investigated whether these selection decisions are also predictable and can be used in the form of a recommender system.

The usage of SNP markers for model training with the two cohorts preceding the individual cohorts in 2017–2019 that were used for validation resulted in a similar precision of 16.2–17.6% of all four tested statistical and machine learning models (Figure 4), which slightly exceeded the precision of 12.5% that would be expected by a random recommendation of lines for selection. Employing genomic estimated breeding values of the above-mentioned major agronomic traits led to a marked increase in average precisions for the Bayesian logistic models (22.2%), the elastic nets (21.3%), and the neural networks (20.1%) in this scenario (Figure 2A). Augmentation of the training population by lines coming from the same cohort as used for validation (Figure 2B) led to a marked increase in precision both when SNP markers and genomic estimated breeding values were used as predictors (Figure 4). The highest average precision of 32.2% was achieved by random forest models, given that SNP markers were used as predictors in this scenario, while the elastic net performed best when genomic estimated breeding values served as predictors, with a precision of 24.4%. The Matthews correlation coefficient revealed a similar pattern (Figure 4) as the Bayesian logistic models

(ϕ = 0.22)

and the random forest models

(ϕ = 0.23),

showing the best performance, especially when training models with SNP markers and including lines coming from the same cohort as was used for validation.

The retrospective testing of using the recommendations made by the different machine learning models as a selection decision support tool revealed that, on average, a higher percentage of target lines selected by the breeders were found by all models in comparison to a non-systematic random choice among the selection candidates (Figure 5 and Figure S2). The usage of the machine-based recommendations showed, for example, that in most cases more than 60% of the target lines were already identified in the fifth iteration of the reciprocal fine-tuning algorithm, while less than 50% of these lines were found by randomly choosing among the selection candidates. This trend continued in subsequent iterations, and in iteration 15 more than 90% of the target lines were identified on average by using the machine-based recommendations, while only 70% of all lines had to be labeled for this purpose. Using a random selection, 86% of all lines had to labeled to achieve this goal, which corresponded to a relative advantage of 23% and underpinned the merit of the suggested recommender system approach as a selection decision support tool.

4. Discussion

Plant breeding is considered to be the science and art of genetically improving plants according to human needs. Scientific breeding is thereby driven by big data from various omics disciplines, while the art of breeding is determined by breeder’s intuition, experience, and the so-called breeder’s eye [39]. The importance of the art and science aspects therefore changes gradually throughout the multi-stage selection in line breeding programs for crops like bread wheat. A higher weight can probably be ascribed to the art aspect in earlier generations, where selection takes place on a head-row basis, since phenotypic information from precise field trials for agronomic traits such as grain yield is not available. A turning point in this continuum from art to science is certainly the stage of preliminary yield trials and the first assessment of the selection candidates in yield plots, while this is nowadays quite often also the stage in which genotyping data and genomic estimated breeding values are available to breeders [14]. Hence, breeders are usually integrating the science and art aspects when making selection decisions in preliminary yield trials with regard to the material that is advanced to a first year of multi-environment trials.

These selection decisions are inherently complex, and many factors can play a role, for example, field impression, the occurrence of diseases, tackling negative trade-offs such as grain yield and protein content, and the aim to develop potential varieties for specific market segments [37]. Taking the vast additional information coming from genomic estimated breeding values and methods like independent culling with certain thresholds for multiple traits into account renders the final binary classification into selected versus non-selected lines into an array of many small decisions. Notwithstanding, the proportion of explained variance based on genome-wide distributed SNP markers or genomic estimated breeding values of a few major agronomic traits was relatively high, considering the complexity of these selection decisions in the study at hand. Some variation in explained variance was though found when regarding individual cohorts, which might be attributed to some cognitive bias by the breeder, with a certain ideotype in mind when making field observations in early generations, which can be influenced by environmental effects that in this way indirectly introduce a non-genetic trend into the estimation of the explained variance of the selection decisions. However, the medium-scaled proportion of explained variance when using the combination of all five cohorts indicated a common trend during selection decisions in preliminary yield trials, reflecting the fact that the breeder’s selection decision ideally considered both the current as well as future performance of the selection candidates [39].

For the purpose of predicting the complex patterns of these selection decisions during variety development, machine learning algorithms were directly trained with the breeders’ selection decisions. This revealed that the breeder’s selection decisions were to some extent predictable, especially when some lines of the validation cohort, i.e., the cohort containing the pending selection candidates, were included in the training population. Since the increase in training population size from 800 to 1200 lines might have some influence on this observation, it was additionally tested if a larger training population size or specifically the inclusion of the lines from the validation cohort would lead to a higher predictive performance of the machine learning models. Although the predictive performance of the models generally increased when enlarging the training population from 400 to 800 lines, it was evident that the inclusion of lines from the validation cohort resulted in a markedly higher precision and correlation coefficient in comparison to model training based solely on past cohorts at the same training population size (Figures S3 and S4). Although the consensus between the breeders’ selection and machine-based recommendations was only of low to moderate magnitude in the study at hand, it suggested that a breeder can find at least some desired selection candidates among the machine-based recommendations. Hence, one possible strategy to exploit the increase in precision when adding lines from the current cohort to the training population is given by firstly obtaining machine-based recommendations trained with decisions from previous cohorts where, for example, a proportion of the selection candidates are labeled to fall into the classes of the selected genotypes. These recommendations can subsequently be used by breeders for a human analysis and providing feedback to the machine learning algorithm by keeping the selection recommendations for certain lines or by re-labeling recommended lines to fall into the “non-selected” class. The pending selection candidates labeled in this way can then be added to the training population, followed by refitting the prediction model with this extended training population and obtaining updated recommendations for the other pending selection candidates. A final selection decision might, in this way, be reached after several rounds of reciprocal fine-tuning based on the synthesis of artificial and human intelligence [57,58] by repeating this procedure until the desired number of lines is labeled as “select”. Testing such a reciprocal fine-tuning algorithm in a retrospective manner showed that lines preferred by the breeder can be more readily and systematically identified in comparison to a non-systemic approach. Following such a strategy would most likely require the usage of models with a low computational burden to provide rapid results to a breeder and allow the implementation of a flexible feedback system, even when their precision would not be as high in comparison to more sophisticated models and predictor sets. It should, however, also be stressed that the results of the content-based recommender system approach that was employed in the study at hand are specific to the investigated breeding program, while the recommender system methodology itself might generally serve as an interesting selection decision support tool for breeders.

Using such a recommender system would moreover exploit the combined power of machine learning and the human capability of finding solutions in unexpected and changing circumstances, where methods of artificial intelligence are still inferior at the moment [57,59]. Hence, humans can be considered as essential in cases where it is necessary to bridge the gap between science and art in disciplines like plant breeding, while artificial intelligence can be considered as a supporting tool in the decision-making process. Breeders are accordingly able to reach selection decisions that involve thorough considerations about multiple aspects simultaneously, including objective phenotypic data from field trials and more subjective observations like field impressions. It should be noted, however, that a breeder’s selection decisions are not perfect due to the limited information that is available at a certain stage of selection; also, oftentimes the heuristic decision making by breeders might lead to the advancement of some selection candidates through the product development pipeline that do not ultimately fulfill the strict requirements of variety registration trials. Nevertheless, breeders’ selection decisions based on phenotypic observations made in the field, knowledge about the germplasm, and data-driven inputs such as genomic estimated breeding values can still be considered the gold standard for finding genotypes with favorable trait combinations and boosting crop genetic improvement.

Another way to aid breeders in the inherently difficult task of finding such promising genotypes, aside from recommending individual selection candidates by statistical and machine learning algorithms, is given by selection indices that feature continuous scaling. Although selection indices such as grain yield deviations and grain protein deviations, involving the two major agronomic traits grain yield and protein content, have gained popularity, e.g., in wheat breeding [60,61], finding objective weights for selection indices that involve all relevant traits is difficult in plant breeding. Such selection indices might be moreover defined solely based on statistical aspects, leaving out breeders’ preferences, which can lead to a low acceptance of these indices by breeders as a selection tool. Bernardo (1991) [62] suggested a retrospective index based on selection differential applied by a breeder in order to derive index weights. Even though this approach is limited to a certain array of assessed traits, such a retrospective index will clearly reflect a breeder’s preferences, giving it a higher chance to be applied in practice. Extending this approach, a related index can be obtained by training machine learning algorithms with breeders’ selection decisions and deriving the probabilities of new selection candidates to fall into the class of selected genotypes. The continuous scaling of these probabilities as a breeder index would make it possible to rank young selection candidates and ease the identification of genotypes with favorable trait combinations. The most convenient option for implementing such a strategy is probably given by deriving such a breeder index once with a model and predictor set that provides high precision, irrespective of the computational burden. The breeder index might in this way routinely obtain back-to-back genomic estimated breeding values as used in combination with other index selection approaches [60,63,64,65], representing a complement or alternative to the here-suggested recommender system with a binary classification according to selected and non-selected genotypes.

5. Conclusions

This study investigated the possibility of implementing a recommender system based on breeders’ preferences to support early generation selection decisions in plant breeding by using breeders’ selection decisions from a commercial wheat breeding program as a case study. It was found that for the breeder’s selection decisions, the medium-scaled proportion of variance could be explained by genome-wide distributed SNP markers as well as the genomic estimated breeding values of a few major agronomic traits, even when considering the combination of different cohorts from multiple years. Training machine learning algorithms with the selection decisions made by this experienced breeder revealed that they were, to some extent, predictable across and within cohorts. These predictions might accordingly be used in the form of a recommender system, and it is suggested that a final selection decision is reached after several rounds of reciprocal fine-tuning, based on recommendations by a statistical or machine learning algorithm and feedback given to this prediction model by the breeder. Deriving a breeders’ index based on the mentioned algorithms might alternatively be useful both for young and senior breeders for checking up on their breeding goals and potentially as a supporting tool to guide their selection decisions. Training statistical or machine learning algorithms with expert knowledge has great potential to aid breeders in their decision-making processes, especially when integrating human and artificial intelligence in the form a recommender system to reduce a breeder’s effort and required time in finding interesting selection candidates.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/crops5030031/s1, Figure S1: Principal component analysis of the 4674 winter wheat (Triticum aestivum L.) breeding lines of the cohorts in 2015–2019; Figure S2: Boxplots for the percentage of chosen target lines and the total percentage of labeled lines among all selection candidates in each iteration of the suggested reciprocal fine-tuning algorithm. A random choice among the selection candidates (RANDOM) was compared with elastic nets (ENET), neural nets (NNET), Bayesian logistic (BAYES), and random forest (FOREST) models based on SNP array data or genomic estimated breeding values as predictors for recommending specific lines; Figure S3: Distribution of the precision (%) with varying training population size for the prediction of breeders’ selection decisions across the validation populations of 80 lines from the individual cohorts in 2017–2019 based on either genome-wide distributed SNP markers (SNP marker-based recommender) or genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (trait-based recommender). Training populations for fitting elastic nets (ENET), neural nets (NNET), Bayesian logistic model (BAYES), and random forest (FOREST) were built either by solely using two cohorts preceding the cohort that was used for validation or by including a proportion of lines from the same cohort used for validation; Figure S4: Distribution of Matthews correlation coefficient

(ϕ)

with varying training population size for the prediction of breeders’ selection decisions across the validation populations of 80 lines from the individual cohorts in 2017–2019 based on either genome-wide distributed SNP markers (SNP marker-based recommender) or genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (trait-based recommender). Training populations for fitting elastic nets (ENET), neural nets (NNET), Bayesian logistic model (BAYES), and random forest (FOREST) were built either by solely using two cohorts preceding the cohort that was used for validation or by including a proportion of lines from the same cohort that was used for validation.

Author Contributions

Conceptualization, S.M. and F.L.; methodology, S.M.; software, S.M.; validation, S.M. and C.A.; formal analysis, S.M.; investigation, S.M. and F.L.; resources, F.L., H.B. (Herbert Bistrich), and C.A.; data curation, F.L.; writing—original draft preparation, S.M.; writing—review and editing, S.M. and F.L.; visualization, S.M.; supervision, H.B. (Hermann Bürstmayr); project administration, H.B. (Hermann Bürstmayr) and F.L.; funding acquisition, F.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the “Frontrunner” FFG project TRIBIO (35412407).

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Acknowledgments

We would like to thank Maria Bürstmayr and her team for the tremendous work when extracting the DNA of the hundreds of wheat lines. We would also like to thank the anonymous reviewers for their comments and suggestions for improving the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tong, H.; Nikoloski, Z. Machine learning approaches for crop improvement: Leveraging phenotypic and genotypic big data. J. Plant Physiol. 2021, 257, 153354. [Google Scholar] [CrossRef] [PubMed]
de Beukelaer, H.; de Meyer, G.; Fack, V. Heuristic exploitation of genetic structure in marker-assisted gene pyramiding problems. BMC Genet. 2015, 16, 2. [Google Scholar] [CrossRef] [PubMed]
Akdemir, D.; Beavis, W.; Fritsche-Neto, R.; Singh, A.K.; Isidro-Sánchez, J. Multi-objective optimized genomic breeding strategies for sustainable food improvement. Heredity 2019, 122, 672–683. [Google Scholar] [CrossRef] [PubMed]
González-Camacho, J.M.; Ornella, L.; Pérez-Rodríguez, P.; Gianola, D.; Dreisigacker, S.; Crossa, J. Applications of Machine Learning Methods to Genomic Selection in Breeding Wheat for Rust Resistance. Plant Genome 2018, 11, 170104. [Google Scholar] [CrossRef]
Montesinos-López, O.A.; Montesinos-López, A.; Pérez-Rodríguez, P.; Barrón-López, J.A.; Martini, J.W.R.; Fajardo-Flores, S.B.; Gaytan-Lugo, L.S.; Santana-Mancilla, P.C.; Crossa, J. A review of deep learning applications for genomic selection. BMC Genom. 2021, 22, 19. [Google Scholar] [CrossRef]
Robert, P.; Auzanneau, J.; Goudemand, E.; Oury, F.-X.; Rolland, B.; Heumez, E.; Bouchet, S.; Le Gouis, J.; Rincent, R. Phenomic selection in wheat breeding: Identification and optimisation of factors influencing prediction accuracy and comparison to genomic selection. Theor. Appl. Genet. 2022, 135, 895–914. [Google Scholar] [CrossRef]
Krause, M.R.; Mondal, S.; Crossa, J.; Singh, R.P.; Pinto, F.; Haghighattalab, A.; Shrestha, S.; Rutkoski, J.; Gore, M.A.; Sorrells, M.E.; et al. Aerial high-throughput phenotyping enables indirect selection for grain yield at the early generation, seed-limited stages in breeding programs. Crop Sci. 2020, 60, 3096–3114. [Google Scholar] [CrossRef]
Zhao, Y.; Thorwarth, P.; Jiang, Y.; Philipp, N.; Schulthess, A.W.; Gils, M.; Boeven, P.H.G.; Longin, C.F.H.; Schacht, J.; Ebmeyer, E.; et al. Unlocking big data doubled the accuracy in predicting the grain yield in hybrid wheat. Sci. Adv. 2021, 7, eabf9106. [Google Scholar] [CrossRef]
Juliana, P.; Poland, J.; Huerta-Espino, J.; Shrestha, S.; Crossa, J.; Crespo-Herrera, L.; Toledo, F.H.; Govindan, V.; Mondal, S.; Kumar, U.; et al. Improving grain yield, stress resilience and quality of bread wheat using large-scale genomics. Nat. Genet. 2019, 51, 1530–1539. [Google Scholar] [CrossRef]
Borrenpohl, D.; Huang, M.; Olson, E.; Sneller, C. The value of early-stage phenotyping for wheat breeding in the age of genomic selection. Theor. Appl. Genet. 2020, 133, 2499–2520. [Google Scholar] [CrossRef]
Raffo, M.A.; Sarup, P.; Guo, X.; Liu, H.; Andersen, J.R.; Orabi, J.; Jahoor, A.; Jensen, J. Improvement of genomic prediction in advanced wheat breeding lines by including additive-by-additive epistasis. Theor. Appl. Genet. 2022, 135, 965–978. [Google Scholar] [CrossRef] [PubMed]
Sneller, C.; Ignacio, C.; Ward, B.; Rutkoski, J.; Mohammadi, M. Using Genomic Selection to Leverage Resources among Breeding Programs: Consortium-Based Breeding. Agronomy 2021, 11, 1555. [Google Scholar] [CrossRef]
Juliana, P.; Singh, R.P.; Braun, H.-J.; Huerta-Espino, J.; Crespo-Herrera, L.; Govindan, V.; Mondal, S.; Poland, J.; Shrestha, S. Genomic Selection for Grain Yield in the CIMMYT Wheat Breeding Program—Status and Perspectives. Front. Plant Sci. 2020, 11, 564183. [Google Scholar] [CrossRef]
Robertsen, C.; Hjortshøj, R.; Janss, L. Genomic Selection in Cereal Breeding. Agronomy 2019, 9, 95. [Google Scholar] [CrossRef]
Farooq, M.A.; Gao, S.; Hassan, M.A.; Huang, Z.; Rasheed, A.; Hearne, S.; Prasanna, B.; Li, X.; Li, H. Artificial intelligence in plant breeding. Trends Genet. 2024, 40, 891–908. [Google Scholar] [CrossRef] [PubMed]
Desta, Z.A.; Ortiz, R. Genomic selection: Genome-wide prediction in plant improvement. Trends Plant Sci. 2014, 19, 592–601. [Google Scholar] [CrossRef]
Michel, S.; Löschenberger, F.; Sparry, E.; Ametz, C.; Bürstmayr, H. Multi-Year Dynamics of Single-Step Genomic Prediction in an Applied Wheat Breeding Program. Agronomy 2020, 10, 1591. [Google Scholar] [CrossRef]
Schrag, T.A.; Schipprack, W.; Melchinger, A.E. Across-years prediction of hybrid performance in maize using genomics. Theor. Appl. Genet. 2019, 132, 933–946. [Google Scholar] [CrossRef]
Sleper, J.A.; Sweet, P.K.; Mukherjee, S.; Li, M.; Hugie, K.L.; Warner, T.L. Genomewide selection utilizing historic datasets improves early stage selection accuracy and selection stability. Crop Sci. 2020, 60, 772–778. [Google Scholar] [CrossRef]
Akdemir, D.; Sanchez, J.I.; Jannink, J.-L. Optimization of genomic selection training populations with a genetic algorithm. Genet. Sel. Evol. 2015, 47, 38. [Google Scholar] [CrossRef]
Fernández-González, J.; Akdemir, D.; Isidro y Sánchez, J. A comparison of methods for training population optimization in genomic selection. Theor. Appl. Genet. 2023, 136, 30. [Google Scholar] [CrossRef] [PubMed]
Bustos-Korts, D.; Malosetti, M.; Chapman, S.; Biddulph, B.; van Eeuwijk, F. Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space. G3 Genes|Genomes|Genet. 2016, 6, 3733–3747. [Google Scholar] [CrossRef]
Neyhart, J.L.; Tiede, T.; Lorenz, A.J.; Smith, K.P. Evaluating Methods of Updating Training Data in Long-Term Genomewide Selection. G3 Genes|Genomes|Genet. 2017, 7, 1499–1510. [Google Scholar] [CrossRef]
Heffner, E.L.; Sorrells, M.E.; Jannink, J.-L. Genomic Selection for Crop Improvement. Crop Sci. 2009, 49, 1–12. [Google Scholar] [CrossRef]
Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef]
Tsai, H.; Cericola, F.; Edriss, V.; Andersen, J.R.; Orabi, J.; Jensen, J.D.; Jahoor, A.; Janss, L.; Jensen, J. Use of multiple traits genomic prediction, genotype by environment interactions and spatial effect to improve prediction accuracy in yield data. PLoS ONE 2020, 15, e0232665. [Google Scholar] [CrossRef] [PubMed]
Belamkar, V.; Guttieri, M.J.; Hussain, W.; Jarquín, D.; El-basyoni, I.; Poland, J.; Lorenz, A.J.; Baenziger, P.S. Genomic Selection in Preliminary Yield Trials in a Winter Wheat Breeding Program. G3 Genes|Genomes|Genet. 2018, 8, 2735–2747. [Google Scholar] [CrossRef]
Ben-Sadoun, S.; Rincent, R.; Auzanneau, J.; Oury, F.X.; Rolland, B.; Heumez, E.; Ravel, C.; Charmet, G.; Bouchet, S. Economical optimization of a breeding scheme by selective phenotyping of the calibration set in a multi-trait context: Application to bread making quality. Theor. Appl. Genet. 2020, 133, 2197–2212. [Google Scholar] [CrossRef]
Fradgley, N.S.; Bentley, A.R.; Gardner, K.A.; Swarbreck, S.M.; Kerton, M. Maintenance of UK bread baking quality: Trends in wheat quality traits over 50 years of breeding and potential for future application of genomic-assisted selection. Plant Genome 2023, 16, e20326. [Google Scholar] [CrossRef]
Schmidt, M.; Kollers, S.; Maasberg-Prelle, A.; Großer, J.; Schinkel, B.; Tomerius, A.; Graner, A.; Korzun, V. Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection. Theor. Appl. Genet. 2016, 129, 203–213. [Google Scholar] [CrossRef]
Charmet, G.; Pin, P.A.; Schmitt, M.; Leroy, N.; Claustres, B.; Burt, C.; Genty, A. Genomic prediction of agronomic and malting quality traits in six-rowed winter barley. Euphytica 2023, 219, 63. [Google Scholar] [CrossRef]
Montesinos-López, O.A.; Kismiantini; Montesinos-López, A. Two simple methods to improve the accuracy of the genomic selection methodology. BMC Genom. 2023, 24, 220. [Google Scholar] [CrossRef] [PubMed]
González-Camacho, J.M.; Crossa, J.; Pérez-Rodríguez, P.; Ornella, L.; Gianola, D. Genome-enabled prediction using probabilistic neural network classifiers. BMC Genom. 2016, 17, 208. [Google Scholar] [CrossRef] [PubMed]
Xu, Z.; Kurek, A.; Cannon, S.B.; Beavis, W.D. Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis. PLoS ONE 2021, 16, e0240948. [Google Scholar] [CrossRef]
Ornella, L.; Pérez, P.; Tapia, E.; González-Camacho, J.M.; Burgueño, J.; Zhang, X.; Singh, S.; Vicente, F.S.; Bonnett, D.; Dreisigacker, S.; et al. Genomic-enabled prediction with classification algorithms. Heredity 2014, 112, 616–626. [Google Scholar] [CrossRef]
Tschermak, E. Ein Leben fuer die Zuechtung. Aus der Werkstatt eines alten Pflanzenzuechter. Odal 1941, 10, 768–777. [Google Scholar]
Duvick, D.N. Theory, empiricism and intuition in professional plant breeding. In Farmers, Scientists and Plant Breeding: Integrating Knowledge and Practice; CABI Publishing: Wallingford, UK, 2002; pp. 189–211. [Google Scholar]
Bueren, E.T.L.; Struik, P.C.; Tiemens-Hulscher, M.; Jacobsen, E. Concepts of Intrinsic Value and Integrity of Plants in Organic Plant Breeding and Propagation. Crop Sci. 2003, 43, 1922–1929. [Google Scholar] [CrossRef]
Timmermann, M. The Breeder’s Eye—Theoretical Aspects of the Breeder’s Decision-Making. In Proceedings of the COST SUSVAR workshop on Cereal Crop Diversity: Implications for Production and Products; Ostergard, H., Fontaine, L., Eds.; ITAB: Paris, France, 2006; pp. 118–123. [Google Scholar]
Bernardo, R. Reinventing quantitative genetics for plant breeding: Something old, something new, something borrowed, something BLUE. Heredity 2020, 125, 375–385. [Google Scholar] [CrossRef] [PubMed]
Burke, R.; Felfernig, A.; Göker, M.H. Recommender Systems: An Overview. AI Mag. 2011, 32, 13–18. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Roy, D.; Dutta, M. A systematic review and research perspective on recommender systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
Diversity Arrays Technology Pty Ltd. DArT P/L, 2020. Available online: https://www.diversityarrays.com/ (accessed on 15 May 2025).
Stekhoven, D.J.; Bühlmann, P. Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112–118. [Google Scholar] [CrossRef] [PubMed]
Michel, S.; Löschenberger, F.; Ametz, C.; Pachler, B.; Sparry, E.; Bürstmayr, H. Simultaneous selection for grain yield and protein content in genomics-assisted wheat breeding. Theor. Appl. Genet. 2019, 132, 1745–1760. [Google Scholar] [CrossRef]
Michel, S.; Kummer, C.; Gallee, M.; Hellinger, J.; Ametz, C.; Akgöl, B.; Epure, D.; Löschenberger, F.; Buerstmayr, H. Improving the baking quality of bread wheat by genomic selection in early generations. Theor. Appl. Genet. 2018, 131, 477–493. [Google Scholar] [CrossRef]
Moreno-Amores, J.; Michel, S.; Löschenberger, F.; Buerstmayr, H. Dissecting the Contribution of Environmental Influences, Plant Phenology, and Disease Resistance to Improving Genomic Predictions for Fusarium Head Blight Resistance in Wheat. Agronomy 2020, 10, 2008. [Google Scholar] [CrossRef]
Pérez, P.; de los Campos, G. Genome-Wide Regression and Prediction with the BGLR Statistical Package. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef]
R Core Team R: A Language and Environment for Statistical Computing 2022. Available online: https://www.r-project.org/ (accessed on 15 May 2025).
Heuer, C.; Scheel, C.; Tetens, J.; Kühn, C.; Thaller, G. Genomic prediction of unordered categorical traits: An application to subpopulation assignment in German Warmblood horses. Genet. Sel. Evol. 2016, 48, 13. [Google Scholar] [CrossRef]
Nakagawa, S.; Schielzeth, H. Repeatability for Gaussian and non-Gaussian data: A practical guide for biologists. Biol. Rev. 2010, 85, 935–956. [Google Scholar] [CrossRef]
de Villemereuil, P.; Gimenez, O.; Doligez, B. Comparing parent–offspring regression with frequentist and Bayesian animal models to estimate heritability in wild populations: A simulation study for Gaussian and binary traits. Methods Ecol. Evol. 2013, 4, 260–275. [Google Scholar] [CrossRef]
Tay, J.K.; Narasimhan, B.; Hastie, T. Elastic Net Regularization Paths for All Generalized Linear Models. J. Stat. Softw. 2023, 106, 1–31. [Google Scholar] [CrossRef]
Günther, F.; Fritsch, S. Neuralnet: Training of Neural Networks. R J. 2010, 2, 30. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Lichtenthaler, U. Substitute or Synthesis: The Interplay between Human and Artificial Intelligence. Res. Technol. Manag. 2018, 61, 12–14. [Google Scholar] [CrossRef]
Jiang, Y.; Li, X.; Luo, H.; Yin, S.; Kaynak, O. Quo vadis artificial intelligence? Discov. Artif. Intell. 2022, 2, 4. [Google Scholar] [CrossRef]
Korteling, J.E.; van de Boer-Visschedijk, G.C.; Blankendaal, R.A.M.; Boonekamp, R.C.; Eikelboom, A.R. Human-versus Artificial Intelligence. Front. Artif. Intell. 2021, 4, 1–13. [Google Scholar] [CrossRef]
Rapp, M.; Lein, V.; Lacoudre, F.; Lafferty, J.; Müller, E.; Vida, G.; Bozhanova, V.; Ibraliu, A.; Thorwarth, P.; Piepho, H.P.; et al. Simultaneous improvement of grain yield and protein content in durum wheat by different phenotypic indices and genomic selection. Theor. Appl. Genet. 2018, 131, 1315–1329. [Google Scholar] [CrossRef]
Thorwarth, P.; Liu, G.; Ebmeyer, E.; Schacht, J.; Schachschneider, R.; Kazman, E.; Reif, J.C.; Würschum, T.; Longin, C.F.H. Dissecting the genetics underlying the relationship between protein content and grain yield in a large hybrid wheat population. Theor. Appl. Genet. 2019, 132, 489–500. [Google Scholar] [CrossRef] [PubMed]
Bernardo, R. Retrospective Index Weights Used in Multiple Trait Selection in a Maize Breeding Program. Crop Sci. 1991, 31, 1174–1179. [Google Scholar] [CrossRef]
Michel, S.; Löschenberger, F.; Ametz, C.; Pachler, B.; Sparry, E.; Bürstmayr, H. Combining grain yield, protein content and protein quality by multi-trait genomic selection in bread wheat. Theor. Appl. Genet. 2019, 132, 2767–2780. [Google Scholar] [CrossRef]
Ceron-Rojas, J.J.; Crossa, J.; Arief, V.N.; Basford, K.; Rutkoski, J.; Jarquín, D.; Alvarado, G.; Beyene, Y.; Semagn, K.; DeLacy, I. A Genomic Selection Index Applied to Simulated and Real Data. G3 Genes|Genomes|Genet. 2015, 5, 2155–2164. [Google Scholar] [CrossRef]
Schulthess, A.W.; Wang, Y.; Miedaner, T.; Wilde, P.; Reif, J.C.; Zhao, Y. Multiple-trait- and selection indices-genomic predictions for grain yield and protein content in rye for feeding purposes. Theor. Appl. Genet. 2016, 129, 273–287. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Number of non-selected lines versus selected lines of each cohort in 2015–2019 based on the breeder’s decisions in the respective preliminary yield trials concerning lines that were advanced to the first year of multi-environment trials.

Figure 2. Forward prediction schemes for assessing the precision of the predicted classification for the individual validation populations (VPs) sampled from the cohorts in 2017–2019 based on different combinations of training populations (TPs). Training populations coming from two cohorts preceding the cohort that was used for validation were used for fitting prediction models (A). Training populations coming from two cohorts preceding the cohort that was used for validation were used for fitting prediction models, while this set was augmented by the training population coming from the same cohort that was used for validation (B).

Figure 3. Schematic representation of the suggested reciprocal fine-tuning algorithm, where the breeder actively provides feedback to the machine learning algorithms. The initial recommendations of selected (light green) and non-selected lines (light red) based on a training population of labeled 800 lines are screened by the breeder, who retains the labeling (dark green) or re-labels the recommended lines (dark red). The initial training population of 800 lines is subsequently augmented by these labeled and re-labeled lines for updating the prediction models and screening the next batch of recommended lines among the set of not-yet-labeled crossing combination. The recommendations in the subsequent iterations are likewise screened by the breeder followed by further augmenting of the training population. The pool of chosen lines furthermore is enlarged in each iteration of the algorithm until reaching the target number of lines in the pool of chosen lines and thus the final selection decision.

Figure 4. Precision (%) and Matthews correlation coefficient

(ϕ)

for the prediction of breeders’ selection decisions across the validation populations of 80 lines from the individual cohorts of 2017–2019 based either on genome-wide distributed SNP markers (SNP marker-based recommender) or genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (Trait-based recommender). Training populations for fitting elastic nets (ENET), neural nets (NNET), the Bayesian logistic model (BAYES), and random forest (FOREST) were built either by solely using two cohorts preceding the cohort that was used for validation (800 lines) or by including a proportion of lines from the same cohort that was used for validation (1200 lines). The horizontal dashed lines represents the expected precision and correlation coefficient of a random recommendation of lines for selection.

Figure 4. Precision (%) and Matthews correlation coefficient

(ϕ)

for the prediction of breeders’ selection decisions across the validation populations of 80 lines from the individual cohorts of 2017–2019 based either on genome-wide distributed SNP markers (SNP marker-based recommender) or genomic estimated breeding values for grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (Trait-based recommender). Training populations for fitting elastic nets (ENET), neural nets (NNET), the Bayesian logistic model (BAYES), and random forest (FOREST) were built either by solely using two cohorts preceding the cohort that was used for validation (800 lines) or by including a proportion of lines from the same cohort that was used for validation (1200 lines). The horizontal dashed lines represents the expected precision and correlation coefficient of a random recommendation of lines for selection.

Figure 5. Percentage of chosen target lines and the total percentage of labeled lines among all selection candidates in each iteration of the suggested reciprocal fine-tuning algorithm. A random choice among the selection candidates (RANDOM) was compared with elastic nets (ENET), neural nets (NNET), the Bayesian logistic model (BAYES), and random forest (FOREST) models based on SNP array data or genomic estimated breeding values as predictors for recommending specific lines.

Table 1. Proportion of the explained variance based either on SNP array data (

ρ_{S N P}^{2}

) or the genomic estimated breeding values of grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (

ρ_{T R A I T}^{2})

for the individual cohorts as well as the combinations among all five cohorts in 2015–2019 for selection decisions made by the breeder in the stage of preliminary yield trials concerning lines that were advanced to the first year of multi-environment trials.

Table 1. Proportion of the explained variance based either on SNP array data (

ρ_{S N P}^{2}

) or the genomic estimated breeding values of grain yield, protein content, protein yield, extensogram dough energy, and Fusarium head blight severity (

ρ_{T R A I T}^{2})

for the individual cohorts as well as the combinations among all five cohorts in 2015–2019 for selection decisions made by the breeder in the stage of preliminary yield trials concerning lines that were advanced to the first year of multi-environment trials.

Year	Lines	Non-Selected Lines	Selected Lines	$ρ_{S N P}^{2}$	$ρ_{T R A I T}^{2}$
2015	653	495	158	0.77	0.33
2016	640	513	127	0.57	0.32
2017	883	770	113	0.53	0.24
2018	532	454	78	0.55	0.22
2019	1966	1743	223	0.54	0.34
2015–2019	4674	3975	669	0.45	0.25

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Michel, S.; Löschenberger, F.; Ametz, C.; Bistrich, H.; Bürstmayr, H. Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences. Crops 2025, 5, 31. https://doi.org/10.3390/crops5030031

AMA Style

Michel S, Löschenberger F, Ametz C, Bistrich H, Bürstmayr H. Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences. Crops. 2025; 5(3):31. https://doi.org/10.3390/crops5030031

Chicago/Turabian Style

Michel, Sebastian, Franziska Löschenberger, Christian Ametz, Herbert Bistrich, and Hermann Bürstmayr. 2025. "Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences" Crops 5, no. 3: 31. https://doi.org/10.3390/crops5030031

APA Style

Michel, S., Löschenberger, F., Ametz, C., Bistrich, H., & Bürstmayr, H. (2025). Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences. Crops, 5(3), 31. https://doi.org/10.3390/crops5030031

Article Menu

Can We Teach Machines to Select Like a Plant Breeder? A Recommender System Approach to Support Early Generation Selection Decisions Based on Breeders’ Preferences

Abstract

1. Introduction

2. Materials and Methods

2.1. Plant Material, Classification of Breeder’s Decisions, and Genotypic Data

2.2. Estimating the Explained Variance of Breeders’ Decisions

2.3. Forward Prediction of Breeder’s Decisions by Machine Learning Algorithms

2.4. Retrospective Assessment of the Machine Learning Algorithms’ Recommendations as a Selection Decision Support Tool

3. Results

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI