Agronomic Evaluation of Bread Wheat Varieties from Participatory Breeding: A Combination of Performance and Robustness

: Participatory plant breeding (PPB) is based on the decentralization of selection in farmers’ ﬁelds and their involvement in decision-making at all steps of the breeding scheme. Despite the evidence of its beneﬁts to develop population varieties adapted to diversiﬁed and local practices and conditions, such as organic farming, PPB is still not widely used. There is a need to share more broadly how the di ﬀ erent programs have overcome scientiﬁc, practical, and organizational issues and produced a large number of positive outcomes. Here, we report on a PPB program that started on bread wheat in France in 2006 and has achieved a range of outcomes, from the emergence of new organization among actors, to speciﬁc experimental designs and statistical methods developed, and to populations varieties developed and cultivated by farmers. We present the results of a two-year agronomic evaluation of the ﬁrst population varieties developed within this PPB program compared to two commercial varieties currently grown in organic agriculture. We found that several PPB varieties were of great agronomic interest, combining relatively good performance even under the most favorable conditions of organic agriculture and good robustness, i.e., the ability to maintain productivity under more constraining conditions. The PPB varieties also tended to show a good temporal dynamic stability and appeared promising for the farmers involved. populations derived from crosses, and more recent varieties Japhabelle JFB, Lot-et-Garonne Mix of around 25 populations derived from crosses and selected on farm Renan INRA line 1989


Introduction
Participatory plant breeding (PPB) is based on the decentralization of selection in farmers' fields and their involvement in decision making at all steps of the breeding scheme. It allows the development of varieties that may be finely adapted to local pedo-climatic conditions, to farmers' agronomic practices, and to the type of products and marketing. Decentralization of evaluation and selection is most critical when genotype by environment (G × E) interactions are important when growing genotypes in farmers' fields compared to in experimental stations [1][2][3].
PPB has been used worldwide (in 10 developed and 59 developing countries) with 47 different crops, as reviewed by Ceccarelli and Grando [4]. While PPB was initially mostly reported in developing countries, with many programs conducted by CGIAR (Consultative Group on International Agricultural Research) centers, the contribution of other research institutes from developed countries to publications on PPB increased after the mid-2000s and remained higher afterwards (see Figure 2 in Ceccarelli and Grando [4]). However, despite the evidence of its benefits, institutional plant breeding is still centralized and non-participatory. Several articles have sought to address this lack of widespread use in developing countries by detailing the outcomes of PPB, such as the cases of maize (Zea mays L.) and rice (Oryza sativa L.) in India, which led to the official registration of one maize variety and two rice varieties [5]. PPB's benefits are also expected to be high in countries where industrialized agriculture has become the dominant model, as it can be a way to finely target the diversified and specific local practices and conditions that arise under organic or other agroecological farming practices, and therefore support the agroecological transition of agriculture. For PPB to be more widely used under these conditions, there are several scientific, practical, and organizational issues, but also a need to share more broadly how the different programs have overcome these problems and produced a large number of diverse positive outcomes.
Indeed, on-farm evaluation and selection of heterogeneous populations lead to methodological challenges in the set-up of adapted experimental designs and in the analyses of data, because each on-farm trial is often of small size, with a limited number of plots, and a low level of replication [6,7]. Several PPB programs have therefore developed or adapted original designs and methods that allow farmers to compare several populations on their farms, to select the most adapted and to pursue the process over several years [6][7][8][9]. To facilitate the implementation of experimental designs suitable for on farm evaluation and selection and the analysis of data generated by PPB programs, an R package, PPBstats, is under development to implement many of these statistical analyses for agronomic and molecular data, organoleptic tests, and seed circulation networks [10].
Moreover, most often in such PPB programs, the varieties developed are population varieties, i.e., genetically heterogeneous varieties derived from one or several crosses, or from the mixture of crosses and/or of landraces, where diversity has been maintained at a certain level determined by farmers' selection practices [11]. This raises questions on the way to manage crop diversity over time. Indeed, a balance must be found between mass selection between plants within the population in order to improve certain traits, individual vigor, or performance, and the maintenance of genetic diversity to allow a response to longer-term selection and further adaptation.
Little literature reports on all aspects of the PPB programs, from the emergence of new organization among actors, to specific experimental designs and statistical methods developed, and to populations or varieties developed and cultivated by farmers in their fields for production. Such a PPB approach has been applied on bread wheat in France since 2006 in a partnership among the research team DEAP (Diversity, Evolution, and Adaptation of Population) at the Quantitative Genetics and Evolution Lab (GQE-Le Moulon), in France and groups of the farmers' organization Réseau Semences Paysannes (RSP) [12,13] In this program, specific protocols, experimental designs, and statistical methods have been developed for on-farm trials [6,7]. Moreover, a collective organization among farmers, facilitators, and the research team has been set up [13]. Both the relatively long-term character of this program (2006 to 2019) and the relatively large number of farmers involved (from 1 in 2006 to around 80 in 2018) makes it a case of interest for assessing the potential value of PPB to develop new and original population varieties of interest to the farmers involved. Here, we present results of a two-year agronomic evaluation of the first population varieties that have been developed within this PPB program.

Description of the PPB Populations and Commercial Varieties Studied
Ten population varieties developed within the bread wheat French PPB program (hereafter called PPB varieties) were proposed by five farmers involved in the project. They cover a wide range of the possible types of population varieties that are usually derived from PPB: a landrace grown and selected on farm for several years, a mass selection of one particular plant within a landrace, a single-cross derived (not fixed) population, a mix of several landraces, a mix of several (up to 20) single-cross populations, a mix of both landraces and single-cross populations (Table 1). In addition, two commercial French varieties (Renan, widely used in France by organic farmers and Hendrix, more recently released and bred for organic agriculture) were used as references to represent the classical pure line varieties. In the following, both PPB populations and commercial varieties will be referred to as varieties.  but varied depending on the local usual practices of the farmers (input of organic manure or not, preceding crop, sowing date, harrowing or manual weeding, or no intervention). Soil fertility and quality drastically varied among farms, with some very superficial soils and some deeper and more fertile ones. Climates also were contrasted, with one farm located in a dry and hot area in the south of France (FRC) while others in the Alps (CHD, RAB) were very rainy, with quite cold temperatures in winter (Table 2). Finally, farm size ranged from 20-25 ha for the FRC farm, to 45-65 ha for the FLM, CHD, and JFB farms, and 70-90 ha for the RAB and JSG farms.
The traits to be measured were chosen in consultation between farmers, facilitators, and researchers. Grain yield (GY, qx/ha), thousand kernel weight (TKW, g), and protein content (PC, %) were measured at the plot level, while plant height (PH, mm), distance between the last leaf and spike (LLSD, mm), spike weight (SW, g), number of spikelets per spike (NSPK), number of sterile spikelets per spike (NSPK_st), spike length (SL, mm), awness (AW, semi-quantitative scale ranging from 0 to 20), spike color (color, semi-quantitative scale ranging from 0 to 20), and spike curve (curve, semi-quantitative scale ranging from 0 to 20) were measured on individual plants (25 plants/plot).

Statistical Analyses
Data were analyzed using an AMMI (additive main effects and multiplicative interaction) model, following Gauch [14], with the R package PPBstats. In the first step, an ANOVA model (called model 1) was run with population, farm, and year as main effects, block within farm and year as a nested effect, and all three second order interactions effects among population, farm, and year: where Y ijkl is the phenotypic value for replication k, population i, farm j, and year l, µ is the general mean, α i is the effect of variety i, θ j is the effect of farm j, β l is the year l effect, (αθ) ij is the interaction effect of variety × farm, (αβ) il is the variety × year interaction effect, (θβ) jl is the farm × year interaction effect, rep k θβ jl is the effect of the replication k nested in farm j in year l, and ε ijk is the residuals.
In the second step, a principal component analysis (PCA) was run on the variety × farm interaction term, as follows: which can also be written as: where N is the number of dimensions (PCA components), which has a maximum value of the number of farms, λ n is the eigen value for component n, γ in is the eigen vector for population i and component n, ω jn is the eigen vector for farm j and component n. The data were double centered on farm and population means. In this analysis, the PCA studied the structure of the interaction matrix and the farms were the variables and the populations were the individuals. On this matrix, Wricke's ecovalences [15] were estimated Sustainability 2020, 12, 128 5 of 14 for each individual. This parameter provided indications on the contribution of each variety to the interaction term and therefore on its stability on the different farms in relation to the productivity potential of each farm. It is generally described as a dynamic stability indicator [16]. The PCA was also applied symmetrically on the variety × year interaction and the "temporal" ecovalence was calculated for each variety.
The ANOVA was also run on the set of populations restricted to the PPB populations, i.e., excluding the two commercial varieties Renan and Hendrix, in order to evaluate whether their particular variability did influence the relative importance of the different effects and interactions. This will be referred to as dataset b while the complete dataset will be referred to as a.
Finally, the results of the statistical analyses were discussed among all the actors involved, i.e., the farmers, facilitators, technicians, and researchers, in order to obtain a common understanding based on the academic knowledge and on the more experiential knowledge shared within the group.

Effects and Interactions in the ANOVA
Results from the analysis of variance with model 1 on dataset a showed that for all traits measured on individual plants (i.e., all traits except TKW, GY, and PC), all effects and interaction effects, except the block effect, were highly significant. This was also the case for TKW, GY, and PC, except for the interaction variety × year for GY (Table S1). However, variance components strongly varied in magnitude according to the trait considered ( Figure 1). First, it should be noted that for all traits except plant height (PH), thousand kernel weight (TKW), grain yield (GY), and protein content (PC), although most effects in the model were significant, the residual part of the variation remained quite high. This was due to the fact that the within population among plant variation was large for most populations due to both genetic and environmental heterogeneity. This could not be observed for TKW, GY, and PC, as only a global measure at the plot level was available, and this was not true for PH, as differences among farms and/or varieties appeared to be relatively much larger. The amount of variance due to variety effect was large for PH, LLSD, and spike traits such as awness and color, moderate for TKW, and small for the rest of the traits, while the amount of variation due to farm effect was important for GY and TKW and moderate for PH, LLSD, and NSPK, and the amount due to year effect was large mainly for PC. When looking at the differences among farms, a clear pattern of responses from the varieties emerged (data not shown). The FRC farm led to significantly shorter plants, lower grain yields, shorter and lighter spikes with fewer spikelets per spike, and smaller thousand kernel weight, which reflects the stressing conditions encountered on the farm (high temperature and severe drought in spring (Table 2), shallow and poor soil). The FLM farm led to significantly lower grain yields too, shorter spikes with fewer spikelets, and a lower protein content, indicating limiting conditions-in particular, a low nitrogen availability-but less stressful climate ( Table 2). The CHD farm led to significantly shorter plants, to intermediate values for the other traits, but to the largest thousand kernel weight and highest protein content. The JFB farm led to intermediate values for all traits except that spikes were significantly heavier and protein content lower. Finally, the RAB and JSG farms led to significantly taller plants, higher grain yields, and longer spikes, while only the RAB farm led to heavier spikes. These results are consistent with the good soil quality and fertility and with rather favorable climate conditions, although different, in these two farms ( Table 2).
In general, the variety × year contribution to variation was very low, while variety × farm was larger, in particular for TKW, GY, and PC, indicating that genetic variability among populations, although limited, was rather more specific to the farm than to the year, which makes it available for local selection. When removing the two commercial varieties (model 1 on dataset b, Figure 1), the relative contributions of the different effects and interactions to the overall variation changed drastically for PH and LLSD, with the variety effect much reduced due to the fact that the commercial varieties were much shorter than all PPB populations. For the other traits, there was, at most, a marginal decrease in the variety contribution. In addition, it can be noted that for PC, which was quite sensitive to the year effect, removing the two commercial varieties slightly reduced the year and year × variety contribution, indicating that the two commercial varieties might be even more sensitive to these effects. local selection. When removing the two commercial varieties (model 1 on dataset b, Figure 1), the relative contributions of the different effects and interactions to the overall variation changed drastically for PH and LLSD, with the variety effect much reduced due to the fact that the commercial varieties were much shorter than all PPB populations. For the other traits, there was, at most, a marginal decrease in the variety contribution. In addition, it can be noted that for PC, which was quite sensitive to the year effect, removing the two commercial varieties slightly reduced the year and year × variety contribution, indicating that the two commercial varieties might be even more sensitive to these effects.

Comparison of PPB Populations and Commercial Varieties over Farms and Years
Only for some morphological traits, such as PH and LLSD, were the responses of populations over farms and years quite parallel due to the low population × farm interaction. For the other traits, in particular for those related to grain yield and quality, the ranking changed a lot from one farm to the other (Figure 2). For SW, TKW, and PC, the interaction between variety and farm seemed more important in 2014 than in 2015 and, in general, differences among farms were larger in 2014 than in 2015.
Except for PH and LLSD, for which commercial varieties exhibited shorter straw than PPB varieties, there was no contrasted pattern between PPB varieties and the commercial ones ( Figure 2). For GY, depending on the farm and on the year, Renan, Hendrix, or some PPB varieties performed best. Each year, Renan had the largest grain yield on two farms, Hendrix had the largest yield on one farm, and PPB varieties did better on the three others (but not the same both years). Renan and Rouge du Roc had consistently large kernels (TKW) over farms and years, while Hendrix and Saint-Priest had small kernels, in general. For protein content, some PPB varieties, such as Rouge du Roc and Saint-Priest, had the highest level, while the commercial varieties were more unstable over years and farms. In particular, in 2015, the general level of PC was very low (below 12%) and the two commercial varieties showed drastically reduced levels of protein (below 9%).

Comparison of PPB Populations and Commercial Varieties over Farms and Years
Only for some morphological traits, such as PH and LLSD, were the responses of populations over farms and years quite parallel due to the low population × farm interaction. For the other traits, in particular for those related to grain yield and quality, the ranking changed a lot from one farm to the other (Figure 2). For SW, TKW, and PC, the interaction between variety and farm seemed more important in 2014 than in 2015 and, in general, differences among farms were larger in 2014 than in 2015.
Except for PH and LLSD, for which commercial varieties exhibited shorter straw than PPB varieties, there was no contrasted pattern between PPB varieties and the commercial ones ( Figure 2). For GY, depending on the farm and on the year, Renan, Hendrix, or some PPB varieties performed best. Each year, Renan had the largest grain yield on two farms, Hendrix had the largest yield on one farm, and PPB varieties did better on the three others (but not the same both years). Renan and Rouge du Roc had consistently large kernels (TKW) over farms and years, while Hendrix and Saint-Priest had small kernels, in general. For protein content, some PPB varieties, such as Rouge du Roc and Saint-Priest, had the highest level, while the commercial varieties were more unstable over years and farms. In particular, in 2015, the general level of PC was very low (below 12%) and the two commercial varieties showed drastically reduced levels of protein (below 9%).

Overall Variety Means
Only for plant height and LLSD were both commercial varieties significantly different from all PPB varieties (Figure 3), reflecting the strong selection for shorter straws and peduncles that has taken place in modern plant breeding. The two commercial varieties also tended to have shorter spikes (significant only for Hendrix) (Table S2). In general, PPB varieties tended to have rather long spikes (e.g., Rouge-du-Roc, Pop-Dynamic-2, and Savoysone), with a larger number of spikelets (e.g., Japhabelle, Pop-Dynamique-2, and Saint-Priest), and sometimes also more sterile spikelets (e.g., Japhabelle and Mélange-1-13-Pops) (Table S2). However, the PPB variety Savoysone had long spikes with fewer spikelets but also much less sterile spikelets per spike. In addition, PPB varieties tended to have more colored spikes while the two commercial varieties had white spikes. Based on these results, a morphological trait syndrome associated with commercial on one side and PPB varieties on

Overall Variety Means
Only for plant height and LLSD were both commercial varieties significantly different from all PPB varieties (Figure 3), reflecting the strong selection for shorter straws and peduncles that has taken place in modern plant breeding. The two commercial varieties also tended to have shorter spikes (significant only for Hendrix) (Table S2). In general, PPB varieties tended to have rather long spikes (e.g., Rouge-du-Roc, Pop-Dynamic-2, and Savoysone), with a larger number of spikelets (e.g., Japhabelle, Pop-Dynamique-2, and Saint-Priest), and sometimes also more sterile spikelets (e.g., Japhabelle and Mélange-1-13-Pops) (Table S2). However, the PPB variety Savoysone had long spikes with fewer spikelets but also much less sterile spikelets per spike. In addition, PPB varieties tended to have more colored spikes while the two commercial varieties had white spikes. Based on these results, a morphological trait syndrome associated with commercial on one side and PPB varieties on the other side could be identified. For the other traits, it was much less clear-cut. For thousand kernel weight, the two commercial varieties had opposite behaviors, with Renan showing the largest seeds, while Hendrix had among the smallest.
while Hendrix had among the smallest.
When comparing the overall grain yield per variety, only two PPB varieties were significantly less productive than the two commercial varieties, the eight others did not differ significantly ( Figure  3). This is probably due to the high variability of the yield value over farms and years for each variety, associated with a large variety × farm interaction. Hendrix had the lowest protein content (PC), although it was significantly different from only five PPB varieties, while Renan had an average PC value which was significantly lower than only two PPB varieties, Rouge du Roc and Saint-Priest. Rouge du Roc was significantly higher in protein than all varieties except Saint-Priest, which may be related to its low level of GY (Figure 3), although this was still true for protein yield (i.e., PC × GY, data not shown).  When comparing the overall grain yield per variety, only two PPB varieties were significantly less productive than the two commercial varieties, the eight others did not differ significantly (Figure 3). This is probably due to the high variability of the yield value over farms and years for each variety, associated with a large variety × farm interaction. Hendrix had the lowest protein content (PC), although it was significantly different from only five PPB varieties, while Renan had an average PC value which was significantly lower than only two PPB varieties, Rouge du Roc and Saint-Priest. Rouge du Roc was significantly higher in protein than all varieties except Saint-Priest, which may be related to its low level of GY (Figure 3), although this was still true for protein yield (i.e., PC × GY, data not shown).

Contribution of the Varieties to the Interaction
While the PCA first axes were able to explain large parts of the model variability (around 75%), the projection of the varieties and farms did provide additional information with regard to analyzing the interaction matrices and Wricke's ecovalences [15]. Therefore, they are not presented below. Figure 4 represents the matrices of variety × farm and variety × year interaction terms for PC and GY, together with Wricke's variety "spatial" and "temporal" ecovalences. Wricke's "spatial" (respectively "temporal") ecovalence is a dynamic stability indicator [16] which quantifies the way varieties deviate from the value that is predicted in each farm (respectively each year) based on the additive model for variety and farm (respectively year) effect. This is also their contribution to the variety × farm (respectively variety × year) interaction.

The Whole Range of Outcomes of PPB Programs
Experimental demonstrations of the ability of PPB programs to produce new interesting population varieties with performances corresponding to farmers' expectations are still scarce and At the farm level, for the protein content, three PPB varieties (Rocaloex, Mélange du Sud-ouest, and Mélange1-13-Pops) had the lowest spatial ecovalences, while two others (Saint-Priest and Dauphibois) had the highest, with Renan and Hendrix having intermediate values (Figure 4a). Some PPB varieties showed a large positive interaction contribution (blue square) on some farms, such as Dauphibois in the RAB farm or Saint Priest in the JSG farm, possibly indicating that they are better able to use the nitrogen available in these environments.
Renan and Hendrix showed similar patterns of responses to the six farms for GY (but not for PC), for which they tended to benefit more from favorable conditions at the RAB and JSG farms, but were more penalized in difficult conditions at the CHD farm. No particular pattern could be observed for the PPB varieties.
At the temporal level, for PC, Renan and Hendrix had the largest temporal ecovalences, showing similar patterns of response, while the two PPB varieties Rouge du Roc and Saint-Priest had very low values. For GY, Renan and Mélange1-5 Pops had the largest over-year ecovalence values. The two commercial varieties tended to be more responsive over farms for GY and over year for PC, and for GY only in the case of Renan.

The Whole Range of Outcomes of PPB Programs
Experimental demonstrations of the ability of PPB programs to produce new interesting population varieties with performances corresponding to farmers' expectations are still scarce and there is a need to accumulate more results obtained from middle-to long-term PPB programs covering a wide range of crop species, environmental and social contexts, and methods used. In order to scale up the decentralized participatory approach in a greater number of plant breeding programs, as these approaches are particularly adapted to support seed and food sovereignty and agroecological transition [17][18][19], it is also very important to show the whole range of outcomes that can be obtained with PPB: new experimental and statistical methods for on-farm trials, new collective organizations, farmers' empowerment, new population varieties used by farmers in production, the dynamic management of crop diversity over time, and changes in the seed regulation to better include PPB [20]. This is the case of the bread wheat PPB program that was studied here. In previous studies, we showed that on-farm evaluation and selection of populations can be organized through specific experimental devices that have been adapted to fit to a large number of small size trials with limited replication of entries within and among trials [6,7]. This is very flexible and convenient for the farmers, as only a small number of common controls are replicated within and among farms, and the rest of entries are populations chosen by each farmer on their farm. The comparison of populations within environments and the estimation of overall population effects, environment effects, and sensitivity of populations over environments can be obtained through a hierarchical Bayesian model adapted for the purpose [6,7]. PPB is also a learning process, where the organization of tasks and the roles of actors can evolve over time to better respond to the objectives of both farmers (and the other actors involved) and researchers. It took some years for this bread wheat PPB project to develop a collective organization that is efficient [12,13], but it can be used as a source of inspiration when starting a new PPB project. Finally, we also showed that farmers' mass selection within heterogeneous populations could be efficient, in particular, if it is associated with a scientific assessment of selection response in order to provide farmers with information on the impact of their practices [12][13][14][15][16][17][18][19][20][21]. Farmers' mass selection and choosing the parents of crosses are two key steps that illustrate farmers' empowerment in crop diversity management.
Characterizing the agronomic performances under organic farming conditions of the first PPB population varieties derived from this project compared to two commercial varieties brings the complementary elements to check the concrete value for farmers of the approach.

Agronomic Performance and Robustness of PPB Varieties Compared to the Commercial Varieties
Several of the wheat population varieties derived from PPB were proven to be of great agronomic interest, as they combine relatively good performance even under the most favorable conditions of organic agriculture, where commercial varieties were highly productive, and good robustness, i.e., the ability to maintain productivity under more constraining conditions. In particular, the interest of some PPB varieties was to present good compromises in grain production and protein content, but also in straw biomass and weed competition, due to their taller straw. This has not been recorded here, but some PPB varieties had improved lodging resistance compared to the landraces used as the genetic basis for the breeding project. Such types of plants with large straws but high lodging resistance are highly sought after by farmers growing wheat under organic farming conditions, as they provide good weed competitiveness, they contribute to soil fertility maintenance, and can be used for livestock. Farmers select tall plants with long peduncles (large LLSD here) because the long distance between the flag leaf and spike basis is known to be important to escape disease and to provide favorable micro-climate conditions [22], and it is also thought to promote a better storage and late transfer of carbohydrates to spikes under drought conditions, although it is difficult to find evidence of this in the literature. Figure 3 shows that PPB varieties have average performances as good as (for grain yield) or even better than (for protein content) those of the two commercial varieties. Although overall variety means (estimated based on the measures done on the six farms over the two years) are not a relevant criteria to judge the interest of the PPB varieties, since they are developed by the farmers in order to fit to their own environmental conditions and farming practices, it is worth considering them, as they are a classical indicator in trials for varieties evaluation. Such good grain yield performances of PPB varieties compared to commercial varieties have also been found for evolutionary participatory barley populations evaluated in Italy under low-input and organic farming conditions [23].
Moreover, for these two important traits, grain yield and protein content, the two commercial varieties appeared somewhat unstable from a dynamic point of view (high Wricke' ecovalence), considering the response over a year, and to a lesser extent over farms. This may seem surprising for Renan, as this variety has been continuously used by organic farmers since the time of its registration (1989), based on its good rusticity. It may be that the variety found its limits in this experiment because of the particularly contrasted and constraining pedo-climatic conditions and farming practices in these six farms and during those two seasons.
In contrast, several PPB varieties, such as Savoysone and Japhabelle, appeared much less responsive to moving from high potential farms to lower potential ones, or to the years' variations, while showing satisfactory performance. These findings are very much in line with the results of the evolutionary participatory barley breeding experiment where populations were as productive and had a higher dynamic stability over years and environments than commercial varieties [23]. This may be due to their intrinsic genetic and phenotypic variability, which provides a more stable performance whatever the conditions, but for the wheat PPB varieties this remains to be proven. Indeed, while for protein content, the four varieties which were most stable over farms were mixtures of landraces or of populations derived from crosses, i.e., varieties that are expected to be among the most diverse due to their creation and selection process ( Figure 4, Table 1), it was not so clear-cut for temporal stability, and for grain yield, there was no particular relationship between the expected diversity due to the creation and selection process and the stability over farms and years. Nevertheless, it should be noted that while stability over time is of major interest for the farmers, stability over farms is, in general, not particularly desirable, because the principle of PPB is to select each population within a particular farm to target adaptation to the local conditions. Moreover, the dynamic stability over time as it was estimated here might not be the most relevant indicator for farmers, as their objective is to minimize crop failure. Another indicator of temporal within farm stability, the coefficient of variation, was estimated in the companion paper of van Frank et al. (this issue) and results obtained with this static stability indicator confirmed our findings.
These results are consistent with the finding that increased stability and resilience are provided by within-field crop genetic diversity, which has been described in the case of mixtures of varieties, landraces, and composite cross populations [23][24][25][26]. The good performance of some of these wheat PPBs observed on farms with more limiting conditions may be also related to the efficiency of the participatory approach to identify and select for plants and varieties adapted to these more irregular and difficult conditions.
In general, the variety × year contribution to variation was very low, while variety × farm was larger, in particular for TKW, GY, and PC, indicating that genetic variability among populations specific to the farms could be available for local selection. The experimental device was not designed to study local adaptation, as the 10 PPB varieties were grown in the six farms only over two consecutive years, so we could not expect to observe local adaptation at such a short time scale. However, the relative amount of the variety × farm and variety × year interactions gave us clues on the stability over time of the behavior of varieties compared to their stability from farm to farm.
While the benefits of farmers' and actors' participation has been acknowledged in numerous PPB experiments (e.g., [17,27]), they have also been recently recognized in a context somewhat different from PPB by van Etten et al. [28]. The authors showed that farmers' evaluation of varieties could generate additional insights into variety adaptation and recommendation in the context of climate change.
Overall, these results, obtained after fewer than 10 years of on-farm participatory breeding, seemed promising to the farmers involved in the PPB process and attractive to new actors not already involved. In this experiment, the agronomic results were also complemented with nutritional and sensory characterization of the varieties, which is of primary importance for the farmers [29]. The most comprehensive possible characterization of these varieties, including their agronomic behavior, bread-making, nutritional, and organoleptic qualities, as well as their level of diversity and stability over time, will be critical to better communicate the benefits of the PPB approach.

Conclusions
We are aware that comparing 10 population varieties derived from PPB with only two commercial varieties does not make it possible to derive general conclusions on the commercial varieties. However, we think the device is relevant to describe the potential of these PPB varieties, as the two commercial varieties (Renan and Hendrix) were chosen to represent both a variety widely used and appreciated by organic farmers and a new one selected in the recently implemented INRA (Institut National de la Recherche Agronomique) breeding program for organic farming. In France, there is still a critical lack of varieties adapted to organic agriculture due to a lack of investment of breeders into this sector. PPB could be a good complement by making it possible to develop a much wider spectrum of varieties with a range of performance under the different organic conditions and practices and a general good robustness under constraining conditions. Supplementary Materials: The following are available online at http://www.mdpi.com/2071-1050/12/1/128/s1, Table S1: ANOVA table with model 1 on dataset a (PPB+CV) and on dataset b (PPB) for all traits studied.