Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland

Iwańska, Marzena; Paderewski, Jakub; Stępień, Michał

doi:10.3390/agronomy15102309

Open AccessFeature PaperArticle

Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland

by

Marzena Iwańska

¹

,

Jakub Paderewski

¹

and

Michał Stępień

^2,*

¹

Institute of Agriculture, Warsaw University of Life Science, 02-787 Warszawa, Poland

²

Independent Researcher, 02-760 Warsaw, Poland

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(10), 2309; https://doi.org/10.3390/agronomy15102309

Submission received: 25 August 2025 / Revised: 25 September 2025 / Accepted: 29 September 2025 / Published: 30 September 2025

(This article belongs to the Special Issue The Revision of Production Potentials and Yield Gaps in Field Crops)

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of cultivar performance across diverse environments is crucial for breeding and recommendation systems, helping to reduce the yield gap, the difference between potential and actual yields, which is often widened by poor cultivar selection. This study assessed the adaptability of winter wheat (Triticum aestivum L.) cultivars using a linear mixed-model framework combined with environmental mean regression. The model was trained on yield data from 19 locations over nine years (2015–2023) and validated independently using 2024 data. To ensure robustness, outliers were removed and cultivars with fewer than 30 observations excluded. The model accounted for genotype-by-environment (G×E) interactions and produced adjusted means for each location–year–management combination. These were used in cultivar-specific regressions to estimate yield response across environments. The approach showed strong predictive performance, with a Pearson correlation of 0.958 between predicted and observed yields in the validation year. Results highlight the model’s potential to inform cultivar recommendations, including for less-tested cultivars. This framework offers a practical tool for data-driven decision-making in plant breeding and agronomy, especially under variable growing conditions.

Keywords:

cultivar adaptation; environmental regression; G×E interaction; mixed models; statistical analysis; winter wheat; yield stability

1. Introduction

The main purpose of plant breeding is the development of new cultivars superior to those already cultivated in terms of yield potential and stability, product quality, tolerance to abiotic and biotic stress and other properties [1]. The new development of new cultivars is one of the most important factors of increase of crop production. For example, the introduction of the new corm cultivars accounted for approximately 47–62% yield increases in China between 1980 and 2013 [2].

Winter wheat (Triticum aestivum L.) is among the most extensively cultivated cereal crops in temperate regions, including Poland, where it represents a significant component of national grain production [3,4,5]. The yield gap—defined as the difference between potential yields under rain-fed conditions and realized farm yields—remains larger in Poland than in several other European countries and can be narrowed, among other levers, through appropriate cultivar choice [6]. Therefore, reliable assessment of cultivar performance across heterogeneous environments is essential for improving yield stability and formulating well-targeted recommendations. However, substantial environmental variability frequently induces genotype-by-environment (G×E) interactions that complicate breeding and advisory decisions [7,8,9].

Within this context, an adaptation-related trait often overlooked in empirical assessments is cultivar intensity, a genotype’s relative increase in yield with improving environmental productivity, which, if ignored, may lead to underestimation of potential performance in high-input or otherwise favorable conditions [10,11].

Classical G×E methods such as AMMI or GGE biplots offer insight into interaction structure and stability but typically assume balanced designs, limiting applicability to large, irregular datasets [12,13]. In contrast, linear mixed models (LMMs) effectively handle unbalanced multi-environment trials by jointly modeling fixed and random effects and borrowing strength across cells [14,15,16]. Yet, only a few studies have validated mixed-model–based regression using an independent year in large, structurally unbalanced networks. To address this gap, we integrate mixed-model analysis with cultivar-specific regression on adjusted environmental means and evaluate performance using independent 2024 data.

Quantifying cultivar responsiveness to environmental provides actionable criteria for breeding objectives, separating broadly adapted, stable genotypes from those with specific adaptation (e.g., high input systems), in line with classic adaptation concepts and stability analysis in plant breeding [10,11]. This alignment of selection with agroecological diversity supports risk management under climate variability and enhances the translational value of testing networks for variety recommendation. Drawing on multi-environment trials from 2015 to 2024 in Poland, we assess whether this framework (i) captures G×E, (ii) ranks cultivars across realistic productivity ranges, and (iii) yields data-driven recommendations to help reduce the yield gap.

2. Materials and Methods

To ensure high methodological quality and the reliability of analytical results, a systematic process of data preparation and statistical modeling was implemented (Figure 1). This approach enables robust yield prediction and provides a practical decision-support tool for cultivar recommendations across diverse farming conditions. All analyses were conducted in the R statistical environment [17], encompassing data curation, mixed model fitting, and genotype-specific regression modeling.

2.1. Data Source and Trial Design

This study utilized grain yield data from multi-environment trials conducted by the Research Centre for Cultivar Testing (COBORU) at 19 locations across Poland between 2015 and 2024. Trial sites were established by COBORU to represent diverse agroecological zones in key winter wheat production regions. Environmental conditions at the sites were generally favorable for wheat cultivation, with more detailed descriptions available in [18].

Each trial implemented two management regimes: (1) a standard input system, reflecting commercial-level practices including baseline fertilization and conventional crop protection; and (2) an intensive input system, featuring enhanced nitrogen fertilization (20–40% above standard) and additional fungicide and plant growth regulator applications [18,19].

Each cultivar’s performance was evaluated under both management regimes. Management intensity was included as a fixed effect in the linear mixed-effects model, allowing for the estimation of mean yields at each input level and enabling the analysis of genotype × management interactions. This modeling approach supports the identification of cultivars exhibiting either stable or variable responses to production intensity.

Each cultivar was typically evaluated in four replications per environment. For the purposes of this study, an environment is defined as a unique combination of location, year, and management regime. All field operations adhered to the national guidelines for official cultivar trials issued by [19], ensuring methodological consistency.

This standardized protocol, combined with the dual-management framework, ensures consistency across trial locations and seasons. As a result, each environment provides a reliable estimate of mean cultivar performance under specific input conditions. These environment-specific yield means serve as the basis for calculating adjusted environmental means in the mixed model and are later used as predictor variables in the regression models. This structure enhances the practical relevance of the findings by linking statistical outcomes directly to typical on-farm conditions and enabling precise, site- and input-specific cultivar recommendations.

2.2. Development of Model

2.2.1. Data Preprocessing

The preprocessing workflow comprised the following steps:

Outlier Detection and Removal

Yield observations showing anomalous deviations at the plot level were excluded based on standardized residuals relative to the environment-specific mean. The exclusion threshold was selected empirically to minimize the influence of extreme values and enhance model robustness.

Dataset Partitioning for Training and Validation

To ensure independent model validation, the dataset was partitioned into training and test subsets. The training dataset comprised cultivars evaluated between 2015 and 2023 and was used for model fitting and parameter estimation. The results of cultivar evaluation performed in the 2024 season were used for independent model validation. This separation allowed for robust benchmarking of predictive accuracy in previously unseen genotypes and environments, ensuring the generalizability of model performance.

Minimum Data Threshold for Cultivars in modelling dataset

Cultivars with fewer than 30 yield observations across all environments contained in the training dataset were excluded from analysis. This threshold is supported by classical literature on yield stability and adaptability [10,11]. Recent empirical work [20] has shown that predictive accuracy (R²) tends to plateau beyond this sample size, justifying its use for model reliability.

2.2.2. Mixed Model Analysis

A linear mixed-effects model (LMM) was applied to account for both the hierarchical structure and unbalanced nature of the multi-environment dataset. The model, formally defined in Equation (1), included the following components:

Fixed effects: year (Y), location (L), and management intensity (MIM), along with all two-way and three-way interactions among these factors (Y × L, Y × MIM, L × MIM, and Y × L × MIM); Random effects: cultivar (G), the interaction cultivar × location × year (G × L × Y), and cultivar × management intensity (G × MIM).

The model can be formally represented as follows:

Y_ijkgm = μ + Y_i + L_j + M_k + (Y × L)_ij + (Y × M)_ik + (L × M)_jk + (Y × L × M)_ijk + G_g + (G
× L × Y)_gij + (G × M)_gk + ε_ijkgm

(1)

where Y_ijkgm—observed yield, μ—overall mean, Y_i, L_j, M_k—fixed effects of year, location, and management intensity, G_g—random effect of cultivar, ε_ijkgm—residual error term.

This modeling structure enabled the estimation of adjusted mean yields for each unique combination of location, year, and management intensity (L × Y × M), while appropriately accounting for random genotype effects and G×E interactions [15,21]. Such model is, on the one hand, parsimony in parameters, which is important due to the relatively small size of the data set compared to the complexity of the estimation problem, but on the other hand, it preserves various sources of variability and does not unify the yield of varieties.

2.2.3. Cultivar Specific Regression Modelling

Each cultivar-specific regression followed the structure shown in Equation (2), using the adjusted environmental means as predictors to estimate genotype responsiveness to environmental productivity:

\hat{Y_{g}} = β_{0 g} + β_{1 g} \cdot \bar{E} + ε

(2)

where

\hat{Y_{g}}

—predicted yield of cultivar g,

β_{0 g}

—intercept (baseline yield under low environmental productivity),

β_{1 g}

—regression slope (responsiveness to environmental improvement),

\bar{E}

—adjusted environmental mean, ε—error term.

The slope

β_{1 g}

was interpreted as the cultivar’s yield responsiveness, while model fit was evaluated using the coefficient of determination (R²).

To avoid extrapolation, predictions were restricted to the 10th–90th percentile range of the environmental means (~7–11 t/ha), as recommended by [10].

All statistical analyses were performed using R software [17] (version 4.4.2). Mixed-model fitting was implemented via the lme4 package [22], linear regressions were conducted using base R’s lm function, and significance tests were carried out using the lmerTest package [23].

2.2.4. Simplified Reference Model for Yield Prediction

In this model, the yield of each cultivar in the 2024 season was predicted based solely on the long-term average yield recorded at a specific trial site under a given management regime (standard or intensive). These site-level averages were computed using data collected between 2015 and 2023 and included all cultivars tested under each unique location × management combination.

Formally, the model can be expressed as follows:

{\hat{Y}}_{g l m} = {\bar{Y}}_{l m}

(3)

where

{\hat{Y}}_{g l m}

—predicted yield of cultivar g at location l under management regime m in 2024,

{\bar{Y}}_{l m}

—long-term average yield at location l and management level m (based on 2015–2023 data).

This approach assumes that all cultivars perform identically under the same location × management conditions. It does not include genotype effects, responsiveness to environmental variation, or genotype-by-environment (G×E) interactions. As such, it ignores any potential shifts in cultivar rankings across environments or differences in cultivar-specific adaptation.

Despite its simplicity, such models are frequently used in agricultural research as practical reference baselines. They provide a neutral benchmark for evaluating whether more advanced modeling strategies—such as mixed-effects or regression-based approaches—deliver meaningful improvements in predictive performance [24,25].

The predictive performance of this simplified model is compared to the full regression-based model in the Results Section 3.3.4, allowing for a direct evaluation of the added value of incorporating genotype-specific responses and genotype × environment interactions.

2.2.5. Model Validation

The predictive performance of the developed yield modeling framework was assessed using an independent dataset from the 2024 trial season. The aim of the validation procedure was to evaluate the model’s ability to generate reliable yield predictions under real-world field conditions not included in the training data (2015–2023).

For each 2024 environment—defined as a unique combination of location and management regime—cultivar-specific regression models were used to generate yield predictions. The arithmetic mean yield observed in each environment (based on all cultivars tested at a given site and management level) was used as the explanatory variable in the regression models. Adjusted means were not available for 2024 due to the exclusion of this season from the mixed-model fitting; however, prior analyses indicated that arithmetic and adjusted means were closely aligned, supporting the use of unadjusted values for this purpose.

To examine predictive performance in greater detail, two types of predictions were evaluated: environmental-level prediction, assessing the model’s ability to estimate average yield for each location × management setting in 2024 using regression models fitted on historical data from 2015 to 2023; and cultivar-level prediction, assessing the accuracy of predicted yields for individual cultivars in each environment, based on their genotype-specific regression coefficients.

Prediction accuracy was quantified using two standard metrics: Pearson’s correlation coefficient to evaluate the strength of association between predicted and observed yields; and root mean square error (RMSE) to estimate the average prediction error.

This multi-level evaluation framework was designed to separately characterize the model’s capacity to generalize environmental productivity as well as to capture genotype-specific responses across a wide range of field conditions.

2.3. Application of Model for Cultivar Recommendation

2.3.1. Evaluation of Cultivar Adaptability Across Diverse Environmental Conditions

To evaluate how different cultivars respond to varying levels of environmental productivity, yield predictions were generated for three representative levels of environmental productivity. These levels corresponded to the 10th (~7 t/ha), 50th (~9 t/ha), and 90th (~11 t/ha) percentiles of the adjusted environmental mean yields based on the training dataset from 2015 to 2023. The use of decile rather than quartiles allowed for finer resolution at the extremes, ensuring better representation of both low- and high-yielding scenarios while retaining a reference for average performance.

For each of these three productivity levels, predicted yields were calculated for all cultivars using the cultivar-specific regression models described in Section 2.2.3. Cultivars were then ranked separately at each productivity level. To identify genotypes with robust and consistently high performance, a fixed selection threshold was applied: cultivars that ranked among the top 20 at all three productivity levels were designated as consistently top-performing cultivars (broadly adapted).

This “Top 20” criterion reflects a selection intensity of approximately 13%, based on a total of 156 cultivars in the training dataset. The threshold is consistent with selection intensities commonly used in national cultivar recommendation systems, which typically focus on the top 10–15% of entries with regard to yield and agronomic stability [19,26].

While the classical term “broad (wide) adaptation [10]” is often used to describe cultivars performing well across environments, this study applies the more descriptive and cautious label “consistently top-performing.” This terminology avoids conflation with the formal genetic definition of adaptability and reflects the empirical nature of the analysis, which is based on observed ranking consistency across predicted yield scenarios, rather than statistical stability metrics such as regression slope. The term consistently top-performing is used descriptively and should not be confused with statistical definitions of yield stability.

2.3.2. Recommendation Scenarios Based on Cultivar Responsiveness

Beyond predictive performance, the developed framework offers practical insights for cultivar selection tailored to on-farm conditions. Based on regression parameters (intercept, slope, and R²), the following recommendation scenarios were defined:

Stable Cultivars for Heterogeneous Fields. Cultivars with high predictive accuracy (R² ≥ 0.90) and slope values close to 1.0 are considered stable. If these cultivars consistently rank among the top 20 across the full range of environmental productivity (7 to 11 t/ha) are considered consistently top-performing (broadly adapted). These genotypes maintain top-tier performance in diverse growing conditions and are suitable for farms seeking uniform results across variable years and locations and they are especially suitable for fields with heterogeneous soils and variable input regimes. Such field conditions are common across many regions of Poland, where intra-field variability significantly affects yield outcomes [27,28,29]. Choosing a single, stable cultivar for such variable fields avoids the need for separate cultivar selection per soil patch and simplifies crop management.

Optimization in High-Productivity Systems. Cultivars with a slope > 1.1 and high predicted yields at 11 t/ha (90th percentile) exhibit strong responsiveness to favorable conditions and are ideal for intensive production systems. These cultivars are likely to make the most of optimal growing environments with good soils, favorable weather, and high input availability.

Recommendations for Low-Input or Marginal Systems. Under lower-productivity conditions (~7 t/ha), cultivars with slopes < 1.0 but good baseline yield performance (intercept) are recommended. These genotypes are well-suited to stress-prone environments or farms with limited input resources. Their limited responsiveness reduces the risk of poor performance under suboptimal conditions.

Role of Management Intensity. Although trials were conducted under both standard and intensive management regimes, separate regression models were not developed for each system. Instead, management intensity was modeled as a fixed effect in the mixed model, used to estimate adjusted environmental means.

This approach allowed the full productivity gradient—encompassing variation from site, season, and input intensity—to be captured holistically, without requiring separate regression structures for each management level. As a result, cultivar recommendations are based on general responsiveness to environmental productivity, regardless of whether yield differences arose from soil quality, climatic conditions, or agronomic inputs.

This strategy supports broad applicability of the results in variety evaluation and agricultural advisory systems, providing a flexible tool for yield optimization under real-world field conditions.

3. Results and Discussion

The COBORU trial network was established on arable soils of varying quality, which reflect the actual agronomic conditions of wheat cultivation in Poland. According to estimates from national research institutes (IUNG-PIB and IHAR-PIB), over 70% of arable land used for winter wheat cultivation in Poland falls within soil classes II–IVb [30,31]. The same soil classes—especially IIIa, IIIb, IVa, and IVb—are the most common across COBORU trial sites, supporting the relevance and practical applicability of the results for breeders and agricultural advisors.

3.1. Yield Range and Representativeness of Trial Environments

The trials were conducted on soils with varying quality classifications (Table 1), most commonly classes IIIa and IIIb (good and moderately good arable soils), but also including class II (very good) and classes IVa–IVb (medium quality: better and poorer). According to COBORU, each trial site was assigned a land suitability group, ranging from group 1 (very good suitability for wheat) to group 5 (good suitability for rye). Notably, 13 of the 19 locations fell into groups 1 or 2, indicating high suitability for wheat and reflecting areas representative of Poland’s most productive cereal-growing regions.

This wide range of environmental conditions enhances the generalizability of the model, making it applicable to both challenging and highly productive situations. Such an approach is consistent with current recommendations for predictive modeling in precision agriculture and cultivar performance assessment [20,24,25].

To evaluate the representativeness of the trials in relation to real-world production conditions, yield data from 19 locations collected between 2015 and 2024 were analyzed. For each site, key summary statistics—including mean and standard deviation of yields—were calculated by aggregating data across all cultivars, management levels, and replications (Table 1).

During the training years (2015–2023), mean yields ranged from 6.30 t/ha in Tomaszów Bolesławiecki to 10.93 t/ha in Głubczyce, indicating that the trial network encompassed a wide range of yield environments. Locations such as Głubczyce, Radostowo, and Zybiszów consistently recorded average yields exceeding 10.0 t/ha, reflecting highly favorable growing conditions. In contrast, sites such as Tomaszów Bolesławiecki, Nowa Wieś Ujska, Głębokie, and Cicibór Duży showed lower average yields, indicative of more constrained production environments.

In several locations (Cicibór Duży, Nowa Wieś Ujska, Słupia, Tomaszów Bolesławiecki, and Świebodzin), minimum yields fell below 5.0 t/ha, likely due to drought, poor soil quality, or disease pressure. Conversely, maximum yields in the top-performing sites (Krościna Mała, Radostowo, Skołoszów, Słupia, and Zybiszów) exceeded 14.0 t/ha, demonstrating that the trials captured both ends of the yield distribution. In some locations, such as Radostowo and Słupia, both very low and very high yields were recorded, suggesting substantial intra-site variability in weather or soil conditions.

In 2024, average yields in most locations remained within the typical range observed during the previous years (2015–2023). However, a few sites, including Słupia and Świebodzin, recorded substantially lower yields compared to their earlier averages. Similar downward trends were also observed in Tomaszów Bolesławiecki and Węgrzce, indicating adverse seasonal or site-specific conditions that negatively affected yield performance in these locations.

For comparison, national statistics from Statistics Poland (GUS) [33] indicate that average winter wheat yields ranged from 5.5 to 6.8 t/ha between 2015 and 2023. In contrast, analysis of COBORU trial data shows that average annual yields ranged from 7.4 to 10.1 t/ha over the same period, reflecting the favorable conditions under which the experiments were conducted and the yield gap between potential and current yields. However, by incorporating both high- and low-yielding sites, the dataset offers a realistic representation of wheat production environments across Poland. This enhances the reliability of the model and supports its applicability to both intensive and low-input farming systems.

Importantly, this wide variability also supports robust model training and validation by facilitating the development of cultivar recommendations tailored to diverse growing conditions. Sites with extreme values—whether exceptionally high or low—contribute to capturing the full range of genotype × environment interactions, which is critical for effective predictive modeling and generalization beyond the training data. This principle has been widely emphasized in the literature on multi-environment trials, where environmental diversity is considered essential for identifying both broadly and specifically adapted genotypes [8,12,15].

3.2. Data Preparation

The full dataset consisted of 19,641 winter wheat yield observations and 212 cultivars collected from 2015 to 2024 across 19 official trial locations in Poland. These data included all cultivar replications under both standard and intensive management regimes. Data collection protocols were largely consistent across years and sites, though occasional trial cancellations or data reporting issues resulted in isolated gaps (see Table 2).

Table 2 presents the number of raw yield observations per location and year, before data cleaning. These counts reflect the total number of cultivar-level observations per site, summed across management levels and repetitions. The data were provided by COBORU, the Polish Research Centre for Cultivar Testing.

To ensure analytical robustness and minimize bias, two quality filters were applied:

Outlier removal based on environment-specific standardized residuals.
Exclusion of cultivars with fewer than 30 observations across the 2015–2023 period.

After filtering, the final training dataset (2015–2023) consisted of 156 cultivars (16,582 yield observations), each with sufficient representation across environments (≥30 observations). The test dataset (2024) initially included 73 cultivars (2703 yield observations) of which 52 met inclusion criteria for validation and thus 1957 yield observations.

Figure 2 shows the distribution of yield observations per cultivar. Panel (a) corresponds to the training dataset, while panel (b) includes the full dataset with 2024 added. The distribution is strongly right-skewed, with the highest frequency observed in the 70–79 range and a long tail of cultivars tested in over 200 environments. Importantly, the inclusion of 2024 observations increased the dataset size but did not alter the overall distribution pattern.

This structure reflects the dynamic turnover of cultivars in official trials and supports the use of a 30-observation threshold to ensure model stability and reliable adaptability assessments. Similar thresholds have been supported in previous multi-environment trial (MET) studies, where prediction accuracy and stability metrics improve with increasing sample size up to a saturation point [11,14].

Additionally, classic studies by [10,12] emphasized the importance of representative environmental sampling for robust estimation of genotype-specific response functions. More recently, ref. [20] confirmed that predictive accuracy for cultivar performance (measured as R²) stabilizes once a cultivar has been tested in approximately 30–40 environments. Thus, the applied filtering strategy is consistent with best practices for modeling genotype × environment interactions [8,15].

3.3. Statistical Models

3.3.1. Performance of the Linear Mixed Model

A linear mixed-effects model was applied to analyze winter wheat yield data from 2015 to 2023 (excluding the validation year 2024). This model structure enabled the simultaneous assessment of three fixed factors—year, location, and management intensity (MIM)—along with all two-way and three-way interactions among them. Random effects accounted for variation due to cultivar, cultivar × location × year, and cultivar × management intensity interactions.

Analysis of variance (Table 3) revealed that all main effects—year, location, and management intensity—significantly influenced yield (p < 0.001). Additionally, all interaction terms were statistically significant, demonstrating that the effect of each factor was context-dependent and varied across environments.

The model effectively captured environmental variability, including differences between locations, seasons, and agronomic inputs. High F-statistics indicate that the model was sensitive to diverse field conditions and capable of quantifying their influence on cultivar performance.

Importantly, these findings align with previous multi-environment trial studies emphasizing the need to model both fixed and random effects to accurately capture genotype × environment interactions in crop performance analysis [8,15,22].

The model’s prediction error, expressed as root mean square error (RMSE), was low (0.116 t/ha), indicating high predictive precision. Adjusted environmental mean yields, estimated for each location × year × management combination, ranged from 3.24 to 13.32 t/ha, with an overall mean of 9.07 t/ha, a median of 9.21 t/ha, and a standard deviation of 1.90 t/ha. This wide distribution confirms that the trial network effectively captured both marginal and highly productive environments.

3.3.2. Performance and Prediction Accuracy of Cultivar-Specific Regression Models

To gain deeper insight into how individual winter wheat cultivars respond to varying environmental conditions, cultivar-specific linear regression models were applied, as described in Section 2.2.3. In each case, grain yield was modeled as a function of the environmental mean yield—defined by unique combinations of location, year, and management level. These environmental means were derived from the linear mixed model described earlier.

Model Fit and Predictive Accuracy

The regression models showed strong agreement with observed yield data across most cultivars. The coefficient of determination (R²), reflecting the proportion of yield variance explained by the model, ranged from 0.49 to 0.97 (Supplementary Table S1). A total of 93 cultivars achieved R² values ≥ 0.90, and an additional 48 exceeded 0.80, confirming the high predictive accuracy of the modeling framework. Among the best-modeled cultivars were LG Egmont (R² = 0.97), Riposta (0.96), Nordkap (0.96), Jannis (0.96), SU Geometry (0.96), Gimantis (0.95), and Adrenalin (0.94), all showing excellent model fit and stable performance across environments.

This level of predictive performance is consistent or even better than previous findings from prior METs. For example, [12,14] reported R² values typically ranging from 0.60 to 0.90 in genotype-specific yield regressions, depending on data structure and environmental heterogeneity. Similarly, ref. [7] observed cultivar-wise R² values between 0.55 and 0.92 in winter wheat trials across diverse agroecological zones. The relatively large number of cultivars with R² ≥ 0.90 in our study reflects the robustness of the regression framework and the high quality of the dataset.

The strong predictive accuracy likely stems from the use of adjusted environmental means derived from linear mixed models, along with the enforcement of a minimum threshold of 30 observations per cultivar. This minimized estimation noise and ensured model stability, consistent with methodological recommendations by [8,11]. Overall, the observed R² distribution falls within the expected range reported in the literature on multi-environment yield modeling and confirms good model fit across a wide range of modern winter wheat cultivars.

Cultivar Responsiveness and Model Robustness

The regression slope (β₁) in cultivar-specific models represents the yield responsiveness of a genotype to environmental productivity. Values near 1.0 indicate a proportional response and are commonly interpreted as indicative of wide adaptability and performance stability [8]. Cultivars with slopes > 1.0—such as KWS Dacanto (β₁ = 1.29)—exhibited increased responsiveness under favorable conditions, suggesting high yield potential but possibly lower stability. In contrast, cultivars like Ceres (β₁ = 0.72) showed limited responsiveness, making them more suitable for low-input or stress-prone environments. Stable cultivars with slopes close to 1.0—such as Adrenalin (0.97), Artist (0.99), and LG Keramik (1.00)—demonstrated consistent yield responses across environments, reinforcing their utility as check cultivars in registration or breeding trials (Supplementary Table S1).

The range of slope estimates observed aligns with prior multi-environment trial (MET) studies in wheat. Ref. [9] reported cultivar-specific slopes ranging from 0.70 to 1.30 across diverse environments. Refs. [11,12] highlighted the joint interpretation of slope and R² as a powerful tool in cultivar selection. As shown by [20], model reliability was not significantly affected by the number of observations, provided that the 30-trial threshold was met. This threshold, as adopted in Section 2.2.1, proved sufficient to yield robust and reproducible estimates across cultivars.

To avoid extrapolation, model predictions were confined to environments with adjusted mean yields between 7 and 11 t/ha—corresponding to the 10th to 90th percentiles of the environmental productivity distribution. This ensured agronomic relevance and prevented parameter distortion due to extreme values. These constraints follow methodological best practices recommended by [8,10,12], who emphasize the importance of restricting genotype performance analysis to observed environmental ranges.

3.3.3. Comparative Validation of Predictive Accuracy Using 2024 Data

To assess the added value of incorporating genotype-specific responses in yield prediction, we compared the predictive accuracy of the full cultivar-specific regression model with that of the simplified reference model described in Section 2.2.4. While the full model accounts for both environmental variation and genotype × environment (G×E) interactions through individual cultivar response functions (Section 2.2.2, Section 2.2.3, Section 2.2.4, Section 2.2.5 and Section 2.3), the simplified reference model uses only long-term mean yields per location × management regime and assumes equal performance across cultivars.

The cultivar-specific regression model was trained on data from 2015 to 2023 and used to predict grain yields for each cultivar–environment combination in the 2024 trial season. Predicted values were then compared to actual observed yields in 2024. The model demonstrated high predictive accuracy, with a Pearson correlation coefficient of r = 0.958 and a root mean square error (RMSE) of 0.45 t/ha Table 4. These values confirm that the model effectively captured both the environmental structure of the trial network and cultivar-specific responsiveness. The corresponding scatterplot (Figure 3) shows that most points lie close to the 1:1 identity line, indicating strong agreement between predicted and observed yields. The prediction covered a broad range of yield outcomes (approximately 3 to 14 t/ha), suggesting robustness across environments and performance levels. A small number of outliers likely reflect extreme local effects or unmodeled residual variation.

To further illustrate the difference in performance between the two models, we present a direct comparison in Table 4 below. The simplified reference model—which does not account for genotypic differences or G×E interactions—yielded substantially lower predictive performance, with a Pearson correlation coefficient of r = 0.502 (between predicted environment-level means and observed cultivar yields) and an RMSE of 2.08 t/ha. These results highlight the limitations of genotype-agnostic models and underscore the predictive advantage offered by incorporating cultivar-specific parameters.

The improvement in predictive accuracy observed here is consistent with findings from prior studies. For example, [9] reported cross-validation R² values between 0.55 and 0.92 in winter wheat yield regressions across multiple agroecological zones. Similarly, refs. [12,14] emphasized that R² values above 0.85 in multi-environment trials indicate strong predictive fit and model robustness. Importantly, these prior works—and the current results—demonstrate that genotype responsiveness to environmental productivity is a key component in accurate and stable prediction systems.

Furthermore, our findings confirm conclusions drawn by earlier studies [24,25], who showed that omitting G×E interactions reduces prediction quality, especially under real-world field variability. Thus, the application of cultivar-specific regression approaches—as implemented in this study—offers tangible gains in predictive performance and should be considered a standard practice in variety evaluation and recommendation systems.

Validation of Specific-Cultivar Regression Models Using 2024 Data

While Supplementary Table S1 provides insights into model fit through R² and regression slope, Table 5 and Table 6 evaluate the external predictive accuracy of cultivar-specific regression models using independent data from the 2024 trial season. This dual approach allows for a more complete assessment of model quality: internal fit (R²) versus real-world prediction (RMSE).

To quantify predictive performance, we used the RMSE/SD ratio, comparing the root mean square error (RMSE) of model predictions against the standard deviation (SD) of observed yields for each cultivar. This normalized indicator evaluates whether the prediction error remains within the range of natural yield variability, offering a standardized metric for comparing cultivars with different yield dynamics.

Across the 156 cultivars tested, most displayed RMSE/SD values well below the 0.30 threshold, confirming that prediction errors were small relative to yield variation. Notably, cultivars such as SU Banatus (0.17), Comandor (0.17), Chevignon (0.18), and Symetria (0.17) demonstrated the highest prediction precision, with both low RMSE/SD and R² ≥ 0.88 (Table 5). These genotypes can be considered highly predictable and stable under varying environmental conditions.

Cultivars like Adrenalin (0.35) and LG Keramik (0.36) showed slightly elevated RMSE/SD ratios but still maintained R² values above 0.90. This suggests that the regression models captured yield responses accurately even in more variable environments.

In contrast, LG Nida (0.70) and KWS Donovan (0.58) exhibited the highest RMSE/SD values, indicating that their 2024 yield outcomes deviated substantially from model expectations despite high R² (0.92). This discrepancy likely reflects strong G×E interactions or environment-specific responses not fully captured by the linear framework. Other cultivars with RMSE/SD > 0.44 include Bosporus, SU Mangold, and Bright, all maintaining R² between 0.90 and 0.94 (Table 6).

The comparative analysis of RMSE/SD highlights significant variation in predictive reliability across cultivars. Genotypes with the lowest RMSE/SD not only demonstrated excellent model fit but also high agreement between predicted and actual yields, confirming their robustness across environments. Such cultivars are ideal candidates for use as check cultivars in official variety testing or as donors of yield stability in breeding programs.

Conversely, cultivars with high RMSE/SD may still show strong R², indicating that model fit alone is not sufficient for predicting performance under new conditions. These cases underline the importance of accounting for non-linear or environment-specific G×E effects, especially when models are applied beyond the calibration range.

Our findings align with recent literature emphasizing the dual use of fit and error metrics in model evaluation. Technow et al., 2015 [24] and Heslot et al., 2014 [25] advocated the use of normalized prediction errors (e.g., RMSE/SD or MAE/SD) as essential indicators of model utility under real-world variability. Similarly, ref. [8] highlighted the role of external validation in detecting model overfitting and in guiding genotype selection under complex agroecological scenarios.

Thus, the combined use of R², slope, and RMSE/SD provides a robust and interpretable framework for discriminating between statistically well-fitted and practically predictable cultivars—a distinction that is critical for successful deployment in breeding, variety registration, and farmer decision-support systems.

3.3.4. Cultivar Adaptation to Diverse Environmental Productivity

To evaluate how winter wheat cultivars respond to different levels of environmental productivity, we calculated predicted yields for each cultivar at three representative yield levels (7, 9 and 11 t/ha). Yields predicted at each level were calculated using cultivar-specific regression models, as described in Section 2.2.3, based on model intercepts and slopes. Each cultivar was then ranked independently at each yield level, and the rankings were aggregated by summing ranks across the three environments to produce a composite adaptation score.

To facilitate practical decision-making, cultivars were also assigned to one of three recommendation categories (consistently top-performance or broad adaptation, high-productivity, or low-productivity environments) based on their slope values (see Table 7 and Note). This allows agronomists and farmers to align cultivar choice not only with general performance, but also with specific environmental conditions and production goals.

Table 7 presents the top 20 broadly adapted cultivars based on cumulative rank, along with their individual ranks at low, medium, and high productivity levels. The table also includes regression parameters—intercept, slope, and coefficient of determination (R²)—to illustrate model fit. A lower cumulative rank indicates more stable and superior yield performance across the three environments.

Noteworthy examples include SY Cellist, Chevignon, and LG Mondial, which combine strong model fit with high yield consistency; Bulldozer, which was the top performer under low-input conditions; SU Tarroca, which excelled in favorable, high-yielding environments; and cultivars such as Knut, KWS Donovan, and LG Keramik, which showed both adaptability and yield stability across the full productivity spectrum. These cultivars are especially valuable for national and regional recommendation systems that prioritize consistent performance under diverse conditions.

In contrast, several cultivars exhibited specific adaptation patterns. For instance, Bright and SU Tarroca performed particularly well in medium- to high-yield environments. RGT Ritter was competitive under low and medium productivity but not among the best performers in high-input scenarios. Meanwhile, KWS Talium demonstrated strong performance exclusively under low-yield conditions, indicating narrow adaptation and a possible lack of robustness under more intensive management.

This stratified analysis offers a practical framework for identifying cultivars suited either for general deployment across variable environments or for targeted use in specific systems—such as resource-limited farms or high-intensity cropping. Full criteria for recommendation classification and regression-based responsiveness are detailed in the note accompanying Table 7. A comprehensive overview of cultivar rankings at each yield level for all 156 tested cultivars is available in Supplementary Table S2, enabling further exploration and data-driven decision-making in cultivar selection.

3.3.5. Implications for Cultivar Recommendation

The results underscore the importance of distinguishing between cultivars with broad versus narrow environmental adaptation when designing effective variety recommendation strategies, as previously emphasized by [8,10]. Broadly adapted cultivars—such as SY Cellist, Chevignon, LG Mondial, and Knut—consistently ranked among the top performers across low-, medium-, and high-productivity scenarios (see Section 3.3.5 and Table 5). Their stable performance, indicated by slope values close to 1.0, combined with high predictive accuracy (R² > 0.90, RMSE/SD < 0.30), and strong model fit, makes them particularly suitable for general deployment in heterogeneous or risk-prone farming systems, as supported by earlier work [27,29]. Conversely, cultivars with narrower adaptation profiles—such as KWS Talium, which excelled under low-yield conditions but underperformed in high-productivity environments—may still offer strong performance but only when matched carefully to specific field conditions. These cultivars are better suited for targeted recommendations in low-input systems, marginal environments, or uniform, resource-limited farms, and failure to account for this specificity can result in suboptimal outcomes in more intensive systems [20]. To support practical decision-making, cultivars were also assigned to one of three recommendation categories—consistently top-performing (broad adaptation), low-productivity, or high-productivity environments—based on their slope values, reflecting responsiveness to environmental potential (see Table 7 and accompanying note). This classification is consistent with variety recommendation protocols adopted in official systems [19,26]. By integrating multiple layers of model evaluation, including predicted yields at defined productivity levels (Section 3.3.5), regression fit parameters (Section 3.3.2), and predictive error metrics such as RMSE/SD (Section 3.3.4), this approach offers a robust and interpretable decision-support tool. It enables breeders, advisors, and farmers to base variety recommendations not only on potential yield but also on yield stability and predictability under diverse and real-world growing conditions. Such stratified deployment strategies align with the principles of precision agriculture by improving resource use efficiency, enhancing agroclimatic resilience, and ultimately narrowing the yield gap—the discrepancy between potential and actual farm-level yields—through better cultivar-environment matching [4,5,6,34,35].

4. Conclusions

This study demonstrates the utility of a regression-based framework for cultivar-specific yield prediction and recommendation, using multiyear, multilocation trial data. By explicitly modeling cultivar responsiveness to environmental productivity levels, we evaluated and ranked winter wheat cultivars based on both average yield performance and yield stability across a broad spectrum of agronomic conditions.

Several cultivars exhibited broad adaptation, maintaining high performance across low-, medium-, and high-productivity environments. Other cultivars showed more specific adaptation, excelling under low-input or high-input conditions. External validation with 2024 data confirmed the robustness of the model, with RMSE/SD ratios below 0.30 for most cultivars and a Pearson correlation of 0.958 between predicted and observed yields in the validation dataset. The classification of cultivars based on regression slope values allowed for their practical assignment to recommendation categories (low-, high-, or broadly adapted), enhancing the precision and usability of variety advisory systems.

Moreover, this cultivar ranking and recommendation framework offers a data-driven solution to narrowing the yield gap—the difference between potential and realized on-farm yields. A significant portion of this gap arises from inadequate cultivar–environment matching. By aligning cultivar choice with site-specific productivity potential, this approach contributes to closing the yield gap.

Although developed for winter wheat, the methodology is readily transferable to other crops, provided that structured multilocation trial data are available. As such, it opens new opportunities for application in national testing networks, plant breeding pipelines, and precision agriculture platforms, promoting more resilient, efficient, and sustainable crop production systems.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/agronomy15102309/s1.

Author Contributions

Conceptualization, M.I., J.P. and M.S.; methodology, M.I. and J.P.; software, J.P.; validation, M.I. and J.P.; formal analysis, J.P. and M.S.; data curation, J.P.; writing—original draft preparation, M.I., J.P. and M.S.; writing—review and editing, M.I., J.P. and M.S.; visualization, M.I., J.P. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Dataset available on request from the authors.

Acknowledgments

The authors acknowledge the Research Centre for Cultivar Testing (COBORU), Poland for providing the data used in this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AMMI	Additive Main Effects and Multiplicative Interaction (analysis method)
COBORU	Centralny Ośrodek Badania Odmian Roślin Uprawnych (Research Centre for Cultivar Testing, Poland)
IHAR-PIB	Instytut Hodowli i AKlimatyzacji Roślin—Państwowy Instytut badawczy (Institute of Plant Breeding and Acclimatization—National Research Institute, Poland)
IUNG-PIB	Instytut Uprawy Nawożenia i Gleboznawstwa—Państwowy Instytut Badawczy Institute of Soil Science and Plant Cultivation—State Research Institute, Poland)
GGE	Genotype + Genotype × Environment (biplot analysis method)
G×E	Genotype-by-environment (interaction)
L × Y × M	Location × Year × Management (combined trial factors)
MIM	Management intensity (treatment factor in trials)
RMSE	Root mean square error
SD	Standard deviation
R²	Coefficient of determination
RMSE/SD	Ratio of prediction error to standard deviation

References

Federizzi, L.C.; Carbonell, S.A.M.; Pacheco, M.T.; Nava, I.C. Breeders’ Work after Cultivar Development: The Stage of Recommendation. Crop Breed. Appl. Biotechnol. 2012, 12, 67–74. [Google Scholar] [CrossRef]
Qian, J.; Zhao, Z. Estimating the Contribution of New Seed Cultivars to Increases in Crop Yields: A Case Study for Corn. Sustainability 2017, 9, 1282. [Google Scholar] [CrossRef]
FAOSTAT. Statistical Database. Food and Agriculture Organization of the United Nations. 2023. Available online: https://www.fao.org/faostat/ (accessed on 5 August 2025).
Dziechciarz, M.; Kaczmarek, J.; Rolbiecki, R. Production and Economic Importance of Winter Wheat in Poland. Sci. J. Warsaw Univ. Life Sci. Probl. World Agric. 2020, 20, 55–63, (Original work published in Polish). [Google Scholar]
Shiferaw, B.; Smale, M.; Braun, H.-J.; Duveiller, E.; Reynolds, M.; Muricho, G. Crops That Feed the World 10: Past Successes and Future Challenges to the Role Played by Wheat in Global Food Security. Food Secur. 2013, 5, 291–317. [Google Scholar] [CrossRef]
Wójcik-Gront, E.; Iwańska, M.; Wnuk, A.; Oleksiak, T. The Analysis of Wheat Yield Variability Based on Experimental Data from 2008–2018 to Understand the Yield Gap. Agriculture 2022, 12, 32. [Google Scholar] [CrossRef]
Yan, W.; Tinker, N.A. Biplot Analysis of Multi-Environment Trial Data: Principles and Applications. Can. J. Plant Sci. 2006, 86, 623–645. [Google Scholar] [CrossRef]
van Eeuwijk, F.A.; Bustos-Korts, D.V.; Malosetti, M. What Should Students in Plant Breeding Know about the Statistical Aspects of Genotype × Environment Interactions? Crop Sci. 2016, 56, 2119–2140. [Google Scholar] [CrossRef]
Li, X.; Bai, G.; Carver, B.; Chao, S. Genotype-by-Environment Interaction and Stability Analysis in Multi-Environment Trials of Wheat. Field Crops Res. 2020, 255, 107866. [Google Scholar] [CrossRef]
Finlay, K.W.; Wilkinson, G.N. The Analysis of Adaptation in a Plant-Breeding Programme. Aust. J. Agric. Res. 1963, 14, 742–754. [Google Scholar] [CrossRef]
Becker, H.C.; Leon, J. Stability Analysis in Plant Breeding. Plant Breed. 1988, 101, 1–23. [Google Scholar] [CrossRef]
Gauch, H.G. Statistical Analysis of Yield Trials by AMMI and GGE. Crop Sci. 2006, 46, 1488–1500. [Google Scholar] [CrossRef]
Yan, W.; Holland, J.B. A Heritability-Adjusted GGE Biplot for Test Environment Evaluation. Euphytica 2010, 171, 355–369. [Google Scholar] [CrossRef]
Piepho, H.P.; Büchse, A.; Emrich, K. A Stage-Wise Approach for the Analysis of Multi-Environment Trials. Biometr. J. 2014, 56, 761–777. [Google Scholar] [CrossRef]
Smith, A.; Cullis, B.; Thompson, R. The Analysis of Crop Cultivar Breeding and Evaluation Trials: An Overview of Current Mixed Model Approaches. J. Agric. Sci. 2005, 143, 449–462. [Google Scholar] [CrossRef]
Crossa, J.; Pérez-Rodríguez, P.; Cuevas, J.; Montesinos-López, O.; Jarquín, D.; de Los Campos, G.; Burgueño, J.; González-Camacho, J.M.; Pérez-Elizalde, S.; Beyene, Y.; et al. Genomic Selection in Plant Breeding: Methods, Models, and Perspectives. Trends Plant Sci. 2017, 22, 961–975. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024; Available online: https://www.R-project.org/ (accessed on 5 August 2025).
Iwańska, M.; Paderewski, J.; Stępień, M.; Rodrigues, P.C. Winter Wheat Cultivar Recommendation Based on Expected Environment Productivity. Agriculture 2021, 11, 522. [Google Scholar] [CrossRef]
COBORU. Results of Winter Wheat Cultivar Trials 2015–2024; Centralny Ośrodek Badania Odmian Roślin Uprawnych (COBORU): Słupia Wielka, Poland, 2024.
Iwańska, M.; Paderewski, J.; Žukovskis, J.; Wnuk, A.; Oleksiak, T.; Rodrigues, P.C. Evaluating Cultivar Intensity and Dataset Size for Reliable Cultivar Recommendation in Winter Wheat: A Systematic Research of Environmental and Genotype Factors. Crop Sci. 2024, 64, 1666–1677. [Google Scholar] [CrossRef]
Piepho, H.P.; Möhring, J.; Melchinger, A.E.; Büchse, A. BLUP for Phenotypic Selection in Plant Breeding and Variety Testing. Euphytica 2003, 161, 209–228. [Google Scholar] [CrossRef]
Bates, D.; Maechler, M.; Bolker, B.; Walker, S. Fitting Linear Mixed-Effects Models Using lme4. J. Stat. Softw. 2015, 67, 1–48. [Google Scholar] [CrossRef]
Kuznetsova, A.; Brockhoff, P.B.; Christensen, R.H.B. lmerTest Package: Tests in Linear Mixed Effects Models. J. Stat. Softw. 2017, 82, 1–26. [Google Scholar] [CrossRef]
Technow, F.; Messina, C.D.; Totir, L.R.; Cooper, M. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation. PLoS ONE 2015, 10, e0130855. [Google Scholar] [CrossRef] [PubMed]
Heslot, N.; Yang, H.-P.; Sorrells, M.E.; Jannink, J.-L. Genomic Selection in Plant Breeding: A Comparison of Models. Crop Sci. 2014, 54, 89–106. [Google Scholar] [CrossRef]
Leśniowska, K.; Wanic, M. Procedures for the Evaluation of Cereal Cultivars in Poland—Criteria and Intensity of Recommendation Selection. Biul. COBORU 2019. [Google Scholar]
Gozdowski, D.; Stępień, M.; Samborski, S.; Dobers, E.S.; Szatyłowicz, J.; Chormański, J. Determination of the Most Relevant Soil Properties for the Delineation of Management Zones in Production Fields. Commun. Soil Sci. Plant Anal. 2014, 45, 2289–2304. [Google Scholar] [CrossRef]
Gozdowski, D.; Leszczyńska, E.; Stępień, M.; Rozbicki, J.; Samborski, S. Within-Field Variability of Winter Wheat Yield and Grain Quality versus Soil Properties. Commun. Soil Sci. Plant Anal. 2017, 48, 1029–1041. [Google Scholar] [CrossRef]
Panek, E.; Gozdowski, D.; Stępień, M.; Samborski, S.; Ruciński, D.; Buszke, B. Within-Field Relationships between Satellite-Derived Vegetation Indices, Grain Yield and Spike Number of Winter Wheat and Triticale. Agronomy 2020, 10, 1842. [Google Scholar] [CrossRef]
IUNG-PIB. Characteristics of Arable Soils in Poland; IUNG-PIB: Puławy, Poland, 2021. [Google Scholar]
Czarnowski, F.; Truszkowska, R. Commentary on the Soil Classification Table for the Evaluation of Arable Land in Plains, Highlands, and Lowlands, Including Regional Instructions for Mountainous Areas and Commentary on the Evaluation of Grassland and Forest Soils for Soil Classifiers and Cartographers; Institute of Soil Science and Plant Cultivation IUNG-PIB: Puławy, Poland, 1963. (In Polish) [Google Scholar]
UTKG. Official Table of Land Classes; Attachment to a Regulation of the Council of Ministers on Soil Classification, Poland. 2012. Available online: https://eli.gov.pl/eli/DU/2012/1246/ogl (accessed on 7 August 2025). (In Polish)
GUS. Statistical Yearbook of Agriculture 2024; Statistics Poland: Warsaw, Poland, 2024.
Lobell, D.B.; Cassman, K.G.; Field, C.B. Crop Yield Gaps: Their Importance, Magnitudes, and Causes. Annu. Rev. Environ. Resour. 2009, 34, 179–204. [Google Scholar] [CrossRef]
van Ittersum, M.K.; Cassman, K.G.; Grassini, P.; Wolf, J.; Tittonell, P.A.; Hochman, Z. Yield Gap Analysis with Local to Global Relevance—A Review. Field Crops Res. 2013, 143, 4–17. [Google Scholar] [CrossRef]

Figure 1. Schematic flowchart of all procedures.

Figure 2. Frequency distribution of the number of yield observations per winter wheat cultivar. (a) Training dataset (2015–2023), including all combinations of location × year × management level; (b) Full dataset (2015–2024), including the independent validation year. The distribution remains right-skewed, with most cultivars evaluated in fewer than 100 environments.

Figure 3. Observed vs. predicted yields for the 2024 season based on cultivar-specific regression models using adjusted environmental means. Each point represents a single cultivar × environment observation. The close alignment along the identity line (y = x) indicates strong agreement and confirms the model’s ability to accurately capture yield variation under real-world field conditions.

Table 1. Summary of winter wheat yield distributions by location (training: 2015–2023. validation: 2024).

Location	Period	Soil Quality Class	Land Suitability Group	Land Suitable Mainly for	Mean ± SD Yield (t/ha)
Cicibór Duży	2015–2023	IIIb	4	rye and wheat	7.99 ± 1.69
Cicibór Duży	2024	IIIb	4	rye and wheat	6.50 ± 0.54
Czesławice	2015–2023	IIIa	2	wheat	9.53 ± 1.40
Czesławice	2024	IIIa	2	wheat	11.57 ± 0.74
Głębokie	2015–2023	IIIa	2	wheat	7.98 ± 2.25
Głębokie	2024	IIIa	2	wheat	7.71 ± 0.53
Głubczyce	2015–2023	II	1	wheat	10.93 ± 1.16
Głubczyce	2024	II	1	wheat	10.54 ± 1.34
Krościna Mała	2015–2023	IIIa. IIIb. IVa. IVb	2 and 4	wheat. rye and wheat	9.36 ± 1.54
Krościna Mała	2024	IIIa	2	wheat	9.68 ± 1.09
Marianowo	2015–2023	IIIb. IVa. IVb	4 and 5	rye and wheat. rye	9.24 ± 1.91
Marianowo	2024	IIIb	4	rye and wheat	10.54 ± 0.65
Masłowice	2015–2023	IIIb. IVb	4 and 5	rye and wheat. rye	9.25 ± 1.51
Masłowice	2024	IIIb	4	rye and wheat	9.12 ± 0.95
Nowa Wieś Ujska	2015–2023	IIIa. IIIb. IVa	2 and 4	wheat. rye and wheat	7.22 ± 1.77
Nowa Wieś Ujska	2024	IVa	4	rye and wheat	7.32 ± 1.27
Pawłowice	2015–2023	IIIb	2	wheat	8.80 ± 2.28
Pawłowice	2024	IIIb	2	wheat	9.02 ± 0.72
Radostowo	2015–2023	II	1	wheat	10.28 ± 2.82
Radostowo	2024	II	1	wheat	10.46 ± 0.89
Rarwino	2015–2023	IIIb. IVa. IVb	4 and 5	rye and wheat. rye	9.03 ± 1.61
Rarwino	2024	IVa	5	rye	9.95 ± 0.93
Rychliki	2015–2023	IIIb. IVa	2	wheat	9.79 ± 1.58
Rychliki	2024	IIIb	2	wheat	9.78 ± 0.93
Seroczyn	2015–2023	IIIb. IVa	4 and 5	rye and wheat. rye	8.42 ± 1.74
Seroczyn	2024	IIIb	4	rye and wheat	9.64 ± 0.81
Skołoszów	2015–2023	II	1	wheat	9.43 ± 1.82
Skołoszów	2024	II	1	wheat	10.98 ± 1.29
Słupia	2015–2023	IIIa	2	wheat	10.83 ± 1.85
Słupia	2024	IIIa	2	wheat	4.42 ± 0.98
Tomaszów Bol.	2015–2023	IVa. IVb	3 and 5	wheat. rye	6.30 ± 1.46
Tomaszów Bol.	2024	IVb	5	rye	4.23 ± 0.56
Węgrzce	2015–2023	II	1	wheat	9.40 ± 1.34
Węgrzce	2024	II	1	wheat	7.89 ± 0.99
Zybiszów	2015–2023	II. IIIa	1 and 2	wheat	10.23 ± 1.53
Zybiszów	2024	IIIa	2	wheat	11.92 ± 0.99
Świebodzin	2015–2023	IIIa. IIIb. IVa	2. 3 and 4	wheat. rye and wheat	9.23 ± 3.07
Świebodzin	2024	IIIa	4	rye and wheat	5.73 ± 0.85

Source: Information from COBORU. Note: Yields are expressed in t/ha and calculated across all cultivar × management × replicate observations. Soil quality class definitions: [31,32] II—very good arable soils. IIIa—good arable soils. IIIb—moderately good arable soils. IVa—arable soils of medium quality (better). IVb—arable soils of medium quality (worse). Soil suitability group definitions: 1—very good suitability for wheat. 2—good suitability for wheat. 3—limited suitability for wheat. 4—very good suitability for rye. 5—good for rye.

Table 2. Number of winter wheat yield observations per location and year (2015–2024), before data filtering.

Location	Training Dataset									Testing Dataset	Sum
Location	2015	2016	2017	2018	2019	2020	2021	2022	2023	2024	2015–2024
Cicibór Duży	102	98	108	68	82	100	110	108	130	144	1050
Czesławice	102	98	108	68	82	100	110	108	130	144	105-
Głębokie	98	96	106	70	86	108	110	108	130	144	1056
Głubczyce	106	106	106	72	86	138	110	108	130	144	1106
Marianowo	98	0	100	64	82	106	110	108	130	144	942
Nowa Wieś Ujska	104	100	104	74	90	107	110	-	130	144	963
Pawłowice	98	98	112	78	84	98	110	108	130	144	1060
Radostowo	105	100	106	76	90	102	110	108	130	144	1071
Rarwino	104	-	120	68	86	100	110	108	130	144	970
Rychliki	100	96	102	68	84	100	110	108	130	144	1042
Seroczyn	98	94	100	70	86	107	110	108	130	144	1047
Skołoszów	96	94	104	66	84	100	110	108	130	144	1036
Słupia	98	96	106	72	94	134	110	108	130	144	1090
Świebodzin	96	98	-	66	82	100	110		130	144	826
Węgrzce	98	94	106	74	84	132	110	108	130	144	1076
Zybiszów	102	96	108	76	94	106	110	108	130	144	1074
Krościna Mała	98	96	108	76	94	106	110	108	130	144	1070
Masłowice	100	94	100	70	84	1–2	110	108	130	144	1–42
Tomaszów Bol.	98	96	108	76	94	106	110	108	130	144	1070
Sum	1901	1650	1912	1352	1642	2052	2090	1836	2470	2736	19,641

Source: Information from COBORU. Note: Each value represents the total number of cultivar-level yield observations at a given trial location in a specific year, summed across all replications and both management intensity levels (standard and intensive). These are raw counts prior to the removal of outliers and the exclusion of cultivars with fewer than 30 observations.

Table 3. Analysis of variance (ANOVA) for fixed effects in the linear mixed model of winter wheat yield (training dataset. 2015–2023).

Source	SS	MS	NumDF	DenDF	F	p-Value
Location	3582	199	18	7977	1717	<0.001
Management Intensity (MIM)	1088	1088	1	135	9386	<0.001
Year	1170	146	8	7728	1263	<0.001
Location × MIM	428	24	18	7973	205	<0.001
Location × Year	3226	23	139	7976	200	<0.001
MIM × Year	87	11	8	3820	94	<0.001
Location × MIM × Year	769	6	139	7965	48	<0.001

Note: SS—sum of squares; MS—mean square; NumDF—numerator degrees of freedom; DenDF—denominator degrees of freedom; F—F-statistic; p—significance level.

Table 4. Predictive performance comparison: Full model vs. Simplified reference model.

Metric	Full Model (Regression-Based)	Simplified Reference Model
Pearson correlation (r)	0.958	0.502
RMSE [t/ha]	0.45	2.08
Cultivar-specific effects	✓ Yes	✗ No
G×E interactions modeled	✓ Yes	✗ No
Data used for prediction	Cultivar × environment (2015–2023)	Location × management (2015–2023)
Suitability for recommendations	High (individualized)	Low (aggregated)

Table 5. Top 10 winter wheat cultivars with the highest prediction accuracy (lowest RMSE/SD) validated on the season 2024 data.

Rank	Cultivar	Yield (t/ha)			Rank at (t/ha)			Group	RMSE (t/ha)	SD (t/ha)	RMSE/SD	R²
Rank	Cultivar	7	9	11	7	9	11	Group	RMSE (t/ha)	SD (t/ha)	RMSE/SD	R²
1	SU Banatus	7.36	9.50	11.64	24	13	12	Top Prediction Accuracy	0.39	2.31	0.17	0.95
2	Comandor	6.88	8.94	11.00	99	94	85		0.42	2.42	0.17	0.95
3	Symetria	7.16	9.21	11.27	51	45	42		0.42	2.41	0.17	0.88
4	Chevignon	7.61	9.72	11.84	9	4	3		0.45	2.55	0.18	0.95
5	Callistus	7.28	9.22	11.16	37	44	60		0.49	2.40	0.20	0.94
6	Asory	6.94	9.11	11.28	90	59	39		0.53	2.49	0.21	0.94
7	RGT Bilanz	7.245	9.28	11.32	40	37	34		0.52	2.43	0.22	0.94
8	Bulldozer	8.11	9.92	11.74	1	2	8		0.50	2.25	0.22	0.80
9	Revolver	7.49	9.61	11.72	16	9	9		0.54	2.40	0.23	0.94
10	Knut	7.44	9.49	11.54	20	17	17		0.55	2.36	0.23	0.93

Table 6. Top 10 cultivars with the lowest model performance (highest RMSE/SD and lowest R²).

Rank	Cultivar	Yield (t/ha)			Rank at (t/ha)			Group	RMSE (t/ha)	SD (t/ha)	RMSE/SD	R²
Rank	Cultivar	7	9	11	7	9	11	Group	RMSE (t/ha)	SD (t/ha)	RMSE/SD	R²
1	LG Nida	6.79	9.00	11.20	118	73	51	Lowest Prediction Accuracy	1.93	2.76	0.70	0.92
2	KWS Donovan	7.62	9.69	11.76	8	7	6		1.43	2.47	0.58	0.92
3	Bosporus	7.01	9.10	11.18	78	64	57		1.14	2.18	0.52	0.91
4	SU Mangold	6.82	9.18	11.54	113	48	16		1.11	2.55	0.44	0.94
5	Bright	7.66	9.54	11.42	6	11	26		0.99	2.23	0.44	0.90
6	Adrenalin	7.51	9.46	11.41	14	20	28		0.85	2.40	0.35	0.94
7	Arevus	7.41	9.51	11.61	21	12	14		0.75	2.12	0.35	0.93
8	Tonnage	7.18	9.47	11.76	48	19	5		0.76	2.17	0.35	0.90
9	LG Keramik	7.55	9.55	11.56	10	10	15		0.88	2.44	0.36	0.90
10	SU Willem	7.00	9.20	11.40	82	46	29		0.88	2.24	0.39	0.86

Table 7. Top 20 broadly adapted winter wheat cultivars based on cumulative ranking across productivity levels.

Cultivar	Rank at 7 t/ha	Rank at 9 t/ha	Rank at 11 t/ha	Sum of Ranks	Intercept	Slope	R²	Recommendation
Bulldozer	1	2	8	11	1.75	0.91	0.80	Low-productivity environments
SY Cellist	7	3	2	12	0.10	1.08	0.95	High-productivity environments
Chevignon	9	4	3	16	0.20	1.06	0.95	High-productivity environments
LG Mondial	3	5	11	19	1.00	0.97	0.91	Consistently top-performing
SU Tarroca	19	1	1	21	−1.53	1.28	0.93	High-productivity environments
KWS Donovan	8	7	6	21	0.37	1.03	0.92	Consistently top-performing
Hyvega	5	6	10	21	0.89	0.98	0.88	Consistently top-performing
Revolver	16	9	9	34	0.10	1.06	0.94	Consistently top-performing
RGT Ritter	2	8	24	34	1.59	0.90	0.85	Consistently top-performing
LG Keramik	10	10	15	35	0.52	1.00	0.90	Consistently top-performing
Bright	6	11	26	43	1.08	0.94	0.90	Consistently top-performing
Arevus	21	12	14	47	0.06	1.05	0.93	consistently top-performing
Venecja	13	14	21	48	0.62	0.99	0.93	Consistently top-performing
SU Banatus	24	13	12	49	−0.13	1.07	0.95	Consistently top-performing
Knut	20	17	17	54	0.26	1.03	0.93	Consistently top-performing
Adrenalin	14	20	28	62	0.69	0.97	0.94	Consistently top-performing
SU Geometry	44	15	4	63	−0.78	1.14	0.96	Consistently top-performing
LG Egmont	33	18	13	64	−0.28	1.08	0.97	Consistently top-performing
LG Mocca	41	16	7	64	−0.66	1.13	0.86	Consistently top-performing
Elektra	25	21	19	65	0.13	1.03	0.91	Consistently top-performing

Note: Cultivars were ranked at three representative environmental productivity levels: low (~7 t/ha), medium (~9 t/ha), and high (~11 t/ha), based on predicted yields from cultivar-specific regression models (see Section 2.2.3). Rankings at each level were summed to calculate cumulative performance, with lower totals indicating broader adaptation and greater yield stability. Regression parameters (intercept, slope, and R²) describe model fit and cultivar responsiveness. The “Recommendation” column was assigned based on slope values: cultivars with slope values ≥ 1.10 were classified as suited for high-productivity environments due to their strong responsiveness; those with slope values ≤ 0.95 were identified as suitable for low-productivity or marginal environments, reflecting their performance stability under suboptimal conditions; cultivars with slope values between 0.96 and 1.09 were considered broadly adapted, performing consistently across diverse agroecological settings.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Iwańska, M.; Paderewski, J.; Stępień, M. Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland. Agronomy 2025, 15, 2309. https://doi.org/10.3390/agronomy15102309

AMA Style

Iwańska M, Paderewski J, Stępień M. Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland. Agronomy. 2025; 15(10):2309. https://doi.org/10.3390/agronomy15102309

Chicago/Turabian Style

Iwańska, Marzena, Jakub Paderewski, and Michał Stępień. 2025. "Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland" Agronomy 15, no. 10: 2309. https://doi.org/10.3390/agronomy15102309

APA Style

Iwańska, M., Paderewski, J., & Stępień, M. (2025). Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland. Agronomy, 15(10), 2309. https://doi.org/10.3390/agronomy15102309

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Winter Wheat Cultivar Performance Using Mixed Models and Environmental Mean Regression from Multi-Environment Trials for Cultivar Recommendation to Reduce Yield Gap in Poland

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Source and Trial Design

2.2. Development of Model

2.2.1. Data Preprocessing

2.2.2. Mixed Model Analysis

2.2.3. Cultivar Specific Regression Modelling

2.2.4. Simplified Reference Model for Yield Prediction

2.2.5. Model Validation

2.3. Application of Model for Cultivar Recommendation

2.3.1. Evaluation of Cultivar Adaptability Across Diverse Environmental Conditions

2.3.2. Recommendation Scenarios Based on Cultivar Responsiveness

3. Results and Discussion

3.1. Yield Range and Representativeness of Trial Environments

3.2. Data Preparation

3.3. Statistical Models

3.3.1. Performance of the Linear Mixed Model

3.3.2. Performance and Prediction Accuracy of Cultivar-Specific Regression Models

Model Fit and Predictive Accuracy

Cultivar Responsiveness and Model Robustness

3.3.3. Comparative Validation of Predictive Accuracy Using 2024 Data

Validation of Specific-Cultivar Regression Models Using 2024 Data

3.3.4. Cultivar Adaptation to Diverse Environmental Productivity

3.3.5. Implications for Cultivar Recommendation

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI