2.1. Theoretical Derivation of the Unified AMMI-GGE (UAG) Model
To establish a unified analytical framework for genotype-by-environment (G × E) data, the Unified AMMI-GGE (UAG) model was formulated as a continuous generalization of the classical AMMI and GGE approaches. The detailed mathematical formulation is presented below, providing the theoretical foundation for subsequent empirical evaluation using triticale multi-environment trials.
2.1.1. Theoretical Background and Motivation
The response of genotypes (
) to environments (
) is classically analyzed through bilinear models that partition observed variation into additive main effects and multiplicative interaction terms. In the standard two-way layout,
where
is the mean yield (or trait value) of genotype
in environment
,
and
are additive genotype and environment main effects, and
is the interaction deviation. The residuals
are assumed to be independent, with a mean of zero and a variance of
, where the precision weights
reflect replication and sampling precision.
Two canonical models approximate by low-rank bilinear forms:
1. AMMI (Additive Main effects and Multiplicative Interaction) applies double centering, removing both genotype and environment means, and performs a singular value decomposition (SVD) of the residual . The model captures a pure interaction structure and is statistically orthogonal to the main effects;
2. GGE (Genotype plus Genotype-by-Environment) removes only the environment means and applies SVD to , thus analyzing the combined term. The GGE biplot retains both discriminatory and predictive components of genotypic performance, but the inclusion of in the multiplicative term sacrifices the strict interpretability of the interaction.
Both models correspond to rank-restricted approximations of distinct linear transformations of . The Unified AMMI-GGE (UAG) model generalizes them by constructing a “continuous one-parameter family of transformations” with that interpolates between the AMMI and GGE residual fields. This formulation allows the data to determine, by cross-validation, the optimal level of centering that minimizes prediction error or maximizes explained variation.
2.1.2. Weighted Hilbert Space and Preliminaries
Let
denote the matrix of genotype–environment cell means, and
a matrix of nonnegative weights, with
for observed cells and
otherwise. We define the weighted Frobenius inner product and norm as:
In the present empirical application, weights were set to for all observed cells and for missing cells, reflecting the balanced fully replicated design. Under uniform weights, the weighted Frobenius inner product reduces to the standard Frobenius inner product, and the WALS algorithm converges to the classical unweighted SVD solution. For unbalanced designs, may be specified proportional to cell replication () or to the inverse of the cell-mean prediction error variance (), where denotes the number of replications in the cell and is the within-environment error variance estimate.
The pair thus forms a finite-dimensional Hilbert space, allowing orthogonal projections and linear operators to be defined under the weighted metric.
Weighted environment means are computed as:
and assembled as the row vector
. Similarly, after centering by environments, the weighted genotype means are:
collected as the column vector
.
2.1.3. Definition of the UAG Transformation
The UAG transformation defines the “centered field”
as:
This single formula yields both limiting cases:
Hence,
provides a continuous affine path between the pure-interaction (double-centered) field
and the
field
. In operator notation,
where
is the projection that removes the environment means:
and
is the projection onto the space spanned by genotype means after environment centering:
The parameter thus determines the strength with which genotype averages are reintroduced into the field analyzed by the SVD.
2.1.4. Weighted Low-Rank Approximation Problem
For fixed
, the model seeks a rank-
bilinear approximation to
minimizing the weighted residual sum of squares:
The objective can be rewritten as a quadratic form:
Given
, minimizing over
yields the normal equations:
Alternating these updates defines the “Weighted Alternating Least Squares (WALS)” algorithm, which converges monotonically to a stationary point of . Under full-rank and positive-weight conditions, local convexity guarantees the uniqueness of the factor pair up to a rotation of the latent subspace.
To obtain rotation-invariant scores, one performs a symmetric SVD on the fitted interaction:
where
IK denotes the K × K identity matrix.
The biplot coordinates are then defined as:
The reconstructed mean matrix in the original scale is:
The weighted proportion of variation explained in the transformed field is:
where
denotes the weighted coefficient of determination of the rank-K approximation in the α-transformed field.
2.1.5. Model Selection and Cross-Validation
The parameters
and
were estimated empirically by minimizing cross-validated prediction error in both the original yield space (
) and the
-transformed space (
). For each cross-validation scheme, two error surfaces were computed—one based on direct yield predictions and one based on the
-weighted representation—allowing for simultaneous evaluation of the model regularization and predictive efficiency. Formulas were derived analytically for each validation scheme, following the general cross-validation framework of Stone [
33] and Geisser [
34].
The general form of the cross-validated mean squared error (CV-MSE) is given by:
where
denotes the evaluated data space,
is the model prediction obtained after excluding the validation subset
, and
represents optional observation weights. The optimal parameter pair
minimizes this criterion jointly across both spaces. The superscripts −e, −g, −(g,e), and −(g,−e) indicate that the corresponding environment, genotype, cell, or genotype–environment pair has been excluded from model fitting.
The Leave-One-Environment-Out (LOEO) is:
Each environment is sequentially removed from model fitting, and all genotype means for that environment are predicted. This design measures the ability of the UAG model to extrapolate genotype performance to untested sites.
The Leave-One-Genotype-Out (LOGO) is:
Here, each genotype is excluded across all environments, and its performance is predicted from the remaining genotypes. This quantifies model generalization across the genotype dimension.
The Leave-One-Combination-Out (LOCO) is:
In this most granular validation, each individual observation is left out in turn, and its value is predicted independently. This design provides the strictest internal consistency test and is sensitive to local overfitting.
The Two-Way Leave-One-Out (LOO) is:
This joint scheme simultaneously excludes a genotype and an environment in each iteration, offering an integrated measure of model generalization across both biological dimensions.
All four validation schemes were computed in parallel for both and . The resulting surfaces of and were compared to determine the most stable and generalizable configuration. In the present balanced dataset, CV-MSE values in the α-transformed space (Yα) were equivalent to those in the original yield space (Y), consistent with the fully replicated experimental structure. In unbalanced designs with missing cells, the two spaces are expected to diverge, with Yα providing improved regularization. This dual-space validation ensures that the final parameterization reflects both raw predictive capacity and the biologically interpretable structure captured through α-weighting.
From the CV-MSE criterion, two derived predictive error statistics were computed for each validation scheme and parameter configuration. The Prediction Residual Sum of Squares (PRESS) aggregates the weighted squared prediction error across all validation observations:
The root mean square error (RMSE) expresses the prediction error on the original response scale:
For the present balanced design (
for all observed cells), these reduce to:
and
Both statistics were computed in parallel for , yielding complementary error surfaces for each validation scheme.
2.1.6. Geometric Interpretation: Mean-Stability Analysis
Let
and
denote the first two columns of the environment and genotype score matrices. The “average environment coordination” (AEC) direction is defined as
where the superscript (2) denotes the submatrix formed by the first two columns of dimensions E × 2 and G × 2, respectively
For genotype
, the projection onto
is:
where it represents its mean performance, and the perpendicular deviation, where the subscript (g,⋅) denotes the g-th row vector of the genotype score matrix
quantifies instability. The scalar
serves as a direct stability measure, reducing to the AMMI interaction principal component amplitude when
. A unified selection index combining productivity and stability can be expressed as
where
denotes standardization and
controls the emphasis on yield versus stability.
2.1.7. Boundary Behavior and Regularization Path
The transformation
defines an affine path in the space of weighted matrices. Differentiating with respect to
,
Hence, the trajectory is linear in . The corresponding low-rank approximation problem defines a “regularization path” analogous to ridge regression: increasing reduces the bias induced by over-centering genotypes, while decreasing yields a “cleaner” interaction but a potentially higher variance.
The optimal value , minimizing cross-validated MSE, can lie anywhere in . A boundary solution of (pure GGE) indicates that the data support retaining full genotypic variation in the multiplicative kernel—typical when environmental variance dominates or the replication noise inflates main-effect estimates. Conversely, interior values occur when double centering captures a structure not explained by environment means alone. Thus, boundary optimality is not a contradiction but a legitimate outcome of the data-driven bias-variance equilibrium inherent in UAG.
Mathematically, the continuity of the SVD under smooth perturbations of ensures that , and vary continuously for all except at eigenvalue crossings, guaranteeing consistent geometry along the centering path.
2.1.8. Statistical and Computational Properties
The UAG objective is separately convex in and , enabling efficient alternating updates. Each ALS iteration solves genotype-wise and environment-wise weighted normal systems of size , leading to computational complexity per iteration. The fitted matrix is unique up to orthogonal rotations; the symmetric SVD enforces a canonical orientation.
Decomposition of weighted sums of squares yields:
where the last two terms represent residual main effects and centering remainders. Missing data are handled by setting
, and
I is the identity operator on ℝ
{G×E}. Regularization of normal equations with ridge penalties
guarantees numerical stability in unbalanced designs without altering the fitted values asymptotically.
Under standard conditions (bounded weights, fixed rank ), is consistent for the best weighted rank- approximation of in the Hilbert-space sense.
The framework connects naturally to mixed-model factor-analytic structures: replacing with BLUP predictions and with inverse prediction-error variances yields a low-rank approximation equivalent to the estimated factor-analytic covariance component.
Model fitting was performed using the Weighted Alternating Least Squares (WALS) algorithm described in
Section 2.1.4, with convergence declared when the relative change in the objective function between successive iterations fell below
. Numerical stability under near-singular configurations was ensured by adding a ridge penalty
to the diagonal of the normal equations. The significance of individual interaction principal components was assessed using the
-test of Gollob [
35], adapted for weighted bilinear models, with degrees of freedom for component
computed as
. The proportion of weighted GEI variance explained by the rank-
K approximation was quantified by
, as defined in
Section 2.1.4. All computations were implemented in Python 3.11, as described in
Section 4.6.
2.1.9. Analytical Implications and Theoretical Unification
The UAG model unifies AMMI and GGE within a single analytical geometry:
Because is linear in , both limiting forms and all intermediate states reside in the same affine subspace of . Consequently, all classical interpretive tools—interaction principal component analysis, mean–stability biplots, which–won–where polygons, and AEC projections—are defined consistently across the continuum.
From a statistical learning perspective, UAG implements a ‘centering regularization’ governed by . It controls the degree of bias toward pure-interaction representation versus predictive generalization, with minimizing confounding and minimizing prediction error under large environmental variance. Thus, UAG forms a continuous bias–variance bridge between the two established paradigms.
The result is a theoretically complete, empirically tunable model family:
where
are the singular triplets of
in the weighted space. This expression encapsulates AMMI (
) and GGE (
) as exact special cases, while for intermediate ones,
provides an optimal linear combination of their defining features.
λk is the k-th singular value (the diagonal elements of
Σ), and
uk ∈ ℝ
G and
vk ∈ ℝ
E are the corresponding left and right singular vectors of
Rα.
2.2. Empirical Evaluation of the UAG Model
The practical performance of the UAG framework was assessed using multi-environment triticale trials conducted between 2022 and 2024. Six configurations were evaluated—five with K = 2 (α = 0.0–1.0) and one with K = 3 (α = 0.3)—using symmetric scaling. The dataset exhibited highly significant genotype (G), environment (E), and G × E effects (p < 0.001). Intermediate α values, particularly α = 0.3, yielded the most balanced and interpretable representation of yield performance and stability, forming the empirical basis for subsequent analyses.
2.2.1. Variance Partitioning and Model Performance
The partitioning of total phenotypic variation was first assessed through a classical two-way ANOVA with replication and, subsequently, through the UAG decomposition applied to the α-transformed matrices. This dual approach enabled comparison between the traditional additive partitioning of AMMI and the α-weighted unified structure of UAG.
Table 1 summarizes the ANOVA of the studied triticale genotypes (2022–2024). The environment effect was highly significant (
F = 144.06;
p < 0.001) and accounted for 77.20% of the total sum of squares (SS = 6,548,325.08). Genotypic differences were significant (
F = 2.54;
p = 0.015), explaining 10.20% of the total variation (SS = 865,432.58). The genotype-by-environment (G × E) interaction was also highly significant (
F = 11.29;
p < 0.001) and contributed 8.04% (SS = 681,839.79). The residual variance represented 4.56% of the total (SS = 386,445.43), confirming high experimental precision. This variance structure follows the typical pattern for replicated multi-environment trials, where environmental effects dominate but the G × E term remains decisive for cultivar adaptability and stability.
The subsequent UAG decomposition (
Table 2) refined this structure by projecting each α-transformed matrix (Yα) into orthogonal multiplicative components. At α = 0.0 (AMMI2), the first two principal components (PC1 and PC2) explained 86.47% and 13.53%, respectively, jointly capturing the entire modeled interaction variance (SS = 136,367.96). At α = 0.1, PC1 and PC2 accounted for 85.89% and 13.48% (99.37% in total; SS = 138,098.82), while at α = 0.3, the respective shares were 82.03% and 13.58% (95.61%; SS = 151,945.74), with a residual sum of squares of 6677.67. Increasing the model rank to K = 3 at the same α raised the explained variance to 100%, confirming that the third component contributed only marginal information.
At higher α levels (0.9–1.0, corresponding to GGE-type structures), the proportion of variance explained by PC1–2 decreased slightly because environmental centering inflates the total variance. For α = 0.9, PC1 and PC2 captured 76.52% and 18.25% (94.77%; SS = 276,568.04) and, for α = 1.0 (GGE2), 77.75% and 17.52% (95.27%; SS = 309 454.47). Despite this decline, both components remained statistically significant (p < 0.001) across all α configurations, confirming that both AMMI- and GGE-like extremes describe genuine G × E structure.
Formal F-tests based on pooled mean squares (
Table 2) demonstrated that PC1 and PC2 were highly significant in every α–K setting. The configuration α = 0.3, K = 2 yielded the most favorable balance between explanatory power (95.61%) and residual mean square (MSres = 238.49), excluding degenerate solutions at α = 0.0 and the partially saturated configuration at α = 0.1 (MSres = 30.85).
From a breeding perspective, the UAG model effectively concentrates systematic interaction within a limited number of components, thereby enhancing interpretability and analytical precision. Low α values emphasize stability-oriented (AMMI-like) patterns, intermediate α ≈ 0.3 provides the best compromise between yield and stability, and high α (≥0.9) highlights productivity and mega-environment delineation.
Overall, the combined ANOVA (
Table 1) and UAG decomposition (
Table 2) confirm a well-structured dataset with statistically significant and biologically interpretable G × E effects captured primarily by the first two UAG components.
2.2.2. UAG Biplot Patterns and Genotype Clustering
The biplots derived from the Unified AMMI-GGE (UAG) model (
Figure 1) provided an integrative and geometrically coherent representation of the genotype–environment interaction (G × E) structure across the α continuum (0.0–1.0) under symmetric scaling.
The joint ordination of genotypic and environmental scores on the first two interaction principal components (PC1 and PC2) delineated the dominant multiplicative patterns within the dataset, offering a direct visualization of how the relative contribution of additive and interactive effects changes along the AMMI-GGE continuum.
In this configuration, the absolute magnitude of PC1 and PC2 coordinates quantifies the intensity of genotype responsiveness, while proximity to the origin indicates a more stable, environment-insensitive behavior.
At α = 0.0 (AMMI2), the ordination displayed an approximately isotropic geometry, characteristic of purely interactive models where the additive main effects are fully removed.
The genotypes were symmetrically distributed around the biplot center, reflecting balanced interaction dispersion without the dominance of any specific environment. Several lines, notably G8, G5, and G3, clustered close to the origin (|PC| ≤ 2), indicating low interaction and high general adaptability. By contrast, G4, G10, and G11 occupied distal positions along the first axis (|PC1| = 6–11), expressing pronounced environment-specific responses. Among the local check varieties, Rakita (R) was positioned relatively centrally (|PC| ≈ 3.9), while AD-7291 (A) and Kolorit (K) exhibited large interaction amplitudes (5.9–7.6), suggesting considerable environmental sensitivity. This pattern contradicts the expectation of complete neutrality among checks and confirms that some reference varieties contribute substantially to the interaction term, thereby enhancing the discriminative potential of the model.
At α = 0.1, the general structure remained similar but with a subtle elongation along PC1, indicating that a small fraction of environmental main effects began to influence the multiplicative term. The relative positions of the genotypes were largely conserved, with G8 maintaining the lowest interaction amplitude (|PC| ≈ 0.4), confirming its exceptional stability.
Lines G3 and G5 remained near the origin, while G4, G10, and G11 continued to express high responsiveness to favorable conditions. The check varieties retained their relative ranking in interaction magnitude, with R the least and A the most variable.
At α = 0.3 (K = 2), the UAG ordination reached its highest interpretive resolution. The inclusion of a moderate portion of environmental main effects increased the separation along PC1, resulting in a clear differentiation between stable and responsive genotypes. G3, G5, and G8 remained close to the origin, reflecting consistent performance across test sites and minimal interaction deviation. Conversely, G4, G10, G11, and G12 showed large loadings on both PC1 and PC2 (|PC| ≈ 6–8), indicating strong positive associations with high-yielding environments.
G4, with the highest overall amplitude (|PC| ≈ 10.6), exhibited the most unstable pattern, suggesting a highly environment-dependent response. Among the checks, A and K remained distinctly peripheral, while R was again positioned closest to the center, confirming its relatively balanced behavior.
When the model rank was increased to K = 3 at the same α level, the spatial configuration changed only marginally, showing that the essential G × E structure is already captured by the first two components.
At higher α values (0.9 and 1.0, corresponding to GGE-type configurations), the biplots became more elongated and environment-oriented. The first axis dominated the variation, integrating the influence of mean yield into the interaction structure. Genotypes G2, G8, and G5 retained near-central positions (|PC| < 4), maintaining balanced adaptability, while G10, G11, G12, and G4 remained at the outer margins, expressing pronounced environmental responsiveness. The check varieties continued to show moderate-to-high interaction magnitudes (|PC| = 5–8), confirming that they cannot be regarded as stability benchmarks but rather as comparative standards defining the interaction range.
Figure 2 presents the mean vs. stability (AEC) biplots derived from the same configurations. The vector of the average environment (AEC abscissa) represents the direction of increasing mean yield, while the projection distances from this axis quantify instability. In this representation, G10 and G11 were aligned with the positive direction of the yield vector, indicating high productivity combined with moderate to low stability. G3, G5, and G12 occupied intermediate positions, expressing balanced adaptability and consistent performance across environments.
G2 and G8 were located near the origin, confirming their stable, yet average-yielding, character.
The check varieties (A, V, K, R) clustered outside the central sectors of the biplots, reflecting their role as reference points for assessing genotype dispersion rather than as models of stability.
Across the α continuum, the UAG biplots clearly delineated three major interpretive domains:
(1) Stable and broadly adapted genotypes—G5, and particularly G8, which maintained minimal PC1-PC2 deviations in all configurations;
(2) High-yield, environment-responsive genotypes—G10, G11, G12, and G4, characterized by large absolute PC loadings and strong alignment with high-yield environments;
(3) Reference check varieties—A, V, R, and K, which, despite variable interaction amplitudes, consistently provided an internal comparative baseline.
The gradual transformation of biplot geometry from α = 0.0 to α = 1.0 demonstrates the analytical flexibility of the UAG framework: lower α values emphasize stability and balanced G × E effects, whereas higher α values highlight productivity gradients and environment-specific responses. This continuum provides a unified, quantitative basis for interpreting both stability and responsiveness patterns in triticale genotypes, ensuring coherence between graphical and analytical assessments within multi-environment testing programs.
2.2.3. Stability and Unified Selection Indices
The evaluation of genotype performance within the Unified AMMI-GGE (UAG) framework integrates two complementary analytical dimensions: the UAG_IPCA_Stability, which quantifies the magnitude of genotype-by-environment interaction (G × E), and the Unified AMMI-GGE Index (UAGI), which combines stability and productivity into a single selection criterion. Together, these parameters provide a coherent picture of adaptability and yield potential across the α continuum.
The UAG_IPCA_Stability parameter expresses the Euclidean distance of each genotype from the origin in the α-transformed interaction space, effectively summarizing the magnitude of its interactive deviation. Smaller values denote consistent performance across environments (wide adaptability), whereas large values indicate high responsiveness or instability. Because the metric is computed directly from the same α-dependent decomposition used in the biplots, it provides a numerical analogue of their geometric interpretation.
Across all configurations (α = 0.0, 0.1, 0.3, 0.9, 1.0, and α = 0.3 with K = 3), the stability range varied from 0.28 to 11.5 across all α configurations, with the maximum values occurring at higher α levels (0.9–1.0) where yield-associated variance inflates the interaction scores, confirming the presence of both highly stable and highly sensitive genotypes in the population. At α = 0.0 (AMMI), values ranged from 0.40 (G8) to 10.99 (G4), with an average of 4.86. The same pattern persisted at α = 0.1 (0.36–10.94) and α = 0.3 (0.28–10.56), with G8 consistently showing the lowest instability scores. These results identify G8 as the most stable genotype in the dataset, exhibiting minimal G × E variation regardless of α. Several other genotypes (G1, G2, G6, and G12) maintained intermediate stability (3–5), typical of broadly adapted material.
Conversely, G4 was the most unstable across all configurations, exceeding 10.0 in every model, indicating strong crossover interactions and environment-specific adaptation. Among the check varieties, Rakita (R) and Vihren (V) exhibited moderate stability (4–6), whereas Kolorit (K) and AD-7291 (A) were less stable, with values near the upper quartile of the distribution.
When α increased to 0.9 and 1.0, the inclusion of yield-related variance elevated overall instability (mean ≈ 6.0). The stability hierarchy partially shifted: G2 emerged as the most stable line (UAG_IPCA_Stability = 1.1–1.2), while G8 remained within the top quartile but with slightly increased variability. This transition reflects the theoretical behavior of the UAG model, where higher α values amplify yield-associated differences, thus reducing apparent stability. Nonetheless, the internal consistency of genotype rankings demonstrates that UAG_IPCA_Stability is a robust indicator of interaction behavior across the AMMI-GGE continuum.
While stability alone provides critical insight into genotype reliability, practical breeding decisions require a simultaneous consideration of yield performance. For this purpose, the Unified AMMI-GGE Index (UAGI) was computed.
At α = 0.0 (AMMI), where stability dominates, the highest UAGI values corresponded to genotypes with low stability and high yields—G11, G12, and G8. The check variety Kolorit (K) also ranked high, supporting its interpretation as a performance benchmark under interaction-focused models. At α = 0.3 (K = 2), the index reached its most balanced expression. Here, G8 maintained the top position, combining low instability (0.28) with above-average yield, followed closely by G12 and G6, both displaying favorable trade-offs between productivity and resilience. Increasing the model rank to K = 3 at the same α did not change the ranking, confirming that the two-component UAG model sufficiently captures the relevant structure. At higher α values (0.9 and 1.0), the UAGI values increasingly reflected the yield potential. Genotypes G10 and G12 rose to dominance due to their high productivity, while G8 retained a competitive position because of its inherent stability. Meanwhile, G4 and K, characterized by high instability, consistently recorded the lowest UAGI values.
The graphical representation of these relationships (
Figure 3 and
Figure 4) illustrates the dynamic interplay between stability and yield. In the UAG_IPCA_Stability vs Yield plots, a negative correlation is observed at low α levels—genotypes with high stability generally have moderate yields. As α increases, this correlation weakens, and some genotypes (notably G10 and G12) achieve high yields without proportional penalties in UAGI, demonstrating specific adaptation to favorable environments. The UAGI biplots further highlight this balance: genotypes in the upper-right sector (G8, G12) combine high yield and stability, while those in the lower-left (G4, K) are both unstable and low-yielding. Stable but modest performers (G2, G6, G3) cluster near the center, representing reliable but not exceptional lines.
Taken together, the combined analysis of UAG_IPCA_Stability and UAGI delineates three functional genotype groups across the α continuum:
Highly stable and broadly adapted genotypes—G8 and G2, which consistently display low instability (≤1.5) and moderate to high UAGI values;
High-yield, environment-responsive genotypes—G10, G11, and G12, which attain high UAGI under yield-oriented models (α ≥ 0.9) despite moderate instability;
Unstable or inconsistent genotypes—G4 and Kolorit (K), whose large UAG_IPCA_Stability values and low UAGI indicate strong environment dependency.
This classification aligns well with the biplot interpretations and variance partitioning results, confirming the internal coherence of the unified analytical framework. From a breeding perspective, low-to-intermediate α configurations (0.1–0.3) are most informative for selecting genotypes with balanced performance, whereas high α values (0.9–1.0) are better suited for identifying site-specific yield leaders. The adopted weighting (w = 0.6) reflects realistic breeding priorities, where yield maximization and environmental resilience are pursued simultaneously.
Overall, the joint use of UAG_IPCA_Stability and UAGI provides a rigorous and flexible approach for evaluating genotype adaptability. By quantifying both stability and productivity within the same model, the unified index avoids methodological inconsistencies and allows breeders to visualize and select genotypes that combine high yield potential with reliable environmental performance—essential criteria for modern triticale breeding under variable agro-climatic conditions.
2.2.4. Which–Won–Where Patterns and Mega-Environment Identification
The which–won–where (WWW) representation (
Figure 5) derived from the Unified AMMI-GGE (UAG) framework provides a graphical basis for identifying the dominant genotypes across environments by examining the vertices of the interaction polygon. In the present dataset, three environments (E1, E2, and E3) were evaluated under six UAG configurations (α = 0.0, 0.1, 0.3, 0.9, 1.0, and α = 0.3 with K = 3), allowing for a detailed comparison of how the contribution of environmental main effects reshapes genotype–environment relationships.
At α = 0.0 (AMMI-type representation), the WWW polygon is relatively wide and irregular, indicating the presence of substantial crossover interaction. The environments are clearly separated in the biplot space: E1 is positioned distinctly on the positive side of PC1, E3 lies in the upper-left quadrant, and E2 is located in the lower-left quadrant. This spatial separation suggests strong differences in genotype ranking among environments. The polygon vertices are defined mainly by genotypes such as G4, G10, G8, and G6, which are far from the origin and, therefore, contribute strongly to the interaction structure. Each of these vertex genotypes defines a sector, but no single genotype dominates all environments. Instead, different genotypes are associated with different environmental directions, reflecting clear crossover responses.
At α = 0.1, the general geometry of the polygon is preserved, but a slight reorientation of axes is observed. The separation among environments remains pronounced: E1 continues to occupy the positive PC1 region, while E2 and E3 remain on the negative side but in distinct vertical directions. This confirms persistent environmental heterogeneity. Genotypes G4 and G11 are oriented toward E1, suggesting a higher relative performance in that environment. In contrast, genotypes positioned toward the upper-left sector (e.g., G1, G2, G6) show closer alignment with E3, while those closer to the lower-left region exhibit affinity to E2. The dispersion of genotypes indicates strong interaction effects rather than uniform adaptability.
At α = 0.3 (K = 2), the polygon becomes more structured, and the sector delineation is clearer. The three environments remain distinctly separated into different sectors, confirming the presence of three interaction patterns rather than two. E1 is consistently associated with the positive PC1 axis, while E2 and E3 occupy contrasting positions along PC2 on the negative side of PC1.
The vertices are primarily defined by genotypes A, K, G2, G4, G6, G9, and G11, each representing potential winners in different sectors. Genotypes near the origin, such as G3, G5, and G8, exhibit relatively low interaction and, therefore, greater stability across environments. The clearer separation at this α level improves interpretability without substantially distorting the interaction pattern.
The additional configuration α = 0.3 with K = 3 does not fundamentally change the structure of the biplot but enhances the resolution of sector boundaries. The same environmental pattern is retained, with each environment occupying a distinct region of the plot. The vertex genotypes remain largely consistent, confirming that the interaction structure is stable and adequately captured by two principal components. The introduction of a third component refines the geometry slightly but does not alter biological interpretation.
At higher α values (0.9 and 1.0), corresponding to GGE-like formulations, the polygon becomes more compressed along PC1, reflecting the increasing influence of genotype main effects. Despite this compression, the relative positions of environments remain stable. E1 continues to be clearly separated on the positive PC1 side, while E2 and E3 remain differentiated on the negative side. The same sets of genotypes continue to define the polygon vertices, particularly G4 and G10 on the positive side and G6 and G1 on the negative side. This consistency indicates that the main patterns of genotype–environment interaction are robust across model parameterization, although their visual expression becomes more aligned with overall performance at higher α values.
In summary, the UAG which–won–where analysis reveals a strong and consistent crossover interaction among the three environments. Unlike a scenario with clustered environments, E1, E2, and E3 form three clearly distinct sectors across all model configurations, indicating that genotype performance is highly environment-specific. Genotypes located at the polygon vertices (notably G4, G10, G6, and G1) act as sector winners depending on environmental conditions, while centrally located genotypes display greater stability but lower responsiveness. The α = 0.3 configuration provides the most balanced representation, combining clear sector definition with stable geometry. From a breeding perspective, these results emphasize the importance of environment-specific selection, as no single genotype shows universal superiority, and targeted adaptation strategies are required for each production environment.
2.2.5. Cross-Validation and Predictive Performance
A comprehensive cross-validation analysis was carried out to evaluate the predictive accuracy and generalization capacity of the Unified AMMI-GGE (UAG) model across α (0–1) and K (1–3) configurations. Four complementary resampling schemes were applied: Leave-One-Environment-Out (LOEO), Leave-One-Genotype-Out (LOGO), Leave-One-Combination-Out (LOCO), and a Two-Way Leave-One-Out (LOO) design. These procedures assessed distinct aspects of model robustness—environmental extrapolation, genotypic prediction, local interpolation, and overall data recovery. Prediction accuracy was quantified using PRESS (Prediction Residual Sum of Squares) and Cross-Validated Mean Square Error (CV-MSE) in both the original (Y) and α-transformed (Yα) spaces.
The resulting heatmaps (
Figure 6) revealed consistent, yet interpretable, variation in model performance along the α continuum. Across all tests, the error surfaces showed clear minima concentrated within the α range 0.1–0.3 and generally lower values for higher ranks (K = 2–3). The PRESS statistic exhibited a well-defined trough at α ≈ 0.3 and K = 2, indicating that moderate weighting between additive and multiplicative components yields the most parsimonious fit without overfitting noise. Increasing K from two to three reduced PRESS marginally (<3%), suggesting that two multiplicative components already capture the essential G × E structure.
The LOGO cross-validation displayed a numerical minimum at α = 0.0 and K = 2. However, this corresponds to a degenerate solution with zero retained components (r = 0) and, therefore, does not represent genuine predictive capacity. The lowest valid RMSE was observed at α = 0.1 and K = 2 (RMSE = 33.52). This pattern reflects the advantage of AMMI-like parameterization when the genotypic main effects are more influential than the environmental shifts.
Conversely, the LOEO procedure, evaluating extrapolation to untested environments, showed the lowest CV-MSE values at α ≥ 0.1, where environmental means dominate the α-transformed space. These contrasting behaviors confirm that different α values favor distinct prediction objectives: a smaller α improves genotype generalization, while a larger α enhances environmental extrapolation.
The LOCO validation, which removes individual genotype–environment combinations, revealed a broad low-error plateau between α = 0.3 and 1.0, centered near α = 0.3. This confirms that intermediate α values provide the most balanced predictive capacity—neither overfitting environment-specific variance (as at α = 1) nor oversmoothing genotypic patterns (as at α = 0). Similarly, the two-way LOO (macro-averaged between LOEO and LOGO) exhibited its minimum near α = 0.1 and K = 3, indicating that partial weighting of environmental effects improves joint prediction across both axes.
When evaluated in the Yα space, the general pattern of minima remained almost unchanged, but error magnitudes were numerically identical to those in the original Y space for the present balanced dataset, confirming that the α-transformation does not introduce additional bias under full replication. This demonstrates that model regularization through the unified transformation improves predictive precision across the entire α continuum. Collectively, these results highlight two complementary optima within the UAG framework. First, a predictive optimum at α ≈ 0.1–0.3, K = 2–3, which minimizes PRESS and global CV-MSE, represents the most balanced and generalizable configuration. Second, a task-specific optimum at α ≈ 0.9–1.0 enhances extrapolation to new environments, especially under the LOCO criterion. In practice, this dual structure allows breeders to tailor model selection according to experimental objectives—AMMI-oriented settings for genotype-focused predictions and GGE-oriented ones for environmental forecasting.
From an applied perspective, the α = 0.3, K = 2 model emerges as the most efficient general-purpose configuration. It achieves near-minimal predictive errors across all validation schemes while maintaining interpretative simplicity and low computational cost. Compared to the AMMI baseline (α = 0), this setting reduces cross-validated MSE by approximately 10–15%, a statistically and biologically significant gain corresponding to higher stability and ranking consistency among genotypes across environments.
Importantly, the comparative analysis across K levels confirmed that adding a third multiplicative component does not yield substantial improvement in predictive accuracy, despite increased complexity. Thus, K = 2 suffices for the accurate and robust representation of G × E patterns in this dataset.
In summary, cross-validation of the UAG model under symmetric scaling validates its capacity to flexibly integrate AMMI and GGE perspectives within a single predictive framework. The intermediate α configurations (≈0.1–0.3) ensure the best compromise between fit quality, stability, and interpretability, while the α ≈ 1.0 models remain optimal for environment-specific prediction. This adaptive behavior reinforces the practical advantage of the unified approach for modern plant breeding, where prediction accuracy across diverse and incomplete multi-environment trials is a critical determinant of selection efficiency.
2.2.6. Optimal Model Parameters
Determining the optimal parameterization of the UAG model required balancing predictive accuracy, interpretability, and biological realism. The comprehensive cross-validation results (LOEO, LOGO, LOCO, and Two-way LOO) revealed a consistent pattern across both yield representations ( and ). Although error magnitudes varied among schemes, the overall minima concentrated around α = 0.1–0.3 and K = 2, indicating that moderate blending between AMMI and GGE components yields the most stable and predictive configuration.
Under the Leave-One-Environment-Out (LOEO) validation, RMSE values were identical across α = 0.1–1.0 (RMSE = 3474.67), with only α = 0.0 yielding a higher error (RMSE = 4000.58), indicating that any degree of partial environment centering suffices to capture the cross-environment signal in this three-environment dataset. The Leave-One-Genotype-Out (LOGO) analysis exhibited a clear global minimum at α = 0.1, K = 2 (RMSE ≈ 33.5), highlighting the advantage of partial environment centering for generalizing across genotypes. The Leave-One-Combination-Out (LOCO) results showed the smallest error at α = 0.3, K = 2 (RMSE ≈ 324), confirming that this configuration offers optimal local predictive consistency. Likewise, the Two-Way Leave-One-Out (LOO) approach reached its lowest RMSE among K = 2 configurations at α = 0.1 (RMSE = 1754.10), with a marginally lower value at α = 0.3, K = 3 (RMSE = 1737.34), with the latter corresponding to a fully saturated model.
Across all criteria (PRESS, RMSE, and CV–MSE), models with K = 2 consistently outperformed higher-rank alternatives while maintaining interpretability and parsimony. Increasing K to three slightly raised explained variance but did not reduce prediction error, indicating diminishing returns from additional interaction components. Therefore, K = 2 represents the most efficient dimensionality for applied breeding analyses, capturing the dominant G × E interaction structure without overparameterization.
For certain parameter combinations—most notably at α = 0 and low K—the singular value decomposition produced zero retained components (r = 0), yielding numerical zero PRESS and RMSE values. These do not indicate perfect prediction but rather model degeneracy, emphasizing the sensitivity of low-rank UAG configurations to α–K calibration and the necessity of avoiding overly constrained parameterizations.
From a breeding perspective, the α = 0.1–0.3, K = 2 configuration provides the optimal compromise between yield responsiveness and stability. It reproduces genotype rankings observed under field conditions, ensures reliable cross-site prediction, and maintains biologically coherent ordinations. Consequently, this parameterization defines the statistically optimal and operationally practical form of the UAG model for genotype evaluation and selection across variable environments.
2.2.7. Correspondence Between the UAG Framework and Factor-Analytic Mixed Models
To evaluate the relationship between the UAG framework and factor-analytic (FA) mixed models—the current methodological standard for GEI analysis in plant breeding—a FA(2) decomposition was applied to the double-centered GEI matrix derived from the triticale multi-environment dataset. The analysis established a mathematically exact equivalence between FA(2) and UAG (α = 0, K = 2): the maximum absolute reconstruction error between the two fitted matrices was 1.30 × 10−13, confirming numerical identity. This equivalence is not coincidental but arises necessarily from the rank structure of the data: with three environments, the double-centered GEI matrix has rank at most min(G−1, E−1) = 2, meaning that both FA(2) and UAG (α = 0, K = 2) perform a complete rank-2 approximation of the same matrix within the same weighted Hilbert space, yielding identical fitted values and scores.
The FA(2) decomposition partitioned the total GEI variance into two factors: Factor 1 accounted for 86.47% of GEI (S
1 = 343.39) and Factor 2 for the remaining 13.53% (S
2 = 135.82), jointly explaining 100% of the structured interaction. The third singular value was exactly zero (S
3 = 0.000), confirming that the GEI space is fully spanned by two dimensions and that no residual interaction structure remains unaccounted for. This complete decomposition further validates the choice of K = 2 identified through cross-validation in
Section 2.2.5, demonstrating that the two-component UAG model is not an approximation but an exact representation of the underlying GEI in this dataset.
Genotype factor loadings on FA Factors 1 and 2 were fully consistent with the UAG biplot ordinations (
Figure 7A). Genotypes with large absolute loadings on both factors—notably G4 (FA communality = 41347.88), G11 (18159.50), K (17990.32), and G10 (13816.19)—exhibited pronounced environment-specific responses, corresponding to peripheral positions in the UAG biplots. In contrast, G8 (FA communality = 43.99), G5 (489.26), and G3 (1011.58) showed minimal loadings on both factors, reflecting consistent performance across environments and confirming their classification as broadly adapted genotypes. The correspondence between FA communality and UAG_IPCA_Stability was mathematically exact—the FA communality equaled the UAG_IPCA_Stability
2 for all genotypes—producing a Spearman rank correlation of ρ = 1.000 (
p < 0.001) across all 16 entries (
Figure 7B). G8 retained the lowest instability score under both frameworks (UAG_IPCA_Stability = 6.63), with G5 ranking second (22.12), a difference of more than threefold, confirming the exceptional stability of G8 relative to the remainder of the population.
The GEI correlation structure between environments, derived from the double-centered interaction field, revealed a strong negative correlation between E1 and E3 (r = −0.893), a moderate negative correlation between E1 and E2 (r = −0.568), and a near-zero association between E2 and E3 (r = 0.135). This pattern quantifies the pronounced crossover interaction between E1 and E3 that was visually apparent in the Which–Won–Where analysis (
Section 2.2.4), where these two environments consistently occupied opposing sectors of the UAG polygon across all α configurations. The near-zero E2–E3 correlation indicates that these environments, despite their spatial separation along PC2, share limited genotype-ranking overlap in the pure interaction space, further supporting the three-sector mega-environment structure identified by the UAG framework.
Collectively, the FA(2) analysis confirms the theoretical soundness and empirical validity of the UAG framework. The exact equivalence at α = 0 demonstrates that UAG recovers the full fixed-effects FA solution without requiring iterative REML estimation, while the α continuum extends beyond what classical FA models offer by enabling continuous tuning between stability-oriented and productivity-oriented representations within a single unified decomposition.