Next Article in Journal
Optimizing Plant Density and Row Spacing Enhances Growth, Yield and Quality of Waxy Maize on the Loess Plateau
Previous Article in Journal
The Role of MicroRNA-Based Strategies in Optimizing Plant Biomass Composition for Bio-Based Packaging Materials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

PCA-Driven Multivariate Trait Integration in Alfalfa Breeding: A Selection Model for High-Yield and Stable Progenies

1
College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
2
Institute of Grassland Science, Yangzhou University, Yangzhou 225009, China
3
College of Agro-Grassland Science, Nanjing Agricultural University, Nanjing 210095, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Plants 2025, 14(18), 2906; https://doi.org/10.3390/plants14182906
Submission received: 27 July 2025 / Revised: 31 August 2025 / Accepted: 17 September 2025 / Published: 18 September 2025
(This article belongs to the Section Crop Physiology and Crop Production)

Abstract

Breeding improvement in alfalfa (Medicago sativa L.) is often constrained by the complexity of agronomic traits and trade-offs among yield-related characteristics. Conventional single-trait selection rarely captures the full range of phenotypic variation or the interactions among traits. To address this, we developed a principal component analysis (PCA)-based framework for multivariate selection in hybrid breeding. Six yield-related traits—plant height, branch number, fresh/hay yield ratio (FHR), leaf/stem ratio (LSR), multifoliolate leaf frequency, and dry weight per plant—were quantified in two parental lines and their F1/F2 generations. PCA identified three principal components (PC1–PC3) with eigenvalues >1, explaining 71.14% of the total phenotypic variance: PC1 (32.43% variance) was predominantly loaded with positive contributions from dry weight per single plant, height, and branches, biologically representing overall plant vigor and biomass accumulation; PC2 (21.77% variance) showed strong negative loadings for LSR, capturing architectural trade-offs between stem dominance and leaf production; PC3 (16.94% variance) had positive loadings on multifoliolate leaf rate and fresh/dry ratio, embodying quality and physiological resilience traits. Based on PCA scores, a composite selection index was constructed, and the top 31.1% of F1 hybrids were selected. Their F2 progeny showed significant improvements in dry weight (+15.56%, p < 0.01), multifoliolate leaf frequency (+74.78%, p < 0.001), and reduced FHR (–8.2%, p < 0.05), accompanied by lower yield decline (−7.2% versus −14.1% in controls). These results show that PCA-based multivariate selection effectively balances trait trade-offs, enhances intergenerational stability, and improves selection efficiency. This framework offers a practical tool for alfalfa breeding.

1. Introduction

Principal component analysis (PCA), a widely used dimensionality reduction technique, is extensively applied in crop breeding for integrative analysis of multiple traits within phenomics. It effectively handles high-throughput phenotypic data such as near-infrared spectroscopy (NIRS) [1] or hyperspectral imaging [2], accelerating breeding cycles by extracting key patterns from complex datasets. PCA converts high-dimensional phenotypic data (such as spectral absorption values) into a small number of principal components (PCs), capturing the major sources of variation and thereby reducing computational burden; numerous studies have further demonstrated its effectiveness in consolidating yield-related traits and in identifying genotypes associated with desirable comprehensive agronomic profiles [3,4]. For instance, PCA has been utilized to integrate key agronomic traits—such as plant height, panicle number, and yield per plant—across a wide range of crops. This approach has been successfully applied in the breeding of rice (Oryza sativa L.) [5], fingerroot (Boesenbergia rotunda L.) [6], sweet potato (Ipomoea batatas [L.] Lam) [7], sorghum (Sorghum bicolor [L.] Moench) [8], cotton (Gossypium hirsutum L.) [9], and maize (Zea mays L.) [10], leading to improved selection efficiency and breeding outcomes.
PCA is particularly powerful in revealing synergistic effects and trade-offs among multiple traits. Some crops exhibit positive correlations, such as plant height and dry weight [11,12]. In contrast, others display negative associations, for example between cyanidin content in alfalfa and neutral detergent fiber content [13]. Through PCA, these interactions can be quantitatively characterized, providing breeders with insights into complex trait networks and helping to minimize conflicts among traits during selection. Beyond trait integration, PCA has also been shown to enhance predictive capacity in breeding, thereby shortening breeding cycles. By evaluating multiple traits simultaneously, PCA effectively elucidates genetic structure and reproductive barriers among populations with varying ploidy levels, thereby revealing patterns of isolation and differentiation [14]. Moreover, by removing redundant information and reducing variable dimensionality [15], PCA facilitates the rapid identification of promising interspecific hybrids from phenomic markers, reducing the need for extensive field trials and accelerating breeding cycles by prioritizing individuals with desirable trait combinations [16].
Despite its advantages, PCA-based breeding approaches still face certain limitations and challenges. One key issue is balancing the number of traits with an adequate sample size. Although PCA reduces dimensionality, its performance depends on sufficient input data. Inadequate sample sizes may prevent the model from capturing critical trait variation, thus undermining reliability [17]. To address this, some researchers have proposed integrating genomic data to increase the effective sample size and improve predictive accuracy. Moreover, PCA is inherently linear and may not capture nonlinear relationships present in biological data [18].
Another challenge lies in the interpretability and potential bias of principal components. While PCA summarizes data through the extraction of components, these components are often abstract linear combinations of the original variables and may lack clear biological meaning [19,20]. This makes it difficult to directly relate them to breeding targets such as yield or disease resistance. Improving the interpretability of principal components and aligning them with practical breeding objectives is a pressing research need. Additionally, selection bias may occur if important but seemingly insignificant traits are discarded during dimensionality reduction, potentially leading to overfitting, especially in models relying on fixed trait weights.
Environmental effects also play a crucial role in trait expression. Variability in environmental conditions can significantly influence phenotypic traits and reduce the stability and reliability of PCA models. As such, incorporating environmental data into PCA can help identify which environmental factors are most closely associated with genetic variation and allow breeders to track how such variation responds across different environmental contexts [21].
In summary, PCA is a powerful and widely adopted tool in plant breeding, particularly effective for multi-trait integration and breeding cycle reduction. Nonetheless, current applications still face challenges such as optimizing trait-to-sample ratios, improving interpretability, and accounting for environmental influences. Future research that combines PCA with genomic data, high-dimensional phenotyping, and environmental adaptability analysis may enhance its accuracy and robustness, making it an even more effective tool for crop improvement. In this study, we propose a PCA-driven selection model aimed at optimizing hybrid breeding in alfalfa (Medicago sativa L.). Traditional breeding approaches often focus on individual traits—such as plant height or dry weight—while overlooking interactions among agronomic characteristics. The goal of our research is to develop an integrated framework based on PCA to capture inter-trait relationships, improve biomass yield and stability, and ultimately enhance selection efficiency in alfalfa breeding.

2. Results

2.1. Agronomic Trait Characterization in Parental Lines and F1 Hybrids

Six yield-related traits—absolute plant height, branch number, FHR, LSR, multifoliolate leaf frequency, and dry weight per plant—were quantified at the initial flowering stage in parental lines (PL34HQ, Huaiyin) and F1 hybrids (n = 90) (Table S1). Significant differences (p < 0.01) were observed between parents for five traits (plant height, branch number, FHR, multifoliolate trait frequency and dry weight), while LSR showed no significant divergence (p > 0.05). F1 hybrids exhibited intermediate trait values between parental extremes (Table 1.).

2.2. Correlations Among Agronomic Traits in the F1 Generation of Alfalfa Crosses

Trait correlations in F1 hybrids revealed strong positive associations between dry weight and plant height (ρ = 0.35, p < 0.001), branch number (ρ = 0.30, p < 0.01), and LSR (ρ = 0.45, p < 0.001), alongside a negative correlation with FHR (ρ = −0.26, p < 0.05). Plant height and branch number were also highly correlated (ρ = 0.49, p < 0.001) (Figure 1).

2.3. Principal Component Analysis (PCA) for Dimensionality Reduction

PCA of six agronomic traits extracted three principal components (PC1–PC3) with eigenvalues >1, cumulatively explaining 71.14% variance (PC1: 32.43%, PC2: 21.77%, PC3: 16.94%) (Table 2, Figure 2). Component coefficients were derived by dividing initial factor loadings (Table 3) by the square root of corresponding eigenvalues (Table 2). The scree plot (Figure S1) further confirmed the appropriateness of retaining these three components based on the eigenvalue >1 criterion and the elbow method.
The final three principal components obtained are as follows:
F1 = 0.523X1 + 0.463X2 − 0.295X3 + 0.294X4 − 0.176X5 + 0.556X6
F2 = 0.216X1 + 0.543X2 − 0.049X3 − 0.704X4 + 0.346X5 − 0.200X6
F3 = −0.395X1 − 0.070X2 − 0.450X3 + 0.101X4 + 0.705X5 + 0.360X6
where X1X6 represent standardized values of plant height, branch number, FHR, LSR, multifoliolate trait frequency, and dry weight, respectively.
The loading plot illustrates the contribution of six agronomic traits to the first three principal components (PC1–PC3), which collectively explain 71.14% of the total variance.
Trait weights were calculated by Trait weights were obtained by multiplying the loading coefficients of each trait by the variance contribution of the corresponding principal component, and then summing the weighted values across F1F3 (Formular (1)–(3)). Finally, the result was normalized by the total variance explained by these three components (Table 2). For example, the weighting factor for plant height was calculated as
0.523 × 32.435 + 0.216 × 21.771 0.395 × 19.937 71.143 = 0.210
which was then incorporated into the final selection index:
Y = 0.210ZX1 + 0.360ZX2 − 0.257ZX3 − 0.057ZX4 + 0.194ZX5 + 0.278ZX6
(ZX1, ZX2, ZX3, ZX4, ZX5, ZX6: Represent the standardized plant height, number of branches, FHR, stem-to-foliage ratio, leafy ratio and dry weight values, respectively).

2.4. Selection of Elite Hybrids Using a Multivariate Approach

The model ranked 90 F1 hybrids by composite scores (Table 4), selecting the top 28 genotypes (combined scores >1) for subsequent crosses.

2.5. Validation of Selection Efficacy in F2 Progenies

Compared to unselected populations, F2 progeny from elite F1 hybrids exhibited a 15.56% higher dry weight (p < 0.01), a 74.78% increase in the frequency of the multifoliolate trait (p < 0.001), an 8.2% reduction in FHR (p < 0.05), and an attenuated yield decline (−7.2% vs. −14.1% in controls; Table 5).

3. Discussion

This study demonstrates that a principal component analysis (PCA)-driven selection framework can effectively integrate multiple agronomic traits in alfalfa breeding and translate this integration into tangible genetic gains. Traditional single-trait selection often underestimates the complexity of yield-related traits, which are shaped by multiple, interacting morphological and physiological factors [22,23,24]. Specifically, as illustrated in the 3D PCA biplot (Figure 2), PC1 (explaining 32.43% of the variance) was predominantly loaded with positive contributions from dry weight per single plant, height, and branches, biologically representing overall plant vigor and biomass accumulation. This component likely reflects integrated physiological processes such as enhanced carbon assimilation and resource allocation toward vegetative growth, which are critical in alfalfa for maximizing forage yield under varying environmental conditions [25,26]. In contrast, PC2 (21.77% variance) showed strong negative loadings for leaf/stem ratio, suggesting it captures architectural trade-offs between stem dominance and leaf production—potentially linked to light interception efficiency and mechanical stability, enabling breeders to select for balanced plant morphology that reduces lodging risks [27,28]. PC3 (16.94% variance), with positive loadings on multifoliolate leaf rate on the main stem and fresh/dry ratio, appears to embody quality and physiological resilience traits, such as leaf complexity for improved digestibility and water retention for stress tolerance. For breeders, this is meaningful as it highlights genotypes with enhanced nutritional value and yield stability, addressing common challenges in forage legumes where multifoliate leaves correlate with higher protein content [29,30]. By selecting based on favorable scores across these PCs, the approach prioritizes holistic trait networks, which explains the superior performance of F2 progeny from high-ranking F1 hybrids: reinforced beneficial interactions led to immediate agronomic gains and intergenerational consistency. By constructing a composite scoring model based on PCA, we were able to rank and select elite F1 hybrids, and subsequently advance them to F2 populations. The results provide strong empirical evidence for the power of this multivariate approach: compared with unselected populations, F2 progeny derived from elite F1 hybrids exhibited 15.56% higher dry weight (p < 0.01), a 74.78% increase in multifoliolate trait frequency (p < 0.001), a reduced FHR (−8.2%, p < 0.05), and attenuated yield decline (−7.2% versus −14.1% in controls). These improvements highlight the ability of PCA-based selection to both enhance agronomic performance and stabilize yield-related traits across breeding cycles.
The F1 population provided important insights into the relationships among traits. Correlation analyses revealed that dry weight per plant was positively associated with plant height, branch number, and LSR, but negatively associated with FHR. Such synergistic relationships suggest that biomass accumulation in alfalfa is not determined by any single attribute, but rather emerges from the interaction of multiple traits [31,32]. The PCA model effectively captured these interactions by assigning greater weight to traits contributing positively to yield and down-weighting antagonistic factors [33,34]. This may explain why the F2 progeny derived from high-ranking F1 individuals outperformed unselected controls: selection favored genotypes in which beneficial trait networks were reinforced, leading to both immediate improvement and potential intergenerational stability.
Interestingly, while most traits differed significantly between parents, LSR remained relatively stable across parental lines, F1 hybrids, and F2 progeny. Because LSR is a recognized quality indicator in alfalfa [35,36], its stability suggests that this trait may be less sensitive to environmental variation and largely under genetic control [28]. The lack of significant divergence in LSR across generations therefore indicates both heritable consistency and a limited degree of genetic diversity for this trait in the materials studied. This observation underscores the importance of combining stable traits like LSR with more variable, yield-contributing traits when designing composite selection indices [37,38,39].
Our findings also resonate with broader trends in plant breeding. Multi-trait selection strategies, including PCA-based indices and multivariate genome-wide association studies (mvLMM), have been shown to outperform single-trait approaches in crops such as maize and wheat [40]. These strategies not only capture trait correlations more effectively but also enhance the detection of true genetic signals. By applying a similar framework to alfalfa, our study provides one of the first empirical validations that multivariate composite scoring can accelerate breeding progress in forage legumes [41], a crop where yield stability and quality improvement are both essential but often difficult to achieve simultaneously [42]. Research suggests that in forage grass breeding programs, yield traits are often selected first due to their direct impact on economic viability, with quality traits addressed in later stages to balance overall performance [43,44].
Looking forward, the framework established here offers a flexible and scalable approach for future breeding programs. While this study relied on field-based phenotyping of six traits, the integration of high-throughput phenotyping platforms and genomic selection tools [45,46,47] could further enhance the precision and efficiency of multivariate selection. Such integration would allow breeders to simultaneously capture complex trait interactions at both the phenotypic and molecular levels, thereby increasing selection power and reducing breeding cycles. Ultimately, by combining PCA-based indices with emerging technologies, alfalfa breeding can move toward more systematic, data-driven strategies that accelerate genetic gain, stabilize key agronomic traits, and contribute to sustainable forage crop improvement.

4. Materials and Methods

4.1. Research Design and Materials

This study aimed to establish a principal component analysis PCA-based multi-trait comprehensive evaluation model to screen alfalfa (M. sativa L.) hybrid progenies with both high yield and synergistic trait advantages, and to elucidate patterns of intergenerational trait transmission.
The Australian multifoliate alfalfa (Medicago sativa L.) line ‘PL34HQ’ [Source: China-Australia Alfalfa Cooperation Project (ASI/1998/026)] was used as the maternal parent, and the local cultivar ‘Huaiyin alfalfa’ (Source: National Animal Husbandry Station, Beijing, China) served as the paternal parent. F1 progeny (90 plants) from a cross between PL34HQ and Huaiyin were evaluated, with 28 plants (top 31.1%) selected by the PCA-based index and the remaining 62 classified as unselected. Each group was intermated separately to generate corresponding F2 populations (n = 90), which were compared to validate the selection model.
The experiment was conducted at the Yang-tzu-chin Experimental Base for Grassland Science, Yangzhou University, using a strip trial design in which every four rows formed an identical strip group, with a row spacing of 15 cm and plant spacing of 10 cm. In early September 2023, seeds were germinated in a controlled growth chamber under conditions of 25 °C with 16 h of light and 22 °C with 8 h of darkness. Once seedlings reached 3–5 cm in height, they were transplanted into trays filled with a 1:1 vermiculite–peat substrate (Pindstrup, Ryomgaard, Denmark). After establishment, seedlings were transplanted row by row onto outdoor ridges (10 cm high). A basal application of compound NPK fertilizer (18-18-18; Stanley San’an, Linyi, China) was made before transplantation. The field trial relied on natural rainfall, with irrigation applied only once immediately after transplantation to promote seedling survival; no further watering was provided. All materials were individually harvested and assessed at the initial flowering stage to ensure uniformity, with each plant in the population measured once to minimize the impact of growth period variation on trait measurements. During the experimental period, January was the coldest month and July the hottest in Yangzhou. In the first year, the average maximum temperature in July was 30.2 °C, and the average minimum temperature was −1.2 °C; in the second year, these values were 35.2 °C and 26.1 °C, respectively. Monthly precipitation ranged from 11.5 to 166.1 mm in the first year and from 35.3 to 625.1 mm from January to July in the second year. Soil properties were as follows: organic matter, 11.89 g kg−1; available nitrogen, 88.26 mg kg−1; available phosphorus, 6.04 mg kg−1; available potassium, 42.33 mg kg−1; and pH, 7.34.

4.2. Measurement and Analytical Methods for Agronomic Traits

4.2.1. Agronomic Trait Quantification

Measured parameters included:
  • Plant height (PH): the distance between the ground (seedling from cotyledonary node) and the top of the main stem (growing point) after the individual plant has been straightened (cm) [48].
  • Branch number (BN): Total primary branches above root crown (parallel to ground) [48].
  • Multifoliolate trait frequency (MF): The multifoliolate trait frequency (MF) is a species-specific indicator used in alfalfa to evaluate the occurrence of compound leaves with an increased number of leaflets. To determine MF, the total number of compound leaves on a representative branch was recorded, and leaves with four or more leaflets were classified as multifoliolate [48]. The MF was then calculated as the proportion of multifoliolate leaves to the total number of compound leaves on the branch, using the following formula:
MF (%) = (Number of compound leaves with ≥ 4 leaflets/Total number of compound leaves) × 100
This trait is commonly used to assess the expression frequency of the multifoliolate phenotype, which is often associated with potential yield improvement and forage quality enhancement.
  • Fresh weight (FW): Fresh biomass per plant after cutting (g) [49].
  • The leaf/stem ratio (LSR) was determined as the ratio of leaf dry weight to stem dry weight. After harvest, plant samples were manually separated into leaf and stem components. Each component was oven-dried at 65 °C to a constant weight. The LSR was calculated using the following formula:
LSR = Leaf dry weight (g)/Stem dry weight (g)
This ratio serves as an important indicator of forage quality, with higher values generally reflecting improved digestibility and nutritional content [50].
  • Dry weight (DW): Constant weight after 105 °C enzyme deactivation (30 min) followed by 65 °C drying (g) until a constant weight was achieved [51].
  • The fresh/hay yield ratio (FHR) was calculated as the ratio of fresh biomass weight to dry biomass weight. Fresh weight was measured immediately after harvesting each plant. To determine dry weight, the same plant samples were oven-dried at 65 °C until a constant weight was achieved (typically 48–72 h). The FHR was then calculated using the formula:
FHR = Fresh weight (g)/Dry weight (g)
This ratio reflects the water content of the biomass and serves as an important indicator of forage moisture characteristics and drying efficiency.

4.2.2. Statistical Analysis

Data processing: Raw data cleaning and descriptive statistics using spreadsheet software (Excel 2021, Microsoft, Redmond, WA, USA).
Statistical analyses: Independent t-tests were used to compare differences between two groups (IBM, Armonk, NY, USA). One-way analysis of variance (ANOVA) followed by Duncan’s multiple range test (DMRT) was performed using SPSS 27 (IBM, Armonk, NY, USA) to assess differences among multiple groups (α = 0.05).
Trait correlations: Inter-trait associations evaluated using Pearson correlation coefficients (ρ) with α= 0.05.
Principal component analysis (PCA): Dimension reduction analysis of six standardized traits conducted in Origin 2024 (OriginLab, Northampton, MA, USA), with principal components extracted via Kaiser criterion (eigenvalue > 1). Comprehensive selection model constructed using factor loading matrix: Y = Σ (weight × standardized trait value).
Visualization: Correlation heatmaps and PCA biplots generated using GraphPad Prism 9.0 (Graphpad Software, Boston, MA, USA).

5. Conclusions

This study establishes a robust, PCA-driven framework for multidimensional trait selection in alfalfa (Medicago sativa L.) breeding, effectively addressing key limitations of conventional single-trait approaches. By quantifying six yield-related agronomic traits across parental lines and F1/F2 hybrid generations, we identified three principal components that together explained 71.14% of the total phenotypic variance. The resulting composite selection model (Y = 0.210ZX1 + 0.360ZX2 − 0.257ZX3 − 0.057ZX4 + 0.194ZX5 + 0.278ZX6) enabled the identification of superior F1 hybrids, whose F2 progenies exhibited significant improvements in dry weight, multifoliolate trait frequency, and yield stability. Notably, the model mitigated generational yield decline and effectively balanced trade-offs among complex traits.
The validated framework not only enhances selection accuracy and breeding efficiency but also demonstrates strong scalability for integration with genomic and environmental datasets. These findings underscore the transformative potential of multivariate selection models in modern forage crop improvement, particularly in response to increasing global demand for livestock feed and the escalating challenges of climate variability.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants14182906/s1, Figure S1: Scree Plot of Principal Components for Six Agronomic Traits; Table S1: Raw Phenotypic Dataset of Medicago sativa - Plant Height and Field Trials.

Author Contributions

Conceptualization, Z.C. and Z.W.; Data curation, Z.C.; Formal analysis, Z.C.; Funding acquisition, Z.S. and Z.W.; Investigation, Z.C.; Methodology, Z.C. and X.M.; Project administration, Z.W.; Resources, X.M. and Z.W.; Software, J.L.; Supervision, Z.W.; Validation, Z.C., J.L., H.L., M.Y., Q.W., R.J. and S.Z.; Visualization, Z.C.; Writing—original draft, Z.C. and J.L.; Writing—review and editing, Z.S., X.M. and Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Biological Breeding—National Science and Technology Major Project, grant number 2023ZD04060; Jiangsu Key R&D Program, “Key Technology Research on Breeding and Production Application of ‘Huaiyang No. 4’ Alfalfa Cultivar”, grant number BE2023383; National Extension Program for Scientific and Technological Achievements in Forestry and Grassland, State Forestry and Grassland Administration, China, grant number 2023133122; and Central Finance Forestry Science and Technology Extension Demonstration Fund Project, grant number Jiangsu [2023] TG12.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Dayananda, B.; Chahwala, P.; Cozzolino, D. The Ability of Near-Infrared Spectroscopy to Discriminate Plant Protein Mixtures: A Preliminary Study. AppliedChem 2023, 3, 428–436. [Google Scholar] [CrossRef]
  2. Gill, T.; Gill, S.K.; Saini, D.K.; Chopra, Y.; de Koff, J.P.; Sandhu, K.S. A Comprehensive Review of High Throughput Phenotyping and Machine Learning for Plant Stress Phenotyping. Phenomics 2022, 2, 156–183. [Google Scholar] [CrossRef]
  3. da Piedade, G.N.; Vieira, L.V.; dos Santos, A.R.P.; Amorim, D.J.; Zanotto, M.D.; Sartori, M.M.P. Principal Component Analysis for Identification of Superior Castor Bean Hybrids. J. Agric. Sci. 2019, 11, 179. [Google Scholar] [CrossRef]
  4. Zhao, Y.; Li, X.; Chen, Z.; Lu, H.; Liu, Y.; Zhang, Z.; Liang, C. An Overview of Genome-wide Association Studies in Plants. Chin. Bull. Bot. 2020, 55, 715–732. [Google Scholar] [CrossRef]
  5. Varanasi, Y.V.P.; Isetty, S.R.; Revadi, P.; Balakrishnan, D.; Hajira, S.; Prasad, M.S.; Laha, G.S.; Perraju, P.; Singh, U.M.; Singh, V.K.; et al. Molecular and Morphological Characterization of Introgression Lines with Resistance to Bacterial Leaf Blight and Blast in Rice. Plants 2023, 12, 3012. [Google Scholar] [CrossRef] [PubMed]
  6. Phahom, T.; Mano, J. Integration of Multiple Linear Regression, Principal Component Analysis, and Hierarchical Cluster Analysis for Optimizing Dried Fingerroot (Boesenbergia rotunda) Extraction Process. J. Appl. Res. Med. Aromat. Plants 2023, 36, 100511. [Google Scholar] [CrossRef]
  7. Placide, R.; Shimelis, H.; Laing, M.; Gahakwa, D. Application of Principal Component Analysis to Yield and Yield Related Traits to Identify Sweet Potato Breeding Parents. Trop. Agric. 2015, 92, 1. [Google Scholar]
  8. Wang, M.; Ahmad, I.; Qin, B.; Chen, L.; Bu, W.; Zhu, G.; Zhou, G. Identification and Comprehensive Evaluation of Drought Tolerance in Sorghum During Germination and Seedling Stages. Plants 2025, 14, 1793. [Google Scholar] [CrossRef]
  9. Baha, J.; Liu, W.; Ma, X.; Li, Y.; Zhao, X.; Zhai, X.; Cao, X.; Guo, W. Comprehensive Evaluation of 202 Cotton Varieties (Lines) and Their Physiological Drought Resistance Response During Seedling Stage. Plants 2025, 14, 1770. [Google Scholar] [CrossRef] [PubMed]
  10. Shojaei, S.H.; Bihamta, M.R.; Mousavi, S.M.N.; Qasemi, S.H.; Keshavarzi, M.H.B.; Omrani, A. Application of Graphical Analysis and Principal Components to Identify the Effect of Genotype × Trait in Maize Hybrids. Agrosystems Geosci. Environ. 2024, 7, e20548. [Google Scholar] [CrossRef]
  11. Ferreira, E.A.; de Abreu, J.G.; Martinez, J.C.; dos Santos Braz, T.G.; Ferreira, D.P. Cutting Ages of Elephant Grass for Chopped Hay Production. Pesqui. Agropecuária Trop. 2018, 48, 245–253. (In Portuguese) [Google Scholar] [CrossRef]
  12. Alatürk, F. Effects of Harvest Height and Time on Hay Yield and Quality of Some Sweet Sorghum and Sorghum Sudangrass Hybrid Varieties. PeerJ 2024, 12, e17274. [Google Scholar] [CrossRef]
  13. Cao, Z.; Li, J.; Wang, C.; Min, X.; Wei, Z. Stem Coloration in Alfalfa: Anthocyanin Accumulation Patterns and Nutrient Profiles of Red- and Green-Stemmed Variants. Agronomy 2025, 15, 862. [Google Scholar] [CrossRef]
  14. Kloda, J.M.; Dean, P.D.G.; Maddren, C.; MacDonald, D.W.; Mayes, S. Using Principle Component Analysis to Compare Genetic Diversity across Polyploidy Levels within Plant Complexes: An Example from British Restharrows (Ononis spinosa and Ononis repens). Heredity 2008, 100, 253–260. [Google Scholar] [CrossRef] [PubMed][Green Version]
  15. Fatima, S.; Rashid, M.; Hameed, A.; Fiaz, S.; Rebouh, N.Y.; Zaman, Q.U. Development of Rice Mutants with Enhanced Resilience to Drought and Brown Spot (Bipolaris oryzae) and Their Physiological and Multivariate Analysis. BMC Plant Biol. 2025, 25, 1040. [Google Scholar] [CrossRef]
  16. Rodriguez, D.F.C.; Urban, M.O.; Santaella, M.; Gereda, J.M.; Contreras, A.D.; Wenzl, P. Using Phenomics to Identify and Integrate Traits of Interest for Better-Performing Common Beans: A Validation Study on an Interspecific Hybrid and Its Acutifolii Parents. Front. Plant Sci. 2022, 13, 1008666. [Google Scholar] [CrossRef]
  17. Filho, A.C.; Toebe, M. Sample size for principal component analysis in corn. Pesqui. Agropecuária Bras. 2021, 56, e02510. (In Portuguese) [Google Scholar] [CrossRef]
  18. Qaraei, M.; Abbaasi, S.; Ghiasi-Shirazi, K. Randomized Non-Linear PCA Networks. Inf. Sci. 2021, 545, 241–253. [Google Scholar] [CrossRef]
  19. Björklund, M. Be Careful with Your Principal Components. Evolution 2019, 73, 2151–2158. [Google Scholar] [CrossRef]
  20. Lever, J.; Krzywinski, M.; Altman, N. Principal Component Analysis. Nat. Methods 2017, 14, 641–642. [Google Scholar] [CrossRef]
  21. Yao, Y.; Ochoa, A. Limitations of Principal Components in Quantitative Genetic Association Models for Human Studies. eLife 2023, 12, e79238. [Google Scholar] [CrossRef]
  22. Semagn, K.; Crossa, J.; Cuevas, J.; Iqbal, M.; Ciechanowska, I.; Henriquez, M.A.; Randhawa, H.; Beres, B.L.; Aboukhaddour, R.; McCallum, B.D.; et al. Comparison of Single-Trait and Multi-Trait Genomic Predictions on Agronomic and Disease Resistance Traits in Spring Wheat. Theor. Appl. Genet. 2022, 135, 2747–2767. [Google Scholar] [CrossRef]
  23. Shahi, D.; Guo, J.; Pradhan, S.; Khan, J.; Avci, M.; Khan, N.; Mcbreen, J.; Bai, G.; Reynolds, M.; Foulkes, J.; et al. Multi-Trait Genomic Prediction Using in-Season Physiological Parameters Increases Prediction Accuracy of Complex Traits in US Wheat. BMC Genom. 2022, 23, 298. [Google Scholar] [CrossRef]
  24. Osterman, J.; Gutiérrez, L.; Öhlund, L.; Ortiz, R.; Hammenhag, C.; Parsons, D.; Geleta, M. Comparison of Single-Trait and Multi-Trait GBLUP Models for Genomic Prediction in Red Clover. Agronomy 2024, 14, 2445. [Google Scholar] [CrossRef]
  25. Jing, F.; Shi, S.; Kang, W.; Guan, J.; Lu, B.; Wu, B.; Wang, W. The Physiological Basis of Alfalfa Plant Height Establishment. Plants 2024, 13, 679. [Google Scholar] [CrossRef]
  26. Zhao, J.; Huang, R.; Yang, K.; Ma, C.; Zhang, Q. Effects of Nitrogen and Phosphorus Fertilization on Photosynthetic Properties of Leaves and Agronomic Characters of Alfalfa over Three Consecutive Years. Agriculture 2022, 12, 1187. [Google Scholar] [CrossRef]
  27. Ames, N.; McElroy, A.R.; Akin, D.E.; Lyon, C.E. Evaluation of Stem Strength of Alfalfa (Medicago sativa L.) Genotypes. Anim. Feed. Sci. Technol. 1995, 54, 267–274. [Google Scholar] [CrossRef]
  28. Annicchiarico, P. Alfalfa Forage Yield and Leaf/Stem Ratio: Narrow-Sense Heritability, Genetic Correlation, and Parent Selection Procedures. Euphytica 2015, 205, 409–420. [Google Scholar] [CrossRef]
  29. Claessens, A.; Thériault, M.; Bertrand, A.; Lajeunesse, J.; Rocher, S.; Biligetu, B. High-Energy Alfalfa (Medicago sativa L.) Developed by Recurrent Phenotypic Selection for Nonfiber Carbohydrate Concentration in Stems. Crop Sci. 2025, 65, e70054. [Google Scholar] [CrossRef]
  30. Min, X.; Luo, K.; Liu, W.; Zhou, K.; Li, J.; Wei, Z. Molecular Characterization of the miR156/MsSPL Model in Regulating the Compound Leaf Development and Abiotic Stress Response in Alfalfa. Genes 2022, 13, 331. [Google Scholar] [CrossRef] [PubMed]
  31. Lin, S.; Medina, C.A.; Boge, B.; Hu, J.; Fransen, S.; Norberg, S.; Yu, L.-X. Identification of Genetic Loci Associated with Forage Quality in Response to Water Deficit in Autotetraploid Alfalfa (Medicago sativa L.). BMC Plant Biol. 2020, 20, 303. [Google Scholar] [CrossRef]
  32. Mnafgui, W.; Jabri, C.; Jihnaoui, N.; Maiza, N.; Guerchi, A.; Zaidi, N.; Basson, G.; Keyster, E.M.; Djébali, N.; Pecetti, L.; et al. Discovering New Genes for Alfalfa (Medicago sativa) Growth and Biomass Resilience in Combined Salinity and Phoma Medicaginis Infection through GWAS. Front. Plant Sci. 2024, 15, 1348168. [Google Scholar] [CrossRef]
  33. Aruna, C.; Rakshit, S.; Shrotria, P.K.; Pahuja, S.K.; Jain, S.K.; Kumar, S.S.; Modi, N.D.; Deshmukh, D.T.; Kapoor, R.; Patil, J.V. Assessing Genotype-by-Environment Interactions and Trait Associations in Forage Sorghum Using GGE Biplot Analysis. J. Agric. Sci. 2016, 154, 73–86. [Google Scholar] [CrossRef]
  34. Leitão, S.T.; Alves, M.L.; Pereira, P.; Zerrouk, A.; Godinho, B.; Barradas, A.; Vaz Patto, M.C. Towards a Trait-Based Approach to Potentiate Yield under Drought in Legume-Rich Annual Forage Mixtures. Plants 2021, 10, 1763. [Google Scholar] [CrossRef]
  35. Volenec, J.J. Physiological Control of Alfalfa Growth and Yield. In Crop Yield: Physiology and Processes; Smith, D.L., Hamel, C., Eds.; Springer: Berlin/Heidelberg, Germany, 1999; pp. 425–442. ISBN 978-3-642-58554-8. [Google Scholar]
  36. Volenec, J.J.; Cherney, J.H.; Johnson, K.D. Yield Components, Plant Morphology, and Forage Quality of Alfalfa as Influenced by Plant Population. Crop Sci. 1987, 27, 321–326. [Google Scholar] [CrossRef]
  37. Lorenzo, C.D.; García-Gagliardi, P.; Antonietti, M.S.; Sánchez-Lamas, M.; Mancini, E.; Dezar, C.A.; Vazquez, M.; Watson, G.; Yanovsky, M.J.; Cerdán, P.D. Improvement of Alfalfa Forage Quality and Management through the Down-regulation of MsFTa1. Plant Biotechnol. J. 2019, 18, 944–954. [Google Scholar] [CrossRef] [PubMed]
  38. Yun, A.; Shi, S.; Gong, W.; Zhang, J.; Zhang, X.; Zhang, J. Cross-Breeding Improvement and Performance Analysis of Dominant Production Traits in Grazing-Type Alfalfa (Medicago sativa L.). BioMed Res. Int. 2022, 2022, 1252310. [Google Scholar] [CrossRef]
  39. Behera, P.P.; Singode, A.; Bhat, B.V.; Ronda, V.; Borah, N.; Verma, H.; Gogoi, L.R.; Borah, J.L.; Majhi, P.K.; Saharia, N.; et al. Genetic Gains in Forage Sorghum for Adaptive Traits for Non—Conventional Area through Multi-Trait-Based Stability Selection Methods. Front. Plant Sci. 2024, 15, 1248663. [Google Scholar] [CrossRef] [PubMed]
  40. Zhou, X.; Stephens, M. Efficient Multivariate Linear Mixed Model Algorithms for Genome-Wide Association Studies. Nat. Methods 2014, 11, 407–409. [Google Scholar] [CrossRef]
  41. Pan, X.; Wang, P.; Wei, X.; Zhang, J.; Xu, B.; Chen, Y.; Wei, G.; Wang, Z. Exploring Root System Architecture and Anatomical Variability in Alfalfa (Medicago sativa L.) Seedlings. BMC Plant Biol. 2023, 23, 449. [Google Scholar] [CrossRef]
  42. Hakl, J.; Mofidian, S.M.A.; Kozová, Z.; Fuksa, P.; Jaromír, Š. Estimation of Lucerne Yield Stability for Enabling Effective Cultivar Selection under Rainfed Conditions. Grass Forage Sci. 2019, 74, 687–695. [Google Scholar] [CrossRef]
  43. McDonagh, J.; O’Donovan, M.; McEvoy, M.; Gilliland, T.J. Genetic Gain in Perennial Ryegrass (Lolium perenne) Varieties 1973 to 2013. Euphytica 2016, 212, 187–199. [Google Scholar] [CrossRef]
  44. Caradus, J.R.; Chapman, D.F. Evaluating Pasture Forage Plant Breeding Achievements: A Review. New Zealand J. Agric. Res. 2025, 68, 1146–1220. [Google Scholar] [CrossRef]
  45. Crain, J.; Mondal, S.; Rutkoski, J.; Singh, R.P.; Poland, J. Combining High-Throughput Phenotyping and Genomic Information to Increase Prediction and Selection Accuracy in Wheat Breeding. Plant Genome 2018, 11, 170043. [Google Scholar] [CrossRef]
  46. Juliana, P.; Montesinos-López, O.A.; Crossa, J.; Mondal, S.; González Pérez, L.; Poland, J.; Huerta-Espino, J.; Crespo-Herrera, L.; Govindan, V.; Dreisigacker, S.; et al. Integrating Genomic-Enabled Prediction and High-Throughput Phenotyping in Breeding for Climate-Resilient Bread Wheat. Theor. Appl. Genet. 2019, 132, 177–194. [Google Scholar] [CrossRef]
  47. Wang, W.; Guo, W.; Le, L.; Yu, J.; Wu, Y.; Li, D.; Wang, Y.; Wang, H.; Lu, X.; Qiao, H.; et al. Integration of High-Throughput Phenotyping, GWAS, and Predictive Models Reveals the Genetic Architecture of Plant Height in Maize. Mol. Plant 2023, 16, 354–373. [Google Scholar] [CrossRef]
  48. Wu, Z. Research of Alfalfa Multifoliate traits and anther culture. Ph.D. Thesis, Yangzhou University, Yangzhou, China, 2013. [Google Scholar]
  49. Jabborova, D.; Abdrakhmanov, T.; Jabbarov, Z.; Abdullaev, S.; Azimov, A.; Mohamed, I.; AlHarbi, M.; Abu-Elsaoud, A.; Elkelish, A. Biochar Improves the Growth and Physiological Traits of Alfalfa, Amaranth and Maize Grown under Salt Stress. PeerJ 2023, 11, e15684. [Google Scholar] [CrossRef]
  50. Liu, J.; Lu, S.; Liu, C.; Hou, D. Nutrient Reallocation between Stem and Leaf Drives Grazed Grassland Degradation in Inner Mongolia, China. BMC Plant Biol. 2022, 22, 505. [Google Scholar] [CrossRef]
  51. GB/T 6435-2014; Determination of Moisture in Feed Stuffs. Standardization Administration of PRC (SAC) Standards Press of China: Beijing, China, 2014.
Figure 1. Multi-variate relationships among agronomic traits in alfalfa F1 hybrids. Correlation matrix of six agronomic traits in alfalfa F1 hybrids derived from reciprocal crosses. Pairwise relationships were assessed using Pearson’s correlation analysis. Asterisks indicate significance levels: ** p < 0.01, * p < 0.05.
Figure 1. Multi-variate relationships among agronomic traits in alfalfa F1 hybrids. Correlation matrix of six agronomic traits in alfalfa F1 hybrids derived from reciprocal crosses. Pairwise relationships were assessed using Pearson’s correlation analysis. Asterisks indicate significance levels: ** p < 0.01, * p < 0.05.
Plants 14 02906 g001
Figure 2. Principal Component Loading Plot of Six Agronomic Traits Based on PCA (PC1–PC3).
Figure 2. Principal Component Loading Plot of Six Agronomic Traits Based on PCA (PC1–PC3).
Plants 14 02906 g002
Table 1. Agronomic Performance of Alfalfa Parents and Progeny.
Table 1. Agronomic Performance of Alfalfa Parents and Progeny.
Traits 1Parents 2p-Value 3F1 Mean
Paternal Mean (♂)Maternal Mean (♀)
Height (cm)80.85 ± 14.9272.48 ± 16.230.03675.66 ± 17.64
Branches7.31 ± 1.909.16 ± 3.280.0088.34 ± 3.08
FHR4.33 ± 0.244.04 ± 0.290.0014.16 ± 0.29
LSR1.64 ± 0.351.50 ± 0.290.1341.57 ± 0.31
MF of individual plants (%)0.0068.87 ± 4.290.00029.03 ± 5.29
Dry weight of individual plants (g)119.69 ± 29.17142.26 ± 43.590.034124.34 ± 47.46
1 FHR: fresh/hay yield ratio. LSR: leaf/stem ratio. MF: Multifoliolate trait frequency. 2 Results are presented as mean ± standard deviation (SD). 3 Statistical significance was determined using two-tailed independent-sample t-tests with a significance level of α = 0.05.
Table 2. Principal Component Analysis Summary: Variance Explained and Extraction Sums of Squared Loadings (PC1–PC3).
Table 2. Principal Component Analysis Summary: Variance Explained and Extraction Sums of Squared Loadings (PC1–PC3).
Component 2Extraction Sums of Squared Loadings 1
TotalPercentage of Variance (%)Cumulative %
11.94632.43532.435
21.30621.77154.206
31.01616.93771.143
1 PCA of six agronomic traits identified three principal components (PC1–PC3) with eigenvalues >1, explaining a total of 71.14% of the variance (PC1: 32.43%; PC2: 21.77%; PC3: 16.94%). 2 Total (Total Variance Explained): The eigenvalue of each factor, indicating the total amount of variance explained by that factor. Percentage of Variance: The percentage contribution of each factor to the total variance. Cumulative (Cumulative Percentage): The cumulative percentage of total variance explained by the first several factors.
Table 3. Component Loading Matrix for Six Agronomic Traits Based on PCA.
Table 3. Component Loading Matrix for Six Agronomic Traits Based on PCA.
Component
123
Z-score (Height)0.7290.247−0.398
Z-score (Branches)0.6450.621−0.071
Z-score (FHR)−0.412−0.057−0.454
Z-score (LSR)0.410−0.8050.102
Z-score (Multifoliolate trait frequency of individual plants)−0.2450.3950.711
Z-score (Dry weight of individual plants)0.775−0.2290.363
Table 4. Combined Scores of Agronomic Traits in F1 Individuals 1.
Table 4. Combined Scores of Agronomic Traits in F1 Individuals 1.
Overall RankingCombined ScoreOverall RankingCombined ScoreOverall RankingCombined Score
135.40310.6561−2.54
218.89320.6262−2.73
315.28330.4663−2.84
413.97340.4464−2.92
510.82350.3365−2.97
610.70360.1066−3.06
710.46370.0867−3.15
810.3138−0.3268−3.67
910.0539−0.3469−4.09
108.7940−0.4170−4.26
117.7941−0.5271−4.49
127.4942−0.5372−4.89
137.3043−0.6373−5.55
147.1044−0.6574−5.99
156.7745−0.6675−6.60
166.4246−0.7276−6.78
176.1547−0.7877−7.06
186.0648−1.0278−7.16
195.6949−1.0579−7.71
204.8850−1.1780−7.97
214.7551−1.4481−8.41
224.5552−1.7082−9.37
234.0153−1.7883−9.46
243.9354−1.8384−9.57
253.1455−1.9985−10.26
262.8656−2.0686−10.45
271.8257−2.1787−13.84
281.5458−2.1888−14.19
290.9559−2.4289−14.67
300.8360−2.5290−15.83
1 The combined score represents a composite metric derived from the original variables through PCA. It reduces data dimensionality and captures essential patterns by applying specified weighting coefficients according to Formula (5). This integrated score serves as a single indicator for evaluating the overall characteristics of each sample, enabling comparative ranking, assessment, and prediction of yield performance.
Table 5. Comparative Results of Agronomic Traits of Screened Plants and Hybrid Progeny of The Integrated Evaluation Model 1.
Table 5. Comparative Results of Agronomic Traits of Screened Plants and Hybrid Progeny of The Integrated Evaluation Model 1.
F1 Generation Selected PlantsAll F1 Generation Hybrid PlantsSelected Single Natural Cross F2 Generation PlantsNon-Selected Single Natural Cross F2 Generation PlantsAll F2 Generation Plants
Height/cm86.45 ± 12.04 a75.66 ± 17.64 c80.68 ± 9.77 b69.27 ± 11.09 d74.98 ± 11.90 c
Branches11.11 ± 3.02 a8.34 ± 3.08 b8.57 ± 1.64 b7.92 ± 1.81 b8.24 ± 1.75 b
Ratio of fresh and hay3.99 ± 0.26 b4.16 ± 0.29 a3.83 ± 0.30 c4.14 ± 0.35 a3.97 ± 0.36 b
Ratio of stem and leaf1.60 ± 0.191.57 ± 0.311.62 ± 0.271.59 ± 0.241.61 ± 0.25
Multifoliolate trait frequency of individual plants/%34.68 ± 7.86 c29.03 ± 5.29 c50.74 ± 9.92 a42.71 ± 6.35 b46.73 ± 8.13 ab
Dry weight of individual plants/g161.21 ± 45.34 a124.34 ± 19.46 cd143.69 ± 44.67 b112.80 ± 27.65 d128.24 ± 40.15 c
1 Values are presented as means ± standard errors. Different lowercase letters indicate statistically significant differences at p < 0.05 based on multiple comparison tests. Identical letters indicate no significant difference (p > 0.05). Statistical significance was determined using Duncan’s test at the 95% confidence level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, Z.; Li, J.; Lei, H.; Yan, M.; Wang, Q.; Ji, R.; Zhang, S.; Min, X.; Sun, Z.; Wei, Z. PCA-Driven Multivariate Trait Integration in Alfalfa Breeding: A Selection Model for High-Yield and Stable Progenies. Plants 2025, 14, 2906. https://doi.org/10.3390/plants14182906

AMA Style

Cao Z, Li J, Lei H, Yan M, Wang Q, Ji R, Zhang S, Min X, Sun Z, Wei Z. PCA-Driven Multivariate Trait Integration in Alfalfa Breeding: A Selection Model for High-Yield and Stable Progenies. Plants. 2025; 14(18):2906. https://doi.org/10.3390/plants14182906

Chicago/Turabian Style

Cao, Zhengfeng, Jiaqing Li, Huanwei Lei, Mengyu Yan, Qianxi Wang, Runqin Ji, Siqi Zhang, Xueyang Min, Zhengguo Sun, and Zhenwu Wei. 2025. "PCA-Driven Multivariate Trait Integration in Alfalfa Breeding: A Selection Model for High-Yield and Stable Progenies" Plants 14, no. 18: 2906. https://doi.org/10.3390/plants14182906

APA Style

Cao, Z., Li, J., Lei, H., Yan, M., Wang, Q., Ji, R., Zhang, S., Min, X., Sun, Z., & Wei, Z. (2025). PCA-Driven Multivariate Trait Integration in Alfalfa Breeding: A Selection Model for High-Yield and Stable Progenies. Plants, 14(18), 2906. https://doi.org/10.3390/plants14182906

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop