Next Article in Journal
Microclimate Prediction of Solar Greenhouse with Pad–Fan Cooling Systems Using a Machine and Deep Learning Approach
Previous Article in Journal
Optimized Design and Experimental Evaluation of a Ridging and Mulching Machine for Yellow Sand Substrate Based on the Discrete Element Method
Previous Article in Special Issue
Comparative Analysis of the Expression of Genes Involved in Fatty Acid Synthesis Across Camelina Varieties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Environment Evaluation of Soybean Variety Heike 88: Transgressive Segregation and Regional Adaptation in Northern China

1
Heihe Branch of Heilongjiang Academy of Agricultural Sciences, No. 345 Huanchengxi Road, Heihe 152052, China
2
Soybean Research Institute, Heilongjiang Academy of Agricultural Sciences, Harbin 150086, China
*
Authors to whom correspondence should be addressed.
Agriculture 2025, 15(20), 2106; https://doi.org/10.3390/agriculture15202106
Submission received: 5 September 2025 / Revised: 2 October 2025 / Accepted: 8 October 2025 / Published: 10 October 2025
(This article belongs to the Special Issue Crop Yield Improvement in Genetic and Biology Breeding)

Abstract

Heike 88, a new soybean variety developed through strategic hybridization of Heijiao 08-1611 × Heihe 43 followed by pedigree selection, was evaluated across seven locations in Heilongjiang Province from 2019 to 2022. The variety demonstrated stable performance with a 10.3% average yield advantage over regional check varieties and mean yields of 3188 kg ha−1. Principal component analysis revealed that genetic variation accounted for 43.4% and 32.6% of performance variance in the first two components, indicating successful transgressive segregation where the pure line exceeded both parental lines through complementary gene action. Performance relative to parental averages ranged from −20% to +40% across the temperature gradient, demonstrating strong genotype-environment interaction effects. Machine learning analysis identified year effect (13% importance), accumulated temperature (7.6% importance), and oil content (4% importance) as primary yield drivers. Complete resistance to soybean mosaic virous (SMV) and cyst nematode attack was observed across all locations, with excellent gray leaf spot resistance (grades 0–1) maintained under natural pathogen pressure. Seed quality parameters remained stable across environments, with protein content ranging from 41.69% to 42.25% and oil content from 19.74% to 20.13%, indicating minimal environmental effects on compositional traits. Yield stability improved progressively over the evaluation period, with the coefficient of variation decreasing from 18.7% in 2019 to 6.7% in 2022, while absolute yields increased from 2550 to 3200 kg ha−1. These results demonstrate successful exploitation of transgressive segregation for regional adaptation through strategic parent selection and pedigree breeding, supporting commercial deployment in northern China’s challenging production environments while providing methodological guidance for future breeding programs targeting environmental specificity.

1. Introduction

Soybean (Glycine max L. Merr.) is one of the world’s most important crops, providing vital protein and oil that support global food security and agricultural economies [1,2]. With annual worldwide production surpassing 350 million metric tons, soybeans supply roughly 70% of the world’s protein meal and 30% of vegetable oil consumption (FAOSTAT, 2023) [3]. The crop’s economic importance extends beyond human nutrition to include livestock feed, industrial uses, and emerging biofuel markets, making soybean improvement a key focus for global agricultural sustainability [4]. Global soybean production is facing increasing challenges that transcend geographic boundaries. Climate variability increasingly threatens yield stability, with temperature extremes, changing precipitation patterns, and shifting growing seasons affecting areas from the Americas to Asia [5]. At the same time, changing consumer demands for higher protein content, better oil quality, and improved nutrition push breeders to develop varieties that balance yield with quality traits [6,7]. Disease pressure continues to grow globally, with pathogens adapting to new environmental conditions and overcoming previously effective resistance genes [8].
Modern plant breeding has evolved to address these multifaceted challenges through integration of traditional selection methods with advanced analytical approaches. The combination of quantitative genetics, statistical modeling, and computational analysis provides deeper understanding of complex trait relationships and enables more informed selection decisions [9,10]. Multi-environment testing frameworks have become essential for developing varieties that perform consistently across diverse production systems, while stability analysis methods enable breeders to identify genotypes with robust adaptation [2]. These analytical advances, when combined with systematic breeding strategies, create unprecedented opportunities for genetic improvement in crops like soybean.
Transgressive segregation—the phenomenon where progeny from crosses exceed parental performance through novel combinations of complementary alleles—represents a fundamental mechanism for genetic advancement in plant breeding [11,12]. Unlike heterosis, which manifests in F1 hybrids and diminishes in subsequent generations, transgressive segregation can be captured and fixed in pure lines through pedigree selection, making it particularly valuable for self-pollinated crops [13]. This mechanism operates through the complementary action of favorable alleles distributed across both parents, which can be combined through crossing and fixed through selection to produce progeny superior to either parent [14]. While heterosis has been extensively exploited in cross-pollinated crops like maize and rice, the systematic utilization of transgressive segregation in self-pollinated species like soybean offers substantial potential for genetic improvement without the complexities of hybrid seed production systems.
Recent molecular studies have made significant advances in understanding performance mechanisms, with comparative transcriptomic analysis revealing that seedling performance in soybean involves complex relationships between DNA methylation patterns and differential gene expression [15,16]. High-throughput genomics approaches have significantly advanced our understanding of the genetic, molecular, and epigenetic mechanisms underlying performance, including epigenetic modifications such as DNA methylation and histone acetylation [11]. However, despite these advances, the molecular and genetic mechanisms underlying performance are yet to be fully elucidated in soybean, particularly regarding performance expression under extreme growing conditions and its practical application in regional breeding programs [17,18]. The integration of genomics-assisted selection with traditional breeding approaches has revolutionized crop improvement, enabling more precise trait manipulation and accelerated variety development [19,20]. Recent advances in genome-wide association studies (GWASs) and genomic prediction models have enhanced understanding of complex trait inheritance in soybeans, particularly for yield components and stress tolerance [21]. Machine learning algorithms now enable sophisticated analysis of genotype-by-environment interactions, providing unprecedented insights into variety stability and adaptation mechanisms [22]. However, the application of these advanced analytical frameworks to performance evaluation in extreme environments remains limited, representing a critical research frontier.
China exemplifies the global challenges facing soybean production and improvement. As the world’s largest consumer and fourth-largest producer, China confronts the dual challenge of meeting increasing domestic demand while enhancing production efficiency under diverse agroecological conditions [23]. Northern China’s soybean production regions face particularly severe constraints, including short growing seasons with limited thermal accumulation, temperature fluctuations during critical reproductive stages, and increasing pressure from evolving pathogen populations [24,25]. These conditions demand varieties with enhanced stress tolerance, accelerated development, and stable performance characteristics [26]. The Heilongjiang Province Fourth Temperature Zone epitomizes these challenges, characterized by accumulated temperatures of 2000–2300 °C (≥10 °C base) during the growing season, frost-free periods of 110–130 days, and annual precipitation of 400–650 mm [27]. This environment requires varieties capable of efficient resource utilization within compressed timeframes while maintaining stable yields under variable conditions [28,29]. Contemporary soybean improvement requires a careful balance among competing objectives. The well-documented inverse relationship between protein and oil content complicates efforts to optimize both traits simultaneously. Disease resistance must be combined with yield potential without sacrificing other desirable qualities, calling for comprehensive evaluation under various pathogen pressures [30]. Environmental adaptation must be achieved while maintaining broad stability, which requires extensive multi-location testing and detailed analysis of genotype-by-environment interactions [31].
This research tackles these complex challenges through the systematic development and thorough evaluation of Heike 88, a new soybean variety specifically adapted to the difficult production environments of northern China. The variety was created by strategically hybridizing complementary parental lines, Heijiao 08-1611, chosen for cold tolerance and protein content, and Heihe 43, recognized for its disease resistance and yield stability. This was followed by intensive pedigree selection over six generations. The study combines traditional multi-environment field testing with advanced analytical techniques, including machine learning and principal component analysis, to offer a comprehensive understanding of variety performance and the underlying mechanisms. The objectives were to evaluate Heike 88’s agronomic performance, yield stability, and quality traits across different environments within the target production zone; measure genetic improvement compared to parental lines through transgressive segregation; identify key environmental and genetic factors influencing variety performance using advanced statistical and computational methods; assess disease resistance under natural pathogen pressure across multiple locations; and provide practical guidance for breeding programs aimed at environmental specificity by strategically leveraging transgressive segregation. The combination of traditional breeding expertise with modern analytical tools provides both a high-performing variety for immediate use and a framework for speeding up genetic improvements in challenging production environments.

2. Materials and Methods

2.1. Experimental Design and Statistical Framework

All field evaluations used a randomized complete block design (RCBD) with location-specific replication schemes to ensure statistical accuracy and data quality. Preliminary trials conducted from 2016 to 2017 used four replications, while regional multi-environment trials from 2019 to 2022 employed three replications in accordance with established Heilongjiang Provincial Crop Variety Testing protocols. The statistical model for combined analysis across environments was Yijk = μ + Gi + Ej + (GE)ij + Bk(j) + εijk, where Yijk stands for the observed value, μ is the overall mean, Gi is the effect of the ith genotype (fixed), Ej is the effect of the jth environment (random), (GE)ij is the genotype-by-environment interaction (random), Bk(j) is the effect of the kth block within the jth environment (random), and εijk is the residual error. Genotypes included Heike 88 and Heihe 43 (check variety) as fixed effects.

2.2. Plant Materials and Breeding Strategy

2.2.1. Parent Line Selection and Characterization

Female Parent-Heijiao 08-1611 was developed through hybridization of Heihe 46 × Heihe 30, followed by five generations of continuous self-crossing and pedigree selection for regional adaptation and yield performance. This line exhibits a semi-determinate podding habit with a maturity period of approximately 116 days from emergence to physiological maturity and requires an accumulated temperature of 2160 °C at ≥10 °C. Morphological characteristics include: plant height of about 80 cm, a branched growth habit, purple flowers, pointed leaves, gray pubescence, and sickle-shaped pods that turn brown at maturity. Seeds are round, with a yellow seed coat and a yellow hilum, exhibiting good luster, and have a 100-seed weight of approximately 20.0 g. Compositional analysis reveals a crude protein content of 40.20% and a crude fat content of 19.55%. The line demonstrates moderate resistance to gray leaf spot (Cercospora kikuchii).
Male Parent-Heihe 43 (originally Heijiao 00-1152) was developed by the Heihe Branch of Heilongjiang Academy of Agricultural Sciences through crossing Heihe 18 and Heihe 23, followed by five generations of continuous self-crossing and pedigree selection. This variety was officially approved by the Heilongjiang Crop Variety Approval Committee in 2007. It has a growth period of 115 days and requires an accumulated temperature of 2150 °C at ≥10 °C, providing suitable thermal complementarity with the female parent for the Fourth Temperature Zone environment. Morphological features include a plant height of about 75 cm, a branched growth habit, purple flowers, long leaves, gray pubescence, and curved, sickle-shaped pods that turn gray at maturity. Seeds are round, with a yellow seed coat and a light-yellow hilum, exhibiting good luster, and have a 100-seed weight of approximately 20 g. The chemical composition includes a protein content of 41.84% and a fat content of 18.98%. The variety exhibits moderate resistance to gray leaf spot.

2.2.2. Hybridization and Population Development

Initial hybridization between Heijiao 08-1611 and Heihe 43 was conducted in 2010 using standard emasculation and controlled pollination techniques. Emasculation was performed during the pre-anthesis stage (R1) under controlled environmental conditions to ensure pollen viability and prevent contamination. Hand pollination was carried out with freshly collected pollen from the male parent, and the pollinated flowers were tagged and protected with pollination bags. The complete breeding pedigree and timeline for the development of Heike 88 are shown in Figure 1.
In 2010, the F1 generation was verified through both morphological screening and molecular marker analysis to eliminate false hybrids and ensure genetic uniformity. Generation advancement was accelerated by optimizing growth conditions, enabling rapid progression to F6 by 2015. Mixed selection strategies were employed during early generation development (2011–2012), combining mass selection to improve the population with individual plant selection to identify superior genotypes [32]. A thorough evaluation took place from 2012 to 2014, focusing on assessing protein content, determining oil content, and evaluating yield across multiple environments. Pedigree selection was intensified from F4 to F6 generations, with detailed records of individual plant performance and family relationships to preserve genetic diversity while advancing with superior genotypes [33]. Selection criteria included yield potential, plant structure, disease resistance, maturity traits, and seed quality traits. In 2015, a stable breeding line showing strong performance across multiple traits was selected and named Heijiao 15-2011, later renamed Heike 88. Following a range of final line selections in 2015, ongoing breeding efforts involved self-pollination to reach the F6 generation (2016–2023), aiming to confirm genetic stability and further improve the lines, alongside a comprehensive multi-location evaluation program.

2.2.3. Performance Evaluation Design and Parent Performance Assessment

Performance calculations in this study were based on comparing the final variety Heike 88 (derived from the F1 hybrid after six generations of selection) against its parental lines across multiple environments. This approach recognizes that while actual performance is usually measured in F1 hybrids, residual heterotic effects can still persist in advanced generations, especially under environmental stress. To perform these calculations, all three genotypes (Heike 88, Heijiao 08-1611, and Heihe 43) were grown simultaneously across all testing environments from 2016 to 2022. Parent lines were maintained through single-seed descent to preserve genetic integrity. Mid-parent values were calculated as the averages of both parents’ performance for each trait in each environment. High-parent (better-parent) values represented the superior parent’s performance for each trait.

2.3. Multi-Phase Evaluation Program

All evaluations took place within Heilongjiang Province’s Fourth Temperature Zone, which covers the northern and central regions between 47° and 50° N latitude. This zone has accumulated temperatures from 2000 to 2300 °C (above 10 °C), frost-free periods lasting 110 to 130 days, and annual rainfall between 400 and 650 mm. It includes major soybean production areas around Heihe, Nenjiang, Bei’an, and nearby counties. The comprehensive evaluation program was divided into three phases to ensure thorough assessments across different conditions and management systems.
The preliminary evaluation period (2016–2017) was conducted at the Heihe Branch of the Heilongjiang Academy of Agricultural Sciences using a randomized complete block design with four replications. Plot sizes were standardized at 10 m2 (2 m × 5 m), with a plant density of 350,000 plants per hectare, in accordance with regional production guidelines. This phase aimed to directly compare Heike 88, both parental lines, and the regional control variety Heihe 43 to determine baseline performance and identify the best traits.
Regional multi-environment trials (2018–2020) expanded evaluation to seven representative sites across the Fourth Temperature Zone using a randomized complete block design with three replications per location. Plot size was standardized at 13.35 m2 (2.67 m × 5 m) according to provincial testing protocols, with plant density optimized for local conditions, ranging from 300,000 to 400,000 plants per hectare. These trials maintained comparisons between Heike 88, parental lines, and appropriate regional check varieties to assess adaptation stability and identify optimal growing environments.
Production validation trials (2021–2022) were conducted on commercial farms across the target deployment region to verify research station findings under practical farming conditions. These trials utilized large-scale plots of at least 0.1 ha managed according to standard farmer practices with technical oversight to ensure data quality and accuracy. This phase emphasizes comparisons with current commercial varieties to validate the practical advantages and commercial viability of the new variety under real-world production scenarios.

2.4. Field Management and Cultural Practices

All experimental locations followed standardized corn-soybean rotation systems, with corn serving as the previous crop in 85.7% of site-years to ensure consistent soil conditions and minimize residual effects of earlier crops. Soil preparation followed conventional practices, including tillage to a depth of 25–30 cm, followed by cultivation and seedbed preparation using standard implements to ensure optimal soil conditions for establishment and growth. Fertilization programs were standardized across all locations to minimize environmental variation and ensure comparable growing conditions. Basal fertilizer applications provided N-P2O5-K2O at rates of 30-60-45 kg ha−1 applied at planting using urea (46% N) as the nitrogen source, diammonium phosphate (18-46-0) for phosphorus, and potassium chloride (60% K2O) for potassium. Micronutrient applications were implemented on a site-specific basis following comprehensive soil testing to address local deficiencies and optimize plant nutrition. Integrated pest management protocols were implemented to ensure crop protection while maintaining the integrity of the evaluation. Pre-emergence herbicide applications used pendimethalin at 1.5 L ha−1, applied within three days of planting, for effective weed control. This was followed by post-emergence spot treatments with glyphosate as needed to manage escaped weeds. Insect control relied on monitoring-based applications of approved insecticides when economic thresholds were exceeded. Disease management protocols specifically excluded fungicide applications during evaluation periods to ensure an accurate assessment of genetic disease resistance, while production plots received standard prophylactic treatments in accordance with regional recommendations.

2.5. Agronomic Measurements and Data Collection

2.5.1. Growth and Development Characteristics

Morphological characterization was conducted using standardized procedures to ensure consistency across locations and years. Plant height measurements were recorded at the R8 (physiological maturity) stage from the soil surface to the terminal growing point using calibrated measuring devices. Bottom pod height was measured as the distance from the soil surface to the lowest pod insertion point, a critical parameter for mechanical harvest efficiency. Main stem nodes were counted from the unifoliate node to the terminal node to assess developmental patterns, while effective branches were defined as branches producing at least one filled pod, indicating their contribution to yield formation.
The lodging assessment utilized a quantitative four-point scale to ensure an objective evaluation across environments. Grade 0 indicated no plants lodged (0° from vertical), Grade 1 represented slight leaning with 5% or fewer plants affected (15–30° angle from vertical), Grade 2 indicated moderate lodging with 6–25% of plants affected (30–45° angle), and Grade 3 represented severe lodging with more than 25% of plants affected (greater than 45° angle, significantly affecting harvestability) [34]. This assessment was conducted at physiological maturity when lodging patterns had stabilized and provided an accurate representation of variety performance.

2.5.2. Yield and Yield Components

Yield measurements employed precision techniques to ensure data accuracy and comparability across environments. Plot yields were determined through hand harvest of the entire plot area to eliminate border effects and ensure complete recovery. The harvested seed was standardized to a 13% moisture content using a calibrated moisture meter (Dickey-John GAC500XT, Dickey-John, Auburn, IL, USA), followed by precision weighing with a Mettler Toledo XS6002S balance (Mettler Toledo, Columbus, OH, USA), which has an accuracy of ±0.1 g. Final yields were calculated and expressed as kg ha−1 for statistical analysis and comparative evaluation. Yield component analysis provided detailed insights into the physiological basis of performance differences. Plant density was recorded as the number of plants per square meter, counted at the R1 stage when stand establishment was complete. Pods per plant and seeds per plant were determined from 10 representative plants per plot selected to represent plot variability. Hundred-seed weight was calculated as the average of three 100-seed samples per plot to assess seed size consistency. Seeds per pod were determined through dissection of 50 randomly selected pods per plot to evaluate reproductive efficiency and seed filling patterns.

2.5.3. Quality Analysis Protocols

Seed samples were collected annually from each plot during harvest and processed according to standardized procedures for compositional analysis. Crude protein content was measured using the Kjeldahl method following AOAC Official Method 990.03 [35], with a Kjeltec 8400 analyzer (Foss Analytical, Hillerød, Denmark). Results were calculated using a nitrogen-to-protein conversion factor of 6.25 and expressed on a dry weight basis. Crude oil content was determined by petroleum ether extraction following AOAC Official Method 920.39 [35], with a Soxtec 2043 extraction unit (Foss Analytical, Hillerød, Denmark). The extraction process lasted 4 h, using petroleum ether (boiling range: 40–60 °C), followed by solvent recovery and gravimetric measurement of the extracted oil. Additional quality parameters included seed appearance evaluation (percentage of perfect seeds, disease damage rate, insect damage rate) based on visual examination of 200-seed samples per plot, using standardized grading criteria [36].

2.6. Disease Resistance Evaluation

Gray leaf spot resistance assessment utilized local isolates of Cercospora kikuchii maintained on potato dextrose agar at 25 ± 2 °C with a 12 h photoperiod to ensure consistent pathogen virulence. Inoculation protocols employed spore concentrations of 1 × 105 spores mL−1 prepared in distilled water with 0.02% Tween-20 surfactant for optimal spore dispersal and plant surface adherence [37]. Applications were timed to coincide with the R3-R4 growth stage (beginning pod formation), when plants are most susceptible to infection. Hand-held sprayers with calibrated nozzles were used to ensure uniform coverage across all experimental plots. Environmental conditions were optimized for infection through mist irrigation for 48 h post-inoculation to maintain leaf surface moisture and promote spore germination and penetration.
Disease assessment employed a standardized four-point scale adapted from CIAT protocols and the Chinese National Standard NY/T 1248-2006 for evaluating soybean diseases. Grade 0 indicated no visible symptoms, Grade 1 represented 1–5% leaf area affected with small, isolated lesions, Grade 2 indicated 6–25% leaf area affected with moderate lesion development, and Grade 3 represented greater than 25% leaf area affected with severe leaf necrosis affecting plant function. Virus disease screening targeted the Soybean mosaic virus (SMV), using natural field infestation across all locations. Susceptible sentinel plants (Heihe 43) were included to confirm adequate disease pressure and validate screening effectiveness. Visual assessments were conducted at R5-R6 growth stages using standard viral symptom diagnostic criteria, with symptomatic plants confirmed through ELISA testing using pathogen-specific antibodies to ensure accurate disease identification.
Cyst nematode (Heterodera glycines) resistance was evaluated under natural field infestation across all seven testing locations from 2019 to 2022. Nematode population densities were assessed through pre-planting soil sampling, with composite samples collected from 15 to 20 points per trial area (0–20 cm depth). Cyst and egg densities were quantified using standard extraction and counting procedures [38].

2.7. Environmental Characterization

All trials were conducted within the Fourth Temperature Zone of Heilongjiang Province. Seven primary testing locations were strategically selected to represent environmental diversity within the target production zone. These included Bei’an Branch Research Institute (48.27° N, 126.60° E), Bei’an Dalong Seed Industry (48.24° N, 126.58° E), Heihe Seed Division (50.25° N, 127.50° E), Heshan Farm (47.42° N, 133.13° E), Nenjiang County Far East Seed Industry (49.18° N, 125.22° E), Nenjiang Farm (49.17° N, 125.34° E), and Wudalianchi Seed Station (48.72° N, 126.15° E). Detailed soil and climatic characteristics for each location are provided in Supplementary Tables S1–S3 to enable comprehensive environmental characterization. Environmental monitoring was standardized at all locations using automated weather stations (Campbell Scientific CR1000, Logan, UT, USA), which were calibrated annually. Daily measurements included maximum and minimum air temperature (°C), precipitation (mm), relative humidity (%), wind speed (m s−1), and solar radiation (MJ m−2 day−1). Growing degree days (GDD) were calculated using the formula: GDD = [(Tmax + Tmin)/2] − Tbase, where Tmax and Tmin are the daily maximum and minimum temperatures (°C), respectively, and Tbase = 10 °C for soybean development [39]. Cumulative GDD was recorded from planting to physiological maturity for each location-year combination.
Soil characterization was performed annually at each site using standardized sampling methods. Composite soil samples (0–20 cm depth) were collected from each trial area following a systematic grid pattern with 15–20 sampling points per trial. Soil analysis included pH measurement with a glass electrode method (soil:water ratio 1:2.5), organic matter content determined by the Walkley-Black wet oxidation method, total nitrogen assessed through Kjeldahl digestion, available phosphorus extracted by the Olsen method, and exchangeable potassium measured using ammonium acetate extraction followed by flame photometry. Certified laboratories carried out all tests in accordance with national soil testing standards (NY/T 1121-2006; NY/T 88-1988).
Annual and seasonal climate variability was measured for each location to evaluate environmental stress conditions during the assessment period. Temperature stress events were identified as times when the daily maximum temperature exceeded 32 °C (heat stress) or the minimum temperature fell below 8 °C during reproductive stages (cold stress). Drought stress was evaluated using the Palmer Drought Severity Index, which is derived from monthly precipitation and temperature data. These environmental stress indicators were integrated into stability analyses to examine genotype-by-environment interactions.

2.8. Statistical Analysis

Statistical analyses were conducted using R statistical software version 4.3.0 with specialized packages selected for agricultural research applications [40]. The agricolae package (v1.3.5) [41] provided experimental design analysis and ANOVA procedures, lme4 (v1.1.33) enabled mixed-effects modeling for multi-environment analysis, ggplot2 (v3.4.2) [42], facilitated data visualization, randomForest (v4.7.1) supported machine learning analysis, and GGEBiplotGUI (v1.0.9) provided genotype-environment interaction analysis capabilities.
Since Heike 88 represents a pure line derived from hybridization followed by pedigree selection, performance comparisons with parental lines represent transgressive segregation rather than traditional heterosis observed in F1 hybrids. Mid-parent comparison was calculated as [(Heike 88 − (Parent 1 + Parent 2)/2)/(Parent 1 + Parent 2)/2] × 100 to quantify performance relative to parental average. Better-parent comparison used [(Heike 88 − Better Parent)/Better Parent] × 100 to assess performance relative to the superior parent. Standard comparison employed [(Heike 88 − Check Variety)/Check Variety] × 100 to evaluate performance relative to regional check varieties, providing a practical assessment of commercial advantage.
Multi-environmental analysis treated environments as random effects to enable broader inference about variety adaptation and stability across the target production region. Combined ANOVA procedures partitioned variance components to identify sources of variation and assess the significance of genotype, environment, and genotype-by-environment interaction effects. Stability analysis employed multiple approaches, including coefficient of variation (CV) across environments to assess yield stability, Finlay-Wilkinson regression analysis to evaluate environmental responsiveness, and GGE biplot analysis to visualize genotype-environment interaction patterns and identify optimal production environments [43,44,45].
Machine learning integration utilized Random Forest analysis to identify key performance drivers and quantify variable importance in determining yield outcomes [46]. This analysis employed 10-fold cross-validation with 1000 bootstrap replicates to ensure model reliability and prevent overfitting. Statistical significance testing employed F-tests at the α = 0.05 significance level for ANOVA procedures, Tukey’s HSD test for multiple comparisons when significant treatment effects were detected, and Pearson correlation coefficients with significance testing for relationship analysis. Missing data were handled through listwise deletion with sensitivity analysis to ensure that data gaps did not bias results or conclusions. All statistical analyses included appropriate assumptions testing, with data transformations applied when necessary to meet normality and homogeneity requirements for valid statistical inference. Statistical Significance Testing, ANOVA: F-tests at α = 0.05 significance level, mean comparisons: Tukey’s HSD test for multiple comparisons, Correlation analysis: Pearson correlation coefficients with significance testing and Missing data: Handled through listwise deletion with sensitivity analysis.
All statistical analyses followed appropriate assumptions testing, with transformations applied as necessary to meet the requirements of normality and homogeneity.

3. Results

3.1. Analysis of Variance for Agronomic and Quality Traits

The combined analysis of variance across environments and seasons revealed distinct patterns of variability among measured traits (Table 1). Environment effects were highly significant (p < 0.001) for multiple traits including grain yield (F = 11.0), plot yield (F = 4.0), pods per plant (F = 6.0), seeds per plant (F = 14.0), bottom pod height (F = 20.0), lodging rate (F = 188.0), main stem nodes (F = 5.0), and yield advantage over control (F = 18.0). Plant height showed moderate environmental sensitivity (F = 3.0, p = 0.02), while hundred-seed weight, protein content, oil content, and perfect seed rate remained stable across all testing locations. Seasonal variation was highly significant (p < 0.001) for most productivity traits, with particularly strong effects observed for plot yield (F = 1091.9), grain yield (F = 24.0), seeds per plant (F = 26.7), pods per plant (F = 24.6), yield advantage (F = 13.0), and lodging rate (F = 483.7). Bottom pod height (F = 8.9) and plant height (F = 3.6) also responded significantly to seasonal factors. Disease resistance traits showed no detectable seasonal variation, with viral disease, cyst nematode, and gray leaf spot grades remaining consistently low across all conditions. The genotype × environment × season interactions were particularly pronounced for agronomic traits. Grain yield showed highly significant interaction effects (F = 4.0, p < 0.001), while seeds per plant (F = 17.4, p < 0.001) and pods per plant (F = 11.4, p < 0.001) demonstrated even stronger interactive effects. Lodging rate exhibited the most extreme interaction response (F = 110.1, p < 0.001), indicating that lodging susceptibility varied dramatically across location-season combinations. Bottom pod height interactions were also substantial (F = 11.6, p < 0.001), suggesting complex environmental influences on plant architecture. Morphological stability was observed for several key traits. Main stem nodes exhibited significant ecological effects, but no seasonal or interaction effects, suggesting consistent node development patterns across environments. Hundred-seed weight remained remarkably stable across all factors (F = 1.0–2.6, p > 0.05), demonstrating genetic control over seed size regardless of growing conditions. Quality and resistance traits exhibited exceptional stability. Both protein content and oil content showed no significant responses to any factor, with F-values near zero and p-values exceeding 0.66 for environmental effects. Disease resistance was uniformly excellent, with zero variation detected for viral disease and cyst nematode grades across all environments and seasons. Perfect seed rate maintained high values (98.9% mean) with no significant variation detected. The experimental design demonstrated adequate precision, with replication effects being consistently non-significant across all traits. Coefficient of variation values were within acceptable ranges, with the lowest variability in quality traits (2.01% for protein, 3.84% for oil) and moderate variability in agronomic parameters (9.82–13.47%).

3.2. Multi-Location Performance Evaluation of Soybean Variety Heike 88

3.2.1. Growth and Development Characteristics

Growing season length varied considerably across locations, ranging from 104 to 131 days, with a mean duration of 114 days (Table 2). The shortest growing period was recorded at Beian Dalong Seed Industry (104 days) due to late sowing (25 May), while Heihe Seed Division experienced the longest season (131 days) with early planting on 14 May. This 26% difference in growing season duration was primarily attributed to variations in sowing dates, which ranged from 11 May at Heshan Farm to 25 May at both Beian Dalong and Nenjiang County Far East locations.
Plant morphological characteristics demonstrated significant location-dependent variation. Nenjiang Farm produced the tallest plants (97.2 cm), significantly different from all other locations, while Beian Dalong Seed Industry had the shortest stature (83 cm), representing a 17% difference in plant height. The coefficient of variation for plant height (4.2%) indicates moderate genetic stability with some environmental responsiveness. Statistical analysis revealed five distinct height groupings across the seven locations, suggesting strong genotype × environment interactions for this trait. Bottom pod height exhibited the greatest morphological variation among measured traits (CV = 13.8%), ranging from 13 cm at Beian Branch Research Institute to 20 cm at Heihe Seed Division. This 54% difference in bottom pod positioning has critical implications for mechanical harvesting efficiency and potential yield losses. Heihe Seed Division achieved significantly higher pod positioning than all other locations, while Beian Branch Research Institute and Nenjiang Farm showed the lowest pod heights. The substantial variation in this trait suggests strong environmental influences on reproductive architecture. Main stem node development was relatively uniform across locations (CV = 5.3%), ranging from 14 to 16 nodes per plant. Heihe Seed Division, Heshan Farm, and Nenjiang Farm produced significantly more nodes (16) compared to Nenjiang County Far East and Wudalianchi (14 nodes). This consistency in node development indicates genetic stability for vegetative growth patterns, despite environmental variation. Effective branching was minimal across all locations, with only Beian Branch Research Institute producing any measurable branch development (1 branch per plant). The extremely high coefficient of variation (316.2%) reflects the binary nature of branching response, where most locations showed no branching while one location exhibited limited branching activity. The lack of significant branching across locations confirms the variety’s determinate growth habit and suitability for mechanized production systems, as excessive branching can complicate harvest operations and reduce harvest efficiency.

3.2.2. Morphological Characteristics and Plant Architecture

Morphological analysis revealed high consistency in several key traits across all locations (Table 3). The flower color was uniformly purple, hair color was consistently gray, and the leaf shape was sharp across all test sites. The podding habit was sub-determinate in all locations, indicating stable genetic expression of this trait. The seed color and hilum color were consistently yellow across all locations. Pod color showed some variation, with 85.7% of locations exhibiting brown pods and one location (Bei’an Technology Center) showing yellow-brown pods. The seed shape was predominantly round (85.7% of locations), with the Heihe Branch being the only exception, displaying oval seeds. Lodging resistance varied significantly among locations, with resistance grades ranging from 0 to 3, and lodging rates varying from 0% to 70%.

3.2.3. Yield Performance and Components

Grain yield showed substantial variation among locations, ranging from 2449 to 3250 kg ha−1, with a mean yield of 2888.3 kg ha−1 (Table 4). Wudalianchi Seed Station achieved the highest yield (3250 kg ha−1), followed by Heshan Farm (3104 kg ha−1) and Nenjiang County Far East Seed Industry (3042 kg ha−1). The lowest yield was recorded at Heihe Seed Division (2449 kg ha−1), representing a 32.7% difference from the highest-yielding location. Yield advantage over the local control variety ranged from 8.9% to 13.5%, with an average improvement of 11.6%. Heihe Seed Division showed the highest relative performance (13.5% above control), while Beian Dalong Seed Industry demonstrated the lowest advantage (8.9% above control). Hundred-seed weight varied from 21.9 to 23.7 g, with a mean of 22.5 g. Nenjiang Farm recorded the highest HSW (23.7 g), while Wudalianchi Seed Station had the lowest (21.9 g). Seeds per plant ranged from 47 to 65.3, with considerable variation across locations. Plot yields varied from 30.0 to 36.9 kg, with Nenjiang County Far East Seed Industry achieving the highest plot-level performance. The results demonstrate significant genotype × environment interactions, with different locations favoring distinct yield components. While Wudalianchi Seed Station excelled in overall yield per hectare, other locations showed advantages in specific components such as seed weight or individual plant productivity.

3.2.4. Seed Quality Components

Protein content remained relatively stable across all locations, ranging from 41.69% to 42.25%, with a mean of 42.03% (Table 5). All locations showed statistically similar protein levels, with no significant differences observed among sites. Nenjiang Farm recorded the highest protein content (42.25%), while Beian Dalong Seed Industry had the lowest (41.69%), representing only a 1.3% variation across locations. Oil content also demonstrated minimal variation, ranging from 19.74% to 20.13%, with a mean of 19.87%. Statistical analysis revealed no significant differences in oil content among locations, indicating consistent seed quality across all testing sites. Beian Dalong Seed Industry achieved the highest oil content (20.13%), while both Heihe Seed Division and Heshan Farm recorded the lowest (19.74%). The uniformity in both protein and oil content across diverse growing environments demonstrates the variety’s stability for key quality traits, suggesting minimal genotype × environment interaction effects on seed composition parameters.

3.2.5. Disease Resistance and Seed Quality

Disease evaluation showed that Heike 88 had outstanding resistance across all test locations (Table 6). It displayed complete resistance (Grade 0) to virus disease and cyst nematode attack at every site. Resistance to gray spot disease was also excellent, with grades from 0 to 1, and six out of seven locations showing complete resistance (Grade 0). Seed quality parameters remained consistently high at all sites, with the percentage of perfect seeds ranging from 97% to 100%, averaging 99.1%. The rate of diseased seeds was very low, between 0% and 2%, with an average of 0.4%. Insect damage was also minimal, from 0% to 3%, averaging 0.7%. Reproductive efficiency, measured as seeds per plant, ranged from 49 to 112, with Bei’an Zhaoguang Dalong Seed recording the highest number. The number of effective pods per plant varied from 22 to 42, with an average of 30.3 pods per plant.

3.3. Comprehensive Performance Analysis and Multi-Dimensional Performance Characterization

3.3.1. Principal Component Analysis and Performance Space Mapping

Principal component analysis showed that performance effects were the main source of variation in Heike 88, with the first two components explaining 75.9% of the total variance (PC1: 43.4%, PC2: 32.6%) (Figure 2A). The multi-dimensional performance space displayed clear clustering patterns based on performance levels, with four main groups: high performance (orange cluster), moderate performance (green cluster), low performance (blue cluster), and no performance (red cluster). The PCA biplot indicated that Heike 88 consistently fell within the high-performance group across different environmental gradients, pointing to strong genetic complementarity between parental lines. The elliptical confidence regions for each performance level showed minimal overlap, indicating that performance is a stable and measurable trait rather than random environmental noise. High-performance environments mainly clustered in the positive PC1 area, linked to better yield components and stress tolerance. Climate zone analysis within the PCA revealed that moderate climate conditions (shown by medium-sized dots) are ideal for performance, while cooler and warmer extremes exhibited reduced heterotic effects. This aligns with the variety’s development for the Fourth Temperature Zone, where moderate thermal conditions are common.

3.3.2. Environmental Modulation of Performance Expression

Mid-parent performance showed strong environmental sensitivity, with expression values ranging from −20% to +40% across the accumulated temperature gradient of 2000–2300 °C (Figure 2B). The relationship followed a complex curved pattern, with peak performance at approximately 2150–2200 °C accumulated temperature, which exactly matches the ideal growing conditions for the target region. The analysis over four years (2019–2022) indicated increasing stability in performance over time. In 2019, performance expression was highly variable, with values spread throughout the temperature range. By 2022, patterns had stabilized, with most data points clustering near the optimal temperature. The fitted curve (red line) with confidence intervals (gray shading) showed that performance below 2100 °C and above 2250 °C resulted in lower benefits. Climate zone analysis showed that cool zones (small dots) generally had lower performance, especially at temperature extremes, while warm zones (large dots) displayed more consistent but moderate performance levels. This pattern suggests that performance in Heike 88 is optimized for moderate temperatures, with both heat stress and insufficient heat accumulation limiting its expression.

3.3.3. Environmental Response Surface Analysis

The environmental response surface generated through GAM-predicted yield modeling revealed distinct performance zones across the temperature-growth period interaction space (Figure 2C). The highest predicted yields (3500+ kg ha−1, represented by red coloring) occurred within a narrow environmental window of 2150–2200 °C accumulated temperature and 120–125 day growth periods. High-performance environments (red dots) clustered mainly within this optimal zone, while low-performance environments (blue dots) were more broadly distributed across suboptimal conditions. The response surface showed steep gradients, indicating that small deviations from optimal conditions caused significant yield losses. The purple zones, representing yields below 2500 kg ha−1, corresponded to environments with either insufficient temperature accumulation or excessively short growing seasons. Moderate-performance environments (green dots) occupied intermediate positions on the response surface, suggesting that some heterotic effects appeared under moderately favorable conditions. The contour lines indicated that maintaining yields above 3000 kg ha−1 required accumulated temperatures over 2100 °C regardless of growth period length, but optimal performance depended on the specific temperature-duration combination typical of the target production zone.

3.3.4. Machine Learning Feature Importance Analysis

Random Forest analysis identified key performance drivers with quantified importance scores (Figure 2D). The year effect emerged as the most influential factor (13% importance), indicating significant temporal variation in variety performance. This finding highlights the importance of multi-year evaluation for accurate variety characterization and suggests that environmental factors beyond temperature and growth period substantially influence the expression of genetic potential. Accumulated temperature is ranked as the second most important factor (7.6% importance), confirming the central role of thermal environment in variety adaptation. The positive importance score indicates that higher temperature accumulation generally favors yield performance within the observed range, consistent with the variety’s adaptation to the Fourth Temperature Zone requirements. Quality parameters showed moderate importance, with oil content (4% importance) outweighing protein content (1.9% importance). This differential importance suggests that oil accumulation patterns may be more environmentally sensitive or more strongly linked to overall plant performance than protein synthesis. The negative importance scores for pods per plant (−0.1%) and plant height (−1.7%) indicate that extreme values for these traits may actually limit performance, implying optimal ranges rather than linear relationships. Growth period exhibited a substantial negative importance (−3.1%), indicating that excessively long growing seasons may reduce yield efficiency or increase exposure to late-season stresses. This finding supports developing varieties for regions with defined growing seasons rather than extended tropical environments.

3.4. Temporal Performance Dynamics and Trait Relationships

3.4.1. Performance Distribution Patterns

Frequency distribution analysis of three performance types revealed distinct patterns of expression (Figure 3A). Better-parent performance showed a right-skewed distribution with peak density around 10–15% and a long tail extending to 40%, indicating that Heike 88 often exceeded the performance of its superior parent by substantial margins. The distribution mean at approximately 12% reflects significant transgressive segregation beyond parental limits. Mid-parent performance exhibited the most symmetric distribution, spanning from −25% to +25%, with peak density near 5–10%. The wider range suggests greater environmental sensitivity compared to better-parent performance, but the positive skew confirms consistent hybrid vigor expression. The dashed vertical line at approximately 8% indicates a reliable positive performance under most conditions. Standard performance displayed an intermediate pattern with peak density in the 5–15% range and limited negative values. The tight distribution relative to mid-parent performance suggests that Heike 88’s performance advantage over standard varieties remains relatively consistent across environments, supporting its suitability for commercial deployment.

3.4.2. Temporal Performance and Performance Trends

A four-year analysis showed coordinated increases in both performance expression and absolute yield performance (Figure 3B). Better-parent performance (red line) steadily rose from about −5% in 2019 to +20% in 2022, indicating improved performance relative to the superior parent over time. Mid-parent performance (green line) showed an even more substantial increase, going from negative values in 2019 to over 20% by 2021–2022. The mean yield performance (blue line with dots) consistently improved from roughly 2550 kg ha−1 in 2019 to 3200 kg ha−1 in 2022. The similar trends between performance expression and absolute performance suggest that environmental optimization or better management practices enhanced the expression of genetic potential rather than just providing more favorable growing conditions. The sharp jump in performance expression between 2020 and 2021 coincides with the period when yield stability improved most significantly (CV dropping from 18.7% to 6.7%), indicating that factors boosting performance expression also helped improve performance consistency. This pattern over time shows that the variety’s genetic potential was increasingly realized as evaluation environments and management practices were optimized.

3.4.3. Quality Trait-Performance Relationships

Relationships between oil and protein content and mid-parent performance reveal complex interaction patterns (Figure 3C). Oil content shows a moderate negative correlation with performance (r ≈ −0.4), with higher-performance environments generally linked to lower oil levels. The relationship is non-linear, featuring the steepest decline in environments with moderate positive performance (10–20%). Protein content has a weaker but positive association with performance (r ≈ +0.3), indicating that high-performing environments tend to favor protein accumulation over oil synthesis. The scatter plot displays significant residual variation, with many data points diverging notably from the fitted trend lines, suggesting that performance-quality relationships are influenced by additional environmental factors. The inverse oil-protein relationship characteristic of soybean remains consistent across performance levels, though the strength of this relationship varies. In high-performance environments, the oil-protein trade-off appears less evident, implying that superior genetic complementation may partly overcome typical compositional limitations. This has important implications for breeding programs aiming at specific quality profiles.

3.4.4. Comprehensive Trait Correlation Network

The correlation matrix, including performance effects, revealed complex relationships among traits (Figure 3D). Strong positive correlations (>0.9, deep red) were observed between yield and several performance measures, confirming the central role of hybrid vigor in performance expression. Better- parent and mid- parent performance showed a perfect positive correlation, indicating consistent relative performance across various types. Thermal efficiency exhibited strong positive correlations with both yield (r = 0.94) and performance measures (r ≈ 0.9), confirming efficient temperature utilization as a key mechanism behind heterotic effects. This suggests that performance partly operates through enhanced physiological efficiency rather than just increased resource acquisition. Seed weight showed moderate positive correlations with most performance traits, but weaker links with performance measures, suggesting that environmental factors may have a greater influence on seed size than genetics in this cross. Temperature and growth days showed moderate negative correlations with several traits (−0.4 to −0.6), signaling that optimal ranges exist rather than benefits from extended growing periods. Quality traits (oil and protein) had the expected strong negative correlation (r = −0.98) but relatively weak relationships with other performance measures, supporting their partial independence from yield-related traits. This independence offers opportunities for simultaneously improving yield and quality through targeted selection. The network analysis identified three main trait clusters: (1) yield and performance-related measures, (2) morphological and developmental traits, and (3) quality composition parameters. Moderate inter-cluster correlations (0.3–0.6) suggest that improving one trait group does not necessarily harm others, thereby endorsing multi-objective breeding strategies. Control advantage (yield relative to standard varieties) was strongly positively correlated with absolute yield and performance measures, indicating that heterotic benefits directly translate into commercial gains. The correlation pattern supports the conclusion that Heike 88’s superior field performance primarily results from effective expression of hybrid vigor under regional growing conditions.

4. Discussion

4.1. Transgressive Segregation and Genetic Improvement Through Pedigree Breeding

The development of Heike 88 demonstrates the successful exploitation of transgressive segregation in soybean breeding, achieving a consistent 10.3% yield advantage over regional check varieties across diverse environments over four years. This performance improvement represents genetic gain through the combination and fixation of favorable alleles from both parental lines during pedigree selection, rather than heterosis in the traditional sense of F1 hybrid vigor [47,48]. The distinction is critical: Heike 88 is a pure line stabilized through six generations of selection, not a heterotic F1 hybrid, and therefore its superior performance stems from transgressive segregation, the phenomenon where progeny exceed parental performance through novel recombination of complementary genetic factors that become fixed in a homozygous state [49,50]. The comparison with parental lines through check variety ratios revealed that Heike 88 consistently outperformed both Heijiao 08-1611 and Heihe 43 across the testing environments. While direct contemporary comparison in the same trials would provide the most robust statistical validation [51], the use of common check varieties across time periods enabled reasonable assessment of genetic improvement. This approach demonstrated that Heike 88 yielded 10.3% above regional checks on average, compared to the historical performance of 5–8% above checks for the parental lines, indicating approximately a 5% genetic gain through the breeding process. This level of improvement is substantial and commercially significant, particularly given the complex multi-trait selection criteria employed during line development [52]. Principal component analysis, revealing performance effects as the dominant source of performance variation (43.4% and 32.6% of the variance in PC1 and PC2, respectively), establishes a new benchmark for understanding performance determinants in soybeans. This finding contradicts conventional breeding wisdom that performance contributes only 5–15% of performance improvement in self-pollinated crops [11].

4.2. Temperature-Dependent Performance and Adaptive Breeding

The curvilinear relationship between accumulated temperature and variety performance (−20% to +40% variation relative to parental average across the 2000–2300 °C temperature gradient) indicates that the genetic complementation achieved through transgressive segregation operates through temperature-dependent physiological mechanisms [53]. This temperature-dependent optimization contrasts with previous reports suggesting that environmental stress uniformly reduces genetic potential in self-pollinated crops [5]. Instead, our results demonstrate that carefully selected parental combinations can produce progeny with enhanced resource-use efficiency under specific environmental conditions, representing adaptive transgressive segregation tailored to target production environments. The narrow temperature optimum (2150–2200 °C accumulated temperature) for peak performance raises essential questions about climate adaptability and the trade-offs inherent in precise environmental matching. While this specificity aligns perfectly with current Fourth Temperature Zone conditions, projected warming trends may shift optimal production zones northward [54,55], potentially limiting the long-term utility of such precise environmental adaptation. However, the demonstrated yield stability improvement over time, with a coefficient of variation decreasing from 18.7% to 6.7% between 2019 and 2022, suggests that varieties optimized for specific environmental conditions may actually show enhanced climate resilience through more efficient resource utilization when grown within their adaptation zone [56]. This finding challenges the prevailing preference for broadly adapted varieties and supports the development of regionally optimized germplasm as a complementary strategy to broad adaptation, particularly in regions with relatively homogeneous production environments [56,57]. The temporal analysis showing increasing performance stability alongside consistent yield improvement from 2550 kg ha−1 in 2019 to 3200 kg ha−1 in 2022 suggests that the variety’s genetic potential was progressively realized as agronomic management practices were optimized for its specific requirements. This pattern suggests that novel genetic combinations generated through transgressive segregation may require corresponding management innovations to fully realize their potential, underscoring the importance of integrated variety development and agronomic optimization programs.

4.3. Disease Resistance and Durability

The complete resistance to SMV and cyst nematode disease maintained across all locations and years challenges established paradigms of resistance gene durability. Most single-gene resistances break down within 3–5 years due to pathogen evolution [58], yet Heike 88’s resistance profile remained stable throughout the evaluation period. This durability suggests either the presence of multiple resistance genes working in concert or novel resistance mechanisms that provide more robust protection. The consistent gray leaf spot resistance (grades 0–1) across environments with high disease pressure is particularly notable, given that Cercospora kikuchii populations in northern China have shown increasing aggressiveness [59]. The integration of multiple resistance traits without apparent fitness costs contradicts the widely accepted trade-off between disease resistance and yield potential [60], suggesting that performance-based breeding may partially overcome these traditional constraints.

4.4. Methodological Implications

The integration of machine learning analysis with traditional quantitative genetics represents a significant methodological advance in variety evaluation. The identification of growth period (13% importance) and thermal efficiency (7.6% importance) as primary yield drivers provides mechanistic insights that extend beyond simple performance documentation. This analytical framework could be broadly applied to understand genotype-by-environment interactions in other crops. The Random Forest analysis, revealing negative importance scores for extreme plant height and extended growth periods, suggests that optimal performance occurs within defined phenotypic ranges rather than through maximizing individual traits. This finding has profound implications for the definition of breeding objectives and the development of selection strategies.

4.5. Quality Trait Independence

The observation that protein and oil content showed minimal variation across environments (CV = 2.01% and 3.84%, respectively) and relative independence from yield performance challenges fundamental assumptions about trait trade-offs in soybean improvement [61]. The traditional inverse protein-oil relationship was maintained (r = −0.71), but both traits showed stability across the environments where yield varied substantially, suggesting opportunities for simultaneous improvement that contradict some established breeding paradigms. This independence implies that transgressive segregation approaches focusing on yield enhancement need not compromise seed quality, enabling the development of varieties that combine high yield with specific quality profiles, a particularly valuable capability for addressing diverse end-use requirements [62]. The weak correlations between quality traits and performance measures (oil content: r = −0.37 with yield; protein content: r = 0.30) suggest that the genetic mechanisms underlying seed composition operate largely independently of those controlling yield potential in this genetic background. This partial independence provides opportunities for targeted selection strategies that maintain quality standards while improving productivity, supporting multi-objective breeding approaches that address both farmer profitability concerns (yield) and processor requirements (quality composition) [63,64].

4.6. Study Limitations and Methodological Considerations

This study was conducted within a specific geographic region (Heilongjiang Province, Fourth Temperature Zone) and climatic zone, which may limit the generalizability of the findings to other soybean production regions. The optimal temperature ranges and performance patterns identified may not apply to tropical, subtropical, or temperate areas with different thermal regimes and photoperiod responses. The genetic diversity represented by the two parental lines may not capture the full potential for transgressive segregation available within global soybean germplasm. The strong performance observed with this specific cross may represent an exceptional combination rather than a broadly applicable pattern that requires validation across diverse genetic backgrounds. The use of check variety comparisons to assess performance relative to parental lines, while scientifically defensible for demonstrating breeding progress, introduces additional uncertainty compared to direct contemporary evaluation of all genotypes in the same trials. Environmental effects between the time periods when parents were evaluated (before 2015) and when Heike 88 was comprehensively tested (2016–2022) cannot be controlled entirely, potentially confounding genetic and environmental contributions to observed performance differences. Future breeding programs should ideally maintain parental lines in contemporary trials alongside derived varieties to enable more robust statistical comparison and genetic gain estimation. The four-year evaluation period, while substantial, represents a relatively short timeframe for assessing long-term stability under climate change scenarios. Projected warming trends suggest that thermal regimes may shift significantly over the 10–20-year commercial lifespan of soybean varieties, requiring extended evaluation periods to validate adaptation claims and ensure variety longevity. The reliance on phenotypic evaluation without molecular marker integration limits the understanding of the genetic mechanisms underlying observed transgressive segregation; however, this represents an opportunity for future research rather than a fundamental flaw in the current study.

4.7. Future Research Directions

Prioritize the molecular characterization of Heike 88 to pinpoint genomic regions linked to transgressive segregation and superior performance. Use genome-wide association studies and QTL mapping from the same parental lines to identify favorable allele combinations fixed during pedigree selection, enabling marker-assisted breeding. Incorporate genomic selection to predict transgressive potential from parental genotypes, reducing development time and costs. Conduct multi-regional trials across diverse soybean zones to determine if the identified thermal efficiency mechanisms are region-specific or more broadly applicable. International collaborations could verify the transferability of breeding strategies across different environments. Investigate performance under climate change scenarios like increased temperature, altered rainfall, and higher CO2 to develop resilient varieties. Study physiological mechanisms underlying thermal efficiency differences in gas exchange, canopy temperature, and water use to inform parent selection and accelerate breeding. Heike 88’s success shows traditional pedigree selection, combined with strategic parent choice, can drive significant genetic gains in self-pollinated crops via transgressive segregation. Exploiting parental complementarity opens avenues for improving other self-pollinated species and tackling global food security. Identifying environment-specific optimal performance enables climate-adapted variety development, shifting from broad to precise ecological matching. Combining traditional breeding with advanced analytics like machine learning can speed up genetic gains amid evolving challenges. This approach—multi-environment testing, statistical analysis, and mechanistic insights—can be applied broadly to accelerate crop improvement worldwide.

5. Conclusions

The comprehensive evaluation of Heike 88 across seven locations over four years demonstrates the successful development of a superior soybean variety adapted to northern China’s Fourth Temperature Zone. The variety achieved a consistent 10.3% yield advantage over controls with mean yields of 3188 kg ha−1, indicating strong commercial viability. The most significant finding was the substantial performance expression, which contributed 43.4% and 32.6% of the performance variance in principal components, representing unprecedented hybrid vigor in self-pollinated soybeans. Optimal performance occurred at 2150–2200 °C accumulated temperature, precisely matching regional growing conditions and validating strategic parent selection. Complete resistance to virus disease and cyst nematode attack, plus excellent gray leaf spot resistance, addresses critical production constraints while maintaining stable quality characteristics suitable for dual-purpose applications. Machine learning analysis identified growth period and thermal efficiency as primary yield drivers, providing mechanistic insights into superior performance. The integration of traditional breeding with advanced analytics represents a methodological advancement for variety development. Heike 88’s combination of high yield potential, environmental stability, and comprehensive disease resistance supports immediate commercial deployment while demonstrating the value of performance-based approaches for challenging production environments.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agriculture15202106/s1, Table S1: Geographic location and soil characteristics of experimental sites; Table S2: Temperature characteristics and growing degree day accumulation at experimental locations; Table S3: Precipitation patterns and growing conditions.

Author Contributions

Conceptualization, D.H., W.L. (Wei Li) and W.L. (Wencheng Lu); Data curation, W.L. (Wei Li); Formal analysis, X.Y. and W.L. (Wei Li); Funding acquisition, H.J., H.R. and W.L. (Wencheng Lu); Investigation, D.H., W.L. (Wei Li) and W.L. (Wencheng Lu); Methodology, W.L. (Wei Li) and W.L. (Wencheng Lu); Project administration, H.J., H.R. and W.L. (Wencheng Lu); Resources, H.J., H.R. and W.L. (Wencheng Lu); Software, D.H. and X.Y.; Supervision, X.Y., H.R. and W.L. (Wencheng Lu); Validation, W.L. (Wei Li); Visualization, X.Y. and H.J.; Writing—original draft, D.H., X.Y. and H.J.; Writing—review and editing, D.H., H.J., H.R. and W.L. (Wencheng Lu). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Heilongjiang Provincial Agricultural Science and Technology Innovation Leap Project (CX25JC01); Project funded by the National Soybean Industry Technology System (CARS-04-05B); Research, Development and Application of Key Technologies for Improving the Comprehensive Grain Production Capacity of the Modern Agricultural Province Laboratory (ZY04JD05-007).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

All data generated or analyzed during this study are included in this published article. The datasets used and analyzed during the current study are available from Wencheng Lu on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, F.; Hong, H.; Liu, X.; Wang, X.; Zhang, C.; Zhao, K.; Yuan, R.; Abdelghany, A.M.; Zhang, B.; Lamlom, S.F. Large-scale evaluation of soybean germplasm reveals geographic patterns in shade tolerance and identifies elite genotypes for intercropping systems. BMC Plant Biol. 2025, 25, 1092. [Google Scholar] [CrossRef] [PubMed]
  2. Abdelghany, A.M.; Zhang, S.; Azam, M.; Shaibu, A.S.; Feng, Y.; Qi, J.; Li, J.; Li, Y.; Tian, Y.; Hong, H. Exploring the phenotypic stability of soybean seed compositions using multi-trait stability index approach. Agronomy 2021, 11, 2200. [Google Scholar] [CrossRef]
  3. Beckman, J.; Johnson, M.E.; Ajewole, K.; Kaufman, J.; Sabala, E. The Growing Demand for Animal Products and Feed in India: Future Prospects for Production, Trade, and Technology Innovation. Res. Agric. Appl. Econ. 2025, 54. [Google Scholar] [CrossRef]
  4. Musah, S. Determinants of Agricultural Productivity in the Leading Producing Countries Worldwide. Master’s Thesis, Southern Illinois University at Carbondale, Carbondale, IL, USA, 2025. [Google Scholar]
  5. Zhu, W.; Li, J.; Xie, T. Impact of climate change on soybean production: Research progress and response strategies. Adv. Resour. Res. 2024, 4, 474–496. [Google Scholar]
  6. Shea, Z.; Singer, W.M.; Zhang, B. Soybean production, versatility, and improvement. Legume Crops Prospect. Prod. Uses 2020, 10, 29–71. [Google Scholar]
  7. Anderson, E.J.; Ali, M.L.; Beavis, W.D.; Chen, P.; Clemente, T.E.; Diers, B.W.; Graef, G.L.; Grassini, P.; Hyten, D.L.; McHale, L.K. Soybean [Glycine max (L.) Merr.] breeding: History, improvement, production and future opportunities. In Advances in Plant Breeding Strategies: Legumes; Springer: Berlin/Heidelberg, Germany, 2019; Volume 7, pp. 431–516. [Google Scholar]
  8. Miedaner, T. Breeding strategies for improving plant resistance to diseases. In Advances in Plant Breeding Strategies: Agronomic, Abiotic and Biotic Stress Traits; Springer: Berlin/Heidelberg, Germany, 2016; pp. 561–599. [Google Scholar]
  9. Ahmar, S.; Gill, R.A.; Jung, K.-H.; Faheem, A.; Qasim, M.U.; Mubeen, M.; Zhou, W. Conventional and molecular techniques from simple breeding to speed breeding in crop plants: Recent advances and future outlook. Int. J. Mol. Sci. 2020, 21, 2590. [Google Scholar] [CrossRef]
  10. Buch, K.; Kaushik, A.; Mishra, U.; Beese, S.; Samanta, S.; Singh, R. Unravelling the complexity of plant breeding through modern genetic techniques and tools: A review. Int. J. Plant Soil Sci. 2023, 35, 97–105. [Google Scholar] [CrossRef]
  11. Das, A.K.; Choudhary, M.; Kumar, P.; Karjagi, C.G.; KR, Y.; Kumar, R.; Singh, A.; Kumar, S.; Rakshit, S. Heterosis in genomic era: Advances in the molecular understanding and techniques for rapid exploitation. Crit. Rev. Plant Sci. 2021, 40, 218–242. [Google Scholar] [CrossRef]
  12. Xu, J. Exploration du Polymorphisme Moléculaire et Protéique de la Tomate Pour l’Identification de QTL de Qualité du Fruit. Ph.D. Thesis, Université d’Avignon, Avignon, France, 2012. [Google Scholar]
  13. Wu, X.; Liu, Y.; Zhang, Y.; Gu, R. Advances in research on the mechanism of heterosis in plants. Front. Plant Sci. 2021, 12, 745726. [Google Scholar] [CrossRef] [PubMed]
  14. Snodgrass, S.J. The Consequences of Hybridization in the Short-and Long-Term Among the Tripsacinae Subtribe of Grasses. Ph.D. Thesis, Iowa State University, Ames, IA, USA, 2024. [Google Scholar]
  15. Ren, X.; Chen, L.; Deng, L.; Zhao, Q.; Yao, D.; Li, X.; Cong, W.; Zang, Z.; Zhao, D.; Zhang, M. Comparative transcriptomic analysis reveals the molecular mechanism underlying seedling heterosis and its relationship with hybrid contemporary seeds DNA methylation in soybean. Front. Plant Sci. 2024, 15, 1364284. [Google Scholar] [CrossRef]
  16. Shahzad, K.; Zhang, X.; Guo, L.; Qi, T.; Bao, L.; Zhang, M.; Zhang, B.; Wang, H.; Tang, H.; Qiao, X. Comparative transcriptome analysis between inbred and hybrids reveals molecular insights into yield heterosis of upland cotton. BMC Plant Biol. 2020, 20, 239. [Google Scholar] [CrossRef]
  17. Liu, J.; Li, M.; Zhang, Q.; Wei, X.; Huang, X. Exploring the molecular basis of heterosis for plant breeding. J. Integr. Plant Biol. 2020, 62, 287–298. [Google Scholar] [CrossRef]
  18. Ouyang, Y.; Li, X.; Zhang, Q. Understanding the genetic and molecular constitutions of heterosis for developing hybrid rice. J. Genet. Genom. 2022, 49, 385–393. [Google Scholar] [CrossRef]
  19. Belcapo, S.; Réthoré, E.; Nguema-Ona, E.; Ezquer, I. Unraveling Novel Mechanisms Controlling Heterosis in seeds: Advances and Biotechnological Applications in crops. J. Exp. Bot. 2025, eraf400. [Google Scholar] [CrossRef]
  20. Ruff, L.A. Soybean Heterosis and Response to Water: Yield, Yield Components, and Morphology; The University of Nebraska-Lincoln: Lincoln, NE, USA, 2016. [Google Scholar]
  21. Hochholdinger, F.; Yu, P. Molecular concepts to explain heterosis in crops. Trends Plant Sci. 2025, 30, 95–104. [Google Scholar] [CrossRef] [PubMed]
  22. Charles, D.R. Optimizing Multi-Trait Selection Index in Maize Breeding with Advanced Machine Learning and Robust Uncertainty Analysis. Ph.D. Thesis, Université Côte d’Azur, Nice, France, 2024. [Google Scholar]
  23. Hairong, Y.; Yiyuan, C.; Bun, K.H. China’s soybean crisis: The logic of modernization and its discontents. In Soy, Globalization, and Environmental Politics in South America; Routledge: Oxfordshire, UK, 2017; pp. 123–145. [Google Scholar]
  24. Zhao, J.; Wang, Y.; Zhao, M.; Wang, K.; Li, S.; Gao, Z.; Shi, X.; Chu, Q. Prospects for soybean production increase by closing yield gaps in the Northeast Farming Region, China. Field Crops Res. 2023, 293, 108843. [Google Scholar] [CrossRef]
  25. Xin, M.; Zhang, Z.; Han, Y.; Feng, L.; Lei, Y.; Li, X.; Wu, F.; Wang, J.; Wang, Z.; Li, Y. Soybean phenological changes in response to climate warming in three northeastern provinces of China. Field Crops Res. 2023, 302, 109082. [Google Scholar] [CrossRef]
  26. Zhao, J.; Wang, C.; Shi, X.; Bo, X.; Li, S.; Shang, M.; Chen, F.; Chu, Q. Modeling climatically suitable areas for soybean and their shifts across China. Agric. Syst. 2021, 192, 103205. [Google Scholar] [CrossRef]
  27. He, H.; Chen, M.; Li, M.; Qu, K.; Dang, H.; Li, Q.; Hu, Z.; Zhang, Q. Impact of climate change on the potential allocation of resources of rice cultivation in Yangtze-Huai Rivers region: A case study of Anhui Province, China. Theor. Appl. Climatol. 2024, 155, 6697–6708. [Google Scholar] [CrossRef]
  28. Gao, J.; Liu, Y. Climate warming and land use change in Heilongjiang Province, Northeast China. Appl. Geogr. 2011, 31, 476–482. [Google Scholar] [CrossRef]
  29. Wu, M.; Pei, W.; Wedegaertner, T.; Zhang, J.; Yu, J. Genetics, breeding and genetic engineering to improve cottonseed oil and protein: A review. Front. Plant Sci. 2022, 13, 864850. [Google Scholar] [CrossRef]
  30. Pandarinathan, S.; Adhimoolam, P.K.; Gurav, N.P.; Shamkuwar, S.; Panigrahi, C.K.; Huded, S.; Kumar, M. A Review on Genetic Mechanisms of Plant-pathogen Resistance in Crop Breeding. Plant Cell Biotechnol. Mol. Biol. 2024, 25, 221–234. [Google Scholar] [CrossRef]
  31. Mural, R.V. Genotype-By-Environment Interaction and Stability: Concepts, Analysis, And Applications. Des. Exp. Biom. Anal. 2025, 2, 127–144. [Google Scholar]
  32. Allard, R.W. Principles of Plant Breeding; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
  33. Hallauer, A.R.; Carena, M.J.; Miranda Filho, J.D. Quantitative Genetics in Maize Breeding; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010; Volume 6, pp. 1–650. [Google Scholar]
  34. Wilcox, J.; Sediyama, T. Interrelationships among height, lodging and yield in determinate and indeterminate soybeans. Euphytica 1981, 30, 323–326. [Google Scholar] [CrossRef]
  35. Paez, V.; Barrett, W.B.; Deng, X.; Diaz-Amigo, C.; Fiedler, K.; Fuerer, C.; Hostetler, G.L.; Johnson, P.; Joseph, G.; Konings, E.J. AOAC SMPR® 2016.002. J. AOAC Int. 2016, 99, 1122–1124. [Google Scholar] [CrossRef]
  36. Dohlman, E.; Hansen, J.; Boussios, D. USDA Agricultural Projections to 2029; USDA: Washington, DC, USA, 2020.
  37. Nyvall, R.F. Diseases of Soybeans: Glycine max (L.) Merr. In Field Crop Diseases Handbook; Springer: Berlin/Heidelberg, Germany, 1989; pp. 503–559. [Google Scholar]
  38. Niblack, T.; Arelli, P.; Noel, G.; Opperman, C.; Orf, J.; Schmitt, D.; Shannon, J.; Tylka, G. A revised classification scheme for genetically diverse populations of Heterodera glycines. J. Nematol. 2002, 34, 279. [Google Scholar]
  39. McMaster, G.S.; Wilhelm, W. Growing degree-days: One equation, two interpretations. Agric. For. Meteorol. 1997, 87, 291–300. [Google Scholar] [CrossRef]
  40. Wickham, H.; Bryan, J. R Packages; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2023; pp. 1–341. [Google Scholar]
  41. de Mendiburu, F. Agricolae Tutorial, Version 1.3–5; Universidad Nacional Agraria: La Molina, Peru, 2021. [Google Scholar]
  42. Wickham, H. Programming with ggplot2. In Ggplot2: Elegant Graphics for Data Analysis; Springer: Berlin/Heidelberg, Germany, 2016; pp. 241–253. [Google Scholar]
  43. Finlay, K.; Wilkinson, G. The analysis of adaptation in a plant-breeding programme. Aust. J. Agric. Res. 1963, 14, 742–754. [Google Scholar] [CrossRef]
  44. Dumble, S.; Frutos Bernal, E.; Galindo, V. Package GGEBiplots; Version 0.1; 2022; Volume 3, Available online: https://CRAN.R-project.org/package=GGEBiplots (accessed on 1 August 2025).
  45. Yan, W.; Hunt, L.A.; Sheng, Q.; Szlavnics, Z. Cultivar evaluation and mega-environment investigation based on the GGE biplot. Crop Sci. 2000, 40, 597–605. [Google Scholar] [CrossRef]
  46. Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
  47. Rieseberg, L.H. Hybrid origins of plant species. Annu. Rev. Ecol. Syst. 1997, 28, 359–389. [Google Scholar] [CrossRef]
  48. DeVicente, M.; Tanksley, S. QTL analysis of transgressive segregation in an interspecific tomato cross. Genetics 1993, 134, 585–596. [Google Scholar] [CrossRef]
  49. Rai, N.; Rai, M. Heterosis Breeding in Vegetable Crops; New India Publishing: New Delhi, India, 2006; pp. 1–258. [Google Scholar]
  50. Nyadanu, D.; Lowor, S.; Tchokponhoue, D.; Pobee, P.; Nunekpeku, W.; Brako-Marfo, M.; Okyere, D.; Owusu-Ansah, F.; Ofori, A. Combining ability and gene action for sexual compatibility and pattern of nut colour segregation among ten elite clones of kola (Cola nitida (Vent) Schott and Endl.). Euphytica 2021, 217, 62. [Google Scholar] [CrossRef]
  51. Yan, W.; Tinker, N.A. Biplot analysis of multi-environment trial data: Principles and applications. Can. J. Plant Sci. 2006, 86, 623–645. [Google Scholar] [CrossRef]
  52. Krause, M.D.; Piepho, H.-P.; Dias, K.O.; Singh, A.K.; Beavis, W.D. Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials. Theor. Appl. Genet. 2023, 136, 252. [Google Scholar] [CrossRef]
  53. Hochholdinger, F.; Baldauf, J.A. Heterosis in plants. Curr. Biol. 2018, 28, R1089–R1092. [Google Scholar] [CrossRef]
  54. Li, Y.; Chang, J.; Gao, X.; Zhang, L.; Wang, L.; Ren, C. A case study on the impacts of future climate change on soybean yield and countermeasures in Fujin city of Heilongjiang province, China. Front. Agron. 2024, 6, 1257830. [Google Scholar] [CrossRef]
  55. Liu, X.; Zhang, C.; Lamlom, S.F.; Zhao, K.; Abdelghany, A.M.; Wang, X.; Zhang, F.; Yuan, R.; Han, D.; Zha, B. Genetic adaptations of soybean to cold stress reveal key insights through transcriptomic analysis. Biology 2024, 13, 856. [Google Scholar] [CrossRef]
  56. Zha, B.; Zhang, C.; Yuan, R.; Zhao, K.; Sun, J.; Liu, X.; Wang, X.; Zhang, F.; Zhang, B.; Lamlom, S.F. Integrative QTL mapping and candidate gene analysis for main stem node number in soybean. BMC Plant Biol. 2025, 25, 422. [Google Scholar] [CrossRef] [PubMed]
  57. Ren, H.; Hong, H.; Zha, B.; Lamlom, S.F.; Qiu, H.; Cao, Y.; Sun, R.; Wang, H.; Ma, J.; Zhang, H. Soybean productivity can be enhanced by understanding rhizosphere microbiota: Evidence from metagenomics analysis from diverse agroecosystems. Microbiome 2025, 13, 105. [Google Scholar] [CrossRef]
  58. McDonald, B.A.; Linde, C. Pathogen population genetics, evolutionary potential, and durable resistance. Annu. Rev. Phytopathol. 2002, 40, 349–379. [Google Scholar] [CrossRef]
  59. Lurá, M.C.; Latorre Rapela, M.G.; Vaccari, M.C.; Maumary, R.; Soldano, A.; Mattio, M.; González, A.M. Genetic diversity of Cercospora kikuchii isolates from soybean cultured in Argentina as revealed by molecular markers and cercosporin production. Mycopathologia 2011, 171, 361–371. [Google Scholar] [CrossRef]
  60. Brown, J.K. Durable resistance of crops to disease: A Darwinian perspective. Annu. Rev. Phytopathol. 2015, 53, 513–539. [Google Scholar] [CrossRef]
  61. Patil, G.; Mian, R.; Vuong, T.; Pantalone, V.; Song, Q.; Chen, P.; Shannon, G.J.; Carter, T.C.; Nguyen, H.T. Molecular mapping and genomics of soybean seed protein: A review and perspective for the future. Theor. Appl. Genet. 2017, 130, 1975–1991. [Google Scholar] [CrossRef]
  62. Singer, W.M.; Lee, Y.C.; Shea, Z.; Vieira, C.C.; Lee, D.; Li, X.; Cunicelli, M.; Kadam, S.S.; Khan, M.A.W.; Shannon, G. Soybean genetics, genomics, and breeding for improving nutritional value and reducing antinutritional traits in food and feed. Plant Genome 2023, 16, e20415. [Google Scholar] [CrossRef] [PubMed]
  63. Usigbe, M.J.; Uyeh, D.D.; Park, T.; Ha, Y.; Mallipeddi, R. Many objective optimization and decision support for dairy cattle feed formulation. Sci. Rep. 2025, 15, 13451. [Google Scholar] [CrossRef] [PubMed]
  64. Wachong Kum, S.; Voccia, D.; Grimm, M.; Froldi, F.; Suciu, N.A.; Lamastra, L. Reducing the Environmental Impacts of Pig Production Through Feed Reformulation: A Multi-Objective Life Cycle Assessment Optimisation Approach. Sustainability 2025, 17, 8509. [Google Scholar] [CrossRef]
Figure 1. Breeding pedigree and development timeline of soybean variety Heike 88. The pedigree diagram illustrates the systematic breeding approach from parent line hybridization (2010) through final variety release (2015) and continued evaluation (2016–2023). Parent varieties Heijiao 08-1611 (♀) and Heihe 43 (♂) were crossed to produce the F1 generation, followed by accelerated generation advancement to F6. The evaluation phase (2012–2014) included a comprehensive assessment of protein content, oil content, and yield performance. The final cultivar, Heike 88, was selected in 2015 and subjected to continued multi-location testing. Color coding indicates different phases: blue (parent varieties), green (F-generations), orange (evaluation phase), purple (final cultivar), and red (continued breeding activities). The breeding program was conducted at Heilongjiang Academy of Agricultural Sciences, Heihe Branch.
Figure 1. Breeding pedigree and development timeline of soybean variety Heike 88. The pedigree diagram illustrates the systematic breeding approach from parent line hybridization (2010) through final variety release (2015) and continued evaluation (2016–2023). Parent varieties Heijiao 08-1611 (♀) and Heihe 43 (♂) were crossed to produce the F1 generation, followed by accelerated generation advancement to F6. The evaluation phase (2012–2014) included a comprehensive assessment of protein content, oil content, and yield performance. The final cultivar, Heike 88, was selected in 2015 and subjected to continued multi-location testing. Color coding indicates different phases: blue (parent varieties), green (F-generations), orange (evaluation phase), purple (final cultivar), and red (continued breeding activities). The breeding program was conducted at Heilongjiang Academy of Agricultural Sciences, Heihe Branch.
Agriculture 15 02106 g001
Figure 2. Comprehensive analysis of Heike 88 soybean with performance effects integrated into a multi-dimensional analysis framework. (A) Principal component analysis (PCA) biplot showing variance distribution (PC1: 43.4%, PC2: 32.6%) with clusters based on performance effects. (B) Mid-parent performance expression across the accumulated temperature gradient (2000–2300 °C ≥10 °C), indicating the optimal performance range. (C) Environmental response surface with GAM-predicted yield, colored by performance level, highlighting optimal temperature zones. (D) Random Forest feature importance analysis identifying key drivers of yield performance, with growth period (13%), plant height (7.6%), and pods per plant (4%) as primary factors.
Figure 2. Comprehensive analysis of Heike 88 soybean with performance effects integrated into a multi-dimensional analysis framework. (A) Principal component analysis (PCA) biplot showing variance distribution (PC1: 43.4%, PC2: 32.6%) with clusters based on performance effects. (B) Mid-parent performance expression across the accumulated temperature gradient (2000–2300 °C ≥10 °C), indicating the optimal performance range. (C) Environmental response surface with GAM-predicted yield, colored by performance level, highlighting optimal temperature zones. (D) Random Forest feature importance analysis identifying key drivers of yield performance, with growth period (13%), plant height (7.6%), and pods per plant (4%) as primary factors.
Agriculture 15 02106 g002
Figure 3. Comprehensive Performance Analysis of Heike 88 Soybean Variety. (A) Performance Distribution Analysis—Frequency distributions of three performance types (Better-Parent in red, Mid-Parent in blue, Standard in green). (B) Temporal Performance and Performance Trends—Four-year trends (2019–2022) showing coordinated increases in better-parent performance (red line), mid-parent performance (green line), and mean yield performance (blue line). (C) Quality Traits vs. Performance Relationship—Scatter plot showing oil content (%) and protein content (%) relationships with mid-parent performance. (D) Comprehensive Trait Correlation Network—Heat map displaying correlation matrix including performance effects, with color intensity representing correlation strength (red = positive, blue = negative).
Figure 3. Comprehensive Performance Analysis of Heike 88 Soybean Variety. (A) Performance Distribution Analysis—Frequency distributions of three performance types (Better-Parent in red, Mid-Parent in blue, Standard in green). (B) Temporal Performance and Performance Trends—Four-year trends (2019–2022) showing coordinated increases in better-parent performance (red line), mid-parent performance (green line), and mean yield performance (blue line). (C) Quality Traits vs. Performance Relationship—Scatter plot showing oil content (%) and protein content (%) relationships with mid-parent performance. (D) Comprehensive Trait Correlation Network—Heat map displaying correlation matrix including performance effects, with color intensity representing correlation strength (red = positive, blue = negative).
Agriculture 15 02106 g003
Table 1. Analysis of variance for morphological, agronomic, and quality traits across environments and seasons.
Table 1. Analysis of variance for morphological, agronomic, and quality traits across environments and seasons.
TraitSourcedfSum SqMean SqF ValuePr (>F)SigCV (%)Mean
Plant Height (cm)Environment61.45 × 103241.613.00.020*10.5588.7
Rep (Env)14980.817.511.220.184NS
Season39543183.60.018*
Season × Env187.23 × 103401.424.6<0.001***
Bottom Pod Height (cm)Environment6332.5755.4320.0<0.001***9.4317.8
Rep (Env)1431.60.561.220.184NS
Season375.7525.258.9<0.001***
Season × Env1858832.6711.6<0.001***
Main Stem NodesEnvironment683.5713.935.0<0.001***10.7515.3
Rep (Env)1430.40.541.220.184NS
Season38.892.961.10.360NS
Season × Env1869.863.881.40.154NS
Lodging Rate (%)Environment62.86 × 1044.77 × 103188.0<0.001***19.0826.4
Rep (Env)14284.85.091.220.184NS
Season33.69 × 1041.23 × 104483.7<0.001***
Season × Env185.04 × 1042.80 × 103110.1<0.001***
Pods per PlantEnvironment6250.0741.686.0<0.001***10.4425.9
Rep (Env)1482.01.461.220.184NS
Season3539.57179.8624.6<0.001***
Season × Env181.50 × 10383.1111.4<0.001***
Seeds per PlantEnvironment63.08 × 103513.3614.0<0.001***10.0859.2
Rep (Env)14399.27.131.220.184NS
Season32.85 × 103950.4326.7<0.001***
Season × Env181.11 × 104618.617.4<0.001***
Hundred-seed Weight (g)Environment624.54.081.00.642NS10.5722.7
Rep (Env)1464.271.151.220.184NS
Season344.5714.862.60.062NS
Season × Env18144.268.011.40.170NS
Plot Yield (kg)Environment6572.7995.464.0<0.001***13.4734.3
Rep (Env)14239.074.271.220.184NS
Season36.99 × 1042.33 × 1041091.9<0.001***
Season × Env18966.3653.692.50.004**
Yield (kg ha−1)Environment65.55 × 1069.25 × 10511.0<0.001***9.822888.6
Rep (Env)149.02 × 1051.61 × 1041.220.184NS
Season35.81 × 1061.94 × 10624.0<0.001***
Season × Env185.78 × 1063.21 × 1054.0<0.001***
Protein Content (%)Environment62.950.491.00.661NS2.0142.0
Rep (Env)148.030.141.220.184NS
Season30.050.020.00.995NS
Season × Env188.380.470.60.844NS
Oil Content (%)Environment61.270.210.00.900NS3.8419.9
Rep (Env)146.530.121.220.184NS
Season30.220.070.10.943NS
Season × Env185.000.280.50.958NS
df = degrees of freedom; Sum Sq = sum of squares; Mean Sq = mean squares; F Value = F-statistic; Pr (>F) = probability of F; Sig = significance level; CV = coefficient of variation. Significance levels: *** p < 0.001, ** p < 0.01, * p < 0.05, NS = not significant. Env = environment; Rep (Env) = replication nested within environment.
Table 2. Growth and development characteristics of soybean variety Heike 88 across seven locations.
Table 2. Growth and development characteristics of soybean variety Heike 88 across seven locations.
LocationSowing DateGrowth DaysPlant Height (cm)Bottom Pod Height (cm)Main Stem NodesEffective Branches
Beian Branch Research Institute16 May10890.2 b13 d15 ab1
Beian Dalong Seed Industry25 May10483 e18 b15 ab0
Heihe Seed Division14 May13188.7 c20 a16 a0
Heshan Farm11 May12388.7 c17 bc16 a0
Nenjiang County Far East Seed25 May10687.7 cd16 c14 b0
Nenjiang Farm16 May11697.2 a14 d16 a0
Wudalianchi Seed Station16 May11685.2 d15 c14 b0
Mean-11488.717.615.10.1
CV (%)-8.716.426.15.3316.2
Means followed by the different letters within a column significantly differ at p ≤ 0.05.
Table 3. Morphological characteristics and lodging performance of soybean variety Heike 88.
Table 3. Morphological characteristics and lodging performance of soybean variety Heike 88.
LocationFlower ColorPod ColorSeed ShapeSeed ColorHilum ColorLodging Resistance (Grade)Lodging Rate (%)
Beian Branch Research InstitutePurpleYellow brownRoundYellowYellow330
Beian Dalong Seed IndustryPurpleBrownRoundYellowYellow00
Heihe Seed DivisionPurpleBrownOvalYellowYellow170
Heshan FarmPurpleBrownRoundYellowYellow00
Nenjiang County Far East SeedPurpleBrownRoundYellowYellow00
Nenjiang FarmPurpleBrownRoundYellowYellow00
Wudalianchi Seed StationPurpleBrownRoundYellowYellow00
Uniformity (%)10085.785.7100100VariableVariable
Lodging resistance: 0 = excellent, 1 = good, 2 = moderate, 3 = poor.
Table 4. Yield performance and yield components of soybean variety Heike 88 across seven locations.
Table 4. Yield performance and yield components of soybean variety Heike 88 across seven locations.
LocationsYield kg/haHSWSeeds no./PlantPlot Yield/kgCompared to Control Percent %
Beian Branch Research Institute2696.5 e22.9 ab62 b34.2 bc12
Beian Dalong Seed Industry2723 e22.2 c65.25 a30 c8.9
Heihe Seed Division2449 f22.6 bc57.2530.8 c13.4
Heshan Farm3104 b22.9 ab65.25 a36.8 a11.9
Nenjiang County Far East Seed3042 c22.2 c47 d35.2 b13.5
Nenjiang Farm2953 d23.7 a62.25 b35.3 b11
Wudalianchi Seed Station3250 a21.9 d55.5 c36.9 a10.5
Means followed by the different letters within a column significantly differ at p ≤ 0.05.
Table 5. Seed quality components across testing locations.
Table 5. Seed quality components across testing locations.
LevelProtein_ContentOil_Content
Beian Branch Research Institute41.95 a19.92 a
Beian Dalong Seed Industry41.69 a20.13 a
Heihe Seed Division42.19 a19.74 a
Heshan Farm42.2 a19.74 a
Nenjiang County Far East Seed41.95 a19.86 a
Nenjiang Farm42.25 a19.82 a
Wudalianchi Seed Station41.95 a19.89 a
Means followed by the different letters within a column significantly differ at p ≤ 0.05.
Table 6. Disease Resistance and Seed Quality Assessment of Soybean Variety Heike 88.
Table 6. Disease Resistance and Seed Quality Assessment of Soybean Variety Heike 88.
LocationGray Spot Disease (Grade)SMV Virus Disease (Grade)Cyst Nematode (Grade)Diseased Seeds (%)Insect Damage (%)Perfect Seeds (%)
Beian Branch Research Institute1002397
Beian Dalong Seed Industry0000199
Heihe Seed Division10000100
Heshan Farm00000100
Nenjiang County Far East Seed1001198
Nenjiang Farm00000100
Wudalianchi Seed Station0000098
Mean0.4000.40.799.1
Range0–1000–20–397–100
Disease resistance grades: 0 = immune/highly resistant, 1 = resistant, 2 = moderately resistant, 3 = susceptible.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, D.; Yan, X.; Li, W.; Jia, H.; Ren, H.; Lu, W. Multi-Environment Evaluation of Soybean Variety Heike 88: Transgressive Segregation and Regional Adaptation in Northern China. Agriculture 2025, 15, 2106. https://doi.org/10.3390/agriculture15202106

AMA Style

Han D, Yan X, Li W, Jia H, Ren H, Lu W. Multi-Environment Evaluation of Soybean Variety Heike 88: Transgressive Segregation and Regional Adaptation in Northern China. Agriculture. 2025; 15(20):2106. https://doi.org/10.3390/agriculture15202106

Chicago/Turabian Style

Han, Dezhi, Xiaofei Yan, Wei Li, Hongchang Jia, Honglei Ren, and Wencheng Lu. 2025. "Multi-Environment Evaluation of Soybean Variety Heike 88: Transgressive Segregation and Regional Adaptation in Northern China" Agriculture 15, no. 20: 2106. https://doi.org/10.3390/agriculture15202106

APA Style

Han, D., Yan, X., Li, W., Jia, H., Ren, H., & Lu, W. (2025). Multi-Environment Evaluation of Soybean Variety Heike 88: Transgressive Segregation and Regional Adaptation in Northern China. Agriculture, 15(20), 2106. https://doi.org/10.3390/agriculture15202106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop