Allelic Variation of Soybean Maturity Genes E1 – E4 in the Huang-Huai-Hai River Valley and the Northwest China

: Soybean is planted in a wide span of the world, and ﬂowering and maturity time is an important trait determining soybean yield formation and adaptation. Maturity loci E1 , E2 , E3 and E4 were frequently reported as the most inﬂuential genetic loci for soybean ﬂowering and maturity. To understand the allelic variation and assess the phenological traits of cultivars with different E allelic combinations in natural environments, 251 cultivars of maturity group (MG) I–V were ﬁeld tested in 42 locations across four sub-regions in the Huang-Huai-Hai and Northwest region of China and genotyped with KASP markers for E1 – E4 loci. The results indicated that mutant alleles were only found in the E1 and E2 locus, all of the cultivars carried functional alleles in the E3 and E4 loci in this area, with the frequency of mutant allele to be higher in early maturity groups (MGs) than late MGs. Among nine E allelic combinations in this area, one photoperiodic insensitive mutation in E2 loci ( E1/e2-ns/E3-Ha/E4 and E1/e2-ns/E3-Mi/E4 ) made up the largest proportion (25.10 and 18.33%), while two photoperiodic insensitive mutations in both E1 and E2 loci ( e1-as/e2-ns/E3-Ha/E4 ) (1.20%) occupied the lowest proportion in this panel. The major combinations of E locus for MGI, MGII and MG III in this area were E1/E2-dl/E3-Mi/E4 , E1/e2-ns/E3-Mi/E4 and E1/e2-ns/E3-Ha/E4 , respectively. Cultivars carrying e1-as/e2-ns/E3-Ha/E4 genotype ﬂowered earliest (34 days) on average, 7.6 days earlier than the latest-ﬂowering E haplotype ( E1/e2-ns/E3-Ha/E4 ). This study provided an opportunity to detect the E allelic combinations in the Huang-Huai-Hai River Valley and the Northwest China, which would facilitate the improvement of soybean adaptation in the future.


Introduction
Soybean was originated from the Southeastern Asia and expanded to the tropical and high latitude zones [1,2]. Flowering time and maturity are the key traits determining the adaptation zone of soybean varieties. Soybean cultivars with different photo-thermal sensitivity adapt to different geographic zones [3]. Understanding the adaptive performance of soybean cultivars plays an important role in breeding.
Allelic variations of E1 to E4 genes in cultivar from different geographic zones and maturity groups were studied in China [16][17][18], Europe [19] and North America [20]. Allelic variation of E1 to E4 genes can explain 62-66% of variations in soybean flowering, indicating that allelic combinations of E genes play major roles in determining soybean flowering and adaptation zone [21]. Functional makers in E1-E4 were developed in our previous study to genotype unknown variants in E1-E4 maturity genes, which were useful for germplasm screening and molecular marker-assisted selection [15,18,[21][22][23]. Competitive allele-specific PCR (KASP) genotyping with high throughput and low cost was extensively used for identifying SNPs [24,25].
Huang-Huai-Hai River Valley is the second largest production region in China and is a very important place of high-protein soybean production. The Northwest China is the place of creating national soybean high-yield records. In the current study, 251 soybean cultivars of MG I to MG V with nine E allelic combinations were planted in four sub-regions. The aims of our study are to (1) analyze the allelic variation of E1 to E4 in the collection of cultivars in the Huang-Huai-Hai and Northwest China and (2) analyze the phenological performance of cultivars with different E combinations.

Plant Material and Locations
The experiment consists of 251 soybean (Glycine max (L.) Merr.) cultivars from five maturity groups (MG) MGI (5), MG II (42), MG III (193), MG IV (10) and MG V (1). It originates from four sub-regions. Forty-seven cultivars were from the north sub-region, 64 were from the middle sub-region, 82 were from the south sub-region, and 25 were from the northwest sub-region, and 33 cultivars were in the tests of multiple sub-regions (Table S1). The soybean seeds were obtained from the breeders.

Experimental Design and Data Collection
The data were downloaded from the report of multiple-site soybean variety test in the Huang-Huai-Hai region in 2017-2018, which did not include the replication data (http://www.soybreeding.com/download/download.php?lang=cn&class2=77, accessed on 15 May 2019). The experiment was carried out in 42 locations in four sub-regions, with 11, 12, 11 and 11 locations in the north, middle, south and northwest sub-regions, respectively. Among them, 3 locations, Fenyang, Zhengzhou and Taiyuan, carried out the tests of two sub-regions. The latitude and longitude of locations were ranged from 32 • 93 N to 40 • 17 N, from 102 • 61 E to 119 • 15 E, respectively (Table S2). Cultivars originating from each sub-region were planted in the locations of the corresponding region. One hundred and 95 cultivars were tested in 2017 and 2018, respectively, and 56 were tested in both years. The experiment was arranged in a complete randomized block design with three replications. Each plot consisted of 6 rows that were 6 m long with an inter-row spacing of 0.5 m. The plants were thinned to a uniform stand of 22 and 18 plants m −2 after emergence in the north and middle, south and northwest sub-regions, respectively. A basal fertilizer was applied (150 kg ha −1 (NH 4 )H 2 PO 4 , 75 kg ha −1 urea, and 40 kg ha −1 K 2 SO 4 ) prior to planting. Weeds and pests were controlled normally. The phenological stages of emergence (VE), beginning bloom (R1) and full maturity (R8) were recorded according to soybean growth stages described by Fehr and Caviness [26]. The days to flowering (DTF) and days to maturity (DTM) were calculated as the period from VE to R1 and from VE to R8, respectively. Two other derived variables were calculated as follows: reproductive period (RP) is the difference between DTM and DTF; R/V is the ratio of the reproductive period (the period from R1 to R8) to the vegetative period (the period from VE to R1). Soybean varieties with the same maturity time may have different growth structure, namely, the R/V ratio, which indicates the comparison of the pre-flowering to the post-flowering duration.

Genotyping of Maturity Loci
The SNPs between different alleles of E1, E2, E3 and E4 were described in our previous study [18]. KASP allele-specific primers were designed to genotype SNP variants in E1 and E2, or an InDel in E3 locus (Supplementary Table S3). Primers sequences are shown in Table S3. Genomic DNA was extracted from fresh trifoliate leaves of each cultivar with a modified cetyl trimethyl ammonium bromide (CTAB) method. KASP assays and InDel assays were developed and conducted to genotype E1-E3 in this panel.

Statistical Analysis
Analyses of variance for DTF, DTM, RP and R/V were calculated across sub-regions and in each sub-region separately, and it was conducted with a linear mixed-effect model (lme) in the nlme package in R software (Version 3.6.1). The genotypes in E1, E2 and E3 loci (only one genotype of E4, the dominant E4, is found; E4 locus is exclusive in the ANOVA analysis), locations, the interactions between genotypes of E1, E2, E3 loci and location were fixed effects, and year was a random effect, given its lack of replication data. Best linear unbiased estimates (BLUEs) of DTF, DTM, RP and R/V of varieties across locations in each sub-region were calculated and used as the mean phenotype in each sub-region.

Genotyping of E1-E4 in the Huang-Huai-Hai River Valley and the Northwest China
Among the 251 cultivars, 84% carried wild type allele (E1), and 16% carried the mutant allele (e1-as). At the E2 locus, three allelic variations were found with the frequency of 28, 28 and 44% for E2-dl, E2-in and e2-ns alleles, respectively. At the E3 locus, E3-Ha was identified in 58% of cultivars, and E3-Mi was identified in 42% of cultivars. Only one dominant allele (E4) was found in the E4 locus.

The Performance of Flowering and Maturity Time with Different E Allelic Combinations
Analyses of variation showed that E genotype, location and their interaction were significant for all phenological traits (DTF, DTM, RP and RV), with the exception of E3 locus suggesting that E genotype, environment and E genotype by environment interaction contributed to the phenotypic variation. Location is the largest contributor to the phenotypic variation, which is 4.53 to 49.29 times of that of E genotype effect, and E genotype by environment interaction is the least important source of variation; this may be due to the large number of locations (42 locations) in the current study and the variation contributed by the flowering genes other than E1 to E4. Analyses of variation in each sub-region were conducted and demonstrated that E3 × Loc interaction and E3 were non-significant across all sub-regions, with the exception of E3 genotype in the south sub-region. E genotype and location interaction was non-significant in most sub-regions (Table 2). E1 contributed a larger effect to the variation of DTF, DTM, RP and RV compared with E2 and E3.  reproductive phase to vegetative phase. b *** indicates that it is significant at the 0.001 level; c ** indicates that it is significant at the 0.01 level; d * indicates that it is significant at the 0.05 level; Loc represents the factor of location.
To compare the duration of the vegetative growth, reproductive growth, total growth phase as well as the ratio of the reproductive to the vegetative growth phase of cultivars with different E combinations, we calculated the BLUEs of cultivars across locations with the same E combinations regarding DTF, DTM, RP and RV. The haplotype with both mutations in E1 and E2 locus (e1-as/e2-ns/E3-Ha/E4) reached the earliest flowering (34.0 days), 7.6 days earlier than the latest-flowering E haplotype (E1/e2-ns/E3-Ha/E4). However, the haplotype of e1-ase2-nsE3-HaE4 has a relatively long reproductive stage (75.06 days), which is 5.25 days longer than the shortest haplotype (E1/E2-dl/E3-Ha/E4) (69.81 days) (Figure 3). The cultivars with similar maturity may have different growth period structures; for instance, Andou109 (E1/e2-ns/E3-Ha/E4) (R/V = 1.7) have a 7-day longer pre-flowering phase and a 7-day shorter post-flowering period than Cangdou 09Y1 (e1-as/E2-in/E3-Ha/E4) (R/V = 2.4), but they have a similar growth period (109 days). We also observed variations within the same E haplotype for flowering and maturity, indicating that additional loci also participate in regulating flowering and maturity.

Discussion
Huang-Huai-Hai region is the second largest soybean production region, which was the origin of the domesticated soybean [27]. The Northwest China is the place of creating soybean national high-yield records. To expand to a wider geographic range, artificial selections in maturity genes were made to match the natural photo-thermal conditions [3]. Natural variations of E1 to E4 genes were found to be limited in the cultivars of MG I-V in the Huang-Huai-Hai and the Northwest region in the current study; only partial non-functional mutations were identified in E1 and E2 loci, and no significant mutations were in E3 and E4 loci, indicating that E3 and E4 loci were fixed in this zone. The results were consistent with a previous report in America and in China of double mutant e1-as/e2 (MG I-III), single mutant e1-as/E2 (MG II-IV) and E1/e2 (MG II-V) and wild type E1/E2 (MG III-VIII) given that no mutations occurred in E3 and E4 loci [16,18,20]. Cultivars carrying allelic combinations of E1/e2-ns/E3/E4 occupied the highest proportion in the Huang-Huai-Hai River Valley, which is a mostly summer-planting soybean, and this agrees with other studies [18]. The cultivars with E1/e2-ns/E3/E4 occupied 45.8% cultivars of MGs I-VI, indicating that it played an important role in adjusting to wide ranges of geographical region and multiple cropping systems.
The cultivars with the combination of both mutations in E1 and E2 loci had the earliest flowering time compared with cultivars of other E combinations, which is consistent with the result of another study on 308 Chinese cultivars with 12 maturity groups [18]. The effect of E1 is different among different E2-E4 genetic backgrounds. Cultivars carrying E1/e2-ns/e3-tr/E4 reached flowering and maturity very early, which is similar to e1-as/e2ns/e3-tr/E4, whereas cultivars carrying E1/e2-ns/E3/E4 showed delayed flowering and maturity compared with e1-as/e2-ns/E3/E4 [18].
With early-maturing "Harosoy" near-isogenic lines, Cober et al. [28] found that E1 allele delayed both flowering and maturity by 16 days under the natural day length compared with early-maturing alleles. The flowering and maturity of lines with the E1 allele did not show much difference under long-day conditions compared with that under natural day length. Under the 12 h short-day condition, there are no differences in flowering and maturity between different near-isogenic lines, indicating that E genes inhibit flowering in the natural and long-day conditions. Compared with wild-type alleles, mutation alleles reduce the photoperiod sensitivity and shorten the growth period. However, in the current study, the effect of E genes cannot be identified in the cultivars with different and unknown genetic backgrounds, rather than near-isogenic lines with consistent genetic background besides the target loci in Cober's study; therefore, genetic effect and gene by environment interaction were not studied in the current study.
Phenotypic plasticity determines the adaptability of plants to environmental stimuli, particularly in the context of climate change. For instance, in the current study, the e1-as/E2in/E3-Ha/E4 genotype in the genetic background of Ji1708 and Lu0126 showed different rankings with different traits across different sub-regions, demonstrating that genetic background besides E loci also affects the phenotypic variation and contributes to genetic and environmental interaction (GEI). This study provides knowledge for germplasm evaluation as well as lays the foundation for selecting and designing elite cultivars of E combinations in the Huang-Huai-Hai River Valley and Northwest China.

Conclusions
Nine E allelic combinations were identified in the Huang-Huai-Hai River Valley and Northwest China, one photoperiodic insensitive mutation in E1 or E2 loci were the most frequent genotypes, while two photoperiodic insensitive mutations in both E1 and E2 loci were the least frequent genotypes, which flowered earliest on average. This study provided an opportunity to detect the adaptable E allelic combinations in the Huang-Huai-Hai River Valley and the Northwest China, which would facilitate the improvement of soybean adaptation in the future.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/agriculture11060478/s1. Table S1: Soybean cultivars analyzed in the current study and their Maturity Groups (MG), regions, E allele combinations and the tested year, Table S2: The latitude and longitude of locations in the current study, Table S3: Primers for E1-E3 genotyping.