Quantitative Trait Locus Mapping for Drought Tolerance in Soybean Recombinant Inbred Line Population

Improving drought stress tolerance of soybean could be an effective way to minimize the yield reduction in the drought prevailing regions. Identification of drought tolerance-related quantitative trait loci (QTLs) is useful to facilitate the development of stress-tolerant varieties. This study aimed to identify the QTLs for drought tolerance in soybean using a recombinant inbred line (RIL) population developed from the cross between a drought-tolerant ‘PI416937’ and a susceptible ‘Cheonsang’ cultivar. Phenotyping was done with a weighted drought coefficient derived from the vegetative and reproductive traits. The genetic map was constructed using 2648 polymorphic SNP markers that distributed on 20 chromosomes with a mean genetic distance of 1.36 cM between markers. A total of 10 QTLs with 3.52–4.7 logarithm of odds value accounting for up to 12.9% phenotypic variance were identified on seven chromosomes. Five chromosomes—2, 7, 10, 14, and 20—contained one QTL each, and chromosomes 1 and 19 harbored two and three QTLs, respectively. The chromosomal locations of seven QTLs overlapped or located close to the related QTLs and/or potential candidate genes reported earlier. The QTLs and closely linked markers could be utilized in maker-assisted selection to accelerate the breeding for drought tolerance in soybean.


Introduction
Soybean (Glycine max [L.] Merr.) is one of the major commodity crops worldwide for food and feed sources (http://faostat.fao.org/). Increment in the production of major crops is crucial for global food security. However, the yield of many crops, including soybean, is challenged by global climate change [1]. Climate changes exacerbate the incidence of extreme weather patterns, such as erratic rainfall, elevated temperature, and the consequent drought stress, causing significant reductions in crop production [2]. Drought stress is a major abiotic stress that may cause more than 50% yield reduction in soybean [3]. Sensitivity of soybean plants to drought stress affects the global soybean yield because nearly 41% of the world's land is dryland [4], and unpredictable climatic variability, including increased drought events, is experienced in many parts [5,6]. Although the negative influence of drought on soybean depends on the severity, duration, and timing of the stress about the growth stage, the most susceptible stage to drought stress is the reproductive stage [7,8]. Therefore, acquisition of genetic information on drought tolerance at the reproductive stages of soybean is of great importance.
Low soil water availability brings several physiological and biochemical changes in soybean plants that may induce a wide range of injury symptoms, such as reduced photosynthesis [9,10], increased oxidative stress [11], and alterations in metabolism [12]. These changes are reflected in various visible traits, including reduced plant height, the number of nodes, branches and pods, biomass, and leaf area in soybean [13][14][15]. As drought tolerance is a complex quantitative trait controlled by multiple genes [16], it can be expected that several traits and loci are associated with the ability to tolerate water-deficit stress in soybean. Therefore, the quantitative trait locus (QTL) studies for drought tolerance comprising traits like plant height, the number of nodes, branches and pods, biomass, and leaf area could be of high significance.
Identification of the genomic regions associated with drought tolerance can help accelerate soybean genetic research and varietal improvement. A few linkage mapping studies have been carried out to identify QTLs related to drought tolerance in soybean considering different traits. For instance, QTLs have been detected using seed yield and drought susceptibility [17], leaf wilting coefficient, excised leaf water loss, relative water content and seed yield [18], the conditioning of fibrous roots that is related to drought avoidance [19], water use efficiency and leaf ash [20,21], beta and carbon isotope discrimination [22], canopy wilting [23], and plant height and seed yield [24]. Recently, Wang et al. [25] used a genome-wide association study to identify QTL for drought tolerance considering the relative plant height and plant weight.
One of the major limiting factors in the genetic study of drought tolerance was the availability of low-density markers, thereby reducing the efficiency and accuracy of QTL mapping. However, the rapid development of sequencing techniques has provided powerful tools like single nucleotide polymorphism (SNP) genotyping, enabling the development of the highest map resolution compared to other marker systems [26,27]. SNP markers have been used to discover QTL in many crops, including rice, maize, wheat, soybean, canola, barley, sugar beet, and cowpea [28]. Similarly, selection and measurements of relevant traits are equally important to precisely identify QTLs for stress tolerance. In this study, we considered a few vegetative as well as reproductive traits, such as plant height (PH), the number of nodes on the main stem (NN), branches (BN) and pods (PN), biomass (BM), and leaf area (LA) for phenotyping and SNP markers for genotyping the RIL population to identify QTL for drought tolerance. As these six traits are regarded as highly affected traits due to drought stress [13][14][15], this study provides valuable information on genetic understanding and breeding for drought tolerance in soybean.

Soil Moisture Content
The soil moisture content of the control and treatment plots differed across three years according to the irrigation applied to the plots. On average, the control plots had 10-13% and the drought treatment plots had 3-10% soil moisture content. In 2017, the control plot showed an average of 11% and the treated plot showed an average of 7% soil moisture content. In 2018, the soil moisture content was 12.7 and 9.7% in the control and drought-treated plots, respectively. Similarly, the control plot showed 10% and the treated plot showed 3% moisture content in 2019.

Phenotypic Analysis of The Parents and 140 RILs
The drought-tolerant parent 'PI416937' had consistently higher weighted drought coefficient (WDC) than the susceptible parent 'Cheonsang' for all three combinations of traits ( Table 1). The mean WDC, calculated using two, three, and six traits, of 'PI416937' was 0.76, 0.80, and 0.79 and that of 'Cheonsang' was 0.42, 0.52, and 0.57, respectively. The highest WDC for 'PI416937' and 'Cheonsang' was found in 2019 and 2018, respectively. On the other hand, the highest WDC for the RILs was found in 2017. RIL distribution for WDC over three years showed normal distribution with transgressive segregation (Figure 1). highest WDC for 'PI416937′ and 'Cheonsang' was found in 2019 and 2018, respectively. On the other hand, the highest WDC for the RILs was found in 2017. RIL distribution for WDC over three years showed normal distribution with transgressive segregation (Figure 1).

Linkage Mapping and QTL Analysis
The 19,259 polymorphic markers were binned (segregation distortion p < 0.001 and missing data with >15%) to eliminate the redundant markers. After binning, 2702 markers remained, out of which 54 markers with high map intervals and recombination frequencies were also eliminated. The 54 removed markers had as high as 63.34 cM map intervals and/or 0.6712 recombination frequencies. A total of 2648 SNPs were used to construct the linkage maps of 20 chromosomes (Supplementary Table S1) and QTL analysis. The total linkage maps spanned 3608.4 cM with a mean of 1.36 cM between markers. Chromosomes 13 (262.44 cM) and 15 (145.71 cM) had the largest and shortest linkage maps, respectively.

Candidate Gene Prediction
The potential candidate genes that resided within 200 kb of the QTLs were searched in Soybase (www.soybase.org, accessed on 20 April 2021), NCBI (https://www.ncbi.nlm. nih.gov/, accessed on 20 April 2021), and Phytozyme (https://phytozome.jgi.doe.gov, accessed on 20 April 2021).Twelve potential candidate genes were found within the 200 kb of the QTL regions (Table 3). Four genes-Glyma07g10321, Glyma07g10340, Glyma07g10440, and Glyma07g11470-reside in one of the major stable QTL qWDC7-1. They are related to myeloblastosis (MYB) transcription factor family, a leucine-rich repeat receptor-like protein kinase, calmodulin binding protein-like, and mitogen-activated protein kinase, respectively. Gene Glyma01g04710 is related to glutathione S-transferase (GST). A few genes, such as Glyma19g33750, Glyma19g34210, and Glyma20g22311 are found to be directly associated with a stress response.

Discussion
The drought tolerance mechanism in plants is highly complex and is an outcome of complicated networks of multiple genes. Various physiological and biochemical alterations, due to drought stress, have been identified in soybean plants [9][10][11][12] that may visibly reflect in traits like PH, NN, BN, PN, BM, and LA [13][14][15]. Qi et al. [29] found a significant correlation between comprehensive drought resistance coefficient and WDC which was calculated by considering 35 morphological, physiological, and biochemical indicators including plant height and aboveground dry weight (biomass), which were also considered in the present study. These two traits (plant height and aboveground dry weight (biomass) incorporated in the previous report [29] were significantly correlated with other traits considered in the present study. As most of these six traits were significantly correlated (Supplementary Table S2), an integrated parameter WDC, derived from these traits, could appropriately represent them whilst analyzing the QTL for drought tolerance. Similarly, positive correlations of the number of nodes and pods with seed yield [30] as well as the associations of leaf area distribution with biomass and thereby with the number of pods, seed number, and seed yield [31] have been reported in soybean under low water availability, indicating the potential application of the QTL results of the present study in the soybean seed yield under drought condition.
The consistently higher WDC (Table 1) value of 'PI416937' than that of 'Cheonsang' over three years showed the former parent is better drought-tolerant than the latter one. Wide range and continuous variations in WDC value of RILs across different environments (year) indicated a quantitative nature of WDC, suggesting the appropriateness of choosing these parents to develop the RIL population for QTL analysis. The transgressive segregation of the genotypes having WDC beyond either parent could be exploited in breeding for drought tolerance [32]. Although high broad-sense heritabilities for six traits were observed in individual years (up to 0.90), the mean year data showed relatively low heritability (up to 0.42) (Supplementary Table S3), suggesting a substantial influence of growing environment on the traits. The highly significant (p < 0.0001) genotype × year interaction also indicated the major influence of environment on the traits (Supplementary Table S4).
Several biochemical mechanisms and genes might be involved in stress tolerance in soybean [37]. Glyma01g04710 related to GST was found to be resided in the QTL qWDC1-1. GSTs play multiple roles in plants including drought stress response in Arabidopsis [38], rice [39], and soybean [40]. Over-expression of a GST gene, GsGST, from wild soybean (Glycine soja) enhances drought and salt tolerance in transgenic tobacco [41]. Overexpression of soybean BiP (binding protein), a molecular chaperon, similar to Glyma01g04750 in QTL qWDC1-1, can enhance drought tolerance in soybean [42].
The products of four genes-Glyma07g10321, Glyma07g10340, Glyma07g10440, and Glyma07g11470-in the QTL region of chromosome 7 are related to the regulation of drought stress in soybean and other plants. For instance, Arabidopsis calmodulin-binding transcription factor CAMTA1 is involved in drought stress response [43]. GmMYB84, a novel MYB confers drought tolerance in soybean [44]. Overexpression of the leucine-rich receptor-like kinase gene LRK2 increases drought tolerance and tiller number in rice [45]. Expression of a truncated ERECTA (a gene family encoding leucine-rich repeat receptor-like kinase) protein modified the growth and abiotic stress tolerance in soybean [46]. Morever, mitogen-activated protein kinase positively regulates drought stress in tomato [47].
In the QTL region of chromosome 19, four candidate genes were found. Glyma19g33750 is associated with salt stress response and Glyma19g34210 is related to a heat shock transcription factor. The other two genes-Glyma19g33650 and Glyma19g34550-are linked with glutathione peroxidase and Golgi SNARE Bet1-related, respectively. Heat stress transcription factors play a crucial role in plants' response to several abiotic stresses by regulating the expression of stress-responsive genes, such as heat shock proteins [48]. Overexpression of a glutathione peroxidase 5 (RcGPX5) gene increases drought tolerance in Salvia miltiorrhiza [49]. Furthermore, reactive oxygen species scavenging activities, including glutathione peroxidase, increased in soybean plants and were positively correlated with seed yield under drought stress [50]. Similarly, SNAREs are found to play a role in plant drought tolerance [51].
The QTLs for drought tolerance, which were identified considering up to six traits, were either colocalized or positioned adjacent to the previously reported QTLs and/or potential candidate genes associated with stresses and/or the traits of consideration. It increased the reliability of the QTL and the results could provide a valuable reference for the molecular marker-assisted selection and further fine-mapping of genes for drought tolerance.

Plant Material and Growing Conditions
A RIL population developed through the single seed descent method from a cross between a drought-tolerant 'PI416937' and susceptible 'Cheonsang' cultivar was used to analyze the QTL for drought tolerance. The parents and 140 RILs of F 6:7 , F 6:8 , and F 6:9 were grown in plastic houses at the Department of Southern Area Crop Science, Daegu (35 •  30 cm row to row and plant to plant distance in two replications for control and drought stress each. Irrigation was applied through drip irrigation and drought stress was imposed from the V4 to R4 stages by withholding irrigation during the period. The plants in the control plots were regularly irrigated to avoid drought stress.

Measurement of Soil Moisture Content
The soil moisture content of the control and drought-stressed plots was measured using a soil moisture meter (TDR 300, Spectrum Technologies, Plainfield, IL, USA).

Measurement of Traits and Phenotyping
The plant height, number of nodes and branches on the main stem, number of pods, and leaf area were measured at the R6 stage, whereas the biomass (including seeds) was measured when plant was harvested at the R8 stage. The traits were measured in three to five plants of each replication. Leaf area was measured using the Easy Leaf Area software [52].
Each of drought coefficient (DC) value of six traits was calculated as the ratio of individual trait under the drought to control conditions as shown in the equation below.

DC = Trait Drought /Trait Control
The weighted drought coefficient (WDC) was calculated as follows [29]. This is one of the methods of comprehensive evaluation of drought tolerance in soybean that were identified from eight yield-related agronomic traits, and rigorous studies of different evaluation methods by establishing a relative correlation with the traits.
where DC is mean drought coefficient of the traits considered, r is the correlation coefficient of the mean DC of the traits considered and the DC of individual traits.
The QTLs for drought tolerance were analyzed by considering the WDC values calculated from the combination of two (biomass and leaf area), three (plant height, biomass, and leaf area), and six (plant height, number of nodes, number of branches, number of pods, biomass, and leaf area) traits.

DNA Extraction and Genotyping
Genomic DNA was extracted from the young trifoliate leaves using a kit (Exgene TM Plant SV Miniprep Kit, GeneAll, Seoul, Korea) as described in a previous report [53]. The parents and RILs were genotyped using a 180K Axiom ® SoyaSNP array [54].

Construction of Linkage Map and QTL Analysis
The polymorphic markers between the parents were separated from the 180K SNPs and subjected to screen for redundancy. In the genetic study, the redundant markers can make no additional information because they have identical segregation in the genetic population and show clustering at one genetic position in the linkage map construction [55]. Therefore, the redundant markers were separated out using the Bin function before the linkage map construction using the Map function in IciMapping V4.1 [56]. The algorithms set for the Bin function were as follows: significant distortion of p < 0.001 and missing data with >15%. The linkage map was constructed using the Kosambi mapping function following the manufacturer's instruction with the adjusted parameters: grouping by 3.0 logarithm of odds (LOD) threshold, ordering by nnTwoOpt, and rippling by the sum of adjacent recombination fractions. The SNPs with high map intervals and recombination frequencies were further removed.
QTLs were analyzed with the composite interval mapping (CIM) using QTL Cartographer V2.5 (available at https://brcwebportal.cos.ncsu.edu/qtlcart/WQTLCart.htm, 5 March 2021) following the manufacturer's instructions with adjusted parameters: Model 6, forward and backward regression, walk speed of 1.0 cM, and putative QTL with a window size of 10 cM. The number of control markers was 5, which was a default parameter. The LOD threshold for each trait was determined using a 1000 permutation test at p < 0.05. After the completion of the analysis, the QTL information was extracted by adjusting a minimum of 10 cM between QTL and 2-LOD support intervals. The graphical presentation of linkage maps with QTLs was done using MapChart 2.32 [57].
The QTLs were named by combing abbreviated letters q for QTL and WDC for weighted drought coefficient followed by the name of chromosome and nth QTL on the chromosome. For instance, qWDC1-2 denotes the second QTL identified on chromosome 1.

Potential Candidate Genes Prediction
Potential candidate genes were searched within 200 kb regions of QTLs. The genes, which were directly linked to drought stress response and/or associated with the stress, were considered candidate genes. The name and function of drought stress-related potential candidate genes that resided in the QTLs were searched in Soybase (www.soybase.org), NCBI (https://www.ncbi.nlm.nih.gov/), and Phytozyme (https://phytozome.jgi.doe.gov). The Glyma1.1 gene version was used to collect the gene information.

Data Analysis
Analysis of variance (ANOVA) and Pearson's correlation were calculated in SAS9.4 using PROC GLM and PROC CORR, respectively. Broad-sense heritability (h 2 ) was determined as the ratio of genotypic variance (σ 2 G ) to phenotypic variance (σ 2 P ) as described earlier [58]. The genotypic variance (σ 2 G ) component was estimated as: M 3 −M 2 /rY where M 3 is the mean square of genotype, M 2 is the mean square of genotype × year, r is the number of replications, and Y is the number of years. The phenotypic variance (σ 2 P ) component was estimated using the equation σ 2 P = σ 2 G + σ 2 GY /Y + σ 2 e/rY where σ 2 GY and σ 2 e are the components of genotype × year and error variances, respectively. The component of genotype × year variance (σ 2 GY ) was estimated as: M 2 −M 1 /r where M 1 is the mean square of error (σ 2 e ).