Genome-Wide Association Study of Morpho-Physiological Traits in Aegilops tauschii to Broaden Wheat Genetic Diversity

Aegilops tauschii, the D-genome donor of bread wheat, is a storehouse of genetic diversity that can be used for wheat improvement. This species consists of two main lineages (TauL1 and TauL2) and one minor lineage (TauL3). Its morpho-physiological diversity is large, with adaptations to a wide ecological range. Identification of allelic diversity in Ae. tauschii is of utmost importance for efficient breeding and widening of the genetic base of wheat. This study aimed at identifying markers or genes associated with morpho-physiological traits in Ae. tauschii, and at understanding the difference in genetic diversity between the two main lineages. We performed genome-wide association studies of 11 morpho-physiological traits of 343 Ae. tauschii accessions representing the entire range of habitats using 34,829 DArTseq markers. We observed a wide range of morpho-physiological variation among all accessions. We identified 23 marker–trait associations (MTAs) in all accessions, 15 specific to TauL1 and eight specific to TauL2, suggesting independent evolution in each lineage. Some of the MTAs could be novel and have not been reported in bread wheat. The markers or genes identified in this study will help reveal the genes controlling the morpho-physiological traits in Ae. tauschii, and thus in bread wheat even if the plant morphology is different.


Introduction
Aegilops tauschii Coss. (syn. Ae. squarrosa auct. non L.), a wild diploid self-pollinating species (2n = 2x = 14, DD), is the D-genome donor of hexaploid bread wheat (Triticum aestivum L.; 2n = 6x = 42, AABBDD). It is native to Central Asia throughout the Caspian Sea region and China. About 10,000 years ago, natural hybridization between tetraploid wheat and Ae. tauschii [1][2][3] led to the formation of hexaploid wheat [4,5]. Only a few Ae. tauschii lines from a limited area were involved in this hybridization [6]. This has resulted in a narrow genetic base of the wheat D-genome during the evolution of bread wheat. This fact has been confirmed by various studies, and indicates that the D-genome of wheat has narrow genetic diversity compared with the A and B genomes [3,7,8]. However, much greater genetic diversity is present in the wild D-genome donor [9]. It is believed that Ae. tauschii is an excellent source of genes to widen the narrow genetic base of bread wheat, such as for drought and heat-stress tolerance [10,11]. To use the genetic diversity in Ae. tauschii effectively, a precise genomic and morpho-physiological analysis is needed.
The genome-wide association study (GWAS) is a leading approach to the dissection of complex traits and the detection of novel and superior alleles for crop breeding. GWAS has Plants 2021, 10, 211 2 of 16 been used to untangle the genetic architecture of numerous traits in different crops [12,13]. Many studies have focused on understanding the genetic and morphological diversity of Ae. tauschii germplasm [9,[14][15][16][17][18][19][20]. However, only a few studies in Ae. tauschii have used GWAS, focusing on cadmium stress [21], phosphorus deficiency [22], grain architecture [23], grain micronutrient concentrations [24], or other morphological traits [25]. Here, we investigated marker-trait associations (MTAs) of morpho-physiological traits that could contribute greatly to improving yield and stress adaptation in bread wheat through GWAS, and sought specific MTAs to define the sources of evolution in two of its three lineages, TauL1 and TauL2.

Morpho-Physiological Variation
We studied eight morphological traits (FLL, FLW, SPL, SPW Bio) and three physiological traits (NDVI, SPAD, and CT). measurement methodology shown in Figure 1. ANOVA reveal among all accessions in all traits (Table 1 and Figure 2).   The effect of seasonal difference (S) was significant (p < 0.05) for all traits except for FLW and DH. The effect of genotype × seasonal difference interaction (G × S) was significant for DH, Bio, NDVI, SPAD, and CT. Morpho-physiological variations among accessions in each trait were confirmed by range, mean, standard deviation, and coefficient of variation. The coefficient of variation ranged from 4.6% to 35.5% in S1 and from 4.4% to 57.9% in S2. Heritability values were higher in morphological traits (>0.90; FLL, FLW, SPL, and SPW) than in physiological traits (<0.60; NDVI, SPAD, and CT; Table 1). The effect of seasonal difference (S) was significant (p < 0.05) for all traits except for FLW and DH. The effect of genotype × seasonal difference interaction (G × S) was significant for DH, Bio, NDVI, SPAD, and CT. Morpho-physiological variations among accessions in each trait were confirmed by range, mean, standard deviation, and coefficient of variation. The coefficient of variation ranged from 4.6% to 35.5% in S1 and from 4.4% to 57.9% in S2. Heritability values were higher in morphological traits (>0.90; FLL, FLW, SPL, and SPW) than in physiological traits (<0.60; NDVI, SPAD, and CT; Table 1).

Correlation of Morpho-Physiological Traits in TauL1, TauL2, and All Accessions
In TauL1 and TauL2, we analyzed correlations among morpho-physiological traits (Tables 2 and 3   The correlations between spike-related traits (SPL, SPW, SN/SP, and SPWg) were slightly higher in TauL2 accessions than in TauL1 accessions.
The desirable allele is that with a greater contribution to phenotypic variation.
R 2 values ranged from 0.10 to 0.15 and were higher than those of the significant markers in all accessions combined (0.05-0.09; Table 6). TauL2 had 1 MTA for each of SPL, SPW, SN/SP, SPWg, DH, Bio, SPAD, and CT, with R 2 from 0.12 to 0.17 ( Figure 4 and Table 5). The desirable allele is that with a greater contribution to phenotypic variation.
Among the MTAs detected for DH in all accessions combined, marker 32782428|F|0-17, on chromosome 5D, was detected in TauL1 also, where it had pleiotropic effects on DH and Bio (Tables 5 and 6). All other significant MTAs differed between all accessions combined, TauL1 and TauL2. Marker 32740588, detected in TauL2, had a pleiotropic effect on SPW and SPWg. An MTA for CT was detected only in TauL2 (Figure 4 and Table 5). TauL1 and TauL2 had no MTAs in common. TauL2 had fewer MTAs than TauL1.

GWAS in All Accessions of Aegilops tauschii
GWAS in all 343 accessions identified 23 MTAs: three each for FLL, SPL, SPW, NDVI; four for SN/SP; six for DH; and one for SPAD ( Figure 5 and Table 6). R 2 values ranged from 0.05 to 0.09. Most of these MTAs were different from those in TauL1 and TauL2. The one exception, 32782428|F|0-17, for DH, appeared also in TauL1 as an MTA for DH and Bio. Most of the MTAs contributed less to variability (R 2 ) than those in TauL1 and TauL2.

Candidate Gene Identification
We searched for candidate genes for the MTAs in TauL1 and TauL2 (Supplementary  Tables S1 and S2) and identified the possible functions. The functions show that the MTAs found here play an important role in plant adaptation and survival.

Morpho-Physiological Variation in Aegilops tauschii
Among the wild species in the tribe Triticeae, Ae. tauschii is considered the most suitable for the genetic enhancement of wheat. The diversity of the D-genome of Ae. tauschii is much larger than that of hexaploid wheat's D genome. The Ae. tauschii genome contains many useful genes for resistance to biotic and abiotic stresses and for seed storage proteins [26][27][28][29]. The 343 Ae. tauschii accessions analyzed showed significant variation in most traits studied. Spike and leaf traits had higher heritabilities than physiological traits (CT, SPAD, and NDVI) ( Table 1), indicating that environmental factors greatly influence physiological traits. As spike and leaf traits are genetically determined, they are less influenced by the environment ( Table 1). Selection of highly heritable traits will be effective for widening the genetic base of wheat diversity [30]. Highly correlated traits are likely to be inherited together, widening the genetic base. A positive correlation between SPW and SPWg (r = 0.781 in TauL1, r = 0.907 in TauL2, r = 0.843 in all accessions; Tables 2-4) indicates that an increase in SPW increases SPWg. SPW had a greater effect on grain weight than SPL. On average, grains in TauL2 were heavier and larger. Moderate to strong correlations between grain weight and size in wheat have been reported [31]. A mutation in TaGW2-A1 increased both grain width and length in tetraploid and hexaploid wheat, which increased 1000-grain weight [32]. The correlation between SPW and SPWg was highest in TauL2 (r = 0.907; Table 3), indicating that TauL2 is a more suitable source for improving grain weight. A positive correlation between SPL and SN/SP indicates that an increase in SPL increases SN/SP. SPL thus affects kernel number per spike and plays an essential role in improving wheat yield [33]. Moreover, the number of grains per m 2 and grain weight are the most important traits for determining grain yield [23].
Among physiological traits, a significant positive correlation of NDVI with Bio indicates that an increase in NDVI enhances Bio production and subsequently plant production and adaptation. The negative correlation between CT and Bio indicates that a decrease in CT increases Bio. In other words, plants with better cooling capacity will maintain better Bio. A positive correlation of DH with Bio indicates that a longer vegetative period is preferable for a higher Bio, if the environment is favorable (Tables 2-4).

GWAS of Morpho-Physiological Traits in TauL1 and TauL2
GWAS revealed that MTAs of morpho-physiological traits differed in both chromosome name and location between TauL1 and TauL2 (Table 5). These findings indicate that the traits have evolved independently in each lineage.
TauL1 had more MTAs for SPL, DH, and Bio than TauL2 (Figures 3 and 4), indicating higher variation in these traits in TauL1. We found candidate genes in TauL1, but not in TauL2, that increase Bio and promote flowering, indicating that TauL1 is a better source for mining genes related to Bio, DH, and SPL (Supplementary Tables S1 and S2).
MTAs for CT and SPAD were found only in TauL2. As most of the accessions in TauL2 originated from Northern Iran, which has a warm and mild environment, we can speculate that these two traits contribute to the adaptation of these accessions to their habitats. Conversely, NDVI was found only in TauL1. TauL1 could be a source for NDVI gene mining, whereas TauL2 could be a source for CT and SPAD gene mining.
Mahjoob et al.'s unpublished study found that spike traits are potentially useful for differentiating between TauL1 and TauL2: SPL, SPW, and SPWg all differed significantly. In TauL1, no significant MTA was detected for SPW, and the marker R 2 for SPWg was lower in TauL1 than in TauL2. These results support our conclusion that TauL2 has more diversity in SPW and SPWg than TauL1. Moreover, the SPW and SPWg candidate genes TraesCS5D02G042200 and TraesCS5D02G041500, identified in TauL2, are orthologous to Arabidopsis thaliana AT2G03590, which encodes a transmembrane transporter that increases nitrogen fixation and promotes seed development [34]. Thus, TauL2 could be an essential source of genes related to these two traits.

GWAS of Morpho-Physiological Traits in All Accessions
The phenotypic contribution of markers revealed by GWAS was lower in all accessions than in TauL1 and TauL2 (Table 6). These may relate to the difference in population structures, which reduced the contribution of markers to phenotypic variation (R 2 ).

Candidate Genes Revealed by GWAS in Aegilops tauschii
We found several MTAs and candidate genes associated with specific functions that play an important role in plant growth and survival. This study is the first study to use GWAS analysis of many morphological and physiological traits in Ae. tauschii of important agronomic value to wheat breeding, though Liu et al. [25] conducted GWAS in Ae. tauschii in which traits, SPL, FLL, and FLW are common. Liu et al. [25] identified 18 MTAs for only 10 of the 29 traits studied. Our study identified more MTAs, with higher R 2 values (0.5-0.17) than most of those studied before [25] because we used GWAS for two lineages independently with more molecular markers.

Marker Traits Revealed in Wheat from Aegilops tauschii
To study the usefulness of the markers revealed in Ae. tauschii and their appearance in wheat, we reviewed previous GWAS studies of wheat (Table 7)  We found MTAs for DH on chromosomes 1D, 5D, and 7D also found by Lie et al. [35]. We identified novel MTAs on chromosomes 3D and 6D for DH; on 2D, 3D, and 5D for FLL; on 1D, 2D, 5D, and 6D for SN/SP; and on 1D, 2D, 3D, 5D, and 6D for SPL. In TauL1, we found novel MTAs on 1D and 5D for DH; on 4D for SN/SP; and on 1D, 2D, 3D, and 5D for SN/SP. In TauL2 (which supplied the D-genome of hexaploid wheat [38]), we identified three novel MTAs: two on 2D associated with SN/SP and SPL, and one on 7D associated with DH. Those MTAs can be easily transferred to the D-genome of wheat where they would be expected to increase yield. Markers on 7D associated with DH can be transferred to improve early flowering in later-flowering variants, especially in drylands.

Plant Materials
We used 343 Ae. tauschii accessions representing the entire range of natural habitats (Supplementary Table S3

Morpho-Physiological Evaluation
Details of the morpho-physiological evaluations and data collection are summarized in Table 8. Spike length and width were measured using ruler as shown in Figure 1. All accessions were characterized in the research field of the Arid Land Research Center, Tottori University (Tottori, Japan; 35 • 32 N, 134 • 13 E), during the winter-spring seasons of 2016-17 (S1) and 2017-18 (S2), in an augmented complete block design with three checks selected randomly. We measured 11 morpho-physiological traits: flag leaf length (FLL), flag leaf width (FLW), spike length (SPL), spike width (SPW), seed number per spike (SN/SP), spike weight (SPWg), days to heading (DH), biomass (Bio), normalized difference vegetative index (NDVI), canopy temperature (CT), and chlorophyll content (SPAD).

Statistical Analysis of Agronomic Traits
ANOVA was conducted in Plant Breeding Tools (PBTools) v. 1.4 software (International Rice Research Institute, http://bbi.irri.org/products). Using genetic variance (V g ) and environmental variance (V e ), we calculated broad-sense heritability [H 2 = V g /(V g + V e )] of each trait [39]. Because genotype × season interactions were significant, we estimated best linear unbiased predictions (BLUPs) for each trait. We used BLUP data for trait correlation analysis in TauL1, TauL2, and all accessions in SPSS v. 25 software [40].

Genotyping and Marker-Trait Association (MTA) Analysis
Genomic DNA was extracted from young leaves by using the CTAB method [41]. The DNA samples (30 µL; 50-100 ng µL −1 ) were sent to Diversity Arrays Technology Pty Ltd., Australia (http://www.diversityarrays.com), for a whole-genome scan on the DArTseq platform (DArT P/L, Canberra, Australia)). DArTseq is a genotyping-by-sequencing method which utilizes a Next-Generation Sequencing approach to sequence the most informative representations of genomic DNA samples to aid marker discovery. In total, DArTseq generates 59,193 silico and 55,390 SNP markers. We selected the markers with a call rate of 90% (10% missing data) and obtained 3117 SNP and 47,072 Sillco markers. The Fisher exact test was applied to determine if the two alleles were independent SNP markers. Single nucleotide polymorphisms (SNPs) or Silico DArT markers with a minor allele frequency of <5% were removed from the analysis. The remaining 34,829 SNPs and Silico DArT markers were used for genomic analysis.
We performed GWAS with BLUP values for each phenotype using a Mixed Linear Model (MLM) in TASSEL v. 5 software [42]. For all traits, the Bonferroni-Holm correction for multiple testing (α = 0.05) was too stringent. Thus, markers with an adjusted -log10 (p-value) ≥ 4.0 were regarded as significant. To search for candidate genes, we performed a BLAST search of the sequence of each significant marker against the Chinese Spring RefSeq v. 1.0 wheat reference genome (IWGSC 2020). The position where the tag hit the best match was extended by 0.5 Mb in both directions, and that sequence was then used in a BLAST search of the Ensembl T. aestivum database ( http://plants.ensembl.org/Triticum_aestivum/Info/Index) to find predicted genes or proteins within this region. To study and validate the usefulness of the MTAs revealed in Ae. tauschii to wheat breeding, we compared it with previous MTAs revealed in bread wheat.

Conclusions
We conducted GWAS analysis of morpho-physiological traits in a diverse panel of Ae. tauschii accessions and identified several MTAs and corresponding candidate genes. Some of the candidate genes had exact functions related to the trait studied. Morphological traits are more stable and less affected by environmental factors than physiological traits. GWAS analysis revealed that morphological traits had higher number of MTAs compared to physiological traits (Tables 5 and 6). This facilitates the use of morphological trait selection in wheat breeding through marker-assisted selection. Comparing our findings with other studies in wheat suggested that some of the MTAs and genes identified here are not present in bread wheat. Our results reveal some of the hidden diversity in Ae. tauschii and provide a basis for its use in wheat breeding through direct and indirect crossing [43].
The information presented here could also help explain the mechanisms controlling the morpho-physiological traits in Ae. tauschii, which will pave the way to a better understanding of the mechanisms in bread wheat. Multiple-synthetic-derivative wheat lines incorporate a wide range of genetic diversity of Ae. tauschii were developed, and heat-and drought-resistant lines were identified through the use of such lines [11,44,45]. These facts support the indispensable role of the D-genome of Ae. tauschii in wheat breeding for high productivity and stress adaptation.