Distribution, Genetic Diversity and Population Structure of Aegilops tauschii Coss. in Major Wheat-Growing Regions in China

: Aegilops tauschii Coss. is known as a noxious grass weed seriously affecting wheat quality and yield. To investigate its present occurrence in wheat fields and the potential genetic diversity of the grass weed in China, a filed survey covering major wheat production regions was conducted during 2017–2019. Seeds of different Ae. tauschii populations collected from the survey were analyzed with Simple Sequence Repeats (SSRs) technique. Results showed that Ae. tauschii was occurring in each of the provinces surveyed with varied occurrence frequency ranging from 0.91% in Sichuan Province to 92.85% in Henan Provinces. Eighty alleles with size ranging from 98 bp to 277 bp were detected from the 192 collected Ae. tauschii populations with 17 SSR markers. Ae. tauschii , in this study, exhibited a moderately high level of genetic diversity, high differentiation, deficient heterozygosity and limited gene flow. Compared with other provinces, Hubei populations pos-sessed relatively low genetic diversity. Dendrogram analysis showed that genetic distance did not seem to be related to geographic distribution. Additionally, STRUCTURE analysis suggested that Ae. tauschii populations in wheat fields of China can be divided into three groups, which was further supported by cluster analysis. Among the three groups, solely 7% of the total variation was detected, whereas the majority variation (67%) occurred among different populations within same group. Undoubtedly, such information will help us to better understand population relationships and spread of Ae. tauschii in China and will provide a new perspective for its integrated management.


Introduction
Aegilops tauschii Coss., an alien invasive species, is widely distributed in major wheat production areas of China during recent years. It is reported that Ae. tauschii occurred in eight wheat-growing provinces of China by 2007, with a damage area of approximately 3.3 × 10 5 ha [1]. As a member of the Poaceae family, Ae. tauschii is a self-pollinated, annual diploid plant [2]. Ae. tauschii has extremely strong producibility, and an individual plant can produce 10-40 tillers, with 100-800 seeds, on average [1,3]. Usually, Ae. tauschii emerges from September to November and matures from May to June of the following year, competing with wheat for sunlight, water and fertilizer [4]. It can cause a 50-80% wheat yield loss when its ears are up to 457 m −2 [1]. As a donor of wheat D genome, Ae. tauschii has high homology with wheat in genetic background, and only a few selective herbicides are available for this weed. In addition, the upper spikelets of Ae. tauschii are easily detached from the rachis and fall into the soil before reaching complete maturity, while the basal spikelets remain in the rachis, spreading with wheat seeds. The distribution of weeds is closely related to its monitoring and management. Descriptive studies of the genetic diversity in weed populations can also be extremely important, because they provide essential background for further focused research, such as understanding the ability of weeds to adapt to different environments and the impact of herbicide selection on weed populations [5,6]. It is generally believed that Ae. tauschii originated from Azerbaijan and the southern coastal region of the Caspian Sea [7]. From its origin, it spread westward to central Syria through the valleys of mountainous southeastern Turkey and eastward to western China through the Kopet Dag Mountains of Turkmenistan [7]. As early as 1955, Ae. tauschii was first spotted in Henan Province, central China. Before the 1990s, Ae. tauschii only sporadically infested wheat in the provinces of Henan, Shanxi and Shaanxi [8]. Since the 2000s, it escaped from control management and experienced a rapid expansion to new habitats. However, the precise distribution and occurrence states of Ae. tauschii in wheat fields in China are obscure.
Although the genetic diversity of Ae. tauschii in China has been described previously, the materials in this research were focused on the populations from provinces of Henan, Shaanxi and Xinjiang [9,10]. Nevertheless, the distribution range of the species is gradually expanding across the wheat fields in China, particularly in recent years. Thus, more populations from a wider range of locations are required to comprehensively evaluate its genetic diversity and understand its spread in China.
Plant phenotypes and their biochemical or molecular markers can be used to evaluate their genetic diversity. Molecular markers play an increasingly important role in estimating genetic distance, due to their predictable genetic basis and abilities to distinguish specific segments of DNA or their products. Furthermore, it is less susceptible to environmental fluctuations, compared with morphological features [11]. Among that, Simple Sequence Repeats (SSRs) are tandemly repeated motifs of 1-6 nucleotides per unit that are widely spread throughout a genome. Compared with other molecular markers, SSRs are more preferable, due to their high polymorphism, reproducibility, high mutation rates and relative scoring ease [12][13][14]. It is reported that Ae. tauschii genome contains a large number of SSRs [15]. Recently, Yang et al. [5] used SSR markers to assess the genetic diversity of Commelina communis, a hard-to-control weed in summer crops. In another study, microsatellite markers were also used to evaluate the genetic diversity of Olea europaea and Eleusine coracana L. [16,17].
Currently, the latest information regarding the occurrence and infestation of Ae. tauschii in China is not well known. It is necessary to comprehensively understand its genetic diversity and population structure. The aims of this study were to (i) ascertain the occurrence state and infestation of Ae. tauschii in wheat fields in China, (ii) assess its genetic diversity and population structure and (iii) clarify the genetic relationships among different populations.

Aegilops tauschii Coss. Field Survey in China
The survey was conducted in 12 major wheat production provinces, municipalities and autonomous regions (Anhui, Beijing, Hebei, Henan, Hubei, Jiangsu, Shaanxi, Shandong, Shanxi, Sichuan, Tianjin, Xinjiang; hereafter referred to as "province") during 2017-2019. The assessments in different provinces were carried out when wheat was at the ripening stage. Detailed survey time is shown in Table 1. County was considered as the basic survey unit, and all counties with wheat planting in each province were surveyed. The sampling locations were randomly selected within 10-km intervals, and within each location, three representative sites of 1 ha wheat field were identified. Sites were surveyed by two people walking in an inverted "W" pattern for approximately 200 m, and the presence or absence of Ae. tauschii was recorded. In addition, random countywide surveys were conducted by Plant Protection and Quarantine Stations of each county as supplement data of the local occurrence state of Ae. tauschii. The occurrence frequency of Ae. tauschii in each province = counties with Ae. tauschii infestation/the total counties surveyed in the province.

Collection of Aegilops tauschii Coss. Seeds
Above surveyed sites with serious Ae. tauschii infestation were selected to estimate spike density. On each selected site, five 1-m 2 quadrats were randomly chosen in which to count the spike number. Sites with Ae. tauschii density over 100 spikes m −2 were determined as those from which to collect seeds, and the seeds were randomly collected from at least 500 individual plants and pooled as a population. In total, 192 Ae. tauschii populations were obtained (Tables 1 and S1). In 2019, seeds of each population were planted in pots containing a 3:1 (v/v) mixture of soil and organic fertilizer. The pots were kept in the greenhouse (25/20 °C, day/night) with a 12-h photoperiod of 320 µmol m −2 s −1 and watered as necessary to maintain the optimum soil moisture. Ten plants from each population were marked, and the leaf tissue of them were sampled and then stored at −80 °C.

DNA Extraction
One hundred milligrams of leaf tissue were ground into a fine powder in liquid nitrogen for each population. DNA was isolated using a Tiangen DNAsecure Plant Kit (catalogue no. DP 320; Tiangen Biotech Co., Ltd., Beijing, China). The quality of DNA was determined visually using 1% agarose gels. To ensure consistent quantities among the samples, DNA solution was diluted to 50 ng µL −1 after the concentration was measured, using a Nanodrop 2000 (Thermo Scientific, Waltham, MA, USA).

SSR Primer Pairs Screening
More than 60 microsatellite primers isolated and mapped from the bread wheat D genome [18,19] were used for PCR amplification. In total, 17 primer pairs with specific amplicons were selected as candidate primers ( Table 2). The PCR mixture contained 50 ng genomic DNA, 0.5 µL forward primer and reverse primer (1.0 × 10 −8 mol L −1 ), 12.5 µL 2× Taq PCR Master Mix (catalogue no. KT 201; Tiangen, Beijing, China) and 10.5 µL ddH2O to a total volume of 25 µL. Amplification began with a denaturation step for 4 min at 94 °C, followed by 32 cycles of 94 °C for 30 s, the annealing temperature of each primer for 30 s and 72 °C for 30 s and a final extension at 72 °C for 40 min. PCR products were evaluated with 1.8% agarose gels.

SSR Detection
The selected candidate primers were used for amplification of all samples for a second time. This time, each forward primer was modified with fluorophores (hex or fam) at the 5′ ends. PCR products with different fluorophores were mixed and analyzed by an ABI PRISM 3730xl DNA Sequencer with GS500 (Applied Biosystems, Foster city, USA) as an internal size standard. Alleles of each locus were determined by GeneMarker (version 2.2.0).

Data Analyses
To generate a binary data matrix for further analysis, each allele was scored as 0 when absent and 1 when present. The effective number of alleles (Ne), the observed number of alleles (Na), the observed heterozygosity (Ho), the expected heterozygosity (He), Shannon's information index (I), the proportion of differentiation among populations (Fst), Nei's gene diversity (H) [20], Wright's fixation index (F = 1-Ho/He) and the gene flow (Nm = [(1/Fst)-1]/4) were calculated by the POPGENE 32-bit version program. Polymorphism information content (PIC) was calculated according to where Pij was the frequency of the jth genotype at the ith loci. Pairwise, genetic similarity among different populations was calculated based on the Dice similarity coefficient, and then a cluster was generated with the unweighted pair group method using arithmetic means (UPGMA) by NTsys 2.10e and MEGA 7.0 software (Tokyo, Japan). Under a Bayesian model, the analysis of population structure was carried out using STRUCTURE 2.3.4 [21]. The K value was set from 2 to 10. Under an admixture model using correlated allele frequencies, each K value was run ten times, independently. The iterations of a burn-in period and Markov chain Monte Carlo (MCMC) steps were set to 100,000 and 200,000, respectively. The most appropriate K value was determined with Structure Harvester. Analysis of molecular variance (AMOVA) was performed using GenAlex 6.1 to further investigate the population structure of Ae. tauschii. GenAlex 6.1 was also used to obtain, pairwise, Fst and Nm among different genetic groups based on the results of population structure analysis.

The Distribution and Occurrence of Aegilops tauschii Coss. in China
Although all the surveyed provinces had Ae. tauschii infestations, the incidences were different among these provinces. Ae. tauschii was most commonly found in Henan, Hebei and Shandong provinces, with occurrence frequency over 70%. In Henan Province, Ae. tauschii occurred in all counties except eight counties in Xinyang. No Ae. tauschii plants were found in 19 counties belonging to Yantai, Qingdao, Weihai and Rizhao of Shandong Province. Except for 25 counties in Zhangjiakou, Chengde and Tangshan of Hebei Province, infestations of Ae. tauschii with wheat fields were observed in other counties. In addition, in Shaanxi Province, Ae. tauschii infested wheat fields in over half of counties (56.96%). However, Shanxi, Jiangsu, Beijing and Tianjin exhibited relatively low occurrence frequency of Ae. tauschii (10-40%). Among 16 surveyed districts in Beijing, Ae. tauschii was only found in Haidian, Tongzhou, Fangshan and Daxing. In Shanxi Province, the weed was mainly found in the counties located at the south of wheat planting, such as Hongtong, Huozhou and Linyi counties. By contrast, Hubei, Anhui, Sichuan and Xinjiang provinces showed very low infestations of Ae. tauschii (<10%). For example, in Hubei Province, Ae. tauschii was only found in Zaoyang and Xiangzhou counties among 63 surveyed counties (Table 1).

Polymorphism of the SSR Markers
In total, 80 polymorphic alleles were amplified using 17 SSR markers, with size varying from 98 bp to 277 bp ( Table 2). Among 192 Ae. tauschii populations, the observed number of alleles (Na) in different loci ranged from 2 to 11, with an average of 4.706. The mean value of the effective number of alleles (Ne) was 1.684, varying from 1.011 to 3.534. The highest value of Shannon's information index (I) was 0.032, and the lowest value was 1.385. The ranges of the expected heterozygosity (He) and observed heterozygosity (Ho) were from 0.010 to 0.719 (mean = 0.285) and 0 to 1 (mean = 0.181), respectively. The average of gene diversity based on Nei's gene diversity (H) was 0.284, with a range from 0.010 to 0.717. Polymorphic information content (PIC), with a mean of 0.284, varied from 0.010 to 0.717 (Table 3).
Wright's fixation index (F), a measure of heterozygotic deficiency or excess, varied between −0.560 and 1.000, with an average of 0.695 per marker. Two SSR loci (Xgwm383-3D and Xgwm469-6D, F < 0) demonstrated excess heterozygosity, while the other 15 loci (F > 0) displayed deficient heterozygosity, revealing deficient heterozygosity for the 17 SSR markers. The proportion of differentiation among populations (Fst) and gene flow (Nm) ranged from 0.220 to 1.000 (mean = 0.848) and 0.000 to 0.886 (mean = 0.117), respectively, indicating a high level of differentiation among populations and low gene flow in Ae tauschii (Table 3).

Comparative Genetic Diversity Analysis of Aegilops tauschii Coss. from Different Provinces
Ae. tauschii from different provinces displayed variable levels of genetic diversity (   Table 3.

Cluster Analysis
Based on the Dice similarity coefficient, a dendrogram was constructed with UPGMA ( Figure 2). The value of the Dice similarity coefficient ranged from 0.31 to 1.00. When the Dice similarity coefficient was 0.75, 192 Ae. tauschii populations could be mainly divided into four clusters. Some clusters included populations from different provinces, while some populations from the same province were dispersed into different clusters. Cluster 1 contained most populations from Shandong and Hebei, approximately half of populations from Henan and very few populations from Shaanxi. Hubei and Tianjin populations were scattered into this cluster as well. Cluster 2 included Jiangsu and Shanxi populations, together with the majority of Shaanxi populations and a few populations from Shandong and Hebei. Very few populations from Henan were also dispersed into cluster 2. The remaining populations from Henan and Shaanxi, as well as a few populations from Shandong and Hebei, formed cluster 3, and cluster 4 only contained the Xinjiang population.  Table S1).

Population Structure Analysis
To illustrate the underlying population structure of the studied populations, the number of hypothetical groups for the 192 populations was evaluated by the Bayesian model-based clustering method. The structure simulation by Structure Harvester indicated that K = 3 is optimal ( Figure 3A-D); in other words, it is rational to divide the given populations into three groups: group 1 (green), group 2 (red) and group 3 (purple) ( Figure  3E). The result was supported by UPGMA analysis, except that populations 13, 116, 166, 188 and 189, belonging to cluster 1, and populations 72 and 128, belonging to cluster 3, were collected into group 2. Additionally, Xinjiang population belonging to cluster 4 was included in group 2 as well.
The AMOVA analysis showed that variance occurring among the three hypothetical groups accounted for 7% of the total variation. Majority variations (67%) occurred among different populations within same group, and 26% was attributed to variation within populations (Table 4). Additionally, the lowest mean value of Fst and highest mean value of Nm were detected between group 2 and group 3 ( Table 5), indicating that there were markedly low genetic differentiation and high gene flow between group 2 and group 3. However, a reverse phenomenon was observed between group 1 and group 3.

Discussion
By 2007, it was reported that Ae. tauschii was mainly distributed in Hebei, Shanxi, Chongqing, Shandong, Jiangsu, Henan, Inner Mongolia and Shaanxi provinces in China [1]. Our filed survey showed that the grass weed further spread into Anhui, Sichuan, Hubei, Xinjiang, Beijing and Tianjin during recent years. The varied occurrence frequency of Ae. tauschii in the surveyed provinces indicated different infestation levels across wheat fields in China. Thus, effective control of Ae. tauschii is urgently needed. The survey was conducted during 2017-2019. Ae. tauschii is easily spread across wheat fields, due to the limited effective herbicides against it, extensive ecology adaptability and high propagation coefficient [3,22]. Therefore, the present occurrence frequency of Ae. tauschii may be higher than our survey results.
Given the expanding distribution ranges and serious infestations, the genetic diversity and population structure of Ae. tauschii from surveyed sites were studied to understand its population relationships and spreading across wheat fields in China. SSR markers are generally considered as effective tools to explain genetic diversity and population structure [16,17,23]. Among the strictly screened SSR markers, five markers were highly informative, with PIC > 0.5; three markers were reasonably informative, with 0.25 < PIC < 0.5; and nine markers were minimally informative, with PIC < 0.25, according to a criterion published previously [24]. The mean value of PIC per marker was between 0.25 and 0.5, indicating a reasonably discriminating and indicative property of the selected markers. Therefore, these markers are reliable in illuminating the population structure and assessing the genetic diversity of Ae. tauschii. In another study, however, the PIC value (0.55) for Ae. tauschii was higher than the present results, which may be explained by the different microsatellite markers and Ae. tauschii populations used in the two studies [10].
Eighty alleles were collectively identified by the 17 SSR markers, with a mean of 4.706 alleles per marker. This observation was largely similar to previous studies [25][26][27]. Beyond that, the mean values of H and I implied moderately high genetic diversity in Ae. tauschii, which was strongly supported by a high level of genetic variance detected in other research [9,10]. Genetic diversity plays an instrumental role in the capability to survive in different environments, as well as in the population evolution [16,28]. Unquestionably, the high genetic diversity of Ae. tauschii would intensify the difficulties of managing it. Furthermore, Hubei populations, with narrow genetic backgrounds, are closely related to each other. Genetic diversity partly depends on the species' evolutionary history [28]. The invasion of Ae. tauschii to wheat fields in Hubei Province was reported recently [3]. Therefore, it was assumed that the low genetic diversity of Hubei populations was related with its short history of infestation; in contrast, fewer populations found from the province may also be responsible for this phenomenon.
In the dendrogram, genetic distance did not seem to be related to geographic distribution, since some populations, especially those collected from different regions of China, were not grouped in a way that exactly relied on geographic location in the cluster analysis. That may be attributed to the long-distance seed distribution mediated by harvesting activities [1].
Our findings demonstrated that Xinjiang populations showed a clear separation from Yellow River Basin populations (Henan and Shaanxi), which is consistent with previous studies [9,10,29,30]. Based on the cluster analysis, a closer genetic relationship was found among some Shandong, Hebei and Henan populations, which are adjacent provinces, geographically. Historically, Ae. tauschii was first found in Henan Province [8]. Given that, it was speculated that Shandong and Hebei populations were introduced from Henan. Fang et al. [31] reported that transregional operation (from the South to the North in China) of combine machines in harvesting could facilitate Ae. tauschii seed distribution among different provinces. In the future, more attention should be paid to the spread of Ae. tauschii across wheat fields. For example, combine machines in harvesting should be timely cleaned in case they carry Ae. tauschii seeds during wheat harvesting season. Generally, Shanxi and Shaanxi populations exhibited distant genetic relationship with populations from other provinces. This might be ascribed to a long-time evolution or genetic changes associated with native climate conditions. However, a few Shandong and Hebei populations were clustered together with the above populations, which is probably related with wheat seed exchange among these provinces or seeds dispersion by Yellow River to cities located downstream. Therefore, the quality of wheat variety and irrigation water should be monitored strictly. UPGMA and STRUCTURE analysis showed that Hubei and Tianjin populations were closely related with Shandong, Hebei and Henan populations. Ae. tauschii was found in wheat fields in Hubei and Tianjin, recently [3]. In view of geographical locations among these provinces, it was assumed that Tianjin and Hubei populations were introduced from Hebei and Henan provinces, respectively, suggesting that the infestation of Ae. tauschii was gradually expanded to the north and south in China.
Cluster and admixture analyses based on the Bayesian model were regarded as a method to clarify the species' origin, to shed light on genetic relationships and to determine the gene flow among different populations [32,33]. STRUCTURE analysis suggested that the 192 Ae. tauschii populations in China could be divided into three groups, which was supported by UPGMA analysis. To some extent, most populations of Shandong and Hebei, which were mainly distributed in the east of Ae. tauschii-infested areas, were grouped together. Group 2 was mainly composed of Shanxi and Shaanxi populations spreading west of the Yellow River Basin. Group 3 was a mixture of Shandong, Henan, Hebei and Shaanxi populations. Based on that, future work to evaluate Ae. tauschii control strategies should be conducted in the sites representing the three groups in China. Some populations were composed of different colored segments, indicating that there was a certain degree of genetic admixture in the tested populations. This situation might be resulted from the gene flow among the populations. Human activity-mediated seed spread appeared to accelerate gene recombination and movement in different populations.

Conclusions
Presently, Ae. tauschii was found in all provinces surveyed. Shandong, Henan and Hebei provinces had very high infestations of the weed. A moderately high level of genetic diversity, high differentiation, deficient heterozygosity and limited gene flow were discovered in this species. Closer genetic relationships were found among some populations geographically located distantly, which is most likely implicated with long-distance seed distribution. The STRUCTURE and UPGMA analyses grouped the 192 populations into three main groups: Roughly, the first group was dominated by populations from the east of the Yellow River Basin, the second group involved populations mainly located at the west of the Yellow River Basin and the last group was a mixture of populations from Henan, Shaanxi, Hebei and Shandong. These results will be helpful for the selection of representative sites to evaluate and develop control techniques and to further establish diverse management strategies aimed at different geographic areas or genetic groups of Ae tauschii.

Data Availability Statement:
The data presented in this study are available in Table S1: Supplementary material.dotx.