Molecular Diversity within a Mediterranean and European Panel of Tetraploid Wheat ( T. turgidum subsp.) Landraces and Modern Germplasm Inferred Using a High-Density SNP Array

: High-density single-nucleotide polymorphism (SNP) molecular markers are widely used to assess the genetic variability of plant varieties and cultivars, which is nowadays recognized as an important source of well-adapted alleles for environmental stresses. In our study, the genetic diversity and population genetic structure of a collection of 265 accessions of eight tetraploid Triticum turgidum L. subspecies were investigated using 35,143 SNPs screened with a 35K Axiom®array. The neighbor-joining algorithm, discriminant analysis of principal components (DAPC), and the Bayesian model-based clustering algorithm implemented in STRUCTURE software revealed clusters in accordance with the taxonomic classiﬁcation, reﬂecting the evolutionary history of the Triticum turgidum L. subspecies and the phylogenetic relationships among them. Based on these results, a clear picture of the population structure within a collection of tetraploid wheats is given herein. Moreover, the genetic potential of landraces and wild relatives for the research of speciﬁc traits of interest is highlighted. This research provides a great contribution to future phenotyping and crossing activities. In particular, the recombination efﬁciency and gene selection programs aimed at developing durum wheat composite cross populations that are adapted to Mediterranean conditions could be improved.


Introduction
Wheat represents the third most important cereal grain and the most widely grown crop in the world [1]. Bread wheat (Triticum aestivum L.) and durum wheat (T. turgidum L. ssp. durum) are the two subspecies predominantly cultivated, used for bread-making or leavened products (cookies, cakes, and pizza) and for semolina products and pasta, respectively. In addition, both wheat species' byproducts are used for animal feed production.
While bread wheat (T. aestivum) is hexaploid (2n = 6x = 42 chromosomes, AABBDD genomes), durum wheat belongs to the T. turgidum tetraploid subspecies group (2n = 4x = 28 chromosomes, AABB genomes) which includes six other subspecies (Triticum carthlicum, Triticum dicoccum, Triticum dicoccoides, Triticum paleocolchicum, Triticum polonicum, and Triticum turgidum) rarely grown commercially [2,3]. Many studies based on cytological and molecular analysis ascribe tetraploid wheat's origin to two different evolutionary steps, which started around 10,000 years ago in the Fertile Crescent [4,5]. The first divergent evolution, of which the original progenitor is unknown, gave rise to diploid species including Triticum urartu (A genome), Aegilops tauschii (D genome), Hordeum vulgare (barley), and Secale cereale (rye) [6]. The second evolutionary process was a natural hybridization between T. urartu (the A genome donor) and an unknown Triticum species, often identified as Aegilops speltoides (the B genome donor); this created the wild emmer T. dicoccoides (2n = 4x = 28, BBAA genomes), the progenitor of durum wheat [7]. The history of durum evolution is the result of domestication starting from wild emmer genotypes and of a transition process from a naked emmer type to durum type [8]. Around 7000 years Before Present (BP), durum genotypes reached the Iberian Peninsula, followed by a rapid spread from the East to the West of the Mediterranean Basin [9,10]. Natural and human selection through thousands of years led to the establishment of wheat landraces characterized by strong adaption to the environmental conditions and cultivation practices of different geographic areas [11]. Local traditional farming communities contributed to the maintenance of these landraces that were characterized by different qualitative and quantitative traits until the first decades of the twentieth century [12].
At the beginning of the 20th century, breeders imposed a strong selection based on commercial purposes: local landrace cultivation was progressively abandoned and replaced with improved, widely adapted, and more productive semi-dwarf varieties, resulting in a reduced level of genetic diversity, especially compared to the wild ancestors [13][14][15]. Today, this lack of diversity is widely recognized as a limiting factor in the breeding of highyielding and stress-resistant varieties [16]. Moreover, under the current climate change events (irregular rainfall, high temperatures during the growing season, rainstorms, and drought) that negatively affect wheat cultivation, the development of new resilient varieties or composite cross populations (CCPs) adapted to different cultivation environments and low-input agriculture has become necessary [17][18][19]. Novel genetic diversity selected by breeders may be introduced into modern genotypes by the introgression of useful alleles from landraces, ancestors, or wild relatives through specific breeding programs [20][21][22]. Durum wheat landraces and other Turgidum subspecies usually show a lower yield when compared to modern varieties [11]; nevertheless, they exhibit reduced productive performance compared to elite germplasm (modern varieties), but their higher genetic variability could be useful, allowing them to cope with environmental stress conditions, and to increase resilience to climate change. They are thus a potential source of favorable alleles to improve grain yield or pest resistance and to give other favorable agronomic traits to new varieties [23,24].
Recent breeding programs have studied and assessed genetic variability or different germplasm panels using different research approaches [25][26][27][28][29][30][30][31][32]. Morphological and agronomical markers have been considerably used [25,26], with variable reproducibility depending on environmental conditions. Nevertheless, this has been overtaken with the use of molecular markers that guarantee the opportunity of studying wheat phenotypes, providing reproducible and environment-independent results [27]. Several DNA markers have been developed and largely used to assess genetic diversity in tetraploid wheats [28][29][30][31], but the high-density genome coverage provided in recent years by singlenucleotide polymorphism (SNP) markers has made them the best choice for wheat genetic analysis [32].
A few years ago, a novel plant breeding approach-evolutionary plant breeding (EB)relying on human selection acting on a heterogeneous population (i.e., CCPs) started to represent a valuable method for developing populations adaptable to different agricultural contexts [33,34]. Cultivation conditions can drive the selection of more adaptable genotypes that present increased fitness [35,36]. After several years of cultivation and multiplication in the same area under isolated conditions, these populations may reach equilibrium with stable yields, and the genetic diversity among such populations represents a trait resilient to climate and environmental stress [37].
In this study, we investigated the genetic diversity and population structure of a panel of 265 accessions from seven tetraploid T. turgidum subspecies originating from different Mediterranean and European areas using the 35K Wheat Breeders' Axiom®SNP array. This work will prove to be a groundwork for phenotypic analysis, both in the field and in the lab, aimed at identifying the best lines that could be used in a cross-breeding program for the selection of resilient and nutritionally improved wheat CCPs.
Seeds were sown in peat-based soil in single pots and maintained in a climatic chamber at 15 • C during the night and 25 • C during the day, with a cycle of 16 h light and 8 h dark. Six weeks after germination, leaf tissue (5-6 cm section of a true leaf) was harvested from plants, immediately frozen on liquid nitrogen, and then stored at −80 • C prior to nucleic acid extraction. All plants were then transplanted in the field and grown until maturity in order to collect seeds for single-seed line constitution to be used in future field studies.

DNA Extraction and Genotyping
Frozen leaf tissues were ground in a TissueLyzer II bead mill (Qiagen, Hilden, Germany), with the tissue and plastic adapter having previously been dipped into liquid nitrogen to avoid sample warming. Genomic DNA was extracted from the leaf powder using a standard cetyltrimethylammonium bromide (CTAB) protocol [38] and then treated with RNase-A (New England Biolabs UK Ltd., Hitchin, UK) according to the manufacturer's instructions. DNA was checked for quality and quantity by electrophoresis on 1% agarose gel and Qubit™ fluorimetric assay (Thermofisher), respectively. The 35K Ax-iom®Wheat breed Genotyping Array (Affymetrix, Santa Clara, US) was used to genotype 265 samples for 35,143 SNPs using the Affymetrix GeneTitan®system at Bristol Genomics Facility (Bristol, UK) according to the procedure described in Axiom®2.0 Assay Manual Workflow User Guide Rev3 (https://assets.thermofisher.com/TFS-Assets/LSG/manuals/70 2991_6-Axiom-2.0-96F-Man-WrkFlw-SPG.pdf (accessed on 3 December 2020)). This array contains a range of probes that are located on chromosomes belonging to the A, B, and D genomes [39]. Since in tetraploid wheat the D genome is lacking, the effective number of markers that can be investigated is lower, corresponding to 24,240 SNPs. Allele calling was carried out using the Axiom Analysis suite software [40], and a variant call rate threshold of 92% was used instead of the default value (97%) to account for the great heterogeneity of the set analyzed [41]. The number of monomorphic and polymorphic SNP markers, the heterozygosity level, and the types of nucleotide substitution for each accession were evaluated using the same software. Monomorphic SNP markers and those with missing data points were excluded from analysis. SNP markers were then filtered for minimum allele frequency (MAF) greater than 1% and failure rate lower than 20%.

Statistical Analysis
The levels and patterns of genetic diversity among accessions were investigated starting from the data obtained from SNP genotyping. The Tamura-Nei method [42] for genetic distance evaluation was applied to obtain a matrix of pairwise distances among accessions. An unrooted Bayesian tree was computed by applying the neighbor-joining algorithm [43], implemented in the ape 3.1 package of R software [44].
To obtain a clear picture of the genetic structure of the tetraploid wheat genotypes, we applied the Bayesian model-based clustering algorithm implemented in STRUCTURE software version 2.3.4 [45]. An admixtured and shared allele frequency model was used to determine the number of clusters (K), assumed to be in the range between 2 and 15, with five replicate runs for each assumed group. For each run, the initial burn-in period was set to 10,000 with 10,000 MCMC (Markov chain Monte Carlo) iterations, with no prior information on the origin of individuals. The best fit for the number of clusters, K, was determined using the Evanno method [46] as implemented in the program STRUCTURE HARVESTER [47]. Structure results were then elaborated using the R package pophelper to align cluster assignments across replicate analyses and produce visual representations of the cluster assignments. Discriminant analysis of principal components (DAPC) was used to infer the number of clusters of genetically related individuals [48] using the adegenet package in R-project [49]. The first step of DAPC was data transformation using principal component analysis (PCA), while the second step was discriminant analysis performed on the retained principal components (PCs). Groups were identified using k-means, a clustering algorithm that finds a given number (k) of groups maximizing the variation between them. k-means was run sequentially with increasing values of k to identify the optimal number of clusters, and different clustering solutions were compared using the Bayesian Information Criterion (BIC). The optimal clustering solution should present the lowest BIC [50].

Results
After SNP dataset filtering, 21,051 SNP markers were identified and used in the statistical analysis to evaluate the genetic diversity of the 265 tetraploid wheat accessions. The genetic relationships in the panel were assessed through three different approachesneighbor-joining tree, discriminant analysis of principal components (DAPC), and STRUC-TURE software-in order to better detail and define the genetic relationship variability among the tetraploid accessions.
The Bayesian tree obtained by applying the neighbor-joining algorithm revealed groups in the population that highly agreed with the subspecies classification and origin ( Figure 1A). Most of the T. turgidum ssp. durum (shown in yellow in Figure 1A) were placed in a large clade together, with modern varieties that appeared separated from the other accessions. Landraces and old varieties were distributed in branches close together, mostly according to their geographical origin, such as the Syrian and Sicilian accessions. Two other clusters were identified, consisting, respectively, of T. turgidum spp. dicoccon (shown in orange) and T. turgidum ssp. turgidum (blue), while T. turgidum spp. turanicum (brown) was clustered into two groups separated by the set of T. turgidum ssp. polonicum accessions (grey). The two T. turgidum ssp. paleocolchicum accessions (light blue) and their cross seemed to be close, while the few accessions belonging to T. turgidum carthlicum and dicoccoides ssp. appeared to be spread amongst the tree branches. The wheat genotype arrangement obtained with the Bayesian tree was subsequently confirmed by the DAPC results ( Figure 1B, Table S2). Seven clusters ( Figure 2) were detected in coincidence with the lowest Bayesian information criterion (BIC) value ( Figure S1), and 100 PCs (80% of variance conserved) from PCA were retained. As reported in Figure 1B, the Syrian T. turgidum spp. durum wheats were pooled in Group 5 and clustered separately in the genetic tree. Most of the old varieties and landraces of the same subspecies were collected in Group 3, while Group 4 was formed by approximately half of the T. turgidum spp. turanicum accessions, which belonged to the same genetic cluster in the tree. The remaining genotypes of this last subspecies were grouped together with T. turgidum spp. polonicum wheats which were also clustered in Group 2. Group 1 was entirely composed of T. turgidum ssp. diccocon accessions, while Group 7 identified the modern varieties of T. turgidum ssp. durum.
Moreover, the Bayesian tree and the DAPC analysis largely agreed with the accessions' geographic origins. In particular, Syrian (Cluster 5), French (part of the Cluster 7), Moroccan (Cluster 6), and Italian and Algerian (Cluster 3) wheats were almost entirely pooled within the same cluster. Iranian (Clusters 3 and 4) and Portuguese and American (Clusters 2 and 6) accessions were equally divided into two clusters. The optimum number of subpopulations, K, estimated using STRUCTURE software ( Figure 3, Table S2) and according to the Evanno method results was 7 (K = 7). This indicated the presence of seven subpopulations, as previously found by the Bayesian tree and DAPC analysis, although characterized mostly by different accessions.

Discussion
Genetic diversity represents the basis for crop improvement, providing plant breeders with the germplasm necessary to develop cultivars with adaptive traits and good quality characteristics [51]. To better target their crossing schemes, the genetic structure and variability of 265 tetraploid wheats accessions were assessed.
Clustering done via a Bayesian tree and clusters obtained via DAPC revealed a clear classification of genotypes in accordance with their geographical origin, strengthening the results of previous studies of phylogenetic relationships between cultivated wheats and their wild relatives [52,53].
Concerning T. turgidum ssp. durum accessions, which represented the largest number of genotypes in the panel, their first and second geographical origin centers-Syria and Ethiopia [54]-appeared to be clearly identified in Clusters 5 and 3, respectively. This result agreed with the molecular assessment by Kabbaj et al. [55] regarding a durum wheat collection of cultivars. More interestingly, the Bayesian tree highlighted the proximity between North African (Morocco, Algeria, and Tunisia) and Italian germplasm; this could be linked to the geographical expansion of Romans during the Imperial Period and consequent wheat genotype introduction and cultivation on the African continent, as suggested by Rickman [56].
In addition, the positions of the accessions "Ciceredda", "Bufala rossa lunga", "Bufala nera corta", and "Paola" on the Bayesian tree deserve attention: although they belong to Cluster 3, which grouped almost all the other T. turgidum ssp. durum genotypes, they were gathered in a distant cluster between T. turgidum turgidum and polonicum ssp. The proximity of these accessions could be due to a taxonomic problem, traceable thanks to work by De Cillis [57], which classified these accessions under T. turgidum turgidum ssp. turgidum.
Finally, another relevant observation on the T. turgidum ssp. durum accession arrangement concerns the low genetic variability detected in the modern Italian varieties, different from landraces and old varieties. Through the second half of the 20th century, national breeding programs aimed at increasing wheat yield started to establish new varieties characterized by small size, limited sprouting, reduced leaf area, and shorter crop cycle [58]. Due to genetic improvement only, De Vita et al. [59] confirmed in their work a 44% increase in productivity for the main varieties of durum wheat grown in Italy during the 20th century; however, this resulted in pure line selection and the development of varieties with low genetic variability [60]. Our study reflects this strong selection activity: Italian modern varieties were gathered in the same cluster ( Figure 1B) and along neighbor branches, highlighting genetic homogeneity.
On the contrary, the subspecies dicoccon showed the highest genetic variability, as Laidò [61] et al. verified in their research, confirming this wild germplasm as a powerful source of genes.
Today, the unpredictable climate, characterized by irregular rainfall and long dry periods, results in a rather unstable crop production. Under marginal environments, landraces and old varieties show higher stability in low-input agriculture [62,63]; thus, they could represent valuable genetic resources for breeders in order to develop new cultivars or CCP populations with specific qualitative traits such as resistance to biotic and abiotic stress, ability to efficiently use organic nitrogen and better nutritional qualities [64]. With this aim, our results showed the genetic diversity among accessions belonging to eight tetraploid wheat subspecies and identified the correct numbers of genotypes that explain the screened genetic variability well.

Conclusions
The genetic diversity of domesticated wheat accessions has been significantly reduced from that of their wild progenitors through a prolonged selection process for those phenotypic traits that better satisfy human needs. On the contrary, landraces' genetic variability represents a precious source of valuable agronomic traits that could be used for interspecific hybridization and for the introgression of genes and/or alleles into cultivated species. In our work, the genetic diversity and the population structure of 265 tetraploid wheats were investigated in order to understand the genetic relationships between domesticated wheats and their close wild relatives. The results obtained from this research could be used in future phenotyping studies in both field and laboratory tests to select the best lines to be intercrossed for the creation of improved and more resilient durum wheat CCP populations adapted to Mediterranean areas.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073 -4395/11/3/414/s1. Figure S1. Statistical determination of the optimum number of clusters by discriminant analysis of principal components (DAPC). The elbow in the curve matches the smallest BIC and clearly indicates that seven cluster should be retained. Table S1. List of wheat accessions used in the experiment. Table S2. Clusters Membership for each accession defined with DAPC and Structure analyses. For Structure, cluster membership probability was reported for each accession.