Genome-Wide Genetic Diversity and Population Structure of Tunisian Durum Wheat Landraces Based on DArTseq Technology

Tunisia, being part of the secondary center of diversity for durum wheat, has rich unexploited landraces that are being continuously lost and replaced by high yielding modern cultivars. This study aimed to investigate the genetic diversity and population structure of 196 durum wheat lines issued from landraces collected from Tunisia using Diversity Array Technology sequencing (DArTseq) and to understand possible ways of introduction in comparing them to landraces from surrounding countries. A total of 16,148 polymorphic DArTseq markers covering equally the A and B genomes were effective to assess the genetic diversity and to classify the accessions. Cluster analysis and discriminant analysis of principal components (DAPC) allowed us to distinguish five distinct groups that matched well with the farmer’s variety nomenclature. Interestingly, Mahmoudi and Biskri landraces constitute the same gene pool while Jenah Zarzoura constitutes a completely different group. Analysis of molecular variance (AMOVA) showed that the genetic variation was among rather than within the landraces. DAPC analysis of the Tunisian, Mediterranean and West Asian landraces confirmed our previous population structure and showed a genetic similarity between the Tunisian and the North African landraces with the exception of Jenah Zarzoura being the most distant. The genomic characterization of the Tunisian collection will enhance their conservation and sustainable use.


Introduction
Durum wheat is the tenth most important crop worldwide, grown on 13 M ha with a production of 39.1 Mt in 2017 (International Grain Council, Grain Market reports, 2017) and is mainly used for pasta production. It is mainly grown in the countries around the Mediterranean basin and its cultivation has extended to India, Mexico, North America, and Russia [1,2]. North Africa grows around 2.5 M ha of durum wheat with around 1 M ha in Tunisia. Landraces are still grown under traditional farming systems and are highly appreciated for local dishes such as frikeh, burghul, couscous, but their acreage is reduced significantly with the large adoption of new cultivars released since the 1970s [3,4]. polymorphism information content (PIC) value of the DArTseq markers was 0.165, and the median was 0.105. The distribution of the majority of MAFs was between 0.05 and 0.15 ( Figure 1). The DArTseq markers are well distributed across all the 14 chromosomes of durum wheat genomes based on the consensus bread wheat genetic map obtained from the International Wheat Genome Sequencing Consortium database (https://urgi.versailles.inra.fr/download/iwgsc/). The distribution is almost equal between A (4201 markers) and B (4659 markers) genomes with the highest number of markers observed on the chromosomes 7A and 2B. However, 7324 markers are still unassigned to any of the chromosomes (Table 1). A similar range of genetic diversity values (He) is observed for both A and B genomes with an average of around 0.25. The maximum value of the expected heterozygosity (He) was observed at chromosome 6B (0.28) and the minimum value was seen for chromosomes 3A, 4A, 5A, and 5B (0. 24). For all chromosomes, the expected heterozygosity values (He) were higher than the observed heterozygosity values (H0).  The DArTseq markers are well distributed across all the 14 chromosomes of durum wheat genomes based on the consensus bread wheat genetic map obtained from the International Wheat Genome Sequencing Consortium database (https://urgi.versailles.inra.fr/download/iwgsc/). The distribution is almost equal between A (4201 markers) and B (4659 markers) genomes with the highest number of markers observed on the chromosomes 7A and 2B. However, 7324 markers are still unassigned to any of the chromosomes (Table 1). A similar range of genetic diversity values (H e ) is observed for both A and B genomes with an average of around 0.25. The maximum value of the expected heterozygosity (H e ) was observed at chromosome 6B (0.28) and the minimum value was seen for chromosomes 3A, 4A, 5A, and 5B (0. 24). For all chromosomes, the expected heterozygosity values (H e ) were higher than the observed heterozygosity values (H 0 ).

Genetic Distance and Clustering of the Tunisian Landraces
The allele sharing distance (ASD) among the 196 Tunisian landraces lines ranged from 0 to 0.79 with an average of 0.46 ( Figure 2). However, different distance patterns were observed for different landraces. Jenah Zarzoura shows the lowest distance with an average of 0.1 and the lowest variability among its lines ranging from 0 to 0. 16  Cluster analysis of the 196 lines derived from the six Tunisian landraces was performed using the allele sharing distance (ASD) method and the results allowed us to group them into five clusters and to identify lines wrongly assigned to some landraces. The first cluster comprised all the lines of Jenah Zarzoura (Zar). The second cluster contained mainly Bidi lines (Bid). The third cluster is subdivided into two sub-groups, one containing Rommani lines (Rom) and the other having a mixture of lines from other landraces (seven from Biskri (Bis), three from Jenah Khotifa (jkf), and two from Bidi (Bid)). Cluster 4 had the majority of Jenah Khotifa lines (jkf) but contained also four lines of Rommani, two of Biskri, and one of Mahmoudi. The last cluster constitutes the largest one and gathered almost all lines of Biskri (Bis) and Mahmoudi (Mah) (Figure 3). Cluster analysis of the 196 lines derived from the six Tunisian landraces was performed using the allele sharing distance (ASD) method and the results allowed us to group them into five clusters and to identify lines wrongly assigned to some landraces. The first cluster comprised all the lines of Jenah Zarzoura (Zar). The second cluster contained mainly Bidi lines (Bid). The third cluster is subdivided into two sub-groups, one containing Rommani lines (Rom) and the other having a mixture of lines from other landraces (seven from Biskri (Bis), three from Jenah Khotifa (jkf), and two from Bidi (Bid)). Cluster 4 had the majority of Jenah Khotifa lines (jkf) but contained also four lines of Rommani, two of Biskri, and one of Mahmoudi. The last cluster constitutes the largest one and gathered almost all lines of Biskri (Bis) and Mahmoudi (Mah) (Figure 3).

Population Structure of the Tunisian Landraces
The DAPC (discriminant analysis of principal components) results showed that 88 Principal Components (PCs) explain 82% of the total molecular variance. The optimum number of clusters was obtained with K = 6 using the Bayesian Information Criterion (BIC), which divided the lines into six sub-populations ( Figure 4). The first subdivision level using the hierarchical population structure K = 2 separated Jenah Zarzoura lines (Zar) clearly from all the lines of the other landraces ( Figure 5A). When K = 3, in addition to the separate sub-population of Jenah Zarzoura, two other sub-populations were identified, one combining most Mahmoudi (Mah) and Biskri (Bis) lines and the other included lines of landraces Bidi (Bid), Jenah Khotifa (jkf), and Rommani (Rom) ( Figure 5B). With K = 4, a separate sub-population containing Jenah Khotifa lines appeared ( Figure 5C); when K = 5, all the landraces were assigned to different sub-populations, except for Biskri and Mahmoudi, which remained grouped in the same sub-population ( Figure 5D). The number of sub-population K = 5 was then chosen to differentiate between different landraces and the sub-populations were given the following names: BID for Bidi, BIS+MAH for Biskri and Mahmoudi, JKF for Jenah Khotifa, ROM for Rommani, and ZAR for Jenah Zarzoura.
Increasing K to 6 resulted into the formation of an additional small group composed by 13 lines mainly of Biskri (7) and Jenah Khotifa (4) showing that these lines could be off-types within their respective landraces, which need further characterization ( Figure 5E). The global population structure image was maintained for the values of K equal to 7 and 8 with all different landraces assigned to different sub-populations except the landraces Biskri and Mahmoudi that are still

Population Structure of the Tunisian Landraces
The DAPC (discriminant analysis of principal components) results showed that 88 Principal Components (PCs) explain 82% of the total molecular variance. The optimum number of clusters was obtained with K = 6 using the Bayesian Information Criterion (BIC), which divided the lines into six sub-populations ( Figure 4). The first subdivision level using the hierarchical population structure K = 2 separated Jenah Zarzoura lines (Zar) clearly from all the lines of the other landraces ( Figure 5A). When K = 3, in addition to the separate sub-population of Jenah Zarzoura, two other sub-populations were identified, one combining most Mahmoudi (Mah) and Biskri (Bis) lines and the other included lines of landraces Bidi (Bid), Jenah Khotifa (jkf), and Rommani (Rom) ( Figure 5B). With K = 4, a separate sub-population containing Jenah Khotifa lines appeared ( Figure 5C); when K = 5, all the landraces were assigned to different sub-populations, except for Biskri and Mahmoudi, which remained grouped in the same sub-population ( Figure 5D). The number of sub-population K = 5 was then chosen to differentiate between different landraces and the sub-populations were given the following names: BID for Bidi, BIS+MAH for Biskri and Mahmoudi, JKF for Jenah Khotifa, ROM for Rommani, and ZAR for Jenah Zarzoura.
Increasing K to 6 resulted into the formation of an additional small group composed by 13 lines mainly of Biskri (7) and Jenah Khotifa (4) showing that these lines could be off-types within their respective landraces, which need further characterization ( Figure 5E). The global population structure image was maintained for the values of K equal to 7 and 8 with all different landraces assigned to different sub-populations except the landraces Biskri and Mahmoudi that are still grouped in the same sub-population ( Figure 5F).   The results of analysis of molecular variance (AMOVA) for K = 2 to K = 7 showed that the variance between populations was high for K = 2 (31.92%), and it decreased for K = 3 and K = 4 (28.75% and 31.29%, respectively) to increase and reach a plateau for K = 5 with a slightly higher value (31.77%). The variance between lines within populations decreased from 11.32% for K = 2 to reach values close to 0% for K = 5. Comparing AMOVA results between populations with K = 5 and the real groups using the landrace's name clearly support the hypothesis that K = 5 better differentiates genetically the lines than the characterization of landraces by name, resulting in higher variance between populations (31.77% for K = 5 and 28.79% for real groups), lower variance between lines   The results of analysis of molecular variance (AMOVA) for K = 2 to K = 7 showed that the variance between populations was high for K = 2 (31.92%), and it decreased for K = 3 and K = 4 (28.75% and 31.29%, respectively) to increase and reach a plateau for K = 5 with a slightly higher value (31.77%). The variance between lines within populations decreased from 11.32% for K = 2 to reach values close to 0% for K = 5. Comparing AMOVA results between populations with K = 5 and the real groups using the landrace's name clearly support the hypothesis that K = 5 better differentiates genetically the lines than the characterization of landraces by name, resulting in higher variance between populations (31.77% for K = 5 and 28.79% for real groups), lower variance between lines 68.23% for K = 5 and 70.39% for real groups, and lower variance between lines within populations ( Table 2). Thus, these findings indicated a higher genetic variation among rather than within Tunisian landraces. The results of analysis of molecular variance (AMOVA) for K = 2 to K = 7 showed that the variance between populations was high for K = 2 (31.92%), and it decreased for K = 3 and K = 4 (28.75% and 31.29%, respectively) to increase and reach a plateau for K = 5 with a slightly higher value (31.77%). The variance between lines within populations decreased from 11.32% for K = 2 to reach values close to 0% for K = 5. Comparing AMOVA results between populations with K = 5 and the real groups using the landrace's name clearly support the hypothesis that K = 5 better differentiates genetically the lines than the characterization of landraces by name, resulting in higher variance between populations (31.77% for K = 5 and 28.79% for real groups), lower variance between lines 68.23% for K = 5 and 70.39% for real groups, and lower variance between lines within populations (Table 2). Thus, these findings indicated a higher genetic variation among rather than within Tunisian landraces. Results from hierarchical AMOVA using hierarchical subdivision strata from K = 2 up to K = 5 indicated that most of the molecular variance was explained with the first level K = 2 separating Jenah Zarzoura from the other landraces (17.18%) and the fifth level K = 5 within the four groups (26.62%) separating Bidi, Rommani, and Jenah Khotifa and the large group constituted by Biskri and Mahmoudi. The remaining molecular variance (variance between lines) was also low compared to non-hierarchical AMOVA results (60.55%) using K = 5 (Table 3). Table 3. Hierarchical AMOVA results from K = 2 to K = 5 using hierarchical subdivision strata of 196 lines derived from the six durum wheat Tunisian landraces.

Subdivision Strata Variance Components Percentage
Variations between K = 2 271. 63 17. Finally, the resulted variance components from AMOVA analysis with K = 5 were significant when they were tested using permutations ( Figure 6). These results showed that the structure found by number of groups with K = 5 was valid and not a result of a random effect. Results from hierarchical AMOVA using hierarchical subdivision strata from K = 2 up to K = 5 indicated that most of the molecular variance was explained with the first level K = 2 separating Jenah Zarzoura from the other landraces (17.18%) and the fifth level K = 5 within the four groups (26.62%) separating Bidi, Rommani, and Jenah Khotifa and the large group constituted by Biskri and Mahmoudi. The remaining molecular variance (variance between lines) was also low compared to non-hierarchical AMOVA results (60.55%) using K = 5 (Table 3). Finally, the resulted variance components from AMOVA analysis with K = 5 were significant when they were tested using permutations ( Figure 6). These results showed that the structure found by number of groups with K = 5 was valid and not a result of a random effect.

Genetic Diversity and Genetic Distance between Tunisian Landraces
The genetic distance between landrace populations showed that Jenah Zarzoura sub-population is the farthest from all the other landraces, from Rommani (0.820), from Biskri and Mahmoudi (0.730),

Genetic Diversity and Genetic Distance between Tunisian Landraces
The genetic distance between landrace populations showed that Jenah Zarzoura sub-population is the farthest from all the other landraces, from Rommani (0.820), from Biskri and Mahmoudi (0.730), from Jenah Khotifa (0.682), and from Bidi (0.816) (Table 4). Furthermore, the Jenah Khotifa sub-population was equally distant to the landraces Bidi and Rommani (0.557). These results confirmed the population subdivision found by DAPC, which separated the Jenah Zarzoura sub-population from the other landraces when K = 2 and grouped the sub-populations Jenah Khotifa, Bidi, and Rommani when K = 3. Results of genetic diversity estimates in each sub-population obtained based on DAPC with K = 5 show that the highest genetic diversity was observed within the Jenah Zarzoura and Rommani populations (H e = 0.27). The lowest genetic diversity was observed within the Bidi population (H e = 0.12). Biskri and Mahmoudi forming the same group with 73 lines showed a moderate genetic diversity (H e = 0.23). Based on F ST values, Jenah Zarzoura showed a low F ST value of 0.05. Thus, this population presents a genetic drift compared to the others, having a fixation of alternate alleles with F ST values superior to 0.25 ( Table 5). The results from genetic diversity confirmed the results obtained using the allele sharing distance (ASD) between lines within the same population as shown in Figure 1.

Linking the Mis-Classified Lines to Local Landraces and Improved Varieties or/and to the ICARDA/CIMMYT Elite Lines
Cluster analysis based on the ASD method was performed using a set of 27 Tunisian durum wheat landrace populations, six local improved varieties, and seven ICARDA/CIMMYT inbred elite lines, along with lines which formed the additional sub-population when K = 6 from the previous set. This study aimed at showing the relationships among a larger number of Tunisian landraces, the differences with improved varieties, and germplasm and at shedding more light on the mis-classified lines included in the last sub-population where K = 6.

Comparison of Tunisian Landraces to Landraces from Mediterranean and West Asia Regions
The population structure of the Mediterranean and West Asian landraces along with four lines representing each of six Tunisian landraces included in Set 1 was assessed using the DAPC method. The optimum number of groupings is determined with K = 12 based on the BIC criterion using 66 PCs, which explained 82% of the total molecular variation. Sub-population 6 showed that the Tunisian landraces Rommani (Rom), Jenah Khotifa (jkf), and Bidi (Bid) were grouped with the majority of the Algerian landraces (8); Sub-population 12 included all Ethiopian landraces (11), along with one accession from Afghanistan and two accessions from Yemen; Sub-population 9 included the Tunisian landraces Biskri (Bis) and Mahmoudi (Mah) that were grouped with the remaining five landraces from Tunisia, with the majority of Algerian and Lebanese landraces (6), and with some Moroccan landraces (3). Sub-population 11 included exclusively the Tunisian landrace Jenah Zarzoura (Zar) with two accessions from Jordan ( Figure 8). The population structure results confirmed that the newly collected landrace Jenah Zarzoura constitutes a new gene pool and that the Tunisian landraces are genetically closer to North African landraces. PCs, which explained 82% of the total molecular variation. Sub-population 6 showed that the Tunisian landraces Rommani (Rom), Jenah Khotifa (jkf), and Bidi (Bid) were grouped with the majority of the Algerian landraces (8); Sub-population 12 included all Ethiopian landraces (11), along with one accession from Afghanistan and two accessions from Yemen; Sub-population 9 included the Tunisian landraces Biskri (Bis) and Mahmoudi (Mah) that were grouped with the remaining five landraces from Tunisia, with the majority of Algerian and Lebanese landraces (6), and with some Moroccan landraces (3). Sub-population 11 included exclusively the Tunisian landrace Jenah Zarzoura (Zar) with two accessions from Jordan ( Figure 8). The population structure results confirmed that the newly collected landrace Jenah Zarzoura constitutes a new gene pool and that the Tunisian landraces are genetically closer to North African landraces.

Discussion
The molecular markers techniques are important tools for better understanding genetic diversity, undertaking association mapping and ensuring efficient conservation and management of genetic resources. This study demonstrated the relevance of DArTseq technology as a reliable and cost-effective tool for assessing the diversity within and between landraces and for comparing them with other germplasm of durum wheat. This technique yielded a large number of polymorphic and informative markers equally spread in the A and B genomes allowing high coverage of the genomes of the Tunisian durum wheat germplasm compared to other molecular techniques used previously. Similar genomes coverage was found on a panel of 170 durum wheat entries [27]. The large coverage of the genomes can serve to undertake association mapping in the studied germplasm, including finding new allelic variations for major breeders sought traits such as QTLs found by other studies

Discussion
The molecular markers techniques are important tools for better understanding genetic diversity, undertaking association mapping and ensuring efficient conservation and management of genetic resources. This study demonstrated the relevance of DArTseq technology as a reliable and cost-effective tool for assessing the diversity within and between landraces and for comparing them with other germplasm of durum wheat. This technique yielded a large number of polymorphic and informative markers equally spread in the A and B genomes allowing high coverage of the genomes of the Tunisian durum wheat germplasm compared to other molecular techniques used previously. Similar genomes coverage was found on a panel of 170 durum wheat entries [27]. The large coverage of the genomes can serve to undertake association mapping in the studied germplasm, including finding new allelic variations for major breeders sought traits such as QTLs found by other studies in chromosomes 7A and 2B linked to protein content [36], gluten strength and yellow pigment [37,38], and salinity tolerance and yield components [39,40]. DArTseq technology along with other high throughput and genotyping by sequencing molecular techniques are increasingly used to study the genetic diversity of different crops as they allow to study the genetic diversity of large number entries and complex genomes [41,42].

DArTseq Polymorphism of Tunisian Durum Wheat Landraces
Polymorphic information content (PIC) values revealed by DArTseq markers averaged 0.165 with an asymmetric distribution skewed towards low values. The same distribution tendency was found using the same approach for 91 durum wheat landraces from the Fertile Crescent and for 138 wheat germplasm from Southwest China [32,41]. Ren et al. [43] have shown the same PIC value for North African durum accessions (0.168) using 946 SNP markers as part of a genetic diversity study in a worldwide durum wheat germplasm collection. Recent studies using DArTseq markers reported higher PIC values for worldwide durum wheat accessions (0.35) and for accessions originating from central Fertile Crescent (0.26) [27,32]. A previous study on 34 Tunisian durum wheat old varieties reported a PIC value of 0.68 [25]. This difference is explained by the bi-allelic nature of DArT markers for which the maximum value for PIC is 0.5 compared to multi-allelic SSR markers with the maximum PIC value of 1 [44].
The lines of the six Tunisian durum wheat landraces showed a moderate level of genetic diversity (H e = 0.25), which is higher than the one exhibited by a set of world-wide durum wheat collection (H e = 0.224) using SNP markers [43]. Our results confirmed previous findings showing that the Fertile Crescent and the eastern Mediterranean durum landraces are more diverse than those from the Western Mediterranean regions [45,46]. Although several studies revealed high genetic diversity in the Tunisian landraces based on phenotypic characterization [15,16], the moderate level of genetic diversity could be explained by the fact that the Tunisian durum wheat lines included in this study are derived from six landraces collected from a limited geographic area in the center and the south parts of Tunisia.

Genetic Diversity and Population Structure of the Tunisian Durum Wheat Landraces
Allele sharing distance using DArTseq markers allowed us to differentiate among the six landraces Jenah Zarzoura, Biskri, Jenah Khotifa, Mahmoudi, Bidi, and Rommani by showing different distance patterns. Two methods, clustering analysis based on ASD (allele sharing distance) and DAPC (discriminant analysis of principal components) were used for depicting the genetic relationship and structure of the Tunisian collection. The first method classified the panel into five groups matching mostly with the farmer's landraces names, with Jenah Zarzoura being the most distant and Mahmoudi and Biskri included in the same cluster. The closeness of Mahmoudi and Biskri is for a long time reported by Boeuf [11] based on glume and spike color, which could be due to the exchange of these landraces among farmers from Algeria and Tunisia [47]. The clear distinction of Jenah Zarzoura could be due to its confinement to a geographically limited area of the Mareth oasis or to a different pattern of introduction.
The use of the multivariate method DAPC to evaluate the population structure showed better performance and allowed for a population subdivision similar to other studies [24,48]. Several molecular approaches were used to assess the population structure in durum wheat landraces from SSRs to DArTs [49,50]. More recently, GBS-SNPs and DArTseq approaches were mainly used for hexaploid wheat population structure studies [28,51], and some reports have described a durum wheat population structure based on the DArTseq technique that allowed to classify the Turkish and Syrian durum wheat landraces in the same gene pool [32]. The DAPC analysis results were in concordance with those of clustering analysis, despite minor differences. Both showed a good fit between the grouping and the names of the varieties, which reflects the ability of farmers to differentiate among the landraces. However, a small group composed of a mixture of landraces appeared, and some lines of the differentiated landraces are included in different clusters, which could be due mainly to mis-naming of the landraces during the collecting missions and to possible mixtures in the landraces. These findings confirm the multiline nature of landraces that is also found through morphological and agronomic characterization [52]. This heterogeneity offers an important buffering capacity of landraces in drought-prone and fluctuating environments.
Hierarchical AMOVA analysis based on hierarchical subdivision strata from K = 2 up to K = 5 agreed with DAPC analysis results and supported the high level of molecular variance to K = 2 (17.18%) and to K = 5 (26.62%). AMOVA analysis results on the basis of the landraces' names indicated a higher genetic variation among (28.79%) rather than within (0.82%) Tunisian landraces. Taking in consideration the structure of the population based on the optimal number of grouping K = 5, AMOVA results showed an increase in percent of the explained genetic variance among landraces (~31.77%) and a decrease of the genetic variance within them. Soriano et al. [49] revealed much variability within sub-populations (83%) than between them (17%) and Mangini et al. [53] found higher genetic diversity within the two Italian durum wheat landraces "Bianchetta" and "Grano Ricco" (9.5 and 9.4%, respectively) and low genetic diversity within "Dauno III.".
This study allowed us to assess for the first time the genetic diversity within and among the Tunisian durum wheat landraces using the DArTseq technique, which allowed us to show different levels of genetic diversity within landraces. Jenah Zarzoura and Rommani populations showed the highest genetic diversity (H e = 0.27), and the Bidi landrace showed the lowest genetic diversity (H e = 0.12). The low genetic variation within the landraces could be explained by the selection from farmers for desirable traits and/or from environmental conditions pressure. Compared to the other landraces, the Jenah Zarzoura landrace showed a high expected heterozygosity (H e = 0.268) and a low fixation index (F ST = 0.05). Thus, this small differentiation could be explained by the confinement of this landrace to a specific environment in the oasis of Mareth, which reflects the geographic isolation of the oasis of Mareth, limited seed exchange, and selection pressure by farmers. Most often, farmers are selecting the best representative spikes from a landrace to form the seed lots.

Origin of Tunisian Durum Wheat Landraces
The cluster analysis extended to 27 other Tunisian landraces, improved varieties, and germplasm along with the lines included in the mixed cluster when K = 6 from Set 1 showed large genetic diversity among the germplasm studied. The six lines of landrace Biskri mis-classified in the additional sub-population when K = 6 in the Set 1 analysis were grouped with the modern cultivar Khiar, and the two mis-classified lines of Mahmoudi and Jenah Khotifa were grouped with the elite germplasm (MCHCB-102). These results confirm the possibility of mixture in some landraces, which could be due to seed exchange and threshing practices as suggested by other studies [52,54]. This extended genetic study confirms the uniqueness of the Jenah Zarzoura landrace and the classification of Jenah Khotifa, Rommani, and Bidi lines with their respective landraces, but not with the other populations with the same name. This could be due to the mis-naming of landraces during seed exchange among farmers or during the collecting missions undertaken by the genebank teams.
Tunisia is considered as a secondary center of diversity for durum wheat, and the introduction patterns of durum wheat landraces into Tunisia and North Africa are still under discussion. The DAPC analysis including landraces from Tunisia and landraces from the countries around the Mediterranean and West Asia regions allowed us to define 12 distinct groups, which can be used to highlight the relationships between Tunisian landraces and other landraces. Most Tunisian landraces held at the ICARDA genebank as well as the lines derived from landraces collected recently in Tunisia were grouped with landraces from North Africa neighboring countries and with landraces from Greece, Italy, and Lebanon. Jenah Zarzoura remained distinct from all landraces studied except for the two landraces from Jordan. These results suggest that most Tunisian landraces could be obtained through Lebanon via Greece and Italy, while Jenah Zarzoura was obtained through another introduction pathway. Previous reports have demonstrated two dispersal patterns of durum wheat in the Mediterranean Basin from the Fertile Crescent: over the north side via Turkey, Greece, and Italy and the south side via North Africa [55]. Moragues et al. [56] supported this hypothesis and classified a collection of durum wheat landraces originating from different Mediterranean countries in two groups: (i) landraces from the North and East Mediterranean basin and (ii) landraces from North Africa and the Iberian Peninsula. More recently, Soriano et al. [49] showed an eastern-western dispersal of the Mediterranean durum landraces, which have been classified into four sub-populations: (i) Eastern Mediterranean, (ii) the Eastern Balkans and Turkey, (iii) the Western Balkans, and (iv) Egypt and the Western Mediterranean. The grouping of landraces from North Africa with Italy and Greece was also confirmed by Olivera et al. [57] and Nazco et al. [58], and could be explained by the Roman influence on durum wheat cultivation in North Africa. The early development of Carthage trade maritime activity in the Mediterranean sea enhanced seed exchanges between Tunisia and the Mediterranean countries [59], which might explain the grouping of the Bidi, Jenah Khoutifa, and Rommani lines with the majority of Algerian landraces (Sub-population 6) and that of the Biskri and Mahmoudi lines with Lebanese and Moroccan landraces (Sub-population 9). Moreover, our work confirmed that Biskri and Mahmoudi lines constitute the same gene pool and that Jenah Zarzoura lines constitute a new gene pool distant from the other Tunisian and foreign landraces, which were grouped with only two accessions from Jordan (Sub-population 11). A possible explanation is that the Jenah Zarzoura population, which was collected from the oasis of Mareth, located in the south of Tunisia, near the Mediterranean Sea, might have been introduced from the east through different paths, probably from Egypt to neighboring countries and possibly received from Palestine and Jordan [60,61], or through the introduction by the Phoenicians coming from Lebanon to Carthage between the 9th and 2nd centuries B.C [59]. During the Roman period (7th to 3rd centuries B.C.), Tunisia became the breadbasket of the Italian peninsula and the source of the excellent semolina quality from durum wheat grown in North-African countries [62]. The landraces of Tunisia and North Africa have also be traced to the introductions by Romans, who contributed to the modernization of the irrigation systems and extended wheat cultivation to Southern Tunisia [59].

Implications on Conservation and Use of Genetic Resources
The results of this study can be used to direct future activities of collection, conservation, and the use of durum wheat genetic resources in Tunisia. In terms of adding new diversity to the existing collections, more collection is needed in Tunisia mainly in the oasis areas to collect different landraces like Jenah Zarzoura. Future studies of this kind of germplasm will shed more light on the specific nature of this germplasm as in the case of durum wheat germplasm from Ethiopia [24]. Additionally, more landraces from other regions, even if they have the same local names, should be collected and given different identifiers within the genebank database and considered as different accessions in the ex situ collection. When collecting, the team should avoid plants with characteristics of improved varieties to avoid mixtures. For ex situ conservation, the genebank in Tunisia should conserve the bulk seeds of each landrace instead of seeds of many individual lines constituting each landrace. This will reduce the cost of conservation and avoid conserving several copies of the same line. DArTseq markers have also allowed us to identify outlier lines within the landraces and can therefore be eliminated during multiplication and characterization. For landraces still prevailing under traditional farming systems and under harsh conditions, on-farm conservation could be promoted to conserve a larger genetic base and the associated local knowledge. The need for ensuring long-term conservation of Tunisian durum wheat landraces is dictated by the on-going genetic erosion due to the spread of newly released cultivars and by their richness in genes for tolerance to drought, heat, and salinity and their quality attributes for different end uses [63].
The seeds for Sets 1 and 2 were taken from single plant spikes of each line, landrace, and germplasm grown at the Mornag INRA-Tunisia experiment station during the 2014-2015 growing season, while the seeds for Set 3 were send to CIMMYT from the ICARDA genebank within the joint effort to genotype wheat genetic resources. Table 6. List and sample size of the Tunisian durum wheat landraces collected from the center, the south and the Oasis of Tunisia (Set 1).    Afghanistan  16  Cyprus  10  Algeria  14  Egypt  12  Ethiopia  11  Greece  18  Iran  12  Iraq  11  Israel  10  Italy  12  Jordan  11  Lebanon  10  Libya  9  Morocco  17  Syria  11  Tunisia 13 Yemen 10

Genotypic Characterization Using the DArtseq™ Method
Fresh young leaves from a single individual plant per accession have been used for genomic DNA extraction performed through a modified CTAB (cetyltrimethylammonium bromide) method [64]. The DNA quality was determined by electrophoresis using 1% agarose gel and quantified with NanoDrop 8000 spectrophotometer V 2.1.0.
A high-throughput genotyping method using DArTseq TM technology was employed to generate a genomic profile of the germplasm at the Genetic Analysis Service for Agriculture (SAGA) facility at CIMMYT, Mexico. This method used a combination of two restriction enzymes (PstI and HpaII) to reduce the genome complexity and to generate a genomic representation of the samples [31]. The genomic DNA was submitted to a process of digestion and ligation with a specific PstI-RE site-adapter tagged with 96 different barcodes, which allow to multiplex 96 DNA samples in a single lane of Illumina HiSeq2500 instrument (Illumina Inc., San Diego, CA, USA). The amplified fragments were sequenced up to 77 bases, generating around 500,000 unique reads per sample. A FASTQ files (full reads of 77 bp) were quality filtered using a Phred quality score of 30, which represents a 90% base call accuracy for at least 50% of the bases. An additional filter has been applied on barcode sequences. DArTsoft 14 was used to generate Silico-DArT score tables as data (1/0), indicating the presence/absence variation (PAV) markers and SNP markers.
DArTseq raw data was filtered according to markers criterion; minor allele frequency >5% and missing data ≤20%.
For cluster analysis of the collection, allele sharing distance matrix was computed as described by Goa et al. [66]. The distance between individuals i and j was defined as where L is the total number of markers, and dij(l) is 0, 1, or 2 if individuals i and j have zero, one, or two allele(s) in common at Locus l. Classification of the individuals into groups was performed using the allele sharing matrix and Ward's minimum variance algorithm [67]. The clustering algorithm was run using the hclust function within the R package [68].

Population Structure and Genetic Differentiation
Discriminant analysis of principal components (DAPC) was used to infer the number of clusters of genetically related individuals [48], using the adegenet package and popr in R-project [68]. DAPC is a multivariate analysis requiring three steps; first the data is transformed using principal component analysis, sub-groups are then identified using k-mean clustering, and discrimination between the sub-groups is then optimized using discriminant analysis. For the k-mean clustering, the optimal number of groups was identified using the Bayesian information criterion (BIC) as a measure of goodness of fit. The number of sub-groups (K) was set from 2 to 8 and the K-value with the lowest BIC was retained as the optimal number of clusters. A discriminant analysis was then implemented using the groups found by k-mean clustering [69].
For detecting the genetic variation among and within population(s) and supporting the hypothesis of the population structure, analysis of molecular variance (AMOVA) was performed for different hierarchical subdivision levels as well as for the full population structure strata from K = 2 to K = 8. Significance levels for variance component were estimated based on 10,000 permutations using the randtest function in the R-project as described by Excoffier et al. [70].
For genetic differentiation and relationships among the sub-populations issued from the population structure analysis, the genetic distance between the sub-populations using Reynolds genetic distance was computed [71], and the genetic indices, such as the observed heterozygosity (H o , the proportion of loci that are heterozygote for a population), the expected heterozygosity or genetic diversity (He, the fraction of all landraces which would be heterozygote for any randomly chosen locus), and the F-statistics (F ST ) as developed by Wright [72], were calculated.