Unlocking the patterns of the Tunisian durum wheat landraces genetic structure based on phenotypic characterization in relation to farmer’s vernacular name

: During the 1970s, Tunisian durum wheat landraces were replaced progressively by modern cultivars. These landraces are nowadays maintained by smallholder farmers in some ecological niches and are threatened gradually by extinction resulting on the narrowing of the genetic diversity. This study aims to investigate patterns of phenotypic variability using twelve quantitative traits in a panel of 189 durum wheat landraces and seven checks, based on farmer’s population name attribution and genetic structure. Our results showed high phenotypic variability among and within landraces and checks for the majority of the studied traits. The principal components analysis showed similar grouping using farmers name attribution and genetic structure using K = 6, which confirmed the identification of a new gene pool in the oases of Tunisia, represented by the sub-population Jenah Zarzoura and the robustness and high relationships between phenotypic and genome wide genetic structure using DArTseq method. These findings will enhance the conservation efforts of these landraces and their use in breeding efforts at national and international levels to adapt to dry conditions. 60%) and low (< 30%). In our study, we showed high heritability for almost all the studied agro - morphological and phenological traits (H2 >60%), except for


Introduction
Tetraploid wheats are among the first crops that were domesticated in the Fertile Crescent around 10.000 years ago [1,2], the period that coincided to the human civilization emergence, marked by several events such as the development of agricultural practices and the shift from the hunter-gatherer to a sedentary and cultivator life style [3,4]. Domestication of durum wheat from its wild progenitor have undergone a series of genetic modifications, known as "the domestication syndrome" [5], involving changes in some morpho-physiological key traits through natural and ancient farmer's selection, such as non-shattering seeds, non-brittle rachis, bigger seed size, seed dormancy reduction, reproductive strategy [4,5,6,7]; which enabled easier harvest and use.
The tetraploid wheat spread is associated with human migration, from the Fertile Crescent towards remote geographical regions and continents to reach North Africa. The oldest described way of this migration is terrestrial, which started from Egypt and continued south to Sudan and Ethiopia and north to eastern Libya [8]. The second route is maritime, first from Greece and Crete to Libya and then from the Sicilian peninsula to reach the coasts of Tunisia, Algeria and Morocco [9]. The process of domestication and genetic evolution of the tetraploid wheat in the Mediterranean Basin was strongly influenced by environmental conditions and different farmers' strategies selection for desirable agronomic and end use traits, which induced the development of diverse and well adapted durum wheat landraces to their agro-ecological zones of origin [10].
Several studies have described a wide genetic diversity in durum wheat populations from different geographical origins, based on qualitative and quantitative traits [11,12,13]. Particularly, the Mediterranean landraces showed higher level of polymorphism and allelic richness for some quality traits compared to those from Southwest Europe and Southwest Asia [14,15]. The West Mediterranean landraces have shown their resistance to drought [16,17] and diseases [18], their phenotypic plasticity and their adaptability to harsh environmental conditions and low in-put farmers agro-systems [20,21]. However, these landraces including the Tunisian ones are continually lost, due to farmer's adoption of new high yielding and homogenous cultivars released since the period of the Green Revolution in 1970s, which resulted in narrowing the genetic diversity of durum wheat [10,22] . Boeuf [23] has described a large number of Tunisian durum wheat landraces, which have been crossed later since the 1970s with foreign landraces and have given rise to a large diversity of local populations [24], which erected Tunisia within the secondary centers of diversity for durum wheat [25]. Nowadays, only Tunisian smallholder farmers under traditional farming systems maintain some of these landraces with traditional agricultural practices and methods of conservation, in some ecological niches, to meet their food needs. The vernacular name of the Tunisian local traditional varieties was attributed by traditional farmers based on the color and shape of the spike as well as the geographical origin or the name of the oldest maintainers to recognize their innovative role [24].
Farmer's management of these landraces has shaped the genetic diversity between and within landraces and their genetic structure. Several studies have reported high phenotypic diversity among the Tunisian durum wheat germplasm through evaluating agronomic and adaptative traits [19,26], and through phenotypic characterization based on the international standards descriptors of the International Plant Genetic Resources Institute (IPGRI) and the International Union for the Protection of New Varieties of Plants (UPOV) [26,27]. Besides the phenotypic diversity of durum wheat landraces, genetic population structure is also an important criterion to buffer the effects of climate change, and biotic and abiotic stresses. Several molecular tools were used to assess the population structure of durum wheat collections [28,29,30]. To date, few studies were done on Tunisian germplasm [31,32]. Kehel et al. [33] demonstrated a structure of the genetic diversity of Moroccan and Syrian durum wheat landraces in relation to their spatial distribution using Bayesian and Eigen approaches. Soriano et al. [34] described a relationship between phenotypic performance and genetic structure. Our previous studies also described different patterns and a genetic structure of 196 lines issued from durum wheat landraces, collected in the south of Tunisia, based on Diversity Array Technology sequencing (DArTseq) [31]. However, no investigations up to now have been done on the structure based on agro-morphological diversity among these landraces in relation to farmers' nomenclature and management. Towards better understanding of the genetic population structure of this collection, the present study aims to 1) assess the phenotypic diversity based on nine agro-morphologic and phenological traits evaluated across six environments, 2) unlock the genetic structure based on agro-morphological characterization related to farmers' vernacular name or/and to the genetic groups described on our previous study [31], and 3) characterize the newly identified Tunisian population Jenah Zarzoura.

Plant material
In this study, we used a collection of 189 lines issued from six Tunisian durum wheat landraces collected from the Center, the South and the Oases of Tunisia (Table 1): Bidi (30), Biskri (37), Jenah Zarzoura (30), Jenah Khoutifa (32), Mahmoudi (30) and Rommani (30), which seven lines of Biskri were identified in our previous work as local improved varieties [31] and will be used along with seven ICARDA elite lines as checks ( Table 2).

Field experimental trials
Field trials were conducted in two contrasted zones of Tunisia: i) Mornag locality, belonging to the semi-arid bioclimatic zone with soft winter (latitude: 36,63; longitude: 10,32), where the experiments were carried out under rainfed conditions during two cropping seasons 2015 and 2016 (MOR_RF), and ii) Kairouan locality belonging to the arid bioclimatic zone with cool winter (latitude: 35,62; longitude: 9,93), where the experiments were carried out under two controlled water regimes during two cropping seasons 2016 and 2017 (full irrigated regime (KAIR_IR) using 100% of the evapotranspiration (ETR) and stressed regime (KAIR_STR) using 25% of the evapotranspiration (ETR)). The 196 genotypes were sown at a seeding rate of 350 seeds / m2 in a two (2) m2 plot (with 4 rows of 2 m long and 0.25 m apart) and arranged respectively on 14×14 alpha lattice design with two replications. Optimal agronomic management including soil preparation, fertilization, weeding and diseases control were applied in each site.

Phenotypic characterization
Five randomly sampled plants from the two central rows of each plot were used to record five agro-morphological traits: Plant height (H; cm) measured from tillering node to the top of the spike (excluding awns) of the main stem at maturity stage, flag leaf area (FLA; cm2) estimated by a planimeter (AM 300 Field Portable Leaf Area Meter -Opti-Sciences), spike length (MLB; cm), awn length ( MLE, cm) and number of spikelet per spike (NEE) were scored based on wheat descriptors of the International Board for Plant Genetic Resources (IBPGR). The remaining grain morphological traits were recorded after harvesting the two central rows and analyzed in CIMMYT Wheat Chemistry and Quality laboratory, including thousand kernel weight (TKW; g) seed width (SW, mm), seed thickness (ST, mm), seed length (SL, mm) and seed total area (TA, mm2), which were measured by digital image analyzer (SeedCount SC5000, Next Instruments, Australia).
The phenological data were recorded according to Zadoks scale, on plot basis, for days to heading (HD, days) and days to maturity (DM, days).

Genetic Landraces patterns analysis
The panel of 196 lines was analyzed in a previous study [31] for genetic diversity and population structure using DArTseq method. Two outcomes were highlighted in this study. First, a population structure with number of groups K = 5, which was efficient to differentiate most of the landraces related to their Farmer's nomenclature. Second, on the basis of number of groups K = 6, the genetic structure showed a mixture of landraces lines separated from all other landraces and were considered as local improved varieties (MIX_VAR).
Therefore, the lines used in this study were categorized based on their names attributed by farmers, and to their respective groups K = 5 and K = 6 issued from the genetic population structure.

Statistical analysis
Phenotypic data from separate environments were analyzed based on a linear mixed model below: Where μ and r are the grand mean and replication effects considered to be as fixed. ICB and G are incomplete blocks within a replica and the genotypes effects considered to be random. ε are the residuals from the model assumed to be randomly and normally distributed around 0.
Broad sense Heritability (H2) of a trait was computed as follow [35]: Where Var (G) and Var (e) are genotypic and residual variances and n is the replication number. Percentage genotypic variance was computed the percentage of variance explained by the genotype from the total phenotypic variance.
Three other linear mixed models were fitted for the traits as below: Where μ and r are the grand mean and replication effects considered to be as fixed. ICB, G and Grp are incomplete blocks within a replica, the genotypes effects and the group effects considered to be random. The group effects were based on landrace's name, landraces grouping using K = 5 and K = 6 [31]. ε are the residuals from the model assumed to randomly and normally distributed around 0. Variation explained by the groups was computed as Var (Grp) / Total variance x 100 which reflects the reduction from the genotypic variance due to the group covariate in the model. All mixed modeling was done using AsReml-R 3.0 package [36,37] in R (R Core Team 2018).
Principal component analysis PCA using adjusted means values of all the traits across environments from the first mixed model was executed using pcaMethods package in R [38]. We used singular value decomposition (svd) and data was scaled and centered.
Finally, we tested statistically the significant difference between different group of landraces in a multi-trait level framework using the three different grouping: Name, genetic group assessed with K = 5 and with K = 6 by analysis of the partitioning sums of squares using dissimilarities [39].

Heritability of agro -morphological and phenological traits in Tunisian durum wheat germplasm collection
In this study, the broad sense heritability (H2) estimated for ten agronomical and two phenological traits across six environments, in a collection of 196 accessions, showed high values for almost all the traits (Table 3). Across the different environments, the highest values were observed for the two phenological traits, heading date (HD) and maturity date (MD), with a range from 0.94 to 0.98 and from 0.80 to 0.96, respectively.

Analysis of variance
The average percentage of variance explained by the genotypes (G) across the six environments (Figure 1), showed the lowest value for the total seed area trait (TA; 38%) compared to the highest one observed for heading date trait (HD; 91%). The genotypic variance (G) explained more than 50% of the total phenotypic variation for 10 out of 12 studied traits (Figure 1). The percentage variance explained by the farmer's population name (Name) as showed in figure 1 had a large contribution of the total phenotypic variation, which was more than 50% for 8 traits. However, we observed a different level of the genotypic variance reduction due to the farmer's name attribution. This reduction was low for TA (14%), ST (17%), TKW (13%) and MNEE (15%), but it was higher for FLA (25%), SL and HD (23%).
Based on our previous study which showed a genetic population structure through DAPC (discriminant analysis of principal components) method using DArTseq (Diversity array Technology sequencing), the collection of 196 accessions used in this study was structured into five sub-populations based on Bayesian Information Criterion (BIC) K = 5 and into six sub-populations as the optimal number of sub -populations K = 6 [31]. Landraces assignment to the genetic groups using the number of sub-populations K = 5, explained a slightly higher portion of the total phenotypic variation compared to that explained by farmer's name attribution. This increase was observed mainly for SL (11%), ST (5%) and SW (5%), followed by MNE (1%) to almost 0% for FLA. The observed explained variation for the eight traits, as that due to the farmer's name attribution, was almost conserved. The highest explained variation was observed for HD trait (70%), followed by MD and SL (64%), SW (62%) and H (51%) traits and the lowest one was seen for TA (27%), MNEE (30%) and FLA (31%) traits.
For landraces assignment to the genetic groups using the number of sub-populations K = 6, we noticed in figure 1 an increase of the explained variance compared to that explained by farmer's name attribution and also to that explained by the landraces assignment with K = 5. This increase ranged from 2 to 4 % for TKW, FLA, MLE, ST, SW and TA to 16% and 20% for H and HD, respectively.
A complementary analysis of partitioning sum of squares of all the traits used in this study, based on different landraces grouping levels: "Names" for farmers population name attribution, "K = 5 and K = 6" for sub -populations structured by Robbana et al. [31], revealed that all the grouping levels explained a large portion of the trait variance (> 64%) ( Table 4) and showed a highly significant differences (p = 0,001) among the subpopulations. The highest variance was explained by the genetic population structure assessed with K = 6 (76%), followed by the grouping level with K = 5 (65.06 %), in comparison to the one based on farmers population name attribution (64.81%). Our results showed a slight difference between the farmer's population name and the level grouping with K = 5.

Phenotypic characterization based on the genetic structure using the optimal number of sub-population (K = 6)
All studied traits showed high genetic correlations between environments (data non shown). Based on the genetic population structure using K= 6, means comparison across the six environments of all the traits between the seven sub-populations are presented in figure 2. It was confirmed that the additional group composed of mixture lines (MIX_VAR) included improved varieties, which presents the same variability and characteristics as the ICARDA elite lines, such as earliness for heading (HD) and maturity (MD) dates (around 110 and 160, respectively), short status with a plant height (H) around 100 cm, higher thousand kernel weight (TKW) with a mean around 40 g and number of spikelets / spike (MNEE) around 20. For almost all the studied traits, the boxplots analysis showed the presence of phenotypic diversity within the different landraces and improved varieties patterns. Compared to the other landraces, the sub-population Jenah Zarzoura possessed higher variability for most of the traits and distinctive characteristics. This landrace exhibited lower HD and MD than the five other landraces, but these traits were slightly higher than the mixture lines grouping (MIX_VAR) and the checks (Check). Jenah Zarzoura was shorter than the other landraces and taller than the mixture lines grouping (MIX_VAR) and the checks (Check). According to the morphometric traits related to the seed size, Jenah Zarzoura sub-population showed the lowest seed length (SL), seed thickness (ST) and seed width (SW).

Phenotypic structure and relationships among the Tunisian germplasm
Results of principal components analysis (PCA) based on mean of the ten agro-morphological and phenological traits across six environments showed that the first five principal eigen values explained cumulative variance respectively of 61, 76, 88, 94 and 95% from the total phenotypic variance. For depicting phenotypic structure and relationships among the different sub-populations, PCA biplots analysis using principal component 1 versus principal component 2 were able to discriminate between contrasting sub-populations and showed in figure 3A

Discussion
Understanding the genetic diversity, population structure and proper characterization of the Tunisian germplasm is essential for its better management in gene bank and for efficient use of superior lines in the breeding programs. Several studies showed the efficiency of the phenotypic characterization using the descriptors to assess the genetic diversity of different durum wheat collections. Other studies found that molecular markers from RFLP to high throughput technology using SNP and DArT were powerful tools for wheat genetic diversity and population structure studies [31,40 ,41]. However, Royo et al. [42] demonstrated weak relationships between genetic similarities using SSR markers among Mediterranean accessions and their overall agronomic traits performance across nine environments. In contrast, Soriano et al. [12] showed strong relationships between phenotypic and genotypic structures in a collection of Mediterranean durum wheat landraces and modern cultivars using as well agronomic traits and SSR markers. In the present study, we demonstrated that field phenotyping combined with high throughput DArTseq genotyping are important and complementary tools for an efficient assessment of the genetic diversity and population structure of a panel of 189 accessions collected from the Center, the South and the Oasis of Tunisia and seven checks.

Strong genetic effect of the agro -morphological and phenological traits in the Tunisian durum wheat collection
The broad sense heritability (H2) is an estimation value of the part of total variance due to the genotypic variability [43], which was classified by Johnson et al. [40] as high (>60%), medium (30-60%) and low (< 30%). In our study, we showed high heritability for almost all the studied agro -morphological and phenological traits (H2 >60%), except for the TA trait in two environments out of six, which could be due to the severe drought in Tunisia in 2016 that affected seed size (http://www.agriculture.tn/documents/opendata/specifiques/dgre/SITUATIONDEMAI2 016.pdf).
The highest heritability of the two phenological (HD and MD) and the plant height (H) traits were comparable to those reported in Ethiopian durum wheat farmers varieties and improved varieties evaluated in two zones in Northern Ethiopia for two years [45]. Our findings showed higher heritability of the agro-morphological traits, particularly for the thousand kernel weight (TKW) in comparison to those described using Ethiopian durum wheat landraces (H2 = 0.6) and a collection of Tunisian and Algerian durum germplasm (H2 = 0.53) [19,45]. Relatively similar heritability values to our findings were reported for the majority of the agronomic traits in a previous study using Chinese wheat landraces collected from different zones in China and evaluated for 23 agro-morphological traits in six environments [46]. However, they found lower heritability for the seed morphological traits with a range of 0.43 and 0.76 for seed width and seed length, respectively [46]. Our results allowed us to reveal using the present Tunisian germplasm collection that all the studied traits with high heritability across the six environments are highly influenced by genetic effects rather than environmental effects since no or low genotype by environment interaction was found in all studied traits (data not shown). According to Singh [47], high heritability values reflect a big correlation between genotypes and phenotypes based on low environmental contribution, and will enhance eventually the selection of superior genotypes with targeted agro-morphological or phenological traits which could be used in national and / or international breeding programs.

Phenotypic variability inferred from genetic patterns and importance of the population Jenah Zarzoura
Tunisia is considered among the secondary centers of durum wheat diversity and includes a large collection of landraces [23], which are described by several experts as a reservoir of useful genes with high allelic richness [10,24]. Previous studies reported high genetic diversity in several collections of Tunisian germplasm using agro-morphological traits [19,26]. Belhaj et al. [27] described high and different levels of genetic diversity in 930 accessions collected from different localities in the South of Tunisia using twentytwo qualitative and three quantitative traits. Recent study using SSR markers showed the importance of the Centre and the South of Tunisia in maintaining some valuable landraces, which are well adapted to low precipitations and agricultural inputs, and described a genetic stratification between the North, which is characterized by substantial precipitations, with a predominance of highly productive improved varieties developed since the 70s until our days, and the Centre / South of the country, which are characterized by erratic and more reduced precipitations, with the presence of old durum wheat varieties and landraces [32]. Nowadays, very little knowledge about Tunisian oases local populations richness is described and characterized, even though some studies showed a wide and interesting diversity of bread and durum wheat landraces in the Algerian Saharan oases [17], as well as in Libyan and Moroccan oases [48,49].
According to these previous findings, we focused on durum wheat landraces, which were collected in Tunisia from the Centre, the South and the Oases of Mareth. Our results showed high variability between genotypes (G) for almost all the traits across the six environments (percentage of variance > 50%), except for seed total area and number of spikelets per spike (percentage of variance < 40%). In comparison to other studies, our work showed higher variability than that reported for nine agro-morphologic and three phenological traits among Tunisian durum wheat genotypes with a range of 5.38 % for heading date to 24.07 % for thousand kernel weight [26], to that reported for four agronomical traits among Tunisian and Algerian germplasm, particularly for thousand kernel weight with a value of 9.45% [19] and to that showed for fourteen agronomic and phenological traits evaluated across six environments in a large collection of Mediterranean durum wheat landraces, with the largest variation registered for plant height (78.2%) [12]. The Percentage of variance of each trait showed that the genetic patterns based on the number of sub-populations K = 6 explained more the phenotypic variability than Farmer`s population nomenclature, and the partitioning sum of squares analysis of all the traits across the six studied environments indicated a highly significant phenotypic diversity among the present Tunisian landraces (> 64.80%). These results are in agreement with our previous findings using DArTseq genotyping [31], which reported an optimal number of sub -populations K = 6, allowing an appropriate classification of the lines and a good identification of the different landraces and improved varieties. However, a high genetic diversity among these landraces was described based on the number of sub -population K = 5 based on analysis of molecular variance (AMOVA) results.
In addition, our present observed results showed high phenotypic variability for almost all the traits for the different patterns of landraces and improved varieties (MIX_VAR and Check). Interestingly, the four landraces Bidi (BD), Jenah Khotifa (JFK), Biskri / Mahmoudi (BIS_MAH) and Rommani (ROM) show the same patterns of variability for the phenological traits, which are characterized as late booting and maturing landraces, as described by Deghaïs et al. [24]. Furthermore, these landraces share similar patterns of variability for the majority of the morphologic traits, as high plant height (H), big flag leaf (FLA) and high seed size (SW and ST). Deghaïs et al. [24] reported that Jenah Khotifa and Jenah Zarzoura are the same landrace based on the black colour of the spike and glumes. However, our work showed that Jenah Zarzoura (ZAR) is different and distant from all the landraces including Jenah Khotifa. This sub -population collected from the Tunisian oases is characterized by early booting and maturing dates (HD and MD, respectively), semi dwarf status (H) and low flag leaf area (FLA). These characteristics confirmed previous studies showing relationships between climatic factors at durum wheat landraces collecting sites and phenotypic variation [50], and high tolerance to drought, salinity and heat stresses in oases landraces [17,49], which reflects oases farmers selection for particular traits to ensure the growing season until harvest and to meet their needs under inadequate climatic conditions. Recently, studies related to seed morphometric traits demonstrated that seed width and more specifically seed thickness were associated to predict the domesticated status and could be applied for the archaeological identification [7]. This finding helped us to suggest the early introduction of Jenah Zarzoura comparing to the other landraces, based on the lowest values of seed width and thickness and support our previous findings that showed similarity between this subpopulation and two Jordanian landraces [31].

Phenotypic structure and relationships between the Tunisian durum wheat landraces
In the present study, we used ten agro-morphological and two phenological traits for understanding the genetic structure and relationships between the Tunisian durum wheat populations and improved varieties. Indeed, many combinations of phenotypic markers were shown efficient in several studies using different collections of durum wheat [11,26,51]. Both PCA performed on all the studied traits based on farmer´s landraces vernacular name attribution and the sub-population structured with DAPC using K = 6, confirmed previous findings, which reported the valuable farmer´s knowledge in distinguishing the local populations [48,52], and a clear differentiation between the improved varieties and the different local populations as described by Fiore et al. [51]. However, some mis-classified lines in each population were identified as improved varieties and others were assigned to their respective population. In addition, the phenotypic structure based on farmer´s landraces vernacular name attribution showed that Biskri and Mahmoudi landraces constitute the same group. Previous findings using a collection of Tunisian germplasm identified through PCA based on morpho-agronomical and phenological traits four groups and showed that Biskri, Mahmoudi and Jenah Khotifa landraces belonged to the same group, on the other hand Bidi landrace constituted a distant and separate group [26], which agreed partially to our results for Biskri and Mahmoudi landraces. In a contrary we demonstrated that Jenah Khotifa and Bidi land-races are very close and could not be differentiated very well. Interestingly, the PCA results showed that the oases landace Jenah Zarzoura is distant and different from all the other landraces, checks and improved varieties. This sub-population constitutes a new gene pool with particular characteristics. These results agreed with our previous study using DArTseq genotyping method for assessing the genetic population structure through DAPC with K = 6 of the same set of accessions, which explained the mixture lines origin in each sub-population was due to farmer ´s seed exchange and / or to the mis-naming of the landraces during the collecting missions, confirmed the identification of a new gene pool represented by Jenah Zarzoura landrace and demonstrated that Biskri and Mahmoudi landraces belong to the same group [31].

Conclusions
Farmers´ vernacular landraces name attribution, their seed maintenance and their traditional agricultural practices are valuable information for gene bank management, conservation and genetic diversity analysis studies [21,48]. Despite the overuse of homogenous, semi dwarf and high yielding varieties, and the loss of durum wheat genetic diversity, nowadays looking for new source of genetic variability from the existing collection of landraces became urgent to face the climate change and to ovoid the high agricultural input systems.
Based on the present study, we demonstrated using the Tunisian collection that both phenotypic markers and DArTseq genotyping showed similar structure of the genotypes and were equally efficient to show high genetic diversity among the landraces, large variability within them and allowed the differentiation and the characterization of different genepools, which constitute a valuable source of useful genes. Our work highlighted the importance of the Tunisian oases by including a new gene pool represented by Jenah Zarzoura landrace, which could be a valuable source of climate change associated adaptive genes to be used in breeding programs for developing new high yielding varieties for dry areas. This study supports then the need for collecting more the landraces mainly the ones in the oasis of Tunisia.
Phenotypic and molecular characterization in this study and our previous study confirmed the need of a clearer strategy when making plans of genotyping Genebank collections and using outcomes in linking phenotype to genotype. Genotyping and phenotyping one seed per a Genebank accession, most of the time not the same seed, do not allow a tight relationship between traits and alleles considering the levels of phenotypic and genetic diversity within an accession. Genotyping and phenotyping several lines per accession are resource consuming process and have a direct implication on conservation strategies within a Genebank.