Genetic Diversity in Jatropha Curcas L. Assessed with Ssr and Snp Markers

Jatropha curcas L. (jatropha) is an undomesticated plant that has recently received great attention for its utilization in biofuel production, rehabilitation of wasteland, and rural development. Knowledge of genetic diversity and marker-trait associations is urgently needed for the design of breeding strategies. The main goal of this study was to assess the genetic structure and diversity in jatropha germplasm with co-dominant markers (Simple Sequence Repeats (SSR) and Single Nucleotide Polymorphism (SNP) in a diverse, worldwide, germplasm panel of 70 accessions. We found a high level of homozygosis in the germplasm that does not correspond to the purely outcrossing mating system assumed to be present in jatropha. We hypothesize that the prevalent mating system of jatropha comprise a high level of self-fertilization and that the outcrossing rate is low. Genetic diversity in accessions from Central America and Mexico was higher than in accession from Africa, Asia, and South America. We identified makers associated with the presence of phorbol esters. We think that the utilization of molecular markers in breeding of jatropha will significantly accelerate the development of improved cultivars.


Introduction
Jatropha is an important undomesticated plant, which has received great attention in recent years for its utilization in biodiesel production, rehabilitation of wasteland, and rural development [1].Jatropha is a small tree or large shrub, which belongs to the Euphorbiaceae family, and has a life expectancy of up to 50 years [2].The plant has a high potential for greening and rehabilitation of wastelands and the seeds have a high oil concentration with excellent quality for conversion into biodiesel [3].Although various efforts have been made to develop jatropha as an industrial crop [4,5], the absence of improved cultivars and lack of agronomic knowledge represent the main bottleneck that limits the full exploitation of this plant's potential [6].
Understanding the genetic structure (genetic diversity within and among individuals) of jatropha germplasm is of significant importance to establish breeding strategies and design breeding programs.In jatropha, the genetic diversity among individuals has been investigated with molecular markers in various studies [7].A consensus seems to exist that the highest genetic diversity is present in gemplasm originated from Central America and Mexico.The genetic diversity within individuals has been investigated with co-dominant markers [8,9] and the high level of homozygosity in the germplasm investigated differed significantly from the homozygosity level expected in outcrossing plants [10].These results trigger questions about the mating system of jatropha.
Jatropha is a monoecious plant with inflorescences having separate female and male flowers and pollen transportation by wind is limited because of its weight and adhesiveness [11].There is evidence that insect play an important role in pollination of jatropha [12].In addition, the opening time shift between male and female flowers has been reported and all these observations lead to hypothesize that jatropha should have a high level of heterozygosis as observed in other outcrossing crops [9].The results of various studies reporting a high level of homozygosity in jatropha point to the revision of the hypothesis about the mating system.Recent results indicate that selfings within an inflorescence and/or within a plant is occurring to a much higher frequency than crosses [13], and the behavior of insect pollinators might influence the mating system significantly.
Phorbol esters are the principal chemical compounds that cause toxicity in the oil and seed cake, and they are not heat liable [14].There exist non-toxic germplasm from Mexico but the field performance in terms of grain yield is generally low in comparison with the toxic germplasm, as observed in multi-location scientific field experiments (personal communication).This indicates a clearly different adaptation pattern.Therefore, it is important to understand the genetic basis of phobol esters in order to design breeding strategies to develop high yielding non-toxic cultivars with a broader adaptation.
Molecular markers associated with phorbol esters production have been developed [7], but there is scarce information indicating the number of genes controlling the trait and information of gene action (dominance level).To understand the genetic basis of phorbol esters, it will be necessary to cross toxic and non-toxic germplasm and assess the expression of the trait with molecular markers and measure the expression of the trait in the progeny.A recent study has been reported and it indicates the monogenic and dominant character of phorbol esters biosynthesis [15].
The main goal of this study was to assess the genetic structure and diversity of a large panel of jatropha germplasm with co-dominant markers (Simple Sequence Repeats (SSR) and Single Nucleotide Polymorphism (SNP).Our objectives were to (i) examine the internal genetic structure within accessions; (ii) assess polymorphism level among accessions; (iii) estimate the level of molecular variance among and within geographical regions (world region and countries); and (iv) evaluate the potential of markers to classify accessions according to their phorbol esters presence.

Experimental Section
The plant material comprised 70 accessions of jatropha from different geographical origins (Table 1).The number of plants included in the DNA sample of each accession is indicated as plants in pool (Table 1).Plants in pool are the progeny of a single mother tree.The complete material was examined with 54 SSR markers (100% public) selected to cover the eleven jatroha chromosomes.A subset of 16 accessions was examined with 120 SNP (100% proprietary).Genotyping of SSR and SNP markers was performed by Trait Genetics GmbH (Gatersleben, Germany).Phorbol esters determination was performed as described elsewhere [14].The range of phorbol ester was 0.0-10.3mg•g −1 .
Table 1.Internal genetic structure in accessions of Jatropha curcas L. analyzed with 54 SSR and 120 SNP markers.PE: phorbol esters (P = present; A = absent), Mda: percentage of markers with different alleles at a locus, Msa: percentage of markers with the same allele at a locus, Mis: percentage of missing marker data.Plants in pool refer to the number of plants contained in the DNA sample analyzed.SAM: South America; CNAM: Central and North America.

Statistical Analysis
For each accession we computed the percentage of markers for which (1) only one allele was present; (2) multiple alleles were present and (3) the percentage of missing markers.This was done separately for SSR and SNP markers.The polymorphic information content (PIC) was calculated for each marker as described by Botstein et al. [16].We computed, for each marker, the percentage of accessions where multiple alleles were present.For each marker allele, we computed the percentage of accessions it was present in (Allele proportion), regardless of whether other alleles of the same marker were present in the accession or not.
Based on the Modified Roger Distances (MRD) between the individuals we performed a principal component analysis (PCA), a cluster analysis and an analysis of molecular variance (AMOVA).The MRD was computed separately for SSR and SNP data, according to Reif et al. [17].For samples where multiple alleles were present, we assumed equal allele frequency of all alleles.The AMOVA was used to partition molecular variance into variance between world regions, countries within world regions and accessions within countries.All computations were performed in the R statistical software environment [18].In particular, the R functions princomp and hclust were used for PCA and cluster analysis, respectively.The AMOVA was performed with the packages vegan [19] and pegas [20].To facilitate visualization of the cluster, the accessions 49 to 70, which cluster all together and do not provide additional information, were eliminated from the cluster.

Results and Discussion
The percentage of markers with multiple alleles at a locus was low (Table 1).For the SSR markers, the highest percentage of markers with multiple alleles at a locus was found in the accession 9 (Mexico) followed by accessions 30 (Guatemala), 47 (Mexico), 7 (Mexico), 27 (Colombia but originated from Salvador-Nicaragua), 29 (Guatemala) and 46 (Mexico).The SNP markers confirmed the different internal genetic structure of the accessions 9 and 7.
The analysis of SSR markers resulted in an average number of alleles per locus of 2.5 and 38 markers were polymorphic (Table 2).Three markers (M16, M34 and M53) showed multiple alleles in most accessions.However, the overall percentage of markers with multiple alleles was low (7.7%).A total of 25 SSR markers showed PIC values larger than 0.1.Similarly to the results obtained with SSR markers, the average number of accessions with multiple alleles per locus was very low in the SNP analysis (Table 3).The variance component analysis revealed that the highest proportion of the genetic variation is present among the world regions and among the accessions within a world region and country (Table 4).The variance components based on SSR and SNP marker data of the same accessions (N = 16) were similar.The variance components based on the complete set of 70 accessions with the SSR data showed, as expected, that most of the variation is in accessions within a world region and country.
The cluster analysis based on SNP marker data resulted in a grouping pattern that clearly distinguishes accessions according to presence or absence of phorbol esters (Figure 1).The cluster analysis based on SSR marker data showed a similar pattern extended to a larger number of accessions (Figure 2).The principal component analysis of the SSR marker data showed a clear grouping pattern to distinguish accessions according to their phorbol ester classification (Figure 3).Knowledge of the internal genetic structure within and across accessions is of significant importance in breeding programs.The optimal design of a breeding strategy relies on the accurate information of the internal genetic structure and polymorphism of the germplasm at the breeder's hands.The optimal breeding category (clone, line, population, hybrid) for jatropha will then be defined in a comprehensive manner by combining knowledge of genetic structure, polymorphism, heterosis level and the cost to produce the improved cultivars.
The low number of markers that showed multiple alleles at a locus indicated a high level of homozygosis in this germplasm panel of jatropha.There were only seven accessions with more than 10% of markers with multiple alleles at a locus (Table 1).The high level of homozygosis is present in germplasm from all world regions.This result is not in agreement with the expected level of homozygosity in outcrossing crops and it questions the mating system of jatropha.
The monoecious character of jatropha, the different opening time of male and female flowers and the need of insect pollinators for transport of pollen indicated a potential outcrossing mating system.However, molecular results with co-dominant markers indicate a high level of homozygosis.Considering the low genetic variation in germplasm from Africa, Asia, and South America, one might speculate that the high level of homozygosis is due to the lack of genetic variation available at those regions due to a bottleneck of genetic diversity transfer from Central America and Mexico.However, the high level of homozygosis was also observed in germplasm originated from Mexico where the highest genetic diversity of jatropha is present.This indicates that the mating system of jatropha might result generally in a high level of homozygosis.Results from recent studies are contradictory to the assumed purely outcrossing mating system in jatropha and point out to a mixed mating system [13,21].
The cluster of accessions showed a perfect agreement with our expectation and confirms that the accessions from Central America and Mexico have a higher genetic diversity than the accessions from Africa, Asia, and South America (Figures 1 and 2).This information is of importance to the breeders in order to establish germplasm management concepts and the planning of crosses to exploit heterosis and combine trait of economic importance.
In addition, the cluster analysis showed a clear grouping pattern of the accessions according to their phorbol esters presence.The principal component analysis indicated that it is possible to classify accessions for their phorbol esters presence with a high level of certainty (Figure 3).Therefore, we conclude that the molecular makers in our study are a suitable tool to classify accessions rapidly and cost-effectively.In addition to these markers, additional published makers [15] could be also used to accelerate the introgression of non-toxicity into elite recurrent toxic genotypes.
The selection of the marker system to be implemented in breeding programs depends principally on the costs.As the cost per marker data point is decreasing rapidly, the development of SNP arrays with thousands of maker data points for many samples seems to be the most probable future scenario.However, we have identified 38 polymorphic SSR markers with good potential for the classification of germplasm and four SSR markers showed a perfect association with the presence of phorbol esters.Similarly, 14 SNP markers showed a perfect association with that trait.In addition, the analysis of those 14 SNP markers indicated that the heterozygous marker corresponded to accessions with phorbol esters present.This agrees well with the dominance character of phorbol esters biosynthesis recently published [15].

Conclusions
We found a high level of homozygosis in a worldwide panel of accessions and this result is not in accordance to the purely outcrossing mating system assumed to be present in jatropha.We hypothesize that the prevalent mating system of jatropha comprise a high level of self-fertilization and that the outcrossing rate is low.The genetic diversity in accessions from Central America and Mexico was higher than in accession from Africa, Asia, and South America.Makers associated with the presence of phorbol esters were identified.
The increasing knowledge of molecular marker information in jatropha allows breeders to design optimal knowledge-based breeding strategies.A first interspecific genetic map [22] and a first intraspecific genetic map [15] have been published, and the position of a limited number of SNP and SSR are known.However, the full exploitation of the molecular tools for breeding will require a higher density of markers to allow association studies and genomic selection.In parallel to the efforts in increasing the marker density, accurate evaluation of field performance covering a range of environments will be also important.We believe that an optimal breeding strategy in jatropha will combine the utilization of high-density marker data with accurate field performance across a range of environments.

Figure 1 .
Figure 1.Cluster of 16 accessions of Jatropha curcas L. based on 120 SNP markers.Accessions's labels combined accession identification number, world region and phorbol prescence.SAM: South America; CNAM: Central and North America.P: phorbol present; A: phorbol absent.

Figure 2 .
Figure 2. Cluster of 48 accessions of Jatropha curcas L. based on 54 SSR markers.Accessions's labels combined accession identification number, world region and phorbol prescence.SAM: South America; CNAM: Central and North America.P: phorbol present; A: phorbol absent.

Figure 3 .
Figure 3. Biplot of the first two principal components for 48 accessions of Jatropha curcas L. based on 54 SSR markers.PE: phorbol esters.

Table 2 .
Allele proportion in 70 accessions of Jatropha curcas L. and polymorphic information content (PIC) of SSR markers.Ama: percentage of accessions with multiple alleles at a locus.

Table 3 .
Allele proportion in 16 accesions of Jatropha curcas L. and polymorphic information content (PIC) of SNP markers.Ama: percentage of accessions with multiple alleles at a locus.

Table 4 .
Variance components expressed as percentage of the total variance of genetic distances estimated with 54 SSR and 120 SNP markers in a genetic diverse panel of accessions of Jatropha curcas L. Df: degree of freedom.