Evaluation of Genetic Diversity Based on Microsatellites and Phytochemical Markers of Core Collection of Cymbopogon winterianus Jowitt Germplasm

Cymbopogon winterianus Jowitt is an industrially important crop due to its value in the aromatic, perfumery and pharmaceutical industries. In this study, 72 accessions of C. winterianus were selected for molecular diversity analysis using SSR markers. It revealed a total of 65 polymorphic alleles showing an average of 68.10% polymorphism. The best SSR primer with competency in discriminating the germplasm was 3CM0506 with PIC (0.69), MI (0.69) and Rp (3.12). Genetic variation was studied between Assam, Manipur, Meghalaya and Arunachal Pradesh populations. A dendrogram based on the Neighbour-Joining Method showed clustering of germplasm on the collection site. A total of six relevant genetic populations were identified through a structure harvester software analysis. Moreover, a dendrogram based on similarity, complete linkage and Euclidean distance was also elucidated differentiating the genotypes with respect to the major phytochemical constituents of the essential oil. GC-FID and GC-MS analyses of the essential oil of the 72 germplasms revealed citronellal content from 2.58–51.45%, citronellol from 0.00–26.39% and geraniol from 0.00–41.15%. This is the first molecular diversity report with 72 accessions of C. winterianus collected from the NE region using 28 SSR primers as well as their diversity based on phytochemical markers. This diversity computation will help with acquisition of the knowledge and relationship among each individual accession leading to the development of improved and essential oil component-rich cultivars.


Introduction
Genetic diversity analysis is a major step for developing plant breeding programmes [1]. A better understanding of the genetic relationship amongst plant species' germplasms, both cultivated and wild, is necessary to formulate strategies for the breeding, conservation and utilization of the genetic resources [2] of a crop's wild relatives, especially those with a primary gene pool [3].
One genus containing profound genetic diversity is the Cymbopogon species, belonging to the Poaceae family, which contains approximately 140 species of heterogeneous plants due to its cross-pollinative nature [1,4,5]. These species are generally distributed in tropical and subtropical regions of the world with an availability of 45 species grown in India [1,6,7]. These species are characterised by an aromatic essential oil, bearing many bioactive compounds that have a high pharmaceutical importance and economic value [8].
One of the important species of this genus, Cymbopogon winterianus Jowitt, popularly known as Java citronella, is a perennial grass that propagates by vegetative means [7,9,10]. The crop is cultivated for the extraction of essential oil, which has aromatic and medicinal the species precisely [32]. This will help in the development of future strategies for the improvement and commercial cultivation of this industrially important plant species. The genome mapping of complex traits and the selection of suitable genes for breeding can also be studied with this molecular tool [43,44]. Reports on the genetic divergence of different Cymbopogon taxa were performed [36], but very scanty information is available on the C. winterianus germplasm.
Hence, the present study aimed to estimate the level of divergence existing among and within the population of this species using SSR marker. The study of diversity based on phytochemical and molecular markers together is regarded as an appropriate measure to conserve and screen the elite genotypes for improvement of the crop and increase the gene pool [45][46][47][48]. This would also help in the identification and crossing of parents with desirable traits, subdivided into clusters, for the development of a superior germplasm with more yield-related traits. Segregating populations would also aid an increase in the recombinant combination in the gene pool. To the best of our knowledge, this is the first report on the phytochemical, as well as molecular, diversity of 72 accessions of C. winterianus.

Results and Discussions
All 72 accessions were submitted to ICAR-National Bureau of Plant Genetic Resources (NBPGR), New Delhi, to obtain the Indigenous collection number from IC-0626991 to IC-0627065 represented in Table 1. This helps in the conservation of genetic resources; utilization of the germplasm for screening of elite genotypes; for the development of superior genotypes by crossing or genetic modification of the genes responsible for economic yield; and for the transformation of genotypes with disease resistant, stress and pest tolerance traits. It is the most effective way of conserving all genes present in the germplasm for a long duration and can be regenerated whenever required. This can serve as a resource house for essential and newly identified genes, thereby creating a genetic information library.

SSR Primer Competency
Initially, 45 SSR primer pairs were selected based on the literature review of the genus Cymbopogon and procured from Bioserve Biotechnologies (India) Private Limited, Hyderabad. Primer testing was performed using 45 SSR primer pairs, of which 28 primer pairs were found to be efficient for the analysis, showing good amplification and clear  Table 2. The amplified bands obtained showed 2-5 alleles/locus. In total, 92 alleles were produced from 28 SSR primers, of which 65 were polymorphic, and the remaining was monomorphic. The polymorphism percentage of the primers ranged from 50% to 100% with an average polymorphic percentage of 68.10% ( Table 2). The primers 3CM0506, 14CM3132, 30CM056 and 34CM014 represented 100% polymorphism. Previously, a RAPD and ISSR analysis of C. winterianus accessions collected across West Bengal (India) showed polymorphism of 74.07% and 47.5%, respectively [1], while 75.11% polymorphism was reported in other species of Cymbopogon [49] and 81.33% in RAPD analysis [50]. Similarly, 90.68% and 69% polymorphism were reported in ISSR analysis, while 88.62% and 62.93% polymorphism were reported in RAPD analysis [35,51]. An EST-SSR analysis in Cymbopogon species showed~81% polymorphism [32]. In the present diversity analysis of C. winterianus using SSR marker, an average of 68.10% polymorphism was observed. The level of polymorphism was low compared to the other studies since the present study was performed on the same Cymbopogon species collected from different geographical locations. Therefore, they are likely to be more related to each other. In addition to this, PIC, MI and Rp values were also evaluated to check the competency of the screened primers. The mean PIC (polymorphism information content) values for all the loci of a particular primer varied from 0.16 (17CM3738) to 0.69 (3CM0506). A high value of PIC is an indication of more variation in the alleles and also describes the efficiency of the primers [52]. Therefore, from the observed data, 13 of the loci (PIC ≥ 0.50) can be considered more informative than the others. The PIC value >0.50 depicts the high efficiency of the primers [53]. Based on this standard value, 13 of the 28 SSR primers can be considered highly efficient for discriminating the C. winterianus accessions. In a study performed on different Cymbopogon species, 19 out of 20 primers were found to be very informative [32] with PIC > 0.50. The primer efficiency was checked by further calculating the marker index (MI) value, where the primer 3CM0506 had the highest MI of 0.69, and the lowest were the primers 5CM1112 and 17CM3738 with a 0.11 marker index. The average MI was found to be 0.34 per primer. Similarly, the resolving power (Rp) of a primer is the ability to discriminate between genotypes. Rp values varied from 0.01 (5CM1112) to 3.12 (3CM0506) with an average of 0.96/primer. Based on the data evaluation, the most efficient primer among the screened primers was the 3CM0506, with PIC (0.69), MI (0.69) and Rp (3.12). The gel image of the primer 3CM0506 differentiating 72 C. winterianus accessions is shown in Figure 1. The primer 5CM1112 could be considered ineffective due to its inability to differentiate variation among the genotypes with low PIC (0.33), MI (0.11) and Rp (0.01).

Intra-and Inter-Population Genetic Variation
The genetic variability parameters, i.e., number of observed alleles (na); number of effective alleles (ne); Nei's gene diversity (h); Shannon's information index (I); genetic diversity in the population (Ht); genetic diversity within the population (Hs); genetic differentiation degree (Gst); and gene flow (Nm), were calculated at the population level (Table 3).

Intra-and Inter-Population Genetic Variation
The genetic variability parameters, i.e., number of observed alleles (na); number of effective alleles (ne); Nei's gene diversity (h); Shannon's information index (I); genetic diversity in the population (Ht); genetic diversity within the population (Hs); genetic differentiation degree (Gst); and gene flow (Nm), were calculated at the population level (Table  3).  The diversity of a species depends on the frequency of heterozygosity (Ht). The total species diversity among the population was found to be 0.22 ± 0.03, and within the population (Hs) was 0.16 ± 0.02. Further, the gene differentiation coefficient (Gst) was 0.25 with a gene flow of 1.48, indicating significant diversity in the population. All of these results confirmed the genetic variation in the populations because the Nm was higher than the threshold value (Nm ≤ 1.0) [54]. Similar results were also reported in the Cymbopogon species, which showed a gene flow of 2.58 (ISSR) and 2.20 (RAPD) [35]. Moderate The diversity of a species depends on the frequency of heterozygosity (Ht). The total species diversity among the population was found to be 0.22 ± 0.03, and within the population (Hs) was 0.16 ± 0.02. Further, the gene differentiation coefficient (Gst) was 0.25 with a gene flow of 1.48, indicating significant diversity in the population. All of these results confirmed the genetic variation in the populations because the Nm was higher than the threshold value (Nm ≤ 1.0) [54]. Similar results were also reported in the Cymbopogon species, which showed a gene flow of 2.58 (ISSR) and 2.20 (RAPD) [35]. Moderate heterozygosity (Ht) and gene differentiation coefficient (Gst) suggested moderate genetic variation among the population. Our results were in accordance with earlier reports [42,55].

Cluster Analysis
Clustering of the genotypes was constructed based on the Neighbour-Joining Method (N-J) using a pairwise distance matrix, which generated three major clusters ( Figure 2). Cluster I comprises of only one accession, while Cluster 2 and Cluster 3 comprise of 34 and 37 accessions, respectively. Cluster 2 is further subdivided into Cluster 2a, consisting of 12 accessions collected from Assam; Cluster 2b, consisting of accessions collected from Arunachal Pradesh; and Clusters 2c and 2d, both consisting of accessions collected from the Assam region. Similarly, Cluster 3 is also divided into three sub-clusters with one outgroup, i.e., Accession No. 16 (IC-0627058), collected from North Lakhimpur, Assam. Cluster 3a consists of accessions collected from Assam, and Cluster 3b consists of all the accessions collected from Manipur as well as four accessions from Assam and two accessions of Arunachal Pradesh. The dendrogram demonstrates that the clustering of accessions was dependent on their geographical collection site with the exception of two accessions from Arunachal Pradesh that were clustered separately in a different group. The reason for this may be either due to the migration of the same genotype from one location to another or due to the inability of the primers to discriminate them. The grouping of genotypes based on the N-J Method is preferred because it is rapid, reliable and produces an unrooted phylogenetic tree based on the minimum evolution criterion [56]. cessions was dependent on their geographical collection site with the exception of two accessions from Arunachal Pradesh that were clustered separately in a different group. The reason for this may be either due to the migration of the same genotype from one location to another or due to the inability of the primers to discriminate them. The grouping of genotypes based on the N-J Method is preferred because it is rapid, reliable and produces an unrooted phylogenetic tree based on the minimum evolution criterion [56].  Jaccard's pairwise coefficient of similarity depicted a minimum genetic similarity (0.485) between IC-0627064 (Cluster 3c) and IC-0627012 (Cluster 2a), as well as between IC-0627049 (Cluster 3a) and IC-0627048 (Cluster 3a), representing the minimum variation (0.928) among the genotypes. This pairwise similarity matrix depicted less variation among the genotypes. This may be due to the non-cross-pollinative nature of C. winterianus, due to rare flowering unlike the other Cymbopogon species [4]. Further, Nei's genetic identity and the genetic distance were calculated between the four populations. The dendrogram constructed indicated the highest genetic identity between Pop 1 and Pop 4 and a maximum distance between Pop 4 and Pop 3 ( Figure 3). Jaccard's pairwise coefficient of similarity depicted a minimum genetic similarity (0.485) between IC-0627064 (Cluster 3c) and IC-0627012 (Cluster 2a), as well as between IC-0627049 (Cluster 3a) and IC-0627048 (Cluster 3a), representing the minimum variation (0.928) among the genotypes. This pairwise similarity matrix depicted less variation among the genotypes. This may be due to the non-cross-pollinative nature of C. winterianus, due to rare flowering unlike the other Cymbopogon species [4]. Further, Nei's genetic identity and the genetic distance were calculated between the four populations. The dendrogram constructed indicated the highest genetic identity between Pop 1 and Pop 4 and a maximum distance between Pop 4 and Pop 3 ( Figure 3). Nei's genetic identity and the genetic distance between the four populations showed low variation, indicating the gene exchange or duplication of germplasms between adjacent geographical populations [57]. In Cymbopogon winterianus, due to the presence of very rare flowering, the gene exchange is not carried out by pollens or seed. Therefore, we can interpret that the factors affecting the gene flow are mainly due to human interference, Nei's genetic identity and the genetic distance between the four populations showed low variation, indicating the gene exchange or duplication of germplasms between adjacent geographical populations [57]. In Cymbopogon winterianus, due to the presence of very rare flowering, the gene exchange is not carried out by pollens or seed. Therefore, we can interpret that the factors affecting the gene flow are mainly due to human interference, such as genetic swamping, genetic rescue, hybridization and urbanization.
The basis of a multivariate analysis is a principal component analysis which is an approximation of the data table provided by the analysis [58]. A field representation of variability can be provided by utilizing the PCA. The PCA is very useful for determining the similarity of samples as the non-similar samples become further apart in the presentation [36]. Population structure analysis and the PCA are used widely to visualize the structure of the data [59]. Therefore, the PCA was performed for the 72 accessions of C. winterianus to check the variability and relationships among them, which is represented in the biplot (Figure 4).  The Eigen values are higher in the first three groups (1.91, 0.99 and 0.69), which show a greater contribution to the explanation of variances among the accessions. The total cumulative variance observed for the fourteen principal components accounted for 69.33% of the variance, of which 16.09%, 8.36% and 5.77% of the variance was contributed by the first, second and third principal components, respectively (Table 4).  The Eigen values are higher in the first three groups (1.91, 0.99 and 0.69), which show a greater contribution to the explanation of variances among the accessions. The total cumulative variance observed for the fourteen principal components accounted for 69.33% of the variance, of which 16.09%, 8.36% and 5.77% of the variance was contributed by the first, second and third principal components, respectively (Table 4).
A cluster analysis along with the PCA based on molecular marker data help in the extraction of maximum information if the first three principal components account for more than 25% variance [60]. The present analysis revealed that the first three components contribute to 30.22% variance, which is in accordance with the previous study. The PCA plot resembled the cluster formed in the dendrogram, although some diversions of the accessions were observed on the PCA plot ( Figure 4).
C. winterianus is a plant of industrial importance globally; therefore, the improvement of this crop through breeding programme is very essential. Germplasm collection, along with their variability, is required for any genetic improvement of the crop. However, very scanty evidence on the intra-and inter-specific relationships and genetic diversity within the C. winterianus germplasm is available; therefore, the present study will add new information for researchers and breeders.

Population Structure
A total of six appropriate genetic populations was identified through a structure harvester software analysis ( Figures 5 and 6). The accessions that showed a probability score of more than 0.80 can be deemed genetically pure accessions, while a score of less than 0.80 can be considered as homogenous accession. The mean Fst values for Populations I, II, III, IV, V, and VI are 0.6023, 0.5092, 0.3296, 0.5244, 0.3483 and 0.4550, respectively. The allele frequency divergence among a population was computed using point estimates of P, which are presented in Table A1. The population structure study indicated the genetic differentiation of the C. winterianus accessions, which suggested that the SSR primers used in the study were suitable for population structure studies. C. winterianus is a plant of industrial importance globally; therefore, the improvement of this crop through breeding programme is very essential. Germplasm collection, along with their variability, is required for any genetic improvement of the crop. However, very scanty evidence on the intra-and inter-specific relationships and genetic diversity within the C. winterianus germplasm is available; therefore, the present study will add new information for researchers and breeders.

Population Structure
A total of six appropriate genetic populations was identified through a structure harvester software analysis (Figures 5 and 6). The accessions that showed a probability score of more than 0.80 can be deemed genetically pure accessions, while a score of less than 0.80 can be considered as homogenous accession. The mean Fst values for Populations I, II, III, IV, V, and VI are 0.6023, 0.5092, 0.3296, 0.5244, 0.3483 and 0.4550, respectively. The allele frequency divergence among a population was computed using point estimates of P, which are presented in Table A1. The population structure study indicated the genetic differentiation of the C. winterianus accessions, which suggested that the SSR primers used in the study were suitable for population structure studies.

Analysis of Molecular Variance (AMOVA) of C. winterianus Accessions
AMOVA was used to interpret the difference in population using molecular markers [61]. Analysis of the C. winterianus accessions showed a high variance within the population (75%) and significantly less variance among the populations (25%), indicating a continuous gene exchange among the populations (Table 5).    AMOVA results showed an 80% variation within the populations in RAPD, and a 79% variance in ISSR analysis as well as significantly less variation were found among the population of the Cymbopogon species with 20% (RAPD) and 21% (ISSR) [35]. It was reported that when an intra-specific variation is compared to an inter-specific variation, the presence of the population structure can be predicted [62]. Many factors, such as the geographical location, genetic drift, gene flow, mating system, long-term evolutionary history and wind-pollinated, long-lived outcrossing in species were responsible for genetic diversity [63]. In addition to this, population size plays an important role for the precision of work and avoiding skewed results and errors [64]. Therefore, the highest intra-specific diversity was observed in Pop 1 (Assam) and lowest in Pop 3 (Meghalaya). According to the genetic theory of population, an increase in diversity helps species' potential for adapting to a changing environment [65]. Therefore, it is necessary to broaden the knowledge of genetic bases because a loss of heterogeneity might affect the feasibility of a population leading to species extermination [66].

Genetic Diversity Based on Phytochemical Analysis of the Biomarkers
The qualitative and quantitative analyses were performed using GC-FID and GC-MS, which represented citronellal, citronellol and geraniol as the major compounds. These are the main biomarkers of the C. winterianus essential oil; therefore, only these three compounds were considered for a chemical diversity analysis. The maximum citronellal (51.450%), citronellol (26.389%) and geraniol (41.146%) were present in the genotype IC-0626993, IC-0627002 and IC-0627004 respectively This would serve as a good source for processing high grade pharmaceuticals, cosmetics and value-added products [30,67]. In all 72 accessions, the citronellal content differed from 2.578% to 51.450%, the citronellol content from 0.000% to26.389% and the geraniol content from 0.000% to 41.146% (Table 1), proving diversified germplasm. Previously, a new variant of C. winterianus with 1.2% essential oil and 35% citronellal content was developed through mutation breeding and registered as INGR-16021 [10,11]. The chemical profiling of C. winterianus showed citronellal content of more than 43% in the M 6-10 cultivar developed through mutagenesis, citronellol content of 15.3% in Mandakini, and geraniol content of 60% in Medini, identified through phenotypic selection [31]. Similarly, the present study demonstrated IC-0626993 as the best line for the citronellal content with 51.45%, while the genotypes IC-0627002 and IC-0627004 were rich in citronellol (26.38%) and geraniol (41.15%), respectively. All variants possessing high levels of the three measured constituents were identified through selection that may be further developed for superior lines. From the quantitative and qualitative analyses of the essential oil, IC-0627062 was identified as the ideal genotype with a commendable amount of the phytochemicals (citronellal: 48.48%, citronellol: 9.25% and geraniol: 20.48%). A GC chromatogram of germplasm (IC-0627062) is shown in Figure 7. Apart from the molecular diversity analysis, a variation based on the essential oil constituents was performed by different researchers on different species of the genus Cymbopogon [36]. However, in the current study, diversity based on the phytochemical analysis of the essential oil was investigated in the same species of C. winterianus. Therefore, using these biomarkers, a dendrogram was constructed based on similarity, complete linkage and the Euclidean distance, which formed three clusters. Cluster I comprises nine accessions; Cluster II comprises fifty-eight, and the remaining five accessions come under Cluster III (Figure 8). A boxplot graphical representation of these markers depicts that Cluster I involves the accessions with high citronellol and geraniol contents, while the Cluster II accessions Apart from the molecular diversity analysis, a variation based on the essential oil constituents was performed by different researchers on different species of the genus Cymbopogon [36]. However, in the current study, diversity based on the phytochemical analysis of the essential oil was investigated in the same species of C. winterianus. Therefore, using these biomarkers, a dendrogram was constructed based on similarity, complete linkage and the Euclidean distance, which formed three clusters. Cluster I comprises nine accessions; Cluster II comprises fifty-eight, and the remaining five accessions come under Cluster III (Figure 8). Apart from the molecular diversity analysis, a variation based on the essential oil constituents was performed by different researchers on different species of the genus Cymbopogon [36]. However, in the current study, diversity based on the phytochemical analysis of the essential oil was investigated in the same species of C. winterianus. Therefore, using these biomarkers, a dendrogram was constructed based on similarity, complete linkage and the Euclidean distance, which formed three clusters. Cluster I comprises nine accessions; Cluster II comprises fifty-eight, and the remaining five accessions come under Cluster III (Figure 8). A boxplot graphical representation of these markers depicts that Cluster I involves the accessions with high citronellol and geraniol contents, while the Cluster II accessions A boxplot graphical representation of these markers depicts that Cluster I involves the accessions with high citronellol and geraniol contents, while the Cluster II accessions were high in citronellal content. Cluster III comprises the accessions with low levels of the compounds measured (Figure 9). were high in citronellal content. Cluster III comprises the accessions with low levels of the compounds measured (Figure 9). Previously, it was reported that C. winterianus is different from C. nardus in the context of the major constituent of the essential oil because the former contains citronellol as the major component, while the later contains geraniol [68]. However, it was also proved that the production of phytochemicals might be due to genetic makeup and geographical variations [18,47]. The analysis of both phytochemical and molecular markers revealed different clustering, probably due to the incorporation of a locus-specific SSR marker concerning the genes not responsible for phytochemical biosynthesis [32].

Plant Materials
A total of 72 accessions of Java citronella were collected from different states (Assam, Arunachal Pradesh, Manipur and Meghalaya) in Northeast India. The GPS locations of the collected accessions were previously reported [69]. The plants were identified by the plant breeder of the Medicinal, Economic and Aromatic Plant (MAEP) Group of the CSIR-North East Institute of Science and Technology (NEIST). The accessions were planted during kharif 2016 and maintained at the NEIST Institutional experimental farm in Jorhat. The latitude and longitude of the experimental farm were recorded using the WGS84 geographical system as 26°44′15.6948′′ N and 94°9′25.4628′′ E, respectively. The pH of the experimental soil was 4.9 with a sandy loam texture. The NPK (nitrogen, phosphorus and potassium) available were 224, 115 and 142 kg/ha, respectively.

Genomic DNA Extraction
The tender leaves of the C. winterianus germplasm were collected separately in different zip lock bags from the experimental farm of CSIR-NEIST, Jorhat. The leaf samples Previously, it was reported that C. winterianus is different from C. nardus in the context of the major constituent of the essential oil because the former contains citronellol as the major component, while the later contains geraniol [68]. However, it was also proved that the production of phytochemicals might be due to genetic makeup and geographical variations [18,47]. The analysis of both phytochemical and molecular markers revealed different clustering, probably due to the incorporation of a locus-specific SSR marker concerning the genes not responsible for phytochemical biosynthesis [32].

Plant Materials
A total of 72 accessions of Java citronella were collected from different states (Assam, Arunachal Pradesh, Manipur and Meghalaya) in Northeast India. The GPS locations of the collected accessions were previously reported [69]. The plants were identified by the plant breeder of the Medicinal, Economic and Aromatic Plant (MAEP) Group of the CSIR-North East Institute of Science and Technology (NEIST). The accessions were planted during kharif 2016 and maintained at the NEIST Institutional experimental farm in Jorhat. The latitude and longitude of the experimental farm were recorded using the WGS84 geographical system as 26 • 44 15.6948 N and 94 • 9 25.4628 E, respectively. The pH of the experimental soil was 4.9 with a sandy loam texture. The NPK (nitrogen, phosphorus and potassium) available were 224, 115 and 142 kg/ha, respectively.

Genomic DNA Extraction
The tender leaves of the C. winterianus germplasm were collected separately in different zip lock bags from the experimental farm of CSIR-NEIST, Jorhat. The leaf samples were cleaned, lyophilised for 48 h at −40 • C and tissue lysed in Tissue Lyser (Qiagen, Germany) using liquid nitrogen gas (N 2 ) for 150 s at a frequency of 25 Hz. Genomic DNA extraction was carried using HiPurA™ Plant DNA Isolation Kit by CTAB method (Cetyl trimethylammonium bromide). The purity of the extracted DNA was checked using 0.8% agarose gel, and the bands observed were detected in a gel-documentation system (Eppendorf, Germany). The concentration was quantified using 3 µL of the stock DNA in a Nano Bio Spectrophotometer (Eppendorf, Germany) at λ 260 / λ 280 ratio. The stock DNA was further diluted to 30 ng/µL for PCR analysis using SSR marker.

SSR Primers-PCR Analysis
A total of 42 pairs of microsatellite (SSR) primers were screened, out of which the best amplifications were observed in 28 primer pairs, for the analysis of genetic diversity in the C. winterianus germplasm ( Table 2). The reaction mixture for an SSR-PCR analysis consists of 6µL of working DNA (30 ng/µL), 1.6 µL each of forward and reverse primer (Bio Serve Biotechnologies, Hyderabad, India), 8.5 µL of 1X Hi-Chrome PCR Master Mix (Hi-Media, India) and 2.3 µL of double distilled water, making the total volume 20 µL. A Prima-96 (Hi-Media, India) thermocycler was used for amplification of the product with PCR programming of 94 • C (initial denaturation) for 3 min, then 35 cycles of denaturation for 30 s at 94 • C, annealing for 60 s at primer melting temperature (Tm) ±5 • C, extension for 90 s at 72 • C and final extension for 10 min at 72 • C. The amplified product along with 100 and 50 bp ladder (Hi-Media, India) was observed using 2% agarose gel in a 1× TBE buffer. Additionally, aliquots (15 µL) were also separated using 8% Polyacrylamide Gel Electrophoresis (PAGE). The acrylamide gel was stained with 0.5 mg/mL SYBR solution which is considered a safe stain for 25-30 min. The DNA bands in the gel were then visualised in a gel documentation system. Electrophoresis was run for 1 h at 100 constant voltages in the 1X TBE buffer.

Statistical Analysis of Data
The amplicons formed were observed in the gel documentation system and were scored to obtain the genetic variation. The Neighbour-Joining tree, distance matrix and principal coordinate analysis (PCA) were constructed based on Jaccard's pairwise distance matrix using Darwin software version 6.0. Based on the polymorphism of the bands, the PIC (Polymorphic information content) percentage was calculated as follows: PIC = 1 − Σpi 2 , where Pi is the frequency of the ith allele [53]. Marker index (MI) was calculated using the formula MI = EMR × PIC where the effective multiplex ratio (EMR) is the product of a number and a fraction of the polymorphic loci [70,71], whereas resolving power (Rp) is the summation of the band informativeness [I b = 1 − [2 × |0.5 − p|] and p is the proportion of individuals containing the band [71]. The genetic diversity variables, such as genetic differentiation degree (Gst), genetic diversity in the population (Ht), genetic diversity within the population (Hs), Shannon's information index (I), number of observed alleles (na), number of effective alleles (ne) and Nei's gene diversity (h), were calculated using the POPGENE (Version-1.31) software package [72]. The inter-specific and intra-specific genetic diversities were determined using analysis of molecular variance (AMOVA) with the help of GenAlex software Version 6.5 [73]. The genetic relationships among the 72 accessions were established using 28 SSR primers and utilizing a model-based population structure, which was performed using STRUCTURE software Version 2.3.4. The software was run multiple times by setting k (the number of populations) from 3 to 10, and the length of the burn-in period and number of Markov Chain Monte Carlo (MCMC) replications were set at 100,000 for each run for all 72 accessions in order to evaluate the number of populations. An online tool called Structure Harvester was used to calculate the most probable genetic population groups from the study.

Extraction of Essential Oil
The extraction of essential oil was carried out by boiling 300 g of fresh leaves of C. winterianus in water using a 3-litre capacity Clevenger apparatus for 3 1 2 h in three replicates. The essential oil produced was measured, collected and treated with anhydrous sodium sulphate (Na 2 SO 4 ) to remove excess moisture present and was stored in 4 • C for qualitative and quantitative analysis.

Essential Oil Analysis
For the qualitative and quantitative evaluation of the major phytochemical biomarkers, a GC-FID and GC-MS analysis of the essential oil was carried out using the following instrument and GC conditions ( Table 6) [69]: Table 6. The optimum conditions for GC-MS and GC-FID analysis in C. winterianus essential oil. The identification of the major components was done by comparing the retention time of the standard samples (Sigma Aldrich, Germany and Hi-media, India) with the same GC conditions, and the percentages of the compounds were determined by an area normalization method. In GC-MS, the mass spectra of the obtained peaks were identified by comparison with the NIST/WILEY Mass Spectral Library and retention indices within the literature.

Statistical Analysis
MINITAB 16.0 software (Minitab Inc, State College, PA, USA) was used to evaluate the analysis of the variances (ANOVA) for the quantitative data of citronellal, citronellol and geraniol, followed by dendrogram clustering based on complete linkage and Euclidean distance.

Conclusions
C. winterianus is a plant of high industrial importance because of its yielding capacity for essential oil and its compounds of high pharmaceutical importance. Therefore, preservation and conservation is highly recommended for proper utilisation of the crop in breeding programmes. Several breeding techniques, such as molecular, mutation and selection breeding, are the most suitable measures for the development of high-yielding varieties. Successful breeding depends upon genetic diversity and information about the accessions collected from different places. Therefore, diversity based on phytochemical and molecular markers was studied to differentiate the accessions into different clusters. Depending on the requirement of the particular constituents, genotypes could be selected for the development of recombinant lines with desirable characteristics. The recombinant lines may contain more essential oil with a highly valuable lead molecule. The knowledge obtained by this study would also enhance the information regarding the phylogeny of different accessions and would enrich the gene bank. The findings would help in adopting proper conservation strategies by minimizing the risk of extinction of the elite lines. Most importantly, the scientific community, academicians, and the commercial sector would benefit highly by utilizing this research work for plant improvement programmes.  Acknowledgments: The authors are thankful to CSIR-NEIST, Jorhat, for providing the necessary facilities and encouraging us throughout the work.

Conflicts of Interest:
The authors declare no conflict of interest. importantly, the scientific community, academicians, and the commercial sector would benefit highly by utilizing this research work for plant improvement programmes.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A  Figure A1. Gel image of primer 15CM3334 depicting diversity in C. winterianus germplasm.