Comparative Plastomics of Ashwagandha (Withania, Solanaceae) and Identification of Mutational Hotspots for Barcoding Medicinal Plants

Within the family Solanaceae, Withania is a small genus belonging to the Solanoideae subfamily. Here, we report the de novo assembled chloroplast genome sequences of W. coagulans, W. adpressa, and W. riebeckii. The length of these genomes ranged from 154,162 to 154,364 base pairs (bp). These genomes contained a pair of inverted repeats (IRa and IRb) ranging from 25,029 to 25,071 bp that were separated by a large single-copy (LSC) region of 85,635–85,765 bp and a small single-copy (SSC) region of 18,457–18,469 bp. We analyzed the structural organization, gene content and order, guanine-cytosine content, codon usage, RNA-editing sites, microsatellites, oligonucleotide and tandem repeats, and substitutions of Withania plastomes, which revealed high similarities among the species. Comparative analysis among the Withania species also highlighted 10 divergent hotspots that could potentially be used for molecular marker development, phylogenetic analysis, and species identification. Furthermore, our analyses showed that even three mutational hotspots (rps4-trnT, trnM-atpE, and rps15) were sufficient to discriminate the Withania species included in current study.


Introduction
The globally distributed megadiverse Solanaceae family includes 93 genera and 2700 species [1][2][3]. The genus Withania Pauq., belonging to the subfamily Solanoideae, contains 10-20 species [1]. Among the worldwide list of Withania species, ashwagandha or winter cherry (W. somnifera (L.) Dunal) and paneer booti or ashutosh booti (W. coagulans (Stocks) Dunal) are considered highly important due to their therapeutic potential. Withania species are pivotal in the Ayurvedic medicine system in Southeast Asia, and W. somnifera has been used for medicinal purposes for around 3000 years [4,5]. Many studies of Withania have described various pharmacological properties of these species, (e.g., anti-inflammatory, anticancer, antidepressant, neuroprotective, and hepatoprotective) [6][7][8][9]. The ubiquity of such herbal products has expanded globally in recent decades. The worldwide market for medicinal plants is anticipated to reach 5 trillion USD by 2050, with Europe driving the market [10]. Although medicinal plants are outstanding sources of innovative drug development, assessing their pharmacological Plants 2020, 9, 752 3 of 20 Additionally, tRNA genes were identified using tRNAscan-SE version 2.0 under default parameters [37] and ARAGORN version 1.2.38 [38]. CPGAVAS2 [36] and Clico FS [39] were used to draw circular maps of the genomes. The average coverage depths of the Withania species plastomes were determined by mapping the reads to the de novo assembled plastome through the Burrows-Wheeler aligner (BWA) [40] and visualizing in the Tablet [41]. The novel annotated plastomes were deposited in NCBI under the following accession numbers: W. coagulans (MN216390), W. adpressa (BK010847), and W. riebeckii (BK010849). The plastome of W. coagulans was also deposited in the GWH database of the National Genomics Data Center [42] (accession number GWHACBF00000000).

Comparative Chloroplast Genome Analysis
All de novo chloroplast genomes were aligned with multiple alignment using fast Fourier transform (MAFFT) 7.309 [43], using default parameters. Protein-encoding genes, intergenic spacer (IGS) regions, and introns were extracted to calculate the average number of nucleotide differences per site or nucleotide diversity (π) with a 100 bp window size as implemented in DnaSP v6 [44]. The substitution, transition (Ts), and transversion (Tv) rates were resolved from the MAFFT alignment, using W. somnifera as a reference. Each structural element, including the LSC, SSC, and IR, was aligned individually to analyze SNPs and indel polymorphisms with Geneious and DnaSP, respectively. The junction sites of the IRs and their border positions were compared using all Withania species and six additional Solanaceae outgroup species (Table S9), using the default setting of the IRscope [45]. The intergeneric comparison was carried out to gain insight to differences and syntenies that may exist between Withania and other Solanaceae species. Circoletto [46] was used to compare structural features of Withania chloroplast genomes using blastn search (e-value of <1 × 10 −10 ) to create a Circos output.
The predictive RNA editor for plants-chloroplast genes (PREP-cp) was used to predict putative RNA editing sites using default settings [47], while codon usage and amino-acid frequencies were analyzed in Geneious R8.1. The ratios of synonymous (Ks) and non-synonymous (Ka) substitutions for each extracted protein-encoding gene were calculated with DnaSP for all Withania, using W. somnifera as reference. The data were interpreted as: Ka/Ks > 1, Ka/Ks = 1, Ka/Ks < 1, representing positive, neutral, and purifying selection, respectively. Microsatellites in Withania plastomes were detected with the microsatellite-web (MISA) [48], using a minimal repeat number of 7 for mononucleotide simple sequence repeats (SSRs), 4 for dinucleotide SSRs, and 3 for tri-, tetra-, penta-, and hexanucleotide SSRs. REPuter [49] was also used to locate forward (F), reverse (R), palindromic (P), and complementary (C) repeats with the following parameters: min. repeat size 30 bp, Hamming distance 3, min. similarity percentage of two repeat copies 90%, and max. computed repeats 500. A subsequent search for repeats was also carried out with tandem repeat finder [50] using default parameters.

Phylogenomic Analysis
We included all available Withania plastome sequences in our analysis and added further Solanaceae plastomes (Organelle Genome Resources of NCBI, accessed on 21 January 2020) from closely related groups of Physaleae and additional taxonomic groups from the so-called 'x = 12 clade'. This group encompasses species of the traditional subfamily Solanoideae, Nicotiana L. and the Australian endemic tribe Anthocercideae belonging to Nicotianoideae. This strongly supported group is united with the cytological synapomorphy of chromosome numbers based on 12 pairs [1]. We used Petunia × atkinsiana (Sweet) D. Don ex W.H. Baxter (Syn.: Petunia × hybrida Vilm.) as an outgroup to root our tree, since this was the only available complete chloroplast genome sequence outside the x = 12 clade. For phylogenetic analysis, we removed one of the IR regions (IRa), and subsequently excised all protein-encoding genes from the plastomes. The reading frames were manually verified during extraction by checking the start and stop codons. We discarded accD, ycf 1, and ycf 15 from our final alignment, because these genes were highly variable in size. The trans-spliced rps12 was also not included in the phylogenetic alignment together with sequence of the inf A pseudogene. The nucleotide sequences of 74 protein-coding genes were aligned with MAFFT (default setting) via the Geneious shell. IQ-TREE [51] was used to determine the best-fitting models for each partition of concatenated matrix using the TESTMERGEONLY and AICc (Akaike information criterion corrected for small sample sizes) options in the built-in ModelFinder [52]. The maximum likelihood (ML) tree search was carried out using the ultrafast bootstrap approximation (UFBoot; [53]) with 1000 replicates. UFBoot reduces computing time and provides an efficient alternative to standard bootstrap [53]. Branch supports were also assessed using the SH-like approximate likelihood ratio test (SH-aLRT), while final phylogenetic trees were edited using TreeDyn [54,55].
For further analyses we divided the dataset into protein-coding gene subsets according to the heuristic searches carried out with Partition Finder v1.1.1 [56] and default settings using Bayesian information criterion (BIC). Intron regions were regarded as distinct subsets. Partitioned Bayesian phylogenetic analyses were carried out with MrBayes v.3.2.3 [57]. jModelTest [58] was used with default settings to infer fitting substitution models (see Table S10). One cold and three heated Markov chains were run parallel with 2 × 10 6 generations sampling every 100th tree per generation, with unlinked parameters across partitions. Branch length and topology parameters were set unlinked. Convergence was checked using the average standard deviation of split frequencies (ASDFs; <0.01) was used to measure convergence of the runs. A majority-rule consensus tree was constructed from the runs with a 25% burn-in removal.

Organization and Characteristics of Withania Plastomes
Our comparative analysis revealed that Withania species have similar plastome structures ( Figure 1 and Table 1). The length of the assembled plastome varied between 154,162 and 154,364 bp. The average coverage depth of the assembled plastomes of W. coagulans, W. adpressa, and W. riebeckii was 573×, 566×, and 590×, respectively. The total guanosine-cytosine (GC) content of the de novo assembled Withania plastomes was 37.7%, as was the previously sequenced species.
The IRs showed a higher GC content (43.2%) compared to the large-(35.7%) and small-single copy regions (31.8%), which could have been due to the occurrence of rRNA genes containing GC-rich regions [19,[59][60][61]. The plastomes of the de novo assembled Withania species had 132 genes from which 18 were represented in duplication in the inverted repeats (Table 2, Figure 2). All Withania plastomes contained 86 protein-encoding, 37 tRNA, and 8 rRNA genes. The IR regions contained 18 duplicated genes and out of these 7 were protein-encoding, 4 were rRNA, and 7 were tRNA genes. The clpP and ycf 3 genes had two introns in their nucleotide sequence, while rps16, atpF, rpoC1, petB, petD, rpl16, ndhA, rpl2, and ndhB had only one. The 5 end exon of the trans-spliced gene rps12 was found in the LSC and the 3 end exons were located in the IR. The GC content was the highest among tRNAs (53%) and rRNAs (55.3%). Hydrophobic amino acids were abundant, while the acidic amino acids were present in the least amount in plastomes of the genus Withania. These amino acids were adenine-thymine (AT)-rich sequences in all species ( Figure 3A). The analysis of codon usage and amino acids frequencies indicated leucine (Leu) as the most frequent and cysteine (Cys) as a rare amino acid in Withania plastomes ( Figure 3B). The codon usage also revealed a shift towards A/T at the third codon position (Table S1).   blocks. The colored blocks outside the sequences refer to the score/max bit core ration, with green ≤0.50, orange ≤0.75, and red >0.75. Blue blocks and chords represent the inverted repeats (IRs).

Figure 2.
Consensus circular genome map of Withania plastomes. The first larger ring represents the gene structure of the chloroplast genome. The GC content is plotted as a green line while a skewed GC plot is presented as an orange/black track around the inner circles. The second ring is displaying the IRs as longer black boxes, while tandem repeats are represented by short bars. The third, and last circle shows the output of MISA microsatellite detection, where reverse repeats are connected with green arcs, while red arcs represent forward repeats.     * Gene with one intron, ** gene with two introns, a gene with two copies, same genes in all Withania species.

Divergence Hotspots in Withania
Our comparison showed that all Withania genomes had similar nucleotide compositions in all structural (LSC, SSC, and IR) and coding regions, which extended even to IGSs (Table S2). The number of substitutions ranged between 25 and 116, while substitution types were shared among species (Table 3). A/G and C/T SNPs occurred frequently among the genomes (Table 3), while the ratio of Ts and Tv in the plastomes ranged from 1.04 to 1.25 in the LSC and between 0.5 and 1.5 in the SSC; the ratio varied from 1.3 to 2 in the IR region (Table S3). In general, Ts were more frequent in Withania, consistent with observations in other plant species [61,62]. Indels were frequent in the LSC region and their number ranged from 32 to 46. The IRs contained only a few indels (Table 4). This may have been due to the observation that IR sequences evolve under concerted evolution compared to LSC and SSC regions that contain more substitutions [63]. When all positions with single-or multinucleotide Plants 2020, 9,752 8 of 20 variations as SNPs were considered, 207 SNPs were identified, which corresponds to a mean SNP frequency of 0.2070 SNPs/kb in Withania species. Indels showed a mean frequency of 0.116/kb.  The indels and SNP mutational events in the plastome showed uneven distributions and clustered as "hotspots" [64,65]. These fast evolving regions are ideally suited for DNA barcoding [66]. The IGS were more polymorphic (average π = 0.0027) than protein-coding regions (π = 0.0011) and introns (average π = 0.0015). Among the Withania species, the values ranged from 0.0003 (psaB) to 0.0119 (ndhl-ndhA region; Figure 4). We selected the 10 most polymorphic regions for further investigation based on the analysis of mutation rates of the complete chloroplast genome sequences (Table 5). From the selected regions, nine were IGSs, and one was a protein-coding gene (rps15). We assessed the efficacy of these regions to discriminate among the four species of Withania and found that three regions (rps4-trnT, trnM-atpE, and rps15) provided enough information for successful barcoding.
We also investigated the Ks and Ka substitutions and their ratio (Ka/Ks; Table S4). We selected 77 protein-encoding genes for further analysis and observed that 69 genes had Ks = 0 and 58 had Ka = 0, while 72 genes had both Ks and Ka = 0. Of the protein-encoding genes, four (accD, ycf 2, ycf 1, and ndhF) had Ka/Ks ratios > 1. ycf 1 and psbC showed Ka/Ks ratios > 0-1 for W. riebeckii, ndhF for W. coagulans, and rps15 for W. adpressa and accD, ycf 2 showed Ka/Ks > 0 for W. adpressa and W. riebeckii, and rpoC2 for W. coagulans and W. riebeckii. The low divergence of most chloroplast genes showed signs of purifying selection to conserve the sequence and function of proteins related to photosynthesis.

Repeat Structure and Analyses
Chloroplast repeat sequences are important sources of variation for evolutionary studies, plant breeding, and construction of linkage maps [67][68][69]. We performed a microsatellite analysis that revealed shared microsatellite loci ranging from 376 (W. coagulans) to 379 (W. adpressa). Poly-A and T SSR motifs were frequent in Withania chloroplast genome sequences, while AT/TA dinucleotide stretches were also highly abundant. The mononucleotide motifs occurred in 7-17-unit repeats, while dinucleotide repeats had a frequent 4-5-units, whereas other types of SSRs were present mainly in 3-5-unit repeats. Most SSRs occurred in the LSC, followed by IR and SSC ( Figure  5) (Table S5). REPuter was also employed to locate further tandem repeats in all Withania species. A

Repeat Structure and Analyses
Chloroplast repeat sequences are important sources of variation for evolutionary studies, plant breeding, and construction of linkage maps [67][68][69]. We performed a microsatellite analysis that revealed shared microsatellite loci ranging from 376 (W. coagulans) to 379 (W. adpressa). Poly-A and T SSR motifs were frequent in Withania chloroplast genome sequences, while AT/TA dinucleotide stretches were also highly abundant. The mononucleotide motifs occurred in 7-17-unit repeats, while dinucleotide repeats had a frequent 4-5-units, whereas other types of SSRs were present mainly in 3-5-unit repeats. Most SSRs occurred in the LSC, followed by IR and SSC ( Figure 5) (Table S5). REPuter was also employed to locate further tandem repeats in all Withania species. A total of 66 oligonucleotide repeats were found among Withania species. The F and P repeats were present in large numbers in all species. The oligonucleotide repeats were variable in size (30-60 bp) and a large fraction of the repeats was located in the LSC and existed in IGS regions, followed by gene, intron, and coding DNA sequence (CDS) regions ( Figure 6; Table S7). The number of tandem repeats varied from 22 to 25 bp among Withania species (Figure 7; Table S6).
Plants 2019, 10, x FOR PEER REVIEW 10 of 20 oligonucleotide repeats were found among Withania species. The F and P repeats were present in large numbers in all species. The oligonucleotide repeats were variable in size (30-60 bp) and a large fraction of the repeats was located in the LSC and existed in IGS regions, followed by gene, intron, and coding DNA sequence (CDS) regions ( Figure 6; Table S7). The number of tandem repeats varied from 22 to 25 bp among Withania species (Figure 7; Table S6).   total of 66 oligonucleotide repeats were found among Withania species. The F and P repeats were present in large numbers in all species. The oligonucleotide repeats were variable in size (30-60 bp) and a large fraction of the repeats was located in the LSC and existed in IGS regions, followed by gene, intron, and coding DNA sequence (CDS) regions ( Figure 6; Table S7). The number of tandem repeats varied from 22 to 25 bp among Withania species (Figure 7; Table S6).

Comparative Plastomics and Inverted Repeat Boundaries
The plastome of land plants has a conserved quadripartite structure while variation frequently occurs in a form of expansion and contraction of the junction sites that could rarely even lead to the loss of the entire IR regions [70,71]. The size of each structural element of the plastome (LSC, SSC, and IR) shows variation at the junction sites JL (LSC/IR) and JS (IR/SSC). Studying these boundaries among plant linages could broaden our knowledge about chloroplast genome evolution and speciation processes [72]. Syntenies among the junction sites could be conserved between species and could explain relationships among them [73]. To investigate such events, we compared the JL (LSC/IR) and JS (IR/SSC) junction sites of Withania plastomes (Figure 8). The resemblance at junctions revealed the close resemblance among the Withania species. The rps19 gene was found at the junction site of JLB (LSC/IRb), and a portion of this gene (8-59 bp) was copied in the IRa in all Withania genomes. The ndhF gene was entirely present in the SSC region in W. somnifera and W. adpressa, but in W. coagulans (5 bp) and W. riebeckii (3 bp) it was located in the IRb region.

Comparative Plastomics and Inverted Repeat Boundaries
The plastome of land plants has a conserved quadripartite structure while variation frequently occurs in a form of expansion and contraction of the junction sites that could rarely even lead to the loss of the entire IR regions [70,71]. The size of each structural element of the plastome (LSC, SSC, and IR) shows variation at the junction sites JL (LSC/IR) and JS (IR/SSC). Studying these boundaries among plant linages could broaden our knowledge about chloroplast genome evolution and speciation processes [72]. Syntenies among the junction sites could be conserved between species and could explain relationships among them [73]. To investigate such events, we compared the JL (LSC/IR) and JS (IR/SSC) junction sites of Withania plastomes (Figure 8). The resemblance at junctions revealed the close resemblance among the Withania species. The rps19 gene was found at the junction site of JLB (LSC/IRb), and a portion of this gene (8-59 bp) was copied in the IRa in all Withania genomes. The ndhF gene was entirely present in the SSC region in W. somnifera and W. adpressa, but in W. coagulans (5 bp) and W. riebeckii (3 bp) it was located in the IRb region.

Putative RNA-Editing Sites
RNA editing is the molecular processes that can alter the sequence of the transcribed RNA by insertion, deletion, or nucleotide substitution [46]. RNA editing aids in creating transcripts and maintaining protein diversity [74], thus several sites are conserved in the plastome of angiosperms [75]. To examine the RNA editing in Withania species, we predicted putative sites in the plastomes using PREP-cp. This revealed 37 putative sites in 15 genes of W. somnifera, while 35, 39, and 37 editing sites were found in 13 genes of W. coagulans and in 14 genes of W. adpressa and W. riebeckii, respectively. The gene clpP has editing sites only in W. somnifera and ccsA only in W. adpressa. Among Withania species ndhB (9), ndhD (7), and rpoB (5) had the highest number of RNA-editing sites. All species had high levels of conversion for serine (Ser) to leucine (Leu; 60%, 53.8%, and 59.4%, respectively), followed by proline (Pro) to Leu (14.28%, 17.94%, and 16.21%, respectively), and Ser to phenylalanine (Phe; 8.57%, 10.2%, and 10.8%, respectively). Of the putative RNA-editing sites detected, 33 (94.2%), 34 (87.1%), and 33 (89.1%) codons were substituted on the second nucleotide and two (5.71%), five (12.8%), and four (10.81%) codons were substituted in the first nucleotide in W. coagulans, W. adpressa, and W. riebeckii, respectively. Many amino acids were converted from Ser to Leu helping to form hydrophobic amino acids, e.g., valine (Val), Leu, and Phe (Table S8)

Putative RNA-Editing Sites
RNA editing is the molecular processes that can alter the sequence of the transcribed RNA by insertion, deletion, or nucleotide substitution [46]. RNA editing aids in creating transcripts and maintaining protein diversity [74], thus several sites are conserved in the plastome of angiosperms [75]. To examine the RNA editing in Withania species, we predicted putative sites in the plastomes using PREP-cp. This revealed 37 putative sites in 15 genes of W. somnifera, while 35, 39, and 37 editing sites were found in 13 genes of W. coagulans and in 14 genes of W. adpressa and W. riebeckii, respectively. The gene clpP has editing sites only in W. somnifera and ccsA only in W. adpressa. Among Withania species ndhB (9), ndhD (7), and rpoB (5) had the highest number of RNA-editing sites. All species had high levels of conversion for serine (Ser) to leucine (Leu; 60%, 53.8%, and 59.4%, respectively), followed by proline (Pro) to Leu (14.28%, 17.94%, and 16.21%, respectively), and Ser to phenylalanine (Phe; 8.57%, 10.2%, and 10.8%, respectively). Of the putative RNA-editing sites detected, 33 (94.2%), 34 (87.1%), and 33 (89.1%) codons were substituted on the second nucleotide and two (5.71%), five (12.8%), and four (10.81%) codons were substituted in the first nucleotide in W. coagulans, W. adpressa, and W. riebeckii, respectively. Many amino acids were converted from Ser to Leu helping to form hydrophobic amino acids, e.g., valine (Val), Leu, and Phe (Table S8).

Phylogenetic Analysis
We performed maximum-likelihood (ML) and Bayesian analysis for phylogenetic reconstruction for 19 Solanaceae species, based on selected protein-coding gene sequences extracted from whole-plastome sequences. Based on a 69,582-bp alignment, our tree was reconstructed and

Phylogenetic Analysis
We performed maximum-likelihood (ML) and Bayesian analysis for phylogenetic reconstruction for 19 Solanaceae species, based on selected protein-coding gene sequences extracted from whole-plastome sequences. Based on a 69,582-bp alignment, our tree was reconstructed and resolved identical topologies for both methods and a phylogenetic tree were supported by high bootstrap values and posterior probabilities ( Figure 9). The genus Withania was represented by W. adpressa, native to North Africa, Morocco, and Algeria, W. coagulans from the eastern distribution area, W. riebeckii native to the island of Socotra, Jemen, and finally, the widespread W. somnifera. Our phylogenetic analysis with limited taxonomic sampling resolved Withania as a monophyletic of the genus. However, further sampling is needed to investigate the relationship of the allied genera especially from Athenaea Sendtn., Aureliana Sendtn., and Mellissia Hook. f. not included in our analysis, which were shown to be closely related to Withania [76]. Our results were consistent with previous findings based on plastid intergenic atpB-rbcL spacer [77], ndhF and trnLF [1], or whole plastome sequences [18]. of the genus. However, further sampling is needed to investigate the relationship of the allied genera especially from Athenaea Sendtn., Aureliana Sendtn., and Mellissia Hook. f. not included in our analysis, which were shown to be closely related to Withania [76]. Our results were consistent with previous findings based on plastid intergenic atpB-rbcL spacer [77], ndhF and trnLF [1], or whole plastome sequences [18].

Discussion
Chloroplast DNA (cpDNA) is frequently used in plant phylogenetics at various levels (i.e., generic level and above) [19,66,78]. It has also been used in Solanaceae systematics to infer family level phylogenetic relationships and to identify major clades and dispersal events [79]. Thus, we characterized, annotated, and analyzed the plastome of Withania species, which was further used in phylogenetic inference. Withania species belong to a rather diverse and widely distributed Withaninae clade within the so-called physaloid group. Species of the genus Withania are

Discussion
Chloroplast DNA (cpDNA) is frequently used in plant phylogenetics at various levels (i.e., generic level and above) [19,66,78]. It has also been used in Solanaceae systematics to infer family level phylogenetic relationships and to identify major clades and dispersal events [79]. Thus, we characterized, annotated, and analyzed the plastome of Withania species, which was further used in phylogenetic inference. Withania species belong to a rather diverse and widely distributed Withaninae clade within the so-called physaloid group. Species of the genus Withania are morphologically similar: the flowers are found in lateral clusters (fascicles) lacking a supporting inflorescence stem (peduncle). The flower petals (corollas) are bell or elongated cup-shaped, sometimes urn-shaped, circular and flattened (rotate), or trumpet-shaped (salverform), while the filaments often form nectar grooves with lateral attachments. The Withaninae clade consists of approximately seven small often monotypic genera mostly found in the Old World, e.g., Tubocapsicum (Wettst.) Makino, Mellissia, Aureliana, or Discopodium Hochst. D'Arcy [80] considered Withania to be one of the truly Old-World genera, while Symon [81] regarded it as a distinctive African Gondwanan element. Withania has a center of distribution around Spain, NW Africa extending to the Canary Islands, while another is located in India, the southern region of the Arabian Peninsula, and the Horn of Africa. The phylogenetic relationships within the genus are poorly known, and the biology, chromosome numbers, and the exact number of species are also lacking. Chromosome counts showed that most species of Withania are polyploids with 2n = 2x = 48 [82], derived from the x = 12 haploid chromosome number typical for the majority of Solanaceae species. In addition to the currently accepted Withania taxa, there are 35 unresolved botanical names that need further investigation to clarify the taxonomy of the genus.
In the Hepper's treatment [83], Withania consisted of 10 species, which were extended by Hunziker [84] with the addition of nine mesophytes from the genera Mellissia Hook. and Physaliastrum Makino. These additions extended the geographical range of the genus from the Canary Islands in the west, through Asia to China and Japan in the east. Symon [81] also emphasized the similarity of Mellissia (a critically endangered endemic of St Helena) to Withania but retained them as distinct genera. In contrast, Hunziker [84] included Mellissia within Withania and molecular results support this placement [1]. There is no consensus on the positions of the small clades related to Withania, while its closest relatives are also debated. In our analysis, Withania formed a clade together with Physalis, similarly to the findings of Deanna et al. [82], although this branching is supported by only weak bootstrap values.
Plastome sequences could be used as tools to further elucidate species boundaries and investigate the phylogenetic relationships among the small clades of Withaninae and resolve the taxonomic debate over the placement of Melissia and other monotypic genera. For such barcoding studies our results could provide valuable reference genomes for assemblies. The hotspot regions described in our study could be useful in such phylogenetic or even population genetic investigations. It was previously demonstrated that identifying highly variable regions by comparative plastomics could provide reveal loci that could be used in DNA barcoding [85][86][87]. Such divergent hotspots in the plastomes can be applied for DNA barcoding at the generic level [29,[88][89][90]. Thus, the set of 10 polymorphic regions identified among Withania in our study could be applied for DNA barcoding. Moreover, similar to the aforementioned studies, our identified mutational hotspots showed high discrimination properties and from the 10 mutational hotspots, three regions were found to be sufficient for the identification of the four Withania species included in our study.
We analyzed Ks substitutions and Ka substitutions of protein-coding genes and recorded greater Ks substitutions relative to the Ka substations. Such observations are essential markers in evolution for defining slow-and fast-evolving genes [91]. The Ka/Ks ratio also informs us about the selection pressure on these genes. When the Ka/Ks value is minimal, it represents purifying selection, while values similar to it or equal to 1 represent neutral evolution, and values greater than 1 denote positive selection [85]. Most plastid genes showed a minimal Ka/Ks ratio (<1), demonstrating that purifying selection is acting over these genes, due to functional constraints of the plastome. However, atpB, ndhD, ndhF, rpoA, rpoC1, rps2, and rps12 showed greater Ka/Ks values (>1), possibly indicating selective pressure acting over these genes that was previously proposed in other groups [92][93][94][95]. Our sampling in the Withania clade was limited to explore the biological causes of the elevated Ka/Ks ratios observed in cases involving these genes. Here, we suggest that the following set of genes could be the principal candidates in investigations of these environmental interactions and their effects on plastid genes. Such investigations should include a nearly complete phylogenetic sampling of Withania and consider the effects of arbitrary variations in Ka and Ks values leading to false positive inferences and values higher than 1. These shortcoming can be bypassed by complete sampling and additional tests of selective pressure also stressed here for future analyses.

Conclusions
It has been shown that DNA barcoding can fail in complicated groups [96]. Solanaceae includes many species complexes with tangled taxonomy such examples of species groups can be found in potatoes and its wild relatives, e.g., Solanum brevicaule complex [97], or the eggplant and its wild relatives (S. melongena complex [98]) but in other clades of the family for example in the genus Petunia [99] or closely related Physalis [100]. In these complicated groups well known plastid barcode regions (e.g., trnH-psbA, matK) could lack enough polymorphism and thus could fail to provide species-specific information necessary for differentiation [96]. It has been shown that plastid genome based "super barcoding" could overcome these difficulties and could differentiate species in difficult taxonomic groups. This approach has been successfully employed in the S. melongena complex to trace the ancestors of cultivated eggplant and differentiate closely related wild species [98]. Here, we compared the complete plastome sequences of four Withania species and investigated if plastid genomes based "super barcoding" could be applied among closely related ashwagandha species. The structure of these genomes showed synteny with a previously reported organization of Solanaceae species. We identified sequence divergence hotspots and located repeat sequences and indels in the plastomes of Withania species. These regions may constitute a useful means to develop suitable molecular markers for species identification and DNA barcoding of ashwagandha medicinal products. It is hoped that our study will aid the development of DNA barcoding markers to clarify the taxonomic identity of Withania species in medicinal plant production. Such plastome-based "super barcoding" could be repeatable, reliable, and sensitive enough to distinguish look-alike species of ashwagandha.
Supplementary Materials: The following are available online at http://www.mdpi.com/2223-7747/9/6/752/s1, Table S1: Base composition in the chloroplast genome of W. somnifera, W. coagulans, W. adpressa and W. riebeckii;  Table S9: NCBI GenBank accession numbers used in this study; Table S10: Model selected for each data partition identified by software Partition Finder for protein-coding chloroplast genes and intron regions in maximum likelihood (ML) analysis and Bayesian inference (BI).