Complete Chloroplast Genome Characterization of Oxalis Corniculata and Its Comparison with Related Species from Family Oxalidaceae

Oxalis corniculata L. (family Oxalidaceae) is a small creeper wood sorrel plant that grows well in moist climates. Despite being medicinally important, little is known about the genomics of this species. Here, we determined the complete chloroplast genome sequence of O. corniculata for the first time and compared it with other members of family Oxalidaceae. The genome was 152,189 bp in size and comprised of a pair of 25,387 bp inverted repeats (IR) that separated a large 83,427 bp single copy region (LSC) and a small 16,990 bp single copy region (SSC). The chloroplast genome of O. corniculata contains 131 genes with 83 protein coding genes, 40 tRNA genes, and 8 rRNA genes. The analysis revealed 46 microsatellites, of which 6 were present in coding sequences (CDS) regions, 34 in the LSC, 8 in the SSC, and 2 in the single IR region. Twelve palindromic repeats, 30 forward repeats, and 32 tandem repeats were also detected. Chloroplast genome comparisons revealed an overall high degree of sequence similarity between O. corniculata and O. drummondii and some divergence in the intergenic spacers of related species in Oxalidaceae. Furthermore, the seven most divergent genes (ccsA, clpP, rps8, rps15, rpl22, matK, and ycf1) among genomes were observed. Phylogenomic characterization on the basis of 60 shared genes revealed that O. corniculata is closely related to O. drummondii. The complete O. corniculata genome sequenced in the present study is a valuable resource for investigating the population and evolutionary genetics of family Oxalidaceae and can be used to identify related species.


Introduction
The largest genus of family Oxalidaceae is Oxalis L., which is distributed mostly in Southern Africa and South America, and comprises of more than 500 species. About one-half of the total species (>200 spp) growing in Southern Africa share a bulbous or tuberous in herbaceous taxa [1][2][3]. A huge morphological variation have been observed among approximately 250 species in South America, where this genus seems to have originated and diversified [2,4]. However, Oxalis sections Corniculatae DC. consist of creeping herbs, and many of the species grows in the temperate and humid areas of the Americas [5]. A cytogenetic study showed that Corniculatae tends to be categorized into two sub-groups: (1) a large number of diploid and polyploid species with a base chromosome number of x = 6, symmetrical karyotypes, and medium and small chromosomes size, and (2) a minor number of diploid species having x = 5, more asymmetrical karyotypes, and average to large chromosomes [3,6,7]. The taxonomy has been suffering with similarities in phenotypes across different species.
there is little information available on their genetic structure, especially their chloroplast genomes or their detailed phylogenetic placement. Hence, the current study was performed with the aim to sequence and analyze the complete chloroplast genome of O. corniculata and compare it with related species from the family Oxilidaceae (Oxalis drummondii, Averrhoa carambola and Cephalotus follicularis). We also aimed to elucidate and compare the global pattern of structural variation in the cp genome of O. corniculata and related three species. In addition, we compared the IR region contraction and expansion, intron contents, regions of high sequence divergence, and phylogenomic of O. corniculata with related species cp genomes to reveal more insight regarding the comparative genome architecture.

Chloroplast Genome Structure of O. corniculata
The chloroplast genome of O. corniculata is 152,189 bp and displays a distinctive quadripartite structure with a pair of 25,387 bp inverted repeats (IRs) that separate 83,427 bp single copy regions and 16,990 bp single copy regions ( Figure 1). The total GC content is 36.7% with uneven distribution across the whole genome. The GC contents of IRs are higher (42.6%) than the large single copy and small single copy regions (34.4% and 30.3%, respectively). Furthermore, the O. corniculata cp genome consists of 131 genes and among all these genes, 82 are protein-coding genes, 40 are tRNA, and 8 are rRNA ( Figure 1; Table 1). The protein-coding genes present in the O. corniculata cp genome included nine genes for large ribosomal proteins (rpl2, 14,16,20,22,23,32,33,36), 11 genes for small ribosomal proteins (rps2, 3,4,7,8,11,12,14,15,18,19), five genes for photosystem I (psaA, B, C, I, J), 15 genes for photosystem II (psbA, B, C, D, E, F, H, I, J, K, L, M, T, Z) and six genes (atpA, B, E, F, H, I) for ATP synthesis and the electron transport chain ( Figure 1, Table 2). Similarly, 17 genes contain introns (11 protein-coding genes and 6 tRNA genes), of which three comprised of two introns (rps12, clpP, ycf 3), while the rest have a single intron ( Table 3). The small ribosomal protein 12 gene (rps12) is trans-spliced with a single intron. Its 5' exon is located in the LSC region, while is 3' exon is located in the IRb region and duplicated in the IRa region ( Figure 1). The largest intron was present in trnK-UUU (2558 bp), whilst trnL-UAA contains the smallest intron (492 bp) ( Table 3). The protein-coding region accounts for 52%, while the tRNA and rRNA regions constitute 1.99% and 5.94%, respectively, in the cp genome. The length of the protein-coding region is about 79,239 bp while those of tRNA and rRNA are 3042 bp and 9048 bp, respectively. Similarly, the rps16 gene, which is found in most angiosperm plastid genomes, is absent in O. corniculata. Furthermore, inf A is also absent in the O. corniculata cp genome.

Category
Group of Genes Name of Genes

Comparative Analysis of O. corniculata Chloroplast Genome with Related Species
The O. corniculata chloroplast genome was compared with three already sequenced genomes from family Oxalidaceae i.e., O. drummondii, A. carambola, and C. follicularis (Table 1). Variations were observed in cp genomes where C. follicularis has the smallest cp genome, 142,706 bp, whilst the A. carambola cp genome was the largest, 155,965 bp, amongst analyzed species. We also compared the O. corniculata cp genome for pairwise sequence divergence (Table S1). Results showed that O. corniculata exhibited the lowest pairwise sequence divergence as compared to O. drummondii (0.044) and A. carambola (0.057; Table S1). Similarly, the whole cp genomes of O. corniculata were compared to identify sequence divergence via mVISTA ( Figure 2). Results showed that the coding regions of all cp genomes are conserved compare to non-coding regions, whilst non-coding regions showed a higher divergence rate than the coding regions. The most divergence was observed in intergenic spaces. The matK gene exhibited a high degree of divergence in all genomes, but it was higher in A. carambola and C. follicularis as compared to others. Similarly, rpoC1, rpoC2, and rpoB exhibited more divergence in four cp genomes with the highest in C. follicularis. The region between rpoB and psbD genes showed a high degree of divergence in all species in the LSC. The ndhK and rpl16 genes showed significant divergence compared to the ycf 2 gene with lesser values. The ndhB is less divergent in all species except C. follicularis, where it showed high divergence. Similarly, the ndhF gene is highly divergent in all cp genomes, while it is absent in the C. follicularis cp genome. In the SSC, the region between ndhG and ndhH is highly divergent and the cp genome of C. follicularis lack most of the NadH oxidoreductase genes. Furthermore, about 63 protein-coding gene sequences were compared to obtain the average pairwise sequence distance among these species. The results showed that a majority of the genes maintained low levels of average sequence divergence. A relatively lower sequence identity was observed between the chloroplast genomes of O. corniculata with related species, especially in the ccsA, clpP, rps8, rps15, rpl22, matK, and ycf1 genes ( Figure 3).
Plants 2020, 9, x; doi: FOR PEER REVIEW www.mdpi.com/journal/plants oxidoreductase genes. Furthermore, about 63 protein-coding gene sequences were compared to obtain the average pairwise sequence distance among these species. The results showed that a majority of the genes maintained low levels of average sequence divergence. A relatively lower sequence identity was observed between the chloroplast genomes of O. corniculata with related species, especially in the ccsA, clpP, rps8, rps15, rpl22, matK, and ycf1 genes ( Figure 3).

Repeat Sequence Analysis
We investigated the repeat sequences of the O. corniculata chloroplast genome with the related species. The results revealed that O. corniculata contains 12 palindromic, 30 forward, and 32 tandem repeats, O. drummondii contains 19 palindromic, 20 forward, and 23 tandem repeats, A. carambola contains 15 palindromic, 18 forward, and 27 tandem repeats, and C. follicularis contain s20 palindromic, 30 forward, and 32 tandem repeats ( Figure 5A). In O. corniculata, out of these repeats, the sizes of 9 palindromic repeats were 30-44 bp, while the sizes of 4 repeats were 45-59 bp. Likewise, the size of 28 and 3 tandem repeats were 15-29 bp and 30-44 bp, respectively, whereas the size of 21 forward repeats was found to be about 30-44 bp ( Figure 5B-D). Amongst all these cp genomes 74 repeats (the highest) were detected in both A. carambola and O. corniculata. In all types of repeats, tandem repeats are the highest in number in all the cp genomes, followed by forward and palindromic repeats.

Repeat Sequence Analysis
We investigated the repeat sequences of the O. corniculata chloroplast genome with the related species. The results revealed that O. corniculata contains 12 palindromic, 30 forward, and 32 tandem repeats, O. drummondii contains 19 palindromic, 20 forward, and 23 tandem repeats, A. carambola contains 15 palindromic, 18 forward, and 27 tandem repeats, and C. follicularis contain s20 palindromic, 30 forward, and 32 tandem repeats ( Figure 5A). In O. corniculata, out of these repeats, the sizes of 9 palindromic repeats were 30-44 bp, while the sizes of 4 repeats were 45-59 bp. Likewise, the size of 28 and 3 tandem repeats were 15-29 bp and 30-44 bp, respectively, whereas the size of 21 forward repeats was found to be about 30-44 bp ( Figure 5B-D). Amongst all these cp genomes 74 repeats (the highest) were detected in both A. carambola and O. corniculata. In all types of repeats, tandem repeats are the highest in number in all the cp genomes, followed by forward and palindromic repeats.

Simple Sequence Repeat (SSR) Analysis
In simple sequence repeats (SSRs) analyses, a total of 46 SSRs were detected in the O. corniculata genome; among them, 42 are mononucleotide repeats, 3 are trinucleotide repeats, and 1 is a pentanucleotides repeat ( Figure 6). There are no dinucleotides, tetranucleotides, and hexanucleotides in the O. corniculata genome. In O. corniculata, 13% SSRs are present in the CDS region, 73.9% is present in the LSC region, 15.2% is present in the SSC region, and 2.1% is present in the IR region ( Figure 6B-E). Similarly, the highest numbers of SSRs in the other three species are located in the intergenic regions; i.e., O. drummondii (70%), A. carambola (87.6%), and C. follicularis (78.5%) followed by the LSC region-that is, 67.5%, 84.6%, and 73.2%, respectively ( Figure 6B-E). On the other hand, in the cp genome of O. drummondii, a total of 40 SSRs were found, of which 37 are mono and 3 are trinucleotide, while di, tetra, Penta and hexanucleotides were not detected. In A. carambola and C. follicularis, 56 and 49 are mononucleotide repeats ( Figure 6F). A. carambola have 5 trinucleotides, 3 penta, and 1 hexanucleotides repeat, while di and tetranucleotides repeats were missing. Similarly, in C. follicularis, 4 tri, 1 penta, and 2 hexanucleotides repeats were found, while di and tetranucleotides repeats were absent in this cp genome. Among the four cp genomes, A. carambola has a high number of SSRs; i.e., 56.

Simple Sequence Repeat (SSR) Analysis
In simple sequence repeats (SSRs) analyses, a total of 46 SSRs were detected in the O. corniculata genome; among them, 42 are mononucleotide repeats, 3 are trinucleotide repeats, and 1 is a pentanucleotides repeat ( Figure 6). There are no dinucleotides, tetranucleotides, and hexanucleotides in the O. corniculata genome. In O. corniculata, 13% SSRs are present in the CDS region, 73.9% is present in the LSC region, 15.2% is present in the SSC region, and 2.1% is present in the IR region ( Figure 6B-E). Similarly, the highest numbers of SSRs in the other three species are located in the intergenic regions; i.e., O. drummondii (70%), A. carambola (87.6%), and C. follicularis (78.5%) followed by the LSC region-that is, 67.5%, 84.6%, and 73.2%, respectively ( Figure 6B-E). On the other hand, in the cp genome of O. drummondii, a total of 40 SSRs were found, of which 37 are mono and 3 are trinucleotide, while di, tetra, Penta and hexanucleotides were not detected. In A. carambola and C. follicularis, 56 and 49 are mononucleotide repeats ( Figure 6F). A. carambola have 5 trinucleotides, 3 penta, and 1 hexanucleotides repeat, while di and tetranucleotides repeats were missing. Similarly, in C. follicularis, 4 tri, 1 penta, and 2 hexanucleotides repeats were found, while di and tetranucleotides repeats were absent in this cp genome. Among the four cp genomes, A. carambola has a high number of SSRs; i.e., 56.

Phylogenetic Analysis
For the phylogenetic analysis of O. corniculata, we have downloaded about 191 genomes from the 20 families mentioned in the Materials and Methods section. We inferred the phylogenetic position of O. corniculata on the basis of 60 shared genes among these genomes. The study revealed that O. corniculata forms a single clade with O. drummondii and A. carambola in the family Oxalidaceae (Figure 7). These results also showed that O. corniculata is closer to O. drummondii than A. carambola, which is a different genus. Furthermore, the phylogenetic tree also inferred that the Oxalidaceae family is close to Cephalotaceae and Celastraceae with high bootstrap support (100%), followed by Zygophyllaceae and Euphorbiaceae. The phylogenetic trees in this study also exhibited that Rosaceae is highly interlinked to Moraceae. Similarly, the phylogenetic trees also indicate the close relationship of Fabaceae with Apodanthaceae.

Phylogenetic Analysis
For the phylogenetic analysis of O. corniculata, we have downloaded about 191 genomes from the 20 families mentioned in the Materials and Methods section. We inferred the phylogenetic position of O. corniculata on the basis of 60 shared genes among these genomes. The study revealed that O. corniculata forms a single clade with O. drummondii and A. carambola in the family Oxalidaceae ( Figure 7). These results also showed that O. corniculata is closer to O. drummondii than A. carambola, which is a different genus. Furthermore, the phylogenetic tree also inferred that the Oxalidaceae family is close to Cephalotaceae and Celastraceae with high bootstrap support (100%), followed by Zygophyllaceae and Euphorbiaceae. The phylogenetic trees in this study also exhibited that Rosaceae is highly interlinked to Moraceae. Similarly, the phylogenetic trees also indicate the close relationship of Fabaceae with Apodanthaceae.

Discussion
In case of genetic and evolutionary relationship assessments among plant species, chloroplast DNA sequences have been extensively used [40][41][42]. The complete chloroplast genome sequences provided sufficient information to reconstruct both current and prehistoric diversifications [43]. The powerful and flexible nature of Next Generation Sequencing (NGS) has permeated many areas of study, enabling the development of a broad range of applications that have transformed study designs capable of unlocking information of the genome, transcriptome, and epigenome of any organism [44]. In the current study, we have sequenced the complete genome of O. corniculata chloroplast for the first time. The results revealed that the chloroplast genome size of O. corniculata is in line with the chloroplast of those flowering plants, which ranges from 125,373 bp to 176,045 bp in Cuscuta exaltata and Vaccinium macrocarpon, respectively [45,46]. The CG content of O. corniculata is 36.7% (Table 1), which is slightly lower than C. follicularis and Paeonia obovata (38.43%) [47]. The GC content in the IR region is higher (42.6%) than that of the LSC and SSC regions. As a result of the presence of the rich GC nucleotide, higher GC content was present in the IR region of rRNA genes such as rrn5, rrn4.5, rrn23, and rrn16, which is consistent with what has been investigated in other cp genomes [48][49][50].

Discussion
In case of genetic and evolutionary relationship assessments among plant species, chloroplast DNA sequences have been extensively used [40][41][42]. The complete chloroplast genome sequences provided sufficient information to reconstruct both current and prehistoric diversifications [43]. The powerful and flexible nature of Next Generation Sequencing (NGS) has permeated many areas of study, enabling the development of a broad range of applications that have transformed study designs capable of unlocking information of the genome, transcriptome, and epigenome of any organism [44]. In the current study, we have sequenced the complete genome of O. corniculata chloroplast for the first time. The results revealed that the chloroplast genome size of O. corniculata is in line with the chloroplast of those flowering plants, which ranges from 125,373 bp to 176,045 bp in Cuscuta exaltata and Vaccinium macrocarpon, respectively [45,46]. The CG content of O. corniculata is 36.7% (Table 1), which is slightly lower than C. follicularis and Paeonia obovata (38.43%) [47]. The GC content in the IR region is higher (42.6%) than that of the LSC and SSC regions. As a result of the presence of the rich GC nucleotide, higher GC content was present in the IR region of rRNA genes such as rrn5, rrn4.5, rrn23, and rrn16, which is consistent with what has been investigated in other cp genomes [48][49][50].
In most angiosperms, it is believed that the gene(s) of the chloroplast genome and their organization are extremely conserved [51]. In correlation, we detected 131 genes in the cp genome of O. corniculata while other studies also show that many angiosperms have retained these genes [52,53]. With the increasing number of chloroplast genome sequences, the diverse organization of the chloroplast genome is becoming more evident, as demonstrated by genome rearrangement and gene losses in the chloroplast genomes of Oxiladaceae. For example, the rps16 gene, which is found in most angiosperm plastid genomes, has been lost in O. corniculata. Similar results have been reported in various cp genomes previously [54]. Furthermore, in O. corniculata, the cp genome the infA gene was lost, as reported previously by various researchers, the infA gene has been independently lost multiple times from angiosperms and especially in most Rosids [32,51]. Moreover, we found 11 protein-coding genes and 6 tRNAs genes containing introns in the O. corniculata cp genome. Among them, three genes, clpP, rps12, and ycf 3 have two introns, while the others have one intron. Similar results were also reported previously in the Manihot esculenta chloroplast genome [32] and Oresitrophe chloroplast genome [55]. In this study, genes ccsA, clpP, rps8, rps15, rpl22, matK, and ycf1 were found to have high evolution rates among the four cp genomes (Figure 3), which agreed with earlier reports of Cuenoud et al. [41]. Similar results of these genes were reported previously among 17 vascular plants and Panax species [56].
In the terrestrial plants, the cp genome is very conserved structurally, and the large inverted repeats (IRs) junction is not essential to the function of the cp genome [57]. It is believed that IRs are the most conserved region due to which the rate of natural nucleotide substitution in IRs is lesser as compared to single copy regions, and the variation in IR/LSC and IR/SSC boundaries is the key reason for the size variation among the cp genomes of different groups. The variation in size among four genomes was exhibited by the slight expansion of the IRb (JLB border) in C. follicularis compared to O. corniculata (Figure 4). These results are in agreement with previous work where IRs are one of the efficient tools for conformational reorganizations within the plastids genomes and are regularly subjected to expansion, contraction, or even complete loss [20]. Similarly, previous results showed that contractions and expansions of the IR regions triggered the diversification of size among the cp genomes [58].
The study of different repeats (palindromic, forward, and tandem) in our sequenced cp genome showed variation in the number of repeats, which is similar to other species previously studied [59]. In all types of repeats, tandem repeats were found more than palindromic and forward repeats in four cp genomes; these results are consistent with previous reports of Teucrium and Commiphora species [60,61], as well as S. miltiorrhiza [62]. Similarly, simple sequence repeats (SSRs) usually have a higher rate of mutation compared with other neutral regions of DNA due to slipped strand mispairing. In genetic studies, due to the haploid and nonparental inheritance nature of cp SSRs, they are commonly used for the assessment of population structure as molecular markers [63,64]. In this study, we comparatively studied the ideal SSRs among the four species O. corniculata, O. drummondii, A. carambola, and C. follicularis ( Figure 6). The largest number of SSRs was found in A. carambola, followed by C. follicularis. Mononucleotide repeats were found to be the most common type of SSR in all four species; the A or T mononucleotide repeats are most abundant SSRs in O. corniculata (Figure 6), which is congruent to the previous result that SSRs in the chloroplast genome are commonly composed of A or T repeats and rarely G or C repeats [62,65].
Recently, cp genomes information has provided a large amount of data for improving phylogenetic resolution. Chloroplast genome sequences have been widely used for the reconstruction of phylogenetic relationships among plant lineages [66][67][68][69]. The phylogenetic evaluation of plant species might not be easy to resolve evolutionary relationships, specifically at taxonomic levels while using a small number of loci [70,71]. Previous phylogenetic studies based on the complete cp genomes and shared genes have been used to explain problematic phylogenetic relations among nearly associated species [34,68] and to increase our concept related to evolutionary relations of angiosperms [72,73]. Phylogenetic relationships of O. corniculata were inferred by using 60 shared genes datasets by using the ML method. The results showed that O. cornicualata form a single clade with O. drummondii. Similarly, the phylogenetic tree also inferred that Oxalidaceae family is close to Cephalotaceae and Celastraceae with high bootstrap support (100%), followed by Zygophyllaceae and Euphorbiaceae (Figure 7). Further cp genomes from the family Oxalidaceae should be explored to determine the phylogenetic position of O. corniculata within the section Corniculatae.

Chloroplast DNA Extraction, Sequencing, and Assembly
Young and immature leaves of O. corniculata were ground into fine powder in liquid nitrogen, and pure DNA was isolated through a DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA). The resultant chloroplast DNA, by using an Illumina HiSeq-2000 platform (San Diego, CA, USA) at Macrogen (Seoul, Korea) was sequenced. A total of 43,453,336 raw reads were generated for O. corniculata, and CLC Genomics Workbench v7.0 (CLC Bio, Aarhus, Denmark) was used to trim and filter reads for the de novo genome assembly. Trimmomatic 0.36 was used for filtering the reads and trailing and leading nucleotide with a Phred score of <20 or when the Phred score dropped below 20 on implementing a 4-bp sliding-window approach. Similarly, reads of <50 bp were discarded after quality filtering and adaptor trimming. The first assembly was formed using SPADES v3.9.0, with an additional switchover to SOAP denovo v2.04. The resulting contigs were compared against the chloroplast genomes of O. drummondii using BLASTN with an E-value cut-off of 1 × 10 −5 . The uncertain regions in these genomes, such as IR junction's region, were chosen from the already published genome mentioned above to adjust the sequence length using the iteration method and by employing the Geneious v11.1.2 software [74]. The chloroplast genome sequence of O. corniculata has been submitted to GenBank (accession number: MN998500).

Genome Annotation
The Dual Organellar Genome Annotator (DOGMA) [75] was used to annotate the cp genomes of the sequenced species and through BLASTX, the number and position of ribosomal RNAs, transfer RNAs, and other coding genes present in chloroplast genomes were identified and analyzed, while BLASTN tRNAscan-SE version 1.21 was used for tRNA annotation [76] software. Furthermore, Geneious (v11.0) and tRNAscan-SE [76] were used for manual adjustment to compare with the reference genomes reported previously. Similarly, the start and stop codon and intron boundaries were also manually adjusted and compared with the reference chloroplast genome already published. Additionally, by using Organellar Genome DRAW (OGDRAW) [77], the structural characteristics of chloroplast genomes of O. corniculata were demonstrated. Beside this, to determine the relative synonymous codon usage and deviations in synonymous codon usage by avoiding the effect of amino acid composition, MEGA6 software [78] was used.

Characterization of Repetitive Sequences and SSR
REPuter software [79] was used to determine the repetitive sequences (palindromic, reverse and direct repeats) within these four cp genomes (O. corniculata, O. drummondii, A. carambola, and C. follicularis). Subsequent settings were used for repeat identification through REPuter: (1) a minimum repeat size of 30 bp, (2) ≥90% sequence identity, and (3) a Hamming distance of 1. Tandem Repeats Finder version 4.07 b was used to find tandem repeats by using default settings [80]. The MIcroSAtellite (MISA) identification tool was used for the microsatellite analysis of O. corniculata and another three species' (O. drummondii, A. carambola, and C. follicularis) cp genomes [81]. The parameters such as unit_size and min_repeats were defined as follows: 1-10, 2-8, 3-4, 4-4, 5-3, and 6-3; the smallest distance between two SSRs was set to 100 bp. The following conditions were set for parametric significance: 10 or more repeats of one base, 6 or more repeats of two bases, 5 or more repeats of three bases, 5 or more repeats of four bases, 4 or more repeats of five bases, and 4 or more repeats of six bases.

Sequence Divergence and Phylogenetic Analysis
In the O. corniculata chloroplast genome, the average pairwise sequence divergence with three related species (O. drummondii, A. carambola, and C. follicularis) from the family Oxalidaceae was determined. After a comparison of gene order and multiple sequence alignment, comparative sequence analysis was used to recognize missing and unclear gene annotations. For whole genome alignment, MAFFT version 7.222 [82], with default parameters were used, and pairwise sequence divergence was calculated by the use of the selected Kimura's two-parameter (K2P) model [83]. MEGA 6 software [78] was used to evaluate the relative synonymous codon usage by avoiding the effect of amino acid composition. Finally, the divergence of the new O. corniculata cp genomes from related species of family Oxalidaceae was determined using mVISTA [84] in Shuffle-LAGAN mode and by employing the genome of new O. corniculata as a reference. To resolve the phylogenetic position of O. corniculata within the family Oxalidaceae and to check the relationship of 20 families (Fabaceae, Apodanthaceae, Zygophyllaceae, Cephalotaceae, Oxalidaceae, Celastraceae, Euphorbiaceae, Malpighiaceae, Chrysobalanaceae, Violaceae, Passifloraceae, Salicaceae, Cucurbitaceae, Fagaceae, Juglandaceae, Betulaceae, Elaeagnaceae, Ulmaceae, Cannabaceae, Moraceae, Rosaceae) in monophyletic clade rosids, about 60 share genes from 191 cp genomes were downloaded from the National Center for Biotechnology Information (NCBI) database. For the alignment of 60 shared genes, MAFFT version 7.222 [82] with default parameters was used. The maximum likelihood (ML) method was adopted to infer the phylogenetic trees with MEGA 6 [78], and parameters were adjusted with a BIONJ tree with 1000 bootstrap replicates using the Kimura two-parameter model with gamma-distributed rate heterogeneity and invariant sites.

Conclusions
The current findings reveal detailed understandings of the complete cp genome of O. corniculata for the first time through sequencing on Illumina HiSeq-2000 platform. The gene order and gene structure of O. corniculata was found to be similar with three related species from the family Oxalidaceae. Through detailed bioinformatic analysis and comparative assessments, we retrieved essential genetic features such as repetitive sequences, SSRs, codon usage, IR contraction and expansion, sequence divergence, and phylogenomic placement. Whole cp genome comparisons revealed an overall high degree of sequence similarity between O. corniculata and O. drummondii and some divergence in the intergenic spacers of other species. No major structural rearrangement in these four cp genomes was observed. Phylogenomic analyses of the complete plastid genomes revealed that O. corniculata is closely related to O. drummondii. A current plastome genomic dataset and the detailed analysis of O. corniculata and related species and their comparative analysis provide a powerful genetic resource for the future molecular phylogeny, evolution, population genetics, and biological functions of genus Oxalis.

Conflicts of Interest:
The authors have declared that no competing interests exist.
Availability of Data and Materials: All data generated or analyzed during this study are included in this published article.