Characterization of the Chloroplast Genome Sequence of Acer miaotaiense: Comparative and Phylogenetic Analyses

Acer miaotaiense is an endangered species within the Aceraceae family, and has only a few small natural distributions in China’s Qingling Mountains and Bashan Mountains. Comparative analyses of the complete chloroplast genome could provide useful knowledge on the diversity and evolution of this species in different environments. In this study, we sequenced and compared the chloroplast genome of Acer miaotaiense from five ecological regions in the Qingling and Mashan Regions of China. The size of the chloroplast genome ranged from 156,260 bp to 156,204 bp, including two inverted repeat regions, a small single-copy region, and a large single-copy region. Across the whole chloroplast genome, there were 130 genes in total, and 92 of them were protein-coding genes. We observed four genes with non-synonymous mutations involving post-transcriptional modification (matK), photosynthesis (atpI), and self-replication (rps4 and rpl20). A total of 415 microsatellite loci were identified, and the dominant microsatellite types were composed of dinucleotide and trinucleotide motifs. The dominant repeat units were AT and AG, accounting for 37.92% and 31.16% of the total microsatellite loci, respectively. A phylogenetic analysis showed that samples with the same altitude (Xunyangba, Ningshan country, and Zhangliangmiao, Liuba country) had a strong bootstrap value (88%), while the remaining ones shared a similar longitude. These results provided clues about the importance of longitude/altitude for the genetic diversity of Acer miaotaiense. This information will be useful for the conservation and improved management of this endangered species.


Introduction
Acer miaotaiense, an endangered species [1], is only distributed in several small and isolated regions of China's Qingling Mountains and Bashan Mountains [2]. Since it was discovered in 1954, this species greatly diversified the genetic diversity of this region, and also provided potential scientific value for analyzing its origin and evolution. This species is an important ornamental plant, and has a wide range of medical applications in traditional Chinese medicine. Many highly bioactive compounds with good pharmacological effects are extracted from plants belonging to the genus Acer, such as flavonoids, tannins, terpenoids, and alkaloids [3]. However, this species is vulnerable to environmental changes and anthropogenic disturbance [4,5]. It was listed as a vulnerable species in a recent nationwide biodiversity report [6]. A recent genetic diversity study using RAPD markers showed that this species has a low level of genetic diversity, especially at the population level; the gene flow among populations is also low [7]. Clustering analysis showed that genetic differentiation occurred between individuals from the middle and west of the Qinling Mountains, and clustering results were inconsistent with

Genomic Features
The complete chloroplast genome of Acer miaotaiense contains 19,087,438 reads and 5,726,231,340 base numbers on average (Table S1). The percentage of high base quality (>Q30, representing 99.9% base call accuracy) was quite high (Q30 > 91.67%). The GC content accounted for 36.77% of the total chloroplast genome, which is an important indicator of species affinity. All samples from five locations had a low range of GC content. This observed content was similar to other Acer species [11][12][13], and is slightly lower (37.88%) than that of the Acer miaotaiense reported in Taibai Country [2]. The total genome size was 156,238 bp on average (Table 1), which was similar to a recent report [2]. The genome contains four regions, including two inverted repeat regions (IRA and IRB), a small single-copy (SSC) region, and a large single-copy (LSC) region (Figure 1 and Figures S2-S4). The largest genomic component was LSC, with 86,095 bp on average, accounting for 55% of the chloroplast genome. In comparison, the IR and SSC regions accounted for 33.33% and 11.57%, respectively (on average). The largest and smallest chloroplast genomes of Acer miaotaiense were found in Zhangliangmiao, Liuba country (ZL) and Dianbingchang, Meixian country (DB) with lengths of 156,260 bp and 156,204 bp, respectively (Table 1). The structure of the cp genome of Acer miaotaiense was similar to those from the other Acer species, such as Acer buergerianum [15], Acer morrisonense [16], and Acer davidii [17], in terms of genome size and the length of the four main regions. was similar to those from the other Acer species, such as Acer buergerianum [15], Acer morrisonense [16], and Acer davidii [17], in terms of genome size and the length of the four main regions. Across the whole chloroplast genome, there were 130 genes in total, and 92 of them were proteincoding genes ( Table 2). The remaining ones were tRNA and rRNA genes, with numbers of 30 and eight, respectively. Among them, genes involved in photosynthesis and self-replication were the two dominant gene families (Table 2). There were six genes coding the subunits of ATP synthase and 11 genes associated with the subunits of NADH dehydrogenase. Five (psaA, psaB, psaC, psaI, and psaJ) and 15 genes (psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, and psbZ) were also identified coding the subunits of photosystem I and photosystem II, respectively. Of the 130 genes, there were five genes with unknown functions (ycf1, ycf2, ycf3, ycf4, and ycf15). These results showed that, for the cp genome of Acer miaotaiense, the number of genes and protein-coding genes is quite conserved across the five studied geographical locations. The total number of genes was slightly lower than that for some Acer species, such as Acer davidii Franch (134 genes) [17], and Acer buergerianum (134 genes) [15], but with a higher number of protein-coding genes. During the evolution of angiosperms, multiple gene losses remain an ongoing process [10,[18][19][20]. Functional and non-functional genes from the plastid genome can be transferred to the nuclear and the mitochondrial genome [21,22]. In some parasitic species, such as those belonging to the Orobanchaceae family, gene losses can happen in the gene-encoding subunits of the genetic apparatus [23], and are not restricted to genes that are involved in photosynthesis and related pathways [24,25]. The loss of genome could Across the whole chloroplast genome, there were 130 genes in total, and 92 of them were protein-coding genes ( Table 2). The remaining ones were tRNA and rRNA genes, with numbers of 30 and eight, respectively. Among them, genes involved in photosynthesis and self-replication were the two dominant gene families ( Table 2). There were six genes coding the subunits of ATP synthase and 11 genes associated with the subunits of NADH dehydrogenase. Five (psaA, psaB, psaC, psaI, and psaJ) and 15 genes (psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, and psbZ) were also identified coding the subunits of photosystem I and photosystem II, respectively. Of the 130 genes, there were five genes with unknown functions (ycf 1, ycf 2, ycf 3, ycf 4, and ycf 15). These results showed that, for the cp genome of Acer miaotaiense, the number of genes and protein-coding genes is quite conserved across the five studied geographical locations. The total number of genes was slightly lower than that for some Acer species, such as Acer davidii Franch (134 genes) [17], and Acer buergerianum (134 genes) [15], but with a higher number of protein-coding genes. During the evolution of angiosperms, multiple gene losses remain an ongoing process [10,[18][19][20]. Functional and non-functional genes from the plastid genome can be transferred to the nuclear and the mitochondrial genome [21,22]. In some parasitic species, such as those belonging to the Orobanchaceae family, gene losses can happen in the gene-encoding subunits of the genetic apparatus [23], and are not restricted to genes that are involved in photosynthesis and related pathways [24,25]. The loss of genome could serve as a low-cost strategy under disadvantageous environmental conditions by facilitating rapid genome replication [25][26][27].  We observed SNP variations in the coding regions of five genes, including matK, atpI, rps4, atpB, and rpl20 (Table S2). Among these, the SNP in atpB was a synonymous mutation in ZL ( Figure S5), and the remaining four caused non-synonymous mutations ( Figure 2). The matK gene is involved in post-transcriptional modification [28], and regulation mechanisms relating to plant development and indirect photosynthesis [29]. It is commonly referred to as maturase-associated with several intron-containing plastid mRNAs [30]. The atpI gene is a membrane protein encoded by atp operons, and is necessary during the assembly of some ATP synthase complexes [31]. The rps4 gene belongs to the family of a small subunit of ribosome, and involves the decoding of genetic information during translation [32]. In contrast, rpl20 belongs to a family of a large subunit of ribosome, and catalyzes peptide bond formation [33]. Notably, all SNP variations causing non-synonymous mutation of the aforementioned genes occurred in at least one location, and they all occurred in Xunyangba, Ningshan country (NS). The surrounding environment is more disadvantageous, compared with other locations, in terms of artificial disturbances, diseases, and pests. These results provided clues regarding local adaptions under environments via altering post-transcriptional modification, photosynthesis, and self-replication. Chloroplast plays vital roles in the biosynthesis of many essential metabolites, such as amino acids, fatty acids, vitamins, etc. These metabolites are important for the response to diverse biotic (e.g., pathogens) and abiotic stresses (heat, drought, salt, etc.) [9,12,34].

Comparative Genomic Analysis
Comparative analysis of cp genomes is crucial in understanding the diversity and evolution of a plant under different environments [12,35]. The sequence identity was quite high across the five geographical locations (Figure 3). In addition, both the gene order and number were highly conserved in the cp genomes across the five geographical locations. The difference across the length of the chloroplast genome was mainly caused by variance in the length of LSCs. The majority of the variances occurred in the conserved non-coding regions. These results indicated the non-coding regions were less conserved, compared to coding regions, which was also found in some other species, such as Cerasus humilis [36], Talinum paniculatum [37], and Heimia myrtifolia [38]. regions were less conserved, compared to coding regions, which was also found in some other species, such as Cerasus humilis [36], Talinum paniculatum [37], and Heimia myrtifolia [38].

IR Expansion and Contraction
After comparison of the LSC, IR, SSC, and LSC boundaries of the cp genome of Acer miaotaiense, we found that the IR contraction and expansion were highly similar across the five geographical locations (Figure 4). The ycf 1 gene was located across the SSC and IRB regions, and the starting position was different in Shiziba, Foping country (FP) and Yangpigou, Taibai country (YP) compared with the other three locations. The trnH gene was located in the LSC region, and was shifted 1 bp to the right, compared with the other locations. These results showed that, with the species of Acer miaotaiense, the genomic structure was highly conserved, though genomic length variation can be found in the LSC and SSC boundary regions, as reported in other species [36,38,39].  position was different in Shiziba, Foping country (FP) and Yangpigou, Taibai country (YP) compared with the other three locations. The trnH gene was located in the LSC region, and was shifted 1 bp to the right, compared with the other locations. These results showed that, with the species of Acer miaotaiense, the genomic structure was highly conserved, though genomic length variation can be found in the LSC and SSC boundary regions, as reported in other species [36,38,39].

Microsatellite Detection Analysis
A total of 415 microsatellite loci (or simple sequence repeats, SSRs) were identified in the cp genome of Acer miaotaiense ( Figure 5). Among them, the largest number of loci was identified in the LSC region across five geographical locations, accounting for 58% of the total microsatellite loci, followed by the IR regions ( Figure 5A). There were 102, 26, and 85 microsatellite loci identified in the protein-coding regions of genes in the LSC, SSC, and IR regions, respectively. No microsatellite loci were found in the intron regions of genes in the cp genome ( Figure 5B). The dominant microsatellite types were dinucleotide and trinucleotide with repeat number of three ( Figure 5C).
The length of repeats ranged from six to 24, with an average value of 7.5. The dominant repeat length was six (54.83%), followed by seven (15.22%), and nine (11.84%), as shown in Figure 6A. Up to 20 repeat types were detected across the cp genome of Acer miaotaiense. Among these, AT and AG were the two dominant repeat types, accounting for 37.92% and 31.16% of the total microsatellite loci, respectively ( Figure 6B).
In over 400 microsatellite loci identified, the number of loci in the LSC, SSC, and IR regions across different geographical locations (DB, YP, ZL, and NS) was highly conserved, with 239, 38, and 138, respectively. For the location of FP, 240, 38, and 137 microsatellite loci were detected in the LSC, SSC, and IR regions, respectively. Similar patterns were also reported in other species, such as Cerasus

Microsatellite Detection Analysis
A total of 415 microsatellite loci (or simple sequence repeats, SSRs) were identified in the cp genome of Acer miaotaiense ( Figure 5). Among them, the largest number of loci was identified in the LSC region across five geographical locations, accounting for 58% of the total microsatellite loci, followed by the IR regions ( Figure 5A). There were 102, 26, and 85 microsatellite loci identified in the protein-coding regions of genes in the LSC, SSC, and IR regions, respectively. No microsatellite loci were found in the intron regions of genes in the cp genome ( Figure 5B). The dominant microsatellite types were dinucleotide and trinucleotide with repeat number of three ( Figure 5C).
The length of repeats ranged from six to 24, with an average value of 7.5. The dominant repeat length was six (54.83%), followed by seven (15.22%), and nine (11.84%), as shown in Figure 6A. Up to 20 repeat types were detected across the cp genome of Acer miaotaiense. Among these, AT and AG were the two dominant repeat types, accounting for 37.92% and 31.16% of the total microsatellite loci, respectively ( Figure 6B). humilis [36], Arabis stellari [40], and Paeonia ostii [41]. However, though the total number of microsatellites was highly conserved, polymorphisms might still be observed in certain loci, which could be helpful in population and evolutionary studies [42][43][44].   In over 400 microsatellite loci identified, the number of loci in the LSC, SSC, and IR regions across different geographical locations (DB, YP, ZL, and NS) was highly conserved, with 239, 38, and 138, respectively. For the location of FP, 240, 38, and 137 microsatellite loci were detected in the LSC, SSC, and IR regions, respectively. Similar patterns were also reported in other species, such as Cerasus humilis [36], Arabis stellari [40], and Paeonia ostii [41]. However, though the total number of microsatellites was highly conserved, polymorphisms might still be observed in certain loci, which could be helpful in population and evolutionary studies [42][43][44].

Phylogenetic Analysis
To analyze the phylogenetic relationships of Acer miaotaiense, a total of 61 common coding genes of the cp genomes from eight species were analyzed, including two outgroup species (Acer miaotaiense, Acer davidii, Acer morrisonense, Acer griseum, Acer buergerianum, Acer palmatum, Dipteronia dyeriana, and Dipteronia sinensis), shown in Figure 7. The phylogenetic tree demonstrated that the Acer miaotaiense from the five geographical locations were closely clustered together with a strong bootstrap value of 100%. All Acer species formed the main groups with strong bootstrap values, excluding the two outgroup species (Dipteronia dyeriana and Dipteronia sinensis). Among the five locations, DB, FP, and YP were located at approximately the same longitude, while the locations of NS and ZL had the same altitude (1224 m) with a strong bootstrap value (88%). These results indicated that differences in geographical environment, especially longitude, might be responsible for the genetic diversity of Acer miaotaiense. Also, sequencing depth, annotation methods [45], gene losses, pseudogenizations, and exceptional gene gains could also possibly affect the result [10,19].

Phylogenetic Analysis
To analyze the phylogenetic relationships of Acer miaotaiense, a total of 61 common coding genes of the cp genomes from eight species were analyzed, including two outgroup species (Acer miaotaiense, Acer davidii, Acer morrisonense, Acer griseum, Acer buergerianum, Acer palmatum, Dipteronia dyeriana, and Dipteronia sinensis), shown in Figure 7. The phylogenetic tree demonstrated that the Acer miaotaiense from the five geographical locations were closely clustered together with a strong bootstrap value of 100%. All Acer species formed the main groups with strong bootstrap values, excluding the two outgroup species (Dipteronia dyeriana and Dipteronia sinensis). Among the five locations, DB, FP, and YP were located at approximately the same longitude, while the locations of NS and ZL had the same altitude (1224 m) with a strong bootstrap value (88%). These results indicated that differences in geographical environment, especially longitude, might be responsible for the genetic diversity of Acer miaotaiense. Also, sequencing depth, annotation methods [45], gene losses, pseudogenizations, and exceptional gene gains could also possibly affect the result [10,19].

Plant Materials DNA Isolation
Fresh and young leaves of Acer miaotaiense were collected from five geographic locations in Shaanxi, China. The DB, YP, and FP regions had a similar longitude. DB was located in the northernmost area of the five regions (Table 3). YP had the highest altitude, and FP had the lowest altitude. In addition, the climate in YP was more complicated, and the distributions of Acer miaotaiense suffered from severe artificial deforestation activities, resulting in serious damage. DB was located in the middle of the Qingling Mountains, where minor insects could be observed. FP had a moderate climate with an average temperature of 25 °C in June. The largest distance (156 km) was found between ZL and NS, while the closest distance was between DB and YP (around 3 km). Total genomic DNA was isolated according to the manufacturer's protocol using the HP PlantDNA Kit D2485-01 (Omega Bio-Tek, Santa Clara, CA, USA).

Plant Materials DNA Isolation
Fresh and young leaves of Acer miaotaiense were collected from five geographic locations in Shaanxi, China. The DB, YP, and FP regions had a similar longitude. DB was located in the northernmost area of the five regions (Table 3). YP had the highest altitude, and FP had the lowest altitude. In addition, the climate in YP was more complicated, and the distributions of Acer miaotaiense suffered from severe artificial deforestation activities, resulting in serious damage. DB was located in the middle of the Qingling Mountains, where minor insects could be observed. FP had a moderate climate with an average temperature of 25 • C in June. The largest distance (156 km) was found between ZL and NS, while the closest distance was between DB and YP (around 3 km). Total genomic DNA was isolated according to the manufacturer's protocol using the HP PlantDNA Kit D2485-01 (Omega Bio-Tek, Santa Clara, CA, USA). After quantification and qualification, a paired-end library was constructed, and high-throughput sequencing was performed using the Illumina Hiseq 2500 platform (Lemont, IL, USA). After cleaning the raw data, a total of 28.5 Gb of high-quality clean data (≥Q30 values were higher than 89.76% for all samples) was retained. The complete chloroplast genome of Acer miaotaiense was assembled using the NOVOPlasty software [46], according to the standard default parameters. The circular map of the fully annotated genome was drawn in OGDRAW v1.2 [47]. All five cp genomes were also deposited in the GenBank database.

Comparative Genome Analysis
Comparative analysis of the chloroplast genome of the members of the genus Acer was done using the mVISTA program [48], based on the LAGAN alignment strategy. One genome in the same family as Acer miaotaiense was used as the reference (GenBank Accession No. NC_030343.1). Differences in the types and gene sizes of the IR, LSC, and SSC border regions among related species were also analyzed.

Microsatellite Detection Analysis
Microsatellites, also known simple sequence repeats (SSRs), were analyzed using PHOBOS 3.3.12 (http://www.rub.de/ecoevo/cm/cm_phobos.htm). A more detailed characterization of the detected microsatellite loci in different regions of IR, LSC, and SSC was carried out, as well as the repeat types being analyzed.

Conclusions
In this study, we reported and compared the chloroplast genome of Acer miaotaiense from five ecological regions in the Qingling and Mashan Regions of China. We observed four genes with non-synonymous mutations involving post-transcriptional modification, photosynthesis, and self-replication. A total of 415 microsatellite loci were identified, which are helpful in developing polymorphic molecular markers. Our phylogenetic analysis showed that samples with the same altitude or similar longitude were more closely related, providing clues to the importance of longitude/altitude for the genetic diversity of Acer miaotaiense. This information will be useful for the conservation and improved management of this endangered species.
Supplementary Materials: The following are available online at http://www.mdpi.com/1420-3049/23/7/1740/ s1, Figure S1: Circular gene map of Acer miaotaiense from the location of YP. Genes at the outside circle are transcribed counterclockwise, while genes inside of the circle are presented clockwise. LSC, large single copy; SSC, small single copy; INA, inverted repeat A; INB, inverted repeat B, Figure S2: Circular gene map of Acer miaotaiense from the location of ZL. Genes at the outside circle are transcribed counterclockwise, while genes inside of the circle are presented clockwise. LSC, large single copy; SSC, small single copy; INA, inverted repeat A; INB, inverted repeat B, Figure S3: Circular gene map of Acer miaotaiense from the location of NS. Genes at the outside circle are transcribed counterclockwise, while genes inside of the circle are presented clockwise. LSC, large single copy; SSC, small single copy; INA, inverted repeat A; INB, inverted repeat B, Figure S4 Table S1: Sequencing quality of Acer miaotaiense from five regions. Table S2. Detailed information on the detected SNPs in the chloroplast genome at five geographical locations.