Evolutionary Analysis of Plastid Genomes of Seven Lonicera L. Species: Implications for Sequence Divergence and Phylogenetic Relationships

Plant plastomes play crucial roles in species evolution and phylogenetic reconstruction studies due to being maternally inherited and due to the moderate evolutionary rate of genomes. However, patterns of sequence divergence and molecular evolution of the plastid genomes in the horticulturally- and economically-important Lonicera L. species are poorly understood. In this study, we collected the complete plastomes of seven Lonicera species and determined the various repeat sequence variations and protein sequence evolution by comparative genomic analysis. A total of 498 repeats were identified in plastid genomes, which included tandem (130), dispersed (277), and palindromic (91) types of repeat variations. Simple sequence repeat (SSR) elements analysis indicated the enriched SSRs in seven genomes to be mononucleotides, followed by tetra-nucleotides, dinucleotides, tri-nucleotides, hex-nucleotides, and penta-nucleotides. We identified 18 divergence hotspot regions (rps15, rps16, rps18, rpl23, psaJ, infA, ycf1, trnN-GUU-ndhF, rpoC2-rpoC1, rbcL-psaI, trnI-CAU-ycf2, psbZ-trnG-UCC, trnK-UUU-rps16, infA-rps8, rpl14-rpl16, trnV-GAC-rrn16, trnL-UAA intron, and rps12-clpP) that could be used as the potential molecular genetic markers for the further study of population genetics and phylogenetic evolution of Lonicera species. We found that a large number of repeat sequences were distributed in the divergence hotspots of plastid genomes. Interestingly, 16 genes were determined under positive selection, which included four genes for the subunits of ribosome proteins (rps7, rpl2, rpl16, and rpl22), three genes for the subunits of photosystem proteins (psaJ, psbC, and ycf4), three NADH oxidoreductase genes (ndhB, ndhH, and ndhK), two subunits of ATP genes (atpA and atpB), and four other genes (infA, rbcL, ycf1, and ycf2). Phylogenetic analysis based on the whole plastome demonstrated that the seven Lonicera species form a highly-supported monophyletic clade. The availability of these plastid genomes provides important genetic information for further species identification and biological research on Lonicera.


Introduction
The genus Lonicera, which includes approximately 200 species, is a major component of the family Caprifoliaceae, comprising a large number of horticultural and economically important shrubs and tree species [1]. These plants are generally distributed in the temperate and subtropical regions of North America, Europe, Asia, and Africa [2], and about 100 Lonicera species are found in China. The majority of these species have important medicinal properties. For example, the extracts from
The overall GC content is similar in the seven genomes, at about 38.4%. The overall GC content is unequally distributed across the plastid genome, which is the highest in the IR region (43.4%), followed by LSC (38.4%) and SSC (33.2%) regions. We summarized the codon usage and anticodon recognition patterns in the seven plastid genomes (Figure 2). Protein-coding genes comprise 25,110 amino acids in L. hispida, 25,178 amino acids in L. nervosa, and 25,222 amino acids in L. ferdinandi. Among these codons, those for leucine (10.8%) and isoleucine (8.2%) are the most common, and cysteine was the least frequently coded amino acid in the seven plastid genomes (Figure 2). UGC (two copies)). The overall GC content is similar in the seven genomes, at about 38.4%. The overall GC content is unequally distributed across the plastid genome, which is the highest in the IR region (43.4%), followed by LSC (38.4%) and SSC (33.2%) regions. We summarized the codon usage and anticodon recognition patterns in the seven plastid genomes ( Figure 2). Protein-coding genes comprise 25,110 amino acids in L. hispida, 25,178 amino acids in L. nervosa, and 25,222 amino acids in L. ferdinandi. Among these codons, those for leucine (10.8%) and isoleucine (8.2%) are the most common, and cysteine was the least frequently coded amino acid in the seven plastid genomes ( Figure 2).

Figure 1.
Chloroplast genome map for three Lonicera species. Genes located outside the outer rim are transcribed in a counterclockwise direction, whereas genes inside the outer rim are transcribed in a clockwise direction (as indicated by grey arrows). The colored bars indicate known different functional groups. The dashed gray area in the inner circle shows the percentage GC content of the corresponding genes. LSC, SSC, and IR denote large single copy, small single copy, and inverted repeat, respectively.

Figure 1.
Chloroplast genome map for three Lonicera species. Genes located outside the outer rim are transcribed in a counterclockwise direction, whereas genes inside the outer rim are transcribed in a clockwise direction (as indicated by grey arrows). The colored bars indicate known different functional groups. The dashed gray area in the inner circle shows the percentage GC content of the corresponding genes. LSC, SSC, and IR denote large single copy, small single copy, and inverted repeat, respectively.

Repeat Sequences Analysis
We identified 313 SSR loci in the seven Lonicera plastids (Table S1, Figure 3a). Each species contains 41-49 SSRs (mean 45 SSRs). Among them, the mono-nucleotides repeat is the most common, which accounts for about 65.8% of total SSRs, followed by tetra-nucleotides (15.0%), dinucleotides (8.3%), tri-nucleotides (5.4%), and hex-nucleotides (4.5%). Penta-nucleotides (0.9%) were very rare across the plastid genomes. Mononucleotide SSRs are especially rich in A/T repeats (about 94%) in each Lonicera species. We found that the number of tetra-nucleotides SSRs is the largest in total SSRs, except mononucleotides. Most SSRs are located in noncoding sections (75%) and about 25% are in protein-coding regions (ycf1, ycf2, atpB, rpoA, rpoB, rpoC1, rpoC2, ndhF, ccsA, and rpl23) ( Table S1). The numbers and distributions of all of the repeat types in the seven plastid genomes are similar and conserved ( Figure 3, Tables S2 and S3). We identified 498 other types of repeats, which included tandem (91), dispersed (277), and palindromic (130) repeats in the seven Lonicera plastids. The number of dispersed repeats is more than that of palindromic repeats, and tandem is the lowest in these species. The length of the repeat units mainly ranges from 30 to 45 bp ( Figure 3). These repeat sequences are mainly distributed in non-coding regions, whereas only a few are located in coding regions (ycf2, ycf1, rpl20, rps18, ndhA, rps7, and ndhI). A large number of repeat sequences are distributed around the pseudogene accD in these Lonicera species.

Gene Group
Gene Name

Gene Group Gene Name
Envelope membrane protein cemA ---- Note: (a) two gene copies in seven Lonicera species; (b) pseudogene in the seven Lonicera chloroplast genomes.

Repeat Sequences Analysis
We identified 313 SSR loci in the seven Lonicera plastids (Table S1, Figure 3a). Each species contains 41-49 SSRs (mean 45 SSRs). Among them, the mono-nucleotides repeat is the most common, which accounts for about 65.8% of total SSRs, followed by tetra-nucleotides (15.0%), dinucleotides (8.3%), tri-nucleotides (5.4%), and hex-nucleotides (4.5%). Penta-nucleotides (0.9%) were very rare across the plastid genomes. Mononucleotide SSRs are especially rich in A/T repeats (about 94%) in each Lonicera species. We found that the number of tetra-nucleotides SSRs is the largest in total SSRs, except mononucleotides. Most SSRs are located in noncoding sections (75%) and about 25% are in protein-coding regions (ycf1, ycf2, atpB, rpoA, rpoB, rpoC1, rpoC2, ndhF, ccsA, and rpl23) ( Table S1). The numbers and distributions of all of the repeat types in the seven plastid genomes are similar and conserved ( Figure 3, Tables S2 and S3). We identified 498 other types of repeats, which included tandem (91), dispersed (277), and palindromic (130) repeats in the seven Lonicera plastids. The number of dispersed repeats is more than that of palindromic repeats, and tandem is the lowest in these species. The length of the repeat units mainly ranges from 30 to 45 bp ( Figure 3). These repeat sequences are mainly distributed in non-coding regions, whereas only a few are located in coding regions (ycf2, ycf1, rpl20, rps18, ndhA, rps7, and ndhI). A large number of repeat sequences are distributed around the pseudogene accD in these Lonicera species.

Divergence Hotspots of Plastid Genomes
The coding genes, non-coding regions, and complete chloroplast genomes of seven Lonicera species were compared using the mVISTA program. To elucidate the level of sequence divergence, the percentages of variation were also calculated. As expected, non-coding regions and SC regions exhibited the higher levels of divergence than the coding and IR regions (Tables S4 and S5, Figures 4 and 5). The percentage of variation in non-coding regions ranges from 0 to 61.3%, with an average of 9.37%, which is higher than that in the coding regions (ranging from 0 to 13.4%, an average of 2.50%). In coding regions, seven genes have the greatest variability (>5%): rps15, rps16, rps18, rpl23, psaJ, infA,

Divergence Hotspots of Plastid Genomes
The coding genes, non-coding regions, and complete chloroplast genomes of seven Lonicera species were compared using the mVISTA program. To elucidate the level of sequence divergence, the percentages of variation were also calculated. As expected, non-coding regions and SC regions exhibited the higher levels of divergence than the coding and IR regions (Tables S4 and S5, Figures 4 and 5). The percentage of variation in non-coding regions ranges from 0 to 61.3%, with an average of 9.37%, which is higher than that in the coding regions (ranging from 0 to 13.4%, an average of 2.50%). In coding regions, seven genes have the greatest variability (>5%): rps15, rps16, rps18, rpl23, psaJ, infA, and ycf1. Eleven intergenic regions have a percentage exceeding 20%: trnN-GUU-ndhF, rpoC2-rpoC1, rbcL-psaI, trnI-CAU-ycf2, psbZ-trnG-UCC, trnK-UUU-rps16, infA-rps8, rpl14-rpl16, trnV-GAC-rrn16, trnL-UAA intron, and rps12-clpP ( Figure 5).   We analyzed the border structure of seven Lonicera plastid genomes. Detailed comparisons of the LSC, SSC, and IR regions are shown in Figure 6. The rpl23 gene located in the IRb extended into the LSC region by about 170-176 bp. The trnN and ndhF genes are located in either side of LSC/IRb border and 969-1068 bp apart, whereas the ndhF gene is located in boundary of L. japonica. The ycf1 gene is located in the SSC region, which ranges from 97 bp (L. ferdinandi) to 333 bp (L. hispida) away from the SSC/IRa border. IRa/LSC border performance is relatively stable, and the trnH gene is located 277-286 bp upstream of the IRa/LSC border.

Positive Selection Analysis
We detected 14 genes with positively selected sites via LRT tests (M0 vs. M3, M1 vs. M2, and M7 vs. M8) (p < 0.05, Tables S6 and S7), which included two genes for the subunit of ribosome protein (rpl16, rpl22), three subunits of the photosystem genes (psaJ, psbC, and ycf4), three NADH oxidoreductase genes (ndhB, ndhH, and ndhK), two subunits of ATP genes (atpA and atpB,) and four other genes (infA, rbcL, ycf1, and ycf2). Five genes (ndhB, ndhK, rpl16, rpl22, and ycf4) were detected in only one positively selected sites within Model 8, and the ycf1 and rbcL genes were detected in more than two or, three selected sites for Model 8 than Model 2 with p > 95%, respectively. We detected the most selective sites (18) in the ycf1 gene in the seven Lonicera plastid genomes.

Phylogenetic Analysis
To obtain an accurate phylogenetic relationship of Lonicera species, we performed multiple sequence alignments of 20 complete plastid genomes. The obtained topology is presented in Figure  7. The basic topologies were similar in the MP and ML analyses, which showed that the 18 Dipsacales species were divided into two parts, containing six Adoxaceae and 12 Caprifoliaceae species. Within Caprifoliaceae, Patrinia saniculifolia Hemsl. was placed as a sister clade to Linnaceae (Dipelta floribunda Maximowicz and Kolkwitzia amabilis Graebner) with 100% bootstrap values. We found that the seven Lonicera species formed a highly-supported monophyletic lineage. L. tragophylla separated first of seven Lonicera species. Three Lonicera species (L. fragrantissima var. lancifolia, L. stephanocarpa, and L. hispida) and the other three species (L. ferdinandi, L. nervosa, and L. japonica) formed a sister clade with high bootstrap value.

Positive Selection Analysis
We detected 14 genes with positively selected sites via LRT tests (M0 vs. M3, M1 vs. M2, and M7 vs. M8) (p < 0.05, Tables S6 and S7), which included two genes for the subunit of ribosome protein (rpl16, rpl22), three subunits of the photosystem genes (psaJ, psbC, and ycf4), three NADH oxidoreductase genes (ndhB, ndhH, and ndhK), two subunits of ATP genes (atpA and atpB,) and four other genes (infA, rbcL, ycf1, and ycf2). Five genes (ndhB, ndhK, rpl16, rpl22, and ycf4) were detected in only one positively selected sites within Model 8, and the ycf1 and rbcL genes were detected in more than two or, three selected sites for Model 8 than Model 2 with p > 95%, respectively. We detected the most selective sites (18) in the ycf1 gene in the seven Lonicera plastid genomes.

Phylogenetic Analysis
To obtain an accurate phylogenetic relationship of Lonicera species, we performed multiple sequence alignments of 20 complete plastid genomes. The obtained topology is presented in Figure 7. The basic topologies were similar in the MP and ML analyses, which showed that the 18 Dipsacales species were divided into two parts, containing six Adoxaceae and 12 Caprifoliaceae species. Within Caprifoliaceae, Patrinia saniculifolia Hemsl. was placed as a sister clade to Linnaceae (Dipelta floribunda Maximowicz and Kolkwitzia amabilis Graebner) with 100% bootstrap values. We found that the seven Lonicera species formed a highly-supported monophyletic lineage. L. tragophylla separated first of seven Lonicera species. Three Lonicera species (L. fragrantissima var. lancifolia, L. stephanocarpa, and L. hispida) and the other three species (L. ferdinandi, L. nervosa, and L. japonica) formed a sister clade with high bootstrap value.

Features of Plastid Genomes
The available plastid genome sequences of most land plants have increased rapidly with the development of next generation sequencing (NGS) methods. However, the plastid genomes of Lonicera remained relatively limited, with only four species (L. japonica, L. fragrantissima var. lancifolia, L. stephanocarpa, and L. tragophylla) being reported [13,23]. Generally, most angiosperm plastid genomes are considered highly conserved in terms of their structure, gene content, and order [14]. In this study, we showed that the genome size of seven Lonicera species ranged from 154,513 to 155,545 bp, containing 82 protein-coding genes, 37 tRNA genes, 8 rRNA genes, and one pseudogene within quadripartite structure (LSC, 88,504-89,299 bp; SSC, 18,552-18,766 bp; and IR, 23,646-23,791 bp). The structure characteristics of the chloroplast genomes of these species are similar to those of most angiosperms [29]. In terms of GC content of the seven Lonicera plastids, the complete chloroplast genome had an overall GC content of ~38.4%, similar to the previously published L. japonica genome [13]. The GC content of IR regions is clearly higher than in the other regions, which are highly similar to most of land plants possibly due to the existence of the rRNA gene [30].
The pseudogenes in plastid genomes are functionless relatives of genes that have lost their ability to code and express a protein [31] relative to a complete gene. Although pseudogenes are not protein-coding DNA, these segment sequences may be similar to other kinds of noncoding regions, which may have a regulatory function [32] and have important roles in normal physiology and abnormal pathology [33]. In this study, we determined that the accD gene encoding a subunit of heteromeric acetyl-CoA carboxylase is a pseudogene in seven Lonicera species. The accD gene is known to be essential for leaf development in angiosperms [34]. Previous studies have shown that the accD gene has been lost in some angiosperm plastid genomes including Poales [35], Acoraceae [36,37], and Geraniaceae [38]. This gene may have played the main role in the physiological regulation in Lonicera species.

Features of Plastid Genomes
The available plastid genome sequences of most land plants have increased rapidly with the development of next generation sequencing (NGS) methods. However, the plastid genomes of Lonicera remained relatively limited, with only four species (L. japonica, L. fragrantissima var. lancifolia, L. stephanocarpa, and L. tragophylla) being reported [13,23]. Generally, most angiosperm plastid genomes are considered highly conserved in terms of their structure, gene content, and order [14]. In this study, we showed that the genome size of seven Lonicera species ranged from 154,513 to 155,545 bp, containing 82 protein-coding genes, 37 tRNA genes, 8 rRNA genes, and one pseudogene within quadripartite structure (LSC, 88,504-89,299 bp; SSC, 18,552-18,766 bp; and IR, 23,646-23,791 bp). The structure characteristics of the chloroplast genomes of these species are similar to those of most angiosperms [29]. In terms of GC content of the seven Lonicera plastids, the complete chloroplast genome had an overall GC content of~38.4%, similar to the previously published L. japonica genome [13]. The GC content of IR regions is clearly higher than in the other regions, which are highly similar to most of land plants possibly due to the existence of the rRNA gene [30].
The pseudogenes in plastid genomes are functionless relatives of genes that have lost their ability to code and express a protein [31] relative to a complete gene. Although pseudogenes are not protein-coding DNA, these segment sequences may be similar to other kinds of noncoding regions, which may have a regulatory function [32] and have important roles in normal physiology and abnormal pathology [33]. In this study, we determined that the accD gene encoding a subunit of heteromeric acetyl-CoA carboxylase is a pseudogene in seven Lonicera species. The accD gene is known to be essential for leaf development in angiosperms [34]. Previous studies have shown that the accD gene has been lost in some angiosperm plastid genomes including Poales [35], Acoraceae [36,37], and Geraniaceae [38]. This gene may have played the main role in the physiological regulation in Lonicera species.

Repeat Sequence Variations
Previous studies suggested that repeat sequences may have played crucial roles in the rearrangement and stabilization of plastid genomes [39]. In the current study, we determined the dispersed, palindromic, and tandem repeats in seven Lonicera species, which showed that the number of tandem repeats is more than that of dispersed repeats, and palindromic repeats are the least common in these species. The majority of repeats were distributed in the intergenic spacer and intron regions, which is similar to those reported in other angiosperm lineages [26]. Variability in the copy number of SSRs in the chloroplast is generally polymorphic and can be used to analyze the population genetics and evolutionary studies at the inter-and intra-population levels [40]. We identified 313 SSR loci in the seven Lonicera plastid genomes. Most of these SSRs are located in noncoding regions (75%) and about of 25% are in protein-coding regions, similar to other angiosperms [41]. More tetra-nucleotide SSRs occur in the seven Lonicera plastomes. Among them, (AGAT) 3 and (TATC) 3 are shared by two ycf2 genes. The (AGAT) 3 repeat unit was also found in the pseudo-gene accD region in six Lonicera species, except for L. tragophylla. This large number of repeat sequences and SSRs is possibly related to the plastid genome size variation and divergence [42]. We identified 18 divergence hotspots (rps15, rps16, rps18, rpl23, psaJ, infA, ycf1, trnN-GUU-ndhF, rpoC2-rpoC1, rbcL-psaI, trnI-CAU-ycf2, psbZ-trnG-UCC, trnK-UUU-rps16, infA-rps8, rpl14-rpl16, trnV-GAC-rrn16, trnL-UAA intron, and rps12-clpP) in seven Lonicera plastid genomes. A large number of repeat sequences are also distributed in these divergence hotspot regions. These regions could be considered as potential molecular genetic markers for further study of population genetics and species evolution of Lonicera.

Positive Selection Analysis
Synonymous and nonsynonymous nucleotide substitutions are important markers for protein coding gene evolution. Generally, the rates of nonsynonymous and synonymous substitution in plant chloroplast genomes are relatively slow [43] due to the action of purifying and neutral selection [44]. In this study, we identified 14 protein-coding genes under positive selection. These genes included two small subunits of ribosome genes (rpl16 and rpl22) that have been proven to be essential for the chloroplast ribosome development in plants [45]. Eleven genes (ndhA-ndhK) were found in the plastid genomes of most plants, encoding the NAD(P)H dehydrogenase (NDH) complex, which is involved in the I circulatory electron transport and chlororespiration, whereas three of these genes (ndhB, ndhH, and ndhK) were found to own selected sites. The family genes of psa and psb, and ycf3 and ycf4 genes were found to play vital roles in plant photosystem. The psaJ and psbC genes respectively belong to photosystem I and photosystem II. The ycf4 gene forms modules that mediate PSI assembly as conserved chloroplast-encoded auxiliary factors [46]. The gene infA encodes translation initiation factor 1. It has been lost completely in some angiosperms [47,48] and is present as a pseudogene in the majority of angiosperms [47,48]. The rbcL gene was also found to play an important role as a photosynthetic electron transfer regulator, which is essential for photosynthesis [49]. We found rbcL gene possess nine sites under positive selection in these Lonicera species. A previous study also showed that the rbcL gene is often under positive selection in land plants [23,50]. The ycf1 and ycf2 genes are the largest plastid genes, encoding a protein that was part of the chloroplast inner envelope membrane protein translocon [51]. We identified 7 and 18 positively selected sites in the ycf1 and ycf2 genes, respectively. The current study also revealed that the positive selection of these two genes in angiosperm plants may be a common phenomenon [42].

Phylogenetic Relationship
In the previously phylogenetic results of Caprifoliaceae, Rehder [7] divided Lonicera species into two subgenera: Lonicera and Caprifolium. Lonicera subgenera contains four sections, Coeloxylosteum, Isoxylosteum, Nintooa, and Isika. In our study, phylogenetic analysis based on the complete plastid genomes showed that the seven Lonicera species form a highly-supported monophyletic lineage.
L. tragophylla is separated from the seven Lonicera species. Some previously studies based on the partial nuclear and chloroplast DNA markers found that L. ferdinandi, L. hispida, L. stephanocarpa, and L. fragrantissima var. lancifolia belong to the Isika section, and L. nervosa belongs to Rhodanthae subsection, and L. japonica belongs to Nintooa [12]. These incongruent results may be due to the different sampling strategies and different molecular markers that were used. We also found that the three Lonicera species (L. fragrantissima var. lancifolia, L. stephanocarpa, and L. hispida) and the other three species (L. ferdinandi, L. nervosa, and L. japonica) form a sister clade with high bootstrap values. L. ferdinandi is closely related to L. japonica and L. nervosa. These findings are similar to previous morphological analyses of Caprifoliaceae species [7,12]. In conclusion, the results of phylogenetic analysis based on the plastid genomes greatly enhance our understanding of the evolutionary relationships among Lonicera species [52,53]. In the future, the more plastid genome datasets are needed to test the phylogenetic relationship and species evolution of Lonicera species.

Chloroplast Genome Assembly and Annotation
We trimmed the raw reads by removing the shorter and low-quality reads using NGSQCToolkit v2.3.3 software [54]. After clean reads of L. nervosa, L. ferdinandi, and L. hispida were assembled using MIRA 4.0.2 [55] with the complete plastid genome of closely-related species L. japonica (NC_026839) as the reference. To further assemble the whole plastid genomes, some ambiguous regions were extended using the MITObim v1.7 program [56] with a baiting and iteration method. The complete chloroplast genome sequences were imported into the online program Dual Organellar Genome Annotator (Dogma) [57] for annotation. The positions of starts, stops, introns, and exons were manually adjusted by comparison with homologous genes in other chloroplast genomes (L. japonica, L. fragrantissima var. lancifolia, L. stephanocarpa, and L. tragophylla). All tRNA genes were further confirmed using online tool tRNAscan-SE [58]. Eventually, the circular plastid genome maps were drawn using the bio-software OGDRAW [59]. The plastid genome sequences of the three Lonicera species and their raw reads were submitted to NCBI (accession numbers: MK176510-MK176512, SRR8269399, and SRR8269400).

Repeat Sequence Analysis
In general, the long repeat contains dispersed, palindromic, and tandem repeats. In our study, the online software REPuter [60] was used to identify the dispersed and palindromic repeats with following conditions: (1) hamming distance of 1, (2) 90% or greater sequence identity, (3) and a minimum repeat size of 30 bp. The tandem repeats (>10 bp) were determined using the program Tandem Repeats Finder [61] with 2, 7, and 7 set for the alignment parameters match, mismatch, and indel, respectively. SSR loci were further detected using MISA software [62] with following thresholds: 10, 5, 4, 3, 3, and 3 repeat units for mono-nucleotide, di-nucleotide, tri-nucleotide, tetra-nucleotide, penta-nucleotide, and hexa-nucleotide SSRs, respectively.

Sequence Divergence Analysis
The complete plastid genomes of seven Lonicera species were compared using web-based program mVISTA [63] with L. japonica as the reference. To further identify the percentage of variable characters for each coding and non-coding region, the SNP sites were counted and positioned in the plastid genomes using DnaSP v5.0 [64].

Gene Selection Sites Analysis
The non-synonymous/synonymous substitution rate ratio (ω = dN/dS) is sensitive to the selection pressure in the evolution of protein level, and is particularly useful for identifying positive selection. A total of 75 protein-coding genes in Lonicera plastid genomes were extracted and compared using Genious R v9.0.5 [65] and MAFFT v7.0.0 [66]. The maximum likelihood phylogenetic tree was constructed using the program RAxML v7.2.8 [67] based on complete plastid genomes. The value of dN, dS, and ω for each gene exon were calculated using the site-specific model in the codeml program of Paml 4.7 [68]. In order to choose a more reliable model, we carried out the three likelihood ratio tests (LRT). The candidate sites of positive selection with significant support from posterior probability (p of (ω > 1) ≥ 0.99; Bayes Empirical Bayes approach) identified by M2 and M8 were considered further.

Phylogenetic Analysis
Phylogenetic analyses were performed on aligned data from 20 complete plastid genomes, which included 18 Dipsacales and two Apiaceae species, as demonstrated using Maximum parsimony (MP), Maximum likelihood (ML), and Bayesian inference (BI) analyses. Firstly, plastid genomes were aligned using MAFFT v7.0.0 [66] and the best-fitting model was selected using the MrModeltest 2.3 [69] through the Akaike information criterion (AIC). The ML and MP analyses were conducted using PAUP4 [70] with 1000 bootstrap replicates. BI analyses were performed using the program MrBayes v3.1.2 [71] with the settings as following: 1,000,000 generations Monte Carlo simulations (MCMC) algorithm, starting from random trees, and sampling 1 of every 1000 generations. Then 25% of all trees were burned using the software Tracer v1.6 [72].

Conclusions
In this study, we collected the complete chloroplast genomes of seven Lonicera species and determined the sequence variations and molecular evolution by comparative genomic analysis. The genus Lonicera plastomes exhibited a typical quadripartite DNA molecular structure, which is similar to those in other angiosperm species. A total of 498 repeats were identified in plastid genomes, which included tandem (130), dispersed (277), and palindromic (91) types of repeat variations. Simple sequence repeat (SSR) elements analysis indicated the enriched SSRs in seven plastomes to be mononucleotides, followed by tetra-nucleotides, dinucleotides, tri-nucleotides, hex-nucleotides, and penta-nucleotides. Interestingly, we determined eighteen divergence hotspot regions in these horticulturally-and economically-important Lonicera plastomes, which could be used as the potential molecular genetic markers for the further study of population genetics and phylogenetic evolution of Lonicera species. Selection pressure analysis showed that some plastid genes were under positive selection, which may played the important roles during the evolutionary process of Lonicera. Phylogenetic analysis based on the whole plastome revealed that the seven Lonicera species form a highly-supported monophyletic clade. The availability of these plastid genomes provides important genetic information for further species identification and evolutionary biological research on Lonicera.