Comparative Analysis and Phylogenetic Relationships of Ceriops Species (Rhizophoraceae) and Avicennia lanata (Acanthaceae): Insight into the Chloroplast Genome Evolution between Middle and Seaward Zones of Mangrove Forests

Simple Summary We sequenced the complete chloroplast genomes of three Ceriops species (C. decandra, C. zippeliana, and C. tagal) and Avicennia lanata and performed comparative analyses among them. All chloroplast genomes have a circular quadripartite structure containing LSC, SSC, and two IR regions. The rpl32 gene was lost in C. zippeliana, and the infA gene was present in only A. lanata. Comparative genome analysis showed that the IR contraction or expansion events resulted in the differentiation of three genes and pseudogenes. Additionally, repeats and SSRs were identified and compared among them and other relative mangrove species. The phylogenetic analysis strongly supports that C. decandra is evolutionarily closer to C. zippeliana and A. lanata is closer to A. marina. In addition, two primer pairs were developed for species identification unique to the three Ceriops species. Abstract Ceriops and Avicennia are true mangroves in the middle and seaward zones of mangrove forests, respectively. The chloroplast genomes of Ceriops decandra, Ceriops zippeliana, and Ceriops tagal were assembled into lengths of 166,650, 166,083 and 164,432 bp, respectively, whereas Avicennia lanata was 148,264 bp in length. The gene content and gene order are highly conserved among these species. The chloroplast genome contains 125 genes in A. lanata and 129 genes in Ceriops species. Three duplicate genes (rpl2, rpl23, and trnM-CAU) were found in the IR regions of the three Ceriops species, resulting in expansion of the IR regions. The rpl32 gene was lost in C. zippeliana, whereas the infA gene was present in A. lanata. Short repeats (<40 bp) and a lower number of SSRs were found in A. lanata but not in Ceriops species. The phylogenetic analysis supports that all Ceriops species are clustered in Rhizophoraceae and A. lanata is in Acanthaceae. In a search for genes under selective pressures of coastal environments, the rps7 gene was under positive selection compared with non-mangrove species. Finally, two specific primer sets were developed for species identification of the three Ceriops species. Thus, this finding provides insightful genetic information for evolutionary relationships and molecular markers in Ceriops and Avicennia species.

Ceriops (Rhizophoraceae, Rosids) and Avicennia (Acanthaceae, Asterids) are classified as true mangroves and the most dominant species in the middle and seaward zones of mangrove forests, respectively [1, 3,24]. Both species have adapted to extreme conditions in mangrove habitats. For example, Ceriops is a viviparous mangrove species that has seeds producing propagules or beginning to germinate on the mother plants and is a salt excluder by filtering salt out at the roots [25]. Avicennia specially adapted with pneumatophores (pencil-like aerial roots) and salt glands on the upper and lower leaf surfaces, which secrete excess salt from the leaves [26,27]. The genus Ceriops contains five species, including Ceriops australis, Ceriops decandra, Ceriops pseudodecandra, Ceriops tagal, and Ceriops zippeliana [28][29][30][31]. In the past, C. zippeliana was believed to be a synonym of C. decandra [7,29]; however, differences in morphology and a trnL intron of chloroplast DNA between them were reported and they were suggested as different species [31]. C. decandra and C. tagal are widespread species in a large geographical range from Eastern Africa and throughout tropical Asia and Northern Australia to Melanesia, Micronesia, and Southern China [7,28,29], while C. pseudodecandra and C. australis are endemic to Australia [32]. C. zippeliana is found in Southeast Asia (Thailand, Malaysia, Singapore, Indonesia, and the Philippines) [31]. Based on the International Union for Conservation of Nature (IUCN) Red List, C. decandra is classified as a near threatened species due to habitat loss [33]. In addition to Ceriops, Avicennia, which is the pioneer of the mangrove swamp, comprises at least eight species [7,26,34]. The distribution of Avicennia species is separated into two geographic parts: the Indo-West Pacific (IWP) and Atlantic-East Pacific (AEP) regions. At least six species (A. alba, A. integra, A. lanata, A. marina, A. officinalis, and A. rumphiana) are distributed in the IWP region, whereas three species (A. germinans, A. schaueriana, and A. bicolor) are found in the AEP region [26,35,36]. Notably, A. lanata is distributed only in Southeast Asia [24]. Recently, A. bicolor, A. integra, A. lanata, and A. rumphiana have been listed as vulnerable species on the IUCN Red List [37].
Chloroplasts are photosynthetic organelles in algae and land plants that have their own genomes. Chloroplast genomes are highly conserved because of uni-parent inheritance or maternal inheritance [38]. The sizes of chloroplast genomes in mangrove species are around 145-168 kb [39][40][41][42]. Chloroplast genomes in mangrove species usually contain four regions, including one large single-copy region (LSC), two inverted repeats (IRA and IRB), and one single small-copy region (SSC) [39][40][41][42]. To date, several chloroplast genomes have been reported because of the development of DNA sequencing technology and bioinformatics methods [43][44][45]. For Ceriops and Avicennia species, only the chloroplast genomes of C. tagal and A. marina are available [39,41]. Therefore, both the Ceriops and Avicennia genera suffer from a lack of chloroplast genomes to compare their genomes and to understand phylogenetic relationships among them.
Mangroves present very special ecological characteristics, and understanding the genome structure through this molecular finding will further provide valuable genetic information regarding the evolutionary trends in plants according to harsh climatic con-ditions. In this study, we investigated the chloroplast genomes of four mangrove species that are commonly distributed in the middle (Ceriops decandra, C. zippeliana, and C. tagal) and seaward (Avicennia lanata) zones of the coastal region of Southeast Asia to understand the evolutionary relationships under different coastal environments and to identify genetic markers for species identification and candidate genes under selective pressures. The four mangrove species were sequenced, assembled, and annotated. Comparisons of chloroplast genomes among the three Ceriops species and between the Ceriops and Avicennia species were performed to reveal their evolutionary relationships. Different numbers of SSRs and short repeats were identified among them. Genes under positive selection were identified and might correlate with adaptive selection that could be used for further studies on the response to stress conditions in mangroves. Finally, two sets of species-specific primers were developed for species identification of the three Ceriops species based on SSRs. These chloroplast genomes provide valuable genetic information and potential molecular markers for mangrove species in the southeast coastal regions.
Fresh leaves of C. decandra, C. zippeliana, C. tagal, and A. lanata were collected from the Ranong, Chanthaburi, Ranong, and Prachuap Khiri Khan provinces in Thailand, respectively (Table S1). The leaf samples were frozen in liquid nitrogen for DNA isolation. Genomic DNA was extracted using the standard cetyltrimethylammonium bromide (CTAB) method [46]. Each sample was sequenced using the Illumina HiSeqX ten platform with paired-end reads of 150 bp.

Chloroplast Genome Assembly and Annotation
The chloroplast genome of C. decandra was assembled using NOVOPlasty version 4.2 [47]. The chloroplast rbcL sequence of C. tagal (NCBI accession number: MH240830) was used as a seed sequence. The chloroplast genomes of C. tagal, C. zippeliana, and A. lanata were assembled using GetOrganelle [48] with the reference genome-based strategy based on the C. tagal chloroplast genome (MH240830) for Ceriops species and the A. marina chloroplast genome (MT012822) for Avicennia species. Notably, GetOrganelle was used mainly for assembling the four mangrove chloroplast genomes due to the highly accurate results of organelle genomes [48,49]. However, it was not fit to complete the chloroplast genome of C. decandra; thus, NOVOPlasty was used instead.

Comparative Genome Analysis
Comparative genome analysis for Ceriops and Avicennia was carried out using mVISTA with the Shuffle-LAGAN mode [53]. The species in this analysis included three Ceriops species (in this study), A. lanata (in this study), and four previously reported mangrove species, including Kandelia obovata (NC_042718), Rhizophora stylosa (NC_042819), and Bruguiera parviflora (MW836113) in the family Rhizophoraceae and A. marina (MT012822) in the family Acanthaceae. The previously reported chloroplast genome of C. tagal (MH240830.1,

Gene Selective Pressure Analysis
A total of 61 shared chloroplast protein-coding genes were used to investigate selection pressures for two mangrove genera, Ceriops and Avicennia (Table S3). We compared species pairs contained between the three Ceriops species and six relative mangrove and non-mangrove species (Kandelia obovata (NC_042718), Rhizophora apiculata (MW387538), Bruguiera parviflora (MW836113), Pellacalyx yunnanensis (NC_048998), Erythroxylum novogranatense (NC_030601), and Ranunculus macranthus (NC_008796)) as well as between A. lanata and six relative mangrove and non-mangrove species (Avicennia marina (NC_047414), Coffea arabica (NC_008535), Nicotiana tabacum (NC_001879), Eucommia ulmoides (NC_037948), Lonicera japonica (NC_026839), and R. macranthus). Notably, R. macranthus was used as an assumed ancestor for the two mangrove genera. Pairwise sequence alignments for each gene in each species pair were generated using MUSCLE with default settings in MEGA X [58,60]. Then, the values of non-synonymous (Ka) and synonymous (Ks) nucleotide substitutions and Ka/Ks (substitution ratio) in all aligned genes were calculated using KaKs Calculator version 2.0 [61]. Notably, the Ka/Ks ratios were not available (NA) and~50, indicating no substitution and extremely low Ks values that were replaced to be zero [62]. The Ka/Ks ratios were then visualized using R with the heatmap function [63].

Development of Species-Specific Molecular Markers for Ceriops Species
Two primer pairs were designed from the IR region of the three Ceriops chloroplast genome sequences using Primer3 [64]. PCR amplifications were carried out in 20 µL volumes containing 1 µg genomic DNA, 2 µL dNTPs (2.5 mM each), 2 µL of Taq PCR buffer, 0.2 µL of Taq DNA polymerase, and 1.0 µL of each primer. The amplification conditions were 94 • C for 2 min; followed by 30 cycles of 94 • C for 20 s (denaturation), 55 • C for 30 s (annealing), and 72 • C for 30 s (extension); and a final extension of 72 • C for 5 min. PCR products and the DNA ladder were analyzed using a 1% agarose gel to reveal PCR product sizes.

Chloroplast Genome Features
A total of 66.32 million reads (150 bp) were generated for the three Ceriops species and Avicennia lanata by the Illumina HiseqX ten platform (Table S1). These data were used to assemble the four chloroplast genomes with over 300× coverage. The sizes of the complete chloroplast genomes of C. decandra, C. zippeliana, C. tagal, and A. lanata were 166,650, 166,083, 164,432, and 148,264 bp in length, respectively ( Figure 1 and Table 1). All four species exhibit a typical quadripartite structure, which consists of one large single copy (LSC), one small single copy (SSC), and a pair of inverted repeats (IRs). All four regions of A. lanata (LSC: 87,995 bp; SSC: 17,949 bp; IRs: 21,160 bp) were shorter than those of the three Ceriops species (92,660-95,217 bp; 18,054-19,158 bp; 26,307-26,535 bp). The overall GC content in the whole chloroplast genomes of the three Ceriops species and A. lanata was 35% and 38%, respectively. The GC content in the IR regions (~42-44%) was greater than that in the LSC (~32-37%) and SSC (~29-33%) regions. Genes located outside and inside the circle are transcribed clockwise and counter-clockwise, respectively. The grey bar area in the inner circle indicates GC content of the genome, whereas the lighter grey area indicates AT content of the genome. LSC, SSC, and IRs (IRA and IRB) represent large single copy, small single copy, and inverted repeats, respectively. Genes based on different functional groups are shown in different colors. Green rectangle indicates a loss region (rpl32) of C. zippeliana. Labeling in blue color indicates the same region of three genes (rpl2, rpl23, and trnM-CAU) in both Ceriops and Avicennia species, whereas labeling in orange color with an orange arrow indicates a unique region of three duplicate genes (rpl2, rpl23, and trnM-CAU) in three Ceriops species compared to A. lanata. ** indicates genes containing introns.
List of annotated genes in the chloroplast genomes of three Ceriops species and Avicennia lanata.

Category
Group of Genes Gene Name

Comparative Analysis of Chloroplast Genomes
The comparison of eight mangrove chloroplast genomes (three Ceriops species, three relative mangrove species in the family Rhizophoraceae, and two Avicennia species) showed similar gene organization and variation regions ( Figure 2). Gene orientation was assessed among the mangrove species, revealing a conserved gene structure in the chloroplast genomes. Coding regions were more conserved than the non-coding regions. Additionally, IR regions were more conserved than the LSC and SSC regions, suggesting low divergence in the IR regions. The IR regions were highly conserved between C. tagal and the Rhizophoraceae species and between C. tagal and the Avicennia species at over 98% and 90%, respectively. Low similarity (<80%) of nine protein-coding gene sequences (trnK-UUU, trnT-CGU, trnL-AAA, trnC-ACA, rps3, rpl22, ycf1, rps15, and rpl32) was observed between C. tagal and the Avicennia species. The highly divergent regions were also found in most intergenic regions, especially between the mangrove species in the family Rhizophoraceae and Avicennia species.

Chloroplast Boundary Structures
The chloroplast boundary structures of the LSC, SSC, and IRs were compared among the three Ceriops species and two Avicennia species (Figure 3). In all species, the ycf1 and ndhF genes are located at the boundary of SSC/IRb and SSC/IRa, respectively. The size of ycf1 is approximately 5800 bp for the Ceriops species and 5500 bp for the Avicennia species. The ycf1 gene is~1400 bp away from the SSC/IRb border in the Ceriops species, whereas it is~800 bp away in the Avicennia species. Additionally, the size of the ndhF gene is similar in all species (2231 bp in C. tagal, A. lanata, and A. marina; 2237 bp in C. decandra; and 2243 bp in C. zippeliana). The LSC/IRb and LSC/IRa junctions in the three Ceriops species positioned the rps19 gene and rps19 pseudogene, respectively ( Figure 3A). In contrast, the LSC/IRb junction in the two Avicennia species positioned the ycf2 gene ( Figure 3B). Notably, no gene stretches across the boundary between the LSC and IRa regions of the two Avicennia species. These results reveal that the contraction and expansion of both LSC/IRa and LSC/IRb boundary regions occurred in Avicennia and Ceriops species, respectively, during their evolution.

Chloroplast Boundary Structures
The chloroplast boundary structures of the LSC, SSC, and IRs were compared among the three Ceriops species and two Avicennia species (Figure 3). In all species, the ycf1 and ndhF genes are located at the boundary of SSC/IRb and SSC/IRa, respectively. The size of ycf1 is approximately 5800 bp for the Ceriops species and 5500 bp for the Avicennia species. The ycf1 gene is ~1400 bp away from the SSC/IRb border in the Ceriops species, whereas it is ~800 bp away in the Avicennia species. Additionally, the size of the ndhF gene is similar in all species (2231 bp in C. tagal, A. lanata, and A. marina; 2237 bp in C. decandra; and 2243 bp in C. zippeliana). The LSC/IRb and LSC/IRa junctions in the three Ceriops species positioned the rps19 gene and rps19 pseudogene, respectively ( Figure 3A). In contrast, the

Chloroplast Repeats and SSRs
Repeats in the chloroplast genomes of the three Ceriops species and A. lanata were identified ( Figure 4A-C and Table S4). The number of forward, reverse, palindromic, and complement repeats was different in each species. For example, 23, 35, 25, and 17 forward repeats were found in C. decandra, C. zippeliana, C. tagal, and A. lanata, respectively ( Figure 4A). The number of forward repeats in C. zippeliana (35) was the highest, while the number of palindromic repeats in A. lanata (20) was the highest ( Figure 4A). There was no complement repeat in C. zippeliana. Interestingly, all Ceriops species contained long repeats (>30 bp), whereas A. lanata species consisted of short repeats (<40 bp) ( Figure 4B). Usually, most repeats in all species were observed in the LSC region ( Figure 4C). LSC/IRb junction in the two Avicennia species positioned the ycf2 gene ( Figure 3B). Notably, no gene stretches across the boundary between the LSC and IRa regions of the two Avicennia species. These results reveal that the contraction and expansion of both LSC/IRa and LSC/IRb boundary regions occurred in Avicennia and Ceriops species, respectively, during their evolution.

Chloroplast Repeats and SSRs
Repeats in the chloroplast genomes of the three Ceriops species and A. lanata were identified ( Figure 4A-C and Table S4). The number of forward, reverse, palindromic, and complement repeats was different in each species. For example, 23, 35, 25, and 17 forward repeats were found in C. decandra, C. zippeliana, C. tagal, and A. lanata, respectively ( Figure  4A). The number of forward repeats in C. zippeliana (35) was the highest, while the number of palindromic repeats in A. lanata (20) was the highest ( Figure 4A). There was no complement repeat in C. zippeliana. Interestingly, all Ceriops species contained long repeats (>30 bp), whereas A. lanata species consisted of short repeats (<40 bp) ( Figure 4B). Usually, most repeats in all species were observed in the LSC region ( Figure 4C).
SSRs in the chloroplast genomes of Ceriops species, A. lanata, and related mangrove species were analyzed (Figure 4D-F and Tables 3 and S5). Mononucleotide SSRs were the most prevalent in all species ( Figure 4D), consisting predominantly of A/T repeats, at over 90% ( Figure 4E). Most SSRs were found in the LSC region ( Figure 4F). Some SSRs were unique in each Ceriops species.     Table 3 and Table S5). Mononucleotide SSRs were the most prevalent in all species ( Figure 4D), consisting predominantly of A/T repeats, at over 90% ( Figure 4E). Most SSRs were found in the LSC region ( Figure 4F). Some SSRs were unique in each Ceriops species.

Ceriops Species Identification Based on Species-Specific Molecular Markers
Two pairs of primers were designed and tested to identify the differences between Ceriops species. PCR products of the chloroplast genomes exhibited different sizes among the three Ceriops species based on one molecular marker using two primer pairs ( Figure 5). The PCR product of the two primer pairs confirmed the same variation. For the first one, the PCR product sizes of C. tagal, C. decandra, and C. zippeliana were 167, 207, and 226 bp, respectively ( Figure 5 and Table S6). For the other one, the PCR product sizes of C. tagal, C. decandra, and C. zippeliana were 323, 363, and 382 bp, respectively ( Figure 5 and Table S6). The difference in PCR product sizes occurred from indels and SSRs in the IR regions. For example, the chloroplast sequence of these regions of C. tagal and C. zippeliana contained 18 and 5 dinucleotide (AT/TA) repeats, respectively (102,313-102,348 and 154,744-154,779 bp: C. tagal; 120,115-120,124 and 156,400-156,409 bp: C. zippeliana) (Table S5), whereas there were no SSRs in these regions of C. decandra based on the SSR analysis criteria in this study due to short (AT/TA) repeats (<4 repeats) (Table S6).

Phylogenetic Relationships
The maximum likelihood (ML) analysis, based on 50 conserved chloroplast genes in 59 plant species, resulted in the best single tree ( Figure 6). The ML tree shows two major clades corresponding to Rosids and Asterids. This tree highly supports that all Ceriops species are in the family Rhizophoraceae (Rosids), whereas A. lanata is in the family Acanthaceae (Asterids). C. decandra is closely related to C. zippeliana with a monophyletic branch supported by 100% bootstrap values. C. tagal is a sister species of the other two Ceriops species. For other mangrove species in the family Rhizophoraceae, Kandelia obovata is closer to the Ceriops species than Rhizophora and Bruguiera species. In addition, A. lanata and A. marina are grouped together in the family Acanthaceae (Asterids), with a bootstrap value of 100%.
The gain and loss of the rpl32, rps16, and infA genes in mangrove and non-mangrove species were plotted in the phylogenetic tree ( Figure 6). For example, the rpl32 gene was lost in four mangrove species in Rhizophoraceae, namely C. zippeliana, K. obovata, Rhizophora stylosa, and Bruguiera gymnorhiza. The rps6 gene was lost in all mangrove species in Rhizophoraceae and most land plant species in Malpighiales, but not in Acanthaceae (Lamiales). The infA gene was also lost in all mangrove species in Rhizophoraceae and most land plant species in Rosids. contained 18 and 5 dinucleotide (AT/TA) repeats, respectively (102,313-102,348 an 154,744-154,779 bp: C. tagal; 120,115-120,124 and 156,400-156,409 bp: C. zippeliana) (Tab S5), whereas there were no SSRs in these regions of C. decandra based on the SSR analys criteria in this study due to short (AT/TA) repeats (<4 repeats) (Table S6).

Chloroplast Genes under Positive Selection
To identify candidate genes under positive selection, the values of Ka/Ks (nonsynonymous/synonymous) were estimated for 61 conserved chloroplast protein-coding genes in Ceriops and Avicennia species to relative mangrove species and non-mangrove species (assumed ancestors) (Figure 7 and Table S7). Most Ka/Ks ratios were lower than 1.0. However, there were two genes, rps7 and rps15, in which the Ka/Ks ratios were greater than 1.0 in several compared species pairs, suggesting positive selection during their evolution. The rps7 gene was under positive selection in both Ceriops species and A. lanata compared with relative non-mangrove species. The average Ka/Ks ratio of the rps7 gene between the Ceriops species compared with Ctenolophon englerianus, Averrhoa carambola, and Vitis rolundifolia was 1.06, 1.11, and 1.81, respectively. The Ka/Ks ratio of the rps7 gene between A. lanata compared with Eucommia ulmoides and Lonicera japonica was 1.03 and 1.13, respectively. In addition, the rps15 gene was positively selected in C. decandra and C. zippeliana. The Ka/Ks average ratio of the rps15 gene between the two Ceriops species compared with Pellacalyx yunnanensis, B. parviflora, and R. apiculata was 1.16, 1.17, and 1.42, respectively.
The maximum likelihood (ML) analysis, based on 50 conserved chloroplast genes in 59 plant species, resulted in the best single tree ( Figure 6). The ML tree shows two major clades corresponding to Rosids and Asterids. This tree highly supports that all Ceriops species are in the family Rhizophoraceae (Rosids), whereas A. lanata is in the family Acanthaceae (Asterids). C. decandra is closely related to C. zippeliana with a monophyletic branch supported by 100% bootstrap values. C. tagal is a sister species of the other two Ceriops species. For other mangrove species in the family Rhizophoraceae, Kandelia obovata is closer to the Ceriops species than Rhizophora and Bruguiera species. In addition, A. lanata and A. marina are grouped together in the family Acanthaceae (Asterids), with a bootstrap value of 100%. Values above the branches represent bootstrap with 1000 replicates. The mangrove species in this study are indicated in red text, whereas other mangrove species are indicated in blue text. The Rhizophoraceae lineage is indicated in gradient green, and the Acanthaceae lineage is indicated in gradient orange. Gain and loss of the rpl32, rps16, and infA genes are shown in different symbols and colors. between the Ceriops species compared with Ctenolophon englerianus, Averrhoa carambola, and Vitis rolundifolia was 1.06, 1.11, and 1.81, respectively. The Ka/Ks ratio of the rps7 gene between A. lanata compared with Eucommia ulmoides and Lonicera japonica was 1.03 and 1.13, respectively. In addition, the rps15 gene was positively selected in C. decandra and C. zippeliana. The Ka/Ks average ratio of the rps15 gene between the two Ceriops species compared with Pellacalyx yunnanensis, B. parviflora, and R. apiculata was 1.16, 1.17, and 1.42, respectively.

Discussion
Diverse chloroplast genome sequences have been used to study the evolution of mangrove species and to identify different mangrove species [39][40][41][42]. In the current study, we reported the chloroplast genomes of four mangrove species, including three Ceriops species (C. decandra, C. zippeliana, and C. tagal) and Avicennia lanata. Based on morphological characteristics, Ceriops is classified to the family Rhizophoraceae of the order Rosids (polypetalous), whereas Avicennia belongs to the family Acanthaceae of the order Asterids (sympetalous) [1,3]. Ceriops and Avicennia have a convergent evolution and are the most dominant species in the middle and seaward zones of mangrove forests, respectively [1, 3,24]. The three Ceriops chloroplast genomes (164.4-166.7 kb) were slightly different, consistent with published chloroplast genomes of mangrove species (middle zone) in Rhizophoraceae such as C. tagal, Kandelia obovata, Rhizophora species, and Bruguiera species (160.3-164.6 kb) [40][41][42][65][66][67]. In contrast, the smaller chloroplast genome of A. lanata was 148.2 kb, which is similar to the previously reported chloroplast genome of Avicennia marina (147.9-152.3 kb) [39,68]. In addition, the chloroplast genomes of Sonneratia alba and Sonneratia apetala, which are true mangroves in the family Lythraceae of the order Rosids in the seaward zone, were approximately 153.1 kb [69,70]. This finding suggests that the size of mangrove chloroplast genomes in the seaward zone may be compact compared with mangrove species in the middle zone, which is caused by adaptation under coastal stress conditions, especially salt stress. Salinity can affect plants in several ways, such as by changing the chloroplast size, number, lamellar organization, and lipid and starch accumulation and interfering with cross-membrane transportation [71].
Chloroplast genomes are usually conserved in genome organization, gene order, and gene content [72]. Nevertheless, gene gain and loss have been found among the four mangrove species. The infA gene (translation initiation factor 1) was found in A. lanata but not in the Ceriops species and other mangrove species in Rhizophoraceae [40][41][42]. The loss of the infA gene from the chloroplast to the nucleus occurred independently in multiple angiosperm lineages, especially in Rosids [73,74]. The rpl16 and rps16 genes became pseudogenes in A. lanata but not in A. marina [39]. The rpl16 gene has been independently pseudogenized in several angiosperm lineages across eudicots and monocots [75][76][77]. Notably, the rps16 gene was not found in the three Ceriops species, consistent with other mangrove and land plant species in the order Malpighiales [40,42,78]. The rps16 gene has been a pseudogene or lost by the nuclear encoded rps16 in many higher plants [79][80][81][82]. Three genes, namely rpl2, rpl23, and trnM-CAU, retained one copy in the LSC region of A. lanata and were found in only a single copy in the LSC region of A. marina [39]. In contrast, the three genes are located in the IR regions in the Ceriops species; thus, they have two copies, concordant with other mangrove species in Rhizophoraceae [40][41][42]. Contraction at the LSC/IR junction, which was observed in several land plants, might result in the deletion of rpl2 and rpl23 from one of the IR regions [83,84]. Remarkably, rpl32 was lost in C. zippeliana but not the other Ceriops species and A. lanata. The loss of rpl32 has occurred in many mangrove species in the family Rhizophoraceae, such as K. obovata, R. stylosa, and B. gymnorhiza [40,42]. Transfer of chloroplast rpl32 to the nucleus DNA occurred independently in several families of Malpighiales plants, such as Rhizophoraceae, Erythroxylaceae, Ctenolophonaceae, Violaceae, Passifloraceae, Salicaceae, and Euphorbiaceae [40,42,78,[85][86][87][88]. These reveal gene evolution in Ceriops, Avicennia, and other mangrove species.
The border positions of the LSC, SSC, and IR regions were compared among the Ceriops and Avicennia chloroplast genomes. The boundaries of the LSC/IRa and LSC/IRb regions between the Ceriops species and A. lanata had different gene positioning. In the Ceriops species, the rps9 gene was located at the LSC/IRb border and the rps9 pseudogene was located at the LSC/IRa border, concordant with other mangrove species such as Bruguiera species [42]. Meanwhile, the ycf2 gene was located at the LSC/IRb border in A. lanata and no gene was located at the LSC/IRa border, which was similar with A. marina and some non-mangrove species in the Acanthaceae, such as Ruellia breedlovei (KP300014) [89,90]. One of the reasons for chloroplast genome variation among angiosperms is the contraction or expansion of the IR regions [91]. These indicated that the contraction of the IR regions in A. lanata and the expansion of the IR regions in the Ceriops species may be mainly caused by decreasing and increasing gene duplications in the IR regions, respectively, during their evolution.
Repeats of the four mangrove species varied among them. The occurrence of short repeats (<40 bp) and a small number of SSRs was found in A. lanata due to a compact chloroplast genome containing small non-coding regions. The mangrove species carry mostly forward repeats in their chloroplast genomes that are similar in other mangrove chloroplast genomes [39,41,42]. For SSRs, most consist of repetitions of an A/T mononucleotide in all four mangrove species, concordant with other mangrove species [39,41,42]. SSRs with repeat length differences occur from the process of mutation [92], which could be used for identifying related species. In general, differentiation between C. decandra and C. zippeliana based on morphology can be difficult; therefore, we designed two species-specific primer sets based on one different SSR in the IR regions among the three Ceriops species. The specific primer sets were tested and could be used to identify C. decandra and C. zippeliana as well as C. tagal.
C. decandra is more closely related to C. zippeliana than to C. tagal, concordant with the results based on morphological and molecular evidence as well as the phylogenetic tree based on the trnL intron sequence of chloroplast genomes [30]. The Ceriops species are more closely related to K. obovata than other mangrove species in Rhizophoraceae, consistent with the results based on 44 conserved genes in 71 species (14 mangrove species and 57 land plant species) using Bayesian inference (BI) and ML [41]. In addition, the chloroplast genome of two Avicennia species was shown to be closely related to Acanthaceae, with a bootstrap value of 100%. These two lineages (Ceriops and Avicennia) had paraphyletic clades of the phylogeny, indicating convergent evolution.
The low average Ka/Ks ratios of most conserved genes in the four mangrove species suggest that the whole-chloroplast protein level of the species has been subjected to strong purifying selections. In general, synonymous changes (Ks) occur more often than nonsynonymous substitutions (Ka); as a result, the ratios of Ka/Ks are commonly lower than 1.0 [93]. Remarkably, the Ka/Ks ratios of two chloroplast genes (rps7 and rps15) were greater than 1.0, suggesting positive selection pressure. The rps7 gene encodes ribosomal protein S7 involved in the regulation of chloroplast translation [94]. Positive selection on the rps7 gene has also been observed in many mangrove species (such as K. obovata, Rhizophora species, and Bruguiera species) and some land plants (such as Ananas comosus (pineapple)) [42,95,96]. Moreover, the rps15 gene encoding ribosomal protein S15 was under positive selection in C. decandra and C. zippeliana. The multiple sequence alignment result showed co-variation of three sites in the rps15 amino acid sequence that occurs in C. decandra and C. zippeliana but not in C. tagal ( Figure S1). Interestingly, one amino acid site at position 75 was unique in only C. decandra and C. zippeliana (Isoleucine) compared with mangrove and non-mangrove species (Valine) in the family Rhizophoraceae. The rps15 gene was also reported to be related to evolution under positive selection in Araliaceae species [97]. Knockout of the chloroplast rps15 gene in tobacco leads to a specific reduction in small 30S ribosomal subunits [98]. Thus, these genes, rps7 and rps15, might be undergoing adaptive evolution in response to stress environments in mangrove forests.

Conclusions
In this study, the complete chloroplast genome sequences of Ceriops decandra, C. zippeliana, C. tagal, and Avicennia lanata were sequenced and compared. The chloroplast genome of A. lanata (seaward zone) is compact compared with the three Ceriops species (middle zone). The chloroplast genomes are mostly conserved in genome organization, gene order, and gene content; however, gene gain and loss have been found among them. The occurrence of contraction or expansion of IR regions in Avicennia and Ceriops species would be a result of decreasing and increasing gene duplications in the IR regions, respectively. Phylogenetic analysis showed that C. decandra is closer to C. zippeliana than to C. tagal in the family Rhizophoraceae, and A. lanata is clustered with A. marina in the family Acanthaceae, which supports convergent evolution between the two genera. The different chloroplast repeats and SSRs in the four mangrove species can be used as genetic markers, and two species-specific primer sets have been developed for species identification among the three Ceriops species in this work. The rps7 gene was identified under positive selection among mangrove species and might correlate with adaptive selection under coastal environments. Hence, these results could not only provide valuable genetic information of mangrove Ceriops and Avicennia species but also offer molecular markers for species identification and a candidate gene in response to climatic stress conditions of coastal environments.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/biology11030383/s1: Figure S1: Multiple protein alignment of rps15 genes among 19 species; Table S1: Sample locations and Illumina raw reads of four mangrove species; Table S2: List of chloroplast accession numbers and genes for phylogenetic analysis; Table  S3: List of chloroplast accession numbers and genes for gene selective pressure analysis; Table S4: Repeats in the chloroplast genomes of Ceriops and Avicennia species; Table S5: SSRs in the chloroplast genomes of Ceriops and Avicennia species; Table S6: List of specific sequences in a IR region in three Ceriops species; Table S7: Ka/Ks values between Ceriops species and assumed ancestors as well as between Avicennia species and assumed ancestors.  Data Availability Statement: The chloroplast genome sequences of Ceriops decandra, Ceriops zippeliana, Ceriops tagal, and Avicennia lanata were submitted to the National Center for Biotechnology Information (NCBI), with the accession numbers OK272497, OK272496, OK258322, and OK258321, respectively.