Complete Plastid Genome Sequences of Three Tropical African Beilschmiediineae Trees (Lauraceae: Crytocaryeae)

: Millions of years of isolation have given Madagascar a unique flora that still reflects some of its relationship with the continents of Africa and India. Here, the complete chloroplast sequence of Beilschmiedia moratii , a tropical tree in Madagascar, was determined. The plastome, with a length of 158,410 bp, was 143 bp and 187 bp smaller than those of two closely related species, B. pierreana and Potameia microphylla , in sub-Saharan Africa and Madagascar with published sequences, respectively. A total of 124 repeats and 114 simple sequence repeats (SSRs) were detected in the plastome of B. moratii . Six highly variable regions, including ndhF , ndhF-rpl32 , trnC-petN , pebE-petL , rpl32-trnL , and ycf1 , among the three African species were identified and 1151 mutation events, including 14 SVs, 351 indels, and 786 substitutions, were accurately located. There were 634 mutation events between B. moratii and P. microphylla with a mean nucleotide variability ( π ) value of 0.00279, while there were 827 mutation events between B. moratii and B. pierreana with a mean π value of 0.00385. The Ka/Ks ratios of 86 protein-coding genes in the three African species were less than 1, and the mean value between B. moratii and P. microphylla was 0.184, while the mean value between B. moratii and B. pierreana was 0.286. In this study, the plastid genomes of the three African Beilschmiediineae species were compared for the first time and revealed that B. moratii and P. microphylla from Madagascar were relatively conserved, with low mutation rates and slower evolutionary rates.


Introduction
Madagascar, located in the southeast of the African continent, is the world's fourth largest island after Greenland, New Guinea, and Kalimantan.Renowned for its remarkable biodiversity, Madagascar stands out as a prime hotspot for plant diversity research on a global scale [1][2][3].It is considered one of the few regions in the world to have evolved in isolation, as it separated from India about 88 million years ago (mya) and before that, from Africa about 170 mya, which led to its unique composition of plant species [4][5][6].As of January 2024, a total of 249 vascular plant families, including 1698 genera and 11,580 native vascular plant species (angiosperms, gymnosperms, and ferns), have been recorded in Madagascar, of which 9329 (82%) are endemic [7].Madagascar's terrain is dominated by mountainous regions and continental rifts, with the highest plant diversity and rarity found on the steep eastern side of this geographical feature [8].Despite the richness and uniqueness of Madagascar's species, there is evidence that levels of biodiversity in Madagascar are declining and that many species have even gone extinct before they were recorded [2].This issue has attracted the attention of many researchers, and the conservation of biodiversity in Madagascar has become urgent.Protection of plant diversity in Madagascar is one of the goals of the local government and the international community to work together.The integration of technology reduced sequencing costs, and molecular tools for studying species diversity in Madagascar has proven to be a successful pathway towards implementing efficient conservation strategies in the face of environmental challenges and biodiversity threats [9,10].
Chloroplasts (cp) are semi-autonomous organelles necessary for photosynthesis and various metabolic pathways essential for growth and development in plants [11].The unique genetic characteristics of plant cp are determined by their haploid, non-recombination, and highly conserved characters [12,13].Typically, genomes range in size from 107 to 218 kb and have a tetrad structure consisting of a pair of inverted repeat regions (IRs), a large single-copy region (LSC) and a small single-copy region (SSC) [14].The cp genome in plants typically retains a limited number of genes, around 120-130, that play essential roles in various cellular functions such as photosynthesis, gene expression, and protein synthesis [15,16].Although the cp genome is small, it contains a large amount of genetic information, showing evolution and conservation at the nucleotide sequence level and structural rearrangement [17,18].The coding and non-coding regions have different evolutionary rates, and there are significant differences at the molecular level [17].It can be used to distinguish interspecific relationships and evaluate interspecific variability and is widely used in the research fields of population genetic diversity and phylogeny [19,20].In the past two decades, significant progress has been made in the study of cp genome composition and structure, interspecific differences, and evolutionary relationships of Lauraceae, but the study of Beilschmiediineae is very limited [21][22][23].
Beilschmiedia is one of the largest pantropical genera of the Lauraceae family, distributed mainly in Africa, Southeast Asia, Oceania, and the Americas (https://www.iplant.cn/info/Beilschmiedia?t=foc, accessed on 23 March 2024).So far, 268 species have been accepted by the World Plant Online website (https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:330809-2,accessed on 23 March 2024).This genus is dominated by tall trees, with a few shrubs, which are important components of their ecosystems and are usually the dominant species in their communities due to their size and ecological role (https://eol.org/pages/39964810/articles,accessed on 23 March 2024) [24].As a result of overlapping characters, it is challenging to differentiate Beilschmiedia from its close relatives [25,26].Examples of such species include Potameia Thouars, Endiandra R. Br., Syndiclis Hook.f., Sinopora J. Li, N. H. Xia, and H. W. Li, and Yasunia van der Werff, all of which are included in Beilschmiediineae [26].In recent years, due to the destruction of primary forests, loss of habitat, invasion of alien species, and overexploitation, the number of Beilschmiedia plants has decreased year-by-year [27].A search in the IUCN Red List for 'Beilschmiedia' reveals 241 species, with one classified as 'extinct', 44 species (18.26%) as 'critically endangered', 64 species (26.56%) as 'endangered', and 46 species (19.09%) as 'vulnerable' (https://www.iucnredlist.org/search?query=Beilschmiedia&searchType=species,accessed on 23 March 2024) [28].Beilschmiedia moratii van der Werff, first published in 1996, is endemic to Madagascar, which brings the number of species of the genus Beilschmiedia in Madagascar to 10, greatly contributing to the study of the Malagasy flora [29].
Little research has been reported about B. moratii since it was published.Nishida and van der Werff (2007) [25] studied twenty-one species of Aspidostemon, nine of Beilschmiedia, three of Cryptocarya, and five of Potameia, all from Madagascar, including B. moratii, and concluded that cuticle characteristics play an important role in dealing with the classification of Lauraceae plants [30].The molecular study of B. moratii, however, was the first of its kind in the work of Song et al. (2023), who used 176 cp genomes to investigate the phylogeny and biogeography of the Cryptocaryeae (Lauraceae), but the cp genomic characteristics of B. moratii were not compared [31].In addition, previous research suggested that Potameia from Madagascar should be classified into Beilschmiedia [31,32], which needs appropriate material to be collected, combined with morphology and molecular evidence for further research.In this study, for the first time, we compared the cp genome of B. moratii from Madagascar with that of other species from the African Beilschmiediineae.Our main aim Forests 2024, 15, 832 3 of 17 was to reveal genetic differences between Madagascar and sub-Saharan Africa and explore their relationship by comparing the cp genomes of the African Beilschmiediineae.This result can provide us with a better understanding of the genetic diversity and evolutionary history of plants in Madagascar, so as to provide an important basis for plant taxonomy and biodiversity conservation.

Sampling, DNA Extraction, Genomic Sequencing
Fresh young B. moratii leaves were collected in Madagascar using the silica gel drying method.The harvested tissues were deposited at the Kunming Institute of Botany, Chinese Academy of Sciences (collection number: 17CS16181).Total genomic DNA of B. moratii was extracted from healthy fresh young leaves by the modified CTAB method [33].Genomic DNA that qualified for library construction was used to construct a paired-end library with an insertion size of 150 bp, which was sequenced using the Illumina HiSeq 2500 platform at BGI-Shenzhen, and more than 2.0 Gb of reads for the sample were obtained.

Mutation Events Detection and Sliding Window Analysis of the Plastomes
We constructed four whole cp genome matrices.Matrix I is composed of sequences from B. moratii and P. microphylla, Matrix II includes sequences from B. moratii and B. pierreana, Matrix III is made up of sequences from P. microphylla and B. pierreana, and Matrix IV integrates sequences from B. moratii, P. microphylla, and B. pierreana.The four matrices were aligned using MAFFT v7.520 [41] and then manually adjusted in BioEdit v7.0.9.0 [42], especially the inversion sites.The mutation sites of Matrix I, Matrix II, Matrix III, and Matrix IV, including insertions and deletions (indels) and SNPs, were detected using Geneious Prime v2023.2.1 [38].These sites were then manually verified in BioEdit v7.0.9.0 [42].
The nucleotide polymorphism values (π) of Matrix I, II, III, and IV were calculated using DnaSP v6.12.03 [48], respectively.The window length was set to 600 base pairs and the step size to 200 base pairs.The values were plotted in the plastomes using an R program.The paired nucleotide distances of the three genomes were calculated using MEGA v11.0.13 [49] based on the Nei-Gojobori method [50].

Selective Pressure Analysis
The 86 protein-coding genes shared by B. moratii, P. microphylla, and B. pierreana were extracted by PhyloSuite v1.2.3 [40] and aligned with MAFFT v7.520 [41].The non-synonymous (Ka) and synonymous (Ks) nucleotide substitution rates, as well as the Ka/Ks ratio, were calculated using the KaKs_Calculator v2.0 [51].Finally, 40 genes potentially under selection were chosen from the three groups for further analysis.Selective pressure was assessed using the Ka/Ks ratio: ratios between 0.5 and 1 indicate relaxed selection, ratios less than 0.5 suggest purifying selection, a ratio of 1 indicates neutral evolution, and ratios greater than 1 signify positive selection [52].

Phylogenetic Analysis
Here, phylogenetic relationships were obtained based on a matK matrix.The resulting phylogenetic tree, as shown in Figure 1B  Two species with red font and one species with blue font, all with whole cp genome data, are located in Madagascar and sub-Saharan Africa, respectively.The numbers above the branches are the Bayesian posterior probabilities values.

Characteristics of the cp Genome of B. moratii
The newly assembled plastome of B. moratii (deposited in LCGDB: LAU00173) had a quadripartite structure forming a circular molecule, and the size of the genome was 158,410 bp in length, which was 187 bp smaller than that of P. microphylla (158,597 bp, GenBank accession No: MT720950) and 143 bp smaller than that of B. pierreana (158,553 bp, GenBank accession No. MT720942) (Figure 2).The B. moratii cp genome includes a pair of inverted repeats (IRs) of 25,537 bp, separated by a large single copy (LSC) region of 89,198 bp and a small single copy (SSC) region of 18,138 bp (Figure 2).The overall GC content within the plastome was 39.1%.The IR regions exhibited a higher GC content of 43%, whereas the LSC and SSC regions had values of 37.8% and 34%, respectively.

Characteristics of the cp Genome of B. moratii
The newly assembled plastome of B. moratii (deposited in LCGDB: LAU00173) had a quadripartite structure forming a circular molecule, and the size of the genome was 158,410 bp in length, which was 187 bp smaller than that of P. microphylla (158,597 bp, GenBank accession No: MT720950) and 143 bp smaller than that of B. pierreana (158,553 bp, GenBank accession No. MT720942) (Figure 2).The B. moratii cp genome includes a pair of inverted repeats (IRs) of 25,537 bp, separated by a large single copy (LSC) region of 89,198 bp and a small single copy (SSC) region of 18,138 bp (Figure 2).The overall GC content within the plastome was 39.1%.The IR regions exhibited a higher GC content of 43%, whereas the LSC and SSC regions had values of 37.8% and 34%, respectively.
remaining 20 genes had a single intron each.The gene rps12 was identified as a transspliced gene, with its 5′ end located in the LSC region and its duplicated 3′ end in the IR region.In addition, of the 131 genes, 17 were double-copy genes and the rest were singlecopy genes.

Repeats Analysis
In the B. moratii cp genome, four types of repeats were identified, including forward repeats (F), palindromic repeats (P), reverse repeats (R), and complement repeats (C) (Figure 3).The repeat analysis showed 52 forward repeats, 46 palindromic repeats, 14 reverse repeats, and 12 complementary repeats (Figure 3B).Most of the repeats observed were forward repeats (accounting for 41.94%) and the least were complement repeats (accounting for 9.68%).The longest repeat was 48 bp and belonged to palindromic repeats, located in the LSC region.
The B. moratii cp genome contains 114 SSRs, and a total of 6 types of SSRs were found, with the largest number of mononucleotide repeats (89), followed by dinucleotide repeats (12), tetranucleotide repeats (6), and trinucleotide repeats ( 5), and at least 1 pentanucleotide and 1 hexanucleotide repeat, respectively (Figure 3C).The 13 repeat units of the six SSR types included the most abundant mononucleotide repeat unit A/T, accounting for 76.35% of all SSRs (Figure 3C).The AT/AT and AAT/ATT repeat units also predominated in their respective SSR types.The distribution of SSR in different regions of the genome was different.In the cp genome of B. moratii, the number of SSR distributed in the LSC region was the largest (94, accounting for 82.46%), while the number of SSR in the IR regions was the smallest (only 4, accounting for 3.51%) (Figure 3A).In addition, 81 SSRs, accounting for 71.05%, were located in the intergenic region, most of the rest were distributed in the intron region, and a few were in the exon region (Figure 3A).

Repeats Analysis
In the B. moratii cp genome, four types of repeats were identified, including forward repeats (F), palindromic repeats (P), reverse repeats (R), and complement repeats (C) (Figure 3).The repeat analysis showed 52 forward repeats, 46 palindromic repeats, 14 reverse repeats, and 12 complementary repeats (Figure 3B).Most of the repeats observed were forward repeats (accounting for 41.94%) and the least were complement repeats (accounting for 9.68%).The longest repeat was 48 bp and belonged to palindromic repeats, located in the LSC region.
The B. moratii cp genome contains 114 SSRs, and a total of 6 types of SSRs were found, with the largest number of mononucleotide repeats (89), followed by dinucleotide repeats (12), tetranucleotide repeats (6), and trinucleotide repeats (5), and at least 1 pentanucleotide and 1 hexanucleotide repeat, respectively (Figure 3C).The 13 repeat units of the six SSR types included the most abundant mononucleotide repeat unit A/T, accounting for 76.35% of all SSRs (Figure 3C).The AT/AT and AAT/ATT repeat units also predominated in their respective SSR types.The distribution of SSR in different regions of the genome was different.In the cp genome of B. moratii, the number of SSR distributed in the LSC region was the largest (94, accounting for 82.46%), while the number of SSR in the IR regions was the smallest (only 4, accounting for 3.51%) (Figure 3A).In addition, 81 SSRs, accounting for 71.05%, were located in the intergenic region, most of the rest were distributed in the intron region, and a few were in the exon region (Figure 3A).

Number and Forms of Microstructural Mutations
To better understand cp genomic differences between sub-Saharan Africa and Madagascar, indel mutations between cp genomes of the three species were analyzed for comparison.In Matrix I (B.moratii and P. microphylla), there were 136 indels in gene spacer regions, 28 in introns, and 18 in exons.Matrix II (B.moratii and B. pierreana) revealed 147 indels in gene spacer regions, 39 in introns, and 22 in exons.Matrix III (P.microphylla and B. pierreana) contained a total of 218 indels, with 158 in gene spacer regions, 42 in introns, and 18 in exons.Finally, in Matrix IV (B.moratii, P. microphylla, and B. pierreana), a total of 351 indels were detected across all regions.These findings are presented in Figure 4A.In Matrix I, Matrix II, and Matrix III, indels were more common in gene spacer regions than in other regions.Notably, Matrix I, both species from Madagascar, had the fewest indels.Furthermore, these indels are classified into simple repeat sequence (SSR)-type indels and non-SSR-type indels.In Matrix I, Matrix II, and Matrix III, there were 81, 89, 86 SSR-type and 101, 119, 132 non-SSR-type, respectively (Figure 4B).In SSR-type indels, most of the indel length changes ranged from 1 to 6 bp, but the insertion length of 24 bp in the ycf2 gene of Matrix III was the longest among Matrix I, Matrix II, and Matrix III.For non-SSRtype indels, most length changes ranged from 1 to 10 bp, but there was a 61 bp deletion in the intergenic region of psbE-petL in Matrix III.There was a 54 bp deletion in ycf1 in Matrix II, and a 54 bp deletion in the double copy of ycf2 in Matrix I.In Matrix I, Matrix II, and Matrix III, indels accounted for 69.78%, 68.75% and 68.81% in LSC region, and the

Number and Forms of Microstructural Mutations
To better understand cp genomic differences between sub-Saharan Africa and Madagascar, indel mutations between cp genomes of the three species were analyzed for comparison.In Matrix I (B.moratii and P. microphylla), there were 136 indels in gene spacer regions, 28 in introns, and 18 in exons.Matrix II (B.moratii and B. pierreana) revealed 147 indels in gene spacer regions, 39 in introns, and 22 in exons.Matrix III (P.microphylla and B. pierreana) contained a total of 218 indels, with 158 in gene spacer regions, 42 in introns, and 18 in exons.Finally, in Matrix IV (B.moratii, P. microphylla, and B. pierreana), a total of 351 indels were detected across all regions.These findings are presented in Figure 4A.In Matrix I, Matrix II, and Matrix III, indels were more common in gene spacer regions than in other regions.Notably, Matrix I, both species from Madagascar, had the fewest indels.Furthermore, these indels are classified into simple repeat sequence (SSR)-type indels and non-SSR-type indels.In Matrix I, Matrix II, and Matrix III, there were 81, 89, 86 SSR-type and 101, 119, 132 non-SSR-type, respectively (Figure 4B).In SSR-type indels, most of the indel length changes ranged from 1 to 6 bp, but the insertion length of 24 bp in the ycf2 gene of Matrix III was the longest among Matrix I, Matrix II, and Matrix III.For non-SSR-type indels, most length changes ranged from 1 to 10 bp, but there was a 61 bp deletion in the intergenic region of psbE-petL in Matrix III.There was a 54 bp deletion in ycf1 in Matrix II, and a 54 bp deletion in the double copy of ycf2 in Matrix I.In Matrix I, Matrix II, and Matrix III, indels accounted for 69.78%, 68.75% and 68.81% in LSC region, and the proportion of IR region was 10.99%, 12.98%, and 17.89%, with 10, 6, and 10 micro-inversion events identified, respectively, ranging in length from 2 to 8 bp.

Numbers and Pattern of SNP Mutations
SNP markers were identified as the most prevalent type of mutation across the three species.In Matrix I (B.moratii and P. microphylla), a total of 442 SNPs were observed, including 251 transitions (Ts) and 191 transversions (Tv), with a Tv to Ts ratio of 1:1.31 (Figure 5B).Specifically, 188 SNPs were identified in intergenic regions, 39 in introns, and 215 in exons (Figure 5A).In Matrix II (B.moratii and B. pierreana), a total of 613 SNPs were detected, comprising 371 Ts and 242 Tv sites, with a Tv to Ts ratio of 1:1.53 (Figure 5B).Out of these SNPs, 270 were found in intergenic regions, 60 in introns, and 283 in exons (Figure 5A).In Matrix III (P.microphylla and B. pierreana), a total of 519 SNPs were identified, consisting of 332 Ts and 182 Tv, with a Tv to Ts ratio of 1:1.82 (Figure 5B).Among these SNPs, 235 were located in intergenic regions, 53 in introns, and 231 in exons (Figure 5A).In Matrix IV (B.moratii, P. microphylla, and B. pierreana), a total of 786 SNPs were observed, including 476 Ts and 310 Tv, with a Tv to Ts ratio of 1: 1.54 (Supplementary Table S2).The types of base mutations were consistent in Matrix I, Matrix II, Matrix III, and Matrix IV, with the most common SNP mutation types being A to G and T to C, with fewer mutations from A to T or T to A (Figure 5B).

Numbers and Pattern of SNP Mutations
SNP markers were identified as the most prevalent type of mutation across the three species.In Matrix I (B.moratii and P. microphylla), a total of 442 SNPs were observed, including 251 transitions (Ts) and 191 transversions (Tv), with a Tv to Ts ratio of 1:1.31 (Figure 5B).Specifically, 188 SNPs were identified in intergenic regions, 39 in introns, and 215 in exons (Figure 5A).In Matrix II (B.moratii and B. pierreana), a total of 613 SNPs were detected, comprising 371 Ts and 242 Tv sites, with a Tv to Ts ratio of 1:1.53 (Figure 5B).Out of these SNPs, 270 were found in intergenic regions, 60 in introns, and 283 in exons (Figure 5A).In Matrix III (P.microphylla and B. pierreana), a total of 519 SNPs were identified, consisting of 332 Ts and 182 Tv, with a Tv to Ts ratio of 1:1.82 (Figure 5B).Among these SNPs, 235 were located in intergenic regions, 53 in introns, and 231 in exons (Figure 5A).In Matrix IV (B.moratii, P. microphylla, and B. pierreana), a total of 786 SNPs were observed, including 476 Ts and 310 Tv, with a Tv to Ts ratio of 1:1.54 (Supplementary Table S2).The types of base mutations were consistent in Matrix I, Matrix II, Matrix III, and Matrix IV, with the most common SNP mutation types being A to G and T to C, with fewer mutations from A to T or T to A (Figure 5B).

Divergence Hotspots of B. moratii
To elucidate levels of sequence difference, we calculated the nucleotide variability (π) values.The π values in Matrix I ranged from 0 to 0.03, with an average value of 0.00279 (Figure 6A).The π values of Matrix Ⅳ ranged from 0 to 0.02667, with an average value of 0.00334 (Figure 6A).The π values of Matrix I and IV are very similar, sharing the highest and lowest points.For the plastomes of Matrix III, the π values ranged from 0 to 0.01833, with a mean value of 0.00330 (Figure 6B).Comparatively, the π values of Matrix II plas-

Divergence Hotspots of B. moratii
To elucidate levels of sequence difference, we calculated the nucleotide variability (π) values.The π values in Matrix I ranged from 0 to 0.03, with an average value of 0.00279 (Figure 6A).The π values of Matrix IV ranged from 0 to 0.02667, with an average value of 0.00334 (Figure 6A).The π values of Matrix I and IV are very similar, sharing the highest and lowest points.For the plastomes of Matrix III, the π values ranged from 0 to 0.01833, with a mean value of 0.00330 (Figure 6B).Comparatively, the π values of Matrix II plastomes varied from 0 to 0.02833, with a mean of 0.00385 (Figure 6C).In Matrix I, Matrix II, and Matrix III, the average π is the largest in Matrix II and the smallest in Matrix I.In addition, we discovered nine hypervariable loci (π > 0.014) in Matrix I, with two in ndhF and seven in ycf1.However, in Matrix II, we identified 19 hypervariable loci (π > 0.014).These loci were found in ndhF-rpl32, trnT-trnL, two in ndhF-rpl32, ndhG-ndhI, rps15-ycf1, and thirteen in ycf1.Similarly, we found 18 hypervariable loci (π > 0.014) in Matrix III.These loci include trnC-petN, rpl32-trnL, psbE-petL (two loci), ndhF-rpl32 (three loci), and ycf1 (eleven loci).In addition, the pairwise nucleotide divergence values between two of the three plastomes varied from 0.00283 to 0.00459 (Table 1).The larger values between sub-Saharan Africa and Madagascar, and the lowest values between B. moratii and P. microphylla, both in Madagascar, are the same as the results of the nucleotide polymorphism.

Selective Pressure Analysis of Protein-Coding Genes
To investigate whether there is selection occurring in the plastid genes of B. moratii, P. microphylla, and B. pierreana, we calculated the Ka, Ks, and Ka/Ks ratios of 86 CDS between them.Among the three comparisons, the highest average Ka was observed between B. moratii vs. B. pierreana, at 0.00192 (Figure 7A).The largest mean value of Ks was found in B. moratii vs. B. pierreana, which was 0.00802, and the smallest was found in P. microphylla vs. B. pierreana, which was 0.00549 (Figure 7B).The mean value of Ka was significantly lower than that of Ks.We then selected 40 genes that showed potential selection pressure for further analysis (Figure 7C).The average Ka/Ks value of the 40 protein-coding genes examined between B. moratii and B. pierreana was found to be 0.286.Out of these, 20 genes had Ka/Ks values higher than the average.Among these twenty genes, the six genes with the highest ratios were ndhD, petA, rps4, rpl16, ycf1a, and ycf1b (Figure 7C).However, none of these genes had a Ka/Ks ratio greater than one.This result was replicated in the other two comparisons, suggesting that none of the genes were positively or neutrally selected between B. moratii, P. microphylla, and B. pierreana.In the comparison of B. moratii vs. P. microphylla, the average Ka/Ks ratio of the 40 protein-coding genes was 0.184.Among these genes, 19 had ratios higher than the average.The top six genes with the highest ratios were ndhF, ndhK, petD, psbB, psbZ, and ycf1a (Figure 7C).In the comparison of P. microphylla vs. B. pierreana, the average Ka/Ks ratio was found to be 0.206.Seventeen genes had ratios higher than the average, with the top six genes being matK, ndhD, ndhI, psbA, rps12, and ycf1a (Figure 7C).In addition, in the comparison between B. moratii and B. pierreana, the Ka/Ks ratio was higher than the other two groups, with the lowest Ka/Ks ratio between B. moratii and P. microphylla.

Discussion
Comparative plastome analyses have revealed structural variations, gene content differences, and evolutionary rate variations that provide insights into the phylogenetic relationships and evolutionary history of these plant groups, as evidenced by studies on Betulaceae [53], Paeonia (Paeoniaceae) [54], and Trollius (Ranunculaceae) [55].In this study, by comparing cp genome analyses, we hope to shed light on the evolutionary relationships between species and gain insight into the mechanisms of adaptation and differenti-

Discussion
Comparative plastome analyses have revealed structural variations, gene content differences, and evolutionary rate variations that provide insights into the phylogenetic relationships and evolutionary history of these plant groups, as evidenced by studies on Betulaceae [53], Paeonia (Paeoniaceae) [54], and Trollius (Ranunculaceae) [55].In this study, by comparing cp genome analyses, we hope to shed light on the evolutionary relationships between species and gain insight into the mechanisms of adaptation and differentiation in their respective habitats (Madagascar and sub-Saharan Africa).

Phylogenetic Analysis of Beilschmiediineae from Africa
Our phylogenetic results suggest that the Beilschmiediineae species in Madagascar may share a common ancestor with the Beilschmiediineae species in sub-Saharan Africa.This result is similar to that of Rohwer et al. (2014) [32], Li et al. (2020) [56], and Song et al. (2023) [31].Rohwer et al. (2014) [32] confirmed that Dahlgrenodendron from sub-Saharan Africa was a sister clade to Aspidostemon from Madagascar on the basis of nuclear ITS and plastid trnK intron sequences, and distinguished sub-Saharan African Beilschmiedia species from Madagascar.Li et al. (2020) [56] reconstructed the phylogenetic tree based on the entire cp genome of the Beilschmiedia group using Bayesian inference and maximum likelihood method and evaluated its differentiation time.B. pierreana from sub-Saharan Africa and P. microphylla from Madagascar were found to form a sister clade with strong support, and their differentiation time ranged from about 3.1 to 31.4 mya.Song et al. (2023) [31] used plastid genome sequences to re-investigate the phylogenetic and biogeographic history of the tribe Crytocaryeae, and found that the Beilschmiediineae of sub-Saharan Africa and Madagascar were sister clades, and their differentiation time was 15.49-38.49mya.These studies have shown that there are significant genetic and/or morphological differences between plant species on Madagascar and those on the African continent.This result can be attributed to the peculiarities of the isolated island environment, including factors such as geographical isolation and changes in climatic conditions, which have gradually separated the plants on Madagascar from those on the African continent and formed distinct species groups.

Structural Variation of B. moratii Plastome
The structure, size, and gene content of the B. moratii cp genome were relatively conserved.It shows a typical quadripartite circular structure with the LSC and SSC regions divided by the IR regions, which were similar to the other Lauraceae plants and most of the angiosperms with no significant differences [57,58].Moreover, each region contributes to this difference in genome size, and there was little difference between them.During the comparison, we found a large number of inserted and deletion mutation sites, which may be the main cause of their genomic differences.In addition, the IR region plays a crucial role in maintaining plastid structural stability, with its contraction and expansion considered the main reasons for differences in plastid genome sizes [59].We compared the IR regions between the three and found that their IR regions were stretched or expanded.This, together with IR boundary gene insertion and deletion mutations, caused differences in the size of their IR regions and also affected the size of the LSC region and the SSC region.In contrast, studies of Asparagus densiflorus [60] and tree peony [61] have shown that the main determinant of plastid length is not the contraction and expansion of the IR region, but the length change in the LSC region.While the direct link between these plastid variations and the overall evolutionary process of these species remains elusive, our study provides valuable insights into plastid evolution.
In the genome, single nucleotide polymorphisms (SNPs) and insertion/deletion mutations (indels) are the most common mutation types and are the basic sources of genetic variation that drive evolutionary change in populations [62].They generally have low genetic effects, but as population size increases, they may have an impact on the genetic diversity of the genome [63].Of the three sets of comparisons, the two species, all from Madagascar, had the fewest mutations, with 634 mutation events, including 442 SNPs and 192 indels.And among them, the number of SNPs is 2.30-2.95times the number of indels, and the largest multiple appears in B. moratii and B. pierreana, with the largest Forests 2024, 15, 832 13 of 17 number of SNP mutation sites and the largest total mutation sites.Five possible reasons for the mutation number differences include genetic drift, population history, evolutionary timing, biogeographic factors, and natural selection [64].We hypothesize that the fewer mutation events in the cp genome of Madagascar species may be due to the combination of these factors.This suggests that the ecological environment and evolutionary history of Madagascar may be different from that of sub-Saharan Africa.In addition, we also noticed that the number of SNPs on exons was 11.94-12.86times that of indels in the three sets of comparisons.The SNPs' sites are not randomly distributed, but clustered as "hotspots", which is not only reflected in the intergenic region but also in the exon [65,66].For example, in the three groups, the SNPs on the exons were mostly concentrated in matK, psbA, ndhD, ndhF, rpoC1, rpoC2, ycf1, etc.It suggests that these genes have frequent genetic variation between species, or play special functional roles in the evolutionary process, so its mutation may be more susceptible to natural selection.

Divergence Hotspot Analysis
While the cp genome is highly conserved overall, it does contain a number of variable regions and features that can be used for evolutionary and phylogenetic analysis, as well as for the development of genetic markers [67].Across the four-sequence matrix, nucleotide variability results were very similar, sharing common hypervariable regions, suggesting that mutation events in the genome were concentrated in these regions and drove them to become hotspots.It has been shown that nucleotide variation within the same genus is low and very similar, but the variation between different genera is very large, and often appears with special hypervariable regions [68].It seems that some clues can also be found from the results of nucleotide variability if we study the problem of the classification divergence between Potameia and Beilschmiedia.In addition, the highest variation points in the four matrices all appear in the ycf1 gene, which has been confirmed in previous studies that ycf1a or ycf1b are the most volatile plastid genomic regions and can serve as the core barcodes of land plants [69].The ycf1 gene and several other identified hypervariable regions (ndhF, ndhF-rpl32, trnC-petN, pebE-petL, and rpl32-trnL) provide valuable information for further functional studies and plant taxonomy.The IR region is generally considered to be the most conserved part of the plastid genome of land plants, and our results also reflect this.

Selective Pressure Analysis
The ratio of non-synonymous to synonymous substitutions (Ka/Ks) is a key metric used to characterize natural selection pressures on protein-coding genes [70,71].In this study, most of the genes across the three comparison sets underwent purifying selection, a few genes were relaxed selection (0.5 < Ka/Ks < 1.0, Figure 7C), and none showed positive or neutral selection, reflecting the typical evolutionary conservation of plastid genes in plants [72].In the three groups, ycf1a was present at the same time, and showed large differences in mutation and nucleotide polymorphism.The ycf1 gene, the largest gene in the plastome, has been proven to be absent or pseudogenized in many prior research [73].However, studies have also shown that ycf1 and ycf2 are functional genes, and their encoded products are essential for cell survival [74].In summary, the encoded proteins of these selected genes function as enzymes involved in chloroplast protein synthesis, gene transcription, energy conversion, and plant development.We inferred that the cp functional genes that were under selection might play key roles during the adaptation and development of the Beilschmiediineae species to terrestrial ecosystems.

Conclusions
This study first sequenced and assembled the complete plastome of B. moratii from Madagascar.Phylogenetic relationships constructed using the matK gene show that B. moratii clustered into a clade with other species from Madagascar and formed a sister clade with species from sub-Saharan Africa.A comparison of the cp genomes of three African Beilschmiediineae species found that the species within Madagascar had the fewest mutation events and evolutionary rates, which is supported by multiple results.Six highly variable regions (ndhF, ndhF-rpl32, trnC-petN, pebE-petL, rpl32-trnL, and ycf1) among the three African species were identified, which can be used for phylogenetic studies or as markers for species identification.Our results deepen our understanding of the B. moratii cp genome in Madagascar and the phylogenetic evolution of the African Beilschmiediineae, contributing to the conservation of species diversity in Madagascar.
, is clearly divided into the Madagascar clade and the sub-Saharan Africa clade, which are sister groups to each other.The Madagascar clade consists entirely of species from Madagascar, namely B. madagascariensis, B. moratii, B. velutina, P. chartacea, P. microphylla, and P. thouarsiana, with support values up to 0.78.The species in the sub-Saharan African Clade are all from sub-Saharan Africa, including B. acuta, B. jacques-felixii, B. pierreana, and B. sp.(No: KC627690, KC627846, KC627629, and HG314964), and show a stronger support value of 0.99.

Figure 1 .
Figure 1.A phylogenetic tree constructed by Bayesian inference based on the matK gene.(A) A map of species loci in phylogenetic trees.Light green circles represent the recorded sites of the Beilschmiediineae in Africa, with data sourced from GBIF (https://www.gbif.org/(accessed on 12 March 2024)).The blue circle and the red circle are the species that built the phylogenetic tree, from sub-Saharan Africa and Madagascar, respectively.Sites marked by species name with a black dot in the center indicate the species for which the cp genome was compared in this study.(B) A Bayesian phylogenetic tree based on the matK gene.Different clades are highlighted with different colors.Two species with red font and one species with blue font, all with whole cp genome data, are located in Madagascar and sub-Saharan Africa, respectively.The numbers above the branches are the Bayesian posterior probabilities values.

Figure 1 .
Figure 1.A phylogenetic tree constructed by Bayesian inference based on the matK gene.(A) A map of species loci in phylogenetic trees.Light green circles represent the recorded sites of the Beilschmiediineae in Africa, with data sourced from GBIF (https://www.gbif.org/(accessed on 12 March 2024)).The blue circle and the red circle are the species that built the phylogenetic tree, from sub-Saharan Africa and Madagascar, respectively.Sites marked by species name with a black dot in the center indicate the species for which the cp genome was compared in this study.(B) A Bayesian phylogenetic tree based on the matK gene.Different clades are highlighted with different colors.Two species with red font and one species with blue font, all with whole cp genome data, are located in Madagascar and sub-Saharan Africa, respectively.The numbers above the branches are the Bayesian posterior probabilities values.

Figure 2 .
Figure 2. A gene map of the plastomes of B. moratii, B. pierreana, and P. microphylla in the family Lauraceae.

Figure 2 .
Figure 2. A gene map of the plastomes of B. moratii, B. pierreana, and P. microphylla in the family Lauraceae.

Figure 3 .
Figure 3.A comparison of the distribution of repeat sequences and SSRs in the cp genome of B. moratii.(A) A distribution map of repeat sequence and SSRs in cp genome.The innermost yellow line represents forward repeats (F), the green line represents palindromic repeats (P), the red line represents reverse repeats (R), and the black line represents complement repeats (C).The blue circle in the innermost circle represents the density of SSR.The outermost circle is the physical map of the cp genome, the two thick black lines are the IR regions, the blue lines are the SSC region, and the green lines are the LSC region.(B) The number of repeats of different types.(C) The type frequency of SSR motifs in different repeat class types.

Figure 3 .
Figure 3.A comparison of the distribution of repeat sequences and SSRs in the cp genome of B. moratii.(A) A distribution map of repeat sequence and SSRs in cp genome.The innermost yellow line represents forward repeats (F), the green line represents palindromic repeats (P), the red line represents reverse repeats (R), and the black line represents complement repeats (C).The blue circle in the innermost circle represents the density of SSR.The outermost circle is the physical map of the cp genome, the two thick black lines are the IR regions, the blue lines are the SSC region, and the green lines are the LSC region.(B) The number of repeats of different types.(C) The type frequency of SSR motifs in different repeat class types.

Figure 4 .
Figure 4.The number of indels in different types and regions.(A) The number of indels distributed in different regions.(B) The number of different types of indels.

Figure 4 .
Figure 4.The number of indels in different types and regions.(A) The number of indels distributed in different regions.(B) The number of different types of indels.

Forests 2024 , 18 Figure 5 .
Figure 5.The number of SNPs in different types and regions.(A) The number of SNPs distributed in different regions.(B) The number of different types of SNPs.

Figure 5 .
Figure 5.The number of SNPs in different types and regions.(A) The number of SNPs distributed in different regions.(B) The number of different types of SNPs.

Forests 2024, 15, 832 Figure 6 .
Figure 6.A comparison of the nucleotide variability (π) values.(A) The nucleotide variability of Matrix Ⅰ and Matrix Ⅳ, the red line represents Matrix Ⅰ, which includes B. moratii and P. microphylla, Figure 6.A comparison of the nucleotide variability (π) values.(A) The nucleotide variability of Matrix I and Matrix IV, the red line represents Matrix I, which includes B. moratii and P. microphylla, while the blue line represents Matrix IV, including B. moratii, P. microphylla, and B. pierreana.(B) The nucleotide variability in Matrix II, which includes B. moratii and B. pierreana.(C) The nucleotide variability of Matrix III, which includes P. microphylla and B. pierreana.In the figure, red circles represent SNP sites and blue circles represent indel sites.

Forests 2024 , 18 Figure 7 .
Figure 7.The Ka, Ks, and Ka/Ks nucleotide substitution values of B. moratii, P. microphylla, and B. pierreana.(A) The Ks values of 86 PCGs.(B) The Ka values of 86 PCGs.(C) The Ka/Ks values for 40 potential selection genes.

Figure 7 .
Figure 7.The Ka, Ks, and Ka/Ks nucleotide substitution values of B. moratii, P. microphylla, and B. pierreana.(A) The Ks values of 86 PCGs.(B) The Ka values of 86 PCGs.(C) The Ka/Ks values for 40 potential selection genes.

Table 1 .
Pairwise nucleotide divergences of the three plastomes of Beilschmiediineae.