Plastomes of Bletilla (Orchidaceae) and Phylogenetic Implications

The genus Bletilla is a small genus of only five species distributed across Asia, including B. chartacea, B. foliosa, B. formosana, B. ochracea and B. striata, which is of great medicinal importance. Furthermore, this genus is a member of the key tribe Arethuseae (Orchidaceae), harboring an extremely complicated taxonomic history. Recently, the monophyletic status of Bletilla has been challenged, and the phylogenetic relationships within this genus are still unclear. The plastome, which is rich in both sequence and structural variation, has emerged as a powerful tool for understanding plant evolution. Along with four new plastomes, this work is committed to exploring plastomic markers to elucidate the phylogeny of Bletilla. Our results reveal considerable plastomic differences between B. sinensis and the other three taxa in many aspects. Most importantly, the specific features of the IR junction patterns, novel pttRNA structures and codon aversion motifs can serve as useful molecular markers for Bletilla phylogeny. Moreover, based on maximum likelihood and Bayesian inference methods, our phylogenetic analyses based on two datasets of Arethuseae strongly imply that Bletilla is non-monophyletic. Accordingly, our findings from this study provide novel potential markers for species identification, and shed light on the evolution of Bletilla and Arethuseae.


Szlachetko (1995)
Bletilla and Bletia were classified in the same tribe (Arethusa was still placed in tribe Arethuseae) Position: tribe Bletiinae (including Bletilla and Bletia) Berg (2005) Bletilla and Arethusa were classified in the same tribe (Bletia was placed in the subtribe Bletiinae of tribe Epidendreae) Until now Position: tribe Arethuseae subtribe Arethusinae (including Arethusa) subtribe Coelogyninae (including Bletilla) Nevertheless, although most studies have focused on the systematic status of Bletilla, only a few reports have involved its internal branching patterns. Based on nrDNA-ITS and plastid gene matK, Li et al. [19] observed that the genus Bletilla was monophyletic, with B. ochracea being a sister to the remaining species of this clade; Feng et al. [20] found that B. formosana was a sister to B. striata + B. ochracea. Moreover, the monophyletic status of Bletilla has been challenged by a recent study from Huang et al. [18]. In this work, with combined morphological and molecular data, the species B. foliosa (a synonym of B. sinensis) has been treated as a new monotypic genus Mengzia. The monophyletic status of Bletilla and the phylogenetic relationships within this genus are still unclear. Thus, more evidences and further investigations are needed to clarify these issues.
As we know, the plastid genome (plastome), which is rich in both sequence and structural variation, has emerged as a powerful tool for understanding plant evolution [21][22][23][24][25][26][27][28]. In particular, with the rapid development of high-throughput sequencing technologies, numerous studies have enriched plastomic resources, such as the genomic composition, structural variation, high diversity regions [29][30][31]. Additionally, there are still several important characteristics of plastome, which harbor phylogenetic implications, that need to be explored further. For instance, studying plastomic tRNA (pttRNA), which accumulated multiple mutation events, is one point for delving into the plastid evolution [32]. It is becoming widely recognized that tRNAs possess highly conserved clover leaf-like structures [33]. Interestingly, our recent study surprisingly found some novel structures in the pttRNAs of the Macaronesian species (Crassulaceae) [28]. We further identified these secondary structural variations as genus-specific markers. Accordingly, exploring the pttRNAs' secondary structure in Bletilla greatly benefit a better understanding of the phylogeny of this genus.
In addition, codon usage bias (CUB), referring to the unbalanced utilization of synonymous codons in coding DNA, can be analyzed for getting insights into the evolutionary patterns of both taxa and genes [34][35][36][37][38]. By the statistical analysis of CUB, we can speculate which factor is mainly responsible for bias pattern, usually mutational bias or natural selection [39][40][41]. It should also be noted that the CUB pattern has been reported to be highly associated with gene expression level [42,43]. Based on that, a codon that has a distinct positive relationship between its frequency and gene expression is defined as the optimal codon [44,45]. Codon optimality was attributed as a major determinant of mRNA stability [46][47][48]. Additionally, codon aversion motifs (CAM), presented as the nonuse of codons in genes, has recently been found to be a novel marker for phylogeny studies [49,50]. Definitely, the phylogenetic implications obtained from the analyses of the codon usage and aversion will improve our understanding of the phylogeny of Bletilla.
In this work, aiming to explore the interspecific differences of the genus Bletilla, four new plastomes were reported. Through comprehensive analyses, we are committed to address (1) the compositional variations of plastomes among the members of Bletilla; (2) the differences in novel secondary structures of pttRNAs; and (3) the phylogenetic relationships within the genus Bletilla. Ultimately, our findings of this study will shed light on the evolution of Bletilla and Arethuseae.

Plastomic Organizations and Structural Features
The size of four complete plastome sequences of Bletilla ranged from 156,942 (B. sinensis) to 160,168 bp (B. ochracea), with typical quadripartite structures, containing LSC regions (86,746 bp) and SSC regions (18,804 bp) separated by two IR regions (26,809 bp each) ( Figure 1, Table 2). The four plastomes were extremely conserved in GC contents and gene numbers, sharing the highest GC content in IR (43.2%), followed by LSC (35.    As shown in Figure 2, the four Bletilla species displayed similar structures at JSA (junction IRA/SSC) and JSB (junction IRB/SSC), where ycf1 (functional copy) and ndhF genes spanned these two junctions, respectively. Additionally, a 55 bp overlap between ycf1 (pseudo copy) and ndhF was found at JSB in all four plastomes. Most notably, upon further analysis, we found that B. sinensis had several unique features. For instance, compared with the rather high similarity of plastome size among the three other species of Bletilla (159,484-160,022 bp), B. sinensis featured a much smaller size (156,942 bp). Moreover, we also detected that rpl22 of B. sinensis completely existed in LSC regions, whereas this gene in the other three species was all across LSC and IRb. To better understand the structural variation of IR-LSC junctions, we further investigated a total of 16 well-annotated plastomes from the tribe Arethuseae. Additionally, the results show that the same location of rpl22 was found in B. sinensis and all species in node 1. Meanwhile, all species in node 2 and 3, together with Arundina graminifolia, shared the same expansion of IR in rpl22 gene ( Figure 2). Moreover, we also observed that each IR of Arethuseae had a trnH-rps19 gene cluster near both JSA and JSB.
In addition to coding regions, the diversity might also occur in the non-coding regions of the plastomes. Hence, we further analyzed DNA polymorphism among the four plastomes of Bletilla. To detect the intrageneric divergence, the calculation was conducted in two groups: all four Bletilla taxa (group A) and three Bletilla species (excluding B. sinensis) (group B). The identified highly polymorphic regions (HPR) of each group displayed a high degree of variation ( Figure 3). In group A, a total of 12 regions were identified with the nucleotide divergence (Pi) ranging from 0.03667 to 0.07028 and 44-353 mutation sites. In comparison, group B possessed 14 HPRs featuring Pi in 0.00556-0.01185 and different sites in 5-33 (Table 3). Most interestingly, only four regions were shared by the two groups, and the Pi values of group A were overall almost 7 times larger than those of group B. Complicated relationships within Bletilla could be inferred by all the above divergence, and the unique patterns of the IR junction and HPR might serve as specific markers for this genus. In addition to coding regions, the diversity might also occur in the non-coding regions of the plastomes. Hence, we further analyzed DNA polymorphism among the four plastomes of Bletilla. To detect the intrageneric divergence, the calculation was conducted in two groups: all four Bletilla taxa (group A) and three Bletilla species (excluding B. sinensis) (group B). The identified highly polymorphic regions (HPR) of each group displayed a high degree of variation ( Figure 3). In group A, a total of 12 regions were identified with the nucleotide divergence (Pi) ranging from 0.03667 to 0.07028 and 44-353 mutation sites. In comparison, group B possessed 14 HPRs featuring Pi in 0.00556-0.01185 and different sites in 5-33 (Table 3). Most interestingly, only four regions were shared by the two groups, and the Pi values of group A were overall almost 7 times larger than those of group B. Complicated relationships within Bletilla could be inferred by all the above

High-Informative Patterns of pttRNAs' Secondary Structures
The secondary structures of pttRNAs were identified and compared among four plastomes of Bletilla. In 38 tRNA genes, a total of 12 novel putative pttRNA structures were detected ( Figure 4) and were clustered into five groups: (1) pttRNA with one novel loop located in the accept stem: tRNA Arg -ACG and tRNA Thr -UGU; (2) pttRNA with one novel loop located in the D arm: tRNA Met

High-Informative Patterns of pttRNAs' Secondary Structures
The secondary structures of pttRNAs were identified and compared among four plastomes of Bletilla. In 38 tRNA genes, a total of 12 novel putative pttRNA structures were detected ( Figure 4) and were clustered into five groups: (1) pttRNA with one novel loop located in the accept stem: tRNA Arg -ACG and tRNA Thr -UGU; (2)  To explore potential new markers for DNA barcoding, we further compared these novel putative pttRNA structures within Bletilla. Interestingly, the species B. sinensis has unique structural features of pttRNA. For example, the expanded 9-nt anticodon loop of tRNA Gln -UUG was found to be characteristic for B. sinensis ( Figure 4) compared with the ordinary 7-nt loop for the remaining three Bletilla species. Moreover, in the variable loop, three unique pttRNAs were also detected (  To explore potential new markers for DNA barcoding, we further compared these novel putative pttRNA structures within Bletilla. Interestingly, the species B. sinensis has unique structural features of pttRNA. For example, the expanded 9-nt anticodon loop of tRNA Gln -UUG was found to be characteristic for B. sinensis ( Figure 4) compared with the ordinary 7-nt loop for the remaining three Bletilla species. Moreover, in the variable loop, three unique pttRNAs were also detected (  species, however, the sequence 5′-UUU-3′ is only for B. sinensis, while it is 5′-UUA-3′ for the other three; (3) the position nt 56 in tRNA Ser -UGA was occupied by 'C' for B. sinensis and 'A' for the other three.
Furthermore, we conducted comparative analyses at the intergeneric level in the tribe Arethuseae (involving 22 released plastomes in total). Notably, except for tRNA Ser -UGA, the structural features of tRNA Gln -UUG, tRNA Ser -GCU and tRNA Leu -UAA of B. sinensis differed from the other species of the tribe Arethuseae.

Patterns of Codon Usage and Aversion
To explicate the patterns of codon usage and aversion among four species of Bletilla and one closely related species Arundina graminifolia, we performed the evaluation of the effective number of codons (ENCs), relative synonymous codon usage (RSCU) value, optimal codons and codon aversion motifs (CAM). Only 53 CDSs with sizes of at least 300 bp were considered for further analyses.
To reflect varying levels of CUB, the lowest and highest 5% of ENC values were selected and compared among these five species. As presented in Table 4, we detected high similarities in B. formosana, B. ochracea and B. striata, which shared the exactly same pattern for both low (rps18, rpl16 and psbD) and high (clpP, ndhE and ycf3) groups. In contrast, for B. sinensis, the low group included rps8, and the high group possessed rps4 and ndhJ. It was worth noting that the B. sinensis was found to share some similarities with A. graminifolia on ENC values. For instance, the rps8 (37.85) gene was also one of the lowest in A. graminifolia and ndhJ (53.81) was the third highest as well.
Considering the codon aversion can act as a new character system in phylogenetics, we also employed the analysis of CAM for these 53 CDSs of five plastomes (Table S2). The genus Bletilla had a highly conservative codon aversion pattern excluding B. sinensis, harboring the same motifs for 27 CDSs. Among these 27 CDSs, five are shared by B. sinensis  (clpP, ndhB, ndhJ, rps7 and ycf4). Surprisingly, we detected many unique CAM for B. sinensis from a total of 31 CDSs, indicating the substantial interspecific difference in Bletilla.

Patterns of Codon Usage and Aversion
To explicate the patterns of codon usage and aversion among four species of Bletilla and one closely related species Arundina graminifolia, we performed the evaluation of the effective number of codons (ENCs), relative synonymous codon usage (RSCU) value, optimal codons and codon aversion motifs (CAM). Only 53 CDSs with sizes of at least 300 bp were considered for further analyses.
To reflect varying levels of CUB, the lowest and highest 5% of ENC values were selected and compared among these five species. As presented in Table 4, we detected high similarities in B. formosana, B. ochracea and B. striata, which shared the exactly same pattern for both low (rps18, rpl16 and psbD) and high (clpP, ndhE and ycf3) groups. In contrast, for B. sinensis, the low group included rps8, and the high group possessed rps4 and ndhJ. It was worth noting that the B. sinensis was found to share some similarities with A. graminifolia on ENC values. For instance, the rps8 (37.85) gene was also one of the lowest in A. graminifolia and ndhJ (53.81) was the third highest as well. Considering the codon aversion can act as a new character system in phylogenetics, we also employed the analysis of CAM for these 53 CDSs of five plastomes (Table S2). The genus Bletilla had a highly conservative codon aversion pattern excluding B. sinensis, harboring the same motifs for 27 CDSs. Among these 27 CDSs, five are shared by B. sinensis (clpP, ndhB, ndhJ, rps7 and ycf4). Surprisingly, we detected many unique CAM for B. sinensis from a total of 31 CDSs, indicating the substantial interspecific difference in Bletilla. Furthermore, two CDSs (accD and atpF) were observed that were shared by A. graminifolia and B. sinensis. Most significantly, as shown in Figure 6, the aversion motifs identified in ndhA and rps11 genes could distinguish five investigated species.
The optimal codons identified in the five plastomes were shown in Table 5. Four species possessed four optimal codons, respectively: B. formosana, B. ochracea and B. striata had the same pattern (GGU, UUG, UCC and CGU), and A. graminifolia featured by GGU, UUG, UCC and CGA. Instead, B. sinensis only had two optimal codons (GGU and CGA), showing a high degree of diversity. Notably, GGU was shared by all five species, and CGA was shared by B. sinensis and A. graminifolia.  Figure 6. The species-specific codon aversion motifs of ndhA and rps11 gene for the 5 investigated plastomes. The dots that were marked, respectively, in different colors, represent specific species. The optimal codons identified in the five plastomes were shown in Table 5. Four species possessed four optimal codons, respectively: B. formosana, B. ochracea and B. striata had the same pattern (GGU, UUG, UCC and CGU), and A. graminifolia featured by GGU, UUG, UCC and CGA. Instead, B. sinensis only had two optimal codons (GGU and CGA), showing a high degree of diversity. Notably, GGU was shared by all five species, and CGA was shared by B. sinensis and A. graminifolia.

Phylogenetic Implications of Plastomes within Bletilla
To further clarify the evolutionary relationships within the genus Bletilla, especially the taxonomic position of B. sinensis, we performed phylogenetic analyses. Along with four new plastomes generated in this study, our phylogeny totally covered 36 species from 9 genera of Arethuseae using the datasets of 79 PCGs. Based on the 69,937-bp concatenated sequence, similar tree topologies were obtained for both ML and BI algorithms.
As Figure 7 displayed, most Arethuseae species formed two main clades. Clade I comprised 6 genera, which could be further divided into two subclades. All Pleione species clustered in a strongly supported monophyletic group

Discussion
With the aim of clarifying the interspecific relationships in the genus Bletilla, this study reported the plastomes sequences of four species. Comprehensive analyses were performed within Bletilla and the tribe Arethuseae, including basic genomic properties of plastids, the structural features of IR boundaries, the predicted structures of pttRNAs, the patterns of codon usage and aversion, as well as phylogeny. Thereby, this work provides abundant molecular evidence for resolving the taxonomic issues in Bletilla, and also sheds light on the evolution of Arethuseae.
Characterized by single-parent inheritance, conservative organization, and a relatively slow-evolving rate, the plastome is widely recognized as a super-barcode for plant species discrimination and phylogenetic analyses [51][52][53]. From our analyses, both simi- In addition, we also constructed a wider taxonomic sampled cladogram based on six cpDNA loci, additionally comprising the data of three Arethusinae taxa ( Figure S1). Notably, clade I showed a highly similar topology to the PCG trees. Additionally, its sister clade (clade II) was composed of four species, three of which (Arethusa bulbosa, Eleorchis japonica and Calopogon tuberosus) formed a well-supported subclade (100 in ML, 1.0 in BI), and A. graminifolia was weakly grouped with them.
Significantly, it was interesting to note that B. sinensis was not clustered with other Bletilla members, and was located at the basal position of Arethuseae ([BS] = 100, [PP] = 1.0) in all trees instead. Thus, our results suggest that the genus Bletilla is paraphyletic.

Discussion
With the aim of clarifying the interspecific relationships in the genus Bletilla, this study reported the plastomes sequences of four species. Comprehensive analyses were performed within Bletilla and the tribe Arethuseae, including basic genomic properties of plastids, the structural features of IR boundaries, the predicted structures of pttRNAs, the patterns of codon usage and aversion, as well as phylogeny. Thereby, this work provides abundant molecular evidence for resolving the taxonomic issues in Bletilla, and also sheds light on the evolution of Arethuseae.
Characterized by single-parent inheritance, conservative organization, and a relatively slow-evolving rate, the plastome is widely recognized as a super-barcode for plant species discrimination and phylogenetic analyses [51][52][53]. From our analyses, both similarities and differences were detected between B. sinensis and three other Bletilla species. All four plastomes harbored exactly the same number of genes and similar GC contents. With the same location of the ycf1 gene, all four plastomes possessed a pseudo ycf1. Nevertheless, compared to other species, the gene content of B. sinensis at JLB was highly divergent. According to Downie and Palmer [52], any mutation occurring in the structure or content of plastome possibly implicated phylogeny. In our recent study on plastome evolution [28], the rps19 genes in all investigated taxa of Crassulaceae were located at the JLB, and were extended by 105 or 110 bp in the IRb. However, all rps19 genes from members of the tribe Arethuseae were present in IRb. Moreover, the trnH-rps19 gene cluster in IRs observed in this study was also present in most monocots, implying that the duplication of this cluster was prior to the divergence of monocot lineages [54].
Nucleotide mutations are cluster-distributed and manifested as "hotspots" in plastomes [55]. Consistent with previous studies [56][57][58][59][60], we also observed that genes in the IR regions were slower to evolve than those in the SC regions. To our knowledge, few mutational hotspots were found in the IR regions. For instance, Henriquez et al. [31] investigated five plastomes of Monsteroideae, which exhibited no hotspots in the two repeat regions. The decreased rates of substitution might result from gene conversion in IR regions [58,61,62]. More interestingly, Li et al. [63] observed that genes translocated into the IR region of fern plastomes not only reduced substitution rates, but also increased the GC content. In addition, the substantial increase in the Pi values in group A (Bletilla included B. sinensis) compared to group B (Bletilla excluded B. sinensis) indicated the faster substitution rate in B. sinensis. This finding implies that the three species (B. ochracea, B. formosana and B. striata) were closely related to each other and distantly related to B. sinensis. Dong et al. [64] found that highly variable chloroplast markers were suitable for evolutionary studies on angiosperms at low taxonomic levels. In our recent research on the plastome evolution of Aeonium and Monanthes (Crassulaceae) [28], we strongly recommended that the hotspots (highly polymorphic regions) of plastomes might have important implications for phylogeny, and could be used for the DNA barcoding of plants. Therefore, the identified hotspots loci in this study obviously possessed higher informative divergence, which would act as more efficient markers for the barcoding and phylogeny of Bletilla.
As we know, the complete plastome has significant genomic resources for untangling phylogenetic issues [65][66][67][68]. Linking the mRNA and protein, pttRNA occupies a critical part of chloroplast [32,69,70]. Brennan and Sundaralingam [71] pointed out that tRNAs embodies two categories on the basis of a variable-region size: type I contains a small loop with 4 or 5 nt, and type II has a larger size with a stem (3-7 bp) as well as a loop (3-5 nt). In this work, among the 38 pttRNAs of Bletilla, seven with a variable arm were examined to belong to type II, including tRNA Leu -CAA, tRNA Leu -UAA, tRNA Leu -UAG, tRNA Ser -GCU, tRNA Ser -GGA, tRNA Ser -UGA and tRNA Tyr -GUA. It is noteworthy that tRNA Tyr of type II was considered to be unique to prokaryotes in an early study [72]. However, through intensive and extensive sampling, Sun and Caetano-Anollés [73] found that all Tyr specific tRNAs in both Archaea and Eukaryotes were also classified as type II. The long variable arm was presumed to be an ancient structure and was lost in more derived tRNAs [73].
More notably, plastomic tRNA Leu -UAA was proved to have important phylogenetic implications. As shown in Figure 7, Arethuseae could be sorted into five distinct categories based on the variable loop of tRNA Leu -UAA. Type A was shared by Clade I, consisting of the genera Bulleyia, Coelogyne (except for C. corymbosa), Panisea, Pholidota (except for P. yunnanensis) and Pleione; type B was featured in the genus Bletilla (except for B. sinensis), C. corymbosa and P. yunnanensis; while types C, D and E were unique to Thunia alba, Arundina graminifolia and B. sinensis, respectively. Based on these analyses, we strongly suggest that the genus Bletilla is non-monophyletic. Our results clearly indicate that the secondary structures of pttRNAs are of highly informative value for phylogenetic analyses. Thus, more in-depth studies are needed to better understand the evolutionary significance of pttRNAs in plants.
The CUB pattern is affected by multiple factors, such as selection on translation [74], gene size [75] and composition [76], as well as mutation and selection pressure [41,77]. The ubiquity of CUB in different genes or taxa makes it an ideal resource for investigating molecular evolution, gene expression, nucleotide composition, etc. [78,79]. As a vital index of CUB, ENC is widely associated with the gene expression level. The lower value of ENC means the stronger impact from CUB, and vice versa [80]. Additionally, it has been widely acknowledged that a higher level of expression is generally associated with a more powerful bias [81,82]. Notably, in comparison with three other species of Bletilla, our analysis revealed that B. sinensis varied considerably in ENC pattern, which might imply that the gene expression mode in B. sinensis is different. Another important indicator for gene expression is optimal codons. In fact, codon optimality has been proven to be a key determinant of mRNA stability [46][47][48]. Previous studies proposed that the number of optimal codons might be correlated with a different selection mode [81,83]. Generally, a positive selection will result in an increased number of preferred codons, while negative selection might cause a decrease. Significantly, we found that B. sinensis harbored the least number (two) of optimal codons among the five taxa investigated, suggesting that this species might undergo more pressure from purifying selection than others. On the other hand, codon aversion has recently been proposed to have extraordinarily potential value in phylogeny [28,49,50,84,85]. Remarkably, this study identified extremely divergent CAM in B. sinensis compared with other analyzed species. Moreover, following the method of Miller et al. [50], we further recovered the phylogeny of Arethuseae using CAM, exhibiting the same topology with the tree based on the PCGs of plastomes. Accordingly, our analyses on the patterns of codon usage and aversion confirmed the non-monophyly of Bletilla within the tribe Arethuseae.
In order to overcome the limitations of a few loci and the taxon sampling estimated, we employed two datasets (79 plastomic PCGs for 38 taxa, and 6 cpDNA regions for 49 taxa) to construct the cladogram of Arethuseae, respectively. Significantly, the nonmonophyly of Bletilla was strongly supported by two different phylogenetic methods. We also found a heterogeneity in the relationship between B. sinensis and Arundina graminifolia compared to the work of Huang et al. [18]. The latter study found a sister relationship of them, while B. sinensis was located at the basal position of Arethuseae in the present study. Currently, there are a limited number of phylogenetic informative sites for the phylogeny of Arethuseae. Hence, to better evaluate the taxonomic status of B. sinensis, more samples are needed.

Sample Material, DNA Extraction, Sequencing and Annotations
The fresh leaves of four Bletilla species (B. formosana, B. ochracea, B. sinensis and B. striata) were gathered, and their specific locations are listed in Table S3. The extraction of wholegenomic DNA was achieved using the Plant Genomic DNA kit (Tiangen, Beijing, China) according to CTAB method [86]. TruSeq DNA PCR-Free Library Prep Kit (Illumina, San Diego, CA, USA) was employed for library construction. Additionally, the resulting libraries were then sequenced through Illumina Novaseq 6000 with 150 paired-ends and 350 bp insert size.

Comparative Structural Analyses among the Plastomes of Bletilla
Comprehensive structural analyses were conducted comparatively among four members of the genus Bletilla. Firstly, the nucleotide composition of the plastomes was identified using Bioedit [95]. The boundaries at the junctions of IR and SC regions were checked and plotted manually. Additionally, the secondary structure of pttRNAs was then predicted by tRNAscan-SE v.2.0.3 [46].

Plastomic Codon Usage and Aversion Indices
To investigate the plastomic codon usage, CodonW v.1.4.2 was employed for the calculation of the value of ENC and RSCU [96]. The RSCU value was applied to quantify the degree of even use for each synonymous codon, with a value larger than 1 favoring the use of a codon and vice versa [97]. In the range of 20-61, ENC values usually denote the bias of codon usage, and the strong bias features with a low value [98].
Furthermore, to explore the deep correlation between CUB and gene expression, we conducted the ∆RSCU method to determine optimal codons in Bletilla [44,45]. Taking the ENC values of the CDSs as a substitute for expression degree, the highest and lowest 5% were categorized as the low and high group, respectively [40,81] Then, the optimal codons were sifted out with the ∆RSCU > 0.08 as well as the RCU (relative codon usage) value > 1 in the high group and <1 in the low group.
Moreover, to gain more informative genetic evidence, the codon aversion motifs, possessing strong phylogenetic implications, were extracted and manually checked using the CAM algorithm [50].

Phylogenetic Inferences
To untangle the controversy of the phylogeny of Bletilla, we performed phylogenetic analyses within the tribe Arethuseae. Additionally, two species of Liparis, from the closely related tribe Malaxideae of Arethuseae, served as outgroups. Additionally, two phylogenetic sampling strategies were employed in this study.
The first one was the inclusion of all available complete plastomic sequences of the tribe Arethuseae from NCBI, along with four new data from this study. The 79 plastomic CDSs from a total of 38 taxa formed the first dataset, which represented nine genera of the tribe Arethuseae (Table S4).
Furthermore, six cpDNA regions (ccsA, matK, psaB, rbcL, rpoC1 and ycf1) were selected for the second dataset. It consisted of 49 species, including eight additional species of Coelogyninae and three Arethusinae taxa compared to the first dataset (Table S5).
The two datasets were aligned, respectively, with MAFFT, under the default settings [99]. SequenceMatrix was then used for the concatenation [100], with gaps as missing data. After that, two approaches were chosen for phylogenetic analysis: maximum likelihood (ML) and Bayesian inference (BI).
The ML trees were inferred by RAxML 8.2.12 [101]. Fifty runs and one thousand bootstrap replicates were executed under the models identified by PartitionFinder v.2.1.1 with the "-raxml" command line [102]. We also checked the convergence of each node by the "-I autoMRE" option. For BI analysis, the best models for the dataset were determined by ModelTest-NG [103]. Two simultaneous runs with four Markov chains each were run for 10 million generations (sampling every 100 generations), and Tracer 1.7.1 was used to assess the convergence [104].

Conclusions
In the context of the controversial intrageneric relationships within Bletilla, this study newly sequenced plastomes from four species of Bletilla, and performed comparative analyses among them. Interestingly, our results reveal considerable plastomic differences between B. sinensis and the other three taxa in many aspects. Most importantly, the specific features of the IR junction patterns, novel pttRNA structures and codon aversion motifs can serve as useful molecular markers for Bletilla. Furthermore, at the tribe level, the variable region of plastomic tRNA Leu -UAA and IR boundaries showed important phylogenetic implications for Arethuseae. Additionally, our phylogenetic analyses based on the two datasets, covering 36 species and 49 taxa of Arethuseae, respectively, suggested the nonmonophyly of Bletilla with strong support. The convincing molecular evidence reported herein will provide novel potential markers for species identification, and achieve a more profound understanding for the evolution of Bletilla and Arethuseae.

Data Availability Statement:
The four plastomes' sequences data generated in this study are available in GenBank of the National Center for Biotechnology Information (NCBI) (https://www.ncbi.nlm.nih.gov/nuccore, accessed on 29 July 2022) under the access numbers: OP104328-OP104330, and MT806143.