Apostasia Mitochondrial Genome Analysis and Monocot Mitochondria Phylogenomics

Apostasia shenzhenica belongs to the subfamily Apostasioideae and is a primitive group located at the base of the Orchidaceae phylogenetic tree. However, the A. shenzhenica mitochondrial genome (mitogenome) is still unexplored, and the phylogenetic relationships between monocots mitogenomes remain unexplored. In this study, we discussed the genetic diversity of A. shenzhenica and the phylogenetic relationships within its monocotyledon mitogenome. We sequenced and assembled the complete mitogenome of A. shenzhenica, resulting in a circular mitochondrial draft of 672,872 bp, with an average read coverage of 122× and a GC content of 44.4%. A. shenzhenica mitogenome contained 36 protein-coding genes, 16 tRNAs, two rRNAs, and two copies of nad4L. Repeat sequence analysis revealed a large number of medium and small repeats, accounting for 1.28% of the mitogenome sequence. Selection pressure analysis indicated high mitogenome conservation in related species. RNA editing identified 416 sites in the protein-coding region. Furthermore, we found 44 chloroplast genomic DNA fragments that were transferred from the chloroplast to the mitogenome of A. shenzhenica, with five plastid-derived genes remaining intact in the mitogenome. Finally, the phylogenetic analysis of the mitogenomes from A. shenzhenica and 28 other monocots showed that the evolution and classification of most monocots were well determined. These findings enrich the genetic resources of orchids and provide valuable information on the taxonomic classification and molecular evolution of monocots.


Introduction
Mitochondria are organelles that play a crucial role in plant productivity and development by being responsible for energy conversion, biosynthesis, and signal transduction [1]. The plant mitogenome is complex and diverse in terms of size, structure, number of repeats, and coding genes [2]. In addition, it has extensive horizontal gene transfer (HGT) and RNAediting mechanisms that ensure mitochondrial function and stable gene expression [3,4]. Most seed plants inherit chloroplasts (cp) and mitochondria (mt) from their maternal parent. This genetic mechanism reduces the influence of paternal lines, making it easier to study genetics [5,6]. Thus, cp genomes and mitogenomes have been extensively analysed to understand taxon classification and evolution [7]. However, compared to plastid genomes, mitogenomes with lower evolutionary rates are more suitable for studying the phylogenetic relationships of families, orders, or higher taxonomic elements.
Monocots are one of the most diverse, ecologically important, and economically important terrestrial plant lineages with approximately 77 families and 85,000 species, accounting for 21% of angiosperms [8,9]. Despite their remarkable species diversity, only 30 monocots, 2 of 20 including three orchids (Gastrodia elata, Platanthera zijinensis, P. guangdongensis) [10,11] mitogenomes have been published in the NCBI GenBank database due to the complexity of mitogenome structure and sequence, which limited the study and utilization of these excellent crops. At present, the phylogenetic relationships of monocots are mainly studied in cp genomes and mitochondrial gene segments, and the phylogenetic relationship of mitogenome is still unclear [12,13]. Therefore, it is extremely important to provide further evidence for the systematic relationship between monocot lineages using mitogenome.
Orchidaceae is the largest family of angiosperms and monocots with over 700 genera and approximately 28,000 species [14]. Apostasia is one of the two genera that form the subfamily Apostasioideae, which is a primitive group located at the base of the Orchidaceae phylogenetic tree [15]. Moreover, Apostasia has special morphology that demonstrates "primitive" traits similar to those of Curculigo crassifolia, are considered the closest to the "pseudoorchids" speculated by Darwin. This makes them ideal for studying orchids and phylogenetic evolution [16]. Therefore, we conducted a mitogenome study of A. shenzhenica to provide genetic resources for further studies on the evolution of Orchidaceae. Meanwhile, we combine the available mitogenomes of monocots to provide further evidence for the phylogenetic relationships of monocots.
In this study, we used second-generation sequencing techniques to de novo assemble the mitogenome of A. shenzhenica and systematically analysed gene content, repetitive sequences, selective pressure, and RNA editing sites. We investigated the gene transfer between the chloroplast and mitogenomes of A. shenzhenica. In addition, we explored the phylogenetic relationships among A. shenzhenica and 28 monocot species using the mitogenome, which provided valuable information on the taxonomic classification, molecular evolution, and breeding of monocots.

Mitogenome Structure and Gene Content
The genome sequence of A. shenzhenica was uploaded to GenBank (accession number: OQ645347). We assembled a 672,872 bp length mitogenome of A. shenzhenica using 18 Gb Illumina sequencing data and manually displayed a circular structure ( Figure 1). The nucleotide composition of the mitogenome was 27.8% A, 27.8% T, 22.3% G, 22.1% C, and 44.4% GC. In addition, we identified 54 mitochondrial genes in the A. shenzhenica mitogenome, including 36 protein-coding genes, two rRNA genes, and 16 tRNA genes (Table 1  and Table S1). Protein-coding genes (PCGs) accounted for 4.21% of the total mitogenome, whereas tRNA and rRNA genes accounted for only 0.17% and 0.31%, respectively. The rest of the mitogenome contained noncoding sequences, such as introns, intergenic spacers, and potential pseudogenes. There were 57 repeat pairs with a length of >50 bp, occupying 1.38% (9298 bp) of the mitogenome. Interestingly, two copies of nad4L genes were detected. The depth of coverage across the entire mitogenome was relatively even, indicating the continuity of our assembly (Figure 1). Overlaps and intervals exist between adjacent genes in the mitogenome. We identified three overlapping sequences. Overlapping sequences were located between rp15 and rps14, rp116 and rps3, and rps19 and rps3. A total of 50 gene spacer regions with lengths ranging from 4 to 67,130 bp were identified. The largest spacer region was between nad5 and trnC-GCA, with a length of 67,130 bp. Furthermore, most PCGs of A. shenzhenica have no introns; six of the annotated genes contained introns, five of them (nad5, ccmFc, rpl2, rps3 and rps10) included an intron and nad7 had four ( Table 1). Analysis of the whole mitogenome sequence has putative group II intron segments near each exon. Then, we set the minimum size to 200 bp in Geneious and discovered that these genes have one or more ORFs with unknown function, laying in the middle region between its exons. For example, ccmFc intron has two ORFs with complete start and stop codons.

Group of Genes Name Length Start Codon Stop Codon Amino Acids
Transfer RNAs Note: Numbers after gene names are the number of copies. The superscripts * and **** represent one and four introns contained, respectively. The superscripts a indicates the chloroplast-derived genes.

Codon Usage Analysis of PCGs
To ATG was the most frequent start codon for 36 protein-coding genes (Table 1); the mttB gene was an exception with the initiating codon TTG. Three stop codons (TAA, TAG, and TGA) were identified. These results indicated that RNA editing from C to U does not occur at the start or stop codon. The relative synonymous codon usage (RSCU) values of A. shenzhenica are displayed in Figure 2. The results showed that the 36 protein-coding gene regions had 9379 codons, excluding termination codons ( Table 2). The most frequently used codons were UUC and UUU (for Phenylalanine; Phe) and CUU (for leucine; Leu), whereas CGU and CGC (for Serine; Ser) and GGC (for tryptophan; Trp) were rarely found. This may explain the negative base skew (AT, GC) of the PCGs.
Note: Numbers after gene names are the number of copies. The superscripts * and **** represent one and four introns contained, respectively. The superscripts a indicates the chloroplast-derived genes.

Codon Usage Analysis of PCGs
To ATG was the most frequent start codon for 36 protein-coding genes (Table 1); the mttB gene was an exception with the initiating codon TTG. Three stop codons (TAA, TAG, and TGA) were identified. These results indicated that RNA editing from C to U does not occur at the start or stop codon. The relative synonymous codon usage (RSCU) values of A. shenzhenica are displayed in Figure 2. The results showed that the 36 protein-coding gene regions had 9379 codons, excluding termination codons ( Table 2). The most frequently used codons were UUC and UUU (for Phenylalanine; Phe) and CUU (for leucine; Leu), whereas CGU and CGC (for Serine; Ser) and GGC (for tryptophan; Trp) were rarely found. This may explain the negative base skew (AT, GC) of the PCGs.

Substitution Rates of PCGs
The non-synonymous-to-synonymous substitution ratio (Ka/Ks) is important in genetics for assessing the magnitude and direction of natural selection acting on homologous genes among divergent species. It is commonly accepted that Ka/Ks indicates neutral evolution when it equals one, positive selection when it is greater than one, and negative selection when it is less than one.
To investigate the evolutionary rate of mitochondrial genes, we calculated the nonsynonymous substitution rate (Ka) and synonymous substitution rate (Ks) for the 19 shared PCGs of A. shenzhenica against the Allium cepa, Gastrodia elata, Hemerocallis citrina, and Asparagus officinalis mitogenomes. As shown in Figure 3, the Ka/Ks ratios were much lower than 1 in the majority of protein-coding genes, indicating the stability of the protein function of these genes during evolution. In contrast, the Ka/Ks ratios of atp9 (1.13) and rps7 (1.72) were greater than 1, implying that these genes were subjected to positive selection. In particular, the Ka/Ks ratios of rps7 in A. shenzhenica and G. elata were significantly higher than 2(2.3), whereas those of H. citrina and A. officinalis were also 2.04 (Table S2), indicating that they may be very crucial for the evolution of A. shenzhenica. According to previous reports, small mitochondrial subunit proteins encoded by the rps7 gene are essential for various biological activities in eukaryotes, such as embryonic development, leaf formation, and reproductive tissue formation [19][20][21]. rps7 (1.72) were greater than 1, implying that these genes were subjected to positive selection. In particular, the Ka/Ks ratios of rps7 in A. shenzhenica and G. elata were significantly higher than 2(2.3), whereas those of H. citrina and A. officinalis were also 2.04 (Table S2), indicating that they may be very crucial for the evolution of A. shenzhenica. According to previous reports, small mitochondrial subunit proteins encoded by the rps7 gene are essential for various biological activities in eukaryotes, such as embryonic development, leaf formation, and reproductive tissue formation [19][20][21]. Additionally, the ratio of the nad3 (0.81) gene was close to 1, indicating that it experienced neutral evolution because of the divergence of A. shenzhenica and four other Asparagales from its last common ancestor. The "X" axis shows the name of protein-coding genes, and the "Y" axis shows the Ka/Ks values. Black dots represent outliers.

Prediction of RNA Editing Sites in PCGs
RNA editing refers to the process of altering genetic information at the mRNA level, including the deletion, insertion, or replacement of nucleotides. We used the web-based PREP-Mt tool to predict 416 RNA-editing sites and 100% C-to-U RNA editing in 28 PCGS of A. shenzhenica. Among them, there were 12 site conversions of CCT to TTT and two conversions of CCC to TTC. Moreover, the proportions were 39.9% (166 sites) for the projected first base location of the codon, 64.3% (255 sites) for the expected second base position, and none for the predicted third base position (Table S3). Instead of the lack of RNA editing at this site, the inadequacies of the PREP-Mt prediction methodology are likely to be responsible for the absence of projected RNA editing sites at the silencing site. Therefore, experimental techniques or proteomic data are required to identify RNA editing in silent editing locations [22,23].
The number of RNA editing sites varied greatly among the different genes, and the predicted RNA editing sites encoded by complex I (NADH dehydrogenase) and cytochrome c biogenesis genes (ccmB, ccmC, ccmFc, and ccmFn) were higher on average ( Figure  4). Upon comparing the RNA editing sites of the five asparagine species, we found that the nad4 gene encoded the most RNA editing sites, whereas rps7 encoded the least. Additionally, the ratio of the nad3 (0.81) gene was close to 1, indicating that it experienced neutral evolution because of the divergence of A. shenzhenica and four other Asparagales from its last common ancestor.

Prediction of RNA Editing Sites in PCGs
RNA editing refers to the process of altering genetic information at the mRNA level, including the deletion, insertion, or replacement of nucleotides. We used the web-based PREP-Mt tool to predict 416 RNA-editing sites and 100% C-to-U RNA editing in 28 PCGS of A. shenzhenica. Among them, there were 12 site conversions of CCT to TTT and two conversions of CCC to TTC. Moreover, the proportions were 39.9% (166 sites) for the projected first base location of the codon, 64.3% (255 sites) for the expected second base position, and none for the predicted third base position (Table S3). Instead of the lack of RNA editing at this site, the inadequacies of the PREP-Mt prediction methodology are likely to be responsible for the absence of projected RNA editing sites at the silencing site. Therefore, experimental techniques or proteomic data are required to identify RNA editing in silent editing locations [22,23].
The number of RNA editing sites varied greatly among the different genes, and the predicted RNA editing sites encoded by complex I (NADH dehydrogenase) and cytochrome c biogenesis genes (ccmB, ccmC, ccmFc, and ccmFn) were higher on average (Figure 4). Upon comparing the RNA editing sites of the five asparagine species, we found that the nad4 gene encoded the most RNA editing sites, whereas rps7 encoded the least.

Identification of Repeat Sequences
Repetitive sequences consist of simple sequence repeats (SSRs), tandem repeats, and dispersed repeats sequences. A total of 226 SSRs were discovered in A. shenzhenica mitogenome, including 57 (25.22%) monomers, 57 (25.22%) dimers, 39 (17.26%) trimers, 61 (26.99%) tetramers, 10 (4.42%) pentamers, and two (0.88%) hexamers. Almost 50% of the repetitions of 226 SSRs were either monomers or dimers. Additional analysis of the repetitive units of the SSRs revealed that G/C only occupied 8.8% of the monomers, whereas A/T accounted for 91.2% of the monomers. The high AT content in A. shenzhenica mononucleotide SSRs was consistent with the high AT content (55.6%) of the entire A. shenzhenica mitogenome. Table S4 shows the size and position of the hexamers and pentamers, all of which were found in intergenic spacers.
Tandem repeats, also known as satellite DNA, are core repeating units of 1-20 bases that are repeated numerous times. They exist extensively in eukaryotic and certain prokaryotic genomes [24]. The mitogenome of A. shenzhenica contained 13 tandem repeats with a matching degree of more than 95% and lengths ranging from 11 to 62 bp (Table S5). Dispersed repeats in A. shenzhenica mitogenome were observed using the REPuter program [25]. As a result, 979 repeats with lengths equal to or greater than 30 were found, 496 of which were straight, and 483 of which were inverted. The longest straight repetition was 115 bp, and the largest inverted repeat was 153 bp. Figure 5 shows the length distributions of the straight and inverted repeats. The 30-39 bp repetitions were found to be the most prevalent for both repeat types. (a)

Identification of Repeat Sequences
Repetitive sequences consist of simple sequence repeats (SSRs), tandem repeats, and dispersed repeats sequences. A total of 226 SSRs were discovered in A. shenzhenica mitogenome, including 57 (25.22%) monomers, 57 (25.22%) dimers, 39 (17.26%) trimers, 61 (26.99%) tetramers, 10 (4.42%) pentamers, and two (0.88%) hexamers. Almost 50% of the repetitions of 226 SSRs were either monomers or dimers. Additional analysis of the repetitive units of the SSRs revealed that G/C only occupied 8.8% of the monomers, whereas A/T accounted for 91.2% of the monomers. The high AT content in A. shenzhenica mononucleotide SSRs was consistent with the high AT content (55.6%) of the entire A. shenzhenica mitogenome. Table S4 shows the size and position of the hexamers and pentamers, all of which were found in intergenic spacers.
Tandem repeats, also known as satellite DNA, are core repeating units of 1-20 bases that are repeated numerous times. They exist extensively in eukaryotic and certain prokaryotic genomes [24]. The mitogenome of A. shenzhenica contained 13 tandem repeats with a matching degree of more than 95% and lengths ranging from 11 to 62 bp (Table S5). Dispersed repeats in A. shenzhenica mitogenome were observed using the REPuter program [25]. As a result, 979 repeats with lengths equal to or greater than 30 were found, 496 of which were straight, and 483 of which were inverted. The longest straight repetition was 115 bp, and the largest inverted repeat was 153 bp. Figure 5 shows the length distributions of the straight and inverted repeats. The 30-39 bp repetitions were found to be the most prevalent for both repeat types.
with a matching degree of more than 95% and lengths ranging from 11 to 62 bp (Table S5). Dispersed repeats in A. shenzhenica mitogenome were observed using the REPuter program [25]. As a result, 979 repeats with lengths equal to or greater than 30 were found, 496 of which were straight, and 483 of which were inverted. The longest straight repetition was 115 bp, and the largest inverted repeat was 153 bp. Figure 5 shows the length distributions of the straight and inverted repeats. The 30-39 bp repetitions were found to be the most prevalent for both repeat types.

Characterization of A. shenzhenica Cpgenome Transfer into the Mitogenome
The A. shenzhenica mitogenome sequence was approximately 4.5 times longer than its cp genome (151,676 bp). The fragments ranged from 30 to 4725 bp in sequence identity with their original cp counterparts. A total of 44 fragments with a length of 34,456 bp mi-

Characterization of A. shenzhenica Cpgenome Transfer into the Mitogenome
The A. shenzhenica mitogenome sequence was approximately 4.5 times longer than its cp genome (151,676 bp). The fragments ranged from 30 to 4725 bp in sequence identity with their original cp counterparts. A total of 44 fragments with a length of 34,456 bp migrated from the cp genome to the mitogenome of A. shenzhenica, accounting for 5.12% of the mitogenome (Figure 6). Five integrated annotated genes were located on these fragments, including four tRNA genes and one cp genome protein-coding gene, namely, trnM-CAU, trnD-GUC, trnP-UGG, trnF-GAA, and psaj. Our data also demonstrated that some genes, such as ycf2, accD, rrn16, psbB, rpoB, etc., migrated from the cp genome to the mitogenome. However, most of them lost their integrity during evolution, and only fragmentary sequences of these genes can presently be found in the mitogenome (Table 3). Through the chloroplast transfer event segments, we found that most tRNA genes were much more conserved than protein-coding genes, probably because they play important roles in the A. shenzhenica mitogenome.     Notes: Lowercase a indicates the partial sequence found in mitogenome. Lowercase b indicates the mt-derived genes.

Phylogenetic Analysis and Gene Loss of Monocotyledon Mitogenomes
To understand the evolution of A. shenzhenica and the monocot mitogenome, we performed phylogenetic analyses on A. shenzhenica and 28 other monocots and four dicots (designated as the outgroup). Table S6 lists the accession numbers for the mitogenomes analysed in this study. A phylogenetic tree was constructed using an aligned data matrix comprising 28 conserved protein-coding genes from these species, as illustrated in Figure 7. The phylogenetic tree strongly supported the separation of monocots and dicotyledonous plants. Moreover, taxa from four orders (Alismatales, Asparagus, Arecales, and Poales) were well clustered. The clustering relationships of taxa in the phylogenetic tree in this study were consistent with those of previous studies examining the evolutionary relationships of these species.
(ML) tree, A. shenzhenica and G. elata were classified into one clade (orchid) with a 100% bootstrap value, whereas this clade formed a sister relationship with the other four Asparagales. From the base group upward, the value for the separation of Alismatales from the clade composed of Asparagus, Arecales, and Poales was 100%. The bootstrap value for the separation of Asparagus and Arecales was 100%, and that of the separation of Arecales from the clade composed of Poales was also 100%.
Furthermore, we compared the A. shenzhenica with 28 other monocots mitogenomes and discovered that genes composition of monocots mitogenomes differed. As illustrated in Figure 8a, the majority of mitochondrial protein-coding and rRNA genes are highly conserved, cox2 is only lost in the A. shenzhenica, and rpl10, rpl14, and sdh4 are lost in most monocots. In the evolutionary history of monocots, a considerable number of mitochondrial-derived tRNAs loss events occurred; only trnC-GCA is conservative (Figure 8b). However, most of the tRNAs in Poales are regularly retained.  As shown in Figure 7, the bootstrap values of most nodes were supported by more than 70%, and 25 nodes were supported by 100%. According to the maximum likelihood (ML) tree, A. shenzhenica and G. elata were classified into one clade (orchid) with a 100% bootstrap value, whereas this clade formed a sister relationship with the other four Asparagales. From the base group upward, the value for the separation of Alismatales from the clade composed of Asparagus, Arecales, and Poales was 100%. The bootstrap value for the separation of Asparagus and Arecales was 100%, and that of the separation of Arecales from the clade composed of Poales was also 100%.
Furthermore, we compared the A. shenzhenica with 28 other monocots mitogenomes and discovered that genes composition of monocots mitogenomes differed. As illustrated in Figure 8a, the majority of mitochondrial protein-coding and rRNA genes are highly conserved, cox2 is only lost in the A. shenzhenica, and rpl10, rpl14, and sdh4 are lost in most monocots. In the evolutionary history of monocots, a considerable number of mitochondrial-derived tRNAs loss events occurred; only trnC-GCA is conservative (Figure 8b). However, most of the tRNAs in Poales are regularly retained.

Characterization of the A. shenzhenica Mitogenome
Most orchids exist as epiphytes in the wild. Orchid roots, stems, leaves, flowers, and seeds have adapted to a wide variety of habitat environments and have evolved unique structures and functions. The advent of rapid and cost-effective genome-sequencing technologies has accelerated our understanding of plant evolution. Mitochondria are the power stations of plants that produce the energy required to carry out life processes. Plant mitogenomes are more complex than the chloroplast genome because of factors, such as complex structures and size variations [10,18]. Our study is the first to report the A. shenzhenica mitogenome and its characterization. Compared with other species of Asparagales, the mitogenome size of A. shenzhenica is moderate. However, the mitogenome size is remarkably smaller than that of G. elata [10], but larger than those of A. cepa, A. officinalis, and H. citrina [26][27][28]. The overall GC content is 44.4%, which is similar to that of other species of Asparagales (42.9-46.7%) ( Table 4).
Most sequences in the A. shenzhenica mitogenome are non-coding, and protein-coding genes account for 4.21%. Compared to most mitogenomes, protein-coding genes are 2.45% in G. elata, 3.31% in C. comosum, 4.70% in A. cepa, 6.45% in A. officinalis, and 8.27% in H. citrina, which is mainly due to the frequent recombination of repeated sequences in the mitogenome and the integration of foreign sequences during evolution.

Characterization of the A. shenzhenica Mitogenome
Most orchids exist as epiphytes in the wild. Orchid roots, stems, leaves, flowers, and seeds have adapted to a wide variety of habitat environments and have evolved unique structures and functions. The advent of rapid and cost-effective genome-sequencing technologies has accelerated our understanding of plant evolution. Mitochondria are the power stations of plants that produce the energy required to carry out life processes. Plant mitogenomes are more complex than the chloroplast genome because of factors, such as complex structures and size variations [10,18]. Our study is the first to report the A. shenzhenica mitogenome and its characterization. Compared with other species of Asparagales, the mitogenome size of A. shenzhenica is moderate. However, the mitogenome size is remarkably smaller than that of G. elata [10], but larger than those of A. cepa, A. officinalis, and H. citrina [26][27][28]. The overall GC content is 44.4%, which is similar to that of other species of Asparagales (42.9-46.7%) ( Table 4). Most sequences in the A. shenzhenica mitogenome are non-coding, and protein-coding genes account for 4.21%. Compared to most mitogenomes, protein-coding genes are 2.45% in G. elata, 3.31% in C. comosum, 4.70% in A. cepa, 6.45% in A. officinalis, and 8.27% in H. citrina, which is mainly due to the frequent recombination of repeated sequences in the mitogenome and the integration of foreign sequences during evolution.

Repeat Sequences, Ka/Ks, and RNA Editing Sites Length of Protein Coding Region (%)
Mitochondrial repeat sequences are essential for intermolecular recombination because they can contribute to extreme mitogenome sizes and structural differences [29,30]. The mitogenome of A. shenzhenica contains 113 repeat sequences longer than 50 bp, accounting for 1.38% of its genome. The maximum length is 175 bp and does not contain medium or large repeats. We suspect that recombination is less frequent in the A. shenzhenica mitochondrion. Moreover, repeats are poorly conserved in plant mitogenomes, even within the same family. As shown in Table 4, the total length of the repeats ranges from 9298 bp (1.38%) in A. shenzhenica to 310,943 bp (57.95%) in A. cepa. Interestingly, the mitogenome of A. cepa is 536,617 bp, and that of A. shenzhenica is 672,872 bp, suggesting that the size of the genome is determined not only by the expansion of repeated sequences, but also by other factors. This mtDNA heteroplasmy may provide more genetic resources for evolutionary selection [31], imparting ecological and genetic fitness to A. shenzhenica during its evolution.
Ka/Ks analysis and comparison of mitogenome features provide a comprehensive understanding of plant mitogenome evolution. In the present study, atp4, atp9, ccmFc, and nad6 undergo positive selection during evolution. Different plant species under conditions involving positive selection pressure during evolution, including atp8, ccmFn, matR, ccmB and mttB, which have also been reported [5,7,22,32]. However, the A. shenzhenica mitogenome is conserved, and most of the PCGs have undergone neutral and negative selection compared to other Asparagales species. In general, most of the results of this investigation are consistent with the previous reports. In addition, nad4L shows the lowest Ka/Ks ratio among A. shenzhenica mitochondrial genes. The nad4L Ka/Ks ratio is < 0.5 in different species [33]. There are two copies of nad4L in A. shenzhenica, and we suspect that this gene plays an important role in A. shenzhenica.
RNA editing, a post-transcriptional mechanism that occurs in higher plant chloroplasts and mitogenomes, aids in protein folding [34,35]. Each organelle is highly lineage-specific in terms of the frequency and the type of RNA editing [36][37][38]. In the earlier studies, Oryza sativa exhibited 491 RNA editing sites in 34 genes [39], and Phaseolus vulgaris showed 486 RNA editing sites in 31 genes [40]. In the present study, we predicted RNA editing sites in 28 PCGs common to G. elata, H. citrina, A. cepa, and A. officinalis mitogenomes. As shown in Figure 4, the number of RNA editing sites predicted in mitogenomes of different Asparagales species is very conserved, from 416 sites in A. shenzhenica to 552 sites in Asparagus officinalis. Of these, 514 are found in G. elata, of which 347 are shared with A. shenzhenica. These results suggest that they share highly conserved PCGs.

DNA Fragment Transfer Events
Intracellular gene transfer between different genomes (mitochondria, nuclei, and chloroplasts) has been widely examined via sequencing analyses [41,42]. Prior studies have found high levels of nuclear DNA translocation to organelles in monocots [43,44]. Nuclear NDH complex-related genes are lost, along with cp-ndh genes in orchids [45]. However, the cp-ndh gene in A. shenzhenica is not completely lost, and only ndhH, ndhE, ndhF, and ndhG are lost; whether it is transferred to a nuclear gene is unknown.
Horizontal gene transfer from chloroplasts to mitochondria has been reported several times. However, the length and number of transfer fragments vary significantly between species [7]. In our study, we identified 44 fragments that have been transferred from the cp genome (22.72% of the cp genome) to the mitogenome. These included five integrated genes, (four tRNA genes and one protein-coding gene, psaJ). Interestingly, we found that chloroplast transfer fragments were randomly scattered across every region of the chloroplast and not just in the repetitive regions [7].

Phylogenetic Analyses
A phylogenetic tree based on the entire mitochondrial protein-coding gene sequence was constructed to explore the evolutionary relationships between mitochondria in monocots. Phylogenetic analysis using the mitogenome shows results congruent with those of Petersen [4]. The evolutionary relationships among the four classes of monocots lineages are well resolved in this study. Our study supports mitogenomics as an informational tool to address the systematic relationships among families, orders, or higher taxonomic levels of angiosperms. However, the phylogenetic differentiation of some nodes, such as Zea, Tripsacum dactyloides, chrysopogon zizanioides, and coix lacrymajobi is not well resolved in the present analysis. This suggests that they may have a close genetic relationship or ancestral relationship. An extended sampling of more representative monocots plants, as well as comparisons of mitogenome phylogenetic with plastid DNA, nuclear DNA, and morphology data, are necessary to confidently establish the phylogenetic relationships of monocots.

Plant Materials and Sequencing
We obtained plant materials of A. shenzhenica from the Shenzhen, Guangdong Province, China. A total DNA of A. shenzhenica was isolated as previously reported [46]. mtDNA was extracted from purified mitochondria using the cetyltrimethylammonium bromide technique [47]. For A. shenzhenica, a 250-bp paired-end library and a 3-kb mate-pair library were produced and sequenced on an Illumina HiSeq 2500 platform using two alternative techniques. Trimmomatic v0.36 was used to remove low-quality bases and adaptor sequences from the raw Illumina reads [48].

Mitogenome Assembly and Annotation
We used SPAdes v3.10.1 for de novo assembly of the A. shenzhenica mitogenome [49]. Furthermore, we ran many SPAdes runs with different k-mer values (k = 77, 101, and 127) and utilized QUAST [50] to evaluate and select 127 as the optimal k-mer number for multiple assembly. Finally, we identified only one candidate mitochondrial scaffold, which can be mapped as a circular molecule with a pair of direct repeats at its both ends. Sanger sequencing was used to confirm the connector and filled the seven remaining gaps in this scaffold.
The Geneious [51] program was used to predict mitochondrial protein-coding genes. tRNAscan-SE v1.21 [52] and RNAmmer 1.2 Server [53] were used to identify the tRNA and rRNA genes, respectively. The start/stop codons and exon-intron boundaries of genes were manually corrected. ORFfinder (https://www.ncbi.nlm.nih.gov/orffinder/, accessed on 30 June 2022) was used to examine ORFs longer than 300 bp inside intergenic sequences, and BlastN [54] was used to identify repetitions with more than 95% identity in each mitogenome. We used a shell script to analyze the Guanine-Cytosine (GC) content and Circos v0.69 to show the circular physical map of all mitogenomes [55]. The RNAweasel program was used to predict Group II introns.
In codon usage analysis, codonW v1.4.4 was used to determine relative synonymous codon use (RSCU) of selected protein-coding genes in the mitogenome [56]. Then, the R package ggplot2 was used for plotting.

Selective Pressure Analysis
The nonsynonymous (Ka) and synonymous (Ks) substitution rates of each PCG between A. cepa, G. elata, H. citrina, and A. officinalis were estimated. MEGA 6.0 was used to separately align orthologous gene pairs. DnaSP v5.10 was used to calculated Ka, Ks, and Ka/Ks values [57]. The ggplot2 v3.3.6 was used to generate a boxplot of paired Ka/Ks values [58].

Prediction of RNA Editing Sites
The online PREP-Mt server suite (http://prep.unl.edu/, accessed on 10 October 2022) was used to anticipate potential RNA editing sites in A. shenzhenica PCGs and the other four Asparagales mitogenomes (A. cepa, A. officinalis, G. elata, and H. citrina). The cutoff value was set to 0.2 [23] to produce a more accurate prediction. A lower cut-off value predicts more real edit sites, but it increases the likelihood of misidentifying an unedited site as an edited one.

DNA Transfer between the Chloroplast and the Mitochondrion
The A. shenzhenica cp genome (MG772639.1) was obtained from the NCBI Organelle Genome Resources database. BLASTN was used to analyze sequence similarity between the cpgenome and the mitogenome in order to detect transferred DNA fragments, and the e-value cut-off was 1 × 10 −5 [60]. The Circos module implemented in TBtools v1.105 was used to visualize the results [61][62][63].
Using the Concatenate Sequence module of Phylosuite v1.1.16 platform (ref), we converted the protein coding gene matrix into PHY tree format file. The (Maximum Likelihood, ML) phylogenetic tree was reconstructed using RAxML-HPC2 on XSEDE 8.2.12 in CIPRES science Gateway V 3.3 [64]. The ML tree was constructed by using the nucleotide replacement model GTRCAT, using the Bootstrap algorithm, with repeated calculation at 1000 times, and the other parameters were set to default [65]. The MP tree was constructed by heuristic search and the branch exchange algorithm (Tree-Bisection-Reconnection, TBR). All nucleotide characters are equally weighted, and the search was carried out by the method of arbitrary repetition of 1000 times. The reliability of the phylogenetic tree was analyzed by the bootstrap method of 1000 repetitions [66].

Conclusions
In this study, we assembled and annotated the A. shenzhenica mitogenome and performed a comprehensive analysis based on the annotated sequences. The draft mitogenome of A. shenzhenica was circular, with a length of 672,872 bp, and it comprised 54 genes, including 36 protein-coding genes, 16 tRNA genes, and two rRNA genes. Upon comparing the mitogenome and cp genome sequences, we discovered 44 fragments that were transferred from the cp genomes to the mitochondrial sequences. Furthermore, we analysed the A. shenzhenica mitogenome for codon usage, sequence repeats, RNA editing, and selection pressure. Moreover, the evolutionary status of the monocots was identified by phylogenetic analysis of the mitogenomes of A. shenzhenica and 28 other monocots. This study provides new insights into the diversity and evolution of orchid mitogenomes. We demonstrate mitochondrial evidence for A. shenzhenica and provide important insights into the evolution of orchids and monocots.  Data Availability Statement: The entire complete mitogenome sequence with gene annotation has been submitted in the NCBI GenBank under the accession number OQ645347. The sequence data utilized in this study can be found in Supplementary Materials.

Conflicts of Interest:
The authors declare no conflict of interest.