The Characterization and Phylogenetic Implications of the Mitochondrial Genomes of Antheminia varicornis and Carpocoris purpureipennis (Hemiptera: Pentatomidae)

: The mitochondrial genome (mitogenome) has been widely used for structural comparisons and phylogenetic analyses of Hemiptera groups at different taxonomic levels. However, little is known about the mitogenomic characteristics of species from Antheminia and Carpocoris , two morphologically similar genera in the Pentatomidae family

Pentatomidae is one of the largest and most diverse families of Heteroptera, with a wide distribution around the world [30].Most species are phytophagous, sucking sap from the stems, leaves, or fruits of their host plants, and pose a serious threat to a wide variety of valued crops, causing significant economic losses worldwide [31,32].Since the publication of the first complete mitogenome of a Pentatomidae species, Nezara viridula (GenBank accession number NC_011755) [33], the number of mitogenomes in this family has continued to grow [20,22,[34][35][36].Detailed comparative analyses of the mitogenome and phylogenetic analyses have also been performed for the Pentatomidae family [37].However, the coverage is still limited relative to the number of species, which is quite restrictive for clarifying the phylogenetic relationships of genera and species in the Pentatomidae family.Antheminia Mulsant and Rey, 1866, and Carpocoris Kolenati, 1846, are two morphologically similar genera within the Pentatomidae family; most species in both two genera are crop pests, with adults and nymphs sucking juice from inflorescences and young stems.The yellow spots will appear on the leaves after being sucked, and the inflorescence and tender ear will wither or even fade after being seriously damaged.Many species of Antheminia and Carpocoris mainly harm cash crops such as alfalfa, wheat, potato, radish, carrot, elm, and bayberry.A previous study based on small molecular fragments (1759 bp of 18S rRNA, 646 bp of 28S rRNA, 592 bp of 16S rRNA and 564 bp of COI) revealed a sister relationship between the two genera, but there is low node support [38].Currently, little is known about the mitogenomic characteristics of the two genera.Their molecular data are yet to be supplemented.Their phylogenetic positions in the Pentatomidae family and the phylogenetic relationships between them still need to be further confirmed.
In this study, we sequenced and annotated the mitogenomes of Antheminia varicornis (Jakovlev, 1874) and Carpocoris purpureipennis (De Geer, 1773), representing the Antheminia and Carpocoris genera, respectively.We analyzed the genomic structure, base composition, codon usage, and tRNA secondary structure of the two mitogenomes.Coupled with published mitogenomes of Pentatomidae, we carried out a phylogenetic analysis to verify the phylogenetic positions of Antheminia and Carpocoris in the Pentatomidae family and their phylogenetic relationships.This study may increase our understanding of the relationships between the Antheminia and Carpocoris genera, and also verify the phylogeny and evolution of Pentatomidae.

Sample Collection and DNA Extraction
The specimens of A. varicornis and C. purpureipennis were collected from Horqin Right Front Banner, Neimenggu Province, China (46.46 N, 120.31 E), on 11 July 2021 and Ziwu Mountain, Shaanxi Province, China (35.87 N, 108.55 E), on 27 July 2019, respectively.Since both species consist of unprotected invertebrates, no special permits were required to collect samples from these sites.All specimens were preserved in 100% ethanol and stored at −20 • C at the Institute of Entomology at Nankai University (Tianjin, China).All specimens were identified based on their morphology.Total genomic DNA was extracted from the thoracic muscle using a Universal Genomic DNA Kit (CWBIO, Beijing, China) and stored at −80 • C until downstream analyses.

Mitogenome Sequencing, Assembly and Annotation
The whole mitochondrial genomes were sequenced using the Illumina NovaSeq 6000 platform with a 150 bp paired-end read strategy at Novogene Co., Ltd.(Beijing, China).Low-quality reads were removed using fastp [39], and then approximately 2 Gb of clean data were obtained for each sample.The clean data from the sequencing reads were assembled using mitoZ 2.4 [40] with default settings and IDBA-UD 1.1.3[41] with minimum and maximum k values of 40 and 120 bp, respectively.Transfer RNA (tRNA) genes and their secondary structures were identified on the MITOS2 webserver (http://mitos2.bioinf.uni-leipzig.de/index.py,accessed on 3 April 2023).The typical secondary structure for tRNAs were manually drawn according to MITOS2 predictions and using Adobe Illustrator 2021.Protein-coding genes (PCGs) and ribosomal RNA (rRNA) genes were annotated through alignment with homologous regions of previously published mitogenomes of Pentatominae in GenBank.Newly sequenced mitogenomes were submitted to GenBank (accession numbers: OR074478 and OR074479).

Phylogenetic Analyses
Phylogenetic analyses were performed using the newly sequenced mitogenomes of A. varicornis and C. purpureipennis, together with 28 Pentatomidae mitogenomes downloaded from GenBank (Table 1).Two species of Plataspidae were selected as outgroups (Table 1).The sequences of 13 PCGs and 2 rRNA genes were aligned using MAFFT 7.402 [44].After removing the stop codon, the alignments of individual genes were then concatenated by PhyloSuite 1.2.2 [45] to generate two datasets: PCG123R (all three codon positions of the 13 PCGs and 2 rRNAs) and PCG12R (the first and second codon positions of the 13 PCGs and 2 rRNAs).The best-fit partitioning scheme and nucleotide substitution models were identified using PartitionFinder 2.0 [46] and the Bayesian Information Criterion (BIC).Phylogenetic analyses were conducted using the Bayesian inference (BI) and maximum likelihood (ML) methods based on the two datasets.BI analysis was performed using MrBayes 3.2.7a[47] with the best-fitting substitution model (Table 2).Two simultaneous Markov chain Monte Carlo (MCMC) runs of 10,000,000 generations were conducted, and trees were sampled every 1000 generations, with the first 25% discarded as burn-in.The convergence of runs was confirmed by checking whether the deviation of split frequencies was below 0.01.ML analysis was performed using IQ-TREE 2.2.0 [48] with 1000 bootstrap replicates under the best-fitting substitution model.

Mitogenome Organization and Composition
The mitogenomes of A. varicornis and C. purpureipennis are 15,251 and 15,322 bp in size, respectively.Each mitogenome contains 37 typical genes (13 PCGs, 2 rRNAs, and 22 tRNAs) and a control region.Among these genes, four PCGs (ND1, ND4, ND4L, and ND5), eight tRNAs (trnC, trnF, trnH, trnL (UAG), trnP, trnQ, trnV and trnY), and two rRNAs (12S rRNA and 16S rRNA) are encoded on the minority strand (N strand), while the other 23 genes are encoded on the majority strand (J strand) (Figure 1, Tables 3 and 4).The mitogenome of A. varicornis has a total of 26 bp space in seven gene overlaps, ranging in length from 1 to 8 bp; the longest overlap region fell between the tRNA-Trp and tRNA-Cys genes.In addition, there were sixteen 1-25 bp gene spacer regions, with a total length of 116 bp; the longest 25 bp intergenic spacer sequences were located between ND1 and tRNA-Ser.In the C. purpureipennis mitogenome, gene overlaps were found at eight gene junctions and involved a total of 35 bp; the longest 8 bp overlap was located between the tRNA-Trp and tRNA-Cys genes.Intergenic spacer sequences were found at 15 gene junctions and involved a total of 98 bp, ranging in length from 1 to 21 bp; the longest 21 bp intergenic spacer sequences were located between ND1 and tRNA-Ser.The number and arrangement of genes in both mitogenomes are conserved, consistent with those of ancestral insects [49].The nucleotide composition of the two mitogenomes is biased toward A + T, as in other Pentatomidae species [22,37].The A + T content of the whole mitogenome is 76.7% for A. varicornis and 73.4% for C. purpureipennis.The A + T content of the protein-coding genes is 73.2% for A. varicornis and 72.9% for C. purpureipennis.The 12S rRNA and 16S rRNA exhibit a higher A + T content among the 37 typical genes in both mitogenomes.The A + T content of the third codon in PCGs is significantly higher than that of the first and second codons (Table 5).The whole mitogenome of A. varicornis exhibits negative AT-skew and GC-skew, while that of C. purpureipennis exhibits positive AT-skew and negative GC-skew.It is generally believed that asymmetric mutations at four bases and selection pressure are the two main reasons for the base composition preference of mitochondrial genome, which mainly come from the process of replication and gene transcription [50].

Protein-Coding Genes and Codon Usage
Most PCGs of the two mitogenomes begin with the standard start codon ATN (N represents one of four nucleotides, A, T, C, or G), while COI starts with TTG.In addition, the start codons of ATP8 are TTG and GTG in A. varicornis and C. purpureipennis, respectively.The start codon of ND1 in A. varicornis is TTG (Tables 3 and 4).The termination codon in nine PCGs (ATP6, ATP8, COIII, CytB, ND1, ND2, ND4, ND4L and ND6) is TAA or TAG, while COI, COII, ND3, and ND5 have incomplete termination codons (T or TA) (Tables 3 and 4) that are probably completed by post-transcriptional polyadenylation [51].The total codon numbers (excluding the termination codons) in A. varicornis and C. purpureipennis are 3663 and 3669, respectively.The longest gene was the ND5 gene (1706 and 1705), and the shortest was the ATP8 gene (159 and 159) in A. varicornis and C. purpureipennis, respectively.The most frequently used codon families are Ile, Leu2, Met, and Phe, each numbering more than 300.The least frequently used codon family is Cys, with a total of 50 in both mitogenomes (Figure 2).The relative synonymous codon usage (RSCU) patterns for the two mitogenomes are similar, and the RSCU values are shown in Figure 3 and Table 6.For each amino acid, the most prevalently used codons are NNA and NNU (Figure 3, Table 6), which is consistent with the higher A + T content in the third codon in PCGs.

tRNAs, rRNAs and Control Region
Typical sets of 22 tRNA genes ranging in length from 62 to 72 bp have been identified in the mitogenomes of A. varicornis and C. purpureipennis (Tables 3 and 4), with variations in length.The A + T content of the concatenated tRNA genes is 75.7% and 73.9% for A. varicornis and C. purpureipennis, respectively.The nucleotide skews in the tRNA genes in the mitogenomes of the two species are consistent, with the concatenated tRNA genes exhibiting a positive AT-skew and a negative GC-skew (Table 5).Most tRNA genes can be folded into the typical cloverleaf secondary structure, while the dihydrouridine (DHU) arms of trnS (GCU) and trnV are very short, with only a single base pair.The non-Watson-Crick base pair G-U is common in tRNA genes from both species (Figures 4 and 5).The size of A. varicornis ranged from 62 bp (tRNA-Cys) to 71 bp (tRNA-Asp) while the size of C. purpureipennis ranged from 63 bp (tRNA-Ala) to 72 bp (tRNA-Lys).The total length of the 22 tRNAs of A. varicornis were 1472 bp, and the total length of the 22 tRNAs of C. purpureipennis were 1463 bp, respectively.Both 12S and 16S rRNA genes exhibit similar positions and sizes in the mitogenomes of A. varicornis and C. purpureipennis (Tables 3 and 4).The 12S rRNA exhibits a positive AT-skew and a negative GC-skew in both species.The 16S rRNA exhibits a negative AT-skew and a positive GC-skew in A. varicornis and a positive AT-skew and a negative GC-skew in C. purpureipennis (Table 5).The control regions of A. varicornis and C. purpureipennis are 595 and 682 bp in size (Tables 3 and 4), with A + T contents of 70.4% and 70.5%, respectively (Table 5).  3 and  4).The 12S rRNA exhibits a positive AT-skew and a negative GC-skew in both species.The 16S rRNA exhibits a negative AT-skew and a positive GC-skew in A. varicornis and a positive AT-skew and a negative GC-skew in C. purpureipennis (Table 5).The control regions of A. varicornis and C. purpureipennis are 595 and 682 bp in size (Tables 3 and 4), with A + T contents of 70.4% and 70.5%, respectively (Table 5).

Phylogenetic Relationships
A previous study, based on small molecular fragments (1759 bp of 18S rRNA, 646 bp of 28S rRNA, 592 bp of 16S rRNA and 564 bp of COI), revealed that the Antheminia genus forms a sister group with Carpocoris, but there is low node support [38].In this study, we selected one species from each genus as a representative taxon, used mitogenome data to

Phylogenetic Relationships
A previous study, based on small molecular fragments (1759 bp of 18S rRNA, 646 bp of 28S rRNA, 592 bp of 16S rRNA and 564 bp of COI), revealed that the Antheminia genus forms a sister group with Carpocoris, but there is low node support [38].In this study, we selected one species from each genus as a representative taxon, used mitogenome data to verify this sister relationship, and further explored their phylogenetic positions within the Pentatomidae family.Phylogenetic analyses were performed using the BI and ML methods based on the two datasets (PCG123R and PCG12R).All four phylogenetic trees show that A. varicornis forms a sister relationship with C. purpureipennis with high nodal support values (PP = 1 in BI trees; BS values = 100 in ML trees), which is consistent with the traditional taxonomy and the findings of a previous study [38].
In addition, all of the phylogenetic results support the fact that the two species form a sister group with Dolycoris baccarum, and then the three species together form a sister group with Rubiconia intermedia (Figures 6 and 7).This result is consistent with previous results based on molecular and morphological evidence; accordingly, we support the previous proposal that the species in the Eysarcorini and Carpocorini are closely related [37].Neojurtina typica is in the most basic position within Pentatomidae.The phylogenetic tree constructed through ML and BI analysis showed a strong support for the monophyly of Asopinae and Phyllocephalinae, while the monophyly of Pentatominae and Podopinae was rejected (Figures 6 and 7).
verify this sister relationship, and further explored their phylogenetic positions within the Pentatomidae family.Phylogenetic analyses were performed using the BI and ML methods based on the two datasets (PCG123R and PCG12R).All four phylogenetic trees show that A. varicornis forms a sister relationship with C. purpureipennis with high nodal support values (PP = 1 in BI trees; BS values = 100 in ML trees), which is consistent with the traditional taxonomy and the findings of a previous study [38].
In addition, all of the phylogenetic results support the fact that the two species form a sister group with Dolycoris baccarum, and then the three species together form a sister group with Rubiconia intermedia (Figures 6 and 7).This result is consistent with previous results based on molecular and morphological evidence; accordingly, we support the previous proposal that the species in the Eysarcorini and Carpocorini are closely related [37].Neojurtina typica is in the most basic position within Pentatomidae.The phylogenetic tree constructed through ML and BI analysis showed a strong support for the monophyly of Asopinae and Phyllocephalinae, while the monophyly of Pentatominae and Podopinae was rejected (Figures 6 and 7).

Conclusions
In previous studies, more attention has been paid to the phylogenetic relationships of the higher order members of Heteroptera, while less attention has been paid to the phylogenetic relationships within the subfamily.In this study, two mitochondrial genomes from Pentatominae were sequenced and added to the existing data; we sequenced and analyzed the mitogenomes of A. varicornis and C. purpureipennis.The two mitogenomes are conserved in genomic structure, base composition, codon usage, and tRNA secondary structure.
We performed a phylogenetic analysis based on the sequences of thirteen PCGs and two rRNA genes.Our results strongly support the sister relationship between A. varicornis and C. purpureipennis.Our results provide a valuable resource for further phylogenetic and evolutionary analyses of the Pentatomidae, which also reveal the relationships among four subfamilies within Pentatomidae.The phylogenetic trees show a strong support for the monophyly of Asopinae and Phyllocephalinae, while the monophyly of Pentatominae and Podopinae was rejected.More mitochondrial genomes and nuclear genes need to be sequenced to reveal the mitochondrial genome evolution and phylogenetic relationships of Pentatominae more comprehensively.
Diversity 2023, 15, x FOR PEER REVIEW 5 of 16GC-skew.It is generally believed that asymmetric mutations at four bases and selection pressure are the two main reasons for the base composition preference of mitochondria genome, which mainly come from the process of replication and gene transcription[50].

Figure 1 .
Figure 1.Mitogenome maps of A. varicornis (a) and C. purpureipennis (b).The names of PCGs and rRNAs are indicated by standard abbreviations, while names of tRNAs are represented by a single letter abbreviation.

Figure 1 .
Figure 1.Mitogenome maps of A. varicornis (a) and C. purpureipennis (b).The names of PCGs and rRNAs are indicated by standard abbreviations, while names of tRNAs are represented by a single letter abbreviation.

Figure 2 .
Figure 2. Patterns of codon usage in the mitogenomes of A. varicornis and C. purpureipennis.The Xaxis shows the codon families, and the Y-axis shows the total codons.Figure 2. Patterns of codon usage in the mitogenomes of A. varicornis and C. purpureipennis.The X-axis shows the codon families, and the Y-axis shows the total codons.

Figure 2 .
Figure 2. Patterns of codon usage in the mitogenomes of A. varicornis and C. purpureipennis.The Xaxis shows the codon families, and the Y-axis shows the total codons.Figure 2. Patterns of codon usage in the mitogenomes of A. varicornis and C. purpureipennis.The X-axis shows the codon families, and the Y-axis shows the total codons.

Figure 2 .
Figure 2. Patterns of codon usage in the mitogenomes of A. varicornis and C. purpureipennis.The Xaxis shows the codon families, and the Y-axis shows the total codons.

Figure 3 .
Figure 3.The relative synonymous codon usage (RSCU) in the mitogenomes of A. varicornis and C. purpureipennis.The X-axis shows the codons, and the Y-axis shows RSCU values.The upper and lower color interpretation is shared.

Diversity 2023 ,
15, x FOR PEER REVIEW 10 of 16 C. purpureipennis ranged from 63 bp (tRNA-Ala) to 72 bp (tRNA-Lys).The total length of the 22 tRNAs of A. varicornis were 1472 bp, and the total length of the 22 tRNAs of C. purpureipennis were 1463 bp, respectively.Both 12S and 16S rRNA genes exhibit similar positions and sizes in the mitogenomes of A. varicornis and C. purpureipennis (Tables

Figure 4 .
Figure 4. Secondary structure of 22 tRNAs in A. varicornis.* represents a non-classical pairing of G=U.

Figure 4 .
Figure 4. Secondary structure of 22 tRNAs in A. varicornis.* represents a non-classical pairing of G=U.

Figure 5 .
Figure 5. Secondary structure of 22 tRNAs in C. purpureipennis.* represents a non-classical pairing of G=U.

Figure 5 .
Figure 5. Secondary structure of 22 tRNAs in C. purpureipennis.* represents a non-classical pairing of G=U.

Figure 6 .
Figure 6.Phylogenetic relationships of Pentatomidae based on dataset PCG123R.Pentagram, the mt genome sequences of A. varicornis and C. purpureipennis in this study.The black lines are the two outgroups used in this study.(a) BI tree, numbers at the nodes are posterior probabilities; (b) ML tree, numbers at the nodes are bootstrap values.

Figure 6 .
Figure 6.Phylogenetic relationships of Pentatomidae based on dataset PCG123R.Pentagram, the mt genome sequences of A. varicornis and C. purpureipennis in this study.The black lines are the two outgroups used in this study.(a) BI tree, numbers at the nodes are posterior probabilities; (b) ML tree, numbers at the nodes are bootstrap values.

Figure 7 .
Figure 7. Phylogenetic relationships of Pentatomidae based on dataset PCG12R.Pentagram, the mt genome sequences of A. varicornis and C. purpureipennis in this study.The black lines are the two outgroups used in this study.(a) BI tree, numbers at the nodes are posterior probabilities; (b) ML tree, numbers at the nodes are bootstrap values.

Table 1 .
Taxonomic information and GenBank accession numbers of mitochondrial genomes downloaded from GenBank in this study.

Table 2 .
The best model for each partition of the two datasets.

Table 3 .
Organization of mitochondrial genome of A. varicornis.Gene Strand Position Anticodon Size (bp) Start Codon Termination Codon Intergenic Nucleotides tRNA-I J 1 65 GAT 65

Table 3 .
Organization of mitochondrial genome of A. varicornis.

Table 4 .
Organization of mitochondrial genome of C. purpureipennis.

Table 5 .
Nucleotide composition of mitochondrial genomes of A. varicornis and C. purpureipennis.

Table 6 .
Codon and the relative synonymous codon usage (RSCU) of the mitochondrial genomes (excluding the termination codons) of A. varicornis and C. purpureipennis.