An RNA Virome Analysis of the Pink-Winged Grasshopper Atractomorpha sinensis

Simple Summary The present work focused on the RNA virome analysis of an agricultural pest. In this study, the four novel RNA viruses in A. sinensis were reported. According to the phylogenetic analysis, they were classified as nege-like virus, iflavirus, chu-like virus, and ollusvirus. The complete genome sequence of four RNA viruses were obtained with RACE. Moreover, the analysis of virus-derived small interfering RNAs showed that most of the RNA viruses were targeted by the host antiviral RNA interference pathway. Abstract A large number of RNA viruses have been discovered in most insect orders using high-throughput sequencing (HTS) and advanced bioinformatics methods. In this study, an RNA virome of the grasshopper was systematically identified in Atractomorpha sinensis (Orthoptera: Pyrgomorphidae), an important agricultural pest known as the pink-winged grasshopper. These insect viruses were classified as the nege-like virus, iflavirus, ollusvirus, and chu-like virus using HTS and phylogenetic analyses. Meanwhile, the full sequences of four novel RNA viruses were obtained with RACE and named Atractomorpha sinensis nege-like virus 1 (ASNV1), Atractomorpha sinensis iflavirus 1 (ASIV1), Atractomorpha sinensis ollusvirus 1 (ASOV1), and Atractomorpha sinensis chu-like virus 1 (ASCV1), respectively. Moreover, the analysis of virus-derived small interfering RNAs showed that most of the RNA viruses were targeted by the host antiviral RNA interference pathway. Moreover, our results provide a comprehensive analysis on the RNA virome of A. sinensis.

Hematophagous and sap-sucking insects play an important role in the transmission of arthropod-borne viruses (arboviruses). Previous studies have shown that several IRVs isolated from hematophagous insects interact with arboviruses and affect vector competence for arboviruses. For example, IRVs in the Flaviviridae modulate arboviruses' replication and transmission in mosquitoes and sandflies [6][7][8]. Negeviruses inhibit the replication of several Alphaviruses during coinfection of mosquito cells [9]. For sap-sucking insects, including planthoppers, whiteflies, aphids, and leafhoppers, IRVs have been extensively investigated, and some IRVs appeared to be host-specific, indicating a long-term co-evolution between IRVs and host insects [10][11][12][13]. IRVs are thought to mediate sex ratio and increase pupal duration and fecundity in the host insect [14,15]. An analysis of viral genomic diversity and phylogeny has suggested that IRVs might be the ancestors of arboviruses and plant viruses [2,16].
Atractomorpha sinensis, commonly known as the pink-winged grasshopper (Orthoptera: Pyrgomorphidae) [17,18], is an important agricultural pest, directly feeding on leaves, buds, and tender stems of numerous cultivated plants in the families Leguminosae, Solanaceae, and Cruciferae. This species has caused severe damage to crop production in many countries including the Asia-Pacific region and North America [17]. In terms of virus transmission, grasshoppers have been considered as a vector to transmit vesicular stomatitis virus to hoofed animals (e.g., cattle and horse) [19,20]. Nevertheless, as a member of the grasshopper group, A. sinensis has not attracted wide attention to its outbreak and potential damage over the past several decades.
In this study, four novel RNA viruses were identified in A. sinensis. As revealed through analysis on sRNA data, RNAi-mediated antiviral immunity was active during the infection of these RNA viruses.

Sample Collection and Species Identification
A single adult pink-winged grasshopper was collected in the field of Huzhou, Zhejiang, China, in September 2019. This sample was delivered to our laboratory and stored at −80 • C until further analysis. Then, insect morphology was identified under a dissecting microscope and verified with Sanger sequencing using a pair of universal primers to amplify mitochondrial cytochrome c oxidase subunit 1 (COI) genes.

RNA Extraction and Sequencing
The single sample was homogenized in liquid nitrogen, and total RNA with rRNA was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer's instructions. High throughput sequencing (HTS) and small RNA sequencing were conducted after evaluating the quality of total RNA with rRNA through Nanodrop (Thermo Scientific, MA, USA). The Illumina TruSeq Total RNA with rRNA Sample Preparation Kit was adopted to construct the cDNA library. A small RNA library was established using the Truseq Small RNA Library Preparation Kit (Illumina, San Diego, CA, USA) and sequenced based on the Illumina HiSeq 2500 platform, which resulted in 19 Mbp of raw data. Furthermore, raw RNA reads were subject to quality control, adapter trimming, and contig assembly using Trinity program (Version 2.8.5) [21] with default parameter settings.

Virus Genome Identification and Small RNA Analysis
To identify viral reads, the assembled RNA-seq contigs were queried against the nonredundant nucleotide (nt) and non-redundant protein (nr) sequence databases in NCBI using Blastn and Blastx algorithm, respectively. Furthermore, the candidate viral reads were compared against the Viral RefSeq database downloaded from GenBank in order to avoid false positive reads. E-value cut-off was set at 2 × 10 −20 for each comparison. Based on blast hit results, unassembled overlaps and same viral reads were merged by using SeqMan program of DNAstar software package to form partial/complete virus genomes. All candidate viral reads were verified through RT-PCR following Sanger sequencing. The first strand of complementary DNA (cDNA) from 1000 ng of total RNA was synthesized with HiScript ® II Q RT SuperMix for PCR (+gDNA wiper) (Vazyme, Nanjing, China) in line with the manufacturer's protocol. RT-PCR was performed in 10 µL-reaction agents containing 0.5 µL template cDNA, 5 µL of 2 × Phanta Max Buffer, 0.2 µL dNTP mix, 0.2 µL of Phanta Max Super-Fidelity DNA Polymerase (Vazyme, Nanjing, China), 0.2 µL of 1 µM forward and reverse primers, and 3.7 µL of ddH 2 O. The thermal cycling conditions were 95 • C for 5 min, 40 cycles of 95 • C for 30 s, 60 • C for 30 s, and 70 • C for 30 s. Abundance of each virus genome was estimated by mapping quality-checked raw reads to virus genomes with Bowtie 2 and Samtools programs [22,23]. Then, the coverage of the aligned reads was visualized using the Integrative Genomics Viewer (version 2.13.2) [24].
To obtain the full-length genome of the candidate virus, the rapid amplification of cDNA ends (RACE) assay was performed as described [25]. Briefly, 5 -and 3 -RACE cDNAs were synthesized using random primers or oligo (dT) primers present in SMARTer ® RACE 5 /3 Kit (Takara, Kyoto, Japan). Both 5 and 3 PCR products were amplified with sequence-specific primers with nest-PCR and cloned into a pMD19-T clone vector individually (Takara, Kyoto, Japan) after Sanger sequencing, to obtain the complete virus genome sequence. The primers used for identification of virus-like sequences are listed in Supplementary Table S1.
To explore the small interfering RNA (siRNA) triggered by viruses in A. sinensis, the clean small RNA (sRNA) reads (18-to 30-nt) were mapped back to full-length virus genome sequences using Bowtie 2 with zero mismatches. In addition to that, virus-derived siRNAs were further vegetated using the Linux bash scripts and the custom perl scripts.

Virus Genome Annotation
The potential open reading frames (ORFs) in the virus genomes were predicted using the ORF Finder program in NCBI. The function annotation of each ORF was conducted through the NCBI Conserved Domain Database (https://www.ncbi.nlm.nih.gov/cdd/, accessed on 5 March 2022) and InterProScan Search (http://www.ebi.ac.uk/interpro/, accessed on 5 March 2022).

Phylogenetic Analyses
Phylogenetic analyses were carried out as described previously with modifications [25]. The amino acid sequence of the viral RNA-dependent RNA polymerase (RdRP) domain was aligned using MAFFT (v7.487) [26] with default parameters, while ambiguous portions of the alignment were trimmed by Gblock (0.91b) [27]. Phylogenies were improved after removing divergent and ambiguously aligned blocks from protein sequence alignments. Thereafter, the best-fit model of amino acid substitution was evaluated with ModelTest-NG [28]. Then, maximum-likelihood (ML) trees were obtained using RAxML-NG (v. 0.9.0) [29] with 1000 bootstrap replications and refined with iTOL [30]. The reference sequences of viral RdRP used for phylogenetic analyses are listed in Supplementary Table S2.

Discovery of RNA Virus-Related Sequences in A. sinensis
To identify the RNA virus-related sequences in A. sinensis, the total RNA sample was subjected to RNA sequencing analysis using HTS. The results showed that 12.9 Gbp raw data (SRR21712770) were generated from the obtained cDNA library, which had deep sequencing on the Illumina HiSeq 4000 platform (150-bp paired-end reads) of Novogene. In total, 86,784,590 reads were generated and assembled de novo into 464,844 reads in the library. Through a Blastn search against the nucleotide database in NCBI, it was found that the insect COI sequence showed a high homology (99% identities) to that of A. sinensis (accession number: MW085548.1). In total, 17,356 virus-like reads were found in the library. After a BLASTn search against the NCBI nucleotide (nt) and the entire viral reference database, four novel RNA viruses were identified in the assembled contigs of A. sinensis, which were classified as nege-like virus, iflavirus, ollusvirus, and chu-like virus, respectively. The sequences of four viruses were uploaded to NCBI, and the accession numbers include OL672484, OL672485, OL672486, and OL672487. It was shown that three reads were almost identical to those of the Aphid lethal paralysis virus (ALPV), suggesting that these reads were derived from a strain of ALPV (Table 1 and Supplementary Figure S1). All of the virus-like sequences were determined using RT-PCR with the specific primers and verified using Sanger sequencing (Supplementary Table S1).
The analysis of the phylogenetic relationship showed that ASNV1 was the cluster with Ganwon-do negev-like virus 1, together with Ingleside virus, Nephila clavipes virus 4, and Hubei virga-like virus 13, forming unclassified group1, showing a close relationship with the Kitaviridae ( Figure 1B). Analyses of nucleotide and amino acid identity based on the RdRP sequences manifested that ASNV1 is related to other four viruses in Group 1, sharing 29-73.5% nt and 44.1-77% aa sequence identity (Supplementary Table S3).
The sRNA library was constructed to analyze virus-derived small RNA (vsiRNA) of ASNV1; as a result, ASNV1 triggered vsiRNA with peaks of 22 nt related to RNAi-mediated antiviral immunity ( Figure 1C). The vsiRNA was generated from the whole genome of ASNV1, including untranslated regions and the asymmetric hotspots on both strands, implying that the host immune system might target these regions preferably ( Figure 1E). Otherwise, the viral siRNAs of ASNV1 showed a strong A/U preference in their 5 -terminal nucleotides ( Figure 1D), which is typical of vsiRNAs from various organisms, including insects.

Characterization of Atractomorpha sinensis Iflavirus 1
The full-length sequence of Atractomorpha sinensis iflavirus 1 (ASIV1) contains 9, 412 nt with the 791 nt 5 UTR, the 118 nt 3 UTR, and a predicted ORF    Moreover, phylogenetic analysis results demonstrated that ASIV1 displayed a clos relationship with another virus in Iflaviridae, especially Hubei coleoptera virus 1 and Tri bolium castaneum iflavirus ( Figure 2B). Furthermore, according to analyses on the con served amino acid and the nucleotide sequence of RdRP domain, ASIV1 clustered closely with Tribolium castaneum iflavirus, with 34.4% aa and 41.3% nt sequence identification (Supplementary Table S4).
For vsiRNA, ASIV1 triggered vsiRNA with peaks of 22 nt ( Figure 2C), which gener Moreover, phylogenetic analysis results demonstrated that ASIV1 displayed a close relationship with another virus in Iflaviridae, especially Hubei coleoptera virus 1 and Tribolium castaneum iflavirus ( Figure 2B). Furthermore, according to analyses on the conserved amino acid and the nucleotide sequence of RdRP domain, ASIV1 clustered closely with Tribolium castaneum iflavirus, with 34.4% aa and 41.3% nt sequence identification (Supplementary Table S4).
For vsiRNA, ASIV1 triggered vsiRNA with peaks of 22 nt ( Figure 2C), which generated the untranslated regions and asymmetric hotspots on both strands from the whole genome of ASIV1 ( Figure 2E). The viral siRNAs of ASNV1 had a strong A/U preference in their 5 -terminal nucleotides ( Figure 2D).  Figure 3B). According to the relatively abundant distribution of the read's coverage in the whole genome and small RNAs with both positive and negative sense, ASOV1 and ASCV1 replicated effectively in A. sinensis ( Figure 3A,B). genome of ASIV1 ( Figure 2E). The viral siRNAs of ASNV1 had a strong A/U preference in their 5′-terminal nucleotides ( Figure 2D).
ASOV1 triggered the peaks of 22 nt ( Figure 3D), and untranslated regions and asymmetric hotspots on both strands of the whole genome were observed in vsiRNA reads ( Figure 3F), while viral siRNAs of ASOV1 exhibited a strong U preference in the 5 -terminal nucleotides ( Figure 3E). However, ASCV1 triggered no peaks of 22 nt (Supplementary Figure S2A), and neither the untranslated regions or asymmetric hotspots on both strands of the whole genome were observed in vsiRNA reads (Supplementary Figure S2B).

Discussion
A large number of RNA viruses were discovered using HTS and advanced bioinformatics methods, thus, gaining a better understanding of insect viromes and viral evolution. Recently, numerous RNA viruses have been identified in multiple grasshopper species [34]. In this study, the four novel RNA viruses, namely, Atractomorpha sinensis nege-like virus 1 (ASNL1), Atractomorpha sinensis iflavirus 1 (ASIV1), Atractomorpha sinensis ollusvirus 1 (ASOV1), and Atractomorpha sinensis chu-like virus 1 (ASCV1) were found in A. sinensis. Negeviruses have a broad host range as a group of insect-specific viruses, including the orders of Coleoptera, Diptera, Hemiptera, Hymenoptera, Lepidoptera, Odonata, Orthoptera, and Thysanoptera [25]. In the meantime, negeviruses are characterized with a ssRNA (+) genome with poly(A) tails that are around 7000 nt to 10,000 nt in length and contain at least three ORFs [3,35]. Iflaviruses are the non-enveloped insect viruses that contain the ssRNA (+) genome with poly(A) tails at 3 UTR and encode one large polyprotein. For the polyprotein, the viral coat protein is located in an N-terminal domain and the non-structural proteins which are involved in replication and polyprotein processing in the C-terminal region [36]. In addition, the negative-sense RNA viruses, Chuviruses and Ollusviruses, belong to the Jingchuvirales order [37,38]. The genome of Chuviruses varies with unsegmented, bi-segmented, and circular forms, and most chuvirus genomes contain glycoprotein (G), nucleoprotein (N), and the large polymerase (L) genes. However, some genes are deleted during the evolution of Chuviruses, such as glycoprotein genes [39]. In our study, the genomic feature of ASNV1 was consistent with that of most negeviruses possessing four ORFs (Figure 1). Similar to other iflaviruses, ASIV1 contains one polyprotein, with the CRPV and Hel domain located at the N-and C-terminus, respectively ( Figure 2). Moreover, the genome of ASCV1 encoded large polymerase (RdRP) and capsid proteins (VP39), but not G protein, which might be deleted during the virus-host co-evolution process ( Figure 3).
As potential biocontrol agents, ISVs have attracted more attention due to their effects on vectoring ability. In addition, they impede some arboviruses' infections in vivo and in vitro in mosquitos, such as Nhumirim virus and Zika virus, participate in reducing West Nile virus transmission when co-infecting the mosquito vector [4], and are suppressed by Menghaic rhabdovirus and Shinobi tetravirus in the A. albopictus C6/36 cell line [40]. Otherwise, it is known that ISVs regulate the innate immune pathway in some insects, such as mosquitoes, to interfere with the replication by decreasing vector competence [4,40,41]. In our study, the antiviral RNA interference pathway of A. sinensis responded strongly during the infection of three RNA viruses.
In conclusion, ASNV1, ASIV1, ASOV1, and ASCV1 were identified in A. sinensis and classified as nege-like virus, iflavirus, ollusvirus, and chu-like virus according to a phylogenetic analysis. Except for ASCV1, other RNA viruses were targeted by the antiviral RNA interference pathway in A. sinensis.
Supplementary Materials: The following are available online at https://www.mdpi.com/xxx/s1, Figure S1: The Genetic structure of Aphid lethal paralysis virus and putative reads found in A. sinensis (% represents amino acid similarity and ? refers to unknown sequences), Figure S2: The sRNA plot of ASCV1, Table S1: Primers used in this study, Table S2: Abbreviations of virus names and GenBank accession numbers used in this study, Table S3: Amino acid/nucleotide identity analysis of ASNV1 based on the conserved amino acid and nucleotide sequence of the RdRp domain, Table S4: Amino acid/nucleotide identity analysis of ASIV1 based on the conserved amino acid and nucleotide sequence of the RdRp domain, Table S5: Amino acid/nucleotide identity analyses of ASCV1 and ASOV1 based on the conserved amino acid and nucleotide sequence of the RdRp domain.

Conflicts of Interest:
The authors declare no conflict of interest.