Transcriptome Analysis Revealed Genes Related to γ-Irradiation Induced Emergence Failure in Third-Instar Larvae of Bactrocera dorsalis

Simple Summary Bactrocera dorsalis (Diptera: Tephritidae) is a severe insect pest of numerous fruits and crops worldwide. Using the Illumina Hiseq 2000 platform, we generated a comprehensive transcriptome of B. dorsalis in response to radiation. The adult of B. dorsalis could not emerge when third-instar larvae were irradiated with 60Co-γ at 116Gy. Most differentially expressed genes (DEGs) in gene ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEEG) in enrichment studies were involved in hemolymph coagulation and digestive processes, such as protein digestion and absorption, while pancreatic secretion may be linked to the response to irradiation. Furthermore, the differential expressions of up/down regulated unigenes were confirmed by quantitative reverse transcription-PCR (qRT-PCR). The downregulation of sqd may contribute to the failure of emergence in irradiated samples, while the up-regulation of ENPEP and Ugt are involved in metabolic pathways, and their up-regulation may be considered the results of the irradiation stress reaction. Overall, the current study is a useful resource for the identification of potentially useful biochemical markers that can be used in quarantine. Abstract The oriental fruit fly is a polyphagous and highly invasive economically important pest in the world. We proposed the hypothesis that radiation treatment influence RNA expression in the larvae and leads to emergence failure. Therefore, transcriptome analyses of third-instar larvae of B. dorsalis ionizing, irradiated with 60Co-γ at 116Gy, were conducted and compared with the controls; a total of 608 DEGs were identified, including 348 up-regulated genes and 260 down-regulated ones. In addition, 130 SNPs in 125 unigenes were identified. For the DEGs, the most significantly enriched GO item was hemolymph coagulation, and some of the enriched pathways were involved in digestive processes. The subsequent validation experiment confirmed the differential expression of six genes, including sqd, ENPEP, Jhe, mth, Notch, and Ugt. Additionally, the 3401:G->A SNP in the Notch gene was also successfully validated. According to previous research, this was the first comparative transcriptome study to discover the candidate genes involved in insect molt to pupae. These results not only deepen our understanding of the emerging mechanism of B. dorsalis but also provide new insights into the research of biomarkers for quarantine insect treatment with the appropriate dose of radiation.


Introduction
The oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae), belongs to the B. dorsalis complex and is a severe insect pest of several fruits and vegetables all around the world [1]. It was first detected in Sub-Saharan Africa (SSA) in costal parts of Kenya in 2003 and Furthermore, our data will provide insights into the molecular mechanisms putatively related to the irradiation response of B. dorsalis.

Insect Rearing and Sample Preparation
The laboratory colony of B. dorsalis was from a strain originally collected and reared following the method of Zheng et al. [26] in Guangzhou, Guangdong, China. Adults were fed with water and a standard laboratory diet consisting of yeast powder, honey, sugar, and ascorbic acid. All specimens were kept in an incubator at 27 ± 1 • C, 70 ± 5%, at a relative humidity, and a photoperiod of 12:10(L:D) h. Third-instar larvae were collected and divided into two groups: one group was treated and irradiated with 60 Co-γ at 116 Gy, according to the Guangzhou Huada Biotechnology Co., Ltd. set manufactured protocol, while the other group was used as controls. Three biological replicates were used for each group: CK (Ck1A, Ck2A, and CK3A) and R (R1A, R2A, and R3A). The Fricke system was employed for both the reference standard and routine dosimetry [27]. This dosimetry system was constructed in accordance with ASTM E1026- 13 [27] and ISO/ASTM 51261 [27], and the measured uncertainty value was computed in accordance with ISO/ASTM 51707 [28]. After irradiation, the development from larvae to the pupa of B. dorsalis was delayed for about 3 days, and no adults emerged.

RNA Isolation and Sequencing
RNA from 18 flies (an irradiated group and control group with three replicates of each group, including 3 flies per replicate) was extracted using a Trizol-reagent (Invitrogen, Carlsbad, CA, USA), following the manufacturer's procedures. The quality of the total RNA was verified by measuring its absorbance at 260 nm using a NanoVue UV-Vis spectrophotometer (GE Healthcare Bio-Science, Uppsala, Sweden). The integrity of RNA was verified by 1% agarose gel electrophoresis. Poly (A) mRNA was purified using oligo-dT beads. It was fragmented into short fragments and served as a template to synthesize the first-strand cDNA using a random hexamer-primed reverse transcription, followed by the synthesis of the second-strand cDNA. The cDNA libraries were then prepared following Illumina's protocols [29] and were sequenced on the Illumina Hiseq 2000 platform with 100 bp paired-end reads in BGI (Beijing Genomics Institute, Shenzhen, Guangdong, China).

De Novo Assembly
Raw data generated by the sequencing were first filtered to remove low-quality reads. Briefly, the following reads were excluded: (1) reads containing adapter sequences; (2) reads with over 5% uncertain nucleotides (Ns); (3) reads with over 10% low-quality bases (Q ≤ 10). The remaining clean reads were used for transcriptome de novo assembly using the Trinity de novo RNAseq assembly pipeline (Inchworm, Chrysalis, and Butterfly) with k-mer sizes set at 25. The output of the Trinity pipeline is a fasta formatted file containing full-length transcripts, erroneous contigs, partial transcript fragments, and non-coding RNA molecules. By using RSEM (RNA-Seq by expectation maximization) software1.2.0 [30], these sequences were then filtered to identify contigs containing full or near full-length transcripts or likely coding regions and isoforms that were represented at a minimum level based on the read abundance. The resulting de novo transcripts (unigenes) were screened against protein databases, such as NR, Swiss-Prot, KEGG, and COG using BLASTX with a cut-off E value of 10 −5 . For each transcript, the best aligning result was used to determine the sequence direction. If different databases gave conflicting results, a priority order of GenBank nonredundant (NR), Swiss-Prot, KEGG, and COG was followed. If a unigene did not map to any of the entries in the above databases, the software ESTScan [31] was introduced to predict its coding regions and sequence direction.

Expression Analysis
Clean reads from each sequencing library were independently mapped to the above transcriptome assembly (only contigs containing full and partial ORFs) by using Soapaligner/soap2 [32]. Mismatches of less than three bases were allowed. Gene expression levels were calculated using the RPKM method [33]. If there was more than one transcript for a given gene, the longest transcript remained to calculate its expression level and coverage. The false discovery rate (FDR) method [34] was used to identify differentially expressed genes (DEGs) between the two groups. Genes with FDR < 0.001 and an absolute value of log2Ratio ≥ 1 (fold change ≥ 2) were selected as DEGs in the DESeq package in R (1.18.0) [35]. Then, the DEGs were further annotated by GO function analysis and KEGG pathway analysis [36].

GO and KEGG Enrichment Analysis
All genes were mapped to the KEGG pathways (http://www.genome.jp/kegg/ (accessed on 1 August 2020)) database and GO database (http://geneontology.org/ (accessed on 1 August 2020)). A hypergeometric distribution test was carried out to identify pathways or GO items overrepresented with DEGs. Multiple testing corrections were performed with the Benjamini-Hochberg procedure [37]. KEEG and GO categories less than p < 0.05 were considered highly enriched.

Heterozygous SNP Analysis
The general unigene sequences were used as references for SNP identification. The SNP calling was carried out by using SoapSNP. SoapSNP calculates the likelihood of each genotype based on the alignment results and the corresponding sequencing quality scores. It infers the genotype with the highest posterior probability based on Bayes' theorem. The SNPs from the two groups were classified into three categories: (1) the common SNPs shared by the irradiated and control samples and the SNPs that were heterozygous in each group; (2) the irradiated-specific SNPs, which were derived only from the irradiated samples and the controls that had the same genotype as the reference; and (3) the controlspecific SNPs, which were derived from only the controls and irradiated samples which had the same genotype as the reference. Only irradiated-specific SNPs or control-specific SNPs shared by all the samples in its group remained.

Validation of Gene Expression by qRT-PCR
The differential expressions for the selected genes were validated in two groups (the control and treatment groups), and these genes were picked from the transcriptome data based on their differential expression trends in both groups. In total, we used 18 individuals (from the control and treatment groups) with three biological replicates and three technical replicates. The samples were treated with the same procedure as the samples used for sequencing. RNA was extracted as previously described for sequencing. A total of 2 µg of RNA from each sample was reverse transcribed in a 20 mL reaction system using the PrimerScript TM RT Reagent Kit (Takara Bio Inc., Shiga, Japan). The PCR conditions for both genes were 95 • C for 3 min, followed by 34 cycles of 94 • C for 30 s, 60 • C for 30 s, 72 • C for 30 s, and a final extension at 72 • C for 10 min. The relative fold changes were determined using the 2 −∆∆Ct method with the a-tubulin gene (GU269902) of B. dorsalis as an internal control. The GraphPad Prism was used to analyze the data (version 7; GraphPad Software, San Diego, CA, USA, https://www.graphpad.com/scientific-software/prism/ (accessed on 27 August 2018)). For mean comparisons, Student's t-test (p > 0.05) was employed. All data were displayed as bar charts, along with the means and standard deviations (Mean ± SD).

SNP Validation
To validate the potential SNP markers, the SNP of the Notch gene was further selected and validated into two groups (the control and treatment groups) by PCR and Sanger sequencing [38]. SNP was validated six times, with three biological replicates and three technical replicates, and each replicate had 3 individuals. The samples were treated with the same procedure as the samples used for sequencing. DNA was extracted for the SNP validation experiment. Primers were designed by the Primer Premier 5 software (PREMIER Biosoft International, Palo Alto, CA, USA). PCR amplification in a 50 µL reaction was performed following the conditions: 95 • C for 2 min, 35 cycles of 95 • C for 15 s, 60 • C for 20 s, 72 • C for 30 s, and 72 • C for 2 min. The PCR product purification was carried out using the E.Z.N.A. ® Gel Extraction Kit (Omega Bio-tek, Inc., Norcross, GA, USA). Sanger sequencing was performed in both forward and reverse directions on an ABI 3730 DNA Analyzer. The sequence trace files were analyzed manually.

Sequencing and Transcriptome Assembly
The data generated by Illumina's Hiseq 2000 sequencing contained a total of 31,987,764,800 bases from 159,938,824 pair-end sequence reads. The average number of reads from the irradiated group and controls were 26,898,794 and 26,414,148, respectively (Table S1). For each sample, the raw data were firstly assembled into contigs separately (Table S2), and then the unigenes were combined into the final assembly (Table 1). In total, we generated 50,268 unigenes with the mean length of 781 bp. As shown in Table 1, the unigene size distribution revealed the following: 59.63% (29,975) were between 100 and 500 bp in length; 19.59% (9845) were between 500 and 1000 bp in length; 12.67% (6370) were between 1000 and 2000 bp; and 8.11% (1367) were more than 2000 bp. Note: Irradiated samples are shown with the prefix of "R" while control samples are shown with the prefix of "CK".
To classify the functions of the unigenes, the COG database was used. As shown in Figure 2, a total of 21,313 unigenes (42.40%) were annotated and divided into 26 specific categories. The general function category, which contained 16.23% of the mapped unigenes (3459), was the top category, followed by the transcription (2017, 9.46%), carbohydrate transport and metabolism (1683, 7.90%), replication, recombination, and repair (1586, 7.44%), and translation, ribosomal structure, and biogenesis (1515, 7.11%). The nuclear structure was the smallest category, which only contained two unigenes (0.0094%). GO analysis divided the unigenes into three ontologies: molecular function, cellular component, and biological process. As shown in Figure 3, 32.65% (16,412 unigenes) of the unigenes were categorized into 59 groups. The cellular process was the group containing the most unigenes. To classify the functions of the unigenes, the COG database was used. As shown in Figure 2, a total of 21,313 unigenes (42.40%) were annotated and divided into 26 specific categories. The general function category, which contained 16.23% of the mapped unigenes (3459), was the top category, followed by the transcription (2017, 9.46%), carbohydrate transport and metabolism (1683, 7.90%), replication, recombination, and repair (1586, 7.44%), and translation, ribosomal structure, and biogenesis (1515, 7.11%). The nuclear structure was the smallest category, which only contained two unigenes (0.0094%). GO analysis divided the unigenes into three ontologies: molecular function, cellular component, and biological process. As shown in Figure 3, 32.65% (16,412 unigenes) of the unigenes were categorized into 59 groups. The cellular process was the group containing the most unigenes.    Pathway mapping analysis was performed based on the KEGG pathway database. As shown in Table S4, 16,528 unigenes were mapped to 258 pathways. Compared with Pathway mapping analysis was performed based on the KEGG pathway database. As shown in Table S4, 16,528 unigenes were mapped to 258 pathways. Compared with other, metabolic pathways contained the most unigenes (2421, 14.65%), followed by pathways in cancer (744, 4.5%), focal adhesion (588, 3.56%), and purine metabolism (583, 3.53%).

Differentially Expressed Genes in Response to Irradiation and Pathway Enrichment Analysis of DEGs
To identify the genes that responded to irradiation, differentially expressed sequences between the two groups were identified (Table S5). There were 608 significantly differentially expressed unigenes detected between the irradiated and control samples, including 348 up-regulated and 260 down-regulated genes. To understand the functions of DEGs, DEGs were mapped to the GO database, and enrichment analysis was performed. A total of 252 DEGs were mapped to the GO database, and 78 GO items were significantly enriched (Table 2). Hemolymph coagulation was the most significantly enriched GO item (p = 4.10 × 10 −25 ), followed by hemostasis (p = 4.83 × 10 −22 ), coagulation (p = 4.83 × 10 −22 ), and the regulation of body fluid levels (p = 1.85 × 10 −21 ). All of the DEGs were also mapped to the KEGG database and compared with the whole transcriptome background. Among all of the genes with KEGG annotation, a total of 380 DEGs were found between the two groups. A total of 32 pathways were notably enriched ( Table 3). The most significant pathway was the renin-angiotensin system (p = 5.42 × 10 −10 ), followed by protein digestion and absorption (p = 3.6 × 10 −8 ) and pancreatic secretion (p = 3.6 × 10 −8 ). Most of the enriched pathways (18/32) were metabolic pathways, such as porphyrin and chlorophyll metabolism, retinol metabolism, steroid hormone biosynthesis, and insect hormone biosynthesis.    Genes with irradiated-specific SNPs or control-specific SNPs were also considered candidate genes that responded to irradiation. As shown in Table S6, a total of 130 SNPs in 125 unigenes were identified.

Verification of Differentially Expressed Genes
To further evaluate the sequencing results, the expression levels of six genes (eight unigenes) potentially involved in the response to irradiation were analyzed by qRT-PCR. As shown in Table 4 and Figure 4, the validation experiment revealed the same expression tendency as the sequencing data.

SNP Validation
Among the three pair samples for the SNP validation, one pair of the samples failed the PCR experiment, and only another two pairs of the samples were successfully sequenced. Among them, the 3401:G->A SNP in the Notch gene was successfully validated (Figure 5), implicating its involvement in response to irradiation.

SNP Validation
Among the three pair samples for the SNP validation, one pair of the samples failed the PCR experiment, and only another two pairs of the samples were successfully sequenced. Among them, the 3401:G->A SNP in the Notch gene was successfully validated (Figure 5), implicating its involvement in response to irradiation.

Discussion
Irradiation, temperature, and fumigation are currently the most popular phytosanitary treatment techniques used for the disinfestation of postharvest fruit flies in fruits and vegetables [39]. Irradiation has gained popularity as a phytosanitary treatment in the last decade for the eradication of arthropods, ornamentals, and preserved products [40]. Irradiation can cause sub-lethal molecular or biochemical changes, triggering a cascade of physiological changes. In order to eliminate insect pests, gamma radiation is frequently used as a potentially better alternative to toxic fumigants [41].
In this study, with transcriptome sequencing data from two groups (the control and treatment groups with their three biological replicates), we selected molecular markers for the identification of insects that had been treated with gamma irradiation. Previous researchers demonstrated that gamma radiation is environmentally beneficial, with no noticeable environmental adverse effects. Moreover, gamma irradiation prevents the reproduction of several stored grain pest species [42,43].
Our deep sequencing strategy (over 5 G of data for each sample) generated a transcriptome assembly with 50,268 unigenes (a mean length of 781 bp). A total of 20.78% of the unigenes were more than 1000 bp. The previously reported B. dorsalis transcriptome, which was also sequenced on the Illumina platform, generated 49,804 unigenes with a mean size of 456 bp, and less than 10% of the unigenes were over 1000 bp. With 454-pyrosequencing, Zheng et al. [44] generated a B. dorsalis transcriptome with 48,876 unigenes (mean size of 750 bp), and about 18.4% of the unigenes were over 1000 bp. Therefore, compared with the previously reported B. dorsalis transcriptome assembly, we generated more detailed data that facilitated our subsequent analysis. The functional annotation results of the unigenes were generally consistent with the previous transcriptome: the majority of the unigenes had a strong homology with Drosophila; the general function category was the top COG category, and the KEGG metabolic pathways contained the most unigenes.
A total of 608 significantly differentially expressed unigenes between the two groups may be attributed to their response to γ irradiation. GO item enrichment analyses revealed that the most significantly enriched GO item was hemolymph coagulation, which is one of the first responses to injury in insects. KEGG pathway analyses of these genes revealed that some of the pathways were involved in digestive processes, such as protein digestion

Discussion
Irradiation, temperature, and fumigation are currently the most popular phytosanitary treatment techniques used for the disinfestation of postharvest fruit flies in fruits and vegetables [39]. Irradiation has gained popularity as a phytosanitary treatment in the last decade for the eradication of arthropods, ornamentals, and preserved products [40]. Irradiation can cause sub-lethal molecular or biochemical changes, triggering a cascade of physiological changes. In order to eliminate insect pests, gamma radiation is frequently used as a potentially better alternative to toxic fumigants [41].
In this study, with transcriptome sequencing data from two groups (the control and treatment groups with their three biological replicates), we selected molecular markers for the identification of insects that had been treated with gamma irradiation. Previous researchers demonstrated that gamma radiation is environmentally beneficial, with no noticeable environmental adverse effects. Moreover, gamma irradiation prevents the reproduction of several stored grain pest species [42,43].
Our deep sequencing strategy (over 5 G of data for each sample) generated a transcriptome assembly with 50,268 unigenes (a mean length of 781 bp). A total of 20.78% of the unigenes were more than 1000 bp. The previously reported B. dorsalis transcriptome, which was also sequenced on the Illumina platform, generated 49,804 unigenes with a mean size of 456 bp, and less than 10% of the unigenes were over 1000 bp. With 454-pyrosequencing, Zheng et al. [44] generated a B. dorsalis transcriptome with 48,876 unigenes (mean size of 750 bp), and about 18.4% of the unigenes were over 1000 bp. Therefore, compared with the previously reported B. dorsalis transcriptome assembly, we generated more detailed data that facilitated our subsequent analysis. The functional annotation results of the unigenes were generally consistent with the previous transcriptome: the majority of the unigenes had a strong homology with Drosophila; the general function category was the top COG category, and the KEGG metabolic pathways contained the most unigenes.
A total of 608 significantly differentially expressed unigenes between the two groups may be attributed to their response to γ irradiation. GO item enrichment analyses revealed that the most significantly enriched GO item was hemolymph coagulation, which is one of the first responses to injury in insects. KEGG pathway analyses of these genes revealed that some of the pathways were involved in digestive processes, such as protein digestion and absorption and pancreatic secretion. This observation is consistent with the well-established irradiation effects on digestive physiology in other insect pests.
The differential expressions of eight unigenes were further confirmed by qRT-PCR. As shown in Table 4 (Figure 4), the squid gene sqd was downregulated in irradiated samples while ENPEP, Jhe, mth, Notch, and Ugt were highly expressed. The downregulation of sqd may contribute to the failure of emergence in irradiated samples since the initial organization of the Drosophila dorsoventral axis depends on the proteins encoded by sqd. Both ENPEP and Ugt are involved in metabolic pathways, and their up-regulation may be considered the results of the γ irradiation stress reaction. Juvenile hormone esterase (Jhe) plays an important role in the regulation of the juvenile hormone (JH). In lepidopteran hemolymph, decreased JH titers are positively correlated with increased JHE abundance [45]. The high expression of Jhe here may result in the low titers of JH, which are associated with diapause. For example, Monarch adult diapause is instituted by a deficiency in JH production [46]. Therefore, the upregulation of Jhe may contribute to the diapauses of our γ irradiated samples. Individuals with a mutated mth displayed an approximately 35 percent increase in their average lifespan and showed significant resistance to oxidative stress, starvation, and heat stress. We also detected a heterozygous SNP in this gene in the γ irradiated samples, which might be caused by the stress response. Notch (N) signaling is involved in establishing the correct width of various wing veins [47], and the up-regulation of Notch here may also contribute to the failure of emergence in the irradiated samples since the gain of function in Notch showed thinner and incomplete wing veins. In addition to the confirmed up-regulation of Notch, an SNP in this gene was also validated by the Sanger sequencing. It is highly possible that the confirmed SNP was induced by the irradiation treatment. Since the SNP was validated six times in three biological and three technical replicates of the samples, it could be used as DNA markers in quarantine. Other SNPs identified in this study are also useful sources for further studies on selecting DNA markers in quarantine.

Conclusions
In conclusion, we have generated a comprehensive transcriptome of the B. dorsalis in response to radiation using the Illumina Hiseq 2000 platform. The genes differentially expressed in the controls and irradiated samples were largely identified and functionally annotated. Our results provide useful biochemical markers that could be used in irradiation quarantine treatment. Furthermore, our data provide insights into the molecular mechanisms putatively related to the irradiation response of the third-instar larvae of B. dorsalis.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/insects13111017/s1, Table S1: Statistics of the transcriptome sequencing data for each sample; Table S2: Statistics of assembled contigs; Table S3: Functional annotation statistics; Table S4: Pathway mapping analyses for all unigenes; Table S5: Differentially expressed sequences between two groups; Table S6: Genes with irradiated-specific SNPs or controlspecific SNPs.
Author Contributions: C.S., S.S., W.W. and Q.L. conceived and designed the research; C.G. and Q.L. performed the experiments; Q.L. and S.S. analyzed the data. C.S., S.S., W.W., C.G., Y.G. and Q.L. wrote and revised the manuscript. All authors have read and agreed to the published version of the manuscript.