Transcriptome Analysis of the Oriental Fruit Fly Bactrocera dorsalis Early Embryos

The oriental fruit fly, Bactrocera dorsalis (Hendel), is one of the most devastating and highly invasive agricultural pests world-wide, resulting in severe economic loss. Thus, it is of great interest to understand the transcriptional changes that occur during the activation of its zygotic genome at the early stages of embryonic development, especially the expression of genes involved in sex determination and the cellularization processes. In this study, we applied Illumina sequencing to identify B. dorsalis sex determination genes and early zygotic genes by analyzing transcripts from three early embryonic stages at 0–1, 2–4, and 5–8 h post-oviposition, which include the initiation of sex determination and cellularization. These tests generated 13,489 unigenes with an average length of 2185 bp. In total, 1683, 3201 and 3134 unigenes had significant changes in expression levels at times after oviposition including at 2–4 h versus 0–1 h, 5–8 h versus 0–1 h, and 5–8 h versus 2–4 h, respectively. Clusters of gene orthology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations were performed throughout embryonic development to better understand the functions of differentially expressed unigenes. We observed that the RNA binding and spliceosome pathways were highly enriched and overrepresented during the early stage of embryogenesis. Additionally, transcripts for 21 sex-determination and three cellularization genes were identified, and expression pattern analysis revealed that the majority of these genes were highly expressed during embryogenesis. This study is the first assembly performed for B. dorsalis based on Illumina next-generation sequencing technology during embryogenesis. Our data should contribute significantly to the fundamental understanding of sex determination and early embryogenesis in tephritid fruit flies, and provide gene promoter and effector gene candidates for transgenic pest-management strategies for these economically important species.


Introduction
The oriental fruit fly, Bactrocera dorsalis (Hendel), is one of the most devastating and highly invasive agricultural pests in the world that causes severe economic loss due to damage to over 250 types of fruits and vegetables throughout Southeast Asia and several Pacific Islands [1,2]. As B. dorsalis has the characteristics of polyphagy, high reproductive capacity, and adaptability, it is a dreaded invasive species with great potential to invade new geographic areas and host plants [1]. Chemical insecticides Insects 2020, 11, 323 3 of 16

Insect Rearing and Embryo Collection
B. dorsalis were reared at the Institute of Horticultural and Urban Entomology at Huazhong Agricultural University (Wuhan, China). Larvae were maintained on a banana diet while adult flies were fed on an artificial diet consisting of yeast extract and sugar. All life stages were cultured at 28 • C under a 12 h light/12 h dark photoperiod in cages [52]. Embryos were collected from females by inducing egg laying through a perforated paper-cup coated with orange juice for a period of 15 min (first egg batch discarded) and then placed on moist filter paper and maintained at 25 • C and 70% humidity for set time intervals.

RNA Extraction and Library Preparation for Transcriptome Sequencing
Embryos were collected at 0-1, 2-4, and 5-8 h after egg laying (AEL) and stored in RNAlater ® Solution (Ambion, Austin, TX, USA). Total RNA was extracted using RNAiso Plus reagent (TaKaRa, Dalian, China) according to the manufacturer's protocol, with the quantity and purity examined using a Bioanalyzer 2100 (Agilent, CA, USA). A total amount of 1 µg RNA from 0-1, 2-4, and 5-8 h embryos was used as the input material to prepare RNA samples. Sequencing libraries were produced using the NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, Ipswich, MA, USA) according to the manufacturer's instructions. The index codes were added to sequences to distinguish each sample.
In brief, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. The divalent cations were used for fragmentation under elevated temperature in NEBNext First Strand Synthesis Reaction Buffer (5X), and the random hexamer primer and M-MuLV Reverse Transcriptase (RNase H − ) were used for first strand cDNA synthesis. Subsequently, DNA Polymerase I and RNase H were used for second strand cDNA synthesis, after which the exonuclease and polymerase were used to convert the remaining overhangs into blunt ends. Hybridizations were prepared by ligating the NEBNext Adaptor with a hairpin loop structure after the adenylation of the DNA fragment 3 ends. The library fragments were purified with the AMPure XP system (Beckman Coulter, Beverly, MA, USA) to preferentially select 250-300 bp cDNA fragments. Then, 3 µL of USER Enzyme (NEB, Ipswich, MA, USA) was incubated with the size-selected and adaptor-ligated cDNA at 37 • C for 15 min, followed by 5 min at 95 • C before PCR. The Phusion High-Fidelity DNA polymerase, Universal PCR primers and Index (X) Primer were used to conduct PCR. PCR products were then purified (AMPure XP system), with library quality being assessed using the Agilent Bioanalyzer 2100 system.
The cBot Cluster Generation System was employed to conduct the clustering of index-coded samples using the TruSeq PE Cluster Kit v3-cBot-HS (Illumina, San Diego, CA, USA) according to the manufacturer's protocol. After cluster generation, an Illumina HiSeq platform was used to sequence the library preparations and to generate 125 bp/150 bp paired-end reads.

Transcript Sequence Analysis
The in-house perl scripts were used first to process raw data (raw reads) of fastq format. After removing reads containing adaptor, reads containing poly-N, and low quality reads from raw data, the clean data (clean reads) were generated. The Q20, Q30 and GC-content of the clean data were calculated simultaneously, and high quality clean data were used for all of the downstream analyses. The paired-end clean reads were then aligned to the whole genome sequence (WGS) of B. dorsalis (NCBI Assembly #ASM78921v2) using HISAT2 v2.0.4. Since HISAT2 can generate a database of splice junctions based on the gene model annotation file, it was the preferred mapping tool versus non-splice mapping tools. The read numbers mapped to each gene were counted using HTSeq v0.9.1, and the expression level of each gene was further calculated by FPKM (Fragment Per Kilobase of exon model per million mapped reads) based on the length of the gene and the read counts mapped to this gene. FPKM is presently the most commonly utilized method for assessing gene expression levels as an effect of sequencing depth and gene length for the concurrent read counts [53,54].

Comparison of Differentially Expressed Genes
Prior to differential gene expression analysis, the read counts were adjusted by the edgeR program package through one scaling normalized factor in each sequenced library. Differential expression analysis of two conditions was performed using the DEGSeq R package (1.20.0). The Benjamini and Hochberg's method was used to adjust the P values. Genes with a corrected P-value less than 0.005 and log 2 (Fold change) of 1 identified by DESeq were assigned as differentially expressed [55]. The GOseq was applied for the Gene Ontology (GO) enrichment analysis of differentially expressed genes, and the gene length bias was corrected. Significantly enriched differentially expressed genes were determined by GO terms with a corrected P-value less than 0.05. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource to check the statistical enrichment of differentially expressed genes using the molecular-level information from biological systems and large-scale molecular datasets from genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/) [56]. The statistical enrichment of differentially expressed unigenes (DEGs) in KEGG pathways was analyzed with the KOBAS software [57]. Both known and novel transcripts were constructed and identified using the Cufflinks v2.1.1 Reference Annotation Based Transcript (RABT) assembly method from TopHat alignment results [58]. The software rMATS v3.2.5 was used to classify the alternative splicing (AS) events as five basic types. The number of AS events was estimated separately for each compared group.

Quantitative Real-Time PCR
The developmental transcript expression profiles of sex determination and early zygotic genes were investigated using quantitative Real-Time PCR (qPCR). Total RNA was extracted using RNAiso Plus reagent (TaKaRa, Dalian, China) from 100 embryos per replicate, with 200 ng for each sample subjected to reverse transcription for mRNA using the PrimeScript TM RT Master Mix (TaKaRa, Dalian, China). The reverse transcription products were used for qPCR using the primers listed in Table S1. qPCR was performed using the SYBR Green qPCR mix following the manufacturer's instructions in a real-time thermal cycler (Bio-Rad, Hercules, CA, USA) using the cycling conditions: 95 • C for 10 min, 40 cycles of 95 • C for 15 s, 60 • C for 30 s and 72 • C for 30 s. Three biological and three technical replicates were performed, with expression data being analyzed by the 2 −∆∆ct method [59]. Dissociation curves were determined for each mRNA to confirm unique amplification. The expression of ribosomal protein 49 (rp49) was used as an internal control to normalize the expression of mRNA.

Statistical Analysis
Conceptual amino acid sequences encoded by the tra, tra-2 and dsx genes in indicated species were used to construct a phylogenetic tree using MEGA5.0 (Molecular Evolutionary Genetics Analysis, Version 5.0, Sudhir Kumar, AZ, USA) with the pair-wise deletion option under the JTT empirical amino acid substitution model. Branch support was assessed by bootstrap analysis with 1000 replicates. All experiments were repeated at least three times and analyzed using GraphPad Prism 5.0 (GraphPad Software, San Diego, CA, USA) or Microsoft Excel (Microsoft, Redmond, WA, USA), with results being expressed as the mean ± SEM. Data were compared with either a two-way ANOVA-with subsequent t tests using Bonferroni post-tests for multiple comparisons-or with Student's t test. For all tests, differences were considered significant when p < 0.05.

B. Dorsalis Early Embryo Sequencing and Assembly
Transcriptome libraries (accession number: GSE118472) for the B. dorsalis early embryonic developmental stages (0-1, 2-4, and 5-8 h embryos AEL) were constructed by paired-end (PE) Illumina sequencing, which generated a total of 272,036,418 125 bp/150 bp-long PE reads with high sequence quality. The filtered sequence reads from all samples were assembled and produced 13,489 unigenes. The average size of these unigenes was 2185 bp with lengths that ranged from 61  (Table 1). There were 9809, 5227, 2873 and 1066 unigenes that were larger than 1000, 2000, 3000 and 5000 bp, respectively, with all having a complete open reading frame (ORF) ( Figure S1). After removing low-quality reads, 0-1, 2-4, and 5-8 h libraries generated 83, 85 and 98 million clean reads, respectively. Among these clean reads, 72-84 million (85.12%-86.97%), were mapped to unigenes in the whole genome sequence (WGS) of B. dorsalis ( Table 2). The percentage of clean reads ranged from 98.08% to 98.61%, and the percentage of reads mapped to the genome exon regions ranged from 91.2% to 97.1%, reflecting high quality sequencing ( Figures S2 and S3).

Comparison of Gene Expression Profiles in Three Sequential Early Embryo Stages
To evaluate the relative expression level of unigenes in the B. dorsalis early embryo transcriptome, unigene paired-end read counts were normalized by transforming them into Fragments Per Kilobase of transcript per Million mapped reads (FPKM). We obtained a wide range of expression levels from less than 1 FPKM to approximately 6595 FPKM (Table S2) Table S3). In the comparison of 5-8 and 0-1 h, the expression profiles of 3201 unigenes had altered, of which 1451 were up-regulated and 1750 were down-regulated ( Figure 2b and Table S3). When comparing 5-8 and 2-4 h, the expression of 3134 unigenes was significantly different as 1675 unigenes were up-regulated and 1459 unigenes were down-regulated ( Figure 2c and Table S3). The expression of 368 unigenes was significantly different in three pairwise comparisons ( Figure 3).

Unigene Function Annotation and Analysis
Unigene sequences were annotated by searching the nonredundant NCBI protein database using BLASTX. A total of 11,693 distinct sequences (86.69% of the unigenes) matched known genes (Table S2). Based on the gene ontology (GO) classification, differently expressed unigenes were characterized within three groups: biological processes, cellular components, and molecular function (Table S4). The results of each comparison showed high concordance with unigenes related to biological processes (Figure 4). In the comparison of 2-4 and 0-1 h, and 5-8 and 0-1 h, 127 and 421 biological process unigenes were involved in the cellular macromolecule metabolic process ( Figure  4a

Unigene Function Annotation and Analysis
Unigene sequences were annotated by searching the nonredundant NCBI protein database using BLASTX. A total of 11,693 distinct sequences (86.69% of the unigenes) matched known genes (Table S2). Based on the gene ontology (GO) classification, differently expressed unigenes were characterized within three groups: biological processes, cellular components, and molecular function (Table S4). The results of each comparison showed high concordance with unigenes related to biological processes (Figure 4). In the comparison of 2-4 and 0-1 h, and 5-8 and 0-1 h, 127 and 421 biological process unigenes were involved in the cellular macromolecule metabolic process ( Figure  4a

Unigene Function Annotation and Analysis
Unigene sequences were annotated by searching the nonredundant NCBI protein database using BLASTX. A total of 11,693 distinct sequences (86.69% of the unigenes) matched known genes (Table S2). Based on the gene ontology (GO) classification, differently expressed unigenes were characterized within three groups: biological processes, cellular components, and molecular function (Table S4). The results of each comparison showed high concordance with unigenes related to biological processes (Figure 4). In the comparison of 2-4 and 0-1 h, and 5-8 and 0-1 h, 127 and 421 biological process unigenes were involved in the cellular macromolecule metabolic process ( Figure  4a,b). In the comparison of 5-8 and 2-4 h, 607 biological process unigenes were involved in the cellular metabolic process (Figure 4c). In the molecular function category, enrichment comparisons showed that 93 unigenes involved in nucleic acid binding were enriched at 2-4 h vs. 0-1 h, 804 unigenes involved in binding were enriched at 5-8 h vs. 0-1 h, and 496 unigenes involved in organic

Unigene Function Annotation and Analysis
Unigene sequences were annotated by searching the nonredundant NCBI protein database using BLASTX. A total of 11,693 distinct sequences (86.69% of the unigenes) matched known genes (Table S2). Based on the gene ontology (GO) classification, differently expressed unigenes were characterized within three groups: biological processes, cellular components, and molecular function (Table S4). The results of each comparison showed high concordance with unigenes related to biological processes ( Figure 4). In the comparison of 2-4 and 0-1 h, and 5-8 and 0-1 h, 127 and 421 biological process unigenes were involved in the cellular macromolecule metabolic process (Figure 4a,b). In the comparison of 5-8 and 2-4 h, 607 biological process unigenes were involved in the cellular metabolic process (Figure 4c). In the molecular function category, enrichment comparisons showed that 93 unigenes involved in nucleic acid binding were enriched at 2-4 h vs. 0-1 h, 804 unigenes involved in binding were enriched at 5-8 h vs. 0-1 h, and 496 unigenes involved in organic cyclic and heterocyclic compound binding were enriched t 5-8 h vs. 2-4 h (Figure 4). In addition, the UniGenes metabolic pathway analysis was conducted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation system in each comparison (Table S5). We found that spliceosome pathway was enriched in the comparison of stages 2-4 and 0-1 h, and 5-8 and 0-1 h ( Figure 5). The observation that the RNA binding and spliceosome pathway were highly enriched and overrepresented in the GO term and KEGG analysis during B. dorsalis early stage embryogenesis may be an indication that RNA binding and spliceosome pathway enrichment is consistent with sex-specific alternative splicing.  (Figure 4). In addition, the UniGenes metabolic pathway analysis was conducted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation system in each comparison (Table S5). We found that spliceosome pathway was enriched in the comparison of stages 2-4 and 0-1 h, and 5-8 and 0-1 h ( Figure 5). The observation that the RNA binding and spliceosome pathway were highly enriched and overrepresented in the GO term and KEGG analysis during B. dorsalis early stage embryogenesis may be an indication that RNA binding and spliceosome pathway enrichment is consistent with sexspecific alternative splicing.

B. dorsalis Sex Determination and Early Zygotic Genes
The alternative intron splicing events that occur during early embryogenesis in B. dorsalis include the skipped exons (SE), alternative 5′ splice sites (A5SS), alternative 3′ splice sites (A3SS), mutually exclusive exons (MXE), and retained intron (RI), of which exon skipping exhibited the highest distribution in the three developmental time periods compared ( Figure S4). In addition to the Bdtra, Bdtra-2 and Bddsx genes, we identified the expression of another 18 additional sex determination gene transcripts in early embryos (Table 3). We found that all sex determination genes contained full length ORFs and that most of the identified sex determination ortholog genes were highly expressed, except for Bddsx, where low levels (normalized FPKM < 1) did not change significantly among 0-1, 2-4 and 5-8 h embryos (Table 3). The phylogenetic analysis of the TRA, TRA-2 and DSX amino acid sequences showed that the putative orthologs of Bdtra, Bdtra-2 and Bddsx were conserved in tephritid species, especially among the Bactrocera (Figure 6). The qPCR results showed that the expression of Bdtra, Bdfru, BdSxl and Bddpn increased significantly from 0-1 to 5-8 h, while the expression of Bddsx, Bdtra-2, Bdda and Bdfl(2)d had no substantial changes across the three early embryo stages, and the expression of Bdotu decreased significantly from 0-1 to 5-8 h (Figure 7). All these results were consistent with our deep sequencing data, which indicated that the current analysis is accurate. The maternal-to-zygotic transition is the period in B. dorsalis embryogenesis when maternal transcripts degrade and the zygotic genome expression is activated. In B. dorsalis, there were two waves of zygotic transcription at 25 °C, a minor wave occurring early where BdsisA

B. dorsalis Sex Determination and Early Zygotic Genes
The alternative intron splicing events that occur during early embryogenesis in B. dorsalis include the skipped exons (SE), alternative 5 splice sites (A5SS), alternative 3 splice sites (A3SS), mutually exclusive exons (MXE), and retained intron (RI), of which exon skipping exhibited the highest distribution in the three developmental time periods compared ( Figure S4). In addition to the Bdtra, Bdtra-2 and Bddsx genes, we identified the expression of another 18 additional sex determination gene transcripts in early embryos (Table 3). We found that all sex determination genes contained full length ORFs and that most of the identified sex determination ortholog genes were highly expressed, except for Bddsx, where low levels (normalized FPKM < 1) did not change significantly among 0-1, 2-4 and 5-8 h embryos (Table 3). The phylogenetic analysis of the TRA, TRA-2 and DSX amino acid sequences showed that the putative orthologs of Bdtra, Bdtra-2 and Bddsx were conserved in tephritid species, especially among the Bactrocera (Figure 6). The qPCR results showed that the expression of Bdtra, Bdfru, BdSxl and Bddpn increased significantly from 0-1 to 5-8 h, while the expression of Bddsx, Bdtra-2, Bdda and Bdfl(2)d had no substantial changes across the three early embryo stages, and the expression of Bdotu decreased significantly from 0-1 to 5-8 h (Figure 7). All these results were consistent with our deep sequencing data, which indicated that the current analysis is accurate. The maternal-to-zygotic transition is the period in B. dorsalis embryogenesis when maternal transcripts degrade and the zygotic genome expression is activated. In B. dorsalis, there were two waves of zygotic transcription at 25 • C, a minor wave occurring early where BdsisA and Bddpn genes were activated before 4 h after oviposition, and a second major wave that occurred at 5 h when Bdtra, BdSxl and Bdfl(2)d gene transcription was initiated based on the qRT-PCR and mRNA-sequencing results (Figure 8). The waves of zygotic transcription during B. dorsalis embryogenesis are similar to those in C. capitata [60]; however, in D. melanogaster, the minor wave initiates before 2 h after oviposition and the second wave occurs at 3 h, at the time when zygotic Dmtra initiates transcription (Figure 8). In the B. dorsalis early embryonic transcriptome, we identified a BdZelda (BdZld) sequence that encodes a zinc-finger transcription factor protein ( Table 3), but we failed to identify Bdslam encoding transcripts in the transcriptome. BdZld displayed a significant increase in expression from 0-1 to 5-8 h, which was involved in the maternal-to-zygotic transition ( Table 3). Other two cellularization genes, Bdsrya and Bdnullo, were found through BLAST annotation and motif searches of the transcriptome (Table 3). Bdsrya showed greater amino acid similarity to the D. melanogaster srya-like gene, which is involved in cellular blastoderm formation. and Bddpn genes were activated before 4 h after oviposition, and a second major wave that occurred at 5 h when Bdtra, BdSxl and Bdfl(2)d gene transcription was initiated based on the qRT-PCR and mRNA-sequencing results (Figure 8). The waves of zygotic transcription during B. dorsalis embryogenesis are similar to those in C. capitata [60]; however, in D. melanogaster, the minor wave initiates before 2 h after oviposition and the second wave occurs at 3 h, at the time when zygotic Dmtra initiates transcription ( Figure 8). In the B. dorsalis early embryonic transcriptome, we identified a BdZelda (BdZld) sequence that encodes a zinc-finger transcription factor protein (Table 3), but we failed to identify Bdslam encoding transcripts in the transcriptome. BdZld displayed a significant increase in expression from 0-1 to 5-8 h, which was involved in the maternal-to-zygotic transition (Table 3). Other two cellularization genes, Bdsrya and Bdnullo, were found through BLAST annotation and motif searches of the transcriptome (Table 3). Bdsrya showed greater amino acid similarity to the D. melanogaster srya-like gene, which is involved in cellular blastoderm formation.    Error bars indicate the SEM of three independent biological replicates and different letters above them indicate statistically significant differences (p < 0.05) among the three stages based on Student's t-test.

Discussion
In previous studies, the developmental gene expression profiles of B. dorsalis have been investigated by constructing RNA-seq libraries for different developmental stages [61][62][63][64]. However, early embryogenesis transcripts in this dipteran species have been largely unexplored. To better define the broad gene expression profiles in early embryogenesis and the initiation of zygotic control of sex determination, we sequenced transcriptomes from three developmental time periods in B. dorsalis early embryos by next-generation sequencing (NGS) technology. Twenty-four sex determination and cellularization genes from a total of 13,489 unigenes were identified, and differential expression profiles of transcripts were validated from the early embryo libraries. Importantly, this study is the first transcriptome exploration of embryogenesis in B. dorsalis, which provides an essential gene expression resource necessary to fully understand the genetic basis of sex determination in this tephritid fruit fly.
In D. melanogaster, the combination of X-chromosome linked signal elements (XSEs), namely sisA, sc, os and runt as the primary signal, initiates the sex determination cascade [41,43,65,66]. In our study, we identified B. dorsalis homologs of sisA, sc and runt, but not os, expressed in the early embryonic stages. BdsisA, Bdsc and Bdrunt increased 7-, 57-and 45-fold, respectively, over this time period. Compared to the sisA, sc, os and runt genes that are all transcribed from the X chromosome in D. melanogaster [66,67], BdsisA, Bdsc and Bdrunt are autosomal genes in B. dorsalis. The transcripts of BdsisA and Bddpn have the same expression profiles in C. capitata, which exhibit a significant increase in the zygote [60]. Although the primary sex determination signal is postulated to be a Y chromosome-linked male determining factor in tephritids [47][48][49][50][51], the identification of XSEs in the early embryonic developmental transcriptome may expand the function of these genes in the sex determination cascade of non-drosophilid insects [60,[68][69][70][71][72][73]. In contrast to in Drosophila, Sxl is not sex-specifically spliced and is not involved in tephritid sex determination [46,74,75]. In our study, we identified the orthologs of Drosophila Sxl, tra, tra-2, and dsx genes in B. dorsalis, of which BdSxl and Bdtra-2 have the same transcripts in males and females. Bdtra and Bddsx play an important role in B. dorsalis sex determination and reproduction, which are regulated by sex-specific alternative splicing [17,[76][77][78]. Interestingly, we observed that RNA binding, the spliceosome pathway and alternative splicing events were highly enriched during the early stage of embryogenesis in B. dorsalis, which is also consistent with the role of sex-specific alternative splicing in B. dorsalis sex determination, in addition to that in other insects species [75,79,80]. The expression of Bdtra increased approximately two-fold during early embryogenesis, while Bdtra-2 had consistently higher expression levels during the same periods. Given that TRA-2 performs a novel function by interacting with TRA to form a splicing factor that results in the female-specific splicing and autoregulation of tra F and dsx F [13][14][15][16][17][18]38,81,82], their significant expression levels in early embryos is not unexpected. However, the relative differences in the levels of Bdtra-2 transcription had no substantial changes across the three early embryo stages.
Previous research shows that several genes expressed during the minor wave and major wave of embryonic transcription are involved in cellularization and sex determination [24,28,60,83]. Based on FPKM analysis, almost all the genes identified from the B. dorsalis early embryos have maternally derived transcripts (Table 3), except for Bddpn, Bddsx, Bdfru, Bdsc, Bdemc, Bdrunt, and Bdnullo having FPKM values <1 at 0-1 h. A Bdslam transcript was not detected previous to cellularization but has been reported as a maternal transcript in C. capitata and Bactrocera jarvisi [60,69]. Bdsrya was expressed during early embryogenesis at 0-1 h, and cognates of these genes have provided promoters that drive a transgenic tetracycline-suppressible embryonic lethality system, first tested in D. melanogaster [10] and then in the pest species C. capitata, Anastrepha suspensa, Anastrepha ludens and Lucilia cuprina [9,11,12,[84][85][86]. Notably, for two of these species, A. ludens and L. cuprina, the use of the srya promoter resulted in embryonic lethality but also in maternal female sterility, presumably due to the pre-zygotic expression of the lethal effector in their ovaries that could be suppressed by a short-term tetracycline diet [84,85]. This might also occur in B. dorsalis, given the appearance of the pre-zygotic maternal Bdsrya transcript at 0-1 h. For Lucilia, however, the use of the nullo promoter avoided pre-zygotic cell lethality [86], and our results would suggest that the use of the Bdnullo promoter, which initiates expression zygotically at 2-4 h, would also be preferable for an embryo-specific lethality system in B. dorsalis. Other uses of embryonic and sex-determination genes for population control or the manipulation of various insects include the engineering of early zygotic gene (EZG) promoters in Drosophila for the zygotic-specific expression of antidote genes for the maternal-effect dominant embryonic arrest (Medea) gene-drive system [87]. For Anopheles gambiae mosquitoes, the CRISPR-Cas9 gene drive targeting of the terminal dsx sex determination gene has been used to induce female sterility, resulting in incomplete population suppression [88], and in the silkworm, Bombyx mori, a female-specific embryonic lethal system has been developed by the CRISPR-Cas9 targeting of the tra-2 gene [89]. In our study, we have identified several sex determination and early zygotic genes that should also contribute to the development of novel genetic control strategies for B. dorsalis and related pest species.

Conclusions
In summary, RNA-seq analysis was performed to elucidate or confirm the sex determination and cellularization processes at the molecular level during early embryogenesis in B. dorsalis, a major invasive agricultural tephritid pest. Numerous genes displayed significant changes in expression during embryogenesis, especially genes involved in sex determination and the maternal-to-zygotic transcriptional transition. The gene expression patterns of Bdtra, Bdtra-2 and BdZld, in particular, are an indication of their important roles in early embryogenesis. Overall, the identification of sex-determination and cellularization genes has demonstrated the value of B. dorsalis as a non-typical model for early embryonic development and sexual differentiation. This study should also result in the advancement of the identification and use of gene promoters and sex-specific splicing mechanisms from these genes for transgenic-based improvements in pest management strategies for B. dorsalis and related fruit fly species.