Drosophila Interspecific Hybridization Causes a Deregulation of the piRNA Pathway Genes

Almost all eukaryotes have transposable elements (TEs) against which they have developed defense mechanisms. In the Drosophila germline, the main transposable element (TE) regulation pathway is mediated by specific Piwi-interacting small RNAs (piRNAs). Nonetheless, for unknown reasons, TEs sometimes escape cellular control during interspecific hybridization processes. Because the piRNA pathway genes are involved in piRNA biogenesis and TE control, we sequenced and characterized nine key genes from this pathway in Drosophila buzzatii and Drosophila koepferae species and studied their expression pattern in ovaries of both species and their F1 hybrids. We found that gene structure is, in general, maintained between both species and that two genes—armitage and aubergine—are under positive selection. Three genes—krimper, methyltransferase 2, and zucchini—displayed higher expression values in hybrids than both parental species, while others had RNA levels similar to the parental species with the highest expression. This suggests that the overexpression of some piRNA pathway genes can be a primary response to hybrid stress. Therefore, these results reinforce the hypothesis that TE deregulation may be due to the protein incompatibility caused by the rapid evolution of these genes, leading to a TE silencing failure, rather than to an underexpression of piRNA pathway genes.


Introduction
Transposable elements (TEs) are mobile genetic units that are interspersed throughout the genomes of almost all eukaryotes, often occupying significant fractions of the genome of their hosts. Their presence is an important threat to their host's integrity, as their mobilising ability and repetitive nature makes them powerful endogenous mutators. To diminish their harmful effects, organisms have developed several TE repression strategies, especially in the germline, where new mutations can be transmitted to the next generations [1,2]. In the animal germline, the Piwi-interacting small RNA (piRNA) pathway acts by silencing TEs transcriptionally and post-transcriptionally through sequence homology between piRNAs and TEs [3][4][5]. piRNA biogenesis starts when long piRNA precursors are transcribed from specific genomic piRNA clusters and cleaved to produce primary piRNAs [4]. Those analyses of genes armi, aub, krimper, piwi, and zuc; and crosses A2 and B2 for genes ago3, mt2, rhino, and spnE. For simplicity, they are listed as crosses A and B in the manuscript. The parental stocks used were the D. buzzatii Bu28 strain-an inbred line originated by the union of different populations collected in 1982 in Los Negros, Bolivia-and the D. koepferae Ko2 strain-an inbred line originated from a population collected in 1979 in San Luis, Argentina. Both stocks were maintained by brother-sister mating for more than a decade and are now kept by mass culturing. All stocks and crosses were reared at 25 • C in a standard Drosophila medium supplemented with yeast.  (Table 1) were downloaded from the Flybase database [29] and aligned to the D. buzzatii genome [30] using BLAST's tblastn. We retrieved the best alignment hit for each gene and its D. buzzatii genome location was used for primer design (see Supplementary file 1A). Because no reference genome was available for D. koepferae, some primers did not amplify, therefore some of the genes-aub, krimper, piwi, and rhino-lack small fragments at their 5 and/or 3 ends. All nucleotide sequences were deposited in GenBank under the accession numbers from MN901612 to MN901629.
We carried out all PCR reactions in an MJ Research Inc. thermal cycler using the following program: 5 min at 94 • C; 30 cycles of 45 s at 94 • C, 45 s at specific annealing temperature (see Supplementary file 1A for primer sequences), 90 s at 72 • C; and 10 min at 72 • C. A final volume of 50 µL was used, with 1× High Yield Reaction Buffer with Mg2+ (Kapa Biosystems), 0.2 mM of each dNTP (Roche), 0.4 µM of each primer (Sigma-Aldrich), template DNA (10-20 ng) and 0.04 U/µL of Taq polymerase (KapaTaq from Kapa Biosystems). SpnE and aub genes were amplified using Roche's Expand Long Template PCR system (for both parental species). Amplicons were purified with the Nucleospin Gel and PCR Clean-Up kit (Macherey-Nagel), and cloned with the pGEM-T Easy Vector System I (Promega).

Sequence Analysis
Sequencing of PCR cloned products was performed by Macrogen Inc. (Seoul, Korea) service. Multiple sequence alignment was carried out with MAFFT [31]. For transcript prediction and consensus protein domain motifs finding, ORF Finder (https://www.ncbi.nlm.nih.gov/orffinder/) and Conserved Domain Search [32] tools were used respectively. The Augustus sofware [33] was used for gene structure prediction in silico. Those predicted genes were compared to existing annotations in D. mojavensis and D. melanogaster species (data obtained from FlyBase database: http.//www.flybase.org, January 2019). TE intron insertions were detected using Repeat Masker software [34] in the species under study and D. mojavensis (see Supplementary file 3C). To test signatures of selection we performed McDonald and Kreitman test (MK) using DnaSP v6 software [35]. The intraspecific polymorphism was computed by aligning the piRNA pathway genes from the D. buzzatii line considered in this study, to those of the previous sequenced D. buzzatii genome [30] and then compared to D. koepferae sequences.

Quantification of Gene Transcripts by qRT-PCR
Ovaries of 5-or 6-day-old flies (from parental species or F1 hybrids) were dissected in PBT (1× phosphate-buffered saline [PBS], 0.2% Tween 20). Total RNA was purified individually for each fly's pair of ovaries with the Nucleospin RNA purification kit (Macherey-Nagel). cDNA synthesis was carried out with anchored-oligo(dT)18 primers using Roche's Transcriptor First Strand cDNA Synthesis Kit. Transcript abundance was estimated by fluorescence intensity using Biorad's iQ SYBR Green Supermix on a CFX96 BioRad Real-Time lightcycler. We performed relative quantification using the ribosomal rp49 housekeeping gene as endogenous control, with at least two technical replicates per sample. This control gene showed to be equally expressed in D.buzzatii and D. koepferae ovaries in a previous work using the same primers and stocks [25].
For each gene we used the same primer set in both species (Supplementary file 1B), designed in a conserved region and tested to have similar efficiencies in D. buzzatii and D. koepferae (Supplementary file 2A). Thus, expression rates were calculated using the comparative Ct method [36] as in [25] (supplementary file 2B). For each gene, we analyzed five sample groups: 2 maternal D. koepferae groups (crosses A and B), 2 F1 hybrid groups (female offspring from crosses A or B) and one D. buzzatii group (females of the stock, not involved in the cross).

Fluorescent In Situ Hybridization in Ovaries
Ovaries of 3-days old flies (which is the ideal age for optimal visualization of the different cells from ovaries) were dissected in PBT, following the protocol described in [37]. Antisense RNA probes for the 9 genes (see Supplementary file 4 for probes details) of the piRNA pathway, including T7 and SP6 promoter sites, were labeled by in vitro transcription of SP6/T7 using the DIG RNA Labeling Kit (Roche) and used to detect gene expression in ovaries. Hybridization signal was detected using the anti-DIG POD antibody (Roche) and fluorescence amplification (TSA PLUS Cyanine3 kit, PerkinElmer), and visualized with an Olympus Fluoview 1000 confocal scanning laser microscope.

Statistical Methods
We used IBM SPSS 22 software for statistical analyses. As the assumptions of Gaussian distribution and equal variances are not valid in qRT-PCR experiments with small sample sizes, we used the non-parametric Wilcoxon rank sum test (or Mann-Whitney test [38]) to compare expression rates between hybrids and parental species. Kruskal-Wallis test [39] was used to determine whether differences between all groups were significant. All multiple test corrections were achieved using a False Discovery Rate (FDR) threshold of 5% based on the method of Benjamini-Hochberg [40]. Additionally, Levene's test for equality of variances, was used to assess changes in variance between groups.

piRNA Pathway Gene Characterization in D. buzzatii and D. koepferae
In order to perform gene characterization and expression analyses we sequenced nine piRNA pathway genes in our parental species, D. buzzatii and D. koepferae. These genes have never been characterized in these species before-they are not annotated in the available D. buzzatii genome sequence [30], and no genome sequence has been released to date for D. koepferae. Our multiple sequence alignments (MSA) show that ago3 is the most conserved gene between both species with 95.6% of nucleotide identity in the coding sequence and rhino is the most divergent with 78% of identity (Table 1). We observe that although amino acid similarities between parental species highly differ between genes (ranging from 69% for rhino to 95.6% for piwi), gene structure is generally conserved. Indeed, the number of exons is exactly the same between D. buzzatii and D. koepferae, and does not change when compared to their closest sequenced relative, D. mojavensis (Supplementary file 3A), or even to the more distant species D. melanogaster (Supplementary file 3B).
In the case of ago3, the first intron is seven times larger in D. koepferae (2275 bp) than in D. buzzatii (321bp), likely due to transposition events. In fact, fragmented TE sequences represent 42% of the D. koepferae first intron length (including both retrotransposons and DNA transposons, see Supplementary file 3C). Even though the same intron in D. buzzatii does not carry any TE sequence, these are also present in the orthologous sequence of D. mojavensis, the closest species with a sequenced genome. Interestingly, this gene sequence is the most conserved between our parental species (Table 1).
We performed the McDonald and Kreitman (MK) test [41] to test for putative selection marks in the nine studied genes ( Table 2). The proportion of adaptive substitutions (α) is higher than 0 for all genes, indicating that they are likely under selective pressure, although only armi and the region corresponding to the PAZ domain of aub yield significant results. aub PAZ 1.60 6.30 × 10 -1 9.94 × 10 -1 5.00 × 10 -3 7.5 × 10 -1 *** CDS: coding sequence; Pn/Ps: polymorphic changes and Dn/Ds: divergent changes-s refers to neutral sites and n to non-neutral ones. NI: Neutrality Index ((Pn/Ps)/(Dn/Ds)); α: proportion of adaptive substitutions (1-NI); p: p-value after Jukes-Cantor correction. **: p < 0.01; ***: p < 0.001. MK test was performed for the complete CDS (results for all genes shown) and for each individual domain (only domains with significant results are shown). PAZ: protein binding domain found in Piwi, Argonaute, and Zwille proteins. MK test could not be performed for ago 3 due to the low gene polymorphism.

Gene Expression in Parental Species
Gene expression in parental species ovaries was studied in individual flies using a single pair of ovaries per sample. mRNAs were quantified by quantitative real time PCR (qRT-PCR) using the comparative C T method [36]. In order to achieve higher statistical power, five groups were used for measuring and comparing gene expression: One parental D. buzzatii group (Dbu, not involved in the cross), two D. koepferae maternal groups (DkoA and DkoB), used subsequently to obtain the two respective hybrid offspring groups (HybA and HybB).

Gene Expression in Hybrids
We quantified the expression of the same nine piRNA pathway genes in ovaries of hybrid females as previously described for parental species (Figure 2). The ERs values were calculated in hybrids obtained in two different crosses (HybA and HybB, see Methods) and compared to their respective maternal group (DkoA or DkoB) as well as to D. buzzatii (Dbu, females were not involved in the cross). We tested for differences in ER within groups (Dbu, DkoA, DkoB, HybA, and HybB) for the nine studied piRNA pathway genes using the Kruskal-Wallis test, and found significant results for all of them, except for piwi (see Supplementary file 5).
We then performed one-to-one comparisons between groups using the Wilcoxon sum rank test ( Table 4). The analyses revealed three possible scenarios: (a) ERs were not significantly different between hybrids and parental species-no difference scenario, (b) ERs were significantly higher in hybrids than in one of the parental species-Dbu or Dko-biased expression scenario, or (c) ERs were higher in hybrids than in both parental species-hybrid overexpression scenario (see Table 4 and Figure 2). A single gene, piwi, did not show any significant difference between hybrids and parents ( Table 4 and Figure 2F). Ago3 had a Dbu-biased expression (higher than D. koepferae, Figure 2A and

Gene Expression in Hybrids
We quantified the expression of the same nine piRNA pathway genes in ovaries of hybrid females as previously described for parental species (Figure 2). The ERs values were calculated in hybrids obtained in two different crosses (HybA and HybB, see Methods) and compared to their respective maternal group (DkoA or DkoB) as well as to D. buzzatii (Dbu, females were not involved in the cross). We tested for differences in ER within groups (Dbu, DkoA, DkoB, HybA, and HybB) for the nine studied piRNA pathway genes using the Kruskal-Wallis test, and found significant results for all of them, except for piwi (see Supplementary file 5).  We then performed one-to-one comparisons between groups using the Wilcoxon sum rank test ( Table 4). The analyses revealed three possible scenarios: (a) ERs were not significantly different between hybrids and parental species-no difference scenario, (b) ERs were significantly higher in hybrids than in one of the parental species-Dbu or Dko-biased expression scenario, or (c) ERs were higher in hybrids than in both parental species-hybrid overexpression scenario (see Table 4 and Figure 2). A single gene, piwi, did not show any significant difference between hybrids and parents (Table 4 and Figure 2F). Ago3 had a Dbu-biased expression (higher than D. koepferae, Figure 2A and Table 4), while rhino, spnE, and armi presented Dko-biased expression (higher than D. buzzatii, Figure 2B,G,H and Table 4). A total of three genes-krimp, mt2, and zuc-fell in the hybrid overexpression scenario ( Figure 2D,E,I and Table 4). In the case of aub, cross A displayed no significant differences between hybrids and parental species, whereas in cross B the hybrid expression is significantly different to the most expressed parental species, Dko (Figure 2A and Table 4). Moreover, this significance in cross B is still maintained after removing the outlier point (Figure 2A) W = 177 and p = 0.013 * (data not shown). Studying individual fly samples allowed us to test for differences in variability between groups. Using Levene's test for equality of variances, we showed that all genes had significant differences in variance within groups except piwi and rhino (see Supplementary file 6A). Only zuc and spnE genes showed higher variance values in hybrids than in both parental species (Supplementary file 6B). Indeed, they presented high ER individual variability in hybrids: they were overexpressed in some individuals and underexpressed in others, when compared to the parental median value (Figure 2).

Expression Localization Patterns in Hybrid and Parental Species Ovaries
To assess whether the observed quantitative differences in gene expression between hybrids and parents involved changes in the localization of the transcripts, we performed fluorescent in situ hybridization (FISH) in ovarian tissue in order to detect the mRNA location of the genes under study.
All genes showed expression mainly in the cytoplasm of nurse cells (see Supplementary file 7), both in parental species and hybrid ovaries. A faint expression signal can also be detected inside the nucleus of nurse cells in some cases, likely corresponding to recently transcribed mRNAs. Interestingly, transcript location for ago3 showed a different pattern between hybrids ( Figure 3E) and parents ( Figure 3C,D): a clear and strong hybridization signal was detected in the hybrid oocytes, whereas only faint signals were detected in parental species. Additionally, we found some cases in which signal intensity seems to follow the same trend as in qRT-PCR-for instance, for mt2 the expression is higher in hybrids than in parental species. However, this can only be used as a validation since FISH is not a quantitative technique.

piRNA Pathway Gene Structure is Conserved between D. buzzatii and D. koepferae
Nine piRNA pathway genes were sequenced for the first time in the parental species D. buzzatii and D. koepferae. Four of them have an exon number higher than the average in D. buzzatii genome (3.8 exons, [30]). All genes but ago3 have an identical number of exons/introns in both parental species as well as in D. melanogaster and D. mojavensis, which was expected given that 80% of intron positions are conserved across distant eukaryotes [42]. Armi and aub bear marks of positive selection (Table 2), in concordance with a previous study of RNA interference genes across the Drosophila phylogeny [43].
Although the general gene structure of the studied piRNA pathway genes is conserved among Drosophila species, ago3 caught our attention because of its low exon/intron number in our species compared to D. melanogaster (2 vs. 6 exons respectively). Ago3 has a highly variable exon number within the Drosophila genus, from a single exon in D. suzukii and D. pseudoobscura to eight exons in D. virilis [43]. This variability cannot be explained phylogenetically, as ago3 extreme exon numbers (high and low) occur in species of both the Drosophila and the Sophophora subgenus. Hence, we cannot be sure whether this variability is due to intron gain or intron loss processes. Although intron loss is predominant over intron gain in Drosophila [44], the presence of TE sequences in the species with intron-rich ago3 indicates that transposition-driven intron gain might have occurred [45]. Indeed, the predominance of intron gain has been attributed to selective pressures due to large effective population sizes [44], which would not explain a lower intron number in our species, whose population sizes are lower than in D. melanogaster [46,47].

Armitage and Aubergine Bear Marks of Positive Selection
The nine piRNA pathway sequenced genes in this study showed identity values between D. buzzatii and D. koepferae ranging between 78-95.6% for DNA and 69-95.6% for protein sequences, a rather low degree of conservation for a couple of species that diverged approximately 5 Mya [48]. This suggests that piRNA pathway genes tend to evolve quickly compared to other genes, as observed in multiple invertebrates [49] and in a previous work [27] where these genes showed protein identity values lower than the median of the proteome between our parental species. Despite the low number of sequences analyzed, we found that at least two of these genes (armi and aub) are under positive selection in our model species (Table 2), which is in concordance with previous studies showing that piRNA pathway display high rates of adaptive evolution [19,20,42]. It is important to note that in aub these selection marks were only detected in the PAZ protein domain, whereas the whole gene is affected in armi. Because some domains are shared by different piRNA pathway genes (e.g., the PAZ domain), and positive selection marks were not observed in all of them, we deduced that adaptation could be gene-specific rather than domain-specific. Several studies have suggested that the degree of gene adaptive evolution is correlated with the position of the corresponding protein in the interaction network [42,50]. In the piRNA pathway, the fastest evolving components of piRNA pathway do not usually correspond to effector proteins [51]. In our case this is true for ago3 and piwi (that are effector proteins with no significant positive selection marks) but not for aub, which shows a greater effect of positive selection in its PAZ domain. In concordance with our results, armi has a general trend to show positive selection marks in different and independent tests [52]. Signatures of adaptation are a pervasive effect in genes affecting piRNA synthesis, although this high evolution rate is not only restricted to this pathway: most of the genes related to RNA interference pathways have also been reported to display high rates of adaptative evolution [49]. Besides, as these genes participate in TE silencing, it is important to take into account the evolutionary process of host-pathogen interaction or "Red Queen" host-pathogen arms race [52]. This rapid evolution of the piRNA pathway genes is a key process in species divergence and can easily generate orthologous incompatibility after hybridization barrier [20].

Misexpression of piRNA Pathway Genes in D. buzzatii-D. koepferae Hybrids
The combination of divergent Drosophila genomes during hybridization results in a genomic shock characterized, inter alia, by a TE deregulation [22,23,25,26] caused by a failure of TE silencing [20,27,53]. There is limited information linking TE deregulation with expression failures in piRNA pathway genes [27], well-known by their important role in germinal TE regulation. Our study has quantified the individual expression, in ovaries, of nine key piRNA pathway genes in D. buzzatii, D. koepferae and their F1 hybrids. We observed that the median expression values in hybrids tend to be higher than at least one parental species in all genes except aub, which codes for an effector protein ( Figure 2). However, hybrid expression values were only significantly higher than in both parents for krimp, mt2, and zuc. For the four other genes (ago3, rhino, spnE, and armi), hybrid expression was only significantly higher than the parental species with the lowest expression. Finally, piwi expression was not significantly different between hybrids and parents, while aub presented different results between crosses. These results do not completely match the previous RNA-seq study in the same species, in which most genes had expression levels similar to the parental species with the highest expression [27]. These differences likely lie in the fact that here we analyzed each ovary pair individually whereas the previous study pools the ovaries resulting from different hybrid crosses. It is worth highlighting that different individuals (both within parents and hybrids) also showed a high variability of piRNA pathway gene expression reaching differences of up to 2.5-fold.
zuc and spnE are the only genes that displayed higher inter-individual variability in hybrids compared to both parental species (see Supplementary file 6). However, because hybrids are known by their high genome instability, these differences could be due to stochastic genetic and epigenetic changes that do not involve the meiotic process, as suggested by other authors [54]. These results are in concordance with previous studies in hybrids showing high individual and cross heterogeneity in transposition [22] and expression rates [25] of the retrotransposon Osvaldo. In the same way, expression studies on the retrotransposon Helena showed additive patterns of expression in hybrids compared to parental species when using pooled flies in qRT-PCR experiments, while FISH experiments in individual flies showed a more extensive presence of Helena transcripts in F1 hybrids compared to parental species [26]. In the present study, the transcripts of the studied genes were mainly localized in the ovarian nurse cells' cytoplasm in both hybrids and parental species. For Ago3, a strong transcript signal was also detected in the oocyte cytoplasm of hybrid ovaries, while only a faint signal was detected in parental ones. It is known that ovarian nurse cells transfer mRNAs and proteins into the oocyte for the production of the egg and early embryo [55]. However, ago3 is the only gene that seems to be more expressed in the oocyte of hybrids than of parental species, which might be due to an activation of the piRNA pathway to counteract TE deregulation in hybrids, or to an abnormal localization of the transcripts due to hybrid incompatibilities. Indeed, abnormal distributions of tissue expression were previously reported in Drosophila hybrids [26].
Several studies showed that interspecific hybrids tend to present TE derepression compared to parental species [53,[56][57][58][59]. However, repression cases have also been observed for some TEs, pointing out a more complex alteration of the TE regulation network. Our results show that nine piRNA pathway genes have a non-uniform expression pattern between hybrids, and that three genes-krimp, mt2, and zuc-are overexpressed in hybrids compared to parental species. Intuitively, we could think that hybrid TE derepression might be preceded by the underexpression of regulatory genes. However, the overexpression of some piRNA pathway genes could be a genomic response to the stress caused by TE mobilization during interspecific hybridization events, to counteract harmful effects on the cell. Indeed, although a reduction of the ping-pong cycle efficiency seems to occur in hybrids for Helena-specific piRNAs [26], the general trend for whole-genome TEs [27] is to show additive or higher ping-pong signature levels in hybrids than in parental species. In the same way, non-deficient amounts of total piRNA were observed also observed in our previous studies [27].
All in all, the most plausible hypothesis to explain TE deregulation in D. buzzatii-koepferae hybrids is the functional divergence between parental piRNA pathways, especially in terms of piRNA production efficiency. Indeed, here we show that some piRNA pathway genes evolve under positive selection and show lower conservation than expected in species that diverged 4.5 Mya. These results are in agreement with our previous transcriptomics study, in which we showed that most piRNA pathway proteins (predicted in silico) have identity percentages between D. buzzatii and D. koepferae lower than the median of the whole proteome [27]. The accumulated divergence between piRNA pathway proteins has also been proposed to explain TE deregulation in D. melanogaster-D. simulans hybrids [53] and was recently attributed to the lack of Rhino and Deadlock protein binding in hybrids [20].
Still, further research is needed for a better understanding of TE deregulation in interspecific hybrids, including studying how the amount of effector proteins affects the piRNA pathway breakdown, as well as whether and how epigenetic changes (such as histone methylation) are involved in TE deregulation.

Conclusions
Genomic stress caused by interspecific hybridization, induces TEs misregulation in Drosophila where piRNA pathway genes can play an important role. In this study, we characterized and quantified the expression of nine piRNA pathway genes in D. koepferae and D. buzzatii species, together with their interspecific hybrids. We showed that at least two of these genes (armi and aub) are under adaptive selection, despite being closely related species. Hybrid ovaries showed deregulation of some piRNA pathway genes compared to parental species and a trend to the overexpression in krimp, mt2, and zuc. This result, together with the observation of a non-deficient amount of piRNAs in hybrids in previous studies, reinforces the idea that the overexpression is a cellular response to mitigate hybrid stress. Therefore, the TE deregulation in hybrids might be due, at least in part, to protein incompatibility due to the rapid evolution of some of the genes under selective pressure such as armi and aub. Funding: This work was supported by the research grants CGL2013-42432P and CGL2017-89160P from the Spanish Ministerio de Ciencia e Innovación to the Grup de Genòmica, Bioinformàtica i Biologia Evolutiva (Principal Investigator: Mauro Santos) and grants 2014SGR 1346 and 2017SGR 1379 from Generalitat de Catalunya. VGV and VRS were supported by a PIF predoctoral fellowship from the Universitat Autònoma de Barcelona (Spain). JMC was supported by the fellowship "Beca de colaboración de estudiantes en departamentos universitarios" from the Ministerio de Educación y cultura (Spain).