Flaviviruses are small, single-stranded positive-sense RNA viruses that are typically transmitted between arthropod vectors and vertebrate hosts. They are endemic in tropic and sub-tropic regions and represent a global health threat, although humans are considered dead end hosts in many cases.
The genus Flavivirus
within the Flaviviridae
family comprises more than 70 species, which are organized into four groups, each with a specific host association: Mosquito-borne flaviviruses (MBFVs) and tick-borne flaviviruses (TBFVs) spread between vertebrate (mammals and birds) and invertebrate (mosquitoes and ticks) hosts, whereas insect-specific flaviviruses (ISFVs) replicate specifically in mosquitoes and no-known-vector flaviviruses (NKVs) have only been found in rodents and bats, respectively. This natural host-range-based classification is in good agreement with sequence-based phylogenetic clustering, mainly because all flaviviruses share a common genome organization [1
]. Conversely, epidemiology, disease association [2
] and transmission cycles [3
] are fundamentally different among different flavivirus groups.
Emerging and re-emerging MBFVs such as Dengue virus (DENV), Japanese encephalitis virus (JEV), West Nile virus (WNV), Yellow fever virus (YFV) or Zika virus (ZIKV) are the causative agents of large-scale outbreaks that result in millions of human and veterinary infections every year [4
]. Likewise, tick-borne encephalitis virus (TBEV), Powassan virus (POWV) and other members of the tick-borne serocomplex are neuropathogenic agents that cause a large number of infections every year, resulting in a massive incidence increase since the 1970ies [5
]. Consequently, much research effort has gone into studying MBFV and TBFV biology, biochemistry and phylogeny [6
]. The two remaining groups, ISFVs and NKVs, however, have received limited attention in the research community, mainly because they are generally not associated with human or veterinary disease and therefore are still underrepresented in the literature. The phylogenetic relationship among the four ecological flavivirus groups is shown in Figure 1
. Table A1
lists all viral species considered in the present study.
TBFVs form a monophyletic group consisting of a single serocomplex, although pathology and clinical manifestations vary among different viruses. They comprise more than a dozen of recognized species and separate into three groups: Mammalian tick-borne flaviviruses (M-TBFV), seabird tick-borne flaviviruses (S-TBFV) and the Kadam virus group. See [7
] for a comprehensive review.
ISFVs naturally infect hematophagous Diptera
and are typically divided into two groups [8
]: Classical insect-specific flaviviruses (cISFVs) naturally infect mosquitoes and excursively replicate in mosquito cells in vitro. They form a phylogenetically distinct clade among known flaviviruses, appearing at the root of the MBFV, TBFV and NKV branches. The cISFV group separates into two clades, one associated with Aedes
spp. mosquitoes and the other associated with Culex
spp. mosquitoes, respectively [9
]. They lack the ability to infect vertebrates and to replicate in vertebrate cell lines and have not been in the research spotlight until very recently. The second group is comprised of arbovirus-related or dual-host affiliated insect-specific flaviviruses (dISFVs), which represent a non-monophyletic group which is phylogenetically and antigenically related to mosquito/vertebrate flaviviruses, although they do not appear to infect vertebrate cells [10
]. Insect-specific viruses play a crucial role in the mosquito microbiome and have been shown to modulate the replication of other arboviruses [11
]. In this line, they are currently considered as biological control agents and vaccine platforms [12
NKVs represent an ecologically and phylogenetically diverse set of viruses which have been isolated exclusively from vertebrates (mainly bats and rodents), without evidence for transmission by arthropod vectors. They form a non-monophyletic group among flaviviruses and are typically divided into bat- (B-NKV) and rodent-associated (R-NKV) groups, see Table A1
. B-NKVs can be further separated into Entebbe virus group, which is phylogenetically closer to MBFVs, and Rio Bravo virus group, which is a sister clade to TBFVs. Species in the R-NKV group form the Modoc virus group, which is phylogenetically close to the B-NKV Rio Bravo group [13
]. While NKVs are poorly characterized they represent a valuable resource to study evolutionary traits related to host-switch capacity mediated by conserved genomic elements.
Conserved RNA Structures Mediate Pathogenesis
Conserved RNA structures in the untranslated regions (UTRs) of RNA viruses are of particular interest because they mediate the viral life cycle by promoting or enhancing replication, as proposed for elements in both 5
] and 3
]. Mosquito/vertebrate viruses must operate efficiently in vectors and hosts, phylogenetically distinct organisms with different cellular machineries. This requires a high level of flexibility of viral regulatory elements to evade various antiviral response strategies while assuring proper replication conditions required for maintaining a stable quasispecies population. To achieve this resilience in host adaptation, RNA duplication strategies have been proposed as an evolutionary trait for MBFVs [19
]. Tandem RNA structures within DENV 3
UTR are under different selective pressures in alternating hosts, suggesting the idea that duplicated RNA structures differentially evolved to accommodate specific functions in the two hosts [20
]. Likewise, there is evidence for evolutionary pressure on maintaining the primary sequence of parts of duplicated RNA elements, as recently shown for flaviviral dumbbell (DB) elements in the context of finding a biophysical model for explaining a possible route for ZIKV-induced neurotropism [21
Viral RNA genomes are different from procaryotic and eucaryotic mRNA. In addition to coding for and regulating the viral machinery, viral genomic RNA (gRNA) exhibits functional regions that act upon different stages of the viral life cycle. The 10-12kB flaviviral gRNA is capped, but non-polyadenylated and encodes a single open reading frame (ORF). Upon translation, a polyprotein is produced, which is then cleaved by viral and cellular enzymes into structural and non-structural proteins [22
]. The ORF is flanked by highly structured untranslated regions (UTRs), which contain evolutionary conserved RNA elements that are crucially related to regulation of the viral life cycle, thereby inducing processes such as genome circularization, viral replication and packaging [23
Upon flavivirus infection, accumulation of both gRNA as well as viral long non-coding RNAs (lncRNAs) is observed. These lncRNAs, which have been referred to as subgenomic flaviviral RNAs (sfRNAs) [26
] are stable decay intermediates produced by exploiting the host’s mRNA degradation machinery [27
] and are associated with viral replication, pathogenesis and cytopathicity [28
]. The production of sfRNA is induced by partial degradation of viral gRNA by the 5
exoribonuclease Xrn1, an enzyme associated with the cell’s RNA turnover machinery [30
]. Mechanistically, sfRNAs are generated by stalling Xrn1 at conserved structural elements in the viral 3
UTR, termed xrRNA (exoribonuclease-resistant RNA elements). These structures efficiently stall Xrn1 from progressing through from the 5
direction, thus protecting the downstream RNA from degradation, while pass-through in the 3
direction, as required for viral RNA-dependent RNA-polymerase is still possible [32
]. In particular, different types of stem-loop (SL) and dumbbell (DB) elements found in many MBFVs and TBFVs have been related to quantitative protection of downstream virus RNA against exoribonuclease degradation [33
Xrn1 stalling results in dysregulation of cellular function with the aim of promoting viral infections. In this regard, functions of sfRNA in modulating cellular mRNA decay and RNAi pathways [34
] as well as modulating anti-viral interferon response [35
] have been reported.
The genomic architecture of flaviviruses has been extensively studied to understand the molecular principles required for sfRNA production. Chemical and enzymatic probing methods [37
], together with x-ray crystallography revealed the 3
UTR structure of the MBFVs WNV [38
], YFV [39
], DENV [40
], Murray Valley encephalitis virus (MVEV) [41
], ZIKV [42
] and recently different species of the TBFV and NKV groups [33
], highlighting the possibility that exoribonuclease resistance might be a pervasive mechanism of the viral world. Interestingly, several conserved RNA structural elements in viral 3
UTRs have been predicted in our group [43
], some of which have later been attributed to xrRNA functionality [26
]. To further expand the set of potential xrRNAs, we report here on a comparative genomics survey aimed at characterization of evolutionary conserved RNA structures in flavivirus 3
UTRs, focusing on TBFVs and the hitherto understudied groups of ISFVs and NKVs. A detailed study on the evolutionary traits of conserved RNAs in MBFV 3
UTRs will be published elsewhere.
Several flaviviruses have previously been studied in great detail, yielding a varied landscape of repeated RNA sequence and structure elements within the 3
UTRs of these viruses, which are likely to have evolved from numerous duplications [19
]. Many of these studies relied on single sequence predictions, which resulted in a good understanding of both structure and genomic position of conserved elements in individual species. A unified picture of homologous RNAs within the 3
UTRs of flaviviruses, however, has not been available.
The comparative approach applied in the present study outperforms single sequence predictions by considering consensus structures formed by all sequences. This allows us not only to confirm previously described RNA structures but also to elucidate hitherto unrecognized tandem repeats in many species. In this line our results can help in understanding the complex evolution of flavivirus 3UTRs.
3.1. Construction of Seed Alignments
Based on recent experimental evidence for the existence of xrRNAs in TBFVs, ISFVs and NKVs, and previously characterized conserved RNA elements in flaviviral 3
UTRs, we built seed alignments for initial CMs, which were then refined iteratively, i.e., subjected to multiple rounds of screening and incorporation of best hits into the CM. Likewise, candidate structures from RNALalifold
calculations were used as seeds for identification of conserved RNA structures. Figure 2
shows an overview of refined consensus structures for each ecologic group of flaviviruses analyzed here.
The four ecologic groups of flaviviruses show a varied 3
UTR architecture, however, the terminal 3
stem-loop structure (3
SL, also referred to as 3
long stable hairpin, 3
LSH) has been shown to be associated with panhandle-formation during virus replication and is therefore present in the terminal region of all flaviviruses [14
]. The element is listed in Rfam as RF00185 (Flavivirus 3
UTR cis-acting replication element, Flavi_CRE) and we could use it to consistently identify terminal regions within 3
UTRs. Absence of this element from a UTR sequence is indicative of incomplete or truncated sequence data. The underlying sequences generally form a stable stem-loop structure upon structural alignment and single sequence folding. We built individual 3
SL seed alignments and CMs for each ecologic group, termed T.3SL, N.3SL and I.3SL, respectively.
3.2. Tick-Borne Flaviviruses
MacFadden et al. [33
] suggested two different exoribonuclease-resistant structures in TBFV 3
UTRs. We used the proposed sequences from TBEV, POWV, Karshi virus (KSIV), Langat virus (LGTV), Louping ill virus (LIV), Omsk hemorrhagic fever virus (OHFV) and Alkhumra hemorrhagic fever virus (ALKV) as templates for a set of initial structural alignments and CMs. These models were then employed to search for high confidence hits within all TBFV 3
UTRs to construct seed alignments of the two exoribonuclease resistant structures in TBFVs, termed T.xrRNA1 and T.xrRNA2. These models allowed us to construct highly specific CMs for both TBFV xrRNAs, which were subsequently used to annotate xrRNA instances in already studied and previously unstudied TBFV species (Figure 3
The full structural alignment of the 3
UTRs of selected tick-borne species moreover suggests a short stem-loop element in several species, which is characterized by high sequence heterogeneity but heavily conserved structure supported by multiple covariations. Evidence for this element, termed stem-loop 6 (SL6), has been reported earlier for at least TBEV, LGTV and OHFV [58
]. We kept this nomenclature and identified the exact position in each TBFV 3
UTR (Figure 3
Our data further shows that both TBFV xrRNA CMs (Figure 2
a), as well as NKV xrRNA CMs (Figure 2
b and Section 3.5
) consistently yield plausible hits with a high degree of structure conservation immediately upstream of the strongly conserved terminal stem-loop element. Existence of a Y-shaped element (termed Y1) and putative similarity to NKVs has been proposed earlier based on single sequence structure predictions [58
]. Structural locARNA
alignment and subsequent RNAalifold
consensus structure prediction indicates strong secondary structure conservation with frequent structure-conserving sequence covariations. Taken together, this suggests good evidence that respective regions in TBFVs harbor a putatively structured and functional xrRNA-like RNA (Y1, Figure 3
Despite the differences in length and sequence composition, the 3
UTRs of most species in the TBFV group share a common architecture. Similar to MBFV SL-elements [19
], two copies of xrRNAs are found in almost every species of this ecologic group, generally succeeded by one instance of SL6 and Y1. Likewise, the terminal 3
stem loop is conserved in all TBFVs and can be reliably annotated by both our CM, T.3SL, and the Rfam Flavi_CRE model, which is used in Figure 3
. Among all investigated species, only ALKV, OHFV and Kyasanur forest disease virus (KFDV) do not have a copy of xrRNA1, indicating that these viruses may have previously lost this element. Conversely, the two seabird-associated TBFVs with available 3
UTR data, Tyuleniy virus (TYUV) and Kama virus (KAMV) do not fit into this general scheme. Likewise, we were not able to annotate additional homologous or conserved structures with any CM used in this screen in the variable region of the 3
UTR of TBEV [5
], despite the substantially longer UTR (+300 nts).
3.3. Classic Insect-Specific Flaviviruses
Classic insect-specific flaviviruses present diverged 3
UTR architectures, which likely result from the association of different species to Aedes
spp. and Culex
spp. vectors, respectively, which are also reflected by clade separation in the ISFV phylogenetic tree. Previous studies employed single sequence predictions to propose a varied set of homologous RNA structures in combination with an unusually large number of duplicated sequence signals [59
]. Recent experimental evidence, however, suggests the presence of xrRNAs that have a similar fold to those known from MBFVs in cISFVs. Consequently, we set out to independently characterize conserved RNA elements for different subclades.
3.3.1. Exoribonuclease-Resistant RNAs in Aedes-Associated cISFVs
MacFadden et al. [33
] used SHAPE structure probing to report the presence of two exoribonuclease-resistant RNAs in Cell fusing agent virus (CFAV), and provided evidence for a duplicated set of homologous structures in Aedes flavivirus (AEFV) and Kamiti river virus (KRV). We constructed initial alignments from the reported sequences in this clade in Aedes
spp. associated viruses, resulting in two seed alignments, termed I.xrRNA1 and I.xrRNA2 (Figure 2
c). For both elements, seed CMs were iteratively built from structural locARNA
alignments. Minor manual adjustments to the alignments were required here, since the predicted consensus structures diverged slightly from the published SHAPE-guided prediction. Both models were then employed to search for additional high confidence hits within other isolates of CFAV, AEFV and KRV which were subsequently added to the seed alignments.
Screening the entire set of flavivirus 3
UTRs revealed that both ISFV xrRNA elements, I.xrRNA1 and I.xrRNA2, are only found in CFAV, AEFV and KRV, i.e., species initially used for the construction of the respective CMs. Furthermore, also in terms of pure structural conservation, no reliable hits in any other ISFV species could be obtained with any of these CMs (Figure 4
a). This suggests that both ISFV xrRNAs may represent a specialized class of xrRNA elements only present in CFAV, AEFV and KRV. The 3
UTR of KRV is unique among all known flaviviruses because it harbors an additional copy of the terminal 3
stem-loop element 600 nts upstream of the actual 3
-terminus, supporting previous reports that the KRV 3
UTR has undergone a full duplication during its evolution [60
3.3.2. Conserved Structures in Culex-Associated cISFVs
The second distinct clade of cISFVs includes Culex flavivirus (CxFV), Quang Binh virus (QBV), Mosquito flavivirus (MSFV), Palm Creek virus (PCV), Culex theileri flavivirus (CTFV) as well as a few other species with only partial genome sequence availability [9
] and is associated with Culex
spp. vectors. An interesting observation in this clade is that no other CM from any of the four ecologic flavivirus groups shows a hit, not even with remote sequence or structure conservation. We therefore set out to produce a a high quality structural alignment of the complete 3
UTRs of CxFV, QBV and MSFV. Consensus structure folding of the full alignment revealed each species to harbor 3–4 repeats of two highly conserved elements supported by multiple co-varying base pairs (Figure 4
b,c). We termed these “Repeat element a/b”, respectively (Ra and Rb). Both elements, while strongly conserving their folds, show highly variable loop regions as well as weak sequence conservation in the case of the Ra element. Structure conservation and occurrence in multiple copies, as typically seen with other exoribonuclease-stalling elements, hints towards possible functional importance. These results complement earlier reports of sequence repeats in the 3
UTR of CxFV [61
] with the identification of a quadruplicated pair of conserved structures.
3.3.3. Diverged 3UTR Architecture in Many cISFVs
Interestingly, a screen of all available CMs in Parramatta River virus (PaRV) revealed five xrRNA-like elements (Figure 4
a), with elements 1–3 sharing sequence and structure properties with NKV xrRNAs (Section 3.5
), while elements 4 and 5 only conserve N.xrRNAs structure. All five hits can be structurally aligned into a consistent consensus structure (Figure 4
d), despite the overall weak sequence consensus.
Conversely, the 3UTRs of Calbertado virus (CLBOV) and Mercadeo virus (MECDV) appear structurally different from the other cISFVs. A general lack of characteristic CM hits lets these species appear more like an outgroup among cISFVs. In particular, we could only find significant hits for the omnipresent terminal 3stem-loop structure, a putative xrRNA-like element in CLBOV and a single instance of a structure homologous to the Rfam model RF00465 (Japanese encephalitis virus hairpin structure) in MECDV. Still, limited availability of 3UTR sequence data renders the characterization of conserved elements and interpretation difficult here.
Our data suggests that the 3
UTRs of cISFVs, in contrast to TBFVs (Section 3.2
and dISFVs (Section 3.4
), do not appear to have a consistent architectural organization. In agreement with the cISFV phylogenetic subtree (Figure 1
) we constitute three diverged groups with common 3
UTR organization that conform to their respective sub-clades: (i) CFAV-AEFV-KRV, each with two instances of xrRNAs, (ii) CxFV-QBV-MSFV with 3–4 copies of I.Ra/I.Rb elements and (iii) PaRV with 4–5 copies of xrRNA-like structures. Although no full 3
UTR sequences are available for the phylogenetically closest relatives of PaRV, HANV and OCFVPT, an xrRNA-like element in the small available fragment (syntenic to PaRV UTR) of OCFVPT 3
UTR suggests that both viruses might be organized in a similar manner, as supported by earlier reports that these viruses should be classified within the same species [1
]. For CLBOV and MECDV, no clear pattern of conserved elements can be identified with our CMs. Both viruses either employ an entirely different class of elements or might not require capability for exoribonuclease stalling at all. The only element shared universally among all cISFVs is the 3
-terminal stem-loop, although cISFVs seem to diverge from other flaviviruses here, indicated by the inability of Rfam model RF00185 (Flavivirus CRE) to reliably annotate any cISFV 3
3.4. Dual-Host Affiliated Insect-Specific Flaviviruses
Isolated almost exclusively from mosquitoes, dISFVs do not seem to infect vertebrate cells, despite their phylogenetic proximity to MBFVs (Figure 1
). This association is reflected by good hits of the Rfam covariance models RF00525 (Flavivirus DB element) and RF00465 (Japanese encephalitis virus hairpin structure) in all dISFV isolates studied here (Figure 5
). Interestingly, we could not find evidence for any sequences or structures homologous to tick-borne or other insect-specific flaviviruses. In this line, our data is in good agreement with the phylogenetic location of these viruses, which share ancestral roots with MBFVs [10
An unusual species within this group is Ecuador Paraiso Escondido virus (EPEV), which has been isolated from New World sandflies and has been classified as insect-specific virus. EPEV phylogenetically appears at the root of the Entebbe bat virus group (ENTVG), a clade comprised of the three NKVs Entebbe bat virus (ENTV), Sokoluk virus (SOKV) and Yokose virus (YOKV). While all of these viruses contain homologs of conserved stem-loop (SL) and dumbbell (DB) elements found in MBFVs, ENTVG species may have lost their vector dependence [1
3.5. No-Known-Vector Flaviviruses
Rather than forming a monophyletic group, the no-known-vector flaviviruses can be separated into two distinct lineages, which are closely related to either TBFVs or MBFVs, respectively (Figure 1
). Two additional NKVs, Tamana bat virus (TABV) and Cyclopterus lumpus virus (CLuV), are phylogenetically distant and serve as an outgroup to all flaviviruses. In analogy to the procedure outlined above for TBFVs (Section 3.2
) and ISFVs (Section 3.3
), we built a CM for experimentally verified xrRNAs in tick-borne related NKVs, termed N.xrRNA.
We found multiple hits of this CM at various loci within the 3
UTRs of tick-borne related NKVs, indicating that these species, in contrast to TBFVs, do not conserve a common 3
UTR architecture (Figure 6
a). Surprisingly, we could identify several high-quality hits of the Rfam model RF00525 (Flavivirus DB element), an element typically found in MBFVs, in Rio Bravo virus (RBV), Montana myotis leukoencephalitis virus (MMLV) and Modoc virus (MODV). This is in so far remarkable as there is no evidence for conservation of this element in TBFVs, which phylogenetically cluster with this clade of NKVs. This element might have been introduced by an ancestral recombination event. Alternatively, conservation of an MBFV element in NKVs might be indicative of an association with an unknown vector, in agreement with the hypothesis that vector specificity is mediated by characteristic 3
UTR elements [19
Conversely, there seems to be no generally conserved 3
UTR architecture among members of the mosquito-borne related NKVs (Figure 6
c). While sequence data has not been available for Sokoluk virus (SOKV), we could annotate typical MBFV elements in the next relatives Entebbe bat virus (ENTV) and Yokose virus (YOKV), as proposed previously [32
3.6. A Generalized xrRNA Structure
Earlier work suggested that xrRNAs from TBFVs and tick-borne related NKVs fall into a more general structural class of xrRNAs [33
]. Following this line of reasoning, we investigated whether all high confidence hits obtained with our TBFV and NKV CMs could be assembled into one coherent CM that conserves the xrRNA-typical fold. A further advantage of a generalized CM would be higher sensitivity, allowing for identification of common features and eventually lead to annotation of previously unannotated xrRNAs.
Structural alignment and consensus structure prediction revealed all high confidence hits to fold into a common secondary structure (Figure 7
a,b). While most of the consensus structure is characterized by low sequence conservation, stem 3 (S3) and loop 1 (L1) show medium to high degree of sequence conservation. The length of all stems is well conserved, although both major loop regions L2 and L3 show large fluctuations, with the length of L3 being de facto constant and L2 showing a high degree of flexibility.
We further investigated whether any high confidence hits from I.xrRNA1/2 in cISFVs could be aligned to the generalized xrRNA model. Although both cISFV xrRNAs (Figure 2
c) bear some similarity to the generalized model, in particular to S3 and L3, we were not able to build a common alignment or consensus structure. Despite seemingly similar shape, our data also suggests that cISFV xrRNAs form a separate xrRNA subclass, unrelated to MBFV xrRNAs. In particular, we could not obtain hits of Rfam CMs (which can be seen as representatives of MBFV elements) in cISFVs, nor could we confirm any hits of cISFV-specific elements in MBFVs.
In addition to learning xrRNA features, a more generalized CM enabled us to detect xrRNA-like structures (indicated as such in all annotation plots) that could not be found previously.
Mediated gRNA decay in the form of exoribonuclease resistance seems to be a pervasive strategy employed by viruses to circumvent host immune responses. Evidence of sfRNA production following incomplete Xrn1 degradation has not only been observed in different members of the Flavivirus
], but also in other species of the Flaviviridae
family, however, with major differences in xrRNA structure and sfRNA characteristics. While MBFV produce a 300-500nt sfRNA that corresponds to degradation products of the gRNA 3
UTR, hepaciviruses and pestiviruses produce a long subgenomic RNA whose 5
end is located within the first 130nt of the viral gRNA [63
Moreover, recent studies have identified xrRNA functionality in several phylogenetically distant RNA viruses, such as animal-infecting, segmented viruses of the Bunyaviridae
] families, as well as plant-infecting viruses of the Tombusuviridae
]. The interesting question whether exoribonucleases other than Xrn1 would be blocked as well has recently been answered. MacFadden et al. [33
] could show that both RNAse J1 and Dxo1 are stalled by MBFV xrRNAs, thereby demonstrating the general nature of this structure-induced blocking mechanism. These novel findings, together with previous knowledge of Xrn1 stalling in segmented plant viruses [67
] provide evidence for a convergent evolution scenario where xrRNAs depend on a specific folded RNA structure and form a distinct class of functional RNAs.
Repeated RNA elements appear to be a hallmark of flavivirus 3UTR architecture. While there seems to be a plethora of conserved structure classes, our data emphasizes the consistent trend that these elements typically do not occur as single copies. Rather, duplicate or even multiple occurrences of these elements hint towards functional relevance. This is further underlined by the evolutionary conservation of both patterns and elements among different species. Since, exoribonuclease stalling is presumably never perfect, it makes sense that viruses might employ multiple copies of such elements.
In this contribution we set out to identify homologs of known exoribonuclease stalling elements and novel conserved structures. To this end, we computationally characterized homologs of experimentally verified xrRNA in tick-borne and no-known vector viruses that seem to form a coherent class of RNA structures with capability to stall exoribonucleases (T.xrRNA1/2, N.xrRNA). Likewise, we identified another class of xrRNAs in classic insect-specific flaviviruses (I.xrRNA1/2) which appears to be only distantly related to the former class. In the same line, we predicted a set of novel conserved elements in cISFVs that appear in quadruples and do not coincide with other insect-specific elements (I.Ra,I.Rb).
While we did not focus on studying the evolutionary history of these groups of elements in detail, our data suggests that many elements share ancestral roots. This is supported by the observation that at least the tick-borne, no-known-vector and Aedes spp. associated xrRNAs fold into a similar Y-shaped substructure, although the exact fold varies significantly among individual species.
We compiled a set of covariance models that can be used for rapid screening assays in the identification and characterization of novel flaviviruses. All models are available from GitHub via https://github.com/mtw/ITNFV-Data
A major problem is the limited availability of diverse 3UTR sequence data for many viruses analyzed here, particularly within cISFVs. Many novel ISFVs have previously been discovered, but 3UTR sequences were only available for a subset of them. Future studies are required to shed more light on the evolutionary history of 3UTR evolution in flaviviruses.