Characterisation of the RNA Virome of Nine Ochlerotatus Species in Finland

RNA viromes of nine commonly encountered Ochlerotatus mosquito species collected around Finland in 2015 and 2017 were studied using next-generation sequencing. Mosquito homogenates were sequenced from 91 pools comprising 16–60 morphologically identified adult females of Oc. cantans, Oc. caspius, Oc. communis, Oc. diantaeus, Oc. excrucians, Oc. hexodontus, Oc. intrudens, Oc. pullatus and Oc. punctor/punctodes. In total 514 viral Reverse dependent RNA polymerase (RdRp) sequences of 159 virus species were recovered, belonging to 25 families or equivalent rank, as follows: Aliusviridae, Aspiviridae, Botybirnavirus, Chrysoviridae, Chuviridae, Endornaviridae, Flaviviridae, Iflaviridae, Negevirus, Partitiviridae, Permutotetraviridae, Phasmaviridae, Phenuiviridae, Picornaviridae, Qinviridae, Quenyavirus, Rhabdoviridae, Sedoreoviridae, Solemoviridae, Spinareoviridae, Togaviridae, Totiviridae, Virgaviridae, Xinmoviridae and Yueviridae. Of these, 147 are tentatively novel viruses. One sequence of Sindbis virus, which causes Pogosta disease in humans, was detected from Oc. communis from Pohjois-Karjala. This study greatly increases the number of mosquito-associated viruses known from Finland and presents the northern-most mosquito-associated viruses in Europe to date.

Forty-three species of mosquitoes are recorded from Finland, which belong to Aedes, Aedimorphus, Culex, Culiseta, Dahliana and Ochlerotatus [11]. Some species have rarely been encountered during recent or historical collections, but most have been reported as human-biting either in Finland or in neighbouring countries [11,12]. Species of the genus Ochlerotatus are most numerous, with 23 recorded from across Finland, but distributions −80 • C prior to the study. Mosquitoes were identified over dry ice using morphological keys [15,16] and then either (i) pooled by species, or (ii) stored individually in 1.2 mL collection microtubes (QIAGEN, Venlo, The Netherlands). From the 18,976 deep-frozen specimens, 14,092 were collected as adults, of which 13,927 were females, and 11,835 were adult female Ochlerotatus. A subset of 2333 of these deep-frozen adult female specimens was chosen for inclusion in this study (see below). Notes were made if any specimens were visibly engorged with blood, or if they had ectoparasites (Acarid mites).  Table 1 for the pool numbers, mosquito species and collection dates, and Table A1 for the viruses found at each location.

Pooling and Homogenisation
Pools were constructed using identifiable females of commonly encountered humanbiting Ochlerotatus, by species, collection location and collection date ( Figure 1, Table 1). Rare species with fewer than 16 specimens were not considered; neither were specimens which were found in low numbers over several collection sites over several years such that location or temporal data would not be confused in the results. Since these species are difficult to identify when scales are denuded, 2176 specimens were immediately excluded from the potential specimens as they were either unidentified or the identification was not confirmed. To suit the available resources, 2333 females belonging to nine species, which were collected in May-August 2015 and July-August 2017, met these criteria, and were divided into 91 pools, as follows: Oc. cantans (n = 1), Oc. caspius (n = 11), Oc. communis (n = 35), Oc. diantaeus (n = 6), Oc. excrucians (n = 3), Oc. hexodontus (n = 8), Oc. intrudens (n = 14), Oc. pullatus (n = 2) and Oc. punctor/punctodes (n = 11) ( Table 1). Table 1. Details of the 91 mosquito pools included in this study by collection site (see Figure 1). Pools shaded grey were made up of specimens from more than one collection. Where several collections were combined, the "number of specimens from a collection/total number of specimens in the pool" are given. Pools varied in size, from 16-60 whole individuals, with most later pools comprising 20 specimens. Females that were noticeably blood fed or gravid, or which had one or more ectoparasites were maintained in individual tubes for homogenisation. Pools were assigned a running number corresponding to the date when they were processed, from FIN/L-2018/001 to FIN/VS-2018/100 (Table 1). Most pools comprised mosquitoes from a single collection site, but several contained specimens from up to three locations. In these few cases, specimens were pooled from the same region and within a few days of being collected.
Individually stored specimens were homogenised in microtubes with 100 µL of Dulbecco's phosphate-buffered saline (PBS) + 0.2% bovine serum albumin (BSA), sterile sand and a 3 mm tungsten carbide bead (QIAGEN, Venlo, The Netherlands). After homogenisation, the tubes were centrifuged at full speed for 5 min at 5 • C. Subsequently, 50 µL of supernatant from each specimen was then combined in a "super pool". For pre-pooled mosquitoes, 1.8 mL of Dulbecco's PBS + 0.2% BSA was added to each 2 mL tube, with a 5 mm tungsten carbide bead. These were homogenised using the QIAGEN TissueLyser II for 2 min at full speed, then centrifuged at 5 • C for 5 min. From each of the 91 pooled mosquito homogenates, aliquots were taken for next-generation sequencing (NGS).

Illumina MiSeq Sequencing
Prior to sequencing, the mosquito homogenates were treated following an established protocol [17] with minor modifications. Specifically, they were each filtered through a 0.8 µm polyethersulfone (PES) filter and treated with micrococcal nuclease (New England Biolabs, Ipswich, MA, USA) and benzonase (Millipore, Merck KGaA, Darmstadt, Germany). RNA was then extracted using TRIzol (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. The RNA samples were treated with DNase I and purified with Agencourt RNA Clean XP magnetic beads (Beckman Life Industries). Ribosomal RNA was removed using a NEBNext rRNA depletion kit according to the manufacturer's protocol, followed by amplification using a whole transcriptome amplification WTA2 kit (Sigma-Aldrich, Merck KGaA, Darmstadt, Germany). The sequencing libraries were prepared using a Nextera XT kit (Illumina, San Diego, CA, USA) and sequenced using the Illumina Miseq platform and v2 reagent kit with 150 bp paired-end reads.

NGS Data Analysis
Sequence reads from the initial homogenates ( Figure S1, Table S1) were analysed in Lazypipe v.1.2, an automated bioinformatics pipeline [18]. Preassembly quality control was first performed on the FASTQ reads using Trimmomatic v.0.39 [19] to remove and trim low quality reads, bases and Illumina adapters. MEGAHIT v.1.2.8 [20] was used to perform de novo assembly with the initial quality-controlled reads. Gene-like regions were detected using MetaGeneAnnotator [21] and translated to amino acids with BioPerl [22]. The amino acid sequences were then queried against the UniProtKB database using SANSparallel [23] and assigned NCBI taxonomy IDs. Any sequences that were unclassified according to NCBI Taxonomy were not possible to identify following the steps, above, so were manually identified using BLASTx. Any contigs longer than 1000 nt, with the highest similarity to viral RNA-dependent RNA polymerases (RdRps), were selected for phylogenetic analyses.
Analyses were performed on amino acid sequences, which were derived by analysing each contig with getorf [24] to identify open reading frames (ORFs) and converting them into an amino acid format. These were aligned with MAFFT v. 7.490 [25] and the resulting alignments trimmed with trimal v.1.2 [26]. Finally, maximum likelihood (ML) trees were constructed with IQ-TREE2 v.2 [27], which employs the ModelFinder algorithm [28] to determine the optimal protein substitution model, and the UFBoot2 algorithm [29] to compute 1000 bootstraps. The final trees were visualised in R v.4.1.2 using the GGTREE package v.3.0.4 [30].
The novel viruses discovered in this study (Table S2) were named according to the nearest town or municipality to the, or one of the site(s) from which the mosquitoes were collected, but with diacritical marks removed as they were not supported in GGTREE. If more than one virus variant or species was found from the same pool an additional, final, running number was appended to the end. Representative virus sequences for each virus family were downloaded from those available in GenBank, compared to newly generated sequences, and included in the ML trees.    Two species belonging to Alphaendornavirus in Endornaviridae, a family of viruses known to infect plants, fungi and oomycetes, were recovered from one pool of Oc. punctor/punctodes ( Figure 2, Table 2). The first was a strain of Hallsjon virus (GenBank accession: QGA70950.1; amino acid identity: 99.77%) and the second was a novel virus, named "Tvarminne alphaendornavirus", that was distantly similar to Vicia faba alphaendornavirus (GenBank accession: YP_438201.1; amino acid identity: 49.12%). Complete genomes were sequenced for both virus species (GenBank accessions ON955055 and ON955056).

Results
Five species belonging to two genera of Flaviviridae were sequenced from nine mosquito pools, four of which are tentative novel viruses ( Figure 3, Table 2). Three viruses grouped within genus Flavivirus, one with flavivirus-like viruses and one within genus Jingmenvirus. Two of the four novel species were named "Hameenlinna flavivirus" and "Kilpisjarvi flavivirus" and these fell within the insect-specific group of flaviviruses. Hameenlinna flavivirus was most similar to another insect-specific flavivirus species that was first detected in Finland, Hanko virus (GenBank accession: YP_009259489.1; amino acid identity: 79.87%). Kilpisjarvi flavivirus was most similar to Xishuangbanna aedes flavivirus (GenBank accession: YP_009350102.1; amino acid identity: 61.88%) although it clustered with Ochlerotatus scapularis flavivirus (GenBank accession: BCI56825.1; amino acid identity: 61.37%) in the phylogenetic tree. The full genome of Kilpisjarvi flavivirus was sequenced (GenBank accession: ON949931). A novel flavivirus-like species, "Lestijarvi flavi-like virus", was most similar to Hymenopteran flavi-related virus (GenBank accession: QTJ63659.1; amino acid identity: 47.75%), although in the phylogenetic tree it clustered with Gudgenby flavi-like virus (GenBank accession: QTJ63659.1; amino acid identity: 47.3%). Hanko virus, a species which was first described in 2012, was also present in four pools of mosquitoes collected near to the virus' type locality, which had an average amino acid identity of >99% ( Figure 3, Table 2). The full genome of Hanko virus was sequenced from these variants (GenBank accession: ON949927-ON949930). One novel member of the genus Jingmenvirus was detected, with two variants provisionally named "Inari jingmenvirus". This species was not closely related to any species, although it weakly resembled Wuhan aphid virus 1 (GenBank accession: BBV14756.1; amino acid identity: 48.82%), which was derived from aphids from Japan. Figure 2. Maximum likelihood tree of Endornaviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.
While not yet formally recognised by the ICTV, Negeviruses have been recorded from mosquitoes and sandflies, among other arthropod species. Four of the viruses, "Kustavi negevirus" and "Utsjoki negevirus 1 to 3" were novel; while three, Cordoba virus, Dezidougou virus and Mekrijärvi negevirus ( Figure 5, Six variants of one novel species belonging to Permutotetraviridae, a family associated with arthropods, were sequenced from five mosquito pools ( Figure 6, Table 2). Named "Inari permutotetravirus", its amino acid identity was most similar to Smithfield permutotetra-like virus (GenBank accession: QIJ25871.1/QIJ25875.1; amino acid identity: 42.72-66.32%), which were both sequenced from unspecified arthropods collected from Queensland, Australia.
Five variants of two species of Picornaviridae, a family of viruses that infect a broad range of vertebrates, were sequenced from two pools of Oc. caspius ( Figure 6, Table 2). The first species was a previously described but as yet unnamed RNA virus, tentatively named here as "Hanko picorna-like viruses". The previously described virus was obtained from an anal swab taken from a passerine bird in a Chinese zoo and was nearly identical to the Finnish variant (GenBank accession: QKN89015.1; amino acid identity: 97.15-99.47%). The second species, Jotan virus, shared high amino acid identity with its previously described counterpart from Culex mosquitoes in Sweden (GenBank accession: QGA70904.1; amino acid identity: 98.25-98.8%).
One virus sequence grouped with the proposed insect-specific taxon Quenyavirus, and was named "Enontekio quenyavirus", despite being found in specimens collected from northern Lapland and from Uusimaa in the far south of Finland ( Figure 6, Table 2). Based on amino acid identity, it is relatively distant from its closest relative, Nete virus (GenBank accession: QIQ61196.1; amino acid identity: 39.71-39.77%) which was sequenced from the moth, Crocallis elinguaria, from the UK. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from Gen-Bank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using. Asterisks denote that the complete genome was recovered. Tentative novel virus species are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s) | collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.
Fifteen variants belonging to the plant-specific Solemoviridae were sequenced, which corresponded to one established virus, Evros sobemo-like virus, and four novel species ( Figure 7, Table 2). The novel viruses, "Enontekio sobemovirus", "Hanko sobemovirus", "Ilomantsi sobemovirus" and "Joensuu sobemovirus", clustered with other viruses in Sobemovirus based on our phylogenetic analysis. Enontekio sobemovirus was closely related to Guadeloupe mosquito virus (GenBank accession: QRW42396.  Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.  Table 2). It was closely related to another Finnish mosquito-derived strain (GenBank accession: AFL65801.1; amino acid identity: 99.76%). This new variant is of note as it is the first mosquito species in Finland that has been definitively linked with Sindbis virus, which causes human disease outbreaks in the country.
Seven variants of viruses that were closely related to plant-specific viruses in Virgaviridae were recovered, belonging to three viruses ( Figure 9, Table 2). They did not, however, cluster with established virgavirus genera in the ML tree, and as such were all named virga-like viruses "Enontekio virga-like virus 1 and 2" and "Pedersore virga-like virus". The closest matches for these three novel viruses were as follows: Enontekio virga-like virus 1 was closest to mosquito-derived Atrato virga-like virus 6 (GenBank accession: QHA33758.1; amino acid identity: 62.86%) from Columbia; Enontekio virga-like virus 2 was distantly similar to the plant pathogen Plasmopara viticola lesion associated virga-like virus 1 (GenBank accession: QHD64722.1; amino acid identity: 34.46%) from Spain; and Pedersore virga-like virus was similar to an unnamed RNA virus which was sequenced from mosquitoes in China  Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. The novel strain of Sindbis virus is displayed in red and was derived from Oc. communis. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.

Negative-Sense ssRNA Virus Sequences
Negative-sense ssRNA viruses belonging to nine virus families, Aliusviridae, Aspiviridae, Chuviridae, Phasmaviridae, Phenuiviridae, Qinviridae, Rhabdoviridae, Xinmoviridae and Yueviridae were recovered during this study. The −ssRNA viruses are listed below, with all tentative variant names and associated mosquito species in Tables 3 and 4.
Aliusviridae is comprised of two genera, Ollusvirus and Obscuruvirus, and its member species have previously been from insects. One novel virus belonging to Obscuruvirus was sequenced from a pool of Oc. communis, which was tentatively named "Lestijarvi obscuruvirus" (Figure 10, Table 3). It was most similar to Atrato chu-like virus 5 (Gen-Bank accession: QHA33675.1; amino acid identity: 41.87%), which was sequenced from Psorophora ciliata, an aedine mosquito from Columbia. . Maximum likelihood tree of Virgaviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.

Negative-Sense ssRNA Virus Sequences
Negative-sense ssRNA viruses belonging to nine virus families, Aliusviridae, Aspiviridae, Chuviridae, Phasmaviridae, Phenuiviridae, Qinviridae, Rhabdoviridae, Xinmoviridae and Yueviridae were recovered during this study. The −ssRNA viruses are listed be-low, with all tentative variant names and associated mosquito species in Tables 3 and 4.
Aliusviridae is comprised of two genera, Ollusvirus and Obscuruvirus, and its member species have previously been from insects. One novel virus belonging to Obscuruvirus was sequenced from a pool of Oc. communis, which was tentatively named "Lestijarvi obscuruvirus" (Figure 10, Table 3). It was most similar to Atrato chu-like virus 5 (Gen-Bank accession: QHA33675.1; amino acid identity: 41.87%), which was se-quenced from Psorophora ciliata, an aedine mosquito from Columbia.
Similarly, one virus grouped with Aspiviridae, a plant pathogenic family of viruses, and was tentatively named "Kilpisjarvi aspivirus" (Figure 10, Table 3). Its closest match was Wilkie ophio-like virus 1 (GenBank accession: ASA47457.1; amino acid identity: 50.45%), which was derived from a mosquito from Western Australia. Oc. intrudens ON955172 Figure 10. Maximum likelihood trees of Aliusviridae, Aspiviridae and Chuviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.
Family Phenuiviridae mainly includes arthropod-specific and vector-borne viruses that primarily infect mammals. We detected one sequence representing a novel virus belonging to genus Phasivirus and 13 phenui-like viruses ( Figure 12, Table 4). These were tentatively named "Hameenlinna phasivirus", "Enontekio phenui-like virus 1 to 5", "Hanko phenui-like viruses 1 to 3", "Ilomantsi phenui-like virus", "Kalajoki phenui-like viruses 1 and 2" and "Palkane phenui-like virus 1 and 2". The complete genome of Hameenlinna phasivirus was sequenced (GenBank accession ON955138) and was most similar to Phasi Charoen-like phasivirus (GenBank accession: QEM39210. 1 Three novel variants of Qinviridae were detected from pools of Oc. communis ( Figure 13, Table 4), which were provisionally named "Ilomantsi qinvirus", "Kalajoki qinvirus" and "Palkane qinvirus". The first one was distantly similar to Nackenback virus (GenBank accession: QGA70919.1; amino acid identity: 63.3%), which was detected in Sweden from a Culex mosquito, while the two others were distantly similar to Wilkie qin-like viruses (GenBank accessions: ASA47357.1 and ASA47455.1; amino acid identities: 54.5-58.2% and 56.61-75.3%). Figure 11. Maximum likelihood tree of Phasmaviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.  Twenty-one variants of Rhabdoviridae, viruses which infect vertebrates, invertebrates and plants, were sequenced from 13 mosquito pools and grouped into eight viruses ( Figure 14, Table 4). Seven of these were novel tentative rhabdoviruses and one an established species. Of the tentative novel viruses, two fell within established genera, "Enontekio merhavirus" (Merhavirus) and "Enontekio ohlsrhavirus" (Ohlsrhavirus), while the remaining species, "Enontekio rhabdovirus", "Hattula rhabdovirus", "Inari rhabdovirus", "Joutseno rhabdovirus 1" and "Joutseno rhabdovirus 2" did not. Two variants of Ohlsdorf virus (officially Ohlsdorf ohlsrhavirus) were also sequenced, which were nearly identical to the originally described virus from Oc. cantans mosquitoes from Germany [32]   Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.
Xinmoviridae includes member species that have been isolated from insects. Nine sequences from four mosquito pools grouped into four novel species, which were tentatively named "Enontekio anphevirus 1 and 2", "Hanko anphevirus" and "Joensuu anphevirus" (Figure 15, Table 4). The closest sequences available on GenBank for each of these novel species were as follows: Enontekio anphevirus 1 had a medium protein similarity with Culex tritaeniorhynchus anphevirus (GenBank accession: BBQ04822.1; amino acid identity: 53.53%), which was sequenced from Japanese Culex mosquitoes; Enontekio anphevirus 2 with Aedes anphevirus (GenBank accession: AWW13453.1; amino acid identity: 60.48%), from a colony of aedine mosquitoes from Thailand; Hanko anphevirus with Serbia mononega-like virus 1 (GenBank accession: QNS17450.1; amino acid identity: 57.88%) from Serbian specimens of Culex pipiens; and Joensuu anphevirus with Guadeloupe mosquito mononega-like virus (GenBank accession: QEM39171.1; amino acid identity: 49.73-70.95%) in aedine mosquitoes from Guadeloupe. The variant sequences were detected in pools of Oc. caspius, Oc. communis, Oc. hexodontus and Oc. punctor/punctodes from across Finland. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.
Yueviridae is another recently validated virus family and includes viruses that have been detected from arthropods and marine diatoms. Among our specimens, we isolated one virus sequence from Oc. hexodontus, which we named "Enontekio yuevirus" (Figure 15, Table 4). It was very distantly similar to Shahe yuevirus-like virus 1 (officially Shahe yuyuevirus) (GenBank accession: YP_009337854.1; amino acid identity: 38.47%), which was sequenced from freshwater isopoda from China.
Finally, while analysing other sequence data that were generated during this study, a fragmentary genome of Inkoo virus (Family Peribunyavirus) was identified. The sequences comprised four contigs of 301 to 630 nucleotides which mapped to the M glycoprotein segment, with >99% nucleotide identity to Russian mosquito-derived strain LEIV-15248Iv (GenBank accession; KT288270). While of a different (polymerase) gene than was included in this study, they are still of interest, as Inkoo virus is pathogenic to humans. The sequences were derived from a pool of 60 Oc. punctor/punctodes (FIN/PK-2018/11), which were collected in late June 2015.

Double-Stranded RNA Virus Sequences
Double-stranded RNA viruses belonging to five established viral families Chrysoviridae, Partitiviridae, Sedoreoviridae, Spinareoviridae and Totiviridae and one proposed family Botybirnaviridae were recovered during the analyses. The dsRNA viruses sequenced in this study are listed, below, with all variant names and associated mosquito species listed in Tables 5-7.
Botybirnavirus is a recently proposed virus taxon, whose species have been isolated from plants and phytopathogenic fungi. One novel virus was sequenced and tentatively named "Palkane botybirna-like virus", which had a low resemblance to Bremia lactucae-associated dsRNA virus 1 (GenBank accession: QIP68006.1; amino acid identity: 40.17-44.61%.). Eight variants were found in six pools of Oc. communis and one of Oc. intrudens ( Figure 16, Table 5). Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered. Five variants of three novel Chrysoviridae viruses, which mainly infect fungi as well as plants and insects, were sequenced from pools of Oc. caspius, Oc. communis, Oc. intrudens and Oc. punctor/punctodes ( Figure 16, Table 5). All species belonged to Alphachrysovirus and were provisionally named "Enontekio alphachrysovirus", "Hanko alphachrysovirus" and "Lestijarvi alphachrysovirus". These viruses had a moderate similarity to Keturi virus (GenBank accession: QRW42852.1; amino acid identities: 73.68%, 77.62% and 72.98-74.71%, respectively).
Fifty-five strains grouped into 23 novel species belonging to Partitiviridae, viruses traditionally associated with fungi, plants and protozoa, but recently associated also with arthropods [35][36][37] (Table 6). Eight of these species were partiti-like viruses and did not fall within an established genus, but the remaining fifteen belonged to three established genera: nine in Alphapartitivirus (Figure 17), three in Betapartitivirus and three in Deltapartitivirus ( Figure 18). The novel alphapartitiviruses were named "Enontekio alphapartitivirus 1 to 2", "Hanko alphapartitivirus 1 to 3", "Kalajoki alphapartitivirus", "Kuusamo alphapartitivirus" and "Palkane alphapartitivirus 1 and 2". Enontekio alphapartitivirus 1 was most similar to Hubei partiti-like virus 27   Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Oc. excrucians OP019974 Figure 18. Maximum likelihood trees of Betapartitivirus and Deltapartitivirus (Partitiviridae). Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.
Finally, the eight partiti-like viruses included the tentatively named "Enontekio partitilike virus", "Hameenlinna partiti-like virus", "Hattula partiti-like virus", "Ilomantsi partitilike virus 1", "Ilomantsi partiti-like virus 2", "Kuusamo partiti-like virus", "Lestijarvi partiti-like virus" and "Vaasa partiti-like virus" (Figure 19,  Figure 19. Maximum likelihood tree of partiti-like viruses (Partitiviridae). Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.
Five novel reoviruses belonging to Reovirales, a diverse order of viruses that infect organisms from several phyla, were sequenced ( Figure 20, Table 7). Four novel viruses belonging to the family Sedoreoviridae were tentatively named "Ilomantsi reovirus 1", "Ilomantsi reovirus 2", "Ilomantsi reovirus 3" and "Ilomantsi reovirus 4", while one novel virus belonging to Spinareoviridae was named "Enontekio reovirus". According to the phylogenetic analyses, none of these five viruses clustered within established genera. Enontekio reovirus was distantly similar to Operophtera brumata reovirus (GenBank accession: YP_392501.1; amino acid identity: 29.59%), while Ilomantsi reoviruses 1-4 were moderately similar to Aedes camptorhynchus reo-like virus (GenBank accession: YP_009389547.1; amino acid identities: 64.96-66.33%, 67.77-70.69%, 74.94% and 64.98%, respectively). However, a phylogenetic analysis suggested that Atrato reo-like virus (GenBank accession: QHA33824.1) was more related to Ilomantsi reoviruses 1-3, while Ilomantsi reovirus 4 clustered near the root of the Ilomantsi reovirus clade. Figure 20. Maximum likelihood tree of Reovirales. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps.
The most viral sequences in this study grouped within Totiviridae, which includes viruses of fungi and protozoans, among others. From 205 sequences, 52 viruses were identified, of which 50 were novel and two were strains of previously described, albeit unnamed, viruses (Figures 21-25, Table 7). Virus strains were found in all nine mosquito species and from across the country. The novel viruses included 33 provisionally named viruses which clustered with member species of Totivirus. These included "Enontekio totivirus 1 to 7", "Hameenlinna totivirus 1 to 3", "Hanko totivirus 1 to 10", "Hattula totivirus 1 to 3", "Ilomantsi totivirus 1 to 3", "Inari totivirus 1 and 2", "Joutseno totivirus", "Karstula totivirus", "Kuusamo totivirus 1 and 2", "Lestijarvi totivirus", "Palkane totivirus" and "Vaasa totivirus". Protein Figure 21. Maximum likelihood subtrees of Totiviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe (Culicinae) or genus (Anophelinae) of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.   Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered. Figure 24. Maximum likelihood subtrees of Totiviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered. Figure 25. Maximum likelihood subtree of Totiviridae. Tentative novel viruses are displayed in red and the mosquito species from which they were derived are in parentheses. Sequences from GenBank are black and display the following information after the virus or species name: "(sampled organism(s)|collection country, collection year)". Tip colours represent the tribe of mosquito from which viruses were obtained. Tip shape represents the continent or region from which the specimens were collected. Trees were constructed from amino acid sequences of virus polymerases >1000 nt, aligned with MAFFT and computed with IQ-TREE2 using ModelFinder and 1000 bootstraps. Asterisks denote that the complete genome was recovered.

Viruses by Mosquito Species
Variable numbers of pools, ranging from 1 to 35, were prepared for each mosquito species included in this study, with pooled material obtained from multiple collection locations. This made direct comparison of some results between species less meaningful, but each species was associated with multiple viruses.
Ochlerotatus cantans, which was the least represented species in the study with only one pool of 20 specimens collected in late June 2015 in Ilomantsi, PK, was found to have six viral sequences. These represented five novel species, and clustered within Solemoviridae, Partitiviridae and Totiviridae (Table 8).
Ochlerotatus caspius was represented with 11 mosquito pools comprised of 305 specimens collected from the southern, coastal regions of Uusimaa and Varsinais-Suomi in July and August 2017. In total, 76 viral sequences grouped into 26 virus species, and of these, 20 represented new virus species within Chrysoviridae, Chuviridae, Iflaviridae, Negevirus, Partitiviridae, Phenuiviridae, Solemoviridae, Totiviridae and Xinmoviridae. The seven previously described viruses fell within Flaviviridae, Picornaviridae, Solemoviridae and Totiviridae (Table 8). It was found to be virus-positive for Hanko virus in Uusimaa (FI 1010 and FI 1011), but not in Varsinais-Suomi (FI 988 and FI 1015) (see Figure 1).
Ochlerotatus communis was overrepresented in this study since it is one of the most common human-biting mosquitoes in Finland and is active across the summer months. As such, 35 pools were constructed, comprised of 866 specimens that were collected from around the country in May to August of 2015 and 2017. Inevitably, it also had the most unique viral sequences, with 179 that grouped into 62 species, of which 58 were novel. The three established viruses were Cordoba and Dezidougou viruses (Negevirus) and Sindbis virus (Alphavirus, Togaviridae). This is the first confirmed mosquito species to be associated with Sindbis virus in Finland. The single strain was found in Mekrijärvi, Pohjois-Karjala, an area where the only other Finnish mosquito-borne Sindbis virus strains have been recovered. The remaining 58 novel species belong to Aliusviridae, Botybirnavirus, Chrysoviridae, Chuviridae, Iflaviridae, Negevirus, Partitiviridae, Phasmaviridae, Phenuiviridae, Qinviridae, Sedoreoviridae, Rhabdoviridae, Solemoviridae, Totiviridae, Virgaviridae and Xinmoviridae (Table 8).
Plants, plasmodiophorids, nematodes and pollen [56]. Ochlerotatus excrucians was represented by 99 specimens divided into three (unequal) pools, which were collected from northern and western Finland in June and July 2015. Twenty sequences were assembled, which grouped into 12 virus species, 11 of which were novel. Two strains of the previously described Ohlsdorf virus (Rhabdoviridae) were found from Inari in Lapland. The other 11 species grouped within Negevirus, Partitiviridae, Permutotetraviridae, Solemoviridae and Totiviridae (Table 8).
Ochlerotatus pullatus was the second least represented species in this study, with 46 specimens divided into two pools: one from Lapland and the other from Hattula, KH. Both were collected in 2017, in May and July. Ten virus sequences were detected, which grouped into nine novel species, which belong to Chuviridae, Partitiviridae, Rhabdoviridae and Totiviridae (Table 8).
Ochlerotatus punctor/punctodes was also represented by 11 pools, comprised of 358 specimens that were collected around Finland between May to August in 2015 and 2017. Forty-one strains were sequenced, which grouped into 27 species, of which 25 were novel. The novels species belong to Chrysoviridae, Endornaviridae, Iflaviridae, Negevirus, Partitiviridae, Permutotetraviridae, Phenuiviridae, Quenyavirus, Rhabdoviridae, Solemoviridae, Totiviridae, Virgaviridae and Xinmoviridae. The two established species were Hallsjon virus (Endornaviridae) and Cordoba virus (Negevirus) ( Table 8). Short M glycoprotein sequences from Inkoo virus were also recovered in addition to the RdRp sequences used to assess species diversity.

Discussion
This is the first in-depth study of the viromes of mosquitoes from Finland. The aim was to investigate RNA viromes of identified Ochlerotatus mosquitoes, thereby ascertaining both the diversity of associated viruses and the potential vector associations of these mosquito species. RNA sequences were generated from nine identified species of female Ochlerotatus (2333 specimens), which were divided into 91 species-specific pools. Viral sequences were present in all mosquito pools, but only 90 contained sequences of RdRp greater than 1000 nucleotides and were included in further analyses. In total, 514 viral RNA sequences were identified that grouped into 159 species, 147 of which were likely to be novel. Strains for 12 viruses which had previously been described were sequenced, although only nine had been named when published: Hallsjon virus (Endornaviridae), Hanko virus (Flaviviridae), Cordoba virus, Dezidougou virus and Mekrijärvi negevirus (Negevirus), Jotan virus (Picornaviridae), Ohlsdorf virus (Ohlsrhavirus, Rhabdoviridae), Evros sobemo-like virus (Solemoviridae) and Sindbis virus (Togaviridae). The remaining three unnamed viruses were given suggested names in this study, to correspond with where the Finnish sequences originated: Hanko picorna-like virus (Picornaviridae), Hanko totivirus 3 and Hanko totivirus 4 (Totiviridae). Only two of these previously described viruses are currently recognised by the ICTV: Sindbis virus and Ohlsdorf virus. Three viruses, which had previously been detected using virus cell culture with Finnish mosquitoes, were sequenced and linked to named mosquitoes as follows: Hanko virus with Oc. caspius, Inkoo virus with Oc. punctor/punctodes and Sindbis virus with Oc. communis. These results affirm the high degree of viral diversity found in mosquitoes from Finland, despite only nine of the forty-three endemic mosquito species [11] being included in this study.

Classification and Interpretation of the Viruses Detected in this Study
Constructing phylogenetic trees of RNA viruses using RNA-dependent RNA polymerase sequences is a common practice to infer evolutionary relationships and classify newly detected viruses. This is because RdRp is a core viral protein which has conserved sequence motifs that make it a preferable gene to utilise in phylogenetic analyses. The nucleotide sequences of RNA viruses change constantly due to the high mutation rate in RNA viruses, but in contrast, amino acid sequences remain relatively stable and conserved [83]. Phylogenetic trees made from RdRps, therefore, tend to be more accurate compared to those made using other core proteins [84][85][86].
The putative viruses sequenced in this study were assigned as novel based on several criteria: (1) novel virus RNA dependent RNA polymerases submitted to NCBI BLASTx had to have an amino acid identity value lower than 90% compared to the most similar virus; (2) phylogenetic analyses were run to ascertain their evolutionary relationship with previously described viruses and their likely classification within virus families; (3) associated GenBank records from closely related taxa were examined and compared with potentially novel viruses. These included the country of sampling, collection date and the organism from which the virus was isolated, to infer their novelty. Finally, (4) the criteria set by ICTV, including sequence lengths, amino acid identity and clustering in phylogenetic trees were also considered. Additionally, we computed supplementary pairwise distances from the protein alignments to ascertain the novelty of the detected viruses (Table S3). Certain viruses named in this study were on the borderline of being novel, since they came close to, but above, the 90% amino acid identity threshold with the closest described viruses. Such cases were noted where relevant. These viruses might indeed turn out to be Finnish strains of established viruses, but confirmation would require additional research including more sequence information on the related viral genetic diversity, especially from other geographical regions. All of the virus names proposed in this study are working names, as the final decision on their nomenclature and classification will be made by the ICTV.
Most (147) of the 159 viruses reported in this study were designated as novel since they had low similarities in RdRp amino acid identity with the most similar existing viruses (average 65.88%). The lowest amino acid identity was seen with Enontekio reovirus, which was only 29.6% similar to Operophtera brumata reovirus (Spinareoviridae), but low values were encountered many times throughout the analysis. This highlights the issues presented by "viral dark matter", i.e., the lack of available sequences in databases to which viral sequences can be aligned [87], as well as the capacity of Lazypipe, the virus discovery and annotation pipeline established in our laboratory [18], to unravel viral sequences that are only remotely related to previously known viruses. Palkane botybirna-like virus (described in Section 3.1.3), shared a low average amino acid identity with another unclassified botybirna-like virus, Bremia lactucae associated dsRNA virus 1. Whether these two viruses are distant relatives, are ancestral viruses to the taxon, whether they can be classified as botybirnaviruses or whether they constitute a novel group of viruses remains undetermined. This study falls short of suggesting new virus genera, but it is likely that many of these sequences will form new genera in future revisions of the affected virus families.
Many more new viruses could have been named from the sequences obtained from this study but were excluded as their contigs fell below the 1000-nucleotide minimum length requirement that was set for any sequences to be considered for analysis. These discarded sequences formed approximately 75.5% of the total viral sequence data generated. In particular, short sequences of the pathogenic species Inkoo virus and Chatanga virus (Peribunyaviridae) were affected by these strict parameters, since short contigs containing polymerase, glycoprotein and nucleocapsid sequences were recovered.
A common pattern observed in the phylogenetic trees was that novel viruses clustered with available sequences of mosquito-derived viruses, inferring that these might be more mosquito-specific than insect-specific. Moreover, many novel viruses obtained during this study clustered with viruses that were sequenced from other mosquitoes, many of which belonged to Aedini, a cosmopolitan tribe of Culicidae with 1263 extant species and which includes 35% of all valid mosquito species [13]. One explanation could be that these viruses share a common ancestry [85]. Viral sequences that grouped within Iflaviridae, Aliusviridae and especially in Flaviviridae (Flavivirus) clustered near to or with insect-specific and Aedini-associated viruses. Several of the novel viruses which grouped within Picornaviridae, Chuviridae and Chrysoviridae were the first mosquito-associated viruses detected which belonged to tribe Aedini. These findings could be indicative of broader mosquito association ranges among these RNA virus families. Among the virus families which infect plants and fungi, e.g., Alphapartitivirus, the discovery of these novel Finnish viruses would suggest that Ochlerotatus mosquitoes (and most likely mosquitoes in general) act in some capacity as vectors for these viruses, whether by mechanical transmission or otherwise.
The proportion of totivirus sequences detected in all Ochlerotatus pools was very high in this study ( Figures S2 and S3), despite them being viruses traditionally more associated with fungi and protozoa (https://ictv.global/taxonomy/ (accessed on 20 May 2022)). GenBank records show that totiviruses have been found in arthropods, plants, mammals and fish, thus indicate that these viruses might have a wider host range than is currently recognised by the ICTV (https://ictv.global/taxonomy/ (accessed on 20 May 2022)). Another factor to potentially explain the high prevalence of totiviruses could be that they are part of the core virome of Ochlerotatus species [88]. Either way, this study highlights the need for an expert group to subject Totiviridae to a critical review, since at present only 28 species belonging to five genera are currently officially recognised by the ICTV (https://ictv.global/taxonomy/ (accessed on 20 May 2022)), but in this study alone, 52 novel viruses were proposed. Similarly, partitiviruses were the next most represented species in this study, with 52 strains belonging to 23 viruses.
In recent years, most of the novel mosquito-borne viruses have been detected and reported from temperate and equatorial regions, since that is where most of the known mosquito-borne diseases are distributed [89]. The number of viromic studies from northern latitudes are increasing [31,32,90,91], but the uneven distribution of global research effort emphasises the importance of investigating mosquito viromes of these regions for more accurate information about the virosphere.

Reflections on the Methods and Their Impact upon Interpreting the Results
Since the lab work for this study was completed, a viromics study of Swedish mosquitoes was published in which a rinse step was added prior to homogenisation to remove surface contaminants from their specimens [90]. On reflection, this additional step would have been very beneficial to exclude any viruses which may have been mechanically transmitted to mosquitoes, or which were associated with bacteria/protozoa on the mosquito's integument. Many of the viruses that were sequenced during this study, e.g., Chrysoviridae, Endornaviridae, Solemoviridae, Totiviridae and Virgaviridae are more traditionally associated with protozoa, plants or fungi than mosquitoes (see Table 8) [39,54,56,70,81,82]. Species of Virgaviridae even use pollen grains to disperse and infect new hosts [56]. The downside of viromics is not knowing the association of the novel viruses that are recovered, e.g., whether the mosquito happened to be covered in pollen grains which were in turn covered in viruses; whether the viruses were present in undigested gut contents; whether they infected the mosquito; or whether the mosquito is a vector for that virus, and so on.
Mosquitoes also have many interactions with other organisms in the environment. Some species are known to feed on honeydew, a sugar-rich excrement that some insects in-cluding ants (Hymenoptera) and aphids (Hemiptera) excrete after feeding on plants [92,93]. It would be interesting to determine, since some species actively seek out honeydew [93], if such interactions affect virus transmission between insects, particularly since so many plantassociated viruses were recovered in this study. In addition, three of the females that were included in the study, one Oc. excrucians (FIN/L-2018/007) and two Oc. punctor/punctodes (FIN/PP-2018/015 and FIN/L-2018/026) were noted to have parasitic or phoretic mites attached to them (nine mites, one and one mite, respectively). If truly phoretic, then the mites may just have been temporarily attached to the mosquito for dispersal, so they may not have been so relevant for interspecies transmission. If, however, they were parasitic, then the transfer of viruses between mites and mosquitoes is not out of the realm of possibility [94]. More work is required in the future to elucidate these relationships.
Taking these points into consideration, a further laboratory step would have also increased our understanding of which viruses may be vectored by the mosquitoes included in these analyses. Honey-baited nucleic acid cards, such as FTA ® Elute Cards (Whatman, Maidstone, UK), have been used in several studies in recent years in order to collect mosquito saliva, preserve any viral RNA, and ultimately sequenced to determine which viruses/virus species are present [95][96][97]. By first collecting mosquitoes, and then allowing them time to feed upon such cards either singularly, or in small groups, it would certainly be possible to refine results from metagenomic studies such as this one to see which viruses were common to the nucleic acid cards/saliva and mosquitoes, and which were only present in the mosquitoes, thereby determining which viruses have higher or lower likelihoods of being pathogenic. This could be then tested further using virus cell culture methods to isolate possible viruses on vertebrate or mosquito cells.
When mosquitoes were collected for this study, a note was made whenever a female was noticeably blood fed or gravid, but not if they had distended abdomens which looked as though they had recently fed upon plant juices. All but three of the females were not  (1), Totiviridae (4) and Virgaviridae (1). Competition between different viruses within mosquitoes might inhibit the replication or transmission of other viruses, resulting in the over representation of more competitive viruses [98][99][100]. Defective viral genomes have also been observed to inhibit replication or transmission of other viruses in mosquitoes [98,99], or in the case of identical or closely related viruses, the virus which manages to infect a host cell first might inhibit the replication of another via a process named "superinfection exclusion" [100].
Viromes of other mosquitoes which are native to Finland would also be of interest to study further in the future. This study only included nine of 43 (21%) currently recognised endemic species [11], and 38% of the pools were Oc. communis, creating a heavy bias to one species. Additional topics that would be of interest to explore further include the geographic and seasonal variations in the virome, as well as differences between males and females and at different developmental stages. Seasonal variation has been observed in Aedes (Stegomyia) albopictus [101] and Culex mosquitoes [102], though the core virome remains similar across different life stages in Ae. albopictus [103]. The sole focus on female mosquitoes might also limit virus discovery, akin to a study done with Ae. albopictus mosquitoes, in which Aedes iflavi-like virus genomes were only detected in a pool of male mosquitoes [104]. The authors do however note that the explanation for this is uncertain and that there might be other causal factors, such as the location of mosquito sampling [104].

Geographical Distribution of Viruses in Finland
This study has significantly increased the number of locations from which viruspositive mosquitoes have been collected in Finland. Prior studies have detected Hanko and Inkoo viruses from Uusimaa [3,4,8], Lammi virus from Kainuu, Pohjois-Karjala and Päijät-Häme [7,9], Chatanga virus from Kainuu (same location as for Lammi virus) and Pohjois-Karjala [5], Ilomantsi virus from Pohjois-Karjala [9], Sindbis virus from Pohjois-Karjala [1,2] and finally Mekrijärvi negevirus from Pohjois-Karjala [10]. In all, these viruses were found in only 4 of the 19 regions, and from only seven approximate locations, since six publications all included specimens from around Mekrijärvi in Pohjois-Karjala. The previously most northern mosquito-associated viruses in Finland were found in mosquitoes from around Sotkamo in Kainuu, approximately N64 • 08 , E28 • 23 [5,9].
In contrast, this study included specimens which were collected from 49 collection efforts at 43 sites (min 1 km separation) in 11 regions and extended the sampling locations of virus-positive mosquitoes to the entire country (see Table A1 for a list of the proposed novel viruses by collection location, which can be compared with Figure 1) [105]. Two other collections which contribute to this study, FI 654 and FI 655 from Inari, Lapland, were also made further north than the Norwegian study (pools FIN/L-2018/07 and FIN/L-2018/19).
Hanko virus, an insect-specific virus which was first described from Finland [8], was sequenced in this study from mosquitoes that were collected near to the type locality in Hanko, Uusimaa. The four virus-positive pools all comprised Oc. caspius, which were collected in late August 2017. This is the first instance where a named mosquito species is confirmed to be associated with the virus. With future analyses, it will be interesting to see if Hanko virus is restricted to Oc. caspius, a halophilic/coastal mosquito species [16], or if it is also associated with mosquito species with larger distributions in Finland. Other specimens of Oc. caspius were included in the analysis, from Kustavi in Varsinais-Suomi from collections made in July and August 2017 (collection numbers FI 988 and FI 1015 in Figure 1), but the virus was not found therein.
A disproportionate number of pools were comprised of specimens which were collected from around Mekrijärvi, or in the municipality of Ilomantsi, Pohjois-Karjala. This was in part because the material in this study was all snap-frozen, identified and stored at −70 • C immediately following identification, to permit virus cell culture experiments. Such specialist facilities are located at a few field stations around Finland, which also explains why many collections were also made around the municipalities of Enontekiö and Utskoki in Lapland and in Hanko, Uusimaa. There were other factors, however, which influenced the decision to include material from eastern Finland. Prior to this study, Pohjois-Karjala was the only region where Sindbis virus [1,2], and one of only two locations from which Chatanga virus, has been found in Finnish mosquitoes [5], and vector species had not been confirmed. However, Sindbis virus has been detected in other parts of Finland in recent years [2]. Chatanga virus was not confirmed within the parameters of the study, but Sindbis virus was, as already mentioned, sequenced from a pool of Oc. communis mosquitoes. This sampling strategy did provide the first record for Inkoo virus in Oc. punctor/punctodes mosquito outside of Uusimaa, so from that perspective, it was very interesting, particularly as seroprevalence to California serogroup viruses is high amongst the Finnish population [106], but virus-positive mosquitoes have rarely been encountered. Since Ilomantsi, Hanko and Enontekiö had the majority of mosquito pools, they also had the most unique virus detections. The most widespread virus families in turn were Totiviridae and Partitiviridae. Totiviruses were detected in all sampled regions, which supports them being part of the core virome of Ochlerotatus mosquitoes. Similarly, partitiviruses were detected in all regions with the exception of Keski-Suomi (Central Finland) and Varsinais-Suomi (Southwest Finland). This, however, is very likely explained by sampling bias, since only one mosquito pool included specimens from Keski-Suomi and four pools from Varsinais-Suomi.

Brief Comparison with Other Virome Studies
Metagenomics studies published in recent years have identified diverse viromes in mosquitoes from around the world [31,34,37,43,45,66,90,[107][108][109][110][111][112].These viromes appear to differ between species and can include anywhere from tens to hundreds of different virus species in a given sample, which often comprises several individuals of a species [31,34,37,43,45,66,90,[107][108][109][110][111][112]. The nearest comparable study to Finland is a singleyear, two-location study of 953 specimens of six mosquito species in Sweden [90]. It examined the viromes of Coquillettidia richiardii, Oc. communis, Oc. annulipes, Oc. cantans, Culex pipiens and Cx. torrentium, all species which are common to both Sweden and Finland, and two of which were common to both studies. They found viruses which belonged to multiple families, but ultimately there were none that were common to both studies [90]. They did, however, find viruses belonging to several families/orders which are yet to be detected in Finnish mosquitoes, including Nodaviridae, Orthomyxoviridae, Tombusviridae and Articulavirales [90]. Another Swedish study focused on comparing the viromes of Culex pipiens and Cx. torrentium collected from two locations over several years. They found 40 viruses (28 novel viruses) belonging to 14 families/orders: Bunyavirales, Endornaviridae, Luteoviridae, Mogonegavirales, Negevirus, Nidovirales, Orthomyxoviridae, Partitiviridae, Picornaviridae, Qinviridae, Reoviridae, Togaviridae (wrongly attributed to "Alphaviridae", an invalid family) Totiviridae and Virgaviridae [31]. Sindbis virus, Hallsjon virus and Jotan virus were common to both this and the Swedish study [31], but viruses from Luteoviridae and Orthomyxoviridae were not sequenced in this study.
It is also of interest to compare these findings with those of other virus studies from Finland, to determine if other distant host taxa share any close virus associations and therefore explore the potential origins or pathogenicity of novel viruses. A study of glowworms (Coleoptera: Lampyridae) amplified targeted RNA sequences from adults collected in central and southern Finland [113]. They recovered 11 novel viruses belonging to Flaviviridae, Iflaviviridae, Tymoviridae, Bunyavirales, Rhabdoviridae, Partitiviridae, Totiviridae and Metaviridae. Lampyris noctiluca flavivirus 1 grouped within the same clade as Lestijarvi flavi-like virus in the Flavivirus tree (Figure 2), in a branch separate from all other flavivirus sequences that were recovered in this study. Similarly, Lampyris noctiluca iflavirus 2 grouped in the same clade with Mekrijarvi iflavirus (Figure 3), in a branch away from all of the other iflaviruses. The glow-worm totivirus and rhabdovirus sequences also featured within the trees generated for this study, but not as closely as for the two named viruses. Since glow-worms are not haematophagous, have predatory larvae, do not feed as adults and are nonsocial, their associated viruses have limited sources [113]. The origins, host associations and pathogenicity of the novel viruses in this study are still to be determined.

Viruses Which Have Pathogenic Associations in Vertebrates
Two of the viruses which were sequenced in this study, Inkoo virus (Peribunyaviridae: Orthobunyavirus) and Sindbis virus (Togaviridae: Alphavirus) have known disease associations in Finland, and, although infrequent, can cause severe enough symptoms for patients to require hospitalisation [6,106,114]. These viruses have been detected in mosquitoes in previous Finnish studies, but it is worth mentioning that two mosquito species have now been implicated as being at the very least hosts for these viruses, if not vectors, Oc. punctor/punctodes and Oc. communis, respectively. The first isolations of Inkoo virus in the 1960s did include mixed pools containing Oc. communis and Oc. punctor/punctodes but now Oc. punctor/punctodes is confirmed as being virus positive. While most of the detected viral diversity has not been, and likely will not be, associated with pathogenic traits, it is nevertheless notable that without targeted sampling to capture outbreaks spatially or temporally, we have been able to detect sequences of the two previously well-established mosquito-borne pathogenic viruses in Finland.
Reovirales (until recently Reoviridae) is an order comprised of two families, Sedoreoviridae and Spinareoviridae (formerly Sedoreovirinae and Spinareovirinae), each of which has pathogenic virus species among their members. It is for this reason that the five novel Reovirales viruses in this study are of particular interest for future examination, to determine if hitherto unrecognised pathogenic mosquito-borne viruses are present in Finland. The proposed Sedoreoviridae viruses (Ilomantsi reovirus 1 to 4) all group with other viruses which were sequenced from mosquitoes, and are related to Phytoreovirus, which includes plantpathogenic viruses based on a phylogenetic analysis. Valmbacken virus (see Figure 20), which is at the root of the novel virus cluster, is likely a mosquito-associated virus [31], indicating that the novel viruses potentially could have such associations. The single tentative Spinareoviridae virus, Enontekio reovirus, is distantly related to Fijivirus, which includes plant-infecting viruses that may spread via an insect vector. The sequences which group together in Figure 20 are all derived from insects.

Conclusions
This study, by using high throughput next-generation sequencing methods and an unbiased virus discovery pipeline, has vastly increased the knowledge of viruses associated with mosquitoes in Northern Europe and has confirmed the number of known mosquito-associated viruses and virus families in Finland from seven and four, to 159 and 25, respectively. Such a large increase in knowledge of the diversity of mosquitoassociated viruses is certainly interesting and begins to enlighten the "viral dark matter", but inevitably brings with it new questions and challenges. It also highlights the pressing need for additional study to bring relevance to the names and sequences presented herein, as well as to investigate arthropod viromes of northern regions more thoroughly. It is evident from the points we have raised that the floodgates have opened, and the real work of elucidating the relationships between mosquitoes, viruses, the environment and host species must now begin.