A Novel Highly Divergent Strain of Cell Fusing Agent Virus (CFAV) in Mosquitoes from the Brazilian Amazon Region

Classical insect-specific flaviviruses (cISFs) have been widely detected in different countries in the last decades. Here, we characterize the near full-length genomes of two cISFs detected in mosquitoes collected in the city of Macapá, state of Amapá, Amazon region of Brazil. A total of 105 pools of female mosquitos were analyzed by next-generation sequencing (NGS). Comparative genomics and phylogenetic analysis identified three strains of cell fusing agent virus (CFAV) and two of Culex flavivirus (CxFV). All sequences were obtained from pools of Culex sp., except for one sequence of CFAV detected in a pool of Aedes aegypti. Both CxFV strains are phylogenetically related to a strain isolated in 2012 in the Southeast region of Brazil. The CFAV strains are the first of this species to be identified in Brazil and one of them is highly divergent from other strains of CFAV that have been detected worldwide. In conclusion, CFAV and CxFV, circulate in mosquitoes in Brazil. One strain of CFAV is highly divergent from others previously described, suggesting that a novel strain of CFAV is present in this region.

There are a few reports on cISFs in Southeast Brazil. CxFV was first isolated in the city of São José do Rio Preto and a fragment of the NS5 gene was sequenced [24]. More recently, the complete genome of this isolate was published [32]. In the city of São Paulo, CxFV and AEFV sequences of the NS5 gene have also been detected in mosquitos [25]. To the best of our knowledge, no information is available on the occurrence of cISF in other regions of the country.
Despite the widespread occurrence of cISFs, there is little information regarding their frequency, distribution, host range and genetic diversity. Therefore, we performed a metagenomics survey in mosquitoes from the North of Brazil, a region with no previous information on cISF and where mosquitoes are highly abundant.

Location of Sample Collection
Mosquitoes (Diptera: Culicidae) were collected in the city of Macapá, Amapá (AP), Northern Brazil. Macapá is the largest city and also the capital of Amapá. It is located in the Amazon region. A population of 474,706 inhabitants was estimated in 2017 [33]. Collections of mosquitoes were performed in either residential or commercial properties at 21 points located in two neighborhoods, Central and Marabaixo ( Figure 1). Central was the first neighborhood to be formed and consists of a commercial and administrative center. Its population of 17,798 inhabitants live in an area of 4.1 km 2 , with 4831 households [33]. Marabaixo is a non-official neighborhood and shows a lower degree of urbanization compared to Central. Coordinates of each point of collection were obtained using the Universal Transverse Mercator System (UTM), by means of Garmin Oregon 550 GPS (Garmin International, Inc., Olathe, Kansas, USA) and QGIS 3.0 software (QGIS ® ).

Collection and Identification of Mosquitoes
Insect collections were carried out twice a month during the morning (8 to 10 a.m.) from January to March 2017. Electric manual aspirators [34], Castro aspirators [35] and entomological nets were used to collect the mosquitoes inside and outside of residential and commercial properties.
The mosquitoes were transported to the laboratory, euthanized with ethyl acetate and morphologically identified using the dichotomous keys of Consoli and Lourenço-de-Oliveira [36]. After identification, mosquitoes had their wings and legs removed. Up to five females were grouped in pools according to their taxonomic category, place and date of collection. Mosquitoes were stored in a −80 • C freezer after identification.
A total of 127 pools of mosquitoes were obtained, 90 from Marabaixo and 37 from Central. A total of 105 of the 127 pools of mosquitoes were analyzed according to the following protocol.

Sample Processing and Next Generation Sequencing (NGS)
The protocol used to perform deep sequencing was a combination of several protocols normally applied to viral metagenomics and/or virus discovery [37], and has been partially described by da Costa et al. [38]. In summary, each mosquito pool was diluted in 900 µL of Hanks' buffered salt solution (HBSS), added to a 2 mL impact-resistant tube containing lysing matrix C (MP Biomedicals, Waltham, MA, USA), and homogenized in a FastPrep-24 5G Homogenizer (MP Biomedicals, Waltham, MA, USA). The homogenized sample was centrifuged at 12,000 × g for 10 min, and approximately 300 µL of the supernatant was then filtrated through a 0.45 µm filter (Merck Millipore, Billerica, MA, USA) to remove eukaryotic and bacterial cell-sized particles. Approximately, 100 µL, roughly equivalent to one-fourth of the volume of the tube, of cold PEG-it Virus Precipitation Solution (System Biosciences, Palo Alto, CA, USA) was added to the obtained filtrate, the contents of the tube were gently mixed and then incubated at 4 • C for 24 h. The mixture was then centrifuged at 10,000× g for 30 min at 4 • C and the supernatant (~350 µL) was discarded. The viral particle-rich pellet was treated with a combination of nuclease enzymes (TURBO DNase and RNase Cocktail Enzyme Mix-Thermo Fischer Scientifc, Waltham, CA, USA; Baseline-ZERO DNase -Epicentre, Madison, WI, USA); Benzonase (Merck KGaA, Darmstadt, Germany); and RQ1 RNaseFree DNase and RNase A Solution (Promega, Madison, WI, USA) to digest free nucleic acids. The resulting mixture was subsequently incubated at 37 • C for 2 h. Viral nucleic acids were then extracted using ZR & ZR-96 Viral DNA/RNA Kit (Zymo Research, Irvine, CA, USA), according to the manufacturer's protocol. The cDNA synthesis was performed with AMV Reverse transcriptase (Promega, Madison, WI, USA). A second strand of cDNA was synthesized using DNA Polymerase I Large (Klenow) Fragment (Promega, Madison, WI, USA). Subsequently, a Nextera XT Sample Preparation Kit (Illumina, San Diego, CA, USA) was used to construct a DNA library, identified using dual barcodes. Individual samples were then purified using the ProNex Size-Selective Purification System (Promega, Madison, WI, USA). For size range, Pippin Prep (Sage Science, Beverly, MA, USA) was used to select a 300 bp insert (range 200-400 bp). The library was deep-sequenced using the HiSeq 2500 Sequencer (Illumina, San Diego, CA, USA) with 126 bp ends. Bioinformatic analysis was performed according to the protocol previously described by Deng et al. [39]. The resulting singlets and contigs were analyzed using BLASTx to search for similarity to viral proteins in GenBank's Virus RefSeq. The contigs were compared to the GenBank non-redundant nucleotide and protein database (BLASTn and BLASTx).

Phylogenetic Analyses
Based on the best hits of the Blast searches, the following genomes, listed by their GenBank numbers, were chosen.  [40]. To the inference of the NS5 gene tree (this region corresponds to the positions 7463-9889 of the genome of the reference NC001564, Galveston), besides the sequences listed above the isolate KJ476731 was also used. Phylogenetic trees were reconstructed using the Maximum Likelihood approach, and branch support values were assessed using the Shimodaira-Hasegawa test. The genome tree was inferred using FastTree software [41] using the general time reversible (GTR) model and gamma distribution according to the likelihood ratio test (LRT) implemented in the jModeltest software [42]. We also used Bayesian Markov chain Monte Carlo method implemented in MrBayes version 3.2.3 [43] assuming GTR substitution model with gamma-distributed rate variation across sites and a proportion of invariable sites. Chains were run for 10 million generations, with the first 25% discarded as burn-in. Additional evolutionary analyses were conducted using MEGA7 [44]. Trees were edited and viewed with FigTree v1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).

Results
We processed a total of 105 pools of mosquitoes that were submitted to NGS. The main characteristics of pools in which near full-length flavivirus genome sequences were detected are shown in Table 1.

Phylogenetic Analysis of Genomes
The near full-length genomes of the Brazilian strains were compared to cISFs available in the GenBank and a maximum likelihood tree was inferred (Figure 2). The tree shows that strains Macapá 01, Macapá 02 and Macapá 04 clustered in the clade formed by cell fusing agent virus (CFAV) isolates while the strains Macapá 05 and Macapá 06 clustered in the Culex flavivirus (CxFV) clade. The strains of CxFV from this study are in a phylogroup in which the basal strain was detected in Uganda (GQ165808), in 2008. Our CxFV strains are closely related to an isolate (KT726939) detected in the Southeast region of Brazil in 2012. In this clade there is also a Mexican strain isolated in 2007 (EU879060). Two strains of CFAV detected in this study (Macapá 01 and 04) cluster with the historical isolate GQ165810 (Rio Piedras02) from Puerto Rico identified in Aedes aegypti. Additionally, we found one CFAV strain (i.e., Macapá 02) divergent from all the other strains. The genetic distance based on the near full-length genomes between Macapá 02 and GQ165810 was 52% while the distance between GQ165810 and Macapá 01 was just 10% (Table S1). For this reason, a more detailed evolutionary analysis was performed with the CFAV strains.

Phylogenetic Trees of CFAV Envelope (E) and NS5 Genes
To characterize the Brazilian strains of CFAV, we chose the E and the NS5 genes because they have distinct evolutionary rates and are located in different regions of the CFAV genome. The tree reconstructed using the E gene indicates that there are three well defined CFAV groups (i.e., Thailand, America and Australia/USA/UK) determined by high branch support values. This tree was midrooted to facilitate visualization of the topology ( Figure 3A). The group Australia/USA/UK may not represent a true phylogroup because the isolates were obtained from mosquito cell lines. The remaining groups were composed of local strains/isolates and represent de facto regional strains of CFAV. This tree also shows the strains AB488425 (from Indonesia) and Macapá 02 are at the base and are highly divergent from all CFAV strains included in this phylogeny.
In addition, we also used genetic distances to illustrate the diversity of Brazilian strains. The distance among the main groups ranged from 22% to 26% (values indicated by dark braces in the tree). We also measured the diversity within each group (colored circle in Figure 3A). The diversity of the group America was estimated without the strain Macapá 02. The diversity of the Brazilian strains is also shown (blue sector in the circle of Figure 3A) and this measurement was estimated including all strains. It is important to mention that the high genetic diversity in the group of Brazilian strains was due to inclusion of the isolate Macapá 02.
Additionally, the same pattern of topology was observed in the phylogenetic tree inferred using the NS5 region ( Figure 3B). Equally, the main groups (i.e., Thailand, America and Australia/USA/UK) were supported by high posterior probabilities. The strain Macapá 02 is at the base of the clade and not within the group composed by strains from the Americas. Distances among groups ranged from 21% to 34% (values indicated by dark braces in the tree) and the diversity within each group was also measured (colored circle in Figure 3B). Note that the diversity of Brazilian strains was higher because of the inclusion of the strain Macapá 02.

Genetic Distances of the Brazilian Isolate Macapá 02
To better illustrate the divergence of the strain Macapá 02 from other CFAV isolates, a pairwise comparison was made (summarized in Table 2). The genetic distances between Macapá 02 and one strain of each phylogroup (Thailand, America and Australia/USA/UK) were higher than 40% (Lane 1 to 3 in Table 2) and therefore, are higher than the distance values calculated within the groups (colored circles in Figure 3). In addition, the genetic distance between Macapá 01 and Macapá 02, measured using the full length genomes, is 0.053 ± 0.002 and between Macapá 01 and NC001564 (Galveston) is 0.022 ± 0.001 (Table S1). Differences in the polyprotein between Macapá 01 and Macapá 02 are shown in Table S4.

Discussion
Here we report the first detection of CFAV in Brazil. This virus had previously been detected in mosquitoes from Puerto Rico [7], Thailand [8,9], Indonesia [10], Mexico [11] and Kenya [12]. Our findings provide evidence that this virus is also distributed in South America.
Our phylogenetic analyses revealed that one strain of CFAV (Macapá 02) was highly divergent from the other strains detected in Brazil (Macapá 01 and Macapá 04) or elsewhere (Figures 2 and 3 and Table 2 and Table S4). This is the first time that divergent CFAV strains have been reported in the same country. Cook et al. [7] detected one strain of CFAV among several isolates recovered from mosquitoes Aedes aegypti, Aedes albopictus and Culex sp. collected in a wide geographical area of Puerto Rico. In addition, others had reported low divergence among isolates of CFAV from Mexico [11] and Thailand [9]. Therefore, our findings suggest a new strain of CFAV is circulating in Macapá, Northern Brazil, in Culex mosquitoes captured in the Marabaixo neighborhood.
Data from our study reinforces the previously described hypothesis about the existence of two genotypes of CxFV: One found in isolates from Asia and USA and the other in isolates from Africa, Caribbean and Latin America [18,19,23,24,28]. More recently, isolates of CxFV from Taiwan were grouped with the Africa/Caribbean/Latin America genotype [29], a classification that was also confirmed in the phylogenetic tree generated in our study.
In our study, Culex sp. females were not identified to the species level due to difficulties in the morphological identification of mosquitoes. This technique presents the following limitations. Some critical structures of the specimens are commonly damaged or lost during their collection and/or transportation to the laboratory. In addition, identification of Culex sp. is laborious and requires a very experienced and well-trained entomologist. Some species of this genus can only be identified by structures of the male genitalia [36]. This is an obvious limitation in monitoring and surveillance studies of flavivirus that focus on females due to their epidemiological importance. The use of tools that allow molecular identification of the mosquitoes to the species level, such as those described by Cywinska et al. [45] and Murugan et al. [46], could be helpful to overcome the difficulties found in the morphological identification of Culex. sp.
Brazil is a very large country with an area of 8.5 million km 2 . Information on cISF is very limited so far and, therefore, not much can be concluded about their frequency, distribution, host range and genetic divergence in this country. However, in the last few years Brazil has been facing multiple epidemics caused by flaviviruses such as dengue virus, Zika virus and yellow fever virus [47][48][49][50]. This fact will probably lead to an increase in surveillance and monitoring studies. Since the majority of new and ongoing studies employ molecular tools, such as PCR with generic primers and nucleotide sequencing, additional cISFs may very well be detected in mosquitoes from Brazil in the near future.
In conclusion, cISFs, more specifically CxFV and CFAV, circulate in mosquitoes in the North of Brazil, Amazon region. One strain of CFAV is highly divergent from others previously described, suggesting that a novel strain of CFAV is present in this region.
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4915/10/12/666/ s1, Table S1: Genetic distances of the near full-length genomes of CFAV strains obtained from this study (Macapá 01 and Macapá 02) and recovered from GenBank, Table S2: Similarity (%) of the sequence of the strain Macapá 01 with sequences of flaviviruses from GenBank, Table S3: Similarity (%) of the sequence of the strain Macapá 05 with sequences of flaviviruses from GenBank, Table S4