The Emergence of Chikungunya ECSA Lineage in a Mayaro Endemic Region on the Southern Border of the Amazon Forest

Anthropic changes on the edges of the tropical forests may facilitate the emergence of new viruses from the sylvatic environment and the simultaneous circulation of sylvatic and urban viruses in the human population. In this study, we investigated the presence of arboviruses (arthropod-borne viruses) in the sera of 354 patients, sampled from February 2014 to October 2018 in Sinop city. We sequenced the complete genomes of one chikungunya virus (CHIKV)-positive and one out of the 33 Mayaro virus (MAYV)-positive samples. The CHIKV genome obtained here belongs to the East/Central/South African (ECSA) genotype and the MAYV genome belongs to the L genotype. These genomes clustered with other viral strains from different Brazilian states, but the CHIKV strain circulating in Sinop did not cluster with other genomes from the Mato Grosso state, suggesting that at least two independent introductions of this virus occurred in Mato Grosso. Interestingly, the arrival of CHIKV in Sinop seems to not have caused a surge in human cases in the following years, as observed in the rest of the state, suggesting that cross immunity from MAYV infection might be protecting the population from CHIKV infection. These findings reinforce the need for continued genomic surveillance in order to evaluate how simultaneously circulating alphaviruses infecting the human population will unfold.


Introduction
Arboviruses, arthropod-transmitted viruses, can successfully replicate in both invertebrates, vectors and vertebrate host cells [1]. These viruses are mostly transmitted by mosquitoes in sylvatic and urban environments, causing a large public health burden in several tropical countries around the globe [2]. WHO estimates that the dengue virus alone infects around 100-400 million people annually, mainly in the South American and Asian continents [3]. On the other hand, several human pathogenic arboviruses circulate in the sylvatic environment, with eventual spillover to the human population, such as Saint Louis encephalitis virus (SLEV) [4,5], West Nile virus (WNV) [6,7] and Mayaro virus (MAYV) [8,9].
Most known human-infecting arboviruses belong to the families Flaviviridae, Togaviridae and Peribunyaviridae [9,10]. Viruses from the Togaviridae family are zoonotic and epizootic pathogens that in a transitional zone between the Amazon rainforest and Cerrado (Savannah) biomes. The region has a tropical climate, with a dry season of six months, and an average temperature of 26 • C and precipitation levels of over 2500 mm.

Sample Collection
Serum samples were collected during arboviral surveillance from February 2014 to October 2018. As inclusion criteria, we sought patients presenting clinical diagnoses of dengue, Zika or chikungunya infection, who reported arbovirus-like symptoms, such as fever, myalgia, headache and arthralgia. All of them were interviewed and signed an informed consent form.

Viral Detection-RT-PCR Assays
Viral RNA was extracted directly from the samples or cells' supernatant using a QIamp viral RNA mini kit (QIAGEN, Hilden, Germany), followed by a reverse transcription using 1 µL of random hexamers (Promega, Madison, MI, USA), 1 µL of Go Script Reverse Transcriptase (Promega) and 8 µL of RNA, for a 20 µL reaction, according to the manufacturers' instructions. A duplex-PCR was made using the genus-specific primers for alphaviruses (targeting nsP1 region) and flaviviruses (targeting NS5 region), followed by multiplex-semi-nested RT-PCR assays for species-specific identification [34], with a few modifications, as previously described [31].

Viral Detection-Isolation
The MAYV and CHIKV positive samples were inoculated into the cell culture of Ae. albopictus C6/36 and Vero cell lines; three passages were performed and then the supernatants were tested for viral nucleic acids by RT-PCR assays following [35].

Viral Genome Sequencing
Two PCR positive samples (one for CHIKV and one for MAYV) were assessed by real-time RT-PCR (rRT-PCR), employing the established protocols for CHIKV [36] and adding the MAYV primers used in the conventional PCR, in order to obtain Ct values to guide the genome sequencing strategy. Total RNA was extracted directly from the serum of the CHIKV positive sample and from the supernatant of the MAYV first passage in C6/36 cells, in order to perform the single-strand cDNA synthesis used for viral genomic amplification. RT-PCR reactions were carried out in a volume of 20 µL using Random Hexamers (Promega, Madison, WI, USA) and ProtoScript II Reverse Transcriptase (New England Biolabs, Ipswich, MA, USA), following the manufacturers' instructions.
Single strand cDNA from the CHIKV positive serum sample was submitted to whole genome amplification using multiplex-PCR [37], while single-strand cDNA from the MAYV sample was submitted to whole genome amplification by using multiplex RT-PCR, with primers designed using Primal Scheme [38]. Primer design was performed with two complete genomes of MAYV belonging to the L and N genotypes (GenBank accession number KY618127 and KP842812, respectively), both detected in South America (primers can be found in Supplementary Table S1). Sequencing libraries were prepared with the Nextera XT Library Prep Kit (Illumina, San Diego, CA, USA), using 2 ng of input cDNA derived from the CHIKV and MAYV multiplex PCRs, following the manufacturer's instructions. A MiSeq Reagent Kit V3 of 150 cycles (Illumina, San Diego, CA, USA) was used, employing a paired-end strategy, resulting in 75 bp reads separated by 350 bp. Samples were sequenced on an Illumina MiSeq platform at the Technological Platform Core at the Aggeu Magalhães Institute (IAM).
Trimmomatic v. 0.36 [39] was employed to remove low-quality reads, adapter sequences and primers. FastQC was used to assess the quality of the Illumina raw reads. These were subsequently mapped against the MAYV BeH473130/1988 (KY618133) and CHIKV BHI3734/H804698 (KP164568) reference genomes, using Bowtie 2 with the default parameters [40]. Consensus sequences of viral genomes were obtained through the Integrated Genome Viewer software [41] and the sequences were deposited in GenBank with the accession numbers: MH513597 and MT349960. The raw sequenced reads may be found in the European Nucleotide Archive under the project number PRJEB38124.

Phylogenetic Analysis
Multiple sequence alignment was performed using MAFFT version 7 [43], through the MAFFT online server (http://mafft.cbrc.jp/alignment/software/); maximum-likelihood analysis (ML) was performed using PhyML version 3.0 [44] through the web site at http://www.atgc-montpellier.fr/phyml/. Coding regions corresponding to the complete genomes from Sinop were aligned with all published and available near-complete: CHIKV genomes (>8000 nucleotides), belonging to the WA, ECSA and Asian/Caribbean genotypes, totalizing 930 genomes; and MAYV genomes (>8000 nucleotides), belonging to the L, D and N genotypes, totalizing 66 genomes. We also performed phylogenetic analysis using only Brazilian genomes belonging to the CHIKV ECSA genotype, totalizing 79 genomes. All genomes and data were collected from ViPR [45] (http://www.viprbrc.org/), on 15 April 2020 (accession numbers of all genomes used in the analysis may be seen in Supplementary Table S2). The ML phylogenies were reconstructed by using the best-fit general time-reversible (GTR) model with invariant sites (+I) (GTR + I) for MAYV, and 4 gamma substitution rate categories (+G) (GTR + G + I) for CHIKV, all suggested as the most likely models to represent the data by the Smart Model Selection implemented in version 3.0 of PhyML online [46]. For the tree search operation, we used SPR, and statistical support for the phylogenetic nodes was evaluated by an SH-like approximate likelihood ratio test. Tree visualization and figure generation were performed with FigTree v1.4.4 [47].

Phylogenetic Analysis
Multiple sequence alignment was performed using MAFFT version 7 [43], through the MAFFT online server (http://mafft.cbrc.jp/alignment/software/); maximum-likelihood analysis (ML) was performed using PhyML version 3.0 [44] through the web site at http://www.atgcmontpellier.fr/phyml/. Coding regions corresponding to the complete genomes from Sinop were aligned with all published and available near-complete: CHIKV genomes (>8000 nucleotides), belonging to the WA, ECSA and Asian/Caribbean genotypes, totalizing 930 genomes; and MAYV genomes (>8000 nucleotides), belonging to the L, D and N genotypes, totalizing 66 genomes. We also performed phylogenetic analysis using only Brazilian genomes belonging to the CHIKV ECSA genotype, totalizing 79 genomes. All genomes and data were collected from ViPR [45] (http://www.viprbrc.org/), on 04/15/2020 (accession numbers of all genomes used in the analysis may be seen in Supplementary Table S2). The ML phylogenies were reconstructed by using the best-fit general time-reversible (GTR) model with invariant sites (+I) (GTR + I) for MAYV, and 4 gamma substitution rate categories (+G) (GTR + G + I) for CHIKV, all suggested as the most likely models to represent the data by the Smart Model Selection implemented in version 3.0 of PhyML online [46]. For the tree search operation, we used SPR, and statistical support for the phylogenetic nodes was evaluated by an SH-like approximate likelihood ratio test. Tree visualization and figure generation were performed with FigTree v1.4.4 [47].

Results
Serum samples were collected from 354 patients exhibiting dengue-like illnesses, within 20 days of the onset of symptoms. Thirty-four patients were positive for alphaviruses-33 tested positive for MAYV (33/354; 9.3%) and one for CHIKV (1/354; 0.3%) ( Figure 1). Seventy-eight samples were positive for flaviviruses and are described elsewhere (Supplementary Table S3).  Table S4). Both patients were urban residents and reported no recent history of travel or access to urban parks, sylvatic or rural areas in temporal proximity to the emergence of the first symptoms.
The cycle thresholds (CTs) for the positive samples were 6.12 and 23.19 for the MAYV and CHIKV samples, respectively. We obtained 2,813,153 reads for the MAYV sample, in which 84.48% mapped to the reference genome, reaching an average coverage depth of 15,453.99; from the CHIKV positive sample, we obtained 1,123,424 reads, in which 96.47% mapped to the reference genome, reaching an average coverage depth of 7133.15 (Table 1). The maximum likelihood phylogenetic reconstruction of the obtained MAYV genome and a few other available MAYV genomes showed that the Sinop strain clustered within the L genotype and was closely related to the MAYV strains detected from mosquitoes in the 1960s in Pará state (PA) in the northern region of Brazil (Figure 2a,b), showing a high aLRT SH-like support of 1.0. Meanwhile, the CHIKV genome was placed in a polytomy within a previously described Brazilian subclade that is an offshoot of the East/Central/South African lineage [21], which also had a high aLRT SH-like support of 1.0 ( Figure 3a). Interestingly, the CHIKV sequenced here did not cluster with other lineages from the state of Mato Grosso (Figure 3b). The maximum likelihood phylogenetic reconstruction of the obtained MAYV genome and a few other available MAYV genomes showed that the Sinop strain clustered within the L genotype and was closely related to the MAYV strains detected from mosquitoes in the 1960s in Pará state (PA) in the northern region of Brazil ( Figure 2, panels a and b), showing a high aLRT SH-like support of 1.0. Meanwhile, the CHIKV genome was placed in a polytomy within a previously described Brazilian subclade that is an offshoot of the East/Central/South African lineage [21], which also had a high aLRT SH-like support of 1.0 ( Figure 3, panel a). Interestingly, the CHIKV sequenced here did not cluster with other lineages from the state of Mato Grosso (Figure 3, panel b).  We found a small number of SNPs per sample, varying from 17 in the CHIKV to 30 in the MAYV genome ( Figure 3). For CHIKV, six SNPs found were non-synonymous and 11 were synonymous. For MAYV, seven SNPs found were non-synonymous and 22 were synonymous. Moreover, we found several specific mutations restricted to the genomes sequenced in this study: CHIKV presented four new amino acid mutations, two in nonstructural proteins (nsP2-T31I and nsP3-A388V) and two in an envelope protein (E3 T20I and H57R); MAYV presented two, both in the E1 protein (I425S and V427A) (Supplementary Table S5; Figure 4). We found a small number of SNPs per sample, varying from 17 in the CHIKV to 30 in the MAYV genome ( Figure 3). For CHIKV, six SNPs found were non-synonymous and 11 were synonymous. For MAYV, seven SNPs found were non-synonymous and 22 were synonymous. Moreover, we found several specific mutations restricted to the genomes sequenced in this study: CHIKV presented four new amino acid mutations, two in nonstructural proteins (nsP2-T31I and nsP3-A388V) and two in an envelope protein (E3 T20I and H57R); MAYV presented two, both in the E1 protein (I425S and V427A) (Supplementary Table S5; Figure 4). We detected two previously reported CHIKV amino acid mutations in structural proteins: E2-G60D, which contributes to the increased CHIKV fitness in Ae. albopictus and Ae. aegypti [48], and E1-T98A, which may increase the CHIKV infectivity for Ae. albopictus in the presence of E1-A226V substitution [49]. Five amino acid mutations, that were previously described to be under positive selection and to delineate the genotype L strains on vector infection, were revealed of MAYV. We also found nsP1-L518A, nsP3-A298P, nsP3-V386T, nsP4-A249K and E1-L300T [50].

Discussion
Arboviruses are widespread and diverse in Brazil and impose a large public health burden on the human population due to the abundance of sylvatic and urban vector species associated with poor sanitation and living conditions. As expected in such tropical environments, the simultaneous circulation and human co-infection of these viruses is common [51], imposing a great socioeconomic impact. Most people recover from arbovirus infection with no major sequelae, although lifethreatening conditions can be identified such as hemorrhagic fever, Guillain-Barre syndrome and encephalitis [52]. Among a dozen arboviruses that currently circulate in Brazil and infect humans, dengue virus (DENV) has largely predominated, causing several thousand infections every year in the country. In 2019 alone, around 1.5 million cases were reported, and at least 782 people died from DENV infection [53]. Besides DENV, several other arboviruses have been causing human infection over the years, such as MAYV [8,9], Zika virus (ZIKV), yellow fever virus (YFV) and CHIKV, causing outbreaks with hundreds of thousands of people infected annually [54].
Most of the arboviral diagnoses in developing countries are based on clinical and epidemiological criteria, which leads to biased estimates of infection and produces underreports of less prevalent arboviruses that are not investigated in the molecular diagnostic routine.  We detected two previously reported CHIKV amino acid mutations in structural proteins: E2-G60D, which contributes to the increased CHIKV fitness in Ae. albopictus and Ae. aegypti [48], and E1-T98A, which may increase the CHIKV infectivity for Ae. albopictus in the presence of E1-A226V substitution [49]. Five amino acid mutations, that were previously described to be under positive selection and to delineate the genotype L strains on vector infection, were revealed of MAYV. We also found nsP1-L518A, nsP3-A298P, nsP3-V386T, nsP4-A249K and E1-L300T [50].

Discussion
Arboviruses are widespread and diverse in Brazil and impose a large public health burden on the human population due to the abundance of sylvatic and urban vector species associated with poor sanitation and living conditions. As expected in such tropical environments, the simultaneous circulation and human co-infection of these viruses is common [51], imposing a great socioeconomic impact. Most people recover from arbovirus infection with no major sequelae, although life-threatening conditions can be identified such as hemorrhagic fever, Guillain-Barre syndrome and encephalitis [52]. Among a dozen arboviruses that currently circulate in Brazil and infect humans, dengue virus (DENV) has largely predominated, causing several thousand infections every year in the country. In 2019 alone, around 1.5 million cases were reported, and at least 782 people died from DENV infection [53]. Besides DENV, several other arboviruses have been causing human infection over the years, such as MAYV [8,9], Zika virus (ZIKV), yellow fever virus (YFV) and CHIKV, causing outbreaks with hundreds of thousands of people infected annually [54].
Most of the arboviral diagnoses in developing countries are based on clinical and epidemiological criteria, which leads to biased estimates of infection and produces underreports of less prevalent arboviruses that are not investigated in the molecular diagnostic routine. Such sub-notification can be clearly observed in Sinop from 2015 to 2019, where 7341 cases of DENV, 1324 of ZIKV and 15 of CHIKV were confirmed by laboratory tests, whereas the majority of the negative samples, around 44% of the cases investigated, were not tested for other arboviruses [55] (Supplementary Table S6). The confirmation of chikungunya diagnosis by Mato Grosso State Department of Health has been based on MAC-ELISA, while MAYV infection has not been investigated routinely yet. Meanwhile, molecular studies using human samples in Sinop have described the silent circulation of SLEV in 2011 and 2015 [56] and a MAYV annual incidence since 2011 [9]. Adding up to this scenario of simultaneous arbovirus circulation, here, we confirm the circulation of CHIKV and MAYV in human samples through molecular testing and genome sequencing.
Genome-wide phylogenetic analysis of MAYV confirmed the circulation of the L genotype in Sinop city, as previously demonstrated [9]. The obtained genome clustered with a genome from an isolate from the state of Pará (PA). Genotype D (widely dispersed) currently covers a broad geographic area from Trinidad and Tobago to Brazil, Peru, Bolivia, Venezuela, Haiti and French Guiana, with several available full-genome sequences (51 genomes). On the other hand, genotype L (limited) has only a limited number of complete genomes available, consisting of ten isolates obtained over the years 1955 to 1991 from PA and two recent isolates: one from Haiti and another from the State of São Paulo, imported from PA [57]. Due to the limited data, it is not possible to precisely determine the L genotype's dispersion throughout the Amazonian region and its borders. Other studies have also detected fragments of L and D genotypes in other municipalities in MT [58], whereas molecular studies in other border states with MAYV circulation have not been performed yet [59].
We detected two non-synonymous amino acid substitutions I425S and V427A in the MAYV E1 protein, but whether these substitutions played any role in the cell entry capacity and viral persistence is unknown. Alphavirus glycoproteins E1/E2 mediate host recognition and entry into the cell [60]. These proteins are useful in the development of vaccines and serodiagnostic assays as they hold most of the immunogenic epitopes [61,62]. However, the constant humoral immune pressure creates amino acid variations in these proteins that may lead to viral evasion, influence cross-neutralization activity and allow host-switching [63]. Reverse genetic studies are needed to determine whether any of these substitutions may cause major changes in viral fitness. Continued genome-based molecular investigation of MAYV is recommended in order to understand the spread and maintenance of MAYV in the Southern Amazon region and assess antigenic variations that might impact cross-immunity between alphaviruses and affect the sensitivity of diagnostic tests based on protein binding (immunoassays) [64].
Our CHIKV isolate belonged to the ECSA lineage, sharing 99.8% of its nucleotide identity with isolates from Bahia, a state in Northeast Brazil. The CHIKV genome from Sinop did not cluster with other genomes from MT [65]. The distinct evolutionary divergence from the monophyletic MT clade suggests that CHIKV was introduced into MT by at least two independent CHIKV ECSA variants, which are likely co-circulating in the state [66]. With the current data, the circulation of CHIKV genotypes other than ECSA in MT cannot be ruled out, especially in light of the circulation of the Asian/Caribbean genotype in the Amazon region (states of Pará, Roraima and Amapá), highlighting the need for continuous molecular surveillance in the region.
Genome analysis revealed mutations E1-T98A and E2-G60D in the glycoprotein of the CHIKV strain from Sinop. The E2-G60D amino acid change was reported to contribute to CHIKV's fitness increase in both Ae. albopictus and Ae. aegypti [48] and is currently detected in all the Asian/Caribbean and ECSA lineage genomes that are available. Meanwhile, the E1-T98A amino acid change has been found in all available ECSA genomes and may increase CHIKV infectivity for Ae. albopictus in the presence of E1-A226V substitution [49]. We did not detect other amino acid changes in the E1 glycoprotein previously associated with CHIKV fitness-enhancing in Ae. aegypti (E1-A226V, E1-K211E and E2-V264A) [67] and Ae. albopictus (E1-A226V, E2-I211T) [68,69], suggesting that the CHIKV lineage circulating in Sinop likely does not have a higher fitness compared to other CHIKV lineages circulating in Brazil. However, the ECSA genotype is currently spreading all over South America and infecting an abundant and diverse Ae. aegypti and Ae. albopictus population. Since the E1-T98A amino acid substitution is fixed in all ECSA strains from Brazil, including the one sequenced in this study, if the E1-A226V change occurs, CHIKV's lineage would likely gain increased fitness in Ae. albopictus. In this scenario, transmission of this virus could be exacerbated throughout Brazilian and South American territories, as Ae. albopictus is already spread over the entire continent [70]. Finally, other sample specific mutations were also detected in the Sinop CHIKV genome, but further genetic studies are required to understand their importance in the virus' fitness, if any.
As previously discussed, MAYV is endemic in the Southern Amazon region, where many outbreaks have been reported in the last decade [9,58,71]. Considering that alphavirus antibody cross-immunity has already been reported [72], people living in the Southern Amazon region may have alphavirus cross-immunity due to a previous infection by MAYV. In fact, it has been suggested that infection by CHIKV may confer cross-protection from MAYV, following the use of sera from CHIKV-exposed patients to cross-neutralize MAYV in vitro [73]. Such cross-immunity could be the reason why CHIKV only emerged in Sinop in 2018 and has not caused large outbreaks since then. Likewise, the restricted spread of CHIKV in other MAYV endemic areas of the Amazon has been pointed out [74]. However, a comprehensive serological investigation based on neutralization assays or new diagnostic tools must be performed in order to assess the role of alphavirus cross-immunity in such areas.
It is worth noting that, despite the underreports, the highest CHIKV incidence rate in the central-west region of Brazil has been reported in MT (387.6 cases per 100,000 inhabitants), also in 2018 [23]. Noticeably, reports in Sinop did not follow the same trend, both using PCR diagnostic and serological assays, as most of the cases concentrated in the southern cities of the state (Supplementary  Table S6). Continuous molecular surveillance is required in order to monitor how the spread of recently emerged viruses will unfold [13].

Conclusions
Epidemiological surveillance based on genome-scale sequencing of the circulating viral strains is valuable for the prompt detection of adaptive mutations, which is essential for understanding transmission patterns, assessing the risk of emergence and intervening in vector control strategies [75]. In this study, we found new mutations in the strains of MAYV and CHIKV and observed that these strains clustered with genomes from geographically distant Brazilian states, suggesting that their spread occurred through infected patients that traveled between states. Further studies are encouraged in order to follow the spread of these viruses within the Southern Amazon region, in order to further understand the importance of mutations in the maintenance and spread of MAYV and CHIKV. This need is reinforced by the large outbreaks of CHIKV in Brazil and the underreporting of MAYV infections [76,77]. New genomic data can clarify the epidemiological characteristics, such as adaptation to vector spread and impact on human infection, where arboviruses of the same viral family co-circulate and may have cross antibody reactivity.
Supplementary Materials: The following are available online at http://www.mdpi.com/2414-6366/5/2/105/s1, Table S1. Primer sets used for whole MAYV genome amplification. Table S2a. Sixty-six complete Mayaro virus genomes, belonging to the D and L genotypes, used to reconstruct ML phylogenetic analysis. Table S2b. 930 complete and near-complete Chikungunya virus genomes, belonging to the West African, East/Central/South African and Asian/Caribbean genotypes, used to reconstruct ML phylogenies. Table S3. Chikungunya and Mayaro viruses distribution throughout the years. Table S4. Clinical and epidemiological features of patients who tested positive for MAYV and CHIKV in Sinop, MT, from 2014 to 2018. Table S5. SNPs called from each genome and associated statistics. Table S6

Conflicts of Interest:
The authors declare no conflict of interest.