A Potential SARS-CoV-2 Variant of Interest (VOI) Harboring Mutation E484K in the Spike Protein Was Identified within Lineage B.1.1.33 Circulating in Brazil

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic in Brazil was dominated by two lineages designated as B.1.1.28 and B.1.1.33. The two SARS-CoV-2 variants harboring mutations at the receptor-binding domain of the Spike (S) protein, designated as lineages P.1 and P.2, evolved from lineage B.1.1.28 and are rapidly spreading in Brazil. Lineage P.1 is considered a Variant of Concern (VOC) because of the presence of multiple mutations in the S protein (including K417T, E484K, N501Y), while lineage P.2 only harbors mutation S:E484K and is considered a Variant of Interest (VOI). On the other hand, epidemiologically relevant B.1.1.33 deriving lineages have not been described so far. Here we report the identification of a new SARS-CoV-2 VOI within lineage B.1.1.33 that also harbors mutation S:E484K and was detected in Brazil between November 2020 and February 2021. This VOI displayed four non-synonymous lineage-defining mutations (NSP3:A1711V, NSP6:F36L, S:E484K, and NS7b:E33A) and was designated as lineage N.9. The VOI N.9 probably emerged in August 2020 and has spread across different Brazilian states from the Southeast, South, North, and Northeast regions.


Introduction
The SARS-CoV-2 epidemic in Brazil was mainly driven by lineages B.1.1.28 and B.1.1.33 that probably emerged in February 2020 and were the most prevalent variants in most country regions until October 2020 [1,2]. Recent genomic studies, however, bring attention to the emergence of new SARS-CoV-2 variants in Brazil harboring mutations at the receptor-binding site (RBD) of the Spike (S) protein that might impact viral fitness and transmissibility.
So far, one variant of concern (VOC), designated as lineage P.1, and one variant of interest (VOI), designated as lineage P.2, have been identified in Brazil and both evolved from lineage B.1.1.28. The VOC P.1, first described in January 2021 [3], displayed an unusual number of lineage-defining mutations in the S protein (L18F, T20N, P26S, D138Y, R190S, K417T, E484K, N501Y, H655Y, T1027I) and its emergence was associated with a second COVID-19 epidemic wave in the Amazonas state [4,5]. The VOI P.2, first described in samples from October 2020 in the state of Rio de Janeiro, was distinguished by the presence of the S:E484K mutation in RBD and other four lineage-defining mutations outside the S protein [6]. The P.2 lineage has been detected as the most prevalent variant in several states across the country in late 2020 and early 2021 (https://www.genomahcov.fiocruz.br, accessed on 1 March 2021). . However, none of these B.1.1.33-derived lineages were characterized by mutations of concern in the S protein.
Here, we define the lineage N.9 within B.1.1.33 diversity that harbors mutation E484K in the S protein as was detected in different Brazilian states between November 2020 and February 2021.

Materials and Methods
The Fiocruz COVID-19 Genomic Surveillance Network has recovered SARS-CoV-2 lineage B.1.1.33 genomes from 422 positive samples between 12th March 2020 and 27th January 2021 (Supplementary Material). Sequencing protocols were as previously described [7,8]. The FASTQ reads obtained were imported into the CLC Genomics Workbench version 20.0.4 (Qiagen A/S, Denmark), trimmed, and mapped against the reference sequence EPI_ISL_402124 available in EpiCoV database in the GISAID (https://www.gisaid.org/, accessed on 1 March 2021). The alignment was refined using the InDels and Structural Variants module.
Sequences were then combined with 816 B.1.1.33 Brazilian genomes available in the EpiCoV database in GISAID by 1st March 2021 (Supplementary Table S1). Only high quality (<1% of N) complete (>29 kb) SARS-CoV-2 genomes were used. This dataset was then aligned using MAFFT v7.475 [9] and subjected to maximum likelihood (ML) phylogenetic analysis using IQ-TREE v2.1.2 [10] under the GTR + F + G4 nucleotide substitution model, as selected by the ModelFinder application [11]. Branch support was assessed by the approximate likelihood-ratio test based on the Shimodaira-Hasegawa procedure (SH-aLRT) with 1000 replicates. The mutational profile was investigated using the Nextclade tool (https://clades.nextstrain.org, accessed on 1 March 2021) and temporal signal was assessed by the regression analysis of the root-to-tip genetic distance against sampling dates using the program Tempest [12].
A time-scaled phylogenetic tree was estimated using the Bayesian Markov Chain Monte Carlo (MCMC) approach implemented in BEAST 1.10.4 [13]. Bayesian tree was reconstructed using the GTR + F + I + G4 nucleotide substitution model, the non-parametric Bayesian skyline (BSKL) model as the coalescent tree prior and a strict molecular clock model with a uniform substitution rate prior (8 × 10 -4 -10 × 10 -4 substitutions/site/year). Ancestral node states were reconstructed using a reversible discrete phylogeographic model [14] where transitions between sampling locations (Brazilian states) were estimated in a continuous-time Markov chain (CTMC) rate reference prior. Convergence (effective sample size > 200) in parameter estimates was assessed using TRACER v1. 7

Results and Discussion
Mutation profile analysis revealed a total of 34 B.1.1.33 sequences harboring the S:E484K mutation. ML phylogenetic analysis revealed that 32 of these sequences branched in a highly supported (SH-aLRT = 98%) monophyletic clade that define a potential new VOI designated as N.9 PANGO lineage [15]. The other two sequences harboring the S:E484K mutation branched separately in a highly supported (SH-aLRT = 100%) dyad ( Figure 1a). The VOI N.9 is characterized by four non-synonymous lineage-defining mutations (NSP3:A1711V, NSP6:F36L, S:E484K, and NSP7b:E33A) and also contains a group of three B.1.1.33 sequences from the Amazonas state that has no sequencing coverage in the position 484 of the S protein, but share the remaining N.9 lineage-defining mutations (Table 1) Table S2).

Genomic Region (Protein) Nucleotide Amino Acid
ORF1a Among the 35 genomes identified so far as VOI N.9, 10 Brazilian states were represented, suggesting that this lineage is already highly dispersed in the country. The VOI N.9 was first detected in Sao Paulo state on 11 November 2020, and soon later in other Brazilian states from the South (Santa Catarina), North (Amazonas and Para), and Northeast (Bahia, Maranhao, Paraiba, Pernambuco, Piaui, and Sergipe) regions (Figure 1b). Analysis of the temporal structure revealed that the overall divergence of lineage N.9 is consistent with the substitution pattern of other B.1.  (Figure 1d). This analysis also revealed that some additional mutations were acquired during evolution of VOI N.9 in Brazil, determining two highly supported (PP > 0.95) subclades. One subclade, that mostly contains sequences from Sao Paulo state, probably arose on 16th October (95% HPD: 22th September-5th November) and was defined by additional mutations NSP3:S1285F and NSP15:K12N. The other subclade that mostly comprises sequences from the North region probably arose on 29th October (95% HPD: 5th October-17th November) and was defined by additional mutations NSP1:T170I and S:A344S (Figure 1d).

Conclusions
In this study we identified the emergence of a new VOI (S:E484K) within lineage B.1.1.33 circulating in Brazil. The VOI N.9 displayed a low prevalence (~3%) among all Brazilian SARS-CoV-2 samples analyzed between November 2020 and February 2021, but it is already widely dispersed in the country and comprises a high fraction (35%) of the B.1.1.33 sequences detected in that period. Mutation S:E484K has been identified as one of the most important substitutions that could contribute to immune evasion as confers resistance to several monoclonal antibodies and also reduces the neutralization potency of some polyclonal sera from convalescent and vaccinated individuals [16][17][18]. Mutation S:E484K has emerged independently in multiple VOCs (P.1, B.1.351 and B.1.1.7) and VOIs (P.2 and B.1.526) [19] spreading around the world, and it is probably an example of convergent evolution and ongoing adaptation of the virus to the human host.
The onset date of VOI N.9 here estimated around mid-August roughly coincides with the estimated timing of emergence of the VOI P.2 in late-July 6 and shortly precede the detection of a major global shift in the SARS-CoV-2 fitness landscape after October 2020 [20]. These findings indicate that 484K variants probably arose simultaneously in the two most prevalent viral lineages circulating in Brazil around July-August, but may have only acquired some fitness advantages, which accelerated its dissemination, after October 2020. We predict that the Brazilian COVID-19 epidemic during 2021 will be dominated by a complex array of B.1.1.28 (S:E484K), including P.1 and P.2, and B.1.1.33 (S:E484K) variants that will completely replace the parental 484E lineages that drove the epidemic in 2020. Implementation of efficient mitigation measures in Brazil is crucial to reduce community transmission and prevent the recurrent emergence of more transmissible variants that could further exacerbate the epidemic in the country.

Acknowledgments:
The authors wish to thank all the health care workers and scientists who have worked hard to deal with this pandemic threat, the GISAID team, and all the EpiCoV database s submitters, GISAID acknowledgment table containing sequences used in this study is available in Supplementary Table S1. We also appreciate the support of the Fiocruz COVID-19 Genomic Surveillance Network (http://www.genomahcov.fiocruz.br/; accessed on 1 March 2021) members, the Respiratory Viruses Genomic Surveillance. General Coordination of the Laboratory Network (CGLab), Brazilian Ministry of Health (MoH), Brazilian States Central Laboratories (LACEN), Brazilian Ministry of Health (MoH), and the Amazonas surveillance teams for the partnership in the viral surveillance in Brazil.