Convergent Evolution in SARS-CoV-2 Spike Creates a Variant Soup from Which New COVID-19 Waves Emerge

The first 2 years of the COVID-19 pandemic were mainly characterized by recurrent mutations of SARS-CoV-2 Spike protein at residues K417, L452, E484, N501 and P681 emerging independently across different variants of concern (Alpha, Beta, Gamma, and Delta). Such homoplasy is a marker of convergent evolution. Since Spring 2022 and the third year of the pandemic, with the advent of Omicron and its sublineages, convergent evolution has led to the observation of different lineages acquiring an additional group of mutations at different amino acid residues, namely R346, K444, N450, N460, F486, F490, Q493, and S494. Mutations at these residues have become increasingly prevalent during Summer and Autumn 2022, with combinations showing increased fitness. The most likely reason for this convergence is the selective pressure exerted by previous infection- or vaccine-elicited immunity. Such accelerated evolution has caused failure of all anti-Spike monoclonal antibodies, including bebtelovimab and cilgavimab. While we are learning how fast coronaviruses can mutate and recombine, we should reconsider opportunities for economically sustainable escape-proof combination therapies, and refocus antibody-mediated therapeutic efforts on polyclonal preparations that are less likely to allow for viral immune escape.


Introduction
In the third year of the COVID-19 pandemic, a relevant proportion of the general population is now largely protected from severe COVID-19 disease and death by mass vaccination campaigns and by immunity from former infection, as shown by the decongestion of hospitals in the western hemisphere. Unfortunately, SARS-CoV-2 remains a life-threatening pathogen for frail and immunocompromised (IC) patients who are unable to mount a protective immune response [1]. IC individuals create a cohort population in whom the virus can persistently replicate, which is a novelty for pandemics [2]. In this regard, advancements in therapeutics and supportive care have increased the prevalence of IC patients up to 2.8% compared to just a few decades ago [3]. SARS-CoV-2 infection in IC patients is arguably the most difficult current problem in the COVID-19 pandemic: these individuals can have large viral loads which inevitably include antigenically different viruses and have a diminished capacity for clearing the infection [2].
Since Summer 2022, SARS-CoV-2 transmission has proceeded undisturbed worldwide after the relaxation of nonpharmaceutical interventions such as lockdowns, social distancing, and face masks, which together with the waning of infection-and vaccine-elicited 2 of 17 immunity, has increased opportunities for spread and the number of susceptible individuals, respectively. Hence, the increase in the "human culture medium" has led to large infectious waves during 2022, with estimated excess deaths similar to those observed in 2020 [4]. While acquisition and waning of immunity from former infections is not a novel occurrenece, in the COVID-19 pandemic the timely introduction of vaccination campaigns and therapeutics targeting the viral receptor domain has has the potential to alter the course of a coronavirus pandemic. There is no historical precedent for the current situation. The combined action of increasing cumulative viral loads in the "human culture medium" and such selective pressures has led to an unprecedented increase in viral diversification in 2022. WHO nomenclature for variants of concern remained stuck at "Omicron" [5], while alternative naming schemes introduced novel names to designate lineages that are responsible for thousands of hospitalizations. The most refined phylogeny to date has been released by PANGOLIN which counts more than 650 designated Omicron sublineages at the time of writing (https://github.com/cov-lineages/pango-designation) (accessed on 26 December 2022)), accounting for more than 45% of SARS-CoV-2 variability (Figure 1). Of interest, such increase in divergence was detected despite a 75% reduction in genomic surveillance in 2022. After peaking at 1 million sequences in January 2022, the number of new sequences deposited at the site decreased to 248,000 in November 2022 (https: //cov-spectrum.org/explore/World/AllSamples/Past6M/sequencing-coverage (accessed on 26 December 2022)). Consequently, it is likely that the number of defined circulating sublineages is an underestimate of the viral genetic variation in the current pandemic. elicited immunity, has increased opportunities for spread and the number of susceptible individuals, respectively. Hence, the increase in the "human culture medium" has led to large infectious waves during 2022, with estimated excess deaths similar to those observed in 2020 [4]. While acquisition and waning of immunity from former infections is not a novel occurrenece, in the COVID-19 pandemic the timely introduction of vaccination campaigns and therapeutics targeting the viral receptor domain has has the potential to alter the course of a coronavirus pandemic. There is no historical precedent for the current situation. The combined action of increasing cumulative viral loads in the "human culture medium" and such selective pressures has led to an unprecedented increase in viral diversification in 2022. WHO nomenclature for variants of concern remained stuck at "Omicron" [5], while alternative naming schemes introduced novel names to designate lineages that are responsible for thousands of hospitalizations. The most refined phylogeny to date has been released by PANGOLIN which counts more than 650 designated Omicron sublineages at the time of writing (https://github.com/cov-lineages/pango-designation) (accessed on 26 December 2022)), accounting for more than 45% of SARS-CoV-2 variability (Figure 1). Of interest, such increase in divergence was detected despite a 75% reduction in genomic surveillance in 2022. After peaking at 1 million sequences in January 2022, the number of new sequences deposited at the site decreased to 248,000 in November 2022 (https://covspectrum.org/explore/World/AllSamples/Past6M/sequencing-coverage (accessed on 26 December 2022)). Consequently, it is likely that the number of defined circulating sublineages is an underestimate of the viral genetic variation in the current pandemic.

Mutation Rates and Mutational Spectra
Mutation rate (MR) is often used interchangeably to indicate 2 different things: occurrence of mutations within a single host (intrahost evolution at individual level without any demand for outcompeting co-circulating strains) or step-wise accumulation of muta-

Mutation Rates and Mutational Spectra
Mutation rate (MR) is often used interchangeably to indicate 2 different things: occurrence of mutations within a single host (intrahost evolution at individual level without any demand for outcompeting co-circulating strains) or step-wise accumulation of mutations ("antigenic drift") that get fixed within a species. While the first meaning has been demonstrated (e.g., in IC hosts [6][7][8], and after administration of the small molecule antiviral molnupiravir which known to increase G→A and C→U transition mutations [9]), from an evolutionary standpoint it is the second meaning which is more interesting and already well-established for other respiratory pathogens [10], including the related human coronavirus 229E [11].
Early in the pandemic, data suggested that mass vaccination could restrict SARS-CoV-2 mutation rates (MR): the diversity of the SARS-CoV-2 lineages declined at the country-level with increased rate of mass vaccination (r = −0.72) and vaccine breakthrough patients harbor viruses with 2.3-fold lower diversity in known B cell epitopes compared to unvaccinated COVID-19 patients [12]. Additionally, vaccination coverage rate was inversely correlated to the MR of the SARS-CoV-2 Delta variant in 16 countries (r 2 = 0.878) [13].
Ruis et al. found a halving in the relative rate of G→T mutations in Omicron compared to pre-Omicron sublineages [14]. To exclude selective pressures on the derived protein structures, Bloom et al. found similar results by repeating the analysis focusing on 4-fold degenerate codons (i.e., codons that can tolerate any point mutation at the third position, although codon usage bias restricts this in practice in many organisms) [15]. Replicaion of viruses and bacteria in the lower respiratory tract has been associated with high levels of G > T mutations and for SARS-CoV-2 this effect occurred with Delta but was lost in Omicron [14]. Such changes on mutation type and rate could theoretically stem from from mutations affecting genome replication and packaging [16], as well as from mutations in genes encoding proteins that antagonize host innate-immune factors (e.g., APOBEC), which otherwise will mutate viral nucleic acids [17][18][19] and/or from environmental factors [9].
The average MR of the entire SARS-CoV-2 genome was estimated from the related mouse hepatitis virus (MHV) to be 10 −6 nucleotides per cycle, or 4.83 × 10 −4 subs/site/year, which is similar, or slightly lower, that observed for other RNA viruses [20]. Following the removal of mandatory nonpharmaceutical interventions such as face masks, social distancing, and quarantine in most western countries, vaccination was not sufficient to prevent hyperendemicity. The MR of SARS-CoV-2 consequently doubled from 23 substitutions per year before December 2021 to 45 substitutions per year after December 2021, coinciding with the advent of omicron (Figure 2), which approximates 14.5/subs/year for the~30 kb SARS-CoV-2 genome. This rate should set the upper limit for mutation frequency, as many mutations will not be viable and/or transmissible and thus not observed in the sequencing data at baseline. Despite this, the previously acknowledged reductio in sequencing intensity in 2022 leaves some room for higher MR. It had been previously shown that the P203L mutation in the error-correcting exonuclease non-structural protein 14 (nsp14) almost doubles the genomic MR (from 20 to 36 SNPs/year) [21]. While this change is not prevalent in Omicron lineages, many changes in the replication machinery appeared with Omicron, such as K38R, ∆1265, and A1892T in Nsp3; P132H in Nsp5; I189V in Nsp6; P323L in Nsp12; and I42V in Nsp14, and some of them could have contributed to the MR jump [22].

Convergent Evolution
In the midst of such massive lineage divergence, convergent evolution towards certain motifs has become increasingly manifest.
In the pre-Omicron and pre-vaccine era, variants of concern (VOCs) notably converged to mutations which resulted in the following amino acid changes: K417N (Beta and Gamma), L452R (Delta), E484K (Beta and Gamma), N501Y (Alpha, Beta, Gamma) and P681X (Alpha and Delta) [23]. These amino acid changes have been proposed to increase the stability of the trimeric protein [24,25,26], and they emerged in the absence of significant selective pressures by the immune system. K417N, E484A, N501Y and P681H remained hallmarks of BA.2.*, while the BA.2-paraphyletic BA.4/5 (i.e., a clade stemming from BA.2) acquired L452R and F486V and the Q493R reversion.

Convergent Evolution
In the midst of such massive lineage divergence, convergent evolution towards certain motifs has become increasingly manifest.
In the pre-Omicron and pre-vaccine era, variants of concern (VOCs) notably converged to mutations which resulted in the following amino acid changes: K417N (Beta and Gamma), L452R (Delta), E484K (Beta and Gamma), N501Y (Alpha, Beta, Gamma) and P681X (Alpha and Delta) [23]. These amino acid changes have been proposed to increase the stability of the trimeric protein [24][25][26], and they emerged in the absence of significant selective pressures by the immune system. K417N, E484A, N501Y and P681H remained hallmarks of BA.2.*, while the BA.2-paraphyletic BA.4/5 (i.e., a clade stemming from BA.2) acquired L452R and F486V and the Q493R reversion.
This "variant soup" can be organized and stratified according to the number of key Spike mutations present, and although the number of key mutations acquired correlates well with increasing fitness, this is only so within each lineage, which shows that the biology of SARS-CoV-2 infection goes beyond what occurs in the Spike protein ( Figure 4). At present, XBB.1.5 displays the highest relative growth advantage compared to the BA.5.2.1 baseline. Convergence was clearly observed at the amino acid level, with different nucleotide mutations leading to similar amino acid changes: e.g., N460K was caused by T22942A in BQ.1.*, XAW and some of the BA. 5     Step-wise accumulation of key Spike mutations involved in immune escape within SARS-CoV-2 Omicron sublineages increase the relative growth rate. Lineage name text is color coded, where BA.5 descendants are in blue text, BA.4 descendants in green text and BA.2.75 descendants are in red text. Each mutation is color coded as shown in the mutation key, and depicted as colored squares when present or white squares if absent. Number of key mutations of each lineage is summarized at the top. The F486P mutation is counted as two mutations due to the inherent increased fitness displayed by variants that carry this mutation relative to variants with F486S or F486V. Relative growth rates were calculated using BA.5 lineage as baseline, for groups of BA.4, BA.5, BA.2.75 and XBB descendant lineages with each exact total number of key mutations. Relative growth rates were calculated using global data, using CoV-Spectrum [30]. as of 26 December 2022.

Escalating Immune Escape
SARS-CoV-2 evolution represents an accelerated movie of Darwinian selection. Variants that are more likely to escape vaccine-and infection-elicited immunity that are more fit expand at the expense of those less fit. While it may sound obvious, we now have formal evidence of such evolution, with PANGOLIN descendants invariably having increased RBD immune escape scores compared to parental strains ( Figure 5). In this ongoing race, descendants invariably replace parents, as these are fitter in hosts with pre-existing immunity. Step-wise accumulation of key Spike mutations involved in immune escape within SARS-CoV-2 Omicron sublineages increase the relative growth rate. Lineage name text is color coded, where BA.5 descendants are in blue text, BA.4 descendants in green text and BA.2.75 descendants are in red text. Each mutation is color coded as shown in the mutation key, and depicted as colored squares when present or white squares if absent. Number of key mutations of each lineage is summarized at the top. The F486P mutation is counted as two mutations due to the inherent increased fitness displayed by variants that carry this mutation relative to variants with F486S or F486V. Relative growth rates were calculated using BA.5 lineage as baseline, for groups of BA.4, BA.5, BA.2.75 and XBB descendant lineages with each exact total number of key mutations. Relative growth rates were calculated using global data, using CoV-Spectrum [30]. As of 26 December 2022.

Escalating Immune Escape
SARS-CoV-2 evolution represents an accelerated movie of Darwinian selection. Variants that are more likely to escape vaccine-and infection-elicited immunity that are more fit expand at the expense of those less fit. While it may sound obvious, we now have formal evidence of such evolution, with PANGOLIN descendants invariably having increased RBD immune escape scores compared to parental strains ( Figure 5). In this ongoing race, descendants invariably replace parents, as these are fitter in hosts with pre-existing immunity.
RBD immune escape can nowadays be estimated in silico based on in vitro data (https://jbloomlab.github.io/SARS2_RBD_Ab_escape_maps/escape-calc/ (accessed on 26 December 2022)). RBD immune escape is clearly a moving scale with an evolving asymptote. E.g., by changing vaccine composition [32] we are likely to reset the "game".

Mutually Exclusive Mutations
Mutually exclusive mutations across the entire SARS-CoV-2 genome have been previously studied [33], but the vast constellation of Omicron sublineages provides an unique opportunity for an in-depth exploration of substitutions that are incompatible in combination. The best examples so far are N450X and R346X mutations, which have not yet been observed together in more than 6 millions of Omicron sequences. Two dipolar interactions exist between the carboxamide group of Asn and the guanidino group of Arg in the an- Figure 6. Sequential mutational events at the same Spike amino acid residues and change in ACE2 affinity score (as calculated here: https://github.com/jbloomlab/SARS-CoV-2-RBD_DMS_Omicron/ blob/main/results/final_variant_scores/final_variant_scores.csv (accessed on 26 December 2022)). Chart created on NextStrain [31] (https://next.nextstrain.org/staging/nextclade/sars-cov-2/21L? gmin=15&l=scatter&scatterX=ace2_binding&scatterY=immune_escape&showBranchLabels=all (accessed on 26 December 2022)).

Mutually Exclusive Mutations
Mutually exclusive mutations across the entire SARS-CoV-2 genome have been previously studied [33], but the vast constellation of Omicron sublineages provides an unique opportunity for an in-depth exploration of substitutions that are incompatible in combination. The best examples so far are N450X and R346X mutations, which have not yet been observed together in more than 6 millions of Omicron sequences. Two dipolar interactions exist between the carboxamide group of Asn and the guanidino group of Arg in the ancestral sequence, stabilizing the receptor binding module (RBM) tertiary fold (Figure 7, left). R346 resides within a short loop between helix α1 and beta strand 1. N450 is a constituent of the extended RBM insertion into the overall five-stranded antiparallel beta-sheet fold of the domain. As the RBM is the critical determinant for the interaction with ACE2, maintaining its optimal conformation through this stabilizing bond is likely to be essential for pathogenesis. N450D is a common substitution among Omicron lineages. This mutation would result in a similarly sized sidechain but different electrostatic properties (carboxamine → carboxylic acid). This substitution would likely result in a stronger interaction with position 450, as one H-bonding is maintained, and one is replaced with ionic salt bridge between the deprotonated oxygen and the basic guanidino group, provided that the residue at position 346 remains Arg. On the other hand, any substitution at position 346, with the exception of Lys, would result in a significantly shorter, non-cationic sidechain, which would abrogate this RBM-stabilizing interaction. R346K would partially maintain this interaction, replacing a bidentate linkage to N450 with a monodentate dipolar interaction. Thus, the observed mutual exclusivity of mutations at these two sites can be rationalized by their contributions to this stabilizing intradomain interaction.  In the wild-type sequence, the basic R346 sidechain interacts with the N450 residue through a pair of hydrogen bond interactions. N450D results in a similarly sized sidechain, but altered electrostatics. One hydrogen bond is maintained between the neutral oxygen of Asp and Nε of Arg, and a new salt bridge is formed between the anionic deprotonated oxygen of Asp and the cationic center of the guanidino group of Arg. In the case of R346X, any substitution except lysine would result in a side chain that is significantly shorter and non-cationic, thus dissolving the interactions between N450 or other common substitutions at that position.
Other combinations have been exceedingly rare so far, and seen only in cryptic lineages (e.g., F486P and K444 mutations), but no steric justifications can be found for them.

Epistasis
While the focus so far has been mostly on the Spike protein, it is likely that convergent evolution is acting on genes other than Spike. Given that the Spike protein is the best protective antigen for both infection and vaccines, mutations in other genes are more likely to provide fitness advantages if they affect Spike expression. E.g., ORF8 limits the amount of Spike proteins that reaches the cell surface and is incorporated into virions, reducing recognition by anti-SARS-CoV-2 antibodies [34]. ORF8 has accordingly been target of convergent evolution in Omicron (e.g., ORF8:S667F in BR.2.1, ORF8:G8x in XBB.1) and in SARS-related coronaviruses [35].

Selective Pressures from Therapeutics Targeting the Spike Protein
There is a theoretical concern that, in addition to vaccines-and infection-elicited immunity, selective pressure by prophylactic and therapeutic anti-Spike monoclonal antibodies (mAb), can contribute to the emergence of novel SARS-CoV-2 sublineages [36]. A very few of those emerging sublineages could be fit enough to compete with the lineages that are dominating at that time to become locally or globally dominant.
While evolution can occur in the absence of selective pressures due to the intrinsic genomic MR (see section above), extended half-life mAbs (such as Evusheld™) administered for pre-exposure prophylaxis or therapy to chronically infected immunocompromised patients at subneutralizing concentrations provide ideal conditions to facilitate the emergence of mutants [37], for these patients often cannot clear the infection and have high viral loads. Establishing a cause-effect relationship is difficult, but intra-host evolution studies provide a highly suggestive temporal association [38]. mAbs have come of age since the advent of the SARS-CoV-2 Delta VOC, but because of the resistance of Omicron to most authorized mAbs, their use since Spring 2022 has been largely limited to Evusheld™ (for which cilgavimab was the only ingredient with residual activity) and bebtelovimab.
We know from in vitro deep mutational scanning studies the exact mutations that cause resistance to each mAb. S:F486X mutations impart resistance to tixagevimab, S:R346X, S:K444X and S:S494X mutations impart resistance to cilgavimab, while S:K444X and S:V445X mutations impart resistance to bebtelovimab (Table 1). We wondered whether the recent increase in the circulation of Omicron sublineages with S:R346X mutations could partly be the result of selective pressure by mAbs. We compared the prevalence of R346X mutations in countries with high versus low usage of Evusheld™ (France vs. UK) or bebtelovimab (USA vs. UK) (Figure 8). UK also represents an ideal control because of its very high SARS-CoV-2 genome sequencing rate. We discuss these 2 scenarios in details below.  2022)). Green means fold-reductions <5; Orange means fold-reduction 5-100; Red means foldreduction in IC 50 > 100 compared to wild-type; blank means no data available.

S:R346X
Different mutations can affect the R346 residue. R346G has been selected in vitro by cilgavimab + tixagevimab [39]. R346S occurred in vitro after 12 weeks of propagating SARS-CoV-2 in the presence of sotrovimab, and before the other epitope mutation (P337L) which leads to sotrovimab resistance [40]. R346I has been selected in vitro under the selective pressure from cilgavimab [41,42]. Lee et al. reported mutually exclusive substitutions at residues R346 (R346S and R346I) and E484 (E484K and E484A) of Spike protein and continuous turnover of these substitutions in 2 immunosuppressed patients [43]. Unfortunately, in vivo selection evidences are so far available for sotrovimab [44] but not for Evusheld™. It should be anyway noted that R346T [45,46] and R346I [47] have been reported to spontaneously develop and fix in 3 IC patients without any selective pressure.
While R346K was associated with the BA.1.1 wave (see Figure 8), the plethora of different Omicron sublineages that showed convergent evolution towards R346I, R346S or R346T is of concern.
• R346K (previously seen only in VOC Mu/B.1.621 [48]) occurred exclusively in BA. There is a theoretical concern that, in addition to vaccines-and infection-elicited immunity, selective pressure by prophylactic and therapeutic anti-Spike monoclonal antibodies (mAb), can contribute to the emergence of novel SARS-CoV-2 sublineages [36]. A very few of those emerging sublineages could be fit enough to compete with the lineages that are dominating at that time to become locally or globally dominant.
While evolution can occur in the absence of selective pressures due to the intrinsic genomic MR (see section above), extended half-life mAbs (such as Evusheld™) administered for pre-exposure prophylaxis or therapy to chronically infected immunocompromised patients at subneutralizing concentrations provide ideal conditions to facilitate the emergence of mutants [37], for these patients often cannot clear the infection and have high viral loads. Establishing a cause-effect relationship is difficult, but intra-host evolution studies provide a highly suggestive temporal association [38]. mAbs have come of age since the advent of the SARS-CoV-2 Delta VOC, but because of the resistance of Omicron to most authorized mAbs, their use since Spring 2022 has been largely limited to Evusheld™ (for which cilgavimab was the only ingredient with residual activity) and bebtelovimab.
We know from in vitro deep mutational scanning studies the exact mutations that cause resistance to each mAb. S:F486X mutations impart resistance to tixagevimab, S:R346X, S:K444X and S:S494X mutations impart resistance to cilgavimab, while S:K444X and S:V445X mutations impart resistance to bebtelovimab (Table 1). We wondered whether the recent increase in the circulation of Omicron sublineages with S:R346X mutations could partly be the result of selective pressure by mAbs. We compared the prevalence of R346X mutations in countries with high versus low usage of Evusheld™ (France vs. UK) or bebtelovimab (USA vs. UK) (Figure 8). UK also represents an ideal control because of its very high SARS-CoV-2 genome sequencing rate. We discuss these 2 scenarios in details below.

S:K444X
The K444E/R mutations were reported in vitro after selection with cilgavimab [42]. Resistance studies with bebtelovimab selected the K444T escape mutations for BA.2 [56]. Ortega et al. found that K444R (previously found in the Beta VOC [57]), K444Q, and K444N mutations can change the virus binding affinity to the ACE2 receptor [58]. Weisblum et al. found that K444R/Q/N occurs after exposure to convalescent plasma [59]. Among largely diversified VOCs such as Delta, S:K444N was associated with reduced remdesivir binding and increased mortality [60].

Conclusions
The convergent evolution of Omicron sublineages appears to reflect the selective pressure exerted by previous infection-or vaccine-elicited immunity. Vaccines and perhaps antibody therapeutics have without doubt saved an untold number of lives but have the potential to modify the evolutionary trajectory of the virus. While other viruses such as influenza and HIV routinely produced new variants because of their mutagenicity, the scale at which SARS-CoV-2 has spun off new variants and lineages appears unprecedented in modern virology history. The SARS-CoV-2 vaccines reduce severe disease and mortality but do not confer sufficient immunity to prevent re-infection with viral replication in vaccinated hosts. Hence, we have the unusual situation of viral replication in hosts where the immune system is placing evolutionary pressure on the virus to select variants that can escape vaccine-elicited immunity in addition to infection-elicited immunity. Whether this rapid evolutionary trajectory is the result of viral replication properties, replication in immune hosts or both is unknown but conditions present in the past year of the pandemic have produced a remarkable natural experiment in viral evolution for which we cannot discern its conclusion.
Insights from structural biology has shown how some mutations are mutually exclusive, which could help the design of next-generation vaccines. However, the latter could reset the run for immune escape, perpetuating the never-ending game of host and pathogen. Viral recombination [61] (more than 50 lineages censed at the time of writing, with both simple and complex variants [62]) and sudden reemergence of former VOCs [63] have to be considered as further drivers for evolutionary saltation.