Recombination in Coronaviruses, with a Focus on SARS-CoV-2

Recombination is a common evolutionary tool for RNA viruses, and coronaviruses are no exception. We review here the evidence for recombination in SARS-CoV-2 and reconcile nomenclature for recombinants, discuss their origin and fitness, and speculate how recombinants could make a difference in the future of the COVID-19 pandemics.

Viruses 2022, 14, 1239 2 of 13 nested set of sgmRNAs sharing a 65-90 nt long common leader. One form of NHR that occurs between genomic and sgmRNA has been hypothesized to result from the collapse of the transcription complex during (-)-strand discontinuous transcription [25]. Such a disruption would leave a partial copy of the leader sequence within the genome near the junction between two genes. Remnants of leader RNAs were found in the genomes of wild-type HCoV-OC43 [26], and the HCoV-HKU1 genome contains two very significant segments of embedded leader sequence (Woo et al., 2005) [27][28][29].

Recombination in SARS-CoV-2 2.1. Recombinant Origin of SARS-CoV-2
Li et al. initially showed in March 2020 that SARS-CoV-2's entire receptor-binding motif (RBM) was introduced through recombination with coronaviruses from pangolins, possibly a critical step in the evolution of SARS-CoV-2's ability to infect humans [40]. This was later confirmed by Zhu et al. in December 2020 [41]. However, more recently, using sliding window bootstrap (SWB) to highlight the regions supporting phylogenetic relationships, SARS-CoV-2 was defined as a mosaic genome composed of regions sharing recent ancestry with three bat SCoV2rCs recently discovered in the Yunnan region of China (RmYN02, RpYN06, and RaTG13) or related to more ancient ancestors in bats from Yunnan and Southeast Asia [42], with no evidence of direct recombination with pangolin viruses.
A few months after the initiation of the COVID-19 pandemic, co-infections were documented without any evidence of recombination. The first detailed case was described in February 2021 as co-infection from NextStrain 20A and 20B lineages, which was followed up for kinetics of relative abundance: a Portuguese patient had a prolonged viral shedding case (97 days long), first with a severe disease manifestation followed by a short second hospitalization episode, in an otherwise healthy young female [43]. More cases soon followed: e.g., co-infection by B.1.1.248 (either as major or minor haplotype) and B.1.1.33 or B.1.91, respectively [44], or co-infection between B.1.1.7 and B.1.351 [45] or GH and GR [46]. A less conclusive case of co-infection was reported from Iraq, suggesting the need for helper strains from defective co-infective strains [47]. A large study identified 53 (~0.18%) co-infection events (including with two Delta sublineages) out of 29,993 samples: apart from 52 co-infections with two SARS-CoV-2 lineages, one sample with co-infections of three SARS-CoV-2 lineages was firstly identified [48]. Another study identified coinfections in around 0.61% of all samples investigated (nine cases) [49]. Co-infections should be distinguished from subclonal variants (so-called intra-host evolution or quasi-species swarm), which naturally occur during infection, especially longlasting infections in immunocompromised recipients either spontaneously or after selective pressure from antiviral therapeutics [50].

Evidence for Recombination in SARS-CoV-2
There is both in silico [51] and in vivo [52] evidence for recombination of different SARS-CoV-2 strains. Studies relying on linkage disequilibrium identified that SARS-CoV-2 recombination occurs at very low levels [52][53][54] or does not occur at all [55][56][57][58][59][60]. Several alternative methods are available for reconstructing genealogies explicitly in the presence of recombination, both with [61] and without [62][63][64] making the parsimony assumption, but none is tailored to the particular problem of detecting recombination in the presence of recurrent mutation. In fact, many tests of recombination assume that all mutations can only occur once at each site, and hence, recurrent mutation from convergent evolution (as it occurs in SARS-CoV-2) and systematic errors can confound signatures of recombination [7,27,36,65].
Hence, novel methodological approaches have been developed to detect recombinant genomes in SARS-CoV-2 lineages. Ignatieva et al. proposed a parsimony-based greedy heuristic algorithm for reconstructing plausible ancestral recombination graphs (KwARG) [66]: it does not scale well to large datasets but was powerful enough for disentangling the effects of recurrent mutation from recombination in the history of a sample [67]. Turakhia et al. developed Recombination Inference using Phylogenetic PLacEmentS (RIP-PLES) to break the sequence into distinct segments that are differentiated by mutations on the recombinant sequence and separated by up to two breakpoints: for each set of breakpoints, RIPPLES places each of its corresponding segments using maximum parsimony to find the two parental nodes-hereafter termed donor and acceptor. RIPPLES is very fast with a large dataset but is biased against identifying recombination events near the edges of the viral genome. They identified 606 recombination events by investigating a 1.6M sample tree, showing that approximately 2.7% of sequenced SARS-CoV-2 genomes have recombinant ancestry, that recombination breakpoints occur disproportionately in the Spike protein region, and that cases were coinfected with 2-3 SARS-CoV-2 variants on average [68].
Haddad et al. observed recombination between different strains only in North American and European sequences [69]. Table 1 summarizes the recombinants between VOCs detected in more than one case (generally > 50 GISAID sequences). Many more cases are likely to have occurred between non-VOCs in a pre-VOC era or within individual hosts, such as a recombinant between B.1.160 and Alpha variants in a lymphoma patient chronically infected for 14 months [70]: nevertheless, those recombinant have been not fit enough to spread and outcompete the dominant strain of the moment.
Recombination has been proposed as a mechanism for the generation of B.1.1.7 (Alpha VOC) [71]. Accordingly, further recombination has been detected among B.  [72].
As soon as the possibility of recombination emerged, nomenclature systems started considering how to name these sublineages. In the PANGOLIN phylogeny, all top-level lineages that are recombinants have a prefix that begins with "X" [73]. In most cases, they expect a minimum of 50 sequences to design a novel recombinant linage, but exceptions arise if the recombinant has a particular novelty or significance, with unusual breakpoint and/or parental lineages. As of 5 April 2022, CoV-lineages (https://cov-lineages.org/ lineage_list.html) reports lineages from XA to XY, mostly from the UK (which contributes the vast majority of GISAID entries), suggesting new changes to nomenclature will soon We will here separately discuss recombination between SARS-CoV-2 VOCs.

Alpha-Delta Recombinants
Recombination between Alpha and Delta SARS-CoV-2 variants has, to date, been reported in a single case despite co-circulation from June 2021 to December 2021. Sekizuka et al. reported a Delta AY.29 and B.1.1.7 (later dubbed XC lineage) [75].

Beta-Delta Recombinants
Recombination between Beta and Delta has, to date, been reported in a single case despite co-circulation since December 2021. He et al. reported possible evidence of recombination in the Orf1ab (174-2692 and 5839) and Spike genes (21,801-22,281, previously proposed as a putative recombination region between the progenitor of SARS-CoV-2 and Bat-SL-CoV) in a patient (dubbed "49H") maintaining a 1:9 Beta:Delta co-infection ratio for 14 days as part of an outbreak during a flight from South Africa to China [78].

Delta-BA.1 Recombinants
Delta and Omicron BA.1 co-circulated from November 2021 until February 2022: cases have been reported of Delta and Omicron co-infection [79,80]. Their recombinants are often colloquially referred to as "Deltamicron" or "Deltacron". They were among the first recombinants to be named by PANGOLIN (XD, XF, and XS), but, as it happened for BA.1, all Deltamicron recombinant were soon out-competed by BA.2.
On 7 January 2022, virologist Leondios Kostrikis at the University of Cyprus in Nicosia deposited 52 sequences in GISAID, which were claimed by media as Deltamicron, but upon further inspection, these appeared to be due to laboratory artifacts (most likely laboratory contamination) or coinfections and were withdrawn from GISAID [81].
Ou et al. reported multiple additional amino acid mutations in the Delta Spike protein were also identified in the recently emerged Omicron isolates, which implied possible recombination events [82].
More individual cases of Deltamicron were reported, which do not have a PANGOLIN name designated yet, e.g.: In both cases, the 5 -end of the viral genome was from the Delta genome and the 3 -end from Omicron though the breakpoints were different [80].
Delta and BA.2 co-circulated minimally: accordingly, Delta-BA.2 recombinants only occurred in a doublet from the end of January in Sweden (PANGO issue #519) and a singlet again in January 2022 in Karnataka, India (PANGO issue #484). Another possible explanation for their scarcity is that countries with significant co-circulation (e.g., India and Philippines) do not perform WGS very frequently.

BA.1-BA.2 Recombinants
Most Omicron recombinants identified to date have the BA.1 as acceptor and the breakpoint within ORF1ab and hence preserve Spike protein from BA.2 (e.g., XE, XG, XH, XJ, XK, XM, XN, XP, XQ, and XR): this is not surprising since BA.2 currently outcompetes BA.1. XP is the lone exception, having BA.1.1 (the BA.1 sublineage with R346K mutation) as an acceptor (including Spike) and BA.2 as a donor. Among them, XE (also known as V-22APR-02 in Public Health England) is the most concerning, having a growth advantage over BA.2 estimated at first at +9.8% [86] and then raised to +20.9% (largely the same as observed for AY.4.2 over Delta in late 2021) [87]. This further increase in the basic reproductive number approaches SARS-CoV-2 as the most contagious virus in human history (see Figure 1).
Ou et al. identified, by scanning high-quality completed Omicron Spike gene sequences, 18 core mutations of BA.1 variants (frequency > 99%) (eight in NTD, five near the S1/S2 cleavage site, and five in S2). BA.2 variants share three additional amino acid deletions with the Alpha variants. BA.1 subvariants share nine common amino acid mutations (three more than BA.2) in the Spike protein with most VOCs, suggesting a possible recombination origin of Omicron from these VOCs. There are three more Alpha-related mutations (∆69-70, ∆144) in BA.1 than in BA.2, and therefore, BA.1 may be phylogenetically closer to the Alpha variant. Revertant mutations are found in some dominant mutations (frequency > 95%) in the BA.1 subvariant [82]. Colson et al. in Marseille detected two samples with a recombinant genome that was mostly that of a BA.2 variant but with a 3 tip originating from BA.1 [88]. Gu et al. in Japan reported two more cases with a breakpoint near the 5 end of the Spike gene (nucleotide position 20,055-21,618) [89]. Leuking et al. in Texas reported two more cases in immunosuppressed patients [84].

Conclusions
Most recombinants to date have been reported in the UK, Denmark, and the USA mostly because those countries have more dense genomic surveillance programs. None of the recombinants detected so far seems to grow fast enough to become dominant, and greater concern comes from the emerging L452R-carrying BA.2 (e.g., BA.2.12.1 in New York Viruses 2022, 14, 1239 9 of 13 or BA.2.11 in Bretagne) or BA.4/BA.5 sublineages. Albeit recombination is extremely likely to occur between SARS-CoV-2 lineages, several factors limit their generation and spread: (1) Pandemic waves from recent VOCs are becoming shorter and shorter, minimizing the time of co-circulation of different VOCs. (2) Apart from immunocompromised hosts, the duration of within-host viral replication is limited, again minimizing the room for co-infection/super-infection. (3) The increasingly high reproductive number achieved by the currently dominating VOC (BA.2) creates a major barrier for any novel strain to emerge (Figure 2). While approaching the asymptote of the reproductive number, only marginal gains in transmissivity will be possible. In this regard, many GISAID-powered bioinformatics tools are available (e.g., Cov-Spectrum [90] or SARS-CoV-2 Recombinant Finder [91]). (4) Detecting a recombinant lineage requires WGS efforts to stay in place given that, as for XE, Spike gene sequencing is not enough to detect recombination.

Conclusions
Most recombinants to date have been reported in the UK, Denmark, an mostly because those countries have more dense genomic surveillance progr of the recombinants detected so far seems to grow fast enough to become dom greater concern comes from the emerging L452R-carrying BA.2 (e.g., BA.2.1 York or BA.2.11 in Bretagne) or BA.4/BA.5 sublineages. Albeit recombination i likely to occur between SARS-CoV-2 lineages, several factors limit their gen spread: (1) Pandemic waves from recent VOCs are becoming shorter and shorter, the time of co-circulation of different VOCs. (2) Apart from immunocompromised hosts, the duration of within-host viral is limited, again minimizing the room for co-infection/super-infection. (3) The increasingly high reproductive number achieved by the currently d VOC (BA.2) creates a major barrier for any novel strain to emerge (Figur approaching the asymptote of the reproductive number, only margin transmissivity will be possible. In this regard, many GISAID-powered bio tools are available (e.g., Cov-Spectrum [90] or SARS-CoV-2 Recombinant F (4) Detecting a recombinant lineage requires WGS efforts to stay in place gi for XE, Spike gene sequencing is not enough to detect recombination. Figure 2. Basic reproductive number (R0) and estimated herd immunity threshold for variants of concern and the XE recombinant compared to other human pathogens. Plea Figure 2. Basic reproductive number (R 0 ) and estimated herd immunity threshold for SARS-CoV-2 variants of concern and the XE recombinant compared to other human pathogens. Please note herd immunity cannot be currently achieved with the current generation of systemically delivered vaccines [92].
Nevertheless, even extremely rare events are likely to happen under massive viral circulation. In particular, we should not forget that COVID-19 is panzootic, and the possibility of recombination between an animal-adapted lineage and a human-adapted lineage could have unpredictable consequences on the efficacy of current COVID-19 vaccines.  Data Availability Statement: All data are available at PubMed, medRxiv, and bioRxiv.

Conflicts of Interest:
We declare we have no conflict of interest to disclose.