Early Detection of the Recombinant SARS-CoV-2 XAN Variant in Bulgaria: Initial Genomic Insights into Yet Another Piece of the Growing Puzzle of Recombinant Clades

The first recombinant SARS-CoV-2 variants were identified in 2022, causing public health concerns. The importance of recombinant variants has increased especially since the WHO designated the recombinant variant XBB and its lineages as subvariants that require monitoring on 20 November 2022. In this study, we provide the first insights into the new SARS-CoV-2 variant named XAN, a recombinant composed of Omicron sub-lineages BA.2 and BA.5. To our knowledge, this is the first report on the recombinant SARS-CoV-2 XAN variant identified in Bulgaria.


Introduction
For the first time since the emergence of the COVID-19 pandemic, in January 2022, Leondios G. Kostrikis announced that his research group at the University of Cyprus in Nicosia had discovered a novel SARS-CoV-2 viral strain, which purportedly shares specific properties with the variants Delta and Omicron [1]. This was the first report of a likely recombination event occurring between two distinct SARS-CoV-2 variants, which was initially designed as a novel "Deltacron" viral strain [2]. This statement was immediately challenged by some researchers, suggesting a lab mistake might be a sounder explanation for the Cyprus laboratory findings [3].
Coronaviruses belong to Nidovirales, an order that has the ability to proofread their genomes during their genetic replication and recombination. Therefore, by default, it can be assumed that SARS-CoV-2 might not be capable of frequent recombination like some other viruses do, and yet has the potential for the replication of heterogeneous and dynamic populations. However, soon after the first announcement of possible recombination, SARS-CoV-2 was found to be no exception and recombination is part of its nature, which allows it to adapt and evade immunity [4]. Recombination in these viruses is possible in events involving superinfection or coinfection with different SARS-CoV-2 lineages and the first coinfection case was observed in a healthy young female patient who had sustained viral shedding [5]. Numerous recombinant lineages have been already identified in different locations and were designed by the PANGO with the 'X-' lineage prefix [6].
Recombinant variants, including those between Delta and Omicron, were already described, and in these viral strains, the backbone was of Delta origin, while the spike protein came from the Omicron, and thus the resulting new recombinant variant had features of both of its progenitors [7]. Moreover, the combination of variants such as Delta, which is known to cause more severe disease, and the less virulent but more contagious Omicron could lead to the emergence of more virulent and more transmittable viruses through the inheritance of the most pathogenic traits from their predecessors [7].
Recombination has been reported as one of the main driving events for the evolution of viruses and naturally, it was considered as one of the possible events that could have led to the emergence of SARS-CoV-2 itself. Already in the first few months of the pandemic, Li et al. demonstrated that the entire receptor-binding motif of the SARS-CoV-2 was introduced through recombination with coronaviruses from pangolins, an event with a key role in the evolution of SARS-CoV-2 ability to infect humans [8]. More recently, using sliding window bootstrap to highlight the regions supporting phylogenetic relationships, SARS-CoV-2 was defined as a Mosaic Genome Closely Related to Bat Viruses from Yunnan [9].
The SARS-CoV-2 Omicron variant has become the dominant circulating strain in the world as a result of its high transmissibility, as well as its capability to evade immune protection induced by both natural infection and vaccination. As a continuation of its ubiquity, intravariant recombinant lineages have emerged within the Omicron subvariants multiple times [10]. Nowadays, we are witnessing numerous recombinations in this virus, and above all, the recombinant forms seem to have a tendency to increase in number.
In this study, we provide the first genomic insights regarding the emerging SARS-CoV-2 XAN recombinant variant, which is a recombinant strain belonging to the Omicron sublineages BA.2 and BA.5. To our knowledge, this is the first report that provides preliminary phylogenetic insights into the SARS-CoV-2 XAN viral strain circulating in Bulgaria [11].

Materials and Methods
The viral RNA was isolated from 400 µL of nasal swabs suspensions obtained for routine COVID-19 diagnostic/genomic surveillance purposes of the National Center of Infectious and Parasitic Diseases, Sofia, Bulgaria, using an ExiPrep 48 Viral DNA/RNA Kit (Bioneer, Daejeon, Republic of Korea) and ExiPrep 48 Dx (Bioneer) according to the manufacturer's instructions. Real-time polymerase chain reaction (RT-qPCR) was performed using the GeneFinder™ COVID-19 Plus RealAmp Kit (OSANG Healthcare Co., Ltd., Anyang-si, Gyeonggi-do, Republic of Korea) targeting the RdRp (RNA-dependent RNA Polymerase), E (envelope) and N (nucleocapsid) SARS-CoV-2 genes. Whole-genome next-generation sequencing (NGS) of SARS-CoV-2 was conducted on samples from randomly selected SARS-CoV-2 positive individuals by using a modified ARTIC-tailed amplicon method [12]. Briefly, after the reverse transcription step, 3 µL of the copied DNA was used in four parallel multiplex PCRs. To improve the evenness of the genome coverage, the concentrations of the ARTIC v4.1-tailed primer were normalized following the protocol developed by Benjamin Farr et al. [13]. After indexing, the libraries were purified by using HighPrep™ PCR Cleanup (MagBio Genomics Inc., Kraichtal, Germany) followed by quantification, normalization, and pooling to reach 4 nM for sequencing on an Illumina MiSeq platform with the v2 reagent kit and 500 cycles (Illumina, San Diego, CA, USA). The resulting reads were aligned trimmed, and quality filtered, the primer sequences were eliminated, and full SARS-CoV-2 genomes were assembled using Geneious Prime 2021.1 (https://www.geneious.com, accessed on 1 March 2023).
Lineage assignment was performed on the obtained consensus sequences using the Phylogenetic Assignment of Named Global Outbreak Lineages tool (PANGOLIN) [14]. The new sequence generated in this study was compared to a diverse set of SARS-CoV-2 sequences (n = 3036) sampled worldwide and collected up to 15 October 2022. Considering the large amount of data available at that time, we used the Subsampler tool available at https://github.com/andersonbrito/subsampler (accessed on 15 March 2023), which in turn is a pipeline for subsampling genomic data based on epidemiological time series data. All sequences were aligned using the ViralMSA tool [15], and IQ-TREE 2.1.2 [16] was used for phylogenetic analysis using the maximum likelihood approach. The raw ML tree topology was then used to analyze and estimate the number of viral transmission events between various regions of the world. TreeTime phylodynamic analysis [17] was conducted to transform this ML tree topology into a dated tree by using a constant mean rate of 8.0 × 10 −4 nucleotide substitutions per site per year, after the exclusion of outlier sequences. The mutation pattern of the VOC was analyzed using the NextClade online tool [18]. In addition, we used data from the Global Lineage Surveillance (COVID GC) https: //covidcg.org/ (accessed on 30 March 2023) to further analyze trends in the geographic distribution of all recombinant SARS-CoV-2 variants up to the time of writing this article, March 2023 [19]. See Table 1 for the detailed list of XAN mutations.  S135R  P314L  T19I  T223I  T9I  D3N  S84L  P13L  L37F  T842I  R1315C  L24S  Q19E  DEL31-33  G1307S  P1452L  DEL25-27  A63T  R203K  L2570M  I1566V  DEL69-70  G204R  L3027F  T2163I  G142D  S413R  T3090I  V213G  L3201F  G339D  T3255I  S371F  P3395H  S373P  DEL3675-3677  S375F T376A

Results
A total of 22,434 samples from patients with SARS-CoV-2 were sequenced until October 2022 in Bulgaria. The Bulgarian SARS-CoV-2 XAN clade was isolated from a 76-year-old woman who spent four days in a clinic and was discharged after her recovery.
Lineage assessment was conducted using PANGOLIN (available at https://github. com/hCoV-2019/pangolin, accessed on 15 March 2023) and revealed that the new strain belonged to the SARS-CoV-2 XAN lineage. Phylogenetic inference by combining our new isolate (EPI_ISL_15390336) with a representative dataset available on GISAID (https: //www.gisaid.org/, accessed on 1 March 2023) up to 15 October 2022 revealed that the newly obtained genome belongs to the XAN lineage and clustered closely with SARS-CoV-2 XAN strains isolated in Europe, North America, and the Caribbean between May and September 2022 ( Figure 1A) (bootstrap = 1.0, SH-aLTR = 1.0). Further, we analyzed the specific mutational profile of the newly generated strain to determine its lineage-defining mutations. The newly identified lineage harbored 58 substitutions that are characteristic of XAN with substitutions interspersed across the viral genome ( Figure 1B).  Table 1 for the detailed list of mutations).
XAN has a BA.5-like mutational profile as fifty-five of the substitutions highlighted are shared with BA.5 sub-lineages. For the remaining three substitutions, one is unique to the XAN lineage (ORF1a:L2570M), whereas ORF1a:L3201F and ORF1b:P1452L are shared with BA.2 lineages and C.36.3.1, respectively.  Table 1 for the detailed list of mutations).
XAN has a BA.5-like mutational profile as fifty-five of the substitutions highlighted are shared with BA.5 sub-lineages. For the remaining three substitutions, one is unique to the XAN lineage (ORF1a:L2570M), whereas ORF1a:L3201F and ORF1b:P1452L are shared with BA.2 lineages and C.36.3.1, respectively.
As of February 2023, a significant number of 63,361 sequences from a total of 89 different recombinant (X-) sub-lineages of SARS-CoV-2 could be found in the COVID GC databases (Figure 2). Their diversity is considerable, with unequal geographical distribution across different continents. The number of cases caused by recombinant viruses steadily increased in 2022 and after October, it rose sharply and peaked in January 2023, especially in North America, followed by a decline in February 2023 (Figure 3).

Discussion
Bulgaria was substantially affected by the pandemic with high mortality rates against the background of low vaccination coverage [20]. After the introduction and distribution of the less virulent Omicron and its sub-variants, the mortality rate in the country significantly decreased.
To date, at least 27 different recombinant XA subvariants have been reported (XA to

Discussion
Bulgaria was substantially affected by the pandemic with high mortality rates against the background of low vaccination coverage [20]. After the introduction and distribution of the less virulent Omicron and its sub-variants, the mortality rate in the country significantly decreased.
To date, at least 27 different recombinant XA subvariants have been reported (XA to The complete dataset used for the analysis will be provided upon request to the principal investigator.

Discussion
Bulgaria was substantially affected by the pandemic with high mortality rates against the background of low vaccination coverage [20]. After the introduction and distribution of the less virulent Omicron and its sub-variants, the mortality rate in the country significantly decreased.
To date, at least 27 different recombinant XA subvariants have been reported (XA to XAZ). Among these, the most prevalent were XAZ (35%) followed by XAM (9.8%) and XAY.2 (9.5%). A specific regional distribution could be observed, with XAZ being more prevalent in Asia and Europe, while XAM is found almost exclusively in North America. The largest proportion of new cases with XA sub-lineages has been registered from May to September 2022 and almost disappeared afterwards. Notably, XAY. Despite its significant mutational background inherited from two branches of Omicron BA.2 and BA.5, the transmission rate and overall spread of XAN remained limited, with only 216 sequences that have been deposited in GISAID (up to 09.01.2023) [21]. Almost 85% of the isolates were reported from European countries (n = 183), whereas in the US, it seemed to be less frequent (n = 12). The first XAN sequences were first reported from Spain and Switzerland in May 2022 and later in Denmark and Greece. The highest morbidity peaks and the corresponding estimated daily proportions for XAN were registered during July and September 2022, while in November, it already abated [21].
Further analysis of the clinical course of cases infected with XAN is necessary to perform a comprehensive assessment of the severity and mortality of the disease, as well as the efficacy of the vaccine protection in this variant. However, compared to other recombinant variants (e.g., XBB), XAN appeared less virulent but still better adapted than other XA sub-lineages that emerged at the same time. XAN, as the other early recombinant subvariant, appears as an intermediate step in the ongoing evolution of SARS-CoV-2, highlighting its exceptional plasticity [22].
It is known that the evolution of viruses, including recombinations, could impact their accurate diagnosis through substitutions in the viral genome targeted by the PCR tests. The genomic sequences of XAN inherited from BA.2 and BA.5 and the composition of mutations characteristic of XAN do not seem to negatively affect standardized multiplex PCR tests. However, the continuous evolution of viruses also requires continuous adaptation of the tests, for diagnostics and sequencing [23]. Our findings have some limitations and despite the significant number of sequenced samples of over twenty-two thousand until October 2022, not all cases of COVID-19 were referred for sequencing, because they did not meet the participant selection criteria or because they had an insufficiently high viral load required for the sequencing analysis. Further research on the identified SARS-CoV-2 XAN variant and the associated epidemiological and clinical data could broaden the spectrum of knowledge to better understand the evolution of these viruses.
Finally, our results can supplement those from other studies to expand and update our knowledge for the development of more efficient vaccine prevention and treatment of the disease.
In conclusion, consistent genomic surveillance of circulating variants remains of high importance for the early detection and monitoring of emerging SARS-CoV-2 viral strains including recombinant viruses. Therefore, monitoring the SARS-CoV-2 evolution over time remains a priority to adapt our defenses against the pandemic (see [24,25]). Funding: This research was funded by a grant from the Ministry of Education and Science, Bulgaria (contract: KΠ-06-H43/1-27.11.2020), entitled "Molecular-virological analysis of the introduced and disseminated newly emerged pandemic virus SARS-CoV-2 in Bulgaria by using next-generation sequencing and combined epidemiological and phylogenetic analysis", by the European Regional Development Fund through Operational Program Science and Education for Smart Growth 2014-2020, Grant BG05M2OP001-1.002-0001-C04. MG is funded by PON "Ricerca e Innovazione" 2014-2020. MG is supported in part by the CRP-ICGEB RESEARCH GRANT 2020 Project CRP/BRA20-03, Contract CRP/20/03.

Institutional Review Board Statement:
This study was approved by the Ethical Committee at the National Centre of Infectious and Parasitic Diseases, Sofia, Bulgaria (NCIPD IRB 00006384).

Data Availability Statement:
The Bulgarian SARS-CoV-2 sequence used for the study has been deposited in GISAID under accession numbers: EPI_ISL_15390336.