Molecular Epidemiology of SARS-CoV-2 in Tunisia (North Africa) through Several Successive Waves of COVID-19

Documenting the circulation dynamics of SARS-CoV-2 variants in different regions of the world is crucial for monitoring virus transmission worldwide and contributing to global efforts towards combating the pandemic. Tunisia has experienced several waves of COVID-19 with a significant number of infections and deaths. The present study provides genetic information on the different lineages of SARS-CoV-2 that circulated in Tunisia over 17 months. Lineages were assigned for 1359 samples using whole-genome sequencing, partial S gene sequencing and variant-specific real-time RT-PCR tests. Forty-eight different lineages of SARS-CoV-2 were identified, including variants of concern (VOCs), variants of interest (VOIs) and variants under monitoring (VUMs), particularly Alpha, Beta, Delta, A.27, Zeta and Eta. The first wave, limited to imported and import-related cases, was characterized by a small number of positive samples and lineages. During the second wave, a large number of lineages were detected; the third wave was marked by the predominance of the Alpha VOC, and the fourth wave was characterized by the predominance of the Delta VOC. This study adds new genomic data to the global context of COVID-19, particularly from the North African region, and highlights the importance of the timely molecular characterization of circulating strains.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID- 19), was first detected in December 2019 in the city of Wuhan in Hubei Province of China, in a patient with acute pneumonia [1][2][3]. On 11 March 2020, the World Health Organization (WHO) characterized COVID-19 as a pandemic. As of 06 March 2022, more than 440 million confirmed cases and more than 6 million deaths have been reported worldwide [4]. The first whole-genome sequence of SARS-CoV-2 was published on 5 January 2020 [2], and since then, the analysis of viral sequences worldwide has been continuous, with more than 2.5 million complete genomes currently available in public databases, such as the GISAID platform [5]. Molecular analysis has shown significant genetic variability of the SARS-CoV-2 virus due to the accumulation of mutations over time. Most of these changes have little to no impact, but some mutations have an impact on viral properties and lead to an increase in virus transmissibility, more severe infection, a potential reduction in vaccine or immune effectiveness and/or escape from molecular diagnosis. Variants that have acquired at least one of these characteristics are named variants of concern (VOCs) and require special monitoring. In addition, other variants are classified as variants of interest (VOIs), variants under monitoring (VUMs) or variants under investigation (VUIs) [6][7][8].
By July 2021, four major variants of concern (VOCs) had been described that led to increased surveillance efforts worldwide [9]. The Alpha variant, B.1.1.7 lineage (20I/501Y.V1, VOC-202012/01), also known as the UK variant, has an unusually high number of mutations and is more transmissible than the wild-type virus [10]. The Beta variant, B.1.351 lineage (501.V2, 20H/501Y.V2, VOC-202012/02), first detected and reported in South Africa in early October 2020, shares several mutations with B.1.1.7 and reduces vaccine effectiveness to some extent [11,12]. The Gamma variant, P.1 lineage (20J/501Y.V3, VOC-202101/02), emerged in December 2020 in Brazil [13]; it has 10 mutations in the spike protein that may affect its ability to be recognized by antibodies [14,15]. The Delta variant (B.1.617.2, AY.1 and AY.2 VOC-21APR-02) was first described in India and then widely spread all over the world [11,12]. This variant has eight mutations in the spike protein and is characterized by increased transmissibility in comparison with the Alpha variant [11,12]. In November 2021, a new VOC designated the Omicron variant (B.1.1.529) was first described in Botswana and in South Africa. This new variant has rapidly spread all over the world and is presently the most frequently detected worldwide [16,17]. This variant exhibits a large number of mutations, among which more than 30 are in the spike protein. The substitution mutations Q493R, N501Y, S371L, S373P, S375F, Q498R and T478K in the spike protein are suggested to result in increased transmission due to a higher affinity for human angiotensin-converting enzyme 2 (ACE2). Moreover, an increased risk of reinfection is observed with this variant as compared to other VOCs [16,17]. Therefore, the molecular monitoring of circulating strains is crucial for the timely identification of the emergence of novel SARS-CoV-2 variants.
In Tunisia, the first cases of COVID-19-positive patients were reported in early March 2020, all of which were imported cases. By June 2020, 1155 cases had been reported, which were imported or import-related: persons coming from foreign countries and testing positive at arrival to Tunisia or persons who had been in contact with imported cases detected through contact tracing. After drastic decisions were taken by the Tunisian government, such as global lockdown, early detection of imported and local cases, quarantining of confirmed/suspected cases and border closures, case numbers reached zero between 4 and 12 June 2020 [18]. The second wave started in July 2020 after the borders reopened and a notable relaxation in compliance with preventive measures by the general population. After a small decrease in disease incidence in February 2021, the country experienced a third wave of COVID-19 with the introduction of the Alpha variant in March 2021 and then fourth and fifth waves after the introduction of Delta and Omicron in May and December 2021, respectively.
The aim of the present study is to provide genetic information on the different lineages of SARS-CoV-2 that circulated in Tunisia during a period of 17 months (March 2020-July 2021) covering different waves.

Sample Collection
This study is based on nasopharyngeal samples tested in the Laboratory of Clinical Virology of Pasteur Institute of Tunis, mandated to perform COVID-19 diagnosis by the Tunisian Ministry of Health (MoH) as part of the national program of surveillance of SARS-COV-2. From March 2020 to 31 July 2021, 125,456 nasopharyngeal swab samples were referred from public and private health institutions, of which 28,517 (22.7%) were confirmed SARS-CoV-2-positive by real-time RT-PCR. Variant detection and lineage characterization were performed on a randomly selected subset of samples (N = 1359) covering the whole 17-month period and representing the 24 governorates of Tunisia.
Three different techniques were used for the timely determination of the lineages circulating: whole-genome sequencing, partial S gene sequencing and real-time RT-PCR. The methods used for SARS-CoV-2 lineage and sub-lineage determination by month of sample collection are shown in Supplementary Material S1.

Whole-Genome Sequencing
Whole-genome sequencing was performed for 601 specimens with Ct values less than 30 that were collected throughout the whole 17-month study period. Of the 601 specimens, 10 were sequenced using MinION technology, and n = 591 were sequenced using Illumina technology as follows: RNA was extracted from a 140 µL nasopharyngeal sample with the Qiamp viral RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Genomes were generated by an amplicon-based approach using the United States Food and Drug Administration (US FDA)-approved Illumina COVIDSeq kit (San Diego, CA, USA) [19], a modified version of the ARTIC protocol. The extracted RNAs were reverse transcribed to single-strand cDNA using Superscript IV Reverse Transcriptase (Invitrogen™, Waltham, MA, USA) as per the manufacturer's instructions. SARS-CoV-2 PCR with two specific primer pools combined with proven Illumina sequencing technology allowed us to obtain 98 tiled amplicons covering the whole genome. For each sample, PCR products were combined, and libraries were prepared using the Illumina-Nextera DNA UD Indexes as per the manufacturer's instructions. Libraries were purified with AMPure XP magnetic beads (Beckman Coulter, Brea, CA, USA), and concentration was measured by Qubit dsDNA HS Assay kit (Thermo Fisher Scientific, Waltham, MA, USA). Library pool validation and mean fragment size were determined by Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA) as per the manufacturer's instructions. The 400 bp library pool was diluted to 4 nM. Libraries were pooled, denatured, diluted to 1.4 pM and sequenced on a NextSeq550 instrument with a V2.5 High Output Cartridge 2 × 150 bp run kit (Illumina, San Diego, CA, USA). Raw sequence data were processed using fastqc version 0.11.9 for quality control (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 30 August 2021). Low-quality reads and adapters were filtered using trimmomatic version 0.39 with a Phred quality score of 30 as the threshold.

Genome Assembly
For the sequences generated by ONT MinION technology (Oxford Nanopore Technology, Oxford, UK), the depth ranged between 564× and 2.263×, and coverage ranged between 22,776 and 29,126. A modified ARTIC network pipeline v1.0.0 was used to generate consensus genomes with a reference-based genome assembly pipeline and read length filtering against the Wuhan-Hu-1 (RefSeq accession: NC_045512.2), and variants were polished using nanopolish v0.13.2, medaka v0.11.5 and samtools 1.9. For the sequences generated using Illumina Technology, the genomes were assembled using the EDGE COVID-19 pipeline, which is based on the fully open-source EDGE Bioinformatics software [20]. FaQCs was used to control the quality of the reads [21]. Low-quality regions of reads were trimmed and filtered if the reads failed a quality threshold of 20 or minimum length of 50 bp. The reads that passed the QC process were then aligned to the original Wuhan-Hu-1 complete reference genome (RefSeq accession: NC_045512.2) using BWA-mem [22]. Various parameters were set to default values: (i) minimum depth coverage (5×) to support a variant site, (ii) alternate base threshold (0.5) to support an alternative for a change in the consensus base, (iii) indel threshold (0.5) to support an INDEL for a change in the consensus base and (iv) minimum mapping quality [23].

Variant Detection by Partial Sequencing of the S Gene
Amplification by standard PCR and partial sequencing using Sanger technology was used for 358 samples collected from February to July 2021, as described previously [24]. The 648-nucleotide-long S gene sequence encodes for the 477 to 693 amino acid residue region of the S protein. It includes key positions and allows the detection of the most important mutations characterizing most VOCs, VOIs and VUMs.

Variant Detection by Real-Time RT-PCR
The commercial kit SNPsig ® real-time PCR SARS-CoV-2 mutation detection/allelic discrimination kit (Primerdesign Ltd., Southampton, UK), which allows the detection of the Alpha, Beta and Gamma variants, was used for 400 samples collected from March to June 2021, during the high-transmission period of the Alpha variant. The test is based on the search for the first step of the N501Y substitution, which is common to the Alpha, Beta and Gamma variants. Discrimination between the three variants is then performed for all samples harboring the N501Y substitution using primers and probes specific to each variant.

Clade and Lineage Assignment
The Fasta format of whole genome sequences was used for clade and lineage assignment using online tools: Nextclade [10] and Pangolin (version 3.1.16, lineages version 2021-11-25) [25].

Results
In the present study, lineages could be successfully assigned to the 1359 SARS-CoV-2 samples using one of the three described methods. Overall, between March 2020 and 31 July 2021, a very high viral diversity was observed, with the identification of 48 different lineages (Table 1). According to the Pangolin lineage classification, the overwhelming majority of lineages detected (97.8%) belonged to B clade, with the low circulation of A (1.8%) and P (0.4%) clades (https://pangolin.cog-uk.io/ (accessed on 10 March 2022)).
According to the Nextclade classification [10], 14 different clades were detected in Tunisia: 19A, 19B, 20A-E, 20G-I, 21A, 21D, 21I and 21J. Figure 1 shows the phylogenetic tree built using the Nextclade online tool (https://clades.nextstrain.org/ (accessed on 10 March 2022)). Figure 2 shows the phylogenetic tree based on Pangolin lineage classification and Table 1 illustrates the correspondence between the WHO, Pangolin lineage and Nextclade assignments according to the months of sample collection.  Table 1. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) lineage distribution in Tunisia from March 2020 to July 2021 according to month of sample collection. Variants of concern (VOCs) and variants of interest (VOIs) are shown in red and in blue, respectively. * Lineages without any sub-lineage assignment, such as B, B.1 and B.1.1, were observed in n = 1, n = 92 and n = 26 cases, respectively. This was due to the quality of sequences generated, which did not allow proper assignment to a sub-lineage. PS = partial sequencing in the S gene; qRT-PCR = real-time PCR; and WGS = whole-genome sequencing.    Table 1 and Figures 2 and 3 report the different SARS-CoV-2 lineages detected during the four different waves of the disease in Tunisia during the 17-month study period.  Table 1).
The third wave ranged from February to May 2021 and was characterized by the emergence of variants of concern (VOCs) and variants of interest and/or under monitoring. Indeed, the A.27 lineage, considered at that time to be a variant of interest, was first detected in Tunisia in February 2021, and the last isolate was identified in April 2021 (Table 1). The A.27 lineage is recognized by the following substitutions in the S gene: L18F, L452R, N501Y, A653V, H655Y, D796Y and G1219V, and was first detected in the present study using partial S gene sequencing. Starting from March 2021, the Alpha B.  The second wave ranged from July 2020 to January 2021; it was characterized by a higher genetic diversity with the circulation of at least 20 different lineages, namely: B. 1.1.50, B.1.597, B.1.1.1, B.1.22, B.1.428.2, B.1.1.25, B.1.1.198, B.1.1.189, B.1.1.354, B.1.177 The third wave ranged from February to May 2021 and was characterized by the emergence of variants of concern (VOCs) and variants of interest and/or under monitoring. Indeed, the A.27 lineage, considered at that time to be a variant of interest, was first detected in Tunisia in February 2021, and the last isolate was identified in April 2021 (  Figure 3). Other sporadic lineages were also detected during the third wave: B.1.533, B.1.416, A.23.1,  B.1.243, B.1.415, B.1.160, B.1.1.178, B.1.620, B.1.1.318 Figure 2).
It is also important to note that some lineages, such as B.1.177 and B.1.160, circulated for a long period covering waves 2 and 3, extending from September 2020 to April-May 2021.

Discussion
In this retrospective observational study, we describe the SARS-CoV-2 lineages that circulated in Tunisia for 17 months after its first introduction to the country in March 2020. Significant genetic diversity of SARS-CoV-2 was observed with the circulation of many SARS-CoV-2 lineages during four different waves that the country experienced up to July 2021. This could be explained by different virus importations. Several factors could favor the multiple introductions of these different viral lineages to Tunisia and their rapid spread. First, Tunisia, a small country with an area of 163,610 km 2 and a population of approximately 12 million, has a strategic geographic location, which makes it a junction point between the Arab world, Africa and Europe. Furthermore, it is known for its history of economic and cultural transactions, particularly with European and neighboring countries. In addition, Tunisia was experiencing an economic and political crisis that prevented total lockdown for long periods, and the introduction of the anti-SARS-CoV-2 vaccine to the population was relatively late.
The succession of the different waves observed in Tunisia is similar to the global picture of COVID-19 infection (https://covid19.who.int/ (accessed on 30 August 2021)). In addition, the same picture of the circulation of several viral lineages has been reported in several countries around the world, such as the Czech Republic, Cyprus, the UK, Russia, South Africa and countries from the Middle East and North Africa (MENA) region [28][29][30][31][32][33]. The exchange between countries has played a crucial role in the importation of new lineages and their rapid spread across countries.
This phenomenon is characteristic of airborne viruses. Countries that have succeeded in stopping their transmission are those that have applied drastic measures, such as China and South Korea, or other countries that were able to contain the virus during the first phases of the pandemic with full containment, border closures and the minimization of any contact, even between non-infected people [34][35][36][37][38].
In Tunisia, the first wave began with the introduction of the virus in March 2020; the drastic decisions taken by the government have effectively controlled the pandemic with a very low rate of infected persons that reached zero cases in June 2020. During this first wave, full compliance with sanitary measures by the population and the adequate decisions taken by decision-makers made it possible to contain the disease. At that time, the number of patients was not significant, especially those with severe disease requiring resuscitation and oxygen beds. On 27 June 2020, it was decided to reopen the borders. Countries were classified into three different categories depending on their epidemiological situation, and passengers coming from countries with low endemicity (green zones) were not required to self-isolate. This led to the re-emergence of the virus since some people coming from green zones were positive and reintroduced the virus in the country. Moreover, the summer period was marked by the return of emigrants and summer festivities, together with noncompliance with sanitary measures by the general population. This led to the outbreak of a second wave of COVID-19 with local transmission and two peaks coinciding with the start of the school year in September/October and the festivities celebrating the New Year in December 2020/January 2021. After a small decrease in disease incidence in February 2021, the country experienced a third wave of COVID-19 with the introduction of the Alpha VOC and A.27 VOI.
The A.27 lineage was first detected in February 2021 concomitantly with its widespread emergence in France, a country with which Tunisia has important cultural and economic exchanges. At that time, the A.27 lineage, also named the Henry Mondor variant, was considered to be a variant of interest or variant under monitoring since it caused many grouped cases in France and harbors mutations described in VOCs [39]. The A.27 lineage shares substitutions with VOCs such as L18F, L452R and N501Y, which have been suggested to result in immune escape and higher transmissibility [40]. Currently, the A.27 lineage is not considered a VOI (https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/ (accessed on 16 March 2022). Thereafter, it was quickly displaced by the Alpha B.1.1.7 VOC. The Alpha VOC, first detected in the United Kingdom in late 2020, is defined by an N501Y amino acid substitution in the spike protein that increases its transmissibility. The Alpha VOC had become the dominant global variant by early 2021 [5,33,40,41]. In Tunisia, it was first detected in January 2021 and rapidly became the dominant lineage. During this third wave, real-time PCR testing detecting the three VOCs detected around the world at that time (Alpha, Beta and Gamma) was used and allowed rapid screening for this variant.
Other variants that have raised interest at the international level were also detected in our series, including the Beta variant (B.1.351), also known as 20H or variant of concern 501Y.V2. Beta was first described in South Africa and reported by Tegally et al. [11], and it was then detected in several countries on all continents, with the highest period of transmission between October 2020 and September 2021. In our series, it was detected in two travelers upon their arrival to Tunisia in April and May 2021; its early detection and the timely decisions taken contained this variant, hampering its spread in the country. The A.23.1 lineage was also detected. This lineage emerged in September 2020 and has several mutations with potential biological concern, including the 681R substitution, although it has not been classified as a VOC or VOI [42]. It was reported in several countries: 18 in Europe, 12 in Asia and 16 in sub-Saharan Africa, the USA, Canada and Australia. To our knowledge, it is reported herein for the first time in North Africa. The Eta (B.1.525) and Zeta (P2 or B.1.1.28) variants were also detected.
Moreover, other lineages circulated for long periods, such as B.1.160 and B.1.177, which took an important place in the lineage landscape circulating in Tunisia. These lineages circulated from September 2020 until mid-2021 without any impact on the overall epidemiological situation. B.1.160, known as 20A/EU2, is one of the main variants first reported in Europe [43]. The B.1.177 lineage, mostly detected in Europe, was first detected in early 2020 and is currently classified into more than 80 sub-lineages [42]. Further molecular characterization of a higher number of viruses will be of great interest to better characterize these two lineages.
In May 2021, the Delta variant, characterized by L452R and P681H amino acid substitutions in the spike protein, was detected in the country and rapidly displaced the Alpha variant, becoming the dominant variant in June-July 2021. This VOC, first detected in India in early 2021, became the most frequently detected variant in many countries [5,44,45]. Indeed, it was demonstrated that the Delta variant emerged faster than the Alpha variant and dominated the variant landscape worldwide. In the present study, the emergence of the Delta variant defined the fourth wave of SARS-CoV-2 infection in the country and participated in the resurgence of SARS-CoV-2 cases. The circulation of the Delta variant coincided with high transmissibility and with a large number of severe disease cases. In fact, infection with the Delta variant is characterized by the generation of an average of 6 times more viral RNA copies per milliliter than Alpha infections [12]. In Tunisia, the detection of the Delta variant decreased from August to December 2021 (data not shown).
The disease incidence increased again with the introduction of Omicron, which was first detected in early December and caused a new wave with a much higher transmission rate.

Conclusions
This study describes the Tunisian experience in the molecular surveillance of SARS-CoV2. The generated genomic data contribute to the enrichment of the globally published data on SARS-CoV-2 circulation, particularly in North Africa. It highlights the efforts that have been made for rapid and efficient detection of variants despite the main limitation of the unavailability of onsite NGS technology. Lineage and sub-lineage assignments were performed using multiple methods, including whole-genome sequencing, partial sequencing of the S gene and screening for VOCs by real-time PCR. Although partial sequencing and VOC detection by real-time PCR do not allow complete characterization of lineages and sub-lineages, they were of great help in promptly identifying the introduction of the main VOCs, especially Alpha and Delta. Our study also points to three important measures that should be considered to prevent the emergence of new waves and new virus(es) introduction(s): (1)   Informed Consent Statement: Patient consent was waived: According to the approval of the IRB and as samples were fully anonymized, the bioethics committee waived the requirement for informed consent to be able to manage the pandemic.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.