Dynamics of SARS-CoV-2 Variants of Concern in Vaccination Model City in the State of Sao Paulo, Brazil

From a country with one of the highest SARS-CoV-2 morbidity and mortality rates, Brazil has implemented one of the most successful vaccination programs. Brazil’s first model city vaccination program was performed by the CoronaVac vaccine (Sinovac Biotech) in the town of Serrana, São Paulo State. To evaluate the vaccination effect on the SARS-CoV-2 molecular dynamics and clinical outcomes, we performed SARS-CoV-2 molecular surveillance on 4375 complete genomes obtained between June 2020 and April 2022 in this location. This study included the period between the initial SARS-CoV-2 introduction and during the vaccination process. We observed that the SARS-CoV-2 substitution dynamics in Serrana followed the viral molecular epidemiology in Brazil, including the initial identification of the ancestral lineages (B.1.1.28 and B.1.1.33) and epidemic waves of variants of concern (VOC) including the Gamma, Delta, and, more recently, Omicron. Most probably, as a result of the immunization campaign, the mortality during the Gamma and Delta VOC was significantly reduced compared to the rest of Brazil, which was also related to lower morbidity. Our phylogenetic analysis revealed the evolutionary history of the SARS-CoV-2 in this location and showed that multiple introduction events have occurred over time. The evaluation of the COVID-19 clinical outcome revealed that most cases were mild (88.9%, 98.1%, 99.1% to Gamma, Delta, and Omicron, respectively) regardless of the infecting VOC. In conclusion, we observed that vaccination was responsible for reducing the death toll rate and related COVID-19 morbidity, especially during the gamma and Delta VOC; however, it does not prevent the rapid substitution rate and morbidity of the Omicron VOC.


Introduction
The SARS-CoV-2 pandemic dramatically hit Brazil and the country became a world leader in COVID-19-related morbidity and mortality (30.4 million cases with 663,000 deaths until April 2022) [1]. However, by March 2021, Brazil undertook a massive COVID-19 vaccination campaign culminating in one of the highest percentages of vaccinated individuals (79.6%) worldwide. The large territory of the country contributed to the entry of different variants of concern (VOCs), leading to massive epidemic waves. In addition, important VOCs, such as the Gamma variant [2], emerged in Brazil. The application of mass vaccinations led to a diminution in the Delta epidemic wave compared to other countries but was ineffective towards the Omicron VOC [3].
The role of pioneer model city for the studies related to the effects of the vaccination was performed by the town of Serrana, where, for the first time in Brazil, between February and April 2021 the CoronaVac vaccine (Sinovac Biotech) was applied, before the beginning of the official vaccination program in the country. This clinical study related to mass vaccination in adults (>18 years old) was named Project S (Portuguese version, Projeto S). This test of the effectiveness of the CoronaVac vaccine demonstrated a dramatic drop in COVID-19 mortality as deaths were reduced by more than 95% [4].
To comprehensively evaluate the extent of the circulating SARS-CoV-2 variants and their substitution dynamics before and after the vaccination program, we undertook a largescale sequencing program aimed at investigating in real-time all of the SARS-CoV-2 positive samples obtained in the city of Serrana (between June 2020 and March 2022). Beyond the molecular surveillance, we also examined the patients' clinical outcomes (symptomatology and lethal outcome) plotted against the VOC distribution in the timeline.

Ethical Statement and Study Location
This study was approved by the Institutional Ethics Committee of the Medical School of Ribeirão Preto (Process CAAE: 50367721.7.1001.5440). The study was performed in the town of Serrana located at latitude 21 • 12 41 south and longitude 47 • 35 44 west in the state of São Paulo. Its population is estimated to be 46,166 inhabitants.

Molecular Confirmation of SARS-CoV-2 Infection
Viral RNA was automatically extracted from 100 uL of nasopharyngeal swab suspension (Extracta kit AN viral, Loccus) in extractor Extracta 32 (Loccus) following the manufacturer's guidelines. The SARS-CoV-2-RT-PCR was executed using the Gene Find-erTM COVID19 Plus RealAmp kit (OSang Healthcare Co. Ltd., Gyeonggi-do, Korea), which detects fragments of the RdRp, E, and N genes. The reaction was performed according to the manufacturer's protocol using a 7500 Real Time PCR cycler (Thermo Fisher Scientific, Waltham, MA, USA).

SARS-CoV-2 Sequencing
For the genomic sequencing, 60-89% of all SARS-CoV-2 positive samples from each epidemiological week (epiweek) were included. The main criteria for sample selection were the cycle threshold (Ct) value under 35 for at least two of the tested viral genes.
SARS-CoV-2 complete genome sequences were obtained using the COVIDSeq kit (Illumina, San Diego, CA, USA) according to the manufacturer's protocol. Sequencing libraries were pooled, normalized to 4 nM, and denatured with 0.2 N NaOH and 400 mM Tris-HCL (pH-8). The final sample library (9 pM) was loaded into a MiSeq Nano Reagent Kit v2 (300 cycles), and run on an Illumina MiSeq sequencer (Illumina, San Diego, CA, USA).

Bioinformatics Pipeline
The raw sequence data obtained were submitted to quality control analysis using the FastQC software, version 0.11.8 [5]. Trimming was performed using Trimmomatic version 0.3.9 [6] to select high-quality sequences. The sequences with quality scores > 30 were used. We mapped the trimmed sequences against the SARS-CoV-2 Wuhan reference genome (Genbank refseq NC_045512.2) using BWA [7] and SAMtools [8] software for read indexing. The mapped files were submitted to refinement with Pilon [9] to obtain the indels and insertions in the most correct way possible. Afterwards, the trimmed sequences were subjected to a remap against the genome refined by Pilon. Finally, we used BCFtools [10] for variant calling, and seqtk [11] to create the consensus genomes. Positions covered by fewer than 10 reads (DP < 10) and bases of quality lower than 30 were considered a coverage gap and converted to Ns. Coverage values for each genome were calculated using SAMtools v1.12 [8]. SARS-CoV-2 lineages were assigned by pangolin v.4.1.1.

Phylogenetics Analysis
The phylogenetic tree is composed of 7283 genomes: 4375 genomes from Serrana town and 2908 global representative SARS-CoV-2 genomes [12] that were downloaded as a representative dataset from GISAID (https://gisaid.org/ accessed on 21 May 2022). Sequence alignment was performed using MAFFT v7.475 [13] and manually curated to remove artifacts using Aliview [14]. Maximum likelihood (ML) phylogenetic trees were constructed using Iqtree 2 [15] with the statistical support of ultrafast bootstrap applying 1000 replicates. The nucleotide substitution model was GTR + G4 + F, chosen according to the BIC (Bayesian Information Criterion) statistical model. The final formatting and visualization of the phylogenetic tree were performed using the ggtree R package [16].

Aspects of the Tested Population and Performed Sequencing
During the routine SARS-CoV-2 diagnosis, we obtained 4538 SARS-CoV-2 genomes in the period between June 2020 and April 2022, comprising three subsequent COVID-19 waves in Brazil. Of them, we analyzed 4375 (96.4%) sequences that presented high-quality parameters including a mean number of reads (986,784), mean coverage (98.5%), and mean depth, i.e., 5610. The analyzed genomes were distributed by VOC like this: 1653 Delta cases (37.8%), 1053 Gamma cases (24.1%), 1513 Omicron cases (34.6%), 75 Zeta cases (1.7%), and 81 cases of other lineages (1.9%). The patients who were included in this study were distributed like this: 55.1% were female and 44.9% were male with a mean age of 38 (±18) years. Although the participants were of all ages, most were found in the group between 21 and 50 years old in all of the three epidemic waves (Figure 1), whereas 11.6% were pediatric patients under 18 years of age. Among the identified VOCs and VOIs, so far, we identified 52 SARS-CoV-2 sub lineages in the studied location. The clinical outcome of COVID-19 in the tested patients was evaluated using the WHO clinical score ( Figure 2). Most of the cases were characterized as mild (88.9%, 98.1%, 99.1% to Gamma, Delta, and Omicron, respectively) regardless of the VOC in question. Delta VOC mild cases were mostly identified in the population group aged under 40 years old ( Figure 2). For the moderate cases, the Gamma VOC were mostly found in the population group aged between 40-50 year old, while the Delta VOC affected individuals below 40 years of age. Interestingly, the moderate Omicron cases affected elderly people. In Brazil, the first COVID-19 wave is estimated to have been between the epidemiological weeks 9 and 45, 2020. By this time, the performed genomic surveillance showed the presence of parental SARS-CoV-2 lineages coinciding with the epidemiological situation worldwide. We obtained 45 complete SARS-CoV-2 genomes during this epidemic wave  Figure 3a). Most of the SARS-CoV-2 positive patients were aged between 41-50 years of age and generally presented with mild symptoms (54.5%), but 36.4% showed moderate symptoms, and in 9.1% of the positive patients, the symptoms were classified as severe (Figure 3b).
were pediatric patients under 18 years of age. Among the identified VOCs and VO far, we identified 52 SARS-CoV-2 sub lineages in the studied location. The clinic come of COVID-19 in the tested patients was evaluated using the WHO clinical scor ure 2). Most of the cases were characterized as mild (88.9%, 98.1%, 99.1% to Gamma, and Omicron, respectively) regardless of the VOC in question. Delta VOC mild case mostly identified in the population group aged under 40 years old ( Figure 2). F moderate cases, the Gamma VOC were mostly found in the population group ag tween 40-50 year old, while the Delta VOC affected individuals below 40 years Interestingly, the moderate Omicron cases affected elderly people.  were pediatric patients under 18 years of age. Among the identified VOCs and VOIs, so far, we identified 52 SARS-CoV-2 sub lineages in the studied location. The clinical outcome of COVID-19 in the tested patients was evaluated using the WHO clinical score (Figure 2). Most of the cases were characterized as mild (88.9%, 98.1%, 99.1% to Gamma, Delta, and Omicron, respectively) regardless of the VOC in question. Delta VOC mild cases were mostly identified in the population group aged under 40 years old ( Figure 2). For the moderate cases, the Gamma VOC were mostly found in the population group aged between 40-50 year old, while the Delta VOC affected individuals below 40 years of age. Interestingly, the moderate Omicron cases affected elderly people.  Clinal scores of 1 to 3 were classified as mild, 4 to 6 as moderate, 7 to 9 were severe, and 10 for death. Data are expressed in average age (years) ± standard deviation. Statistical analysis was performed using Kruskal-Wallis with Dunn's test for correction of multiple comparisons and α = 0.05, p > 0.05 (not significant), p < 0.05 (*), p < 0.01 (**), and p < 0.0001 (****).
wave which belonged to B.1.1.33 (55.6%), B.1.1 (42.2%), and P.2 (2.2%), respectively (Figure 3a). Most of the SARS-CoV-2 positive patients were aged between 41-50 years of age and generally presented with mild symptoms (54.5%), but 36.4% showed moderate symptoms, and in 9.1% of the positive patients, the symptoms were classified as severe ( Figure  3b).   The second epidemic wave in Serrana occurred between the 46th epiweek (2020) the 39th epiweek (2021). This period was related to the progressive growth of the quenced genomes, raising from 7% to 100% of all positive cases per epiweek. By the of October 2021, we detected 28 SARS-CoV-2 sublineages: 81.9% Gamma VOC (P.1, P.1 P.1.12, P.1.14, P.1. 15 1.1, B.1.1.28, B.1.1.519, B.1.1.7, C.37, N.9, P.4, and P.7) (Fig  4a). The second epidemic wave was related to the emergence and rapid dissemination of Gamma VOC in Brazil, and in the town of Serrana. The first Gamma VOC case appeared in the 5th epiweek (February 2021). The complete substitution of all lineages by Gamma VOC took 11 epiweeks (5th-16th epiweeks), and Gamma predominance was observed until epiweek 34 ( Figure 5). We continued to detect single Gamma VOC strains until October 2021, when we observed a complete disappearance of this VOC after the detection of only The first Delta VOC cases in Serrana appeared in the 31st epiweek (August 2021). Cases started to grow exponentially from the 33rd epiweek until the complete substitution of Gamma VOC by Delta VOC in epiweek 43. The length of time for the replacement of these two VOCs was similar ( Figure 5). However, while the Gamma wave was characterized by P.1 and its sublineages (66.4% P.1, 10.5% P.1.7, 2.6% P.1.14, 1.5% P.1.12, and 0.9% of P.1.11, P.1. 15 Cases started to grow exponentially from the 33rd epiweek until the complete substitution of Gamma VOC by Delta VOC in epiweek 43. The length of time for the replacement of these two VOCs was similar ( Figure 5). However, while the Gamma wave was characterized by P.1 and its sublineages (66.4% P.1, 10.5% P.1.7, 2.6% P.1.14, 1.5% P.1.12, and 0.9% of P.1.11, P.1.15, P.1.2, and P. Most of the patients infected during the second epidemic wave were aged between 31 and 41 years of age. Compared to the first epidemic wave, the percentage of patients with mild symptoms during the second epidemic wave was considerably higher (89.8%). However, the second epidemic wave coincided with the mass vaccination campaign in Serrana town, with the first dose on February 17, 2021, followed by a second dose 4 weeks later. From the first Gamma VOC case (epiweek 5) to the complete vaccination of all individuals in the city (epiweek 15), 71.4% were mild cases, 19.1% were moderate, and 9.5% were classified as severe. After the complete vaccination of the city until the end of the second epidemic wave (epiweeks 16-42), the percentage of mild cases rose to 94.5% while the moderate and severe cases dropped to 3.9% and 1.5%, respectively.  Most of the patients infected during the second epidemic wave were aged between 31 and 41 years of age. Compared to the first epidemic wave, the percentage of patients with mild symptoms during the second epidemic wave was considerably higher (89.8%). However, the second epidemic wave coincided with the mass vaccination campaign in Serrana town, with the first dose on 17 February 2021, followed by a second dose 4 weeks later. From the first Gamma VOC case (epiweek 5) to the complete vaccination of all individuals in the city (epiweek 15), 71.4% were mild cases, 19.1% were moderate, and 9.5% were classified as severe. After the complete vaccination of the city until the end of the second epidemic wave (epiweeks 16-42), the percentage of mild cases rose to 94.5% while the moderate and severe cases dropped to 3.9% and 1.5%, respectively.

Third Epidemic Wave (December 2021-April 2022)
In Brazil, the third epidemic wave started at the 51st epiweek (2021) and continues to date. Interestingly, while in Brazil there was a notable reduction in COVID-19 cases from October to December, in Serrana we observed an increasing number of cases ( Figure 5 than the previous epidemic waves-especially the Delta VOC wave (Figure 1), but also the Gamma emergence in Brazil. By the 50th epiweek of 2021, we identified the first Omicron VOC in Serrana city. Omicron VOC disseminated so rapidly that in five weeks it completely replaced the Delta VOC. This was a record when compared to other waves such as the Gamma and Delta replacements.
such as the Gamma and Delta replacements.

Global SARS-CoV-2 Phylogenetic Analysis in the Town of Serrana
In this study, we performed a phylogenetic analysis of all of nomes that were obtained from the city of Serrana. The utilized d reference sequences retrieved up to April 2022 from GISAID sam obtained global phylogenetic tree showed well-defined clusters w the main VOCs circulating in this region. As observed in Figure  corresponded to the epidemic waves described above, Gamma, De new strains generated in this study were randomly interspersed SARS-CoV-2 strains. For example, considering the collection date, resented by multiple introductions events in this region. This is exp

Global SARS-CoV-2 Phylogenetic Analysis in the Town of Serrana
In this study, we performed a phylogenetic analysis of all of the SARS-CoV-2 genomes that were obtained from the city of Serrana. The utilized dataset contained 2908 reference sequences retrieved up to April 2022 from GISAID sampling worldwide. The obtained global phylogenetic tree showed well-defined clusters which corresponded to the main VOCs circulating in this region. As observed in Figure 7, the largest clusters corresponded to the epidemic waves described above, Gamma, Delta, and Omicron. The new strains generated in this study were randomly interspersed within the reference SARS-CoV-2 strains. For example, considering the collection date, Delta cluster was represented by multiple introductions events in this region. This is explained by the fact that the multiple introductions of the main SARS-CoV-2 lineages occurred over time in these regions, which supports the epidemiologic events observed in the SARS-CoV-2 pandemic not only in the state of São Paulo but also in Brazil, and worldwide.

Discussion
This study investigated the SARS-CoV-2 molecular dynamics between the period of 2020-2022 in a model vaccination town in Brazil (Serrana) that has gained international importance [4]. The first confirmed COVID-19 case in this city dates back to April 2020 [17], almost two months (25 February 2020) after the first confirmed SARS-CoV-2 case in Brazil [18], reflecting the intensive dissemination speed of SARS-CoV-2 in the state of São Paulo. During the initial period of the COVID-19 pandemic, Serrana followed the pre-established measures by the Ministry of Health of Brazil including isolation, lockdowns, reduction in public transportation, similar to the worldwide guidelines [19]. Further, under the Butantan Institute's guidance, Serrana participated in the first stepped-wedge trial to assess the efficiency of the CoronaVac vaccine. Briefly, this clinical trial was conducted between the epidemiological weeks 6 and 19 of 2021, coinciding with the mid-Gamma VOC wave of the second wave when the Gamma VOC was introduced and became predominant (for more information of the campaign organization, refer to reference [20]). Additionally, by this time São Paulo state had established the Network for Pandemic Alert of SARS-CoV-2 variants, that improved the genomic surveillance in this region and helped to understand the introduction routes of the SARS-CoV-2 variants in the town of Serrana.
During the ongoing vaccination program, we observed a significant gap between the vaccination process and the lack of molecular data concerning the circulating SARS-CoV-2 variants; thus, we estimated the effect of the vaccine on the SARS-CoV-2 diversity and the clinical impact that was measured by the percentage of observed lethal and severe cases, respectively. With a focus on this, we adopted a large-scale sequencing survey on the circulating SARS-CoV-2 variants which was aimed at the sequencing of all of the positive cases in this location. By this way, we estimated the SARS-CoV-2 substitution dynamics which followed the trends of emergence and VOCs' dissemination in Brazil from the introduction of the parenteral strains in the region [21], and the emergence of the first epidemic Gamma VOC wave [22]. This profile was also observed for the Delta epidemic wave [23]. The performed genomic surveillance also helped to identify some rare VOIs, such as C.37 which was circulating in the Andean countries but was underrepresented in Brazil, and occurred mostly as introductions [24,25]. This shows the importance of the systemic monitoring of the circulating SARS-CoV-2 variants that gives important molecular epidemiological information, helping not only to monitor rare lineages but also to predict the introduction of important VOCs and to predict further epidemic waves.
We believe that the application of the vaccine in Serrana town was related to reduced morbidity and mortality as judged by the evaluation of the clinical score of the positive individuals (especially Gamma and Delta VOCs). This was observed especially during the Gamma and Delta VOC waves. In that line, during the Gamma VOC wave, nearby cities (São José do Rio Preto city, situated about~200 km from the town of Serrana) experienced a higher mortality rate [26], especially among young unvaccinated individuals. We believe that reduced lethality in Serrana was probably due to the anticipated vaccination program, where, by this time, 80.1% of the adult population was vaccinated. The beneficial effects of the vaccination were also observed in other studies where fully vaccinated individuals were less likely to acquire symptomatic or asymptomatic infections [27]. Therefore, the COVID-19 vaccination shows beneficial effects in reducing the SARS-CoV-2-related infection rates, severe cases, and mortality.
Despite the highly immune population in the studied location, we observed an extremely high circulation of the Omicron VOC that was responsible for the third epidemic wave (the first cases were detected in the 51st epidemic week). The dissemination of the Omicron VOC was very high, and in a period of 3-4 weeks, Omicron VOC was a fully dominant strain in Serrana town. The Omicron VOC dissemination demonstrated that the currently applied vaccination schemes could not contain this variant. The Omicron VOC has a specific mutation profile that confers immune evasion from the commonly applied SARS-CoV-2 vaccines [28,29]. Additionally, the inactivated SARS-CoV-2 vaccines, which were widely applied in Serrana town, lost their neutralizing capacity in regard to the Omicron VOC, where only 40% of the immune serum showed low-level neutralization to Omicron with a 14.7 xfold decrease compared to the wild variant [30]. The low efficiency to Omicron VOC protection has been attributed to the m-RNA vaccines, as observed in studies performed among healthcare workers with high occupational risk of SARS-CoV-2 exposure [31]. Despite that, the vaccination could not contain the dissemination of the Omicron VOC. In Serrana, we observed very low frequency of the lethal and severe cases, similar to other locations, such as the Gauteng province in South Africa [32], and southern California, USA [33]. Moreover, beyond the Omicron VOC vaccine escape, Brazil was gradually reducing restriction measures by this time, e.g., for the holidays of children and students [34]. In addition, the Christmas period related to intensive human dispersion and traveling which might all have contributed to the rapid dissemination of Omicron VOC, not only in Brazil but also worldwide. The lower risk of severe outcomes for the Omicron infection and its interclade variation is related to the global establishment of this variation as a dominant lineage, which should alert the public health authorities worldwide [32].

Conclusions
In conclusion, in this study we performed longitudinal monitoring of the SARS-CoV-2 variants in a model vaccination city in Southeast Brazil. Despite the epidemic waves during the study period, we observed that the vaccination was crucial in reducing morbidity, severe cases, and mortality. The vaccination was not effective in regard to the Omicron VOC introduction and rapid dissemination, which is due to the specific mutation profile of this variant and the immune evasion of the currently used vaccines. The performed genomic surveillance plays an important role in the monitoring of SARS-CoV-2 variants, as it allows for the quick detection of new variants and monitoring of their dissemination and substitution rates. In that line, we still continue the genomic monitoring of the SARS-CoV-2 variants in this region and turn our attention to the diversification of Omicron and its sublineages that can rise.

Informed Consent Statement:
The informed consent was waived by the Institutional Ethics Committee-Hospital das Clínicas da Faculdade de Medicina de Ribeirao Preto-Universidade de São Paulo. The main symptoms related before and after the vaccination campaign were similar and included-CAAE: 50367721.7.1001.5440. An exemption from the application of the informed consent form was approved once the molecular test (SARS-CoV-2 RT-PCR) had already been performed and the result was communicated by the health team to patients. The residual RNA was used (this material would be discarded) and the obtained results did not bring any new clinical/therapeutic conduct for the patient.

Data Availability Statement:
The genomic sequences presented in this study are openly available in the GISAID public database (available online: https://www.gisaid.org/). The GISAID IDs and related metadata presented in this study are available in the Supplementary Materials File S1.