Rapid Shift from SARS-CoV-2 Delta to Omicron Sub-Variants within a Dynamic Southern U.S. Borderplex

COVID-19, caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), remains an ongoing global health challenge. This study analyzed 3641 SARS-CoV-2 positive samples from the El Paso, Texas, community and hospitalized patients over 48 weeks from Fall 2021 to Summer 2022. The binational community along the U.S. southern border was predominantly SARS-CoV-2 Delta variant (B.1.617.2) positive for a 5-week period from September 2021 to January 2022 and quickly transitioned to the Omicron variant (B.1.1.529), which was first detected at the end of December 2021. Omicron replaced Delta as the predominant detectable variant in the community and was associated with a sharp increase in COVID-19 positivity rate, related hospitalizations, and newly reported cases. In this study, Omicron BA.1, BA.4, and BA.5 variants were overwhelmingly associated with S-gene dropout by qRT-PCR analysis unlike the Delta and Omicron BA.2 variants. The study reveals that a dominant variant, like Delta, can be rapidly replaced by a more transmissible variant, like Omicron, within a dynamic metropolitan border city, necessitating enhanced monitoring, readiness, and response from public health officials and healthcare workers.

nasopharyngeal swabs were placed in viral transport media (VTM) or Bionex solutions for diagnostic processing by the provider. COVID-19 positive samples were provided to the UTEP-BBRC for viral RNA extraction followed by next generation sequencing (NGS) and analysis.

Viral RNA Extraction and SARS-CoV-2 Confirmation
To validate the positivity of the received samples, we performed quantitative real-time PCR (qRT-PCR). SARS-CoV-2 RNA was isolated from specimens (200 µL) utilizing the Mag-MAX Viral/Pathogen Nucleic Acid Isolation kit (ThermoFisher, Waltham, MA, USA) and the KingFisher Flex Purification System (ThermoFisher) according to the manufacturer's instructions. Viral SARS-CoV-2 RNA was confirmed by multiplex quantitative RT-PCR test following the guidelines of the Applied Biosciences TaqPath COVID-19 Combo Kit approved for detection of SARS-CoV-2 under Emergency Use Authorization (EUA). The qRT-PCR reactions were prepared following the 200 µL sample input 96-well reaction plate according to manufacturer instructions, processed using an Applied Biosystems 7500 Fast Dx Real-Time PCR Instrument, and analyzed with the Applied Biosystems COVID-19 Interpretive Software (v1.5). The three SARS-CoV-2 viral genes recognized through this protocol were ORF1ab gene (ORF1ab), gene for the N protein (N), and gene for the S protein (S). The Ct values for the viral gene targets (ORF1ab, N, and S) had to be less than or equal to 37 for the samples to be identified as positive. Samples not meeting these criteria were excluded from further studies. The limit of detection (LoD) of SARS-CoV-2 viral concentrations was established within the TaqPath COVID-19 Combo Kit.

NGS and Analysis
SARS-CoV-2 whole viral genome targeted NGS was performed by the UTEP-BBRC Biomolecule Analysis & Omics Core laboratory utilizing the following procedures. Invitrogen's SuperScript IV First-Strand Synthesis System was used to produce cDNA from viral RNA. Next, IDT's xGen for SARS-CoV-2 containing primers against the complete viral genome. Next, IDT's xGen for SARS-CoV-2 containing primers against the complete genome SARS-CoV-2 isolate Wuhan-Hu-1 (NCBI Reference Sequence NC_045512.2) was utilized for library construction from first strand cDNA following the low viral load input plate protocol. Libraries were quantified with Qubit dsDNA high sensitivity kit and quality checked using a 4200 TapeStation with D1000 ScreenTape. Following quality check, libraries were normalized enzymatically with the IDT Normalase reagents. Equimolar library pools were sequenced in a MiSeq system from Illumina with V3 reagent kits of 600 cycles or using Illumina's NextSeq2000 system with P2 (300 cycles) reagent kits while maintaining a consistent number of reads per sample. Because of the collective mutations arising in the SARS-CoV-2 genome and to achieve complete coverage in sequencing the genomes, it was necessary to expand the sequencing length from the 150 to 300 bases to allow for enough overlap of the sites with the increased number of mutations. Sequences of the samples were demultiplexed and saved in sample specific FASTQ files. Reads were trimmed for quality using Trimmomatic (v.0.38) and then aligned using Burrows-Wheeler aligner (bwa) to the Wuhan COVID-19 reference sequence (NCBI Reference Sequence: NC_045512.2) along with the GRCh37 human genome build as a decoy (v0.7.15). Sequences were then deduped and sorted using Samtools (v1.6). Bcftools (v1.12) mpileup was then used to create a compiled VCF file. Using Bcftools consensus, the sequence was extracted, and a single FASTA file of all the sequences was generated. Pangolin V (Aine, version 3.0), along with PUsher, was used to call the COVID-19 lineages from the generated FASTA sequences that had a maximum ambiguous cutoff of 20%. Generated results were combined, and graphs were plotted using pivot tables in Excel. Overall, 6.15% of samples sequenced were not assigned a lineage (224 unassigned) because of poor coverage and were excluded from further analysis and reporting in this study. Pango nomenclature was used to assign lineages that were reported as percent of the weekly total analyzed.

Radial Phylogenetic Tree
To visualize the sequence relationship of the S-protein, FASTA sequences were loaded into Geneious Prime (v2022.2, www.geneious.com; accessed on 15 September 2022) and the S-gene was isolated. The S-gene was aligned using the Clustal Omega plugin (v1.2.3) and then translated. A Jukes-Cantor distance matrix was determined, and a Neighbor-Joining tree constructed utilizing the Wuhan S-protein was the outgroup.

Data Availability
SARS-CoV-2 sequences were deposited to Global Initiative on Sharing All Influenza Data (GISAID) in accordance with data sharing requirements; sequences accessible by searching "TX-UTEP-" in the virus name field. Additionally, sequences are available at https://datarepo.bioinformatics.utep.edu/getdata?acc=YKQV0DBUXY130OL.

El Paso Strong Data
Data collected and reported by El Paso-DPH was obtained from the El Paso Strong website [17], accessed 8 January 2022. Data from year 2021 (wk 36 thru wk 52) to year 2022 (wk 01 thru wk 30) was plotted using Microsoft Excel to depict the number of new weekly cases, individuals hospitalized, and 7-day rolling average within the County of El Paso, Texas.

COVID-19 Sample Characteristics
A collaboration between The University of Texas at El Paso Border Biomedical Research Center (UTEP-BBRC), University Medical Center of El Paso (UMC), and El Paso Department of Public Health (DPH) was formed to monitor SARS-CoV-2 variants within the southern U.S. border community through NGS. The study carried out by UTEP-BBRC sequenced 3641 SARS-CoV-2 positive samples provided by UMC and DPH, weekly, during a 48-week period from September 2021 to July 2022. An amount of 1635 samples were collected from UMC and 2006 samples from DPH. All samples were de-identified prior to analysis; thus, the study does not report gender, age, or symptomatic state. Table 1 indicates the number of samples sequenced each week. Sequenced samples remained steady (average 49 samples) during the initial phase of the study at the end of 2021 (wk 36-2021 to wk 49-2021). In January of 2022, an increase in samples were collected and examined (average 95 samples during wk 01-2022 to wk 06-2022). During wk 09-2022 to wk 16-2022, March 2022 and April 2022, fewer SARS-CoV-2 samples were received and sequenced. Finally, as the weeks progressed, we observed a gradual increase in samples reaching a maximum weekly count of 514 samples during July (wk 27-2022) and then tapering off towards the end of the study (wk 30-2022).

SARS-CoV-2 Lineage Transition
Variants were denoted according to the assigned WHO label, Delta (∆) or Omicron (O). Greek letter symbols were used to denote these variants for easier visualization. The Pango lineage naming system was used to assign lineages and sublineages. Omicron variants were further characterized by the sublineages BA.  However, BA.5 proportions rapidly increased and was identifiable in up to 81% of samples during the final week of the study .

Delta and Omicron Sublineage Spectrum
The appearance of SARS-CoV-2 lineages and sublineages from variants was evident as the original virus mutated. A total of 98 sublineages were identified in this study, 50 of which occurred in greater than 5 samples and are depicted in Figure 2

Delta and Omicron Sublineage Spectrum
The appearance of SARS-CoV-2 lineages and sublineages from variants was evident as the original virus mutated. A total of 98 sublineages were identified in this study, 50 of which occurred in greater than 5 samples and are depicted in Figure 2

qRT-PCR S-gene Dropout
The failure to detect SARS-CoV-2 S-gene in COVID-19 molecular tests is referred to as S-gene dropout or drop-off [18].

qRT-PCR S-gene Dropout
The failure to detect SARS-CoV-2 S-gene in COVID-19 molecular tests is referred to as S-gene dropout or drop-off [18].

SARS-CoV-2 Viral Phylogenetic Tree
The genetic distance between the SARS-CoV-2 S-gene within our sample population and the Wuhan S-gene as an outgroup was visualized through a radial phylogenetic tree (Figure 4). The Δ-strain (green within Figure 4) was observed to be tightly clustered while O-sublineages are spatially dispersed and indicate extensive genetic variance among those samples. BA.1 and BA.2 sublineages, magenta and blue respectively, formed monophyletic branches. BA.4 and BA.5, shown as orange and yellow, often had branches that were a mix. Within the phylogenetic tree (Figure 4), the height of the nodes suggest variability in the mutational load relative to the Wuhan S-gene reference. Here, Delta and O-

SARS-CoV-2 Viral Phylogenetic Tree
The genetic distance between the SARS-CoV-2 S-gene within our sample population and the Wuhan S-gene as an outgroup was visualized through a radial phylogenetic tree ( Figure 4). The ∆-strain (green within Figure 4) was observed to be tightly clustered while O-sublineages are spatially dispersed and indicate extensive genetic variance among those samples. BA.1 and BA.2 sublineages, magenta and blue respectively, formed monophyletic branches. BA.4 and BA.5, shown as orange and yellow, often had branches that were a mix. Within the phylogenetic tree (Figure 4)

El Paso Weekly Cases and Hospitalizations
The incidence of weekly COVID-19 cases and hospitalizations within the El Paso community are displayed in Figure 5A and 5B. The number of cases remained steady until

El Paso Weekly Cases and Hospitalizations
The incidence of weekly COVID-19 cases and hospitalizations within the El Paso community are displayed in Figure 5A

COVID-19 Cases in El Paso Community and in U.S.
The trends in 7-day average national U.S. cases and El Paso community cases were similar except for an early peak that appeared in mid-November 2021 up to mid-December 2021 (wk 46-2021 to wk 50-2021) in new El Paso cases ( Figure 6).

COVID-19 Cases in El Paso Community and in U.S.
The trends in 7-day average national U.S. cases and El Paso community cases were similar except for an early peak that appeared in mid-November 2021 up to mid-December 2021 (wk 46-2021 to wk 50-2021) in new El Paso cases ( Figure 6).

Discussion
In this study, the viral genomic landscape of SARS-CoV-2 within the El Paso, Texas, border community was investigated. The observations illustrate the chronological progression of the Omicron variant within a predominantly Delta variant population of 3641 samples. Of these, 1635 samples were provided by UMC from hospitalized patients ( Table  1). As the sole Level-1 Trauma Center in any 200-mile direction, UMC serves the El Paso, TX, USA-Juarez, Chihuahua, Mexico communities. Inpatient samples may have originated from symptomatic or asymptomatic individuals with the disease; however, patient information was dissociated from the specimens and not traced. In contrast, 2006 samples

Discussion
In this study, the viral genomic landscape of SARS-CoV-2 within the El Paso, Texas, border community was investigated. The observations illustrate the chronological progression of the Omicron variant within a predominantly Delta variant population of 3641 samples. Of these, 1635 samples were provided by UMC from hospitalized patients ( Table 1). As the sole Level-1 Trauma Center in any 200-mile direction, UMC serves the El Paso, TX, USA-Juarez, Chihuahua, Mexico communities. Inpatient samples may have originated from symptomatic or asymptomatic individuals with the disease; however, patient information was dissociated from the specimens and not traced. In contrast, 2006 samples were provided by DPH, which were obtained from a COVID-19 testing location open to the community (Table 1). Throughout this study, samples were sequenced from the two locations. COVID-19 samples received were validated by qRT-PCR to determine a Ct value for at least two viral genes to confirm a continued SARS-CoV-2 positive status following their potential freeze/thaw and time from collection. Therefore, those that did not meet this criterion suggest low viral RNA presence for NGS and were not sequenced for further analysis. From UMC, we received 1969 samples, and 302 did not meet the study criteria for NGS. From DPH, we received 2408 samples, and 400 did not meet study criteria. Reliable sequence data of ≥80% genome coverage were obtained for approximately 90% of all samples received. We did not observe differences in the Pango lineage classification of variants between samples obtained from UMC and DPH (Figure 1). The inclusion of samples from different sources reduced bias and resulted in a more diverse sample cohort. The Delta and Omicron variants appeared successively as VOC in the U.S; as documented here, the evolution of SARS-CoV-2 variants were identified within our sample population from September 2021 to July 2022, (Figure 1). At the onset of this study, SARS-CoV-2 RNA was predominately found to be the Delta variant, primarily ∆-AY lineages (Figures 1 and 2), from September 2021 (wk  to December 2021 (wk 49-2021). Samples analyzed during this time represented 24 total sublineages of the Delta variant ( Figure 2). The most common sublineage of the Delta variant was ∆-AY.103 (Figure 2), which first appeared in the U.S. in May 2020 [19]. Following the global arrival of Omicron  [19]. With Omicron first observed in our data during the final week of December 2021, at the time of lineage transition there was only two weeks of data collection (wk 49-2021 and wk 52-2021), but missing data from the two-week period (50-2021 to 51-2021), inhibited our ability to monitor the spread of Omicron versus Delta. Lack of samples from the community (DPH) during the transition from Delta to Omicron was due to low numbers of reported cases. As of November 2022, Omicron lineages remain classified as a VOC by the WHO, the CDC, and the European Center for Disease Prevention and Control [19][20][21].
Recent data suggests the Omicron variant contains more mutations compared to its predecessors [22]. Lineages arising from the Omicron variant have several properties that permit the dominant behavior [23] and the ability to evade neutralizing antibodies obtained from previous vaccinations or infections, with BA.2.12.2 having the most significant capability [24]. The onset of O-BA.1 during the last week of December 2021 was associated with a spike in the rolling 7-day average of new weekly cases and hospitalized individuals in El Paso ( Figure 5A [28]. However, in our study, O-BA.5 was prevalent at higher proportions (up to 81% of samples) than O-BA.4, which represented an average of 10.4% of weekly samples ( Figure 2).
New mutations within viral genes impact the effectiveness of molecular diagnostic tests, such as qRT-PCR, to detect SARS-CoV-2 [29,30]. The S-gene target is not recognized by the ThermoFisher TaqPath qRT-PCR kit for SARS-CoV-2 detection in samples containing the S-gene deletion 69-70 [31]. These events are commonly known as S-gene dropout, S-gene target failure (SGTF), or S-gene drop-off. The Omicron variant contains over 30 mutations within the S-gene, including the above mentioned 69-70 deletion [32]. The O-BA.2 lineage does not have the 69-70 deletion [32], which correlates with the S protein being recognized by PCR tests as was the case with the Delta variant. In previous studies, samples identified as Omicron (with the exception of the BA.2 lineage and sublineages) were associated with S-gene dropout, similar to what was observed with those samples identified to be Alpha (B.1.1.7 VOC 202012/01) [18,31]. Samples collected and sequenced in this study were associated with the S-gene dropout mirroring the progression of Omicron detection within the population (Figures 1 and 3). S-gene dropout was observed in an average of 3.3% of samples analyzed from September 2021 to the second week of December 2021 (wk 36-2021 to wk 49-2021) (Figure 3). During this time, samples sequenced were associated with the Delta variant that, unlike Omicron, lacks the 69-70 deletion in the S protein ( Figure 1) [33]. The last week of December saw an increase in S-gene dropout (21%) that was associated with the appearance of the Omicron O-BA.1 lineage (Figures 1 and 3). S-gene dropout was observed at a higher frequency during the first week of January 2022 (wk 01-2022; 91% of samples) around the same time the Omicron variant became dominant (lineage O-BA.1, average 89% of samples) (Figures 1 and 3). Not surprisingly, cases reported by the City of El Paso showed an increase in the number of active cases, new weekly cases, hospitalized individuals, and newly reported deaths following the emergence of O-BA.1 [17]. The appearance and successive increase in samples displaying the O-BA.2 lineage during the last week of March 2022 (wk 13-2022) caused an opposite trend in S-gene detection since there was a decrease in S-gene dropout (18% of samples) (Figures 1 and 3). The introduction of lineages O-BA.4 and O-BA.5 followed by their dominance in the third week of June (wk 25-2022) resulted in an increase in S-gene dropout (54%) (Figures 1 and 3). Lineage O-BA.5 remained dominant until the end of our study (wk , and correspondingly, S-gene dropout continued to increase (up to 91%) (Figures 1 and 3). Tracking the emergence of new lineages can be facilitated by the chronological observation of S-gene dropout of one or more targets in PCR tests [34]. Our study proves that S-gene dropout accompanies the appearance of new lineages that can be detected once PCR probes are updated to encompass new mutations. The timely identification of this trend is crucial to avoid false negative results in tested individuals. These data suggest that S-gene dropout can be used by health officials as an indicator of new SARS-CoV-2 variants in the community. Multiple global agencies have reported that S-gene dropout can assist in identifying Omicron variants [35][36][37].
In this study, the resulting S-protein sequences stemming from the Delta, O-BA.1, and O-BA.2 are clearly distinct clades (Figure 4). O-BA.2 showed a greater distance from the original Wuhan reference, whereas the O-BA.4 and O-BA.5 were intermixed but closer in similarity to the Wuhan reference. As seen in rodents, the viral fitness of O-BA.5 was greater than O-BA.2 in competitive experiments [38], suggesting that the additional mutations cause the lower fitness leading to the O-BA.5 to become the dominant lineage in our study. The intermixing of O-BA.4 and O-BA.5 were likely due to mutations that were not used in predicting the lineage by Pangolin and should be explored further to determine if additional clusters could be determined.
In the El Paso border region, peaks in the numbers of new weekly COVID-19 cases and hospitalizations during O-BA.1 were not seen with subsequent Omicron variants ( Figure 5). Trends in the 7-day Rolling Average within El Paso closely paralleled the nation ( Figure 6), with the exception of a peak in El Paso cases between mid-November 2021 to mid-December 2021 (wk 46-2021 to wk 50-2021) that was not seen in the U.S. data ( Figure 6). Conducting genomic surveillance along the U.S. border regions can serve as an indicator of what is occurring nationally and inform the type of measurements that are needed against SARS-CoV-2, highlighting the impact of this study.

Conclusions
Case number trends within the El Paso community and U.S. cases reported by the CDC suggests that surveillance within the dynamic border community could be used to inform public health officials nationwide. This study tracked the progression of SARS-CoV-2 variants by the NGS of 3641 samples originating from the southern border community, including the general population and hospitalized individuals. No difference was observed in the frequency of variants expressed between samples that originated from hospitalized or non-hospitalized individuals. Omicron quickly replaced Delta as the most prominent variant detectable from individuals in the El Paso/Juarez borderplex community. The rise of Omicron, specifically the O-BA.1 lineage, was associated with the marked appearance of S-gene dropout. O-BA.1, O-BA.4, and O-BA.5 samples were associated with the S-gene dropout while the S-gene of Delta and O-BA.2 were largely recognized during qRT-PCR analysis confirmation of SARS-CoV-2. The evolution of the SARS-CoV-2 lineages in the El Paso/Juarez region followed a similar trend as other parts of the United States. Similar studies could be used to predict community control and prevention measures such as immunization strategies, healthcare preparedness, and COVID-19 testing availability to reduce viral transmission.  Data Availability Statement: SARS-CoV-2 sequences were deposited to Global Initiative on Shar-ing All Influenza Data (GISAID) in accordance with data sharing requirements; sequences accessible by searching "TX-UTEP-" in the virus name field. Additionally, sequences are available at https://datarepo. bioinformatics.utep.edu/getdata?acc=YKQV0DBUXY130OL.