Next Article in Journal
Evaluating the Clinical Impact of BioFire Spotfire R/ST on the Management of Pediatric Respiratory Presentations in the Emergency Department: A Pre–Post Cross-Sectional Study in Chile
Next Article in Special Issue
Disentangling SARS-CoV-2 Sustained Viremia Cases: Evolution, Persistence and Reinfection
Previous Article in Journal
The Introduction of a HuR-Binding Site in the 3′ UTR and the CD47 Cytoplasmic Tail Enhances SARS-CoV-2 S-Protein Expression in Cells
Previous Article in Special Issue
Sociodemographic Associations and COVID-19 Symptoms Following One Year of Molecular Screening for SARS-CoV-2 Among Healthcare Workers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genetic Diversity of SARS-CoV-2 in Kazakhstan from 2020 to 2022

1
National Center of Public Health Care, The Ministry of Health of the Republic of Kazakhstan, Almaty 050008, Kazakhstan
2
Smorodintsev Research Institute of Influenza, 197022 Saint Petersburg, Russia
3
Research and Production Center for Microbiology and Virology, Almaty 050010, Kazakhstan
4
Faculty of Natural Sciences and Geography, Abai Kazakh National Pedagogical University, Almaty 050010, Kazakhstan
5
National Center of Expertise of the Committee of Sanitary and Epidemiological Control, The Ministry of Health of the Republic of Kazakhstan, Astana 010000, Kazakhstan
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Viruses 2026, 18(1), 138; https://doi.org/10.3390/v18010138
Submission received: 15 November 2025 / Revised: 30 December 2025 / Accepted: 5 January 2026 / Published: 21 January 2026
(This article belongs to the Special Issue Molecular Epidemiology of SARS-CoV-2, 4th Edition)

Abstract

Coronavirus disease 2019 (COVID-19), caused by SARS-CoV-2, has had major social and economic consequences worldwide. Whole genome sequencing (WGS) is essential for genomic monitoring, enabling tracking of viral evolution, detection of emerging variants, and identification of introductions and transmission chains to inform timely public health responses. Here, we compile and harmonize SARS-CoV-2 genomic data generated by multiple laboratories across Kazakhstan together with publicly available sequences to provide a national overview of genomic dynamics across successive epidemic waves from 2020 to 2022. We analyzed 4462 genomes deposited in GISAID (including 340 generated in this study), of which 3299 passed Nextclade quality filters, and summarized lineage turnover across major phases (pre-VOC, Alpha, Delta, Omicron BA.1/BA.2, Omicron BA.4/BA.5, and a later recombinant-dominant period). Sequencing intensity varied markedly over time (0.60‰ of confirmed cases during Delta vs. 11.57‰ during the Omicron BA.5 wave), suggesting that lineage diversity and persistence may be underestimated. Pre-VOC circulation included ≥12 Pango lineages with evidence of multiple introductions and sustained local transmission, including a Kazakhstan-restricted B.4.1 lineage that emerged in Nur-Sultan/Astana and disappeared after April 2020. The Tengizchevroil oilfield outbreak comprised B.1.1 viruses with phylogenetic support for ≥three independent introductions. Alpha and Omicron waves were characterized by repeated introductions and heterogeneous origins, whereas Delta was dominated by AY.122 with an additional distinct AY.122 cluster; a notable BF.7 local transmission event was observed during BA.5. We also highlight locally enriched non-lineage-defining mutations. Overall, recurrent importations and variable local amplification shaped SARS-CoV-2 dynamics in Kazakhstan, while interpretation is constrained by strongly time-skewed sequencing.

1. Introduction

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first identified in Wuhan, China, at the end of 2019 and rapidly spread, evolving into a global pandemic [1,2]. SARS-CoV-2 is an enveloped, non-segmented, positive single-stranded RNA virus with a genome size of about 30,000 bases, featuring four structural proteins: spike (S), envelope (E), membrane (M), and nucleocapsid (N) [3,4]. Overall, SARS-CoV-2 shares approximately 79.5% and 96% genomic sequence identity with previously identified SARS-CoV and bat coronavirus, SL-CoV-RaTG13, respectively [5,6].
SARS-CoV-2 exhibits a propensity for multiple recombinations and mutations in its genome, potentially leading to changes in viral protein structure, binding affinity, virus transmission, diagnostics, vaccine efficacy, and sensitivity to antiviral drugs [7,8,9]. Since the onset of the pandemic, the SARS-CoV-2 virus has evolved into new variants of interest (VOIs) or variants of concern (VOCs). During the circulation of early virus variants such as Alpha, Beta, Delta, etc., a high mortality rate was observed among elderly individuals and those with comorbidities such as hypertension, kidney disorders, cancer, diabetes, and obesity [10]. Later variants of the virus, including the Omicron group of strains, caused less severe illness compared to previous variants but led to an increase in cases among young people and children [11].
The first confirmed case of COVID-19 in Kazakhstan was reported on 13 March 2020. This individual had traveled from Germany. The initial virus strain in Kazakhstan was the same as the original strain identified in Wuhan, China [12]. From May 2020, Kazakhstan saw an increase in cases. Early studies of the virus indicated the circulation of the original SARS-CoV-2 strain, which caused significant outbreaks around the world [13].
From the beginning of 2021, more transmissible SARS-CoV-2 variants began to spread, including the Alpha variant (B.1.1.7), first identified in the UK, and the Beta variant (B.1.351), first identified in South Africa. From July 2021 onward, Kazakhstan began to register cases caused by the Delta variant (B.1.617.2), which is characterized by higher transmissibility and partial immune evasion. Delta subsequently became the dominant variant in the country, triggering a new wave of the pandemic [14,15].
The Omicron strain was officially confirmed in Kazakhstan in January 2022. It spread quickly across the country, helped by its high R0 (reproduction rate, meaning how quickly the virus spreads). This SARS-CoV-2 variant was more contagious than previous ones, but at the same time, it caused less severe disease in most people, which led to a significant number of infections but relatively fewer severe cases [16].
In February 2022, Kazakhstan experienced a peak in cases caused by the Omicron strain. Against this background, the number of patients with COVID-19 in medical institutions increased significantly, but most cases were mild or moderate. Thus, Omicron subvariants were dominant in winter–spring 2023 [17].
From March 2022, subvariants BA.1 and BA.2 began to actively circulate. They were then joined by the BA.4 and BA.5 subvariants. These subvariants had improved immune evasion properties, which contributed to the increase in cases despite high vaccination rates and previous infections.
Thus, the purpose of this study was to analyze the dynamics of the spread of SARS-CoV-2 virus variants in Kazakhstan in the period from 2020 to 2023, as well as to identify key mutations characteristic of variants circulating in Kazakhstan.

2. Materials and Methods

2.1. Sample Collection

The Reference Laboratory for Viral Infection Control (Almaty, Kazakhstan), which is the National Influenza Center in Kazakhstan, received 10% of SARS-CoV-2-positive samples from 18 regional virology laboratories in the country for sequencing every month.
The samples used in this study were nasopharyngeal and oropharyngeal swabs selected according to the following epidemiological and clinical criteria: (1) samples from individuals with symptoms of acute respiratory viral infections and COVID-19; (2) contact with confirmed cases of COVID-19; (3) samples from individuals arriving from abroad; (4) samples from home foci of infection; (5) samples from individuals with moderate/severe disease and/or hospitalized in intensive care units, etc.
From 2022, monitoring of coronavirus infection was introduced into the current epidemiological surveillance system for acute respiratory viral infections and influenza. All samples collected within the framework of the surveillance were tested in parallel for influenza and SARS-CoV-2; during routine studies for SARS-CoV-2, some negative samples were tested for influenza.

2.2. Screening PCR

In regional laboratories, PCR with real-time detection for identification of SARS-CoV-2 virus RNA was performed using the Amplitest SARS-CoV-2 reagent kit (CRIE, Moscow, Russia), Intifika SARS-CoV-2 (Alkor Bio Company Ltd., Saint-Petersburg, Russia), BGI (BGI Genomics, Shenzhen, China), etc. The CDC’s Influenza SARS-CoV-2 Multiplex Assay reagent kit was used in the reference laboratory.

2.3. Virus Genome Sequencing

Viral RNA extraction was performed using the commercial PureLink RNA MiniKit (Life Technologies, Carlsbad, CA, USA) and QIAamp Viral RNA Mini Kit (Qiagen GmbH, Hilden, Germany) according to the manufacturers’ recommendations. Reverse transcription and SARS-CoV-2 library preparation were performed using the Ampliseq™ cDNA Synthesis and AmpliSeq Library PLUS for Illumina kits (Illumina, San Diego, CA, USA) according to the manufacturer’s recommendations. Amplification products were purified using the AMPure XP reagent (Beckman Coulter, Inc., Brea, CA, USA) according to the manufacturer’s recommendations. DNA fragment library quality was determined using the Agilent High Sensitivity DNA Kit on a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA) according to the manufacturer’s recommendations. Quantitative analysis of cDNA libraries was performed using the NEBNext Library Quant Kit for Illumina according to the manufacturer’s recommendations.
For whole genome sequencing, the MiniSeq High output Reagent Cartridge was used in PE paired-end reading mode according to the manufacturer’s recommendations. Whole genome sequencing of SARS-CoV-2 viruses was performed using the Illumina MiniSeq next-generation sequencer (NGS).

2.4. Bioinformatics and Phylogenetics

A sequencing quality check was performed using fastqc 0.12.1 with subsequent quality trimming using trimmomatic 0.39. Consensus genome assembly was performed using BWA v. 0.7.17, Samtools v. 1.19.2, and Ivar v.1.4.2, along with custom Python 3.10.4 scripts. All sequenced genomes were timely submitted to EpiCoV GISAID. In total, 4462 SARS-CoV-2 genomic sequences available in GISAID were downloaded and quality-filtered using Nextclade [18]. Qc.overallStatus equal to “good” or “mediocre” was used as a quality filtration criterion. A total of 3299 SARS-CoV-2 genomes from Kazakhstan passed quality filtration. First we stratified SARS-CoV-2 genomes into pandemic waves using collection dates and VOC designations: 122 genomes in the pre-VOC period, 167 in the Alpha period, 717 in the Delta period, 823 in the Omicron BA.1/BA.2 period, and 1193 in the Omicron BA.4/BA.5 period. Uvaia v2.0.1 software was used to search for the closest neighboring sequences available in GISAID (mmsa-2024-02-04.tar.xz global alignment was used as an input). Briefly, each set of SARS-CoV-2 genomes from corresponding waves was aligned to the Wuhan reference genome (EPI_ISL_402124) using uvaialign (e.g., for the Alpha wave the following command was used: uvaialign -r EPI_ISL_402124.fasta VOC_alpha.fasta). Then the uvaia search was performed as follows: “uvaia -r mmsa-2024-02-04.tar.xz/2024-02-04_masked.fa –trim=230 KZ_alpha/uvaia.10dc4_alphaKZ.aln.xz -o alpha_result --nthreads=120 -x”. Rank 1 and rank 2 neighbors were used for further analysis. Multiple alignment was performed using MAFFT software (Version 7.526) [19]. Phylogenetic tree reconstruction was conducted in RAxML-NG (“raxml-ng --msa MSA_dataset.fasta --prefix MSA_dataset --model GTR”) [20]. Treemmer tool v0.3 [21] was used to remove redundant sequences while preserving phylogenetic diversity. Final visualization of phylogenetic trees was performed using custom R scripts (v.4.4.1 ggtree package). The overall analysis workflow is presented in Supplementary Figure S1. The final list of viruses after quality filtration is provided in Supplementary Table S1.

2.5. Ethical Principles

In Kazakhstan, obtaining informed consent from the patient before collecting samples for COVID-19 is a mandatory procedure. During sample collection, patients filled out written voluntary consent in accordance with the order of the Ministry of Healthcare of the Republic of Kazakhstan dated 20 May 2015 No. 364 “On approval of the form of written voluntary consent of the patient for invasive interventions”. This retrospective study used anonymized residual samples obtained as part of routine and sentinel epidemiological surveillance of influenza and respiratory infections in Kazakhstan. The analysis of anonymized data and samples for scientific purposes was carried out in accordance with the Code of the Republic of Kazakhstan “On Public Health and the Healthcare System” (Article 9, Clause 4). The study was reviewed and approved by the Local Bioethics Committee of the RSE on the Right of Economic Management “National Center for Public Health” of the Ministry of Health of the Republic of Kazakhstan (Protocol No. 1 dated 30 April 2021) and the Local Ethics Committee of the Research and Production Center of Microbiology and Virology (Protocol No. 17 dated 30 October 2023).

3. Results

Kazakhstan experienced five COVID-19 waves between March 2020 (first confirmed SARS-CoV-2 case) and the end of the reporting period in late 2022. For analysis, we stratified the timeline by epidemic waves and the emergence of WHO variants of concern (VOCs) in Kazakhstan: (i) pre-VOC—from March 2020 to January 2021 (first detection of Alpha/B.1.1.7); (ii) Alpha period—February 2021 to June 2021 (until first detection of Delta/B.1.617.2); (iii) Delta period—July 2021 to December 2021 (until first detection of Omicron BA.1/BA.2); (iv) Omicron period, further subdivided into BA.1/BA.2 and BA.4/BA.5 sub-periods, followed by a recombinant-dominant phase (Figure 1).
A total of 4462 SARS-CoV-2 genomes were obtained from GISAID, of which 340 were generated in this study. Following quality assessment with Nextclade, 3299 genomes met high- or medium-quality thresholds and were retained for downstream analyses.
Genomic sequencing was unevenly distributed over time, with scarce data early in the pandemic and a marked increase in the second and third years (Table 1). The proportion of confirmed cases sequenced ranged from 0.60‰ (per mille) during the Delta wave—when case counts were high and sequencing capacity limited—to 11.57‰ during the Omicron BA.5 wave. After the WHO ended the COVID-19 Public Health Emergency of International Concern on 5 May 2023, case registration practices changed substantially; therefore, we did not calculate sequencing proportions for the subsequent period.
Table 1. Proportion of sequenced genomes during different periods of COVID-19 pandemic in Kazakhstan.
Table 1. Proportion of sequenced genomes during different periods of COVID-19 pandemic in Kazakhstan.
PeriodStartEndTotal CasesTotal GenomesProportion Sequenced, ‰95% CI
pre-VOC2020-03-012021-01-31227,1651930.850.74–0.98
Alpha2021-02-012021-06-30250,8983281.311.17–1.46
Delta2021-07-012021-12-14645,1213870.600.54–0.66
Omicron BA.1/BA.22021-12-152022-05-31331,0587722.332.17–2.50
Omicron BA.52022-06-012022-10-3191,081105411.5710.90–12.29

3.1. Genetic Diversity of SARS-CoV-2 Circulated in Kazakhstan in Pre-VOC Period (March 2020–January 2021)

The first COVID-19 case in Kazakhstan was identified in Almaty on 13 March 2020. The earliest genomic data were generated in May 2020 from specimens collected in Nur-Sultan (now Astana) in late March 2020, with most early strains sequenced retrospectively. During the pre-VOC period, 12 Pango lineages circulated nationwide; B.1.1 predominated across most locations and months, whereas Astana was dominated by B.4.1 (Figure 2 and Figure 3).
Figure 2. Geographical distribution of sequenced SARS-CoV-2 genomes in the pre-VOC period.
Figure 2. Geographical distribution of sequenced SARS-CoV-2 genomes in the pre-VOC period.
Viruses 18 00138 g002
Figure 3. Genetic diversity of SARS-CoV-2 in Kazakhstan in pre-VOC period (March 2020–January 2021).
Figure 3. Genetic diversity of SARS-CoV-2 in Kazakhstan in pre-VOC period (March 2020–January 2021).
Viruses 18 00138 g003
Phylogenetic analysis of Kazakhstan samples from the pre-VOC period resolved six clades corresponding to A, B, A.2, B.1, B.1.1, and B.4.1. Within B.1.1, we observed diversification into B.1.1.10, B.1.1.294, B.1.1.336, B.1.1.440, and B.1.1.462. Nearest-neighbor clustering with Uvaia against a global background indicated multiple, sustained local transmission chains during the first epidemic wave of COVID-19 in Kazakhstan.
Geographically, A and B lineage viruses grouped with contemporaneous sequences from China and Malaysia, whereas A.2 grouped with European sequences (notably Spain and Portugal). B.1 sequences predominantly grouped with Asian sequences (Saudi Arabia and Indonesia), with a minor cluster linked to Iceland.
Lineage B.4.1 formed a distinct, Kazakhstan-only cluster. The sequenced B.4.1 genome was collected in Nur-Sultan (now Astana) in late March 2020. Its putative ancestor, B.4, was detected in Wuhan in January 2020 and became highly prevalent in Iran (303/740 B.4 sequences in GISAID). Compared with B.4, the Kazakhstan B.4.1 lineage carries ORF1ab substitutions I1023K and A1225D, plus a deletion at position 1024. B.4.1 forms a monophyletic clade restricted to Nur-Sultan, with no phylogenetic evidence of exportation and no circulation observed after April 2020.
B.1.1.294 was first reported in Russia in March 2020. In Kazakhstan, B.1.1.294 was detected in December 2020–January 2021 in the West Kazakhstan region and Nur-Sultan (Astana). The highest relative prevalence in the region was observed in Uzbekistan, where 4/28 genomes sequenced between March 2020 and January 2021 belonged to B.1.1.294. Phylogenetic analysis clustered Kazakhstan B.1.1.294 sequences with contemporaneous viruses from Russia and South Korea. According to GISAID metadata, 7/13 B.1.1.294 cases reported in South Korea were imports (six independent introductions from Uzbekistan during July–October 2020 and one from Russia in September 2020). Because genomic surveillance in Central Asia was limited in 2020, the origin of B.1.1.294 remains uncertain; a Russian source is plausible but cannot be determined with confidence.
B.1.1.440 was a predominantly U.S.-centered pre-VOC lineage: 89/93 available genomes were collected in the United States or U.S. territories, including 86 from Texas. Two sequences from Baikonur (Kazakhstan) cluster phylogenetically with a South Korean genome (hCoV-19/South_Korea/KDCA3546/2020) that GISAID metadata annotate as an export from Kazakhstan to South Korea. The Baikonur sequences appear closely related to two genomes from the Northern Mariana Islands.
Tengizchevroil (TCO), in the Atyrau Region, became Kazakhstan’s largest workplace COVID-19 hotspot in 2020. Between March and May 2020, 1306 laboratory-confirmed infections among workers were recorded [22]. In May 2020, as the Ministry of Health considered suspending operations, TCO demobilized ~20,000 of its ~30,000 staff and instituted pre-rotation quarantine with PCR testing for the remaining ~13,000 employees. Despite these measures, cumulative infections rose to 2661 by 29 July 2020 [22]. Genomic data from the outbreak indicate that all sequenced cases belonged to Pango lineage B.1.1. Phylogenetic reconstruction supports at least three independent introductions of SARS-CoV-2 into the Tengiz oilfields (Figure 4).

3.2. Genetic Diversity of SARS-CoV-2 in Kazakhstan in Alpha Period (February–June 2021)

Sequenced Alpha VOC viruses had wide geographical distribution (Figure 5). The first Alpha variant viruses were reported in Kazakhstan at the end of March 2021. Retrospective sequencing revealed the oldest specimen positive for Alpha to be from the beginning of February 2021 [23]. Most of the sequenced specimens were collected in March and April 2021 (Figure 6).
The B.1.1.7 lineage harbors eighteen amino acid changes compared with the Wuhan-Hu-1 reference—four of which are shared with its parental B.1.1 lineage—along with three in-frame deletions (ORF1a:del3676/3678, S:del69/70, and S:del144). In addition to the lineage-defining set, nine nonsynonymous mutations occurred in at least 10% of Kazakhstan viruses (Table 2), with several showing higher prevalence in Kazakhstan than globally.
Several mutations in the Kazakhstan Alpha dataset show markedly higher frequencies than in global datasets, suggesting potential regional enrichment or distinct local transmission dynamics. The most striking enrichment is observed for ORF1b I28T (NSP12:I37T), detected in 69/167 (41.32%) genomes from Kazakhstan yet virtually absent in global Alpha datasets (0.01%). Additional recurrent changes include ORF1a F200L (NSP13:F200L) in 28/167 (16.77%) and ORF7a P84L (NS7a:P84L) in 30/167 (17.96%), both of which were rare worldwide (0.00% and 0.28%, respectively).
Several substitutions enriched in Kazakhstan reached approximately 10–11% locally: M F100I (M:F100I) and ORF1a M1312I (NSP3:M494I) (each 19/167; 11.38%), both essentially absent globally (0.00% and 0.08%), and ORF3a Y189S (NS3:Y189S), ORF3a A99S (NS3:A99S), and ORF1a V3595D (NSP6:V26D) (each 18/167; 10.78%), all rare worldwide (0.01%, 0.06%, and 0.00%, respectively). In contrast, ORF8 Y73C (NS8:Y73C) was highly frequent both in Kazakhstan (115/167; 68.86%) and globally (94.93%), representing a characteristic Alpha mutation rather than a region-specific variant. Similarly, ORF8 K68stop (NS8:K68stop) appeared at a lower frequency in Kazakhstan (23/167; 13.77%) compared with global datasets (33.23%).
The placement of Kazakhstan B.1.1.7 sequences across the phylogeny (Figure 7) indicates multiple introductions, as they do not form a monophyletic group. Notably, many appear as singletons or in small clades, suggesting that most introductions did not result in sustained community transmission

3.3. Genetic Diversity of SARS-CoV-2 in Kazakhstan in Delta Period (July–December 2021)

The first Delta strain cases were identified in Kazakhstan in July 2021 [24], yet reliable genomic data start from August 2021. Sequenced Delta VOC viruses had wide geographical distribution (Figure 8). Most of the sequenced specimens were collected in August, September, and November 2021 (Figure 9). The dominating lineage was AY.122 with combination of NS7a_P45L and NSP2_K81N substitutions typical for the previously described Russian sublineage of AY.122 [19]. Interestingly, a monophyletic cluster of AY.122 viruses without NS7a_P45L+ NSP2_K81N combination typical for Russian AY.122 viruses grouping with a virus of Indian origin was observed (Figure 10). Due to the low sequencing level in Russia and Central Asia at that time, it is hard to locate the plausible place of emergence of these AY.122 sublineages and the directions of their importation/exportation; nevertheless, we can speculate that AY.122 in Kazakhstan was not as homogeneous as AY.122 in Russia. The sporadic detection of AY.120, AY.121, AY.123, AY.126, and AY.127 lineages was observed.
The most characteristic Delta variant mutations showed near-universal prevalence in Kazakhstan sequences: Spike D950N in 606/613 (98.86%) and NSP14 A394V in 520/613 (84.83%), both of which were also highly conserved globally (90.63% and 98.68%, respectively). Additional defining Delta mutations included Spike T478K in 553/613 (90.21%), Spike L452R in 545/613 (88.91%), and Spike G142D in 566/613 (92.33%), all with similarly high global frequencies (97.11%, 96.83%, and 60.53%, respectively).
The NS7a substitutions P45L, T120I, and V82A were detected in 509/613 (83.03%), 487/613 (79.45%), and 507/613 (82.71%) of Kazakhstan sequences, with corresponding global frequencies of 62.89%, 93.43%, and 91.53% (Table 3). These represent characteristic AY.122 lineage markers rather than Kazakhstan-specific variants. In contrast, NSP3 H1841Y showed notable enrichment in Kazakhstan, occurring in 56/613 (9.14%) sequences compared with only 0.30% globally, suggesting potential regional selection or distinct local transmission dynamics within the Kazakhstan SARS-CoV-2 Delta variant population.
Figure 8. Geographical distribution of sequenced SARS-CoV-2 genomes in the Delta period.
Figure 8. Geographical distribution of sequenced SARS-CoV-2 genomes in the Delta period.
Viruses 18 00138 g008
Figure 9. Genetic diversity of SARS-CoV-2 in Kazakhstan in Delta period (August–December 2021).
Figure 9. Genetic diversity of SARS-CoV-2 in Kazakhstan in Delta period (August–December 2021).
Viruses 18 00138 g009
Figure 10. Phylogenetic analysis of Delta VOC SARS-CoV-2 in Kazakhstan.
Figure 10. Phylogenetic analysis of Delta VOC SARS-CoV-2 in Kazakhstan.
Viruses 18 00138 g010
Table 3. Frequency of additional AY.122 substitutions observed in sequences from Kazakhstan.
Table 3. Frequency of additional AY.122 substitutions observed in sequences from Kazakhstan.
GeneSubstitutionGISAID NotationNumber in DatasetFrequency in DatasetNumber in Global DatasetFrequency in Global Dataset
NS7aP45LNS7a_P45L50983.03154,03462.89
NS7aT120INS7a_T120I48779.45228,83693.43
NS7aV82ANS7a_V82A50782.71224,18191.53
NSP14A394VNSP14_A394V52084.83241,69998.68
NSP3H1841YNSP3_H1841Y569.147450.30
SpikeG142DSpike_G142D56692.33148,26460.53
SpikeD950NSpike_D950N60698.86221,98090.63
SpikeL452RSpike_L452R54588.91237,17496.83
SpikeT478KSpike_T478K55390.21237,85097.11

3.4. Genetic Diversity of SARS-CoV-2 Circulated in Kazakhstan in Omicron BA.1/BA.2 Period (January–May 2022)

The first Omicron case was reported in Kazakhstan in January 2022, followed by a high epidemic wave and a drastic decline within less than a month. Sequenced BA.1/BA.2 viruses showed a wide geographic distribution but a strongly skewed temporal distribution: more than 50% of sequenced specimens were collected in January 2022, at the start of the Omicron BA.1/BA.2 wave in Kazakhstan (Figure 11 and Figure 12). The dominant lineage was BA.1 and its sublineages. The most prevalent BA.1.1 did not form a monophyletic cluster, with many branches of the phylogenetic tree clustering with Indian SARS-CoV-2 genomes (Figure 13). It is interesting to note that inbound tourism from India to Kazakhstan constantly grew from 1603 visitors in 2021 to 10,090 visitors in 2022 [25]. International air travel was resumed in Kazakhstan in September 2021. BA.2 genomes from Kazakhstan also clustered with SARS-CoV-2 viruses collected in India and Nepal.

3.5. Genetic Diversity of SARS-CoV-2 Circulated in Kazakhstan in Omicron BA.4/BA.5 Period (June–October 2022)

The earliest Omicron BA.4/BA.5 genomes in Kazakhstan were collected in June 2022. Most sequenced specimens, however, were collected in August and September 2022 (Figure 14 and Figure 15). BA.5 and its sublineages predominated. Among BA.5 descendants, the most prevalent were BA.5.2, BA.5.1, BF.5 (BA.5.2.1.5), BF.7 (BA.5.2.1.7), and BE.1 (BA.5.3.1.1). Phylogenetic clustering with the closest global sequences suggested a large local transmission event involving the BF.7 lineage, affecting many regions of Kazakhstan, with the nearest non-Kazakhstan BF.7 genome in the tree originating from Central America (Figure 16). In contrast, BE.1 sequences from Kazakhstan clustered mainly with European genomes (notably from Slovenia). BA.5.2 strains appeared to be of heterogeneous origin, with multiple introductions forming clusters closely related to Russian, Indian, and various European sequences. BA.4 genomes from Kazakhstan clustered with European SARS-CoV-2 sequences.

4. Discussion

Kazakhstan experienced five COVID-19 waves from March 2020 through 2023. We analyzed 3299 quality-filtered genomes (4462 from GISAID; 340 newly generated), stratifying by epidemic phase—pre-VOC (March 2020–9 February 2021), Alpha (9 February –June 2021), Delta (July–December 2021), Omicron BA.1/BA.2 (December 2021–May 2022), and Omicron BA.4/BA.5 (June–October 2022)—providing new insights into the spread of SARS-CoV-2 in Central Asia. At the same time, the very uneven sequencing intensity across waves (from 0.60‰ of cases during Delta to 11.57‰ during the Omicron BA.5 period) means that inferences about relative lineage diversity and persistence must be interpreted with caution, as the diversity of SARS-CoV-2 viruses in Kazakhstan is likely undersampled.
The pre-VOC phase in Kazakhstan was characterized by the co-circulation of at least 12 Pango lineages across six major clades (A, B, A.2, B.1, B.1.1, and B.4.1), mirroring the broad early diversity reported previously and consistent with multiple importations from both Asian and European/American sources [25]. The explicit association of some B.1.1 sublineages with the Tengiz oilfield outbreak is epidemiologically plausible: that complex has been documented as a major national hotspot in 2020, with >1000 infections among oilfield workers in early 2020 and strong evidence for intense workplace-associated transmission [22].
The strong concentration of B.1.1.440 in Houston, Texas (home to NASA’s Johnson Space Center), together with its apparent phylogenetic connection to Baikonur (a major spaceport) via the Northern Mariana Islands (which host space-launch facilities), may reflect travel associated with space-launch operations. However, this interpretation remains speculative and cannot be confirmed with the available data.
Earlier work suggested that B.4.1 may have arisen independently in Kazakhstan; the Kazakhstan-restricted monophyletic clade that disappeared after April 2020 provides strong support for this and illustrates how geographically constrained lineages can emerge and then go extinct without contributing to the later global VOC landscape. Similar short-lived local lineages have been described elsewhere and are generally interpreted as the product of founder effects and transient ecological opportunities, which are then overwhelmed by fitter variants or by changes in mobility and control measures.
During the Alpha period, B.1.1.7’s widespread geographic presence but scattered phylogenetic placement—dominated by singletons and small clades rather than one or two large monophyletic clusters—indicates that Kazakhstan experienced numerous Alpha introductions, most of which failed to generate large, sustained community transmission chains. This pattern closely parallels detailed B.1.1.7 analyses from Denmark [26], Mexico [27], and wastewater-based studies in Europe, where repeated introductions were common but only a subset of lineages achieved major expansion. The strong local enrichment of ORF1b I28T (NSP12:I37T) among Kazakhstan Alpha genomes, despite its rarity globally, likely reflects a combination of founder effects and expansion of a particular B.1.1.7 sublineage rather than clear adaptive change.
The observed Delta period in Kazakhstan fits well into the global picture of rapid Delta replacement in mid-2021 but with some striking regional nuances. Delta became globally dominant by mid-2021, accounting for nearly all sequenced infections by late August, and similar timing has been reported across Europe and the Middle East [28].
The dominance of AY.122 carrying the nsp2:K81N and ORF7a:P45L substitutions closely parallels the situation in Russia, where >90% of Delta sequences shared this mutation pair and were assigned to AY.122, a combination that remained rare in most other countries. Klink et al. also noted that Kazakhstan was among the few settings outside Russia with a high frequency of this AY.122 signature, suggesting intense epidemiological connectivity across the region during the Delta wave. The broad geographic distribution of these AY.122 viruses within Kazakhstan allows the speculation of a scenario in which one or a few successful introduction(s) of the “Russian-like” AY.122 sublineage were amplified by sustained community transmission. At the same time, the identification of a monophyletic AY.122 cluster lacking the canonical nsp2:K81N + ORF7a:P45L combination and grouping phylogenetically with a virus of Indian origin indicates that AY.122 circulation in Kazakhstan was not genetically homogeneous. Rather, it likely reflects at least two epidemiologically distinct AY.122 sources—a Russian-linked K81N+P45L sublineage and a second introduction related to an unknown source. Similar coexistence of multiple AY.* sublineages arising from separate importation events has been documented in England and other well-sampled settings, where some sublineages expand locally while others remain confined to small clusters [29].
The BA.1/BA.2 Omicron wave in Kazakhstan, first detected in January 2022, was short but intense, with a rapid rise and decline in reported cases, consistent with the high intrinsic transmissibility and immune escape properties of BA.1/BA.2 observed globally. The genomic data show that BA.1 and its sublineages, particularly BA.1.1, dominated this wave and were widely distributed across the country, but sequencing was heavily concentrated in January, which likely captures the early expansion phase while under-representing subsequent transmission. The fact that Kazakhstan BA.1.1 genomes do not form a single monophyletic clade and instead intersperse with multiple Indian sequences, together with BA.2 genomes clustering with viruses from India and Nepal, supports a scenario of repeated Omicron introductions from South Asia rather than expansion of a single local founder lineage. The resumption of international air travel in late 2021 and the marked increase in inbound tourism from India provide plausible human mobility pathways for these introductions. However, uneven temporal sampling and limited sequencing depth constrain precise reconstruction of introduction routes and onward spread.
Overall, these results show that Kazakhstan’s epidemic was shaped by repeated international introductions, workplace and community amplification of a subset of those importations, and the rapid turnover of locally restricted lineages by globally successful VOCs. Sustained, more uniform genomic surveillance across regions and epidemic phases would not only improve reconstruction of past transmission dynamics in Kazakhstan but also strengthen the country’s ability to detect and characterize future variants of concern.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/v18010138/s1, Figure S1: Kazakhstan SARS-CoV-2 genomic dataset analysis workflow, Table S1: SARS-CoV-2 genomes from Kazakhstan after quality filtration.

Author Contributions

Conceptualization, A.G. and A.M.; methodology, A.G., A.M. and A.K. (Andrey Komissarov); software, A.U. and A.F.; validation, A.M. and K.K.; formal analysis, D.J. and Y.K.; investigation, A.M., A.G., A.A. (Askar Abdaliyev) and A.U.; resources, A.A. (Aigerim Abdimadiyeva), M.K., T.S. and A.G.; data curation, M.S.; writing—original draft preparation, A.G. and A.M.; writing—review and editing, A.K. (Andrey Komissarov) and A.K. (Aidyn Kydyrmanov); visualization, A.P. and A.F.; supervision, A.K. (Ainagul Kuatbaeva) and A.K. (Aidyn Kydyrmanov); project administration, A.M., A.K. (Aidyn Kydyrmanov), and A.K. (Andrey Komissarov); funding acquisition, A.K. (Aidyn Kydyrmanov). All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Science Committee of the Ministry of Science and Higher Education of the Republic of Kazakhstan (Grant No. AP23485202, “Study the epidemiology and evolution dynamics of human coronaviruses and other respiratory viruses in Kazakhstan”) and by the US CDC in the frame of the project “Strengthening the surveillance system, laboratory diagnostics, the workforce for the early detection of cases and outbreaks of infectious diseases, including COVID-19” (Award No: 5NU2GGH002389-05-00).

Institutional Review Board Statement

The study was reviewed and approved by the Local Bioethics Committee of the RSE on the Right of Economic Management “National Center for Public Health” of the Ministry of Health of the Republic of Kazakhstan (Protocol No. 1 dated 30 April 2021) and the Local Ethics Committee of the Research and production center for microbiology and virology (Protocol No. 02-09/190 dated 30 October 2023).

Informed Consent Statement

Written informed consent was obtained from all subjects in accordance with the order of the Ministry of Healthcare of the Republic of Kazakhstan dated 20 May 2015 No. 364 “On approval of the form of written voluntary consent of the patient for invasive interventions”.

Data Availability Statement

All sequences were submitted to EpiCov GISAID (EPI_SET_251114au. DOI: https://doi.org/10.55876/gis8.251114au accessed on 12 November 2025). Programming scripts used for data analysis and visualization are available at https://github.com/LMV-NIC-St-Petersburg/sars-cov-2-kazakhstan-data (accessed on 12 November 2025).

Acknowledgments

We gratefully acknowledge personnel of the hospitals of Almaty and other cities for their collaboration and delivering clinical samples from patients with COVID-19. We acknowledge all data contributors, i.e., the Authors and their originating laboratories responsible for obtaining the specimens and their submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based. We also acknowledge Aklab LLC (Almaty, Kazakhstan) for the technical support in data analysis and preparation of the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Chan, J.F.; Kok, K.H.; Zhu, Z.; Chu, H.; To, K.K.; Yuan, S.; Yuen, K. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg. Microbes Infect. 2020, 9, 221–236. [Google Scholar] [CrossRef]
  2. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef]
  3. Wu, J.; Liu, W.; Gong, P. A Structural Overview of RNA-Dependent RNA Polymerases from the Flaviviridae Family. Int. J. Mol. Sci. 2015, 16, 12943–12957. [Google Scholar] [CrossRef]
  4. Follis, K.E.; York, J.; Nunberg, J.H. Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell–cell fusion but does not affect virion entry. Virology 2006, 350, 358–369. [Google Scholar] [CrossRef] [PubMed]
  5. Boni, M.F.; Lemey, P.; Jiang, X.; Lam, T.T.-Y.; Perry, B.W.; Castoe, T.A.; Rambaut, A.; Robertson, D.L. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nat. Microbiol. 2020, 5, 1408–1417. [Google Scholar] [CrossRef]
  6. Li, X.; Giorgi, E.E.; Marichannegowda, M.H.; Foley, B.; Xiao, C.; Kong, X.-P.; Chen, Y.; Gnanakaran, S.; Korber, B.; Gao, F. Emergence of SARS-CoV-2 through recombination and strong purifying selection. Sci. Adv. 2020, 6, eabb9153. [Google Scholar] [CrossRef] [PubMed]
  7. Munnink, B.B.O.; Worp, N.; Nieuwenhuijse, D.F.; Sikkema, R.S.; Haagmans, B.; Fouchier, R.A.M.; Koopmans, M. The next phase of SARS-CoV-2 surveillance: Real-time molecular epidemiology. Nat. Med. 2021, 27, 1518–1524. [Google Scholar] [CrossRef]
  8. Tegally, H.; Wilkinson, E.; Lessells, R.J.; Giandhari, J.; Pillay, S.; Msomi, N.; Mlisana, K.; Bhiman, J.N.; von Gottberg, A.; Walaza, S.; et al. Sixteen novel lineages of SARS-CoV-2 in South Africa. Nat. Med. 2021, 27, 440–446. [Google Scholar] [CrossRef] [PubMed]
  9. Meng, B.; Kemp, S.A.; Papa, G.; Datir, R.; Ferreira, I.A.; Marelli, S.; Harvey, W.T.; Lytras, S.; Mohamed, A.; Gallo, G.; et al. Recurrent emergence of SARS-CoV-2 spike deletion H69/V70 and its role in the Alpha variant B.1.1.7. Cell Rep. 2021, 35, 109292. [Google Scholar] [CrossRef]
  10. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 2020, 395, 1054–1062. [Google Scholar] [CrossRef]
  11. Harvard Medical School Coronavirus Resource Center. Available online: https://www.health.harvard.edu/ (accessed on 24 December 2025).
  12. Battakova, Z.; Imasheva, B.; Slazhneva, T.; Imashev, M.; Beloussov, V.; Pignatelli, M.; Tursynkhan, A.; Askarov, A.; Abdrakhmanova, S.; Adayeva, A.; et al. Public Health Response Measures for COVID-19 in Kazakhstan. Disaster Med. Public Health Prep. 2023, 17, e524. [Google Scholar] [CrossRef] [PubMed]
  13. Zhussupov, B.; Saliev, T.; Sarybayeva, G.; Altynbekov, K.; Tanabayeva, S.; Altynbekov, S.; Tuleshova, G.; Pavalkis, D.; Fakhradiyev, I. Analysis of COVID-19 pandemics in Kazakhstan. J. Res. Health Sci. 2021, 21, e00512. [Google Scholar] [CrossRef]
  14. Kairov, U.; Amanzhanova, A.; Karabayev, D.; Rakhimova, S.; Aitkulova, A.; Samatkyzy, D.; Kalendar, R.; Kozhamkulov, U.; Molkenov, A.; Gabdulkayum, A.; et al. A high scale SARS-CoV-2 profiling by its whole-genome sequencing using Oxford Nanopore Technology in Kazakhstan. Front. Genet. 2022, 13, 906318. [Google Scholar] [CrossRef] [PubMed]
  15. Andre, M.; Lau, L.S.; Pokharel, M.D.; Ramelow, J.; Owens, F.; Souchak, J.; Akkaoui, J.; Ales, E.; Brown, H.; Shil, R.; et al. From alpha to omicron: How different variants of concern of the SARS-Coronavirus-2 impacted the world. Biology 2023, 12, 1267. [Google Scholar] [CrossRef]
  16. Cui, Q.; Shi, Z.; Yimamaidi, D.; Hu, B.; Zhang, Z.; Saqib, M.; Zohaib, A.; Gulnara, B.; Yersyn, M.; Hu, Z.; et al. Dynamic variations in COVID-19 with the SARS-CoV-2 Omicron variant in Kazakhstan and Pakistan. Infect. Dis. Poverty 2023, 12, 18. [Google Scholar] [CrossRef] [PubMed]
  17. Tukusheva, A. Kazakhstan Reports First Cases of Eris COVID-19 Variant. Kursiv.media. 6 October 2023. Available online: https://kz.kursiv.media/en/2023-10-06/kazakhstan-reports-first-cases-of-eris-covid-19-variant/ (accessed on 4 January 2026).
  18. Aksamentov, I.; Roemer, C.; Hodcroft, E.B.; Neher, R.A. Nextclade: Clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 2021, 6, 3773. [Google Scholar] [CrossRef]
  19. Katoh, K.; Standley, D.M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
  20. Kozlov, A.M.; Darriba, D.; Flouri, T.; Morel, B.; Stamatakis, A. RAxML-NG: A fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 2019, 35, 4453–4455. [Google Scholar] [CrossRef]
  21. Menardo, F.; Loiseau, C.; Brites, D.; Coscolla, M.; Gygli, S.M.; Rutaihwa, L.K.; Trauner, A.; Beisel, C.; Borrell, S.; Gagneux, S. Treemmer: A tool to reduce large phylogenetic datasets with minimal loss of diversity. BMC Bioinform. 2018, 19, 164. [Google Scholar] [CrossRef]
  22. Nabirova, D.; Taubayeva, R.; Maratova, A.; Henderson, A.; Nassyrova, S.; Kalkanbayeva, M.; Alaverdyan, S.; Smagul, M.; Levy, S.; Yesmagambetova, A.; et al. Factors Associated with an Outbreak of COVID-19 in Oilfield Workers, Kazakhstan, 2020. Int. J. Environ. Res. Public Health 2022, 19, 3291. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  23. Usserbayev, B.; Zakarya, K.; Kutumbetov, L.; Orynbayev, M.; Sultankulova, K.; Abduraimov, Y.; Myrzakhmetova, B.; Zhugunissov, K.; Kerimbayev, A.; Melisbek, A.; et al. Near-complete genome sequence of a SARS-CoV-2 variant B.1.1.7 virus strain isolated in Kazakhstan. Microbiol. Resour. Announc. 2022, 11, e0061922. [Google Scholar] [CrossRef] [PubMed]
  24. Usserbayev, B.; Abduraimov, Y.; Kozhabergenov, N.; Melisbek, A.; Shirinbekov, M.; Smagul, M.; Nusupbayeva, G.; Nakhanov, A.; Burashev, Y. Complete Coding Sequence of a Lineage AY.122 SARS-CoV-2 Virus Strain Detected in Kazakhstan. Microbiol. Resour. Announc. 2023, 12, e0030123. [Google Scholar] [CrossRef]
  25. Yegorov, S.; Goremykina, M.; Ivanova, R.; Good, S.V.; Babenko, D.; Shevtsov, A.; MacDonald, K.S.; Zhunussov, Y.; COVID-19 Genomics Research Groupon behalf of the Semey COVID-19 Epidemiology Research Group. Epidemiology, clinical characteristics, and virologic features of COVID-19 patients in Kazakhstan: A nation-wide retrospective cohort study. Lancet Reg. Health Eur. 2021, 4, 100096. [Google Scholar] [CrossRef]
  26. Michaelsen, T.Y.; Bennedbæk, M.; Christiansen, L.E.; Jørgensen, M.S.F.; Møller, C.H.; Sørensen, E.A.; Knutsson, S.; Brandt, J.; Jensen, T.B.N.; Chiche-Lapierre, C.; et al. Introduction and transmission of SARS-CoV-2 lineage B.1.1.7, Alpha variant, in Denmark. Genome Med. 2022, 14, 47. [Google Scholar] [CrossRef]
  27. Zárate, S.; Taboada, B.; Muñoz-Medina, J.E.; Iša, P.; Sanchez-Flores, A.; Boukadida, C.; Herrera-Estrella, A.; Selem Mojica, N.; Rosales-Rivera, M.; Gómez-Gil, B.; et al. The Alpha Variant (B.1.1.7) of SARS-CoV-2 Failed to Become Dominant in Mexico. Microbiol. Spectr. 2022, 10, e0224021. [Google Scholar] [CrossRef] [PubMed]
  28. Chen, Z.; Azman, A.S.; Chen, X.; Zou, J.; Tian, Y.; Sun, R.; Xu, X.; Wu, Y.; Lu, W.; Ge, S.; et al. Global landscape of SARS-CoV-2 genomic surveillance and data sharing. Nat. Genet. 2022, 54, 499–507. [Google Scholar] [CrossRef]
  29. Eales, O.; Page, A.J.; Martins, L.d.O.; Wang, H.; Bodinier, B.; Haw, D.; Jonnerby, J.; Atchison, C.; Ashby, D.; Barclay, W.; et al. SARS-CoV-2 lineage dynamics in England from September to November 2021: High diversity of Delta sub-lineages and increased transmissibility of AY.4.2. BMC Infect. Dis. 2022, 22, 647. [Google Scholar] [CrossRef] [PubMed]
Figure 1. COVID-19 weekly new cases in Kazakhstan, epidemic waves, and number of sequenced genomes (in red).
Figure 1. COVID-19 weekly new cases in Kazakhstan, epidemic waves, and number of sequenced genomes (in red).
Viruses 18 00138 g001
Figure 4. Phylogenetic analysis of SARS-CoV-2 in Kazakhstan in pre-VOC period.
Figure 4. Phylogenetic analysis of SARS-CoV-2 in Kazakhstan in pre-VOC period.
Viruses 18 00138 g004
Figure 5. Geographical distribution of sequenced SARS-CoV-2 genomes in the Alpha period.
Figure 5. Geographical distribution of sequenced SARS-CoV-2 genomes in the Alpha period.
Viruses 18 00138 g005
Figure 6. Genetic diversity of SARS-CoV-2 in Kazakhstan in Alpha period (February–April 2021).
Figure 6. Genetic diversity of SARS-CoV-2 in Kazakhstan in Alpha period (February–April 2021).
Viruses 18 00138 g006
Figure 7. Phylogenetic analysis of Alpha VOC SARS-CoV-2 in Kazakhstan.
Figure 7. Phylogenetic analysis of Alpha VOC SARS-CoV-2 in Kazakhstan.
Viruses 18 00138 g007
Figure 11. Geographical distribution of sequenced SARS-CoV-2 genomes in the Omicron BA.1/BA.2 period.
Figure 11. Geographical distribution of sequenced SARS-CoV-2 genomes in the Omicron BA.1/BA.2 period.
Viruses 18 00138 g011
Figure 12. Genetic diversity of SARS-CoV-2 in Kazakhstan in Omicron BA.1/BA.2 period (January–May 2022).
Figure 12. Genetic diversity of SARS-CoV-2 in Kazakhstan in Omicron BA.1/BA.2 period (January–May 2022).
Viruses 18 00138 g012
Figure 13. Phylogenetic analysis of Omicron BA.1/BA.2 SARS-CoV-2 in Kazakhstan.
Figure 13. Phylogenetic analysis of Omicron BA.1/BA.2 SARS-CoV-2 in Kazakhstan.
Viruses 18 00138 g013
Figure 14. Geographical distribution of sequenced SARS-CoV-2 genomes in the Omicron BA.4/BA.5 period.
Figure 14. Geographical distribution of sequenced SARS-CoV-2 genomes in the Omicron BA.4/BA.5 period.
Viruses 18 00138 g014
Figure 15. Genetic diversity of SARS-CoV-2 in Kazakhstan in Omicron BA.4/BA.5 period (June–October 2022).
Figure 15. Genetic diversity of SARS-CoV-2 in Kazakhstan in Omicron BA.4/BA.5 period (June–October 2022).
Viruses 18 00138 g015
Figure 16. Phylogenetic analysis of Omicron BA.4/BA.5 SARS-CoV-2 in Kazakhstan.
Figure 16. Phylogenetic analysis of Omicron BA.4/BA.5 SARS-CoV-2 in Kazakhstan.
Viruses 18 00138 g016
Table 2. Frequency of additional B.1.1.7 substitutions observed in sequences from Kazakhstan.
Table 2. Frequency of additional B.1.1.7 substitutions observed in sequences from Kazakhstan.
GeneSubstitutionGISAID NotationNumber in DatasetFrequency in DatasetNumber in Global DatasetFrequency in Global Dataset
MF100IM_F100I1911.38230.00
NS3Y189SNS3_Y189S1810.78740.01
NS3A99SNS3_A99S1810.787550.06
NS7aP84LNS7a_P84L3017.9633720.28
NS8Y73CNS8_Y73C11568.861,137,60394.93
NS8K68stopNS8_K68 2313.77398,19433.23
NSP12I37TNSP12_I37T6941.32840.01
NSP12P227LNSP12_P227L10.60173,25914.46
NSP13F200LNSP13_F200L2816.77480.00
NSP3M494INSP3_M494I1911.389450.08
NSP6V26DNSP6_V26D1810.78190.00
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gabiden, A.; Komissarov, A.; Mutaliyeva, A.; Usserbayev, A.; Karamendin, K.; Perederiy, A.; Fadeev, A.; Kuatbaeva, A.; Jussupova, D.; Abdaliyev, A.; et al. Genetic Diversity of SARS-CoV-2 in Kazakhstan from 2020 to 2022. Viruses 2026, 18, 138. https://doi.org/10.3390/v18010138

AMA Style

Gabiden A, Komissarov A, Mutaliyeva A, Usserbayev A, Karamendin K, Perederiy A, Fadeev A, Kuatbaeva A, Jussupova D, Abdaliyev A, et al. Genetic Diversity of SARS-CoV-2 in Kazakhstan from 2020 to 2022. Viruses. 2026; 18(1):138. https://doi.org/10.3390/v18010138

Chicago/Turabian Style

Gabiden, Altynay, Andrey Komissarov, Aknur Mutaliyeva, Aidar Usserbayev, Kobey Karamendin, Alexander Perederiy, Artem Fadeev, Ainagul Kuatbaeva, Dariya Jussupova, Askar Abdaliyev, and et al. 2026. "Genetic Diversity of SARS-CoV-2 in Kazakhstan from 2020 to 2022" Viruses 18, no. 1: 138. https://doi.org/10.3390/v18010138

APA Style

Gabiden, A., Komissarov, A., Mutaliyeva, A., Usserbayev, A., Karamendin, K., Perederiy, A., Fadeev, A., Kuatbaeva, A., Jussupova, D., Abdaliyev, A., Smagul, M., Khan, Y., Kumar, M., Sabyrzhan, T., Abdimadiyeva, A., & Kydyrmanov, A. (2026). Genetic Diversity of SARS-CoV-2 in Kazakhstan from 2020 to 2022. Viruses, 18(1), 138. https://doi.org/10.3390/v18010138

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop