1. Introduction
Since late 2019 and the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), vaccines have played a major role in limiting the spread and severity of the disease. The first genetic sequence was publicly available in January 2020. This accelerated the development of vaccines worldwide [
1]. As soon as the emergence of the SARS-CoV-2 virus, several vaccine platforms have been developed using different technological approaches. These vaccines have played a decisive role in controlling the spread of the virus by reducing the risk of infection, severe illness, hospitalization and death [
2].
Many stages of vaccine development and testing were achieved in a very short period of time, using technologies such as the growth of viruses in cell culture, recombinant DNA, and genomics [
3]. Additionally, many in silico approaches, including bioinformatics, immunoinformatics, structural biology and molecular dynamic simulations, were freely available to support vaccine design in this limited time [
4].
However, the emergence of SARS-CoV-2 variants has complicated the vaccine strategy against the virus. Each new variant acquires new mutations and may lose some in order to become more transmissible and avoid the immune response [
5]. Numerous studies have demonstrated that mutations arising in SARS-CoV-2 variants significantly affect viral infectivity, disease severity, transmissibility, and interactions with host immunity, including vaccine-induced immune responses [
3,
6]. Notably, several mutations are located within the spike protein, particularly in the receptor binding domain (RBD) and the N-terminal domain (NTD), where they can alter receptor engagement and antibody recognition, thereby facilitating immune escape.
Emerging variants have consequently acquired an ability to bind the human angiotensin-converting enzyme 2 (ACE2) receptor, resulting in increased transmissibility compared to the ancestral SARS-CoV-2 strain [
3].
Continuous genomic and genetic surveillance of the global viral population is therefore essential for the early detection of emerging mutations and for assessing their potential impact on viral transmission and immune evasion [
7]. In this context, bioinformatic tools such as Nextstrain play a pivotal role by enabling the identification and tracking of spike protein mutations through phylogenetic analyses. Nextstrain integrates a comprehensive viral genome database with a bioinformatics pipeline for phylodynamic analysis and an interactive visualization platform, allowing real-time exploration of viral evolution [
8]. Complementarily, molecular docking approaches provide functional insights by acting as energy-based filters to predict the initial binding affinity between the spike protein and its targets, including neutralizing antibodies and the ACE2 receptor. This facilitates the characterization of how specific mutations modulate protein–protein interactions and contribute to immune evasion [
9]. In addition, advances in machine learning and computational biology offer powerful frameworks for understanding viral evolution, identifying emergent new SARS-CoV-2 variants, predicting the mutation’s impact and supporting vaccine development. For example, experimental studies have reported machine learning models capable of predicting future SARS-CoV-2 mutations in real time based on mutation frequency patterns [
10].
Beyond variant detection, bioinformatics tools also play a crucial role in promoting global data sharing and integrative analyses of genomic, transcriptomic, proteomic, and structural data, thereby guiding rational vaccine and therapeutic development strategies [
4].
This bibliometric review, based on data from the Scopus database, aims to investigate the scientific literature on SARS-CoV-2’s immune and vaccine escape, using bioinformatics tools. Using quantitative bibliometric indicator and network visualization, the study will identify major research words, key authors and leading institutions, as well as mapping the evolution of thematic areas in this topic. In fact, it aims to map the development and application of bioinformatics approaches to understanding viral immune evasion, identify knowledge gaps, and draw attention to emerging methodologies.
To the best of our knowledge, this bibliometric study is among the first to specifically integrate bioinformatics and in silico tools with immune evasion-focused SARS-CoV-2 research, covering the full pandemic period (2020–2025). In contrast to previous bibliometric analyses, which have largely focused either on SARS-CoV-2 vaccines, general virological aspects of the virus, or were limited to 100 publications, the present study provides a comprehensive and longitudinal overview of the scientific landscape. By covering the full duration of the pandemic, this analysis captures the temporal evolution of research efforts in response to emerging variants and highlights the increasing role of bioinformatics tools in understanding immune evasion mechanisms.
Focusing on bioinformatics and in silico approaches is particularly relevant because the rapid accumulation of SARS-CoV-2 genomic, proteomic, and structural data requires systematic computational analysis. Bioinformatics tools enable researchers to monitor emerging mutations, assess their impact on immune escape, and anticipate potential challenges to vaccines and therapeutics. Integrating these approaches highlights both the scientific output and the evolving computational strategies essential for a timely pandemic response. In this study, we also aim to identify and characterize the ten highly cited publications on our topic for a descriptive analysis.
2. Materials and Methods
2.1. Data Collection
The literature was retrieved on 11 November 2025 from the Scopus database, selected for its broad coverage of virology, immunology, and bioinformatics. The search strategy targeted studies on immune escape investigated with bioinformatics or computational tools.
Scopus was chosen for its extensive indexing of journals, and reviews across multiple disciplines, ensuring comprehensive coverage of the evolving SARS-CoV-2 literature. We acknowledge that database selection may influence coverage, as some relevant publications indexed elsewhere may not have been included.
Table 1 represents the search query of the database used in our study. The Scopus database search strategy ensured a comprehensive and reliable collection of studies investigating the molecular and computational aspects of SARS-CoV-2 immune escape.
The search query was adapted to fit the database syntax. The search strategy was designed to balance sensitivity and specificity. Broad terms related to SARS-CoV-2, immune escape, and bioinformatics were included to maximize sensitivity and ensure comprehensive coverage of relevant studies. At the same time, Boolean operators were used to combine terms, and restrictions to titles, abstracts, and keywords helped maintain specificity by reducing retrieval of irrelevant articles in the selected field.
To further ensure reliability and transparency of both bibliometric and descriptive analysis, only full-text access articles were included, allowing accurate data extraction and consistent evaluation across all studies.
2.2. Inclusion and Exclusion Criteria
The following criteria were used to determine articles included in this bibliometric analysis:
Peer-reviewed original research articles and reviews only.
Published in English.
Available in full text (open access only).
Published between January 2020 and November 2025.
Related to the specified topic (SARS-CoV-2 immune escape studied using bioinformatics).
Exclusion criteria:
The following publications were excluded:
Other publications, such as books, editorials, conference abstracts.
Articles in languages other than English.
This bibliometric analysis applied the predefined inclusion and exclusion criteria above to an initial dataset of 530 publications retrieved from the database. Eligible publications were limited to peer-reviewed original research articles and review papers published between January 2020 and November 2025. Publications such as books, editorials, and conference abstracts were excluded.
Following this screening process, a final dataset of 416 articles (original research articles and reviews) was retained for bibliometric analysis. From this dataset, the top 10 most cited articles were subsequently identified based on citation counts; only original research articles were included for a descriptive analysis, while review articles were excluded at this stage, and their full texts were reviewed to confirm eligibility and extract the required data (
Figure 1).
The study period was limited to this timeframe to include all data from the period following the virus’s emergence in late 2019. Publications from 2025 were included in the analysis, despite the year not yet being complete, in order to include the most recent studies.
Figure 1 provides a flowchart generated using R version 4.5.1, visually illustrating the data extraction process from the Scopus database.
A total of 530 original studies were initially retrieved from the Scopus database, all within the period from January 2020 to November 2025. After removing non-original articles or reviews (n = 29), 501 articles were screened for eligibility based on predefined inclusion criteria, leaving 416 studies with full-text availability. All eligible studies were included in the bibliometric synthesis, and the ten most cited original articles were selected for descriptive analysis, providing an in-depth evaluation of influential research published between January 2020 and October 2025.
2.3. Bibliometric Analysis
Bibliometric approaches provide a quantitative framework for evaluating scientific performance and identifying research trends. These methods facilitate scientific output to be classified and organized according to parameters such as authors, institutions, countries and research themes—such as an examination of authors and co-authors, and co-occurrence and keyword analysis, facilitating the identification of collaborative networks, dominant research themes and emerging scientific areas of interest within the field.
The articles (n = 416) retrieved from the Scopus database were analyzed using the Biblioshiny web interface integrated within the R “Bibliometrix” package (version 4.5.1, RStudio environment). The Biblioshiny platform was used to perform graphical visualizations and statistical analysis.
The bibliometric analysis was performed using R version 4.5.1 and the “Bibliometrix” package to generate descriptive statistics, analyze publication trends and explore thematic evolution. Additionally to the Biblioshiny platform [
11], VOSviewer (Leiden University, Leiden, The Netherlands, version 1.6.20.0) [
12] was employed to construct graphic visualization of citation, authorship, country, institution, and keyword co-occurrence networks [
13].
To ensure data accuracy and reliability, the data extraction and analysis were performed independently three times. This process minimized potential errors and biases, thereby enhancing the quality and credibility of the study’s findings.
Furthermore, a Prisma flow diagram illustrating the selection and screening process of the retrieved studies was generated using the R package (RStudio).
This combined methodological approach provided an overview of the scientific literature surrounding SARS-CoV-2 immune escape using bioinformatic tools. It highlighted the major contributions, global collaborations and thematic clusters that define the evolution of research in this field within the period 2020–2025.
The selected bibliometric indicators, including citation counts and keyword co-occurrence networks, were chosen because they are widely used to assess scientific impact, collaboration patterns, and research trends. However, these indicators may have some limitations. For example, citation-based metrics tend to favor older publications and articles published in high-impact journals, which may underrepresent recent but potentially influential studies or contributions from low- and middle-income countries. In addition, co-occurrence network analyses may be influenced by variations in keyword usage across studies. Other bibliometric indicators such as authorship, institutional, and country analyses also have limitations. These indicators may be influenced by database coverage, as not all journals and regions are equally represented; by differences in publication practices across disciplines and countries; and by inconsistencies in the standardization of author and institutional names.
2.4. Descriptive Analysis
To identify the most cited articles, publication citation counts were analyzed, allowing the selection of the top ten cited articles in the selected topic. These analyses were generated thanks to the Biblioshiny platform.
A descriptive analysis was conducted on the ten most cited original articles that were identified and selected. For each article, the following bibliographic information was extracted: title, authors, affiliations, year of publication and keywords, as well as other information (methodology and main finding).
3. Results
3.1. An Overview and General Characteristics of Publications
The scientific articles published between 2020 and 2025 were examined to assess the relevant research output. All the retrieved publications were original research articles and reviews highlighting the crucial role of bioinformatics tools in investigating the immune evasion mechanisms of SARS-CoV-2. These studies demonstrated the contribution of computational analyses in understanding viral mutations and their effects on immune recognition, as well as their role in supporting the development of effective prevention and treatment strategies.
Overall, a total of 416 publications were retrieved from the Scopus database. Between 2020 and 2025, the number of scientific articles increased rapidly from 14 articles in 2020 to a peak of 119 in 2022, before gradually declining to 63 in 2025. Despite this decline, production levels remained much higher than in the early years, reflecting sustained research activity over time (
Figure 2).
3.2. Authors and Affiliations
Figure 3 illustrates the most cited countries in publications within the selected field. The world map is color-coded in varying shades of blue to represent scientific productivity levels, with darker shades indicating higher citation counts. This visualization shows the geographic distribution of research impact and highlights which countries contribute most significantly to the field.
Analysis of the geographical distribution of publications showed that the United States of America (USA) was the dominant contributor throughout the study period 2020–2025, with a continuous increase in research output. The USA’s publication count increased from 20 articles in 2020 to 520 articles by 2025, making China the second most scientifically productive country after the USA, with its output rising from 5 articles in 2020 to 274 articles by 2025.
The top five most productive countries in this field were the USA, China, India, Germany, and the United Kingdom. Notably, the USA emerged as the leading producer of scientific articles in 2021. Its growth far surpassed that of the other top five countries, whose outputs did not exceed 274 publications during the same period. (
Figure 4).
Moreover, the co-authorship network map, generated using VOSviewer, illustrates the interactions and collaborations among the most highly cited countries contributing to the field. The USA has strong collaborative links with several countries, such as the United Kingdom, Germany, Canada and China, which reflects its significant global research influence. China also has extensive connections, particularly with India, Saudi Arabia, Pakistan, and Singapore. European countries such as the United Kingdom, Germany, Italy, and Spain form a dense cluster, emphasizing their interconnected research activities (
Figure 5). These collaborative networks have probably increased the productivity of research by combining expertise and data from different countries, focusing studies on the globally circulating SARS-CoV-2 variants with vaccine escape potential and enabling a quicker understanding of the virus’s full characteristics.
Indeed, an analysis of citation data confirmed that the USA was also the most cited country, with 3684 citations in total and an average of 40.5 citations per article. This was followed by China, with 1457 citations in total and an average of 20 citations per article, and Germany, with 117 citations in total and an average of 63.6 citations per article. Other countries with a high citation impact included India, the United Kingdom, South Africa, Switzerland, Bangladesh, Korea and Spain. Notably, Morocco also ranked within the top 50 most cited countries worldwide, highlighting its contribution to the field (
Figure 6).
Moreover, according to the corresponding authors’ countries, the top six scientific contributors were the USA (91 published articles), China (73), India (54), Germany (16), Brazil (14), and Italy (14).
3.3. Contributing Authors and Affiliations
Authors who have made the greatest contributions are those with the highest number of publications or the most cited works. This analysis, based on the correlation between publications produced in the selected fields and their citations, highlights key contributors. Regarding the total author network in the selected field, a total of 3967 authors contributed between 2020 and 2025. Verkhivker G.M. leads with 13 publications, followed by Alshahrani M. with 11 articles. These authors appear in highly cited publications, reflecting the influence and significance of their work within the scientific community. Overall, the data show that a group of researchers led by Verkhivker G.M. consistently produced the highest number of publications, demonstrating both their productivity and the importance of their findings (
Figure 7).
Following the analysis of the most frequently cited authors, the focus shifts to the most frequently cited affiliations, highlighting the institutions with the greatest research impact.
The data showed that the institutions with the highest number of citations are mainly world-renowned universities and specialized research institutes.
Chapman University is top of the list with 26 citations, with major USA institutions such as the University of California [
14] and Washington University School of Medicine in St. Louis [
13] following closely behind. This highlights the important contribution of American research centers to scientific research.
There is also a considerable number of institutions from outside the United States, reflecting their global research participation: The University of KwaZulu-Natal (South Africa, 15 citations), the Universidade de São Paulo (Brazil, 10 citations), the University of Chinese Academy of Sciences (China, 11 citations), the All India Institute of Medical Sciences (India, 10 citations) and the Robert Koch Institute (Germany, 12 citations).
Moreover, the analysis of the affiliation co-occurrence network reveals strong collaborations between several research institutions. During the 2022–2023 period, the Peng Cheng Laboratory (Shenzhen) and the Shanghai–Islamabad–Belgrade Joint Laboratories emerged as central actors, highlighting their roles in the selected research. These institutions have strong collaborations with the University of Chinese Academy of Sciences and the Academy of Scientific and Innovative Research (India), suggesting active partnerships between international research centers and interdisciplinary laboratories. The color gradient from blue to yellow illustrates the temporal evolution of collaborations from 2022 to mid-2023, showing the institutional partnerships over this period (
Figure 8).
3.4. Sources and Most Cited Documents
The analysis of the most relevant sources in the selected field reveals that Viruses (47 articles) and Frontiers in Immunology (23 articles) are the leading journals in terms of publication volume. Other prominent journals include the International Journal of Molecular Sciences (19 articles), followed by Vaccines (9 articles), Frontiers in Microbiology (8 articles), Cell reports, MBIO, and Microbiology Spectrum (6 articles each). The dominance of Viruses and Frontiers in Immunology reflects their thematic specialization in virology or immunology and editorial focus on SARS-CoV-2 research during the pandemic.
An analysis of the top five journals’ production during 2020–2025 indicates a continuous increase in publications, culminating in peak publication output in the selected field in 2025 (
Figure 9).
In terms of citation impact, the most cited articles were published by Hoffmann M. (2022, Cell) [
15] with 700 citations, reflecting his global influence in the field. The journal of the top cited article has an impact factor of 42.5, which is ranked in the top quartile (Q1) of journals. This is followed by Hacisuleyman E. (2021, New England Journal of Medicine) [
16] with 541 citations, and Tarke A. (2021, Cell Reports Medicine) [
17] with 499 citations. Other highly cited studies include those by Cele S. (2021, Nature), Starr T. N. (2021, Nature) [
18] and Cui Z. (2022, Cell) [
14], each surpassing 300 citations. Overall, the majority of the leading papers were published in high-impact journals such as Cell, Nature, Science and The New England Journal of Medicine, highlighting their contribution to advancing research in this field (
Figure 10).
The other three journals that have published at least three of the most frequently cited articles are Viruses, iScience, and Frontiers in Immunology. This reflects their important role in disseminating influential research in this field. The impact factors of these three journals were between 3.5 and 5.9. The most cited scientific articles were published between 2021 and 2022. Meanwhile, the journal with the highest number of top cited articles was Viruses, which published forty-seven (n = 47) of the most cited documents.
3.5. Keyword and Co-Occurrence Network
According to Biblioshiny analysis, the most prominent keyword is “SARS-CoV-2”, which appears 356 times, confirming that studies about this virus represent a central research focus. Following closely behind are the terms “human” (345 occurrences) and “COVID-19” (321 occurrences). Other frequently used keywords include genetics (212), “Immune evasion” (182), “Spike protein” (170), “mutation” (137), and “Molecular dynamic” (122).
Figure 11 illustrates the co-occurrence network map that shows the different research themes and their connections within the research topics. The different clusters represent the close relationships between the keywords. Main clusters focus on immune responses and viral immune evasion, as represented by terms such as “immune evasion”, and “antibodies”, as well as on viral structure and mutations, as represented by terms such as “spike protein”, and “mutations”.
Keywords such as “molecular dynamics”, and “ACE2 host receptor” reflect the use of modeling and bioinformatics in understanding SARS-CoV-2 immune escape, highlighting the computational and structural biology approaches’ importance. Overall, these clusters encapsulate the main topics in the field, and the network reveals dynamic, interdisciplinary research centered on the molecular identification of the SARS-CoV-2 immune responses using computational analyses.
3.6. Trend of Topic Research
Figure 12 provides findings into the evolution of research trends by tracking keywords and examining their temporal dynamics, which highlights the growth of popular research topics over time. The trending topics of the selected field between 2020 and 2025 are virology, evolution, and immunology. The trending topics with the highest term frequency between 2022 and 2023 are related to the virus (SARS-CoV-2, COVID-19) and human. The newly trending topics in 2025 are genotype, and SARS-CoV-2 (lineage XBB.1.5).
During the early stages of the pandemic (2019–2021), the focus was on virological aspects. From 2021, the focus expanded to include immune responses, with studies on antibodies and immunology becoming important. From 2022, new viral lineages and variants emerged as dominant topics. By 2023–2024, research trends had become centered on genomic surveillance and variant characterization.
3.7. Data Summary
For the descriptive analysis, the top ten cited original articles (
Table 2) were selected. This allowed the extraction of information to be retrieved, such as the methodology and the main findings of these scientific articles (
Table 3).
We selected these articles to highlight the most influential studies. However, citation-based ranking tends to favor older publications due to longer citation accumulation periods, a well-recognized limitation of bibliometric analysis.
Most of the studies with the highest number of citations in the dataset focused on the Omicron variant (50%). Structural analyses using cryo-electron microscopy (Cryo-EM) and X-ray crystallography [
14,
19,
20], combined with molecular dynamics simulations, revealed that the Omicron spike primarily exhibits an ‘open’ conformation. Comparative analyses across variants, including Alpha, Beta, Gamma and mink-associated lineages [
15], revealed consistent patterns of reduced antibody binding in Omicron and other SARS-CoV-2 variants, alongside increased ACE2 binding affinity across multiple variants.
Some of these studies [
14,
15,
18,
21] showed that neutralization assays, including live virus and pseudo virus techniques, quantified antibody escape for specific mutations, while surface plasmon resonance (SPR) binding assays measured changes in spike and ACE2 interactions. Deep mutational scanning identified RBD and NTD mutations associated with altered antibody recognition. Sequencing based on mutation mapping linked genomic changes to these functional impacts [
21].
Advanced computational tools were applied across the studies. “Markov state modeling” [
20] identified conformational transitions in the spike protein and “Adaptive Sampling Simulations” [
22] detected over 50 cryptic pockets on the spike, as well as additional pockets on Mpro.
Another study [
17] analyzing T-cell epitopes revealed a high degree of conservation across variants, with 93% of CD4 and 97% of CD8 epitopes remaining unchanged despite the presence of many spike mutations. Vaccination studies indicated that neutralizing activity against the Omicron variant decreased after two vaccine doses or post-infection, but increased following booster doses.
Finally, all studies analyzing the immune escape of SARS-CoV-2 using neutralizing assays or Elisa-based techniques, in combination with in silico tools, used plasma or serum samples. Meanwhile, studies that combined bioinformatics and genome sequencing used Nasopharyngeal and oropharyngeal swab samples.
Figure 13 illustrates all the bioinformatic tools used in the top ten cited original articles that studied SARS-CoV-2 immune evasion.
Table 2.
Top ten cited original articles in the study of SARS-CoV-2 immune escape using bioinformatics tools (2020–2025).
Table 2.
Top ten cited original articles in the study of SARS-CoV-2 immune escape using bioinformatics tools (2020–2025).
| First Author | DOI | Date of Publication | Citation | Affiliation | Journal | Country |
|---|
| Hoffmann M, [15] | 10.1016/j.cell.2021.12.032 | 2022 | 700 | German Primate Center | Cell | Germany |
| TARKE A, [17] | 10.1016/j.xcrm.2021.100355 | 2021 | 499 | La Jolla Institute for Immunology | Cell Reports Medicine | USA |
| Cele S, [18] | 10.1038/s41586-021-03471-w | 2021 | 429 | Africa Health Research Institute | Nature | South Africa |
| Starr TN, [21] | 10.1038/s41586-021-03807-6 | 2021 | 355 | Fred Hutchinson Cancer Research Center | Nature | USA |
| Cui Z, [14] | 10.1016/j.cell.2022.01.019 | 2022 | 320 | Chinese Academy of Sciences | Cell | China |
| Mccallum Z, [19] | 10.1126/science.abn8652 | 2022 | 308 | University of Washington | Science | USA |
| Ren S, [23] | 10.12998/wjcc.v10.i1.1 | 2022 | 283 | China Medical University | World Journal of Clinical Cases | China |
| Gobeil SMC, [20] | 10.1126/science.abi6226 | 2021 | 271 | Duke Human Vaccine Institute | Science | USA |
| Shah M, [24] | 10.3389/fimmu.2021.830527 | 2022 | 187 | Ajou University School of Medicine | Front Immunol | South Korea |
| Zimmerman MI, [22] | 10.1038/s41557-021-00707-0 | 2021 | 177 | Washington University School of Medicine | Nature Chemistry | USA |
Table 3.
Analysis of the top ten cited original articles in the field (2020–2025).
Table 3.
Analysis of the top ten cited original articles in the field (2020–2025).
| Article | Methodology | Cohort | Main Findings |
|---|
| HOFFMANN M, 2022, CELL [15] | Neutralization assay (cell culture) Genomic analysis | Plasma/sera Convalescent Vaccinees | Omicron uses human and animal ACE2 for host cell entry Omicron is resistant against neutralization by several therapeutic antibodies Omicron efficiently evades antibodies from infected or 2 doses BNT-vaccinated patients Omicron moderately evades antibodies induced by 3 doses BNT or heterologous vaccination |
| TARKE A, 2021, CELL REP MED [17] | AIM and FluoroSPOT assays Sequence analysis (Mutations mapping) Peptide pool generation | Serum Convalescent Vaccinees Controls | T-cells of exposed donors or vaccinees effectively recognize SARS-CoV-2 variants Effective recognition in AIM and FluoroSPOT assays for spike and other proteins 93% and 97% of CD4 and CD8 epitopes are 100% conserved across variants |
| CELE S, 2021, NATURE [18] | Live-virus neutralization assay Sequencing analysis (Genome assembly) Phylogenic analysis | Nasopharyngeal and oropharyngeal swab samples Plasma samples | Convalescent with a virus containing only E484K) strongly neutralized both variants Vaccines based on variants of concern (VOCs) like 501Y.V2 may still be effective against other circulating SARS-CoV-2 lineages |
| STARR TN, 2021, NATURE [21] | Neutralization assay SPR binding assays Deep mutational scanning Multidimensional scaling projection of antibody epitopes Protein engineering Molecular dynamics simulations | Plasma/sera Convalescent Vaccinees | Identified panel of anti-SARS-CoV-2 antibodies Identified mutations that cause escape from neutralizing antibodies Multidimensional scaling of antibody binding-escape |
| CUI Z, 2022, CELL [14] | Cryo-EM structures of the Omicron spike Neutralization assay | N/A | Omicron spike stably maintains an active conformation for receptor recognition Improved stability of Omicron enhances attachment but compromises viral fusion Mutations perturb the conformation of antigenic sites recognized by most antibodies |
| MCCALLUM M, 2022, SCI [19] | Cryo-electron microscopy and X-ray crystal Computational structural modeling/molecular dynamics simulations | N/A | Omicron retains high-affinity binding to ACE2 while greatly reducing binding to other therapeutic antibodies Omicron carries mutations in both the RBD [13] and NTD [10], which together result in a marked reduction in plasma-neutralizing activity among convalescent or vaccinees |
| REN S, 2022, WORLD J CLIN CASES [23] | Clinical observation study Computational predictive modeling | N/A | Omicron shows high infectivity associated with less severe symptoms than earlier variants Strong ability to evade existing immunity, including vaccine-induced protection Primary vaccination offers reduced effectiveness against Omicron. Booster doses improve immune protection and help restore neutralizing activity |
| GOBEIL SMC, 2021, SCI [20] | Cryo-EM structural determination Binding assays Computational analyses and molecular dynamics Markov state modeling | N/A | B.1.1.7 variant reduced binding to antibodies P.1 and B.1.351 variants decreased binding to both NTD- and receptor-binding domain (RBD)-directed antibodies All variants exhibited increased binding affinity to ACE2 The mink-related variant showed spike protein instability |
| SHAH M, 2022, FRONT IMMUNOL [24] | Structural modeling, Binding affinity analysis, Phylogenetics Visualization of spike protein interactions with ACE2 and therapeutic antibodies | N/A | Omicron RBD binds ACE2 ~2.5 times stronger than prototype SARS-CoV-2 Omicron harbors E484A substitution instead of the E484K that helped neutralization escape of Beta, Gamma, and Mu variants |
| ZIMMERMAN MI, 2021, NAT CHEM [22] | Molecular dynamic simulations, structural biology Adaptive sampling simulations | N/A | Spike protein homologs influence the balance between open and closed conformations, modulating the trade-off between receptor binding and immune evasion The spike protein undergoes conformational changes that reveal over 50 previously hidden (“cryptic”) pockets Cryptic pockets provide expanded opportunities for antiviral drug design by targeting novel binding sites Two novel cryptic pockets on Mpro that expand our current therapeutic options |
4. Discussion
The global impact of the COVID-19 pandemic has highlighted the importance of vaccination in containing the spread of SARS-CoV-2 virus. However, the virus’s ability for immune escape can affect the efficacy and effectiveness of vaccines worldwide. Studying these mechanisms can help anticipate and predict the impact of future mutations. This helps to optimize vaccine and drug treatments so that they remain effective against new emergent variants. Bioinformatic and computational tools play a major role in understanding immune escape; in a very short time, a large amount of data can be analyzed. These tools facilitate the processes of vaccines and drug production and optimization.
Therefore, this study aims to map the importance of bioinformatic tools to understand viral immune evasion and highlight the emerging methodologies used during the period 2020–2025.
Overall, a total of 416 publications were retrieved from the Scopus database. Between 2020 and 2025, the number of scientific publications increased rapidly in 2020 and reached a peak in 2022, reflecting the urgent and global scientific response to the COVID-19 pandemic. It coincided with periods marked by the emergence of major SARS-CoV-2 variants of concern, which triggered intensive research efforts focused on viral evolution, immune escape mechanisms, and the evaluation of vaccine and therapeutic effectiveness [
25,
26]. During this phase (2020–2022), bioinformatic and in silico tools played a central role in rapidly analyzing large-scale genomic and structural data and assessing the impact of mutations.
Following this peak, publication output gradually declined during 2024–2025, indicating a transition from emergency-driven research toward more consolidated and targeted scientific activity. Despite this decline, the overall volume of publications remained substantial, suggesting a sustained structural interest in COVID-19 research, particularly in areas related to long-term genomic surveillance, variant monitoring, and computational analyses.
During the 2024–2025 period, several new SARS-CoV-2 lineages emerged, including JN.1, NB.1.8.1, XEC, LP.8.1, and KP.3.1.1 [
27,
28,
29]. These variants exhibited differences in transmissibility and partial immune evasion; however, most did not demonstrate a significant increase in clinical severity or hospitalization rates. The continued effectiveness of booster vaccination strategies and antiviral therapies likely contributed to this trend. The slight rebound in publication activity observed in 2025 therefore appears to reflect renewed scientific interest driven by the emergence of new variants within an already established and evolving research framework.
As mentioned, the USA leads (3684 citations) with strong collaborative links with the United Kingdom, Germany, Canada and China, reflecting its significant global research influence. While citation counts indicate visibility and influence rather than direct measures of scientific quality, a multiplex network analysis has shown that leading countries in global research increasingly receive a disproportionately high number of citations compared to other countries conducting similar work, reinforcing their prominence within the scientific community [
30]. Other countries with a high citation impact include China, India, the United Kingdom, South Africa, Switzerland, Bangladesh, Korea and Spain.
It should be noted that South Africa has a high citation frequency (442 citations), even though it does not have an outstanding number of publications. The reason may be the monitoring of an emerged SARS-CoV-2 variant by a genomic surveillance team in South Africa and Botswana in November 2021 [
31].
Notably, Morocco also ranked within the top 50 most cited countries worldwide, highlighting its growing influence and contribution to the field, such as a study of SARS-CoV-2 genomic evolution and genetic diversity circulating in Morocco from collected data. A total of 1989 whole genome sequences of SARS-CoV-2 viruses circulating in Morocco from 2020 to 2024 were analyzed. This study used bioinformatic tools, such as Nextclade, and phylogenetics to analyze mutations detected in the different sequences retrieved [
32]. Moreover, the SARS-CoV-2 genome sequencing data collection was based on publicly available bioinformatics tools, highlighting its major role in immune evasion studies.
The analysis of the most cited authors in the selected fields reveals a global productivity and impact within a group of authors during the period 2020–2025. Verkhivker G.M. stands out as a leading contributor, particularly in SARS-CoV-2 and COVID-19 research using bioinformatics and computational tools, which further highlights his active role in this area of research in the selected field. His highly cited output reflects both personal expertise and the advantages of being affiliated with a USA research institution.
He has used bioinformatics approaches, such as molecular dynamics modeling, computational simulations and structural analyses, to study SARS-CoV-2 viral mechanisms and therapeutic targets. The frequent citations of his work confirm his highly influential studies in the selected field.
One of his studies examined functional dynamics and identified the regulatory centers of SARS-CoV-2 allosteric interactions by using molecular simulations and network modeling approaches [
33]. This study offers a novel perspective on the SARS-CoV-2 spike protein mechanisms, which could help with the drug and vaccine optimization advance.
Another of his studies used atomistic simulations and functional dynamics analysis, combined with alanine scanning and mutational sensitivity profiling, to examine the SARS-CoV-2 spike protein’s interactions with the ACE2 host receptor and neutralizing antibodies [
34].
In addition, the distribution of the highly cited documents highlights the journals that serve as the main platforms for research in the selected fields (2020–2025), and reflects the diversity of publication, as well as the fact that influential studies are concentrated in a limited number of high-output sources and related to influential affiliations worldwide. Indeed, our results showed that international collaboration is crucial in advancing our understanding of SARS-CoV-2 immune evasion.
Overall, these findings demonstrate how a combination of expertise, access to advanced research infrastructure and international collaboration can increase productivity and scientific impact in the development of SARS-CoV-2 immune escape studies.
Furthermore, keyword and co-occurrence network analysis revealed that SARS-CoV-2 remains the central focus of research in this field. Terms such as “human” and “COVID-19” appeared frequently alongside it, indicating a strong focus on human infection and COVID-19 pandemic-related studies. Frequently used keywords such as “genetics”, “immune evasion”, “spike protein”, “mutation” and “molecular dynamics” suggested a focus on understanding viral structure and function, as well as the mechanisms underlying immune escape using bioinformatics. The co-occurrence network also showed how these topics are interconnected, forming distinct clusters that reflect the interdisciplinary nature of the research. Overall, these keywords demonstrated an integrated approach that combines virology, immunology, and computational modeling to shed light on SARS-CoV-2 immune escape.
An analysis of research trends over time reveals a clear evolution in focus in response to the progression of the pandemic. During the early period of the pandemic (2019–2021), research focused on the virological and structural characteristics of SARS-CoV-2. Understanding the characteristics of this newly emerged virus was essential for effectively managing the pandemic and its consequences. However, from 2021 onwards, studies focused on how immune responses reacted to vaccines and antibodies across the emerged variants. In fact, the emergence of new viral lineages and variants in 2022 shifted research priorities towards genomic surveillance and variant characterization, reflecting the need to monitor the viral evolution. Recent topics of interest in 2025, such as “SARS-CoV-2 lineage XBB.1.5”, suggest that the focus on variant monitoring and molecular epidemiology continues [
34]. This evolution in trends reflects shifting research interests in fundamental virology, immunology and bioinformatics in response to emerging challenges, and demonstrates how scientific activity can increase rapidly in response to global issues.
Finally, in the top ten cited articles in the selected field, a focus is on the methodology used and the main findings—showing that Omicron was the main variant studied because most of these influential studies were published during 2021 and 2022, a period when Omicron emerged as the new variant of concern. This timeline coincides with the WHO’s designation of the B.1.1.529 lineage as the Omicron variant of concern in late November 2021, which spread rapidly around the world by early 2022 [
23,
35]. The predominance of Omicron in these studies also reflects its significant impact on transmission and vaccine efficacy during this period. During 2021–2022, there was a transition from earlier variants, such as Gamma and Delta, to Omicron. This variant quickly became the dominant strain worldwide by early 2022, and many subvariants have emerged. Since it first identification BA.1 and BA.1.1, and then BA.2 and BA.3, were also reported around the same period [
36]. Several studies documented Omicron’s epidemiological analysis, genomic surveillance and immune studies conducted globally.
For example, a study reported a genetic analysis that stated that the Omicron variant evolved either from the alpha VOC or a new monophyletic clade. This study provided a detailed analysis of the emergence of the Omicron variant, its mutational patterns, pathogenicity, transmissibility, and treatment and vaccination strategies [
36].
The highly cited articles suggested that structural investigations combining Cryo-EM, X-ray crystallography, and molecular dynamics simulations showed that Omicron’s spike is mostly in an open conformation. In contrast to the original strain, this probably helps it to bind more strongly to ACE2 and evade the immune system. This open structure has been observed across multiple variants, including Alpha, Beta, and Gamma. This highlights a viral strategy to reduce antibody recognition, allowing immune evasion [
37].
Moreover, it showed that assays such as neutralization tests using live and pseudo virus systems have validated that certain mutations in the spike protein induce SARS-CoV-2 antibody recognition escape. In fact, neutralization tests are considered as the gold-standard technique for detecting and quantifying neutralizing antibodies against viruses, including SARS-CoV-2. These tests directly measure the ability of antibodies to prevent viral infectivity in vitro, providing information about immune protection abilities. Their high specificity and sensitivity makes them important tools for vaccine evaluation, serological studies, and therapeutic development [
2,
38].
These top ten cited studies also used several bioinformatic tools such as SPR assays, which precisely measure changes in spike–ACE2 binding interactions. Furthermore, deep mutational scanning studies provide a high-resolution map that links spike protein mutations to immune evasion. This approach reveals how mutations facilitate escape from neutralizing antibodies while preserving ACE2 affinity.
In this context, a study used deep mutational scanning to investigate SARS-CoV-2 mutations and their impact on immune evasion and infectivity. Their analysis revealed that spike protein expression, rather than ACE2 affinity, was the main factor affecting viral infectivity and was correlated with the evolution of SARS-CoV-2 [
39].
In conclusion, this multidisciplinary framework integrating structural, functional and computational studies of SARS-CoV-2 spike mutations and immune evasion provides information to guide future therapeutic and vaccine development.
Overall, these highly cited studies illustrate a complementary interplay between experimental and computational approaches. Experimental techniques, including neutralization assays, SPR analysis, and live or pseudo virus assays, provided direct functional and structural insights into Omicron’s immune evasion and infectivity. In parallel, computational and bioinformatic methods, such as molecular dynamics simulations, deep mutational scanning, and Cryo-EM, enabled rapid prediction and mapping of mutational effects on spike–ACE2 interactions and antibody escape. Together, these approaches allow a comprehensive understanding of viral evolution, highlighting how experimental validation and computational modeling reinforce and complement each other in SARS-CoV-2 research.
This study focused exclusively on scientific articles indexed in the Scopus database, which may have introduced database biases and excluded relevant studies from other sources. Furthermore, as the search strategy relied on terms in article titles, abstracts and keywords, it may have excluded some relevant studies without explicit keyword matches.
Future research should use broader search strategies and include additional databases in order to incorporate more articles and collect a larger dataset for analysis. This would reduce the risk of missing relevant studies in the selected field.
In contrast, as far as we know, previous bibliometric studies had focused on global COVID-19 vaccines, vaccine or immune evasion, or had only considered the top 100 highly cited articles, or investigated the impact of COVID-19 vaccines on the immune system during 2020–2024. However, none of these studies have conducted a comprehensive analysis of all highly cited articles investigating immune evasion using bioinformatic tools. This bibliometric analysis is the first to do so, aiming to evaluate global scientific output on SARS-CoV-2 immune escape using bioinformatics (2020–2025). Also, a descriptive analysis was conducted to identify and examine the top ten cited original articles, allowing a multidimensional analysis of research trends, and the use of bioinformatic approaches for the study of SARS-CoV-2 viral immune evasion during 2020–2025.