Next Article in Journal
Proteomic Analysis of Plant-Derived hIGF-1-Fc Reveals Proteome Abundance Changes Associated with Wound Healing and Cell Proliferation
Previous Article in Journal
Integrated Analysis of Proteomic Marker Databases and Studies Associated with Aging Processes and Age-Dependent Conditions: Optimization Proposals for Biomedical Research
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

TCEPVDB: Artificial Intelligence-Based Proteome-Wide Screening of Antigens and Linear T-Cell Epitopes in the Poxviruses and the Development of a Repository

by
Mansi Dutt
1,2,3,
Anuj Kumar
1,2,3,
Ali Toloue Ostadgavahi
1,2,
David J. Kelvin
1,2,3,* and
Gustavo Sganzerla Martinez
1,2,3
1
Department of Microbiology and Immunology, Canadian Center for Vaccinology (CCfV), Faculty of Medicine, Dalhousie University, Halifax, NS B3K 6R8, Canada
2
Laboratory of Immunity, Shantou University Medical College, Jinping, Shantou 515041, China
3
BioForge Canada Limited, Halifax, NS B3N 3B9, Canada
*
Author to whom correspondence should be addressed.
Proteomes 2025, 13(4), 58; https://doi.org/10.3390/proteomes13040058
Submission received: 6 August 2025 / Revised: 8 October 2025 / Accepted: 27 October 2025 / Published: 6 November 2025

Abstract

Background: Poxviruses constitute a family of large dsDNA viruses that can infect a plethora of species including humans. Historically, poxviruses have caused a health burden in multiple outbreaks. The large genome of poxviruses favors reverse vaccinology approaches that can determine potential antigens and epitopes. Here, we propose the modeling of a user-friendly database containing the predicted antigens and epitopes of a large cohort of poxvirus proteomes using the existing PoxiPred method for reverse vaccinology of poxviruses. Methods: In the present study, we obtained the whole proteomes of as many as 37 distinct poxviruses. We utilized each proteome to predict both antigenic proteins and T-cell epitopes of poxviruses with the aid of an Artificial Intelligence method, namely the PoxiPred method. Results: In total, we predicted 3966 proteins as potential antigen targets. Of note, we considered that this protein may exist in a set of proteoforms. Subsets of these proteins constituted a comprehensive repository of 54,291 linear T-cell epitopes. We combined the outcome of the predictions in the format of a web tool that delivers a database of antigens and epitopes of poxviruses. We also developed a comprehensive repository dedicated to providing access to end-users to obtain AI-based screened antigens and T-cell epitopes of poxviruses in a user-friendly manner. These antigens and epitopes can be utilized to design experiments for the development of effective vaccines against a plethora of poxviruses. Conclusions: The TCEPVDB repository, already deployed to the web under an open-source coding philosophy, is free to use, does not require any login, does not store any information from its users.

1. Introduction

Poxviruses represent a family of large dsDNA viruses [1] whose genome encodes several proteins that are key for driving the virus pathogenicity and transmissibility. Poxviruses can infect invertebrates and vertebrates, including homo sapiens. Throughout history, poxviruses, in the form of the variola virus (VARV), which causes the disease smallpox, have been listed among the greatest infectious killers of mankind [2]. Archaeological cues of VARV can be traced back to ancient Egypt in 1157 BC, when evidence found in the mummified body of the pharaoh Ramses V suggested smallpox-derived lesions. Although the smallpox was declared as an eradicated disease in 1980, other poxviruses still pose a threat to human life [3]. In May 2022, when the world was still in the SARS-CoV-2 pandemic, sustained human-to-human transmission of the mpox virus (MPXV) started being reported in areas in which the disease, mpox, is not endemic, having resulted in 93,497 cases (91,373 in locations that have not historically reported mpox) and 177 deaths (156 in locations that have not historically reported mpox) worldwide [4]. Apart from MPXV, other poxviruses such as tanapox [5], orf [6], and molluscum contagiosum [7] can also infect humans. Poxviruses also pose a veterinary threat to several animals. Lumpy Skin Disease Virus (LSDV) affects cattle with mouth ulcers that can cause weakness, loss of appetite, and reduced milk production. Camelpox virus accounts for increased weight loss, reduced milk production, and mortality in camelids [8]. Birds are also affected by poxviruses, as approximately 9000 bird species 232 have been reported to have acquired a natural poxvirus infection [9].
One of the components of an immune response is cellular immunity or T-cell-mediated immunity, responsible for defending one’s body against intracellular pathogens. At the start, the cells of pathogens can express antigens, which are proteins that are recognized by T-cell receptors (TCRs). These antigens might contain specific regions called epitopes, which are short protein subsequences that directly interact with TCRs. The epitopes are presented by major histocompatibility complex (MHC) molecules on the surface of cells that present antigens. After recognizing the epitope–MHC complex, the specific T cells are activated and cloned, leading to the proliferation of antigen-specific T cells adapted to present an immune response against a specific antigen of a pathogen. Upon activation, T cells can specialize in different functions such as antibody production, inducing apoptosis in target cells, suppressing excessive immune response, and generating long-lasting immunological memory [10].
With whole-genome sequencing techniques having become accessible, the genome of pathogens can be explored to determine the potential antigenic repertoire of an organism from its genomic sequence; this process is termed ‘reverse vaccinology’ [11,12]. The dissemination of Artificial Intelligence (AI) has enabled the analysis and prediction of high volumes of genomic data [13,14]. Examples of AI techniques have also been employed as a key step in the analysis and classification of data in the reverse vaccinology paradigm [15,16]. Examples of vaccines that are in the market as of August 2025 include a vaccine for the influenza H5N1 virus [17]; Bexsero, a multicomponent meningococcal serogroup B (4CMenB) vaccine [18]; and Shingrix, a vaccine developed against Shingles, which is caused by the Varicella Zoster virus [19].
In this work, we present the T-Cell Epitopes Poxviruses Database (TCEPVDB). We obtained the protein repertoire of 37 distinct poxviruses and submitted them for vaccine components prediction using the PoxiPred [20] method. The predicted outputs are designed in a user-friendly database. Here, we document the development stage of TCEPVDB. Users interested in exploring the data of TCEPVDB can freely access the tool at https://tcepvdb.microbiologyandimmunology.dal.ca (accessed on 11 October 2025).

2. Materials and Methods

A flowchart of the pipeline utilized in the present study is illustrated in Figure 1.

2.1. Obtention of Protein Répertoire

We obtained the protein repertoire of 37 poxviruses. Our search included the query term on NCBI ‘poxvirus’. A total of 49 results matched our query, from which we could obtain a complete genome for 37 distinct poxviruses. When available, we opted to use RefSeq genomes.

2.2. Predicting Antigens and LTCEs

PoxiPred [20] was originally developed as an agnostic classification framework for predicting antigens and LTCEs in poxvirus protein datasets, functioning as an early data curation step. For the construction of TCEPVDB, we did not develop new models; instead, we employed the pre-trained Deep Learning Artificial Neural Network (DL-ANN) models for (i) antigen prediction and (ii) LTCE prediction. The models are publicly available at https://github.com/gustavsganzerla/poxipred (accessed on 21 September 2025). Using these existing models, we analyzed the protein repertoire of 37 distinct poxviruses. First, the antigen prediction model was applied to each individual protein from the dataset. Proteins predicted as potential antigens were then fragmented into smaller peptides, which were subsequently evaluated using the LTCE prediction model. In both instances, predictions were considered as positive when the sigmoid output layer of the corresponding pre-trained model produced a score ≥ 0.5.

2.3. Web Development

We implemented TCEPVDB as a webtool using the Django framework (version 4.2.4) for Python (version 3.11) web development. First, a host server was purchased in the region of Toronto (Canada). This server was subsequently configured to execute tasks related to the synchronization of a Git (version 2.39.2) repository. The repository serves as the source of version control, with commits being actively contributed over time. We used the relational structured database system SQLite, built in the Django framework. In addition, the deployment architecture incorporated the use of the Apache2 HTTP server to manage proxy functionalities to a domain obtained within Dalhousie University. Apache serves as a reverse proxy to handle requests and direct to the appropriate backend services.

2.4. Unified Modeling Language (UML) Artifacts

We used UML to document the development stages of TCEPVDB. First, an entity relationship (ER) diagram to document the logical structure of our database system. Next, an activity diagram was used to visualize the dynamic aspects of the system, representing a user using TCEPVDB to query antigens and epitopes. Both diagrams were developed using the webtool LucidChart for diagramming (Lucid Software, South Jordan, UT, USA).

2.5. Conservation of Epitopes Across Poxvirus Species

Predicted epitope sequences from each organism were extracted from their files and scanned against the proteome of all other organisms. For each source organism, the number of its epitopes present in each proteome was counted and compiled into a matrix. The resulting matrix was visualized as a heatmap to illustrate epitope conservation and cross-species distribution.

3. Results

3.1. Organisms and Predictions

We assessed a total of 7185 proteins from 37 distinct poxviruses. Firstly, each protein was submitted to PoxiPred’s antigen predictor. A total of 3966 proteins were flagged as potential antigens (Table 1). Importantly, these proteins may exist in different proteoforms and may arise from sequence variations, post-translational modifications, etc., which may significantly influence epitope accessibility and immune recognition.
Next, we queried each protein and extracted subsequences of it to submit to the T-cell epitope prediction model. First, we explored the training data of PoxiPred and obtained the interquartile length of the epitopes used for training the model, i.e., the range of 9 to 13 amino acids (Supplementary Figure S1). To mathematically represent our epitope search, let α be an antigen sequence. Let n be a random variable representing the length of the epitope such that n ∈ {9, 10, 11, 12, 13}. The epitope a1, a1 + n is a substring of α starting at position a1 and ending at position a1 + n − 1, where
epitopeα1,α1+n = aα1:α1+n−1
Upon extracting the epitope a1, a1+n, let a be updated as
α = αα1+n
and repeated until length (α) ≤ 13.
In total, we predicted 54,291 LTCEs.

3.2. Webtool Functionalities

Users who access TCEPVDB can search in two distinct tables: (i) antigens, which return to the user proteins predicted as antigens, and (ii) epitopes, which return to the user peptides predicted as epitopes. The search term, the only input required from the user, can be of two natures: (i) a search organism, in which epitopes or antigens are searched based on the name or partial name of an organism, and (ii) a search antigen, in which epitopes or antigens are searched based on the name or partial name of a protein. TCEPVDB is implemented using two relational tables, i.e., protein and epitope tables, in an SQLite database. One protein can have 0 to many epitopes associated with it. We modeled the protein and epitope tables as part of an entity relationship diagram (Figure 2A).
Moreover, we modeled a user performing a query operation to TCEPVDB as part of an activity diagram (Figure 2B). First, a user needs to submit a query term, whether in the name of an organism or the name of a protein. Next, the user selects the table to perform the query, i.e., antigens table or epitopes table. The system will then search the query term in the specified table. Finally, the system will either display the results of the submitted query or display a message indicating no results were found.
Upon rendering the results page, the user can click on a ‘download’ icon. If queried to the antigen table, the user will download a .fasta file containing all the proteins predicted as antigens. If queried to the epitopes table, a .fasta file containing the protein of origin followed by all epitopes predicted separated by a tab delimiter will be prepared for the user to download.
TCEPVDB also includes a functionality to display all organisms available in the DB. In the navigation bar, users can select the item ‘All Organisms’. A table similar to Table 1 will be displayed, in which each row represents an organism. The information of an organism is divided into three levels: protein repertoire, antigen, and epitope. In each level, the number of proteins, proteins predicted as antigen, and total number of LTCEs is displayed. On the side of each level, a ‘view’ and a ‘download’ icon is also displayed. Upon clicking in the ‘view’ icon, the user will be directed to another table showing, according to the level of the click, (i) all the proteins of an organism; (ii) all the proteins predicted as antigens of an organism; or (iii) all the epitopes of an organism. Upon clicking the ‘download’ icon, the system will automatically download, according to the level of the click, (i) all proteins of an organism compiled in a .fasta file; (ii) all the proteins predicted as antigens of an organism, compiled in a .fasta file; or (iii) all the LTCEs of an organism, compiled in a .fasta file with the header containing the protein of origin followed by all epitopes predicted separated by a tab delimiter. Both the epitope and antigen models in TCEPVDB contain a score attribute, which is defined by the output of the sigmoid function applied at the final layer of the neural networks implemented in PoxiPred [20]. Entries displayed in TCEPVDB correspond to predicted antigens and epitopes with sigmoid scores ≥ 0.5, reflecting the classification threshold used to determine positive predictions.
Finally, when a user is visualizing either the antigen or the epitope table (resulting from viewing a query or all organisms), we include a url in the ‘Description’ column of the rendered table. The ‘Description’ column contains information on the antigen or the molecular parent of a predicted epitope. Upon clicking the url, the user is directed to the individual view of the antigen, in which a html page is rendered showing the description (Figure 3A); the protein sequence (Figure 3B); a button to download the epitopes predicted for the protein in question (Figure 3C) in .fasta format; and a table containing all the epitopes predicted for that antigen (Figure 3D), containing the column’s epitope number, epitope sequence, epitope prediction score, and genomic coordinates for the start and end of the epitope within the protein. A comparison between newly developed TCEPVDB and other existing vaccine tools and platforms (IEDB [21], Vaxign [22], and VaxiJen [23] is given in Table 2.

3.3. Conservation of Predicted Epitopes Across Poxvirus Species

To check the conservation of the predicted epitopes, we investigated the presence or absence of each of the predicted epitopes across the entire protein repertoire of the 37 poxviruses we have data available for at TCEPVDB (Figure 4). The poxviruses ectromelia, horsepox, Mpox, taterapox, vaccinia, variola, camelpox, and cowpox had over one thousand epitopes shared between themselves, attesting for the conservation between these poxviruses which might be exploited in cross-protective vaccine development.

4. Discussion

Here, we developed TCEPVDB, a database consisting of AI-predicted antigens and LTCEs from 37 distinct poxviruses that infect humans and animals. TCEPVDB is freely accessible, does not require any login, and does not store any information from its users. Moreover, we followed an open-data policy in developing our tool. It is of our belief that TCEPVDB constitutes an important milestone in the reverse vaccinology paradigm; as of August 2025, no databases specifically focused on providing resources for studying the development of vaccines following reverse vaccinology are available.
The first vaccine whose development was based on the paradigm of reverse vaccinology was proposed by Sette and Rappuoli (2010) [12]. Since then, several applications have become promptly available to facilitate the execution of the reverse vaccinology pipeline [24,25,26]. Despite the absence of a single comprehensive database focused on reverse vaccinology elements, there are resources that, when combined, might assist in predicting antigens and epitopes to facilitate vaccine development. Despite examples such as the Immune Epitope Database (IEDB) [21], Vaxign [22], and Pathosystems Resource Integration Center (PATRIC) [27] not specifically focusing on resources for reverse vaccinology, they all contain valuable data and tools that can be used in the process of identifying potential vaccine targets.
The current iteration of TCEPVDB contains limitations. First, the database is entirely focused on poxviruses. There is compelling evidence to ascertain that true antigens and true epitopes have a conserved profile when compared to proteins/peptides not able to drive an immune response [28,29,30]. Next, we also elicit that TCEPVDB is entirely developed upon T-cell epitopes, which constitute only a share of an immune response. However, the lack of experimentally verified B-cell epitopes hindered the obtention of an inclusive training dataset for constituting a pipeline focused on predicting B-cell epitopes of poxviruses. Finally, the predictions we obtained with PoxiPred [20] and publicized in TCEPVDB are still raw outputs of the PoxiPred pipeline. There are filters that test proteins and epitopes for allergenicity [31] and toxicity [32] of antigens/epitopes that can be employed and will consequently reduce the number of available antigens/epitopes in our database.
The methodology for the prediction of antigens and epitopes utilized in the present study is solely based on the PoxiPred method with ROC AUC metrics, which has been defined as a key indicator of classification performance [20]. Similar trends have already been followed in other related studies. Previously, Souza et al. [33] significantly applied the precision–recall trade-offs to validate model reliability in drug-likeness prediction for the screening of SARS-CoV-2 inhibitors. In another recent study, Kacen et al. [34] highlighted the importance of antigenic landscape modeling with the aid of tumor immunopeptidomics. To predict effective vaccine candidates for mpox or other contagious poxviruses, a combined sequence- and structure-based approach is required to enhance accuracy and productivity, as reported by Pritam [35] in a proteome-wide immunoinformatics study. These reports underscore the value of rigorous evaluation of available epitope prediction tools. In the future, incorporating the different ML-based metrics into the validation of antigen and epitopes embedded in the TCEPVDB may give strength to its comparative positioning and significant utilization for the development of effective vaccine candidates against multiple poxviruses.
In conclusion, the in silico step of reverse vaccinology serves primarily as a data curation step, generating preliminary predictions of candidate epitopes and antigens. Their prediction requires subsequent experimental validation and biological interpretation to refine the set of candidates with true potential for incorporation into vaccine designs.

5. Conclusions

The developed TCEPVDB is devoted to providing a comprehensive catalog of a total of 3966 proteins as potential antigen targets and 54,291 linear T-cell epitopes from 37 distinct poxviruses. The antigen proteins and linear T-cell epitopes embedded in this database are predicted using the AI-based PoxiPred method. TCEPVDB is a user-friendly database and can be freely accessed using the following URL: https://tcepvdb.microbiologyandimmunology.dal.ca/ (accessed on 11 October 2025). With further progress in genome sequencing and the AI-based screening of antigens and epitopes, we anticipate that the number of entries in TCEPVDB will eventually grow in the upcoming years. Taken together, the information available in TCEPVDB can be used in efforts of reverse vaccinology, facilitating the rapid development of effective vaccines to tackle poxviruses in a significant manner.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/proteomes13040058/s1: Figure S1: Length distribution of the T Cell epitopes used for training the PoxiPred model.

Author Contributions

Conceptualization, M.D., A.K., G.S.M. and D.J.K.; methodology, M.D., G.S.M. and A.K.; software, G.S.M.; validation, G.S.M. and A.K.; formal analysis, M.D., A.T.O., G.S.M. and A.K., investigation, M.D., A.T.O., G.S.M. and A.K.; resources, G.S.M. and A.K.; data curation, M.D.; writing—original draft preparation, M.D., G.S.M. and A.K.; writing—review and editing, D.J.K.; visualization, G.S.M.; supervision, D.J.K.; project administration, D.J.K.; funding acquisition, D.J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study is supported by awards from the Canadian Institutes of Health Research (CIHR), Mpox Rapid Research Funding Initiative (CIHR MZ1 187236), Moderna Global Fellowship 2024 [2024-MGF-000316 (91353)], Research Nova Scotia Grant 2023-2565, Dalhousie Medical Research Foundation, and the Li-Ka Shing Foundation. A.K. is a Moderna Global Fellow, and D.J.K. is the Canada Research Chair in Translational Vaccinology and Inflammation.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors are thankful to Nikki Kelvin for providing key editorial guidance in the consolidation of this work.

Conflicts of Interest

The authors declare no conflicts of interest. M.D., G.S.M., A.K. and D.J.K. are members of a company, BioForge Canada Limited. BioForge Canada Limited is a company that uses bioinformatics in immunological approaches in the monitoring, prevention, and treatment of infectious diseases. The authors disclose that the interests of BioForge Canada Limited had no impact on this study.

References

  1. Günther, T.; Haas, L.; Alawi, M.; Wohlsein, P.; Marks, J.; Grundhoff, A.; Becher, P.; Fischer, N. Recovery of the first full-length genome sequence of a parapoxvirus directly from a clinical sample. Sci. Rep. 2017, 7, 3734. [Google Scholar] [CrossRef]
  2. Brüssow, H. Pandemic potential of poxviruses: From an ancient killer causing smallpox to the surge of monkeypox. Microb. Biotechnol. 2023, 16, 1723–1735. [Google Scholar] [CrossRef]
  3. Yang, Z.; Gray, M.; Winter, L. Why Do Poxviruses Still Matter? Cell Biosci. 2021, 11, 96. [Google Scholar] [CrossRef]
  4. 2022–2023 Mpox Outbreak Global Map. Available online: https://archive.cdc.gov/#/details?url=https://www.cdc.gov/poxvirus/mpox/response/2022/world-map.html (accessed on 21 September 2025).
  5. Dhar, A.D.; Werchniak, A.E.; Li, Y.; Brennick, J.B.; Goldsmith, C.S.; Kline, R.; Damon, I.; Klaus, S.N. Tanapox infection in a college student. New Engl. J. Med. 2004, 350, 361–366. [Google Scholar] [CrossRef]
  6. Ginzburg, V.E.; Liauchonak, I. Human orf: A typical rash in an urban medical practice. Can. Fam. Physician 2017, 63, 769–771. [Google Scholar]
  7. Hebert, A.A.; Bhatia, N.; Del Rosso, J.Q. Molluscum Contagiosum: Epidemiology, considerations, treatment options, and therapeutic gaps. J. Clin. Aesthetic Dermatol. 2023, 16 (Suppl. 1), S4–S11. [Google Scholar]
  8. Balamurugan, V.; Venkatesan, G.; Bhanuprakash, V.; Singh, R.K. Camelpox, an emerging orthopox viral disease. Indian J. Virol. 2013, 24, 295–305. [Google Scholar] [CrossRef] [PubMed]
  9. Bolte, A.L.; Meurer, J.; Kaleta, E.F. Avian host spectrum of avipoxviruses. Avian Pathol. 1999, 28, 415–432. [Google Scholar] [CrossRef] [PubMed]
  10. Janeway, C.A., Jr.; Travers, P.; Walport, M. Immunobiology: The Immune System in Health and Disease, 5th ed.; T-cell Receptor Gene Rearrangement; Garland Science: New York, NY, USA, 2001. [Google Scholar]
  11. Delany, I.; Rappuoli, R.; Seib, K.L. Vaccines, reverse vaccinology, and bacterial pathogenesis. Cold Spring Harb. Perspect. Med. 2013, 3, a012476. [Google Scholar] [CrossRef]
  12. Sette, A.; Rappuoli, R. Reverse vaccinology: Developing vaccines in the era of genomics. Immunity 2010, 33, 530–541. [Google Scholar] [CrossRef] [PubMed]
  13. Martinez, G.S.; Perez-Rueda, E.; Kumar, A.; Dutt, M.; Maya, C.R.; Ledesma-Dominguez, L.; Casa, P.L.; Kumar, A.; de Avila e Silva, S.; Kelvin, D.J. CDBProm: The Comprehensive Directory of Bacterial Promoters. NAR Genom. Bioinform. 2024, 6, lqae018. [Google Scholar] [CrossRef] [PubMed]
  14. Martinez, G.S.; Pérez-Rueda, E.; Sarkar, S.; Kumar, A.; de Ávila e Silva, S. Machine learning and statistics shape a novel path in archaeal promoter annotation. BMC Bioinform. 2022, 23, 171. [Google Scholar] [CrossRef]
  15. Yang, Z.; Bogdan, P.; Nazarian, S. An in silico deep learning approach to multi-epitope vaccine design: A SARS-CoV-2 case study. Sci. Rep. 2021, 11, 3238. [Google Scholar] [CrossRef]
  16. Keshavarzi Arshadi, A.; Webb, J.; Salem, M.; Cruz, E.; Calad-Thomson, S.; Ghadirian, N.; Collins, J.; Diez-Cecilia, E.; Kelly, B.; Goodarzi, H.; et al. Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development. Front. Artif. Intell. 2020, 3, 65. [Google Scholar] [CrossRef]
  17. Webby, R.J.; Perez, D.R.; Coleman, J.S.; Guan, Y.; Knight, J.H.; Govorkova, E.A.; McClain-Moss, L.R.; Peiris, J.S.; Rehg, J.E.; Tuomanen, E.I.; et al. Responsiveness to a pandemic alert: Use of reverse genetics for rapid development of influenza vaccines. Lancet 2004, 363, 1099–1103. [Google Scholar] [CrossRef]
  18. Masignani, V.; Pizza, M.; Moxon, E.R. The development of a vaccine against Meningococcus B using reverse vaccinology. Front. Immunol. 2019, 10, 751. [Google Scholar] [CrossRef]
  19. Moxon, R.; Reche, P.A.; Rappuoli, R. Editorial: Reverse Vaccinology. Front. Immunol. 2019, 10, 2776. [Google Scholar] [CrossRef] [PubMed]
  20. Martinez, G.S.; Dutt, M.; Kelvin, D.J.; Kumar, A. PoxiPred: An Artificial-Intelligence-Based Method for the Prediction of Potential Antigens and Epitopes to Accelerate Vaccine Development Efforts against Poxviruses. Biology 2004, 13, 125. [Google Scholar] [CrossRef]
  21. Vita, R.; Mahajan, S.; Overton, J.A.; Dhanda, S.K.; Martini, S.; Cantrell, J.R.; Wheeler, D.K.; Sette, A.; Peters, B. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019, 47, D339–D343. [Google Scholar] [CrossRef]
  22. He, Y.; Xiang, Z.; Mobley, H.L. Vaxign: The first web-based vaccine design program for reverse vaccinology and applications for vaccine development. BioMed Res. Int. 2010, 2010, 297505. [Google Scholar] [CrossRef] [PubMed]
  23. Doytchinova, I.A.; Flower, D.R. VaxiJen: A server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinform. 2007, 8, 4. [Google Scholar] [CrossRef]
  24. Cuypers, B.; Rappuoli, R.; Brozzi, A. A Lean Reverse Vaccinology Pipeline with Publicly Available Bioinformatic Tools. Methods Mol. Biol. 2023, 2673, 341–356. [Google Scholar] [CrossRef]
  25. D’Mello, A.; Ahearn, C.P.; Murphy, T.F.; Tettelin, H. ReVac: A reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates. BMC Genom. 2019, 20, 981. [Google Scholar] [CrossRef]
  26. Ras-Carmona, A.; Lehmann, A.A.; Lehmann, P.V.; Reche, P.A. Prediction of B cell epitopes in proteins using a novel sequence similarity-based method. Sci. Rep. 2022, 12, 13739. [Google Scholar] [CrossRef] [PubMed]
  27. Snyder, E.E.; Kampanya, N.; Lu, J.; Nordberg, E.K.; Karur, H.R.; Shukla, M.; Soneja, J.; Tian, Y.; Xue, T.; Yoo, H.; et al. PATRIC: The VBI PathoSystems Resource Integration Center. Nucleic Acids Res. 2007, 35 (Suppl. 1), D401–D406. [Google Scholar] [CrossRef]
  28. Doneva, N.; Dimitrov, I. Viral immunogenicity prediction by machine learning methods. Int. J. Mol Sci. 2024, 25, 2949. [Google Scholar] [CrossRef]
  29. Kiyotani, K.; Toyoshima, Y.; Nemoto, K.; Nakamura, Y. Bioinformatic prediction of potential T cell epitopes for SARS-Cov-2. J. Hum. Genet. 2020, 65, 569–575. [Google Scholar] [CrossRef]
  30. Doytchinova, I.A.; Flower, D.R. Toward the Quantitative Prediction of T-Cell Epitopes: CoMFA and CoMSIA Studies of Peptides with Affinity for the Class I MHC Molecule HLA-A*0201. J. Med. Chem. 2001, 44, 3572–3581. [Google Scholar] [CrossRef]
  31. Dimitrov, I.; Bangov, I.; Flower, D.R.; Doytchinova, I. AllerTOP v.2—A server for in silico prediction of allergens. J. Mol. Model. 2014, 20, 2278. [Google Scholar] [CrossRef] [PubMed]
  32. Wei, L.; Ye, X.; Sakurai, T.; Mu, Z.; Wei, L. ToxIBTL: Prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics 2022, 38, 1514–1524. [Google Scholar] [CrossRef] [PubMed]
  33. Souza, A.S.; Amorim, V.M.F.; Soares, E.P.; de Souza, R.F.; Guzzo, C.R. Antagonistic trends between binding affinity and drug-likeness in SARS-CoV-2 Mpro inhibitors revealed by machine learning. Viruses 2025, 17, 935. [Google Scholar] [CrossRef] [PubMed]
  34. Kacen, A.; Javitt, A.; Kramer, M.P.; Morgenstern, D.; Tsaban, T.; Shmueli, M.D.; Teo, G.C.; Leprevost, F.D.V.; Barnea, E.; Yu, F.; et al. Post-translational modifications reshape the antigenic landscape of the MHC I immunopeptidome in tumors. Nat. Biotechnol. 2023, 41, 239–251. [Google Scholar] [CrossRef] [PubMed]
  35. Pritam, M. Exploring the whole proteome of monkeypox virus to design B cell epitope-based oral vaccines using immunoinformatics approaches. Int. J. Biol. Macromol. 2023, 252, 126498. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The flowchart of the pipeline utilized in the present study to predict the antigens and linear T-cell epitopes, followed by the development of a comprehensive TCEPVDB. The PoxiPred [20] method was utilized to predict the antigens and T-cell linear epitopes
Figure 1. The flowchart of the pipeline utilized in the present study to predict the antigens and linear T-cell epitopes, followed by the development of a comprehensive TCEPVDB. The PoxiPred [20] method was utilized to predict the antigens and T-cell linear epitopes
Proteomes 13 00058 g001
Figure 2. In (A), we show the UML entity relationship diagram of the tables modelled in TCEPVDB as SQL objects with their specific attributes and types of variables assigned in the database modelling. There are two tables, antigen and epitope. One antigen can be associated with 0 or many epitopes. In (B), we show the UML activity diagram of the workflow of a query to TCEPVDB. A query can be done either in the antigen or epitope model. The modelling of Figure 2 was done in the LucidChart diagramming tool.
Figure 2. In (A), we show the UML entity relationship diagram of the tables modelled in TCEPVDB as SQL objects with their specific attributes and types of variables assigned in the database modelling. There are two tables, antigen and epitope. One antigen can be associated with 0 or many epitopes. In (B), we show the UML activity diagram of the workflow of a query to TCEPVDB. A query can be done either in the antigen or epitope model. The modelling of Figure 2 was done in the LucidChart diagramming tool.
Proteomes 13 00058 g002
Figure 3. View of an individual antigen and its associated epitopes in TCEPVDB: (A) Description of the protein entry; (B) Amino acid sequence of the protein; (C) Download option to save the results in the .csv format; and (D) Tabular representation of the epitopes result.
Figure 3. View of an individual antigen and its associated epitopes in TCEPVDB: (A) Description of the protein entry; (B) Amino acid sequence of the protein; (C) Download option to save the results in the .csv format; and (D) Tabular representation of the epitopes result.
Proteomes 13 00058 g003
Figure 4. Conservation of epitopes across different poxvirus species. In Figure 4, we show a heatmap of epitope conservation across different poxvirus species. Each epitope that was predicted by PoxiPred and modeled on TCEPVDB had its presence/absence in the proteome of the poxvirus species considered in this study compiled. The heatmap consists of the absolute presence of epitopes across poxvirus species.
Figure 4. Conservation of epitopes across different poxvirus species. In Figure 4, we show a heatmap of epitope conservation across different poxvirus species. Each epitope that was predicted by PoxiPred and modeled on TCEPVDB had its presence/absence in the proteome of the poxvirus species considered in this study compiled. The heatmap consists of the absolute presence of epitopes across poxvirus species.
Proteomes 13 00058 g004
Table 1. Protein repertoire, predicted antigens, and predicted LTCEs available in TCEPVDB.
Table 1. Protein repertoire, predicted antigens, and predicted LTCEs available in TCEPVDB.
OrganismGenome
Accession
Proteins (n)Predicted Antigens (n)Predicted Linear
T-Cell Epitopes (n)
Amsacta moorei entomopox virusGCF_000837185.12941571891
Bovine papular stomatitis virus GCF_000844045.113061932
Canarypox virus GCF_000841685.13221672387
Choristoneura biennis entomopoxvirusGCF_000909015.13341792341
Eastern grey kangaroopox virusGCF_006450915.1162821167
Ectromelia virus GCF_000841905.11801271714
Goatpox virus GCF_000840165.1149681092
Horsepox virus GCF_000860085.12281541844
Lumpy skin disease virusGCF_000839805.1156771193
Molluscum contagiosum virus GCF_000843325.116359895
Mpox virus GCF_000857045.11831341777
Mule deerpox virusGCF_000861985.1169811113
Myxoma virus GCF_000843685.115873924
Orf virus GCF_000844845.113052737
Pseudocowpox virus GCF_000886295.112555775
Raccoonpox virusGCF_001029045.12071281750
Salmon gill poxvirusGCF_001271235.12101051490
Sea otter poxvirusGCF_003260795.1132701074
Sealpox virus GCF_002219465.111952820
Sheeppox virusGCF_000840205.1147721113
Squirrelpox virus GCF_000913615.1141631083
Swinepox virus GCF_000839965.1146751183
Tanapox virus GCF_000847185.1155791140
Taterapox virus GCF_000869985.12201401761
Turkeypox virus GCF_001431935.1170921400
Vaccinia virus GCF_000860085.12141501851
Variola virus GCF_000859885.12111421663
White-tailed deer poxvirusMF966153171851167
Yaba monkey tumor virusGCF_000845705.114059902
Yokapox virusGCF_000892975.11861051410
Camelpox virusGCF_000839105.12611641887
Cowpox virusGCF_000839185.12141421959
Finch poxvirusOM8694823351862414
Fowlpox virusGCF_000838605.12511461973
Murmansk poxvirusGCF_002270885.12061151647
Penguinpox virusGCF_000923135.12421371914
Pigeonpox virusGCF_000922075.12241331908
Table 2. A comparison of the TCEPVDB and other existing popular vaccine design tools and platforms.
Table 2. A comparison of the TCEPVDB and other existing popular vaccine design tools and platforms.
NameFocus OrganismTypeAntigen Prediction MethodEpitope TypeSpecificity to PoxvirusesStructural Integration Output OptionUser Interface
TCEPVDBPoxviruses (n = 37)Database (epitopes + antigen repository)PoxiPred (ML-based, proteome-wide)Linear T-cell epitopesYesNo, but output can be utilized for structure modeling Tabular + downloadable
.fasta
Web-based, with custom search
IEDB Epitope ToolsBroad (4700+ species)Prediction + curated experimental databaseMultiple ML-based tools (e.g., NetMHCpan)T-cell, B-cell, MHC ligandsNo, but information on multiple poxviruses can be extracted Partial (some 3D epitope mapping)Epitope lists, binding scores, and plotsWeb-based, modular tools
VaxignBroad (bacteria, viruses, parasites)Pipeline + databaseReverse vaccinology (genomic + ML filtersT-cell (MHC I/II), B-cellNoPartial (subcellular localization)Ranked antigen list + epitope predictions based on the scoresWeb-based, form-driven
VaxiJenBroad (pathogen-agnostic)Standalone prediction toolAlignment-independent auto cross-covarianceNo, only antigenicity scores Can be used to predict antigenicity scores for individual proteins of poxviruses NoAntigenicity scoreWeb-based, simple input
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dutt, M.; Kumar, A.; Ostadgavahi, A.T.; Kelvin, D.J.; Martinez, G.S. TCEPVDB: Artificial Intelligence-Based Proteome-Wide Screening of Antigens and Linear T-Cell Epitopes in the Poxviruses and the Development of a Repository. Proteomes 2025, 13, 58. https://doi.org/10.3390/proteomes13040058

AMA Style

Dutt M, Kumar A, Ostadgavahi AT, Kelvin DJ, Martinez GS. TCEPVDB: Artificial Intelligence-Based Proteome-Wide Screening of Antigens and Linear T-Cell Epitopes in the Poxviruses and the Development of a Repository. Proteomes. 2025; 13(4):58. https://doi.org/10.3390/proteomes13040058

Chicago/Turabian Style

Dutt, Mansi, Anuj Kumar, Ali Toloue Ostadgavahi, David J. Kelvin, and Gustavo Sganzerla Martinez. 2025. "TCEPVDB: Artificial Intelligence-Based Proteome-Wide Screening of Antigens and Linear T-Cell Epitopes in the Poxviruses and the Development of a Repository" Proteomes 13, no. 4: 58. https://doi.org/10.3390/proteomes13040058

APA Style

Dutt, M., Kumar, A., Ostadgavahi, A. T., Kelvin, D. J., & Martinez, G. S. (2025). TCEPVDB: Artificial Intelligence-Based Proteome-Wide Screening of Antigens and Linear T-Cell Epitopes in the Poxviruses and the Development of a Repository. Proteomes, 13(4), 58. https://doi.org/10.3390/proteomes13040058

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop