Bacterial Immunogenicity Prediction by Machine Learning Methods
Faculty of Pharmacy, Medical University of Sofia, 1000 Sofia, Bulgaria
*
Author to whom correspondence should be addressed.
Vaccines 2020, 8(4), 709; https://doi.org/10.3390/vaccines8040709
Received: 2 November 2020 / Revised: 19 November 2020 / Accepted: 24 November 2020 / Published: 30 November 2020
(This article belongs to the Special Issue Vaccine Evaluation Methods and Studies)
The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.
View Full-Text
▼
Show Figures
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
- Supplementary File 1:
ZIP-Document (ZIP, 1803 KiB)
MDPI and ACS Style
Dimitrov, I.; Zaharieva, N.; Doytchinova, I. Bacterial Immunogenicity Prediction by Machine Learning Methods. Vaccines 2020, 8, 709. https://doi.org/10.3390/vaccines8040709
AMA Style
Dimitrov I, Zaharieva N, Doytchinova I. Bacterial Immunogenicity Prediction by Machine Learning Methods. Vaccines. 2020; 8(4):709. https://doi.org/10.3390/vaccines8040709
Chicago/Turabian StyleDimitrov, Ivan; Zaharieva, Nevena; Doytchinova, Irini. 2020. "Bacterial Immunogenicity Prediction by Machine Learning Methods" Vaccines 8, no. 4: 709. https://doi.org/10.3390/vaccines8040709
Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.
Search more from Scilit