De Novo Antimicrobial Peptide Design with Feedback Generative Adversarial Networks
Abstract
:1. Introduction
- Introduce two alternative classifiers based on k-mers and transfer learning with ESM2 for identifying AMPs, surpassing the baseline FBGAN classifier in performance.
- Propose two alternative generative models, FBGAN-kmers and FBGAN-ESM2, that confirm superior performance compared to FBGAN, highlighting the significance of incorporating advanced classifiers.
- Compare FBGAN-kmers and FBGAN-ESM2 with existing state-of-the-art methods for AMP generation, demonstrating competitive performance.
2. Results
2.1. Performance of Classifiers
2.2. Performance of Generative Models
3. Discussion
4. Materials and Methods
4.1. Datasets
Dataset Preparation and Standardization
4.2. Protein Sequence Representation and Classification
4.2.1. Proposed Data Representation via K-Mers Technique
- Dinucleotides or 2-mers () represent pairs of adjacent amino acids in the protein sequence. This captures short-range interactions between amino acids and provides insights into local structural motifs such as helices and turns.
- Trinucleotides or 3-mers (), also known as a codon, are sequences of three nucleotides that encode a specific amino acid during protein synthesis. The frequency of each codon in a coding sequence can indicate the level of expression or translational efficiency of a gene. Tripeptides can capture more complex local structural features, including beta turns and secondary structure elements.
- Tetranucleotides or 4-mers (), can be used to infer the genomic signature of an organism, which is influenced by factors such as mutation, selection, and horizontal gene transfer [26].
- Pentanucleotides or 5-mers () provide broader coverage of local sequence patterns that may capture more diverse structural motifs and functional motifs, including protein–protein interaction sites or substrate binding sites.
4.2.2. Transfer Learning with ESM2
4.2.3. Experimental Setup
4.2.4. Network Architecture and Parameter Tuning
4.2.5. Hyperparameter Tuning
- Number of layers:
- Hidden states:
- Learning rate:
- Batch size:
- Dropout:
4.2.6. Classification and Performance Metrics
4.3. Feedback GAN Architecture
4.3.1. Feedback-Loop Training
4.3.2. Evaluation of AMP Generative Models
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zasloff, M. Antimicrobial peptides of multicellular organisms. Nature 2002, 415, 389–395. [Google Scholar] [CrossRef] [PubMed]
- Hancock, R.E.; Sahl, H.G. Antimicrobial and host-defense peptides as new anti-infective therapeutic strategies. Nat. Biotechnol. 2006, 24, 1551–1557. [Google Scholar] [CrossRef] [PubMed]
- Lei, J.; Sun, L.; Huang, S.; Zhu, C.; Li, P.; He, J.; Mackey, V.; Coy, D.H.; He, Q. The antimicrobial peptides and their potential clinical applications. Am. J. Transl. Res. 2019, 11, 3919. [Google Scholar] [PubMed]
- Wu, Q.; Ke, H.; Li, D.; Wang, Q.; Fang, J.; Zhou, J. Recent progress in machine learning-based prediction of peptide activity for drug discovery. Curr. Top. Med. Chem. 2019, 19, 4–16. [Google Scholar] [CrossRef] [PubMed]
- Zervou, M.A.; Doutsi, E.; Tsakalides, P. Unleashing the Power of Artificial Intelligence for Personalised Drug Design. ERCIM News 2023, 134, 23–24. [Google Scholar]
- Gupta, A.; Zou, J. Feedback GAN for DNA optimizes protein functions. Nat. Mach. Intell. 2019, 1, 105–111. [Google Scholar] [CrossRef]
- Van Oort, C.M.; Ferrell, J.B.; Remington, J.M.; Wshah, S.; Li, J. AMPGAN v2: Machine learning-guided design of antimicrobial peptides. J. Chem. Inf. Model. 2021, 61, 2198–2207. [Google Scholar] [CrossRef] [PubMed]
- Dean, S.N.; Alvarez, J.A.E.; Zabetakis, D.; Walper, S.A.; Malanoski, A.P. PepVAE: Variational autoencoder framework for antimicrobial peptide generation and activity prediction. Front. Microbiol. 2021, 12, 725727. [Google Scholar] [CrossRef] [PubMed]
- Szymczak, P.; Możejko, M.; Grzegorzek, T.; Jurczak, R.; Bauer, M.; Neubauer, D.; Sikora, K.; Michalski, M.; Sroka, J.; Setny, P.; et al. Discovering highly potent antimicrobial peptides with deep generative model HydrAMP. Nat. Commun. 2023, 14, 1453. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Lee, H.T.; Lee, C.C.; Yang, J.R.; Lai, J.Z.; Chang, K.Y. A large-scale structural classification of antimicrobial peptides. BioMed Res. Int. 2015, 2015, 475062. [Google Scholar] [CrossRef] [PubMed]
- Bhadra, P.; Yan, J.; Li, J.; Fong, S.; Siu, S.W. AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest. Sci. Rep. 2018, 8, 1697. [Google Scholar] [CrossRef]
- Veltri, D.; Kamath, U.; Shehu, A. Deep learning improves antimicrobial peptide recognition. Bioinformatics 2018, 34, 2740–2747. [Google Scholar] [CrossRef]
- Yan, J.; Bhadra, P.; Li, A.; Sethiya, P.; Qin, L.; Tai, H.K.; Wong, K.H.; Siu, S.W. Deep-AmPEP30: Improve short antimicrobial peptides prediction with deep learning. Mol. Ther.-Nucleic Acids 2020, 20, 882–894. [Google Scholar] [CrossRef] [PubMed]
- Burdukiewicz, M.; Sidorczuk, K.; Rafacz, D.; Pietluch, F.; Chilimoniuk, J.; Rödiger, S.; Gagat, P. Proteomic screening for prediction and design of antimicrobial peptides with AmpGram. Int. J. Mol. Sci. 2020, 21, 4310. [Google Scholar] [CrossRef]
- Chung, C.R.; Kuo, T.R.; Wu, L.C.; Lee, T.Y.; Horng, J.T. Characterization and identification of antimicrobial peptides with different functional activities. Briefings Bioinform. 2020, 21, 1098–1114. [Google Scholar] [CrossRef]
- Fingerhut, L.C.; Miller, D.J.; Strugnell, J.M.; Daly, N.L.; Cooke, I.R. ampir: An R package for fast genome-wide prediction of antimicrobial peptides. Bioinformatics 2020, 36, 5262–5263. [Google Scholar] [CrossRef]
- Lawrence, T.J.; Carper, D.L.; Spangler, M.K.; Carrell, A.A.; Rush, T.A.; Minter, S.J.; Weston, D.J.; Labbé, J.L. amPEPpy 1.0: A portable and accurate antimicrobial peptide prediction tool. Bioinformatics 2021, 37, 2058–2060. [Google Scholar] [CrossRef] [PubMed]
- Gawde, U.; Chakraborty, S.; Waghu, F.H.; Barai, R.S.; Khanderkar, A.; Indraguru, R.; Shirsat, T.; Idicula-Thomas, S. CAMPR4: A database of natural and synthetic antimicrobial peptides. Nucleic Acids Res. 2023, 51, D377–D383. [Google Scholar] [CrossRef]
- Yan, K.; Lv, H.; Guo, Y.; Peng, W.; Liu, B. sAMPpred-GAT: Prediction of antimicrobial peptide by graph attention network and predicted peptide structure. Bioinformatics 2023, 39, btac715. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Karlin, S.; Altschul, S.F. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 1990, 87, 2264–2268. [Google Scholar] [CrossRef]
- Lin, Z.; Akin, H.; Rao, R.; Hie, B.; Zhu, Z.; Lu, W.; Smetanin, N.; Verkuil, R.; Kabeli, O.; Shmueli, Y.; et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023, 379, 1123–1130. [Google Scholar] [CrossRef]
- Wang, G.; Li, X.; Wang, Z. APD3: The antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 2016, 44, D1087–D1093. [Google Scholar] [CrossRef] [PubMed]
- Consortium, U. UniProt: A worldwide hub of protein knowledge. Nucleic Acids Res. 2019, 47, D506–D515. [Google Scholar] [CrossRef]
- Abe, T.; Kanaya, S.; Kinouchi, M.; Ichiba, Y.; Kozuki, T.; Ikemura, T. Informatics for unveiling hidden genome signatures. Genome Res. 2003, 13, 693–702. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Likic, V. The Needleman-Wunsch Algorithm for Sequence Alignment; Lecture given at the 7th Melbourne Bioinformatics Course; Bi021 Molecular Science and Biotechnology Institute, University of Melbourne: Melbourne, Australia, 2008; pp. 1–46. [Google Scholar]
- Dumoulin, V.; Belghazi, I.; Poole, B.; Mastropietro, O.; Lamb, A.; Arjovsky, M.; Courville, A. Adversarially learned inference. arXiv 2016, arXiv:1606.00704. [Google Scholar]
- Donahue, J.; Krähenbühl, P.; Darrell, T. Adversarial feature learning. arXiv 2016, arXiv:1605.09782. [Google Scholar]
- Kingma, D.P.; Mohamed, S.; Jimenez Rezende, D.; Welling, M. Semi-supervised learning with deep generative models. Adv. Neural Inf. Process. Syst. 2014, 27, 1–9. [Google Scholar]
- Melo, M.N.; Ferre, R.; Feliu, L.; Bardaji, E.; Planas, M.; Castanho, M.A. Prediction of antibacterial activity from physicochemical properties of antimicrobial peptides. PLoS ONE 2011, 6, e28549. [Google Scholar] [CrossRef] [PubMed]
- Kang, S.J.; Kim, D.H.; Mishig-Ochir, T.; Lee, B.J. Antimicrobial peptides: Their physicochemical properties and therapeutic application. Arch. Pharmacal Res. 2012, 35, 409–413. [Google Scholar] [CrossRef]
- Sitaram, N.; Nagaraj, R. Interaction of antimicrobial peptides with biological and model membranes: Structural and charge requirements for activity. Biochim. Biophys. Acta Biomembr. 1999, 1462, 29–54. [Google Scholar] [CrossRef] [PubMed]
FBGAN | k-Mers-Based Classifier | ESM2-Based | ||||
---|---|---|---|---|---|---|
Classifier | k = 2 | k = 3 | k = 4 | k = 5 | Classifier | |
Input size | 4 | 16 | 64 | 256 | 1024 | 480 |
Layers | 2 GRUs and 1 Dense | 1 GRU and 1 Dense | 1 GRU and 1 Dense | 1 GRU and 1 Dense | 1 GRU and 1 Dense | 2 Dense |
Hidden states | 128 | 64 | 64 | 64 | 64 | 128 |
Learning rate | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
Dropout | 0.3 in both layers | 0.3 | 0.2 | 0.3 | 0.3 | 0.2 |
Batch size | 64 | 64 | 8 | 64 | 32 | 16 |
Total number of parameters | 198.9 K | 15.8 K | 25.0 K | 61.9 K | 209.4 K | 61.8 K |
FBGAN | k-Mers | ESM2 | ||||
---|---|---|---|---|---|---|
k = 2 | k = 3 | k = 4 | k = 5 | |||
Accuracy | 81.4 (7.0) | 85.7 (1.4) | 85.6 (1.5) | 85.3 (1.0) | 82.3 (7.2) | 91.5 (0.7) |
Precision | 82.3 (2.8) | 84.3 (2.1) | 83.8 (1.8) | 82.9 (1.9) | 81.7 (5.1) | 90.6 (2.3) |
Recall | 80.5 (18.7) | 87.7 (3.1) | 88.3 (3.2) | 89.1 (1.8) | 85.0 (18.9) | 92.7 (1.7) |
F1-score | 79.2 (16.8) | 85.8 (1.5) | 85.9 (1.6) | 85.9 (0.8) | 80.8 (17.1) | 91.6 (0.6) |
Diversity | Sequence Similarity | |
---|---|---|
Real Data | 98.5% | 24.3% |
FBGAN | 100% | 33.5% |
FBGAN-kmers | 100% | 33.4% |
FBGAN-ESM2 | 100% | 27.6% |
AMPGAN | 100% | 26.8% |
HydrAMP | 98.9% | 21.7% |
CAMPR4 | ||||
---|---|---|---|---|
RF | SVM | ANN | Average among Classifiers | |
FBGAN | 33.5% | 31.4% | 32.9% | 32.6% (0.8) |
FBGAN-kmers | 61.8% | 53.8% | 49.7% | 55.1% (5.0) |
FBGAN-ESM2 | 59.1% | 55.5% | 58.9% | 57.8% (1.6) |
AMPGAN | 64.6% | 53.8% | 56.6% | 58.3% (4.5) |
HydrAMP | 58.8% | 52.6% | 32.9% | 48.1% (11.0) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zervou, M.A.; Doutsi, E.; Pantazis, Y.; Tsakalides, P. De Novo Antimicrobial Peptide Design with Feedback Generative Adversarial Networks. Int. J. Mol. Sci. 2024, 25, 5506. https://doi.org/10.3390/ijms25105506
Zervou MA, Doutsi E, Pantazis Y, Tsakalides P. De Novo Antimicrobial Peptide Design with Feedback Generative Adversarial Networks. International Journal of Molecular Sciences. 2024; 25(10):5506. https://doi.org/10.3390/ijms25105506
Chicago/Turabian StyleZervou, Michaela Areti, Effrosyni Doutsi, Yannis Pantazis, and Panagiotis Tsakalides. 2024. "De Novo Antimicrobial Peptide Design with Feedback Generative Adversarial Networks" International Journal of Molecular Sciences 25, no. 10: 5506. https://doi.org/10.3390/ijms25105506