Structural Bioinformatics and Deep Learning of Metalloproteins: Recent Advances and Applications
Abstract
:1. Introduction
2. Structure-Based Definition of Metal Binding Sites (MBS)
3. Structure-Based Prediction of Metal Sites
3.1. Template-Based Methods
3.2. Random Forest Methods
4. Structural Comparison of the Metal Sites
5. Metalloprotein Databases
6. AI Methods Applied to Metalloproteins
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Foster, A.W.; Young, T.R.; Chivers, P.T.; Robinson, N.J. Protein metalation in biology. Curr. Opin. Chem. Biol. 2022, 66, 102095. [Google Scholar] [CrossRef] [PubMed]
- Smethurst, D.G.J.; Shcherbik, N. Interchangeable utilization of metals: New perspectives on the impacts of metal ions employed in ancient and extant biomolecules. J. Biol. Chem. 2021, 297, 101374. [Google Scholar] [CrossRef] [PubMed]
- Chandrangsu, P.; Rensing, C.; Helmann, J.D. Metal homeostasis and resistance in bacteria. Nat. Rev. Microbiol. 2017, 15, 338–350. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Young, T.R.; Martini, M.A.; Foster, A.W.; Glasfeld, A.; Osman, D.; Morton, R.J.; Deery, E.; Warren, M.J.; Robinson, N.J. Calculating metalation in cells reveals CobW acquires CoII for vitamin B12 biosynthesis while related proteins prefer ZnII. Nat. Commun. 2021, 12, 1195. [Google Scholar] [CrossRef] [PubMed]
- Begg, S.L. The role of metal ions in the virulence and viability of bacterial pathogens. Biochem. Soc. Trans. 2019, 47, 77–87. [Google Scholar] [CrossRef]
- Hunsaker, E.W.; Franz, K.J. Emerging Opportunities To Manipulate Metal Trafficking for Therapeutic Benefit. Inorg. Chem. 2019, 58, 13528–13545. [Google Scholar] [CrossRef] [Green Version]
- Andreini, C.; Cavallaro, G.; Lorenzini, S.; Rosato, A. MetalPDB: A database of metal sites in biological macromolecular structures. Nucleic Acids Res. 2013, 41, D312–D319. [Google Scholar] [CrossRef] [Green Version]
- Putignano, V.; Rosato, A.; Banci, L.; Andreini, C. MetalPDB in 2018: A database of metal sites in biological macromolecular structures. Nucleic Acids Res. 2018, 46, D459–D464. [Google Scholar] [CrossRef]
- Andreini, C.; Bertini, I.; Cavallaro, G.; Holliday, G.L.; Thornton, J.M. Metal-MACiE: A database of metals involved in biological catalysis. Bioinformatics 2009, 25, 2088–2089. [Google Scholar] [CrossRef] [Green Version]
- Waldron, K.J.; Rutherford, J.C.; Ford, D.; Robinson, N.J. Metalloproteins and metal sensing. Nature 2009, 460, 823–830. [Google Scholar] [CrossRef]
- Valasatava, Y.; Rosato, A.; Furnham, N.; Thornton, J.M.; Andreini, C. To what extent do structural changes in catalytic metal sites affect enzyme function? J. Inorg. Biochem 2018, 179, 40–53. [Google Scholar] [CrossRef] [PubMed]
- Ben-David, M.; Soskine, M.; Dubovetskyi, A.; Cherukuri, K.-P.; Dym, O.; Sussman, J.L.; Liao, Q.; Szeler, K.; Kamerlin, S.C.L.; Tawfik, D.S. Enzyme Evolution: An Epistatic Ratchet versus a Smooth Reversible Transition. Mol. Biol. Evol. 2019, 37, 1133–1147. [Google Scholar] [CrossRef] [PubMed]
- Ridge, P.G.; Zhang, Y.; Gladyshev, V.N. Comparative genomic analyses of copper transporters and cuproproteomes reveal evolutionary dynamics of copper utilization and its link to oxygen. PLoS ONE 2008, 3, e1378. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Gladyshev, V.N. Comparative Genomics of Trace Elements: Emerging Dynamic View of Trace Element Utilization and Function. Chem. Rev. 2009, 109, 4828–4861. [Google Scholar] [CrossRef] [PubMed]
- Andreini, C.; Bertini, I.; Rosato, A. A hint to search for metalloproteins in gene banks. Bioinformatics 2004, 20, 1373–1380. [Google Scholar] [CrossRef]
- Andreini, C.; Banci, L.; Bertini, I.; Rosato, A. Zinc through the three domains of life. J. Proteome Res 2006, 5, 3173–3178. [Google Scholar] [CrossRef]
- Andreini, C.; Banci, L.; Bertini, I.; Elmi, S.; Rosato, A. Non-heme iron through the three domains of life. Proteins Struct. Funct. Bioinf. 2007, 67, 317–324. [Google Scholar] [CrossRef]
- Zhang, Y.; Zheng, J. Bioinformatics of Metalloproteins and Metalloproteomes. Molecules 2020, 25, 3366. [Google Scholar] [CrossRef]
- Zeng, X.; Cheng, Y.; Wang, C. Global Mapping of Metalloproteomes. Biochemistry 2021, 60, 3507–3514. [Google Scholar] [CrossRef]
- Grosjean, N.; Blaby-Haas, C.E. Leveraging computational genomics to understand the molecular basis of metal homeostasis. New Phytol. 2020, 228, 1472–1489. [Google Scholar] [CrossRef]
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef] [PubMed]
- Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G.R.; Wang, J.; Cong, Q.; Kinch, L.N.; Schaeffer, R.D.; et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871–876. [Google Scholar] [CrossRef] [PubMed]
- AlQuraishi, M. AlphaFold at CASP13. Bioinformatics 2019, 35, 4862–4865. [Google Scholar] [CrossRef] [PubMed]
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Applying and improving AlphaFold at CASP14. Proteins Struct. Funct. Bioinf. 2021, 89, 1711–1721. [Google Scholar] [CrossRef]
- Jones, D.T.; Thornton, J.M. The impact of AlphaFold2 one year on. Nat. Methods 2022, 19, 15–20. [Google Scholar] [CrossRef]
- Laine, E.; Eismann, S.; Elofsson, A.; Grudinin, S. Protein sequence-to-structure learning: Is this the end(-to-end revolution)? Proteins Struct. Funct. Bioinf. 2021, 89, 1770–1786. [Google Scholar] [CrossRef]
- Masrati, G.; Landau, M.; Ben-Tal, N.; Lupas, A.; Kosloff, M.; Kosinski, J. Integrative Structural Biology in the Era of Accurate Structure Prediction. J. Mol. Biol. 2021, 433, 167127. [Google Scholar] [CrossRef]
- wwPDB consortium. Protein Data Bank: The single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019, 47, D520–D528. [Google Scholar] [CrossRef] [Green Version]
- Andreini, C.; Bertini, I.; Cavallaro, G. Minimal functional sites allow a classification of zinc sites in proteins. PLoS ONE 2011, 10, e26325. [Google Scholar] [CrossRef] [Green Version]
- Tran, J.B.; Krężel, A. InterMetalDB: A Database and Browser of Intermolecular Metal Binding Sites in Macromolecules with Structural Information. J. Proteome Res. 2021, 20, 1889–1901. [Google Scholar] [CrossRef]
- Metzner, F.J.; Huber, E.; Hopfner, K.-P.; Lammens, K. Structural and biochemical characterization of human Schlafen 5. Nucleic Acids Res. 2022, 50, 1147–1161. [Google Scholar] [CrossRef] [PubMed]
- Yamashita, M.M.; Wesson, L.; Eisenman, G.; Eisenberg, D. Where metal ions bind in proteins. Proc. Natl. Acad. Sci. USA 1990, 87, 5648–5652. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gregory, D.S.; Martin, A.C.; Cheetham, J.C.; Rees, A.R. The prediction and characterization of metal binding sites in proteins. Protein Eng. 1993, 6, 29–35. [Google Scholar] [CrossRef] [PubMed]
- Nair, P.A.; Smith, P.; Shuman, S. Structure of bacterial LigD 3’-phosphoesterase unveils a DNA repair superfamily. Proc. Natl. Acad. Sci. USA 2010, 107, 12822–12827. [Google Scholar] [CrossRef] [Green Version]
- Natarajan, A.; Dutta, K.; Temel, D.B.; Nair, P.A.; Shuman, S.; Ghose, R. Solution structure and DNA-binding properties of the phosphoesterase domain of DNA ligase D. Nucleic Acids Res. 2011, 40, 2076–2088. [Google Scholar] [CrossRef] [Green Version]
- Babor, M.; Gerzon, S.; Raveh, B.; Sobolev, V.; Edelman, M. Prediction of transition metal-binding sites from apo protein structures. Proteins Struct. Funct. Bioinf. 2008, 70, 208–217. [Google Scholar] [CrossRef]
- Goyal, K.; Mande, S.C. Exploiting 3D structural templates for detection of metal-binding sites in protein structures. Proteins: Struct. Funct. Bioinf. 2008, 70, 1206–1218. [Google Scholar] [CrossRef]
- Hu, X.; Dong, Q.; Yang, J.; Zhang, Y. Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals. Bioinformatics 2016, 32, 3260–3269. [Google Scholar] [CrossRef] [Green Version]
- Yang, J.; Yan, R.; Roy, A.; Xu, D.; Poisson, J.; Zhang, Y. The I-TASSER Suite: Protein structure and function prediction. Nat. Methods 2015, 12, 7–8. [Google Scholar] [CrossRef] [Green Version]
- Yang, J.; Roy, A.; Zhang, Y. Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 2013, 29, 2588–2595. [Google Scholar] [CrossRef]
- Lin, Y.F.; Cheng, C.W.; Shih, C.S.; Hwang, J.K.; Yu, C.S.; Lu, C.H. MIB: Metal Ion-Binding Site Prediction and Docking Server. J Chem. Inf. Model 2016, 56, 2287–2291. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lu, C.H.; Lin, Y.S.; Chen, Y.C.; Yu, C.S.; Chang, S.Y.; Hwang, J.K. The fragment transformation method to detect the protein structural motifs. Proteins 2006, 63, 636–643. [Google Scholar] [CrossRef] [Green Version]
- Ajitha, M.; Sundar, K.; Arul Mugilan, S.; Arumugam, S. Development of METAL-ACTIVE SITE and ZINCCLUSTER tool to predict active site pockets. Proteins 2018, 86, 322–331. [Google Scholar] [CrossRef] [PubMed]
- Rodríguez-Guerra Pedregal, J.; Sciortino, G.; Guasp, J.; Municoy, M.; Maréchal, J.D. GaudiMM: A modular multi-objective platform for molecular modeling. J. Comput. Chem. 2017, 38, 2118–2126. [Google Scholar] [CrossRef] [PubMed]
- Sciortino, G.; Garribba, E.; Rodríguez-Guerra Pedregal, J.; Maréchal, J.D. Simple Coordination Geometry Descriptors Allow to Accurately Predict Metal-Binding Sites in Proteins. Acs. Omega 2019, 4, 3726–3731. [Google Scholar] [CrossRef]
- Sánchez-Aparicio, J.-E.; Tiessler-Sala, L.; Velasco-Carneros, L.; Roldán-Martín, L.; Sciortino, G.; Maréchal, J.-D. BioMetAll: Identifying Metal-Binding Sites in Proteins from Backbone Preorganization. J. Chem. Inf. Model. 2021, 61, 311–323. [Google Scholar] [CrossRef]
- Babor, M.; Greenblatt, H.M.; Edelman, M.; Sobolev, V. Flexibility of metal binding sites in proteins on a database scale. Proteins 2005, 59, 221–230. [Google Scholar] [CrossRef]
- Garg, A.; Pal, D. Inferring metal binding sites in flexible regions of proteins. Proteins Struct. Funct. Bioinf. 2021, 89, 1125–1133. [Google Scholar] [CrossRef]
- Ireland, S.M.; Martin, A.C.R. Zincbindpredict—Prediction of Zinc Binding Sites in Proteins. Molecules 2021, 26, 966. [Google Scholar] [CrossRef]
- Nguyen, H.; Kleingardner, J. Identifying metal binding amino acids based on backbone geometries as a tool for metalloprotein engineering. Protein Sci. 2021, 30, 1247–1257. [Google Scholar] [CrossRef]
- Hirata, A.; Klein, B.J.; Murakami, K.S. The X-ray crystal structure of RNA polymerase from Archaea. Nature 2008, 451, 851–854. [Google Scholar] [CrossRef] [PubMed]
- Lancaster, C.R.D.; Kröger, A.; Auer, M.; Michel, H. Structure of fumarate reductase from Wolinella succinogenes at 2.2 Å resolution. Nature 1999, 402, 377–385. [Google Scholar] [CrossRef] [PubMed]
- Andreini, C.; Cavallaro, G.; Rosato, A.; Valasatava, Y. MetalS2: A tool for the structural alignment of minimal functional sites in metal-binding proteins and nucleic acids. J. Chem. Inf. Model 2013, 53, 3064–3075. [Google Scholar] [CrossRef] [PubMed]
- Valasatava, Y.; Andreini, C.; Rosato, A. Hidden relationship between metalloproteins unveiled by structural comparison of their metal sites. Sci. Rep. 2015, 5, 9486. [Google Scholar] [CrossRef] [Green Version]
- Rosato, A.; Valasatava, Y.; Andreini, C. Minimal functional sites in metalloproteins and their usage in strucutral bioinformatics. Int. J. Mol. Sci 2016, 17, 671. [Google Scholar] [CrossRef] [Green Version]
- Valasatava, Y.; Rosato, A.; Cavallaro, G.; Andreini, C. MetalS3, a database-mining tool for the identification of structurally similar metal sites. J. Biol. Inorg. Chem 2014, 19, 937–945. [Google Scholar] [CrossRef]
- Andreini, C.; Arnesano, F.; Rosato, A. The Zinc Proteome of SARS-CoV-2. Metallomics 2022, 14, mfac047. [Google Scholar] [CrossRef]
- He, W.; Liang, Z.; Teng, M.; Niu, L. mFASD: A structure-based algorithm for discriminating different types of metal-binding sites. Bioinformatics 2015, 31, 1938–1944. [Google Scholar] [CrossRef] [Green Version]
- Li, G.; Dai, Q.-Q.; Li, G.-B. MeCOM: A Method for Comparing Three-Dimensional Metalloenzyme Active Sites. J. Chem. Inf. Model. 2022, 62, 730–739. [Google Scholar] [CrossRef]
- Sippl, M.J.; Wiederstein, M. Detection of spatial correlations in protein structures and molecular complexes. Structure 2012, 20, 718–728. [Google Scholar] [CrossRef] [Green Version]
- Wiederstein, M.; Sippl, M.J. TopMatch-web: Pairwise matching of large assemblies of protein and nucleic acid chains in 3D. Nucleic Acids Res. 2020, 48, W31–W35. [Google Scholar] [CrossRef] [PubMed]
- Bromberg, Y.; Aptekmann, A.A.; Mahlich, Y.; Cook, L.; Senn, S.; Miller, M.; Nanda, V.; Ferreiro, D.U.; Falkowski, P.G. Quantifying structural relationships of metal-binding sites suggests origins of biological electron transfer. Sci. Adv. 2022, 8, eabj3984. [Google Scholar] [CrossRef] [PubMed]
- Raanan, H.; Pike, D.H.; Moore, E.K.; Falkowski, P.G.; Nanda, V. Modular origins of biological electron transfer chains. Proc. Natl. Acad. Sci. USA 2018, 115, 1280–1285. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Attwood, T.K.; Agit, B.; Ellis, L.B.M. Longevity of Biological Databases. EMBnet.J. 2015, 21, e803. [Google Scholar] [CrossRef] [Green Version]
- Wren, J.D.; Georgescu, C.; Giles, C.B.; Hennessey, J. Use it or lose it: Citations predict the continued online availability of published bioinformatics resources. Nucleic Acids Res. 2017, 45, 3627–3633. [Google Scholar] [CrossRef] [Green Version]
- Imker, H.J. 25 Years of Molecular Biology Databases: A Study of Proliferation, Impact, and Maintenance. Front. Res. Metr. Anal. 2018, 3, 18. [Google Scholar] [CrossRef] [Green Version]
- Yang, J.; Roy, A.; Zhang, Y. BioLiP: A semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic Acids Res. 2012, 41, D1096–D1103. [Google Scholar] [CrossRef] [Green Version]
- Ireland, S.M.; Martin, A.C.R. ZincBind-the database of zinc binding sites. Database 2019, 2019, baz006. [Google Scholar] [CrossRef]
- Kondo, H.X.; Kanematsu, Y.; Masumoto, G.; Takano, Y. PyDISH: Database and analysis tools for heme porphyrin distortion in heme proteins. Database 2020, 2020, baaa066. [Google Scholar] [CrossRef]
- Jentzen, W.; Song, X.-Z.; Shelnutt, J.A. Structural Characterization of Synthetic and Protein-Bound Porphyrins in Terms of the Lowest-Frequency Normal Coordinates of the Macrocycle. J. Phys. Chem. B 1997, 101, 1684–1699. [Google Scholar] [CrossRef]
- Zhang, H.; Chen, P.; Ma, H.; Woinska, M.; Liu, D.; Cooper, D.R.; Peng, G.; Peng, Y.; Deng, L.; Minor, W.; et al. virusMED: An atlas of hotspots of viral proteins. IUCrJ 2021, 8, 931–942. [Google Scholar] [CrossRef] [PubMed]
- Zheng, H.; Shabalin, I.G.; Handing, K.B.; Bujnicki, J.M.; Minor, W. Magnesium-binding architectures in RNA crystal structures: Validation, binding preferences, classification and motif detection. Nucleic Acids Res. 2015, 43, 3789–3801. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zheng, H.; Cooper, D.R.; Porebski, P.J.; Shabalin, I.G.; Handing, K.B.; Minor, W. CheckMyMetal: A macromolecular metal-binding validation tool. Acta Crystallogr. Sect. D 2017, 73, 223–233. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Laitaoja, M.; Valjakka, J.; Janis, J. Zinc coordination spheres in protein structures. Inorg. Chem 2013, 52, 10983–10991. [Google Scholar] [CrossRef] [PubMed]
- Choi, H.; Kang, H.; Park, H. MetLigDB: A web-based database for the identification of chemical groups to design metalloprotein inhibitors. J. Appl. Crystallogr. 2011, 44, 878–881. [Google Scholar] [CrossRef]
- Li, G.; Su, Y.; Yan, Y.H.; Peng, J.Y.; Dai, Q.Q.; Ning, X.L.; Zhu, C.L.; Fu, C.; McDonough, M.A.; Schofield, C.J.; et al. MeLAD: An integrated resource for metalloenzyme-ligand associations. Bioinformatics 2020, 36, 904–909. [Google Scholar] [CrossRef]
- Medina-Franco, J.L.; López-López, E.; Andrade, E.; Ruiz-Azuara, L.; Frei, A.; Guan, D.; Zuegg, J.; Blaskovich, M.A.T. Bridging informatics and medicinal inorganic chemistry: Toward a database of metallodrugs and metallodrug candidates. Drug Discov. Today 2022, 27, 1420–1430. [Google Scholar] [CrossRef]
- Anthony, E.J.; Bolitho, E.M.; Bridgewater, H.E.; Carter, O.W.L.; Donnelly, J.M.; Imberti, C.; Lant, E.C.; Lermyte, F.; Needham, R.J.; Palau, M.; et al. Metallodrugs are unique: Opportunities and challenges of discovery and development. Chem. Sci. 2020, 11, 12888–12917. [Google Scholar] [CrossRef]
- Yu, Y.; Wang, R.; Teo, R.D. Machine Learning Approaches for Metalloproteins. Molecules 2022, 27, 1277. [Google Scholar] [CrossRef]
- Greener, J.G.; Moffat, L.; Jones, D.T. Design of metalloproteins and novel protein folds using variational autoencoders. Sci. Rep. 2018, 8, 16189. [Google Scholar] [CrossRef]
- Koohi-Moghadam, M.; Wang, H.; Wang, Y.; Yang, X.; Li, H.; Wang, J.; Sun, H. Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach. Nat. Mach. Intell. 2019, 1, 561–567. [Google Scholar] [CrossRef]
- Nallapareddy, V.; Bogam, S.; Devarakonda, H.; Paliwal, S.; Bandyopadhyay, D. DeepCys: Structure-based multiple cysteine function prediction method trained on deep neural network: Case study on domains of unknown functions belonging to COX2 domains. Proteins 2021, 89, 745–761. [Google Scholar] [CrossRef] [PubMed]
- Berardi, A.; Quilici, G.; Spiliotopoulos, D.; Corral-Rodriguez, M.A.; Martin-Garcia, F.; Degano, M.; Tonon, G.; Ghitti, M.; Musco, G. Structural basis for PHDVC5HCHNSD1–C2HRNizp1 interaction: Implications for Sotos syndrome. Nucleic Acids Res. 2016, 44, 3448–3463. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Feehan, R.; Franklin, M.W.; Slusky, J.S.G. Machine learning differentiates enzymatic and non-enzymatic metals in proteins. Nat. Commun. 2021, 12, 3712. [Google Scholar] [CrossRef] [PubMed]
- Varadi, M.; Anyango, S.; Deshpande, M.; Nair, S.; Natassia, C.; Yordanova, G.; Yuan, D.; Stroe, O.; Wood, G.; Laydon, A.; et al. AlphaFold Protein Structure Database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022, 50, D439–D444. [Google Scholar] [CrossRef] [PubMed]
- Perrakis, A.; Sixma, T.K. AI revolutions in biology: The joys and perils of AlphaFold. EMBO Rep. 2021, 22, e54046. [Google Scholar] [CrossRef]
- Thornton, J.M.; Laskowski, R.A.; Borkakoti, N. AlphaFold heralds a data-driven revolution in biology and medicine. Nat. Med. 2021, 27, 1666–1669. [Google Scholar] [CrossRef]
- Evans, R.; O’Neill, M.; Pritzel, A.; Antropova, N.; Senior, A.; Green, T.; Žídek, A.; Bates, R.; Blackwell, S.; Yim, J.; et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv 2022. [Google Scholar] [CrossRef]
- Hekkelman, M.L.; de Vries, I.; Joosten, R.P.; Perrakis, A. AlphaFill: Enriching the AlphaFold models with ligands and co-factors. bioRxiv 2021. [Google Scholar] [CrossRef]
- van Beusekom, B.; Touw, W.G.; Tatineni, M.; Somani, S.; Rajagopal, G.; Luo, J.; Gilliland, G.L.; Perrakis, A.; Joosten, R.P. Homology-based hydrogen bond information improves crystallographic structures in the PDB. Protein Sci. 2018, 27, 798–808. [Google Scholar] [CrossRef] [Green Version]
- Joosten, R.P.; Salzemann, J.; Bloch, V.; Stockinger, H.; Berglund, A.-C.; Blanchet, C.; Bongcam-Rudloff, E.; Combet, C.; Da Costa, A.L.; Deleage, G.; et al. PDB_REDO: Automated re-refinement of X-ray structure models in the PDB. J. Appl. Crystallogr. 2009, 42, 376–384. [Google Scholar] [CrossRef] [PubMed]
- Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol 1990, 215, 403–410. [Google Scholar] [CrossRef]
- Wehrspan, Z.J.; McDonnell, R.T.; Elcock, A.H. Identification of Iron-Sulfur (Fe-S) Cluster and Zinc (Zn) Binding Sites Within Proteomes Predicted by DeepMind’s AlphaFold2 Program Dramatically Expands the Metalloproteome. J. Mol. Biol. 2022, 434, 167377. [Google Scholar] [CrossRef] [PubMed]
- Golinelli-Pimpaneau, B. Prediction of the Iron–Sulfur Binding Sites in Proteins Using the Highly Accurate Three-Dimensional Models Calculated by AlphaFold and RoseTTAFold. Inorganics 2022, 10, 2. [Google Scholar] [CrossRef]
- Littmann, M.; Heinzinger, M.; Dallago, C.; Weissenow, K.; Rost, B. Protein embeddings and deep learning predict binding residues for various ligand classes. Sci. Rep. 2021, 11, 23916. [Google Scholar] [CrossRef]
- Yang, K.K.; Wu, Z.; Bedbrook, C.N.; Arnold, F.H. Learned protein embeddings for machine learning. Bioinformatics 2018, 34, 2642–2648. [Google Scholar] [CrossRef] [Green Version]
- Aptekmann, A.A.; Buongiorno, J.; Giovannelli, D.; Glamoclija, M.; Ferreiro, D.U.; Bromberg, Y. mebipred: Identifying metal-binding potential in protein sequence. Bioinformatics 2022, 38, btac358. [Google Scholar] [CrossRef]
- Laveglia, V.; Giachetti, A.; Sala, D.; Andreini, C.; Rosato, A. Learning to Identify Physiological and Adventitious Metal-Binding Sites in the Three-Dimensional Structures of Proteins by Following the Hints of a Deep Neural Network. J. Chem. Inf. Model. 2022, 62, 2951–2960. [Google Scholar] [CrossRef]
Tool Name and Link | Implemented Approach | Reference |
---|---|---|
Template-Based Methods | ||
Identification of cavities with high hydrophobicity contrast | [33] | |
CHED | Identification of suitable arrangement(s) of triads of the CHED residues based on the distances between candidate donor atom | [36] |
IonCom https://zhanggroup.org/IonCom/ (accessed on 5 July 2022) | Integration of four structure-based predictors and a novel sequence-based predictor | [38] |
MIB http://bioinfo.cmu.edu.tw/MIB/ (accessed on 5 July 2022) | Docking MBS templates with the fragment transformation method | [41] |
ZINCCLUSTER http://www.metalactive.in/ (accessed on 5 July 2022) | Detection of known structural patterns | [43] |
Predictive algorithm in the GaudiMM modeling suite | Identification of accessible cavities whose center of mass is within 3.5 Å from the β-carbon atoms of three or more CHED residues | [45] |
BioMetAll https://github.com/insilichem/biometall (accessed on 5 July 2022) | Identification of cavities followed by their validation against pre-defined geometric patterns of the protein backbone | [46] |
N.A. | Docking MBS templates with geometric hashing against an ensemble of 11 structural conformations for the query protein, generated with coarse-grained molecular mechanics | [48] |
Random forest methods | ||
Zincbindpredict https://zincbind.net/predict (accessed on 5 July 2022) | Application of a portfolio of predictive models, each optimized to detect a specific type of zinc-binding site. Each type corresponds to a different zinc-binding patterns. | [49] |
Prediction of positions where metal ligands can be introduced, based on protein backbone coordinates, to design artificial MPs | [50] | |
Structural comparison of metal sites | ||
MetalS2 http://metalweb.cerm.unifi.it/tools/metals2/ (accessed on 5 July 2022) | Pairwise metal-centered superposition of MBSs based on a combination of sequence and structural similarity | [53] |
MetalS3 http://metalweb.cerm.unifi.it/tools/metals3/ (accessed on 5 July 2022) | A web server using an optimized version of MetalS2 to search the MetalPDB database for MBSs structurally similar to the query | [56] |
mFASD http://staff.ustc.edu.cn/~liangzhi/mfasd/ (accessed on 5 July 2022) | A structure-based algorithm to predict which metal populates a MBS based on systematic comparison against a template library | [58] |
MeCOM https://mecom.ddtmlab.org (accessed on 5 July 2022) | Pairwise superposition of MBSs based on a combination of site features and the position of the Cα atoms | [59] |
TopMatch + Sahle https://topmatch.services.came.sbg.ac.at (accessed on 5 July 2022) | Scoring of pairwise structural superpositions computed by the TopMatch tool, which ignores metal ions, with the sahle function to detect alignments having a good overlap of the MBSs | [62] |
Metalloprotein databases | ||
MetalPDB https://metalpdb.cerm.unifi.it/ (accessed on 5 July 2022) | MetalPDB collects structural information on all the MBSs present in the Protein Data Bank | [8] |
BioLiP https://zhanggroup.org/BioLiP (accessed on 5 July 2022) | A database collecting structures of protein adducts, including metal-protein complexes | [67] |
ZincBind https://zincbind.net (accessed on 5 July 2022) | A database specialized on zinc-binding sites built on biological assemblies | [68] |
PyDISH https://pydish.bio.info.hiroshima-cu.ac.jp (accessed on 5 July 2022) | PyDISH is specialized on the analysis of heme-binding sites in PDB structures | [69] |
VirusMED https://virusmed.biocloud.top (accessed on 5 July 2022) | A database of epitopes, drug binding site and metal binding sites in viral proteins of known 3D structure | [71] |
InterMetalDB https://intermetaldb.biotech.uni.wroc.pl (accessed on 5 July 2022) | A database of MBSs occurring at macromolecular interfaces, built on biological assemblies | [30] |
MetLigDB http://silver.sejong.ac.kr/MetLigDB (accessed on 4 July 2022) | MetLigDB focuses on the structural and chemical properties of small molecules that bind directly to the metal ion(s) in MP structures | [75] |
MeLAD https://melad.ddtmlab.org (accessed on 5 July 2022) | A database derived from the 3D structures of all metalloenzyme-ligand adducts, which integrates detailed analyses of metal-binding pharmacophores, metalloenzyme structural similarity and ligand chemical similarity | [76] |
AI methods applied to metalloproteins | ||
https://github.com/psipred/protein-vae (accessed on 5 July 2022) | Use of conditional variational autoencoders for the automated design of artificial metalloproteins | [80] |
https://bitbucket.org/mkoohim/multichannel-cnn (accessed on 5 July 2022) | Identification of disease-related mutations through a multichannel convolutional neural network (MCCNN) | [81] |
DeepCys https://deepcys.herokuapp.com/ (accessed on 5 July 2022) | Discrimination of four cysteine different roles, i.e., metal-binding, disulphide formation, sulphenylation and thioether | [82] |
MAHOMES https://github.com/SluskyLab/MAHOMES (accessed on 5 July 2022) | Discrimination of enzymatic and non-enzymatic metals in MPs | [84] |
AlphaFill https://alphafill.eu/ (accessed on 5 July 2022) | A database derived from AlphaFold predictions of apo-proteins where holo-structures of MPs have been reconstructed | [89] |
bindEmbed21 https://github.com/Rostlab/bindPredict (accessed on 5 July 2022) | bindEmbed21 uses a combination of homology-based inference and a convolutional neural network to predict whether a protein residue binds to a metal ion, a nucleic acid, or a small molecule | [95] |
mebipred https://services.bromberglab.org/mebipred (accessed on 5 July 2022) | Sequence-based prediction of MPs using a NN trained with information derived from 3D structures | [97] |
https://github.com/cerm-cirmmp/MBSDL (accessed on 5 July 2022) | Discrimination of physiological and adventitious zinc-binding sites in MPs using a recurrent neural network (RNN) | [98] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Andreini, C.; Rosato, A. Structural Bioinformatics and Deep Learning of Metalloproteins: Recent Advances and Applications. Int. J. Mol. Sci. 2022, 23, 7684. https://doi.org/10.3390/ijms23147684
Andreini C, Rosato A. Structural Bioinformatics and Deep Learning of Metalloproteins: Recent Advances and Applications. International Journal of Molecular Sciences. 2022; 23(14):7684. https://doi.org/10.3390/ijms23147684
Chicago/Turabian StyleAndreini, Claudia, and Antonio Rosato. 2022. "Structural Bioinformatics and Deep Learning of Metalloproteins: Recent Advances and Applications" International Journal of Molecular Sciences 23, no. 14: 7684. https://doi.org/10.3390/ijms23147684