Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize?

de Brevern, Alexandre G.

doi:10.3390/biomedinformatics4040124

Open AccessEditorial

Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize?

by

Alexandre G. de Brevern

Université Paris Cité and Université de la Réunion, INSERM, BIGR, DSIMB Bioinformatics Team, F-75015 Paris, France

BioMedInformatics 2024, 4(4), 2306-2308; https://doi.org/10.3390/biomedinformatics4040124

Submission received: 2 November 2024 / Accepted: 29 November 2024 / Published: 6 December 2024

Download Versions Notes

AlphaFold (AF) was the first deep learning tool to achieve exceptional fame in the field of biology [1]. To sum up, we first recall the existence of the CASP (Critical Assessment of Structural Prediction) competition, which allows the evaluation of individual prediction methods by proposing protein structural models. In 2018, the first version of the AF obtained excellent results, close to those of the best approaches available at the time [2,3]. Two years later, in 2020, a particularly significant average improvement was observed [4,5], and then with the communicative power of a company spun off from Alphabet, a great increase in media coverage of structural bioinformatics occurred.

Specialists in the field have had to analyse this phenomenon [6,7,8,9]. An essential aspect of this second version of AF is that all the tools were made available to the scientific community [10]. A large number of laboratories could therefore install and use it autonomously. This particularly positive aspect was further enhanced by the EBI’s AF2 database [11], which was initially dedicated to complete proteomes and later included hundreds of millions of sequences [12]. Moreover, for proteins whose structural models were not already available, online versions such as ColabFold could be employed to build a structural model using a simple web browser [13].

Experts in the field have also had to adapt to benefit from the real-world use of AF applications [14,15,16,17,18,19]. In fact, I discovered in discussions with non-specialists (repeating scientific or often non-scientific articles without hindsight) that they have very often assumed that since the protein-folding problem had been solved, all structures were available (I would like to point out two major problems in this sentence: (i) the term protein folding is wrong and should be protein fold, and (ii) the fact that we regularly use the term ‘predicted structures’ instead of ‘structural models’ means that many people now use the term ‘structures’ for AF models in a problematic and ambiguous way).

This last point rendered experimental research almost useless and other predictive approaches obsolete. To repeat what a researcher (who shall remain nameless) told me in a conversation 2 months after identifying the CASP results, ‘If I have a protein sequence, I, the non-specialist, can always get a nice visualisation of a real structure without tiring myself. The field of protein structural biology research is definitely over’.

Unfortunately, reality is more complicated. The DeepMind group previously pointed out in their articles that only 1/3 of the residues were of atomistic quality, and this figure only increased to 58% for correct folding; thus, for 42%, local quality could not be assured [11]. Secondly, an independent consortium clearly showed that for human proteome, for example, AF2 increased the number of proteins that could be used to propose a structural model of interest by 10% compared to a comparative modelling approach [20]. Finally, technical articles have highlighted practical problems related to unsuccessful AF results. Personally, I support the conclusions of the short article by Dame Janet Thornton, which gives examples of the positives and negatives of structural models, who included an impressive final example. [21]

For my own work, in recent proposals for structural models related to blood group issues, I have used approaches varying from classical comparative modelling to deep learning methods. In the end, I retained the models that I built manually [22,23,24]. am interested in are transmembrane proteins, so this was not an unreasonable approach. Similarly, in a more systematic way, the analysis of the whole human proteome showed systematic local biases for certain local conformations, such as γ-turns, especially cis ω-angle conformations [25]. Another underlying problem was the incomplete nature of the β-strands that form the β-sheets, so there is still room for improvement [25,26,27].

As I have edited and reviewed numerous manuscripts since the advent of this approach, I have had to calm the enthusiasm of non-specialists who have used the AF tool incorrectly. I would like to anonymously transcribe a heated exchange with researchers who had used AF models in conjunction with experiments. The evaluation was not successful because they had used two structural models with confidence limits (pLDDT) of less than 30, i.e., a very simple case to analyse. Their response was that their models could not be criticized because they had been designed by AF2. They had not understood that though AF does represent a methodological revolution, signified by the recent Nobel Prize [28], it only represents an evolution in terms of application.

This recent and well-deserved award brings focus back phrases such as protein folding problem solved, 50 year old problem solved, etc. This renewed focus will lead us to rethink the critical evaluation of structural models. The very user-friendly aspects of tools such as ColabFold, for example, often makes non-specialists forget that a protein is often be in complex with other proteins. They also pose the following problems:

Alternative splicing events (not taken into account if the correct sequence is not given);
Post-translational modifications essential for function but which may be poorly known, poorly determined, reversible, etc., and are not taken into account by AF, occur;
They often require ions, cofactors and partners that can be very complex to introduce (the latest version of AF remains very limited);
A non-negligible part of the protein structure is disordered and therefore difficult to take into account, even when using by AF;
A fixed structure with a nice visual appearance represents only part of the structural conformations of the protein of interest, especially in terms of conformers.

In conclusion, the structural bioinformatics community was pleased to be recognised with these new Nobel Prizes, following on from those awarded to Martin Karplus, Michael Levitt, and Arieh Warshel in 2013 [29], but once again we need to point out to non-specialists that there are always limitations, as we lack precise experimental data in a significant number of cases.

Funding

This Editorial was funded by a PHC ORCID grant n°49680ZC, which enabled me to visit the Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu City, Taiwan, a project supported by Campus France and the French Office in Taipei.

Acknowledgments

The author wishes to thank Jean-Christophe Gelly and Wei-Cheng Lo for multiple discussions on this topic, as well as for their support for the evolution of our field’s methodologies, applications and position in society.

Conflicts of Interest

The author, who is Editor-in-Chief of BioMedInformatics, declares no other conflicts of interest. The funders had no role in the design of the study; the collection, analysis, or interpretation of data; the writing of the manuscript, or the decision to publish the results.

References

Method of the year 2021: Protein structure prediction. Nat. Methods 2022, 19, 1. [CrossRef] [PubMed]
Senior, A.W.; Evans, R.; Jumper, J.; Kirkpatrick, J.; Sifre, L.; Green, T.; Qin, C.; Žídek, A.; Nelson, A.W.R.; Bridgland, A.; et al. Protein structure prediction using multiple deep neural networks in the 13th critical assessment of protein structure prediction (casp13). Proteins 2019, 87, 1141–1148. [Google Scholar] [CrossRef] [PubMed]
AlQuraishi, M. Alphafold at casp13. Bioinformatics 2019, 35, 4862–4865. [Google Scholar] [CrossRef] [PubMed]
Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with alphafold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
Jumper, J.; Hassabis, D. Protein structure predictions to atomic accuracy with alphafold. Nat. Methods 2022, 19, 11–12. [Google Scholar] [CrossRef]
Fersht, A.R. Alphafold—A personal perspective on the impact of machine learning. J. Mol. Biol. 2021, 433, 167088. [Google Scholar] [CrossRef]
Mullard, A. What does alphafold mean for drug discovery? Nat. Rev. Drug Discov. 2021, 20, 725–727. [Google Scholar] [CrossRef]
Cramer, P. Alphafold2 and the future of structural biology. Nat. Struct. Mol. Biol. 2021, 28, 704–705. [Google Scholar] [CrossRef]
Xu, T.; Xu, Q.; Li, J. Toward the appropriate interpretation of alphafold2. Front. Artif. Intell. 2023, 6, 1149748. [Google Scholar] [CrossRef]
Radjasandirane, R.; de Brevern, A.G. Alphafold2 for protein structure prediction: Best practices and critical analyses. Methods Mol. Biol. 2024, 2836, 235–252. [Google Scholar]
Tunyasuvunakool, K.; Adler, J.; Wu, Z.; Green, T.; Zielinski, M.; Žídek, A.; Bridgland, A.; Cowie, A.; Meyer, C.; Laydon, A.; et al. Highly accurate protein structure prediction for the human proteome. Nature 2021, 596, 590–596. [Google Scholar] [CrossRef] [PubMed]
Varadi, M.; Anyango, S.; Deshpande, M.; Nair, S.; Natassia, C.; Yordanova, G.; Yuan, D.; Stroe, O.; Wood, G.; Laydon, A.; et al. Alphafold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022, 50, D439–D444. [Google Scholar] [CrossRef] [PubMed]
Mirdita, M.; Schütze, K.; Moriwaki, Y.; Heo, L.; Ovchinnikov, S.; Steinegger, M. Colabfold—Making protein folding accessible to all. Nat. Methods 2022, 19, 679–682. [Google Scholar] [CrossRef] [PubMed]
Yang, Z.; Zeng, X.; Zhao, Y.; Chen, R. Alphafold2 and its applications in the fields of biology and medicine. Signal Transduct. Target. Ther. 2023, 8, 115. [Google Scholar] [CrossRef] [PubMed]
Peng, C.X.; Liang, F.; Xia, Y.H.; Zhao, K.L.; Hou, M.H.; Zhang, G.J. Recent advances and challenges in protein structure prediction. J. Chem. Inf. Model 2024, 64, 76–95. [Google Scholar] [CrossRef]
Guo, H.B.; Perminov, A.; Bekele, S.; Kedziora, G.; Farajollahi, S.; Varaljay, V.; Hinkle, K.; Molinero, V.; Meister, K.; Hung, C.; et al. Alphafold2 models indicate that protein sequence determines both structure and dynamics. Sci. Rep. 2022, 12, 10696. [Google Scholar] [CrossRef]
Tong, A.B.; Burch, J.D.; McKay, D.; Bustamante, C.; Crackower, M.A.; Wu, H. Could alphafold revolutionize chemical therapeutics? Nat. Struct. Mol. Biol. 2021, 28, 771–772. [Google Scholar] [CrossRef]
Skolnick, J.; Gao, M.; Zhou, H.; Singh, S. Alphafold 2: Why it works and its implications for understanding the relationships of protein sequence, structure, and function. J. Chem. Inf. Model 2021, 61, 4827–4831. [Google Scholar] [CrossRef]
Tourlet, S.; Radjasandirane, R.; Diharce, J.; de Brevern, A.G. Alphafold2 update and perspectives. BioMedInformatics 2023, 3, 378–390. [Google Scholar] [CrossRef]
Akdel, M.; Pires, D.E.V.; Pardo, E.P.; Jänes, J.; Zalevsky, A.O.; Mészáros, B.; Bryant, P.; Good, L.L.; Laskowski, R.A.; Pozzati, G.; et al. A structural biology community assessment of alphafold2 applications. Nat. Struct. Mol. Biol. 2022, 29, 1056–1067. [Google Scholar] [CrossRef]
Thornton, J.M.; Laskowski, R.A.; Borkakoti, N. Alphafold heralds a data-driven revolution in biology and medicine. Nat. Med. 2021, 27, 1666–1669. [Google Scholar] [CrossRef] [PubMed]
Floch, A.; Lomas-Francis, C.; Vege, S.; Brennan, S.; Shakarian, G.; de Brevern, A.G.; Westhoff, C.M. A novel high-prevalence antigen in the lutheran system, luga (lu24), and an updated, full-length 3d bcam model. Transfusion 2023, 63, 798–807. [Google Scholar] [CrossRef] [PubMed]
Floch, A.; Lomas-Francis, C.; Vege, S.; Burgos, A.; Hoffman, R.; Cusick, R.; de Brevern, A.G.; Westhoff, C.M. Two new scianna variants causing loss of high prevalence antigens: Ermap model and 3d analysis of the antigens. Transfusion 2023, 63, 230–238. [Google Scholar] [CrossRef] [PubMed]
Floch, A.; Galochkina, T.; Pirenne, F.; Tournamille, C.; de Brevern, A.G. Molecular dynamics of the human rhd and rhag blood group proteins. Front. Chem. 2024, 12, 1360392. [Google Scholar] [CrossRef]
de Brevern, A.G. An agnostic analysis of the human alphafold2 proteome using local protein conformations. Biochimie 2022, 207, 11–19. [Google Scholar] [CrossRef]
Bruley, A.; Mornon, J.P.; Duprat, E.; Callebaut, I. Digging into the 3d structure predictions of alphafold2 with low confidence: Disorder and beyond. Biomolecules 2022, 12, 1467. [Google Scholar] [CrossRef]
Bruley, A.; Bitard-Feildel, T.; Callebaut, I.; Duprat, E. A sequence-based foldability score combined with alphafold2 predictions to disentangle the protein order/disorder continuum. Proteins Struct. Funct. Bioinform. 2022, 91, 466–484. [Google Scholar] [CrossRef]
Nobel Prize. 2024. Available online: https://www.nobelprize.org/prizes/chemistry/2024/press-release/ (accessed on 1 November 2024).
Jorgensen, W.L. Foundations of biomolecular modeling. Cell 2013, 155, 1199–1202. [Google Scholar] [CrossRef][Green Version]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

de Brevern, A.G. Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize? BioMedInformatics 2024, 4, 2306-2308. https://doi.org/10.3390/biomedinformatics4040124

AMA Style

de Brevern AG. Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize? BioMedInformatics. 2024; 4(4):2306-2308. https://doi.org/10.3390/biomedinformatics4040124

Chicago/Turabian Style

de Brevern, Alexandre G. 2024. "Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize?" BioMedInformatics 4, no. 4: 2306-2308. https://doi.org/10.3390/biomedinformatics4040124

APA Style

de Brevern, A. G. (2024). Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize? BioMedInformatics, 4(4), 2306-2308. https://doi.org/10.3390/biomedinformatics4040124

Article Menu

Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize?

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI