AlphaFold2 Update and Perspectives

: Access to the three-dimensional (3D) structural information of macromolecules is of major interest in both fundamental and applied research. Obtaining this experimental data can be complex, time consuming, and costly. Therefore, in silico computational approaches are an alternative of interest, and sometimes present a unique option. In this context, the Protein Structure Prediction method AlphaFold2 represented a revolutionary advance in structural bioinformatics. Named method of the year in 2021, and widely distributed by DeepMind and EBI, it was thought at this time that protein-folding issues had been resolved. However, the reality is slightly more complex. Due to a lack of input experimental data, related to crystallographic challenges, some targets have remained highly challenging or not feasible. This perspective exercise, dedicated to a non-expert audience, discusses and correctly places AlphaFold2 methodology in its context and, above all, highlights its use, limitations, and opportunities. After a review of the interest in the 3D structure and of the previous methods used in the ﬁeld, AF2 is brought into its historical context. Its spatial interests are detailed before presenting precise quantiﬁcations showing some limitations of this approach and ﬁnishing with the perspectives in the ﬁeld.


Foreword
The idea for this short perspective comes from multiple discussions about the real impact of AlphaFold2 (AF2) with fellow specialists, biologists, and students.We provide a simple but comprehensive overview including the expertise of researchers who deal with AF2 on a regular basis, for non-specialists such as medical doctors.AF2 is has various users.It is a method that has been discussed in an unparalleled way in recognized scientific journals (method of the year for Nature Methods [1], with a $3 million award for its designers [2]) and has impacted non-specialists (e.g., the Times best inventions 2022 [3]).Statements asserting that 'It will change everything' [4] or 'DeepMind AI cracks 50-year-old problem of protein folding' [5] bring questions, especially when the reality and impact of the results differ from one research lab to another.
This strategic perspective exercise is articulated in four parts.First, we outline for the record the issues of interest in protein structure and the history of the field of threedimensional (3D) structural model prediction.Second, we discuss more specifically the deep learning approaches in Structural Bioinformatics.Third, we present our ideas on the contributions and limitations of AF2.Finally, we conclude with perspectives for the evolution of the field.

Introduction 2.1. Proteins and 3D Structures
Proteins are composed of a succession of amino acids, essential biological molecules that are the building blocks of macromolecules.With 20 different types, these amino acids trigger the folding of the three-dimensional structure.The latter support the biological functions of these entities.When the protein is small in size, it is called a peptide.In this case, its structure is often highly flexible and adaptable.In other cases, proteins can be of very large size (average size of human protein is about 800 residues).By knowing the protein structure, one can better understand what happens at the atomic level, e.g., interaction between a parasite protein and a human cell or in the case of a drug, with its target [6].Figure 1 shows an example of an FDA-approved drug (namely Jakavi) associated with the Janus Kinase 2 protein [7,8].This structural information allows us to precisely understand which atoms are interacting and, thus, which are essential to consider for drug design purposes.

Proteins and 3D Structures
Proteins are composed of a succession of amino acids, essential biological molec that are the building blocks of macromolecules.With 20 different types, these amino ids trigger the folding of the three-dimensional structure.The latter support the biol cal functions of these entities.When the protein is small in size, it is called a peptid this case, its structure is often highly flexible and adaptable.In other cases, proteins be of very large size (average size of human protein is about 800 residues).By know the protein structure, one can better understand what happens at the atomic level, interaction between a parasite protein and a human cell or in the case of a drug, wit target [6].Figure 1 shows an example of an FDA-approved drug (namely Jakavi) associ with the Janus Kinase 2 protein [7,8].This structural information allows us to preci understand which atoms are interacting and, thus, which are essential to consider drug design purposes.Protein 3D structures have been deposited for more than 50 years in the Pro Data Bank (https://www.rcsb.org,accessed on 15 March 2023) [10][11][12].Two other databases are (i) the UniProt database (https://www.uniprot.org,accessed on 15 M 2023) [13] and (ii) the EMBL's European Bioinformatics Institute website ( https://www.ebi.ac.uk, accessed on 15 March 2023) [14].UniProt is particularly known because of the large amount of information.It includes almost 555,000 exp Protein 3D structures have been deposited for more than 50 years in the Protein Data Bank (https://www.rcsb.org,accessed on 15 March 2023) [10][11][12].Two other key databases are (i) the UniProt database (https://www.uniprot.org,accessed on 15 March 2023) [13] and (ii) the EMBL's European Bioinformatics Institute website (EBI, https://www.ebi.ac.uk, accessed on 15 March 2023) [14].UniProt is particularly well known because of the large amount of information.It includes almost 555,000 experimentally manually curated pro-teins.They are the reference proteins for biology, and since 2021, Uniprot has listed also the best AlphaFold model generated for each entry in the "Structure" section.EBI has a large number of tools and databases, mainly concerning protein sequences.It is has been central to the diffusion of the AlphaFold2 models.

Protein Structure Prediction
High throughput sequencing approaches allow for obtaining particularly large masses of sequences.However, obtaining experimental structures remains highly complex (expensive and sometimes not possible, for example, transmembrane proteins).In consequence, the number of available structures remains highly limited compared to the tremendous number of currently available sequences.
For more than 40 years, techniques have been developed to predict 3D structural models of proteins from their sequences alone.The first approaches were rather simple, namely homology/comparative modelling; the most well-known and successful approach in this field is the Modeller from Sali's lab [15].It has been maintained and improved since 1993 (https://salilab.org/modeller/,accessed on 15 March 2023) [16,17].Its main principles are, considering a sequence of interest without known structures, (i) to find a sequence sharing a good sequence identity or similarity, i.e., that can be aligned; (ii) to use this sequence alignment to build a local structural analogue (performing a copy/paste from the sequence with a 3D structure to the query one); and (iii) to optimize the global fold.The closer the sequences are, the easier the process is.Similarly, the fewer the gaps (insertions/deletions), the easier it is.Specialists often manually adapt the alignment and add extra constraints, such as multiple template processes or partial structures for specific portions of the query sequence [18].The modeller is free for academics and has been incorporated in commercial solutions.An effective comparable approach, named SwissModel, is also freely available (https://swissmodel.expasy.org,accessed on 15 March 2023) [19,20].
This approach is based on the hypothesis of accumulation of neutral mutations during evolution, leading to the famous sentence: 'sequence is less conserved than the structure', as the structure must conserve its properties to provide the protein biological functions.However, as mutations accumulate, the alignment becomes impossible, even if the fold is preserved.
Following this idea, the next generation of Protein Structural Prediction (PSP) was developed and named the 'threading approach'.Using more complex compatibility measures, threading approaches are able to find structural compatibility.PSP calculates the energetic compatibility of the fold when no alignments are possible using classical approaches.An iconic example of this method is Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2/, accessed on 15 March 2023) [21].There are also some ab initio approaches based only on physical parameters, i.e., trying to mimic the protein folding process and parameters; however, these are mainly limited to small proteins [22].
A first generation of web servers providing consensus approaches of homology/ comparative modelling, threading and ab initio, have been proposed, e.g., 'Frankenstein's monster', which is aptly named [23].However, it was the second generation of web servers with de novo approaches that resulted in significant advances [24].The de novo approaches used a large series of evolution, energetic, and optimisation processes.They necessitated large clusters and CPU powers.Two main groups were leaders in this field: I-Tasser from Zhang's lab (https://zhanggroup.org/I-TASSER/, accessed on 15 March 2023) [25][26][27] and the Rosetta/Robetta suite from Baker's lab (https://robetta.bakerlab.org,accessed on 15 March 2023) [28][29][30].These approaches rely on complex computational searches to find distant associations.

Recent Protein Structure Prediction Methods
Since the beginning of the 2010s, artificial neural networks (ANNs)/machine learning approaches have undergone a phenomenal evolution, with the emergence of deep learning (DL).ANNs were known to have a succession of a small number of layers with thousands of parameters to be optimized.DL, thanks to new Graphical Process Units (GPUs) and innovative developments, allowed the design of highly complex architectures with dozens of layers and tens of millions of parameters to be optimized.DL was first known through Go and chess play of human versus machine as well as through automatic recognition of faces as proposed by Facebook (now called Meta) in 2011.
In 2018, AlphaFold1, developed by DeepMind (a subsidiary of Alphabet Inc., Mountain View, CA, USA), was the first complex DL applied to PSP.It was perfectly comparable to most of the current approaches [31,32], such as I-Tasser [27].Two years later, the second version showed an exceptional improvement [33][34][35].It moved beyond the classical methods, which is why we will discuss it in the following sections.Since then, several other similar approaches have been made available, such as RoseTTAfold by Baker's group [36] and by Meta Platforms, Inc. [37].However, they are considered less efficient.

Deep Learning in Structural Bioinformatics
Every two years since 1994, a competition known as the Critical Assessment of Structural Prediction (CASP, https://predictioncenter.org, accessed on 15 March 2023) has evaluated the methods of structural model prediction.For this reason, researchers who obtain new protein structures in this two-year period do not deposit them directly in the PDB.They provide their sequences to the committee, and competitors have access to them, in order to propose a structural model.Obviously, the latter must be close to the experimental structure, not yet revealed.They deposit their proposal on a dedicated website, which is evaluated by the committee [29,38].
In 2018, AlphaFold1 was the first Deep Learning method present at CASP [31,32].It showed very positive results, but older approaches gave comparable structural models in most of the cases.Two years later, DeepMind completed a large amount of refinement work by modifying several new pieces of information in their model architecture [39].Surprisingly, the information and general parameters used were very similar to those of de novo approaches, i.e., sequence evolution and optimization of local and global folding [40].These Deep Learning approaches derive from historical neural network approaches, but at higher level (millions of parameters optimized, thanks to the power of GPUs) [41].
The results of AlphaFold2 are very significant; it outdoes the second-best approach (from David Baker's group) by a large margin, and above all, it proposes models that are sometimes of atomistic quality (equivalent to the experimental structure) and thus useful for numerous purposes [32,33,35,42].For example, AlphaFold2 models may be used for protein engineering and prediction of entirely computational sequences.In addition, drug discovery could be an important field of application of this methodology.Because of their quality, models coming from this software could also be good starting points for molecular dynamics simulations or docking calculations.These few examples and the corresponding figures quickly spread on Twitter, then the scientific media, and finally the general media.AlphaFold2 was named method of the year by Science and Nature [1,43].The original paper had 8965 citations in less than 2 years (Google Scholar, accessed 1 March 2023).The researchers behind the AlphaFold artificial-intelligence (AI) system won one of this year's US $3 million Breakthrough prizes-the most lucrative award in science [2].Nowadays, AlphaFold2 also proposes the prediction of protein-protein multimeric complexes, called AlphaFold-multimers, with interesting results [44].
Another interesting point that contributes to its success is the possibility to locally install the program on a computer.Though the device must have a good GPU and memory, this is attainable even for a small academic laboratory.Moreover, Alphafold2 is faster than other methods.Indeed, it can propose a model within days, whereas some methods take weeks to produce a prediction [34].In addition, researchers have made their AlphaFold2 personal installation available for free to the scientific community (CollabFold based on the Google forms, https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/beta/AlphaFold2_advanced.ipynb,accessed on 15 March 2023 [45]), with great success.Similarly, the European Bioinformatics Institute (EBI) has proposed dedicated databases for all human proteins, as for other organisms.Those models were generated by the AlphaFold2 approach; eventually, the whole UniProt database encompassed, in the structure section, the AlphaFold results of about 200 million structural models (https: //www.alphafold.ebi.ac.uk, accessed on 15 March 2023) [46,47].
More recently, competing approaches have been made available, such as RoseTTAfold [36,48] and ESMFold by Meta [37].This latter also made available more than 600 million structural models coming from various meta-genomes [37].These approaches are very similar to AF2 but, on average, perform less well.

Current Identified AlphaFold2 Limits 4.1. Introduction
The emergence of the AF2 algorithm was indeed game-changing in PSP, but there remain significant issues, despite the strong statements proposed by the non-specialist media [49].Indeed, one question that may highlight some flaws of AF2 is as follows: does AF2 propose quality models for every potential protein?Here, we aim to answer this question by considering different types of proteins and their complexities.In addition, we look at the different precise quantifications that will help us to have a better and less biased view of AF2.

Law and Order
An essential first point to remember is that 15 to 20% of the residues are not associated with an ordered region, experimentally defined, but are disordered (highly flexible and therefore not experimentally characterized) [50].
AF2 provides a confidence index (pLDDT), informing about the quality of the folding at a level of the residue.Numerous works have shown a correlation between low pLDDT and disorder [51][52][53].Nonetheless, studies have shown that low pLDDT is not only associated with disorder but also with poor-quality prediction [54,55].
It remains important to remember that AlphaFold2 learns the experimental data and, therefore, potentially the errors that are associated.Three classes of proteins exist: (i) globular (98% of the PDB), (ii) transmembrane (2%), and (iii) fibber (almost nothing).Only the globular proteins are strongly represented.For transmembrane proteins, this low number (and also with much redundancy) is related to the lipid bilayer.By consequence, it is an unbalanced learning.
Finally, the number of targets to be reached, i.e., a very low number, always limits the CASP competition.It is also difficult to simply and directly generalize the results on all protein datasets.

Few Pertinent Quantifications
Four primary studies provide a better understanding of the general quality of Al-phaFold2 models: (i) The first work, coming directly from the EBI with the help of DeepMind, made available structural models of 220 million sequences (including proteomes of interest, UniProt, and other sequences) [46,47].With the analysis of the pLDDT, they clearly show that a third of the human proteome is of atomistic quality; 58% corresponds to a correct fold, and there is no clear information on remaining 42%.This tendency is also found on several of the other available proteomes.Of course, this result may be partly due to disordered areas, but they are considered in smaller numbers.
Clearly, a significant number of transmembrane protein structures are not correctly predicted [56].It was recently confirmed with the analysis of nearly 700,000 domains provided by AF2 that only 52% of models were appropriate for analysis with the new CATH-annotation tool [57].(ii) The second work came from a large academic consortium that has independently evaluated the advancements provided by this methodology compared to a recognized comparative modelling tool, namely SwissModel [19], with its own repository [58].They selected 21 model species, corresponding to more than 365,000 proteins, i.e., twice the number of experimental structures and six times the number of unique proteins in PDB.They analysed the SwissModel repository for 11 model species and compared it with the AF2 database.On average, the predicted models of AF2 provide longer predictions (+44% of residues).Looking at high-quality regions (pLDDT > 90), an average of around 25% of the residues of the proteomes of the 11 model species are covered by AF2 with novel (not present in SwissModel repository) and confident predictions [59].This very elegant and rigorous study also shows, similar to the previous analysis, that a large number of proteins are still not reachable.The surprise for non-specialists is that they are not only transmembrane proteins but also globular ones.(iii) The third study comprises the analysis of the local conformation of proteins and shows that, globally, the results are very good.However, in a surprising way, some local conformations observed in a recurrent way within all the proteins are, in a way, systematically associated with particularly low confidence scores [60].These conformations are PolyProline II helices (important for protein-proline interaction) [61], γ-turns (present in many loops), and ω angles in cis conformation (often associated with Proline) [62].This analysis also shows that there would be an under-representation of sheets and beta compared to what should be observed.In addition, those β-like forms are present in large numbers and would only ask to be able to form sheets. (iv) The fourth study focuses on the position of the side chains, a complex subject due to their large panel of motions.The quality of the predictions at this level is still largely perfectible [63].
As a partial conclusion of these four representative studies, we can assert the following: (a) An undeniable strength of AlphaFold 2 compared to its previous version is that it is made available in the form of a usable and stable GitHub, which does not require overly expensive and powerful computers.In addition, several academic groups provide their own AlphaFold system, called CollabFold, which can be used free of charge by the scientific community, but with a less rich database of protein structures compared to the real AlphaFold.(b) AlphaFold2 can quickly model more protein than previous approaches and, on average, with better quality.However, the attainable/usable protein number is lower than we would have expected from the assertion that "a 50-year-old problem was solved".(c) One AF2 limitation that strongly affects the biomedical field is the poor quality of transmembrane protein models, whereas the confidence indices of transmembrane segments can be of good quality.The overall predicted topology is not compatible with its insertion within a membrane bilayer.Figure 2 shows the AF2 model proposed for Atypical Chemokine Receptor 1 (ACKR1), previously named Duffy Antigen for Chemokine.ACKR1 is a seven TM protein associated with malarial Plasmodium vivax infection [64,65].This model is incomplete, but segments are present.However, it is not possible to insert it in a membrane bilayer, e.g., with CHARMM-GUI [66].Indeed, its topology does not allow any recognition of this portion as transmembrane by the webserver.
CHARMM-GUI [66].Indeed, its topology does not allow any recognition of this portion as transmembrane by the webserver.(d) Another limitation is that most proteins have ions and co-factors (such as FAD and NADPH); however, AlphaFold 2 has not been trained to take them into account.In addition, in a certain number of cases, it is impossible to add them or to dock them.Thus, an external tool, called AlphaFill, has been dedicated to place them and add post-translational modifications such as glycosylation (often essential for the protein functions) on the models [68].(e) As noticed, AlphaFold2 is also pertinent for underlying intrinsic disordered regions (IDRs).Nonetheless, sometimes AF2 provides protein models with regions that look like IDRs but are in reality not disordered.Figure 3

Perspectives
Future developments are not easy to predict, as all the structural information has been used to develop the AlphaFold models.For example, DeepMind did not participate in the last CASP competition, and results have not shown new cutting-edge data.Many academic teams have used AlphaFold code and adapted it.(d) Another limitation is that most proteins have ions and co-factors (such as FAD and NADPH); however, AlphaFold 2 has not been trained to take them into account.In addition, in a certain number of cases, it is impossible to add them or to dock them.Thus, an external tool, called AlphaFill, has been dedicated to place them and add post-translational modifications such as glycosylation (often essential for the protein functions) on the models [68].(e) As noticed, AlphaFold2 is also pertinent for underlying intrinsic disordered regions (IDRs).Nonetheless, sometimes AF2 provides protein models with regions that look like IDRs but are in reality not disordered.Figure 3  CHARMM-GUI [66].Indeed, its topology does not allow any recognition of this portion as transmembrane by the webserver.(d) Another limitation is that most proteins have ions and co-factors (such as FAD and NADPH); however, AlphaFold 2 has not been trained to take them into account.In addition, in a certain number of cases, it is impossible to add them or to dock them.Thus, an external tool, called AlphaFill, has been dedicated to place them and add post-translational modifications such as glycosylation (often essential for the protein functions) on the models [68].(e) As noticed, AlphaFold2 is also pertinent for underlying intrinsic disordered regions (IDRs).Nonetheless, sometimes AF2 provides protein models with regions that look like IDRs but are in reality not disordered.

Perspectives
Future developments are not easy to predict, as all the structural information has been used to develop the AlphaFold models.For example, DeepMind did not participate in the last CASP competition, and results have not shown new cutting-edge data.Many academic teams have used AlphaFold code and adapted it.

Perspectives
Future developments are not easy to predict, as all the structural information has been used to develop the AlphaFold models.For example, DeepMind did not participate in the last CASP competition, and results have not shown new cutting-edge data.Many academic teams have used AlphaFold code and adapted it.
In fact, current and future concerns are around the proper use of AlphaFold2 models.As in the previous approaches, it is always advisable to evaluate the quality of the model obtained.The confidence index (pLDDT) allows this, in part.Thus, a sequence having a structural model whose pLDDT exceeds 90 is particularly useful and undoubtedly of atomistic quality, but this is not always the case.
For instance, Figure 4 shows a recent example of the Scianna blood group.The AlphaFold2 model (see Figure 4a) provides a too-long helix, and some domains known from the literature are not properly designed.On the other hand, a manual humansupervised model (see Figure 4b) encompasses each region in a better way.Obviously, a longer time was needed to design it (analysis of different domains, evolution, prediction of properties, multiple optimisations, etc.).It should be noted that all other automatic approaches have the same issues [70].
In fact, current and future concerns are around the proper use of AlphaFold2 models.As in the previous approaches, it is always advisable to evaluate the quality of the model obtained.The confidence index (pLDDT) allows this, in part.Thus, a sequence having a structural model whose pLDDT exceeds 90 is particularly useful and undoubtedly of atomistic quality, but this is not always the case.
For instance, Figure 4 shows a recent example of the Scianna blood group.The Al-phaFold2 model (see Figure 4a) provides a too-long helix, and some domains known from the literature are not properly designed.On the other hand, a manual human-supervised model (see Figure 4b) encompasses each region in a better way.Obviously, a longer time was needed to design it (analysis of different domains, evolution, prediction of properties, multiple optimisations, etc.).It should be noted that all other automatic approaches have the same issues [70].Another important point is that the structural model, even of excellent quality, is not final.For instance, single nucleotide polymorphisms (i.e., changes the amino acids) could be the driving force behind pathologies.A recent study showed that these point mutations placed on very good quality models do not provide information on the mechanism [71] but need extra treatments [72].Similarly, AF2 models are used to establish links with physico-chemical parameters and experiments [73,74].TAGPPI extracts multi-dimensional features by employing 1D convolution operation on protein sequences and graph learning methods on contact maps constructed from AlphaFold [75].However, it is often indirect; e.g., PROST extracts several descriptors from the most promising sequence-based predictors, such as BoostDDG, SAAFEC-SEQ, and DDGun, but also from iFeature and AlphaFold2 descriptors [76].Protein dynamics must be taken into account, as some effects can be of long range [77][78][79].Some new DL developments are in progress [80,81].It is essential to note that AF2 provides only a single topology, without giving any indication of conformational change, and that the notions of dynamics are to be taken into account to answer biological questions [82,83].
A question arises: Will AF2 have a negative impact on Structural Biology?Because of its existence, would it no longer be necessary to conduct X-ray crystallography or Nuclear Magnetic Resonance?The contribution of AF2 is twofold and positive in the short and medium term.First, AF2 helps experimenters to obtain their structures.AF2 models can be used directly in electron density maps and can solve complex cases [84][85][86].Sec- Another important point is that the structural model, even of excellent quality, is not final.For instance, single nucleotide polymorphisms (i.e., changes the amino acids) could be the driving force behind pathologies.A recent study showed that these point mutations placed on very good quality models do not provide information on the mechanism [71] but need extra treatments [72].Similarly, AF2 models are used to establish links with physicochemical parameters and experiments [73,74].TAGPPI extracts multi-dimensional features by employing 1D convolution operation on protein sequences and graph learning methods on contact maps constructed from AlphaFold [75].However, it is often indirect; e.g., PROST extracts several descriptors from the most promising sequence-based predictors, such as BoostDDG, SAAFEC-SEQ, and DDGun, but also from iFeature and AlphaFold2 descriptors [76].Protein dynamics must be taken into account, as some effects can be of long range [77][78][79].Some new DL developments are in progress [80,81].It is essential to note that AF2 provides only a single topology, without giving any indication of conformational change, and that the notions of dynamics are to be taken into account to answer biological questions [82,83].
A question arises: Will AF2 have a negative impact on Structural Biology?Because of its existence, would it no longer be necessary to conduct X-ray crystallography or Nuclear Magnetic Resonance?The contribution of AF2 is twofold and positive in the short and medium term.First, AF2 helps experimenters to obtain their structures.AF2 models can be used directly in electron density maps and can solve complex cases [84][85][86].Second, AF2 has brought back to the forefront the problem of proteins that are not yet predictable by AF2.More experimental data are needed to make advancements in the space of known folding.Hence, crystallographic strategies and high-accuracy models are being continually developed.The AlphaFold framework helps crystallographers to concentrate their work on the most unresolved and challenging structures, e.g., coiled-coils inducing modulations and challenges in crystallographic method development.
A final question is the possibility of improvement of AF2 and its competitors.A first point to take into account is the question of the topology of deep learning approaches.
New developments offer relative improvements and can allow small gains.During the last CASP, AF2 was not present.Self-based or similar approaches failed to show significant gains.The second essential point is the need for experimental data to increase our body of knowledge on difficult targets.

Conclusions
The interest in 3D structures includes the knowledge of the topology of proteins and the possibility to access the positions of the cofactors, ions, ligands, DNA, RNA, etc., that interact with them.In particular, 3D structures can facilitate the design of drugs (simple peptide, modified, or exclusively chemical types), bringing explanatory power in the functioning of the therapeutic molecule (which is currently required by the FDA).
AF2 has changed many things.However, as seen with the ACKR1 model of Figure 2 or the Scianna model of Figure 4, modelling research design has not changed greatly.Figure 5a shows the different steps of the classical protocol to provide correct 3D structural models, e.g., [70,[87][88][89].Figure 5b shows the actual protocol; AF2 has not replaced everything but is added as a new step in the design.
ing modulations and challenges in crystallographic method development.
A final question is the possibility of improvement of AF2 and its competitors.A point to take into account is the question of the topology of deep learning approac New developments offer relative improvements and can allow small gains.During last CASP, AF2 was not present.Self-based or similar approaches failed to show sig cant gains.The second essential point is the need for experimental data to increase body of knowledge on difficult targets.

Conclusions
The interest in 3D structures includes the knowledge of the topology of proteins the possibility to access the positions of the cofactors, ions, ligands, DNA, RNA, etc., interact with them.In particular, 3D structures can facilitate the design of drugs (sim peptide, modified, or exclusively chemical types), bringing explanatory power in functioning of the therapeutic molecule (which is currently required by the FDA).
AF2 has changed many things.However, as seen with the ACKR1 model of Figu or the Scianna model of Figure 4, modelling research design has not changed gre Figure 5a shows the different steps of the classical protocol to provide correct 3D st tural models, e.g., [70,[87][88][89].Figure 5b shows the actual protocol; AF2 has not repla everything but is added as a new step in the design.At first, it is essential to analyse the protein sequence and its properties, t looking at the information available in the different databases, the evolution is determined.lowing this, the first building of structural models can begin, with (if possible) comparative m elling, threading, and de novo approaches.The most important point in the analyses is that it be critical.
To conclude, it has been proven that experimentally determined X-ray struct (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure m els.Experimental structure should be used preferentially whenever possible.Impr ment of AF2 models, even the top ones, is possible in the future [90].At first, it is essential to analyse the protein sequence and its properties, then, looking at the information available in the different databases, the evolution is determined.Following this, the first building of structural models can begin, with (if possible) comparative modelling, threading, and de novo approaches.The most important point in the analyses is that it must be critical.
To conclude, it has been proven that experimentally determined X-ray structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models.Experimental structure should be used preferentially whenever possible.Improvement of AF2 models, even the top ones, is possible in the future [90].

Figure 1 .
Figure 1.Jakavi TM interacting with JAK2.Taken from the Protein DataBank, the PDB id 6VGL i Janus Kinase 2 (JAK2) JH1 domain in complex with Ruxolitinib (commercial name Jakavi) [7 Cartoon representation of JAK2 JH1 while Ruxolitinib is in sphere representation, with (b) a z on the interaction zone, (c) adding electrostatics and (d) zoom (red: negative charge; blue: pos charge).JAK2 variant is a driver of myeloproliferative neoplasms (MPNs); this drug has been proved by FDA against MPNs.Visualisation is done with PyMOL software [9].

Figure 1 .
Figure 1.Jakavi TM interacting with JAK2.Taken from the Protein DataBank, the PDB id 6VGL is the Janus Kinase 2 (JAK2) JH1 domain in complex with Ruxolitinib (commercial name Jakavi) [7].(a) Cartoon representation of JAK2 JH1 while Ruxolitinib is in sphere representation, with (b) a zoom on the interaction zone, (c) adding electrostatics and (d) zoom (red: negative charge; blue: positive charge).JAK2 variant is a driver of myeloproliferative neoplasms (MPNs); this drug has been approved by FDA against MPNs.Visualisation is done with PyMOL software [9].
presents the E3 ubiquitin-protein ligase PPP1R11 (UniProt ID O60927) AF2 model underlined by Thornton and collaborators in[69].This model is considered as a poor-quality model because this protein is a globular one, and the model presents what looks like a disordered protein.
presents the E3 ubiquitinprotein ligase PPP1R11 (UniProt ID O60927) AF2 model underlined by Thornton and collaborators in[69].This model is considered as a poor-quality model because this protein is a globular one, and the model presents what looks like a disordered protein.

Figure 3
presents the E3 ubiquitin-protein ligase PPP1R11 (UniProt ID O60927) AF2 model underlined by Thornton and collaborators in[69].This model is considered as a poor-quality model because this protein is a globular one, and the model presents what looks like a disordered protein.

Figure 5 .
Figure 5. Example of PSP protocols.(a) A classical protocol and (b) a new protocol encompas the use of AF2.At first, it is essential to analyse the protein sequence and its properties, t looking at the information available in the different databases, the evolution is determined.lowing this, the first building of structural models can begin, with (if possible) comparative m elling, threading, and de novo approaches.The most important point in the analyses is that it be critical.

Figure 5 .
Figure 5. Example of PSP protocols.(a) A classical protocol and (b) a new protocol encompassing the use of AF2.At first, it is essential to analyse the protein sequence and its properties, then, looking at the information available in the different databases, the evolution is determined.Following this, the first building of structural models can begin, with (if possible) comparative modelling, threading, and de novo approaches.The most important point in the analyses is that it must be critical.