Making NSCLC Crystal Clear: How Kinase Structures Revolutionized Lung Cancer Treatment

The parallel advances of different scientific fields provide a contemporary scenario where collaboration is not a differential, but actually a requirement. In this context, crystallography has had a major contribution on the medical sciences, providing a “face” for targets of diseases that previously were known solely by name or sequence. Worldwide, cancer still leads the number of annual deaths, with 9.6 million associated deaths, with a major contribution from lung cancer and its 1.7 million deaths. Since the relationship between cancer and kinases was unraveled, these proteins have been extensively explored and became associated with drugs that later attained blockbuster status. Crystallographic structures of kinases related to lung cancer and their developed and marketed drugs provided insight on their conformation in the absence or presence of small molecules. Notwithstanding, these structures were also of service once the initially highly successful drugs started to lose their effectiveness in the emergence of mutations. This review focuses on a subclassification of lung cancer, non-small cell lung cancer (NSCLC), and major oncogenic driver mutations in kinases, and how crystallographic structures can be used, not only to provide awareness of the function and inhibition of these mutations, but also how these structures can be used in further computational studies aiming at addressing these novel mutations in the field of personalized medicine.


Introduction
Cancer is a general term to define a myriad of medical conditions that can affect different tissues in the body. A common characteristic is the abnormal growth of a cell that later develops the ability to spread to other tissues, in a process known as metastasis [1]. Worldwide, it is the second major cause of death, responsible for one in six deaths. Within cancer-related mortality, lung cancer is responsible for more than one million deaths annually-populating the top of the list of deadliest cancer types, a situation that is likely to increase [2]. Lung cancer can be subdivided into Small Cell Lung Cancer (SCLC) and Non-Small Cell Lung Cancer (NSCLC), with the latter being diagnosed in around 85% of lung cancer patients [3]. Advances in molecular techniques unveiled details of genes acting as drivers in NSCLC, identifying a family of proteins responsible for controlling key features of cell development: the kinase family [4].

Kinases: A Structural Overview
The involvement of kinase proteins in key regulatory aspects of cell biology is powered by their ability to modulate other proteins through a phosphorylation reaction, where the γ-phosphate group of Adenosine Triphosphate (ATP) molecules are transferred to selected amino acids of a substrate. Phosphorylation is the most common protein modification in signal transmission, mostly due to its reversible nature by dephosphorylation performed by phosphatases [25].
The nature of the phosphorylated residue guides the classification of kinase proteins into serine/threonine kinases or tyrosine kinases. However, a small group of proteins is able to target both threonine and tyrosine amino acids (dual specificity kinases) and are exemplified by the Dual Specificity Mitogen-Activated Protein Kinase 1(MEK1) and 2 (MEK2) [25,26]. The structure of the kinase domain can be generically depicted as a bilobal structure with a larger C-terminal lobe, presenting several conserved α-helices and β-strands, connected through a hinge to a smaller N-terminal lobe, a composition of a five-stranded antiparallel β-sheet (β1-β5) and a roving α-helix (αC-helix). In Figure 1, the crystal structure of the Hepatocyte Growth Factor Receptor (HGFR or c-MET) in the presence of ATP is used to exemplify the general kinase domain folding (Protein Data Bank (PDB): ID 3DKC). c-MET in complex with ATP (PDB: ID 3DKC) is used as a general representation of a kinase domain with the C-terminal lobe colored blue with hinge motif colored pink, N-terminal lobe is colored yellow with the P-loop residues backbone depicted as sticks and the regulatory αC-helix colored in orange. Nitrogen and oxygen atoms are colored blue and red, respectively. The ATP molecule is depicted with carbon atoms in green and phosphate atoms in orange, magnesium ion is depicted as a green sphere.
The N-terminal presents a conserved glycine-rich (GxGxΦG) loop, occurring between the β1and β2-strands, responsible for positioning the β-and γ-phosphate groups from the ATP molecule for catalysis. The glycine-rich loop is also known as the G-loop or P-loop, with the latter being the most common in kinase-related literature. The β1-and β2 strands also harbor the adenine moiety of ATP, contributing to its stabilization [27]. Within the N-terminal lobe, a characteristic interaction is often observed involving a conserved lysine from β3-strand and a glutamate residue occurring in the αC-helix, this salt bridge is a precondition for the active state. The presence or absence of such interaction is based on the positioning of the roving αC-helix. In the kinases, in this review, once the αC-helix is orientated toward the ATP binding pocket and the salt bridge is present it can be classified as αC-in. The outward positioning of the helix and absence of the lysine-glutamate interaction is (D) c-MET in complex with ATP (PDB: ID 3DKC) is used as a general representation of a kinase domain with the C-terminal lobe colored blue with hinge motif colored pink, N-terminal lobe is colored yellow with the P-loop residues backbone depicted as sticks and the regulatory αC-helix colored in orange. Nitrogen and oxygen atoms are colored blue and red, respectively. The ATP molecule is depicted with carbon atoms in green and phosphate atoms in orange, magnesium ion is depicted as a green sphere.
The N-terminal presents a conserved glycine-rich (GxGxΦG) loop, occurring between the β1and β2-strands, responsible for positioning the βand γ-phosphate groups from the ATP molecule for catalysis. The glycine-rich loop is also known as the G-loop or P-loop, with the latter being the most common in kinase-related literature. The β1and β2 strands also harbor the adenine moiety of ATP, contributing to its stabilization [27]. Within the N-terminal lobe, a characteristic interaction is often observed involving a conserved lysine from β3-strand and a glutamate residue occurring in the αC-helix, this salt bridge is a precondition for the active state. The presence or absence of such interaction is based on the positioning of the roving αC-helix. In the kinases, in this review, once the αC-helix is orientated toward the ATP binding pocket and the salt bridge is present it can be classified as αC-in. The outward positioning of the helix and absence of the lysine-glutamate interaction is EGFR mutations are one of the major causes of NSCLC formation and progression and appear more frequently in never or light smokers, women, and East Asian NSCLC patients [39]. In a healthy cell, the absence of extracellular stimuli drives EGFR monomers into an auto-inhibitory, tethered conformation in which the dimerization arm is buried [40,41]. Activating ligands bind bivalently on EGFR and trigger a large conformational change in the extracellular domain, in which the dimerization arm, a β hairpin-like motif, becomes exposed to the aqueous environment and is thus able to dimerize [42]. Recent reports show the presence of a mixed population of inactive monomers and homodimers in the cellular membrane. In the inactive dimer, the monomers adopt a symmetric configuration and are kept together through interactions of the intracellular domain, the transmembrane domain, and the C-terminal tail of the extracellular domain. In response to ligand binding, a conformational change leads to a rotation of the transmembrane and intracellular domains of the receptor, yielding an active, asymmetric dimer [43].
Elucidation of EGFR kinase domain structure is in compliance with the canonical structure previous mentioned, as seen in Figure 1, with representations of the α-helix rich C-terminal lobe and an N-terminal lobe containing a five stranded β-sheet, its P-loop and the mobile αC-helix [44]. As shown in Figure 3, the plasticity of EGFR can be measured between its active and inactive conformation by comparison of the aforementioned structures with a clear coiled conformation of the activation segment in the inactive form against an elongated loop in the active EGFR, as example [45][46][47]. The most common sequence deletion in exon 19, E746-A750 is colored orange. Comparison of regulatory motifs (B) β1 and β2 strand along with the P-loop (residues G719-G724), (C) αC-helix (residues P741-A767), (D) Activation segment starting with DFG motif (residues D855/F856/G857) and ending with AxE motif (A882/L883/E884). Regulatory motifs are highlighted and aligned with the inactive conformation (PDB: ID 1XKK) in gray. (E) K745 from β3 strand engaging in a salt bridge with E762 in the αC-helix.
In analyzing the binding of ATP and its analogues to EGFR, the aromatic N1 nitrogen atom serves as a hydrogen bond acceptor for the backbone amino group from M793 while the N6 amino group serves as a hydrogen bond donor for Q791 (PDB: ID 2GS6) [45,48]. Combination of MD simulations with a Molecular Mechanics Generalized Born Surface Area (MMGBSA) method was employed to elucidate the specific structural elements that stabilize ATP binding in both active and inactive conformations. MMGBSA allows for assessment of binding free energy of protein-ligand complexes with a modest computational input but with proven contributions towards a higher quality evaluation of small ligands binding to biomacromolecules [49]. In the EGFR kinase domain, ATP binding is stabilized by the formation of hydrogen bonds and salt bridges between the negatively charged phosphate group and residues K745, R841, and D855, in both the active and inactive states throughout the simulations. Furthermore, in the inactive state, an ionic bond is formed between the negatively charged phosphate and Mg 2+ , which is correctly orientated through D855 and N842 [50,51].
The crystal structure of the N-terminal kinase domain of EGFR resembles that of other RTKs such as the Insulin Receptor Kinase (IRK) and the Fibroblast Growth Factor Receptor Kinase (FGFRK) [52]. However, the C-lobe kinase domain adopts a prototypical structure and activation mechanism, not previously observed in other RTKs. Typically, RTKs require phosphorylation on a conserved tyrosine residue located on the activation loop, which causes the conformational shift of the activation segment. This conformation is compatible with the active receptor and permits ATP binding [50]. However, crystallographic studies of the kinase domain of EGFR showed that the A-loop often adopts an active conformation and does not require phosphorylation of the conserved Y869 residue The most common sequence deletion in exon 19, E746-A750 is colored orange. Comparison of regulatory motifs (B) β1 and β2 strand along with the P-loop (residues G719-G724), (C) αC-helix (residues P741-A767), (D) Activation segment starting with DFG motif (residues D855/F856/G857) and ending with AxE motif (A882/L883/E884). Regulatory motifs are highlighted and aligned with the inactive conformation (PDB: ID 1XKK) in gray. (E) K745 from β3 strand engaging in a salt bridge with E762 in the αC-helix.
In analyzing the binding of ATP and its analogues to EGFR, the aromatic N1 nitrogen atom serves as a hydrogen bond acceptor for the backbone amino group from M793 while the N6 amino group serves as a hydrogen bond donor for Q791 (PDB: ID 2GS6) [45,48]. Combination of MD simulations with a Molecular Mechanics Generalized Born Surface Area (MMGBSA) method was employed to elucidate the specific structural elements that stabilize ATP binding in both active and inactive conformations. MMGBSA allows for assessment of binding free energy of protein-ligand complexes with a modest computational input but with proven contributions towards a higher quality evaluation of small ligands binding to biomacromolecules [49]. In the EGFR kinase domain, ATP binding is stabilized by the formation of hydrogen bonds and salt bridges between the negatively charged phosphate group and residues K745, R841, and D855, in both the active and inactive states throughout the simulations. Furthermore, in the inactive state, an ionic bond is formed between the negatively charged phosphate and Mg 2+ , which is correctly orientated through D855 and N842 [50,51].
The crystal structure of the N-terminal kinase domain of EGFR resembles that of other RTKs such as the Insulin Receptor Kinase (IRK) and the Fibroblast Growth Factor Receptor Kinase (FGFRK) [52]. However, the C-lobe kinase domain adopts a prototypical structure and activation mechanism, not previously observed in other RTKs. Typically, RTKs require phosphorylation on a conserved tyrosine residue located on the activation loop, which causes the conformational shift of the activation segment. This conformation is compatible with the active receptor and permits ATP binding [50]. However, crystallographic studies of the kinase domain of EGFR showed that the A-loop often adopts an active conformation and does not require phosphorylation of the conserved Y869 residue on the activation loop to become activated [53]. Consistent with this, a mutagenesis study provided evidence that Y869F substitution had no adverse effect on kinase activity [54].
In addition to the conformational changes required for EGFR activation, a dimerization process is also necessary. As previously mentioned, inactive EGFR is often found as symmetric dimers while activation is characterized by an asymmetric complex of two monomeric units. In the case of an EGFR-EGFR dimer, one of the units acts as an active partner contributing with its N-terminal while the other is the passive partner, contributing with its C-terminal. The dimer interface is governed by hydrophobic interactions involving residues L704, I706, L760, L782, and V786 from the active partner, and I941, Y944, M945, V948 and M952 from receiver kinase unit contributing with its C-lobe. The asymmetric dimer formation results in the transphosphorylation between the kinase domains and their activation [45].
Early attempts in crystallization of the monomeric inactive EGFR TK domain were challenging due to the spontaneous dimerization of EGFR at high concentrations. Wood et al. elucidated the first inactive structure in the presence of EGFR inhibitor lapatinib [55]. Lapatinib is an ATP competitive drug with a 4-anilinoquinazoline scaffold designed to interact with the hinge motif. Lapatinib explores the allosteric pocket guarded by the gatekeeper residue with an extended moiety using the hydrophobicity of phenolic rings substituted with chlorine and fluorine atoms to form Van der Waals interactions as depicted in Figure 4 (PDB: ID 1XKK) [56]. The inactive conformation closely resembles the structures of inactive Src and Cyclin-Dependent Kinase (CDK) proteins, mainly due to the positioning of the αC-helix [55]. on the activation loop to become activated [53]. Consistent with this, a mutagenesis study provided evidence that Y869F substitution had no adverse effect on kinase activity [54]. In addition to the conformational changes required for EGFR activation, a dimerization process is also necessary. As previously mentioned, inactive EGFR is often found as symmetric dimers while activation is characterized by an asymmetric complex of two monomeric units. In the case of an EGFR-EGFR dimer, one of the units acts as an active partner contributing with its N-terminal while the other is the passive partner, contributing with its C-terminal. The dimer interface is governed by hydrophobic interactions involving residues L704, I706, L760, L782, and V786 from the active partner, and I941, Y944, M945, V948 and M952 from receiver kinase unit contributing with its C-lobe. The asymmetric dimer formation results in the transphosphorylation between the kinase domains and their activation [45].
Early attempts in crystallization of the monomeric inactive EGFR TK domain were challenging due to the spontaneous dimerization of EGFR at high concentrations. Wood et al. elucidated the first inactive structure in the presence of EGFR inhibitor lapatinib [55]. Lapatinib is an ATP competitive drug with a 4-anilinoquinazoline scaffold designed to interact with the hinge motif. Lapatinib explores the allosteric pocket guarded by the gatekeeper residue with an extended moiety using the hydrophobicity of phenolic rings substituted with chlorine and fluorine atoms to form Van der Waals interactions as depicted in Figure 4 (PDB: ID 1XKK) [56]. The inactive conformation closely resembles the structures of inactive Src and Cyclin-Dependent Kinase (CDK) proteins, mainly due to the positioning of the αC-helix [55]. . Two-dimensional representation of lapatinib and first-generation inhibitors, gefitinib and erlotinib. The 4-aminoquinaoline scaffold, common to all drugs, engages in hydrogen bonds with hinge residue M793 while hydrophobic substituents contribute with van de Waals interactions with residues from the allosteric pocket, such as L788, V766, and L858. Interactions were analyzed and . Two-dimensional representation of lapatinib and first-generation inhibitors, gefitinib and erlotinib. The 4-aminoquinaoline scaffold, common to all drugs, engages in hydrogen bonds with hinge residue M793 while hydrophobic substituents contribute with van de Waals interactions with residues from the allosteric pocket, such as L788, V766, and L858. Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon(grey), nitrogen (blue), oxygen (red), sulfur (beige), chlorine (green), and fluorine (light blue).
Crystals 2020, 10, 725 9 of 52 A more elegant methodology has been employed to study the inactive TK domain in which the ATP nucleotide analog AMPPNP is coupled with the incorporation of the V948R point mutation (PDB: ID 2GS7). This mutation takes place in the dimer interface and disrupts the hydrophobic interactions between the active and the receiver units by introducing the polar residue arginine. Consequently, the mutation restricts the dimerization of erbB members and allows for studies of monomeric wild-type (WT) enzyme [45].
The first crystallographic structure obtained in the active state was WT EGFR in the apo form (PDB: ID 1M14). In the active EGFR TK domain conformation, the αC-helix becomes highly structured, adopting an αC-in conformation allowing the formation of the catalytically important salt bridge (E762-K745). Furthermore, the A-loop is positioned away from the active site, allowing substrate access to the catalytic site [52].
The increased availability of EGFR structures contributed to more in-depth computational studies to elucidate the dynamics of EGFR, both as a monomer and as a member of the activated dimer. A set of 25 EGFR crystallographic structures were used to build an ensemble structure that would be further submitted to molecular dynamic simulations and analyzed as standalone kinase domains or as active/passive components of an EGFR homodimer. This extensive study showed that asymmetric dimerization might not only stabilize the active conformation, thus organizing the ATP binding site, but also affect regions distal from the dimerization interface such as the A-loop. It is relevant to highlight concerns about the duration of the simulations, here consisting of 100 ns, which are potentially not enough to accurately model the intrinsically disordered intermediate between the active and inactive conformations. Songtawee et al. show that MD simulations can sample different experimentally obtained conformations by comparing the output of MD simulations with the aforementioned structures, the high level of similarity between MD outputs and crystallization data strengthen the reliability of computational methods such as molecular dynamics to explore kinase dynamics [58].

Mutations 1st Generation 2nd Generation 3rd Generation
The two most commonly found activating mutations, L858R and exon19 deletions account for over 90% of all sensitizing mutations and thus have been termed "classical" activating mutations [88]. L858 is located towards the N-terminus of the activation segment, directly neighboring the DFG motif. When in the inactive conformation the L858 is part of a hydrophobic cluster composed of hydrophobic and/or aromatic residues, F723, L747, M766, and L788, responsible for stabilizing the coiled stated of the activation segment. Structures of L858R mutants are available with a variety of ligands, from ATP analogs (PDB: ID 2EB3) to natural products (PDB: ID 2ITU). The multiple ligand complexes all share the same active conformation with an uncoiled A-loop and αC helix-in conformation [44,65].
The L858R point mutation is characterized by a 10 to 100-fold increase in the affinity for TKIs [75,89]. A structural comparison of WT/AMPPNP (PDB: ID 3VJO) with L858R/AMPPNP (PDB: ID 2EB3) identified that the side chain of F723 protrudes outwards towards the active site and interacts with R748. This enlarges the active site cleft, which allows faster release of ATP, while also making the active site more attainable for TKIs [65].
Further information on the L858R mutation was provided by Ding et al. through long duration (500ns or more) atomistic MD simulations, and experimental data. The MD studies provided evidence on the process of switching from the inactive state into an active conformation. Simulations involving bound gefitinib, (also an amino-4quinazole EGFR inhibitor) to either the WT or L858R show higher binding energy to the active than to the inactive conformation. This data explains why inhibitors such as gefitinib are more successful in the presence of activating mutations than in gene amplifications in the treatment of EGFR positive NSCLC patients. Since L858R drives the kinase domain into an active conformation, disturbing the equilibrium between active and inactive, it contributes to drug binding by providing a more accessible (open) conformation for the drug [90].
Deletions on exon19, and more specifically the common ∆ 746 ELREA 750 (Del19) take place in the N-terminal loop, in between the β3-strand and the αC-helix. MD studies performed by Tamirat et al. on the inactive and active form of the Del19 mutant found that the active state is favored due to stabilization of the E762-K745 salt bridge. This results from a decrease in the β3-αC loop flexibility, which stabilizes the αC-helix in the active αC-in conformation. Furthermore, the ∆ 746 ELREA 750 deletion causes an inwards conformational shift of the αC-helix, which disrupts the hydrophobic cluster in between the αC-helix, thereby promoting the activation of the TK domain [86].
Similarly, to L858R point-mutants, Del19 EGFR mutants display a decreased affinity for ATP binding (K M ) and a lower K i for first-generation TKIs such as erlotinib [66,86]. However, exon 19 deletions and consequent residue insertions display large heterogeneity and thus exhibit differential drug sensitivity [85,91]. Unfortunately, so far, there is no structure available for any of the exon19 deletions.
Upon identification of the aforementioned mutations as major biomarkers of NSCL cancer, there has been a keen interest in the development of drugs able to inhibit the enhanced kinase activity provided by these mutations. Efforts culminated with the development of first-generation inhibitors gefitinib and erlotinib, both reversible ATP competitive inhibitors sharing a 4-anilinoquinalzoline scaffold [92].
As pictured in Figure 4, both drugs are capable of interacting with M793 similarly to the adenosine ring of ATP, through a hydrogen bond with the residue backbone. The presence of the 3-chloro-4-fluoro aniline allows gefitinib to explore the allosteric hydrophobic pocket through interactions with L788, and T790. The methoxy moiety is within Van der Waals contact of G796 (PDB: ID 4WKQ). The 6-propyl morpholine ring on gefitinib extends into an area exposed to the solvent and was implemented to improve pharmacokinetic properties [93].
Compellingly, gefitinib has also been identified to bind in a second conformation to the L858R mutant. The second conformation exhibits a 180 • rotation of the aniline ring, which allows the chloride substituent to interact with the sidechain of R855 via a halogen bond through the coordination of a water molecule. In both cases, the ether group extends outwards from the ATP binding pocket towards the aqueous environment [44].
In the active EGFR conformation, the erlotinib anilinoquinazoline ring is stabilized by seven hydrophobic (L718, A743, L788, L792, P794, and L844), three polar (T790, Q791, and T854) and three charged residues (E762, K45, and D855), while the solvent-exposed substituents interact with F795 and G796 (PDB: ID 1M17) [94]. In the inactive state, erlotinib is stabilized by the seven hydrophobic interactions stabilized in the active state and by an extra hydrophobic interaction from V726. Furthermore, it is stabilized by the same three polar residues and by three charged residues (K745, D800, and D855) [95].
Initial studies suggested that both drugs recognize the active conformation of EGFR. However, computational approaches indicated that erlotinib can bind to both active and inactive, conformations [96]. Crystallographic studies were able to co-crystallize the inactive TK domain in complex with erlotinib (PDB: ID 4HJO) providing further crystallographic evidence that erlotinib can bind to both states [96].
Erlotinib and gefitinib both showed greater potency for L858R EGFR against WT EGFR. Notably, gefitinib binds 20-fold more strongly to EGFR L858R mutant against WT EGFR and thus, preferentially inhibit L858R positive cancer cells, leading to their consequent apoptosis and cancer remission while sparing healthy cells [44].
Afatinib, a second-generation approved in 2013, is an irreversible ATP competitive anilinoquinazoline, which harbors an acrylamide reactive group as shown in Figure 5. Crystallographic studies by the Solca laboratory on the WT EGFR in complex with afatinib, (PDB 4G5J) showed a hydrogen bond between M793 and its core quinazoline ring but, most importantly, the electron density map identified a covalent bond formed between C797 and the acrylamide group in the active state of the kinase [77]. G719X (where X can be alanine, aspartic acid, cysteine, or serine) is a rare sensitizing point mutation on exon 18 accounting for approximately 3% of all EGFR TK domain mutations, with G719S (PDB 2EB2) being the most common variant [65,97]. G719 is located on the P-loop connecting strands β1 and β2, contributing to the stabilization of ATP phosphate groups. Furthermore, G719 is part of the hydrophobic cluster found during the inactive conformation creating a helical turn that generates a steric hindrance that helps position the αC-helix from the active site in a αC-out conformation. Deviation from glycine residues is not tolerated in the inactive state due to the disruption of the hydrophobic cluster thus favoring the active state. The structure of the G719S mutant in complex with AMP-PNP (PDB: ID 2ITN), gefitinib (PDB: ID 2ITO), and a staurosporine analog (PDB: ID 2ITQ) are available. Unlike L858R and del19 mutations, the G719X mutation does not promote receptor dimerization but rather influences intrinsic structural components favoring receptor activation [66].
Crystals 2020, 10, x FOR PEER REVIEW 12 of 53 Figure 5. Two-dimensional representation of second-generation inhibitors, afatinib and dacomitinib, and third-generation inhibitor, osimertinib. Although all three drugs follow a similar binding mode to first-generation inhibitors through hinge-binding scaffolds, osimertinib lacks interactions with the allosteric pocket. The proximity of the drugs' warhead to C797, its covalent bond partner, is also depicted. Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon (grey), nitrogen (blue), oxygen (red), chlorine (green), and fluorine (light blue). G719X (where X can be alanine, aspartic acid, cysteine, or serine) is a rare sensitizing point mutation on exon 18 accounting for approximately 3% of all EGFR TK domain mutations, with G719S (PDB 2EB2) being the most common variant [65,97]. G719 is located on the P-loop connecting strands β1 and β2, contributing to the stabilization of ATP phosphate groups. Furthermore, G719 is part of the hydrophobic cluster found during the inactive conformation creating a helical turn that generates a steric hindrance that helps position the αC-helix from the active site in a αC-out conformation. Deviation from glycine residues is not tolerated in the inactive state due to the disruption of the hydrophobic cluster thus favoring the active state. The structure of the G719S mutant in complex with AMP-PNP (PDB: ID 2ITN), gefitinib (PDB: ID 2ITO), and a staurosporine analog (PDB: ID 2ITQ) are available. Unlike L858R and del19 mutations, the G719X mutation does not promote receptor dimerization but rather influences intrinsic structural components favoring receptor activation [66].
The presence of a serine residue at position 719, shows to be sensitive to gefitinib with an inhibitory concentration (IC50) of 0.18 µM against 1.04 µM found for the WT EGFR. However, the presence of a secondary mutation at position 790 (T790M) increases the IC50 by 10-fold (IC50 = 1.86 µM). Interestingly, analysis of the constant of dissociation (Kd) for the double mutant (Kd = 5.6 nM) shows a tighter binding of gefitinib when compared to either the single mutant (Kd = 31.9 nM) or the WT (Kd 14.2 nM), indicating that double mutant diminished affinity for gefitinib is not due to reduced drug binding. However, a plausible explanation is raised by the Kinect studies, which show a ratio between the kinase activity (kcat) and the Michaelis-Menten constant (Km) comparable to the WT, indicating that the mutation T790M restores the nucleotide binding ability [65].
Treatment of L858R, Del19, and G719X with first-and second-generation TKIs show improved Figure 5. Two-dimensional representation of second-generation inhibitors, afatinib and dacomitinib, and third-generation inhibitor, osimertinib. Although all three drugs follow a similar binding mode to first-generation inhibitors through hinge-binding scaffolds, osimertinib lacks interactions with the allosteric pocket. The proximity of the drugs' warhead to C797, its covalent bond partner, is also depicted. Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon (grey), nitrogen (blue), oxygen (red), chlorine (green), and fluorine (light blue).
The presence of a serine residue at position 719, shows to be sensitive to gefitinib with an inhibitory concentration (IC 50 ) of 0.18 µM against 1.04 µM found for the WT EGFR. However, the presence of a secondary mutation at position 790 (T790M) increases the IC 50 by 10-fold (IC 50 = 1.86 µM). Interestingly, analysis of the constant of dissociation (K d ) for the double mutant (K d = 5.6 nM) shows a tighter binding of gefitinib when compared to either the single mutant (K d = 31.9 nM) or the WT (K d 14.2 nM), indicating that double mutant diminished affinity for gefitinib is not due to reduced drug binding. However, a plausible explanation is raised by the Kinect studies, which show a ratio between the kinase activity (kcat) and the Michaelis-Menten constant (Km) comparable to the WT, indicating that the mutation T790M restores the nucleotide binding ability [65].
Treatment of L858R, Del19, and G719X with first-and second-generation TKIs show improved overall survival when compared to classical chemotherapy [69,98]. However, after a median of nine to thirteen months, EGFR positive NSCLC treated with first-or second-generation typically acquire the resistance mutation T790M [79,83]. The T790M resistance mutation is analogous to the imatinib-resistant bcr-ABL fusion harboring the T315I mutation and it accounts for more than 50% of all EGFR TKI-resistant mutations [99].
T790M is referred to as the gatekeeper mutation and it is located at the back of the ATP-binding site [67]. Just like L858R, the T790M mutation stabilizes the active conformation of the TK domain, Crystals 2020, 10, 725 13 of 52 but via a different mechanism as shown by free energy studies. M790 is part of a hydrophobic cluster formed in the back of the ATP binding site of the N-lobe and interacts with M766, located on the αC-helix. This hydrophobic interaction further extends towards the F856 and the catalytically relevant DFG motif [45,100]. Once the methionine replaces the threonine residue, there is a stabilization of αC-in conformation. Thermodynamic integration (IT) analysis is a theoretical method able to correlate free energy divergence between two given states of a system even in different spatial coordinates arising due to long simulation durations [101]. Park and colleagues' combination of IT with MD simulations showed that the T790M stabilizes isoenergetically the active (αC-in) and intermediate disordered form of active apo EGFR, while disfavoring the inactive. However, it does not repress αC-intrinsic disorder and, consequently, is not believed to favor dimerization. Combination of MD simulation with MMGBSA in both active and inactive, in presence of the T790M alone or in presence of L858R shows that only erlotinib binding energy is decreased by the gatekeeper mutation while lapatinib is not affected [102].
The formation of resistance to first and second-generation TKIs by acquiring T790M is believed to emerge due to changes in the stability of ATP and drug binding [103]. It has recently been proposed that T790M resistance is a consequence of the restoration of ATP sensitivity similar to that of WT EGFR [50,104]. Comparisons of the L858R mutant with the WT EGFR showed that the variant amplifies the conformational landscape of the kinase domain. Interestingly, the co-existence of L858R with T790M as a double mutant presents a conformational landscape similar to the WT EGFR. Such a similar conformational profile is associated with the restored ATP affinity of the double mutant, being comparable to the WT [50].
Resistance also stems from steric hindrance clashes from the replacement of threonine by methionine within the ATP binding pocket, which contributes to the reduced binding of reversible TKIs [67]. However, more recent studies show that both gefitinib and erlotinib retain low nanomolar binding affinity towards the T790M mutant, proving that drug binding is still possible despite being limited [100]. In a scientific setting, limited binding affinity might still correlate with response to a drug yet, clinical assessment of these results might lead to a decision to withdraw the drug since the advantage of targeting the mutated kinase rather than its wild-type specie is lost. The therapeutic window for oncology drugs is a major point in medical decision-making [105].
Following the observation that first-generation drugs retain binding affinity for the T790M mutant free-energy studies found that gefitinib binding to the T790M and L858R are more energetically favorable than binding to WT EGFR, in a range of -12 and -15 kcal/mol, respectively. In addition, MD studies have shown that gefitinib binding alters the confirmation of the αC-helix whilst the activation loop maintains an active conformation. Overall, it has been observed that the T790M mutation does not ablate gefitinib binding as experimentally demonstrated by Gajiwala et al. through in vitro phosphorylation analysis and of T790M in the presence of L858R and in crystal structures (PDB: IDs 3UG2, 4I22) [103,104].
Emergence of T790M can also follow the primary G719X activating mutation, leading to a double mutant with synergistic interactions that stabilize the active conformation. Analysis of the gefitinib-double mutant complex (PDB: ID 3UG2) showed that gefitinib binds to the double mutant similarly to that of the WT EGFR [65]. Intriguingly, gefitinib binds the double mutant 6-fold stronger (K d = 5.6nM) than the single G719S mutant (K d =31.9 nM) providing evidence that binding is still possible and it is not prohibited via steric hindrance. However, the G719S does become 10-fold less sensitive to gefitinib when acquiring the T790M resistance mutation. When in the presence of AMPPNP (PDB: ID 3VJN), the methionine in position 790 forms a more stable structure with AMPPNP when compared to the WT. An important observation for future drug development is that the double mutant also decreases the size of the hydrophobic cleft formed between L718 and G796, and therefore future drug prototypes that aim to treat the T790M mutant should avoid this cleft [65].
Another piece of evidence proving that steric hindrance does not ablate drug binding is that afatinib has been co-crystallized in the active receptor conformation in presence of the T790M mutation Crystals 2020, 10, 725 14 of 52 (PDB: ID 4G5P) [77]. In vitro kinase assays identified that afatinib has 100-fold higher potency against the L858R/T790M double mutant when compared to gefitinib [77]. However, the concentrations necessary to bring an inhibitory effect to the T790M point-mutant might not be attained in the clinic under standard dosing regimens [106].
Due to the need to find a viable treatment option for the emergence of the resistant T790M mutation, third-generation TKIs have been developed. Osimertinib, which molecular structure is disclosed in Figure 5, is an irreversible EGFR inhibitor comprising a 2,4-diarylaminopyrimidine scaffold utilized for the L858R/T790M or exon19deletion/T790M mutants and shows a 200-fold preference for the double mutants over the WT EGFR [107]. A combination of the information provided by the crystal structure of osimertinib with WT EGFR (PDB: ID 4ZAU) and computational tools helped to elucidate the binding mode of osimertinib to the inactive state of the TK domain.
Yosatmadia et al. modeled osimertinib binding using the previously known crystal structure of T790M EGFR in complex with dacomitinib (PDB: ID 4I24) [103,107]. The L858R, T790M, and L858R/T790M mutations do not directly contribute to osimertinib binding but do favorably alter the TK domain conformation and dynamics that enhance drug binding [107]. Osimertinib engages in a hydrogen bond with the M793 backbone while Van der Waals interactions contribute to the drug orientation within the pocket. Specifically, the phenyl ring sits in a hydrophobic sandwich between L718 and G796, and the methyl-indole moiety is within Van der Waals distance from G719, F723, and V725. The interactions from the indole ring, especially with the aromatic side chain of F723, are believed to be responsible for the positioning of the P-loop towards the ATP binding pocket. As an irreversible ligand, osimertinib is capable of the formation of a covalent bond with C797 [107].
Unfortunately, similar to the emergence of first-generation resistance mutations, acquired resistance develops in response to osimertinib and afatinib treatment [108]. The most common resistance following T790M is C797S. The replacement of C797 with serine ablates the covalent binding ability of these irreversible drugs and thus confers resistance in approximately 15-25% of patients treated with osimertinib [74,109]. Uchibori et al. were able to identify a treatment option for a C797S/T790M/activating mutation triple mutant by running computational simulations and structure-activity relationship analyses that yielded brigatinib, a dual EGFR/ALK TKI, as a therapeutic agent [110]. Brigatinib binding to the C797/T790M/activating mutation EGFR ATP-binding site resembles that of EML4-ALK (PDB: ID 6MX8). Kinase studies identified that the inhibitor is more potent against Del19/T790M/C797S than in L858R activating mutations. When screened against different cell lines presenting the triple mutants, brigatinib was the only drug to inhibit EGFR phosphorylation and its downstream signaling. Effect of brigatinib on the triple mutant is improved once in combination with cetuximab, an anti-EGFR antibody [110].
L718 mutations significantly increase the IC 50 value to osimertinib, with L718Q conferring the greatest resistance potential [64]. As previously mentioned, L718 is located on the P-loop in the proximity of the ATP-binding site and is important for the correct coordination of osimertinib during the covalent bond formation with C797. Substitution of leucine with glutamine sterically inhibits osimertinib binding due to the introduction of the larger, charged side chain, which decreases local hydrophobicity at the point of contact of osimertinib and spatially restricts its binding [64].
The L718Q mutant, in combination with L858R, confers resistance to gefitinib in either presence or absence of the T790M [64]. The L718Q mutation also confers gefitinib resistance to C797S positive NSCLC. Following the substitution of leucine with glutamic acid, the local hydrophobicity is disrupted, which impairs gefitinib binding [64]. Interestingly, although the L718Q point-mutant confers resistance to osimertinib and gefitinib, a patient with advanced metastatic NSCLC harboring the L858R/L718Q double mutant was successfully treated with afatinib, indicating a furan moiety might be suitable in the presence of an L718 mutation [81].
L792F/Y/H mutations constitute approximately 1.5% of resistance to osimertinib, with the L792H conferring the most remarkable resistance [64]. Structural and mutagenesis analysis of the complex of WT EGFR/osimertinib (PDB: ID 4ZAU) showed that replacement of L792 with the aforementioned amino acids sterically inhibits the binding of osimertinib to the ATP binding cleft thus disrupting the correct orientation of the inhibitor and its pharmacological action [111,112].
The G796C/D/R mutation had previously been identified to hamper the potency of erlotinib without further mechanistic disclosures [63]. The G796D mutation has also been identified to confer resistance to osimertinib due to the steric hindrance effect imposed by the replacement of glycine to aspartic acid thus, impairing the formation of the previously mentioned hydrophobic sandwich [71].
A novel G724S point mutation, identified in a set of NSCLC patients and linked with resistance to osimertinib was studied using computational modeling [113]. Interestingly, the G724S only confers osimertinib resistance to exon19del and not L858R mutants [82,113]. Resistance arises through the complementary action of exon19del, that reduces β3-αC loop flexibility, and the G724S point mutation, which lead to the destabilization of the αC-in conformation [82].
EGFR mutations are clearly rising faster than drug development can follow as seen by emerging clinical resistance to osimertinib and an associated poor prognostic for patients. Repurposing of already approved inhibitors can be of use as an accelerated methodology for clinicians as demonstrated for allopurinol and methotrexate, both initially developed to treat cancer but later repurposed for gout and rheumatoid arthritis [114]. The process of repurposing, despite being faster that following the pathway of developing a new molecular entity, remains hindered by the multitude of drugs to be assessed against a myriad of diseases, indicating a clear need for improvement on its methodological approach, opening a venue for application of in silico high throughput screening [115].
Computational tools, as the ones previously described, can be used for a rapid assessment of novel mutations as shown by Kemper et al. with the triple mutation exon19del/T790M/P794L. A team of clinicians and structural biologists were faced with the emergence of a novel mutation, from a proline to a leucine at position 794, in addition to an exon 19 deletion and T790M. Despite the presence of the T790M, the patient was responding poorly to osimertinib. Through docking studies, the Molecular Tumor Board compared the drugs osimertinib and afatinib, providing insight in how afatinib could retain binding affinity, thus being a suitable therapeutic option [87]. This is an example that drug repurposing associated with change of therapeutic indication of a medication, may also be used to reconsider drugs that were previously discarded.

Anaplastic Lymphoma Kinase (ALK)
ALK is a RTK that is subject to aberrations in 4-5% of NSCLC cases (UniProt: ID Q9UM73) [116]. The mutations cause ALK to become an essential growth driver of the tumor. This renders NSCLCs as part of the ALKoma entity [117]. The ALK protein was first discovered in 1994 as a fusion protein in a Non-Hodgkin's Lymphoma subtype and has since been classified as a member of the insulin receptor tyrosine kinase (IRK) superfamily [118][119][120]. Subsequently, an increasing number of genetic aberrations have been identified in ALK with many of these mutations being caused by chromosomal rearrangements often driving to the hyperactivation of ALK. This leads to the overstimulation of downstream pathways that are involved in cell survival, differentiation, and apoptosis, resulting in oncogenesis [121].
There is an increasing number of treatments available for ALK-positive NSCLC in the form of ALK specific TKIs. However, most of them are met with drug resistance mutations after prolonged treatment. Computational studies can aid in the understanding of conformational changes due to aberrations in the ALK protein. This would be a key step towards personalized medicine for ALK-positive NSCLC patients by filling the gap brought by novel mutations and their response to available treatment.
Unlike many of the IRK subfamily members, the normal physiological function of ALK is yet to be fully elucidated. There is evidence that ALK plays a role in growth and fetal development of both the central nervous system (CNS) and peripheral nervous system (PNS) [119,120,[122][123][124][125][126][127][128][129]. Furthermore, evidence indicates that constitutively active mutated ALK proteins induce neuronal growth and differentiation [130,131].
Like the physiological role of ALK, the downstream signaling pathways of the protein remain to be fully deciphered. A membrane-bound receptor, such as ALK, receives and transfers extracellular signals by activating intracellular signaling pathways. Upon ligand binding, the wild-type ALK receptor homodimerizes and activates via trans-autophosphorylation of tyrosine residues within the kinase activation loop. Docking sites for downstream Src homology 2 (SH2) and phosphotyrosine-binding domain (PTB)-containing effector and adaptor proteins are located within the cytoplasmic domain [132].
Ligands that activate ALK are under tremendous debate. Some scientists show that the growth factors pleiotrophin and midkine induce ALK activation, whereas other refute this or propose that the small secreted, FAM150 peptides generate ALK activation [132]. What does have a consensus is that upon ligand binding, ALK can induce a multitude of pathways. These include the JAK/STAT, PI3K/AKT/mTOR, SOS/RAS/MEK/ERK1/2, and PLC-γ/DAG/PKC pathway as depicted in Figure 2 [133][134][135]. Cell growth and proliferation are induced via the PLC-γ/DAG/PKC and SOS/RAS/ERK1/2 pathways, while cell survival is directed through JAK/STAT and PI3K/AKT/mTOR. The diverse pathogenic signaling profile of ALK is caused by the different aberrations within the protein and the large number of pathways ALK can induce [133,135,136].
ALK consists of 1620 amino acids in its native full-length single-chain receptor form. Multiple subdomains encompass the 1030-residue extracellular region, including the LDL-A domain (low-density lipoprotein class A domain), a MAM (meprin, A5, mu) domain, and a glycine-rich region [119,120]. The cytoplasmic domain contains 563 residues and includes the catalytic kinase domain. ALK was first identified in a chimeric protein, where the catalytic kinase domain of ALK was fused with the extracellular region of nucleophosmin (NPM) [118]. NPM mediates constitutive dimerization of the protein, thereby inducing constant activation via trans-autophosphorylation of the chimeric protein [117]. Currently, nearly 30 fusion partners with ALK have been identified of which the most common in NSCLC is the echinoderm microtubule-associated protein-like 4 (EML4)-ALK gene fusion [137].
EML4-ALK is the main rearrangement form in ALK-positive NSCLC, with 3-7% of all cases [138,139]. EML4-ALK positive NSCLC patients share clinical characteristics with patients who harbor activating mutations in EGFR, both groups are non-or light smokers and manifest adenocarcinoma histology [88,116,117,140]. Notably, the EML4-ALK fusion gene and mutations in EGFR or Kirsten rat sarcoma virus (KRAS) are mutually exclusive, albeit with rare exceptions [141][142][143].
The protein resulting from the fusion gene consists of the amino-terminal portion of EML4-fused to the intracellular region, including the catalytic domain, of ALK. The chimeric protein is constitutively active, which is caused by the coiled-coil trimerization domain of the EML4 portion [144]. This results in a transforming ability in a manner dependent on the associated upregulation of RTK activity [117].
Furthermore, there are not only variations within the EML4-ALK fusion oncogene, but rearrangements of the ALK gene with a different partner have also been reported, such as kinesin family member 5B (KIF5B)-ALK [145], Huntington-interacting protein 1 (HIP1)-ALK [146], translocated promoter region (TPR)-ALK [147], baculoviral inhibition of apoptosis protein repeat-containing 6 (BIRC6)-ALK [148], and many more [149]. Most of these fusion proteins rely on ALK catalytic activity. Furthermore, recent evidence suggests that the different fusion partners influence kinase activity, transforming ability, protein stability, and notably ALK TKI drug sensitivity [150]. The development of a wide range of ALK specific inhibitors is therefore imperative.
Understanding the structure and mechanism of the ALK catalytic domain aids in the development of ALK specific TKIs, on the road towards personalized medicine. An important step in understanding the unique substrate specificity of ALK was the release of its x-ray crystal structure in 2010 by Lee et al.
(PDB: ID 3L9P) [132]. It was to be expected that ALK's catalytic domain is similar to other IRK family members as there is a high degree of structural and sequence conservation amongst this kinase family; 45% sequence identity and 62% sequence conservation over 280-290 residues between ALK and IRK/IGF1RK. Indeed, ALK has the canonical kinase domain architecture and topology, with the ATP-binding site residing at the interlobar cleft. As with EGFR and BRAF, the cleft is formed by a smaller N-terminal lobe (N-lobe) and a larger C-terminal lobe (C-lobe) connected through the hinge region, as portrayed in Figure 6 [132].
Crystals 2020, 10, x FOR PEER REVIEW 17 of 53 residues between ALK and IRK/IGF1RK . Indeed, ALK has the canonical kinase domain architecture and topology, with the ATP-binding site residing at the interlobar cleft. As with EGFR and BRAF, the cleft is formed by a smaller N-terminal lobe (N-lobe) and a larger C-terminal lobe (C-lobe) connected through the hinge region, as portrayed in Figure 6 [132]. Although ALK is believed to follow the same switch between active and inactive, Lee et al. determined the structure of ALK in an intermediate conformation that, despite having the αC-in, characteristic of an active conformation, the A-loop is not fully extended as previously described for similar kinases [132]. Accelerated MD (aMD) is an enhancement from the classical molecular dynamics (MD), on this method the energy barriers between two different states of a system are reduced, thus improving the conformational sampling to be analyzed, boosting free energy calculations [151]. Analyzing an apo WT ALK system with aMD suggests a smooth switch from the αC-helix from the in into the out state, as would be expected in a cellular setting [132,152]. The juxtamembrane segment at the N-terminus, residues 1096-1103 (YCFAGKTS), contains a β-turn motif. This β-motif packs against the distal end of the αC-helix, on the opposite side of the β-sheet. This β-turn motif is not observed in the structures of IGF1RL or IRK and seems to be unique to ALK although its contribution to the general structure is yet to be disclosed [132].
The most solvent-exposed A-loop tyrosine residue, Y1282 is positioned below R1284 . The other tyrosine residue, Y1283, is flipped towards the active site and sandwiched between hydrophobic residues of the proximal, M1273 of αAl, and distal, M1290/L1291, ends of the A-loop [132]. Y1278, critical for ALK transforming activity, forms a hydrogen bond with the backbone amide nitrogen of C1097 of the N-terminal β-turn motif . Y1278 is separated from the second and third tyrosine in the ALK A-loop sequence by the RAS (Arg-Ala-Ser) motif [132]. This RAS sequence motif has been predicted to be a distinguishing A-loop feature, contributing to ALK A-loop autophosphorylation Figure 6. Schematic representation of (A) partially active ALK kinase domain with crizotinib (PDB: ID 5FTO). Mutation hotspots are indicated in blue with L1196 depicted as spheres. Representation of the regulatory (B) β1 and β2 strands with P-loop (residues G1123-G1128), (C) unique YCFAGKTS motif (residues Y1096-S1103) in grey packed against the αC-helix (residues E1158-K1173), and (D) αAL motif (G1272-R1279) from the activation segment (residues D1270/F1271/G1272 to E1299) are highlighted and aligned with the inactive conformation in gray (PDB: ID 3L9P). (E) K1150 from β3 strand engaging in a salt bridge with E1167 in the αC-helix.
Although ALK is believed to follow the same switch between active and inactive, Lee et al. determined the structure of ALK in an intermediate conformation that, despite having the αC-in, characteristic of an active conformation, the A-loop is not fully extended as previously described for similar kinases [132]. Accelerated MD (aMD) is an enhancement from the classical molecular dynamics (MD), on this method the energy barriers between two different states of a system are reduced, thus improving the conformational sampling to be analyzed, boosting free energy calculations [151]. Analyzing an apo WT ALK system with aMD suggests a smooth switch from the αC-helix from the in into the out state, as would be expected in a cellular setting [132,152]. The juxtamembrane segment at the N-terminus, residues 1096-1103 (YCFAGKTS), contains a β-turn motif. This β-motif packs against the distal end of the αC-helix, on the opposite side of the β-sheet. This β-turn motif is not observed in the structures of IGF1RL or IRK and seems to be unique to ALK although its contribution to the general structure is yet to be disclosed [132].
The most solvent-exposed A-loop tyrosine residue, Y1282 is positioned below R1284. The other tyrosine residue, Y1283, is flipped towards the active site and sandwiched between hydrophobic residues of the proximal, M1273 of αAl, and distal, M1290/L1291, ends of the A-loop [132]. Y1278, critical for ALK transforming activity, forms a hydrogen bond with the backbone amide nitrogen of C1097 of the N-terminal β-turn motif. Y1278 is separated from the second and third tyrosine in the ALK A-loop sequence by the RAS (Arg-Ala-Ser) motif [132]. This RAS sequence motif has been predicted to be a distinguishing A-loop feature, contributing to ALK A-loop autophosphorylation efficiency and the preference for Y1278. The RAS motif is not present in IGF1RK/IRK, instead, they have the ETD (Glu-Thr-Asp) motif [153].
The ALK catalytic domain is intrinsically autoinhibited due to the aforementioned β-turn motif in the N-lobe [132]. The hydrogen atom of the C1097 amide backbone forms a hydrogen bond with the hydroxy group of Y1278, which prohibits the phosphorylation of the latter residue [154]. The A-loop also plays a role in the autoinhibition of ALK. A short, two helical turned α-helix, named αAL (residues 1272-1279; GMARDIYR), is found at the proximal portion of the A-loop immediately following the phenylalanine of the DFG-motif. αAL is packed orthogonally below the αC-helix. This prevents ALK from relaxing to its active conformation [154].
The previously mentioned accelerated molecular dynamics study supported the findings of Lee et al. that wild-type ALK resides in an autoinhibitory state. It was published that the A-loop is not fully extended and, thereby, blocks the space for peptide binding. Furthermore, the A-loop blocks the ATP-binding site when deviating inward. It thereby adopts a closed conformation that is similar to other inactive tyrosine kinases [152]. IRK family members form a pseudosubstrate with their A-loop in the inactive and unphosphorylated form. Furthermore, the phosphor-acceptor site (P-site) of the substrate peptide binding region is bound in a cis-auto-inhibitory fashion to the second A-loop tyrosine residue [132]. However, during the simulation ALK adopts a DFG-in conformation, where D1270 points inward, allowing for more space in the ATP binding site, an indication of an active conformation [154]. Furthermore, computational studies indicate that the more active A-loop conformation does not display the pseudosubstrate cis-inhibitory pose as it does in other IRK family members. This means that dormant wild-type ALK resides in a unique partially inactive tyrosine kinase conformation as it lacks the negative regulatory structural elements consistent with a fully inactive kinase conformation [132,152].
The x-ray crystal structure in combination with the aMD simulation demonstrates that the hydrogen bond network is essential in the construction of the conformational framework [132,152]. Amongst the polar face of αAL, basic residues along with K1285 draw the proximal portion of the A-loop downward toward the lower and inner aspect of the αC-helix, stabilizing this relative position in the partial inactive conformation of ALK. D1163 of the αC-helix forms two hydrogen bonds with the positively charged residues of R1275 and R1279 on the face of αAL [132,152]. These hydrogen bonds are considered the key factors in the stabilization of the αC-in formation in the partially inactive state [152]. As for the semi-closed A-loop conformation in this state, Q1159 of the αC-helix and Y1283, form a hydrogen bond that results in a sharp U-turn at the end of αAL. As a result, the A-loop cannot be fully extended or adopt a fully closed conformation. The electrostatic and hydrophobic interactions within the protein further cement the partially inactive conformation. The hydrophobic stem consisting of M1290, L1291, and P1292 , is stabilized by the hydrophobic cluster of M1296, F1301, and L1339 in combination with the C1288 and L1291 hydrogen bond [152].
As previously discussed, the apo ALK structure rarely adopts an active conformation and the dormant protein resides in a partially inactive state. Upon binding of ATP, aMD simulations show that the structure adopts another energetically favorable conformation comparable to an obtained structure (PDB: ID 3LCT). This new conformation adopts an active-like state. Generally, ATP is used by protein kinases to phosphorylate their specific substrate with their phosphate groups [152]. However, recently ATP has also been reported as an allosteric modulator [155][156][157][158]. This active-like state of ATP bound ALK is also the result of such allosteric effects [152].
Whereas the apo kinase domains mostly transition between the αC-in and αC-out conformation, ATP bound ALK has a more compact active site as the triphosphate moiety positions the αC-helix towards the active site [152]. The triphosphate moiety engages in electrostatic interactions with K1150 thereby anchoring the side chain of said residue to E1167, protecting the stability of the salt bridge partner. Furthermore, a hydrogen bond network is formed between ATP, the charged R1275, and D1063, pulling the αC-helix even more toward the active site [152].
A MD study also shows that ATP binding induces conformational rearrangements in the A-loop of ALK [152]. Upon binding of ATP, the sharp U-turn at the end of αAL is disrupted. This allows for the full extension of the A-loop and creates space for peptide substrate binding. This is considered a determinant step in kinase activation [159]. Due to the hydrogen bond formed between ATP and R1275, αAL moves nearly 4 Å forward [152]. This leads to the disruption of the hydrogen bond of D1276 and R1284 and the sharp U-turn is no longer formed. Additionally, C1288 and K1285 form a hydrogen bond stabilizing the distal A-loop residues in a position that no longer blocks the substrate-binding site. Upon ATP binding, the hydrophobic region, consisting of M1290, L1291, and P1292, moves away from the active site, and a new hydrophobic core is formed as M1273 packs against P1292 and V1293. This new hydrophobic core resides closer to the active site than the original hydrophobic core and is considered to support the open A-loop conformation [152].
To combat ALK-positive lung cancer, small-molecule inhibitors have been developed over the past decade. Like EGFR, the drugs are designed specifically to stop the pro-survival signaling provided by ALK mutants. To date, the Food and Drug Administration (FDA) has approved five TKIs for the treatment of ALK-positive NSCLC [116].
Crizotinib was the first ATP-competitive small-molecule ALK inhibitor to be FDA-approved for lung cancer in 2014. Treatment with crizotinib resulted in longer progression-free survival at higher response rates and substantially reduced symptoms compared to standard-second line chemotherapy [160]. The compound is a type I inhibitor, as such it binds in the typical fashion by binding to the active protein kinase formation (PDB: ID 2XP2). Crizotinib inhibits ALK by occupying the front ATP binding pocket in the αC-in/DFG-in conformation, forming two hydrogen bonds with E1197 and M1199, as depicted in Figure 7 [133]. Hydrophobic interactions between L1256 and crizotinib stabilize the 2-aminopyridine core, the 3-benzyloxy group, and anchors the scaffold in an L-shape [161].
Crystals 2020, 10, x FOR PEER REVIEW 19 of 53 determinant step in kinase activation [159]. Due to the hydrogen bond formed between ATP and R1275, αAL moves nearly 4 Å forward [152]. This leads to the disruption of the hydrogen bond of D1276 and R1284 and the sharp U-turn is no longer formed . Additionally, C1288 and K1285 form a hydrogen bond stabilizing the distal A-loop residues in a position that no longer blocks the substratebinding site. Upon ATP binding, the hydrophobic region, consisting of M1290, L1291, and P1292, moves away from the active site, and a new hydrophobic core is formed as M1273 packs against P1292 and V1293. This new hydrophobic core resides closer to the active site than the original hydrophobic core and is considered to support the open A-loop conformation [152].
To combat ALK-positive lung cancer, small-molecule inhibitors have been developed over the past decade. Like EGFR, the drugs are designed specifically to stop the pro-survival signaling provided by ALK mutants. To date, the Food and Drug Administration (FDA) has approved five TKIs for the treatment of ALK-positive NSCLC [116].
Crizotinib was the first ATP-competitive small-molecule ALK inhibitor to be FDA-approved for lung cancer in 2014. Treatment with crizotinib resulted in longer progression-free survival at higher response rates and substantially reduced symptoms compared to standard-second line chemotherapy [160]. The compound is a type I inhibitor, as such it binds in the typical fashion by binding to the active protein kinase formation (PDB: ID 2XP2). Crizotinib inhibits ALK by occupying the front ATP binding pocket in the αC-in/DFG-in conformation, forming two hydrogen bonds with E1197 and M1199, as depicted in Figure 7 [133]. Hydrophobic interactions between L1256 and crizotinib stabilize the 2-aminopyridine core, the 3-benzyloxy group, and anchors the scaffold in an L-shape [161].  Unfortunately, drug resistance quickly emerged after prolonged treatment with the first-generation inhibitor crizotinib as outlined in Table 2 [162]. To overcome these resistances, multiple generations of drugs were developed. The resistances are frequently induced by single point mutations, either within or surrounding the catalytic ALK pocket (such as the L1196M gatekeeper mutation). These mutations alter the structure of the binding pocket, preventing ALK specific inhibitors from binding in their normal "bioactive" pose. X-ray crystallography and computational simulations allow us to examine the effect of mutations on the ALK-TKI complex and thereby design and develop new TKIs [163]. Ceritinib, a second-generation ALK inhibitor, was discovered through high-throughput screening by Novartis [175]. Ceritinib is almost 10-fold more potent than crizotinib against wild-type ALK and proved highly effective against L1196M, S1206Y, G1269A, and I1171T EML4-ALK mutants [162]. Like crizotinib, ceritinib is a type I inhibitor and binds to the active ALK conformation. As portrayed in Figure 8, ceritinib forms two hydrogen bonds with M1199 located at the hinge region (PDB: ID 4MKC). Furthermore, a salt bridge is formed between the piperidine ring and the side chain of E1210. The isopropoxy group of the compound forms favorable interactions with R1120, E1132, and the L1198-A1200-G1201-G1202 hinge segment [133,189]. Brigatinib has also been approved for EGFR-positive NSCLC, classifying it as a dual inhibitor. The compound undergoes multiple interactions with the target; the methoxy group interacts with L1198, the C5-chlorine atom with L1196, and the unique dimethyl phosphine oxide (DMPO) moiety with the DFG motif (PDB: ID 6MX8), as illustrated in Figure 8 [194]. Moreover, the hydrogen bond between the protein and the DMPO moiety stabilizes the compound in a U-shaped conformation.
To date, all FDA approved ALK TKIs contain an ATP-adenine equivalent kinase hinge binder and an extra motif extending to the solvent area. This renders the compounds sensitive towards solvent-front mutations, e.g., G1202R. In 2018 a new type of compound, a third-generation ALK inhibitor was approved by the FDA. This compound, lorlatinib, contains an amido-linked 12membered macrocycle [195]. Johnson et al. compared the apo ALK structure with the ALK/crizotinib co-crystallized structure and used crizotinib as the base to design the compound. As illustrated in Figure 7, the macrocycle of lorlatinib is precisely anchored in the adenine binding site (PDB: ID 4CLI) [195]. Furthermore, two stable hydrogen bonds are formed with M1199 and one with E1197 through the N3, N17, and N24 of lorlatinib [196]. Moreover, van der Waals interactions are established between the compound and R1270 [133]. Hydrophobic interactions between lorlatinib and the residues L1122, A1148, L1196, A1200, G1202, and L1256 are conserved between crizotinib and Figure 8. Two-dimensional representation of second-generation inhibitors, alectinib, brigatinib, and ceritinib. All represented drugs interact with the hinge region through hydrogen bonds with M1199. Alectinib 6-dimethyl-5,6-dihydro-11H-benzo[b]carbazol-11-one scaffold is depicted as illustration of its constrained pose. Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon (grey), nitrogen (blue), oxygen (red), phosphorus (orange), sulfur (beige), and chlorine (green).
Another type I second-generation inhibitor, alectinib, was approved in 2015. This compound is more effective than crizotinib in untreated ALK-positive lung cancer and has an IC 50 of 1.9 nM against native ALK [190,191]. Moreover, alectinib may inhibit both native ALK (K i = 0.83 nM) and the L1196M ALK mutation (K i = 1.56 nM) [190]. Furthermore, alectinib is more effective against tumors with brain metastasis, due to higher penetration across the blood-brain barrier (BBB) [192]. The structure of ALK co-crystallized with alectinib (PDB: ID 3AOX) indicates that the backbone nitrogen of M1199, like other inhibitors, forms a hydrogen bond with the carbonyl oxygen of the benzo[b]carbazole moiety. Furthermore, the tetracyclic benzo[b]carbazolone of alectinib adopts a planar conformation as observed in its co-crystalized structure with ALK. This allows for the formation of a hydrogen bond network via solute or water molecules to K1150, E1269, E1270, and R1253 [190].
Brigatinib has also been approved for EGFR-positive NSCLC, classifying it as a dual inhibitor. The compound undergoes multiple interactions with the target; the methoxy group interacts with L1198, the C5-chlorine atom with L1196, and the unique dimethyl phosphine oxide (DMPO) moiety with the DFG motif (PDB: ID 6MX8), as illustrated in Figure 8 [194]. Moreover, the hydrogen bond between the protein and the DMPO moiety stabilizes the compound in a U-shaped conformation.
To date, all FDA approved ALK TKIs contain an ATP-adenine equivalent kinase hinge binder and an extra motif extending to the solvent area. This renders the compounds sensitive towards solvent-front mutations, e.g., G1202R. In 2018 a new type of compound, a third-generation ALK inhibitor was approved by the FDA. This compound, lorlatinib, contains an amido-linked 12-membered macrocycle [195]. Johnson et al. compared the apo ALK structure with the ALK/crizotinib co-crystallized structure and used crizotinib as the base to design the compound. As illustrated in Figure 7, the macrocycle of lorlatinib is precisely anchored in the adenine binding site (PDB: ID 4CLI) [195]. Furthermore, two stable hydrogen bonds are formed with M1199 and one with E1197 through the N3, N17, and N24 of lorlatinib [196]. Moreover, van der Waals interactions are established between the compound and R1270 [133]. Hydrophobic interactions between lorlatinib and the residues L1122, A1148, L1196, A1200, G1202, and L1256 are conserved between crizotinib and lorlatinib. However, the interactions between G1123, V1130, F1198, and G1269, and the drug are unique to the ALK-lorlatinib interaction [196].
Unfortunately, resistance mutations in ALK-positive NSCLC against lorlatinib has already been predicted [197] and reported as highlighted on Table 2. Notably, the L1198F substitution induces resistance against lorlatinib, but it resensitizes cells for crizotinib [176].
The inevitability of resistance mutations leads the scientific community to develop a wider range of ALK specific TKIs. One such TKI in development is ensartinib (X-396), which is seen as a promising candidate as its potency values are ten-fold greater than crizotinib, with an IC 50 of 22 nM [198]. The drug is effective against multiple ALK mutants, including G1269A and C1156Y and it is currently undergoing Phase II clinical trials [199][200][201][202].
Entrectinib (PDB: ID 5FTO) is a promising potent drug against ALK with an IC 50 of 2 nM [203]. It was approved in 2019 for the treatment of ROS1-positive NSCLC patients, but still awaits approval in ALK-positive treatment [204]. In phase I trials it is well tolerated and, like ensartinib, seems to have antitumor activity in the central nervous system [205]. Molecular docking and molecular dynamic studies have examined the behavior of the drug in the ALK catalytic domain [206]. This provides early knowledge on the ALK-TKI relationship, and whether inhibitory activity is sufficient for clinical use, findings that might be further used in the study of activating and resistant mutations [207].
Computational studies are heavily reliant on x-ray crystallography. The protein structures that these studies unravel allow for the creation of 3D models of mutants with unknown structure When an x-ray crystal structure is not available, homology modeling via computational means provides an accurate structure that still allows for the analysis of certain ALK mutations, as was the case for Okamoto et al. [206].
The L1196M mutation has been dubbed the gatekeeper mutation and is analogous to the T790M EGFR resistance mutation. L1196 controls access of inhibitors to a hydrophobic pocket within the active site. Substituting L1196 with bulky side chain residues leads to the resistance of ATP-competitive ALK inhibitors, which is considered a common mechanism of resistance in tyrosine kinases [208]. Studies not only revealed that the mutation causes steric hindrance but also how the mutation affects the structure, shifting the αC-helix, thereby increasing its affinity for ATP [163,209].
Extensive in silico analyses have led to the understanding of the resistance mechanism of L1196M against crizotinib [163,209]. The results indicate that L1196M causes flexibility in the P-loop and A-loop, thereby causing fluctuations in the conformation of the protein. This increased flexibility of the P-loop, including its GxGxΦG motif, increases the binding affinity for ATP [210]. A-loop flexibility alters the hydrophobic interaction of the protein with crizotinib, leading to poor binding of the inhibitor [163]. Furthermore, studies indicate that the L1196M mutation causes a secondary structural modification in the A-loop region, which reduces the stability of the αAL motif (residues 1272-1278). Therefore, the preferentially autophosphorylated Y1278 is destabilized by the L1196M mutation. The van der Waals interactions that Y1278 undergoes with Y1096 and M1166, located within the hydrophobic face of the C-helix, are thereby compromised. Changes in van der Waals interactions as caused by L1196M have been identified as a major component of the binding energy difference between all wild-type and mutated complexes [209,211]. In silico experiments suggest that the L1196M mutation decreases the total binding energy of crizotinib. This occurs through a decrease in the non-bonded contribution and increases in the entropy upon binding of the ligand. The main cause of resistance to other drugs is associated with a change in the overall binding energy [163].
In summary, the L1196M mutation introduces conformational flexibility that leads to a shift in the αC-helix that increases the binding affinity for ATP while simultaneously decreasing the binding affinity for the ALK inhibitor [209,210]. Many other computational studies have uncovered different resistance mechanisms of ALK mutations, such as F1174V, L1198F, and the frequent G1202R [196,[212][213][214]. For example, molecular dynamics and binding energy calculations have shown that the C1156Y mutation disrupts the van der Waals interactions and electrostatic interactions between the mutated ALK and crizotinib [215]. The mutation alters the conformations of the P-loop, β-sheet, and αC-helix, causing the displacement of crizotinib.
Similarly, molecular dynamics together with free binding energy calculations revealed that the F1174C mutation disrupts the aromatic-aromatic network that is formed amongst residues F1098, F1174, F1245, and F1271 [216]. This allosterically affects the dynamics of the P-loop, as it leads to its upward movement, away from the ATP-binding site. The mutation causes weak binding of ALK to ceritinib as hydrophobic interactions, are disrupted by the mutation, causing drug resistance. Computational studies are not limited to single ALK point mutations. Double mutated ALK catalytic domains have also been subjected to virtual molecular studies, such as the L1196M/G1269A [217] or the L1198F/G1202R double mutant, where the effects of the individual mutations as well as the double mutant effects were considered [218].
The computational studies may not only clarify the mechanism of resistance of mutations but also indicate the compatibility of certain drugs with the mutated ALK protein. In the case of L1198F, computational studies showed that the mutation alters the conformation of the ATP binding pocketwhich is unfavorable for lorlatinib binding. This conformational change allows for the resensitization of the protein to crizotinib, with a new hydrogen bond being formed between the fluorine atom and K1150. MMGBSA analysis of the simulations indicate an ameliorated binding of crizotinib to the L1198F single or double mutant when compared to the WT ALK [176,218]. The divergence for experimental and theoretical data of the G1202R mutant is elucidated once the apo structure of the mutant is analyzed through MD simulations. The resistance is suggested to be due a hydrogen bond involving R1202 and the backbone of L1122, hindering the access of drugs to the binding pocket. This blockage is absent in the double mutant L1198F/G1202R due to the F1198 interaction with E11132, dislocating R1202 [196].
Recently, Shaw et al. reported a case in which they used virtual molecular studies to aid in the treatment regimen of the patient [176]. A patient was diagnosed with ALK-positive NSCLC and treated accordingly with crizotinib. However, after prolonged treatment, resistance mutations arose and the patient was treated with ceritinib after acquiring the C1156Y mutation. After entering a clinical trial, in which the patient received lorlatinib, the patient acquired a second mutation; C1156Y/L1198F. Interestingly, the patient was resensitized to crizotinib as observed for the single mutant L1198F [176].
Besides, Shaw et al. set out to understand the mechanism of the patient's mutated ALK protein and included a structural study. The computational study proposed that the C1156Y substitution increased the kinase activity by shifting the P-loop even though the C1156 is positioned 13 Å from the inhibitor binding site. The novel resistance mutation, L1198F, resides near the ATP-binding site. This mutation forces the rigid macrocyclic inhibitor lorlatinib to rotate away from the introduced phenylalanine. This disrupts the hinge-binding interaction with the inhibitor and introduces strain into the kinase, reducing the binding energy compared to wild-type ALK. Interestingly, the newly introduced phenylalanine does not clash with crizotinib. Instead, it moves slightly closer to the inhibitor. In the double mutation, the computational study revealed that the enhanced binding caused by L1198F offsets the increased kinase activity introduced by C1156Y, resulting in crizotinib sensitization [176].
This case is a prime example of computational biology in the setting of diagnosis and personalized medicine by its effort to evaluate their own theoretical data with different source of experimental output. Structure-based study led to the understanding of the mechanism of the double mutant with a novel resistance mutation. Upon understanding the structure and mechanism, an informed clinical decision could be made and the treatment with crizotinib was restarted. The patient had a "rapid and dramatic clinical improvement" [176].

Rapidly Accelerated Fibrosarcoma Homologue B (BRAF)
The RAF (rapidly accelerated fibrosarcoma) protein family members, ARAF, BRAF, and CRAF are one of the core protein kinases in the mitogen-activated protein kinase (MAPK) pathway represented in Figure 2 [219]. The MAPK cascade plays a major role in cellular signal transduction, and it is activated by different extracellular ligands like growth factors, cytokines, and hormones. This activation, based on specific cellular conditions mediates cell growth, survival, and differentiation [220,221]. RAS (rat sarcoma viral oncogene) a membrane-associated protein, when loaded with GTP, induces dimerization and activates RAF proteins. The activation of RAF kinases leads to phosphorylation of mitogen-activated protein (MEK) which in turn phosphorylate ERK (extracellular signal-regulated kinase) downstream the pathway [221,222]. Despite being an ATPase, such as EGFR and ALK, BRAF is classified as serine-threonine kinase and does not contact the extracellular environment directly, instead depending on TKRs and RAS binding for its activation. RAF activity is tightly regulated and its malfunction is often linked to cancer [223].
The three isoforms share conserved regions; CR1 and CR2 in the N-terminus and CR3, which encodes the kinase domain in the C-terminus [224]. CR1 is composed of a RAS-GTP binding domain (RBD) and cysteine-rich domain, which binds with two zinc ions [225]. In the autoinhibitory state, the cysteine-rich domain also functions as an auto inhibitor of the kinase domain [226]. The CR2 region is rich in serine and threonine residues and contains a domain for binding of 14-3-3 protein [225].
Confined between CR1 and CR3, the CR2 function as a flexible hinge between the two regions [227]. The CR3 region besides, the kinase domain, presents a binding site for 14-3-3 protein a class of helical regulatory molecules. The N-terminus of CR3 contains a glycine-rich motif, responsible for the stabilization of ATP binding that also contributes to the stabilization of the inactive conformation. The substrate-binding pocket is found in the C-terminal end of CR3 and has a catalytic loop that promotes the smooth transfer of phosphate group from ATP to the enzyme substrate [228,229]. Despite all three isoforms' contribution to physiological processes, the focus of this review will be on the BRAF isoform due to its contribution to NSCLC (UniProt: ID P15056) [230].
Structurally, BRAF presents two lobes connected by a hinge region, as mentioned for EGFR and ALK. Regulatory components such as the P-loop (G464-G469), the movable αC-helix (T491-R506) and, the dynamic A-loop, and its DFG (D594/F595/G596) motif are also present, as depicted in Figures  1 and 9 [231,232].
ATP binding to BRAF follows a similar pattern as described for the previous kinases, with the adenosine ring anchored through hydrogen bonds to the backbone of Q530 and C532 in the hinge region. In addition to stabilization by the P-loop and the DFG, the non-transferable phosphate groups are in contact with K483 and E501, with the positive charge of K484 bending towards the ATP α and β phosphate groups, and being stabilized by E501 in the absence of a nucleotide [233].  ATP binding to BRAF follows a similar pattern as described for the previous kinases, with the adenosine ring anchored through hydrogen bonds to the backbone of Q530 and C532 in the hinge region. In addition to stabilization by the P-loop and the DFG, the non-transferable phosphate groups are in contact with K483 and E501, with the positive charge of K484 bending towards the ATP α and β phosphate groups, and being stabilized by E501 in the absence of a nucleotide [233].
In quiescent cells, RAF proteins are localized in the cytosol and are devoid of subcellular localization motifs [234]. They exist in an autoinhibited manner in which the N-terminus interacts with the C-terminus and represses kinase activity [222]. Dimerization of the kinase is believed to be an important step in RAF activation and, binding of RAS-GTP to RBD activates the dimerization of RAF members leading to the formation of homo-or hetero monomers [235,236].
In its quiescent state, besides existing in the autoinhibited state, BRAF forms a complex with MEK. RAF and MEK form a dimer interface by primarily using the C-lobe of both kinases including the activation segment.. Characterization of the BRAF-MEK1 complex (PDB: ID 4MNE and 6PP9) provides insights into the dynamics of the protein-protein interaction and casts light on the inactive complex in the presence of the inhibitory 14-3-3 protein [237,238]. Upon RAS-GTP mediated pathway activation, BRAF bounded to MEK will dimerize with CRAF or another BRAF monomer to form a transient active heterotetrameric RAF-MEK complex, as shown in Figure 2, resulting in a phosphorylated MEK. Phosphorylation of MEK destabilizes the heterotetrameric complex leading to dissociation of MEK and allowing MEK to phosphorylate the kinase domain of ERK [237].
Studies conducted to examine RAF dimerization have shown that all RAF proteins are capable of forming homo or heterodimers [236,239]. However, growth factor stimulus predominantly induces BRAF/CRAF heterodimerization, with a low level of BRAF/ARAF and little to no CRAF/ARAF heterodimers observed [239]. Moreover, BRAF/CRAF heterodimers have a higher kinase activity when compared to their respective monomer or homodimers [235]. The process of dimerization involving the members of the RAF family relies on the αC-helix of both units of the dimer packing against each other. Therefore, mutations on the intermolecular interface are linked to the disruption of RAF dimers. As an example, the mutation of a conserved arginine on both BRAF (R509) and CRAF (R401) into a histidine residue results in disruption of the dimerization process [239,240]. The introduction of this mutation is useful when seeking crystallization of the monomer BRAF instead of its dimeric state (PDB: ID 4RZV) [241].
In the inactive kinase conformation, the activation segment is packed and the αChelix is in the out conformation, disrupting the salt bridge between the conserved lysine of the β3 strands and the glutamate of αChelix. Activation is usually triggered by phosphorylation and leads the disruption of the inactive conformation, leading the activation segment to take a position needed for catalysis, the DFG in state, and αC-helix to turn to "in" to form a catalytic salt bridge [242].
RAF paralogs are subject to complex regulation and their activity depends on the phosphorylation of different regions, binding of inhibitory proteins, and intramolecular autoinhibitory interactions between the different domains of the protein. The detail of their regulation also varies based on the regions conserved in each isoform [221,222]. The activation segment that is found at the center of the catalytic domain plays a key role in the regulation, and phosphorylation of residues positioned at the activation segment transforms the inactive kinase orientation into the active kinase conformation [243]. ARAF and CRAF require additional phosphorylation within the N region of the kinase domain whereas BRAF does not need this phosphorylation [221,244]. This difference in their activation implies that those isoforms can be regulated independently [244].
Although all RAF isomers are important in normal physiology, BRAF has higher basal kinase activity. Moreover, it is by far the most mutated RAF kinase in different types of malignancies [229,245]. BRAF mutations have been identified in different types of malignancies such as thyroid cancer, colorectal cancer, and hairy cell leukemia. When mutated, it is a major therapeutic target in melanoma and interestingly also in NSCLC [246]. Mutations in BRAF are mostly found in the activation segment (A-loop) near position 600 or the glycine-rich phosphate-binding loop (P-loop) with rare cases in flanking regions [245].
While all BRAF mutations constitutively hyperactivate the MAPK pathway, they have a different mechanism of activation. Based on their effect on kinase activity, RAS independency, and dimerization state, BRAF mutations can be categorized into three distinct groups [227]: class 1 BRAF mutations act as RAS independent active monomers whereas, class 2 BRAF mutations are active dimers. Class 3 mutations are RAS dependent and possess low kinase activity [227]. As mentioned above a significant number of oncogenic BRAF mutations occur in the conserved region of the kinase domain specifically in the P-loop and the DGF motif, destabilizing the inactive conformation leading to auto-activation of BRAF independent of upstream signaling [244].
Class 2 mutants have an intermediate to high kinase activity, whereas class 3 mutants have an impaired kinase activity [234]. Class 2 mutations predominately reside in the activation segment or P-loop, whereas Class 3 mutants are located in the P-loop, catalytic loop, and DFG motif [227]. Class 2 mutants that are found in the activation segment block the interaction with the P-loop, which is important for the auto-inhibition of BRAF leading to increased intrinsic kinase activity. Class 1 and 2 mutated kinases act independently of their upstream activator RAS GTPase for signaling growth and proliferation in cancer cells [229].
In contrast, class 3 mutants are RAS dependent and these mutants possess low kinase activity compared to the wild-type BRAF, or they lack kinase activity [227]. However, those non-oncogenic BRAF mutations, which have a lower kinase activity stimulate ERK activity by disfavoring BRAF-MEK complex formation and potentiate BRAF-CRAF dimerization. Those mutants trigger the dimerization of WT BRAF and CRAF and their stimulation of the MAPK pathway is dependent on WT CRAF [237,244] BRAF V600E/D/K/R mutations are categorized as class 1 mutations and are the most frequently identified BRAF mutations in cancer. These mutations lead to a 500-fold increase in kinase activity compared to the wild-type protein [244]. The WT BRAF kinase domain structure shows that the aliphatic side chain of V600 contacts the phenyl ring of residue F468 found in the P-loop. Substitution of valine with a larger and hydrophilic amino acid-such as glutamic or aspartic acid, lysine, or arginine-is presumed to disrupt the interaction that maintains the DFG motif in the inactive conformation, holding the activation segment in an active position [244]. Additionally, mutation at position 600 (V600X) could also guide the activation segment into the active conformation, resulting in increased BRAF activity [244]. This conformational change is similar to the one formed during dimerization, which explains why BRAF V600E is active as a monomer and does not depend on dimerization for kinase activity and MEK hyperactivation [222,228].
Although BRAF V600E mutants are functional as monomers, an in vitro study by Cope et al. revealed that BRAF V600E mutants have increased oligomeric formation, indicating that V600E contributes to a higher potential of BRAF for dimerization. In the same study, MD simulations showed that the BRAF V600E mutant preferentially adopts a conformation in which the activation segment is fully uncoiled and the αC-helix takes the αC-in position. As a result, αC-in is associated with BRAF dimerization, providing evidence on how BRAF V600E shows an amplified dimerization capacity. However, the introduction of R509H severely disrupts the BRAF dimers, but does not abolish kinase activity of BRAF V600E. This implies that the V600E mutation could influence BRAF kinase activity by playing two functions, serving as an active monomer and also increasing dimerization potential [247].
Over 40 mutations have been identified in BRAF. However, the substitution of a valine residue by glutamine acid at position 600 (V600E) is responsible for 92% of BRAF mutation in different types of cancer [245]. This mutation is located proximal to the DFG motif and introduces a negatively charged side chain in the previously hydrophobic position. The introduction of the negatively charged side chain of glutamic acid has a comparable effect on the activation loop as the phosphate group of ATP on neighboring residues T599 and S602, classifying this variant as phosphomimetic. Similarly, to the L858R variant of EGFR, the V600E mutation is believed to favor the active conformation by increasing the energy threshold required to transition from the active into the inactive conformation -locking the kinase in a constitutively active conformation. This loss of inhibition increases the basal kinase activity, creating an oncogenic mutation [248].
Identification of BRAF gene alteration in different cancers and a better understanding of BRAF three-dimensional structure has enabled the development and design of small-molecule inhibitors that specifically deactivate BRAF kinase activity [229,249]. The first-generation RAF inhibitors were ATP-competitive inhibitors, of which sorafenib was the first to be approved. Sorafenib has a multi-kinase profile and acts on wild-type and V600E BRAF, CRAF, VEGFR, PDGFR, c-KIT, and FLT3 [229,249].
Sorafenib was first designed as a CRAF inhibitor, targeting the inactive conformation of RAF proteins [244,250]. Sorafenib was shown to be inefficient in the treatment of melanoma as a monotherapy or with combination with chemotherapy [229], but was found to be effective in renal cell and hepatocellular carcinoma without evidence of efficiency on the NSCLC treatment [251,252]. The co-crystal structure of sorafenib with WT BRAF (PDB: ID 1UWH) demonstrated that the drug locks the kinase domain in the inactive conformation by inducing a DFG-out conformation and decreasing activation loop mobility. The DFG-out locked conformation is stabilized by the interaction of sorafenib's urea carbonyl moiety acting as a hydrogen bond acceptor for D594 backbone carbonyl, with further sterically hindrance created by the trifluoromethyl ring. The pyridyl rings mimic the adenosine ring interaction with W531, F583, F595-part of the hinge region, catalytic loop, and DFG motif, respectively. Residue F595 in the DFG motif also interacts with the central benzene ring of sorafenib as demonstrated on Figure 10. The other end of the inhibitor contains a lipophilic trifluoromethyl phenyl ring and is buried in the hydrophobic pocket that is constructed between the αC and αE helices and the N-terminal region of the catalytic loop and DFG motif [244]. The same binding mode is observed in the presence of the V600E mutation with the DFG-in conformation conserved (PDB: ID 1UWJ) [244].
Second-generation inhibitors were designed to specifically inhibit V600E. Through structure-based drug design principles, vemurafenib (PLX4032) was developed and approved by the FDA for the treatment of advanced-stage melanoma [253]. Dabrafenib (GSK2118436), another mutant selective BRAF inhibitor likewise gained approval in 2013 [254]. Encorafenib (LGX818), also an ATP competitive second-generation RAF inhibitor, has a higher potency than vemurafenib and dabrafenib due to its significant slower off-rate from BRAF V600E and it is currently in phase II clinical trial in combination with binimetinib, a MEK inhibitor (clinical trial identifier: NCT03915951). Moreover, in vivo and in vitro data showed that it is inactive against BRAF wild-type tumors [255,256].
cell and hepatocellular carcinoma without evidence of efficiency on the NSCLC treatment [251,252]. The co-crystal structure of sorafenib with WT BRAF (PDB: ID 1UWH) demonstrated that the drug locks the kinase domain in the inactive conformation by inducing a DFG-out conformation and decreasing activation loop mobility. The DFG-out locked conformation is stabilized by the interaction of sorafenib's urea carbonyl moiety acting as a hydrogen bond acceptor for D594 backbone carbonyl, with further sterically hindrance created by the trifluoromethyl ring. The pyridyl rings mimic the adenosine ring interaction with W531, F583, F595-part of the hinge region, catalytic loop, and DFG motif, respectively. Residue F595 in the DFG motif also interacts with the central benzene ring of sorafenib as demonstrated on Figure 10. The other end of the inhibitor contains a lipophilic trifluoromethyl phenyl ring and is buried in the hydrophobic pocket that is constructed between the αC and αE helices and the N-terminal region of the catalytic loop and DFG motif [244]. The same binding mode is observed in the presence of the V600E mutation with the DFG-in conformation conserved (PDB: ID 1UWJ) [244]. Figure 10. Two-dimensional representation of BRAF inhibitors, sorafenib, dabrafenib, and vemurafenib. All represented drugs interact with the hinge region through hydrogen bonds with Q530 and C532. Vemurafenib and dabrafenib present a functionalized sulfonamide moiety providing an anchoring point with the allosteric pocket and the αC-helix (L505). Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon(grey), nitrogen (blue), oxygen (red), sulfur (beige), chlorine (green), and fluorine (light blue). Figure 10. Two-dimensional representation of BRAF inhibitors, sorafenib, dabrafenib, and vemurafenib. All represented drugs interact with the hinge region through hydrogen bonds with Q530 and C532. Vemurafenib and dabrafenib present a functionalized sulfonamide moiety providing an anchoring point with the allosteric pocket and the αC-helix (L505). Interactions were analyzed and plotted using Discovery Studio Visualizer [57]. Atoms color scheme: carbon(grey), nitrogen (blue), oxygen (red), sulfur (beige), chlorine (green), and fluorine (light blue).
Studies conducted to analyze the binding mode of BRAF inhibitors have shown that different sulfonamide-based inhibitors, like vemurafenib, bind to the off state conformation of BRAF [257]. Interaction of the sulfonamide moiety with residues at the N-terminus of the DFG region leads to the binding of the alkyl chain to an "RAF-selective pocket", which is unique to the RAF family. This binding is critical and drives selective inhibition of oncogenic mutations [258]. This unique pocket contains residues L505, L514, F516, and F595, and interaction between the functionalized sulfonamide moiety with the L505 side chain stabilizes the αC helix in an outward position [257].
RAF inhibitors are also classified by the conformation of the regulatory αC-helix and DFG motif resulting from drug binding. In this classification, inhibitors can be type I (αC-in/DFG-in), type I 1/2 (αC-out/DFG-in), or type II inhibitors (αC-in/DFG-out) [241,259,260]. BRAF first-and second-generation inhibitors belong mostly to type I and I 1/2, with vemurafenib and dabrafenib belonging to the type I 1/2 class inhibitors. Type II inhibitors target the inactive kinase domain αC-in/ DFG-out kinase state, whereas type I 1/2 kinase inhibitors bind to another inactive conformation, αC-out/DFG-in which blocks the formation of catalytical competent enzyme form. Type II and Type I 1/2 kinase inhibitors are more selective since those molecules target a particular inactive conformation that could be structurally unique for BRAF [261]. The crystal structure of the different inhibitors with the BRAF kinase domain has been solved including vemurafenib (PDB: ID 3OG7, 4RZV), sorafenib (PDB: ID 1UWH, 1UWJ) and, dabrafenib (PDB: ID 4XV2, 5CSW) [244,258,262].
The crystal structure of type II inhibitor sorafenib with WT BRAF and V600E BRAF has shown that type II inhibitors bind strongly to both monomers and exhibit a long inhibition period, and may be relatively weak dimerization inducers. Sorafenib binds to both of the protomers and lead to the dimerization of BRAF [244]. The recent discovery of LY3009120 (PDB: ID 5C9C) led to a better understanding of "paradoxical" BRAF inhibitor activation. LY3009120 is a type II pan-RAF dimer inhibitor of three RAF isoforms and has a minimal activation pathway, due to its effective binding with both protomers of the dimer [263]. Vemurafenib, a typical type I 1/2 inhibitor binds only to one of the monomers of the dimer and stabilizes the αC-out conformation. This position induces negative allosteric changes to the αC helix of the other monomer, resulting in a conformation which is resistant to the binding of vemurafenib and activation of the heterodimer BRAF-CRAF (leading to the term "paradoxical"). This mechanism of vemurafenib is best visualized in the BRAF-vemurafenib co-crystal structure (PDB: ID 3OG7) [258].
Dabrafenib is categorized as a type I 1/2 inhibitor and binds to both protomers with identical DFG-in/αC-helix-out conformation (PDB: ID 4XV2, 5CSW) [262,264]. The potential of type I 1/2 inhibitor to induce dimerization could be correlated with the extent of their paradoxical activation. Likewise, whereas dabrafenib is a relatively strong promoter of dimerization and paradoxical activation, vemurafenib may weakly induces dimerization and shows a decreased paradoxical activation in biological assays [262,264,265]. Responses of dabrafenib, vemurafenib, and encorafenib to BRAF mutations are summarized in Table 3.
MD simulations, combined with cell assays of paradox inducers (vemurafenib and PLX4720) and paradox breakers (PLX8394 and PLX7904) indicated that a subtle structural difference could result in a profound conformational change and the overall dynamics of BRAF dimer complex [266]. Arora et al. showed that paradox inducers could increase the movement of both the αC-helix and the A-loop region, whereas paradox breakers promote a relatively stable αC-helix and activation segment. Analysis of hydrogen bonds throughout MD simulation identified that paradoxical inducers present a significantly higher number of hydrogen bonds with the gatekeeper residue T529 when compared to the paradox breakers, providing an explicit discrimination between paradox inducers and breakers [266].
As observed for EGFR and ALK, gatekeeper mutations are a common mechanism of resistance as they do not hamper ATP binding but preclude interactions with the hydrophobic pocket composed of the αC-helix and A-loop -a common pocket for drug development. This residue is also considered an essential structural feature that could influence kinase activation [208]. Arguably, the addition of the sulfonylurea group in paradox breakers blocks the interaction with the gatekeeper residue T529, inducing an overall conformation change in the paradox blocker complex, and promoting an open conformation. In contrast, the sulfonamide group of the paradox inducer promotes a closed conformation facilitating BRAF dimerization [266], which can lead to paradoxical activation of the MAPK pathway in the presence of ATP competitive BRAF inhibitors [240,260].
The mutational landscape of BRAF despite being limited when compared to EGFR and ALK for NSCLC patients remains a topic of interest due to its complex signaling contribution and possibility of future role in lung cancer [229].

Kirsten Rat Sarcoma Viral Oncogene Homologue (KRAS)
The rat sarcoma viral oncogene (RAS) family includes Kirsten rat sarcoma viral oncogene homolog (KRAS), neuroblastoma rat sarcoma viral oncogene homolog (NRAS) and Harvey rat sarcoma viral oncogene homolog (HRAS). The RAS family is a membrane-bound GTPase family with a pivotal role in cell homeostasis. RAS is responsible for the conversion of a plethora of extracellular signals from multiple TKRs (i.e., EGFR, insulin receptors) and G protein-coupled receptors (GPCRs) into intracellular signals, through the activation of canonical cascades such as PI3K/AKT/mTOR, and RAF/MEK/ERK. Therefore, this family is associated with a multitude of typical cellular functions, from proliferation and survival to motility and gene transcription [281].
RAS exists in two conformations; an inactive conformation with a GDP molecule bound and active conformation with GTP. The association of GTP with RAS leads to conformational changes that oblige downstream effectors to bind and be further activated by one of the three RAS isoforms. [282].
A slow intrinsic hydrolysis ratio is not compatible with the dynamic role of RAS on cell survival. A combination of RAS-associated proteins, such as GTPase activating proteins (GAPS) and Guanine nucleotide exchange factors (GEFs) are responsible for strict amplification and control of RAS activity. GAPs are associated with an increase of RAS hydrolysis, increasing GTPase activity. Once GDP molecules are generated, GEF assists RAS to eject GDP allowing for a new GTP molecule to bind [283].
Despite being rather small proteins (~190 amino acids), RAS family members have well-defined regions associated with either their function (binding site, switch I, switch II, effector binding site) or regulation (GEF binding site and GAP binding site). GTP binds to the nucleotide-binding region where it is converted into GDP. Upon GTP hydrolysis, the switch I (Q25-Y40) and switch II (D57-G75) regions change their conformation allowing GAP and GEF binding. The changes also allow the binding of effector proteins while in the active conformation [283].
RAS signaling is structurally linked to conformational changes in the GTP-bound (active) state. In the presence of GTP, RAS proteins present a closed conformation of two structural elements (switch I and switch II), allowing the binding of effectors such as RAF kinases and PI3K. The positively charged P-loop stabilizes the negatively charged phosphate group, thereby maintaining the GDP bound conformation. The P loop is conserved throughout the family as a conserved GxxxxGKS/T motif. A conserved lysine stabilizes the negative charges of the β and γ phosphate groups, and the remaining negative charges are stabilized by the bidentate interaction with a magnesium ion. Another conserved motif, NKxD, stabilizes the guanine base [284,285].
Structurally, the RAS family follows the canonical α,β-fold with six β-strand and five α-helices like other nucleotide binding proteins as shown in Figure 11 [286]. The variability of conformations, especially of the switch II region, was analyzed through the lens of computational biology. Interestingly, the output of MD simulations shows that there is not only a change of flexibility in the three isoforms but also a change in their degree of freedom upon binding of different nucleotide states. Kapoor and Travesset also uncovered a transient new pocket in KRAS, neighboring the nucleotide-binding site. The proximity of this new transient pocket to the effector domain and the membrane-binding domain provides an insight into a possible targetable site for the abrogation of the nucleotide exchange [287]. Recent advances provided experimentally determined structures for less common mutants such as G13C (PDB: ID 6OB3), D33E located on switch I (PDB: ID 6BP1) and A59G (PD: ID 6ASA) [307,308]. Q61 is located on the switch II region, and related to effector binding. In KRAS, the P-loop is adjacent to both switch I and II leading to the hypothesis that these mutations, as well as variations on codon 61, could affect KRAS binding to effector partners [287].
As previously mentioned, wild-type KRAS exists in two states. A molecular dynamics study indicates that hotspot mutations influence the inactive-to-active conformational transition [309]. The G12C mutant still exists in two, active and inactive, states compared to WT. Both the G12C and the G12D mutants alter the dynamics of KRAS, shifting the protein such that the GTP-binding pocket now resides in a more open conformation compared to wild-type KRAS-GDP. These open conformations increase the solvent-accessible surface area (SASA) by almost 23% and 14% respectively [309].
Of note is that the increased SASA of the G12D substitution culminated in a more prominent open conformation than mutation G13D. The G12D mutation also causes increased atomic fluctuations at the P-loop and switch II regions) [310]. Interestingly, the open conformation is not limited to the common mutations in the P-loop. The rare switch mutations D33E (switch I) (PDB: ID 6BP1) and A59G (switch II) (PDB: ID6ASA) also adopt an open switch I conformation in silico [308].
The dynamic and open conformation of the mutated KRAS is suggested to hamper nucleotide exchange [311].
Hunter and colleagues showed that all three residues positions are linked to a decreased affinity between RAS and RAF although with different impacts [312]. The impact of mutations was studied in the context of intrinsic GTP conversion, with the mutation G12C presenting minimal impact in the hydrolysis rate, while the presence of alanine or arginine residues in the same position led to around 80%-fold decrease and aspartic acid and valine showing an intermediate effect. A considerable decrease of hydrolysis constants was observed for position 61, where mutation Q61H/L has similar profiles to G12D/V. Hotspot position G13 also presented decreased hydrolysis when mutation G13D was present. A similar outcome is found when in the presence of GAPs [312].
The common mutation G12C impairs the arrangement necessary for GAP-mediated hydrolysis by disturbing the arrangements of the catalytic residues Q61 of KRAS and the arginine finger of GAP in the active site for catalysis. The mutation also disrupts the formation of a hydrogen bond between the side chain NE2 atom of Q61 and the γ-phosphate of GTP [309]. Probe-based Molecular Dynamics (pMD) differ from the classical method by using probe molecules in conjunction with usual solvents (i.e., water) [288,289]. This method is comparable with fragment-based crystallography and NMR spectroscopy [290,291]. By applying the principles of pMD to the RAS family, the initial binding pockets were analyzed dynamically indicating that these are better formed upon protein relaxation [292,293]. Besides four allosteric pockets, pMD also highlighted reactive surfaces that might be involved with protein-protein or protein-membrane interactions [294].
Due to its association with different cancer types, KRAS has gained the spotlight [4], with its role in tumor survival evident from KRAS-driven mouse models [295]. Despite its critical role in pancreatic and colorectal cancer, the KRAS mutational landscape has also been in focus in lung cancer with related prognostic and predictive power [296]. Due to its relevance for oncogenic addiction, multiple attempts to target the RAS family have been described. However, they failed, either due to moderate efficacy or lack of selectivity leading to toxic effects [292,[297][298][299][300][301][302].
Recent advances provided experimentally determined structures for less common mutants such as G13C (PDB: ID 6OB3), D33E located on switch I (PDB: ID 6BP1) and A59G (PD: ID 6ASA) [307,308]. Q61 is located on the switch II region, and related to effector binding. In KRAS, the P-loop is adjacent to both switch I and II leading to the hypothesis that these mutations, as well as variations on codon 61, could affect KRAS binding to effector partners [287].
As previously mentioned, wild-type KRAS exists in two states. A molecular dynamics study indicates that hotspot mutations influence the inactive-to-active conformational transition [309].
The G12C mutant still exists in two, active and inactive, states compared to WT. Both the G12C and the G12D mutants alter the dynamics of KRAS, shifting the protein such that the GTP-binding pocket now resides in a more open conformation compared to wild-type KRAS-GDP. These open conformations increase the solvent-accessible surface area (SASA) by almost 23% and 14% respectively [309].
Of note is that the increased SASA of the G12D substitution culminated in a more prominent open conformation than mutation G13D. The G12D mutation also causes increased atomic fluctuations at the P-loop and switch II regions) [310]. Interestingly, the open conformation is not limited to the common mutations in the P-loop. The rare switch mutations D33E (switch I) (PDB: ID 6BP1) and A59G (switch II) (PDB: ID6ASA) also adopt an open switch I conformation in silico [308]. The dynamic and open conformation of the mutated KRAS is suggested to hamper nucleotide exchange [311].
Hunter and colleagues showed that all three residues positions are linked to a decreased affinity between RAS and RAF although with different impacts [312]. The impact of mutations was studied in the context of intrinsic GTP conversion, with the mutation G12C presenting minimal impact in the hydrolysis rate, while the presence of alanine or arginine residues in the same position led to around 80%-fold decrease and aspartic acid and valine showing an intermediate effect. A considerable decrease of hydrolysis constants was observed for position 61, where mutation Q61H/L has similar profiles to G12D/V. Hotspot position G13 also presented decreased hydrolysis when mutation G13D was present. A similar outcome is found when in the presence of GAPs [312].
The common mutation G12C impairs the arrangement necessary for GAP-mediated hydrolysis by disturbing the arrangements of the catalytic residues Q61 of KRAS and the arginine finger of GAP in the active site for catalysis. The mutation also disrupts the formation of a hydrogen bond between the side chain NE 2 atom of Q61 and the γ-phosphate of GTP [309].
Similarly, the side chain of alanine in the G12A substitution causes a clash with Q61. In addition, there is a clash between the side chain of T32 and the side chain of R789 of GAP caused by the G12A substitution. The formation of a hydrogen bond involving the backbone of A12 with the hydroxyl group of the T32 side chain in the switch I stabilize the γ-phosphate of GTP in its pre-catalytic form. This restricts the phosphates from stretching and rotating their bonds, increasing the difficulty of reaching the transition state, explaining the reduced intrinsic GTP-hydrolysis [313].
Other position 12 mutations, such as G12D, cause the OE 1 atom of Q61 to move away from the γ-phosphate of GTP, preventing hydrogen to be extracted from a catalytic water molecule [309,314]. Instead, the side chain of G12D takes the place of Q61, thereby impairing the charge distribution in the active site for intrinsic hydrolysis. Furthermore, position 12 of the P-loop is less flexible than Q61 of switch II, thereby sterically interfering with the GAP arginine finger, supported through a computational study [311,314].
Interestingly, it is not the common G12C, but rather the G13D substitution that induces rapid nucleotide exchange kinetics compared to other KRAS mutants. The kinetics of nucleotide exchange between wild-type KRAS and mutant KRAS are the same, except for KRAS G13D. The G13D substituted KRAS has a GDP exchange rate that is 13.5 times faster than the wild-type KRAS. Furthermore, the GTP exchange rate of KRAS G13D is nine times faster than WT KRAS. The G13D mutation decreases the intrinsic and GAP-mediated GTPase activity of KRAS, thereby decreasing the auto-inactive signal propagation of KRAS. Furthermore, the substitution causes a 2.4-fold decrease in the affinity for RAF-RBD, an effector protein of KRAS [312].
Mutations at position 13 alter the side-chain conformation of Q61 in such a manner that it can no longer interact with a catalytic water. Furthermore, G13D substitution prevents any direct interaction between Q61 and GTP [309]. However, the kinetic abnormalities induced by the G13D mutation can be explained by changes in the electrostatic charge distribution of the active site [312]. The crystal structure shows that the aspartate side chain of G13D is positioned above the α-phosphate. The carboxyl group of the residue points towards the carbon-5 of the sugar group of the GDP ribose. As the G13D mutation resides in the strongly positive P-loop region above the negatively charged phosphate groups, it disturbs the local electrostatic binding pocket. The aspartic acid introduces a strongly negative charge into the phosphate-binding pocket, increasing binding difficulty. Of note is that the neighboring G12D induces much less of a disruption to the electrostatic density of the GTP-binding pocket [312].
KRAS has been considered problematic in the past, as the high in vivo concentration of GTP and GDP in combination with the high affinity of KRAS for these nucleotides has been proven troublesome for the direct targeting of the protein [312]. However, data collated over the years suggest that direct targeting of the mutated KRAS proteins is possible. For example, the data collated on the structure and difference in conformational states was used to develop a specific, high-affinity non-covalent inhibitor for G12D KRAS over the wild-type protein [315]. As familiarity with mutated KRAS increased, some have used the mutations as a means for the development of small-molecule inhibitors. Lim, Hunter, and colleagues, developed a GDP analog with an electrophilic warhead that covalently binds to the mutant cysteine in the G12C KRAS [316,317].
Another challenge in the KRAS drug development process is the lack of a sufficiently large and deep hydrophobic pocket for small-molecule binding, aside from the precarious nucleotide-binding pocket. There have been numerous efforts in the identification and targeting of shallow sites on RAS protein, which have led to the ambitious effort of Welsch et al. to develop a compound that is capable of simultaneously inhibiting HRAS, NRAS, and KRAS (also known as pan-RAS) [298]. However, these allosteric binding sites may also be used in a mutant-specific manner [318]. Ostrem et al. show that the binding of G12C specific inhibitors in an allosteric pocket disrupts both switches I and II, thereby undermining the KRAS preference to favor GDP over GTP and impairing binding to effector proteins [319]. The combination of protein-protein interaction assays has also been proven fruitful in the development of Q61H KRAS specific inhibitors, again using an allosteric binding site [320].
The most promising development is the compound AMG-510. Upon the discovery of a surface groove that may be occupied by aromatic rings due to an alternative conformation of H95, AMG-510 emerged as a top candidate after an optimization campaign. Preclinical data show that AMG-510 selectively targets G12C KRAS tumors and cause durable regression as monotherapy. The compound may not only be applied individually but also works synergistically with cytotoxic and targeted agents. Furthermore, there are indications that AMG-510 also synergizes with immunotherapy. Moreover, the combination of AMG-510 with immunotherapy seems to result in an adaptive immune response that can recognize and eradicate related non-G12C KRAS tumors [321].
Elucidating the structures, mechanisms, and kinetics of KRAS mutants is essential in the drug development and personalized medicine process, as seen in the aforementioned tyrosine kinases, computational biology will aid in the KRAS drug development process, with molecular dynamics studies allowing for the understanding of the mutant structures and their mechanisms. Using techniques such as docking, possible compounds may be virtually assessed. Gupta and colleagues have developed a pipeline that involves molecular dynamics, high-throughput ensemble docking, and biophysical and cell assays that yield novel lead compounds [322]. Similarly, this pipeline may be applied in the future as a means of personalized medicine if novel mutations emerge. The KRAS mutant of individual patients, even those with a novel or double mutation, can be evaluated through molecular dynamics, and in combination with docking their treatment options may be ranked according to viability. This allows for fast and accurate diagnostics and development of treatment regimens tailored to an individual patient.

Molecular Modeling as a Supporting Tool for Personalized Medicine: The Future Is Knocking on the Door
Despite medical advances, the prognosis of advanced NSCLC remains gruesome, especially for those with a late diagnosis. In the case of metastasis, surgery can no longer be performed and these patients might not be suitable for chemoradiation approaches. In those cases, genetic profiling of these patients is performed searching for driver mutations in EGFR, ALK, BRAF, RAS, RET, and ROS that can undergo treatment with small molecules that would inhibit the abnormal kinase activity. While there are many kinase inhibitors in the market, physicians are still struggling to address the overall survival of these patients, which ranges between 7 to 14 months [323,324] The lingering time between emergence/identification of resistance and treatment availability is not only due to the long process of drug development and approval but also due to the complexity of cancer-which can result from anomalies in a multitude of pathways within the same tumor cell population. Additionally, there is a clear gap between the medical and the research community, which delays the classification of mutations once they emerge and is thereby detrimental for patient care to address this issue [325].
An alternative from the traditional pursuit of novel molecular entities to target novel mutations is the repurposing of already approved drugs, as described for ALK single mutant L1198F and crizotinib and EGFR triple mutant exon19del/T790M/P794L and afatinib [87,176]. These cases show how computational studies combined with experimental data can not only grant possible treatments but also provide insight into the mechanism of resistance. Another remarkable example of combining theoretical and experimental methodologies is the potential to predict future BRAF resistance mutations by combining previous knowledge on EGFR and ALK hotspot positions [278,280].
Throughout this review, computational methods contributions were highlighted in the study of proteins and small molecules dynamics when experimental data was limited or absent. A highly noteworthy contribution of MD simulations is depicted in uncovering a transient pocket on KRAS, yielding a molecule for a target previously considered undruggable [321].
Experimental data might vary upon different techniques used or even the settings for the same experiment, as reproducibility grows as a scientific concern that is addressed with sizable initiatives as established by the Brazilian government or the European Union (Findability, Accessibility, Interoperability, and Reusability-FAIR project), as examples [326,327]. Computational studies are not different; a simulation for the same system using different force fields, temperature and pressure controls, and cluster sizes can provide different outputs. Docking is also susceptible to such phenomena, since there are hundreds of commercial and non-commercial software available, and their reckoning in the weight of types of bonds between protein-ligands might differ [328]. Following the steps set for the FAIR project, a sustainable approach for computational assessment of proteins and related mutations can no longer be postponed due to its remarkable contribution to the biomedical field.
In an attempt to close the gap between medical teams and scientists, an increasingly common approach is establishing tumor boards at cancer treatment centers, gathering not only health care providers, but also experts in scientific fields such as molecular biology. The main goal is to provide a personalized approach to patients who do not fit into the conventional treatment regiments, either in the presence of novel mutations or limited response to an established regiment [329].
Although tumor boards initially rely solely on previous patient reports, an innovative approach is the use of structural data obtained either from experimental or theoretical sources. As described by Koopman et al., structural biologists are able to provide computational data regarding novel mutations in a timeframe to contribute to patient care [87,330,331]. Ultimately, the combination of experimental and theoretical structural biology, when applied judiciously, is shown to be an advantageous supporting tool for the process of decision-making in personalized medicine and, hopefully, it can be included in standard care in the not-too-distant future.

Conflicts of Interest:
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the result.