Rational Computational Design of Fourth-Generation EGFR Inhibitors to Combat Drug-Resistant Non-Small Cell Lung Cancer

Although the inhibitors of singly mutated epidermal growth factor receptor (EGFR) kinase are effective for the treatment of non-small cell lung cancer (NSCLC), their clinical efficacy has been limited due to the emergence of various double and triple EGFR mutants with drug resistance. It has thus become urgent to identify potent and selective inhibitors of triple mutant EGFRs resistant to first-, second-, and third-generation EGFR inhibitors. Herein, we report the discovery of potent and highly selective inhibitors of EGFR exon 19 p.E746_A750del/EGFR exon 20 p.T790M/EGFR exon 20 p.C797S (d746-750/T790M/C797S) mutant, which were derived via two-track virtual screening and de novo design. This two-track approach was performed so as to maximize and minimize the inhibitory activity against the triple mutant and the wild type, respectively. Extensive chemical modifications of the initial hit compounds led to the identification of several low-nanomolar inhibitors of the d746-750/T790M/C797S mutant. Among them, two compounds exhibited more than 104-fold selectivity in the inhibition of EGFRd746-750/T790M/C797S over the wild type. The formations of a hydrogen bond with the mutated residue Ser797 and the van der Waals contact with the mutated residue Met790 were found to be a common feature in the interactions between EGFRd746-750/T790M/C797S and the fourth-generation inhibitors. Such an exceptionally high selectivity could also be attributed to the formation of the hydrophobic contact with a Gly loop residue or the hydrogen bond with Asp855 in the activation loop. The discovery of the potent and selective EGFRd746-750/T790M/C797S inhibitors were actually made possible by virtue of the modified protein–ligand binding free energy function involving a new hydration free energy term with enhanced accuracy. The fourth-generation EGFR inhibitors found in this work are anticipated to serve as a new starting point for the discovery of anti-NSCLC medicines to overcome the problematic drug resistance.


Introduction
Epidermal growth factor receptor (EGFR) plays a key role in regulating various intracellular signaling for tumor cell proliferation, differentiation, migration, and invasion [1,2]. Deregulated activity of EGFR kinase is responsible for the pathogenesis and the progression of approximately 10-15% of lung adenocarcinomas, which are the main subtype of non-small cell lung cancer (NSCLC) [3][4][5]. The most template [28]. The optimized structural model for the d746-750/T790M/C797S mutant was validated with the ProSa 2003 program, which has been widely used as a valuable computational tool for examining whether the interactions of each amino-acid residue with the rest part of the protein are maintained favorably [32]. This free energy profile for individual amino acids could be obtained with the knowledge-based mean field potential. Figure 1 shows the free energy profile of the homology-modeled kinase domain of the d746-750/T790M/C797S mutant in comparison with that of the L858R/T790M/C797S mutant, which served as the structural template. It is remarkable to see that the target protein exhibits the better energy profile than the template in most parts of the protein especially in the central region between the Nand C-terminal domains. Furthermore, the energy values remain negative throughout the amino acid residues, implying that the homology-modeled structure of the d746-750/T790M/C797S mutant would be physically acceptable. Based on the good energetic features, the structure of the kinase domain of the d746-750/T790M/C797S mutant constructed with homology modeling was adopted as the receptor model for the virtual screening and de novo design to find the fourth-generation EGFR inhibitors. The applicability of virtual screening and de novo design has been limited owing to the inaccurate nature of the scoring function for estimating the protein-ligand binding affinity [33,34]. In particular, the binding free energy functions of popular docking programs tend to underestimate the ligand dehydration effects in the protein-ligand association. This may inevitably culminate in overestimating the biochemical potency of a molecule with many hydrophilic groups [31]. Prior to conducting the virtual screening, therefore, we modified the dehydration term in the original scoring function within the framework of the extended solvent-contact model to enhance the predictive capability.
Virtual screening began with the preparation of a chemical library from which the tight-binding inhibitor scaffolds for the d746-750/T790M/C797S mutant could be found by docking simulations. This docking library was constructed by collecting a total of approximately 370,000 'Rule-of-Five'compliant molecules [35] with molecular weights ranging from 300 and 400 atomic mass units (amu) among the commercially available compounds. Virtual screening of the fourth-generation EGFR inhibitors was then carried out through the docking simulations in the ATP-binding pockets of the d746-750/T790M/C797S mutant and the wild type of EGFR. This two-track approach was intended to collect the putative inhibitors with binding free energies lower than -10 kcal/mol and higher than - The applicability of virtual screening and de novo design has been limited owing to the inaccurate nature of the scoring function for estimating the protein-ligand binding affinity [33,34]. In particular, the binding free energy functions of popular docking programs tend to underestimate the ligand dehydration effects in the protein-ligand association. This may inevitably culminate in overestimating the biochemical potency of a molecule with many hydrophilic groups [31]. Prior to conducting the virtual screening, therefore, we modified the dehydration term in the original scoring function within the framework of the extended solvent-contact model to enhance the predictive capability.
Virtual screening began with the preparation of a chemical library from which the tight-binding inhibitor scaffolds for the d746-750/T790M/C797S mutant could be found by docking simulations. This docking library was constructed by collecting a total of approximately 370,000 'Rule-of-Five'-compliant molecules [35] with molecular weights ranging from 300 and 400 atomic mass units (amu) among the commercially available compounds. Virtual screening of the fourth-generation EGFR inhibitors was then carried out through the docking simulations in the ATP-binding pockets of the d746-750/T790M/C797S mutant and the wild type of EGFR. This two-track approach was intended to collect the putative inhibitors with binding free energies lower than -10 kcal/mol and higher than −6.0 kcal/mol with respect to the triple mutant and the wild type, respectively, which corresponds to at least 1000-fold difference in binding affinities. As a result that the bidentate hydrogen-bond interactions in the hinge region were characteristic of the effective fourth-generation EGFR inhibitors [27,28], only molecules capable of forming the two hydrogen bonds with backbone groups of residues 791-795 were selected with the distance criteria of <3.5 Å after all molecules in the docking library were screened.
Only 26 compounds satisfied all the filtration criteria as indicated in Figure 2. This small number of virtual hits exemplifies the difficulty in discovering the fourth-generation inhibitors that are specific for the triple mutant. This can be understood in the context of the high structural similarity between the mutant and the wild type. All virtual hits were assessed for the presence of inhibitory activity against the d746-750/T790M/C797S mutant and the wild type via in vitro radiometric ([γ-33 P]-ATP) kinase assays (Reaction Biology Corp., Malvern, PA, USA). As a consequence of virtual and experimental screening, four molecules were identified as low-micromolar inhibitors of d746-750/T790M/C797S mutant. Actually, none of the four molecules have previously been reported as EGFR kinase inhibitors. With respect to disproving the possibility of acting as a false positive in enzyme inhibition assays, all four inhibitors were examined in the public ZINC database [36] to confirm the lack of any substructure in pan assay interference compounds (PAINS) [37].
Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 4 of 23 6.0 kcal/mol with respect to the triple mutant and the wild type, respectively, which corresponds to at least 1000-fold difference in binding affinities. As a result that the bidentate hydrogen-bond interactions in the hinge region were characteristic of the effective fourth-generation EGFR inhibitors [27,28], only molecules capable of forming the two hydrogen bonds with backbone groups of residues 791-795 were selected with the distance criteria of <3.5 Å after all molecules in the docking library were screened. Only 26 compounds satisfied all the filtration criteria as indicated in Figure 2. This small number of virtual hits exemplifies the difficulty in discovering the fourth-generation inhibitors that are specific for the triple mutant. This can be understood in the context of the high structural similarity between the mutant and the wild type. All virtual hits were assessed for the presence of inhibitory activity against the d746-750/T790M/C797S mutant and the wild type via in vitro radiometric ([γ-33 P]-ATP) kinase assays (Reaction Biology Corp., Malvern, PA, USA). As a consequence of virtual and experimental screening, four molecules were identified as low-micromolar inhibitors of d746-750/T790M/C797S mutant. Actually, none of the four molecules have previously been reported as EGFR kinase inhibitors. With respect to disproving the possibility of acting as a false positive in enzyme inhibition assays, all four inhibitors were examined in the public ZINC database [36] to confirm the lack of any substructure in pan assay interference compounds (PAINS) [37]. Chemical structures and biochemical potencies of the four inhibitors of d746-750/T790M/C797S mutant EGFR are summarized in Figure 3 and Table 1, respectively. It is a common structural feature for 1-4 to possess several polar groups and nonpolar aromatic rings, therefore, both hydrogen-bond   As listed in Table 1, all four inhibitors exhibited good biochemical potency against the triple mutant with IC50 values ranging from 0.8 to 3 μM. Furthermore, compounds 1-3 appear to be selective in the inhibition of the triple mutant over the wild type by a factor of 10-35. They are therefore worthy of further development with structural modifications to maximize the inhibitory activity against NSCLC cells resistant to second-and third-generation EGFR inhibitor drugs. 1 seems to serve as an effective molecular core for designing the new fourth-generation EGFR inhibitors due to the highest selectivity and good inhibitory activity against the triple mutant. 2 and 3 contain 2aryl-4-aminoquinazoline and 1,3,5-triazin-2-amine moieties, respectively, which were also included in the previously identified inhibitors of the triple mutant [27].
It is quite unexpected that 4 reveals 136-fold higher inhibitory activity against the wild type than against the d746-750/T790M/C797S mutant, indicating that its binding affinity for the wild type was underestimated to a great extent in the precedent virtual screening. Thus, the scoring function used in this work remains imperfect despite the modification by substituting a sophisticated hydration  As listed in Table 1, all four inhibitors exhibited good biochemical potency against the triple mutant with IC 50 values ranging from 0.8 to 3 µM. Furthermore, compounds 1-3 appear to be selective in the inhibition of the triple mutant over the wild type by a factor of 10-35. They are therefore worthy of further development with structural modifications to maximize the inhibitory activity against NSCLC cells resistant to second-and third-generation EGFR inhibitor drugs. 1 seems to serve as an effective molecular core for designing the new fourth-generation EGFR inhibitors due to the highest selectivity and good inhibitory activity against the triple mutant. 2 and 3 contain 2-aryl-4-aminoquinazoline and 1,3,5-triazin-2-amine moieties, respectively, which were also included in the previously identified inhibitors of the triple mutant [27].
It is quite unexpected that 4 reveals 136-fold higher inhibitory activity against the wild type than against the d746-750/T790M/C797S mutant, indicating that its binding affinity for the wild type was underestimated to a great extent in the precedent virtual screening. Thus, the scoring function used in this work remains imperfect despite the modification by substituting a sophisticated hydration energy term. Although 4 was found accidentally as a potent inhibitor of the wild-type EGFR, it deserves consideration for further development as a new first-generation inhibitor because the biochemical potency reached the nanomolar level (Table 1) without any chemical modification.
To find the structural relevance for low-micromolar activity of the newly discovered EGFR inhibitors, their binding modes were derived with docking simulations using the modified scoring function. Overlaid in Figure 4 are the docked poses of 1-4 around the ATP-binding site of the d746-750/T790M/C797S mutant. All four inhibitors appear to be accommodated in the well-established binding pocket comprising the glycine-rich loop (Gly loop, residues 718-726), the hinge region (residues 791-795) of the ATP-binding site, and the residues at the interface of the N-(700-860) and C-terminal (861-1014) domains. It is also a common complexation pattern that the inhibitors reside in close proximity to the side chains of Met790 and Ser797, which were mutated from Thr790 and Cys797 of the wild-type EGFR, respectively. The interactions with these mutated residues would be necessary for the specific inhibition of the d746-750/T790M/C797S mutant. To find the potential allosteric sites that could accommodate 1-4, additional docking simulations were carried out using 3D grid maps extended to include the whole EGFR kinase domain. Nonetheless, no peripheral binding site was identified in which 1-4 could be stabilized with a negative value of binding free energy. Therefore, compounds 1-4 are likely to impair the kinase activity of the d746-750/T790M/C797S mutant through the specific binding in the ATP binding site. To find the structural relevance for low-micromolar activity of the newly discovered EGFR inhibitors, their binding modes were derived with docking simulations using the modified scoring function. Overlaid in Figure 4 are the docked poses of 1-4 around the ATP-binding site of the d746-750/T790M/C797S mutant. All four inhibitors appear to be accommodated in the well-established binding pocket comprising the glycine-rich loop (Gly loop, residues 718-726), the hinge region (residues 791-795) of the ATP-binding site, and the residues at the interface of the N-(700-860) and C-terminal (861-1014) domains. It is also a common complexation pattern that the inhibitors reside in close proximity to the side chains of Met790 and Ser797, which were mutated from Thr790 and Cys797 of the wild-type EGFR, respectively. The interactions with these mutated residues would be necessary for the specific inhibition of the d746-750/T790M/C797S mutant. To find the potential allosteric sites that could accommodate 1-4, additional docking simulations were carried out using 3D grid maps extended to include the whole EGFR kinase domain. Nonetheless, no peripheral binding site was identified in which 1-4 could be stabilized with a negative value of binding free energy. Therefore, compounds 1-4 are likely to impair the kinase activity of the d746-750/T790M/C797S mutant through the specific binding in the ATP binding site. As a result that 1 and 2 are low-micromolar inhibitors of the d746-750/T790M/C797S mutant with reasonably good selectivity over the wild type, their complexation patterns in the ATP-binding site would provide great insight into how to promote the potency and selectivity through chemical modifications. Figure 5 illustrates the most stable binding modes 1 and 2 obtained with docking simulations using the modified scoring function. The carbonyl oxygen of 1 receives a hydrogen bond from the backbone amidic moiety of Met793 in the hinge region while the phenolic moiety donates a As a result that 1 and 2 are low-micromolar inhibitors of the d746-750/T790M/C797S mutant with reasonably good selectivity over the wild type, their complexation patterns in the ATP-binding Int. J. Mol. Sci. 2020, 21, 9323 7 of 23 site would provide great insight into how to promote the potency and selectivity through chemical modifications. Figure 5 illustrates the most stable binding modes 1 and 2 obtained with docking simulations using the modified scoring function. The carbonyl oxygen of 1 receives a hydrogen bond from the backbone amidic moiety of Met793 in the hinge region while the phenolic moiety donates a hydrogen bond to the backbone aminocarbonyl oxygen of Leu718 in the Gly loop. These two hydrogen bonds seem to play a key role in binding of 1 to the mutant EGFR because they are established in the middle of the ATP-binding pocket. The third hydrogen bond is observed between the terminal piperidine moiety of 1 and the sidechain hydroxyl group of Ser797, which augments the strength of EGFR d746-750/T790M/C797S -1 binding. This interaction would also be important in terms of selectivity because it involves the mutated sidechain at residue 797. 1 looks stabilized further through the hydrophobic contacts with the nonpolar side chains of Leu718, Val726, Phe723, Met790, Leu792, and Leu844. In particular, we note that three leucine residues (Leu718, Leu792, and Leu844) accommodate the 6-hydroxybenzofuran-3-one moiety of 1 around the two hydrogen bonds in the ATP-binding site. These interactions are assumed to contribute to the strengthening of the vicinal hydrogen bonds by preventing the approach of solvent water molecules. Significant synergistic effects are therefore anticipated in binding affinity by positioning the two hydrogen bonds in proximity to the hydrophobic contacts. Indeed, it has served as a facile strategy in the optimization of biochemical potency to reinforce the hydrogen bonds cooperatively with the hydrophobic interactions in complexation with the target proteins [38,39]. hydrogen bond to the backbone aminocarbonyl oxygen of Leu718 in the Gly loop. These two hydrogen bonds seem to play a key role in binding of 1 to the mutant EGFR because they are established in the middle of the ATP-binding pocket. The third hydrogen bond is observed between the terminal piperidine moiety of 1 and the sidechain hydroxyl group of Ser797, which augments the strength of EGFR d746-750/T790M/C797S -1 binding. This interaction would also be important in terms of selectivity because it involves the mutated sidechain at residue 797. 1 looks stabilized further through the hydrophobic contacts with the nonpolar side chains of Leu718, Val726, Phe723, Met790, Leu792, and Leu844. In particular, we note that three leucine residues (Leu718, Leu792, and Leu844) accommodate the 6-hydroxybenzofuran-3-one moiety of 1 around the two hydrogen bonds in the ATP-binding site. These interactions are assumed to contribute to the strengthening of the vicinal hydrogen bonds by preventing the approach of solvent water molecules. Significant synergistic effects are therefore anticipated in binding affinity by positioning the two hydrogen bonds in proximity to the hydrophobic contacts. Indeed, it has served as a facile strategy in the optimization of biochemical potency to reinforce the hydrogen bonds cooperatively with the hydrophobic interactions in complexation with the target proteins [38,39]. The binding mode of 2 in the ATP-binding site of EGFR d746-750/T790M/C797S is similar to that of 1 in that three hydrogen bonds are involved in the complexation. In particular, we note that one of the nitrogens on the pyrimidine ring and the terminal phenolic moiety of 2 receives and donates a hydrogen bond from the backbone amidic nitrogen of Met793 and to the aminocarbonyl oxygen of Gln791, respectively. These bidentate hydrogen bonds are supported by the neighboring van der Waals contacts between the inhibitor phenyl ring and the sidechain of the mutated Met790. This kind of interaction pattern was actually observed in a variety of EGFR-inhibitor complexes, irrespective of the mutational status of EGFR [40]. In the calculated structure of EGFR d746-750/T790M/C797S -2 complex, an additional hydrogen bond is found at the bottom of the ATP-binding site between the -NH moiety The binding mode of 2 in the ATP-binding site of EGFR d746-750/T790M/C797S is similar to that of 1 in that three hydrogen bonds are involved in the complexation. In particular, we note that one of the nitrogens on the pyrimidine ring and the terminal phenolic moiety of 2 receives and donates a hydrogen bond from the backbone amidic nitrogen of Met793 and to the aminocarbonyl oxygen of Gln791, respectively. These bidentate hydrogen bonds are supported by the neighboring van der Waals contacts between the inhibitor phenyl ring and the sidechain of the mutated Met790. This kind of interaction pattern was actually observed in a variety of EGFR-inhibitor complexes, irrespective of the mutational status of EGFR [40]. In the calculated structure of EGFR d746-750/T790M/C797S -2 complex, an additional hydrogen bond is found at the bottom of the ATP-binding site between the -NH moiety attached to the pyrimidine ring of 2 and the side-chain hydroxyl group of the mutated Ser797. This hydrogen bond seems to contribute to the selective inhibition of the triple mutant because the substitution of serine for cysteine at residue 797 is the tertiary mutation to afford the resistance to the second-and third-generation EGFR inhibitors. The stabilization of 2 in the ATP-binding pocket may be promoted due to the hydrophobic interactions with the nonpolar side chains of Leu718, Val726, Phe723, Met790, Leu792, and Leu844. On the basis of the structural features in the calculated EGFR d746-750/T790M/C797S -2 complex, it is most likely that the low-micromolar inhibitory activity of 2 would stem from the multiple hydrogen bonds facilitated by the hydrophobic contacts in the ATP-binding site.
As a result that 1 is almost fully accommodated in the ATP-binding site of the d746-750/T790M/C797S mutant, only small substituents would be allowed for derivatizations to optimize the inhibitory activity. On the other hand, the terminal phenolic moiety of 2 is directed to a vacant peripheral binding pocket consisting of Val726, Lys745, Met790, Thr854, and Asp855 ( Figure 5). This sub-binding region may serve as a potential target for improving both the potency and selectivity in the inhibition of deregulated EGFR mutants because it includes the amino-acid residues not only in the hinge region (Met790) and Gly loop (Val726) but also in the DFG motif (Asp855-Phe856-Gly857) that resides on the activation loop. Therefore, the introduction of a suitable chemical moiety at the terminal phenolic moiety of 2 would have the effect of enhancing the biochemical potency against the triple mutant.

Synthesis of the Derivatives of 1 and 2 Generated from de novo Design
Virtual screening and the subsequent de novo design processes generated a variety of fourth-generation EGFR inhibitors with 1 and 2 as the molecular core. Aurone derivatives C were synthesized as illustrated in Scheme 1. The key intermediates B were prepared by Mannich reaction of benzofuran-3(2H)-one A with paraformaldehyde and various secondary amines. Subsequently, base-catalyzed condensation of B with aldehyde gave rise to the desired aurone derivatives C. hydrogen bond seems to contribute to the selective inhibition of the triple mutant because the substitution of serine for cysteine at residue 797 is the tertiary mutation to afford the resistance to the second-and third-generation EGFR inhibitors. The stabilization of 2 in the ATP-binding pocket may be promoted due to the hydrophobic interactions with the nonpolar side chains of Leu718, Val726, Phe723, Met790, Leu792, and Leu844. On the basis of the structural features in the calculated EGFR d746-750/T790M/C797S -2 complex, it is most likely that the low-micromolar inhibitory activity of 2 would stem from the multiple hydrogen bonds facilitated by the hydrophobic contacts in the ATP-binding site. As a result that 1 is almost fully accommodated in the ATP-binding site of the d746-750/T790M/C797S mutant, only small substituents would be allowed for derivatizations to optimize the inhibitory activity. On the other hand, the terminal phenolic moiety of 2 is directed to a vacant peripheral binding pocket consisting of Val726, Lys745, Met790, Thr854, and Asp855 ( Figure 5). This sub-binding region may serve as a potential target for improving both the potency and selectivity in the inhibition of deregulated EGFR mutants because it includes the amino-acid residues not only in the hinge region (Met790) and Gly loop (Val726) but also in the DFG motif (Asp855-Phe856-Gly857) that resides on the activation loop. Therefore, the introduction of a suitable chemical moiety at the terminal phenolic moiety of 2 would have the effect of enhancing the biochemical potency against the triple mutant.

Synthesis of the Derivatives of 1 and 2 Generated from de novo Design
Virtual screening and the subsequent de novo design processes generated a variety of fourthgeneration EGFR inhibitors with 1 and 2 as the molecular core. Aurone derivatives C were synthesized as illustrated in Scheme 1. The key intermediates B were prepared by Mannich reaction of benzofuran-3(2H)-one A with paraformaldehyde and various secondary amines. Subsequently, base-catalyzed condensation of B with aldehyde gave rise to the desired aurone derivatives C. Quinazoline derivatives E were prepared as outlined in Scheme 2. Base-promoted nucleophilic addition of requisite amines to 2,4-dichloroquinazoline C led to the installation of the amino groups at the C4 position of quinazolines D. Next, Suzuki−Miyaura coupling reactions between intermediate D with various arylboronic acids or arylboronic acid pinacol esters yielded target compound E.
Quinazoline derivatives E were prepared as outlined in Scheme 2. Base-promoted nucleophilic addition of requisite amines to 2,4-dichloroquinazoline C led to the installation of the amino groups at the C4 position of quinazolines D. Next, Suzuki−Miyaura coupling reactions between intermediate D with various arylboronic acids or arylboronic acid pinacol esters yielded target compound E.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC 50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Synthesis of the Derivatives of 1 and 2 Generated from de novo Design
Virtual screening and the subsequent de novo design processes generated a variety of fourthgeneration EGFR inhibitors with 1 and 2 as the molecular core. Aurone derivatives C were synthesized as illustrated in Scheme 1. The key intermediates B were prepared by Mannich reaction of benzofuran-3(2H)-one A with paraformaldehyde and various secondary amines. Subsequently, base-catalyzed condensation of B with aldehyde gave rise to the desired aurone derivatives C. As a result that it was unsuccessful to derive the effective fourth-generation EGFR inhibitors by the derivatizations of 1, the alternative de novo designs were carried out using 2 as a new molecular core. Based on the structural features derived from docking simulations between the d746-750/T790M/C797S mutant and 2, the structural modifications aimed at finding the optimal chemical moiety (R1) attached to the nitrogen of the central quinazolin-4-amine group. The para position (R2) of the terminal phenol group was also selected as the substitution point because its importance in EGFR inhibition was demonstrated in the previous study [27]. Actually, only small substituents are allowed at the R2 position because the terminal phenyl ring of 2 resides in close proximity to the side chains of Val726, Met790, and Thr854 in the ATP-binding site ( Figure 5). Among the ten derivatives of 2 designed as new fourth-generation EGFR inhibitors, two candidates exhibited not only the low nanomolar activity but also high selectivity in the inhibition of the triple mutant over the wild type by a factor of more than 10 4 . Summarized in Table 3 are the IC 50 values of various derivatives of 2 generated from de novo design with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. All derivatives reveal the higher biochemical potency against the triple mutant than against the wild type. This exemplifies the usefulness of the modified scoring function adopted in this work for optimizing the activities of the fourth-generation EGFR inhibitors as well as for identifying the molecular cores of potent inhibitors in virtual screening. Although the introduction of N,N-diethylpropan-1-amine (12) and 4-methylthiazole (14) moiety at the R1 position leads to only a slight increase in the inhibitory activity against the triple mutant, the biochemical potency increases to the submicromolar level by further substitution of a nitrile group at the R2 position in 13 and 15. The submicromolar inhibitory activity is retained in the presence of 3-methylpyrrole (16) and 4-methyloxazole (17) at the R1 position. Most remarkably, the replacement of the five-membered aromatic ring with imidazole (18) lead to the increase in the biochemical potency to low-nanomolar level with more than 6690-fold higher activity against the d746-750/T790M/C797S mutant than against the wild-type EGFR. The selectivity index surges to higher than 10 4 either by the methyl substitution at the nitrogen atom of the imidazole ring (19) or by the one-carbon elongation of the linking group in 20 between the central quinazoline and the terminal imidazole ring. The change of R2 substituent from nitrile in 19 to aldehyde in 21 leads to a substantial loss of selectivity in contrast to the maintenance of low-nanomolar inhibitory activity against the triple mutant, which confirms the necessity of the former to retain both biochemical potency and selectivity. Among the ten derivatives of 2 listed in Table 3, 18-20 may be proposed as new promising fourth-generation EGFR inhibitors in the context of the exceptionally high selectivity and low-nanomolar activity in the inhibition of the d746-750/T790M/C797S mutant. They are anticipated to serve as a good starting point for the development of new medicines against NSCLC cells with acquired resistance to second-and third-generation EGFR inhibitor drugs. of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 ( Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 (Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 (Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine.

Biochemical Potencies of the Newly Synthesized Compounds
Most initial hit compounds derived from virtual and experimental screening tend to be insufficient in both biochemical potency and selectivity to serve as a good lead compound because all the candidate molecules were prepared for the purposes other than inhibiting the target protein of interest. Hence, we performed de novo design to identify the fourth-generation EGFR inhibitors better than 1 using this compound as the molecular core. Based on the calculated binding mode of 1 (Figure 5), the derivatizations were conducted so as to maximize the interactions with Ser797 and the amino-acid residues in the peripheral binding pocket. Table 2 lists the chemical structures and IC50 data of the seven derivatives of 1 with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. Biochemical potencies for both the wild type and the triple mutant appear to remain almost intact as the methyl group on the terminal piperidine ring of 1 is removed in 5 or shifted to the neighboring carbons in 6 and 7. Further substitution of the methoxy moiety in 8 causes a substantial loss of inhibitory activity, confirming that a bulky substituent on the benzofuran ring would be disfavored for tight binding to EGFR. In contrast to the negligible substitution effects on the terminal piperidine ring, the replacement of piperidine in 5 with piperazine in 9 leads to the increase in the inhibitory activity by a factor of 3 with respect to both target proteins. Although a negative effect of the methoxy substitution is also observed in 10, the biochemical potency against the triple mutant increases further to the submicromolar level due to the introduction of -OH moiety in 11 on the benzofuran ring of 9. However, the loss of selectivity over the wild-type EGFR makes it difficult for 11 to serve as a good lead for the development of a new anti-NSCLC medicine. As a result that it was unsuccessful to derive the effective fourth-generation EGFR inhibitors by the derivatizations of 1, the alternative de novo designs were carried out using 2 as a new molecular core. Based on the structural features derived from docking simulations between the d746-750/T790M/C797S mutant and 2, the structural modifications aimed at finding the optimal chemical moiety (R1) attached to the nitrogen of the central quinazolin-4-amine group. The para position (R2) of the terminal phenol group was also selected as the substitution point because its importance in EGFR inhibition was demonstrated in the previous study [27]. Actually, only small substituents are allowed at the R2 position because the terminal phenyl ring of 2 resides in close proximity to the side chains of Val726, Met790, and Thr854 in the ATP-binding site ( Figure 5). Among the ten derivatives of 2 designed as new fourth-generation EGFR inhibitors, two candidates exhibited not only the low nanomolar activity but also high selectivity in the inhibition of the triple mutant over the wild type by a factor of more than 10 4 . Summarized in Table 3 are the IC50 values of various derivatives of 2 generated from de novo design with respect to the d746-750/T790M/C797S mutant and the wild type of EGFR. All derivatives reveal the higher biochemical potency against the triple mutant than against the wild type. This To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory activity in going from 2 to 19 and 20 (Table 3) because the peripheral binding pocket involves the mutated residue Met790.  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory   To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory  To find a rationale for the exceptionally high selectivity and biochemical potency, the binding modes of 19 and 20 were explored with docking simulations in the ATP-binding site of the d746-750/T790M/C797S mutant. The calculated binding configurations of the two nanomolar inhibitors are compared in Figure 6. Both 19 and 20 are supposed to be well accommodated in the ATP-binding site in the similar way to 2 ( Figure 5) with the nitrile moiety occupying the peripheral binding pocket. This additional interaction would contribute to the increase in selectivity as well as in the inhibitory It is interesting to note that the terminal 1-methylimidazole group of 19 forms a close van der Waals contact with the sidechain phenyl ring of Phe723 in the Gly loop. This interaction would also be a significant contributor to the impairment of the kinase activity of the EGFR d746-750/T790M/C797S because the Gly loop acts as a receptor for binding of the phosphate group in complexation with ATP. The extra stabilization of the EGFR d746-750/T790M/C797S -20 complex seems to be caused by the additional hydrogen-bond interaction of the terminal imidazole moiety with the side-chain carboxylate group of Asp855, which is a component of the DFG motif (Asp855-Phe856-Gly857) on the activation loop. This new hydrogen bond is made possible by one-carbon elongation of the linking group to connect the central quinazoline ring and the terminal imidazole group, which is required to reach Asp855 that resides at the end of the ATP-binding pocket. In the calculated EGFR d746-750/T790M/C797S -19 and EGFR d746-750/T790M/C797S -20 complexes, it is also worth noting that both the hydrophobic interaction of 19 with Phe723 and the hydrogen bond of 20 with Asp855 are supported by the neighboring hydrogen bond with the side-chain hydroxyl moiety of mutated residue Ser797. The additional interactions generated by the chemical modification from 2 to 19 and 20 are thus consistent with the substantial increase in the inhibitory activity and selectivity.
Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 12 of 23 activity in going from 2 to 19 and 20 (Table 3) because the peripheral binding pocket involves the mutated residue Met790. It is interesting to note that the terminal 1-methylimidazole group of 19 forms a close van der Waals contact with the sidechain phenyl ring of Phe723 in the Gly loop. This interaction would also be a significant contributor to the impairment of the kinase activity of the EGFR d746-750/T790M/C797S because the Gly loop acts as a receptor for binding of the phosphate group in complexation with ATP. The extra stabilization of the EGFR d746-750/T790M/C797S -20 complex seems to be caused by the additional hydrogen-bond interaction of the terminal imidazole moiety with the side-chain carboxylate group of Asp855, which is a component of the DFG motif (Asp855-Phe856-Gly857) on the activation loop. This new hydrogen bond is made possible by one-carbon elongation of the linking group to connect the central quinazoline ring and the terminal imidazole group, which is required to reach Asp855 that resides at the end of the ATP-binding pocket. In the calculated EGFR d746-750/T790M/C797S -19 and EGFR d746-750/T790M/C797S -20 complexes, it is also worth noting that both the hydrophobic interaction of 19 with Phe723 and the hydrogen bond of 20 with Asp855 are supported by the neighboring hydrogen bond with the side-chain hydroxyl moiety of mutated residue Ser797. The additional interactions generated by the chemical modification from 2 to 19 and 20 are thus consistent with the substantial increase in the inhibitory activity and selectivity.
Two promising molecules (18 and 19) were further evaluated for cellular growth inhibition against Ba/F3 cell lines with the wild type and the d746-750/T790M/C797S mutant EGFR. These cellbased assays were conducted at WuXi AppTec Corp. (Shanghai, China) using Gefitinib and Brigatinib as the reference compounds, which were approved by FDA for the NSCLC therapy. Only the two molecules were investigated because of the difficulty in synthesizing 20 and 21 in the amount sufficient to perform cellular studies. As shown in Table 4, both 18 and 19 revealed high antiproliferative activity against the d746-750/T790M/C797S mutant cell at the submicromolar level. Two promising molecules (18 and 19) were further evaluated for cellular growth inhibition against Ba/F3 cell lines with the wild type and the d746-750/T790M/C797S mutant EGFR. These cell-based assays were conducted at WuXi AppTec Corp. (Shanghai, China) using Gefitinib and Brigatinib as the reference compounds, which were approved by FDA for the NSCLC therapy. Only the two molecules were investigated because of the difficulty in synthesizing 20 and 21 in the amount sufficient to perform cellular studies. As shown in Table 4, both 18 and 19 revealed high antiproliferative activity against the d746-750/T790M/C797S mutant cell at the submicromolar level. However, the micromolar-level inhibitory activity was also observed for the cell lines with the wild-type EGFR to the extent similar to the reference compounds. As a result that the anticellular activity appears to be higher than the inhibitory activity in enzyme assays (Table 3), further optimization would be required for the fourth-generation EGFR inhibitors found in this work to alleviate the potential off-target activity.
Although some promising fourth-generation EGFR inhibitors were identified in this work, the biochemical potencies of many compounds in Tables 1-3 remained modest in spite of the modification of the scoring function for virtual screening and de novo design. The imperfection of the scoring function can be attributed in a large part to the incomplete optimization of the weighting factors for varying energy terms, which stems from the insufficient number of EGFR-inhibitor complexes in the training set for parameterizations. The scoring function is expected to become even more accurate by reoptimizing the weighting factors using the new training set supplemented with a variety of EGFR-inhibitor complexes. In the near future, we plan to design and identify the more potent and selective fourth-generation EGFR inhibitors than those presented in this work with the improved scoring function.

Structural Preparations of d746-750/T790M/C797S Mutant and Wild Type of EGFR
As a result that the 3D structure of d746-750/T790M/C797S mutant EGFR was unavailable in Protein Data Bank (PDB), its atomic coordinates were constructed by homology modeling using the active conformation of the L858R/T790M/C797S mutant [28] as the structural template (PDB entry: 6JRJ). This homology modeling began with the retrieval of the amino acid sequence of human EGFR comprising 1210 residues from UniProtKB protein knowledgebase (http://www.uniprot.org, accession number: P00533, gene name: EGFR ERBB). Only the cytoplasmic kinase domain (residues 700-1014) of EGFR was considered in the homology modeling because the present study was focused on the discovery of ATP-competitive inhibitors. To build the structure of the d746-750/T790M/C797S mutant, the atomic coordinates were optimized in such a way as to minimize the violation of spatial restraints as implemented in the latest version of the MODELLER program [41].
The X-ray crystal structure of the EGFR kinase domain in the active form [42] (PDB entry: 2GS2) was selected as the receptor model for the wild type. Finally, all-atom models of the wild type and d746-750/T790M/C797S mutant were constructed by adding the hydrogen atoms according to the protonation states of titratable residues revealed in the patterns of intramolecular hydrogen bonds.

Two-Track Virtual Screening to Identify the Fourth-Generation EGFR Inhibitors
To identify the EGFR inhibitors specific for the d746-750/T790M/C797S mutant, virtual screening should be conducted not only for the triple mutant but also for the wild-type EGFR to collect the candidates capable of binding tightly and weakly to the former and the latter, respectively. This two-track virtual screening started with the preparation of a docking library containing approximately 370,000 synthetic and natural compounds from the latest version of the chemical database provided by InterBioScreen Ltd. A total of 560,000 compounds in the database were filtered to collect only those that possess the physicochemical properties of drug candidates [35]. Similar molecules with the Tanimoto coefficient exceeding 0.8 were then clustered into a single representative molecule to remove structural redundancy. After the two filtration steps, 3D atomic coordinates of all the molecules in the docking library were generated with the CORINA program [43].
Two-track virtual screening to identify fourth-generation EGFR inhibitors was performed with the automated AutoDock program [44,45], the performance of which had been appreciated in the discovery of various kinase inhibitors [46][47][48]. Despite the significant contribution to protein-ligand binding affinity, it is difficult to precisely reflect the ligand hydration effects in docking simulations for virtual screening because the scoring function of the original AutoDock program encompasses a crude dehydration energy term involving only six atom types to describe varying solute molecules. As a preliminary step to virtual screening of fourth-generation EGFR inhibitors, the scoring function was modified by substituting a new ligand dehydration energy term for the original one. This modified scoring function (∆G b aq ) has the following mathematical form.
The coefficients W vdW , W hbond , W elec , and W tor in Equation (1) refer to the weighting factors of van der Waals interactions, hydrogen bonds, electrostatic interactions, and torsional motions of the putative inhibitor, respectively. The variable r ij represents the interatomic separation, and the A ij , B ij , C ij , and D ij parameters are associated with the well depth and the equilibrium distance in a given potential energy function. AMBER force field parameters were adopted to compute the van der Waals interaction energies between EGFR and all putative inhibitors. An additional weighting factor (E(t)) was necessary in the intermolecular hydrogen bond term to reflect the angle-dependent directionality. In calculating the intermolecular electrostatic interactions between EGFR and a putative inhibitor, we used the atomic charges determined with Gasteiger-Marsilli method [49] and the distance-dependent sigmoidal function as the dielectric constant to simulate the long-range charge screening effects [50]. The N tor parameter in the torsional term means the number of rotatable bonds to estimate the entropic penalty for a putative inhibitor to be bound in the ATP-binding site of EGFR.
The first four terms in Equation (1) correspond to the binding free energy in the gas phase, while the final term is the negative of ligand hydration free energy. In this energy term, the S i , V i , and O i max parameters denote the atomic hydration energy per unit volume, the atomic volume in molecules, and the atomic maximum occupancy, respectively [51]. To quantify the hydration free energies of putative EGFR inhibitors, all atomic parameters were derived with the extended solvent-contact model that had achieved high outperformance in the SAMPL4 blind prediction challenge for molecular hydration free energies [52,53]. The introduction of this sophisticated hydration free energy term in the scoring function would enhance the possibility of finding actual EGFR inhibitors in virtual screening by preventing the overestimation of the biochemical potency of a candidate molecule with many polar groups [31]. Actually, the virtual screening protocol was not validated explicitly in this work because the accuracy of the modified scoring function was demonstrated well in the previous studies [47,48].

De novo Design
To improve the potency and selectivity in inhibiting the d746-750/T790M/C797S mutant, the hit compounds identified from virtual screening needed to be structurally modified in such a way as to maximize the interactions in the ATP-binding site. For this purpose, the structure-based de novo design was performed in a stepwise fashion. First, various derivatives of a hit compound were generated with the LigBuilder program [54] using the structure of d746-750/T790M/C797S mutant in complex with the hit compound as the starting structure. This step proceeded with the genetic algorithm to change the structure of the molecular core by introducing a variety of substituents at the specified positions. The number of such substitution positions was limited to two in this study to reduce the computational burden. The empirical scoring function comprising electrostatic, van der Waals, hydrogen bond, and entropic terms was then used for selecting approximately 14,000 derivatives that were predicted to have a higher binding affinity than the initial hit. Bioavailability rules were also applied in this step to collect only the derivatives with druggable physicochemical properties.
The second step of de novo design was performed in a similar manner to the precedent two-track virtual screening in the context that all of the derivatives were further screened to select only those with a higher binding affinity for the d746-750/T790M/C797S mutant than for the wild type. The modified scoring function in Equation (1) served to evaluate the derivatives designed in the first step. Among the derivatives predicted to bind more tightly to the triple mutant than the wild type, those with the difference in binding free energies larger than 5 kcal/mol were inspected for the availability of chemical synthesis. Finally, seven and ten derivatives of 1 and 2 were synthesized and evaluated, respectively, with enzyme inhibition assays to find the new fourth-generation EGFR inhibitors.

General Methods
Unless stated otherwise, reactions were performed in flame-dried glassware. Analytical thin layer chromatography (TLC) was performed on precoated silica gel 60 F 254 plates and visualization on TLC was achieved by UV light (254 and 365 nm) High-resolution mass spectra were obtained by using EI or FAB method from Korea Basic Science Institute (Daegu). Commercial grade reagents and solvents were used without further purification except as indicated below.

Synthesis of Compound C
Generally, aurone derivatives were synthesized via two steps as depicted in Scheme 1. At the first step, piperidine derivatives or Boc protected piperazine were introduced to a coumaranone core with methylene unit using paraformaldehyde. To the corresponding modified coumaranone core, benzofuran-2-carbaldehyde was introduced by condensation reaction under catalytic amount of base. Further deprotection processes were employed to afford more aurone derivatives using HCl or BBr 3 .

Representative Procedure for Modification of Coumaranone (Step 1)
Reaction was conducted in a round bottom flask (15 mL) sealed with a rubber septa. 6-Hydroxy-3-coumaranone (100 mg, 0.67 mmol) and paraformaldehyde (20 mg, 0.67 mmol) were combined. To the mixture, 3 mL ethanol and 3-Methylpiperidine (78 mg, 0.67 mmol) were added. The mixture was heated and stirred under reflux at 80 • C for overnight. The reaction mixture was monitored by TLC using 95% dichloromethane and 5% methanol as the mobile phase. After 24 h, the reaction mixture was concentrated and diluted with dichloromethane (25 mL × 3) and washed with brine (50 mL). The organic layer was dried over Na 2 SO 4 . After removal of solvent, concentrated mixture was purified by flash chromatography on silica gel (dichloromethane/methanol = 20:1) to give the desired product (35 mg, 20%, white solid).

Cell Proliferation Inhibition Assay
Two compounds (18 and 19) were sent to WuXi AppTec Corp., Shanghai, China, for cell proliferation inhibition assay. The cells were treated with various concentrations of the compounds in duplicate. After 72 h, the fluorescence signals were measured at an excitation wavelength of 540 nm and an emission wavelength of 590 nm using a microplate reader. More specifically, the number of living cells was determined with CellTiter-Glo ® Luminescent Cell Viability Assay (Promega).

Conclusions
Based on the two-track virtual screening and the targeted de novo design, we discovered the effective fourth-generation EGFR inhibitors highly selective for the d746-750/T790M/C797S mutant over the wild type. This was made possible by virtue of the modified protein-ligand binding free energy function involving a new hydration free energy term with enhanced accuracy. Most remarkably, compounds 19 and 20 exhibited low-nanomolar biochemical potency against EGFR d746-750/T790M/C797S as well as more than 10 4 -fold selectivity over the wild type in vitro. The docking simulation results indicated that these new inhibitors would be bound tightly in the ATP-binding pocket of the triple mutant through the bidentate hydrogen bonds with backbone groups in the hinge region, together with the hydrophobic interactions with the nonpolar residues in the Gly loop, hinge region, and interdomain region. A hydrogen bond with the mutated residue Ser797 was also shown to be a significant binding force for the fourth-generation EFGR inhibitors to be stabilized in the ATP-binding site of the d746-750/T790M/C797S mutant. In addition to the additional interactions with the mutated residue Met790, the establishment of a van der Waals contact with Gly loop residues and the formation of a hydrogen bond with Asp855 in the activation loop could be invoked to account for the high inhibitory activity and selectivity of 19 and 20, respectively. These fourth-generation EGFR inhibitors are anticipated to serve as a lead compound for the development of new anticancer medicines against NSCLC cells resistant to second-and third-generation EGFR inhibitor drugs.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.