Binding Free Energy Calculation Based on the Fragment Molecular Orbital Method and Its Application in Designing Novel SHP-2 Allosteric Inhibitors

The accurate prediction of binding free energy is a major challenge in structure-based drug design. Quantum mechanics (QM)-based approaches show promising potential in predicting ligand–protein binding affinity by accurately describing the behavior and structure of electrons. However, traditional QM calculations face computational limitations, hindering their practical application in drug design. Nevertheless, the fragment molecular orbital (FMO) method has gained widespread application in drug design due to its ability to reduce computational costs and achieve efficient ab initio QM calculations. Although the FMO method has demonstrated its reliability in calculating the gas phase potential energy, the binding of proteins and ligands also involves other contributing energy terms, such as solvent effects, the ‘deformation energy’ of a ligand’s bioactive conformations, and entropy. Particularly in cases involving ionized fragments, the calculation of solvation free energy becomes particularly crucial. We conducted an evaluation of some previously reported implicit solvent methods on the same data set to assess their potential for improving the performance of the FMO method. Herein, we develop a new QM-based binding free energy calculation method called FMOScore, which enhances the performance of the FMO method. The FMOScore method incorporates linear fitting of various terms, including gas-phase potential energy, deformation energy, and solvation free energy. Compared to other widely used traditional prediction methods such as FEP+, MM/PBSA, MM/GBSA, and Autodock vina, FMOScore showed good performance in prediction accuracies. By constructing a retrospective case study, it was observed that incorporating calculations for solvation free energy and deformation energy can further enhance the precision of FMO predictions for binding affinity. Furthermore, using FMOScore-guided lead optimization against Src homology-2-containing protein tyrosine phosphatase 2 (SHP-2), we discovered a novel and potent allosteric SHP-2 inhibitor (compound 8).


Introduction
Computer-aided drug design (CADD) strategies have become an indispensable tool in modern drug discovery and development, with the primary objective of structure-based drug design being the prediction of protein-ligand binding affinity [1][2][3][4].The accuracy of binding affinity predictions is essential in rationalizing potency, selectivity, and computational drug design.Currently, various strategies have been proposed to predict proteinligand binding affinities, including low-end empirical scoring functions (SFs) [5], molecular mechanics/Poisson-Boltzmann (generalized Born) surface area (MM/PB(GB)SA) methods [6,7], and theoretically rigorous free energy perturbation (FEP) [8] and thermodynamic integration (TI) [9,10].However, these force field-based molecular mechanics calculation methods are limited by their ability to accurately capture weak interactions (CH-π, Cl-π, cation-π) in inter-and intra-biomolecules without rigorous testing.In contrast, quantum mechanics (QM) can describe the behavior and structure of electrons and derive the properties of substances [11].However, applying QM methods toward large biomolecular systems has been computationally quite expensive.
The fragment molecular orbital (FMO) method was proposed by Kitaura et al. [12][13][14][15], and is achieved by dividing the larger system into smaller fragments, which is performed using conventional molecular orbital calculations and offers a substantial computational cost saving over traditional QM methods.This method is helpful in analyzing the interaction energies of the ligand and surrounding residues by calculating fragment-fragment interaction energies.This indicates that the FMO method constitutes a more precise approach than molecular mechanics methods for capturing weak interactions.Therefore, this method has a wide range of applications in the field of drug design, including structure-based drug design (SBDD) [16][17][18][19], fragment-based drug design (FBDD) [20], and protein-protein interactions (PPIs) [21][22][23][24].In structure-based drug design, the FMO method proves highly valuable for the quantitative understanding of molecular interactions.Medicinal chemists and computational chemists employ interaction data between ligands and individual residues of the protein to engage in structure-activity relationship research, establishing a theoretical foundation for ligand design [25,26].This information is also instrumental in predicting binding modes [27,28], performing virtual screening [29][30][31], and elucidating molecular recognition mechanisms within RNA-ligand systems [32], among other applications.
While FMO calculations represent a valuable approach for enhancing enthalpic contributions in drug design, the phenomenon of protein-ligand binding is an exceedingly complex process where binding might be driven by different energy terms, including direct enthalpic contributions, entropy, solvation effects, and the 'deformation energy' of the ligand's bioactive conformation [33].Solvation effects should be taken into consideration when assessing binding affinities, particularly in cases involving interactions with ionized fragments [34].In recent years, solvation methods have been incorporated into the FMO method through the utilization of implicit solvent models such as the polarized continuum model (PCM) and solvation model density (SMD) [35][36][37][38][39].In addition, the solvation energies were calculated in separate calculations using generalized Born/surface area (GBSA) or the Poisson-Boltzmann/surface area (PBSA) method [40][41][42][43].These solvation methods were anticipated to enhance the precision of FMO binding affinity calculations, and previous studies have reported their efficacy [44][45][46].
In the context of the large-scale systems encountered in drug design, semi-empirical quantum mechanical methods offer a compromise between time and precision.The PM7/COSMO solvation model demonstrates good performance in calculating solvation energy changes during binding [47].Hence, we intend to assess the performance of three levels of implicit solvent models (ab initio, semi-empirical, and molecular mechanics (MM)based) on FMO binding affinity calculations using the Schrödinger FEP+ data set [48] as a benchmark.Currently, studies have reported that the incorporation of ligand deformation energy can enhance the predictive performance of the FMO /PCM method in binding affinity predictions [36,49].However, as of now, there is no reported research on the contribution of entropy.Hence, we explore the optimal combinations of enthalpy of FMO calculation, solvation free energy, deformation energy, and entropy to deal with a wide variety of drug design cases.
In this work, we develop an FMO-based binding affinity calculation method (FMOScore) with linear fitting on terms including solvation free energy, calculated by the PM7/COSMO model [50], and the deformation energy term of ligands.When the entropy term, calculated with the interaction entropy (IE) method [51], was added, FMOScore exhibited a slight performance boost.Considering time consumption and cost computationally, we discarded the entropy term.The performance study of FMOScore was carried out and compared with other methods of calculating the free energy of binding (FEP+, MM/PB(GB)SA, Vina) on the Schrödinger FEP+ benchmark of eight targets containing 258 complexes.Results demonstrated that FMOScore improved the prediction accuracy of the FMO method and displayed better or equally high correlation to the experimental data compared to the results for these traditional methods (FEP+, MM/PB(GB)SA, Vina).
Further, FMOScore was applied to design new inhibitors against Src homology-2containing protein tyrosine phosphatase 2 (SHP-2), which is a non-receptor protein tyrosine phosphatase encoded by the gene PTPN11 [52].SHP-2 is recognized as a promising drug target for the treatment of multiple cancer types and tumors, including leukemia, Noonan syndrome (NS), LEOPARD syndrome (LS), colon cancer, melanoma, lung adenocarcinoma, neuroblastoma, and hepatocellular carcinoma [53][54][55].Previously, SHP-2 inhibitors targeting the catalytic PTP domain failed to advance successfully into clinical trials mainly because of their drawbacks, including poor membrane permeability, low selectivity, and poor oral bioavailability [56].To overcome these shortcomings, Novartis reported the first allosteric inhibitor, SHP099 (SHP-2 WT IC 50 = 0.071 µM), with high selectivity and low toxicity that stabilizes the auto-inhibited conformation of SHP-2 at the "tunnel" allosteric site [57].Until now, several SHP-2 allosteric inhibitors, such as JAB-3068, JAB-3312, TNO155, ERAS-601, RMC-4630, BBP-398, SH3809, RLY-1971, and HBI-2376, have advanced into clinical trials for cancer treatment [58].In addition, recent studies have reported that SHP-2 inhibitors, activators, and proteolysis-targeting chimera (PROTAC)-based degraders also display therapeutic promise [59][60][61].Herein, we report strategies for scaffold hopping and structural modifications based on a lead compound (SHP099) using the FMOScore method, leading to the discovery of compound 8 as a novel and potent allosteric SHP-2 inhibitor.These encouraging outcomes suggest that FMOScore may be valuable in various structure-based drug design efforts for rational drug discovery.

Influence of Energy Terms on the Performance of FMOScore
Given the thermodynamic cycle, free energy can be split into several components, including the gas-phase potential energy, solvation free energy, and the entropy of ligandreceptor interactions (Figure 1).Therefore, we constructed an FMO-based binding affinity prediction strategy, which consisted of eight functions (Table 1).These functions included the calculation of the solvation free energy term by the PBSA model (function form 2: FMO_PBSA), the GBSA model (function form 3: FMO_GBSA), the PM7/COSMO model (function form 4: FMO_COSMO), the PCM model (function form 5: FMO_PCM), and the SMD model (function form 6: FMO_SMD).Additionally, the deformation energy term of ligands was calculated (function form 7: FMO_COSMO_SE), and the entropy term was calculated by the IE method (function form 8: FMO_COSMO_SE_IE).Detailed linear coefficients are given in Table S2.Table 1 summarizes the functions' integral performance to the eight biological targets in the Schrödinger FEP+ benchmark [48] (confidence intervals calculated by bootstrap sampling for each data set).The statistics of each target for Pearson correlation coefficient R, Kendall τ, the root-mean-squared error (RMSE), as well as the mean unsigned error (MUE), are also shown in Figure 2.   a R, ρ, τ are the Pearson, Spearman, Kendall correlation coefficients between ∆G exp and the calculated scores using a publicly available Schrödinger FEP+ benchmark of 8 targets containing 258 complexes, respectively.MUE is the mean unsigned error.RMSE is the root-mean-squared error between ∆G exp and scores.For each performance metric, 90% symmetric confidence intervals were estimated via bootstrap sampling with 10,000 resampling events and are reported as , where X is the mean statistic and X low and X high are the values of the statistic at the 5th and 95th percentiles of the value-sorted list of the bootstrap samples.The arrows indicate whether Functions 2~8 show increases or decreases in five evaluation criteria in comparison to Function 1 (FMO method).The percentage values in brackets indicate the relative change of metrics computed by other functions in comparison to those calculated by Function 1 (FMO method).
To explore the best-fit function of the FMO-based binding affinity calculation model, the Schrödinger FEP+ benchmark was used to evaluate its prediction power.We performed linear fitting on the calculated energy terms and experimental binding affinities of eight targets, respectively.The results showed that the binding affinities using the conventional FMO method on this data set exhibited a relatively low correlation with the experimental values (average Pearson R = 0.62 0.55 0.68 , average Kendall τ = 0.47 0.41 0.52 ).The implicit solvation models (PBSA, GBSA, COSMO, PCM, SMD) were used to calculate the solvation free energies that evaluated the response of the whole solute energy.The PM7/COSMO model systematically improved the correlation between the scores and experimental binding free energies, with a 37.1% improvement of the Pearson R and a 35.0%decrease in the averaged RMSE over the FMO method.As expected, the semi-empirical quantum mechanical (SQM) method, the PM7/COSMO solvation model, was the best-performing method, which had a higher accuracy than the PBSA or GBSA models.This improvement can be mainly attributed to the algorithmic simplicity, numerical stability, and great insensitivity for outlying charge errors by the PM7/COSMO model [62].It is noteworthy that the performance of function form 3 (FMO_GBSA) was slightly better than that of function form 2 (FMO_PBSA).Previous studies have also reported that molecular mechanics/ Generalized-Born surface area (MM/GBSA) performed better than molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) for both binding pose predictions and binding free energy estimations [41].In addition, we separately incorporated both PCM and SMD implicit solvent models into the FMO method to assess binding affinity.The results indicated that the binding affinities using the SMD solvent effects marginally improved the correlation with experimental values, with a 17% increase in Pearson R.However, the binding affinity predictions using the PCM solvation effects did not exhibit a good correlation with experimental values (average Pearson R = 0.43 0.26 0.60 ).This could possibly be attributed to the poor performance of the PCM for specific protein cavities, resulting in a lack of benefit from the nonpolar contributions to protein-ligand binding; on the contrary, the results deteriorate [35,63,64].Because of the COSMO method used in our scoring function, the computational cost was much lower than the combination of traditional FMO calculations with the PCM or the SMD solvent model.

Prediction Performance of the FMOScore Approach
Achieving absolute accuracy in binding affinity predictions is crucial to provide definitive guidance for lead optimization.Therefore, a performance assessment of FMOScore developed was carried out, whose results are summarized in Table S1 (detailed plots for the per-target data set can be found in Figure 3).In seven data sets (CDK8, c-Met, EG5, PFKFB3, SHP-2, SYK, and TNKS2), FMOScore demonstrated an excellent ability to predict binding free energy (Pearson R > 0.62, Kendall τ > 0.5, RMSE < 0.9 kcal/mol).In particular, FMOScore produced a good ranking in the c-Met and PFKFB3 data sets (0.77  Hence, we added the deformation energy of the ligand into function form 4 (FMO_COSMO) to obtain function form 5 (FMO_COSMO_SE).The performance of the model increased significantly (Pearson R increased by 40.3% and RMSE decreased by 37.7%, compared to function form 1). When the entropy term was included in function form 6 (FMO_COSMO_SE_IE), the prediction power was slightly elevated compared to that in function form 5 (FMO_COSMO_SE).Considering the calculation of the entropy term involves MD simulation sampling and the extraction of protein-ligand interactions from a trajectory of 100,000 frames, it incurs a high computational cost; hence, we chose function form 5 (FMO_COSMO_SE) as the FMO-based binding affinity prediction model (called FMOScore) in this work.

Prediction Performance of the FMOScore Approach
Achieving absolute accuracy in binding affinity predictions is crucial to provide definitive guidance for lead optimization.Therefore, a performance assessment of FMOScore developed was carried out, whose results are summarized in Table S1 (detailed plots for the per-target data set can be found in Figure 3).In seven data sets (CDK8, c-Met, EG5, PFKFB3, SHP-2, SYK, and TNKS2), FMOScore demonstrated an excellent ability to predict binding free energy (Pearson R > 0.62, Kendall τ > 0.5, RMSE < 0.9 kcal/mol).In particular, FMOScore produced a good ranking in the c-Met and PFKFB3 data sets (0.77 0.65 0.88 and 0.73 0.64 0.81 by Kendall τ, respectively).Meanwhile, the experimental and predicted free-energy changes (∆∆G), presented as box-plot diagrams in Figure S1, indicated the high prediction accuracy of FMOScore, which measured accuracy by comparing predicted free energy ∆G pred to experimental affinity ∆G exp for each compound individually.Except for the HIF-2α target, the predicted ∆G pred was within 2 kcal/mol of the experimental value using the FMOScore method in seven data sets (Figures 3 and S1).However, the predictive performance of FMOScore is poor for the HIF-2α target, possibly due to the pivotal role of entropy contributions in ligand binding, while FMOScore primarily focuses on the precise calculation of enthalpy contributions.Despite our attempts to enhance the performance of the FMOScore method by employing the IE method for entropy calculation, favorable outcomes were not achieved.free-energy changes (ΔΔG), presented as box-plot diagrams in Figure S1, indicated the high prediction accuracy of FMOScore, which measured accuracy by comparing predicted free energy ΔGpred to experimental affinity ΔGexp for each compound individually.Except for the HIF-2α target, the predicted ΔGpred was within 2 kcal/mol of the experimental value using the FMOScore method in seven data sets (Figures 3 and S1).However, the predictive performance of FMOScore is poor for the HIF-2α target, possibly due to the pivotal role of entropy contributions in ligand binding, while FMOScore primarily focuses on the precise calculation of enthalpy contributions.Despite our attempts to enhance the performance of the FMOScore method by employing the IE method for entropy calculation, favorable outcomes were not achieved.
Moreover, the contributions of different energy terms for the Schrödinger FEP+ benchmark calculated by FMOScore are summarized in the Supplemental Data, where the pair interaction energy calculated with the FMO method can be decomposed (PIEDA).The PIEDA analysis decomposes the interaction energy into the electrostatic and charge transfer terms, the exchange-repulsion term, and the dispersion term, which describe Hbonds (or salt bridges, polar interactions), steric repulsion, and hydrophobic interactions.Figure S2 and Table S3 display detailed information on the individual contribution of each residue to the lead compound of the eight targets and provide a reference for the rational structure-based drug design (SBDD) strategy.Moreover, the contributions of different energy terms for the Schrödinger FEP+ benchmark calculated by FMOScore are summarized in the Supplemental Data, where the pair interaction energy calculated with the FMO method can be decomposed (PIEDA).The PIEDA analysis decomposes the interaction energy into the electrostatic and charge transfer terms, the exchange-repulsion term, and the dispersion term, which describe H-bonds (or salt bridges, polar interactions), steric repulsion, and hydrophobic interactions.Figure S2 and Table S3 display detailed information on the individual contribution of each residue to the lead compound of the eight targets and provide a reference for the rational structurebased drug design (SBDD) strategy.

Comparison with Traditional Methods for Calculating the Binding Free Energy
The primary goal of FMOScore is to predict the binding free energy of a given protein-ligand complex.Comparing FMOScore results on the benchmark with FEP+, MM/GBSA, MM/PBSA, and Vina, we found that FMOScore performed better than these traditional structure-based drug design (SBDD) methods (Figure 4 and Table S1).In detail, the FMOScore method achieved better correlation (average Pearson R = 0.87 0.84 0.89 ) and higher-ranking power (average Kendall τ = 0.69 0.65 0.72 ) than FEP+ (average R = 0.78 0.74 0.83 , average Kendall τ = 0.58 0.53 0.63 ).Among the eight data sets tested, it exhibited comparable performance in binding free energy prediction of FMOScore and FEP+ (Figures 3 and 4 and Table S1).In addition, the overall performance of FMOScore (average Pearson R = 0.87 0.84 0.89 , average RMSE = 0.71 0.62 0.80 kcal/mol) was better than that of MM/GBSA (average Pearson R = 0.27 0.18 0.36 , average RMSE = 1.39 1.25  1.54 ) and MM/PBSA (average Pearson R = 0.13 0.03 0.23 , average RMSE = 1.44 1.30  1.57 ), and was much better than that of Autodock Vina (average Pearson R = 0.09 0.008 0.18 , average RMSE = 1.44 1.30 1.57 ) (Figure 4 and Table S1).Therefore, this confirmed that FMOScore has higher prediction accuracy on this benchmark consisting of eight data sets with structurally diverse sets of ligands, compared to traditional methods for calculating binding free energy.

Retrospective Application of FMOScore to Predict Binding Free Energy
In the initial phase of our investigation, we conducted a retrospective evaluation of FMOScore's performance concerning thyroid hormone receptor β (THR-β) agonists, using a set of 18 compounds taken from the work of Hu et al. [65] (Table S4 and Figure 5A).Overall, we observed that the FMOScore approach (R 2 = 0.71) exhibited a better correlation with ∆G exp compared to the FMO interaction energy (R 2 = 0.46) (Figure 5C), which suggests that the solvation free energy and deformation energy contributions play a positive role in ligand binding.Table S4 reveals that the contributions of solvation free energy and deformation energy to the binding free energy are significant, with correlations R 2 reaching 0.37 and 0.4, respectively.In detail, ligand 16 incorporated a strongly polar carboxyl moiety, which resulted in a notably elevated penalty in solvation free energy as computed using the COSMO solvent model.Ligands 3 to 5 exhibited a reduced number of flexible bonds, resulting in the lowest deformation energy penalties calculated from first principles (Table S4).This also underscored the rationality of the solvation free energy and deformation energy calculations established by this method.Moreover, the difference between ∆G pred obtained by FMOScore and the experimental value remained within 2 kcal/mol, indicating its high prediction accuracy (Figure 5B).Hence, by means of solvation free energy and deformation energy calculations, the precision of FMO predictions for binding affinity can be further enhanced, thereby extending its applicability to a broader spectrum of drug design scenarios.S1).Therefore, this confirmed that FMOScore has higher prediction accuracy on this benchmark consisting of eight data sets with structurally diverse sets of ligands, compared to traditional methods for calculating binding free energy.

Design and Synthesis of Potent and Novel SHP-2 Inhibitors
The synthesis of compound 1-8 and intermediate S1-S6 is shown in Schemes 1 and 2. Firstly, substituted phenylboronic acid and 2,5-dibromo-3-nitropyridine underwent a Suzuki coupling reaction catalyzed by palladium afforded intermediate S1.Then, S1 reacted with substituent amines to obtain intermediate S2.Subsequently, S2 was deprotected to remove the Boc group and then reduced by iron powder to obtain the final compounds 1~6 (Scheme 1).The initial step in the synthesis of compounds 7-8 involved a reaction between 3-dichlorophenol and 5-bromo-2-fluoro-3-nitropyridine, yielding intermediate S4.Then, S4 underwent the Suzuki coupling reaction with the corresponding substituted amine to obtain intermediate S5.Finally, S5 reduced the nitro group to an amino group by iron powder reduction, and then removed the Boc group to obtain the final compounds 7~8 (Scheme 2).ples (Table S4).This also underscored the rationality of the solvation free energy and deformation energy calculations established by this method.Moreover, the difference between ΔGpred obtained by FMOScore and the experimental value remained within 2 kcal/mol, indicating its high prediction accuracy (Figure 5B).Hence, by means of solvation free energy and deformation energy calculations, the precision of FMO predictions for binding affinity can be further enhanced, thereby extending its applicability to a broader spectrum of drug design scenarios.It has been reported that SHP099, a lead compound, has an inhibitory role by locking SHP-2 in an auto-inhibited conformation, which binds to the central tunnel region formed at the C-SH2, N-SH2, and PTP domains (Figure 6A).The results of PIEDA on the X-ray crystal structure of SHP-2 (PDB: 5EHR) are shown in Figure 6B.The PIEDA results showed that the high electrostatic (ES) and charge transfer with inseparable (CT + MIX) terms of Arg111, followed by Glu250, indicate that these two residues formed a strong hydrogen bond with the ligand.In addition, the main PIEDA components of Glu249 (PTP) and His114 (N-SH2) were ES and CT + MIX terms.The former formed a salt bridge or ionic interaction with quaternary ammonium cation, and the latter formed a cation-π interaction with this cation group.Arg111 made a cation-π stacking interaction with the dichlorophenyl motif.The methyl group of Thr219 formed a CH-π interaction with pyrazine of SHP099, and its DI component was substantial at −5 kcal/mol.Moreover, five residues (Thr253, Leu254, Pro491, Gln257, and Lys492), whose main PIEDA component is dispersion interactions (DIs) (−3.0~−7.2kcal/mol), were located around the dichlorophenyl group.This indicates that these residues form hydrophobic interactions with the ligand.The sum of PIE was equal to −141.7 kcal/mol (ES:EX:CT + MIX:DI = −131:98:−30:−77 kcal/mol), indicating a significant influence of electrostatic forces in the activity of this ligand.Given FMO analysis, the quaternary ammonium cation of SHP099 made apparent ionic interactions, which occupied a polar region of the allosteric binding pocket.The aminopyrazine ring acted as a H-bond acceptor and a H-bond donor to make key H-bonds with Arg111 and Glu250, respectively.Therefore, we applied the developed FMOScore method for rational drug design after analyzing the key pharmacophores of SHP099.It has been reported that SHP099, a lead compound, has an inhibitory role by locking SHP-2 in an auto-inhibited conformation, which binds to the central tunnel region formed at the C-SH2, N-SH2, and PTP domains (Figure 6A).The results of PIEDA on the X-ray crystal structure of SHP-2 (PDB: 5EHR) are shown in Figure 6B.The PIEDA results showed that the high electrostatic (ES) and charge transfer with inseparable (CT + MIX) terms of Arg111, followed by Glu250, indicate that these two residues formed a strong hydrogen bond with the ligand.In addition, the main PIEDA components of Glu249 (PTP) and His114 (N-SH2) were ES and CT + MIX terms.The former formed a salt bridge or ionic interaction with quaternary ammonium cation, and the latter formed a cation-π interaction with this cation group.Arg111 made a cation-π stacking interaction with the dichlorophenyl motif.The methyl group of Thr219 formed a CH-π interaction with pyrazine of SHP099, and its DI component was substantial at −5 kcal/mol.Moreover, five residues (Thr253, Leu254, Pro491, Gln257, and Lys492), whose main PIEDA component is dispersion interactions (DIs) (−3.0~−7.2kcal/mol), were located around the dichlorophenyl group.This indicates that these residues form hydrophobic interactions with the ligand.The sum of PIE was equal to −141.7 kcal/mol (ES:EX:CT + MIX:DI = −131:98:−30:−77 kcal/mol), indicating a significant influence of electrostatic forces in the activity of this ligand.Given FMO analysis, the quaternary ammonium cation of SHP099 made apparent ionic interactions, which occupied a polar region of the allosteric binding pocket.The aminopyrazine ring acted as a H-bond acceptor and a H-bond donor to make key H-bonds with Arg111 and Glu250, respectively.Therefore, we applied the developed FMOScore method for rational drug design after analyzing the key pharmacophores of SHP099.
The protein-ligand interactions of inhibitor SHP099 with SHP-2 (PDB code 5EHR) were revealed by FMO analysis (Figure 6B).In attempts to build an SAR of the central ring for more novel compounds, we preserved the aniline interaction with Glu250 and replaced pyrazine with pyridine.The results showed that compound 1 retained biochemical inhibition (IC 50 = 1.494 ± 0.024 µM, SHP2 PTP IC 50 > 100 µM, Table 2 and Figure S4), which exhibited a 16-fold decrease compared to SHP099 (IC 50 = 0.092 ± 0.001 µM, SHP2 PTP IC 50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.The protein-ligand interactions of inhibitor SHP099 with SHP-2 (PDB code 5EHR) were revealed by FMO analysis (Figure 6B).In attempts to build an SAR of the central ring for more novel compounds, we preserved the aniline interaction with Glu250 and replaced pyrazine with pyridine.The results showed that compound 1 retained biochemical inhibition (IC50 = 1.494 ± 0.024 µM, SHP2 PTP IC50 > 100 µM, Table 2 and Figure S4), which  exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > 100 µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed exhibited a 16-fold decrease compared to SHP099 (IC50 = 0.092 ± 0.001 µM, SHP2 PTP IC50 > µM).This is presumably due to changes in the binding affinity caused by differences in electron cloud density between pyrazine and pyridine.In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed In our attempt to improve the potency and obtain novel structures by modifying the piperazine motif, we extended the terminal amine toward a polar region for discovering new interactions.Morphing the piperazine ring to a piperidine-3-ylmethanamine motif and a piperidine-4-amine motif reduced inhibition 10-fold (compound 2, IC 50 = 15.610 ± 1.490 µM) and 15-fold (compound 4, IC 50 = 23.490± 2.860 µM), respectively.These compounds have low binding affinities predicted by FMOScore, which can be attributed to higher deformation energies resulting from an increased number of one or two rotation bonds (Table 2).Pyrrolidine-3-amine substituted for the piperazine ring (compound 3, IC 50 = 5.174 ± 1.080 µM) was approximately equipotent to compound 1 in biochemical assays, on account of it having less solvation free energy (Table 2).Further, our attention turned to the dichlorophenyl motif, which effectively filled a small hydrophobic pocket formed by Arg111, Thr253, Leu254, Gln257, and Gln495.Attempts to replace meta-chlorine with fluorine in the phenyl ring resulted in poor phosphatase inhibition (compound 5, IC 50 = 15.110 ± 0.800 µM).
The redesign of the aryl group to include pyridyl N resulted in a more polar compound 6, displaying reduced phosphatase inhibition (IC 50 = 10.650 ± 0.920 µM) compared to compound 1.These data indicated that the decrease in the hydrophobicity of compounds 5 and 6 affected the biochemical activity.Then, we introduced the thioether linker into the dichlorophenyl motif, which resulted in compound 7.We evaluated compound 7 and found reduced biochemical potency (IC 50 = 11.250± 1.190 µM) relative to compound 1.In our attempt to improve the potency of SHP-2 biochemical inhibition, substituting the piperazine subunit in compound 7 with spirocyclic amine (compound 8, IC 50 = 2.290 ± 0.660 µM, Table 2 and Figure S4) found new interactions and increased the biochemical activity five-fold compared to that in compound 7. FMOScore revealed that spirocyclic amine formed a CH-π interaction with His114 (Figure 6D,E), which occupied more ES components than compounds 7 (−16 kcal/mol vs. −5 kcal/mol).We observed a higher correlation (R 2 = 0.87, Figure 6C) between the inhibitors' biological potencies and the binding affinities calculated by FMOScore, as compared to the FMO method (R 2 = 0.56).Despite the fact that SHP2 inhibitors consist of ionizable fragments, the incorporation of an implicit solvent model into the FMOScore method proved advantageous for assessing the accurate binding affinity.Those results suggested that the direction of structure optimization is reasonable for aminopyrazine core, dichlorophenyl, and piperazine motifs.

FMOScore Calculations
From a microscopic point of view, binding free energy (∆G bind ) is composed of the enthalpy change and entropy change (see also the thermodynamic cycle in Figure 1).In the drug discovery process, the half maximal inhibitory concentration (IC 50 ) values are often experimentally measured for tested inhibitors.The dependence between IC 50 or K d and free energy of binding can be expressed [66] as follows: where k B and T stand for Boltzmann constant and temperature, respectively.The free energy of binding in solution, ∆G bind , is broken down into the gas-phase binding free energy (∆G gas bind ) and the solvation free energy (∆G sol ).The gas-phase free energy of binding contains an enthalpic term (∆H gas bind ) and an entropic term (T∆S gas bind ).
∆H gas bind is expressed as the difference of molecular energy in vacuum for the structures of the complex ("com"), receptor protein ("pro"), and isolated ligand ("lig").T∆S gas bind is represented as the degrees of freedom of the individual systems, calculated by interaction entropy (IE) method. ∆H There is a difference between the energy of the isolated ligand conformation and the protein-bound ligand ("lig(com)"), due to fact that the ligand must take on strain in order to bind to the protein.Therefore, ∆H gas bind is further separated into the intermolecular interaction energy (∆E int ), all in the bound state, and the deformation energy of the ligand (∆E def lig ).∆H ∆E def lig = E lig(com) − E lig (7) where ∆E int is represented as the summation of IFIEs calculated by FMO calculations in vacuum (see below).Finally, the solvation free energy upon complex formation is approximately divided into the change of solvation free energy in binding state (∆G sol ).Hence, ∆G bind is directly estimated from FMOScore energies as follows: Considering the entropy term is computationally expensive, the entropy term was dropped in FMOScore (detailed results can be found in Section 2.1).
Structure preparation.In this work, the structures of Schrödinger FEP+ benchmark and SHP-2-ligand complexes were obtained by crystal structures and molecular docking simulations.Residues within 4.5 Å surrounding the ligand were chosen as the entire system calculated.The missing hydrogen atoms of complexes were added at physiological pH (7.0), and the protonation states of the acidic and basic amino acid residues were predicted using the Protonate 3D module with the default settings (the generalized Born/volume integral methodology with a 15 Å cutoff and a solvent dielectric constant of 80).We used a constrained minimization procedure to refine the complexes with the semiempirical Amber10: EHT forced field implemented in Molecular Operating Environment (MOE version 2014.09)program, where each atom was allowed to deviate up to 0.5 Å from its original position in the crystal structure.The prepared complex structures were subjected to subsequent accurate FMO calculations.
Fragment molecular orbital (FMO) method and inter-fragment interaction energy (IFIE).Here, the FMO method was applied to all systems using FMO code version 5.1 as embedded in General Atomic and Molecular Electronic Structure System (GAMESS), a general ab initio quantum chemistry package [14,67].In this work, we used the secondorder Møller−Plesset perturbation theory (MP2) [68] method with a 6-31G* basis set [69].
The FMO method is a general quantum mechanical method that divides large molecules, such as proteins, into N fragments.Each residue is characterized as a fragment, and the interaction energies reported herein correspond to essential amino acid residues, not fragments of residues.The total energy E of the whole system is calculated using the following series expansion (truncated here at the two-body terms, the FMO2 method [70]).
where monomer (I) and dimer (IJ) energy are obtained as described above.
The value of IFIE as well as the pair interaction energy decomposition analysis (PIEDA) were used to investigate the chemical situations in pharmacophore.The fragment pair interaction energy (PIE) was obtained by subtracting the two monomer energies from the dimer energy.PIE does not indicate binding free energy, but rather the "strength" of the interaction between the ligand and the protein residues in the complex, which is enthalpy in gas phase.The IFIE between fragments I and J (∆E int I J ) is decomposed into its energy components, such as electrostatic (ES), exchange (EX), charge transfer with inseparable (CT + MIX), and dispersion interactions (DI) [71], as follows: where the DI component corresponds to the MP2 electron correlation energy and the others constitute the Hartree−Fock (HF) energy.PM7/COSMO solvation.For biological systems, a significant task to be described is the solvation phenomena of the complexes and the binding partners prior to binding, in addition to the complex interactions in vacuo.A compromise that balances speed and precision for the solvation is using the COSMO model employing the scaled-conductor approximation [72], which is implemented for PM7 calculations in MOPAC [73].Kristiam et al.'s benchmark set showed that the PM7/COSMO solvation model is highly accurate in the calculation of solvation energy change upon binding [50].
Deformation energy of ligands.Ligands often distort the shape to adjust to the narrow space of the pocket at the deformation energy.The deformation energy of the ligand is obtained from the energy difference of single-point energy between the ligand-bound conformation and the low-energy conformation.For each molecule, 10,000 conformers were generated using Confab [74] as implemented in Open Babel.The generated conformers, clustered and deduplicated, were optimized using the GFN2-xTB method developed by Grimme [75,76].The lowest energy conformers were then obtained by clustering, deduplicating, and sorting the conformations with Isostat program [77].Moreover, the conformations with the top three lowest energies were optimized and computed the vibrational frequencies in the Gaussian09 (Revision C.01) [78] software package using the B3LYP method and the 6-31G(d,p) basis set in Molclus software [77].Therefore, we obtained the lowest energy of molecules ab initio.Meanwhile, the single-point energy of ligand-bound conformation was calculated at the B3LYP/6-31G(d,p) level.In this method, it is recommended to add a dispersion correction, such as B3LYP-D3(BJ), to optimize the system when the small molecule exhibits significant intramolecular weak interactions [79].In the presence of pronounced intramolecular weak interactions, particularly dispersion-dominated pi-pi stacking, neglecting dispersion correction may lead to substantial discrepancies between the optimized and actual structures [80].
Interaction entropy calculation.A novel entropy calculation method (named interaction entropy, IE) developed by Duan et al. [51] was used for the binding free energy calculation.Interaction entropy calculations are derived from protein-ligand interactions in the gas phase, obtained from MD simulation, and represent fluctuations in ligand-receptor interactions (the derivation process can be found in reference [51]).A 2 ns MD simulation was previously performed for each system at a constant temperature (300 K).Furthermore, 100,000 snapshots were extracted from the last 1 ns trajectories for interaction entropy calculations.Equation (11) shows the components for calculating IE, where ∆E int pl = E int pl − E int pl represents the fluctuation of protein-ligand interaction energy around mean energy.

Comparison with Traditional Methods
The relative binding free energy data were obtained from the publicly available Schrödinger FEP+ benchmark data set [48].The MM/PB(GB)SA calculations were performed to obtain the binding energy of the complex implemented in mmpbsa.pytool of Amber18 [81].Snapshot conformations were extracted from the short-time MD simulation every 10 ps, from 1 ns to 2 ns, for a total of 101 snapshots.The binding conformation of proteins and ligands was scored by Autodock vina [82] with default settings.

Statistical Analyses
Correlation statistics and error estimation were evaluated with the bootstrap method with 10,000 bootstrap repetitions using standard Python 3 with NumPy and SciPy libraries [83].

Protein Expression, Purification, and Biochemical Assay
SHP2 WT (1-593) and SHP2 PTP (237-529) genes were inserted separately into a pET30 vector, and SHP2 recombinant proteins were expressed and purified according to protocols published previously [84].Briefly, the constructs were transformed into BL21 Star TM (DE3) cells and cultured at 37 • C for 3~4 h.Subsequently, the cells with high expression of SHP2 protein were harvested after being induced by IPTG and grown overnight at 18 • C. Cell pellets were resuspended in lysis buffer (25 mM imidazole, 50 mM Tris−HCl, 500 mM NaCl, 2.5 mM MgCl 2 , 1 mM tris(2-carboxyethyl)phosphine (TCEP), 1 µg/mL DNase1, and complete EDTA-free protease inhibitor) and lysed using Low Temperature Ultrahigh Pressure Continuous Flow Cell Disrupter (JNBIO), followed by ultracentrifugation.The supernatant was collected.Then, the SHP2 was purified from the collected supernatant by using HisTrap HP chelation column, TEV protease, and HiTrap Q FastFlow column.

Figure 1 .
Figure 1.Computational workflow of the FMOScore approach, which contains three terms: (a) gasphase potential energy, (b) deformation energy, and (c) solvation free energy.

Figure 1 .
Figure 1.Computational workflow of the FMOScore approach, which contains three terms: (a) gasphase potential energy, (b) deformation energy, and (c) solvation free energy.

Figure 3 .
Figure 3. Prediction results of FMOScore in the Schrödinger FEP+ benchmark.ΔG values are predicted by FMOScore in red dots and FEP+ in dark dots.Predictions within 1 and 2 kcal/mol of the experimental affinity are highlighted by dark and light gray areas, respectively.

Figure 3 .
Figure 3. Prediction results of FMOScore in the Schrödinger FEP+ benchmark.∆G values are predicted by FMOScore in red dots and FEP+ in dark dots.Predictions within 1 and 2 kcal/mol of the experimental affinity are highlighted by dark and light gray areas, respectively.
), and was much better than that of Autodock Vina (average Pearson R = 0

Figure 6 .
Figure 6.Discovery and design of novel SHP-2 inhibitors.(A) Binding mode of SHP-2 inhibitor SHP099 (PDB code: 5EHR).Key residues are shown in stick.Hydrogen bonds are displayed as yellow dashes.(B) PIE between inhibitor SHP099 and the residues at the allosteric site of SHP-2 (left graph), and PIEDA for these residues (right graph).The dispersion, exchange-repulsion, charge transfer, and electrostatic terms are represented in gray, cyan, blue, and red, respectively.(C) Correlations between experimentally measured affinities and ΔG values predicted by FMOScore for the designed SHP-2 inhibitors.(D) Predicted binding mode of compound 8. (E) PIE and PIEDA for the residues at the allosteric site of SHP-2 and compound 8.

Figure 6 .
Figure 6.Discovery and design of novel SHP-2 inhibitors.(A) Binding mode of SHP-2 inhibitor SHP099 (PDB code: 5EHR).Key residues are shown in stick.Hydrogen bonds are displayed as yellow dashes.(B) PIE between inhibitor SHP099 and the residues at the allosteric site of SHP-2 (left graph), and PIEDA for these residues (right graph).The dispersion, exchange-repulsion, charge transfer, and electrostatic terms are represented in gray, cyan, blue, and red, respectively.(C) Correlations between experimentally measured affinities and ∆G values predicted by FMOScore for the designed SHP-2 inhibitors.(D) Predicted binding mode of compound 8. (E) PIE and PIEDA for the residues at the allosteric site of SHP-2 and compound 8.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Materials and Methods Section for details).b Experimental affinities were converted to ΔGexp values using the equation ΔGexp ≈ kBTlog IC50.c Predicted binding free energy ΔGprep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (ΔE int ), solvation free energy (ΔG sol ), and deformation energy into the isolated form (ΔE def lig ).All calculated values are in kcal/mol.

a
All the data are averages of three independent experiments (see the Section 3 for details).b Experimental affinities were converted to ∆G exp values using the equation ∆G exp ≈ k B Tlog IC 50.cPredicted binding free energy ∆G prep was fitted to the three types of energies calculated by FMOScore, consisting of binding affinities in vacuo (∆E int ), solvation free energy (∆G sol ), and deformation energy into the isolated form (∆E def lig ).All calculated values are in kcal/mol.

Table 1 .
Performance comparison of FMOScore with different function forms in Schrödinger FEP+ benchmark.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.

Table 2 .
Structure-activity relationship study of SHP-2 inhibition activities.