Design, Synthesis, Biological Evaluation, and Molecular Modeling of 2-Difluoromethylbenzimidazole Derivatives as Potential PI3Kα Inhibitors

PI3Kα is one of the potential targets for novel anticancer drugs. In this study, a series of 2-difluoromethylbenzimidazole derivatives were studied based on the combination of molecular modeling techniques 3D-QSAR, molecular docking, and molecular dynamics. The results showed that the best comparative molecular field analysis (CoMFA) model had q2 = 0.797 and r2 = 0.996 and the best comparative molecular similarity indices analysis (CoMSIA) model had q2 = 0.567 and r2 = 0.960. It was indicated that these 3D-QSAR models have good verification and excellent prediction capabilities. The binding mode of the compound 29 and 4YKN was explored using molecular docking and a molecular dynamics simulation. Ultimately, five new PI3Kα inhibitors were designed and screened by these models. Then, two of them (86, 87) were selected to be synthesized and biologically evaluated, with a satisfying result (22.8 nM for 86 and 33.6 nM for 87).


Introduction
PI3K/AKT/mTOR signaling is a therapeutic targeting pathway for the development of cancer treatment, which is activated aberrantly in many human cancers [1][2][3].
Class II PI3Ks are composed of PI3KC2α PI3KC2β, and PI3KC2γ, which modulate intracellular membrane dynamics and membrane traffic [13]. Class III PI3Ks have as the only member VPS34, which participates in vesicle trafficking and autophagy [13].
Among the subtypes of PI3K, PI3Kα is most often associated with cancer and controls a variety of physiological processes as an important regulator of intracellular signaling pathways [8,14]. The aberrant activation of PI3Kα and its downstream effectors is related to tumor initiation and maintenance [15]. The selective targeting of the PI3Kα isoform has special pharmacological significance, which makes PI3Kα an attractive target in anticancer drug development [10,16]. However, although great efforts have been made to explore PI3Kα inhibitors, there are few reports on PI3Kα inhibitors that possess the ability to differentiate between PI3Kα and its isoforms [17].

CoMFA and CoMSIA Statistical Results
To verify the predictive ability of the model, all key statistical parameters were analyzed. The statistical parameters of standard CoMFA and CoMSIA models are given in Table 1. For CoMFA analysis, Partial Least Square (PLS) analysis showed cross-validated correlation coefficient (q 2 ) value of 0.797, non-cross-validated correlation coefficient (r 2 ) of 0.996, and the optimal number of components (ONC) of 10 in the training set. The contribution of the electrostatic field (50.4%) to the binding affinity was close to that of the steric field (49.6%). For CoMSIA analysis, PLS analysis showed q 2 value of 0.567, r 2   In this study, 3D-QSAR models were developed using CoMFA [20] and CoMSIA [21] technology based on 85 2-difluoromethylbenzimidazole derivatives [22][23][24]. The relationships between the key structural and pharmacodynamic information were obtained from the 3D-QSAR model, which were helpful for further drug research [25][26][27][28][29][30]. The binding mode of the compound 29 with 4YKN was explored through molecular docking and molecular dynamics simulation. Ultimately, five new PI3Kα inhibitors were designed and evaluated. Two of them were selected to be synthesized and biologically evaluated. Compound 86 (IC 50 = 22.8 nM) and compound 87 (IC 50 = 33.6 nM) showed potent inhibitory activity against PI3Kα.

CoMFA and CoMSIA Statistical Results
To verify the predictive ability of the model, all key statistical parameters were analyzed. The statistical parameters of standard CoMFA and CoMSIA models are given in Table 1. For CoMFA analysis, Partial Least Square (PLS) analysis showed cross-validated correlation coefficient (q 2 ) value of 0.797, non-cross-validated correlation coefficient (r 2 ) of 0.996, and the optimal number of components (ONC) of 10 in the training set. The contribution of the electrostatic field (50.4%) to the binding affinity was close to that of the steric field (49.6%). For CoMSIA analysis, PLS analysis showed q 2 value of 0.567, r 2 of 0.960, Fisher test (F) = 230.278, and Standard Error of Estimate (SEE) of 0.099, with 6 components in the training set.

3D-QSAR Contour Map Results and Analysis
In the optimized CoMFA and CoMSIA models, the most active compound, 29, was used as a template molecule in all subsequent contour maps (Figures 2 and 3). Contour maps are important for designing new drug candidates because they show areas where changes in the molecular field energy are closely related to changes in biological activity.
The steric and electrostatic fields from the optimal CoMFA model are shown in Figure 2. As shown in the steric map (Figure 2a), areas where steric bulk substituents increased the potency are represented by green polyhedrons, while areas where steric bulk substituents decreased the potency are represented by yellow polyhedrons. The medium green contour occurring at R 2 of compound 29 indicates that compounds with bulk substituents at this site would have better biological activity. Compound 2 (pIC 50 = 8.4), with a methoxy group at R 2 , possesses better biological activity than compound 1 (pIC 50 = 8.05), without methoxy substitution. The medium-yellow contour in the near-methylsulfonyl group at R 1 of compound 29 indicates that compounds with small substitutions would have better biological activity. Compound 5 (pIC 50 = 8.22), with the methylsulfonyl group, possesses better biological activity than compound 6 (pIC 50 = 7.68), with the ethylsulfonyl group. A large green contour occurring in the piperidine ring at R 1 of compound 29 indicates that compounds with bulk substituents at this site would have better biological activity. Compound 17 (pIC 50 = 8.29), with a 6-membered ring, possesses better biological activity than the corresponding compound 21 (pIC 50 = 8.12), with a 5-membered ring, and compound 25 (pIC 50 = 7.49), with a 4-membered ring.
As shown in the electrostatic map (Figure 2b), the blue area indicates that the electropositive group is beneficial to biological activity and the red area indicates that the electronegative group is beneficial to biological activity. A large blue contour occurring near the 2 -site of the piperidine ring at R 1 of compound 29 indicates that compounds with electropositive substituents at this site would have better biological activity. Compound 29 (pIC 50 = 8.64), with -N at the 1 -site of the piperidine ring, possesses better biological activity than the corresponding compound 31 (pIC 50 = 7.7), with -N at the 2 -site of the piperidine ring. A small red contour occurring above the aminomethyl group of R 1 of compound 29 indicates that compounds with electronegative substituents at this site would have better biological activity. Compound 29 (pIC 50 = 8.64), with the aminomethyl group, possesses better biological activity than compound 28 (pIC 50 = 7.85), with the imino group. have better biological activity. Compound 5 (pIC50 = 8.22), with the methylsulfonyl group, possesses better biological activity than compound 6 (pIC50 = 7.68), with the ethylsulfonyl group. A large green contour occurring in the piperidine ring at R1 of compound 29 indicates that compounds with bulk substituents at this site would have better biological activity. Compound 17 (pIC50 = 8.29), with a 6-membered ring, possesses better biological activity than the corresponding compound 21 (pIC50 = 8.12), with a 5-membered ring, and compound 25 (pIC50 = 7.49), with a 4-membered ring.   As shown in the electrostatic map (Figure 2b), the blue area indicates that the electropositive group is beneficial to biological activity and the red area indicates that the electronegative group is beneficial to biological activity. A large blue contour occurring near the 2′-site of the piperidine ring at R1 of compound 29 indicates that compounds with electropositive substituents at this site would have better biological activity. Compound 29 (pIC50 = 8.64), with -N at the 1′-site of the piperidine ring, possesses better biological activity than the corresponding compound 31 (pIC50 = 7.7), with -N at the 2′-site of the piperidine ring. A small red contour occurring above the aminomethyl group of R1 of compound 29 indicates that compounds with electronegative substituents at this site would have better biological activity. Compound 29 (pIC50 = 8.64), with the aminomethyl group, possesses better biological activity than compound 28 (pIC50 = 7.85), with the imino group. Figure 3a depicts the CoMSIA steric contour map of the optimal model with compound 29 overlaid. In this map, the green contours (sterically beneficial) and yellow contours (sterically unbeneficial) contours represent 80% and 20% level contributions, respectively. A small green contour occurring above R2 of compound 29 indicates that compounds with bulk substituents at this site would have better biological activity. Compound 5 (pIC50 = 8.22), with the methoxy group at R2, possesses better biological activity than compound 4 (pIC50 = 7.68), without the methoxy group. A medium-yellow contour  (pIC 50 = 8.22), with the methoxy group at R 2 , possesses better biological activity than compound 4 (pIC 50 = 7.68), without the methoxy group. A medium-yellow contour occurring near the sulfonamide group indicates that compounds with small substituents at this site would have better biological activity. Compound 47 (pIC 50 = 7.11), with -NH 2 not in the yellow contour, possesses slightly better biological activity than the corresponding compound 48 (pIC 50 = 6.97), with -NH 2 in the yellow contour.
The electrostatic contour map of the CoMSIA model is displayed in Figure 3b. The blue (electropositive group favorable) and red (electronegative group favorable) contours represent 80% and 20% level contributions, respectively. A medium-red contour occurring near the oxygen atom of R 1 of compound 29 indicates that compounds with electronegative substituents at this site would have better biological activity. Compound 2 (pIC 50 = 8.64) with -OCH 3 possess better biological activity than compound 1 (pIC 50 = 7.85), with -H. A medium-blue contour was observed on the piperidine ring of R 1 , which revealed that an improvement in the electropositivity of this site would improve the biological activity.
The CoMSIA contour map of hydrophobic contribution is described in Figure 3c. In this figure, the yellow (hydrophobic favorable) and white (hydrophobic unfavorable) contours represent 80% and 20% level contributions, respectively. Small yellow contours occurring near the 2 -site of piperidin at R 1 of compound 29 indicates that the hydrophobic group at this site would improve the biological activity. Compound 10 (pIC 50 = 8.49), with -CH 2 at the 4 -site of piperidin, possesses better biological activity than compound 9 (pIC 50 = 8.13), without -CH 2 . Small white contours covering the 2 ,3 -site of piperidin at R 1 of compound 29 indicates that a hydrophobic group at this site would reduce the biological activity.
In Figure 3d, the cyan and purple contour maps indicate favorable and unfavorable H-bond donor groups, representing 80% and 20% level contributions, respectively. A medium cyan contour occurring at R 1 illustrates that compounds with an H-bond donor substituent at this site were preferred to have higher PI3Kα inhibitory activity. Compound 3 (pIC 50 = 8.51), with -NH 2 , possesses slightly better biological activity than compound 2 (pIC 50 = 8.40), with -H. A medium purple contour is located blew the piperidine ring at R 1 of compound 29, indicating the H-bond donor substituents that had bad biological activity.
As shown in Figure 3e, the hydrogen bond acceptor field of the CoMSIA model is represented by magenta (hydrogen bond acceptor favorable) and red (hydrogen bond acceptor unfavorable), representing 80% and 20% level contributions, respectively. Small red contours of the methylsulfonyl group suggest that the introduction of hydrogen bond acceptor substituents to these regions would decrease biological activity. A medium magenta contour near the piperidin ring at R 1 of compound 29 shows that hydrogen bond acceptor substituents would have better biological activity. Compound 7 (pIC 50 = 8.12), with -N(CH 3 ) 2 , possesses better biological activity than compound 6 (pIC 50 = 7.68), with -CH 2 CH 3 in the yellow contour.
The above-mentioned contour analyses of CoMFA and CoMSIA are summarized in Figure 4, which provides effective help for the future design of new PI3Kα inhibitors. occurring near the sulfonamide group indicates that compounds with small substituents at this site would have better biological activity. Compound 47 (pIC50 = 7.11), with -NH2 not in the yellow contour, possesses slightly better biological activity than the corresponding compound 48 (pIC50 = 6.97), with -NH2 in the yellow contour.
The electrostatic contour map of the CoMSIA model is displayed in Figure 3b. The blue (electropositive group favorable) and red (electronegative group favorable) contours represent 80% and 20% level contributions, respectively. A medium-red contour occurring near the oxygen atom of R1 of compound 29 indicates that compounds with electronegative substituents at this site would have better biological activity. Compound 2 (pIC50 = 8.64) with -OCH3 possess better biological activity than compound 1 (pIC50 = 7.85), with -H. A medium-blue contour was observed on the piperidine ring of R1, which revealed that an improvement in the electropositivity of this site would improve the biological activity.
The CoMSIA contour map of hydrophobic contribution is described in Figure 3c. In this figure, the yellow (hydrophobic favorable) and white (hydrophobic unfavorable) contours represent 80% and 20% level contributions, respectively. Small yellow contours occurring near the 2′-site of piperidin at R1 of compound 29 indicates that the hydrophobic group at this site would improve the biological activity. Compound 10 (pIC50 = 8.49), with -CH2 at the 4′-site of piperidin, possesses better biological activity than compound 9 (pIC50 = 8.13), without -CH2. Small white contours covering the 2′,3′-site of piperidin at R1 of compound 29 indicates that a hydrophobic group at this site would reduce the biological activity.
In Figure 3d, the cyan and purple contour maps indicate favorable and unfavorable H-bond donor groups, representing 80% and 20% level contributions, respectively. A medium cyan contour occurring at R1 illustrates that compounds with an H-bond donor substituent at this site were preferred to have higher PI3Kα inhibitory activity. Compound 3 (pIC50 = 8.51), with -NH2, possesses slightly better biological activity than compound 2 (pIC50 = 8.40), with -H. A medium purple contour is located blew the piperidine ring at R1 of compound 29, indicating the H-bond donor substituents that had bad biological activity.
As shown in Figure 3e, the hydrogen bond acceptor field of the CoMSIA model is represented by magenta (hydrogen bond acceptor favorable) and red (hydrogen bond acceptor unfavorable), representing 80% and 20% level contributions, respectively. Small red contours of the methylsulfonyl group suggest that the introduction of hydrogen bond acceptor substituents to these regions would decrease biological activity. A medium magenta contour near the piperidin ring at R1 of compound 29 shows that hydrogen bond acceptor substituents would have better biological activity. Compound 7 (pIC50 = 8.12), with -N(CH3)2, possesses better biological activity than compound 6 (pIC50 = 7.68), with -CH2CH3 in the yellow contour.
The above-mentioned contour analyses of CoMFA and CoMSIA are summarized in Figure 4, which provides effective help for the future design of new PI3Kα inhibitors.

Molecular Docking Results and Analysis
Docking studies were performed on compound 29 as the template. As shown in Figure 5, template compound 29 was docked in the binding cavity of the 4ykn with three non-classical H-bonds. An intramolecular H-bond (-H . . . N-, 2.62 Å) was built between -H of the methoxy group at R 2 and the -N group at the 2 -site of the nuclear structure. Two hydrogen atoms of -NCH 3 at R 1 interacted through H-bonding with -O=C of Ser1078 (-H . . . O=C-, 2.60 Å, -H . . . O=C-, 2.65 Å), which increased the affinity between compound 29 and the protein.

Molecular Docking Results and Analysis
Docking studies were performed on compound 29 as the template. As shown in Figure 5, template compound 29 was docked in the binding cavity of the 4ykn with three non-classical H-bonds. An intramolecular H-bond (-H…N-, 2.62 Å) was built between -H of the methoxy group at R2 and the -N group at the 2′-site of the nuclear structure. Two hydrogen atoms of -NCH3 at R1 interacted through H-bonding with -O=C of Ser1078 (-H…O=C-, 2.60 Å, -H…O=C-, 2.65 Å), which increased the affinity between compound 29 and the protein.

Molecular Dynamics Simulation Results and Analysis
A 20 ns molecular dynamics (MD) simulation was carried out using the above-mentioned docking complexes (PI3Kα with ligand 29). The main purpose of the simulation was to optimize the binding pocket and the correlation between PI3Kα and compound 29. The root-mean-square deviation (RMSD) plots are shown in Figure 6a for the receptor (4YKN) and the ligand (compound 29) to explore conformational variations. After 10 ns, the RMSD of the complex was about 2.5 Å, and it retained this value throughout the simulation. A superposition of the most stable structure in all MD simulations and the initial docked structure is shown in Figure 6b,c. After MD simulation, the interactions between compound 29 and the receptor were analyzed here in order to explore the similarity and difference between the results of docking and MD simulation. From the most stable structure extracted from the MD simulation, the model also revealed that one classical hydrogen bond and three non-classical hydrogen bonds are formed between the ligand and the receptor (shown in Figure 6a). The oxygen atom of the morpholine group at R1 interacts through classical H-bonding with -NH of Arg929 (-O…H-, 2.53 Å). The oxygen atom of methanesulfonyl at R1 interacts through intramolecular H-bonding with -H of the piperidine ring (-O…H-, 2.51 Å). The methoxy group at R2 forms a non-classical H-bond with Glu1008 (-C=O…H-, 2.76 Å). The hydrogen atom of -NCH3 at R1 interacts through a nonclassical H-bond with -O=C of Ser1078 (-H…O=C-, 3.07 Å). The number of hydrogen bonds in this MD simulation result was more than that in the docking model, and a classic hydrogen bond was formed between the protein and the receptor. Therefore, the conformation obtained after MD simulation was more stable than the docked conformation. MD simulation is performed in a more realistic environment and much closer to physiological conditions.

Molecular Dynamics Simulation Results and Analysis
A 20 ns molecular dynamics (MD) simulation was carried out using the abovementioned docking complexes (PI3Kα with ligand 29). The main purpose of the simulation was to optimize the binding pocket and the correlation between PI3Kα and compound 29. The root-mean-square deviation (RMSD) plots are shown in Figure 6a for the receptor (4YKN) and the ligand (compound 29) to explore conformational variations. After 10 ns, the RMSD of the complex was about 2.5 Å, and it retained this value throughout the simulation. A superposition of the most stable structure in all MD simulations and the initial docked structure is shown in Figure 6b,c. After MD simulation, the interactions between compound 29 and the receptor were analyzed here in order to explore the similarity and difference between the results of docking and MD simulation. From the most stable structure extracted from the MD simulation, the model also revealed that one classical hydrogen bond and three non-classical hydrogen bonds are formed between the ligand and the receptor (shown in Figure 6a . The number of hydrogen bonds in this MD simulation result was more than that in the docking model, and a classic hydrogen bond was formed between the protein and the receptor. Therefore, the conformation obtained after MD simulation was more stable than the docked conformation. MD simulation is performed in a more realistic environment and much closer to physiological conditions.

Design and Prediction
On the basis of the above-mentioned analysis, five novel structures (compounds 86-90), with -CH3/-CH2CH3/-C(CH3)3/CH2CH (CH3)CH3 for R4 and -CF3/-CHF2/-CHFBr for R5, were designed based on the template compound 29 (Figure 7). The predicted values are shown in Table 2. Compounds 86-90 might have higher activities than the template compound 29 in the 3D-QSAR models; it was indicated that these new designed compounds might become potential candidates as PI3Kα inhibitors. Compounds 86 and 87 were synthesized to validate the reliability of this 3D-QSAR model.

Design and Prediction
On the basis of the above-mentioned analysis, five novel structures (compounds 86-90), with -CH 3 /-CH 2 CH 3 /-C(CH 3 ) 3/ CH 2 CH (CH 3 )CH 3 for R 4 and -CF 3 /-CHF 2 /-CHFBr for R 5 , were designed based on the template compound 29 (Figure 7). The predicted values are shown in Table 2. Compounds 86-90 might have higher activities than the template compound 29 in the 3D-QSAR models; it was indicated that these new designed compounds might become potential candidates as PI3Kα inhibitors. Compounds 86 and 87 were synthesized to validate the reliability of this 3D-QSAR model.

Design and Prediction
On the basis of the above-mentioned analysis, five novel structures (compounds 86-90), with -CH3/-CH2CH3/-C(CH3)3/CH2CH (CH3)CH3 for R4 and -CF3/-CHF2/-CHFBr for R5, were designed based on the template compound 29 (Figure 7). The predicted values are shown in Table 2. Compounds 86-90 might have higher activities than the template compound 29 in the 3D-QSAR models; it was indicated that these new designed compounds might become potential candidates as PI3Kα inhibitors. Compounds 86 and 87 were synthesized to validate the reliability of this 3D-QSAR model.

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86/87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized

The Synthesis of Compounds 86 and 87
Two novel 2-difluoromethylbenzimidazole derivatives (86, 87) were synthesized according to Scheme 1. The compounds 91, 92a, 94, and 96 are commercially available and were used without further purification. The compounds 86b and 87a were synthesized with reference to previous literature [31][32][33]. Compounds 93a or 93b were synthesized by Pd/C hydrogenation and subsequent dehydration cyclization with trifluoroacetic acid, with moderate yields of 59% and 60%, respectively. Then, under the action of potassium carbonate in DMF, 95a and 95b were obtained via the first S N Ar reaction and the yields were 59% and 52%, respectively. Finally, after the second S N Ar reaction at the residual chlorine atom in 95a and 95b under slightly stronger reaction conditions, and subsequent methylation by methyl iodide, the desired compounds 86 and 87 were achieved with 59% and 51% yields, respectively.

Biological Evaluation
IC 50 values of the compound 86, 87, PIK-90, and ZSTK474 against PI3Kα are listed in Table 3 and Figure 8. ZSTK474 and PIK-90 were used as the reference drug. The results showed that compound 86 and 87 both have potent inhibitory activity against PI3Kα, close to that of ZSTK474. sized with reference to previous literature [31][32][33]. Compounds 93a or 93b were synthesized by Pd/C hydrogenation and subsequent dehydration cyclization with trifluoroacetic acid, with moderate yields of 59% and 60%, respectively. Then, under the action of potassium carbonate in DMF, 95a and 95b were obtained via the first SNAr reaction and the yields were 59% and 52%, respectively. Finally, after the second SNAr reaction at the residual chlorine atom in 95a and 95b under slightly stronger reaction conditions, and subsequent methylation by methyl iodide, the desired compounds 86 and 87 were achieved with 59% and 51% yields, respectively.

Biological Evaluation
IC50 values of the compound 86, 87, PIK-90, and ZSTK474 against PI3Kα are listed in Table 3 and Figure 8. ZSTK474 and PIK-90 were used as the reference drug. The results showed that compound 86 and 87 both have potent inhibitory activity against PI3Kα, close to that of ZSTK474.

Datasets and Biological Activity
In 3D-QSAR studies, a set of 85 2-difluoromethylbenzimidazole derivatives had good biological activity against PI3Kα, and these derivatives were collected from the literature [22][23][24]. All derivatives were allocated to the training set and the test set: the training set, containing 65 compounds (76%), would generate a 3D-QSAR model, and the test set, containing 20 compounds (24%), would confirm the reliability of the models. The test set was selected randomly so that the data set would show high structural diversity and a wide range of activities.

Datasets and Biological Activity
In 3D-QSAR studies, a set of 85 2-difluoromethylbenzimidazole derivatives had good biological activity against PI3Kα, and these derivatives were collected from the literature [22][23][24]. All derivatives were allocated to the training set and the test set: the training set, containing 65 compounds (76%), would generate a 3D-QSAR model, and the test set, containing 20 compounds (24%), would confirm the reliability of the models. The test set was selected randomly so that the data set would show high structural diversity and a wide range of activities.
The CoMFA and CoMSIA models were created using the Sybyl X 2.0 software package, in which Gasteiger-Hückel [34] charge was added to all compounds, and its energy was minimized at the Tripos force field [35] using the Powell gradient algorithm. To obtain the most stable conformation, the convergence criterion is 0.005 kcal/mol·Å, and the maximum number of iterations was set to 10,000. Their chemical structures and active values against PI3Kα are listed in Supplementary Information (Table S1).

Molecular Modeling and Alignment
Molecules of the training sets were aligned onto the molecules that had the most potent inhibitor (compound 29). Compound 29 was employed as the template molecule because of its highest PI3Kα inhibitory activity amongst training sets, and the corresponding results are shown in Figure 9. The common substructure is depicted in bold.  (Table S1).

Molecular Modeling and Alignment
Molecules of the training sets were aligned onto the molecules that had the most potent inhibitor (compound 29). Compound 29 was employed as the template molecule because of its highest PI3Kα inhibitory activity amongst training sets, and the corresponding results are shown in Figure 9. The common substructure is depicted in bold.

3D-QSAR Models Studies
The reliability and predictive ability of the 3D-QSAR models were evaluated through internal and external validation. In the partial least squares (PLS) analysis, the leave-oneout (LOO) method was used to determine the optimum number of components (NOC). In the 3D-QSAR model validation, non-cross-validated correlation coefficient (r 2 ncv), standard error estimate (SEE), and F-statistic values (F) were obtained. Each molecule's predicted value and actual value are in a linear relationship and, respectively, shown in Figure 10. The predicted pIC50 values via CoMFA and CoMSIA models are listed in Supplementary Information (Table S2).
Five fields (steric, electrostatic, donor, acceptor, and hydrophobic) were considered during PLS analysis. The relationship between actual and predicted pIC50 values of the training and test set molecules is illustrated in Figure 10.
Before this validation, an initial inspection of the predicted activities revealed poor predictions for compounds 14 and 48, which were considered outliers in our work for both the ligand-based and receptor-based models. The steric hindrance of the morpholine ring of compound 14 was increased by two methylene bridges. The structure was different from that of other compounds, which was inconsistent with the model, and it led to abnormal activity values. The terminal methyl group of compound 48 was surrounded by a blue area in the electrostatic map, which was inconsistent with the model, and it resulted in abnormal activity values.

3D-QSAR Models Studies
The reliability and predictive ability of the 3D-QSAR models were evaluated through internal and external validation. In the partial least squares (PLS) analysis, the leave-oneout (LOO) method was used to determine the optimum number of components (NOC). In the 3D-QSAR model validation, non-cross-validated correlation coefficient (r 2 ncv ), standard error estimate (SEE), and F-statistic values (F) were obtained. Each molecule's predicted value and actual value are in a linear relationship and, respectively, shown in Figure 10. The predicted pIC 50 values via CoMFA and CoMSIA models are listed in Supplementary Information (Table S2).

Molecular Docking
Molecular docking studies were carried out using the Surflex-Dock module in Sybyl-x 2.0, which aimed to analyze the detailed binding mode of small molecules and PI3Kα and to validate the 3D-QSAR models. The 3D structures of PI3Kα (PDB code: 4ykn) were downloaded from the RCSB protein database. The protein ligands and water were removed, and the pocket used to combine with molecules was exposed before docking.

Commented [M42]
provide sharper im Note: A sharper im Five fields (steric, electrostatic, donor, acceptor, and hydrophobic) were considered during PLS analysis. The relationship between actual and predicted pIC 50 values of the training and test set molecules is illustrated in Figure 10.
Before this validation, an initial inspection of the predicted activities revealed poor predictions for compounds 14 and 48, which were considered outliers in our work for both the ligand-based and receptor-based models. The steric hindrance of the morpholine ring of compound 14 was increased by two methylene bridges. The structure was different from that of other compounds, which was inconsistent with the model, and it led to abnormal activity values. The terminal methyl group of compound 48 was surrounded by a blue area in the electrostatic map, which was inconsistent with the model, and it resulted in abnormal activity values.

Molecular Docking
Molecular docking studies were carried out using the Surflex-Dock module in Sybyl-x 2.0, which aimed to analyze the detailed binding mode of small molecules and PI3Kα and to validate the 3D-QSAR models. The 3D structures of PI3Kα (PDB code: 4ykn) were downloaded from the RCSB protein database. The protein ligands and water were removed, and the pocket used to combine with molecules was exposed before docking.

Molecular Dynamics Simulation
A 20 ns molecular dynamics simulation was carried out on the docking complex to assess the reliability of the docking results and further explore the 29-PI3Kα interactions. Amber 12.0 software [36] was used for molecular dynamics studies. The general AMBER force field (GAFF) and the Amber ff12SB were employed on the ligands and the proteins, respectively. Na + was included to neutralize the system within a cubic water box. The entire molecular dynamics simulation followed the procedures for minimization, heating, density balance, and production. The steepest descent and the conjugate gradient method were used to minimize the energy of the whole system. The minimized system was heated to 300 K gradually over 20 ps under the NVT ensemble. Then 20 ns unrestrained MD simulations were performed in the NPT ensemble at 1 atm and a temperature of 300 K. All hydrogen atoms were constrained by SHAKE algorithm [37]. After the MD simulation, the root-mean-square deviation (RMSD) was calculated to evaluate the stability of the complex system.

Enzyme Assays
These assays were carried out as described previously [38]. All of the enzymatic reactions were conducted at 30 • C for 40 min. The 50 µL reaction mixture contained 40 mM Tris, pH 7.4, 10 mM MgCl 2 , 0.1 mg/mL BSA, 1 mM DTT, 10 µM ATP, kinase, and the enzyme substrate. The compounds were diluted in 10% DMSO, and 5 µL of the dilution was added to a 50 µL reaction so that the final concentration of DMSO is 1% in all of reactions. The assay was performed using the Kinase-Glo Plus luminescence kinase assay kit. It measured the kinase activity by quantitating the amount of ATP remaining in solution following a kinase reaction. The luminescent signal from the assay was correlated with the amount of ATP present and was inversely correlated with the amount of kinase activity. The IC 50 values were calculated using nonlinear regression with normalized dose-response fit using Scikit-learn 1.0.2 [39].

Conclusions
In this work, the binding models between 2-difluoromethylbenzimidazole derivatives and PI3Kα were studied by a combination of 3D-QSAR, molecular docking, and molecular dynamics simulations. The contour maps explained the relationship between chemical structures and bioactivities. The molecular docking and molecular dynamics results implied that relevant hydrogen bonds are important for ligand-receptor binding. The results from the 3D-QSAR models, molecular docking, and MD simulation illustrated the chemical structure characteristics of 2-difluoromethylbenzimidazole derivatives. Ultimately, two designed compounds (86 and 87) were selected to be synthesized and biologically evaluated. Compound 86 (IC 50 = 22.8 nM) and compound 87 (IC 50 = 33.6 nM) show potent inhibitory activity against PI3Kα. This study provides guidance for the further development of potent PI3Kα inhibitors with improved biological activity.
Supplementary Materials: The following supporting information can be downloaded online, Table  S1: Structures and activity value of every molecule used in the study, Table S2: The actual and predicted pIC 50 values of all compounds, Figures S1-S22 1 H-NMR, 19 F-NMR, 13 C-NMR and HRMS spectra for the new drug.