Predicting Accurate Lead Structures for Screening Molecular Libraries: A Quantum Crystallographic Approach

Optimization of lead structures is crucial for drug discovery. However, the accuracy of such a prediction using the traditional molecular docking approach remains a major concern. Our study demonstrates that the employment of quantum crystallographic approach-counterpoise corrected kernel energy method (KEM-CP) can improve the accuracy by and large. We select human aldose reductase at 0.66 Å, cyclin dependent kinase 2 at 2.0 Å and estrogen receptor β at 2.7 Å resolutions with active site environment ranging from highly hydrophilic to moderate to highly hydrophobic and several of their known ligands. Overall, the use of KEM-CP alongside the GoldScore resulted superior prediction than the GoldScore alone. Unlike GoldScore, the KEM-CP approach is neither environment-specific nor structural resolution dependent, which highlights its versatility. Further, the ranking of the ligands based on the KEM-CP results correlated well with that of the experimental IC50 values. This computationally inexpensive yet simple approach is expected to ease the process of virtual screening of potent ligands, and it would advance the drug discovery research.


Introduction
Lead optimization is an essential part of drug discovery, where a weakly potent substrate/lead structure, identified by virtual or high throughput screening, is developed by improving ligand specificity, potency, and pharmacokinetic properties. One of the efficient ways of accelerating the lead optimization process is to predict the ligand binding affinity and/or functional potency, as it cuts down the labour and reduces the cost. Various methods have been developed and reviewed for calculating ligand binding affinity [1,2]. Methods such as molecular dynamic simulations, free energy perturbation, Monte Carlo simulations and thermodynamic integration can calculate binding free energies that are comparable to the experimentally determined values [3][4][5]. Molecular Mechanics/Poisson Boltzmann Surface Area (MM/PBSA) calculations compute binding free energies between the bound and the unbound states for the binding complexes using a combination of MM and continuum solvation [6]. A relatively similar approach MM/Generalized Born Surface Area (GBSA) has been used to study protein-ligand interactions and is applied to diverse targets [7,8]. Although free energy calculations using the aforementioned methods have produced promising results to some extent [9], these approaches are computationally expensive and often becomes tedious for the quick evaluation of binding affinities.
Currently, the field of computer-aided drug-design (CADD) is dominated by molecular docking approach, for which scoring functions are used to identify and rank possible binding poses of a ligand in a binding pocket. As per the records in Swiss Institute of Bioinformatics, there are 57 tools and 20 web services available for molecular docking (Click2Drug: https://www.click2drug.org, accessed on 28 April 2021). The classic scoring functions are broadly divided into three classes-MM force field, empirical scoring and knowledge-based scoring functions. The knowledge-based scoring function incorporates binding modes present in the training dataset but is also accompanied with a requirement of a considerable database having high-quality empirical structures for training [10,11]. On the other hand, machine learning approach deduces the functional form directly from the data. The functional form for the relationship between the structural features and binding affinity of the protein-ligand complex is not predetermined [12]. There are two types of models generally adopted in protein-ligand docking, namely, "Lock and Key" and "Induced-fit". For docking, two well-known programs, AutoDock [13] and GOLD [14]. have been used widely because of their easy accessibility [15]. Recently, the program Rosetta [16] has also become popular. Figure 1 is plotted based on the data retrieved from PubMed, and depicts the steadily increasing trend of publication on docking studies during the past two decades. functions are broadly divided into three classes-MM force field, empirical scoring and knowledge-based scoring functions. The knowledge-based scoring function incorporates binding modes present in the training dataset but is also accompanied with a requirement of a considerable database having high-quality empirical structures for training [10,11]. On the other hand, machine learning approach deduces the functional form directly from the data. The functional form for the relationship between the structural features and binding affinity of the protein-ligand complex is not predetermined [12]. There are two types of models generally adopted in protein-ligand docking, namely, "Lock and Key" and "Induced-fit". For docking, two well-known programs, AutoDock [13] and GOLD [14]. have been used widely because of their easy accessibility [15]. Recently, the program Rosetta [16] has also become popular. Figure 1 is plotted based on the data retrieved from PubMed, and depicts the steadily increasing trend of publication on docking studies during the past two decades. Research in medicinal chemistry is heavily dependent on docking tools and scoring functions, but it has been observed that these scoring functions can result in accuracies anywhere between 0-92.66% and thus their reliability remains a major concern [15]. However, the introduction of experimentally derived crystal structure geometries has proven to improve the success rate to as high as 99% [17]. For induced-fit modelling, the flexibility of protein affects both the scoring and ranking of the best poses. This arises from the additional burden of accurately analysing protein conformational free energy changes apart from ligand binding free energies [18]. Thus, the performances of docking and scoring functions are assessed based on two quantities. First, the reproduction of binding poses of the ligands to that present in the protein complex crystal structures, in which docking is considered accurate only if the heavy atom root mean square displacement (RMSD) is ≤2.0 Å from the localization of crystalized ligand for the top ranked poses [19]. Second, Research in medicinal chemistry is heavily dependent on docking tools and scoring functions, but it has been observed that these scoring functions can result in accuracies anywhere between 0-92.66% and thus their reliability remains a major concern [15]. However, the introduction of experimentally derived crystal structure geometries has proven to improve the success rate to as high as 99% [17]. For induced-fit modelling, the flexibility of protein affects both the scoring and ranking of the best poses. This arises from the additional burden of accurately analysing protein conformational free energy changes apart from ligand binding free energies [18]. Thus, the performances of docking and scoring functions are assessed based on two quantities. First, the reproduction of binding poses of the ligands to that present in the protein complex crystal structures, in which docking is considered accurate only if the heavy atom root mean square displacement (RMSD) is ≤2.0 Å from the localization of crystalized ligand for the top ranked poses [19]. Second, the enrichment factor (EF), which validates docking and scoring algorithms by examining them after screening [19]. For a given percentile limit, higher the EF value better the scoring function. EF studies require large dataset like A Database of Useful Decoys: Enhanced (DUD-E) [20], Maximum Unbiased Validation (MUV) [21] and Comparative Assessment of Scoring Function (CASF) [22], which contain actives and decoys for a diverse set of proteins. In general, the docking and scoring functions are assessed using these two parameters and seldom based on the binding affinity. Additionally, the employment of deep-learning approach [23,24], the graph matching method [25], followed by the traditional docking process have shown to improve the accuracy of protein-ligand binding mode. However, this approach could be target specific [26]. A comparative study by Wang et al. evaluated 11 scoring functions with 100 protein-ligand complexes by accessing their ability to reproduce the binding conformations and affinities [27]. Out of those 11 scoring functions only four resulted a ranking correlation of 50-70% for predicting the binding affinities for the complexes with RMSD criterion of ≤2 Å [27]. Another recent study using a large set of scoring functions suggests that the choice of scoring functions highly depends on the environment of the active site of a target [28]. The use of CASF demonstrated that the current docking tools have promising "docking power" in comparison to "scoring power", "ranking power" and "screening power" [22].
Recently, the quantum crystallographic (QCr) approach-kernel energy method (KEM) [29]-has been successfully employed for estimating the protein-ligand interaction energies in a simplified, accurate and yet faster way in comparison to the other similar fragment-based approaches [30]. Calculating the ab-initio density matrices using any chemical model present within the quantum chemistry and using crystallographic coordinates is the forte of KEM [31]. In proteins or their complexes, the fragments (a.k.a. kernels) can be as small as one amino acid or a ligand molecule. Thus, the desired energy is estimated at a considerably reduced computational cost. Since its inception, KEM has been applied to a large variety of systems like peptides [29], DNA [32], RNA [33], and proteins [34,35]. Huang et al. have also calculated the interaction energies of aminoglycoside drugs with the target ribosomal A site of RNA as well as the hydrogen bonding interactions within the double stranded RNA [36]. Based on their numerous studies, Massa et al. have demonstrated that the accuracy of energies obtained using KEM is independent of the basis functions and the MP2 method [37,38] provides the best results in comparison to HF and DFT.
Alongside KEM, counterpoise (CP)-corrected energy calculation [39], which accounts for the basis set superposition errors (BSSE) [40], is essential for the accurate estimation of interaction energy (IE) of a hydrogen-bonded complex [41,42]. IE thus calculated provides both CP-corrected and raw (uncorrected) energy values and their average value provide a good estimation of energy for a hydrogen-bonded complex system [42]. In our recent study on exploration of potent ligands for proteins based on the IEs for protein-ligand complexes containing both polar and non-polar interactions, the use of CP corrected KEM (hereafter referred to as KEM-CP) provided accurate results [43].
The aforementioned studies clearly indicate that the requirement of a lead structure is imperative for drug discovery and even for molecular docking when predicting the potential of a ligand towards the formation of its complex with a specific target. Therefore, in this study, we employ the KEM-CP approach [43] for predicting the lead structures based on their IE. For this, we consider three protein complexes with resolutions ranging from ultrahigh to standard to low and binding pocket environment ranging from highly hydrophilic to moderately hydrophobic to highly hydrophobic in nature ( Table 1). The protein complex structures harvested from RCSB PDB, are (a) 2-(4-bromo-2-fluorobenzylthiocarbamoyl)-5-fluorophenoxyacetic acid (IDD594) bound human aldose reductase (hAR-IDD594), (b) O6-cyclohexylmethoxy-2-(4 -sulphamoylanilino) purine (NU6102) bound cyclin dependent kinase 2 (CDK2-NU6102, PDB ID-1H1S) and (c) 1-chloro-6-(4-hydroxyphenyl)-2-napthol (4NA) bound estrogen receptor β (ERβ -4NA, PDB ID-1YY4). Subsequently, we compare the scoring functions GoldScore [14], ChemScore [44,45] and ChemPLP [46], as implemented in GOLD [14] and utilize the superior scoring function GoldScore for docking some of the known ligands (Schemes S1-S3, Supplementary Materials) with structures similar to their ligands present in the respective complex structures. Thereby, for all the ligands, irrespective of their fitting scores, we select various types of poses predicted via docking and estimate their IEs using KEM-CP. Finally, based on their fitting scores (from GoldScore) and binding IEs (from KEM-CP) we predict the lead structure(s). We also take into account the heavy atom RMSD of the poses with respect to the crystal structure of the ligand present in the complex for predicting the accurate lead structures. Moreover, we compare the ranking of the ligands based on both the fitness scores and the IEs with the ranking based on the reported experimental IC 50 values, except for hAR-IDD594 complex. Thus, we demonstrate the versatility of KEM-CP and its accuracy over the well-known scoring tool GoldScore.

Case of hAR-IDD594
In this case, for the five ligands, GoldScore predicts two major types of orientation of poses (namely Type 1 and Type 2) as the best poses ( Figure S1 and Table S1, Supplementary Materials). For all the five ligands for both types of poses the IEs are calculated for the best poses (highest fitting score, Table 2) and the RMSD of the poses are compared with the crystal geometry of IDD594 (Table 3) Table 2. List of ligands of hAR with their average IEs and GoldScore fitness scores for the two types of poses. The most negative IEs and the best fitness scores from GoldScore are highlighted using (values in bold). The reported experimental IC 50 [7] values are also listed.  The results depicted in Figure 2 suggest that KEM-CP approach predic poses for all the five ligands of hAR with RMSDs of <1.2 Å . Whereas GoldS predict correct poses only for the ligands 10, 16 and 19 with RMSDs of <0.3 Å.

Case Study of CDK2-NU6102
In this case, the majority of the 30 poses generated for each of the seven lig GoldScore belong to three types of poses as shown in Figure S2, Supplementary The populations of each type and the overall populations of all three types (7 for ligand 33 with poor IC50 value) for each ligand are listed in Table S7, Supp Materials. The average IEs along with the IC50 values [48], fitting scores and sponding ranks are listed in Table 4. While for ligand 28, none of the poses Type 2, for ligand 29, only one pose belongs to Type 2. The IE calculation for pose of ligand 34 was failed even using lower basis sets, possibly due to the un geometry of this particular pose. For all the seven ligands, GoldScore predicts T as the best pose. However, to avoid conformational biasness, we calculate th based IEs for all three types of poses. Interestingly, KEM-CP also predicts Typ  The results depicted in Figure 2 suggest that KEM-CP approach predicts the best poses for all the five ligands of hAR with RMSDs of <1.2 Å . Whereas GoldScore could predict correct poses only for the ligands 10, 16 and 19 with RMSDs of <0.3 Å.

Case Study of CDK2-NU6102
In this case, the majority of the 30 poses generated for each of the seven ligands using GoldScore belong to three types of poses as shown in Figure S2, Supplementary Materials. The populations of each type and the overall populations of all three types (70%, except for ligand 33 with poor IC50 value) for each ligand are listed in Table S7, Supplementary Materials. The average IEs along with the IC50 values [48], fitting scores and the corresponding ranks are listed in Table 4. While for ligand 28, none of the poses belongs to Type 2, for ligand 29, only one pose belongs to Type 2. The IE calculation for the Type 3 pose of ligand 34 was failed even using lower basis sets, possibly due to the unfavourable geometry of this particular pose. For all the seven ligands, GoldScore predicts Type 1 pose as the best pose. However, to avoid conformational biasness, we calculate the KEM-CP based IEs for all three types of poses. Interestingly, KEM-CP also predicts Type 1 as the 10 − The results depicted in Figure 2 suggest that KEM-CP approach predicts the best poses for all the five ligands of hAR with RMSDs of <1.2 Å. Whereas GoldScore could predict correct poses only for the ligands 10, 16 and 19 with RMSDs of <0.3 Å.  The results depicted in Figure 2 suggest that KEM-CP approach predicts the best poses for all the five ligands of hAR with RMSDs of <1.2 Å . Whereas GoldScore could predict correct poses only for the ligands 10, 16 and 19 with RMSDs of <0.3 Å.

Case Study of CDK2-NU6102
In this case, the majority of the 30 poses generated for each of the seven ligands using GoldScore belong to three types of poses as shown in Figure S2, Supplementary Materials. The populations of each type and the overall populations of all three types (70%, except for ligand 33 with poor IC50 value) for each ligand are listed in Table S7, Supplementary Materials. The average IEs along with the IC50 values [48], fitting scores and the corresponding ranks are listed in Table 4. While for ligand 28, none of the poses belongs to Type 2, for ligand 29, only one pose belongs to Type 2. The IE calculation for the Type 3 pose of ligand 34 was failed even using lower basis sets, possibly due to the unfavourable geometry of this particular pose. For all the seven ligands, GoldScore predicts Type 1 pose as the best pose. However, to avoid conformational biasness, we calculate the KEM-CP based IEs for all three types of poses. Interestingly, KEM-CP also predicts Type 1 as the

Case Study of CDK2-NU6102
In this case, the majority of the 30 poses generated for each of the seven ligands using GoldScore belong to three types of poses as shown in Figure S2, Supplementary Materials. The populations of each type and the overall populations of all three types (≥70%, except for ligand 33 with poor IC 50 value) for each ligand are listed in Table S7, Supplementary Materials. The average IEs along with the IC 50 values [48], fitting scores and the corresponding ranks are listed in Table 4. While for ligand 28, none of the poses belongs to Type 2, for ligand 29, only one pose belongs to Type 2. The IE calculation for the Type 3 pose of ligand 34 was failed even using lower basis sets, possibly due to the unfavourable geometry of this particular pose. For all the seven ligands, GoldScore predicts Type 1 pose as the best pose. However, to avoid conformational biasness, we calculate the KEM-CP based IEs for all three types of poses. Interestingly, KEM-CP also predicts Type 1 as the best pose, except for ligand 30, for which Type 2 is predicted as the best pose. Further, the comparison of RMSDs of these three types of poses of the ligands with that of the crystal geometry suggests that for all the seven ligands the Type 1 is the most favourable pose, which has least RMSD of <1 Å (Table 5, Figure 3). Moreover, among the three types of poses, the ligands with Type 1 pose have average IEs closest to that of the crystal geometry of NU6102 ligand (−444.7 kCal·mol −1 , Table S8, Supplementary Materials). Whereas, both Type 2 and Type 3 poses have RMSDs of >2.7 Å and >3.3 Å, respectively and the IEs of most of these ligands differ by a large to the IE of the NU6102 ligand. The pairwise IEs for all the docked poses including the crystal geometry are listed in Tables S8-S14, Supplementary Materials.  Further, given a similar prediction by both KEM-CP and GoldScore for a good num ber of ligands, in this case, we rank the ligands based on the result from these two ap proaches and compared to the ranking based on the experimental IC50 values (Table S15 Supplementary Materials). Interestingly, both the KEM-CP-and the GoldScore-based     Further, given a similar prediction by both KEM-CP and GoldScore for a good number of ligands, in this case, we rank the ligands based on the result from these two approaches and compared to the ranking based on the experimental IC 50 values (Table S15, Supplementary Materials). Interestingly, both the KEM-CP-and the GoldScore-based rankings for the best poses correlate very well (64%) with the rankings based on the experimental IC 50 values (Table S15, Supplementary Materials and Figure 4). While focusing only on the pose Type 1 (Table S16 and Figure S3, Supplementary Materials), which is grossly predicted as the best pose by KEM-CP and GoldScore, the correlation between the IE and the IC 50 value has further improved to 71%. Such an excellent correlation of the predicted ranking to the experimental ranking of the ligands of CDK2 suggests that both the approaches, KEM-CP and the GoldScore, perform similarly for proteins with hydrophilic active site.

Case of ERβ-4NA
In this case, the 30 poses generated for each of the 10 ligands using GoldScore are distributed into four types of poses ( Figure S4, Supplementary Materials). The distribution of populations of the four types of poses for each ligand are listed in Table S17 (Supplementary Materials). The average IEs along with the IC 50 values [49], fitting score and the corresponding ranks are listed in Table 6. For ERβ, the GoldScore predicts Type 1 pose as the best pose for majority of the ligands (six: 15, 40, 44, 57, 62 and 68), Type 4 as the best pose for three ligands (25, 27 and 29), Type 3 as the best pose for ligand 70 only and none from Type 2. Interestingly, KEM-CP also predicts Type 1 pose as the best pose for six ligands (25,29,40,57, 62, and 68), Type 3 as the best pose for three ligands (15,44,70), Type 2 as the best pose for ligand 27 only and none from Type 4. Further, the comparison of RMSDs (Table 7) of these four types of poses of the ligands with the crystal geometry of 4NA suggests that Type 1 (RMSD < 0.4 Å) and Type 3 (RMSD < 1.3 Å) are the correctly predicted poses for these ligands. Moreover, among the four types of poses, the ligands with Type 1 and Type 3 poses have average IEs closest to that of the crystal geometry of 4NA ligand (−53.5 kCal·mol −1 , Table S18, Supplementary Materials). Both Type 2 and Type 4 poses are found to be the incorrect predictions as their RMSDs are >3 Å and the IEs for most of these ligands differ by a large to the IE of the 4NA. Accordingly, GoldScore predicts the correct pose only for seven ligands (six of Type 1 and one of Type 3) whereas KEM-CP predicts the correct pose for as many as nine out of the ten ligands (six of Type 1 and three of Type 3) as shown in Figure 5. However, both GoldScore and KEM-CP predict Type 1 as the best pose for the ligands 40, 57, 62 and 68 and Type 3 as the best pose for the ligand 70. The pairwise IEs for all the docked poses are provided in Tables S18-S27, Supplementary Materials. It is noteworthy that although both Type 1 and Type 3 poses are considered favourable, with respect to the Type 1 pose the two fused rings of the ligands in Type 3 pose is rotated by 180 • along the single bond present between the two aromatic groups.

Case of ERβ-4NA
In this case, the 30 poses generated for each of the 10 ligands using GoldScore are distributed into four types of poses ( Figure S4, Supplementary Materials). The distribution of populations of the four types of poses for each ligand are listed in Table S17 (Supplementary Materials). The average IEs along with the IC50 values [49], fitting score and the corresponding ranks are listed in Table 6. For ERβ, the GoldScore predicts Type 1 pose as the best pose for majority of the ligands (six: 15, 40, 44, 57, 62 and 68), Type 4 as the best pose for three ligands (25, 27 and 29), Type 3 as the best pose for ligand 70 only and none from Type 2. Interestingly, KEM-CP also predicts Type 1 pose as the best pose for six ligands (25, 29, 40, 57, 62, and 68), Type 3 as the best pose for three ligands (15, 44, 70), Type 2 as the best pose for ligand 27 only and none from Type 4. Further, the comparison                Tables 6 and 7.
Further, in this case also, the best pose ligands are ranked based on the average IEs and the fitting scores are compared with the ranking based on their experimental IC 50 values. For these ten ligands, the correlations of the rankings based on the average IE and the GoldScore to that of the IC 50 values resulted only 31% and 14%, respectively (Table S28 and Figure S5, Supplementary Materials). However, the ranking based on the average IE and IC 50 is correlated to the least for the ligands 25 and 68, excluding which the overall correlation for eight ligands has improved to as high as 88%. Likewise, for GoldScore, excluding the ligands 15 and 44 the correlation with IC 50 has improved to 71% (Table S28, Supplementary Materials and Figure 6). While focusing only on Type 1 pose, the corresponding correlations are 55% and 67%, respectively (Table S29 and Figure S6, Supplementary Materials), whereas, for pose Type 3, the correlations are 48% and 41%, respectively (Table S30 and Figure S7, Supplementary Materials).
Further, in this case also, the best pose ligands are ranked based on the average I and the fitting scores are compared with the ranking based on their experimental IC50 v ues. For these ten ligands, the correlations of the rankings based on the average IE and t GoldScore to that of the IC50 values resulted only 31% and 14%, respectively (Table S and Figure S5, Supplementary Materials). However, the ranking based on the average and IC50 is correlated to the least for the ligands 25 and 68, excluding which the over correlation for eight ligands has improved to as high as 88%. Likewise, for GoldSco excluding the ligands 15 and 44 the correlation with IC50 has improved to 71% (Table S Supplementary Materials and Figure 6). While focusing only on Type 1 pose, the cor sponding correlations are 55% and 67%, respectively (Table S29 and Figure S6, Supp mentary Materials), whereas, for pose Type 3, the correlations are 48% and 41%, resp tively (Table S30 and Figure S7, Supplementary Materials).  from Table S28).

Discussion
Initially, by docking the ligand IDD594 to the ultra-high resolution hAR crystal stru ture and without supplying any lead structure both qualitatively and quantitatively w confirm that the scoring function GoldScore is indeed superior to the ChemScore a ChemPLP.
In the case of hAR-IDD594, KEM-CP approach could predict the correct pose of t ligands solely based on their IEs as the differences between the IEs of the two types poses are significantly high (Table 3). Additionally, the corresponding RMSDs correlat extremely well with their IEs, poses with least IE resulted minimum RMSD. However, the cases of CDK2-NU6102 and ERβ-4NA, the KEM-CP based IEs of the various pos resulted similar values. Therefore, in these two cases, for predicting the correct types poses, the RMSDs of the best poses predicted by the KEM-CP approach and the GoldSco are compared with the crystal geometries of the corresponding ligands.
Overall, GoldScore predicted correct poses for only three out of five (60%) ligands hAR, which has moderately hydrophobic active site and seven out of ten (70%) ligan for ERβ with highly hydrophobic active site ( Table 8). As expected, for CDK2 with a h drophilic active site, GoldScore predicts correct poses for all the seven ligands (100% Interestingly, KEM-CP approach did not show any such environment specificity and could predict the correct poses for all the five (100%) ligands of hAR, for nine out of t ten (90%) ligands of ERβ and for six out of the seven (86%) ligands of CDK2 (Table 8).  from Table S28).

Discussion
Initially, by docking the ligand IDD594 to the ultra-high resolution hAR crystal structure and without supplying any lead structure both qualitatively and quantitatively we confirm that the scoring function GoldScore is indeed superior to the ChemScore and ChemPLP.
In the case of hAR-IDD594, KEM-CP approach could predict the correct pose of the ligands solely based on their IEs as the differences between the IEs of the two types of poses are significantly high (Table 3). Additionally, the corresponding RMSDs correlated extremely well with their IEs, poses with least IE resulted minimum RMSD. However, in the cases of CDK2-NU6102 and ERβ-4NA, the KEM-CP based IEs of the various poses resulted similar values. Therefore, in these two cases, for predicting the correct types of poses, the RMSDs of the best poses predicted by the KEM-CP approach and the GoldScore are compared with the crystal geometries of the corresponding ligands.
Overall, GoldScore predicted correct poses for only three out of five (60%) ligands of hAR, which has moderately hydrophobic active site and seven out of ten (70%) ligands for ERβ with highly hydrophobic active site ( Table 8). As expected, for CDK2 with a hydrophilic active site, GoldScore predicts correct poses for all the seven ligands (100%). Interestingly, KEM-CP approach did not show any such environment specificity and it could predict the correct poses for all the five (100%) ligands of hAR, for nine out of the ten (90%) ligands of ERβ and for six out of the seven (86%) ligands of CDK2 (Table 8). For both ERβ and CDK2, the ranking comparison emphasizes that the prediction of the lead structures by the KEM-CP approach is as good as the experimental IC 50 values, whereas the GoldScore approach could predict well only for the CDK2. Such an excellent correlation of the predicted ranking to the experimental ranking of the ligands of CDK2 suggests that both the approaches, KEM-CP and the GoldScore, perform similarly for proteins with hydrophilic active site. For hAR, such a comparison did not result a meaningful correlation, possibly due to the small number of ligands and inadequate experimental measurements (without error analysis).
Overall, a better correlation of ranking is observed for the CDK2 in comparison to the ERβ. This is because the resolution of the crystal structure of CDK2 is better (2.0 Å) than that of ERβ (2.7 Å). Moreover, CDK2 has hydrophilic environment, for which GoldScore performs better.

Selection of Complex Structures and Preparation of the Targets and Their Ligands
The rationale behind the selection of the three protein complexes with distinct active site environments is summarized below: (1) hAR-IDD594: The ultra-high resolution (0.66 Å) complex structure was previously considered by some of us for studying protein-ligand interactions and for benchmarking KEM-CP approach against MOE scoring function [43]. Here, once again, we consider this complex structure with moderately hydrophobic active site environment for benchmarking KEM-CP approach against GoldScore. (2) CDK2-NU6102: This standard resolution (2.0 Å) complex structure has a hydrophilic environment in its active site and previous report [28] suggests that GoldScore provides better results for such systems. Therefore, we select this system to check the superiority of KEM-CP over GoldScore. (3) ERβ-4NA: This complex structure consists of a hydrophobic (or lyophilic) active site and reported at a low resolution of 2.7 Å. As reported earlier, GoldScore fails to rank potent ligand accurately for proteins with hydrophobic environments [28] ( Table 1) and hence IE study on such a system provides us an opportunity to test the potentiality of KEM-CP approach.
Such selections allowed us to investigate distinct protein active site environments ranging from highly hydrophilic (CDK2) to moderately hydrophobic (hAR) to highly hydrophobic (ERβ) in nature. Moreover, these target systems provide an opportunity to investigate structures of varying resolution ranging between 0.66 Å to 2.7 Å ( Table 1).
The retrieved protein structures from RCSB PDB were prepared for docking by selecting the major conformer (wherever multiple conformers were present) using pdbset module of CCP4 [50], which was also used to remove the solvents including water molecules. The atomic coordinates were then fed into GOLD. Subsequently, the metal ions (none was at the active site) were removed and the H-atoms, protonation state and tautomerization were assigned. The hydrophobicity of the protein active site was calculated using SiteMap (SiteMap, Schrödinger, LLC, New York, NY, USA, 2020) [47] and the corresponding values are listed in Table 1.

Scoring Function and Docking Studies
The accuracy of each of the scoring functions, GoldScore [14], ChemScore [44,45] and ChemPLP [46], is tested by docking the ligand IDD594 to the active site of the ultra-high resolution hAR-IDD594 complex structure (see Section S1.1, Supplementary Materials). Thereby, the poses with rank 1 of the IDD594 ligand as provided by the three scoring functions are compared among themselves ( Figure S8, Supplementary Materials) as well as with the crystal geometry of IDD594 ( Figure S9, Supplementary Materials). The qualitative comparison suggests that the pose predicted by GoldScore is closest to its crystal geometry. Further, to bring the quantitative comparison, in each case, we calculate the IE using KEM-CP approach as discussed in the following section. The corresponding pairwise IEs are listed in Tables S31-S34, Supplementary Materials. Thus, the total IEs of hAR with the best poses of IDD594 as predicted by the three scoring functions and with the crystal geometry of IDD594 along with the RMSDs of the poses with respect to the crystal geometry are compared (Table S35, Supplementary Materials). The comparisons suggest that the pose predicted by the GoldScore has the least RMSD and minimum IE (Table S35, Supplementary Materials). Therefore, we conclude that GoldScore is superior among these three scoring functions, which was also pointed out by Xu et al. [28]. Subsequently, the ligands (Schemes S1-S3, Supplementary Materials) selected in this study were docked into the target protein structures using GoldScore. The binding pocket was defined by selecting a sphere of radius 5 Å with one of the residues located deep inside the active site as the centre. The central residues (Trp111 for hAR, Leu354 for ERβ and Phe80 for CDK2) are highlighted in blue in Figure S10, Supplementary Materials. The docking was performed based on the "Lock and Key" model without supplying any lead structure as input and the ligands were allowed to be highly flexible. Thereby, 30 poses were generated for each ligand. The fitness scores thus obtained from the docking are unit less and higher the fitness score better the binding affinity of the ligand to the target. The 30 poses are then grouped into certain types of orientations with a significant number of poses within a given orientation type. Subsequently, using KEM-CP, the IEs are calculated for the best ranked posed from each type of orientations.

KEM-CP Interaction Energy Calculation and Kernel Selection
The first step of KEM is to fragment the macromolecule into small pieces call kernels such that every atom of the macromolecule belongs to one and only one kernel ( Figure S11, Supplementary Materials) [29]. Subsequently, the kernels are capped at the point of fragmentation using H-atoms to preserve the valency. The quantum calculations are performed on the single kernels and double kernels (pairs) and their contributions are then summed up to get the total molecular energy as follows (Equation (1)): where n is the number of single kernels, m, i, j are the running numbers, E ij is the energy of a double kernel and E i is the energy of a single kernel. The IE between any two kernels, I ij is defined as the difference of energy between their double kernel (E ij ) and the constituent single kernels (E i and E j ), represented as (Equation (2)): In general, for protein-ligand IE calculation, the ligand is chosen as the first kernel, and each of the residues/solvent present as the immediate neighbour of the ligand are selected as the second kernel. Thus, forming the single and double kernels (pair of kernels), the protein-ligand interaction energy, I protein−ligand is estimated by the summation of all such kernel pairs as follows (Equation (3)): The calculations on the double kernels (dimers) are performed using 'counterpoise' keyword in Gaussian09 [51]. after assigning the fragment number to each monomer along with their charge and multiplicity information.
The KEM-CP calculations are performed on multiple types of orientation of poses of every ligand of hAR (2 types per ligand), CDK2 (3 types per ligand) and ERβ (4 types per ligand) using MP2/6-311G(d,p) level of theory. All the residues identified at the active site, irrespective of their contact with the ligand, are considered for the IE calculations. Thereby, the variation in the protein-ligand interactions due to the conformational changes of the ligands at the active site are monitored. In any case, in the absence of a contact, the KEM-CP correctly predicts a negligible IE.

Conclusions
Here, we employ the quantum crystallographic approach, KEM-CP, for predicting accurate lead ligand structures for proteins hAR, CDK2 and ERβ. Our study demonstrates an important application of KEM-CP for drug discovery. We select protein systems with active site environments ranging from highly hydrophilic to moderate to highly hydrophobic and with structural resolutions ranging from ultra-high (0.66 Å, hAR-IDD594) to standard (2 Å, CDK2-NU6102) to low (2.7 Å, ERβ-4NA) for exploring the versatility of KEM-CP approach. Our comparative study based on the hAR-IDD594 complex structure confirms that GoldScore is indeed superior to the other two scoring functions as implemented in the package GOLD. However, the results based on the KEM-CP approach in conjunction with the molecular docking demonstrate that not necessarily the top ranked pose predicted by the GoldScore is the best pose for a given target. Comparison of the fitting score and the IE based results with the RMSDs of the ligands w.r.t. their crystal geometries allowed identifying the accurate lead structures. Further, the ranking of the ligands based on our results correlated well with that of the experimental IC 50 values. Although, the GoldScore is active site environment specific, KEM-CP approach shows neither such environment specificity nor any dependency on the structural resolutions. Moreover, besides its efficiency and quickness, KEM-CP calculations are simple and can be performed economically. KEM-CP can also be used for exploring the potent ligands of a system for which only an apo crystal structure or even a simulated structure is known. These clearly demonstrate the versatility and simplicity of the KEM-CP approach. Nevertheless, further such studies on some more systems would be necessary to judge the enhanced capability of KEM-CP. Moreover, the usefulness of KEM-CP based accurate poses, especially their orientations, for the EF based benchmarking using DUD-E, MUV, CASF etc. could be worth exploring in the future. Lastly, it is evident from our analysis that the application of KEM-CP approach along with docking studies may ease the process of virtual screening of potent ligands-the bottleneck of drug discovery research.
Supplementary Materials: The following information are available online at. Scheme S1: Chemical structures of the five ligands docked in hAR. Scheme S2: Chemical structures of the seven ligands docked in CDK2. Scheme S3: Chemical structures of the ten ligands docked in ERβ. Figure S1: Two major types of orientations of the IDD594 ligand at the binding site. Figure S2: Three major types of orientations of the ligand NU6102 of CDK2 at the binding site and their overlay diagrams. Figure S3: Plot showing IC 50 ranking distribution with the KEM-CP IE and GoldScore Fitness score rankings for pose Type 1 (from Table S16). Figure S4: Four major types of orientations of the ligand 4NA of ERβ at the binding site and their overlay diagrams. Figure S5: Plot showing correlations of rankings based on the average KEM-CP IE and GoldScore fitness score with the ranking based on IC 50 values for best poses of all 10 ligands (data from Table S28). Figure S6: Plot showing correlations of rankings based on the average KEM-CP IE and GoldScore fitness score with the ranking based on IC 50 values for pose Type 1 (data from Table S34). Figure S7: Plot showing correlations of rankings based on the average KEM-CP IE and GoldScore fitness score with the ranking based on IC 50 values for pose Type 3 (data from Table S35). Figure S8: Best poses of IDD594 in hAR using different scoring functions. Figure S9: Comparison of IDD594 poses generated by different scoring functions with respect to the crystal geometry (blue). Figure S10: Views of the active sites of (a) hAR-IDD594, (b) CDK2-NU6102 and (c) ERβ-4NA highlighting the complexed ligand (cyan) and the residues chosen as the active site centre (blue). Figure S11: Representation of the fragmentation of macromolecule as single and double kernels. Table S1: Population distribution of the ligands of hAR belonging to two major types. Tables S2-S6: Interaction energies (kCal·mol −1 ) of hAR with GoldScore poses of ligand 10, 16, 19, 24 and 25, respectively. Table S7: Population distribution of various poses of the ligands of CDK2 belonging to three major types. Tables S8-S14: Interaction energies (kCal·mol −1 ) of CDK2 with GoldScore poses of ligand 3, 25, 28, 29, 30, 33 and 34, respectively. Table S15: Comparison of ranking based on the experimental IC 50 values, the average IE from KEM-CP and fitting score from GoldScore of the best poses of the CDK2 ligands. Table S16: Comparison of ranking based on the experimental IC 50 values, the average IE from KEM-CP and fitting score from GoldScore of the Type 1 poses of the CDK2 ligands. Table S17: Population distribution of the 30 poses of the ligands of ERβ belonging to 4 types. Tables S18-S27: Interaction energies (kCal·mol −1 ) of ERβ with GoldScore poses of ligand 15,25,27,29,40,44,57,62, 68 and 70, respectively. Table S28: Comparison of ranking based on the experimental IC 50 values, the average IE from KEM-CP and fitting score of the best poses of the ERβ ligands. Table S29: Comparison of ranking based on the experimental IC 50 values, the average IE from KEM-CP and fitting score of the pose Type 1 of the ERβ ligands. Table S30: Comparison of ranking based on the experimental IC 50 values, the average IE from KEM-CP and fitting score of the pose Type 3 of the ERβ ligands. Table S31: Interaction energies (kCal·mol −1 ) of hAR-IDD594 complex. Table S32: Interaction energies (kCal·mol −1 ) of hAR with GoldScore pose of ligand 19. Table S33: Interaction energies (kCal·mol −1 ) of hAR with ChemScore pose of ligand 19. Table S34: Interaction energies (kCal·mol −1 ) of hAR with ChemPLP pose of ligand 19. Table S35: Interaction energy (IE) of hAR with IDD594 poses generated by different scoring function and with the crystal geometry of IDD594 and RMSD of the docked poses with respect to the crystal geometry. Section S1.1: Benchmarking of scoring functions.