Boronic Acid Group: A Cumbersome False Negative Case in the Process of Drug Design

Herein we present, an exhaustive docking analysis considering the case of autotaxin (ATX). HA155, a small molecule inhibitor of ATX, is co-crystallized. In order to further extract conclusions on the nature of the bond formed between the ligands and the amino acid residues of the active site, density functional theory (DFT) calculations were undertaken. However, docking does not provide reproducible results when screening boronic acid derivatives and their binding orientations to protein drug targets. Based on natural bond orbital (NBO) calculations, the formed bond between Ser/Thr residues is characterized more accurately as a polar covalent bond instead of a simple nonpolar covalent one. The presented results are acceptable and could be used in screening as an active negative filter for boron compounds. The hydroxyl groups of amino acids are bonded with the inhibitor’s boron atom, converting its hybridization to sp3.


Introduction
Boronic acid is considered as a bioisostere for carboxylic acids bearing lower acidity [1]. It emerged as a pharmacophore group after the clinical approval of the drug bortezomib [2] in 2003 as an anticancer agent. Boron is located before carbon in the periodic table of elements. Both elements present similarities in behavior when participating in organic molecules [3]. In nature, boron-containing compounds can be found in plants, algae, microorganisms [4], and in certain natural products derived from bacteria [5]. Boron contains an empty p orbital in its sp 2 hybridization state, presenting a trigonal planar geometry. Upon covalent bond formation, sp 3 hybridization takes place, leading to a tetrahedral geometry [6].
Based on our previous 2D-QSAR and 3D in silico studies on boronic acid derivatives [23,24], we present herein a docking analysis in order to define the typical steps and variables that are important to incorporate in a library's virtual screening when organoboron derivatives are included. 2 of 9 The enzyme autotaxin (ATX) was used as a biological target in our investigation. The enzyme is located extracellularly and participates in lipid metabolism, converting lysophosphatidylcholine (LPC) to lysophosphatidic acid (LPA). Its active site is located within its catalytic phosphodiesterase (PDE) domain with known recorded XRD data (PDB entry 2XRG) in the presence of a small molecule inhibitor HA155 [25], presenting a threonine-mediated covalent bond between the enzyme and the inhibitor. Moreover, we performed DFT calculations in order to explore the nature of the covalent bond formed between the inhibitor HA155 and two residues similar like Ser/Thr. Additionally, boron compounds are known to form covalent bonds with His [26] and/or Asp [18] residues. The use of covalent "warhead" groups in Medicinal Chemistry are an emerging strategy to achieve irreversible protein inhibition [27,28]. Hence, designed molecules bearing a covalent warhead can be used in targeting shallow pockets or clefts at the surface of proteins to selectively disrupt protein-protein interactions (PPIs) [29] in vivo and in other contexts.

Results and Discussion
The usage of typical constraints in the screening of boronic acid derivative libraries could lead to a false negative presentation with regard to binding poses and scoring function values, failing to replicate possible covalent bonds among boron and the amino acids Ser/Thr existing in the active site. Thus, alternate steps must be taken into account when we are dealing with hit compounds bearing boron in their structures producing plausible bindng poses. In particular, using standard hydrogen bond (HB) or contact constraints for Thr209 provided HA155 solutions with 100% off poses based only on the inhibitor's shape score with minimum HB network scores. Alternatively, the introduction of custom constraints incorporating a smart pattern-like option is needed in order to better simulate the original HA155 query in our training sets (see Scheme 1). Additionally, smart patterns I and II reproduce a number of binding modes with other neighboring residues bearing an oxygen atom, such as Asp171/Tyr306 (and excepting Thr209). The statistics for our study groups are presented in Table 1. Based on our previous 2D-QSAR and 3D in silico studies on boronic acid derivatives [23,24], we present herein a docking analysis in order to define the typical steps and variables that are important to incorporate in a library's virtual screening when organoboron derivatives are included. The enzyme autotaxin (ATX) was used as a biological target in our investigation. The enzyme is located extracellularly and participates in lipid metabolism, converting lysophosphatidylcholine (LPC) to lysophosphatidic acid (LPA). Its active site is located within its catalytic phosphodiesterase (PDE) domain with known recorded XRD data (PDB entry 2XRG) in the presence of a small molecule inhibitor HA155 [25], presenting a threonine-mediated covalent bond between the enzyme and the inhibitor. Moreover, we performed DFT calculations in order to explore the nature of the covalent bond formed between the inhibitor HA155 and two residues similar like Ser/Thr. Additionally, boron compounds are known to form covalent bonds with His [26] and/or Asp [18] residues. The use of covalent "warhead" groups in Medicinal Chemistry are an emerging strategy to achieve irreversible protein inhibition [27,28]. Hence, designed molecules bearing a covalent warhead can be used in targeting shallow pockets or clefts at the surface of proteins to selectively disrupt proteinprotein interactions (PPIs) [29] in vivo and in other contexts.

Results and Discussion
The usage of typical constraints in the screening of boronic acid derivative libraries could lead to a false negative presentation with regard to binding poses and scoring function values, failing to replicate possible covalent bonds among boron and the amino acids Ser/Thr existing in the active site. Thus, alternate steps must be taken into account when we are dealing with hit compounds bearing boron in their structures producing plausible bindng poses. In particular, using standard hydrogen bond (HB) or contact constraints for Thr209 provided HA155 solutions with 100% off poses based only on the inhibitor's shape score with minimum HB network scores. Alternatively, the introduction of custom constraints incorporating a smart pattern-like option is needed in order to better simulate the original HA155 query in our training sets (see Scheme 1). Additionally, smart patterns I and II reproduce a number of binding modes with other neighboring residues bearing an oxygen atom, such as Asp171/Tyr306 (and excepting Thr209). The statistics for our study groups are presented in Table 1.  Training set_1 used custom constraint I and provided the worst results. On the other hand, Training set_7 (with no active constraints) resulted in impressively high scoring functions, but Scheme 1. General smart pattern representation structures (I, II, III, IV) used for the performed constrained docking simulations. Training set_1 used custom constraint I and provided the worst results. On the other hand, Training set_7 (with no active constraints) resulted in impressively high scoring functions, but entirely lacked the correct binding mode. Training set_3 had as active constraint patterns I, II, and III, and provided improved results. For the study of the remaining cases, all of the above-mentioned constraints (I-IV) were used (differentiated on the included distance). The distance was presented as a sphere with descending radius centered on the Thr209 hydroxyl group varying from preset 3.5 to 1.25 Angstroms (Å). The presented results were acceptable and could be used in screening as an active negative filter for boron compounds. Among Training sets_2, _4, _5, and _6 (based on their statistics), the most enriched proved to be Training set_4, with a cut-off radius of 2.5 Å. Figure 1 provides a detailed 2D map analysis of the ATX active site.
Molecules 2016, 21, 1185 3 of 10 entirely lacked the correct binding mode. Training set_3 had as active constraint patterns I, II, and III, and provided improved results. For the study of the remaining cases, all of the above-mentioned constraints (I-IV) were used (differentiated on the included distance). The distance was presented as a sphere with descending radius centered on the Thr209 hydroxyl group varying from preset 3.5 to 1.25 Angstroms (Å). The presented results were acceptable and could be used in screening as an active negative filter for boron compounds. Among Training sets_2, _4, _5, and _6 (based on their statistics), the most enriched proved to be Training set_4, with a cut-off radius of 2.5 Å. Figure 1 provides a detailed 2D map analysis of the ATX active site. Images of the successful case studies are given in Figure 2, both in overlay with the original query compound HA155 and in detailed 3D representations inside the protein's active site.
The optimized geometries of serine and threonine amino acids are the global minima (as shown in Figure 3). Two hydrogen bonds are observed in serine. The first is located between the carboxylic hydrogen and the nitrogen of the backbone amine (1.929 Å), and the second HB is between the carboxylic oxygen and the residual hydroxyl group (2.040 Å). In threonine, the HBs can be observed between the carboxylic hydrogen and the amine group (1.899 Å).
The optimized geometries of the inhibitor HA155 are depicted in Figure 4. The HA155 (2) structure is the optimized crystal structure as experimentally isolated. This conformation of the inhibitor is more stable than the HA155 (1) structure, which is another local minimum conformation in the potential energy surface, varying by 2.02 kcal/mol, including the zero point correction energy. Images of the successful case studies are given in Figure 2, both in overlay with the original query compound HA155 and in detailed 3D representations inside the protein's active site.
1.25 Angstroms (Å). The presented results were acceptable and could be used in screening as an active negative filter for boron compounds. Among Training sets_2, _4, _5, and _6 (based on their statistics), the most enriched proved to be Training set_4, with a cut-off radius of 2.5 Å. Figure 1 provides a detailed 2D map analysis of the ATX active site. Images of the successful case studies are given in Figure 2, both in overlay with the original query compound HA155 and in detailed 3D representations inside the protein's active site.
The optimized geometries of serine and threonine amino acids are the global minima (as shown in Figure 3). Two hydrogen bonds are observed in serine. The first is located between the carboxylic hydrogen and the nitrogen of the backbone amine (1.929 Å), and the second HB is between the carboxylic oxygen and the residual hydroxyl group (2.040 Å). In threonine, the HBs can be observed between the carboxylic hydrogen and the amine group (1.899 Å).
The optimized geometries of the inhibitor HA155 are depicted in Figure 4. The HA155 (2) structure is the optimized crystal structure as experimentally isolated. This conformation of the inhibitor is more stable than the HA155 (1) structure, which is another local minimum conformation in the potential energy surface, varying by 2.02 kcal/mol, including the zero point correction energy.  The optimized geometries of serine and threonine amino acids are the global minima (as shown in Figure 3). Two hydrogen bonds are observed in serine. The first is located between the carboxylic hydrogen and the nitrogen of the backbone amine (1.929 Å), and the second HB is between the carboxylic oxygen and the residual hydroxyl group (2.040 Å). In threonine, the HBs can be observed between the carboxylic hydrogen and the amine group (1.899 Å).    The optimized geometries of the inhibitor HA155 are depicted in Figure 4. The HA155 (2) structure is the optimized crystal structure as experimentally isolated. This conformation of the inhibitor is more stable than the HA155 (1) structure, which is another local minimum conformation in the potential energy surface, varying by 2.02 kcal/mol, including the zero point correction energy.    The optimized geometries of the bonded serine and threonine with boronic acid (HA155) are presented in Figure 5. In both structures, the hydroxyl groups of the amino acids are bonded with the inhibitor's boron atom, converting its hybridization to sp 3 . The geometric parameters are shown in Table 2. The highest occupied molecular orbitals (HOMO) of the two complexes are shown in Figure 6, depicting the electron density concentrated between the boron atom of compound HA155 and the amino acids. The natural bond orbital (NBO) calculations-both in HA155-Ser and HA155-Thr complexes-showed that the bond between boron and oxygen (B-O3, Figure 5) more correctly refers to a polar covalent bond. Based on these calculations, it seems that the occupation number in oxygen is approximately 1.65 electrons, and in boron 0.40 electrons, respectively. Accordingly, the degree of polarity for our models was calculated as 1.770 for HA155-Thr and 1.821 for HA155-Ser, with a threshold of 1.700 for the covalent bond. Figure 6, depicting the electron density concentrated between the boron atom of compound HA155 and the amino acids. The natural bond orbital (NBO) calculations-both in HA155-Ser and HA155-Thr complexes-showed that the bond between boron and oxygen (B-O3, Figure 5) more correctly refers to a polar covalent bond. Based on these calculations, it seems that the occupation number in oxygen is approximately 1.65 electrons, and in boron 0.40 electrons, respectively. Accordingly, the degree of polarity for our models was calculated as 1.770 for HA155-Thr and 1.821 for HA155-Ser, with a threshold of 1.700 for the covalent bond.    * The atom numbering follows the same format as represented in Figure 4. The optimized geometries of the bonded serine and threonine with boronic acid (HA155) are presented in Figure 5. In both structures, the hydroxyl groups of the amino acids are bonded with the inhibitor's boron atom, converting its hybridization to sp 3 . The geometric parameters are shown in Table 2. The highest occupied molecular orbitals (HOMO) of the two complexes are shown in Figure 6, depicting the electron density concentrated between the boron atom of compound HA155 and the amino acids. The natural bond orbital (NBO) calculations-both in HA155-Ser and HA155-Thr complexes-showed that the bond between boron and oxygen (B-O3, Figure 5) more correctly refers to a polar covalent bond. Based on these calculations, it seems that the occupation number in oxygen is approximately 1.65 electrons, and in boron 0.40 electrons, respectively. Accordingly, the degree of polarity for our models was calculated as 1.770 for HA155-Thr and 1.821 for HA155-Ser, with a threshold of 1.700 for the covalent bond.   The binding energies of the HA155-Ser complex were calculated both in the gas phase as well as in water solution, providing 311.18 and 300.13 kcal/mol, respectively, whereas the energies for HA155-Thr were 326.49 and 309.52 kcal/mol. The high binding energies found here support the formation of a weak covalent bond between compound HA155 and both Ser/Thr residues. These results provided extra evidence of the reversible nature of this bond, as it was also previously experimentally documented [13,15].

Computational Details
Docking studies. The target enzyme ATX (PDB entry 2XRG) co-crystallized with the small molecule inhibitor HA155 was used in our study [25]. All simulations were reproduced on a typical desktop PC running a Windows 7 64-bit operating system (Dual Core Intel Pentium 3.2 GHz CPU processors, RAM 8 GB), using OEDocking suite v. 3.0.1 programs (OpenEye Scientific Software, Inc., Santa Fe, NM, USA; www.eyesopen.com) [31][32][33] using Exhaustive Search Algorithm. Visualization of the docking solutions was performed with PyMol v. 1.4.1 software [30].
The PDB data were used to evaluate our results, and the HA155 molecule was represented in smile format compiled in a simple *.txt file, where with the use of Open Babel [34] we transformed it to the respective *.smi file. Consequently, we acquired its conformer library file with the use of Omega v.2.5.1.4 software (OpenEye Scientific Software, Inc., Santa Fe, NM, USA; www.eyesopen.com) [35,36] in *.oeb.gz format. Preparation of the respective *.pdb-formatted protein was done using the OEDocking v. 3.0.1 suite program MAKE RECEPTOR (OpenEye Scientific Software, Inc., Santa Fe, NM, USA; www.eyesopen.com) [31][32][33]. All water molecules were removed. The docking box was centered on the protein active site which included the co-crystallized HA155 molecule implementing a balanced site-shape potential by an outer contour docking space. No residue modifications were presented unless otherwise specified. Several different constraint options were utilized for residue Thr209 in an effort to better reproduce the original data, and the protein was then saved as a *.oeb-formatted file for further use. Command line-based docking was then performed, specifying the above prepared files as input for the OEDocking v. 3.0.1 suite software (OpenEye Scientific Software, Inc., Santa Fe, NM, USA; www.eyesopen.com) [31][32][33], generating 60 final poses for each model. All of the results were further refined by the use of OEDocking suite program FRED rescore v. 3.0.1 (OpenEye Scientific Software, Inc., Santa Fe, NM, USA; www.eyesopen.com) [31][32][33], providing the sorting of poses using default parameters. Scoring functions were based on Chemgauss4 values, and the solutions were saved as *.sdf files, offering the possibility of being used further for visualization purposes.
Density Functional Theory (DFT) calculations. All geometry optimization calculations were carried out on a typical desktop PC running a Windows 7 64-bit operating system (Intel i7 2.9 GHz CPU processors, RAM 4 GB), in gas phase, using the G09W [37] software package. The hybrid DFT method with Becke's [38] three-parameter functional and the nonlocal correlation provided by the Lee, Yang, and Parr expression [39] (B3LYP) was used for optimization, employing the 6-31+G(d,p) basis set [40][41][42]. Frequency calculations for all optimized geometries ensured that the calculated structures were the global minima of the potential energy surface of the molecules. Single point calculations starting from the optimized geometry of all structures were also carried out in water solvent using the PCM [43][44][45] model as implemented in G09W software. Natural bond orbital analysis was performed using NBO 3.1 software package as implemented in the G09W package [37], in order to evaluate the covalent bond nature between the studied compounds.

Conclusions
Virtual screening of large libraries of boronic acid derivatives fail to dock in a natural mode, since they commonly form covalent bonds with Ser and/or Thr residues when they are presented in the protein active site. Hence, they are left out as false negatives both in regard to their binding poses and their scoring function values. This issue demands programs that can incorporate custom constraints such as smart pattern options or pinpoint selection.
Based on NBO calculations (performed originally herein), the formed bond between Ser/Thr residues is characterized more accurately as a polar covalent bond instead of a simple nonpolar covalent one. The presented results are acceptable, and could be used in screening as an active negative filter for boron compounds. The hydroxyl groups of amino acids are bonded with the inhibitor's boron atom, converting its hybridization to sp 3 . Furthermore, the extremely high binding energies of the HA155-Ser/Thr complexes revealed that this reaction is not spontaneous. These theoretical data are also in accordance with the experimentally witnessed reversible binding of the inhibitor HA155 inside the ATX catalytic domain active site.
The findings described above highlight general options that need to be considered when large libraries of boron compounds are virtually screened to identify novel hits in drug design.