In the past several years, halogen bonding has been recognized as an important contributor to the stability of supramolecular structures [1
], crystals [3
], protein-ligand complexes [6
], as well as other chemical and biomolecular structures [12
]. These noncovalent interactions have properties that are similar to those of hydrogen bonds, most notably their strong directionality and electrostatic character [12
]. However, because of the involvement of large polarizable halogens, halogen bonds depend more strongly on the contributions from dispersion forces [15
], making their solubilities and binding free energy properties distinctly different from those of hydrogen bonds [17
]. Traditionally, electronegative Lewis bases have been considered the primary halogen bond accepting group, as is true in hydrogen bonding [12
]. More recently, π-bonding and aromatic moieties have been shown to be effective halogen bond acceptors, leading to a subclass of interactions, sometimes called R–X···π (or C–X···π) interactions [19
Halogen bonds have been found to be of particular interest in the stabilization of protein-ligand complexes and in the field of drug design, where these interactions are being actively explored as a means to modify binding free energies [6
]. In recent protein database (PDB) survey studies seeking to identify halogen bonds existing in protein-ligand complexes, R–X···π interactions have been found to be one of the most common types of halogen bonding motifs [9
]. Examples of R-X···π interactions that contribute to the stabilization of protein-ligand complexes are given in Figure 1
. In one study, Zhu and coworkers found that 33% of halogen bonds in the PDB are of the R–X···π type, the second most common halogen bonding motif identified, with the most common motif being of the R–X···O type (53%) [9
]. Matter and coworkers identify R–X···π interactions in several different types of protein-ligand complexes, and find that the introduction of R–Cl···π and R–Br···π interactions in complexes involving inhibitors of serine protease factor (fXa) results in the dramatic enhancement of ligand binding affinity [26
]. Imai and coworkers identified 59 Cl···π contacts within the PDB (2007 release), with 21, 15, 15, and 8 contacts involving phenylalanine, tyrosine, tryptophan, and histidine, respectively [27
]. Sirimulla and coworkers examine the propensities for each of the amino acids to form halogen bonds, carefully distinguishing between the interactions involving protein sidechains and those involving the backbone [28
]. Among all sidechain interactions, it is found that halogen contacts with the aromatic amino acids phenylalanine and tyrosine are among the most frequent, while interactions involving tryptophan and histidine, while still significant, occur less frequently.
One of the main sources of attraction in an R–X···Y halogen bond is an electrostatic interaction that occurs between the σ-hole, a region of positive potential (electron deficiency), located along the extension of the R–X bond and an electronegative Lewis base (Y) [12
]. Because of the large size and polarizability of halogens, dispersion also plays a significant role in stabilizing halogen bonds [15
]. The σ-hole is responsible for halogen bond directionality, although the nature of this directionality is not as straightforward as might be presumed [12
]. Several recent computational studies based on symmetry adapted perturbation theory (SAPT), a method that provides interaction energies that are divided into contributions from electrostatics, dispersion, induction, and exchange, indicate that it is exchange-repulsion, and not electrostatics, as might be assumed, that are responsible for R–X···Y directionality [29
]. In a phenomenon known as polar flattening [32
], the electron density envelope in the region of the σ-hole exhibits a flat character in a plane perpendicular to the R–X bond. As this flat region rotates away from the optimal R–X···Y angle of 180°, there is an increase in the electron density overlap of the halogen and the Lewis base, resulting in an increasingly larger exchange-repulsion contribution, thus destabilizing the R–X···Y complex [30
R–X···π interactions are generally slightly weaker than their R–X···Y counterparts, with binding energies whose magnitudes are typically 10–25% lower [20
]. The SAPT characteristics of R–X···π interactions are similar to those of R–X···Y interactions, however, the former tend to have larger relative contributions from dispersion, which is to be expected given the large size and polarizabilities of both halogens and phenyl groups [25
]. The geometric properties of R–X···π interactions are inherently different than those of standard halogen bonds because of the diffuse nature of the region of negative potential in an aromatic system.
In a recent computational study, utilizing the accurate CCSD(T)/aug-cc-pVQZ and SAPT2+3δMP2/aug-cc-pVTZ methods on model complexes involving a benzene R–X···π acceptor, the ways in which the strengths of R–X···π interaction depends on the relative orientation of the interacting pair that were investigated [25
]. It was seen that the strength of an R–X···π interaction depends strongly on the distance between the halogen and the benzene ring, a distance that is described by both R(R–X···π) and the α angle (as defined in Figure 2
, and first introduced by Glaser et al.) [33
], while being only mildly dependent on the R–X···π angle (θ). Another useful geometric parameter, R⫫
, derived from both R(R–X···π) and α, describes the distance between the halogen and the benzene plane (not the benzene centroid). With this definition, we can say that the strength of an R–X···π interaction depends strongly on R⫫
. It is also found in this study that the C6V
(T-shaped) structure of the complex, in which the bromine is directly above the benzene center with the C-Br bond perpendicular to the benzene plane, is not the global minimum for this complex. The global potential energy minimum was not found in that study, however, it is shown that shifting the R–X···π donor across the benzene plane results in interactions energies that are slightly more attractive (by about 0.05 kcal/mol) at shift distances of 0.50–0.75 Å. The rotation of the R–X···π donor away from perpendicularity with the benzene plane (i.e., decreasing the θ angle) results in a weaker interaction, however this destabilization is a weaker function of this rotation than has been observed for R–X···Y interactions. In terms of SAPT contributions to the interaction energy, the chief reason for R–X···π destabilization upon rotation away from perpendicularity is an increase in the (repulsive) exchange term, with the dispersion term becoming slightly more attractive as the a(R–X···π) angle decreases, as has been observed for “standard” R–X···Y type interactions [30
]. The electrostatic and induction terms are essentially flat throughout the rotation (whose minimum angle is 140°).
There are two main factors that control the size and charge of a halogen’s σ-hole, thus modulating the strength of a R–X···π or R–X···Y interaction. The identity of the halogen has a large impact on interaction strength, with larger halogens generally forming stronger interactions (I > Br > Cl), both because larger halogens have larger σ-holes and because larger halogens have higher polarizabilities, which leads to stronger dispersion interactions [12
]. The chemical environment in which the halogen is found also has a large impact on the σ-hole size, with the electronegativity of neighboring atoms having the largest impact on the σ-hole size. Thus, it is observed that a halogen bound to an electron withdrawing group will generally have a larger, more positive, σ-hole than one bound to an electropositive or electroneutral group. The size of a σ-hole correlates strongly with the strength of the halogen bonding, or R–X···π, interaction in which the halogen participates [24
Although R–X···π contacts appear to be quite common in protein-ligand complexes, these interactions have not been the subject of intensive study until recently, and their characteristics are not understood nearly as well as those of their R–X···Y counterparts. Here we seek to investigate the strength and character of R–X···π interactions found within protein-ligand complexes obtained from the PDB. Several model complexes, derived from these protein-ligand structures are considered, including six R–Cl···π complexes and five R–Br···π complexes. This study is not intended to be a comprehensive survey of R–X…π interactions in protein-ligand complexes, but is aimed at elucidating the strengths and properties of these interactions that can be expected to be observed within the context of the protein binding pocket. As such, we have chosen the 11 systems studied here to have varying near-halogen chemical environments (i.e., differing σ-hole magnitudes) as well as variable R–X···π geometries.
2. Results and Discussion
Before computational results are discussed, it is important to clearly state here that, although we have gone to a good deal of effort to exclude secondary (i.e., non-R–X···π) interactions from consideration in our model systems to as great an extent as possible, there are several systems for which this task was not possible. Thus, some of the interaction energies reported here do not purely reflect the strength of the R–X···π interaction in question alone, but may also include some other non-negligible contribution to intermolecular attraction that either stabilizes or destabilizes the overall complex. A large number of complexes were excluded from our study because of the large contributions to binding made by secondary interactions. It will be further noted that R–X···π complexes whose geometric arrangements are far from the C6v configuration are the most likely to include secondary interactions, as these structures often include interactions between aromatic hydrogens (or other aromatic substituents) on the R–X···π donor and the aromatic amino acid R–X···π acceptor. Model systems in which secondary interactions clearly play a role in complex stabilization will be indicated throughout the text. The difficulty in finding R–X···π interactions that are not significantly affected by secondary interaction highlights the fact that a particular noncovalent interaction motif often does not exist “in a vacuum” within a protein-ligand complex, meaning that it is difficult to separate contributions to overall protein-ligand complex stability that are made by the various noncovalent contacts between the ligand and the surrounding amino acids.
gives electrostatic potential maps for benzene, ethylbenzene (phenylalanine model), and 4-ethylphenol (tyrosine model). Here, it is seen that the area of minimum charge near the aromatic centroid is of similar size and charge for all three molecules, although slightly smaller for 4-ethylphenol. In the case of 4-ethylphenol, this region of negative charge extends towards the electronegative hydroxyl oxygen with only a small barrier of less negative potential. The main point to be made about the electrostatic potentials of these three molecules is that they all have negative regions with potentials of roughly the same magnitude (~−100 kcal/mol) and size, although it can be seen that the central negative region in 4-ethylphenol is slightly smaller than that of the other two molecules. Thus, at least in terms of electrostatics, it can be expected that the interactions involving ethylbenzene and 4-ethylphenol as R–X···π acceptors can be expected to have properties that are similar to those having benzene as the R–X···π acceptor.
illustrates model complex geometries, geometric parameters, SAPT2+ interaction energies, and interaction terms, R–Cl···π donor electrostatic potentials, and VS,max
values for all R–Cl···π model complexes considered here. The VS,max
value represents the most positive potential on a σ-hole and reflects the size and charge of that σ-hole. One of the most outstanding features of the depicted data is the relatively low variability of the interaction strengths, with all of the complexes having binding energies falling within the range from −2.10 kcal/mol to −2.66 kcal/mol. This finding is consistent with the results of previous studies in which it was found that the strength of an R–X···π interaction does not strongly depend on geometric factors other than the R–X···π distance. It will be noted here that the most stable complex in this group (−2.66 kcal/mol), 1T4V, contains a strong C–H···O contact in addition to the R–Cl···π interaction. Neglecting this system, the next most stable complex (−2.56 kcal/mol), 1IQE, can be seen to have the most favorable C–Cl···π distance (3.47 Å, R⫫
= 3.36 Å) and the second highest VS,max
value (53 kcal/mol) within the group. This result is consistent with previous studies showing that the two factors most strongly affecting the strength of an R–X···π interaction are the R–X···π distance and the size/charge of the halogen σ-hole (i.e., VS,max
In terms of SAPT interaction contributions, it is clearly seen that dispersion is the largest attractive force for all interactions considered here, with electrostatics also playing a significant role in attraction. This is consistent with previous SAPT results on model R–X···benzene complexes [24
] and with the results of Imai and coworkers [27
], who used Morokuma decomposition and comparison of Hartree-Fock and MP2 results to investigate the origins of attraction in R–Cl···benzene complexes. SAPT dispersion components fall in the range between −3.35 kcal/mol and −4.29 kcal/mol, while the electrostatic term ranges from −1.05 kcal/mol and −1.86 kcal/mol. In terms of the relative role that each attractive term plays in stabilizing these complexes, reflected in the % attraction quantity, it is seen that dispersion is responsible for 64% to 72% of attraction in the R–Cl···π complexes considered here. Large dispersion contributions to attraction are expected here, as both the R–X···π donor and acceptor are large polarizable groups, and the size of each of these leads to a high degree of overlap between the two electron density envelopes. This high degree of overlap also leads to strong exchange-repulsion components, with the SAPT values varying between 2.01 kcal/mol and 4.47 kcal/mol. In previous studies it has been seen that the strengths of the dispersion and exchange interaction contributions depend strongly on the R⫫
distance, but are affected relatively little by the R–X···π angle (θ) and by the R–X···π donor VS,max
value. The electrostatic and induction SAPT terms are also relatively insensitive to the variations in θ angles, but depend strongly on both R⫫
and on the donor VS,max
value. Thus, as is the case for standard R–X···Y type halogen bonds, the tuning of the strength of an R–X···π interaction is mediated primarily through the electrostatic and, to a much lesser extent, induction contributions to attraction.
Looking at the SAPT binding energy terms as a function of geometric parameters, it can be seen that the electrostatic, dispersion, and exchange terms exhibit a degree of correlation with the R⫫ distance; the electrostatic and dispersion terms become more attractive, and the exchange term becomes more repulsive as the R⫫ decreases. For example, the three complexes having R⫫ distances of at least 3.40 Å (1T4V, 1NFY, and 2BMG) have electrostatic terms ranging from −1.05 to −1.19 kcal/mol, dispersion terms ranging from −3.35 to −3.41 kcal/mol, and exchange terms ranging from 2.01 to 2.75 kcal/mol while the complexes having R⫫ distances of less than 3.36 Å (1IQE, 1NFU, and 2BQW) have electrostatic terms ranging from −1.75 to −1.86 kcal/mol, dispersion terms ranging from −4.09 to −4.29 kcal/mol, and exchange terms ranging from 3.80 to 4.47 kcal/mol. Conversely, there is no strong correlation seen here between the SAPT interaction terms and the θ angle.
gives model complex geometries, geometric parameters, SAPT2+ interaction energies and interaction terms, R–Cl···π donor electrostatic potentials, and VS,max
values for all R–Br···π model complexes considered here. The most salient aspect of the data depicted here is the fact that, as expected, R–X···π interactions involving bromine are generally stronger than those involving chlorine, with R–Br···π interactions having binding energies that fall in the range from −2.01 to −3.60 kcal/mol. There are two principal reasons for the enhanced strength of R–Br···π interactions relative to R–Cl···π interactions, both related to the larger size of bromine: larger halogens tend to form larger, more positive, σ-holes, and larger halogens are more polarizable and, thus, tend to form stronger dispersion interactions. The first of these points (increased σ-hole size) is reflected in VS,max
values, which range from 57 to 118 kcal/mol for R–Br···π donors, and from 25 to 54 kcal/mol for R–Cl···π donors.
In terms of geometric parameters, it is seen that R–Br···π distances are generally greater than R–Cl···π distances, as would be expected because of the larger size of bromine, with an average difference of 0.20 Å. It should be noted here that, in comparison to the R–Cl···π complexes, a wider range of R–X···π angles, both α and θ, is represented in the set of five R–Br···π structures, with α angles ranging from 16.3 to 30.3° and θ angles in the range from 127.5 to 174.8°. The result of the larger variation in the three principal geometric parameters is a larger variation of the R⫫ distance, which varies from 3.18 to 3.82 Å (∆R⫫ = 0.64 Å) for R–Br···π structures, as compared to varying from 3.29 to 3.68 Å (∆R⫫ = 0.39Å) for R–Cl···π complexes. The 1P5E complex, which exhibits the shortest R–Br···π distance and the most positive σ-hole, represents the strongest interaction among all complexes considered here, with an interaction energy of −3.60 kcal/mol. The strength of this interaction is principally attributable to the large size and charge of the bromine σ-hole (VS,max = 118 kcal/mol) and the very short distance between the halogen and the benzene ring (R⫫ = 3.18 Å). The short R⫫ distance results in SAPT terms that are all large relative to structures with shorter intermolecular separations. This is not a surprising result, as it has been seen that the magnitude of all SAPT terms tend to increase sharply with decreasing intermolecular separation within the region of the interaction energy potential energy minimum for R–X···π interactions, traditional halogen bonds, and other noncovalently bound complexes. It should also be noted that 1P5E has the largest relative contribution from electrostatics, with a % attraction value of 35.6%, which is attributable to both the short R⫫ distance and the large VS,max value. Comparing the 1ZOE complex, representing the second strongest interaction in this study, to 1P5E, it is seen that 1ZOE has a bromine σ-hole that is significantly smaller (VS,max = 86 kcal/mol) and an R⫫ distance that is much higher, resulting in a R–Br···π interaction that is weaker by 0.50 kcal/mol. The relative contribution of electrostatics in stabilizing this complex is smaller than in 1P5E (% attraction = 30.5%), but is higher than in all of the other complexes considered here, which is again attributable to a short R⫫ distance and large VS,max value. The low binding energy (−2.01 kcal/mol) of the weakest complex in this group, 1O27, is chiefly attributable to a large R⫫ distance of 3.82 Å. This complex is seen to be largely dispersion-bound, with dispersion accounting for 78.4% of attraction.
In this work, we have found that the strengths of R–X···π interactions in protein-ligand complexes vary a great deal, with binding energies ranging from −2.01 to −3.60 kcal/mol. SAPT analysis has shown that dispersion plays the largest role in stabilizing these R–X···π interactions, generally accounting for about 50% to 80% of attraction. Electrostatics are also important, accounting for roughly 20% to 35% of attraction. Generally, R–Br···π interactions are stronger than R–Cl···π interactions, however, geometric factors, as well as the halogen’s chemical environment play a large role in determining this strength, meaning that the strongest R–Cl···π interactions can be stronger than the weakest R–Br···π interactions. It will be noted here that it is reasonable to expect that the R–X···π interactions involving iodine, which is larger and generally has a larger σ-hole than bromine (for an equivalent chemical environment), would tend to be stronger than those involving bromine or chlorine. Interactions involving iodine are not included here, as the influence of relativistic effects in these interactions is strong and we are not able to include these effects in our SAPT calculations.
In previous studies on smaller model systems involving benzene as the R–X···π donor, it was found that the two factors most strongly affecting the strength of these interactions is the distance between the halogen, the phenyl plane, and the size of the halogen’s σ-hole. The R–X···π angle (θ) has a much weaker influence on binding energies. The results of this investigation largely confirm these findings, with interactions that generally increase with increasing VS,max values and decreasing R⫫ distances.
For both R–Cl···π and R–Br···π interactions, it is seen that the SAPT electrostatic, dispersion, and exchange terms are strongly correlated with the R⫫ distance, with E(elec) and E(disp) increasing, and E(exch) decreasing as R⫫ becomes larger. In the case of R–Br···π interactions, it is also clearly seen that interaction energies, which depend both on geometric factors and σ-hole size, generally decrease (become more stable) with a decreasing R⫫.
4. Computational Methods
As noted above, the interaction energies are computed for R–X···π complexes derived from the PDB. In all cases, the R–X···π donor is derived from the ligand and the R–X···π acceptor is derived from a protein tyrosine or phenylalanine residue.
These complexes are prepared in such a way that the critical aspects of the halogen’s chemical environment are included, while the size of the R–X···π donor is kept as small as possible. There are two critical reasons for minimizing the size of the R–X···π donating system: firstly, this limits the computational expense for dimer calculations, and secondly (and perhaps more importantly), this helps limit the possibility for secondary (non-R–X···π) interactions from being present in the complexes. As our primary aim is to study the strengths of R–X···π interactions, it is important to limit the role that other interactions within the system may play in stabilizing the complexes under investigation.
For all 11 ligands investigated here, the halogen on the R–X···π donating group is bound to an aromatic carbon, which is the most common situation for these types of interactions, as has been shown through PDB survey studies [9
]. In all of the cases, with the exceptions of the 1P5E and 1ZOE R–Br···π structures, the entire ligand is not retained for model calculations, but only the aromatic moiety on which the halogen is located along with relevant functional groups that are also bound to that aromatic moiety (see figures in Table 1
and Table 2
). Phenylalanine is modeled as ethylbenzene, and tyrosine is modeled as 4-ethylphenol. Heavy (non-hydrogen) atoms remain fixed in their crystal structure geometries while the positions of hydrogens, which are not determined experimentally, are determined by optimization at the BLYP/def2-TZVP level of theory.
The main tool used here to investigate the strength and character of R–X···π interactions is the symmetry adapted perturbation theory, SAPT2+/aug-cc-pVDZ [36
], method, which has been shown to provide the binding energies that are at least semi-quantitatively accurate [38
]. For example, for the S22 database of hydrogen bonded and dispersion bound complexes, this method consistently yields binding energies that are more accurate (mean unsigned error of 0.22 kcal/mol) than the, very commonly used, MP2/cc-pVTZ method (mean unsigned error of 0.70 kcal/mol) [38
]. An additional feature of SAPT methods is that the interaction energy is divided into physically meaningful components corresponding to electrostatics, induction, dispersion, and exchange-repulsion. These terms give insight into the nature of a noncovalent interaction, and indicate the dominant stabilizing (and/or destabilizing) forces in an interacting pair. All SAPT calculations are performed using the PSI4 suite of molecular electronic structure programs [40
Electrostatic potential maps are produced on the molecular ‘surface’, taken to be the 0.001 electron/bohr contour of the molecule’s electron density. Molecular electronic densities are computed at the B3LYP/6-311G* level of theory [41
]. Electrostatic potentials are computed using the Spartan molecular electronic structure program [42
]. The VS,max
value, which represents the maximum potential value on the halogen, is a convenient measure of the magnitude of the size of a σ-hole, and is used extensively throughout this work [41