This article is an openaccess article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
In this work, the partition method introduced by Carvalho and Melo was used to study the complex between Cucurbita maxima trypsin inhibitor (CMTII) and glycerol at the AM1 level. An effective potential, combining nonbonding and polarization plus charge transfer (PLCT) terms, was introduced to evaluate the magnitude of the interaction between each amino acid and the ligand. In this case study, the nonbonding–PLCT noncompensation characterizes the stabilization energy of the association process in study. The main residues (Gly29, Cys3 and Arg5) with net attractive effects and Arg1 (with a net repulsive effect), responsible by the stability of proteinligand complex, are associated with large nonbonding energies noncompensated by PLCT effects. The results obtained enable us to conclude that the present decomposition scheme can be used for understanding the cohesive phenomena in proteins.
Computational methods are of great interest to evaluate binding affinities between proteins and ligands, with many applications in structurebased drug design (SBDD) [1]. A complete description of the correspondent molecular interactions, including the shortrange polarization plus charge transfer (PLCT) effects, can only be carried out at a quantum mechanics (QM) level. However, the more accurate QM methods require very large computational resources and this has limited highlevel theoretical studies in this field [2]. Current methods using classical potentials can only represent a good approximation for evaluating the nonbonding proteinligand interactions, because they are usually designed to treat the QM effects in an average manner. For this purpose, QM methods have been currently used to parameterize force fields [3–7] and scoring functions [8]. This has enabled computationally intensive SBDD studies at lower theoretical levels. The hybrid quantum mechanics/molecular mechanics (QM/MM) methods are an interesting alternative approach for this type of problems, because they conjugate an appropriate description of the molecular system with moderate computational resources [9, 10]. Good QM/MM methods require a rigorous partition of the molecular systems into two regions: a strongly perturbed region that should be described at a quantum level and a bulk region whose interactions can be reproduced by classical potentials. In this context, the energy partition schemes [11–13] can provide rational criterions to define these regions. In vitro methods and mathematical models for enzymatic reactions [14–16] are the prototype for evaluate kinetics and binding affinities between proteins and ligands.
In this work, the application of the SemiEmpirical Energy Based (SEEB) partition method introduced by Carvalho and Melo [17] to study proteinligand association processes was analyzed. This method enables the stabilization energy decomposition both into physically meaningful and spatial components. As this formalism was developed at a semiempirical quantum level, it enables also the complete separability of these components. Here, the SEEB formalism was extended to describe proteinligand interactions using a pairwise potential. The SEEB method was then used to study the association between the Curcubita maxima trypsin inhibitor (CMTII) and glycerol. CMTII is well known by its biological importance [18, 19]. Glycerol is a cryoprotectant [20], which should be washed away with solvent in the crystallization process. However, it forms a stable complex with CMTII that is detected in the crystallized structure. In this context, glycerol should be considered to have a large affinity to CMTII and this study can provide a further insight for a rational modeling of highspecific ligands for this protein.
2. Methods
A noncovalent (no) association between a protein (P) with n amino acid residues and a ligand (L) can be represented by equation (1):
P+L⇌P:L
A general association process can be described by a hypothetical mechanism involving two steps (see Figure 1). In the first step, the molecular monomers (P and L) are rearranged assuming the geometries adopted in the dimer.
In the second step, the rearranged species (P(rearr) and L(rearr)) associate each other preserving their geometries and originating the dimer (P:L). According to the SEEB formalism, the stabilization energy can be partitioned as:
ΔEno=ΔE1no+ΔE2no
In equation (2), the conformational rearrangement component (
ΔErearrno) is associated with step 1, while the polarization plus charge transfer (
ΔEPLCTno) and nonbonding (
ΔEn/bondno) components are associated with step 2:
In equations (2) to (4), the superscript (no) indicates that the P:L association is of non covalent nature. The nomenclature used for binding states is presented elsewhere [17]. The exact definition of the terms occurring in equation (3) and (4) is presented in appendix A.
An effective pairwise potential, combining nonbonding and PLCT contributions, is proposed in this work:
ΔEeffno=∑A=1nΔEeffno(AL)
In this context, the equation (4) can be rewritten as:
ΔE2no=ΔEeffno
The transformation of the exact interaction energy decomposition (equations (2) to (3)) into the effective pairadditive expression (5) is developed and justified in Appendix B.
In consistency with the SEEB formalism, the stabilization energy associated with step 2 can be also partitioned into strongly perturbed (
ΔEpertno) and bulk (
ΔEbulkno) components:
ΔE2no=ΔEpertno+ΔEbulkno
Both components can then be partitioned into longrange nonbonding and shortrange PLCT terms:
For this purpose, the amino acid residues have to be divided between these two regions. The extension of the perturbed region should be appropriately selected to minimize the absolute value of the bulk component (
ΔEbulkno) and of specially its shortrange PLCT (
ΔEbulk,PLCTno). Within a hypothetical hybrid (QM/MM) model, it is also essential that the longrange nonbonding bulk energy (
ΔEbulk,n/bondno) can be reproduced by molecular mechanics,
where q_{X} is the charge of atom X, q_{Y} is the charge of atom Y, C_{X,Y} and D_{X,Y} are de van der Waals parameters associated with atoms X and Y, and r_{XY} is the distance between the same atoms. In equation (14), N_{A} and M_{A} are respectively the first and the last atoms of amino acid A. In the same equation, N_{L} and M_{L} have the same meaning for the ligand. In this work, the AMBER99 force field [21] was used to parameterize the bulk terms (14) and the atomic point charges (q_{X} and q_{Y}) were calculated using both Mulliken [22] and MerzKollman [23] schemes. On the other hand, the effective interaction energy between a residue (C) included in the strongly perturbed region and the ligand has to be calculated at a quantum level.
In this work, the association of CMTII and glycerol was studied at an semiempirical level [26], AM1, using the SEEB modified formalism described above. The initial structure of CMTIIglycerol complex was obtained from Xray crystallography with 1.03 Å resolution [24] and can be found in the Protein Data Bank (PDB, 2004) with the reference 1LU0. The geometries of all species (both monomers and complex) were optimized using the MOPAC2002 package [25] in a Pentium 4 computer.
3. Discussion
The physically meaningful components of the complex CMTII glycerol stabilization energy, obtained using the modified SEEB formalism, are presented in Table 1. The nonbonding term is the dominant component for the stabilization energy. However, the PLCT and the conformational rearrangements components have an important corrective effect.
The nonbonding interaction energies between the amino acids and the ligand are presented in Figure 2.
Five residues (Arg1, Cys3, Arg5, Cys28 and Glu29) have the most relevant contributions, which correspond to absolute values larger than 10 kJ mol^{−1}. Three residues (Cys3, Arg5 and Cys28) are involved in specific hydrogen bonds with the hydroxyl groups of glycerol (see Figure 3). Arg5 is of particular importance, because this residue establishes two hydrogen bonds of this type and is responsible for approximately 57 % of the nonbonding energy. The two terminal residues (Arg1 and Glu29) are connected by an internal hydrogen bond involving the guanidinium (Arg1) and carboxylate (Glu29) groups.
In the complex in study, the most electropositive groups of glycerol are oriented in the opposite direction of this guanidiniumcarboxylate saltbridge (see Figures 3 and 4). Therefore, the attractive (Glu29) and repulsive (Arg1) contributions of these residues can be explained by nonspecific electrostatic interactions with the ligand. The PLCT energy is represented in Figure 5, as a sum of intra and interfragment terms (see equation A9).
Residue addictive effective PLCT terms, defined according to equations (B1) and (B2), are presented in Figure 6. To obtain a bulk region in consistency with the requirements presented in the previous section, all the residues having effective PLCT contributions larger in absolute value than 1.0 kJ mol^{−1} and/or a nonbonding interaction with ligand larger in absolute value larger than 3.0 kJ mol^{−1} were included in the strongly perturbed region. Twelve residues (Arg1, Val2, Cys3, Pro4, Arg5, Ile6, Asp13, Glu19, Cys20, Tyr27, Cys28 and Glu29) satisfied this requirement.
The global contributions of the amino acid residues for the stability of the CMTII:glycerol complex was evaluated using the effective pairwise potential introduced in the previous section. The results obtained are presented in Figure 7. Including these corrective effects, only four residues (Arg1, Cys3, Arg5 and Glu29) were verified to have significant effective contributions (larger than 10 kJ mol^{−1}) for the stabilization energy. A fifth residue (Cys28) had been previously identified as relevant, because its nonbonding interaction energy with ligand is markedly negative (–15.2 kJ mol^{−1}). However, this residue has an opposite effective PLCT energy of 15.8 kJ mol^{−1} and its overall contribution is near null.
The partition into the strongly perturbed and bulk components is presented in Table 2. These components were evaluated using quantum and classical formalisms. The results obtained enable us to conclude that the interaction energy of bulk region with ligand is wellreproduced by classical potentials, when MerzKollman charges are used.
4. Conclusions
Using the SEEB formalism, the association of the CMTII with glycerol was analysed. The stability of the correspondent complex can be mostly associated with four residues (Arg1, Cys3, Arg5 and Glu29). The residues are also divided between a strongly perturbed and a bulk region. This spatial partition was carried out according to the appropriate requirements previously discussed. In fact, the correspondent shortrange bulk component (
ΔEbulk,PLCTΓ) is 1.86 kJ mol^{−1} representing only 6.3 % of the total stabilization energy (−29.05 kJ mol^{−1}) and the longrange bulk component (
ΔEbulk,n/bondno) is wellreproduced by an AMBERtype potential. In a previous work, it was verified that the Mulliken charges are more appropriate to reproduce such type of potentials in small dimeric species [17]. However, in the globular proteic environment studied in this work, the MerzKollman charges seem to be preferable.
The spatial decomposition used in this study constitutes a rational methodology to build QM/MM models. For this purpose, the shortrange PLCT term (
ΔEbulk,PLCTno) can be neglected and the longrange nonbonding term (
ΔEbulk,n/bondno) can be calculated at a molecular mechanics level. On the other hand, the strongly perturbed region should be described at an appropriate quantum level.
The modified SEEB method, introduced in this work, enables the description of a proteinligand association process in terms of a pairwise interaction potential. This effective potential includes the nonbonding interaction between each pair and the correspondent PLCT correction associated with electronic rearrangement effects. The associated (amino acidligand) pairwise energies can be assumed as physically meaningful components, which can provide an important contribution to better understanding of the proteinligand association processes.
We thank the Fundação para a Ciência e a Tecnologia (FCT) for a doctoral scholarship (SFRH/BD/17900/2004) granted to Alexandre R. F. Carvalho. Financial support from Fundação para a Ciência e a Tecnologia (FCT), project POCI/QUI/55673/2004 is gratefully acknowledged.
Appendix ADetailed Description of the Physically Meaningful Decomposition
The physically meaningful decomposition, presented in equation (3) to (4) of the second section, is here discussed in detail.
The conformational rearrangement components,
ΔErearrno (P) and
ΔErearrno (L) in equation (3), are defined as,
where the superscription (∞) represents a nonbinding states where the fragments P and L are at infinite distance. These components are associated with the conformational energy rearrangements occurring in step 1 for the protein (ΔE^{no} (P)) and the ligand (ΔE^{no} (L)) respectively. From a physical point of view, each of these terms represents the energy cost for the correspondent specie (P or L) associated with the transition from the free optimized structure to the free rearranged structure which is the most appropriate for the P:L docking.
P→ΔErearrno(P)P(rearr)L→ΔErearrno(L)L(rearr)
Consequently, the total conformational energy component (
ΔEn/bondno) is calculated as the sum of (A1) and (A2) terms:
ΔErearrno=ΔErearrno(P)+ΔErearrno(L)
The PLCT and nonbonding components are associated with step 2, of the hypothetical mechanism presented in Figure 1. In this step, the rearranged fragments are docked, preserving these internal geometries P(rearr) and L (rearr), and originating the final optimized complex (P:L). In this sense, these components have a pure electronic nature, occurring without any modification of the intrafragment nuclear positions.
Each of these terms represents the electronic energy cost for the correspondent fragment, associated with transition of (P(rearr) or L(rearr)) species from the free rearranged state (∞) to the binding state (no) where the dimeric P:L specie is formed. These terms are originated by polarization plus charge transfer effects, associated with electronic density redistribution within each fragment (P(rear)) or L(rear)) and electronic charge transfer between these species. These components involve energy terms that exist both in free rearranged species (P(rearr)) or L(rearr)) and in the dimer (P:L). However, the associated values are not conserved between these two states due to the mentioned electronic effects. Each of these terms includes the intrafragment kinetic electronic energy, the attraction energies between the electronic density of the fragment and its nuclei, and the repulsion energy associated with its electronic density.
Considering that the protein (P) is constituted by n amino acids residues, its PLCT component can be partitioned into intra and inter residues terms:
ΔEPLCTno(P)=∑A=1nΔEno(A)+∑A=1n∑B=1A−1ΔEno(AB)
Finally, the total PLCT component can be calculated as:
In equation (A9), the terms ΔE^{no}(A) and ΔE^{no}(L) correspond to the energy variations during the mentioned step 2, associated with the amino acid A and the ligand respectively. In the same equation, ΔE^{no}(AB) is associated with the variation in interaction energy between the amino acids A and B during the same step.
The non bonding component
ΔEn/bondno can be calculated as a sum of pairadditive terms,
ΔEn/bondno=∑A=1nΔEno(AL)
where ΔE^{no} (AL)is the interaction energy between the amino acids A and the ligand (L) that only exists in the dimer. This term includes the attraction energies between the electronic density of A and the nuclei of L, the attraction energies between the electronic density of L and the nuclei of A, the repulsion energy between the electronic densities of A and L, and the repulsion energies between the nuclei of the two species.
Appendix BDetailed description and justification of the pairwise decomposition
The pairwise decomposition, presented in equation (5) of the second section, is here discussed and justified in detail. Analyzing the equation (A9) in Appendix A, it is obvious that the PLCT component is not that residueadditive due the occurrence of the interresidue terms (ΔE^{no}(AB)) . However, effective additive terms can be obtained dividing equally these interaction energies by the two partner residues (A and B) involved:
ΔEPLCTno=∑A=1nΔEeffno(A)+ΔEno(L)
With
ΔEeffno(A)=ΔEno(A)+∑B≠A=1nΔEno(AB)/2
For the noncovalent interactions, the simplification adopted in equations (B1) and (B2) is a natural criterion to divide the interresidue interactions energies. For covalent interactions, this corresponds to a Mullikenlike approach that involves some degree of arbitrariness. However, in general, the methodology proposed here seems to be a reasonable approach to obtain effective PLCT terms.
An effective interaction potential between the ligand L and an amino acid (A) (
ΔEeffno(AL)), combining nonbonding and PLCT contributions, can now be obtained from equations (A10), (B1) and (B2). This potential evaluates the nonbonding interaction between A and L, corrected by the PLCT reorganization associated with both species.
ΔEeffno(AL)=ΔEno(AL)+ΔEeffno(A)+αAL×ΔEno(L)
In equation (B3), the coefficient (α_{AL} ) is defined as,
αAL=ΔEno(AL)/ΔEno
The mutual PLCT energy cost of two residues (A and L) is clearly correlated with the magnitude of the correspondent interaction. In fact, if the AL interaction is extremely weak their mutual polarization effects can be neglected. On the other hand, if the amino acid A strongly interacts with L these polarization effects should be very important. Almost all PLCT effect induced on amino acid A is originated, directly or indirectly, by the interaction with the ligand. On the other hand, the PLCT effects in the ligand are induced by its interactions with the amino acids of the protein (P). In this context, the coefficient (α_{AL} ) can be considered as a good weighting factor to evaluate the degree of contribution of the AL interaction to the PLCT ligand energy cost.
References and NotesKuntzIDStructurebased strategies for drug design and discoveryNetzevaTIAptulaAOBenfenatiECroninMTDGiniGLessigiarskaIMaranUVrackoMSchüürmannGDescription of the Electronic Structure of Organic Chemicals Using Semiempirical and Ab Initio Methods for Development of Toxicological QSARsWeinerPKKollmanPAAMBER: Assisted model building with energy refinement. A general program for modeling molecules and their interactionsWeinerSJKollmanPACaseDASinghUCGhioCAlagonaGProfetaSWeinerPKA new force field for molecular mechanical simulation of nucleic acids and proteinsCornellWDCieplakPBaylyCIGouldIRMerzKMFergusonDMSpellmeyerDCFoxTCaldwellJWKollmanPAA Second Generation Force Field for the Simulation of ProteinsBrooksBRBruccoleriRFOlafsonBDStatesDJSwaminathanSKarplusMCHARMM: A program for macromolecular energy, minimization, and dynamics calculationsMacKerellADJrWiorkiwiczKuczeraJKarplusMAn allatom empirical energy function for the simulation of nucleic acidsRahaKMerzKMJrLargeScale Validation of a Quantum Mechanics Based Scoring Function: Predicting the Binding Affinity and the Binding Mode of a Diverse Set of ProteinLigand ComplexesLipkowitzKBBoydDBMethods and Applications of Combined Quantum Mechanical and Molecular Mechanical PotentialsFriesnerRAGuallarVAb Initio Quantum Chemical and Mixed Quantum Mechanics/Molecular Mechanics (QM/MM) Methods for Studying Enzymatic CatalysisMeloARamosMA new partitioning scheme for molecular interacting systems within a multiconfigurational or monoconfigurational HartreeFock formalismDixonSLMerzKMJrCombined quantum mechanical/molecular mechanical methodologies applied to biomolecular systemsMayerISalvadorPOverlap populations, bond orders and valences for ‘fuzzy’ atomsPutzMVLacrămăAMEnzymatic control of the bioinspired nanomaterials at the spectroscopic levelPutzMVLacrămăAMOstafeVIntroducing logistic enzyme kineticsPutzMVLacrămăAMOstafeVFull Analytic Progress Curves of the Enzymic Reactions in VitroCarvalhoARFMeloAEnergy partitioning in association processesCarvalhoARFMeloANatural inhibitors of proteases – pharmacological target for destabilization/stabilization of the protease/inhibitor complexWynnRLaskowskiMJrInhibition of human βfactor XII_{A} by squash family serine proteinase inhibitorsMitchellEPGarmanEFFlash freezing of protein crystals: investigation of mosaic spread and diffraction limit with variation of cryoprotectant concentrationCaseDAMullikenRSMullikenRSElectronic Population Analysis on LCAOMO Molecular Wave Functions. IBeslerHBMerzMKKollmanPAAtomic Charges Derived from Semiempirical MethodsThaimattamRTykarskaEBierzynskiASheldrickGMJaskolskiMAtomic resolution structure of squash trypsin inhibitor: unexpected metal coordinationStewartJJPBredowTJugKTheory and Range of Modern Semiempirical Molecular Orbital MethodsFigures and Tables
Hypothetical mechanism for a noncovalent association between a protein (P) and a ligand (L), which enables the application of SEEB formalism to partition the associated stabilization energy into physically meaningful components.
Nonbonding amino acid residue (A)ligand (L) interaction energies for CMTII:glycerol complex. The residues included in the strongly perturbed region are identified by the symbol (P). The residues that strongly interact with the ligand are identified by the symbol (S).
Schematic representation of the of the strongest amino acid residuesligand interactions (nonbonding interaction energies with absolute values larger than 10 kJ mol^{−1}) in the CMTIIglycerol complex.
Three dimensional structure of the CMTIIglycerol complex. Glycerol is represented in CPK, the strongly perturbed region is represented in black bold and the bulk region is represented in line gray.
Intra and interfragment terms of the polarization plus charge transfer (PLCT) energy. The amino acids are numbered sequentially. The residue 30 corresponds to the glycerol.
Residue (A) addictive effective PLCT energies. The residues included in the strongly perturbed region are identified by the symbol (P).
Effective amino acid residue (A)ligand (L) interaction energies for CMTII:glycerol complex. The residues included in the strongly perturbed region are identified by the symbol (P). The residues that strongly interact with the ligand are identified by the symbol (S).
Physically meaningful components (kJ mol^{−1}) of the stabilization energy for the association between CMTII and glycerol.
ΔEn/bondno
ΔEPLCTno
ΔErearrno
ΔE^{no}
Protein

35.30
33.11

Ligand

33.14
16.89

Total
−147.50
68.45
50.00
−29.05
Spatial components (kJ mol^{−1}) of the stabilization energy for the association between CMTII and glycerol.
ΔErearrno
ΔEbulk,PLCTno
ΔEbulk,n/bondno
ΔEbulkno
ΔEpert,PLCTno
ΔEpert,n/bondno
ΔEpertno
ΔE^{no}
Q
Ml
MK
50.00
1.86
1.277
0.300
1.347
3.13
66.59
−148.77


Q: Quantum; MlMolecular mechanics with Mulliken charges; Mk: Molecular mechanics with MerzKollman charges.