The reaction catalysed by an enzyme is dependent on substrate availability. Whereas access of the substrates to the active site can be straightforward for enzymes with an active site exposed to the surroundings, for enzymes with buried active sites the situation seems to be much more complicated, and the substrates entry or the products exit phenomenon, occurring via tunnels, can define the rate-limiting step of the reaction [1
]. Here an immediate question appears: ‘Why has nature complicated the structure of some biocatalysts and hid the active site inside a buried protein cavity?’ The constrained active site arrangement in 3D space, providing precise prepositioning of the amino acid functional groups, is one obvious answer [3
]. Tunnels provide an access to the buried active sites and ensure more precise control of the substrates selection [4
]. However, the possibility of controlling the reaction environment seems to be just as important as the geometrical constraints. Each enzyme provides its service immersed in the solvent that contributes to catalyst stability, activity, and selectivity [6
]. Since water molecules can greatly influence the catalytic event, a variety of mechanisms regulating water access to the active site have been developed during natural evolution [9
]. The division of the hydrophobic and hydrophilic compartments in the protein core can separate processes requiring distinct dielectric conditions. In enzymes with a buried active site connected with the surrounding solvent by tunnels, the water flow can be controlled much more precisely by the molecular properties of the amino acids constituting the tunnels or, in more sophisticated enzymes, by gates controlling the opening and closing of the access pathways [10
]. Such control mechanisms can be much more complicated than one might expect. Various arrangements of tunnels and gates can provide the delivery of substrate or water molecules ‘on request’, thus protecting hydrophobic conditions when required and guaranteeing the water access for a hydrolytic event [13
]. However, our understanding of solvent distribution and the flow of water molecules inside the protein core has been limited, mainly due to the lack of proper tools facilitating such analysis. We found that the exchange of water molecules occupying the active site cavity and penetrating the protein core can be investigated reasonably successfully by the use of classical molecular dynamics (MD) simulations. It is not a simple task, even though the MD simulations packages contains a variety of tools for atoms or molecules selection and the tracking of collections of molecular entities. The identification and tracking of water molecules that enter into regions important for catalysis require the screening of the positions of several thousands of single molecules along several thousands of MD steps. To fill the existing gap between tools searching for tunnels and pathways and advanced tools for accelerated water flux investigations, we have developed AQUA-DUCT, an easy-to-use tool facilitating the analysis of the behaviour of water (and if necessary other solvent molecules) penetrating any selected region in a protein [15
An epoxide hydrolase from Solanum tuberosum
(StEH1) is a well-studied enzyme with a buried active site and a well-defined main tunnel providing access into the catalytic triad. As an attractive target for protein engineering, it has been intensively studied using both experimental [16
] and computational methods [18
]. Solanum tuberosum
epoxide hydrolase belongs to the α/β-hydrolase family. The core domain comprises a central eight-stranded β-sheet flanked by α-helices, with a mainly helical cap-domain (or lid-domain) positioned over the core, forming a buried active site. The active site consists of a catalytic triad, D105, D265 and H300, of the α/β domain, with two tyrosines (Y154 and Y235) from the cap domain assisting in epoxide ring opening [16
Previously, several mutants have been designed to test and modify the mechanism of the reaction, substrate specificity, enantioselectivity, regioselectivity, and pH dependence (Supplementary Table S1
]. Most of the targeted residues were positioned in the binding site cavity; however, a few of them were distinct and were located on the protein surface (Supplementary Figure S1
). Initial studies were dedicated to investigating the role of active site residues. Elfström and Widersten [16
] mutated four catalytic candidate residues, D105, Y154, Y235, and H300, into residues with non-ionisable functional groups. They constructed seven mutants, D105A, Y154F, Y235F, Y154F/Y235F, H300A, H300N, and H300Q, with very low activity towards different TSO (trans-stilbene oxide) enantiomers and no detectable activity towards other tested epoxides [16
]. Almost a decade later, Amrein et al. [30
] described H300N and E35Q/H300N mutants in order to establish the principles underlying the activity and selectivity of StEH1 and proposed to expand the ‘catalytic triad’ to include E35 and H104 residues, which according to the authors’ results are indispensable for epoxide hydrolase activity. They resolved the crystal structure of the H300N mutant, which revealed a substantial perturbation of the active site.
Thomaeus et al. [19
] investigated the importance and role of a putative proton wire in the StEH1 catalysis mechanism. The proton transfer path was suggested to begin from donor oxonium ions in the bulk solvent through a chain of water molecules that led to the Y154 residue in the active site. Water molecules in the main StEH tunnel were coordinated by the following residues (starting from the active site to the protein surface): Hydroxyl group of Y154 → imidazole moiety of H153 → backbone carbonyl of L266 → hydroxyl of Y149 → backbone carbonyl of H269 → backbone carbonyl of P186. Three mutant variants, Y149F, H153F and Y149F/H153F, were constructed to validate the Y149’s and H153’s integral role in this chain. The introduced single mutations resulted in a protein with a shorter estimated half-life than the wild-type (about 1 h, whereas the wild-type enzyme was stable for about 2 h and 15 min at 55 °C), whereas the double mutant showed a dramatic drop in enzyme activity with an interpolated half-life of 20 min. The single mutants H153F and Y149F also lost one and two protein-water hydrogen bonds, respectively, which destabilised the structure. Losing three hydrogen interactions in the double mutant Y149F-H153F was linked with the lowering of the protein’s half-life.
The influence of salt bridges situated between the core and cap domain and on the surface on protein stability and regioselectivity was studied by Lindberg et al. [28
]. Four mutations, K179Q, E215Q, R236K and R236Q, introduced at the interface of the α/β-hydrolase fold core and the lid domains and between residues in the lid domain (residues 139–237) disrupted the salt-bridging interactions between K179-D202, E215-R41 and R236-E165 and caused increased flexibility of the protein, lower thermostability and a decrease in activity with a general trend of wild-type > K179Q > E215Q > R236K > R236Q.
The analysis of StEH1 enantioselectivity was performed in the series of studies [21
]. Four hotspots were targeted by random mutagenesis that consisted of: A single F33 residue (hotspot A), the Y106 and L109 residues (hotspot B), V141, L145, and I155 residues (hotspot C) and I180 and F189 residues (hotspot D) [21
]. Several constructed mutants were investigated in details [21
] and the crystal structures were solved for four of them: R-C1 variant (V141K, I155V), R-C1B1 variant (W106L, L109Y, V141K, I155V), R-C1B1D33 variant (W106L, L109Y, V141K, I155V, F189L), and R-C1B1D33E6 variant (W106L, L109Y, V141K, I155V, F189L, L266G) [26
]. All the targeted residues were situated in the binding site cavity, apart from F189, which was located behind the active site cavity. The R-C1B1 variant had an increased volume of the binding site cavity caused by the Y109 side chain’s rotation away from the binding site. The additional F189L substitution in the R-C1B1D33 variant and the L266G in the R-C1B1D33E6 variant further increased the volume of the active site pocket. The R-C1B1 variant was studied by Bauer et al. [29
], who pointed out that the origin of the observed enantio- and regioselectivity is related not only to active site cavity enlargement but also to water penetration into the active site. Analysis of the F189L variant [26
] also showed the importance of the F33 residue, which was found to be involved in substrate binding and might explain the degree of conservation of this residue during directed evolution (only exchanges to tyrosine were observed) [22
Keeping in mind all the above-mentioned findings, we used AQUA-DUCT with wild-type StEH1 to investigate in detail the water flow in the vicinity of the binding site cavity and also to identify residues which may perhaps be important for the further modification of the activity and selectivity of this enzyme.
Depending on the planned improvement of the enzyme, protein redesign methods can vary significantly. A strategy aiming at protein stabilisation seeks to obtain better residue packing or the introduction of covalent bonds or salt bridges [44
]. On the other hand, modification of the enzyme’s selectivity targets the binding site cavity or the residues composing or controlling the access pathways [1
]. Recent examples show that substrate specificity regulation can be achieved by subtle changes, at a distance from the active site, which modify water or solvent accessibility [51
]. Also, an entropy contribution to both binding affinity and catalysis is greatly affected by internal water positioning and dynamics, therefore mutations of residues positioning water can modify enzyme properties significantly [54
]. Therefore, we applied a ligand tracking approach to examine water traffic in the interior of StEH1 and to identify potential hot-spots or regions suitable for further enzyme modification.
We assumed that our conceptually simple approach, analysis of water distribution, can identify regions where water is either attracted by favourable interactions with nearby amino acids or trapped in hydrophobic cages. In both cases, such hot-spots can mark regions of particular importance for the enzyme’s functions. Indeed, using such a procedure for hot-spot identification in the protein core, we were able to detect the most important residues and regions in the StEH1 enzyme. The active site cavity, E35 (a residue responsible for catalytic water positioning), as well as amino acids regulating the access of water via the TC/M tunnel, were easily detected. Residues surrounding hot-spots at TC/M and TM2 tunnels entry can contribute to dynamic of tunnels entrance and work as gates controlling not only solvent flow but also substrates entry or products egress. It was clearly visible during 1st repetition, where high flexibility of residues of P247-P256 loop caused an increase of water molecules traffic via TM2 tunnel. Besides these, we identified several key amino acids that can be considered for further modification of StEH1’s properties.
Using the water tracking approach, we identified and described three cavities that were directly connected with the active site and contributed to enzyme selectivity and activity. Two of them were previously reported [17
] as narrow volumes that lie on the ‘inside’ of the active site cavity and in chain B of the crystal structure; they were partly filled by the two aliphatic tails of valpromide. The authors pointed out that one branch (in our study cavity I) consisted of mixed hydrophobic and polar components and eventually led out to the solvent, while the other (cavity III) was lined almost exclusively with phenylalanine side chains and was completely enclosed. In our research, we could detect both cavities in all simulations. Cavity I had the ability to expand significantly and thus protein dynamic allows to host even larger substrates than valpromide (Supplementary Figure S5a,b
). In contrast to cavity I, cavity III had rather a constant size probably due to large number of π-π interactions between aromatic residues; however in one simulation, significant enlargement was observed with the cavity reaching surface residues. The AQUA-DUCT results suggested that both cavities can be permeable for water molecules; however, the events occurred over different time scales. Cavity I, which was filled by both polar and aromatic residues, provided fast transfer of water molecules, whereas ‘aromatic’ cavity III trapped water molecules prior to exit or after entry. Interestingly, passages via both cavities were reported during the entire simulation only on a few occasions. According to evolutionary analysis, both cavities (I and III) are made up of non-conserved amino acids, and thus provide an opportunity for their modification, which could run in two directions: (i) widening of a potential tunnel and thus opening an alternative pathway that could modify solvent access and thus influence substrate specificity or selectivity; or (ii) reshaping of cavities able to host larger substrates. The first strategy was successfully reported for haloalkane dehalogenase from Sphingomonas paucimobilis
UT26 (LinB) [49
], where a de novo
tunnel was designed, and for D-amino acids oxidase, where modification of selectivity was achieved by water access modification [53
]. The second strategy was previously applied to StEH1 [22
] and other epoxide hydrolases, for example Bacillus megaterium
epoxide hydrolase [57
], where the cavity at the back was opened to accommodate larger substrates.
Interesting suggestions can arise from the merged analysis of structural and evolutionary data. The low entropy of the residues composing cavity II suggest its importance in catalysis, whereas the more variable residues creating the walls of cavities I and III suggest the possibility of their reshaping towards changed substrate specificity with potentially tolerable effect on enzyme stability. Indeed, among the residues mutated by Carlsson et al. [22
], there are two, F189 and L266, that make up the walls of cavity I. The F189L and L226G mutations increased the volume of the active-site pocket without rearrangement of the position of other residues. In the case of F189L substitution, the possibility of the binding of aromatic substrates in ‘sandwiched’ mode was lost [26
]. In contrast, the additional L266F mutation in the R-C1B1D33 variant had to influence the rearrangement of the surrounding amino acid side chains; however, there has been no experimental or theoretical investigation providing insight into the incorporated changes. All the tested mutants influenced the regio- and enantioselectivity of the enzyme, which proves the importance of this region in substrate positioning. The analysis of our data suggests several additional potential residues that could be used to reshape cavity I and thus further modify the enzyme’s properties.
Two amino acids detected by our approach as residues surrounding expanded cavity III have been mutated, and their impact into enzyme activity was reported in the literature. The Y154F mutant was inactive and verified the role of the hydroxyl group of the Y154 residue in the proposed mechanism of reaction [16
]. The K179Q mutant was designed to examine salt-bridges’ role in enzyme stability [28
]. According to our knowledge, none of the other residues surrounding cavity III have been used for StEH1 redesign so far.
As we have already pointed out, in contrast to the non-conserved cavities I and III, cavity II was found to be built by highly conserved residues. In fact, it is located in close vicinity to the 36HGXP39 sequence (where X is an aromatic residue), one of the most highly conserved motifs among EHs due to its crucial role in shaping the active site. Without the cis-peptide bond provided by this motif, the catalytic water molecule will not be correctly placed for the second step of the reaction. The alkyl-enzyme intermediate is hydrolysed by the attack of a water molecule that has been activated through proton extraction using the H300-D265 charge relay. In crystallographic structures, a crystallographic water molecule (water molecule 21 and 49 in structures A and B, respectively) was found to be ideally positioned to play this role. Its position was stabilised by the side chains of H300 and E35, as well as the backbone carbonyl oxygen of F33 [17
]. The crystallographic water molecule was positioned in the entrance to cavity II. In two repetitions of our MD simulations, we detected an opening of the 2nd cavity in a manner that allowed it to accommodate two water molecules. Thus, cavity II can play a role as a reservoir of water molecules required for reaction, and protein dynamics can ensure water molecule availability after the alkyl-enzyme intermediate’s creation. As we stated in the introduction, the importance of the E35 and H104 residues was confirmed by Amrein et al., and led these authors to propose the expansion of the catalytic triad with these two residues [30
]. We should also mention that in contrast to cavities I and III, we did not observe any leakage of water molecules from cavity II, which suggests that the control of the water molecule’s presence in the position required for the reaction is truly exceptional.