Modelling Complex Bimolecular Reactions in a Condensed Phase: The Case of Phosphodiester Hydrolysis

(1) Background: the theoretical modelling of reactions occurring in liquid phase is a research line of primary importance both in theoretical–computational chemistry and in the context of organic and biological chemistry. Here we present the modelling of the kinetics of the hydroxide-promoted hydrolysis of phosphoric diesters. (2) Method: the theoretical–computational procedure involves a hybrid quantum/classical approach based on the perturbed matrix method (PMM) in conjunction with molecular mechanics. (3) Results: the presented study reproduces the experimental data both in the rate constants and in the mechanistic aspects (C–O bond vs. O–P bond reactivity). The study suggests that the basic hydrolysis of phosphodiesters occurs through a concerted ANDN mechanism, with no formation of penta-coordinated species as reaction intermediates. (4) Conclusions: the presented approach, despite the approximations, is potentially applicable to a large number of bimolecular transformations in solution and therefore leads the way to a fast and general method to predict the rate constants and reactivities/selectivities in complex environments.


Introduction
Phosphodiester bonds are the linkages that connect biological information stored in nucleic acids. Their resistance to hydrolysis [1] makes them ideal for preserving biochemical information. However, phosphodiester hydrolysis is a crucial process in biochemical systems and for this reason, nature has developed a number of enzymes able to catalyse the hydrolysis of these compounds in aqueous solution. The rate enhancements induced by these enzymes (e.g., kinases, ATPases, phosphatases) are astonishing (up to 10 17 in the case of staphylococcal nuclease) [2,3]. On the other hand, numerous research groups have designed and synthesized small molecule enzyme mimics able to cleave DNA, RNA and their model compounds in an effective and selective fashion [4][5][6][7]. The mechanism of phosphodiester hydrolysis was investigated at different pH conditions and for diverse phosphodiesters through experimental and theoretical-computational approaches for both catalysed [2,4,[8][9][10][11][12][13][14] and uncatalysed reactions [9,[15][16][17][18]. Both approaches present some critical aspects. The measurements of the rate constants for the uncatalysed hydrolysis of phosphodiesters are hampered by the extremely long half-life time of the reaction (e.g., 30,000,000 years at 25°C) [2]. For this reason, there is the need to carry out kinetic experiments at elevated temperatures and pressures and obtain the reaction rate at room temperature by extrapolation [2]. Regarding the computational approach, the problems manifest in the choice of the proper method to describe the reaction coordinate and the solvent effect. For the reasons illustrated above, the development of an efficient and reliable method to investigate phosphodiester hydrolysis is crucial to provide useful insights into the mechanisms of these reactions and to evaluate the catalytic enhancement offered by the enzymes and artificial catalysts with respect to spontaneous hydrolysis.
In this work, we focused on the study of the hydrolysis of a simple phosphodiester, dineopentyl phosphate, hereafter referred to as N p 2 P − . The hydrolysis of this substrate was investigated by Wolfenden et al. through an in-depth kinetic analysis [2], and therefore experimental data are available for a direct comparison. This compound has the tendency, similarly to DNA, to undergo cleavage of the phosphorous-oxygen bond, instead of the carbon-oxygen bond as many phosphoric diesters [2]. This reactive behaviour can be attributed to the steric hindrance provided by the t-butyl units that force the attack of the upcoming nucleophile to the phosphorous atom, as illustrated and discussed in detail in the present manuscript (vide infra. Experimental data indicate that the hydroxidecatalyzed hydrolysis dominates over the spontaneous reaction as the leaving groups become poorer [2] and, according to this evidence, we have selected the hydroxide ion attack onto the substrate, instead of a water molecule, as a test reaction. The hydroxide attack is also frequent in enzyme-catalysed reactions, where a metal cation helps the deprotonation of a water molecule [8,19,20]. This transformation was modelled by a statistical-mechanical approach based on a quantum mechanics/molecular mechanics hybrid method. The results obtained by this approach are illustrated and discussed in detail and compared with the experimental data.

General Considerations
Let us consider a simple reaction from the reactant (R) chemical state to the product (P) chemical state, without involving any distinguishable intermediate chemical state, as occurring within a single reactive centre (RC), either a molecule or a complex, interacting with its environment. Note that we assume for the R and P chemical states a fully in equilibrium: i.e., the corresponding mean residence time is large enough to allow for internal relaxation. We can also reasonably assume that the reaction events must pass trough a specific and unique transition state (TS) with a mean traversing time too short for any internal relaxation, hence providing the reaction scheme and thus[ Considering that we necessarily have k k R we can assume the steady state for TS (i.e.,[TS] ∼ = 0) providing It is reasonable to assume that within the TS equilibrium ensemble the outward flux rate constants are identical, given by k TS ∼ = k/2 (only one half of the TS population is moving towards P), and thus from Equation (6) we havė with Q TS and Q R the canonical partition functions of the whole system with a selected single RC complex (the reference RC) constrained in the TS or the R state, respectively (note that for reactive centres at infinite dilution the whole system can be considered as including a single RC embedded into a huge amount of solvent molecules). From Equation (7) it readily follows that the reaction process can be described by the simplified reaction scheme

The Reaction Coordinate, the Landau Free Energy and the Rate Constant
Let us assume that we can identify a generalized coordinate ξ (i.e., the reaction coordinate) univocally defining, via its different intervals, the R, TS and P chemical states of the reference RC (infinite dilution at the reactive centre the single RC is present), and thus properly describing the reaction process (see Figure 1). Defining with [ξ TS − δ/2, ξ TS + δ/2] the TS tiny reaction coordinate range and with ξ m the minimum ξ value of the R chemical state, we can express the partition functions Q R and Q TS via the Landau free energy A defined by A(ξ) = −k B T ln Q(ξ) (9) with Q(ξ) the partition function density providing where 1/β = k B T and we used A(ξ) ∼ = A(ξ TS ) within the TS range. From Equations (7), (10) and (11) we obtain where ξ R is a reference position within the R state range (typically the Landau free energy minimum). Realizing that the TS, corresponding to a tiny ξ interval, is characterized by a constant density within the whole δ range and for each reaction event by a virtually constant dξ/dt velocity, we can express the outward flux (forming the P population) via k[TS] ∼ = [TS] vs. /δ with vs. the mean TS traversing velocity given by the equilibrium mean value of v = |dξ/dt|. Therefore, it follows readily providing Finally, we can obtain the Landau free energy change along ξ, via [21] A where the subscript of the angle brackets indicate averaging within the ξ re f reference ensemble, U (ξ) and U (ξ re f ) are the perturbed electronic energies (estimated by means of the perturbed matrix method (PMM) [22,23]) as obtained either at fixed ξ or at fixed ξ re f for each accessible semi-classical configuration, once relaxed all the quantum vibrational coordinates are at their minimum energy position (we consider the quantum vibrational energy as independent of the ξ coordinate), and A ξ and A ξ re f are the Helmholtz free energies when confining the RC within the tiny δ range either around ξ or around ξ re f . Note that in practice Equation (17) is accurate only when ξ is not too far from ξ re f , thus allowing us to use the fixed ξ re f MD simulation to reconstruct the ξ ensemble (i.e., the MD simulation provides a reasonable sampling of the relevant phase space regions of both ensembles). Therefore, for evaluating the complete free energy profile we can use a set of reference positions ξ re f ,n , n = 1, N providing a set of [ξ re f ,n , ξ re f ;n+1 ] sub-ranges each to be used to obtain the corresponding (inner) Landau free energy changes. By summing the free energy variations over such sub-ranges the complete Landau free energy profile can be reconstructed and hence by averaging over the free energy profiles as obtained by both the ξ re f ,1 → ξ re f ,N and ξ re f ,N → ξ re f ,1 calculations we can obtain a reliable estimate of the reaction Landau free energy surface.

Application to the N p 2 P Hydrolysis Reaction
The general scheme for the hydrolysis reaction considered is where N p 2 P − is the dineopentyl phosphate, HO − is the hydroxide ion and C NR , C R are the non-reactive and reactive complexes, respectively: the former defined by the interacting partners still insufficiently close to alter their covalent structure while the latter corresponding to closely interacting partners possibly undergoing covalent rearrangement (i.e., the R state of Scheme (8)). In the previous reaction scheme the association rate constant k A can be obtained via [24,25] where r 0 is the maximum distance between N p 2 P − and HO − within the non-reactive complex (i.e., the radius length of the sphere defining the overall complex C = C NR + C R ), D N p 2 P − and D HO − are the free diffusion coefficients of the two reactant molecules characterizing their behaviour for r ≥ r 0 , and N A is the Avogadro number. Finally, assuming steady state for C NR and C R , we have [25] [ with as the overall (hydrolysis) rate constant (note that in Equation (24) we used k 1 + K R ∼ = k 1 as following from K R k 1 , see Table 1). By means of proper MD simulations, the PMM approach and the relations described in the theory section we obtained all the kinetic constants to be used in Equation (24). Table 1. Kinetic rate constants of the P-O bond cleavage in N p 2 P − .

Calculated Rate Constants
Value Dimension

Quantum Chemical Calculations
The quantum chemical calculations were conducted using Orca software package [26]. To model the hydrolysis of dineopentyl phosphate (N p 2 P − ) with the attack on the phosphorous atom, we selected as the quantum centre (QC) the dimethyl-substituted phosphate (DMP − ) and the hydroxide ion. The quantum chemical calculations were performed using density functional theory (DFT) to obtain the intrinsic reaction coordinate (IRC). The IRC calculation requires a guess of the transition state (TS) of the reaction. The initial guess for the transition state structure was taken from previous work [8] and then optimized for a saddle point at the DFT/PBE0 [27] level of theory using the 6-311++G(2d,2p) [28,29] basis set. The analysis of the transition state hessian confirmed that the obtained structure corresponds to a saddle point of the potential energy surface (PES).
The choice of the PBE0 functional was motivated by a previous benchmark work present in the literature [30].
To show that the attack of the hydroxide ion on the neopentyl group is energetically precluded, we performed a PES scan along the hydroxide oxygen atom-dineopentyl carbon atom and the dineopentyl carbon atom-leaving group (see Figure 2a) distances, at DFT/PBE0 [27] level of theory using the 6-31+G(d,p) [28,29,31]

Molecular Dynamics Simulations
The molecular dynamics (MD) simulations were conducted starting from the structures obtained from the IRC calculation (as explained in the previous subsection) and by replacing one hydrogen atom of each methyl group of the DMP − with a t-butyl group to obtain a model of the dineopentyl phosphate and hydroxide ion, along the reaction coordinate.
All the molecular dynamics simulations have been performed using Gromacs 2022 software package [32]. The initial structure was centred in a cubic box and solvated using the SPC [33] water model, whereas the force field parameters of the neopentyl phosphate were obtained by means of the ACPYPE package [34]. Two Na + ions have been inserted in the simulation box to reach the electro-neutrality. An energy minimization step was performed using the steepest descent algorithm freezing the atoms corresponding to the QC. After the minimization, a series of equilibration steps, lasting 50 ps with a timestep of 2 fs, were performed in the NVT ensemble: after each run, we checked the density of the system and tune the size of the box in order to reach the correct value of density. After the equilibration steps, each system has been simulated for 25 ns at fixed temperature and volume. The temperature was kept constant at 300 K using the V-rescale thermostat [35] using a τ T of 0.1 ps. The electrostatic interactions were calculated using the particle mesh Ewald method [36,37] with a cut-off of 1.2 nm. A cut-off of 1.2 nm was used for the van der Waals interactions.
Because the reaction shows a relevant reorganization of the internal degrees of freedom, we divided the intrinsic reaction coordinate into five subparts. For each of them, an MD simulation (maintaining the QC fixed, in a reference structure for the selected subpart) was carried out to provide a proper semi-classical perturbation of the QC.
Additionally, we performed an MD simulation, lasting 100 ns, of the free reactants using a timestep of 2 fs to obtain the hydroxide ion-dineopentyl phosphate distance probability distribution.
A short simulation, lasting 200 ps, with a timestep of 1 fs of the dineopentyl phosphate in aqueous solution was performed to estimate its diffusion coefficient according to the Einstein Equation [38]. For the hydroxide ion, we used the experimental diffusion coefficient reported in [39]. The diffusion coefficients are required for the calculation of the kinetic constant k A (see Scheme (21)) according to Equation (22).
We also performed a set of 70 simulations, lasting 500 ps, starting from random initial configuration within the hydroxide ion-dineopentyl phosphate distance of 0.5-0.6 nm to reconstruct the reactive and non-reactive complex dissociation kinetics.

Computational Strategy
The computational strategy can be summarized in the following steps: • Calculation of the unperturbed quantum chemical properties, needed to apply the MD-PMM procedure, for each point along the reaction coordinate (electronic ground state energy, permanent dipole moments and CHELPG charges [40] Calculation of the kinetic rate constants involved in the reactive non-reactive equilibrium in Scheme (21) by performing different sets of MD simulations: (i) a 100 ns long simulation of the free reactants to obtain the hydroxide ion-dineopentyl phosphate distance probability distribution and the equilibrium fraction of the non-reactive complex, χ eq ; (ii) one short simulation for the determination of the diffusion coefficient of the dineopentyl phosphate; (iii) a set of 70 simulations in order to reconstruct the reactive and non-reactive complex dissociation kinetics.

Results and Discussion
As explained in the Theory section, the evaluation of the overall rate constant k H for the P-O cleavage reaction of the N p 2 P − in water requires the estimate of the rate constants associated with the three reaction steps, described in Scheme (21), using Equation (24).

Evaluation of the Rate Constant K R of the Dineopentyl Phosphate Hydrolysis
The methodology described in the Theory and Methods sections was applied to reconstruct the full kinetics of the hydrolysis of the dineopentyl phosphate. This substrate was selected because of the availability of experimental data [2], as highlighted in the introduction. It is known that the hydrolysis of the dialkyl phosphodiesters with minor steric hindrance around the carbon atoms bound to the oxygen atoms proceeds preferentially via the nucleophilic attack at this carbon atom with respect to the phosphorous atom [17,18] (Figure 2a). However, in the case of the dineopentyl phosphate, the steric hindrance drives the attack onto the phosphorous atom [1], as shown in Figure 2b.
Therefore, to quantitatively model the free energy profile of the reaction by means of the PMM-MD approach, we first evaluate the minimum energy path along both the P-O and C-O bond cleavages. To study the attack of the hydroxide ion on the phosphorous atom of the dineopentyl phosphate, we performed a TS optimization of the quantum centre and analysis of the TS hessian to confirm that the obtained structure is a saddle point of the potential energy surface of the system. From this TS structure, an intrinsic reaction coordinate calculation was performed to obtain the reaction profile reported in Figure 3b. In the case of C-O bond cleavage, the same calculation on N p 2 P − failed because of the large steric hindrance due to the neopentyl groups. Therefore, a bi-dimensional PES along the two bond distances involved in the reaction, i.e., the distance between the carbon atom and the oxygen of the hydroxide (d C−OH ) and between that carbon atom and the oxygen of the leaving group (d C−OLg ), was performed. These results reported in Figure 3 clearly show a higher energy barrier associated with the C-O bond cleavage (36.1 kcal/mol) with respect to the P-O bond (27.5 kcal/mol). This indicates that the steric hindrance inhibits the C-O bond cleavage. Therefore, we have applied the MD-PMM procedure to evaluate only the thermodynamics and kinetics in solution for the P-O bond cleavage.
To this end, the electronic energies were evaluated for ∼100 structures taken along the reaction coordinate and perturbed-in the corresponding MD ensemble-by the electric field as provided by the MD simulations. These data indicate that the effect of the environment is to stabilize both the reactants and the TS, as shown by the distributions of the reactant and TS perturbed electronic energies (see Figure 4).
However, the TS perturbed energies decrease less than the reactant state ones, resulting in a higher barrier with respect to the vacuum condition (see Table 2). This effect can be ascribed to a better ability of polar solvent molecules to stabilize two negative charges located on two separated molecular entities compared with a doubly charged molecule.
Using these perturbed energy trajectories in Equation (17), it was possible to calculate the corresponding (Helmoltz) free energy profile along the reaction coordinate, which results in a barrier of 38.8 kcal/mol, Figure 5. Using the statistical-mechanical approach described in the Theory section, i.e., Equation (16), we obtain, for the last (and slowest) reaction step (see Scheme (21)), a kinetic rate constant K R as high as 1.85·10 −15 s −1 .

Evaluation of the Other Kinetic Rate Constants
In order to evaluate the other rate constants for the reaction steps providing the complex dissociation and the non-reactive reactive complex transitions, we computed the overall complex dissociation rate constant (k D = χ eq k 2 , χ eq = k 1 /(k 1 + k −1 )), assuming pre-equilibrium within the reactant complex, as provided by 70 MD simulations starting from random initial configurations representative of the reactive complex. The direct evaluation of k D can be obtained by monitoring and fitting the overall complex fraction in time, i.e., by first-order modelling the time decay of the trajectory fraction corresponding to the reactants with distances within 1.2 nm. This cut-off value, defining the overall complex radius r 0 , can be identified by inspecting the hydroxide ion-dineopentyl phosphate distance distribution ( Figure 6) as provided by MD simulations. When that distribution starts to exhibit a purely quadratic behaviour, the two reactants can be considered to be outside of the non-reactive complex region. Analogously, the rate constant k 1 for the reactive complex → non-reactive complex transition can be computed by monitoring the time decay of the complex fraction corresponding to the reactants with distances within 0.6 nm (i.e., the cut-off radius defining the reactive complex), see Figure 7. The computed values for k D and k 1 are reported in Table 1. Moreover, from the 100 ns MD simulation of the free reactants we obtained the equilibrium fraction of the non-reactive complex (χ eq ) used to estimate the kinetic constants k −1 and k 2 (see Scheme (21)), also reported in Table 1.
Finally, using all the rate constants for the cleavage reaction steps in Equation (24), it was possible to estimate the overall rate constant k H = 2.61·10 −15 s −1 in very good agreement with the experimental data, both reported in Table 1.
Our data clearly show that the first two steps of the reaction have a limited effect on the overall rate of the reaction, as pointed out by the limited difference between K R and k H .

Hydrolysis of the Dimethyl Phosphate
In order to confirm that the steric hindrance is able to drive the reaction towards different paths, that is P-O or C-O bond cleavage, the slow step of the hydrolysis reaction of the dimethyl phosphate (DMP − ) was investigated by means of the same theoreticalcomputational strategy for both C-O and P-O bond breaks. In agreement with the experimental data, the free energy barrier is remarkably lower for the C-O bond break with respect to the cleavage of the P-O bond.
Assuming that in the case of the DMP − the first two reaction steps exhibit similar behaviours with respect to N p 2 P − , we compare the ratio between the experimental overall rate constants of the two possible bond breaks with the ratio of the corresponding K R . From these data, it comes out that our procedure correctly predicts the favourite path (see Table 2).

Mechanistic Considerations
The hydrolysis of phosphodiesters may proceed via a step-wise mechanism or, alternatively, a concerted pathway (see Figure 8). In the first case, the formation of the intermediate can be generated either by the attack of the nucleophile onto the phosphate unit, i.e., A N + D N , mechanism with the generation of a penta-coordinated species, or by the early departure of the leaving group (Lg), named as D N + A N [8,9,41,42]. The latter case can be considered only in the case of substrates provided with very good leaving groups. On the other hand, in a concerted reaction mechanism, the penta-coordinated phosphorane is a transition state instead of an intermediate, namely A N D N pathway, indicated also as S N 2(P) for its close similarity to carbon nucleophilic substitution. Even in the concerted transformation, the TS can verge to one of the two extreme conditions illustrated above. A transition state that is closer to the associative limit is labelled as A N D N , i.e., asynchronous tight. On the contrary, an asynchronous loose D N A N mechanism is closer to the dissociative limit [8,41]. From the analysis of the potential energy surface and the corresponding O nucleophile -P and P-OLg distances (see Figure 3) the reaction pathway does not indicate the existence of any intermediate, as no energy minima is detected along the profile, and therefore the mechanism can be considered concerted and synchronous, or slightly "associative" (A N D N ). This is in contrast to what has been observed in a previous investigation based on a more approximate theoretical-computational approach and on the implicit treatment of the solvent [9].

Conclusions
Our theoretical and computational procedure, based on a mixed quantum/classical approach, was employed to investigate a chemical transformation of relevant importance in biological systems. The proposed methodology, based on the perturbed matrix method (PMM), has turned out to be effective in the quantitative prediction of the kinetics and thermodynamics of the hydrolysis of phosphate diesters in water solution. The results obtained by such an approach, in spite of the adopted approximation, are in excellent agreement with the experimental rate constants. Furthermore, this approach provides useful mechanistic insights about the reaction pathway, i.e., C-O bond cleavage vs. O-P bond cleavage for phosphodiesters featuring different steric hindrance conditions, in accordance with the experimental picture. The analysis of the potential energy surface does not show the presence of any energy minimum along the reaction pathway, thus suggesting that the hydroxide-promoted hydrolysis of these compounds proceeds through a concerted "synchronous" mechanism. We believe the present approach can be applied in a wide variety of chemical transformations and therefore can open up the way to a fast and reliable general method to treat bimolecular reactions in condensed phase and in other complex environments.