# Identifying Systematic Force Field Errors Using a 3D-RISM Element Counting Correction

^{*}

^{†}

^{‡}

## Abstract

**:**

## 1. Introduction

## 2. Theory

#### 2.1. PMV Correction

#### 2.2. vdW Volume Correction

#### 2.3. Element Count Correction

## 3. Results

#### 3.1. Identifying Rigid and Flexible Molecules Using Molecular Dynamics with GB Solvent

#### 3.2. Fitting PMVC, ECC, and PMVECC Parameters

#### 3.3. Quality of Fit

## 4. Discussion

#### 4.1. Dealing with Conformational Sampling

#### 4.2. Accuracy and Computational Efficiency of 3D-RISM/PMVECC

#### 4.3. Force Field Parameters

## 5. Materials and Methods

#### 5.1. Structure Preparation

#### 5.2. GB HFE

and vacuum (

) [3,54,55] environments in the

MD engine of AmberTools 2017 [56]. For all simulations, a 1 fs time step was used, temperature was held at $298.15\phantom{\rule{0.166667em}{0ex}}\mathrm{K}$ using a Langevin thermostat with $\gamma =5\phantom{\rule{0.166667em}{0ex}}\mathrm{p}{\mathrm{s}}^{-1}$, and conformations were saved every 10,000 steps. The resulting trajectories were then post-processed in

`sander`(

`imin = 5`) using the GB with surface area implicit solvent (

`igb = 2, gbsa = 1`) and in a vacuum to obtain the potential energy of each conformation in aqueous and gas phases. HFEs were then calculated from these potential energies using pyMBAR 3.1.1 [57,58].

#### 5.3. 1D-RISM

in AmberTools 2021 [44,59]. The coincident extended simple point charge model (cSPC/E) was used to model water [44,60]. The dielectrically consistent RISM (DRISM) equations [61,62] were solved with a dielectric constant of 78.497 to a residual tolerance of ${10}^{-12}$ on a 16,384-point grid, with a grid spacing of $0.025\phantom{\rule{0.166667em}{0ex}}\mathsf{\AA}.$ Convergence was accelerated with the modified inversion of iterative subspace (MDIIS) method [63].

#### 5.4. 3D-RISM Calculations

of AmberTools 2021 [44,59] was used to calculate the HFE and PMV using the AMBER parameter and coordinate files provided with the FreeSolv dataset for each solute and bulk water properties from

. The 3D-RISM equations were solved to a residual tolerance of ${10}^{-4}$ on a grid with spacing of $0.3\phantom{\rule{0.166667em}{0ex}}\mathsf{\AA}$, accelerated by MDIIS. Lennard–Jones cutoffs with a relative tolerance of ${10}^{-4}$ were used to determine the size of the grid and analytic corrections were applied [47]. Reciprocal space long-range asymptotics were calculated with a relative tolerance of ${10}^{-5}$.

#### 5.5. Parameter Fitting

## Supplementary Materials

**Figure 1.**Categorizing rigid and flexible molecules from MD simulations. The standard deviation of the combined GB and surface area from MD simulations is given on the x axis. The difference between ${E}_{\mathrm{GB}}$ calculated from just the first frame (static) and over the entire MD trajectory is given on the y axis. Histograms for both quantities are given on their respective axes. For clarity, the full range of the data is not shown, which has maximum values of ${\sigma}_{\Delta {G}_{\mathrm{GB}}}=4.0\phantom{\rule{0.166667em}{0ex}}\mathrm{kcal}/\mathrm{mol}$ and $\left(\right)open="|"\; close="|">\Delta {G}_{\mathrm{GB},\mathrm{static}}-\Delta {G}_{\mathrm{GB},\mathrm{MD}}$.

**Figure 2.**HFEs for 3D-RISM/PMVC, 3D-RISM/ECC, 3D-RISM/PMVECC, and explicit solvent using parameters from Table 1. Leave-out data were used for all plots, except for uncorrected explicit solvent calculations, which are from Refs. [22,35]. Molecules containing combinations of F, Cl, Br, P, and S atoms are plotted with multiple symbols (e.g., see labeled molecule in the bottom row). See Section 5.5 for details of the fitting procedure.

**Figure 3.**HFEs from single (original conformation) rigid and flexible datasets for GB and 3D-RISM with PMVECC.

**Table 1.**Fit parameters for PMVC, ECC, and PMVECC, averaged over all leave-one-out fits. Uncertainties in the last digit are given in parentheses, and represent the standard deviation over all leave-one-out fits. Uncertainty for the a coefficient for PMVC is $8\times {10}^{-5}\phantom{\rule{0.166667em}{0ex}}\mathrm{kcal}/\mathrm{mol}/{\mathsf{\AA}}^{3}$. Coefficient a is in kcal/mol/Å${}^{3}$. All other values are in kcal/mol. See Section 5.5 for details of the fitting procedure.

PMVC | ECC | PMVECC | Explicit Solvent ECC | |
---|---|---|---|---|

a | −0.15 | −0.130(1) | ||

b | −0.04(1) | 0.00(1) | ||

H | −1.199(1) | −0.225(5) | −0.098(1) | |

N | −1.573(6) | −0.392(7) | 0.091(5) | |

C | −1.667(1) | −0.148(8) | 0.114(1) | |

O | −1.277(3) | 0.069(9) | 0.088(3) | |

F | −2.082(4) | −0.05(1) | 0.076(2) | |

Cl | −4.695(4) | −1.19(2) | −0.456(2) | |

Br | −5.544(7) | −1.06(2) | −0.412(6) | |

I | −6.27(1) | −0.79(3) | −0.25(1) | |

P | −1.03(3) | 2.04(3) | 2.93(3) | |

S | −3.18(1) | 0.09(2) | 0.32(1) |

**Table 2.**Hydration free energies calculated with 3D-RISM and an explicit solvent [22] with PMVC, ECC, and PMVECC corrections using parameters from Table 1. Leave-out data were used to calculate statistics, except for uncorrected explicit solvent calculations, which used data from Ref. [22] with the same bootstrap procedure. All values are given in kcal/mol. Uncertainties in the last digit are given in parentheses and represent the standard error of the mean. See Section 5.5 for details of the fitting procedure.

Slope | MUE | MSE | RMSE | ${\mathit{R}}^{2}$ | Max Error | ||
---|---|---|---|---|---|---|---|

Rigid | |||||||

3D-RISM/PMVC | 0.93(4) | 0.86(6) | −0.29(7) | 1.3(1) | 0.75(4) | 6.6 | |

3D-RISM/ECC | 0.92(4) | 1.02(6) | −0.51(8) | 1.37(8) | 0.76(3) | 5.9 | |

3D-RISM/PMVECC | 0.92(2) | 0.61(3) | 0.05(5) | 0.83(6) | 0.89(2) | 4.4 | |

Explicit solvent | 0.96(3) | 0.85(4) | −0.59(6) | 1.11(6) | 0.86(2) | 4.6 | |

Explicit solvent, ECC | 0.91(2) | 0.66(3) | −0.14(5) | 0.86(4) | 0.88(1) | 3.1 | |

Flexible | |||||||

3D-RISM/PMVC | 0.98(4) | 1.53(8) | 0.2(1) | 2.1(1) | 0.75(3) | 9.6 | |

3D-RISM/ECC | 1.07(4) | 1.56(7) | 0.0(1) | 2.1(1) | 0.78(3) | 9.8 | |

3D-RISM/PMVECC | 0.95(5) | 1.35(6) | −0.04(9) | 1.8(1) | 0.79(3) | 9.4 | |

Explicit solvent | 0.97(4) | 1.34(7) | −0.09(0) | 1.8(1) | 0.79(3) | 10.8 | |

Explicit solvent, ECC | 0.91(4) | 1.17(6) | −0.13(9) | 1.7(1) | 0.81(3) | 7.8 | |

Total | |||||||

3D-RISM/PMVC | 1.01(3) | 1.22(5) | 0.00(7) | 1.77(9) | 0.83(2) | 9.6 | |

3D-RISM/ECC | 1.06(2) | 1.32(5) | −0.21(7) | 1.80(8) | 0.84(2) | 9.8 | |

3D-RISM/PMVECC | 0.96(3) | 1.01(4) | 0.00(6) | 1.44(7) | 0.87(1) | 9.4 | |

Explicit solvent | 1.02(3) | 1.11(4) | −0.32(6) | 1.53(8) | 0.87(1) | 10.8 | |

Explicit solvent, ECC | 0.94(2) | 0.94(4) | −0.13(5) | 1.35(8) | 0.88(1) | 7.8 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

