# Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments

^{*}

## Abstract

**:**

## 1. Introduction

## 2. The Maximum Entropy Principle

#### 2.1. Combining Maximum Entropy Principle and Molecular Dynamics

#### 2.2. A Minimization Problem

- When data are incompatible with the prior distribution.
- When data are mutually incompatible. As an extreme case, one can imagine two different experiments that measure the same observable and report different values.

#### 2.3. Connection with Maximum Likelihood Principle

#### 2.4. Enforcing Distributions

#### 2.5. Equivalence to the Replica Approach

## 3. Modelling Experimental Errors

## 4. Exact Results on Model Systems

#### 4.1. Consistency between Prior Distribution and Experimental Data

#### 4.2. Consistency between Data Points

## 5. Strategies for the Optimization of Lagrangian Multipliers

#### 5.1. Ensemble Reweighting

#### 5.2. Iterative Simulations

#### 5.3. On-the-Fly Optimization with Stochastic Gradient Descent

#### 5.4. Other On-the-Fly Optimization Strategies

## 6. Convergence of Lagrangian Multipliers in Systems Displaying Metastability

#### 6.1. Results for a Langevin System

#### 6.2. Comments about Using Enhanced Sampling Methods

## 7. Discussion and Conclusions

## Acknowledgments

## Conflicts of Interest

## Abbreviations

DEER | double electron-electron resonance |

EDS | experiment-directed simulation |

GD | gradient descent |

MD | molecular dynamics |

NMR | nuclear magnetic resonance |

SAXS | small-angle X-ray scattering |

SGD | stochastic gradient descent |

VES | variationally enhanced sampling |

## References

- Dror, R.O.; Dirks, R.M.; Grossman, J.; Xu, H.; Shaw, D.E. Biomolecular simulation: A computational microscope for molecular biology. Annu. Rev. Biophys.
**2012**, 41, 429–452. [Google Scholar] [CrossRef] [PubMed] - Bernardi, R.C.; Melo, M.C.; Schulten, K. Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta Gen. Subj.
**2015**, 1850, 872–877. [Google Scholar] [CrossRef] [PubMed] - Valsson, O.; Tiwary, P.; Parrinello, M. Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint. Annu. Rev. Phys. Chem.
**2016**, 67, 159–184. [Google Scholar] [CrossRef] [PubMed] - Mlỳnskỳ, V.; Bussi, G. Exploring RNA structure and dynamics through enhanced sampling simulations. Curr. Opin. Struct. Biol.
**2018**, 49, 63–71. [Google Scholar] [CrossRef] - Petrov, D.; Zagrovic, B. Are current atomistic force fields accurate enough to study proteins in crowded environments? PLoS Comput. Biol.
**2014**, 10, e1003638. [Google Scholar] [CrossRef] [PubMed] - Piana, S.; Lindorff-Larsen, K.; Shaw, D.E. How robust are protein folding simulations with respect to force field parameterization? Biophys. J.
**2011**, 100, L47–L49. [Google Scholar] [CrossRef] [PubMed] - Condon, D.E.; Kennedy, S.D.; Mort, B.C.; Kierzek, R.; Yildirim, I.; Turner, D.H. Stacking in RNA: NMR of Four Tetramers Benchmark Molecular Dynamics. J. Chem. Theory Comput.
**2015**, 11, 2729–2742. [Google Scholar] [CrossRef] [PubMed] - Bergonzo, C.; Henriksen, N.M.; Roe, D.R.; Cheatham, T.E. Highly sampled tetranucleotide and tetraloop motifs enable evaluation of common RNA force fields. RNA
**2015**, 21, 1578–1590. [Google Scholar] [CrossRef] [PubMed] - Šponer, J.; Bussi, G.; Krepl, M.; Banáš, P.; Bottaro, S.; Cunha, R.A.; Gil-Ley, A.; Pinamonti, G.; Poblete, S.; Jurečka, P.; et al. RNA Structural Dynamics As Captured by Molecular Simulations: A Comprehensive Overview. Chem. Rev.
**2018**. [Google Scholar] [CrossRef] [PubMed] - Kuhrová, P.; Best, R.B.; Bottaro, S.; Bussi, G.; Šponer, J.; Otyepka, M.; Banáš, P. Computer Folding of RNA Tetraloops: Identification of Key Force Field Deficiencies. J. Chem. Theory Comput.
**2016**, 12, 4534–4548. [Google Scholar] [CrossRef] [PubMed] - Bottaro, S.; Banáš, P.; Šponer, J.; Bussi, G. Free Energy Landscape of GAGA and UUCG RNA Tetraloops. J. Phys. Chem. Lett.
**2016**, 7, 4032–4038. [Google Scholar] [CrossRef] [PubMed] - Schröder, G.F. Hybrid methods for macromolecular structure determination: experiment with expectations. Curr. Opin. Struct. Biol.
**2015**, 31, 20–27. [Google Scholar] [CrossRef] [PubMed] - Ravera, E.; Sgheri, L.; Parigi, G.; Luchinat, C. A critical assessment of methods to recover information from averaged data. Phys. Chem. Chem. Phys.
**2016**, 18, 5686–5701. [Google Scholar] [CrossRef] [PubMed] - Allison, J.R. Using simulation to interpret experimental data in terms of protein conformational ensembles. Curr. Opin. Struct. Biol.
**2017**, 43, 79–87. [Google Scholar] [CrossRef] [PubMed] - Bonomi, M.; Heller, G.T.; Camilloni, C.; Vendruscolo, M. Principles of protein structural ensemble determination. Curr. Opin. Struct. Biol.
**2017**, 42, 106–116. [Google Scholar] [CrossRef] [PubMed] - Pitera, J.W.; Chodera, J.D. On the Use of Experimental Observations to Bias Simulated Ensembles. J. Chem. Theory Comput.
**2012**, 8, 3445–3451. [Google Scholar] [CrossRef] [PubMed] - Boomsma, W.; Ferkinghoff-Borg, J.; Lindorff-Larsen, K. Combining Experiments and Simulations Using the Maximum Entropy Principle. PLoS Comput. Biol.
**2014**, 10, e1003406. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev.
**1957**, 106, 620. [Google Scholar] [CrossRef] - Jaynes, E.T. Information theory and statistical mechanics. II. Phys. Rev.
**1957**, 108, 171. [Google Scholar] [CrossRef] - Caticha, A. Relative entropy and inductive inference. In AIP Conference Proceedings; AIP: College Park, MD, USA, 2004; Volume 707; pp. 75–96. [Google Scholar]
- Banavar, J.; Maritan, A. The maximum relative entropy principle. arXiv, 2007; arXiv:0703622. [Google Scholar]
- Shell, M.S. The relative entropy is fundamental to multiscale and inverse thermodynamic problems. J. Chem. Phys.
**2008**, 129, 144108. [Google Scholar] [CrossRef] [PubMed] - Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat.
**1951**, 22, 79–86. [Google Scholar] [CrossRef] - Ryckaert, J.P.; Ciccotti, G.; Berendsen, H.J. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys.
**1977**, 23, 327–341. [Google Scholar] [CrossRef] - Case, D.A. Chemical shifts in biomolecules. Curr. Opin. Struct. Biol.
**2013**, 23, 172–176. [Google Scholar] [CrossRef] [PubMed] - Karplus, M. Vicinal Proton Coupling in Nuclear Magnetic Resonance. J. Am. Chem. Soc.
**1963**, 85, 2870–2871. [Google Scholar] [CrossRef] - Tolman, J.R.; Ruan, K. NMR residual dipolar couplings as probes of biomolecular dynamics. Chem. Rev.
**2006**, 106, 1720–1736. [Google Scholar] [CrossRef] [PubMed] - Bernadó, P.; Mylonas, E.; Petoukhov, M.V.; Blackledge, M.; Svergun, D.I. Structural characterization of flexible proteins using small-angle X-ray scattering. J. Am. Chem. Soc.
**2007**, 129, 5656–5664. [Google Scholar] [CrossRef] [PubMed] - Jeschke, G. DEER distance measurements on proteins. Annu. Rev. Phys. Chem.
**2012**, 63, 419–446. [Google Scholar] [CrossRef] [PubMed] - Piston, D.W.; Kremers, G.J. Fluorescent protein FRET: the good, the bad and the ugly. Trends Biochem. Sci.
**2007**, 32, 407–414. [Google Scholar] [CrossRef] [PubMed] - Mead, L.R.; Papanicolaou, N. Maximum entropy in the problem of moments. J. Math. Phys.
**1984**, 25, 2404–2417. [Google Scholar] [CrossRef] - Berger, A.L.; Pietra, V.J.D.; Pietra, S.A.D. A maximum entropy approach to natural language processing. Comput. Linguist.
**1996**, 22, 39–71. [Google Scholar] - Chen, S.F.; Rosenfeld, R. A Gaussian Prior for Smoothing Maximum Entropy Models. Technical Report. 1999. Available online: http://reports-archive.adm.cs.cmu.edu/anon/anon/1999/CMU-CS-99-108.pdf (accessed on 4 February 2018).
- Dannenhoffer-Lafage, T.; White, A.D.; Voth, G.A. A Direct Method for Incorporating Experimental Data into Multiscale Coarse-Grained Models. J. Chem. Theory Comput.
**2016**, 12, 2144–2153. [Google Scholar] [CrossRef] [PubMed] - Reith, D.; Pütz, M.; Müller-Plathe, F. Deriving effective mesoscale potentials from atomistic simulations. J. Comput. Chem.
**2003**, 24, 1624–1636. [Google Scholar] [CrossRef] [PubMed] - White, A.D.; Dama, J.F.; Voth, G.A. Designing free energy surfaces that match experimental data with metadynamics. J. Chem. Theory Comput.
**2015**, 11, 2451–2460. [Google Scholar] [CrossRef] [PubMed] - Marinelli, F.; Faraldo-Gómez, J.D. Ensemble-biased metadynamics: A molecular simulation method to sample experimental distributions. Biophys. J.
**2015**, 108, 2779–2782. [Google Scholar] [CrossRef] [PubMed] - Valsson, O.; Parrinello, M. Variational Approach to Enhanced Sampling and Free Energy Calculations. Phys. Rev. Lett.
**2014**, 113, 090601. [Google Scholar] [CrossRef] [PubMed] - Shaffer, P.; Valsson, O.; Parrinello, M. Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin. Proc. Natl. Acad. Sci. USA
**2016**, 113, 1150–1155. [Google Scholar] [CrossRef] [PubMed] - Invernizzi, M.; Valsson, O.; Parrinello, M. Coarse graining from variationally enhanced sampling applied to the Ginzburg–Landau model. Proc. Natl. Acad. Sci. USA
**2017**, 114, 3370–3374. [Google Scholar] [CrossRef] [PubMed] - Fennen, J.; Torda, A.E.; van Gunsteren, W.F. Structure refinement with molecular dynamics and a Boltzmann-weighted ensemble. J. Biomol. NMR
**1995**, 6, 163–170. [Google Scholar] [CrossRef] [PubMed] - Best, R.B.; Vendruscolo, M. Determination of Protein Structures Consistent with NMR Order Parameters. J. Am. Chem. Soc.
**2004**, 126, 8090–8091. [Google Scholar] [CrossRef] [PubMed] - Lindorff-Larsen, K.; Best, R.B.; DePristo, M.A.; Dobson, C.M.; Vendruscolo, M. Simultaneous determination of protein structure and dynamics. Nature
**2005**, 433, 128–132. [Google Scholar] [CrossRef] [PubMed] - Cavalli, A.; Camilloni, C.; Vendruscolo, M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. J. Chem. Phys.
**2013**, 138, 094112. [Google Scholar] [CrossRef] [PubMed] - Roux, B.; Weare, J. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method. J. Chem. Phys.
**2013**, 138, 084107. [Google Scholar] [CrossRef] [PubMed] - Olsson, S.; Cavalli, A. Quantification of Entropy-Loss in Replica-Averaged Modeling. J. Chem. Theory Comput.
**2015**, 11, 3973–3977. [Google Scholar] [CrossRef] [PubMed] - Hummer, G.; Köfinger, J. Bayesian ensemble refinement by replica simulations and reweighting. J. Chem. Phys.
**2015**, 143, 243150. [Google Scholar] [CrossRef] [PubMed] - Cesari, A.; Gil-Ley, A.; Bussi, G. Combining simulations and solution experiments as a paradigm for RNA force field refinement. J. Chem. Theory Comput.
**2016**, 12, 6192–6200. [Google Scholar] [CrossRef] [PubMed] - Olsson, S.; Ekonomiuk, D.; Sgrignani, J.; Cavalli, A. Molecular dynamics of biomolecules through direct analysis of dipolar couplings. J. Am. Chem. Soc.
**2015**, 137, 6270–6278. [Google Scholar] [CrossRef] [PubMed] - Camilloni, C.; Vendruscolo, M. A tensor-free method for the structural and dynamical refinement of proteins using residual dipolar couplings. J. Phys. Chem. B
**2014**, 119, 653–661. [Google Scholar] [CrossRef] [PubMed] - Beauchamp, K.A.; Pande, V.S.; Das, R. Bayesian Energy Landscape Tilting: Towards Concordant Models of Molecular Ensembles. Biophys. J.
**2014**, 106, 1381–1390. [Google Scholar] [CrossRef] [PubMed] - Bonomi, M.; Camilloni, C.; Cavalli, A.; Vendruscolo, M. Metainference: A Bayesian inference method for heterogeneous systems. Sci. Adv.
**2016**, 2, e1501177. [Google Scholar] [CrossRef] [PubMed] - Brookes, D.H.; Head-Gordon, T. Experimental Inferential Structure Determination of Ensembles for Intrinsically Disordered Proteins. J. Am. Chem. Soc.
**2016**, 138, 4530–4538. [Google Scholar] [CrossRef] [PubMed] - Różycki, B.; Kim, Y.C.; Hummer, G. SAXS ensemble refinement of ESCRT-III CHMP3 conformational transitions. Structure
**2011**, 19, 109–116. [Google Scholar] [CrossRef] [PubMed] - Boura, E.; Różycki, B.; Herrick, D.Z.; Chung, H.S.; Vecer, J.; Eaton, W.A.; Cafiso, D.S.; Hummer, G.; Hurley, J.H. Solution structure of the ESCRT-I complex by small-angle X-ray scattering, EPR, and FRET spectroscopy. Proc. Natl. Acad. Sci. USA
**2011**, 108, 9437–9442. [Google Scholar] [CrossRef] [PubMed] - Sanchez-Martinez, M.; Crehuet, R. Application of the maximum entropy principle to determine ensembles of intrinsically disordered proteins from residual dipolar couplings. Phys. Chem. Chem. Phys.
**2014**, 16, 26030–26039. [Google Scholar] [CrossRef] [PubMed] - Leung, H.T.A.; Bignucolo, O.; Aregger, R.; Dames, S.A.; Mazur, A.; Bernéche, S.; Grzesiek, S. A rigorous and efficient method to reweight very large conformational ensembles using average experimental data and to determine their relative information content. J. Chem. Theory Comput.
**2015**, 12, 383–394. [Google Scholar] [CrossRef] [PubMed] - Cunha, R.A.; Bussi, G. Unraveling Mg2+–RNA binding with atomistic molecular dynamics. RNA
**2017**, 23, 628–638. [Google Scholar] [CrossRef] [PubMed] - Bottaro, S.; Bussi, G.; Kennedy, S.D.; Turner, D.H.; Lindorff-Larsen, K. Conformational Ensemble of RNA Oligonucleotides from Reweighted Molecular Simulations. bioRxiv
**2017**, 230268. [Google Scholar] [CrossRef] - Podbevsek, P.; Fasolo, F.; Bon, C.; Cimatti, L.; Reisser, S.; Carninci, P.; Bussi, G.; Zucchelli, S.; Plavec, J.; Gustincich, S. Structural determinants of the SINEB2 element embedded in the long non-coding RNA activator of translation AS Uchl1. Sci. Rep.
**2018**. accepted. [Google Scholar] - Shen, T.; Hamelberg, D. A statistical analysis of the precision of reweighting-based simulations. J. Chem. Phys.
**2008**, 129, 034103. [Google Scholar] [CrossRef] [PubMed] - Gray, P.G.; Kish, L. Survey Sampling. J. R. Stat. Soc. A
**1969**, 132, 272. [Google Scholar] [CrossRef] - Martino, L.; Elvira, V.; Louzada, F. Effective sample size for importance sampling based on discrepancy measures. Signal Process.
**2017**, 131, 386–401. [Google Scholar] [CrossRef] - Norgaard, A.B.; Ferkinghoff-Borg, J.; Lindorff-Larsen, K. Experimental parameterization of an energy function for the simulation of unfolded proteins. Biophys. J.
**2008**, 94, 182–192. [Google Scholar] [CrossRef] [PubMed] - Giorgetti, L.; Galupa, R.; Nora, E.P.; Piolot, T.; Lam, F.; Dekker, J.; Tiana, G.; Heard, E. Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell
**2014**, 157, 950–963. [Google Scholar] [CrossRef] [PubMed] - Tiana, G.; Amitai, A.; Pollex, T.; Piolot, T.; Holcman, D.; Heard, E.; Giorgetti, L. Structural fluctuations of the chromatin fiber within topologically associating domains. Biophys. J.
**2016**, 110, 1234–1245. [Google Scholar] [CrossRef] [PubMed] - Zhang, B.; Wolynes, P.G. Topology, structures, and energy landscapes of human chromosomes. Proc. Natl. Acad. Sci. USA
**2015**, 112, 6062–6067. [Google Scholar] [CrossRef] [PubMed] - Zhang, B.; Wolynes, P.G. Shape transitions and chiral symmetry breaking in the energy landscape of the mitotic chromosome. Phys. Rev. Lett.
**2016**, 116, 248101. [Google Scholar] [CrossRef] [PubMed] - Torda, A.E.; Scheek, R.M.; van Gunsteren, W.F. Time-dependent distance restraints in molecular dynamics simulations. Chem. Phys. Lett.
**1989**, 157, 289–294. [Google Scholar] [CrossRef] - Darken, C.; Moody, J. Towards faster stochastic gradient search. In Proceedings of the Neural Information Processing Systems 4 (NIPS 1991), Denver, CO, USA, 2–5 December 1991; pp. 1009–1016. [Google Scholar]
- Gil-Ley, A.; Bottaro, S.; Bussi, G. Empirical Corrections to the Amber RNA Force Field with Target Metadynamics. J. Chem. Theory Comput.
**2016**, 12, 2790–2798. [Google Scholar] [CrossRef] [PubMed] - Bach, F.; Moulines, E. Non-strongly-convex smooth stochastic approximation with convergence rate O (1/n). In Proceedings of the Neural Information Processing Systems 16 (NIPS 2013), Lake Tahoe, CA, USA, 4–11 December 2013; pp. 773–781. [Google Scholar]
- White, A.D.; Voth, G.A. Efficient and Minimal Method to Bias Molecular Simulations with Experimental Data. J. Chem. Theory Comput.
**2014**, 10, 3023–3030. [Google Scholar] [CrossRef] [PubMed] - Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res.
**2011**, 12, 2121–2159. [Google Scholar] - Hocky, G.M.; Dannenhoffer-Lafage, T.; Voth, G.A. Coarse-grained Directed Simulation. J. Chem. Theory Comput.
**2017**, 13, 4593–4603. [Google Scholar] [CrossRef] [PubMed] - White, A.D.; Knight, C.; Hocky, G.M.; Voth, G.A. Communication: Improved ab initio molecular dynamics by minimally biasing with experimental data. J. Chem. Phys
**2017**, 146, 041102. [Google Scholar] [CrossRef] [PubMed] - Sugita, Y.; Okamoto, Y. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett.
**1999**, 314, 141–151. [Google Scholar] [CrossRef] - Liu, P.; Kim, B.; Friesner, R.A.; Berne, B. Replica exchange with solute tempering: A method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. USA
**2005**, 102, 13749–13754. [Google Scholar] [CrossRef] [PubMed] - Piana, S.; Laio, A. A bias-exchange approach to protein folding. J. Phys. Chem. B
**2007**, 111, 4553–4559. [Google Scholar] [CrossRef] [PubMed] - Gil-Ley, A.; Bussi, G. Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering. J. Chem. Theory Comput.
**2015**, 11, 1077–1085. [Google Scholar] [CrossRef] [PubMed] - Torrie, G.M.; Valleau, J.P. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys.
**1977**, 23, 187–199. [Google Scholar] [CrossRef] - Laio, A.; Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. USA
**2002**, 99, 12562–12566. [Google Scholar] [CrossRef] [PubMed] - Bussi, G.; Gervasio, F.L.; Laio, A.; Parrinello, M. Free-energy landscape for β hairpin folding from combined parallel tempering and metadynamics. J. Am. Chem. Soc.
**2006**, 128, 13435–13441. [Google Scholar] [CrossRef] [PubMed] - Pfaendtner, J.; Bonomi, M. Efficient sampling of high-dimensional free-energy landscapes with parallel bias metadynamics. J. Chem. Theory Comput.
**2015**, 11, 5062–5067. [Google Scholar] [CrossRef] [PubMed] - Bonomi, M.; Camilloni, C.; Vendruscolo, M. Metadynamic metainference: enhanced sampling of the metainference ensemble using metadynamics. Sci. Rep.
**2016**, 6, 31232. [Google Scholar] [CrossRef] [PubMed] - Löhr, T.; Jussupow, A.; Camilloni, C. Metadynamic metainference: Convergence towards force field independent structural ensembles of a disordered peptide. J. Chem. Phys.
**2017**, 146, 165102. [Google Scholar] [CrossRef] [PubMed] - Raiteri, P.; Laio, A.; Gervasio, F.L.; Micheletti, C.; Parrinello, M. Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics. J. Phys. Chem. B
**2006**, 110, 3533–3539. [Google Scholar] [CrossRef] [PubMed] - Valsson, O.; Parrinello, M. Well-tempered variational approach to enhanced sampling. J. Chem. Theory Comput.
**2015**, 11, 1996–2002. [Google Scholar] [CrossRef] [PubMed] - Tiberti, M.; Papaleo, E.; Bengtsen, T.; Boomsma, W.; Lindorff-Larsen, K. ENCORE: Software for quantitative ensemble comparison. PLoS Comput. Biol.
**2015**, 11, e1004415. [Google Scholar] [CrossRef] [PubMed] - Tribello, G.A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun.
**2014**, 185, 604–613. [Google Scholar] [CrossRef] [Green Version] - Bonomi, M.; Camilloni, C. Integrative structural and dynamical biology with PLUMED-ISDB. Bioinformatics
**2017**, 33, 3999–4000. [Google Scholar] [CrossRef] [PubMed] - Nodet, G.; Salmon, L.; Ozenne, V.; Meier, S.; Jensen, M.R.; Blackledge, M. Quantitative Description of Backbone Conformational Sampling of Unfolded Proteins at Amino Acid Resolution from NMR Residual Dipolar Couplings. J. Am. Chem. Soc.
**2009**, 131, 17908–17918. [Google Scholar] [CrossRef] [PubMed] - Pelikan, M.; Hura, G.L.; Hammel, M. Structure and flexibility within proteins as identified through small angle X-ray scattering. Gen. Physiol. Biophys.
**2009**, 28, 174–189. [Google Scholar] [CrossRef] [PubMed] - Berlin, K.; Castañeda, C.A.; Schneidman-Duhovny, D.; Sali, A.; Nava-Tudela, A.; Fushman, D. Recovering a Representative Conformational Ensemble from Underdetermined Macromolecular Structural Data. J. Am. Chem. Soc.
**2013**, 135, 16595–16609. [Google Scholar] [CrossRef] [PubMed] - Yang, S.; Blachowicz, L.; Makowski, L.; Roux, B. Multidomain assembled states of Hck tyrosine kinase in solution. Proc. Natl. Acad. Sci. USA
**2010**, 107, 15757–15762. [Google Scholar] [CrossRef] [PubMed] - Fisher, C.K.; Huang, A.; Stultz, C.M. Modeling Intrinsically Disordered Proteins with Bayesian Statistics. J. Am. Chem. Soc.
**2010**, 132, 14919–14927. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Cossio, P.; Hummer, G. Bayesian analysis of individual electron microscopy images: Towards structures of dynamic and heterogeneous biomolecular assemblies. J. Struct. Biol.
**2013**, 184, 427–437. [Google Scholar] [CrossRef] [PubMed] - Molnar, K.S.; Bonomi, M.; Pellarin, R.; Clinthorne, G.D.; Gonzalez, G.; Goldberg, S.D.; Goulian, M.; Sali, A.; DeGrado, W.F. Cys-Scanning Disulfide Crosslinking and Bayesian Modeling Probe the Transmembrane Signaling Mechanism of the Histidine Kinase, PhoQ. Structure
**2014**, 22, 1239–1251. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**The effect of a linear correcting potential on a given reference potential. ${P}_{0}\left(s\right)$ is the marginal probability distribution of some observable $s\left(\mathit{q}\right)$ according to the reference potential ${V}_{0}\left(\mathit{q}\right)$ and ${F}_{0}\left(s\right)$ is the corresponding free-energy profile (

**left panel**). Energy scale is reported in the vertical axis and is given in units of ${k}_{B}T$. Probability scales are not reported. Vertical lines represent the average value of the observable s in the prior (${\langle s\rangle}_{0}$) and in the experiment (${s}^{exp}$). A correcting potential linear in s (green line) shifts the relative depths of the two free-energy minima, leading to a new free energy profile ${F}_{ME}\left(s\right)={F}_{0}\left(s\right)+{k}_{B}T{\lambda}^{\ast}s$ that corresponds to a probability distribution ${P}_{ME}\left(s\right)$ (

**central panel**). Choosing ${\lambda}^{\ast}$ equal to the value that minimizes $\Gamma \left(\lambda \right)$ (

**right panel**) leads to an average $\langle s\rangle ={s}^{exp}$.

**Figure 2.**Effect of modeling error with a Gaussian probability distribution with different standard deviations $\sigma $ on the posterior distribution ${P}_{ME}\left(s\right)$. The experimental value is here set to ${s}^{exp}=5.7$, which is compatible with the prior distribution.

**Left**and

**middle**column: prior ${P}_{0}\left(s\right)$ and posterior ${P}_{ME}\left(s\right)$ with $\sigma =0,\phantom{\rule{4pt}{0ex}}2.5,\phantom{\rule{4pt}{0ex}}5.0$.

**Right**column: ensemble average $\langle s\rangle $ plotted as a function of $\sigma $ and $\Gamma \left(\lambda \right)$ plotted for different values of $\sigma $. ${\lambda}^{\ast}$ denotes that value of $\lambda $ that minimizes $\Gamma \left(\lambda \right)$.

**Figure 3.**Same as Figure 2, but the experimental value is here set to ${s}^{exp}=2$, which is almost incompatible with the prior distribution.

**Figure 4.**Effect of different prior distributions for the error model in a two-dimensional system. In the first (last) two columns, compatible (incompatible) data are enforced. In the first and the third column, prior distributions are represented as black contour lines and posterior distributions are shown in color scale. A black dot and a ★ are used to indicate the average values of $\mathit{s}$ in the prior and posterior distributions respectively, while an empty circle is used to indicate the target ${\mathit{s}}^{\mathit{exp}}$. In the second and the fourth column, the function $\Gamma \left(\mathit{\lambda}\right)$ is shown, and its minimum ${\mathit{\lambda}}^{\ast}$ is indicated with a ★. The first row reports results where errors are not modeled, whereas the second and the third row report results obtained using Gaussian and Laplace prior for the error model respectively. Notice that a different scale is used to represent $\Gamma \left(\mathit{\lambda}\right)$ in the first row. For the Laplace prior, the region of $\mathit{\lambda}$ where $\Gamma \left(\mathit{\lambda}\right)$ is undefined is marked as white.

**Figure 5.**Effect of choosing different values for k and $\tau $ when using stochastic gradient descent (SGD) on-the-fly during molecular dynamics (MD) simulations. Panel labels (

**a**–

**e**) refer to different sets of k and $\tau $ values matching those of Table 1. In particular, for each set of k and $\tau $ we show the convergence of the Lagrangian multipliers (number 1 of each letter), the time series of the observable (number 2 of each letter), and the resulting sampled posterior distribution, red bars, together with the analytical result, continuous line (number 3 of each letter).

**Table 1.**Summary of the results obtained with the Langevin model, including learning parameters (k and $\tau $) and average $\langle \lambda \rangle $ and $\langle s\rangle $ computed over the second half of the simulation. In addition, we report the exact Lagrangian multiplier ${\lambda}_{\langle s\rangle}^{\ast}$ required to enforce an average equal to $\langle s\rangle $ and the exact average ${\langle s\rangle}_{\langle \lambda \rangle}$ corresponding to a Lagrangian multiplier $\langle \lambda \rangle $. The last two columns are obtained by using the analytical solutions described in Section 4. Panel labels match those in Figure 5.

Panel | k | $\mathit{\tau}$ | $\langle \mathit{\lambda}\rangle $ | $\langle \mathit{s}\rangle $ | ${\mathit{\lambda}}_{\langle \mathit{s}\rangle}^{\ast}$ | ${\langle \mathit{s}\rangle}_{\langle \mathit{\lambda}\rangle}$ |
---|---|---|---|---|---|---|

a | 2 | 10 | 0.207 | 1.008 | 0.210 | 1.015 |

b | 2 | 0.001 | 0.080 | 1.308 | 0.080 | 1.307 |

c | 0.001 | 10 | 0.077 | 1.324 | 0.073 | 1.316 |

d | 2 | 10000 | 0.145 | 1.000 | 0.214 | 1.157 |

e | 1000 | 10 | 0.158 | 1.001 | 0.214 | 1.125 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Cesari, A.; Reißer, S.; Bussi, G.
Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments. *Computation* **2018**, *6*, 15.
https://doi.org/10.3390/computation6010015

**AMA Style**

Cesari A, Reißer S, Bussi G.
Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments. *Computation*. 2018; 6(1):15.
https://doi.org/10.3390/computation6010015

**Chicago/Turabian Style**

Cesari, Andrea, Sabine Reißer, and Giovanni Bussi.
2018. "Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments" *Computation* 6, no. 1: 15.
https://doi.org/10.3390/computation6010015