Simulation of pH-Dependent Conformational Transitions in Membrane Proteins: The CLC-ec1 Cl−/H+ Antiporter

Intracellular transport of chloride by members of the CLC transporter family involves a coupled exchange between a Cl− anion and a proton (H+), which makes the transport function dependent on ambient pH. Transport activity peaks at pH 4.5 and stalls at neutral pH. However, a structure of the WT protein at acidic pH is not available, making it difficult to assess the global conformational rearrangements that support a pH-dependent gating mechanism. To enable modeling of the CLC-ec1 dimer at acidic pH, we have applied molecular dynamics simulations (MD) featuring a new force field modification scheme—termed an Equilibrium constant pH approach (ECpH). The ECpH method utilizes linear interpolation between the force field parameters of protonated and deprotonated states of titratable residues to achieve a representation of pH-dependence in a narrow range of physiological pH values. Simulations of the CLC-ec1 dimer at neutral and acidic pH comparing ECpH-MD to canonical MD, in which the pH-dependent protonation is represented by a binary scheme, substantiates the better agreement of the conformational changes and the final model with experimental data from NMR, cross-link and AFM studies, and reveals structural elements that support the gate-opening at pH 4.5, including the key glutamates Gluin and Gluex.


Introduction
Members of the CLC family of transmembrane proteins that mediate voltage-dependent transport of chloride across the membrane cell offer powerful examples of the key role that the ambient pH can play in the mechanisms of function of biomolecular systems in the cell membranes. The CLCs family comprises both ion channels and transporters that share major structural motifs [1,2]. Nine different human CLCs were identified, expressed in brain, heart, liver, gut and muscle tissues [3,4]. They have major physiological roles in maintaining homeostasis (e.g., in skeleton muscle tissues, the CLC-1 channel supports action potential generation by sodium and potassium re-polarizing the muscle fiber) [5,6]. The CLC transporter members of the family operate as Cl − /H + antiporters where the intraand extracellular concentrations of permeant Cl − and pH levels regulate conductivity. The prokaryotic analog of human CLC transporters, CLC-ec1, has been studied extensively as a mechanistic and structural prototype for the human transporters in this class [7][8][9][10].
CLC-ec1 shares the common structural architecture of a homodimer, in which each subunit has an independent anion-permeation pathway [11,12]. Structure-function studies have identified key residues involved in the transport mechanisms, such as the two glutamate residues, Glu ex and Glu in , located, respectively, in the extracellular and intracellular sides of each protomer [13][14][15]. Thus, Glu ex was proposed to act as both a "gate" for chloride transport and a mediator for proton binding [9,16], whereas Glu in is considered to be a proton transfer site [10,17].

Results
The structural rearrangements underlying the pH-dependent transition of the CLC-ec1 dimer between opened and closed gate states observed with HR-AFM [19] and a variety of other experimental approaches [18,[21][22][23][24] were investigated with MD simulations employing the ECpH force field modification scheme. This approach enables ECpH to overcome the binary representation of the protonation states in the force field of cMD, thereby introducing a crucial improvement in the ability of MD simulations to represent protein molecules in a range of physiological pH (Appendix A). Thus, the core of the ECpH method is a linear interpolation between the protonated and non-protonated force-field parameters of each titratable residue in the system that scales the current protonation state according to its individual pKa value and environmental pH (see Section 4.1). We estimated the individual pKa values for all the titratable residues with the recently introduced PROPKAtraj [25,26] approach that revises the past core flaw of PROPKA [26] and takes advantage of the protein dynamics in its evaluation of individual pKas (see Methods). We investigated the pH-dependent structural dynamics of the CLC-ec1 macromolecular system at the physiologically relevant pH values with the equilibrium constant pH (ECpH) MD approach described in [20] and briefly reviewed in the Methods (Section 4). Additional information about the formal framework of the ECpH methodology is provided in Appendices A and B).
The simulations focused on the path of the pH-dependent CLC-ec1 conformational rearrangement that takes the molecule to the state in which the transport efficiency is largest, at~pH 4.5. Interestingly, the comparison of crystallography data for the WT CLC-ec1 at pH 4.6 (PDB ID 1KPL) and pH 8.5 (PDB ID 1KPK) does not show major differences in the conformational state of the protein [27], but evidence from other experimental approaches has indicated conformational changes interpreted to represent the interconversion between functional states of the CLC protein in response to pH changes [18,[21][22][23][24]. For example, the introduction of a disulfide crosslink by the mutations A399C/A432C was shown to inhibit Cl − transport, leading to the suggestion of a connection between the resulting displacement of Helix O opening of the gate in the outward-facing state in WT protein [22]. Major structural rearrangements were also suggested from DEER experiments, crosslinking, and NMR experiments, which concluded that motions of helices N, O and P act to widen the extracellular bottleneck [18,23]. Additionally, the NMR data also suggested a pH dependence of the position of the P-Q linker that is associated with sidechain orientation of Y419 [10,24,28]. The recent high-resolution X-ray structure of a mutant construct of the CLC-ec1 protein (E148Q/E203Q/E113Q, PDB ID 6V2J), termed QQQ CLC [18], does show structural rearrangements to form an outward-facing open conformational state at pH 7.5 that mimics the WT-like conformation at low pH. The next sections describe the results from the detailed analysis of both local and global pH-dependent conformational changes in CLC-ec1, revealed by the analysis of the ECpH MD simulation trajectories.

Analysis of ECpH Trajectories Brings to Light the Mechanisms of Structural Changes Detected Experimentally
The initial structure used in all the ECpH and cMD simulations was the X-ray structure of CLC-ec1 at basic pH (PDB ID 1OTS). Initial equilibration was carried out with an established protocol described in previous publications (see [18] and Methods). Starting from the equilibrated structure, parallel ECpH simulations at pH 4 and 8 were run in equivalent swarms of six replicas for~1 µs each. The trajectories from the production runs of the simulated systems were used in the various comparative analyses described in Methods.

Spatial Reorganization of Helices B and C
Analysis of the conformational ensembles in the ECpH trajectories at pH 4 and pH 8 revealed structural changes in the CLC-ec1 protein in response to the decrease in pH. As shown in Figure 1A, the extracellular part of helix B moved closer to the interdomain surface as indicated by the decrease in intra-subunit distance between the D73 residues from~98 Å at pH 8 to 95 Å at pH 4. The tighter conformation shown in Figure 1A is the result of a repositioning of helices B and C. The repositioning of the R147 sidechain in helix B ( Figure 1A), which follows the protonation of D54 and E148 at lower pH values, enables the rearrangement of the two helices. Another local reorientation that likely supports the helix B shift occurs near the extracellular part of the helix. Thus, at pH 4, H70 and D73 form hydrogen bonds with the backbone atoms of G66 and T71, respectively, whereas in their deprotonated form at pH 8, both H70 and D73 adopt different conformations that destabilize the alpha-helical structure at the tip of helix B ( Figure 1B). the rearrangement of the two helices. Another local reorientation that likely supports the helix B shift occurs near the extracellular part of the helix. Thus, at pH 4, H70 and D73 form hydrogen bonds with the backbone atoms of G66 and T71, respectively, whereas in their deprotonated form at pH 8, both H70 and D73 adopt different conformations that destabilize the alpha-helical structure at the tip of helix B ( Figure 1B). The zoom-in insert shows orientations of R147 at pH 4 (in dark blue) and at pH 8 (in light blue). (B) 2D histograms of the distribution of the dihedral angles χ 1 and χ 2 at pH 4 and 8 for H70 (left two panels) and R147 (right panels). (C) Kernel plots of density (probability) distribution for the H-bonds stabilizing the conformation between the sidechain of H70, D73 and the backbone oxygen of G66 and T71, (left two panels) and the sidechains of R147 and E148, D54 (right panels).

pH-Dependent Conformational Rearrangement of Helices N, P and O
Multiple experimental studies [18,[21][22][23][24] have concluded that helices N and O exhibit pH-dependent conformational rearrangements. From the ECpH trajectories, we find that the distribution of inter-subunit distances between Cα atoms of M373 (Helix N) and E385 (helix O) increases by ~4 Å at pH 4 compared to the pH 8 conformation (Figure 2), and a subtle shift of ~0.5 Å in the RMSD of their backbone atoms ( Figure 2). The pH dependence of the shift in position of the N helix is likely related to the reorientation of the Gluex sidechain (E148) shown in Figure 2, accompanied by the reorganization of the hydrophobic environment located at the bottom of helix N (F190, F199, F357 and I186), as shown in Figure S2. Helix P also undergoes a significant displacement in the ECpH simulations at pH 4, as seen from the >1 Å decrease in ensemble average of the inter-subunit distance between the Y419 residues located on the P-Q loop ( Figure 2). This is associated with the emergence at pH 4 of a new orientation of the Y419 sidechain which, at pH 8, is buried in the hydrophobic core of the protein ( Figure S3). Notably, the position of this sidechain is similar to that observed in the crystal structure of the E113Q/E148Q/E203Q mutant CLC-ec1 (PDB ID 6V2J). Indeed, this mutant is considered to mimic the opened conformation of the WT under acidic conditions.

pH-Dependent Conformational Rearrangement of Helices N, P and O
Multiple experimental studies [18,[21][22][23][24] have concluded that helices N and O exhibit pH-dependent conformational rearrangements. From the ECpH trajectories, we find that the distribution of inter-subunit distances between Cα atoms of M373 (Helix N) and E385 (helix O) increases by~4 Å at pH 4 compared to the pH 8 conformation (Figure 2), and a subtle shift of~0.5 Å in the RMSD of their backbone atoms ( Figure 2). The pH dependence of the shift in position of the N helix is likely related to the reorientation of the Glu ex sidechain (E148) shown in Figure 2, accompanied by the reorganization of the hydrophobic environment located at the bottom of helix N (F190, F199, F357 and I186), as shown in Figure S2. Helix P also undergoes a significant displacement in the ECpH simulations at pH 4, as seen from the >1 Å decrease in ensemble average of the inter-subunit distance between the Y419 residues located on the P-Q loop ( Figure 2). This is associated with the emergence at pH 4 of a new orientation of the Y419 sidechain which, at pH 8, is buried in the hydrophobic core of the protein ( Figure S3). Notably, the position of this sidechain is similar to that observed in the crystal structure of the E113Q/E148Q/E203Q mutant CLC-ec1 (PDB ID 6V2J). Indeed, this mutant is considered to mimic the opened conformation of the WT under acidic conditions. that can form pH-dependent interactions. As shown in Figure 3, at pH 4, the H234-D240 region of this loop visits a conformation that buries hydrophobic residues V236, A237, L238 and I239 in the hydrophobic core of the subunit interface, following occasional formation of hydrogen bonds between R230 and E235. At pH 8, however, a stable salt-bridge is created between R230 and E414 (P), and R230 is no longer available to orient E235 sidechain ( Figure 3). Thus, the bending of the I-J loop is likely associated with the displacement of helix P and serves as an indicator of the transition.   The extracellular parts of helices N, O and P are located in close proximity to extracellular I-J loop (H234-L249), which also changes its conformation under acidic pH conditions. This loop carries a number of titratable residues (including H234, E235 and K243) that can form pH-dependent interactions. As shown in Figure 3, at pH 4, the H234-D240 region of this loop visits a conformation that buries hydrophobic residues V236, A237, L238 and I239 in the hydrophobic core of the subunit interface, following occasional formation of hydrogen bonds between R230 and E235. At pH 8, however, a stable salt-bridge is created between R230 and E414 (P), and R230 is no longer available to orient E235 sidechain ( Figure 3). Thus, the bending of the I-J loop is likely associated with the displacement of helix P and serves as an indicator of the transition. The extracellular parts of helices N, O and P are located in close proximity to extracellular I-J loop (H234-L249), which also changes its conformation under acidic pH conditions. This loop carries a number of titratable residues (including H234, E235 and K243) that can form pH-dependent interactions. As shown in Figure 3, at pH 4, the H234-D240 region of this loop visits a conformation that buries hydrophobic residues V236, A237, L238 and I239 in the hydrophobic core of the subunit interface, following occasional formation of hydrogen bonds between R230 and E235. At pH 8, however, a stable salt-bridge is created between R230 and E414 (P), and R230 is no longer available to orient E235 sidechain ( Figure 3). Thus, the bending of the I-J loop is likely associated with the displacement of helix P and serves as an indicator of the transition.

Discussion
Application of ECpH to model pH-dependent conformational changes of the CLC-ec1 transporter produced a detailed representation of the complex structural response to the change from neutral to acidic pH. The ability of this new approach to bring to light the pH-driven conformational changes of specific residues that lead to major repositioning of transmembrane helices inferred from experimental measurements represents significant progress in the computational representation of the functional mechanism of a transporter that is activated at the acidic pH. Unlike the trajectories from canonical MD simulation with binary protonation states, the adaptation of the molecular structure to the gradual change in protonation probability modeled in the ECpH simulation framework enables the attribution of functionally important conformational rearrangements to specific local conformational rearrangements. As detailed in the results of the ECpH simulations, above, local pHdependent rearrangements of residue interactions lead to observed rearrangements of secondary structure elements and changes in the tertiary structure of the entire protein that agree with direct experimental measurements and inferences from a variety of structurefunction investigations. This holds as well for global function-related changes in the structure, as shown by comparison of our results to recently published experimental data, which identified the same changes in the switch from the inward facing conformational state at pH 7.5 to an outward facing opened state at pH 4 [23].
The ability of ECpH simulations but not the parallel cMD runs to successfully identify the local conformational changes relates to the improved model of pH-dependent conformational transitions by the force-field scaling approach. This approach enables ECpH to perform MD simulations within a canonical (or n, P, T) ensemble, starting with a protein construct in which the representation of electrostatic and Van der Waals interactions for each titratable site is scaled (see Section 4.2.3) according to the desired pH level and defined pKa value. In cMD, the pH-dependence is modeled by a fixed protonation configuration determined by an assumed protonation status based on the pH and pKa of a particular residue, in which the occupancy of the proton is either 1 or 0. This corresponds only to the situation at the starting and final pH. In contrast, the scaling procedure is a linear interpolation between force-field representations of protonated and non-protonated states (see Section 4.2.3) that takes advantage of an ensemble definition of protonation probability for each titratable site. This yields a construct of the system that corresponds to the statistical probability in an ensemble of the molecules at a certain pH. Here, we showed that the ECpH results agree with experimental findings about the pH-dependent rearrangement of the CLC-ec1 transporter at both the local and the global scale. Thus, the ECpH simulations revealed the details of local conformational changes produced by the evolution of protonation probabilities in the systems at one pH or another, as well as their consequence in the structural rearrangement of larger motifs and elements of secondary structures such as the transmembrane helical segments. As demonstrated by the specific comparisons of ECpH results with the corresponding experimental findings (in Section 2), the agreement obtained in the transition from basic pH to acidic pH values is remarkable at both the local and the global scale. Examples of agreement with detailed results in a recent study offer a direct comparison to parallel computations with cMD.
The increase observed with ECpH in the inter-subunit distance measured from HR-AFM maps [19] between the highest points of helices B in the two protomers reproduces the HR-AFM maps of CLC-ec1 structure at pH 4.5 that show the displacement of the peaks attributed to the B-C linker. These changes are seen from the ECpH simulations to accompany sidechain reorientations of residues D54, H70, D73 and R147 that occur as a result of the changes in protonation states, as shown in Figure 1B,C. Moreover, the juxtaposition of HR-AFM maps of CLC-ec1 at pH 4.6 and 7.5 also revealed conformational rearrangements in the I-J loop, as evidenced by a decreased gap between AFM peaks attributed to residue K243 [19]. This ability of the ECpH approach to highlight the pHdependent behavior of the all-atom CLC-ec1 system had eluded our conventional MD simulations in Ref. [19], pointing to the significant advantages of using partial protonation probabilities, over the binary representation of protonation changes in cMD. The results obtained for the CLC-ec1 transporter system demonstrate the capabilities of ECpH to distinguish in this manner between the effects of pH in the range of physiologically relevant values (5)(6)(7)(8) with the updated set of individual pKa values of titratable residues estimated by PROPKAtraj [25,26], a substantial advantage over the cMD framework, which captures effects of a protein's dynamics only in a pH range in which only histidine is expected to change its protonation state.

Definition of the ECpH Framework
A comprehensive description of the ECpH methodology is presented in [20]. Here, we define the main parameters used in the MD of the CLC-ec1 molecular system with this method.

Individual pKa Definition
To model the conformational transition to an open-gate conformation of the channel, molecular dynamics simulations were carried out by starting at both pH 4 and pH 8. The pKa values were estimated by applying PROPKATraj to the 40 × 500 ns trajectories of CLC-ec1 at pH 7 described in our previous publication [19].

Protonation State Representation
The representation of ensemble protonation states is realized in the semi-grand canonical ensemble framework by the concept of co-existing protonation states. This is described by discrete switching between the protonation states of the titratable residues. Because of the high frequency of proton exchange, this approach requires significant computational resources, and for the real-world all-atom protein systems remains mostly unfeasible. For this reason, the ECpH method employs a force-field modification scheme. Our approach is formulated in the same n, P, T-ensemble framework that is commonly used in MD simulations (see Appendix A), and using the same information for input pKa values. The specific protonation scheme (Section 4.1.3) that enables the representation of the protonation states is a central element of the theoretical framework of the ECpH method.

The Protonation Scheme: Definition of The Effective Forcefield
To enable the shift between protonated and not protonated states of a molecular system composed of any number of protonatable sites, the number of particles is kept constant (no protons are added or subtracted), but the presence/absence of a proton at each site is then represented by linearly scaling the non-bonded interactions. This scaling by an individual protonation probability value is applied to sites with titratable hydrogens. The protonation probability is dynamically determined by (1) the ambient pH, and (2) the pKa value of that site in a particular conformational state of the entire system. To achieve a gradual shift between the protonated/non-protonated states of the system, the charges of the atoms neighboring the titratable residues are also scaled linearly, by the same scaling factor. Once individual pKa values are set for each protonatable residue in the system, the protonation probabilities are evaluated from the equilibrium constant of the protonation/deprotonation process with the Hill equation.
As presented in detail elsewhere [20], the core idea of the ECpH method is the introduction of the modified force field, representing the pH-dependence of the state of the system. A single canonical ensemble is created for each pH in a given range using this protonation scheme as the corresponding protonation probabilities modify accordingly the internal energies of the titratable protonation sites of the system (see Appendix A). We follow the common protocol [29,30] of applying scaling only to nonbonded terms of the potential energy to eliminate the dependence of kinetic energy on pH. The electrostatic potential energy and van der Waals terms for interactions of titratable protons are scaled by λ in Equations (1) and (2): where q i and q j are the partial charges of the two atoms, r ij is the distance between them, and ε 0 , σ ij are, respectively, the electrostatic constant and the Lennard-Jones well depth. The partial charges of atoms neighboring the titratable residues, which are affected by the change in protonation state, are calculated as follows: where q current represents the new value of the partial charges that correspond to the current level of pH, q prot stands for the partial charge value in the force-field for the atom in a protonated state, and ∆q prot→nonprot is the difference between the force-filed partial charges of the atom in a protonated and a non-protonated state. Amino acid residues considered as titratable sites include aspartate, glutamate, histidine, cysteine, and lysine [31]. Because histidine can be involved in two pH-dependent transitions-(i) deprotonation of the protonated state and (ii) transition between the ε and δ tautomeric forms-we use two scaling parameters in Equations (1)-(3) for the histidine: the λ to model deprotonation, and a binary parameter to encode the information about the tautomeric form.
We note that because adaptation of partial charges of the titratable residues to the environmental pH leads to small changes in the system's net charges, we have developed a neutralization protocol. This protocol distributes the excessive charge to the water molecules, mimicking the wet-lab titration experiment (see Appendix A).

Parameters Used in the cMD and ECpH Simulations
All MD simulations (both ECpH and cMD) were performed with the OpenMM 7.5 software [32] using the CHARMM36 all-atom force field [33], in the NPT thermodynamic ensemble at P = 1 bar and T = 310 K. The pressure was preserved by either a Monte Carlo barostat or its membrane modification implemented in OpenMM (MonteCarloMem-braneBarostat) [32]. Long-range electrostatic interactions were evaluated with PME. Nonbonded interactions' cutoff and switching distances were set to 12 Å and 10 Å, respectively. The integration step was set to 2 fs, while the fluctuations of water bonds were constrained. The pKa values used in assigning the protonation states for titratable residues in both cMD and ECpH simulations were Glu: 4.4, Asp: 4.0, Lys: 10.4, His 6.5, 9.1, Cys: 9.5, unless stated differently (see below). Cys residues involved in disulfide bridges were not titrated.

Individual pKa Assignment in CLC-ec1 Homodimer
For pKa estimation of the CLC-ec1 system, we have applied a recently developed extension to the PROPKA algorithm-PROPKA Traj [25,26]-to cMD trajectories of the protein obtained previously and described in detail in Section 4.2.2 and in ref. [19]. According to the estimated accuracy of PROPKA (RMSD = 0.89) [26], we have altered the model value of individual pKa for residues for which the median of PROPKA Traj-predicted distribution deviated from the model value by more than 1 pKa unit. If, on the other hand, the standard deviation of pKa prediction exceeded 1 pKa unit, the value of the pKa standard deviation itself was taken as a threshold instead.

ECpH and cMD Simulation Protocols for MD Trajectories of CLC-ec1 System
For CLC-eq1, the simulated molecular construct was built from PDB ID 1OTS [16] and was embedded in the membrane with the CHARMM-GUI Membrane Builder server [33][34][35]. The membrane was composed of 629 lipid molecules with a 70:30 composition of POPE:POPG. The system was solvated in explicit water solution with 0.15 M KCl. The equilibration procedure followed the standard CHARMM-GUI equilibration protocol in NAMD 2.10 software [36,37]. The ECpH MD simulation was performed for a pH range from 2 to 9 with step 1 in replicas of 6 for~1 µs each. In both ECpH and cMD simulations published previously [19], pKa values for Glu113 and Glu148 were set at 8 and 6.5, respectively, according to experimental and computational data [38,39]. All other titratable residues were assigned the pKa values of the corresponding amino acids. For cMD of the CLC-ec1 system, data were generated from two parallel trajectories of 1.6 µs each for both pH 4.5 and pH 7.5 [19].

ECpH and cMD Simulation Protocols for MD Trajectories of BBL Protein
For the ECpH simulations of the test BBL system described in Appendix B, the pH range was from 2 to 11, running for 1 µs at each pH value. The runs were initiated from both opened and closed states of the EF loop. The crystal structures of BBL at pH~8.1 (PDB ID 2BLG), and at~6.3 (PDB ID 3BLG), were used as starting points. In all the simulations, the pKa value of Glu89 was set to 7.3 according to experimental data [40,41]. The corresponding cMD simulations were performed for pH 2, 6 and 8. At pH 2, all titratable residues were protonated, while ensembles at pH 6 and 8 differed in the protonation states of Glu89 and His residues. We also performed simulations starting from each of the two states of the EF loop, for 1 µs each.

Native Contacts Analysis Procedure
The analysis of native contacts was performed on 1 µs trajectories at a pH range from 2 to 11 for ECpH and pH 2, 6 and 8 for cMD. The distance cutoff was set to 5 Å.

Supplementary Materials:
The following are available online: Figure S1 representing the aligned crystallographic structures of CLC-ec1 from PDB data bank along with the main conditions of X-ray experiment; Figure S2 showing the distribution of sidechain dihedral angles of CLC-ec1 residues L186, F190, F199 and F357 in ECpH simulations at pH 4 and 8; Figure S3 depicting the distribution of sidechain dihedrals of Y419 residue in ECpH MD simulations at pH 4 and 8; Figure S4 depicting per-residue RMSF evaluated for BBL protein from NMR, cMD and ECpH experiments; Figure S5 showing the first two principal components for the motion of BBL EF loop computed from ECpH trajectories at pH 6 and 8; Figure S6 showing the radial distribution of water molecules around protonation sites for BBL protein from ECpH and cMD at various pHs. The ECpH python code compatible with OpenMM is published in GitHub: https://github.com/weinsteinlab/ECpH-MD.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The ECpH python code compatible with OpenMM is published in GitHub: https://github.com/weinsteinlab/ECpH-MD. Along with the ECpH code the GitHub repository contains tutorial files for BBL protein system and step-by step instructions to perform the calculations reported in this manuscript.

Acknowledgments:
The authors thank George Khelashvili and members of the Weinstein lab for helpful discussions. The computational work was performed using the following resources: the Oak Ridge Leadership Computing Facility (INCITE allocations BIP109) at the Oak Ridge National Laboratory, which is a DOE Office of Science User Facility supported under Contract DE-AC0500OR22725; resources services, and support provided at the RPI Artificial Intelligence Multiprocessing Optimized System (AiMOS) system accessed through an award from the COVID-19 HPC Consortium (https://COVID19-hpc-consortium.org/), which is a unique private-public effort to bring together government, industry, and academic leaders volunteering free computer time and resources in support of COVID-19 research.

Conflicts of Interest:
The authors declare no conflict of interest.
Sample Availability: Not Applicable.

Appendix A
Expression of the pH dependence of the state. For clarity, the description of the core idea of pH dependence within a canonical ensemble framework and its implementation begins with a system with a single protonation site. The protonation probability λ for this case is equal to 1, which corresponds to the system at very low pH. Expressing the average energy of the subset of atoms of a titratable residue in this protonated state from the probability density (ρ 0 prot ) integrated over configuration space Ω, we obtain: where E 0 prot is the average total energy of the single protonation site of the system, ρ 0 prot (E) is a Boltzmann probability density of a protonated site over its whole conformational space, and U 0 prot is the internal energy of this part of the system. The expected pH-dependent differences in dynamics arise from the protonation probability densities modified by the change in proton availability defined by the pH. This pH dependence is introduced through individual protonation probabilities for each protonatable site throughout the system. We assume that the system visits a shared configurational space at the different pH values; degradation processes (e.g., complete unfolding or fragmentation) are outside the scope of this method. Therefore, for a system in any protonation state i with a probability density ρ i prot , the average energy ( E i prot ) is expressed as: where the probability density function is defined in the same configuration space Ω. Thus, by scaling the internal energy U 0 prot by the λ corresponding to the pH λ(pH), Equation (2) defines the Boltzmann probability density for any single-site system with protonation probability λ at a given pH, in terms of the probability density of the fully protonated system (ρ 0 prot ). This definition is equivalent to splitting the entire ensemble into nonprotonated (U i nonprot ) and protonated fractions (U i prot ): so that at extreme λ values, the internal energy would be: and, for any other λ, the internal energy of the corresponding ensemble is expressed as: As the protonation probabilities vary at different pH values, the difference between energies of ensembles at different pH values arises from interactions of the titratable proton, so that the internal energies U 0 prot and U nonprot represent interactions between all atoms of the system and the titratable proton, according to probability l; this makes U nonprot equal to zero.
By the same definitions, expressing the average total energy of a system with k titratable sites at a very low pH (λ 0 = · · · = λ k = 1) yields: so that for a system with multiple pH-dependent protonation sites and a set of protonation probabilities λ(pH): where λ k is an individual protonation probability for the site k calculated for the current pH. It should be noted that protonation probability values (λ k ) in each state are determined individually for each protonation site by the pH and the pKa. For complex systems with multiple protonation sites, the individual pKa values depend not only on thermodynamical parameters (T, P, n, the nature of the solution), but also on the impact of the other titratable residues and the overall conformation of the system, which can cause shifts of 2-3 units on the pKa scale [42,43].
Conservation of the net charge. All CpH approaches based on discrete protonation states face a common problem, emerging from the charge-exchange procedure that affects the net charge of the molecular system, which becomes variable, non-zero, during the calculation. Thus, approaches proposed to enable simulations in explicit solvent with PME treatment of long-range electrostatics were shown to affect the dynamical properties of the system [44,45]. Here, we use an approach that overcomes the problem by distributing the excessive net charge to a sufficiently large number of water molecules in the solution to avoid affecting their state. This is achieved by assigning an additional partial charge of 0.001e to the oxygen atoms of randomly selected water molecules. The extremely dilute acidic environment of this water solution acts as a charge buffer that can absorb from, or provide to, the molecular system the necessary charges without affecting enthalpy, entropy, or the interactions with proteins or membranes.

Appendix B
The implementation of the ECpH method to the study of conformational transitions in protein systems is designed to have the general characteristics of conventional MD simulations. This includes the stability of the trajectories generated in the runs, the integrity of the overall protein structure throughout the simulation, and the similarity of dynamic properties of the generated ensembles. To test these elements of the performance of the ECpH method in comparison to parallel canonical cMD simulations, we used a small (162 amino acids) globular protein-bovine β-lactoglobulin (BBL) as our benchmark system. BBL is known for having a pH-dependent conformational shift of the EF loop at pH~7.3 (Tanford transition) that is attributed to an unusually high pKa value of Glu89 [39,40]. In addition, the initiation of an unfolding process of BBL at low (<3) and at high (>9) pH values is associated with a series of conformational transitions that have been captured with numerous experimental techniques, including NMR, X-ray crystallography and heteronuclear single quantum coherence spectroscopy (HSQC) [46]. We therefore investigated the performance of the ECpH method in describing the pH-dependent dynamic properties of BBL from a set of ECpH trajectories obtained at pH conditions ranging from 2 to 11 and compared it to cMD simulations with protonation states of titratable residues varied to represent states at pH 2, 6, and 8. The details of the simulation protocols and parameters are given in the main Methods section (Sections 4.2.3 and 4.2.4).
Stability and dynamical properties of ECpH trajectories. The root-mean-square deviation (RMSD) computed along the trajectory for backbone atoms of the protein (see Figure A1A) is commonly used to infer the stability of MD methods [47,48]. We also use the comparison of standard deviation of the RMSD (STD RMSD ) calculated for short segments of the trajectories to quantify that stability ( Figure A1B). As shown in Figure A1A, the canonical MD and ECpH trajectories initiated from the same conformation of BBL (PDB ID 3BLG, pH~6.3) [39] demonstrate common trends of pH dependence. Ensembles generated by the computations at pH 6 refer to the folded conformation of BBL with EF-loop closed and, thus, should be the closest to X-ray structure. Indeed, at pH 6, the RMSD of Cα-atoms converges rapidly in both ECpH and canonical MD simulations within the first 300 ns, to values falling below 2.5 Å on average, with STD RMSD remaining~0.8 Å ( Figure A1B). The initiation of an unfolding process involving loosening of the proteins' loops was observed at pH 2 in NMR and other spectroscopic studies [49][50][51]. This is evidenced in the RMSD profile of the ECpH trajectory at pH 2 by the change representing the rearrangements of loops AB, BC, CD, and DE into a loose opened conformation at~600 ns ( Figure A1, Figure S4). In the canonical MD simulations, the geometry of BBL evolves gradually to the same RMSD values. Transition to a more basic environment (pH 8) forces the system into a gradual conformational change associated with the Tanford transition of loop EF ( Figure A2, Figure S5). Stability and dynamical properties of ECpH trajectories. The root-mean-square deviation (RMSD) computed along the trajectory for backbone atoms of the protein (see Figure A1A) is commonly used to infer the stability of MD methods [47,48]. We also use the comparison of standard deviation of the RMSD (STDRMSD) calculated for short segments of the trajectories to quantify that stability ( Figure A1B). As shown in Figure A1A, the canonical MD and ECpH trajectories initiated from the same conformation of BBL (PDB ID 3BLG, pH ~ 6.3) [39] demonstrate common trends of pH dependence. Ensembles generated by the computations at pH 6 refer to the folded conformation of BBL with EF-loop closed and, thus, should be the closest to X-ray structure. Indeed, at pH 6, the RMSD of Cα-atoms converges rapidly in both ECpH and canonical MD simulations within the first 300 ns, to values falling below 2.5 Å on average, with STDRMSD remaining ~0.8 Å ( Figure  A1B). The initiation of an unfolding process involving loosening of the proteins' loops was observed at pH 2 in NMR and other spectroscopic studies [49][50][51]. This is evidenced in the RMSD profile of the ECpH trajectory at pH 2 by the change representing the rearrangements of loops AB, BC, CD, and DE into a loose opened conformation at ~600 ns ( Figure A1, Figure S4). In the canonical MD simulations, the geometry of BBL evolves gradually to the same RMSD values. Transition to a more basic environment (pH 8) forces the system into a gradual conformational change associated with the Tanford transition of loop EF ( Figure A2, Figure S5).   We used the standard technique of native contact analysis [47] to evaluate the stability of BBL in the ECpH and cMD simulations, with reference to the crystal structure of BBL at pH 6 (PDB ID 3BLG), in which the EF loop is in the closed conformation ( Figure  A1C). Interestingly, the changes in the fraction of native contacts in response to pH changes calculated with the two methods are similar and reproduce the experimental structural data [46,49,[51][52][53].
Water distribution. Since the force field modification scheme in ECpH uses explicit water molecules as a charge buffer, the method was also tested for the ability to reproduce proper water dynamics. The radial distribution functional (RDF) analysis demonstrates identical water densities in canonical MD and ECpH trajectories. Calculated separately, the water distribution around titratable protons is seen to morph gradually between the non-protonated and protonated state densities observed in cMD trajectories as the pH changes ( Figure S6 in the Supplementary Information).
Advantages of the ECpH method in describing the dynamics of the BBL protein. The pH-dependent conformational change in BBL, captured in both the crystal structure and by NMR, is known as a reversible Tanford transition of the EF-loop at pH 7, associated with a highly elevated pKa value of Glu89 (~7.3) [39,40]. Below pH 7, the carboxyl of Glu89 is buried in the hydrophobic core of BBL, and the gates formed by the EF and AB loops are closed. We observed the opening of the EF-loop at pH 8 in 1 μs trajectories obtained from both methods ( Figure A2, right).
Another pH-dependent conformational change was observed for Trp61 in the fluorescence study conducted at pH ~ 2 (2.0-3.5), ~4 (4.0-5.5), ~6 (6.0-7.0), 7.5 and 8.0 by Sakurai et al. [46]. The three stable states observed were assigned to states of BBL unfolding: native, loose, and intermediate. The intermediate of the unfolding process was attributed to the reorientation of Trp61, which the authors considered to be triggered by the Tanford EF transition occurring at ~pH 7. ECpH simulations have captured the new orientation of Trp61 triggered by the overall conformational change at pH 7. In its intermediate conformation ( Figure A3B), the Trp61 side chain is stabilized by π-interactions, with Arg40 located on AB ( Figure A3A). Notably, the cMD simulations do not sample the intermediate state at the pH levels tested ( Figure A3C). A third state, corresponding to the beginning of unfolding, is visited only very briefly in cMD at pH 6 ( Figure A3C), whereas in the ECpH simulations, it is strongly populated at extreme pH values (pH 2, 10), at which multiple experiments had reported observations of the beginning of unfolding [49][50][51][52]. Although the unfolding process is too slow to be captured in the 1 μs simulations that we carried out [49], the initiation of the process is revealed in the mechanistic details emerging from the ECpH simulation, but not with cMD. We used the standard technique of native contact analysis [47] to evaluate the stability of BBL in the ECpH and cMD simulations, with reference to the crystal structure of BBL at pH 6 (PDB ID 3BLG), in which the EF loop is in the closed conformation ( Figure A1C). Interestingly, the changes in the fraction of native contacts in response to pH changes calculated with the two methods are similar and reproduce the experimental structural data [46,49,[51][52][53].
Water distribution. Since the force field modification scheme in ECpH uses explicit water molecules as a charge buffer, the method was also tested for the ability to reproduce proper water dynamics. The radial distribution functional (RDF) analysis demonstrates identical water densities in canonical MD and ECpH trajectories. Calculated separately, the water distribution around titratable protons is seen to morph gradually between the non-protonated and protonated state densities observed in cMD trajectories as the pH changes ( Figure S6 in the Supplementary Information).
Water distribution. Since the force field modification scheme in ECpH uses explicit water molecules as a charge buffer, the method was also tested for the ability to reproduce proper water dynamics. The radial distribution functional (RDF) analysis demonstrates identical water densities in canonical MD and ECpH trajectories. Calculated separately, the water distribution around titratable protons is seen to morph gradually between the non-protonated and protonated state densities observed in cMD trajectories as the pH changes ( Figure S6 in the Supplementary Information).
Advantages of the ECpH method in describing the dynamics of the BBL protein. The pH-dependent conformational change in BBL, captured in both the crystal structure and by NMR, is known as a reversible Tanford transition of the EF-loop at pH 7, associated with a highly elevated pKa value of Glu89 (~7.3) [39,40]. Below pH 7, the carboxyl of Glu89 is buried in the hydrophobic core of BBL, and the gates formed by the EF and AB loops are closed. We observed the opening of the EF-loop at pH 8 in 1 µs trajectories obtained from both methods ( Figure A2, right).
Another pH-dependent conformational change was observed for Trp61 in the fluorescence study conducted at pH~2 (2.0-3.5),~4 (4.0-5.5),~6 (6.0-7.0), 7.5 and 8.0 by Sakurai et al. [46]. The three stable states observed were assigned to states of BBL unfolding: native, loose, and intermediate. The intermediate of the unfolding process was attributed to the reorientation of Trp61, which the authors considered to be triggered by the Tanford EF transition occurring at~pH 7. ECpH simulations have captured the new orientation of Trp61 triggered by the overall conformational change at pH 7. In its intermediate conformation ( Figure A3B), the Trp61 side chain is stabilized by π-interactions, with Arg40 located on AB ( Figure A3A). Notably, the cMD simulations do not sample the intermediate state at the pH levels tested ( Figure A3C). A third state, corresponding to the beginning of unfolding, is visited only very briefly in cMD at pH 6 ( Figure A3C), whereas in the ECpH simulations, it is strongly populated at extreme pH values (pH 2, 10), at which multiple experiments had reported observations of the beginning of unfolding [49][50][51][52]. Although the unfolding process is too slow to be captured in the 1 µs simulations that we carried out [49], the initiation of the process is revealed in the mechanistic details emerging from the ECpH simulation, but not with cMD. (C) Distribution of Trp61 sidechain dihedrals in cMD at pH 2, 6, and 8. The Unfolded* label refers to the initiation of the unfolding process indicated first by the increased mobility of the loops as described in detail in [49][50][51][52].