Influence of Secondary-Structure Folding on the Mutually Exclusive Folding Process of GL5/I27 Protein: Evidence from Molecular Dynamics Simulations

Mutually exclusive folding proteins are a class of multidomain proteins in which the host domain remains folded while the guest domain is unfolded, and both domains achieve exchange of their folding status by a mutual exclusive folding (MEF) process. We carried out conventional and targeted molecular dynamics simulations for the mutually exclusive folding protein of GL5/I27 to address the MEF transition mechanisms. We constructed two starting models and two targeted models, i.e., the starting models GL5/I27-S and GL5/I27-ST in which the first model involves the host domain GL5 and the secondary-structure unfolded guest domain I27-S, while the second model involves the host domain GL5 and the secondary/tertiary-structure extending guest domain I27-ST, and the target models GL5-S/I27 and GL5-ST/I27 in which GL5-S and GL5-ST represent the secondary-structure unfolding and the secondary/tertiary-structure extending, respectively. We investigated four MEF transition processes from both starting models to both target models. Based on structural changes and the variations of the radius of gyration (Rg) and the fractions of native contacts (Q), the formation of the secondary structure of the I27-guest domain induces significant extending of the GL5-host domain; but the primary shrinking of the tertiary structure of the I27-guest domain causes insignificant extending of the GL5-host domain during the processes. The results indicate that only formation of the secondary structure in the I27-guest domain provides the main driving force for the mutually exclusive folding/unfolding between the I27-guest and GL5-host domains. A special structure as an intermediate with both host and guest domains being folded at the same time was found, which was suggested by the experiment. The analysis of hydrogen bonds and correlation motions supported the studied transition mechanism with the dynamical “tug-of-war” phenomenon.

The details of the model construction for the GL5/I27-S and GL5/I27-ST models were described as follows: the folded coordinates of Met1-Val39 and Asp136-Gly152 sequences of the folded GL5-host domain and the folded coordinates of Leu45-Leu133 sequences of the I27-guest domain were taken from the NMR structures of the GB1 protein (PDB:2GB1) and the I27 protein (PDB:1TIT), respectively. These coordinates were loaded in the Discovery Studio visualizer (http://accelrys.com/) by adding the additional residues of Gly40-Gly44 and the linker of Leu134-Gly135 using "Insert after" option of "Build and Edit Protein" menu, and were positioned manually close to each other. The complex folded coordinates of GL5/I27 protein were generated with four inserted sites by using "Save" option in the Discovery Studio visualizer. The unfolded coordinates of I27-guest domain in the complex GL5/I27 protein were built through using repeatedly the "Build Loop" option of "Build" menu on several β strands in the Swiss-PdbViewer (also known as DeepView, http://spdbv.vital-it.ch/). Further, the unfolding sizes of the secondary-structure and the secondary/tertiary-structure for such complex coordinates were manually adjusted in the Discovery Studio visualizer. Finally, the two constructed GL5/I27-S and GL5/I27-ST models were fully minimized by using AMBER 9 package to obtain the appropriate joined coordinates of the folded GL5-host and unfolded I27-guest domains with the secondary-structure unfolding and the secondary/tertiary-structure extending. Similarly, the other two models of GL5-S/I27 and GL5-ST/I27 corresponding to the secondary-structure unfolding and the full extending in the GL5-host domain were built by using the same method.

Conventional Molecular Dynamics Simulation Protocols
All CMD simulations for these built models were carried out using the AMBER 9 package [1] and ff03 all atom force field parameters [2][3][4]. The protocol for all CMD simulations was described as follows: (1) the systems were energetically minimized to remove unfavorable contacts. Four cycles of minimizations were performed with 2500 steps of each minimization and harmonic restraints on the GL5/I27-S, GL5/I27-ST, GL5-S/I27 and GL5-ST/I27 models from 100, 75, 50 to 25 kcal/(mol·Å 2 ), which means that the restraints were relaxed stepwise by 25 kcal/(mol·Å 2 ) per cycle. The fifth cycle consists of 5000 steps of unrestrained minimization before the heating process. The cutoff distance used for the non-bonded interactions was 10 Å. The SHAKE algorithm [5] was used to restrain the bonds containing hydrogen atoms; (2) each energy-minimized structure was heated over 120 ps from 0 to 300 K (with a temperature coupling of 0.2 ps, while the positions of the GL5/I27-S, GL5/I27-ST, GL5-S/I27, and GL5-ST/I27 models were restrained with a small value of 25 kcal/(mol·Å 2 ). Constant volume was maintained during the processes; (3) the unrestrained equilibration of 200 ps with constant pressure and temperature conditions was carried out for each system. The temperature and pressure were allowed to fluctuate around 300 K and 1 bar, respectively, with the corresponding coupling of 0.2 ps. A Langevin thermostat (NTT = 3) was used for the temperature regulation of each system. For each simulation, an integration step of 2 fs was used; and (4) finally, conventional molecular dynamics (CMD) runs of 50 ns for GL5/I27-S, GL5/I27-ST, GL5-S/I27, and GL5-ST/I27 models were carried out, respectively, by following the same protocol.

Targeted Molecular Dynamics Simulation Protocols
In the targeted molecular dynamics simulations [6,7], a time-dependent constraint force was applied onto the positions of all the heavy atoms of the GL5/I27-S, GL5/I27-ST, GL5-S/I27 and GL5-ST/I27 models at each time-step to bias the trajectory toward the corresponding target structure. To obtain the appropriate small force constant k, four 10-ns independent targeted molecular dynamics (TMD) simulations (including the last 2 ns of equilibration) were performed using different force constants of 0.5, 1.0, 2.0, 3.0, and 4.0 kcal/(mol·Å 2 ), respectively. As shown in Figure S5 for the conversion of GL5/I27-S to GL5-ST/I27, three out of the four TMD simulations reached a RMSD value of <0.5 Å extracting from the backbone atoms of two GL5/I27-S and GL5-ST/I27 models. To achieve the satisfied transition process of the initial structure GL5/I27-S to the target structure GL5-ST/I27, k = 3.0 kcal/(mol·Å 2 ) was chosen as the lowest harmonic force constant to apply onto all the heavy atoms of four models to bias the trajectories toward the target structure. All the other atoms in the two GL5/I27-S and GL5-ST/I27 models were assumed to move freely and to rearrange at their convenience to let the proteins reach their equilibrium under this constraint. The weighted RMSDs along the TMD trajectories from four, 10-ns independent simulations showed that each process could follow the same conformational transition pathway, which indicated that the external forces and initial velocities did not affect the transition pathway. Therefore, the TMD simulations with the 8 ns simulation time followed by 2 ns of equilibration with an integration step of 2 ps were performed for these four mutually exclusive processes. . The fraction of native contacts Q defined as the number of native residue-residue contacts in the predicted complex divided by the number of native contacts in the target was used as order parameter for the folding degrees (Q = 0 at the fully unfolded state and Q = 1 at the folded state) [11][12][13]. In recent years, these two parameters have been shown to be a key determinant of transition state structure and the mechanism of protein folding [14,15]. A "hydrogen bond" is defined as a distance of less than 3.5 Å between a hydrogen atom attached to either an oxygen or a nitrogen atom and an acceptor atom, and as an angle formed by a donor, a hydrogen atom and an acceptor being larger than 120° with the corresponding occupancy of ≥30%. To compare the amount of the total possible hydrogen bonds in the various transition conformations, a relative percentage of total hydrogen bond occupancies for a conformation was calculated by defining the relative percentage equal to one hundred percent when the all possible hydrogen bonds occupy each snapshot in the reactant during the mutually exclusive process. These parameter analyses in the present work were calculated by using the PTRAJ module of the AMBER 9 program.
The dynamic feature of a protein and the extent of correlation of motions in the different regions of a protein were assessed via the calculation of cross-correlation coefficients, C(i, j), given as follows: In the equation, ∆ri and ∆rj are the displacement vectors for Cα atoms of residues i and j, respectively, and the angle brackets denote the ensemble averages [4]. The correlation coefficients were averaged over the regions of protein, and the resulted cross-correlation coefficients are presented in the form of a two-dimensional graph.
The correlation of motions analyses in the present work were calculated by using the PTRAJ module of the AMBER 9 program.    I27-guest domain for the GL5/I27-ST,  GL5-ST/I27 states and I', II', III',