Free Energy Analyses of Cell-Penetrating Peptides Using the Weighted Ensemble Method

Cell-penetrating peptides (CPPs) have been widely used for drug-delivery agents; however, it has not been fully understood how they translocate across cell membranes. The Weighted Ensemble (WE) method, one of the most powerful and flexible path sampling techniques, can be helpful to reveal translocation paths and free energy barriers along those paths. Within the WE approach we show how Arg9 (nona-arginine) and Tat interact with a DOPC/DOPG(4:1) model membrane, and we present free energy (or potential mean of forces, PMFs) profiles of penetration, although a translocation across the membrane has not been observed in the current simulations. Two different compositions of lipid molecules were also tried and compared. Our approach can be applied to any CPPs interacting with various model membranes, and it will provide useful information regarding the transport mechanisms of CPPs.


Introduction
Cell-penetrating peptides (CPPs) have been extensively studied for a long time since they are capable of transporting various cargoes (e.g., proteins, peptides, DNAs, and even small drugs) into cells [1]. Various factors, including the concentration of CPPs and the properties of the membrane affect the transport mechanisms of CPPs [2,3]. It has been known that the translocation mechanisms of different families of CPPs are not the same, and most CPPs can have more than a single pathway depending on the experimental conditions such as a concentration of CPPs [4].
Arginine (R)-rich peptides have been extensively studied because of their effectiveness in translocation [2,5]. One of the possible scenarios for the insertion of R-rich peptides into the lipid bilayer is a strong interaction with negatively charged phospholipid heads.
Molecular dynamics (MD) simulations have been used to investigate functional properties of various CPPs and their interactions with many different lipids; however, the mechanism of translocation of CPPs and interactions with lipids are still under debate.
Herce et al. [6,7] showed in their MD simulations that the attractive interactions between the Arg 9 (or Tat) peptides and the phosphate groups of the phospholipids results in significant local distortions of the bilayer, and these distortions lead to the formation of a toroidal pore. However, Herce et al.'s simulation was criticized by Yesylevskyy et al. [8] that the spontaneous translocation of CPPs in MD simulations is not expected within a short time scale (100∼200 ns). It is believed that the time required for the translocation of CPP is on the order of minutes [9].
It is unclear that we can observe the translocation of CPPs across the membranes using conventional all-atom MD simulations. However, an MD simulation is still one of valuable tools to study protein-lipid interactions, and it can provide us with detailed information such as the contribution of electrostatic energy between CPPs and the membranes, effects of water molecules, etc. In our previous work [10], we found that the electrostatic interaction between Arg 9 and a DOPC/DOPG(4:1) membrane plays a role at the initial stage of translocation. We also quantitatively showed that several water molecules coordinated by Arg 9 increased during the penetration, and this led to a membrane thinning and a decreased bending rigidity of the membrane. However, we could not find a translocation of Arg 9 across the membrane within a 1 µs time scale.
Due to the short time scale achieved in current MD simulations, people have been using biased simulations (e.g., umbrella sampling [11][12][13], steered MD simulations [14]) to study CPPs and their interactions with membranes. The umbrella sampling is very popular to obtain free energy barrier between CPPs and membranes. However, people have noticed that there could be an artifact in the free energy analysis due to a short simulation time in each window.
In this study, we implement a weighted ensemble (WE) method [15] in our MD simulations (see more in the Section 2). The WE method is one of very flexible path sampling techniques and it is easy to implement. A detailed review was given by Zuckerman and Chong [15]. We use the WESTPA software [16,17], which has been widely applied to various systems, ranging from atomistic to cellular scale. It turns out that the WE method is very effective for studying interactions between CPPs and membranes and for obtaining free energy barriers in the current study.
First, we apply the WE method to Arg 9 with a DOPC/DOPG(4:1) membrane used in the previous study [10] as well as Tat with identical lipids. Only a few simulations used mixtures of lipids (e.g., a mixture of DOPC/DOPG or DOPC/DOPE) so far, and it is still unclear whether the surface charge of a membrane affects the translocation of CPPs. For example, a DOPC/DOPG(1:1) membrane appeared the better host for the translocation of KR9C in experiments [18] and the fluorescence probe carboxyfluorescein-labeled R9 (CF-R9) translocates continuously across the lipid membrane of single DOPG/DOPC(2/8), DOPG/DOPC(4/6), and DLPG/ DTPC(2/8) giant unilamellar vesicles (GUVs) and enters these GUVs without pore formation [19]. However, Herce et al. showed in their experiments that DOPG lipids are not necessary for the Arg 9 peptides to penetrate bilayers [7]. Thus, we also simulate two different lipid compositions, DOPC/DOPE(4:1) and DOPC lipids, respectively, to see if the surface charge of model membranes affects the penetration of CPPs. Then, the free energy barriers between CPPs and model membranes can be obtained using the WE simulation data, and these free energy calculations will be helpful to understand the transport mechanisms of CPPs.

Equilibrium Simulations
All simulations were performed using the NAMD package [20] and CHARMM36 force field [21]. The following systems were well equilibrated before starting the WE simulations: 4 Arg 9 with a DOPC/DOPG(4:1) membrane (System I), 4 Tat with a DOPC/DOPG(4:1) membrane (System II), 4 Arg 9 with a DOPC/DOPE(4:1) membrane (System III), and 4 Arg 9 with a DOPC membrane (System IV). CHARMM-GUI [22] was used to setup a DOPC/DOPG(4:1) membrane, a DOPC/DOPE(4:1) membrane, a DOPC membrane and TIP3P water molecules. The DOPC/DOPG(4:1) mixture consists of 76 DOPC and 19 DOPG lipids in each layer in System I. The same model membrane was used in System II. System III has a DOPC/DOPE(4:1) mixture which consists of 80 DOPC and 20 DOPE lipids in each layer. System IV has 95 DOPC lipids in each layer. K and Cl ions were added in each system to make a concentration of 150 mM. All Arg 9 (in System I, III, and IV) or Tat (in System II) peptides were initially located in the upper solution, and they were bound to the upper layer during the equilibration.
The NPT simulations were performed at T = 310K. Temperature and pressure were kept constant using Langevin dynamics. An external electric field (0.05 V/nm) was applied in the negative z-direction (from CPPs to the membrane) as suggested in previous work [7,23] and also in our previous simulations [10] to account for the transmembrane potential [24]. The particle-mesh Ewald (PME) algorithm was used to compute the elec-tric forces, and the SHAKE algorithm was used to allow a 2 fs time step during the whole simulations.
In the case of System I, we use a pre-equilibrated structure in the previous study [10], which was equilibrated up to 1 µs. System II was also equilibrated for at least 1 µs. The other two systems (System III and IV) were equilibrated for 100 ns before starting the WE simulations. This time scale is enough for CPPs to close to model membranes. Figure S1 (see Supplementary Information) shows both an initial setup and a snapshot after equilibration for each system, and Figure S2 presents radial distribution functions of the followings: phosphorus atoms vs. water, phosphorus atoms vs. oxygen in water, ester oxygen vs. oxygen in water. These plots show that the quality of hydration of membranes is very similar to each other. During the equilibration, CPPs were confined in the upper water box so that there is no interaction between CPPs and the lower leaflet of model membranes. Whenever a CPP is leaving the upper water box, a small force was applied to pull that CPP inside the box. All CPPs in each system were well contacted with the lipid molecules after the equilibration, and then the WE simulations were performed using those equilibrated systems.

Weighted Ensemble (WE) Simulations
We use the WESTPA (The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis) software package [16,17] to enable the simulation of rare events, for example, translocation of CPPs across a model membrane. WESTPA is an open source, and its utility has been proved for a broad range of problems. All WE trajectories are unbiased and hence used to calculate conditional probabilities or transition rates.
To use the WE method in MD simulations, we need to define a progress coordinate, several bins, several walkers (child simulations) in each bin, and a time interval for splitting and combining trajectories [17]. We define the progress coordinate as a distance in zdirection between the center of mass of phosphorus atoms in the upper leaflet and that of a CPP (Arg 9 or Tat). After equilibration of each system, an initial distance between phosphorus atoms and a CPP was measured, and boundaries were set using this initial position and the position of the center of the membrane, for example, [−18 Å (the center of membrane), 3 Å (the initial distance)]. Each bin size was 0.25 Å and, the number of walkers in each bin was 5. The time interval for splitting and combining the trajectories was set 5 ps during all the WE simulations. The progress coordinate was calculated using MDAnalysis [25]. The potential mean of force (PMF) profiles were obtained from each WE simulation data using "w_pdist" and "plothist" codes in the WESTPA package [16,17].

WE Simulations Show Much-Enhanced Penetration within a Short Amount of Time
We observe that both Arg 9 and Tat penetrate through the middle of the model membrane during the WE simulations; however, translocation across the membrane has not been observed in the current simulations. Figure 1 shows the penetration depth of both Arg 9 and Tat vs. the number of iterations of each WE simulation. Here, we define the penetration depth as a distance between the center of mass of CPP and that of phosphorus atoms of the upper leaflet. The penetration depth of Arg 9 in the previous equilibrium simulation was about −6∼−7 Å [10], and the current WE simulation shows −17.6 Å and −17.7 Å for Arg 9 and Tat, respectively. The initial positions of Arg 9 and Tat after equilibration are different from each other; however, they quickly penetrate the DOPC/DOPG(4:1) membrane. Please note that these plots show only one of the trajectories of each simulation which shows the maximum penetration depth. Multiple trajectories cannot reach the middle of the membrane. As one can see in the figure, the penetration depth changes a lot within a very short time scale. As described in the Section 2 each iteration corresponds to 5 ps and the figure shows that both CPPs reach their maximum penetration depth within a very short time scale (3∼5 ns) after the WE simulation started. Therefore, the WE method provides us with a very effective tool to overcome potential energy barriers. Figure 2 presents snapshots of both Arg 9 and Tat, which show the maximum penetration depth during the WE simulation. The yellow shows each CPP, and gray is the lipids. The blue and red are phosphorus atoms of the upper and the lower leaflets. Water molecules, ions, and the other three CPPs are omitted for clarity. As one can see in Figure 2, the membrane is deformed a lot when Arg 9 or Tat penetrates the middle of the membrane. Figure S3 in Supplementary Information shows membrane curvature more clearly. As shown in the previous simulation [10], several water molecules around a CPP is increased when the CPP penetrates the membrane, and a membrane thinning is also observed in the current WE simulations.   In our previous equilibrium simulation of Arg 9 with a DOPC/DOPG(4:1) mem-157 brane, we showed that electrostatic energy between Arg 9 and the membrane is strongly 158 correlated with the penetration depth of Arg 9 [10]. This proved that electrostatic in-159 teraction between a CPP and a membrane is essential at the early translocation stage 160 [26]. 161 We calculate the same electrostatic energy between Arg 9 (or Tat) and the membrane. 162 We want to see if there is still a strong correlation between electrostatic energy and

Electrostatic Energy between a CPP and a Model Membrane Plays a Role during the Translocation
In our previous equilibrium simulation of Arg 9 with a DOPC/DOPG(4:1) membrane, we showed that electrostatic energy between Arg 9 and the membrane is strongly correlated with the penetration depth of Arg 9 [10]. This proved that electrostatic interaction between a CPP and a membrane is essential at the early translocation stage [26].
We calculate the same electrostatic energy between Arg 9 (or Tat) and the membrane. We want to see if there is still a strong correlation between electrostatic energy and penetration depth during the WE simulations. We use two trajectories in Figure 1 to calculate electrostatic energy. Figure 3 presents the penetration depth (left axis, black line) vs. electrostatic energy (right axis, blue line) between Arg 9 and the lipid molecules within 15 Å of Arg 9 . The red dotted line is an average over every 30 iterations. The figure shows that an absolute magnitude of electrostatic energy becomes smaller when Arg 9 rapidly penetrates inside the membrane (e.g., between 450th ∼ 550th iterations). Therefore, one can expect that the penetration increases as the magnitude of electrostatic energy becomes smaller.
We find similar behavior in the case of Tat. Figure 4 shows the same energy calculation for Tat. The red dotted line is an average over every 30 iterations as before. The penetration increases gradually as the magnitude of electrostatic energy becomes smaller.
At the initial stage of translocation, it is well known that the interaction between positive charges of CPPs and a negative part of lipid molecules is essential for binding. The current simulations show that electrostatic interaction between CPPs and membranes still plays a role during the translocation of CPPs. In System I and System II Arg 9 and Tat strongly interact with DOPG lipids, and the translocation seems not easy. However, our simulations show that the translocation of CPPs is possible even in this strong electrostatic interaction. During the translocation, the electrostatic interaction should be diminished. We conjecture that water molecules play a role to make electrostatic energy weaker. As shown in the previous work [10], several water molecules around CPPs are increased during the translocation. penetration depth during the WE simulations. We use two trajectories in Fig. 1 to   164 calculate electrostatic energy. the membrane (e.g., between 450th ∼ 550th iterations). Therefore, one can expect that 170 the penetration increases as the magnitude of electrostatic energy becomes smaller. 171 We find similar behavior in the case of Tat. Fig. 4 shows the same energy calculation

The Free Energy Profiles Based on the WE Simulation Data Show That Arg 9 Penetrates through a DOPC/DOPG(4:1) Model Membrane Much Easier Than Tat Does
This section presents the potential mean of force (PMF) profiles of Arg 9 and Tat along the penetration paths shown in Figure 1. Another advantage of using the WE method is that it provides a free energy analysis without any biased sampling (e.g., umbrella sampling) [17]. Therefore, these PMF profiles will tell us how much potential energy barriers exist along those paths and which CPP can penetrate the membrane more efficiently. Figure 5 shows the PMF profiles of both Arg 9 and Tat as a function of a progress coordinate (or reaction coordinate). We use a penetration depth as the progress coordinate. The progress coordinate becomes negative when Arg 9 and Tat penetrate below the upper leaflet of the membrane. The blue line is the PMF profile for Arg 9 and the red line for Tat. One can notice that the free energy of Arg 9 is much lower than that of Tat. Figure 5 shows that a free energy barrier is about 100 kT for Arg 9 to reach the middle of the membrane while it is about 280 kT for Tat to reach a similar position. This means that Arg 9 penetrates the DOPC/DOPG(4:1) membrane much easier than Tat does. However, these analyses can depend on the type of model membranes, and we cannot conclude that Arg 9 is always more efficient for penetrating membranes than Tat regardless of membrane types. The order of magnitude of free energy for Arg 9 (∼100 kT) is very similar to a previous result [27], which used umbrella samplings. On the other hand, Tat's order of magnitude of free energy is much higher than those of previous simulations [8,13]. Please note that most of the previous free energy analyses were done with a single lipid molecule (e.g., DOPC or DPPC), while we use a mixed composition of lipid molecules (DOPC/DOPG(4:1)) in our simulations. Although the magnitude of free energy can depend on systems studied, the observed tendency between Arg 9 and Tat is consistent with experimental results [28,29] which showed more efficient cellular uptake of Arg 9 than Tat. is that it provides a free energy analysis without any biased sampling (e.g., umbrella 190 sampling) [16]. Therefore, these PMF profiles will tell us how much potential energy

Different Lipid Compositions Slightly Affect the PMF Profiles of Arg 9
It is interesting to see whether a different lipid composition can affect the penetration depth and the PMF profile. Figure 6 presents the PMF profiles of Arg 9 with DOPC/DOPG(4:1) lipids, DOPC/DOPE(4:1) lipids, and DOPC lipids, respectively. The blue line is for Arg 9 with DOPC/DOPG(4:1) lipids shown already in Figure 5. The red one is for Arg 9 with DOPC/DOPE(4:1) lipids and the orange for Arg 9 with DOPC lipids. DOPC and DOPE lipids are neutral, while DOPG lipids are negatively charged. We want to see if the translocation of CPPs strongly correlates with the lipid types, e.g., the charge of membranes. The minimum point in each plot is different from each other because the initial position of each CPP after equilibration is not the same. Although the initial positions are different from each other, the slope of each plot looks very similar to each other. Although Arg 9 in System III (with DOPC/DOPE(4:1) lipids) and Arg 9 in System IV (with DOPC lipids) have not reached the final position as of System I (with DOPC/DOPG(4:1)), the similar slope means similar variations in free energy. Based on these plots, we can conjecture that the penetration of Arg 9 is not much dependent on the lipid composition, and the effect of surface charge is minimal.

Discussion
Both Arg 9 and Tat do not show any translocation across the model membranes during the WE simulations. However, the WE method can be helpful to identify the transport mechanisms of CPPs and interactions between CPPs and lipid molecules because conventional equilibrium MD simulations have difficulty in sampling. Free energy (potential mean of force, PMF) profiles can be obtained without any biased simulation (e.g., umbrella sampling) within the WE approach. The WE simulation and its free energy analyses can be used to study the efficiency of penetration of CPPs.
Is it possible to observe the translocation if we simulate much longer? Currently, both Arg9 and Tat are stuck in the hydrophobic core. They can stay in the hydrophobic core for a long time even if we simulate much longer. They can move up and down a little, but it is not likely to move in either direction, moving back or forward. A visual inspection shows that they are trapped inside the hydrophobic core. The lipids make a small cage around CPPs so that CPPs cannot move easily. It seems that a single CPP cannot cause further deformation of the membrane at the current stage, and thus it is difficult to move. There is a possibility that the free energy barriers along the downward direction are much higher than those along the upward direction. In that case, it is easier for CPPs to go back to the hydrophilic region where they started. It will be interesting to try the WE simulations from the current positions of both Arg9 and Tat to the hydrophilic area outside, i.e., the positive z-direction. Then, we will obtain another free energy profiles for both Arg9 and Tat, and these profiles will give us a clue.
There are still some limitations in our WE simulations. First, a time scale of each trajectory shown in Figure 1 is less than 10 ns. Please note that the total simulation time for each CPP simulation is about 5∼10 µs, which includes all the walkers (child simulations) in each bin and all the iterations. This time scale of a single trajectory might be too short to observe a water pore mentioned in the previous results [8,13,27,30]. Second, our PMF profiles are incomplete because we do not find any translocation of CPPs and PMF profiles spanned only from the top to the middle of the model membranes. It has been known that free energy barriers between CPPs and membranes are changed in the presence of a water pore. For example, Huang et al. [27] reported that the free energy becomes lower in the presence of a water pore. This confirms the previous results from MD simulations and a continuum modeling which calculate the free energy of a single arginine residue for translocating across a membrane [31][32][33][34]. We need more WE simulation data points to complete the free energy profiles, which include translocation of CPP across the membranes and the effect of the water pore. Third, we have not attached any cargo to Arg 9 (or Tat) in the current study. We expect that the conformational changes both in CPP and the model membranes depend on a type of cargo. CPPs are commonly used with attaching negatively charged molecules and thus the total charge of the system decreases or becomes closer to neutral. It is not clear whether the overall charge of the whole system or only the charge of CPP is essential for the internalization. Most electronic neutral CPPs are usually far less powerful than cationic or amphiphilic CPPs. However, the electronic neutral cyclic peptide cyclosporin A (CsA) can have stronger (5-fold or 19-fold) penetrating capacity than other neutral peptides and its penetrating capacity was as strong as Tat [35]. On the other hand, in the case of oligoarginine (R n : n = 4, 8, 12, 16) R 12 and R 16 showed more efficient uptake than R 4 and R 8 [36]. If only the charge of CPP is essential during the translocation, we can compare the free energy barriers of these oligoarginine within the WE method and identify their penetration efficiency.
It has been known that it is difficult for a single CPP to make a pore and to make a translocation across the membrane [8]. One of the key factors which makes the water pore is a concentration of CPPs. In our WE simulations, the P(protein)-L(lipids) ratio was 0.042 or 0.05, and these concentrations maybe not be enough to observe the translocation across the membrane. Another critical factor for translocating CPPs is pH dependence. Experimental results showed the uptake of CPPs depends on pH values [37,38], and the uptake of CPPs is enhanced at low pH. It is worth trying MD simulations at low pH to see if any changes in conformational changes both in CPPs and in the model membranes. There has been significant progress in the development of constant pH molecular dynamics (pHMD) techniques [39,40] and this approach with the WE method would be good to reveal the pH dependence in interactions between CPPs and the model membranes.
Our simulations show characteristics of both the carpet-like model [41] and the membrane thinning model [42]. It is interesting to note that an orientation (a direction from the N-terminal to the C-terminal) of CPPs (Arg 9 or Tat) in our simulations is almost parallel to the membrane, and its orientation is changed a little during the entire simulation. This could be one of the reasons that it is difficult for a single CPP to make a water pore. In this work, the progress coordinate was set for a single CPP, and it was difficult to observe aggregation of CPPs in a local area. To make a water pore, a collective behavior between CPPs seems to be essential, and a high concentration of CPPs would be needed. It is worth using an increased concentration of CPPs and implementing a progress coordinate as a distance between the center of mass of the phosphorus atoms of the upper leaflet and that