Application of Coarse-Grained (CG) Models to Explore Conformational Pathway of Large-Scale Protein Machines

Protein machines are clusters of protein assemblies that function in order to control the transfer of matter and energy in cells. For a specific protein machine, its working mechanisms are not only determined by the static crystal structures, but also related to the conformational transition dynamics and the corresponding energy profiles. With the rapid development of crystallographic techniques, the spatial scale of resolved structures is reaching up to thousands of residues, and the concomitant conformational changes become more and more complicated, posing a great challenge for computational biology research. Previously, a coarse-grained (CG) model aiming at conformational free energy evaluation was developed and showed excellent ability to reproduce the energy profiles by accurate electrostatic interaction calculations. In this study, we extended the application of the CG model to a series of large-scale protein machine systems. The spike protein trimer of SARS-CoV-2, ATP citrate lyase (ACLY) tetramer, and P4-ATPases systems were carefully studied and discussed as examples. It is indicated that the CG model is effective to depict the energy profiles of the conformational pathway between two endpoint structures, especially for large-scale systems. Both the energy change and energy barrier between endpoint structures provide reasonable mechanism explanations for the associated biological processes, including the opening of receptor binding domain (RBD) of spike protein, the phospholipid transportation of P4-ATPase, and the loop translocation of ACLY. Taken together, the CG model provides a suitable alternative in mechanistic studies related to conformational change in large-scale protein machines.


Introduction
Protein machines are the basic units to perform biological functions in cells. By responding to biological stimulus such as substrate binding or nucleoside triphosphate hydrolysis, they can execute directional molecular movements or specific chemical reaction pathways, and they play an important role in the cycle of matter and energy in cells [1]. A large number of protein machines have been structurally resolved at the atomic level and functionally verified under physiological conditions, such as the natural "molecular motor" ATPase [2], the transmembrane protein GPCR that mediates cellular "communication" [3], and diverse enzymes that are responsible for the synthesis, decomposition, or transformation of substances [4]. Like the machines in macroscopic world, the steady-state or equilibrium conformations of these protein machines are often restricted to a small set of possibilities and occur in a specific order, which guarantees the highly organized activities of those machines [5]. In that way, the functions of protein machines are not only determined by their crystal structures at steady state, but also closely related to the conformational pathways connecting those states. Nowadays, the crystal structures at steady state

Coarse-Grained (CG) Model
The coarse-grained (CG) model applied in this work is based on the solvation of ionizable residues and emphasizes the electrostatic effects within proteins. In the CG model, the sidechain atoms are united into a simplified atom, while the atoms of the main chain are explicitly represented. The simplified atom is located at the mass center of side chains for polar and nonpolar residues, and at the center of the charged part for ionizable residues. The CG model is derived by fitting to the observed absolute stability, which means the folding free energy of proteins is expressed as: where ∆G vdw side , ∆G CG solv , ∆G CG HB , ∆G elec side , ∆G polar side , ∆G hyd side , and ∆G vdw main/side represent the sidechain van der Waals interaction, the mainchain solvation energy, the mainchain hydrogen bond force, the sidechain electrostatic effect, polar, hydrophobic contribution, and the mainchain/sidechain van der Waals interaction, respectively. Scaling coefficients c 1 , c 2 , and c 3 have values of 0.10, 0.25, and 0.15, respectively, in this formula.
The mainchain interaction involves van der Waals interactions and electrostatic effects, and all bonding terms are calculated by ENZYMIX force field [31]. The key treatment for the CG model comes from the sidechain interaction. The sidechain electrostatic contribution can be expressed as: QQ , and ∆G dev Q stand for the Monte Carlo averaged charge, the pKa under protein environment, the pKa in water of the ith ionizable residue, the folded states interaction potential, the unfolded states interaction potential of ionized groups, and the term that describes the scaled-down effect of changing the protonation state of an ionizable residue upon unfolding, respectively. For the estimation of pKa i i , the Monte Carlo Proton Transfer (MCPT) method is utilized.
The Metropolis Monte Carlo (MC) approach is applied to determine the ionization states of protein residues under given pH and temperature [13,32]. Each MC step involves the attempt of proton transfer (PT) between a pair of ionizable residues or within an ionizable residue and bulk solvent. The transferring is repeated until the electrostatic interaction of the folded protein converges, then the ionization states of the protein residues are obtained to evaluate the CG free energy. During every MC move, the electrostatic free energy of a folded protein for the mth charge configuration Q (m) i is given by: where ∆G (m) QQ represents the charge-charge interaction free energies in the mth charge configuration. When a lower value of electrostatic free energy is achieved or the Metropolis criteria are met, the charge configuration is accepted. The minimized ∆G elec is used to evaluate the electrostatic contribution to the folding free energy using Equation (3).
The pKa i i term is expressed as: where sgn Q ion i refers to the charge sign function of the ith residue in its ionized state (which is +1 for HIS, ARG, LYS, and −1 for GLU, ASP). ∆G sel f ,i stands for the change in self-energy of an ionizable residue when it moves from water to the protein. ∆G sel f ,i is a Entropy 2022, 24, 620 4 of 15 key element for reliable evaluation of the electrostatic effect in the system. ∆G sel f ,i can be given by: where U is effective potential, and U sel f p , U sel f np , and U sel f mem are the self-energy contribution from non-polar residues, polar residues, and membrane grid points, respectively. N i p , N i np , and N i mem give the number of non-polar residues, polar residues, and membrane atoms surrounding the ith residue, respectively. The empirical function U sel f mem is given by: and where R min refers to the distance to the closest solvent molecule. The value of B sel f ,ion mem is 15 kcal/mol, which is based on the experimental data and computer simulations [33]. N mem is the number of neighboring membrane grid points. In this way, the self-energy penalty induced by a partial double counting in the center of the membrane can be avoided, so this CG model is suitable for studying complex protein systems such as transmembrane proteins [34].

Workflow Integrated with Coarse-Grained (CG) Model
Here, we present a molecular modeling workflow to explore the dynamic properties of protein machines in terms of their conformational pathways and free energy profiles. The complete workflow consists of four steps, namely, molecular modeling, acquisition of intermediate conformations, determination of electrostatic charge, and conformational free energy calculation.
Firstly, the initial endpoint assemblies connecting a specific biological process were built based on the crystal structures from the Protein Data Bank database [35]. The missing parts were repaired by Modeller software [36]. All the ligands were removed, and the proteins were trimmed to the same length. After that, targeted molecular simulation (TMD) [37] was conducted to construct the conformational pathways between different endpoint structures and sample a series of intermediate conformations representing the transition process. At each TMD step, the target structure was firstly aligned to the current structure, and then the root-mean-square (RMS) distance between the current structure and the target structure was computed. The force (F TMD ) on each target atom is given by the equation: where the k represents the spring constant and N represents the number of targeted atoms. RMS (t) is the current RMS value. RMS (0) is a reference value which evolves linearly from the initial RMSD to the final RMSD [37]. The initial structure was aligned to the target structure using the backbone heavy atoms, which were restrained to the initial positions by harmonic restraints with a force constant of 100.0 kcal/mol/Å 2 . The system was restrained throughout the simulation to prevent abnormal translation and rotation. When TMD from the initial state to the target state was completed, the initial and final states were interchanged, so we can obtain the intermediate structures of the whole transition process. Then, the accurate electrostatic charge was computed by the Monte Carlo Proton Transfer (MCPT) method. Finally, each structure was converted into coarse-grained (CG) representation and the CG free energy was calculated to obtain the conformational free energy profiles. For transmembrane proteins, membrane particles were added, and molecular dynamics relaxation was carried out to attain energy convergence. All simulations were completed by Molaris-XG software [31].

Application to Protein-Protein Interaction System
Protein-protein interaction (PPI) is the basic cooperation form wherein proteins perform their functions together. The formation of interfacial interactions is coupled with specific conformational states of each protein partner; therefore, the conformational change is an important regulatory mechanism for the binding and unbinding process of protein-protein interaction systems. Starting from the unbinding conformation, the protein will go through a series of intermediate conformations until it finally reaches another protein to form binding interactions. The energic profile of a protein's conformational change process can explain detailed mechanisms for the existence of a PPI relationship with its partner.
The spike protein of SARS-CoV-2 is a typical molecular machine that depends on conformational changes to form protein-protein interaction. It locates on the surface of the viral envelope and mediates the viral entry into cells by interacting with the ACE2 receptor of human cells through its receptor binding domain (RBD) [38,39]. A complete spike protein is a homotrimer with a large scale of more than 3000 residues. There are three distinct conformational states of spike trimers in nature: S-closed, S-open, and S-complex ( Figure 1a) [40]. S-closed is a tightly closed state in which the receptor binding motif (RBM) is masked, while S-open is a fusion-prone state with one RBD erecting to expose the RBM. Under the ACE2-free condition, most spike trimers are in the S-closed state and a minor population of them are in the S-open state, forming a dynamic balance (Figure 1a). During ACE2 approaching, the conformational landscape shifts toward the S-open state, and finally, most trimers reach the state in which they can tightly bind to ACE2 (Figure 1a). The transition process reflects the free energy difference among the three conformational states of spike trimers. Due to the too many computational resources needed by such a large-scale system, the energic profile of the process has never been obtained.
Here, we constructed CG models of the spike protein to reduce the resource demand for molecular modeling. The structural models of three conformations of the spike trimer were built by the Cryo-EM structures (PDB ID: 7DF3, 7DK3, and 7DF4) resolved by Xu et al. [40]. Then, the molecular modeling workflow integrated with the CG model was performed to obtain the energic profile of the process. Our results give an adequate explanation about behavior characteristics of spike proteins. Regarding the relative free energy of the three experimental conformations, S-closed has the lowest free energy, and the free energy of S-complex is the second lowest, while that of S-open is the highest (Figure 1b, Table 1). This gives an explanation that most spike trimers are in the S-closed state because of the energy advantages under ACE2-free conditions. The physiological significance of the S-closed state is hiding RBD, which is recognized by antibodies to achieve immune evasion, and therefore, the viruses become more adaptable [40]. During the transition process, the conformational free energy keeps rising until reaching the peak, which corresponds to the S-open conformation, and then turns into a downward trend (Figure 1b). The energy barrier of the spike trimer's conformational transition is 25.44 kcal/mol ( Figure 1b). Other computational studies also obtained an energy barrier varying from 7 to 20 kcal/mol of the process [41][42][43]. The high energy barrier is consistent with the conclusion that SARS-CoV-2 spike protein has higher energy barriers between active and inactive states as compared with the SARS-CoV-1 spike protein [44]. Lu et al. verified the high barrier between S-closed and S-open states by performing single-molecule fluorescence (Förster) resonance energy transfer experiments [45]. The evolutionary significance of the high energy barrier is to maintain the S-closed state of most trimers to hide surface antigens. Our calculation here only reflects the barrier of the conformational change in spike trimers. In reality, the energy barrier of spike protein's activation may be different due to the role of other ligands, such as glycans [41,46]. This suggests that the S-open state is the energy barrier site for the whole conformational change process. The approaching of ACE2 may shift the balance of S-closed and S-open states [40] by reducing the energy barrier. We Entropy 2022, 24, 620 6 of 15 decomposed the energic terms of free energy of S-closed and S-open, and found that the main contribution is that S-open's hydrophobic energy is much weaker than that of Sclosed (Table 1). This is reasonable because RBD's erecting exposes RBM as well as lots of hydrophobic residues in the solvent. We previously reported the important role of spike/ACE2 interaction in variants of SARS-CoV-2 [47,48]. The interaction between ACE2 and S-open may compensate the hydrophobic energy and contribute more electrostatic energy than the interaction between ACE2 and S-closed because of the glycans covering RBD [49,50]. The free energy of the S-complex conformation is lower than S-open but is still hard to reach because there is another energy barrier in the conversion process from S-open to S-complex ( Figure 1b). As the hydrophobic energy of S-complex is stronger than S-open (Table 1), when ACE2 binding, the S-complex state could have a significantly reduced free energy and become the most stable conformation.
receptor of human cells through its receptor binding domain (RBD) [38,39]. A com spike protein is a homotrimer with a large scale of more than 3000 residues. Ther three distinct conformational states of spike trimers in nature: S-closed, S-open, a complex ( Figure 1a) [40]. S-closed is a tightly closed state in which the receptor bin motif (RBM) is masked, while S-open is a fusion-prone state with one RBD erecti expose the RBM. Under the ACE2-free condition, most spike trimers are in the S-c state and a minor population of them are in the S-open state, forming a dynamic ba (Figure 1a). During ACE2 approaching, the conformational landscape shifts toward open state, and finally, most trimers reach the state in which they can tightly bind to A (Figure 1a). The transition process reflects the free energy difference among the three formational states of spike trimers. Due to the too many computational resources ne by such a large-scale system, the energic profile of the process has never been obtain

Application to Enzyme System
As a special class of protein machines, enzymes reduce the energy barriers of chemical reactions with the help of key residues or cofactors under the protein environment. Upon substrate binding, an enzyme may undergo a wide range of conformational changes, but only some specific intermediate states are important and meaningful for their stabilities on kinetic and thermodynamic properties [51]. It is hypothesized that the catalytic abilities of enzymes are dependent on the stabilities of transition-state structures of the substrate compared to that of the ground state [52]. Revealing the atomic details and energy change along the conformational transition process could provide preliminary mechanistic explanation of the catalytic process. In this case, we take ATP citrate lyase (ACLY) as an example and describe its key conformational pathway the during substrate-binding process.
ATP-citrate lyase (ACLY) is a representative central metabolic enzyme linking carbohydrate and lipid metabolism [53]. It catalyzes coenzyme A (CoA) to acetyl coenzyme A (Ac-CoA) with the participation of Mg 2+ -ATP and citrate. The overall reaction is as follows: Mg 2+ -ATP + Citrate + CoA → Mg 2+ -ADP + Pi + Ac-CoA + oxaloacetate. Its substratebinding process is accomplished with the successive binding of Mg 2+ -ATP, Citrate, CoA, and the related chemical reactions, as shown in Figure 2a. The latest crystallographic studies have revealed that the basic function unit of ACLY is a homo-tetrameric complex with D2 symmetry (Figure 2b), comprising four CCL domains that closely assemble at the center of ACLY tetramer and four CCS domains that freely stretch outside [54][55][56]. Two significant conformational changes in ACLY tetramers were observed from the crystal structures at the endpoints of the substrate-binding process (Figure 2c): (1) the His760 located loop transferred from the Mg 2+ -ATP site to the citrate site after reaction (i), and (2) the CCS domain on each monomer underwent significant conformational changes compared to the CCL domain. Owing to the structural lack of His760-located loop in both apo and substrate-binding crystal structures, it is unclear whether these two conformational changes occur in cascade or in coupled ways. Therefore, we tried to construct a detailed free energy profile of conformational change and provide a clear mechanistic explanation.  As shown in Figure 3a-c, according to the occurrence order of two conformationa changes (conf. 1 and conf. 2), a coupling pathway (pathway 1) and two cascade pathway (pathway 2, pathway 3) were constructed, in which E, E', T1, and T2 represent the startin state, ending state, and intermediate states 1 and 2, respectively. The calculated confor mational free energies for state E, E', T1, and T2 are −1372.78 kcal/mol, −1485.72 kcal/mo −1371.95 kcal/mol, and −1476.02 kcal/mol, respectively ( Table 1). The energy differenc between E and E' was −112.94 kcal/mol, indicating that the conformational change i ACLY showed a stable trend during the substrate-binding process. The energy differenc between E and T1 was slight, while the energy difference between E and T2 was approx imately 100 kcal/mol, indicating that conf. 2 but not conf. 1 causes a major energy chang during the process. It can be seen that the energy difference between E and E' was mainl affected by the electrostatic energy term (about 45 kcal/mol) and hydrophobic energy term (about 60 kcal/mol), as shown in Table 1. According to the conformational analysis by We et al. [56], E' (the substrate-binding pose) showed a more compact position between mon omers relative to E (the apo pose), which partly suggests the high stability of the confor mation E'. The energy profiles of three pathways were further analyzed, as shown in Fig  ure 3d-f. In the coupled situation, two important energy barrier changes are located in th initial and middle stages of the conformational change of pathway 1 with values of 43.9 kcal/mol and 32.69 kcal/mol, respectively. As for pathway 2, the energy barrier of E→T is 55.21 kcal/mol, while the energy barrier of T1→E' is 30.72 kcal/mol. In pathway 3, th energy barrier of E→T2 stage is 29.49 kcal/mol, while the energy barrier of T2→E' is 50.7 kcal/mol. It can be seen that conf. 1 and conf. 2 correspond to two important energy barrie changes in the free energy profile. At the E→T1 of pathway 2 and the T2→E' of pathwa 3, the energy barrier values are both higher than 50 kcal/mol, indicating that conf. 1 cor responds to the key energy barrier in the overall substrate-binding process. In the cascad Here, the ACLY tetramers along the substrate-binding process were built based on the D2-symmetrical Cryo-EM structures solved by Wei et al. [56]. The two captured conformations E and E' (PDB ID: 6POF, 6UUW) represent the start point and end point conformations of the substrate-binding process, respectively. It is indicated that the long linker region between CSSβ and CSSα (sequence 426-486) did not affect the catalytic function of ACLY tetramer [54]. Therefore, all the protein monomers were trimmed to the same length (sequence 2-425, 487-1099). The missing loop of conformation E (sequence 751-766) and conformation E' (sequence 140-148) were completed from the corresponding section of another ACLY tetramer (PDB ID: 6QFB). Then, the molecular modeling workflow integrated with the CG model was performed to obtain the energic profile of the process.
As shown in Figure 3a-c, according to the occurrence order of two conformational changes (conf. 1 and conf. 2), a coupling pathway (pathway 1) and two cascade pathways (pathway 2, pathway 3) were constructed, in which E, E', T1, and T2 represent the starting state, ending state, and intermediate states 1 and 2, respectively. The calculated conformational free energies for state E, E', T1, and T2 are −1372.78 kcal/mol, −1485.72 kcal/mol, −1371.95 kcal/mol, and −1476.02 kcal/mol, respectively ( Table 1). The energy difference between E and E' was −112.94 kcal/mol, indicating that the conformational change in ACLY showed a stable trend during the substrate-binding process. The energy difference between E and T1 was slight, while the energy difference between E and T2 was approximately 100 kcal/mol, indicating that conf. 2 but not conf. 1 causes a major energy change during the process. It can be seen that the energy difference between E and E' was mainly affected by the electrostatic energy term (about 45 kcal/mol) and hydrophobic energy term (about 60 kcal/mol), as shown in Table 1. According to the conformational analysis by Wei et al. [56], E' (the substrate-binding pose) showed a more compact position between monomers relative to E (the apo pose), which partly suggests the high stability Entropy 2022, 24, 620 9 of 15 of the conformation E'. The energy profiles of three pathways were further analyzed, as shown in Figure 3d-f. In the coupled situation, two important energy barrier changes are located in the initial and middle stages of the conformational change of pathway 1 with values of 43.98 kcal/mol and 32.69 kcal/mol, respectively. As for pathway 2, the energy barrier of E→T1 is 55.21 kcal/mol, while the energy barrier of T1→E' is 30.72 kcal/mol. In pathway 3, the energy barrier of E→T2 stage is 29.49 kcal/mol, while the energy barrier of T2→E' is 50.70 kcal/mol. It can be seen that conf. 1 and conf. 2 correspond to two important energy barrier changes in the free energy profile. At the E→T1 of pathway 2 and the T2→E' of pathway 3, the energy barrier values are both higher than 50 kcal/mol, indicating that conf. 1 corresponds to the key energy barrier in the overall substrate-binding process. In the cascade of chemical reactions of the ACLY catalytic cycle, the activation effect of Mg 2+ -ATP at the starting point is the driving force of the conformational shift of His760-located loop [57]. The above results provide a mechanistic explanation for the necessity of Mg 2+ -ATP activation in the catalytic cycle of ACLY. A comparison between the energy barriers of three pathways shows that the coupling of conf. 1 and conf. 2 in pathway 1 could effectively reduce the highest energy barriers of conf. 1 (11.23 kcal/mol lower than pathway 2, and 6.72 kcal/mol lower than pathway 3), which preliminarily showed the synergy between different domains during the conformational change process.
Entropy 2022, 24, x FOR PEER REVIEW 9 of 15 of chemical reactions of the ACLY catalytic cycle, the activation effect of Mg 2+ -ATP at the starting point is the driving force of the conformational shift of His760-located loop [57].
The above results provide a mechanistic explanation for the necessity of Mg 2+ -ATP activation in the catalytic cycle of ACLY. A comparison between the energy barriers of three pathways shows that the coupling of conf. 1 and conf. 2 in pathway 1 could effectively reduce the highest energy barriers of conf. 1 (11.23 kcal/mol lower than pathway 2, and 6.72 kcal/mol lower than pathway 3), which preliminarily showed the synergy between different domains during the conformational change process.

Application to Transmembrane Protein System
Transmembrane proteins are integral to many biological processes such as membrane trafficking, enzymatic activity, and cell-cell recognition, to name a few [58]. The function of transmembrane proteins is achieved by the conformational changes under the effect of membrane constituents [59]. It is a challenge to investigate the precise conforma-

Application to Transmembrane Protein System
Transmembrane proteins are integral to many biological processes such as membrane trafficking, enzymatic activity, and cell-cell recognition, to name a few [58]. The function of transmembrane proteins is achieved by the conformational changes under the effect of membrane constituents [59]. It is a challenge to investigate the precise conformational change details of protein machines in the presence of membranes. P4-ATPase is an important transmembrane protein that acts as an essential transporter to flip specific lipids from the extracellular leaflet to the inner leaflet of cell membranes in eukaryote cells [60]. This transporting activity regulates nearly all biological process such as myotube formation, apoptosis, immune response, and sperm capacitation [61][62][63][64][65][66][67]. Understanding how phospholipid and proteins within bilayers operate and interact represents an important question in health and pathophysiology. Massive endeavors have been undertaken to glean structural and mechanistic insights into P4-ATPase over the decades [68][69][70][71]. With the technical development of Cryo-EM and X-ray crystallography, we know that P4-ATPases are made up of α protein subunit and β cell division cycle (CDC) 50 subunit. The α subunit consists of the A domain (actuator domain), N domain (nucleotide binding domain), P domain (phosphorylation domain), and 10 TMs (transmembrane helix). Without the CDC50 expression, P4-ATPase cannot induce catalytic activity [72]. Further mutagenesis and assay experiments provide insights into the key amino acids that are indispensable for the lipid recognition and for the translocation pathway [68,73]. However, the energy basis of the catalytic process by P4-ATPase is still not well understood.
Here, the initial models were constructed based on the Cryo-EM structures solved by Hiraizumi and coworkers [71]. The six captured intermediates E1, E1-ATP, E1P-ADP, E1P, E2P, and E2Pi-PL (PDB ID: 6K7G-6K7M) reflect the whole reaction cycle of lipid transfer (Figure 4a). The membrane was added by Molaris-XG software [31]. The membrane model was selected in which the hydrophobic contributions are scaled down by a factor of ∼3.6 and does not consider the polar contribution of sidechains [16]. The free energy for the six endpoint states of P4-ATPase system is displayed in Table 1. The highest energy of E1P state (−813.39 kcal/mol) suggests its instability in the observed conformation, which is consistent with the finding that E1P is a transient phosphorylated state [71]. E1P connects the phosphoryl transfer intermediate (E1P-ADP) and phosphoenzyme ground state (E2P), during which the TM1 and TM2 segments in the region proximal to the A domain clearly shift. Moreover, the distance between the N and P domain elongate when the N domain forces out in the E1P state [71]. Such structural change may induce the lower electrostatic energy term ( Table 1). The obviously low value of E Form2MC (−27.82 kcal/mol) in the E1P state conforms to the assumption that the electrostatic term usually contributes the most from the different interaction items in biophysical systems [21].
We also analyzed the whole energy transition for the transport cycle along the lipid flipping reaction, as shown in Figure 4b. The energy barrier (37.06 kcal/mol) for the transition from E1P-ADP to E1P surpass the values of the resting transitions. This is consistent with the fact that the interaction network in the E1P-ADP state reconnects within a short time during the dissociation of ADP, and energy generated from ATP hydrolysis is needed to compensate. For other ATPase members such as F1-ATPase and Adenosine triphosphate (ATP)-binding cassette (ABC) transporter, the energy barrier for conformation change varies differently from 10 to 32.1 kcal/mol [23,74,75]. The relatively high barrier of P4-ATPase may be due to the relatively large molecular conformation and functional specificity of phospholipid transporters [76]. The second important energy barrier (34.13 kcal/mol) occurs at the transition from E2P to E2Pi-PL, indicating that proteins need to overcome potential resistance generated by the occlusion of lipid head groups. The energy barrier is obtained at the early stage of the transition, representing that phospholipids on the exoplasmic side of the membrane invade the membrane at the start of the transition from E2P to E2Pi-PL. This is consistent with the cellular observations of Nakanishi and coworkers [77]. With our coarse-grained simulation methods, we can achieve relative reasonable results and give a quantitative explanation for the phospholipid transport progress.

Conclusions and Summary
Here, we reported the application of the CG model on several large-scale protein machine systems. It is important to mention that our results still follow the philosophy of capturing more relevant physical features and focusing less on minute details [78]. It is not a perfect strategy, but in many cases, it gives the best option for capturing the functions of complex biological systems. The spike protein of SARS-CoV-2 is an important target for antiviral drugs because of its critical role in infection [79]. We modeled its three conformations by using the CG model and calculated the whole energetic profile of conformational change. The results explain the energetic basis of the mechanism of spike protein conformational change. Furthermore, ACLY is a homo-tetrameric structure with D2 symmetry; hence, its conformational changes are accompanied by more complex monomer motions and coupling relationships. We detailed here the coupling process of conformational changes of the ACLY enzyme during its catalytic cycle. Finally, we studied the catalytic mechanism of a membrane protein system, P4-ATPase, which is responsible for lipid transport across membranes. The difficulty in modeling this system is that it embedded in a membrane environment and has six intermediate conformations. The phospholipid transport function is accomplished in a cycle of successive transitions of these six conformations. Thanks to the optimization of membrane proteins of the CG models, we described the energetic profile of the conformational change cycle and gave a quantitative explanation for the phospholipid transport progress. These applications demonstrate valid examples of the applicability of the CG model in large-scale protein machines.

Conclusions and Summary
Here, we reported the application of the CG model on several large-scale protein machine systems. It is important to mention that our results still follow the philosophy of capturing more relevant physical features and focusing less on minute details [78]. It is not a perfect strategy, but in many cases, it gives the best option for capturing the functions of complex biological systems. The spike protein of SARS-CoV-2 is an important target for antiviral drugs because of its critical role in infection [79]. We modeled its three conformations by using the CG model and calculated the whole energetic profile of conformational change. The results explain the energetic basis of the mechanism of spike protein conformational change. Furthermore, ACLY is a homo-tetrameric structure with D2 symmetry; hence, its conformational changes are accompanied by more complex monomer motions and coupling relationships. We detailed here the coupling process of conformational changes of the ACLY enzyme during its catalytic cycle. Finally, we studied the catalytic mechanism of a membrane protein system, P4-ATPase, which is responsible for lipid transport across membranes. The difficulty in modeling this system is that it embedded in a membrane environment and has six intermediate conformations. The phospholipid transport function is accomplished in a cycle of successive transitions of these six conformations. Thanks to the optimization of membrane proteins of the CG models, we described the energetic profile of the conformational change cycle and gave a quantitative explanation for the phospholipid transport progress. These applications demonstrate valid examples of the applicability of the CG model in large-scale protein machines.
With the development and application of cryo-electron microscopy, more and more intermediate conformation states of protein machine systems will be resolved, and the CG model will be more widely applied in this context. The most important application of the CG model is in studying the coupling of a protein's conformational changes with specific reactions. Most protein machines cannot function solely by conformational transitions and need to be coupled with interactions with other biomolecules. For instance, the spike proteins must interact with the ACE2 receptor to invade cells; the catalytic reaction of ACLY involves the substrate binding of CoA, Mg 2+ , and ATP to carry out chemical reactions; and the lipids need to bind to P4-ATPase to be transported. There is no doubt that the participation of these biomolecules may promote or depend on the conformational changes of the protein machines. It is important to understand the coupling relationship between conformational change and biological events and then gain a deeper understanding of protein functions. Achieving this goal requires more efficient modeling of the free energy landscape of biological process, which must extend the application of the CG model by combining it with other computational methods-for example, using the empirical valence bond (EVB) method [80] to compute the energy profile of chemical reactions in protein, using the protein dipoles Langevin dipoles (PDLD) method [81] to calculate the binding free energy between a ligand and a protein, etc. Altogether, based on its advantage of conformational free energy evaluation, the CG model is set to play an increasingly important role in various computational biology studies.