Unveiling Conformational States of CDK6 Caused by Binding of Vcyclin Protein and Inhibitor by Combining Gaussian Accelerated Molecular Dynamics and Deep Learning

CDK6 plays a key role in the regulation of the cell cycle and is considered a crucial target for cancer therapy. In this work, conformational transitions of CDK6 were identified by using Gaussian accelerated molecular dynamics (GaMD), deep learning (DL), and free energy landscapes (FELs). DL finds that the binding pocket as well as the T-loop binding to the Vcyclin protein are involved in obvious differences of conformation contacts. This result suggests that the binding pocket of inhibitors (LQQ and AP9) and the binding interface of CDK6 to the Vcyclin protein play a key role in the function of CDK6. The analyses of FELs reveal that the binding pocket and the T-loop of CDK6 have disordered states. The results from principal component analysis (PCA) indicate that the binding of the Vcyclin protein affects the fluctuation behavior of the T-loop in CDK6. Our QM/MM-GBSA calculations suggest that the binding ability of LQQ to CDK6 is stronger than AP9 with or without the binding of the Vcyclin protein. Interaction networks of inhibitors with CDK6 were analyzed and the results reveal that LQQ contributes more hydrogen binding interactions (HBIs) and hot interaction spots with CDK6. In addition, the binding pocket endures flexibility changes from opening to closing states and the Vcyclin protein plays an important role in the stabilizing conformation of the T-loop. We anticipate that this work could provide useful information for further understanding the function of CDK6 and developing new promising inhibitors targeting CDK6.


Introduction
Cyclin-dependent kinases (CDKs) are a class of protein kinases that play a key role in the regulation of the cell cycle [1][2][3][4][5][6][7][8][9][10].According to their respective functions, CDKs can be categorized into two major groups: cell cycle-regulated CDKs (such as CDK1, CDK2, CDK4, and CDK6) and transcription-regulated CDKs (such as CDK7, CDK8, and CDK9) [11,12].Throughout the progression of the cell cycle, these CDKs sequentially regulate various mitotic events.Under the influence of diverse growth signals, phosphorylation of pRb is initiated by CDK4-Cyclin D and CDK6-Cyclin D, thereby initiating the cell cycle.Subsequently, the CDK2-Cyclin E complex keeps pRb highly phosphorylated, regulating the entry of cells into the S phase and centrosome replication.CDK1 is a key determinant of mitosis and forms active complexes with Cyclin A and B in the latter half of the G2 phase and throughout the M phase, respectively.Furthermore, CDK5 plays a crucial role in the regulation of post-mitotic events in specific tissues [11].In cancer cells, CDKs lead to abnormal cell proliferation due to their over-activation [13,14].Consequently, CDKs have been regarded as a potential key target for cancer therapy [15][16][17].Exploring the molecular-level structures of CDKs is imperative for developing next-generation CDK inhibitors.However, much of the current research is primarily focused on CDK2, with limited investigation into CDK6.
Molecules 2024, 29, 2681 2 of 24 CDK6, a pivotal member of the CDK family, integrates mitogenic and anti-mitogenic extracellular signals with the cell cycle, exerting significant regulation during the G1 phase.CDK6 has similar structures to other CDK members, characterized by an N-lobe domain comprising five β-strands, a C-lobe domain consisting of multiple α-helices, and the catalytic domain featuring as an interconnecting region between the N-lobe and Clobe domains [18], as depicted in Figure 1A.The substrates of CDK6, including ATP and peptides, bind to the catalytic sites of CDK6, initiating a catalytic reaction that leads to the phosphorylation of peptides as well as the production of ADP and H 2 O.The catalytic domain of CDK6 primarily encompasses an ATP binding pocket, a phosphate transfer catalytic loop (C-loop: residue: 141-151), and an activation loop (T-loop: residue: 163-184) responsible for substrate peptide binding.CDK6 activity is finely regulated by its binding to cyclin protein and subsequent phosphorylation on the T-loop [19][20][21][22][23][24][25][26][27][28][29][30].
Molecules 2024, 29, x FOR PEER REVIEW 3 of 24 inhibitors, specific inhibitor LQQ, and less specific inhibitor AP9, indicated by using their identity document (ID) in the protein data bank (PDB), were selected for our current studies.The topological structures of the inhibitor-CDK6 complex and inhibitor-CDK6/Vcyclin are depicted in Figure 1A,B, respectively.The structures of LQQ and AP9 are separately displayed in Figure 1C,D.Two inhibitors, LQQ and AP9, have IC50 values of 15 and 450 nM, respectively, showing different inhibition abilities on the activity of CDK6 [66].
Insights into the effect of the two inhibitors and Vcyclin protein on the conformational changes of CDK6 will be important in the design of potent inhibitors.In this work, multiple separate GaMD (MS-GaMD) simulations were carried out to enhance the conformational sampling of CDK6, and DL was performed to identify key residue contacts in CDK6.Additionally, principal component analysis (PCA) [67][68][69][70] and construction of free energy landscapes (FELs) were performed to reveal the changes in the conformational dynamics of CDK6 induced by the binding of inhibitors and Vcyclin protein.We anticipate that this study will provide valuable information for the development of the potential inhibitor against CDK6.

Characteristic Residue Contacts Revealed by Deep Learning
Classification of the LQQ-bound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6 was achieved by DL, and the results are shown in Figure 2A.The overall accuracy achieved on the validation set after 25 epochs was 0.99992, while the overall loss was 0.00019 (Figure S1).In total, 6000 frames were used for the validation of each system with most of them being accurately classified (Figure 2A), including 6000 frames of the LQQ-bound CDK6/Vcyclin, 6000 frames of the LQQ-bound CDK6, 5999 In the CDK6-Vcyclin complex, a contiguous protein-protein interface was established by the interaction of Vcyclin with one side of the CDK6 catalytic domain, as depicted in Figure 1B.For the αC helix, the Vcyclin protein drives it to shift and rotate toward the catalytic domain.As a result, the E61 residue of CDK6 on the αC helix is oriented toward the ATP binding pocket, thereby contributing to stabilization of the ATP binding conformation.For the T-loop, five hydrogen bonds and a large number of van der Waals contacts between the Vcyclin and T-loop residues in CDK6 lock the T-loop in a stable conformation.V181 of CDK6 adopts the left-handed conformation, which is necessary to form a substrate binding pocket.Unlike CDK2, residues 171-174 in CDK6 adopt a classical type II turn conformation, contributing to the larger buried surface [3].Therefore, it is highly requisite to probe the molecular mechanism underlying the conformational regulation of CDK6 caused by the binding of inhibitors and Vcyclin for the development of efficient inhibitors targeting CDK6.
Molecular dynamics (MD) simulation is a computational technique that provides valuable insights into biomolecular dynamics at atomic levels [31][32][33][34][35].Meanwhile, calculations of binding free energies are used as a tool for evaluating the binding ability of inhibitors to targets [36][37][38].These two simulation methods have been extensively applied in the conformation analysis of CDKs [39][40][41][42][43][44].These studies suggest that CDK6 exhibits significant conformational flexibility and undergoes multiple conformations during the reaction cycle.Notably, the conformation of the catalytic domain is closely related to the action mechanism of CDK6 and binding to specific inhibitors.Conformations sampled by conventional MD (cMD) simulations are possibly trapped within an energy minima space due to the high energy barrier in simulation systems [45][46][47][48][49].To address this limitation, Gaussian accelerated molecular dynamics (GaMD) simulations employ a harmonic boost potential to smooth the free energy barrier of biomolecules [50,51].Moreover, GaMD ensures the accurate calculation of free energy profiles by minimizing energy noise during the reweighting process [52,53].These attributes make GaMD particularly well-suited for the investigation of larger biological systems and ligand-target binding [54][55][56][57][58][59][60].To better understand the molecular mechanism from MD simulations, the integration of machine learning (ML) with MD simulations has been proposed [61,62].Miao's group proposed a trajectory-based deep learning (DL) approach, called GaMD, DL, and free energy profiling workflow (GLOW), to successfully decipher the molecular mechanisms underlying the activation and allosteric modulation of G protein-coupled receptors [63,64].Wang et al. combined MD simulations and DL to successfully probe the binding mechanism of inhibitors to BRD4 and BRD9 [65].However, DL has not been combined with GaMD for the conformation analysis on inhibitor-bound CDK6.
To achieve our goal, four complexes, including the LQQ-bound CDK6/Vcyclin, LQQbound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6, were selected to investigate the influence of Vcyclin protein and inhibitor binding on CDK6 conformations.Two inhibitors, specific inhibitor LQQ, and less specific inhibitor AP9, indicated by using their identity document (ID) in the protein data bank (PDB), were selected for our current studies.The topological structures of the inhibitor-CDK6 complex and inhibitor-CDK6/Vcyclin are depicted in Figure 1A,B, respectively.The structures of LQQ and AP9 are separately displayed in Figure 1C,D.Two inhibitors, LQQ and AP9, have IC50 values of 15 and 450 nM, respectively, showing different inhibition abilities on the activity of CDK6 [66].Insights into the effect of the two inhibitors and Vcyclin protein on the conformational changes of CDK6 will be important in the design of potent inhibitors.In this work, multiple separate GaMD (MS-GaMD) simulations were carried out to enhance the conformational sampling of CDK6, and DL was performed to identify key residue contacts in CDK6.Additionally, principal component analysis (PCA) [67][68][69][70] and construction of free energy landscapes (FELs) were performed to reveal the changes in the conformational dynamics of CDK6 induced by the binding of inhibitors and Vcyclin protein.We anticipate that this study will provide valuable information for the development of the potential inhibitor against CDK6.

Characteristic Residue Contacts Revealed by Deep Learning
Classification of the LQQ-bound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6 was achieved by DL, and the results are shown in Figure 2A.The overall accuracy achieved on the validation set after 25 epochs was 0.99992, while the overall loss was 0.00019 (Figure S1).In total, 6000 frames were used for the validation of each system with most of them being accurately classified (Figure 2A), including 6000 frames of the LQQ-bound CDK6/Vcyclin, 6000 frames of the LQQ-bound CDK6, 5999 frames of the AP9-bound CDK6/Vcyclin, and 5999 frames of the AP9-bound CDK6.Only two frames were inaccurately categorized: one frame of the AP9-bound CDK6/Vcyclin was predicted to be the LQQ-bound CDK6/ Vcyclin, and one frame of the AP9-bound CDK6 was predicted to be the LQQ-bound CDK6.
αC helix and T-loop (Table S1 and Figure 2F).Compared to the LQQ-bound CDK6/Vcyclin and LQQ-bound CDK6, the AP9-bound CDK6/Vcyclin leads to the disappearance of the characteristic residue contacts between the G-loop and T-loop, and induces new characteristic residue contacts as we mentioned above.The characteristic residue contacts of the AP9-bound CDK6 were located between the αC helix and T-loop (Table S1 and Figure 2F).By referencing the AP9-bound CDK6/Vcyclin, the AP9-bound CDK6 changes the residue contact from the T-loop.In the AP9-bound CDK6/Vcyclin, the residues of the T-loop identified by DL were situated in the binding pocket; however, in the AP9-bound CDK6, the residue contacts from the T-loop were located near the binding interface between CDK6 and the Vcyclin protein.The pixel-based residue attention maps of gradients of the most populated CDK6 structures are shown in Figure 2B-E, respectively.The residue contacts with gradients ≥ 0.7 that are characteristic residue contacts are shown in Table S1.Overall, the characteristic residue contacts of the LQQ-bound CDK6/Vcyclin were located between the G-loop and T-loop (Table S1 and Figure 2F).The characteristic residue contacts of the LQQ-bound CDK6 were also located between the G-loop and T-loop (Table S1 and Figure 2F).The characteristic residue contacts of the AP9-bound CDK6/Vcyclin were situated between the αC helix and T-loop (Table S1 and Figure 2F).Compared to the LQQ-bound CDK6/Vcyclin and LQQ-bound CDK6, the AP9-bound CDK6/Vcyclin leads to the disappearance of the characteristic residue contacts between the G-loop and T-loop, and induces new characteristic residue contacts as we mentioned above.The characteristic residue contacts of the AP9-bound CDK6 were located between the αC helix and T-loop (Table S1 and Figure 2F).By referencing the AP9-bound CDK6/Vcyclin, the AP9-bound CDK6 changes the residue contact from the T-loop.In the AP9-bound CDK6/Vcyclin, the residues of the T-loop identified by DL were situated in the binding pocket; however, in the AP9-bound CDK6, the residue contacts from the T-loop were located near the binding interface between CDK6 and the Vcyclin protein.
Through the characteristic residue contacts learned by DL, the involved structure domains are primarily involved in the G-loop, the αC helix, and the T-loop.The work of Schulze-Gahmen et al. indicated that the G-loop, the αC helix, and the T-loop participate in the stable conformation of the binding pocket from CDK6 while the T-loop and the αC helix are heavily affected by the binding of Vcyclin [3], which agrees with our current findings learned by DL.

Conformational Transition of CDK6 from Free Energy Landscapes
The previous DL analysis suggests that the residue contacts are mainly located near the binding pocket and binding interface between CDK6 and the Vcyclin protein.Residues located in the binding pocket identified from the DL results with the gradient of 0.9 were selected to calculate distances.Thus, we calculate the distance 1 (DIS1) between the Cα atom of residue E61 in the αC helix and the Cα atom of residue A162 in the T-loop and the distance 2 (DIS2) of the Cα atoms of residue G22 and residue A23 in the G-loop, respectively, away from the Cα atoms of residue D163 and residue L166 in the T-loop.The DIS1 and DIS2 were selected as the RCs to build FELs so as to reveal the binding pocket conformations of CDK6.The FELs and the corresponding representative structures are depicted in Figures 3-5.
spectively, away from the Cα atoms of residue D163 and residue L166 in the T-loop.The DIS1 and DIS2 were selected as the RCs to build FELs so as to reveal the binding pocket conformations of CDK6.The FELs and the corresponding representative structures are depicted in Figures 3-5.

Dynamics Behavior of CDK6
To understand the different structural stabilities of inhibitors, root-mean-square deviations (RMSDs) of heavy atoms for LQQ and AP9 were calculated relative to the initial structure (Figures 8A and S2).It is observed that the inhibitors show a stable fluctuation in the LQQ-bound CDK6/Vcyclin, the LQQ-bound CDK6, and the AP9-bound CDK6/Vcyclin systems.The RMSDs of LQQ in the LQQ-bound CDK6/Vcyclin and the LQQ-bound CDK6 systems are populated at 1.6 Å and 1.9 Å (Figure 8A), respectively.The RMSD of AP9 in the AP9-bound CDK6/Vcyclin system is populated at 2.4 Å (Figure 8A), while the RMSD of AP9 in the AP9-bound CDK6 is populated at two peaks of 3.4 and 4.2 Å (Figure 8A), with a wider distribution range, suggesting that the absence of the Vcyclin protein increases the RMSD of AP9.Meanwhile, we also calculate the RMSDs of all heavy atoms from proteins to understand the structural stability of proteins (Figure S3).The results show that proteins show a similar fluctuation and structural stability through GaMD simulations.Root-mean-square fluctuations (RMSFs) of CDK6 were estimated by using the coordinates of the Cα atoms (Figure 8B).The LQQ-bound CDK6 and AP9-bound CDK6 strengthen the structural flexibility of most of the CDK6 regions, particularly in the β-strands, the αC helix, and the T-loop relative to the LQQ-bound CDK6/Vcyclin and AP9-bound CDK6/Vcyclin.It is also found that the L1-loop shows stronger flexibility (Figure 8B).Furthermore, it is worth noting that the residues obtained from the DL results are also located within these regions.These results indicate that the changes in the structural flexibility can influence the function of CDK6.

Dynamics Behavior of CDK6
To understand the different structural stabilities of inhibitors, root-mean-square deviations (RMSDs) of heavy atoms for LQQ and AP9 were calculated relative to the initial structure (Figure 8A and Figure S2).It is observed that the inhibitors show a stable fluctuation in the LQQ-bound CDK6/Vcyclin, the LQQ-bound CDK6, and the AP9-bound CDK6/Vcyclin systems.The RMSDs of LQQ in the LQQ-bound CDK6/Vcyclin and the LQQ-bound CDK6 systems are populated at 1.6 Å and 1.9 Å (Figure 8A), respectively.The RMSD of AP9 in the AP9bound CDK6/Vcyclin system is populated at 2.4 Å (Figure 8A), while the RMSD of AP9 in the AP9-bound CDK6 is populated at two peaks of 3.4 and 4.2 Å (Figure 8A), with a wider distribution range, suggesting that the absence of the Vcyclin protein increases the RMSD of AP9.Meanwhile, we also calculate the RMSDs of all heavy atoms from proteins to understand the structural stability of proteins (Figure S3).The results show that proteins show a similar fluctuation and structural stability through GaMD simulations.Root-mean-square fluctuations (RMSFs) of CDK6 were estimated by using the coordinates of the Cα atoms (Figure 8B).The LQQ-bound CDK6 and AP9-bound CDK6 strengthen the structural flexibility of most of the CDK6 regions, particularly in the β-strands, the αC helix, and the T-loop relative to the LQQbound CDK6/Vcyclin and AP9-bound CDK6/Vcyclin.It is also found that the L1-loop shows stronger flexibility (Figure 8B).Furthermore, it is worth noting that the residues obtained from the DL results are also located within these regions.These results indicate that the changes in the structural flexibility can influence the function of CDK6.To explore the impacts of the inhibitor and Vcyclin binding on the concerted movements of structural domains in CDK6, PCA was performed on the coordinates of the Cα atoms using the CPPTRAJ module in Amber 20.The function of eigenvalues over eigenvectors is depicted in Figure S4, which is used to characterize the structural fluctuation along the eigenvectors.It is found that the binding of the Vcyclin protein weakens the total structural fluctuation of CDK6.The first eigenvector was visualized utilizing the VMD software 1.9.4a51 [72] and the results are displayed in Figure 9. Structural domains of CDK6 display well-concerted motions; furthermore, the Vcyclin protein greatly affects the concerted motions of the L1-loop.In the LQQ-bound CDK6/Vcyclin, the L2-loop and L3-loop generate a parallel concerted motion in the same direction.It was also found that the L1-loop exhibits a high fluctuation (Figure 9A).Differently, in the LQQ-bound CDK6, the L2-loop and L3-loop have an opposite fluctuation tendency and the L1-loop has a different motion direction (Figure 9B).By referencing the LQQ-bound CDK6/Vcyclin, removal of the Vcyclin protein slightly strengthens the fluctuations of the α helices in the C-lobe domains.Compared to the AP9-bound CDK6/Vcyclin (Figure 9C), the removal of Vcyclin highly strengthens the concerted motions of the L2-loop and T-loop (Figure 9C).By comparison with the AP9-bound CDK6/Vcyclin, cutting the Vcyclin changes the fluctuation tendency of the L1-loop (Figure 9D).In summary, the removal of the Vcyclin protein greatly alters the structural fluctuation of the T-loop, L1-loop, and L2-loop, To explore the impacts of the inhibitor and Vcyclin binding on the concerted movements of structural domains in CDK6, PCA was performed on the coordinates of the Cα atoms using the CPPTRAJ module in Amber 20.The function of eigenvalues over eigenvectors is depicted in Figure S4, which is used to characterize the structural fluctuation along the eigenvectors.It is found that the binding of the Vcyclin protein weakens the total structural fluctuation of CDK6.The first eigenvector was visualized utilizing the VMD software 1.9.4a51 [72] and the results are displayed in Figure 9. Structural domains of CDK6 display well-concerted motions; furthermore, the Vcyclin protein greatly affects the concerted motions of the L1-loop.In the LQQ-bound CDK6/Vcyclin, the L2-loop and L3loop generate a parallel concerted motion in the same direction.It was also found that the L1-loop exhibits a high fluctuation (Figure 9A).Differently, in the LQQ-bound CDK6, the L2-loop and L3-loop have an opposite fluctuation tendency and the L1-loop has a different motion direction (Figure 9B).By referencing the LQQ-bound CDK6/Vcyclin, removal of the Vcyclin protein slightly strengthens the fluctuations of the α helices in the C-lobe domains.Compared to the AP9-bound CDK6/Vcyclin (Figure 9C), the removal of Vcyclin highly strengthens the concerted motions of the L2-loop and T-loop (Figure 9C).By comparison with the AP9-bound CDK6/Vcyclin, cutting the Vcyclin changes the fluctuation tendency of the L1-loop (Figure 9D).In summary, the removal of the Vcyclin protein greatly alters the structural fluctuation of the T-loop, L1-loop, and L2-loop, implying that these structural domains are possibly involved in the function domains of CDK6.

Binding Free Energy Calculations
To explore the binding preference of two inhibitors, the binding free energies of LQQ and AP9 to CDK6 with and without the binding of Vcyclin were calculated with the QM/MM-GBSA method.The calculated free energy components are listed in Table 1.It is worth noting that the rank of the experimental values is consistent with that of the binding free energies predicted by the QM/MM-GBSA [66], which suggests that our current analyses on free energy are reliable.g Binding free energy: ∆ = ∆ − ∆.h ∆ is obtained via ∆G = −50 with the experimental value of IC50 [66].

Binding Free Energy Calculations
To explore the binding preference of two inhibitors, the binding free energies of LQQ and AP9 to CDK6 with and without the binding of Vcyclin were calculated with the QM/MM-GBSA method.The calculated free energy components are listed in Table 1.It is worth noting that the rank of the experimental values is consistent with that of the binding free energies predicted by the QM/MM-GBSA [66], which suggests that our current analyses on free energy are reliable.The inhibitor-CDK6 van der Waals interactions (∆E vdW ) in the LQQ-bound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6 are −52.14, −51.73, −48.89, and −42.90 kcal/mol, respectively.On the whole, the binding ability of LQQ to CDK6 is stronger than that of AP9.By comparison with CDK6 bounded by Vcyclin, it is found that the removing of the Vcyclin protein slightly weakens the van der Waals interactions of LQQ and AP9 with CDK6.The electrostatic interactions (∆E ele ) provide few contributions for inhibitor binding.The self-consistent field energy (∆G sc f ) of AP9bound CDK6/Vcyclin and AP9-bound CDK6 are abated by 5.1 and 7.76 kcal/mol relative to those of LQQ-bound CDK6/Vcyclin and LQQ-bound CDK6, which shows that HBI between V101 and the inhibitor plays a key role in the binding of the inhibitor to CDK6.The binding entropies (−T∆S) of LQQ-bound CDK6/Vcyclin, LQQ-bound CDK6, AP9bound CDK6/Vcyclin, and AP9-bound CDK6 are 17.20, 20.98, 21.08, and 19.22 kcal/mol, respectively, which are unfavorable for the binding of inhibitors to CDK6.On the whole, the binding abilities of inhibitors LQQ and AP9 to CDK6 are strengthened by 1.62 and 2.24 kcal/mol because of the Vcyclin binding.The results not only suggest that the binding ability of LQQ to CDK6 is stronger than AP9 to CDK6 but also verifies that the binding of the Vcyclin protein strengthens the binding ability of the inhibitor to CDK6.In summary, van der Waals interactions and HBI contribute the main force to inhibitor-CDK6 binding.In the future drug design with regard to CDK6, these two interactions should be paid special attention.

Analyses of Inhibitor-CDK6 Interaction Networks
To obtain atomic-level insights the interaction modes of inhibitors with CDK6, the residue-based free energy decomposition method was applied to estimate the inhibitorresidue interaction spectrum of LQQ and AP9 with CDK6, and the results are displayed in Figure 10.The contributions of the side chains and backbones of residues to the inhibitor-CDK6 associations are provided in Table 2. HBIs between the inhibitors and residues of CDK6 were analyzed using the program CPPTRAJ, and the results are listed in Table 3.The geometric information regarding inhibitor-residue interactions is depicted in Figure 11.

𝑆
and  , respectively, correspond to contributions of the side chains and backbones to electrostatic interactions ( ) of inhibitors with residues. and  , respectively, represent contributions of the side chains and backbones to inhibitor-residue polar solvation free energies.a Hydrogen bonds are analyzed by an acceptor-donor distance of <3.5 Å and acceptor-H-donor angle of >120 • .b Occupancy (%) is defined as the percentage of simulation time that a specific hydrogen bond exists.For the LQQ-bound CDK6/Vcyclin, LQQ produces interactions stronger than −0.8 kcal/mol with nine residues, including I19, V27, A41, F98, H100, V101, Q103, T107, and L152 (Figure 10A,E).According to Figure 11A, the hydrophobic groups of I19, V27, A41, H100, V101, Q103, T107, and L152 are located near the hydrophobic ring of LQQ.Thus, H100 structurally forms the π-π interaction of −1.77 kcal/mol with LQQ.Residues I19, V27, A41, V101, Q103, T107, and L152 yield the CH-π interactions with LQQ, and their For the LQQ-bound CDK6/Vcyclin, LQQ produces interactions stronger than −0.8 kcal/mol with nine residues, including I19, V27, A41, F98, H100, V101, Q103, T107, and L152 (Figure 10A,E).According to Figure 11A, the hydrophobic groups of I19, V27, A41, H100, V101, Q103, T107, and L152 are located near the hydrophobic ring of LQQ.Thus, H100 structurally forms the π-π interaction of −1.77 kcal/mol with LQQ.Residues I19, V27, A41, V101, Q103, T107, and L152 yield the CH-π interactions with LQQ, and their corresponding interaction energies are −3.1, −1.94, −0.83, −2.37, −1.83, −0.94, and −2.74 kcal/mol, respectively (Figure 10A,E and Table 2).Structurally, the phenyl groups of F98 are adjacent to the alkyls of LQQ, which leads to a CH-π interaction of −1.1 kcal/mol (Figure 11A and Table 2).In addition, H100 and V101 form three HBIs with LQQ, and their occupancy is higher than 65.4% (Table 3), indicating that these three hydrogen bonds are stable though the entire GaMD simulations.According to Table 2, the energetic contributions of I19, V27, A41, F98, T107, and L152 mostly arise from the sidechains of these residues.The energetic contribution of H100 mainly comes from the van der Waals interactions of its sidechain but the polar interactions are weak.The energetic contribution of V101 mostly arises from the van der Waals interactions of its sidechain and electrostatic interaction of its backbone.The energetic contribution of Q103 is primarily provided by the van der Waals interactions of both the sidechain and the backbone of Q103.Moreover, a hydrogen bond with an occupancy of 53% appears between LQQ and D163 (Table 3), implying a favorable force for the LQQ-CDK6 binding.Compared to the LQQ-bound CDK6/Vcyclin, LQQ produces similar interaction modes with CDK6 in the LQQ-bound CDK6, but the deleting of the Vcyclin protein also induces the alteration.By referencing the LQQ-bound CDK6/Vcyclin, the removal of the Vcyclin protein strengthens the LQQ-H100 interaction (Table 2), which mainly comes from the changes in electrostatic interactions of the sidechain from H100 with LQQ.
For the AP9-bound CDK6/Vcyclin, AP9 produces interactions stronger than −0.8 kcal/mol with seven residues, including I19, V27, A41, F98, V101, Q103, and L152 (Figure 10C,E).According to Figure 11C, the alkyls of I19, V27, A41, V101, and L152, and the CH-group of Q103 are located near the hydrophobic ring of AP9.Thus I19, V27, A41, V101, Q103, and L152 yield the hydrophobic CH-π interactions, and their corresponding interaction energies are −2.95,−2.22, −0.98, −2.71, −2.0, and −2.95 kcal/mol, respectively (Figure 10C,E and Table 2).Structurally, the phenyl groups of F98 are adjacent to the alkyls of AP9, which leads to a CH-π interaction of −1.68 kcal/mol (Figure 11C and Table 2).According to Table 2, the energetic contributions of I19, V27, A41, F98, and L152 mostly arise from the sidechains of these residues.The energetic contribution of V101 mainly stems from the van der Waals interactions of its sidechain with AP9 and the electrostatic interaction of its backbone with AP9.The energetic contribution of Q103 arises from the van der Waals interactions of both the sidechain and the backbone of Q103 with AP9.In addition, AP9 forms two HBIs with V101, and their occupancy is higher than 59.7%, indicating that these two hydrogen bonds are stable through the entire GaMD simulations (Table 3).By comparison with the AP9-bound CDK6/Vcyclin, AP9 yields the same interaction modes in the AP9-bound CDK6.Apart from residue Q103, the removal of the Vcyclin protein slightly weakens the interactions of AP9 with I19, V27, A41, F98, V101, and L152 (Table 2).The deletion of the Vcyclin protein enhances the occupancy of the hydrogen bond V101-N-H. ..AP9-N7 relative to the AP9-bound CDK6/Vcyclin, but decreases the occupancy of the hydrogen bond V101-O. ..AP9-N6-H6.
Based on the aforementioned description, two inhibitors, LQQ and AP9, form hydrophobic interactions with common residues I19, V27, A41, F98, V101, Q103, T107, and L152 of CDK6; moreover, our current predicted binding sites are in good agreement with the experiment results [66].Differently, LQQ also forms hydrophobic interaction with residue H100.By comparison with LQQ, AP9 misses two HBIs with residues H100 and D163.These different interactions reflect the structural difference between LQQ and AP9.It is concluded that the above-mentioned residues play key roles in the binding of inhibitors to CDK6.More importantly, the CH-π and π-π interactions, and HBIs between the abovementioned residues and inhibitors are identified as target sites of drug design with regard to CDK6, which should be paid special attention.

Scheme of Operating Calculations
The integration of MS-GaMD simulations and DL was employed to identify crucial residue contacts and uncover significant function domains of CDK6 binding.The overall approach is illustrated in Figure 12.The procedure involved: (1) extracting initial atomic coordinates from the PDB and constructing simulation systems using the Amber program with force field parameters, (2) performing three separate GaMD simulations to relax conformations and gather conformational ensembles, (3) utilizing the MDTraj program to convert conformational ensembles into images suitable for DL analysis, (4) randomly dividing the images into training and validation sets for image classification based on a two-dimensional (2D) convolutional neural network (CNN), (5) identifying significant residue contacts through the gradient maps, and finally, (6) obtaining reaction coordinates (RCs) from these key residue contacts to construct FELs and reveal conformation changes of CDK6.
formations and gather conformational ensembles, (3) utilizing the MDTraj program to con-vert conformational ensembles into images suitable for DL analysis, (4) randomly dividing the images into training and validation sets for image classification based on a two-dimensional (2D) convolutional neural network (CNN), ( 5) identifying significant residue contacts through the gradient maps, and finally, (6) obtaining reaction coordinates (RCs) from these key residue contacts to construct FELs and reveal conformation changes of CDK6.

Constructions of Simulated Systems
The initial structures of the LQQ-bound CDK6/Vcyclin and AP9-bound CDK6/Vcyclin complexes were obtained from the PDB and they, respectively, correspond to the PDB

Constructions of Simulated Systems
The initial structures of the LQQ-bound CDK6/Vcyclin and AP9-bound CDK6/Vcyclin complexes were obtained from the PDB and they, respectively, correspond to the PDB entries 2EUF and 2F2C [66].The LQQ-bound CDK6 without the Vcyclin protein was obtained by cutting Vcyclin from the crystal structure 2EUF.The AP9-bound CDK6 without the Vcyclin protein was obtained by deleting Vcyclin from the crystal structure 2F2C.The missing residues of CDK6 in four crystal structures were repaired by using the program Modeller [73].All of the crystal water and non-inhibitor molecules were deleted from the initial model.The protonated states of residues in CDK6 were examined by using the program H++3.0 [74].The Leap module in Amber20 [75,76] was utilized to complete the following tasks: (1) assigning force field parameters for CDK6 and CDK6/Vcyclin using the ff 19SB force field, (2) constructing a periodic box of octahedral TIP3P water molecules with a buffer size of 10.0 Å to solve four CDK6-related systems, and (3) adding counter ions into each system within a 0.15 M salt environment to achieve neutralization, where the force parameters for sodium ions (Na + ) and chloride ions (Cl − ) were obtained from Joung and Cheatham's work [77,78].The molecular structures of two inhibitors, LQQ and AP9, were optimized at a semi-empirical AM1 level, followed by assigning BCC charges to each atom of the inhibitors using Amber20's Antechamber module [79].The general Amber force field (GAFF2) [80,81] was employed to generate force field parameters for both LQQ and AP9 inhibitors.
Third, the standard deviation of ∆V should be sufficiently small to ensure the appropriateness of energetic reweighting: where σ ∆V is the standard deviation of ∆V, V avg is the average potential energy, and σ V is the standard deviation of the potential energy.σ 0 is a user-defined upper limit for appropriate reweighting.When E is set to the lower bound E = V max , k 0 can be: Similarly, when E is set to the upper bound E = V min + 1 k , k 0 can be: in which k 0 is the effective harmonic constant that determines the magnitude of the applied push potential.As k 0 increased, a higher harmonic push potential was applied to the free energy surface, thus enhancing the conformational sampling of biomolecules.Before GaMD simulation, a cMD simulation is conducted to obtain the V max , V min , V avg , and σ V of each simulation system.In our current study, 3 µs GaMD simulations were run on the LQQbound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6 systems.These simulations consisted of three separate GaMD simulations, each running for 1 µs.To facilitate data analysis, we combined three separate GaMD trajectories into a single GaMD trajectory (SGT) using the CPPTRAJ module in Amber 20 [83].This allows us to extract valuable insights into the function of CDK6.We employed a program called PyReweighting developed by Miao et al. [84] to accurately reweight and identify the original free energy profiles of our CDK6-related systems.The SHAKE algorithm was used in both cMD and GaMD simulations to constrain chemical bonds between hydrogen atoms and heavy atoms [85].To regulate the temperatures of four CDK6-related systems, we utilized a Langevin thermostat with a collision frequency of 1.0 ps −1 [86].Non-bonded interactions were estimated using the PME method with a cutoff distance of 9 Å.All simulations were performed using pmemd.cudaimplemented in Amber 20 [75,76].

Deep Learning
Deep Learning was employed to analyze the GaMD simulations of the four CDK6related system.Contact maps for the conformational frames of the GaMD simulations were computed using a contact definition of ≤4.5 Å between any Cα atoms in a time interval of 100 ps.The Python packages MDTraj [87] and contact map explorer [87] were utilized to generate 295 × 295 residue contact maps.These contact maps were transformed into grayscale images of 295 × 295 pixels for subsequent 2D CNN analysis.A total of 30,000 images were generated for each CDK6-related system, with a random selection of 80% images used for training and the remaining images used for validation purposes.Our constructed 2D CNN was implemented using the Python PyTorch 1.12.1 package.The optimal model consisted of four convolutional layers with kernel sizes of 3 × 3, comprising filters in quantities of 32, 32, 64, and 64, respectively, followed by two fully connected (dense) layers where the first two contained filters numbering at 512 and then 128, respectively; both layers had a dropout rate set at 0.5.The final fully connected layer was the classification layer, which categorized input data into four classes: LQQ-bound CDK6/Vcyclin, LQQbound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6.Throughout all layers of the 2D CNN, except for the classification layer, the "ReLU" activation function was employed, whereas in the classification layer, the "softmax" activation function was utilized.The "softmax" function transforms the network's output into a probability distribution that represents the likelihood of each potential class.A 2 × 2 max-pooling layer was added after each convolutional layer.Finally, by utilizing the residue contact maps from the most populated structure in each CDK6-related system, significance (attention) maps of residue contact gradients were computed through backpropagation.We declare that our program was rewritten with the PyTorch package based on the work of Miao's group [63,64].

Construction of Free Energy Landscapes
Key residue contacts were detected by the DL model according to the three following criteria: (1) the contact gradients were higher than 0.7 in the saliency maps, (2) the residue contacts were calculated by using all Cα atoms, and (3) obvious changes in contacts among structural domains can be captured.The distances between the Cα atoms of key residues were used as RCs to construct FELs according to the reweighting procedure.
In the reweighting process of GaMD simulations, the reweighted free energy F(A) = −k B Tln(ρ A ) can be calculated as: where F * (A) = −k B Tlnp * (A) represents the modified free energy arising from GaMD simulations, F C denotes a constant, and β = k B T. The probability distribution p * (A) of selected RCs from GaMD simulations can be reweighted to recover the canonical ensemble distribution ρ A .All calculations involved in the free energy reweighting were realized by using the program PyReweighting 1.0 developed by Miao et al., and the details for the reweighting procedure have been elucidated in the work of Miao et al. [84].

Principal Component Analysis
PCA is a valuable tool for gaining insights into concerted motions of structural domains within biomolecules.Hence, we employed PCA to clarify how the binding of inhibitors and Vcyclin protein impacts the concerted motions of CDK6.In our work, PCA was performed by diagonalizing a covariance matrix C, which was constructed using the coordinates of the Cα atoms in CDK6 based on Equation (12): C = (q i − q i )(q j − q j T ) (12) in which q i and q j represent the Cartesian coordinates of the ith and jth Cα atoms in CDK6, respectively, while q i and q j are their averaged positions obtained from conformational ensembles recorded at the SGT.To calculate this average, a superimposition of the SGT with a referenced structure is performed to eliminate overall translations and rotations using a least-squares fitting procedure [88].The resulting eigenvalues and eigenvectors from the PCA are usually applied to, respectively, embody the fluctuation amplitude along an eigenvector and the concerted motions of structural domains.In our study, we performed the PCA using the CPPTRAJ program in Amber 20 [83].

QM/MM-GBSA Calculations
Binding free energies are usually employed to evaluate the binding ability of inhibitors to their targets.Inhibitor binding is influenced by two crucial factors, namely enthalpy changes and entropy changes.Currently, molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) and MM-GBSA are considered as two effective methods for this purpose [80,89].Hou's team performed several works to evaluate the performance of these two methods [90][91][92].Chen's group proved that the QM/MM-GBSA method can accurately evaluate the hydrogen bonding interactions [93].Based on their information, we selected the QM/MM-GBSA method to calculate inhibitor-CDK6 binding free energies with Equation (13) as follows: ∆G bind = ∆E ele + ∆E vdw + ∆G gb + ∆G sur f + ∆G sc f − T∆S (13)

Figure 2 .
Figure 2. Classification and saliency map of residue contact gradients: (A) classification of the LQQbound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6, (B-E) the saliency map of residue contact gradients for the LQQ-bound CDK6/Vcyclin, LQQ-bound CDK6, AP9-bound CDK6/Vcyclin, and AP9-bound CDK6, and (F) key structural domains revealed by DL.The gradient of each residue contact is shown in a 0.2 (blue) to 0.8 (red) color scale.Through the characteristic residue contacts learned by DL, the involved structure domains are primarily involved in the G-loop, the αC helix, and the T-loop.The work of Schulze-Gahmen et al. indicated that the G-loop, the αC helix, and the T-loop participate in the stable

Figure 3 .
Figure 3. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the LQQbound CDK6/Vcyclin: (A) FEL, (B) superimposition of the EV1, the EV2, and the EV3 structures, (C) structural superimposition of LQQ in the EV1, the EV2, and the EV3 structures, (D-F) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1, the EV2, and the EV3 structures, in which CDK6 was shown in surface modes.The PMF is scaled in kcal/mol.

Figure 3 .
Figure 3. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the LQQbound CDK6/Vcyclin: (A) FEL, (B) superimposition of the EV1, the EV2, and the EV3 structures, (C) structural superimposition of LQQ in the EV1, the EV2, and the EV3 structures, (D-F) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1, the EV2, and the EV3 structures, in which CDK6 was shown in surface modes.The PMF is scaled in kcal/mol.
), that of structure EV2 shows a semi-open one (Figure 3E), and that of structure EV3 has an open state (Figure 3F).Based on these results, the up-down moving of the G-loop results in the open and closed states of the binding pocket during GaMD simulations.Moreover, the conformational transformation between the open and closed states of the binding pocket induces the twist of the LQQ posture.closedstate (Figure4F).Compared to structure EV1, structure EV3 has a greater distance between the αC helix and the T-loop.By referencing to the LQQ-bound CDK6/Vcyclin, the LQQ-bound CDK6 has no open state.Moreover, the LQQ-bound CDK6 induces more disordered states of the αC helix and the T-loop.Thus, binding of the Vcyclin protein contributes to the stable conformation of the αC and the T-loop[3].As a result, removing the Vcyclin protein leads to more disordered states of the αC helix and the T-loop.

Figure 4 .
Figure 4. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the LQQbound CDK6: (A) FEL, (B) superimposition of the EV1, the EV2, and the EV3 structures, (C) structural superimposition of LQQ in the EV1, the EV2, and the EV3 structures, (D-F) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1, the EV2, and the EV3 structures, in which CDK6 was shown in surface modes.The PMF is scaled in kcal/mol.

Figure 4 .
Figure 4. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the LQQ-bound CDK6: (A) FEL, (B) superimposition of the EV1, the EV2, and the EV3 structures, (C) structural superimposition of LQQ in the EV1, the EV2, and the EV3 structures, (D-F) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1, the EV2, and the EV3 structures, in which CDK6 was shown in surface modes.The PMF is scaled in kcal/mol.
Compared to structure EV1, structure EV3 has a greater distance between the αC helix and the T-loop.By referencing to the LQQ-bound CDK6/Vcyclin, the LQQ-bound CDK6 has no open state.Moreover, the LQQ-bound CDK6 induces more disordered states of the αC helix and the T-loop.Thus, binding of the Vcyclin protein contributes to the stable conformation of the αC and the T-loop [3].As a result, removing the Vcyclin protein leads to more disordered states of the αC helix and the T-loop.With regard to the AP9-bound CDK6/Vcyclin and AP9-bound CDK6, only an energy valley (EV1) was captured by GaMD simulation, and it was situated at (DIS1, DIS2) of (8.6 Å, 10.1 Å) and (8.5 Å, 13.1 Å), respectively (Figure 5A,C), implying that the binding of AP9 and Vcyclin does not lead to conformational rearrangement of CDK6.The two FELs exhibit some similarity; however, the DIS2 in the AP9-bound CDK6 is higher than that of the AP9-bound CDK6/Vcyclin.The binding pocket of structure EV1 in the AP9-bound CDK6/Vcyclin forms a semi-open state, while that of structure EV1 in the AP9-bound CDK6 shows a more open one (Figure 5B,D).Compared with the LQQ-bound CDK6/Vcyclin and LQQ-bound CDK6, the AP9-CDK6/Vcyclin and AP9-CDK6 have no closed state of the binding pocket.Thus, different inhibitors certainly impact the binding pocket conformation of CDK6.Molecules 2024, 29, x FOR PEER REVIEW 7 of 24

Figure 5 .
Figure 5. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the AP9bound CDK6/Vcyclin and AP9-bound CDK6: (A) FEL of AP9-bound CDK6/Vcyclin, (B) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1 structure of AP9-bound CDK6/Vcyclin, (C) FEL of AP9-bound CDK6, (D) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1 structure of AP9-bound CDK6.The PMF is scaled in kcal/mol.

Figure 5 .
Figure 5. FEL constructed by using DSI1 and DIS2 as RCs and representative structures of the AP9bound CDK6/Vcyclin and AP9-bound CDK6: (A) FEL of AP9-bound CDK6/Vcyclin, (B) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1 structure of AP9-bound CDK6/Vcyclin, (C) FEL of AP9-bound CDK6, (D) geometric positions of the G-loop, the αC helix, the T-loop, and the C-loop in the EV1 structure of AP9-bound CDK6.The PMF is scaled in kcal/mol.

Figure 6 .
Figure 6.FELs constructed by using the distance of the Cα atom of residue E61 and that of A162 and the distance of the Cα atom of residue I59 and that of M174 as RCs: (A) LQQ-bound CDK6/Vcyclin, (B) LQQ-bound CDK6, (C) AP9-bound CDK6/Vcyclin, and (D) AP9-bound CDK6.The PMF is scaled in kcal/mol.

Figure 6 .
Figure 6.FELs constructed by using the distance of the Cα atom of residue E61 and that of A162 and the distance of the Cα atom of residue I59 and that of M174 as RCs: (A) LQQ-bound CDK6/Vcyclin, (B) LQQ-bound CDK6, (C) AP9-bound CDK6/Vcyclin, and (D) AP9-bound CDK6.The PMF is scaled in kcal/mol.
(1) the up-down moving of the G-loop results in the open and closed states of the binding pocket; (2) unlike the binding pocket of the LQQ-bound CDK6, that of AP9-bound CDK6 has no closed state of the binding pocket and (3) without the Vcyclin protein, single CDK6 leads to more disordered states of the αC helix and the T-loop.The work of He et al. showed that the flexibility of the G-loop alters the conformation of the binding pocket, supporting our current findings [71].The study of Schulze-Gahmen et al. indicated that the Vcyclin protein contributes to the stable conformation of the T-loop [3], agreeing with our current results.

Molecules 2024 ,
29, x FOR PEER REVIEW 11 of 24implying that these structural domains are possibly involved in the function domains of CDK6.

a
Hydrogen bonds are analyzed by an acceptor-donor distance of <3.5 Å and acceptor-H-donor angle of >120°.b Occupancy (%) is defined as the percentage of simulation time that a specific hydrogen bond exists.

Figure 12 .
Figure 12.Workflow of deep learning from GaMD simulations: (A) the structure of CDK6, (B) the initialized system bound with the inhibitor, (C) conformational ensembles recorded in three independent GaMD trajectories, (D) images extracted from the MDTraj program, (E) convolution neural networks, (F) the saliency maps obtained from backward propagation, (G) key residue contacts identified by deep learning, and (H) free energy landscapes used for revealing the conformation of CDK6 influenced by the inhibitors and the Vcyclin protein.

Figure 12 .
Figure 12.Workflow of deep learning from GaMD simulations: (A) the structure of CDK6, (B) the initialized system bound with the inhibitor, (C) conformational ensembles recorded in three independent GaMD trajectories, (D) images extracted from the MDTraj program, (E) convolution neural networks, (F) the saliency maps obtained from backward propagation, (G) key residue contacts identified by deep learning, and (H) free energy landscapes used for revealing the conformation of CDK6 influenced by the inhibitors and the Vcyclin protein.

Table 1 .
Binding free energies of inhibitors to CDK6 estimated by utilizing the QM/MM-GBSA method.

Table 1 .
Binding free energies of inhibitors to CDK6 estimated by utilizing the QM/MM-GBSA method. a

Table 2 .
Contributions of the side chains and backbones to inhibitor-residue interaction a .S vdW and B vdW , respectively, indicate contributions of the side chains and backbones to van der Waals interactions (T vdW ) of inhibitors with residues.S ele and B ele , respectively, correspond to contributions of the side chains and backbones to electrostatic interactions (T ele ) of inhibitors with residues.S gb and B gb , respectively, represent contributions of the side chains and backbones to inhibitor-residue polar solvation free energies.
a All energy components are scaled in kcal/mol.

Table 3 .
Hydrogen bonds formed between inhibitors and residues analyzed using CPPTRAJ.

Table 3 .
Hydrogen bonds formed between inhibitors and residues analyzed using CPPTRAJ.