Insights into the Allosteric Effect of SENP1 Q597A Mutation on the Hydrolytic Reaction of SUMO1 via an Integrated Computational Study

Small ubiquitin-related modifier (SUMO)-specific protease 1 (SENP1) is a cysteine protease that catalyzes the cleavage of the C-terminus of SUMO1 for the processing of SUMO precursors and deSUMOylation of target proteins. SENP1 is considered to be a promising target for the treatment of hepatocellular carcinoma (HCC) and prostate cancer. SENP1 Gln597 is located at the unstructured loop connecting the helices α4 to α5. The Q597A mutation of SENP1 allosterically disrupts the hydrolytic reaction of SUMO1 through an unknown mechanism. Here, extensive multiple replicates of microsecond molecular dynamics (MD) simulations, coupled with principal component analysis, dynamic cross-correlation analysis, community network analysis, and binding free energy calculations, were performed to elucidate the detailed mechanism. Our MD simulations showed that the Q597A mutation induced marked dynamic conformational changes in SENP1, especially in the unstructured loop connecting the helices α4 to α5 which the mutation site occupies. Moreover, the Q597A mutation caused conformational changes to catalytic Cys603 and His533 at the active site, which might impair the catalytic activity of SENP1 in processing SUMO1. Moreover, binding free energy calculations revealed that the Q597A mutation had a minor effect on the binding affinity of SUMO1 to SENP1. Together, these results may broaden our understanding of the allosteric modulation of the SENP1−SUMO1 complex.


Introduction
Small ubiquitin-related modifier (SUMO) modification, also called sumoylation, is a reversible process that catalyzes the post-translational modification of a target protein using a SUMO protein [1]. As an important mechanism in the regulation of numerous protein activities, sumoylation has been implicated in a number of biological processes, including signal transduction and cell proliferation, DNA replication and repair, cell cycle control, genome integrity, and protein localization [2,3]. Sumoylation is a highly dynamic process. The SUMO catalytic reaction is accomplished by E1, E2, and E3 ligases, while the reverse, desumolyation, is catalyzed by SUMO-specific proteases (SENPs) [4]. There are six SENP isoforms (SENP1-3 and SENP5-7) that have been identified in mammals; these are divided into three sub-families according to their sequence homology, subcellular localization, and substrate specificity [3]. SUMO proteins are expressed in their precursor forms, and SENPs catalyze the maturation process of SUMO proteins to expose their C-terminal Gly-Gly motif [1].
Among them, SENP1 has been associated with prostate tumorigenesis [5,6]. Accumulating evidence indicates that SENP1 is involved in the desumoylation of androgen receptor (AR), and that overexpression of SENP1 increases AR transcriptional activity, which has been found in >50% of high-grade precancerous prostate tissues, and in numerous prostate cancer cases [3,7]. Moreover, through comprehensive analysis of >150 specimens of prostate cancer, Wang et al. validated that there was a correlation between SENP1 expression and prostate cancer aggressiveness and recurrence [8]. Furthermore, based on quantitative PCR in human paired hepatocellular carcinoma (HCC), Cui et al. validated that deSUMOylation of hypoxia-inducible factor 1α (HIF-1α) by SENP1 is associated with increased cancer stemness in HCC and hepatocarcinogenesis under hypoxic conditions [9]. In addition, Tao et al. revealed that SENP1 is a crucial promotor for HCC through deSUMOylation of ubiquitin-conjugation enzyme E2T (UBE2T) [10]. Therefore, SENP1 is considered to be a promising target for the treatment of HCC and prostate cancer.
Human SENP1 is a 644 amino acid protein which includes an N-terminal regulatory domain (residues 1-414) and a C-terminal catalytic domain (residues 415-643) [11,12]. The C-terminal domain controls catalytic activity, while the N-terminal domain modulates substrate specificity and cellular localization. To date, only crystal structures of the Cterminal catalytic domain of SENP1 have been solved in the apo form or with SUMObound protein-protein interactions (PPIs). The three-dimensional crystal structure of the C-terminal catalytic domain of SENP1 in complex with SUMO1 (residues 20-97) shows that the N-terminal subdomain of SENP1 contains two β-strands (β1 and β2) and four α-helices (α1, α2, α5, and α6), while the C-terminal subdomain includes a four stranded β-sheet (β3-β6) that is surrounded by two α-helices (α3 and α4) ( Figure 1A). The C-terminal domain of SUMO1 forms an elongated strand that occupies the large cleft of SENP1. There is a covalent thiohemiacetal linkage between the active site Cys603 of SENP1 and the C-terminal carbonyl-C of Gly97 in SUMO1 ( Figure 1B).
It should be noted that the protease catalytic triad Cys603, His533, and Asp550 are located in the SENP1 active site at the front of the helix α5, at the front of the β5, and at the end of the β6, respectively. The nucleophilic active site Cys603 is coordinated by the general base His533, which is in turn stabilized by Asp550 ( Figure 1B). In vitro assays have demonstrated that each mutation of Cys603, His533, and Asp550 to alanine in SENP1 severely diminished its hydrolysis reaction with SUMO1 [3]. These results are explained based on the SENP1−SUMO1 structural complex, because the three residues form the catalytic center for the processing of pre-SUMO1 to its mature form. However, the Gln597 of SENP1 that is located at the unstructured loop is away from the interface with SUMO1, but the Q597A variant was severely impaired in the processing of SUMO1. Because Gln597 has no direct contact with the active sites of SENP1 and SUMO1, the effect of the Q597A mutation on the hydrolytic reaction of SUMO1 may be through an allosteric mechanism [13][14][15][16][17][18][19]. However, the underlying allosteric mechanism of the Q597A mutation in the disruption of SENP1 catalytic activity has remained unclear.
Here, we employed multiple microsecond MD simulations of wild-type (WT) and Q597A SENP1−SUMO1 complexes to investigate SENP1 Q597A mutation-induced allosteric effects in the explicit water environment. Coupled with cross-correlation analysis, principal component analysis, and binding free energy calculations, we showed that the SENP1 Q597A mutation largely affects the arrangement of the catalytic Cys603 andHis533 in the active site, which might impair the catalytic activity of SENP1 to process SUMO1. The results obtained may broaden our understanding of the allosteric modulation of SENP1−SUMO1 complex. The secondary structural elements of α-helices, β-strands, and loops are colored using wheat, yellow, and gray, respectively; (B) Conformational rearrangement of the catalytic triad Cys603, His533, and Asp550 in the SENP1 active site. Cys603 of SENP1 forms a covalent thiohemiacetal linkage with the C-terminal carbonyl-C of Gly97 in SUMO1. Gln597 of SENP1 on the unstructured loop is shown as sticks.

System Stability
To reveal the allosteric effect of SENP1 Q597A mutation on the hydrolysis of SUMO1, three replicate 1 µs MD simulations of WT and Q597A SENP1-SUMO1 complexes were performed. To unravel the conformational dynamics of the SENP1-SUMO1 complexes during the simulations, the root-mean-square deviation (RMSD) of all the Cα atoms was monitored for the three independent runs. As shown in Figure 2, the RMSD plots implied that both systems reached equilibrium after 400 ns of simulation, and the RMSD values for the WT and Q597A mutant were 1.74 ± 0.16 Å and 1.68 ± 0.20 Å, respectively. The mutant system had a similar RMSD value to the WT system, indicating that the single Q597A mutation on SENP1 had a minor effect on the overall stability of the protein complex. To further illuminate the local conformational dynamics of SENP1-SUMO1 upon mutation, we calculated the atomic root-mean-square fluctuation (RMSF) for all Cα atoms in the three independent runs. As shown in Figure 3, compared to the WT system, a markedly higher RMSF was observed at the residues Ser571-Asp602 of SENP1 in the mutant system, which are located at the unstructured loop connecting the helices α4 to α5. Indeed, the SENP1 Q597A mutation site is located at this unstructured loop. This result suggested that the Q597A mutation had a significant effect on its nearby residues. As for SUMO1, the RMSF profiles in both the WT and mutant systems were similar, indicating that the SENP1 Q597A mutation had a subtle effect on the conformational dynamics of SUMO1. However, in comparing the conformational dynamics of the N-and C-terminals of SUMO1, we found that the C-terminus became more stable relative to the N-terminus, with the latter exhibiting a large conformational flexibility. This observation was reasonable, because in the crystal structure of the SENP1-SUMO1 complex, the C-terminal of SUMO1 inserts into the catalytic site of SENP1, forming strong contacts with SENP1 and stabilizing the C-terminal SUMO1; while the N-terminal of SUMO1 has no direct contacts with SENP1, thereby demonstrating the noticeable conformational dynamics of the N-terminal of SUMO1 during the simulations.

Q597A Mutation Enhanced the Coupled Motions of Protein Domains
Dynamic cross-correlation matrix (DCCM) analysis was carried out to probe the interdependent conformational motions of the SENP1-SUMO1 complex among spatially different domains in both the WT and Q597A mutant systems. In the DCCM calculations, the CCij matrix reflects the correlated motions between the two Cα atoms (i and j), representing whether they move as correlated motions (CCij > 0) or as anti-correlated motions (CCij < 0). As shown in Figure 4, the CCij matrix of the SENP1-SUMO1 complex showed a conserved pattern of correlated and anti-corelated motions in both the WT and Q597A mutant systems. However, in SENP1, the Q597A mutant system had a higher correlated motion in the region of residues~Ile470-Ile620 compared to the WT system. This result was consistent with the RMSF analysis, in which the Q597A mutant system yielded an enhanced residue fluctuation in the region of SENP1 Ser571-Asp602 located at the unstructured loop connecting the helices α4 to α5. In contrast, the Q597A mutation caused a minor effect on the conformational dynamics of SUMO1. Collectively, these data indicated that the Q597A mutation yielded an enhancement of the correlated motions of protein domains in SENP1.

Pricipal Component Analysis (PCA)
In order to dissect the large-scale collective motions of the SENP1−SUMO1 complex and to reveal the effect of Q597A mutation on its conformational dynamics, principal component analysis (PCA) was carried out. According to PCA, the first two principal models of motion (i.e., principal components 1 and 2, PC1 and PC2) offer knowledge regarding the large amplitude motions of the SENP1−SUMO1 complex. During PCA, the three replicates of the simulated trajectories for both the WT and Q597A mutant systems were used, and subjected to RMS fitting to the same crystal structures of the SENP1−SUMO1 complex. This operation ensured the consistency of the PCs.
First, PC1 and PC2 were plotted over the two-dimensional surfaces of the SENP1−SUMO1 complex in the WT and Q597A mutant systems. As a result, the PC1 vs. PC2 plots for the WT complex showed one major state ( Figure 5A), while two states were observed in the Q597A mutant system ( Figure 5B). This indicated that the Q597A mutation induced conformational changes in the SENP1−SUMO1 complex. To further reveal which protein domains were affected by the Q597A mutation, PC1 was plotted over the three-dimensional structures of the SENP1−SUMO1 complex in the WT and Q597A systems, using arrows to represent the direction and the relative amplitude of motions. As shown in Figure 5C, in the WT system, SUMO1 showed moderate amplitude motions, while SENP1 displayed no obvious motions. In contrast, in the Q597A mutant system ( Figure 5D), SUMO1 showed large amplitude motions and SENP1 displayed obvious motions in the region of the unstructured loop connecting the helices α4 to α5. The PCA results unveiled an observation that the Q597A mutation induced marked motion of the unstructured loop in SENP1, which was agreement with the RMSF analysis. An increased flexibility of the unstructured loop in response to the Q597A mutation could have an essential role on the influence of the hydrolytic reaction of SUMO1.

Community Network Analysis
To unveil the allosteric differences between the WT and Q597A SENP1−SUMO1 complexes, allosteric community analysis was carried out to calculate mutual information for all residues based on the Girvan-Newman algorithm. The analysis can assess the varied coupling among the different communities. In all simulated trajectories, the two Cα atoms of any residues within a cut-off distance of 4.5 Å keeping >75% of simulation time were deemed as belonging to the same community. The community was regarded as a coordinated unit within the overall protein structure. Visualization of the community network graphs can uncover the corresponding intensity of the different communities by the means of comparing the allosteric networks of the WT and Q597A systems within SENP1-SUMO1. Both WT (Figure 6A,C) and Q597A ( Figure 6B,D) SENP1−SUMO1 complexes were divided into nine communities. Moreover, in both systems, the SUMO1 part largely belonged to Community 6, while the SENP1 part constituted Communities 1-5 and 7-9. These results suggested that the Q597A mutation did not change the number of communities within the SENP1-SUMO1 complex. However, the Q597A mutation induced the sizes and intensities of several communities to be changed. For example, the sizes of Communities 2, 4, 7, and 9 in the Q597A mutant were decreased compared to the WT, while the sizes of Communities 3 and 5 became larger in the Q597A mutant compared to those in the WT. In fact, the SENP1 residues Ser571-Asp602 were in Community 3 and the C-terminal helix α5 was in Community 5. The newly formed communication between Ser571-Asp602 and the C-terminal of helix α5 in the mutant would affect the conformational dynamics of the N-terminal helix α5 occupied by the catalytic Cys603. The three remaining communities, 1, 6, and 8, were unchanged in both the WT and Q597A systems. Indeed, the three communities 1, 6, and 8 mainly comprised the interface of SENP1−SUMO1 protein-protein interactions. These results implied that the Q597A mutation changes the communication network of SENP1, but may have a minor effect on SUMO1 dynamic communication.

Q597A Mutation Caused Conformational Changes to Cys603 and His533
In the SENP1 catalytic site, the catalytic triad Cys603, His533, and Asp550 forms direct interactions. His533 forms hydrogen bonding interactions with both Asp550 and Cys603. Thus, the correct arrangement of the catalytic triad in the SENP1 active site is critical for the hydrolysis of SUMO1. To reveal the impact of the Q597A mutation on the arrangement of the catalytic triad, we monitored the conformational dynamics of the catalytic triad during the simulations. We calculated the distances between the NE2 atom of His533 and the OD1 atom of Asp550 ( Figure 7A), the NE2 atom of His533 and the OD2 atom of Asp550 ( Figure 7B), and the ND1 atom of His533 and the SG atom of Cys603 ( Figure 7C) for the three independent runs. As shown in Figure 7A, the distance between the NE2 atom of His533 and the OD1 atom of Asp550 was stable in the WT system (3.44 ± 0.70 Å), while in the Q597A mutant, this distance was larger (3.74 ± 0.54 Å) than that in the WT system. The distance between the NE2 atom of His533 and the OD2 atom of Asp550, however, was slightly shorter in the Q597A mutant (3.18 ± 0.45 Å) compared to the WT system (3.56 ± 0.71 Å) ( Figure 7B). These results suggested concerted changes in His533 and Asp550 in both systems. Notably, the distance between the His533 and the catalytic residue Cys603 varied due to the Q597A mutation ( Figure 7C). In the WT system, the calculated distance was 3.69 ± 0.23 Å, which was shorter than that in the Q597A mutant (4.02 ± 0.49 Å). The analysis of the arrangement of the catalytic triad in both the WT and the Q597A mutant suggested that the mutation mainly affected the conformational arrangement of Cys603 and His533 of SENP1, affecting the hydrolysis of SUMO1. To further elucidate the arrangement of the catalytic triad, we extracted the most representative structural complexes of SENP1-SUMO1 using cluster analysis, and we superimposed the two conformations, focusing on the catalytic site. The k-means algorithm was used for cluster analysis, which first generated seed points, and then all the data points were iterated and each was assigned to its closest seed point. As shown in Figure 8, compared to the WT system, the catalytic triad Cys603, His533, and Asp550 showed conformational changes in the Q597A mutant system. Taken together, these results indicated that the SENP1 Q597A mutation caused the conformational changes of Cys603 and His533, which would impair the hydrolytic activity of SUMO1.

Q597A Mutation Had a Minor Effect on the Binding Affinity of SUMO1
To further reveal the effect of Q597A mutation on the binding affinity of SUMO1 to SENP1, a MM-PBSA binding free energy calculation was performed; this has been proven successful in evaluating protein-ligand or protein-protein interactions in different systems [43,44]. The last 200 ns trajectory in all three replicates for both the WT and Q597A systems were selected for the MM-PBSA binding free energy calculation. As shown in Table 1, the gas phase (∆E gas ) free energy of the SENP1-SUMO1 complex in the WT and Q597A mutant systems was −331.65 ± 15.88 and −324.55 ± 16.38 kcal/mol, respectively. The solvation free energy (∆G solvation ) of the SENP1-SUMO1 complex in the WT and Q597A mutant systems was −224.89 ± 12.72 and −220.79 ± 12.82 kcal/mol, respectively. Overall, the binding free energy (∆G binding ) of the SENP1-SUMO1 complex in the WT and Q597A mutant systems was −106.76 ± 6.46 and −103.77 ± 7.06 kcal/mol, respectively. The results show that the binding potencies of the SENP1-SUMO1 complex in the WT and Q597A mutant systems were similar, suggesting that the SENP1 Q597A mutation had a minor impact on the binding affinity of SUMO1 to SENP1.

Conclusions
In the present study, MD simulations in combination with multiple analysis methods, including RMSD, RMSF, DCCM, PCA, MM-PBSA, and community network analyses, were performed to elucidate the allosteric effect of the SENP1 Q597A mutation on the hydrolytic reaction of SUMO1. RMSF analyses revealed that Q597A mutation caused enhanced conformational dynamics of the unstructured loop linking the helices α4 to α5 occupied by the mutation site. DCCM analyses were carried out to reveal correlated motions in the SENP1-SUMO1 complex. The results revealed that Q597A mutation enhanced the correlated motions of protein domains in SENP1, rather than in SUMO1. The visualized community network data showed the same number of communities in the allosteric crosstalk for both the WT and Q597A mutant systems. Structural analysis further exposed that Q597A mutation caused conformational changes to Cys603 and His533 in the SENP1 catalytic site. MM-PBSA calculation further implied that the binding affinity of SUMO1 to each of WT and Q597A SENP1 was similar. Together, these data suggested that the allosteric effect of the Q597A mutation was through the impacts on Cys603 and His533 in SENP1, rather than on SUMO1, which may deepen our insight into the allosteric modulation of the SENP1-SUMO1 complex.

Structural Preparation
The 2.8 Å X-ray crystal structure of the SENP1−SUMO1 complex was obtained from the RCSB Protein Data Bank (PDB) (PDB ID: 2G4D) [11]. In 2G4D, the active site Cys603 of SENP1 was mutated to Ser603. Therefore, we mutated Ser603 back to Cys603 using the Discovery Studio program, and we manually adjusted the sidechain of SENP1 Cys603 to form a covalent thiohemiacetal linkage with the carbonyl-C of Gly97 in SUMO1. To construct the mutant, the SENP1 Gln597 was mutated to alanine to represent the SENP1 Q597A −SUMO1 complex.

MD Simulations
MD simulations of SENP1−SUMO1 complexes in the explicit water environment were performed using the AMBER 18 package [45]. The force field for the protein was treated using AMBER FF14SB [46], and the TIP3P force field was used for the water molecules [47]. The covalent bond between SENP1C603 and SUMO1G97 was defined using the antechamber module. Both the WT and Q597A SENP1−SUMO1 complexes were solvated by TIP3P water molecules in an octahedral box, with a minimum distance of 10 Å between any protein atom and the edge of the water box. To simulate physiological conditions, 0.15 mol/L NaCl was added to both systems.
A two-stage energy minimization approach was performed to optimize the initial complexes using previous protocols [48][49][50]. First, we applied minimization of water molecules and ions (Na + and Cl − ) with positional restraints on protein atoms via a harmonic potential with a force constant of 500 kcal mol −1 Å −2 . This process was completed with 5000 steps of the steepest descent energy minimization, followed by 5000 steps of the conjugate gradient energy minimization. Then, we applied minimization of all the atoms in the system without any restraints. This process was completed with 10,000 steps of the steepest descent energy minimization, followed by 20,000 steps of the conjugate gradient energy minimization. After minimization, both systems were heated gradually from 0 to 300 K within 300 ps, followed by constant temperature equilibration at 300 K for 700 ps, with positional restraints on protein atoms via a harmonic potential with a force constant of 10 kcal mol −1 Å −2 , using a canonical NVT ensemble. Finally, three independent runs of 1 µs MD simulations were simulated with random velocities for both systems under NPT ensemble and periodic boundary conditions. An integration step of 2 fs was set for the simulations. Langevin dynamics were used to maintain a temperature at 300 K with a collision frequency of 1 ps −1 . The particle mesh Ewald method was used to compute long-range electrostatic interactions [51]. A cutoff of 10 Å was set to calculate van der Waals and electrostatic interactions. The SHAKE method was used to constrain all covalent bonds involving hydrogen atoms [52].

Principal Component Analysis (PCA)
Principal component analysis (PCA) was performed to analyze the cartesian coordinates of the Cα atoms of the SENP1−SUMO1 complexes [53][54][55]. All simulated snapshots were used to construct a covariance matrix Cij: where x i is a cartesian coordinate of the ith Cα atom, and x i represents the time averaged over all the configurations selected in the simulation. Before PCA analysis, translational and rotational motions were excluded by overlaying the backbone atoms of SENP1−SUMO1 complexes on the reference crystal structure.
Diagonalization of the covariance matrix C yields the eigenvalues λi and the corresponding eigenvectors Vi, namely, the principal component (PC). Vi represent the directions in the multidimensional space that correspond to independent modes of atomic motion, while λi represent their corresponding amplitudes. The projection Proj (M, PC i ) of any structure (snapshot) M onto the ith PC was calculated as follows: where M α are the Cα atoms of the SENP1−SUMO1 complexes after overlaying M on the reference crystal structure.

Dynamic Cross-Correlation Analysis
The correlated motions were explored in both systems by calculating the dynamic cross-correlation matrix of the Cα atoms, described by the index c ij as follows: where the fluctuation ∆r i of the Cα atom of the ith residue along the simulated trajectory is calculated with respect to the average structure. The correlation values of c(i,j) fall in the [-1, 1] range. Positively correlated residues move in the same direction, whereas negatively (anti-correlated) residues move in the opposite direction. Values of 0 suggest the uncorrelated motions of two residues.

MM-PBSA Binding Free Energy Calculations
The molecular mechanics Poisson-Boltzmann surface area (MM−PBSA) method [43,44,[56][57][58][59][60] was performed to calculate the binding free energy between SENP1 and SUMO1. The binding free energy (∆G binding ) was shown as Equation (1): where ∆G complex , ∆G SENP1 , and ∆G SUMO1 are the free energies of the complex, SENP1, and SUMO1, respectively. Equation (1) can be further expressed as a sum of the absolute free energy in the gas phase (∆E gas ), the solvation free energy (∆G solvation ), and the entropy term (T∆S) (Equation (2)). The conformational entropy (−T∆S) was not calculated owing to challenges and inaccuracy in the estimation of the conformational entropy for the proteinprotein interactions [59].

Community Network Analysis
The community network analysis was calculated based on the correlation coefficient matrix, Cij, through the NetworkView plugin for VMD [55]. The Cα atoms of each residue in the SENP1−SUMO1 complex were considered to be a group of nodes that were connected by edges. The edges in the dynamic network were weighted according to the correlation data obtained from the DCCM results. The formation of the edges was calculated between the two nodes within a cut-off distance of 4.5 Å for >75% of the total of simulation time. We calculated the edge connections between two nodes using Equation (6): where i and j represent two nodes, and Cij was calculated from the dynamical crosscorrelation matrix of the Cα atoms.