Exploration of the Product Specificity of chitosanase CsnMY002 and Mutants Using Molecular Dynamics Simulations

Chitosanase CsnMY002 is a new type of enzyme isolated from Bacillus subtilis that is used to prepare chitosan oligosaccharide. Although mutants G21R and G21K could increase Chitosan yield and thus increase the commercial value of the final product, the mechanism by which this happens is not known. Herein, we used molecular dynamics simulations to explore the conformational changes in CsnMY002 wild type and mutants when they bind substrates. The binding of substrate changed the conformation of protein, stretching and deforming the active and catalytic region. Additionally, the mutants caused different binding modes and catalysis, resulting in different degrees of polymerization of the final Chitooligosaccharide degradation product. Finally, Arg37, Ile145 ~ Gly148 and Trp204 are important catalytic residues of CsnMY002. Our study provides a basis for the engineering of chitosanases.

At present, the best method of Cos preparation is enzymatic hydrolysis of Chitosan, which is mediated by chitosanases [9][10][11]. These are a group of enzymes with high similarity formed by seven families: GH3, GH5, GH7, GH8, GH46, GH75 and GH80 [12,13], where the GH46 family is different from the others.
Chitosanase MY002 (a GH46 family member) was successfully isolated in 2021 from Bacillus subtilis [14] and was referred to as chitosanase CSNMY002. Three mutants were produced, E19A, G21K and G21R, that were also active, but different from the wild type with respect to substrate binding and cleavage mechanism. The cleavage mode of Chitohexose (GlcN) 6 by CSNMY002 is a "3 + 3" symmetry mode, whereas the three mutants have a different mechanism.
Subsequent functional experiments explored the enzymatic properties of these enzymes, but how these three mutations affect the substrate binding and splitting mechanism is not known. Herein, we have used molecular dynamics to simulate the reaction of CSNMY002 wild type (WT) and its two mutants G21R and G21K with Chitodisaccharide (GlcN) 2 and Chitohexose (GlcN) 6 . Our results provide the basis for the design of new chitosanases.
CSNMY002 wild type (WT) and its two mutants G21R and G21K with Chitodisaccharide (GlcN)2 and Chitohexose (GlcN)6. Our results provide the basis for the design of new chitosanases.

System Stabilization
The root-mean-square deviation (RMSD) values of the atomic skeletons in the four simulation architectures (Figure 2A,B) showed that the four reaction systems reached equilibrium after 60 ns. The final RMSD was always below 5 Å, indicating that the systems were stable during the 100 ns MD simulation.

System Stabilization
The root-mean-square deviation (RMSD) values of the atomic skeletons in the four simulation architectures (Figure 2A,B) showed that the four reaction systems reached equilibrium after 60 ns. The final RMSD was always below 5 Å , indicating that Then Rg value represents the tight degree of protein structure. The smaller the value of Rg is, the more closely the 3D structure of the protein is. The Rg was below 30 Å in all four systems ( Figure 3A-D), and it was smaller in the mutants, indicating a conformational change after binding to ligand. Then Rg value represents the tight degree of protein structure. The smaller the value of Rg is, the more closely the 3D structure of the protein is. The Rg was below 30 Å in all four systems ( Figure 3A-D), and it was smaller in the mutants, indicating a conformational change after binding to ligand.

System Stabilization
The root-mean-square deviation (RMSD) values of the atomic skeletons in the four simulation architectures (Figure 2A,B) showed that the four reaction systems reached equilibrium after 60 ns. The final RMSD was always below 5 Å , indicating that Then Rg value represents the tight degree of protein structure. The smaller the value of Rg is, the more closely the 3D structure of the protein is. The Rg was below 30 Å in all four systems ( Figure 3A-D), and it was smaller in the mutants, indicating a conformational change after binding to ligand. The SASA value of the free protein was stable (around 28,000~30,000 Å 2 ) after 100 ns (Figures 4 and 5), but it was smaller in the three complexes, with final values stable between 26,000 and 28,000 Å 2 , suggesting reduced protein hydrophilicity caused by ligand binding. The SASA residue values of subsites −1 and +1 ( Figure 5A,B) showed a significant increase in Thr50 in 7C6C-(GlcN) 2 , compared to 7C6C-free, but residues at subsite 1 did not show significant differences. The SASA values of residue Tyr118 in the two mutants increased, whereas G21K did not show significant changes in these key sites.  The SASA value of the free protein was stable (around 28,000 ~ 30,000 Å 2 ) after 100 ns ( Figure 4 and 5), but it was smaller in the three complexes, with final values stable between 26,000 and 28,000 Å 2 , suggesting reduced protein hydrophilicity caused by ligand binding. The SASA residue values of subsites −1 and +1 ( Figure 5A,B) showed a significant increase in Thr50 in 7C6C-(GlcN)2, compared to 7C6C-free, but residues at subsite 1 did not show significant differences. The SASA values of residue Tyr118 in the two mutants increased, whereas G21K did not show significant changes in these key sites.  Overall, the four systems were stable after 100 ns MD simulations and could be used in subsequent steps.

Conformational Changes between Protein and Ligands
Compared with free CSNMY002, binding to (GLCN)2 induced changes in residues 83-89 ( Figure 6A), and differences were observed in the active site region. The RMSF value The SASA value of the free protein was stable (around 28,000 ~ 30,000 Å 2 ) after 10 ns ( Figure 4 and 5), but it was smaller in the three complexes, with final values stab between 26,000 and 28,000 Å 2 , suggesting reduced protein hydrophilicity caused by li and binding. The SASA residue values of subsites −1 and +1 ( Figure 5A,B) showed a si nificant increase in Thr50 in 7C6C-(GlcN)2, compared to 7C6C-free, but residues at subsi 1 did not show significant differences. The SASA values of residue Tyr118 in the two m tants increased, whereas G21K did not show significant changes in these key sites.  Overall, the four systems were stable after 100 ns MD simulations and could be use in subsequent steps.

Conformational Changes between Protein and Ligands
Compared with free CSNMY002, binding to (GLCN)2 induced changes in residu 83-89 ( Figure 6A), and differences were observed in the active site region. The RMSF valu Overall, the four systems were stable after 100 ns MD simulations and could be used in subsequent steps.

Conformational Changes between Protein and Ligands
Compared with free CSNMY002, binding to (GLCN) 2 induced changes in residues 83-89 ( Figure 6A), and differences were observed in the active site region. The RMSF value of the G21R mutant and free WT were almost the same and only showed a strong fluctuation at the active site Trp204. In contrast, the G21K mutant showed strong fluctuations during the MD simulations at Arg37, Thr50, Tyr118 and Trp204.
olecules 2023, 28, x FOR PEER REVIEW 5 of 12 of the G21R mutant and free WT were almost the same and only showed a strong fluctuation at the active site Trp204. In contrast, the G21K mutant showed strong fluctuations during the MD simulations at Arg37, Thr50, Tyr118 and Trp204.  In two mutants, the α-helices remained unchanged in the Ile145-Gly148 domain (Table 1). In the free WT, the α-helices in the inner part dropped and sharply formed a loop. Residues Ile145-Gly148 located at the α6 region contain key catalytic residues at subsites −1 and +1. In both mutants, the enhanced helix probability may increase the tunnel length of the α6 helix, which may facilitate sliding of the substrate into the tunnel.  The secondary structure change probability is shown in (Figure 7A-E).
of the G21R mutant and free WT were almost the same and only showed a strong fluctu ation at the active site Trp204. In contrast, the G21K mutant showed strong fluctuation during the MD simulations at Arg37, Thr50, Tyr118 and Trp204. The secondary structure change probability is shown in (Figure 7A-E). In two mutants, the α-helices remained unchanged in the Ile145-Gly148 domain (Ta ble 1). In the free WT, the α-helices in the inner part dropped and sharply formed a loop Residues Ile145-Gly148 located at the α6 region contain key catalytic residues at subsite −1 and +1. In both mutants, the enhanced helix probability may increase the tunnel lengt of the α6 helix, which may facilitate sliding of the substrate into the tunnel.  In two mutants, the α-helices remained unchanged in the Ile145-Gly148 domain ( Table 1). In the free WT, the α-helices in the inner part dropped and sharply formed a loop. Residues Ile145-Gly148 located at the α6 region contain key catalytic residues at subsites −1 and +1. In both mutants, the enhanced helix probability may increase the tunnel length of the α6 helix, which may facilitate sliding of the substrate into the tunnel. The amino acid residues of the Ile145-Gly148 fragment were used to analyze RMSD, Rg and SASA. The RMSD value of the free protein fluctuated, especially around 60 ns ( Figure 8A,B), but that of the three complexes was more stable. The Rg of the free protein in this region was smaller than in the three complexes ( Figure 8C,D), where it increased after binding the substrate, especially those involving the two mutant complex systems. The SASA of the free protein in this region was lower, whereas the hydrophilicity of the three complex systems increased. The SASA values of the two mutants showed an increase ( Figure 8E,F The amino acid residues of the Ile145-Gly148 fragment were used to analyze RM Rg and SASA. The RMSD value of the free protein fluctuated, especially around 60 ( Figure 8A,B), but that of the three complexes was more stable. The Rg of the free pro in this region was smaller than in the three complexes ( Figure 8C,D), where it increa after binding the substrate, especially those involving the two mutant complex syste The SASA of the free protein in this region was lower, whereas the hydrophilicity of three complex systems increased. The SASA values of the two mutants showed an crease ( Figure 8E,F). Protein active pocket analysis is important to study protease activity. The ac pocket region in the MD simulation was calculated using POCASA [15]. Changes in ac pocket volume were examined after 0, 50 and 100 ns ( Figure 9A-D). Compared with free protein, the active pocket volumes of the three complex systems increased. In the mutants, they were bigger than in the 7C6C-(GlcN)2 complex, which may be useful for substrate to slide into and to start the catalytic reaction. This may be one of the reasons the improvement of enzyme activity in the two mutants. Protein active pocket analysis is important to study protease activity. The active pocket region in the MD simulation was calculated using POCASA [15]. Changes in active pocket volume were examined after 0, 50 and 100 ns ( Figure 9A-D). Compared with the free protein, the active pocket volumes of the three complex systems increased. In the two mutants, they were bigger than in the 7C6C-(GlcN) 2 complex, which may be useful for the substrate to slide into and to start the catalytic reaction. This may be one of the reasons for the improvement of enzyme activity in the two mutants.

Cross-Correlation Analysis
Cross-correlation matrix analysis can find protein regions that experience large conformational changes. The G21K-(GlcN)2 complex showed more flexibility in the MD simulations, whereas Ile145-Gly148 were negatively correlated in the G21K-(GlcN)2 system ( Figure 10A-D).

Cross-Correlation Analysis
Cross-correlation matrix analysis can find protein regions that experience large conformational changes. The G21K-(GlcN) 2 complex showed more flexibility in the MD simulations, whereas Ile145-Gly148 were negatively correlated in the G21K-(GlcN) 2 system ( Figure 10A-D).

Molecular Docking
The substrate was docked to CSNMY002 with AutoDock 4.2 software [18][19][20][21]. The grid size was set to 60 × 50 × 60 Å and spacing between grid points was 0.375 Å. After docking, protein-ligand complexes with the lowest energy were selected and used for subsequent molecular dynamics simulations.

Molecular Dynamics Simulations
Amber 16 software [22,23] was used to simulate the systems consisting of free WT, WT-(GlcN) 2 , G21K-(GlcN) 2 and G21R-(GLCN) 2 for 100 ns. The force field for the protein was Amber FF99SB [24,25], whereas for (GLCN) 2 it was GAFF2 [26,27]. The TIP3P model [28,29] was used, and periodic boundary conditions were applied to the reaction system during the simulation. Because the net charge in the initial reaction system is not zero, Na + was added in the initial stage of the simulation. The information of each system is listed in Table 3. After the system was built, it was energy-minimized using the steepest descent and a conjugate gradient method, each with 500 steps. After this minimization, the initial structure was stable. The temperature of the simulated reaction was raised from 0 to 300 K in 50 ps. At the end of heating, the system was left to react for another 50 ps. Finally, the system was equilibrated with constant pressure under NPT condition [30][31][32], with a constant pressure balance of 500 ps at 300 K. This was the last step for the system balance, which took 2 fs. After stabilization of all the thermodynamic parameters, a 100 ns MD simulation was performed for each system, collecting data every 1 fs, with a storage interval of 2 ps/interval and a total of 10,000 frames. AutoDockTools 1.5.6 was used for the six molecular docking systems. The results of molecular docking were visualized with Pymol 2.4.0. Data from four MD systems were collected and analyzed for protein structure fluctuation, combined with analysis of pocket volume, stretch kinetics and secondary structure. Trajectory analyses were computed using Amber16 s CPPTRAJ module [15] and included RMSD, radius of gyration, RMSF, SASA and dictionary of secondary structures. The cross-correlation matrix of the trajectory was generated with Tcl script in VMD, and its eigenvector and eigenvalue were calculated [33,34].

MM-PBSA
The MM-PBSA method [35] was used to predict binding free energies and relative stabilities of the models [36]. Binding free energies were calculated using the MM-PBSA method in AMBER 16. A total of 100 snapshots were chosen evenly from the MD trajectory. Total binding energy (∆G bind ) was computed using the equation: where ∆Gbind is the binding free energy between protein and ligand, calculated as the difference between the total free energy of the complex (Gcomplex) and the sum of the free energy of protein (Gprotein) and ligand (Gligand). The binding energy is expressed as the combination of enthalpy and entropy terms: where T∆S refers to the entropic contribution to the free energy in a vacuum, and T and S are temperature and entropy, respectively. The changes in protein and ligand upon binding were similar in all complexes, with very small entropy differences; therefore, the calculation of the solvate entropy term is omitted.
where E MM is the molecular mechanics energy of the molecule expressed as the sum of internal energy and electrostatic and van der Waals energies.
The solvation free energy is the sum of polar and nonpolar contributions: where G nonpolar is calculated from the solvent-accessible surface area (SASA): Here, γ = 0.0072 kcal/mol/Å, and b = 0 kcal/mol.

Conclusions
First, binding of substrate can activate the protein, changing the conformation of the active and catalytic regions. The combination of chitosaccharide and chitosanase can effectively stabilize the structure of chitosanase during the reaction and enhance its stability. The point mutation of residue 21 changed the original properties of regions 145-148 and 198-208. When combined with the substrate, the segment underwent obvious stretching deformation, thus changing the initial wave condition. In chitosanase CSNMY002, the active region recruits Chitosan molecules through stretching and deformation. When Chitodisaccharide or Chitohexose docks and reacts with CSNMY002, a tight interaction network forms, and Chitosan is degraded.
Second, point mutations such as G21K and G21R lead to conformational changes of the original degradation site and a changed degree of polymerization of the final degradation product. In the two mutants, the helical augmentation effect effectively inhibited the catalytic action of the original specificity on the +1/−1 site and at the same time increased the tunnel length of the α6 helix, which made the hydrogen bond network between chitosanase and chitosan molecules more stable and closer.
Finally, in this enzyme, Arg37, Ile145-Gly148 and Trp204 are important catalytic residues, and Arg37 forms two stable hydrogen bonds with the -1 site which helps to form a tighter complex with Chitosan. Ile145-Gly148 is an important binding and catalytic site, which can catalytically degrade the β-(1,4)-glycosidic bond under the synergistic action of Glu19 sites while binding the +1/−1 sites. Trp204, located in the α9 helix, is also an important site for degradation, which can be induced by the synergistic action of Lys21 or Arg21. These sites are potential target sites for CSNMY002 optimization.