Mutation Hotspot for Changing the Substrate Specificity of β-N-Acetylhexosaminidase: A Library of GlcNAcases

β-N-Acetylhexosaminidase from Talaromyces flavus (TfHex; EC 3.2.1.52) is an exo-glycosidase with dual activity for cleaving N-acetylglucosamine (GlcNAc) and N-acetylgalactosamine (GalNAc) units from carbohydrates. By targeting a mutation hotspot of the active site residue Glu332, we prepared a library of ten mutant variants with their substrate specificity significantly shifted towards GlcNAcase activity. Suitable mutations were identified by in silico methods. We optimized a microtiter plate screening method in the yeast Pichia pastoris expression system, which is required for the correct folding of tetrameric fungal β-N-acetylhexosaminidases. While the wild-type TfHex is promiscuous with its GalNAcase/GlcNAcase activity ratio of 1.2, the best single mutant variant Glu332His featured an 8-fold increase in selectivity toward GlcNAc compared with the wild-type. Several prepared variants, in particular Glu332Thr TfHex, had significantly stronger transglycosylation capabilities than the wild-type, affording longer chitooligomers – they behaved like transglycosidases. This study demonstrates the potential of mutagenesis to alter the substrate specificity of glycosidases.

In general, the specificity of wild-type (WT) fungal β-N-acetylhexosaminidases varies between species. There are relatively selective GlcNAcases: β-N-acetylhexosaminidase from Aspergillus oryzae (AoHex) with a GalNAcase/GlcNAcase ratio of 0.3-0.6 [9], or from A. versicolor (AvHex) with a GalNAcase/GlcNAcase ratio of 0.09-0.11 [10]. There are also prevalent GalNAcases, e.g., β-N-acetylhexosaminidase from Penicillium oxalicum (PoHex) with a GalNAcase/GlcNAcase ratio of 1.4-2.3 [9,11]. Although bacterial β-Nacetylhexosaminidases capable of GalNAc transfer have been identified [12][13][14], selective yielded only limited results in our hands. Therefore, we resorted to site-directed mutagenesis, where the prospective variants were proposed by in silico modeling. Thus, we could thoroughly elucidate the relationship between the amino acid substitution in the active site and the change in the enzyme properties. We found that the point mutation at this position prevalently led to an increase in GlcNAcase activity relative to GalNAcase. In the series of ten variants prepared, we identified Glu332His TfHex, which had a GalNAcase/GlcNAcase ratio (0.15) comparable to the most selective natural fungal β-Nacetylhexosaminidase from A. versicolor [10].

Results
Our previous docking and molecular dynamics simulation study of interactions between the active site residues of TfHex and its substrates revealed that the side-chain of Glu332 is located close to the C-3 and C-4 hydroxyls of the substrate and could form hydrogen bond interaction with pNP-GlcNAc while the hydrogen bond with pNP-GalNAc was not formed, which often affects the binding affinity to the substrate ( Figure  1) [32]. Thus, due to the direct involvement in the interaction with the substrate, the mutation of the Glu332 residue was proposed to significantly shift the substrate specificity either sterically, energetically, or both. Therefore, this residue was selected as the mutagenesis hotspot for the present study.

Site-Saturation Mutagenesis Approach-A Dead-End Street?
In the first approach, we decided to perform site-saturation mutagenesis of the TfHex gene at the position of Glu332 (Supplementary Table S1). Due to their complex quarternary structure, β-N-acetylhexosaminidases need to be produced in a eukaryotic host such as Pichia pastoris. Advantageously, P. pastoris is capable of complex posttranslational modifications, and in particular, the correct folding of the catalytically active tetramer consisting of two non-covalently associated subunits and two non-covalently linked propeptides [39], and glycosylation, contrary to the prokaryotic host of E. coli [40,41]. The WT gene of TfHex was cloned into the pPICZαA expression vector using EcoRI and KpnI restriction sites downstream of the α-factor (for extracellular enzyme targeting and zeocin resistance) as described previously [39]. The site-saturation mutagenesis was performed by PCR as detailed in Materials and Methods. After the

Results
Our previous docking and molecular dynamics simulation study of interactions between the active site residues of Tf Hex and its substrates revealed that the side-chain of Glu332 is located close to the C-3 and C-4 hydroxyls of the substrate and could form hydrogen bond interaction with pNP-GlcNAc while the hydrogen bond with pNP-GalNAc was not formed, which often affects the binding affinity to the substrate ( Figure 1) [32]. Thus, due to the direct involvement in the interaction with the substrate, the mutation of the Glu332 residue was proposed to significantly shift the substrate specificity either sterically, energetically, or both. Therefore, this residue was selected as the mutagenesis hotspot for the present study.

Site-Saturation Mutagenesis Approach-A Dead-End Street?
In the first approach, we decided to perform site-saturation mutagenesis of the Tf Hex gene at the position of Glu332 (Supplementary Table S1). Due to their complex quarternary structure, β-N-acetylhexosaminidases need to be produced in a eukaryotic host such as Pichia pastoris. Advantageously, P. pastoris is capable of complex post-translational modifications, and in particular, the correct folding of the catalytically active tetramer consisting of two non-covalently associated subunits and two non-covalently linked propeptides [39], and glycosylation, contrary to the prokaryotic host of E. coli [40,41]. The WT gene of Tf Hex was cloned into the pPICZαA expression vector using EcoRI and KpnI restriction sites downstream of the α-factor (for extracellular enzyme targeting and zeocin resistance) as described previously [39]. The site-saturation mutagenesis was performed by PCR as detailed in Materials and Methods. After the transformation of competent cells of P. pastoris KM71H, we initially screened 150 clones as recommended by Reetz et al. [42,43] (see Equation (1) in Section 4-Materials and Methods) to cover all 19 amino acid residues except for the WT (Supplementary Table S1). However, because the activity screening afforded mostly variants with a GalNAcase/GlcNAcase ratio close to the WT (ca. 0.6-1.4), we increased the number of screened colonies to approximately 1600. For this aim, we optimized the screening of transformed P. pastoris cells in 96-deep-well plates under methanol induction based on the procedure by Weis et al. [44]. The deep-well screening was performed in the minimal medium with glucose as a carbon source, with 6-day-long cultivation. The production of active enzymes (GalNAcase & GlcNAcase activities) in the cultivation medium was confirmed by the spectrophotometric activity assay using pNP-GalNAc and pNP-GlcNAc substrates.
The 22 prospective variants with a significantly modified GalNAcase/GlcNAcase ratio (more than 3-fold decreased or increased compared to the WT) were then produced by the standard cultivation procedure in flasks and purified by cation-exchange chromatography. Thus, we could confirm or disprove the preliminary estimate of the activity ratio from the medium screening. As a result, only seven variants were selected for further analysis, in which the significant shift in activity ratio was confirmed after purification. The respective Glu332 mutation in the mutant variants was identified by DNA sequencing. Chromosomal DNA of P. pastoris was isolated by combining enzymatic degradation (zymolyase) and adsorption (DNeasy Ultraclean Microbial Kit) to reach sufficient amounts of DNA for further processing. After amplification of mutant Tf Hex genes by PCR, they were ligated into the pGEM-T Easy vector and sequenced.
The properties of the analyzed clones selected for DNA isolation are summarized in Supplementary Table S2. In summary, as a result of site-saturation mutagenesis, we identified Glu332Gly, Glu332His, and Glu332Trp Tf Hex variants with a significantly shifted GalNAcase/GlcNAcase activity ratio (Supplementary Table S2). In one case, we detected simultaneous recombination of two different genes into the yeast genome in the variant Glu332Trp/Glu332Arg. Although multiple gene copies are usually desirable due to higher expression levels of recombinant proteins [45,46], and may further reinforce the desired enzyme property, multiple gene expression hinders a straightforward elucidation of structure-activity relations. In our case, the combined Trp/Arg variant showed a quite favorable GalNAcase/GlcNAcase activity ratio, but since it was not the best of all variants, we have not analyzed it further.
Despite the risk of multicopy expression, P. pastoris has previously been used for screening of directed-evolution mutants or products of DNA shuffling. Chen and coworkers identified two mutant α-galactosidases from Penicillium janczewskii zalesk with a higher galactosidase activity, potentially useful in feed manufacturing [47]. Kao and coworkers used a 96-deep-well plate screening to identify a β-glucosidase with an enhanced specific activity [48]. However, since this system afforded just limited results for the site-saturation mutagenesis of Tf Hex (only four prospective amino acid exchanges were identified), we resorted to the site-directed mutagenesis approach, as detailed further.

Site-Directed Mutagenesis Approach
For the aim of site-directed mutagenesis, we had to predict the potential candidate mutations at Glu332 by molecular modeling.

In Silico Rational Design of Prospective Tf Hex Mutants
The in silico modeling of interactions between β-N-acetylhexosaminidases and their substrates is a relatively challenging task, given the high substrate flexibility of the enzyme. Additionally, the predictions acquired for Tf Hex are just based on the homology model with the enzyme from AoHex (sequence identity of catalytic domains is 67%), which is the only known fungal β-N-acetylhexosaminidase with a resolved 3D structure [38]. These limitations had to be considered when analyzing the results of in silico experiments. The final selection of potential candidate mutant variants of Tf Hex with a shifted Gal-NAcase/GlcNAcase ratio was performed based on a combination of in silico methods. One of them was the scanning of residues by BioLuminate based on MM-GBSA scoring (a part of Schrödinger software, release 2018-4) [49]. By using this program, the Glu332 residue was in silico mutated into the corresponding amino acid residue, followed by short energy minimization in an implicit solvent (including only residues within 5 Å of the mutated residue). Then, the change in the substrate-enzyme binding affinity compared with WT based on MM-GBSA scoring (∆∆G = ∆G mutant − ∆G WT ) was calculated. More negative values of ∆∆G corresponded to variants with an enhanced preference for the respective substrate over WT. The discrimination between pNP-GlcNAc and pNP-GalNAc should be achieved by selecting variants with a higher absolute difference in respective ∆∆G values ( Figure 2). According to the ∆∆G calculation, the exchange of Glu332 for Gln, Lys, Met, Thr, Phe, His, Trp, and Leu was predicted to improve the affinity to both substrates (pNP-GlcNAc and pNP-GalNAc) compared with WT, with a more pronounced GlcNAcase activity ( Figure 2). In addition, two variants were expected to show improvement in GalNAcase activity compared to WT: Glu332Arg and Glu332Tyr Tf Hex.
Glu332 residue was in silico mutated into the corresponding amino acid residue, followed by short energy minimization in an implicit solvent (including only residues within 5 Å of the mutated residue). Then, the change in the substrate-enzyme binding affinity compared with WT based on MM-GBSA scoring (ΔΔG = ΔGmutant − ΔGWT) was calculated. More negative values of ΔΔG corresponded to variants with an enhanced preference for the respective substrate over WT. The discrimination between pNP-GlcNAc and pNP-GalNAc should be achieved by selecting variants with a higher absolute difference in respective ΔΔG values ( Figure 2). According to the ΔΔG calculation, the exchange of Glu332 for Gln, Lys, Met, Thr, Phe, His, Trp, and Leu was predicted to improve the affinity to both substrates (pNP-GlcNAc and pNP-GalNAc) compared with WT, with a more pronounced GlcNAcase activity ( Figure 2). In addition, two variants were expected to show improvement in GalNAcase activity compared to WT: Glu332Arg and Glu332Tyr TfHex. Since the BioLuminate method only analyzes the interactions of a static complex, the hypothetical Glu332 mutant variants in the complex with both substrates were next subjected to molecular dynamics simulation. Thus, we could verify the stability of the formed complexes and analyze the interactions of the respective variant with the substrates. The molecular dynamics simulations of enzyme-substrate complexes were run in YASARA for 30-50 ns. The root-mean-square deviation (RMSD) analysis revealed that pNP-GlcNAc was not stabilized in the active site of Glu332Met and Glu332Arg TfHex and that neither of the substrates was well stabilized in the active site of Glu332Leu TfHex (Supplementary Figures S1 and S2). Therefore, these three variants were excluded from the BioLuminate selection. The complexes of variants with significant differences in substrate orientation required for hydrolysis were analyzed by visual inspection. All modeled complexes are shown in Supplementary Figures S2 and S3. The formation of hydrogen bonds, reflecting on the correct formation of the oxazolinium ion intermediate, is shown in Figure S4. For substrates docked in the selected candidate mutants, this parameter distance was comparable to WT. Since the BioLuminate method only analyzes the interactions of a static complex, the hypothetical Glu332 mutant variants in the complex with both substrates were next subjected to molecular dynamics simulation. Thus, we could verify the stability of the formed complexes and analyze the interactions of the respective variant with the substrates. The molecular dynamics simulations of enzyme-substrate complexes were run in YASARA for 30-50 ns. The root-mean-square deviation (RMSD) analysis revealed that pNP-GlcNAc was not stabilized in the active site of Glu332Met and Glu332Arg Tf Hex and that neither of the substrates was well stabilized in the active site of Glu332Leu Tf Hex (Supplementary Figures S1 and S2). Therefore, these three variants were excluded from the BioLuminate selection. The complexes of variants with significant differences in substrate orientation required for hydrolysis were analyzed by visual inspection. All modeled complexes are shown in Supplementary Figures S2 and S3. The formation of hydrogen bonds, reflecting on the correct formation of the oxazolinium ion intermediate, is shown in Figure S4. For substrates docked in the selected candidate mutants, this parameter distance was comparable to WT.
The equilibrated complexes of both substrates with the variants allowed us to analyze the binding score for each substrate-enzyme complex for selected snapshots derived from the stable period of molecular dynamics simulation (Supplementary Table S5). The variants with a larger XP score difference between pNP-GlcNAc or pNP-GalNAc than WT should allow for better discrimination between both substrates than WT. They are denoted in bold in Table S5 (Phe, Gly, His, Asn). Based on this particular analysis, we can consider them as potential candidates for modulating the GalNAcase/GlcNAcase ratio. This extended the original BioLuminate-method list of prospective variants with Glu332Gly and Glu332Asn Tf Hex.
To complete the selection of candidate variants, we also included published mutant homologs in the analysis. For β-N-acetylhexosaminidase from Ostrinia furnacalis, the exchange of the residue corresponding to Glu332 for Ala and Gln was reported [50]. To compare the effects of these mutations on the substrate affinity in Tf Hex, we also included the Ala variant in the selection.
In summary, by combining (i) the results of the BioLuminate residue scanning showing altered free energy of binding; (ii) the analysis of the stability of the enzyme-substrate complex in molecular dynamics simulation; (iii) the binding scores for individual substrates in equilibrated complexes (Glide XP method); and (iv) information from the literature, a library of ten prospective variants of Tf Hex at the Glu332 hotspot (Ala, Asn, Gln, Gly, His, Lys, Phe, Thr, Trp, Tyr) was selected. These variants were prepared by site-directed mutagenesis and tested for their GalNAcase/GlcNAcase ratio.

Preparation of Tf Hex Mutants by Site-Directed Mutagenesis
Based on molecular modeling, ten prospective amino acid substitutions at the position of Glu332 were selected for modifying Tf Hex substrate specificity. Single mutant variants were prepared by PCR from the WT gene [39], as summarized in Supplementary Table S3. The codons present in the respective variants are shown in Table S4. Some mutant constructs were also isolated while probing the occurrence of mutant genes during site-saturation mutagenesis (cf. Supplementary Tables S1 and S4). Mutant pPICZαA-Tf Hex plasmids were linearized by restriction endonuclease SacI and electroporated into the competent cells of P. pastoris KM71H. After screening for growth under the selection pressure of zeocin, the production of variants was monitored by SDS-PAGE, and enzyme activity was determined with a standard activity assay. The clones exhibiting the most extreme ratio of GalNAcase/GlcNAcase activities were cryopreserved and aliquots stored at −80 • C.

Production and Characterization of Site-Directed Mutants of Tf Hex
The preparative production of Tf Hex variants was performed in minimal cultivation medium for easier purification. The variants of Tf Hex were purified to homogeneity in a single purification step by cation-exchange chromatography. Their purity was confirmed by SDS-PAGE (Supplementary Figure S5). The purification yield and the specific GalNAcase and GlcNAcase activities were determined; their ratio was evaluated. The obtained data were compared with the recombinant Tf Hex WT [23] and are summarized in Table 1. The specific activities show that the mutagenesis generally caused a significant decrease in the GalNAcase/GlcNAcase activity ratio in the tested variants, affording GlcNAcases (3-to 8-fold more selective compared to WT). The Glu332Trp and Glu332Phe Tf Hex variants were rather unselective. The only variant that showed a (slightly) higher GalNAcase affinity than WT was Glu332Tyr Tf Hex with a respective ratio of 1.4. Notably, in many prepared variants, the significant increase in selectivity (GalNAcase/GlcNAcase activity ratio decreased from 3-to 8-fold compared to WT) has not caused a considerable decline in specific activity as is otherwise usual in active site mutant variants, particularly those at the catalytic nucleophile [51]. Here, in many cases, we rather observed an increase in the enzyme specific activity compared with the WT. This was particularly striking for the Glu332Thr variant. Its specific (GlcNAcase) activity increased almost 4-fold compared with the WT. A higher specific GlcNAcase activity over WT was also observed in Ala, Asn, Gln, and Gly variants. The highest substrate selectivity was achieved by introducing histidine and lysine; however, in both cases, it was accompanied by a sharp decline in the specific activity. In the most selective variant Glu332His Tf Hex, the GalNAcase/GlcNAcase activity ratio decreased ca. 8-fold (ratio of 0.15), and in the lysine variant, ca. 7-fold (ratio 0.18) compared with the WT. The previously published Arg218Lys Tf Hex exhibited a GalNAcase/GlcNAcase ratio of 0.15 [32].

Kinetic Analysis and Transglycosylation Potential of the Best Mutants
To gather more information on the influence of the Glu332 mutation on the catalytic activity, we determined the kinetic parameters of outstanding GlcNAcase variants. In all the variants with increased specific GlcNAcase activity relative to WT (Ala, Asn, Gln, Gly, and Thr; Table 1), we observed a well-preserved value of the turnover number (k cat ) but a significantly higher K M (i.e., lower affinity to the pNP-GlcNAc substrate). As a result, the catalytic efficiency (k cat /K M ) in these variants decreased from 12-to 27-fold compared to WT Tf Hex [23], though it was still well comparable with other native GlcNAcases such as AvHex (Supplementary Table S6).
In the two most selective variants (His and Lys), this trend was even more pronounced. Due to a very high K M (>7 mM), the kinetic curves could not be measured to full saturation due to the limited water solubility of the pNP-GlcNAc substrate (Supplementary Figure S6). Therefore, for Glu332Lys Tf Hex, only estimates of kinetic parameters are presented [23]. Interestingly, Glu332Gln Tf Hex showed a unique catalytic behavior, which could not be fitted to the standard Michaelis-Menten kinetics (Supplementary Figure S6), and thus disabled a reliable extraction of kinetic parameters for this variant. We suspect that this phenomenon may be caused by changes in the hydrogen bonding network during substrate binding in the active site caused by the Gln332 substitution.
The transglycosylation ability of Tf Hex variants was tested in a self-condensation reaction with pNP-GlcNAc. This experiment showed the effect of the introduced mutation on the transglycosylation capability of the enzyme. Previously, the formation of longer pNP-functionalized chitooligomers was described in the reaction with Tf Hex transglycosidases [35] and with Arg218Lys Tf Hex [32]. We used the same reaction conditions as in our previous work [32] (100 mM pNP-GlcNAc; 1 U/mL enzyme) for the analytical reaction with outstanding Tf Hex variants and compared it with the respective data for Tf Hex WT ( Figure 3). Interestingly, in contrast to WT and also to the previously described Arg218Lys mutant variant with a comparable GalNAcase/GlcNAcase activity ratio, we found that Glu332His and Glu332Lys Tf Hex had a completely different transglycosylation behavior. Instead of creating a whole array of transglycosylation products (chitooligomers of varying lengths), these variants selectively formed only a small amount of pNP-(GlcNAc) 2 , behaving prevalently as hydrolases (lower transglycosylation degree than WT) (see also Supplementary Table S7). In contrast, vivid transglycosylation activity accompanied by the formation of longer chitooligosaccharides was observed in transglycosylation reactions catalyzed by the Ala, Asn, Gln, Gly, and Thr variants. Especially, Glu332Thr Tf Hex showed an almost 5-fold higher transglycosylation efficiency compared with WT (Supplementary Table S7). This is a considerable increase in the synthetic potential by introducing a single amino acid exchange.
small amount of pNP-(GlcNAc)2, behaving prevalently as hydrolases (lower transglycosylation degree than WT) (see also Supplementary Table S7). In contrast, vivid transglycosylation activity accompanied by the formation of longer chitooligosaccharides was observed in transglycosylation reactions catalyzed by the Ala, Asn, Gln, Gly, and Thr variants. Especially, Glu332Thr TfHex showed an almost 5-fold higher transglycosylation efficiency compared with WT (Supplementary Table S7). This is a considerable increase in the synthetic potential by introducing a single amino acid exchange.

Discussion
The present work raises the question of why only a limited number of variants with shifted GalNAcase/GlcNAcase ratio were identified by site-saturation mutagenesis of TfHex. With 1600 clones screened, over 10-fold the number theoretically required for 99% coverage [42,43], we should have assumed that all six mutations with the significant shift in the activity ratio identified by site-directed mutagenesis would have been detected in the screening after site-saturation mutagenesis. However, this was not the case. Most likely, the ratio and representation of mutant genes in the mixture of plasmids were not completely balanced after site-saturation mutagenesis, and some mutant genes were generated, incorporated into the host chromosome, and expressed preferentially over others. This result indicates that the applicability of Equation (1) by Reetz et al. [42,43]

Discussion
The present work raises the question of why only a limited number of variants with shifted GalNAcase/GlcNAcase ratio were identified by site-saturation mutagenesis of Tf Hex. With 1600 clones screened, over 10-fold the number theoretically required for 99% coverage [42,43], we should have assumed that all six mutations with the significant shift in the activity ratio identified by site-directed mutagenesis would have been detected in the screening after site-saturation mutagenesis. However, this was not the case. Most likely, the ratio and representation of mutant genes in the mixture of plasmids were not completely balanced after site-saturation mutagenesis, and some mutant genes were generated, incorporated into the host chromosome, and expressed preferentially over others. This result indicates that the applicability of Equation (1) by Reetz et al. [42,43] may have practical limitations. Nevertheless, since the overall outcome of this study may lean on the results of both site-directed and site-saturation approaches, we may reasonably suppose that we were able to identify all the mutants at the Glu332 position with a significantly shifted GalNAcase/GlcNAcase ratio.
Another factor that may have contributed to false-positive or false-negative hits in the site-saturation mutagenesis screening is the fact that individual transformants of the same gene in P. pastoris may exhibit slightly different catalytic properties (in terms of both specific activity (U/mg) and GalNAcase/GlcNAcase activity ratio); although, with one particular transformant, these parameters are well reproducible in different enzyme productions. Apparently, the activity ratio in β-N-acetylhexosaminidases does not solely result from the primary enzyme structure, but other factors must also be considered-besides the optimum folding of the catalytically active tetramer and the glycosylation issue mentioned above, it is also the number of copies of the same gene transformed into the Pichia genome and possibly other factors yet undisclosed. From our experience, the catalytic parameters of β-N-acetylhexosaminidases produced in P. pastoris also slightly vary according to the type of medium and carbon source. Variations between specific activities and yields of different transformants of the same gene (as seen in the Glu332His mutant and also partially in the Glu332Gly mutant prepared by site-directed and site-saturation mutagenesis, cf. Table 1 and Supplementary Table S2) may supposedly be contributed, e.g., to an incomplete folding into an active tetramer comprising two catalytic propeptides. This issue deserves a deeper analysis in the future. The quite heavy workload of screening hundreds of variants further favors the site-directed mutagenesis approach, especially when aiming at single-mutant variants.
In the retrospective analysis of the correlation between site-directed mutagenesis data and in silico predictions, we found that the substrate preference was adequately predicted with the exception of the Glu332Phe variant, which was expected to have higher selectivity based on BioLuminate mutation scan data. However, the values of the calculated differences in the ∆∆G values by mutant scan or ∆XP score for various substrates were poorly correlated with the experimental GlcNAcase/GalNAcase ratio (R 2 lower than 0.1). The lower predictability of the experimental results by molecular modeling reflects the fact that, besides the initial binding affinity, the processing of the substrate also depends on its ability to reach active conformation and proper orientation in the active site. Therefore, not only the binding score (reflecting the substrate affinity) but also the distances to the catalytic residues, the stability of the interactions, and the transition state may play an important role. The main challenge and bottleneck remain in how to weigh the importance of each of these parameters. Moreover, we are still working only with a homology model since Tf Hex has never been crystallized, despite many years of effort.
The calculation of energy values for each hydrolysis step, including energy barriers, is possible by the combined QM/MM method. However, structures with a good resolution are required for a precise prediction. Calculation of such parameters for a large number of variants is rather time-consuming. Another complication in predicting some variants is the correct prediction of rotamers [52]; the mutation scan may not fully reflect the enzyme flexibility, especially when the mutation is introduced in the loop region, as in this case. Molecular dynamics simulation helps to sample all possible conformations and optimize the variants, but the scoring problem remains. Despite all these bottlenecks, in this study, the site-directed mutagenesis approach based on in silico modeling provided more information on the relationship between the Glu332 substitution and the GalNAcase/GlcNAcase activity ratio in Tf Hex than site-saturation mutagenesis.

General
pNP-GlcNAc and pNP-GalNAc, standard substrates for β-N-acetylhexosaminidases [32], were obtained from Gold Biotechnology, USA. If not stated otherwise, all other chemicals were from VWR Chemicals or Lach-Ner (Czech Republic).

Preparation and Cloning of Mutant Genes
The pPICZαA expression vector carrying the gene of Tf Hex WT (GenBank ID: JN601495) and the gene for zeocin resistance were prepared as described earlier [39]. The Tf hex gene was cloned downstream of the α-factor-encoding the DNA segment for extracellular protein targeting using EcoRI and KpnI restriction sites as described previously [39].
Site-saturation mutagenesis and site-directed mutagenesis were performed using the QuikChange Site-Directed Mutagenesis Kit (Agilent, Santa Clara, CA, USA) and the T-Personal Thermal Cycler (Biometra, Göttingen, Germany) with the following cycling parameters: initial denaturation at 95 • C for 2 min, followed by 18 cycles of denaturation at 95 • C for 20 s, primer annealing for 1 min, elongation at 68 • C for 6 min, and final elongation at 68 • C for 10 min. For each primer pair, the annealing temperature varied, see Supplementary Tables S1 and S3. For site-saturation mutagenesis, four pairs of degen-erate primers (Supplementary Table S1) were used to minimize the number of colonies required for screening and also to avoid the GAG codon present in the WT sequence. After amplification, 2 µL of DpnI were added, and the reaction mixture was incubated at 37 • C and 300 rpm for 5 min to cleave the methylated DNA template. The amplified product or product mixture (in the case of site-directed or site-saturation mutagenesis, respectively) was then transformed into E. coli Top10 competent cells (Thermo Fisher Scientific, Waltham, MA, USA) by heat shock under the selection pressure of zeocin (100 µg/mL).
In the case of site-directed mutagenesis, the respective mutant plasmid verified by sequencing was amplified, isolated using the High Pure Plasmid Midi Isolation Kit (Roche, Basel, Switzerland), and used for the transformation of Pichia pastoris as described further. In the case of site-saturation mutagenesis, all transformants on a plate were pooled and cultivated as a single culture for plasmid isolation. The prepared mixture of mutant plasmids was subsequently used for the transformation of P. pastoris.

Transformation of P. pastoris Host with Mutant Genes
The single-mutant pPICZαA-Tf Hex plasmid or the mixture of yeast expression vectors pPICZαA carrying various mutant genes (30 µg) were linearized using SacI (New England Biolabs, Ipswich, MA, USA). The P. pastoris KM71H cells (Invitrogen, Life Technologies, Waltham, MA, USA) were transformed with the plasmids by electroporation according to the EasySelect Pichia Expression Kit standard protocol (Invitrogen, Life Technologies, Waltham, MA, USA). Protein expression data and enzyme activities were determined on a small-scale (site-directed mutants) or in 96-deep-well-plate production systems (saturated mutants).

Deep-Well Screening for Heterologous Expression of β-N-Acetylhexosaminidase Variants in P. pastoris
The 96-deep-well-plate screening cultivation method was used to identify the prospective mutants of Tf Hex prepared by site-saturation mutagenesis. The number of colonies needed for the screening was originally calculated according to Reetz et al. [42,43] using Equation (1): where T is the required number of transformants, V denotes the number of required codon combinations (32 in our case), and P i is the probability with which the sequence occurs in the library. For our library, the number of transformants was found to be ca. 150 with 99% probability. Finally, 1600 variants were screened to maximize the coverage. The screening was optimized based on the procedure by Weis and coworkers [44]. Single colonies were inoculated into the wells containing 250 µL of BMD medium (10 g/L glucose, 200 mM K-P buffer, 13.4 g/L yeast nitrogen base, 0.4 mg/L biotin). The cultures were grown under standard conditions (28 • C, 320 rpm, 80% relative humidity) for 60 h. Then, protein expression was induced by adding 250 µL of BMM2 medium (1% methanol, 200 mM K-P buffer, 13.4 g/L yeast nitrogen base, 0.4 mg/L biotin) and incubating under the same conditions for 8 h. The cultures were then fed with 50 µL BMM10 medium (5% methanol, 200 mM K-P buffer, 13.4 g/L yeast nitrogen base, 0.4 mg/L biotin) at time points of 68, 80, and 92 h after inoculation. Then, the cultures were centrifuged and the supernatants were tested for β-N-acetylhexosaminidase (GlcNAcase, GalNAcase) activity. Colonies producing enzyme variants with a significantly changed activity were cryopreserved at −80 • C in 15% (v/v) glycerol for long-term storage.

Heterologous Expression of β-N-Acetylhexosaminidase Variants in P. pastoris, Screening and Purification
The screening of the expression of site-directed Tf Hex mutant variants was performed according to the manufacturer's instructions (EasySelect Pichia Expression Kit; Invitrogen, Life Technologies, USA). For each variant, at least 16 transformants were screened. For the small-scale production of P. pastoris, a combination of BMGY medium (Buffered Glycerol Complex Medium) and BMMY medium (Buffered Methanol Complex Medium) was used. Sixteen selected colonies were inoculated into BMGY medium (100 mL) and incubated at 28 • C and 220 rpm overnight. Grown cultures were centrifuged (5000 rpm, 10 min, 12 • C), and the pellets were resuspended in BMMY medium (30 mL), where the recombinant protein expression was induced by methanol (0.5% v/v). The cultures were incubated with shaking (28 • C, 220 rpm) for three days, and methanol was added every 24 h. On day five after inoculation, the cultures were tested for the presence of Tf Hex using SDS-PAGE and enzyme activity assay (see below). Two colonies for each variant were cryopreserved at −80 • C in 15% (v/v) glycerol for long-term storage.
For purification purposes, BMMY medium was exchanged for BMMH (Buffered Methanol Minimal Medium). On day five, the cultures were centrifuged (5000 rpm, 10 min, 12 • C), and the enzyme contained in the supernatant was purified to homogeneity by cation exchange chromatography (Fractogel EMD SO 3− , Merck, Darmstadt, Germany) using an Äkta Protein Purifier system (Amersham Biosciences, Uppsala, Sweden). Equilibration of the column was performed with 10 mM sodium citrate-phosphate buffer pH 3.5, and the enzyme was eluted with a linear gradient of 0-1 M sodium chloride (60 mL). The purified enzyme was concentrated by ultrafiltration in 50 mM sodium citrate-phosphate buffer pH 5.0, and its purity was verified by SDS-PAGE.

Isolation of Chromosomal DNA
Cryopreserved stocks of P. pastoris containing a mutant Tf Hex gene (50 µL) were transferred to YPD medium (50 mL, 10 mg/L yeast extract, 20 g/L peptone, 20 mg/L D-glucose), and the culture was incubated overnight (28 • C, 220 rpm). When the cultures reached an OD 600 of 20-30, 4 mL of the culture were centrifuged, and the pellet was resuspended in 50 mM sodium phosphate buffer pH 7.5 containing 10 mM 2-mercaptoethanol. The disruption of the yeast cell wall was promoted by the addition of zymolyase (100 U). Further procedures were performed according to DNeasy UltraClean Microbial Kit (Qiagen; Germany).

Amplification and Ligation of Mutant Genes into the Plasmid for Sequencing
The isolated Tf Hex gene was amplified using DreamTaq DNA polymerase (Thermo Fisher Scientific, USA) in a T-Personal Thermal Cycler (Biometra, Germany). The P. pastoris chromosomal DNA carrying the Tf Hex gene (200 ng) and two pairs of primers were used for PCR amplification. Primer pair 1: Fw: 5'-gctgttttgccattttccaacagcacaaataacggg-3', Re: 5'-tggtcgacggcgctattcagatcctcttctg-3'; primer pair 2: Fw: 5'-ggggatttcgatgttgctgttttgccattttccaac-3', Re: 5'-tgatgatggtcgacggcgctattcagatcctc-3' (Generi Biotech, Hradec Králové, Czech Republic). PCR was run under the following conditions: initial denaturation at 95 • C for 1 min, followed by 35 cycles of denaturation at 95 • C for 30 s, primer annealing at 66 • C for 30 s, elongation at 72 • C for 1 min, and final elongation at 72 • C for 10 min. The PCR product was purified with a GeneJET PCR Purification Kit (Thermo Fisher Scientific, USA). Then, A-overhangs were added to the PCR product using dATP (0.5 mM) and DreamTaq DNA polymerase (Thermo Fisher Scientific, USA). The reaction mixtures were incubated at 70 • C for 15 min and precipitated with sodium acetate and ethanol. The PCR products were ligated into the pGEM-TEasy vector system (Promega, Madison, WI, USA) according to the manufacturer s protocol and sequenced. At least 5 clones of each variant were confirmed by sequencing to contain the same point mutation. In the case of the combined mutant variant Glu332Trp/Glu332Arg Tf Hex (8A11), the ratio of both detected amino acids (in 16 clones) was ca. 3:2.

Protein Characterization
The protein concentration was determined by the Bradford method [53] using Protein Assay Dye Reagent Concentrate (Bio-Rad, Hercules, CA, USA). Calibration was performed for bovine plasma γ-globulin (Bio-Rad, UK).

Enzyme Activity Assay
The hydrolytic activity of β-N-acetylhexosaminidases was determined in an end-point assay with spectrophotometric detection. p-Nitrophenyl 2-acetamido-2-deoxy-β-D-glucopyranoside (pNP-GlcNAc) and p-nitrophenyl 2-acetamido-2-deoxy-β-D-galactopyranoside (pNP-GalNAc) were used as substrates for determining the GlcNAcase and GalNAcase activity of the enzyme, respectively. The reaction mixture consisted of McIlvaine buffer (50 mM citric acid/100 mM Na 2 HPO 4 , pH 5.0) and 2 mM pNP-GlcNAc or pNP-GalNAc in a microtube or the well of a microtiter plate. Assay reaction mixtures were pre-incubated at 35 • C, and the reaction was started by adding 10 µL of the appropriately diluted enzyme to avoid exceeding 10% reaction conversion. The reaction ran for 10 min at 850 rpm and was stopped by adding 1 mL of 0.1 M sodium carbonate to the tube or 0.1 mL to the microtiter plate well. The free p-nitrophenolate released from the enzymatic reaction was detected by a spectrophotometer at 420 nm. One unit of enzymatic activity is the amount of enzyme that hydrolyzes 1 µmol of the respective substrate per min under the standard assay conditions; all activity determinations were performed at least in triplicate.
Michaelis-Menten parameters of pNP-GlcNAc hydrolysis by Glu332Ala, Glu332Asn, Glu332Gln, Glu332Gly, Glu332His, Gly332Lys, or Gly332Thr Tf Hex were determined using a discontinuous kinetic assay. The reaction mixture (400 µL) containing pNP-GlcNAc, McIlvaine buffer pH 5.0, and the appropriately diluted enzyme was incubated at 25 • C and 850 rpm for 14 min. Aliquots (50 µL) were taken every 2 min, and the reaction was stopped by adding 1 M sodium carbonate (100 µL) to the microplate well. The released p-nitrophenolate was determined using a Sunrise Microplate Reader (Tecan, Mennedorf, Switzerland). The kinetic parameters were evaluated by non-linear regression using Graph-Pad Prism 7 (GraphPad Software, San Diego, CA, USA). In acidophilic glycosidases such as Tf Hex, continuous determination of kinetic parameters is not suitable due to a significant overlap of absorption spectra of pNP-GlcNAc and p-nitrophenol, which results in a poor sensitivity of the measurement. The reaction progress was monitored by TLC (propan-2-ol/water/NH 4 OH aq., 7/2/1; visualization by UV and carbonization in 5% H 2 SO 4 in ethanol) and by HPLC. The reaction was stopped after 5 h by boiling for 2 min. In every analytical reaction, the amount of enzyme was identical (1 U/mL).
HPLC analysis was used to monitor the course of transglycosylation reactions and to quantify the transglycosylation products. The Shimadzu Prominence LC analytical system comprised a Shimadzu CBM-20A system controller, a Shimadzu LC-20AD binary HPLC pump, a Shimadzu CTO-10AS column oven, a Shimadzu SIL-20ACHT cooling autosampler, and a Shimadzu SPD-20MA diode array detector (Shimadzu, Kyoto, Japan). Analyses were performed on a TSK gel Amide-80 column (250 × 4.6 mm, 5 µm) preceded by a TSKgel Amide-80 Guardgel

Molecular Modeling, Docking, and Molecular Dynamics
For the modeling and docking of WT and mutant Tf Hex, we used the homology model of Tf Hex built as described previously [54]. Multiple sequence alignment to determine and compare mutation hotspots was performed by the Mustang algorithm [55] in YASARA [56]. Initial predictions of the effects of mutations were performed by residue scan and mutation in the BioLuminate package of Schrödinger software [49,57]. It uses the MM-GBSA (molecular mechanics with generalized Born and surface area solvation) method [49] for calculating the difference in the free energy of binding for WT and mutant variants. The atoms were parameterized in the OPLS2005 force field and minimized in the implicit VSGB2.0 solvent model.
The models of selected Tf Hex variants were then built in YASARA [56] based on the model of WT in complex with docked pNP-GlcNAc and pNP-GalNAc constructed previously [32]. The best rotamers were selected based on the combination of the SCWALL algorithm [56,58] followed by energy minimization of the mutated residue in vacuum [56] using the YASARA standard protocol.
Modeled structures were placed in the periodic boundary conditions (the box border is extended by 0.1 nm from the system in each direction), and explicit TIP3P water with neutralizing Na + ions was added in YASARA. For the simulation of substrates, we used the GLYCAM force field [59]. The parameters for substrates were assigned from the GLYCAM force field [59]; the parameters for the pNP group were automatically added using the AutoSMILE approach [60]; the parameters for the protein, water, and ions were from YASARA2 force field [61]. The protonation state of the residues was assigned for pH 6.0, and the catalytic Glu371 was modeled in the protonated state. The modeled complexes were minimized in YASARA according to the standard protocol and used for further molecular dynamics simulation in the NPT (constant number of particles N, pressure P, and temperature T, adjusted by Berendsen thermostat) ensemble [62]. The results of the molecular dynamics simulation were analyzed by YASARA tools [63] and visualized by Xmgrace [64].
The equilibrated systems for calculating the Glide XP binding score were selected from molecular dynamics simulations based on the stable RMSD values and a short distance from the catalytic Glu371 residue, as explained previously [32].

Conclusions
By employing site-directed mutagenesis based on rational design, we performed a detailed study on the influence of a point mutation of the Glu332 residue in the active site of Tf Hex, a promiscuous enzyme, on its substrate specificity. Two variants, Glu332His and Glu332Lys Tf Hex, were the most selective GlcNAcases with an 8-fold and 7-fold higher activity, respectively, towards pNP-GlcNAc over pNP-GalNAc compared with WT. Their synthetic potential was tested in a transglycosylation reaction to determine that they both behaved more like hydrolases, forming fewer transglycosylation products than WT. This property may be used for selective cleavage of GlcNAc-terminated oligosaccharides from reaction mixtures. Five other variants containing Ala, Asn, Gln, Gly, or Thr at position 332 combined a higher specific GlcNAcase activity than WT with a high transglycosylation capability, producing almost five times more transglycosylation products than WT.
Based on this and previous work [32], we conclude that the task of shifting the GalNAcase/GlcNAcase activity ratio in transglycosylating β-N-acetylhexosaminidases is extremely challenging and that it might even be generally beyond reach to acquire a very high substrate selectivity (>100-fold) in these enzymes (as also apparent in the existing WT enzymes of this class [32]. In this study, site-saturation mutagenesis followed by high-throughput screening gave less information on the structure-activity relationship than rational site-directed mutagenesis. This is also caused by the strong dependence of catalytic properties of β-N-acetylhexosaminidases on the enzyme folding, glycosylation pattern, cultivation medium, and higher complexity of the expression system of P. pastoris, which is, however, necessary for correct folding of propeptide-carrying tetramers of fungal β-N-acetylhexosaminidases such as Tf Hex.