Change of the Product Specificity of a Cyclodextrin Glucanotransferase by Semi-Rational Mutagenesis to Synthesize Large-Ring Cyclodextrins

Cyclodextrin glucanotransferases (CGTases) convert starch to cyclodextrins (CD) of various sizes. To engineer a CGTase for the synthesis of large-ring CD composed of 9 to 12 glucose units, a loop structure of the protein involved in substrate binding was targeted for semi-rational mutagenesis. Based on multiple protein alignments and protein structure information, a mutagenic megaprimer was designed to encode a partial randomization of eight amino acid residues within the loop region. The library obtained encoding amino acid sequences occurring in wild type CGTases in combination with a screening procedure yielded sequences displaying a changed CD product specificity. As a result, variants of the CGTase from the alkaliphilic Bacillus sp. G825-6 synthesizing mainly CD9 to CD12 could be obtained. When the mutagenesis experiment was performed with the CGTase G825-6 variant Y183R, the same loop alterations that increased the total CD synthesis activity resulted in lower activities of the variant enzymes created. In the presence of the amino acid residue R183, the synthesis of CD8 was suppressed and larger CD were obtained as the main products. The alterations not only affected the product specificity, but also influenced the thermal stability of some of the CGTase variants indicating the importance of the loop structure for the stability of the CGTase.


Introduction
Evolutionary methods can be applied to change amino acid residues of proteins to alter their properties as desired [1].Iterative rounds of diversification and selection thereby mimic an evolutionary process in vitro [2].Such directed evolution approaches on a molecular level are in in contrast to site-directed mutagenesis techniques, where rational considerations lead to specific and targeted alterations of the target protein [3].A combination of both approaches can be described as semi-rational protein engineering where the size of the library of the created protein variants is reduced and its fitness is increased [4].Libraries with higher fitness can be constructed using mutagenic primers [5][6][7].Their design requires information obtained from multiple sequence alignments, consensus sequences, and from the structure and function of the protein [5][6][7].By selecting distinct amino acid residues instead of total randomization, which would lead to an incongruous number of combinations, a drastic decrease of the size of the variant library can be achieved allowing the alteration of further residues of the target protein without increasing its size [8].
Cyclodextrin glucanotransferases (CGTases), a part of the GH13 α-amylase superfamily, are starch-degrading enzymes produced by many bacteria [9,10].They catalyze an intramolecular transglycosylation reaction resulting in the formation of cyclic α-1,4-linked glucans, cyclodextrins (CD) [11].Due to their ability to form reversible complexes with guest molecules thereby increasing their solubility and stability, CD6 to CD8 composed of 6 to 8 glucose units have found many industrial applications [12].In contrast, limited information is available on the potential of larger ring CD as host compounds [13][14][15].In particular, CD9 to CD12 could be of interest as hosts for the formation of complexes with bulky guest compounds or with pharmaceuticals not able to form complexes with CD6 to CD8 [16][17][18].Although a mixture of CD with a degree of polymerization of 6 up to more than 60 is initially synthesized by the CGTase, the larger CD are rapidly reused as substrates in further intermolecular transglycosylation reactions [19].Therefore, mostly CD6, CD7 and some CD8 with only small amounts of large-ring CD of more than eight glucose units are obtained as products of a reaction of CGTases with starch [20,21].While efforts to engineer CGTases to increase the yield of CD 6 to CD8 have been published previously [22], only recently attempts to increase the yield of larger CD by site-directed mutagenesis and domain shuffling of CGTases have been reported [23,24].
In this paper, we engineered a CGTase to synthesize large-ring CD in high amounts.We used a semi-rational mutagenesis approach to generate a diverse library of enzyme variants with high fitness.Based on multiple sequence alignments and protein structure information, a loop region of the enzyme was selected for mutagenesis using a partially randomized megaprimer.A library adjusted to reduce sequence space and increase its fitness was designed using degenerated codons, followed by fitness screening of the obtained recombinant E. coli clones.

Construction of Vector Libraries
A low-conserved loop region of the CGTase G825-6 (amino acid residues 81-89) representing the binding site for the glucan substrate at the subsite-3 (three glucose residues downstream from the glucan chain cleavage site towards the non-reducing end) was selected as the target for mutagenesis.Based on a multiple sequence alignment of 31 CGTases (Table S1), degenerated codons were used to design a DNA library encoding for a partially randomized loop sequence (Figure S1, Figure 1a).The mutagenic megaprimer with conserved flanking regions was amplified and used in a further PCR with CGTase G825-6 and the variant Y183R as templates.The obtained vector library displayed the desired randomization and was subsequently used for expression and activity screening in E. coli BL21(DE3) as host (Figure 1a).The constructs were designated Loop-3 (L3) with the CGTase G825-6 as a template and Loop-3 Y183R (L3YR) with the corresponding CGTase encoding the mutation Y183R as a template.

Agar Plate Sceening of the CGTase Variants
About 1350 E. coli BL21 clones of the L3 and 1550 clones of the L3YR experiment were screened plates for the specific detection of CD7, CD8, and starch-degrading activity (Figure S2).The majority of the L3 clones (>80%) synthesized CD7 and CD8 and about 10% of them in larger amounts as compared to the wild type (WT) CGTase indicated by the formation of larger halos around the colonies.Twenty-two clones from the L3 and eight clones from the L3YR experiments were selected and further characterized by isolating and sequencing the vector DNA followed by the recombinant expression and purification of the CGTase variants.SDS-PAGE analysis confirmed the high purity of the CGTase (Figure S3).

Comparison of the CD Synthesis by the CGTase Variants
To compare the synthesis of CD by the created L3 variants, their cyclic glucan products up to a size of CD12 were analyzed after 1 h and 24 h of reaction at 40 • C and 50 • C (Figure 2).At a reaction time of 24 h, the WT CGTase produced the maximum yield of the larger CD.After 1 h of reaction at 50 • C, 20 of the 22 L3-variants showed higher yields of total CD (CD7 to CD12) compared to the WT CGTase.All variants except L3-8 also synthesized larger amounts of CD9 to CD12 (Figure 2a).After 24 h of reaction at 50 • C, nine variants (L3-1 to L3-9) produced between 6.9 to 11.5 mg CD7 to CD12, with a higher yield as the WT enzyme (6.4 mg total CD) corresponding to a substrate conversion of up to 58% (Figure 2b).While CD9 to CD12 made up 22% of the total CD synthesized by the WT enzyme, the variants L3-1, L3-3, L3-5 and L3-9 synthesized 38%, 43%, 47% and 51% CD9 to CD12 at 50 • C, respectively.The loop randomization step was repeated with the CGTase G825-6 variant Y183R, which synthesized CD8 as the smallest CD.The CD products were analyzed after 1 h (Figure 2c) and 24 h (Figure 2d) of reaction.The L3YR variants synthesized between 76% and 97% CD9 to CD12 after 1 h of reaction.CD9 to CD12 made up 58% of its products after 24 h of reaction (Figure 2d).In contrast, the variants L3YR-6, L3YR-7 and L3YR-8 synthesized between 91% to 94% CD9 to CD12 at the same reaction conditions, however with a concomitant very low total CD yield of 0.7 to 1.5 mg.At a reaction temperature of 40 • C, these three L3YR variants showed a similar total CD yield as Y183R after 1 h of reaction, but with a higher proportion of CD9 to CD12 (Figure 2e).After 24 h of reaction, the total amount of CD produced by L3YR-8 at 40 • C was three times higher compared to a synthesis performed at 50 • C indicating a destabilizing effect of the alterations at the loop structure of the enzyme (Figure 2f).

Discussion
Previously we had engineered the CGTase G825-6 variant Y183R, which produced a high yield of large-ring CD without concomitant formation of CD6 and CD7 [23].However, its total CD synthesis activity was strongly decreased.To reconstitute its activity, we aimed to modify the enzyme near its substrate binding site by a semi-rational mutagenesis approach.We focused on a loop structure of the protein involved in substrate binding composed of the amino acid residues 81−89 (Figure 4).This region has been previously suggested to control the CD product specificity of CGTases [25,26].Indeed, the loop structure of CGTases mainly synthesizing CD6 or CD7 (Figure 4a) was different from CGTases with CD8 as the main product (Figure 4b,c).Since the residue F88 in the CGTase G825-6 plays an important role in its product specificity for CD8, it was kept constant while the surrounding residues were targeted for mutagenesis (Figure S6).

High Fitness of the CGTase Variant Library
Based on the multiple sequence alignments performed, the limited selection of residues used for the design of the variant library reduced the number of possibly translated protein sequences by a factor of ~6 × 10 5 to 0.9 million sequences.The resulting library not only encoded for the majority of amino acid residues occurring in WT CGTases, but also included a set of residues not found in CGTases within the corresponding loop structure (Figure S1, Figure 1a).The residues D, E, K and R were encoded to allow for the formation of salt bridges with other loop regions to stabilize the binding pocket, since previous studies have shown that a crosslinking of CGTases increases their catalytic efficiency [27][28][29].The residues D, E and P were encoded to destabilize a β-sheet formation of the loop (Figure 5) with the aim to increase its flexibility [30,31].Since stabilizing the loop or making it more flexible was expected to influence the CD product specificity, the library was designed to allow both possibilities.Instead of a classical directed evolution approach performing small adaptive walks in an imaginary fitness landscape [32], the library design set a high mutation frequency within a narrow sequence window to cover a large sequence space simultaneously.By this strategy, a large set of combinations could be tested for a small sequence element like a loop, while keeping the remaining structure intact.This allowed screening for combinatorial effects that had a beneficial outcome regarding the catalytic activity or product specificity of the CGTase.
In comparison to other loop saturation experiments [33,34], the high fitness of the L3 semi-randomized library can be attributed to several factors, which ensured loop compositions with a high proportion of active variants: (i) The sufficient distance of the mutagenesis target site to avoid a direct interference with the catalytic triad, (ii) the choice of encoded residues deduced from multiple sequence alignments and (iii) the low conservation of this loop.Furthermore, the introduction of ionic residues should promote synergistic effects by electrostatic interactions between different residues of the loop [35].Three of the randomized codons encoded amino acid residues with an equal distribution (YHT, MWY, NCN), while for the other randomized codons the ratio of the most to the rarest amino acid encoded was 2:1, except for VYN (4:1).Therefore, only a minor encoded bias occurred in the library.

Selection of CGTase Variants with Changed CD Product Specificity
By screening for either CD7 or CD8 and of starch-degrading activity in a semi-quantitative agar plate assay, CGTase variants synthesizing CD7 or CD8 in amounts relative to their starch-degrading activity could be selected in parallel (Figure S2).Despite coverage of less than 0.1% of the L3 library, this screening method resulted in the identification of variants with a changed CD product specificity demonstrating the validity of the library design.The L3 variants showed a distribution of amino acid residues similar to the encoded library indicating that certain residues were not preferred among the 22 L3 variants (Figure 1a,b).In the L3YR clones (n = 8), residues H84 and G87 occurred in 7 out of 8 protein sequences of the variants, indicating their importance for the synthesis of CD in the presence of the substitution Y183R.Interestingly, L3YR-1, L3YR-2 and L3YR-3 had the sequence 83-LHPXG-87 with X = E or G, similar to the CGTase G825-6 and a CGTase of Bacillus clarkii.In fact, the sequences of the variants L3YR-1 and L3-8 were identical to the CGTase from B. clarkii (Figure S1).The sequence 81-YALHP of variant L3YR-2 was also found in the protein sequences of other WT CGTases (Figure S1).This suggests that the naturally evolved loop sequence efficiently compensated alterations of the centrally located Y183 residue resulting in active enzyme variants.While the number of investigated variants was too small to draw definitive conclusions, we consider these sequences as interesting targets for further site-directed mutagenesis experiments.

Variants Synthesizing High Yields of Large-Ring CD
The incorporated mutations resulted in a set of variants, which produced high proportions of CD9 to CD12 (Figure 2).The variant L3-8, carrying the B. clarkii CGTase/G825-6 CGTase loop consensus sequence 83-LHP-85 showed the same composition of its CD products as the WT enzyme, indicating that this sequence could be involved in the suppression of the synthesis of larger CD.L3YR variants with the composition 83-LHPGG or LHPEG, corresponding to the sequences of the B. clarkii CGTase/G825-6, synthesized a similar proportion of large-ring CD compared to Y183R.In contrast, L3YR-6, L3YR-7, L3YR 8 and L3-2M showed a strongly decreased synthesis of CD8.Whether these changes were caused by the alteration of the sequence 83-LHP, the position 86 or by a combination of both remains ambiguous.The CD yield of these variants increased at lower reaction temperatures, indicating that they were less thermostable compared to the WT CGTase.
From 18 possible combinations of FA, FD and YA of the loop amino acid residues 81 and 82, 16 were present in the 30 variant sequences and represent combinations also found in WT CGTases (Table S1, Figure S1).When the altered sequence of variant L3-2 was cloned into Y183R, the resulting variant L3-2M showed similar properties as L3YR-6, L3YR-7 and L3YR-8 (Figure 2d).The replacement of Y at position 183 with R resulted in the synthesis of almost only large-ring CD composed of CD9 to CD12, however concomitant with a decreased overall CD yield.
The most efficient variants from the L3 experiment displayed E, R or K at position 87, amino acids able to electrostatic interaction with D42 and K45 of the nearby loop.Variant L3-3 with K85 and E87 possibly formed salt bridges between loop 81-89, loop 39-48 and loop 136-142 (Figure 5).All three loops were positioned near the binding site and have been reported to contribute to the activity and CD product specificity of the CGTase [11,36,37].Accordingly, our results also indicate that a stabilization of the loop resulted in an increased synthesis of large-ring CD.However, less efficient L3 variants likewise had E, D and K at position 87, indicating that surrounding residues also contribute to the observed changes in CD synthesis and enzyme stability.

The L3 Loop Contributes to the Thermal Stability of the CGTase
Several L3YR variants showed high CD synthesis activities within 1 h of reaction, which were significantly lower after 24 h, suggesting a thermal inactivation of the enzyme during longer reaction times, due to the altered loop (Figure S4).A determination of the thermal stability and optimum temperature of the variant enzymes confirmed this assumption.The variants L3YR-6, L3YR-7 and L3YR-8 showed a lower T M and optimum temperature compared to the WT enzyme (Figure 3).When the CD synthesis reaction with these variants was performed at 40 • C, these L3YR variants indeed showed a 1.8 to 2.9-fold higher yield of total CD compared to 50 • C (Figure 2e,f).

Generation of the Vector Library
A CGTase expression system in E. coli based on a pET20b+ vector harboring an expression cassette encoding for a mature CGTase from the alkaliphilic Bacillus sp.G825-6 with an N-terminal DacD signal peptide was used [38].A partially randomized megaprimer 5'-agc ccg ccg att gaa aat gtg yht gmw vyn nmy vmn rrn rrn ttc ncn agc tat cat ggc tat tgg ggc was amplified by PCR (Phusion HF Kit, New England Biolabs, Frankfurt, Germany,) using primers hybridizing to the flanking sites to generate a dsDNA megaprimer.The randomized ssDNA primer (1 pmol) was added as a template for a 50 µL PCR reaction for 30 cycles according to the supplier's manual.The product was purified by agarose gel electrophoresis.The megaprimer DNA and the template vector pET20b(+):dacD-cgt was used in a molar ratio of 10:1 in a second PCR reaction (18 cycles, T Anneal : 60 • C).The product designated pET20b(+):dacD-cgt-L3 was digested with DpnI and ultracompetent E. coli XL10-Gold cells (New England Biolabs, Frankfurt, Germany) were used for transformation [39].After 1 h, an aliquot (50 µL) was plated on LB agar plates containing 70 µg mL −1 ampicillin (LB-amp), and the residual cells were further incubated for 12 h before harvesting the cells.Plasmids were extracted to obtain the L3 vector library and used for the transformation of E. coli BL21 (DE3).The procedure was repeated with a mutated version of the vector encoding the CGTase G825-6 with the substitution Y183R to generate the L3YR vector library.

Agar Plate Screening
Single colonies of E. coli BL21(DE3) pET20b(+):dacD-cgt-L3 were transferred to LB-amp, congo red [40], and phenolphthalein (LB agar pH 7.4, 40 mg L −1 phenolphthalein) agar plates [41].Both congo red and phenolphthalein agar plates were supplemented with 10 g L −1 soluble starch and 70 µg mL −1 ampicillin.Plates for the L3 experiment were incubated at 37 • C for 24 h.For the screening of the L3YR clones the congo red plates were transferred after 14 h from 37 • C to 50 • C and further incubated for 8 h.The size of the halos formed on the congo red plates was estimated after 14 h and 24 h of incubation.Subsequently, starch-degrading activity was visualized by covering the congo red plates with 1% (w/w) Lugol's solution and halo formation were estimated after 1 min.After 24 h of incubation, the phenolphthalein plates were overlayed with a solution containing 1 M NaOH, 0.1 M glycine, 1 M NaCl.After 1 min of incubation the size of the halos was estimated.28 clones were picked per plate.Each plate contained two clones for the positive control E. coli BL21(DE3) pET20b(+):dacD-cgt, a second positive control encoding the G825-CGTase variant D358R with low CD7and CD8-synthesizing activity, and a negative control (E. coli BL21(DE3) pET20b(+):tfcut, encoding a cutinase [42]).

Recombinant Protein Production and Analysis
Positive clones were repeatedly screened, and selected clones were used for expression of the proteins in 50 mL cultures.The enzyme in the extracellular fraction was purified by starch adsorption [38].The starch-degrading activity and the protein concentration of the purified enzyme variants were determined, as previously described [38].

CD Synthesis and Analysis
For the determination of the product spectrum synthesized by the CGTase variants, 0.2 µg purified protein (0.4 µg for the L3YR variants) was added to a 20 g L −1 soluble starch substrate (soluble potato starch, CAS9005-84-9; Merck KGaA, Darmstadt, Germany) in a total volume of 1 mL.Reactions were performed at 40 • C and 50 • C in 25 mM Tris-HCl pH 8.5 containing 10 mM KCl and 5 mM MgCl 2 .Samples were analyzed by high pressure anion exchange chromatography with pulsed amperometric detection after 1 h and 24 h of reaction [23].

Determination of Temperature Optimum and Thermostability of the Variants
The temperature optimum of the variants was determined by measuring their starch-degrading activity between 40 • C and 70 • C [38].The thermostability of Y211R and L3YR-6 was analyzed by determining its residual activity after incubation of the enzyme solution at 50 • C for 2 h.Nano differential scanning fluorimetry (nano-DSF) (Prometheus NT.48, Nanotemper Technologies, Munich, Germany) based on the tryptophan fluorescence ratio 350/330 nm (20 • C to 95 • C with 1 • C/min) was used to determine the melting temperature (T m ) of the proteins, which were calculated by first derivative analysis.

Protein Sequence Alignments and Molecular Modeling
Protein sequences from 30 CGTases were obtained from the National Center of Biotechnology Information (www.ncbi.nlm.nih.gov;accession numbers in Table S1) and compared to the CGTase G825-6 sequence [43] using a clustalW algorithm implemented in the homology software MEGA, version 6.06 [44].Based on the X-ray structure of the CGTase from Bacillus clarkii (PDB:4JCM), obtained from the Protein Data Bank (PDB) (www.rcsb.org),structure homology modeling was performed using the SWISS-MODEL platform [45,46].The PyMOL molecular graphic system (v0.99,Schrödinger, LCC) was used to model amino acid substitutions.The superimposed structure of a maltononaose substrate was derived from PDB:1CXK [47].

Conclusions
A strongly compressed library with high fitness was constructed using a semi-rational design for the alteration of the CD synthesis activity of a CGTase.The screening method based on three different markers allowed both the selection for variants with changed CD synthesis activity and CD product specificity.With this approach, we were able to screen a reasonable small number of clones to obtain CGTase variants synthesizing preferentially large-ring CD.This approach resulted in enzyme variants producing three times higher total amounts of CD7 to CD12 with a high proportion of CD9 and CD12.Some of the variants even synthesized almost solely CD10, CD11 and CD12.The results demonstrate that by semi-rational design, CGTase variants specifically producing these large-ring CD can be generated providing a previously difficult to access group of novel host molecules in supramolecular complexing reactions.Supplementary Materials: Supplementary Materials are available online at http://www.mdpi.com/2073-4344/9/3/242/s1, Table S1: Accession numbers for the sequences used in the multiple protein alignment, Figure S1: Multiple sequence alignment of CGTases and library design, Figure S2: Agar plate screening of L3 clones for CD7 and CD8 synthesis activity, and starch-degrading activity, Figure S3: SDS-PAGE of purified L3 and L3YR variants, Figure S4: Thermal stability of variant Y183R and L3YR-6, Figure S5: Nano-DSF melting curves.Figure S6: Influence of residue 88 on the CD product share.

Figure 1 .
Figure 1.Cyclodextrin glucanotransferases (CGTase) library design and screening process.(a) Sequencing of the generated L3YR vector library.The protein sequence 80-90 of the CGTase G825-6 and the library of encoded residues and codon distributions are shown as the frequency of the sequences.(b) Logos of the sequence frequency of 22 variants from the L3, and of eight variants from the L3YR experiment.

Figure 2 .
Figure 2. Synthesis of CD by the CGTase variants.The distribution of synthesized CD7 to CD12 is shown at the primary y-axis.The total amount of CD synthesized (CD6 to CD12) is shown at the secondary y-axis.The CD products of the CGTase G825-6 and the L3 variant enzymes obtained after 1 h (a) and 24 h (b) of reaction are shown.The CD products of the Y183R variant and of eight L3YR variant CGTases obtained after 1 h (c) and 24 h (d) of reaction at 50 • C are also shown.Selected enzymes were compared to Y183R in a synthesis reaction at 40 • C for 1 h (e) and 24 h (f).Mean values (n = 3) ± S.D. are presented.

Figure 3 .
Figure 3. Temperature optimum of the CGTase variants.The starch-degrading activity of L3 variant enzymes (a) and L3YR variant enzymes (b) was determined between 40 • C and 70 • C. Data represent mean values, n = 3.

Figure 5 .
Figure 5. Potential salt bridges between loop elements in the CGTase variants.A model of residues encoded in the L3 library (positions 81−89) predicted to form salt bridges with residues from the adjacent loops 39−48 and 136−142 is shown.The residues D, E, R and K encoded in the library design at position 87 are in close range to K45 and D42 and could form a salt bridge between these loops.K85 may form a salt bridge with D139, located on a neighboring loop.