Structural and Biochemical Characterization of Endo–1,4-glucanase from Dictyoglomus thermophilum, a Hyperthermostable and Halotolerant Cellulase

Enzymatic conversion of polysaccharides in the lignocellulosic biomass is currently the subject of intensive research and will be a key technology in future biorefineries. Using a bioinformatics approach, we previously identified a putative endo-β-1,4-glucanase (DtCel5A) from Dictyoglomus thermophilum, a chemoorganotrophic and thermophilic bacterium. Here, we structurally and functionally characterize DtCel5A and show that it is endowed with remarkable thermal and chemical stability. The structural features of DtCel5A and of its complex with cellobiose have been investigated by combining X-ray crystallography and other biophysical studies. Importantly, biochemical assays show that DtCel5A retains its activity on cellulose at high temperatures and at elevated salt concentrations. These features make DtCel5A an enzyme with interesting biotechnological applications for biomass degradation.


Introduction
Cellulases are highly attractive enzymes for various industrial applications, such as lignocellulosic biomass conversion [1,2]. Due to its recalcitrance, degradation of cellulose requires an enzymatic treatment for the complete hydrolysis of its components [3,4]. In the biofuel industry, an enzymatic complex of cellulases is required for the complete hydrolysis of cellulosic polymers into fermentable sugars units. This complex is composed of three major hydrolases: endoglucanases, which attack low-crystallinity regions in the cellulose fibers by endo-action, creating free chain-ends; exoglucanases or cellobiohydrolases, which hydrolyze the 1,4-glycocidyc linkages to form cellobiose; and β-glucosidase, which converts cello-oligosaccharides and disaccharide cellobiose into glucose residues [5].
Fungi, such as Trichoderma reesei, are industrially employed as a source of extracellular cellulases [6]. The maximum activity for most fungal cellulases occurs at 50 ± 5 • C and at a pH of 4.5-5 [7]. Usually, they lose about 60% of their activity in the temperature range of 50-60 • C and almost completely at 80 • C [8]. Therefore, owing to the progress in protein engineering and in recombinant protein production using Escherichia coli, new cellulases are being identified that are more stable over long periods of time and at elevated temperatures, thus enabling a highly efficient conversion of biomass [9]. Moreover, several thermophilic strains producing endoglucanases have been isolated and identified from various environments [10]. In this context, Dictyoglomus thermophilum, a chemoorganotrophic and thermophilic bacterium that encodes different thermostable xylanases and amylases [11], can be considered a promising source of new enzymes for the production of tailored cellulolytic cocktails for industrial application [12].
We have recently identified a new uncharacterized endo-β-1,4-glucanase from D. thermophilum, belonging to glycoside hydrolase family 5 (UniProtKB accession number B5YAS2 [13]. Here, we report the structural and functional characterization of this endoβ-1,4-glucanase, denoted as DtCel5A, as a promising enzyme for biotechnological applications. The crystal structure of DtCel5A shows an overall fold that is typical of glycosyl hydrolase family 5 and clan GH-A. Importantly, DtCel5A is endowed with remarkable thermal and chemical stability, with preserved enzymatic activity even at 80 • C. Additionally, an elevated halo-stability makes DtCel5A fully usable for biotechnological applications. The structural and functional data reported here provide a rationale for this outstanding stability and may help the bioengineering of cellulases using a structure-based approach to produce more efficient and resistant enzymes for biomass polysaccharides' degradation in biofuel production.

DtCel5A from D. thermophilum Is a Thermally and Chemically Stable Cellulase
The full-length coding sequence of a putative endoglucanase from D. thermophilum was purchased from GENEWIZ (Sigma-Aldrich, Gillingham, Dorset, UK). The gene encodes a 335 amino-acid endo-β-1,4-glucanase (B5YAS2 UniprotKB accession number), which we denominated DtCel5A. Bioinformatic analysis, using the PFAM database, shows that DtCel5A is a member of glycosyl hydrolase family 5, constituted by a single cellulase domain. Its fold is represented in 811 different domain architectures and, although DtCel5A is of bacterial origin, is most frequent among fungi (data not shown).
Since DtCel5A contains a transmembrane helix to its N-terminal end, as predicted by TMPRED (Figure 1), we cloned the corresponding gene deprived of its first 19 amino acids. Additionally, we mutated two cysteine residues to their isosteric serine (Cys156Ser and Cys308Ser) to avoid aggregation phenomena in the solution. The recombinant protein was overexpressed for biochemical and structural studies. The thermal and chemical stability of cellulose-hydrolyzing enzymes are precious characteristics for their biotechnological use. Using far-UV CD spectroscopy, we observed that the spectrum of DtCel5A is typical of a well-structured α-β fold, indicating a good degree of structural integrity of the protein in the solution (Figure 2A). To investigate the heat-induced changes in the protein's secondary structure, thermal unfolding curves were recorded by following the CD signal at 222 nm as a function of temperature, using a 1 • C/min heating rate. Thermal unfolding shows that DtCel5A is highly thermostable , with a melting temperature of 85 • C ( Figure 2B).
The stability of DtCel5A against the denaturing action of guanidine hydrochloride (GuHCl) was also investigated by performing CD measurements at 22 • C in 20 mM sodium phosphate buffer at pH 7.4 in the presence of increasing GuHCl concentrations, up to 6.0 M ( Figure 2C). The resulting denaturation profile is characteristic of a two-state transition with a single inflection point. Results show that DtCel5A possesses a remarkably high resistance against the GuHCl-denaturing action, showing a value of GuHCl concentration at half-completion of the transition (C 1 2 ) of 5.4 M ( Figure 2C). The extraordinary thermal and chemical stability of DtCel5A makes this enzyme a promising tool for biotechnological applications.

DtCel5A Is a Structurally Conserved Endo-β-1,4-glucanase
Fine-tuning of protein concentration in crystallization drops, using vapor-diffusion techniques, produced crystals of DtCel5A suitable for X-ray studies in two different space groups, the trigonal space group P3 2 and the orthogonal P2 1 2 1 2 1 [13]. The structures were solved using the molecular replacement method, using the structure of Cel5A from T. maritima (pdb code 3mmu) as a template. For details of data processing, refinement, and structure validation, see Table 1. In the P3 2 space group, there are two DtCel5A molecules in an asymmetric unit, with a strong structural similarity (rmsd 0.07 Å on Cα atoms). Overall, DtCel5A adopts a compact (β/α) 8 -barrel fold (Figure 3), which is long known as the most commonly occurring fold among protein catalysts, appearing in approximately 10% of all known enzyme structures [14].  An analysis of the degree of conservation of surface residues of the protein, carried out with ConSurf, clearly identifies the catalytic site cleft of DtCel5A as a highly conserved cluster of residues in the central part of the β-barrel ( Figure 3). Most conserved residues are located at the bottom of the cleft and include residues Glu154-His214-Glu272, which are predicted to act as a catalytic triad [15,16] (Figure 3).
A superposition of the structure in the P3 2 space group (chain A) with that in P2 1 2 1 2 1 shows full structural conservation, with rmsd computed on Cα atoms (residues 20-333) of 2.0 Å. However, significant differences exist in the N-terminal region (embedded between Ser20 and Pro29), which adopts a different conformation in the two forms. In the P3 2 space group, this region exhibits an extended conformation and forms several backbonebackbone hydrogen-bonding interactions with loop 227-230 of the molecule related through non-crystallographic symmetry ( Figure 4). In this conformation, it partially locks the catalytic site of the adjacent molecule in the asymmetric unit ( Figure 4). Differently, in the P2 1 2 1 2 1 space group, the same N-terminal region forms intra-molecular interactions with outer residues of helices H7 and H8 (e.g., of S19 and Y21 with D258 and D254 of H7, respectively), thus leaving the catalytic site empty (Figure 4). Consistent with these findings, secondary structure predictions using JPred show that the N-terminal region 20-29 is expected to adopt a loop conformation. Consistent with the full conservation of DtCel5A structure in two different space groups, strong structural conservation is also observed with other members of the glycosyl hydrolase family 5, clan GH-A. Specifically, a structural similarity search using DALI showed that the overall architecture of DtCel5A resembles a set of endoglucanases, among which the most similar are those of the thermophilic bacteria Fervidobacterium pennivorans (seqid 55%, PDB code 6kdd, Z = 49.3, and rmsd = 0.8 Å on backbone atoms), Fervidobacterium nodosum (seqid 52%, PDB code 3rjy, Z = 48.3, and rmsd = 0.9 Å) [16] and from Thermatoga maritima (seqid 60%, PDB code 3amd, Z = 48.4, and rmsd = 0.9 Å) [17]. In all cases, structural differences are only observed in the N-terminal region, which, as reported above, is strongly affected by crystal packing (Figure 4).
Strong structural similarity is also observed with endo-1,4-β-glucanase of the mesophilic Clostridium difficile (seqid 41%, PDB code 6uje, Z = 46.1, and rmsd = 1.2 Å) [18]. This structural similarity allowed us to compare specific features of these two proteins, which could potentially explain their dramatically different thermostabilities. Although the identification of structural determinants of thermostability has long been debated, it appears clear that DtCle5A possesses peculiar features compared to Cel5A of C. difficile. Indeed, DtCel5A contains a doubled amount of proline residues (Table 2). These proline residues of DtCel5A present similar backbone dihedral angles as their corresponding residues in Cel5A of C. difficile and are located either at the N-terminal side of α-helices or in loop regions, likely entropically stabilizing the entire structure ( Figure 5). Consistently, a critical effect of proline on the thermostability of endoglucanase II from P. verruculosum was previously observed [19]. In addition, the sequence of DtCel5A presents a larger number of hydrophobic side chains, 151 compared to 120 of C. difficile (Table 2). Additionally, a larger number of salt bridges, 14 versus 12, is computed for the DtCel5A structure. Consistent with our data, mutational analysis showed that electrostatic and hydrophobic networks strongly contribute to thermal stability and halo-stability [20].

Crystal Structure of DtCel5A with Cellobiose
Crystals of complexes of DtCel5A with cellobiose, the major hydrolysis product of most endo-β-1,4-glucanases, were obtained in the P2 1 2 1 2 1 space group. Indeed, likely due to the partial occlusion of the catalytic site pocket operated by the N-terminal region of DtCel5A in the P3 2 space group (Figure 4), all soaking and co-crystallization experiments were unsuccessful in these experimental conditions. The analysis of electron density maps shows a single cellobiose molecule located at the entrance of the DrCel5A catalytic pocket ( Figure 6A,B). This cellobiose molecule is stacked between Trp48 and Trp228 and forms a hydrogen bond with the side chain of Trp305 ( Figure 6B). Superposition of the apo-form and cellobiose-bound complex structures show that the binding of cellobiose to the enzyme does not cause significant conformational changes (rmsd on backbone atoms 0.7 Å).
A superposition of DtCel5A-cellobiose crystal structure with those of an inactive mutant of Cel5A from Thermatoga maritima in complex with cellobiose (CBI) and cellotetraose (CTT) [17] shows that the cellobiose molecule binds to DtCel5A in a different position in the two enzymes. Indeed, in DtCel5A, cellobiose is located further from the catalytic triad and has a one-position shift, from −2/−1 to −3/−2 ( Figure 6C,D). This finding points to a possible role of Trp48 and Trp228 in holding cellulose in the enzyme cavity by ensuring the non-specific (stacking) interactions needed for cellulose processivity. Indeed, conservation analysis using ConSurf shows that Trp48 and Trp228 are among the most conserved tryptophan residues of DtCel5A (data not shown).

DtCel5A Is Fully Active and Halotolerant at High Temperatures
To probe the usability of DtCel5A for biotechnological applications, we measured its activity on carboxymethyl cellulose (CMC) as a function of pH, temperature, and ion concentration. Endoglucanase activity of DtCel5A was measured at 70 °C in the pH range of pH 3.5-7.5, using citrate-phosphate buffer. As a result, the enzyme shows a remarkably high cellulase activity, with a maximum of (233 ± 24.4) U/µmol at a pH value of 4.5 ( Figure  7A), consistent with previously characterized endoglucanase Cel5H from D. thermophilum [21].
Given the elevated thermostability of DtCel5A, we tested the effect of 1 h incubation at different temperatures, in the range between 30 and 80 °C, on the cellulase activity of the purified enzyme. Experiments were conducted in the best-identified pH conditions, 50 mM citrate-phosphate buffer, with a pH of 4.5. As shown in Figure 7B, we observed a A superposition of DtCel5A-cellobiose crystal structure with those of an inactive mutant of Cel5A from Thermatoga maritima in complex with cellobiose (CBI) and cellotetraose (CTT) [17] shows that the cellobiose molecule binds to DtCel5A in a different position in the two enzymes. Indeed, in DtCel5A, cellobiose is located further from the catalytic triad and has a one-position shift, from −2/−1 to −3/−2 ( Figure 6C,D). This finding points to a possible role of Trp48 and Trp228 in holding cellulose in the enzyme cavity by ensuring the non-specific (stacking) interactions needed for cellulose processivity. Indeed, conservation analysis using ConSurf shows that Trp48 and Trp228 are among the most conserved tryptophan residues of DtCel5A (data not shown).

DtCel5A Is Fully Active and Halotolerant at High Temperatures
To probe the usability of DtCel5A for biotechnological applications, we measured its activity on carboxymethyl cellulose (CMC) as a function of pH, temperature, and ion concentration. Endoglucanase activity of DtCel5A was measured at 70 • C in the pH range of pH 3.5-7.5, using citrate-phosphate buffer. As a result, the enzyme shows a remarkably high cellulase activity, with a maximum of (233 ± 24.4) U/µmol at a pH value of 4.5 ( Figure 7A), consistent with previously characterized endoglucanase Cel5H from D. thermophilum [21]. Given the elevated thermostability of DtCel5A, we tested the effect of 1 h incubation at different temperatures, in the range between 30 and 80 • C, on the cellulase activity of the purified enzyme. Experiments were conducted in the best-identified pH conditions, 50 mM citrate-phosphate buffer, with a pH of 4.5. As shown in Figure 7B, we observed a substantial increase in CMC degrading activity from 30 • C to 80 • C, with maximum of activity between 70 • C and 80 • C ( Figure 7B). These data are consistent with the elevated melting temperature (Tm 85 • C, Figure 2) observed by CD spectroscopy analyses.
Halotolerant enzymes are beneficial for industrial processes, often requiring high salt concentrations [22]. Therefore, we investigated the effect of NaCl on the activities of DtCel5A. DtCel5A was incubated in the presence of this salt, ranging from 0.3 to 3 M, and the enzymatic activity was measured under optimum values of pH and temperature (pH of 4.5 and 70 • C, respectively). As a result, we observed that the enzymatic activity of DtCel5A remained almost unchanged at different NaCl concentrations ( Figure 7C), albeit with a small enhancement at 3 M NaCl. This property was previously observed for the paralogous Cel5H [21] and for the recombinant endoglucanase from the thermophilic fungus Scytalidium thermophilum [23].
It was previously shown that hydrophobic residues significantly improve protein halotolerance by strengthening intra-domain hydrophobic interactions [24]. As stated above and shown in Table 2, the higher number of hydrophobic residues in DtCel5A (aliphatic index 83.5), compared to the homologous mesophilic Cel5A from C. difficile (aliphatic index 78.5), may provide a ratio to its enhanced halo-tolerance.

Recombinant Protein Production, Crystallization of DtCel5A, and Data Collection
The gene encoding a putative endoglucanase from D. thermophilum (UniProtKB. Available online https://www.uniprot.org/uniprot/B5YAS2), encompassing the protein region 18-334 and named DtCel5A, was synthesized and subcloned into pETM-13 expression vector as previously described [13] Briefly, the recombinant protein was successfully expressed in E. coli and purified as C-terminal his-tagged protein by coupling two consecutive chromatography steps. Freshly concentrated protein (epsilon value of 95,500 M −1 cm −1 derived from ProtParam tool. Available online https://web.expasy.org/protparam/) was used for the crystallization experiments.
A high-throughput crystallization screening was performed at 293 K by hanging-drop vapor-diffusion methods using commercially sparse-matrix solutions (Hampton Research, Aliso Viejo, CA, USA). Two diffracting crystal forms were obtained using different protein concentration solutions, spanning from 20 to 35 mg/mL, and two different precipitants [13]. All crystals were cryoprotected by adding 15-20 (v/v) % glycerol to the crystallization solution, and diffraction data were collected in-house. The first crystal form, obtained in the ammonium sulfate, belonged to space group P3 2 and diffracted to 1.5 Å, whereas the second crystal form, obtained in PEG8K, belonged to the orthorhombic space group P212121 and diffracted to 1.6 Å resolution [13]. The data set were scaled and merged using the HKL2000 software package (Z. Otwinowski and W. Minor, 1997, C.W. Carter, Jr. & R. M. Sweet, Eds., Academic Press, New York). Statistics of data collection are reported in Table 1. Crystals of complexes of DtCel5A with cellobiose (5 mM) were obtained using the hanging-drop vapor-diffusion method, both by co-crystallization and soaking approaches.

Structure Determination and Refinement
The crystal structure of the DtCel5A was solved by molecular replacement using the software PHASER (ccp4i supported program, Didcot, UK) [25] and the structure of endoglucanase Cel5A from the hyper-thermophilic Thermotoga maritima (PDB code 3MMU) as template [26]. Crystallographic refinement was first carried out against 95% of the measured data using the program REFMAC in the ccp4i program suite (Didcot, UK) [27]. The remaining 5% of the observed data, which were randomly selected, was used in Rfree calculations to monitor the progress of refinement. Water molecules were incorporated into the structure in several rounds of successive refinement. The structures were validated using the program PROCHECK [28]. Structure comparison was carried out using the DALI server (http://ekhidna2.biocenter.helsinki.fi/dali/). All figures were prepared using the program PyMOL (DeLano Scientific, San Carlos, CA, USA)

Circular Dichroism
To analyze the conformational state of DtCel5A, Far-UV CD spectra were registered at 20 • C in 20 mM sodium phosphate buffer at pH 7.5. All CD spectra were recorded with a Jasco J-810 spectropolarimeter equipped with a Peltier temperature control system (Model PTC-423-S, Jasco Europe, Cremella, LC, Italy). Molar ellipticity per mean residue, [θ] in deg cm 2 dmol −1 , was calculated from the equation: [θ] = [θ]obs · mrw · (10 · l · C) −1 , where [θ]obs is the ellipticity measured in degrees, mrw is the mean residue molecular mass (118 Da), C is the protein concentration in mg/mL, and l is the optical path length of the cell in cm. Far-UV measurements (195-250 nm) were carried out at 20 • C using a 0.1 cm optical path length cell and a protein concentration of 0.2 mg/mL. Thermal denaturation was investigated by recording the CD signal at 222 nm. The GuHCl-induced denaturation curves, at fixed constant temperature of 20 • C, were obtained by recording the CD spectra at increasing concentrations of GuHCl, up to 6.0 M. The signal at 222 nm was also followed as a function of GuHCl concentration to estimate the C 1 /2.

Enzyme Assays: Temperature, pH, and Salt Dependence of DtCel5A
Endoglucanase activity was determined in a reaction mixture (500 µL) with 1% (w/v) carboxymethyl cellulose (CMC low viscosity, C5678 Sigma-Aldrich, Merck Milano, Italy), as substrate, in 50 mM citrate-phosphate buffer (pH 4.5) and 300 mM NaCl. The mixture was pre-incubated at 70 • C before adding 27 µg (0.7 nmol) of purified protein. The reactions were performed for 1 h, and the resulting amount of reducing sugar equivalents was detected using the 3,5-dinitrosalicylic acid (DNS) method [21]. DNS assay solution was prepared with 1% (w/v) DNS, 0.05% (w/v) sodium sulfite, and 1% (w/v) sodium hydroxide. After incubation, 100 µL of digested CMC was added to 0.5 mL of the DNS assay solution, 0.1 mL of sodium acetate (pH 5), and brought to 1 mL with water. The samples were boiled for 10 min and quenched with 0.1 mL of 1.5 M potassium sodium tartrate, and then cooled to room temperature. The absorbance measured at 575 nm is proportional to the concentration of reducing sugar, which was calculated with the standard curve. Standard glucose solutions were prepared in 1 mL of DNS solution ranging from 0.6 to 4.0 mM (µmol/mL).
The effect of temperature on the enzymatic activity was determined in 50 mM citratephosphate buffer (pH 4.5) in the temperatures range from 30 to 80 • C, whereas the effect of pH was investigated in citrate-phosphate buffered in a range of pH 3.5-7.5. The enzymatic activity of DtCel5A was also investigated at different concentrations of NaCl, ranging from 0 to 3.0 M, under its optimum pH and temperature. One unit of enzymatic activity (U/mL) was defined as the amount of enzyme to release 1 µmol of glucose equivalent reducing groups per minute in 1 mL. All enzymatic assays were performed in triplicate. The means and standard errors of means (mean ± S.E.) were calculated for each treatment, and S.E values are displayed as Y-error bars in figures.

Conclusions
The production of thermostable and halo-tolerant biocatalysts is one of the current challenges in industrial and biorefinery processes [29]. Endoglucanases are essential operating enzymes in cellulolytic cocktails for the treatment of ligninocellulose, aimed at its conversion into fermentable sugars. In this study, we report the functional and structural characterization of a new thermostable endo-β-1,4-glucanase from D. thermophilum, DtCel5A.
We show that DtCel5A displays extreme thermostability, with a Tm of 85 • C and the strongest enzymatic activity at pH 4.5 between 70 and 80 • C. Additionally, DtCel5A possesses a remarkable resistance to GuHCl denaturing, with a value of GuHCl concentration at half-completion of the transition (C 1 2 ) of 5.4 M. Moreover, the enzymatic activity of DtCel5A is unaffected by NaCl concentration, up to a concentration of 3 M. The structural features shown here for DtCel5A allow for the understanding of the extremely resistant properties of DtCel5A, mainly due to its elevated number of proline residues at the N-terminal ends of α-helices or in loop regions and the elevated number of buried hydrophobic residues. All these properties provide a rationale for DtCel5A being an ideal enzyme for the industrial pretreatment processes commonly used for lignocellulosic material. Finally, the characterization of DtCel5A provides molecular determinants useful for the design of enzymes with enhanced cellulase activity.