Improvement of the catalytic activity and thermostability of a hyperthermostable endoglucanase by optimizing N-glycosylation sites

Background Endoglucanase has been extensively employed in industrial processes as a key biocatalyst for lignocellulosic biomass degradation. Thermostable endoglucanases with high catalytic activity at elevated temperatures are preferred in industrial use. To improve the activity and thermostability, site-directed mutagenesis was conducted to modify the N-glycosylation sites of the thermostable β-1,4-endoglucanase CTendo45 from Chaetomium thermophilum. Results In this study, structure-based rational design was performed based on the modification of N-glycosylation sites in CTendo45. Eight single mutants and one double mutant were constructed and successfully expressed in Pichia pastoris. When the unique N-glycosylation site of N88 was eliminated, a T90A variant was active, and its specific activity towards CMC-Na and β-d-glucan was increased 1.85- and 1.64-fold, respectively. The mutant R67S with an additional N-glycosylation site of N65 showed a distinct enhancement in catalytic efficiency. Moreover, T90A and R67S were endowed with extraordinary heat endurance after 200 min of incubation at different temperatures ranging from 30 to 90 °C. Likewise, the half-lives (t1/2) indicated that T90A and R67S exhibited improved enzyme thermostability at 80 °C and 90 °C. Notably, the double-mutant T90A/R67S possessed better hydrolysis activity and thermal stability than its single-mutant counterparts and the wild type. Conclusions This study provides initial insight into the biochemical function of N-glycosylation in thermostable endoglucanases. Moreover, the design approach to the optimization of N-glycosylation sites presents an effective and feasible strategy to improve enzymatic activity and thermostability.

Since elevating temperature generally improves catalytic efficiency and simultaneously reduces microbial contamination, thermostability is a desired quality for endoglucanases in practice [4,5]. Additionally, utilizing potent thermostable endoglucanases with optimal activity at high temperatures can accelerate the hydrolysis process, shorten the reaction period, and enhance cost competitiveness [6,7]. Consequently, the enhancement of advantageous characteristics is desirable to generate superior endoglucanases [8]. Rational engineering coupled with structural analysis and functional prediction is an efficient genetic approach to optimize enzyme properties [9,10]. To date, many thermostable endoglucanases from diverse origins and GH families have been engineered to improve their specific activity and thermal stability [11][12][13][14].
N-glycosylation is a ubiquitous posttranslational modification involving the covalent attachment of a carbohydrate unit at the asparagine residue within the sequon Asn-Xaa-Ser/Thr (where Xaa cannot be Pro) [15]. Cellulases secreted from filamentous fungi are often decorated with N-glycosylation, which plays varied roles in myriad biological functions [16]. For example, the modification of N-glycans in GH6 and GH7 family cellobiohydrolases can improve their biochemical properties in some cases, including thermal and proteolytic stability, hydrolytic activity, and substrate binding [16][17][18][19][20]. In addition, N-glycosylation affects the proper folding, enzymatic characteristics, and production of fungal GH1 and GH3 family β-glucosidases [21,22]. However, the underlying role of N-glycosylation in the enzymatic activity and stability of endoglucanases is rarely reported [16], especially for industrially important thermostable endoglucanases. Furthermore, ideal engineering of thermostable endoglucanases should be conducted, along with modification of N-glycosylation sites, to generate mutants with optimized performance.
In our previous study, a novel thermostable β-1,4endoglucanase, CTendo45, which is a typical GH45 family member, was identified in the thermophilic fungus Chaetomium thermophilum [12,23,24]. In this study, rational engineering was performed to further improve the specific activity and thermostability of CTendo45 by removing the existing N-glycosylation sites and introducing additional ones using site-directed mutagenesis, providing a feasible pathway for improved enzyme redesign proposals.

Structural characterization
Homologous modeling is an efficient and general approach to predict the three-dimensional (3D) structure of structurally uncharacterized proteins [25,26].
Although the 3D structure of CTendo45 has not been solved, crystal structures of some homologous GH45 endoglucanases are available from different organisms [27][28][29][30]. Among them, the Thielavia terrestris β-1,4endoglucanase TtCel45A (PDB: 5GLY), which is a typical GH45 endoglucanase and shares the highest amino acid identity (64%) with CTendo45, was used as a basis to construct the protein model for structural characterization [28]. The root-mean-square deviation (RMSD) value and the global model quality estimation (GMQE) score used for model quality evaluation are 0.095 Å and 0.8, respectively, indicating that the model is reliable for homologous structural analysis [31,32]. As shown in Fig. 1, CTendo45 possesses a six-stranded β-barrel framework and a characteristic region with several interconnecting loops. These two main portions are separated by a substrate-binding cleft, which contains the catalytic center (subsites − 4 to + 3) [28]. The latter comprises Asp144 (catalytic acid) and Asp32, the catalytic base that confers nucleophilic enhanced character to the catalytic water [29,33,34]. Residue Asn88, the unique N-glycosylation site in CTendo45 (Additional file 1: Table S1), lies at the β-barrel domain on the opposite side of the catalytic domain and is decorated with a branched oligosaccharide chain (Fig. 1).

Production and purification of mutant enzymes
To determine the enzymatic characteristics, CTendo45 and its mutants were heterologously expressed in P. pastoris and purified using Ni 2+ affinity chromatography. The similar protein yields implied that the attachment of N-glycans to different sites has hardly any influence on efficient enzyme expression (Additional file 1: Table S2). SDS-PAGE analysis showed that each single mutant with an additional N-glycosylation site presented as a single band at approximately 34 kDa, which is higher than that Fig. 1 Homologous modeling of CTendo45 using the Thielavia terrestris β-1,4-endoglucanase TtCel45A (PDB: 5GLY) as a template. a The solvent accessible surface. Cellotriose and cellotetraose molecules, bound in the substrate-binding cleft, are shown in cyan. The branched N-glycan attached to the residue Asn88 in CTendo45 is represented as a white and red stick. The V-VI loop is noted in purple. Catalytic residues and the additional histidine residues are noted in orange and yellow, respectively. The intrinsic histidine residue was noted in red. b The 3D superposition between CTendo45 (marine) and TtCel45A (wheat). The spatial positions of valuable residues used in this study are marked by colored sticks. A front view of substrate-binding cleft. c Rotating the configuration 90° anti-clockwise. d Rotating 90° clockwise. All of the structural diagrams were drawn using PyMOL software Fig. 2 Sequence alignment of CTendo45 with other homologous GH45 endoglucanases using Clustal Omega. These enzymes are isolated from Thielavia terrestris (PDB: 5GLY, 64% of sequence identity), Magnaporthiopsis poae ATCC 64411 (KLU88048, 77% of sequence identity), Madurella mycetomatis (KXX82926, 78% of sequence identity), Coniochaeta ligniaria NRRL 30616 (OIW24112, 78% of sequence identity), and Rosellinia necatrix (GAP84246, 72% of sequence identity), respectively. Asterisk indicates the positions which have a single, fully conserved residue. Colon indicates the strongly similar parts among homologous sequences and period means the weakly similar parts among homologous sequences. The potential signal peptide is signed with a black arrow and the V-VI loop is signed with a purple arrow. Conserved and nonconserved asparagine residues in CTendo45 are noted by closed and open inverted triangles, respectively. Highlight blocks specify the catalytic residues Asp32 and Asp144 of CTendo45 in orange. The modifiable and intrinsic N-glycosylation sequences are shaded in yellow and red, respectively. All mutant sites in this study are noted by blue vertical arrows of the wild type (approximately 32 kDa) ( Fig. 3a) because of the artificial attachment of oligosaccharides under the same culture conditions [36]. After the deglycosylation of N-linked glycans with PNGase F, the enzyme molecular mass decreased to 28 kDa, which was consistent with the size of the N-deglycosylated mutant T90A. The molecular mass of the double-mutant R67S/T90A is nearly the same as that of the wild type (Additional file 1: Fig. S3). Glycoprotein staining further confirmed that residue N88 was the unique N-glycosylation site in CTendo45. Moreover, this result indicated that the N-glycan was successfully attached to each additional mutation as a single clear band (Fig. 3b). In particular, it is worth mentioning that the bands of T90A and deglycosylated wild-type also appeared in glycoprotein staining, which resulted from O-linked glycans, as there are three predicted O-glycosylation sites (T52, T235, and T237) in CTendo45. The mass spectrum of the tryptic digest of CTendo45 and its mutants further confirmed the addition or deletion of N-glycan in a specific position (Additional file 1: Table S3 and Fig. S4).

Effect of N-glycosylation on activity
The optimal pH values displayed no obvious differences, with relatively high catalytic activity against CMC-Na under acidic conditions (pH 4) (Fig. 4a). In addition, all mutants had similar temperature optima to that of wildtype CTendo45 at 60 °C (Fig. 4b). These results were consistent with that of the optimum activity assay using the native substrate of β-d-glucan (Additional file 1: Fig.  S5), which is likely attributable to the lack of apparent  conformational rearrangements as a result of changes in N-glycosylation sites [37]. Compared with the wild type, the mutant T90A, eliminating the unique N88-glycosylation site, showed enhanced hydrolytic activity towards CMC-Na and β-dglucan with increases of 1.85-and 1.64-fold, respectively ( Table 2). The intrinsic glycosylation with the branched glycan attached to residue N88 acts a fastening clamp that restrains the conformation of the backbone β-barrel domain and further maintains a relatively crowded and confined structure for the whole protein, especially for the substrate-binding cleft [38]. The absence of the glycan, therefore, initiates a moderate loosening of the buried cleft, leading to functional improvement of the catalytic residues [16]. In addition, a distinct enhancement of hydrolytic performance was noted for the mutant R67S (Table 2), probably attributable to the dynamic interaction between cellulose chains and the additional oligosaccharide attached at residue N65, which is located close to the cleft (Fig. 1) [17,39]. Based on these observations, the double-mutant R67S/T90A was generated. The activity of R67S/T90A was increased and exhibited maxima of 2.26-and 1.94-fold against CMC-Na and β-dglucan, respectively (Table 2).
Among the other mutants, the activity of L47T and Q56T improved to a certain degree, and the mutant G63T had slightly reduced activity. The mutation of F143 to T or S resulted in appreciably impaired activity ( Table 2). The residue N141 in CTendo45, homologous to N119 in TtCel45A, was adjacent to the flexible  V-VI loop, which is acknowledged as an active regulatory switch in cooperation with the catalytic acid D144 during catalytic reactions (Fig. 1) [28]. Therefore, the addition of a glycan chain at N141 actually blocked the dynamic active space and seriously impeded the specific function of the V-VI loop. Residue W169 is a significant component of the linker joining two α-helices; as a result, the additional branched N-glycan at N167 would destroy the structural configuration [36]. Alternatively, the additional glycan can generate a steric hindrance to perturb the ability of the catalytic base D32 (Fig. 1) [40], resulting in a near-complete loss of the enzyme's ability to hydrolyze both CMC-Na and β-d-glucan. As mentioned above, N-glycosylation at different positions exerted diverse effects on enzyme activity, which is consistent with the results of recent studies on other glucoside hydrolases [17,37,41]. Residue substitutions at either the F143 or W169 position had a strong negative influence on enzymatic activity (Table 2); hence, the mutants F143S, F143T, and W169S were not further analyzed in this study.

Effect of N-glycosylation on thermostability
To determine the effect of glycosylation on enzyme thermostability, the hydrolytic activities of these endoglucanases were detected after preincubation at different temperatures ranging from 30 to 90 °C for 200 min. T90A exhibited excellent heat resistance after treatment at high temperatures with CMC-Na (Fig. 4c). Consistent thermostability results were realized for T90A using β-d-glucan as a substrate (Additional file 1: Fig. S5). Among the other single mutants, L47T and R67S were more thermostable than CTendo45 at elevated temperatures, while both Q56T and G63T lost nearly all hydrolytic activity after 200 min of incubation at 90 °C with each substrate. The high-temperature resistance of the double-mutant R67S/ T90S was much greater than that of the single-mutation counterparts. Furthermore, the half-lives (t 1/2 ) consistently revealed that four mutants, L47T, R67S, T90A, and R67S/T90S, exhibited improved enzyme thermostability at 80 °C and 90 °C (Table 3).
Previous studies have demonstrated that the optimized stabilization introduced by additional glycosylation is closely associated with entropy, which is largely dependent on the positions of glycosylation sites [38,42]. Glycans attached to the flexible region within random coils would, in general, inherently confine the conformational space and encourage entropic reduction to a point to enhance conformational stability at high temperatures [17,36]. Additionally, the polarity of the protein surface would significantly change after glycosylation, extending the tertiary structure and exposing some hydrophobic amino acids to a more hydrophilic environment [43]. Nevertheless, the lower thermostability of Q56T and G63T could be related to other complicated structural determinants that increase the protein's entropy, for instance, the destruction of hydrogen bonds and the perturbation of secondary structures through deglycosylation [44,45]. The greater thermostability of T90A appeared to be connected with the amino acid position located in the backbone structure of the β-barrel region. The conjugated glycan might have acted as a strong clamp and tightened the enzyme spatial conformation with increased configurational entropy [38]. Thus, the reduction in the entropy of the folded state, in the case of deglycosylation, might repress fluctuations in a more stable structure and thereby reduce heat sensitivity [46]. Thermostability is a complex property that can be controlled by several factors; therefore, the thermodynamic mechanism has not yet been fully elucidated [5]. In this case, additional details should be pursued in future research to ascertain the reasons for the enhanced thermostability resulting from the modification of N-glycosylation sites.

Effect of N-glycosylation on kinetic characterization
The kinetic parameters against CMC-Na were determined at the optimum enzyme conditions of 60 °C and pH 4 ( Table 4). The catalytic efficiency of T90A was significantly increased, as the kcat/Km value was 1.57-fold greater than that of native CTendo45, and this trend was consistent for the Vmax and kcat values of T90A. The additional mutant R67S also showed an obvious increase in turnover rate and catalytic efficiency with elevated kcat and kcat/Km values. The double-mutant R67S/ T90S inherited the improved enzymatic performance from its single-mutation counterparts. In contrast, the mutant G63T had a reduction in catalytic efficiency, and the kcat/Km value was lower than that of other endoglucanases. The kinetic parameters of each enzyme were also measured with barley β-d-glucan as the native substrate (Table 4). These results indicated that the N-glycosylation modifications, except for that of G63T, effectively improved the catalytic efficiency towards cellulose substrates. The covalent bond connecting the oligosaccharide and the N-glycosylation site could adjust the energy landscape of glycoproteins, resulting in substantial changes in kinetic properties [47]. The deglycosylated mutant exhibited increased catalytic efficiency, probably due to a relatively flexible protein structure that in turn influenced the location and function of amino acids at the active site [48]. The residue N65 is located near the active site cleft, and the branched N-glycan would unavoidably interact with cellulose chains. To tear away a single chain from the surface of cellulose molecules, multiple intermolecular hydrogen bonds must be formed, and the free energy of new hydrogen bonds can facilitate the hydrolytic process [17]. The discrepancies in kinetic characterization among L47T, Q56T, and G63T are mainly attributed to their different amino acid positions (although all of them are situated in the flexible loop), where the glycan would play different roles in enzyme properties [41].
From a practical perspective, the efficient catalytic activity of enzymes is a crucial prerequisite for industrial applications. Consequently, T90A, R67S, and their double-mutant R67S/T90A, which showed considerable thermostability and elevated catalytic efficiency, are regarded as prospective candidates for widespread biotechnological applications. More remarkably, the N88-and N65-glycosylation sites are highly conserved in homologous sequences of CTendo45. Therefore, the design principle regarding the use of N-glycosylation site modification to optimize enzyme properties may be widely applied to homologous endoglucanases sharing a similar structure with CTendo45 or even to the majority of GH45 family members.

Conclusion
In this study, we investigated for the first time each modifiable N-glycosylation site of a thermostable endoglucanase and obtained three mutants, T90A, R67S, and the double-mutant R67S/T90A, with superior catalytic activity and thermostability compared to the wild-type and other mutants. This work provides preliminary insight into the biological functions of N-glycosylation in thermostable endoglucanases and has referential significance for engineering homologous and structurally similar enzymes to improve enzymatic performance via rational design.

Materials
Escherichia coli T1 (TransGen, Beijing, China) was used for gene cloning and sequencing. Pichia pastoris GS115 (Invitrogen, Carlsbad, CA, USA) was used as a heterologous expression host. The pPIC9K vector (Invitrogen) was used for constitutive expression in P. pastoris. The recombinant plasmid pPIC9K/ctendo45, harboring the endoglucanase-encoding gene ctendo45

Table 4 Kinetic parameters of CTendo45 and its mutants against CMC-Na and β-d-glucan
Values are mean ± SD of three replicates (GenBank Accession no. KC441877) and a 6× His-tag at the C-terminus, was previously constructed [12]. A Fast Mutagenesis System Kit was purchased from TransGen. Mutagenic primers were synthesized by Sangon (Shanghai, China) and are summarized in Additional file 1: Table S4. All chemicals were of analytical grade.

Mutagenesis of CTendo45
N-glycosylation site and O-glycosylation site analyses were carried out using the NetNGlyc 1.0 Server (http:// www.cbs.dtu.dk/servi ces/NetNG lyc/) and the NetOGlyc 4.0 Server (http://www.cbs.dtu.dk/servi ces/NetOG lyc/), respectively. Homologous sequence alignment was performed using Clustal Omega (https ://www.ebi.ac.uk/ Tools /msa/clust alo/). The homology-modeled structure of the Thielavia terrestris β-1,4-endoglucanase TtCel45A (PDB: 5GLY) was used to predict the biological function of candidate mutation sites [28]. Homologous modeling was performed using the online software SWISS-MODEL. To remove the existing N-glycosylation sites and introduce additional ones, different residues were selected to produce one deletion mutant (T90A), seven single addition mutants (L47T, Q56T, G63T, R67S, F143T, F143S and W169S), and one double mutant (R67S/T90A). Each target mutant plasmid was generated by site-directed mutagenesis with the plasmid pPIC9K/ctendo45 as a PCR template and then transformed into E. coli T1. Positive colonies were picked on LB agar plates supplemented with 50 µg/mL kanamycin after culture at 37 °C for 14 h and ultimately verified by DNA sequencing with AOX1 gene primers and selfprimers (Additional file 1: Table S4).

Heterologous expression in Pichia pastoris
The identified recombinant plasmid was linearized using SacI (Fermentas, Thermo Scientific, Waltham, MA, USA) and electrotransformed into P. pastoris GS115 [49]. The transformants that emerged on MD and MM plates were inoculated onto YPD medium plates supplemented with 1-4 mg/mL G418 (Sangon) and cultured at 28 °C for 3 days to select multicopy integrants [12]. PCR amplification was performed with AOX1 primers based on the genomic DNA extracted from the acquired multicopy colony to confirm the presence of the mutant plasmid. Enzyme induction was carried out according to the protocol of a Pichia Expression Kit (Invitrogen) [50].

Purification and SDS-PAGE analysis
After 7 days of methanol-induced culture, the cell-free culture supernatant was collected by centrifugation at 8000 rpm for 15 min. Then, the supernatant was precipitated with ammonium sulfate at 80% saturation and 4 °C overnight. The suspension was centrifuged at 8000 rpm for 15 min and the precipitate was dissolved in 20 mM phosphate buffer solution (pH 7.4). Afterwards, His-tagged recombinant enzymes were purified using Ni 2+ affinity chromatography (HisTrap ™ FF crude; GE Healthcare, Buckinghamshire, UK), as previously described [49]. Protein concentrations were estimated using a Pierce ™ BCA Protein Assay Kit (Thermo Scientific). SDS-PAGE analysis was carried out in a 12% (w/v) polyacrylamide gel, and staining was conducted with Coomassie blue R-250 (Sigma-Aldrich, St. Louis, MO, USA) and a Pierce ™ Glycoprotein Staining Kit (Thermo Scientific), respectively. PNGase F, which is the most effective enzyme for specifically removing N-linked glycans (but not O-linked glycans) from glycoproteins [51], was obtained from New England Biolabs (Ipswich, MA, USA).

Enzymatic activity assay
β-d-glucan and CMC-Na (400-800 centipoise in water at room temperature) were purchased from Sigma-Aldrich as substrates. The reaction system comprised 150 µL of 1% (w/v) CMC-Na or 0.2% (w/v) β-d-glucan and 15 µg of purified enzyme in a 300 µL reaction mixture. The reaction was incubated at 60 °C for 30 min and terminated by the addition of 300 µL of 3,5-dinitrosalicylic acid reagent in a boiling water bath for exactly 10 min. After the sample was cooled to ambient temperature, the absorbance was read at 540 nm [52]. One international unit (IU) of enzyme activity was defined as the amount of enzyme that catalyzes the liberation of reducing sugar equivalent to 1 μmoL of glucose per minute under the reaction conditions. Each experiment was performed in triplicate.

Biochemical characterization
The optimal pH was measured in multiple buffer solutions at 50 mM concentrations, including acetate buffer (pH 3-6), sodium phosphate buffer (pH 6-8), and Tris-HCl buffer (pH 8-9). The optimal temperature was evaluated from 30 to 90 °C at the optimal pH value [53]. Thermostability was determined by detecting the residual enzyme activity after the enzyme was preincubated at 30-90 °C for 200 min [54]. Moreover, the half-life (t 1/2 ), which was defined as the time at which the enzyme activity declined to half of its initial activity value at that temperature, was investigated at 80 °C and 90 °C [55].

Kinetic parameters
The reaction was performed in 50 mM acetate buffer (pH 4) at 60 °C for 30 min using an appropriate equivalent amount of diluted enzyme with varying concentrations of CMC-Na (1-10 mg/mL) and β-d-glucan (0.5-5 mg/ mL). Kinetic parameters were calculated according to the Michaelis-Menten equation [56].