Endoglucanases: insights into thermostability for biofuel applications
© Yennamalli et al.; licensee BioMed Central Ltd. 2013
Received: 4 June 2013
Accepted: 24 September 2013
Published: 27 September 2013
Skip to main content
© Yennamalli et al.; licensee BioMed Central Ltd. 2013
Received: 4 June 2013
Accepted: 24 September 2013
Published: 27 September 2013
Obtaining bioethanol from cellulosic biomass involves numerous steps, among which the enzymatic conversion of the polymer to individual sugar units has been a main focus of the biotechnology industry. Among the cellulases that break down the polymeric cellulose are endoglucanases that act synergistically for subsequent hydrolytic reactions. The endoglucanases that have garnered relatively more attention are those that can withstand high temperatures, i.e., are thermostable. Although our understanding of thermostability in endoglucanases is incomplete, some molecular features that are responsible for increased thermostability have been recently identified. This review focuses on the investigations of endoglucanases and their implications for biofuel applications.
The use of plants as bioreactors is not a new concept. Within the last fifteen years, studies have established that with the knowledge of biotechnology and genetic engineering, plants are indeed a low-cost source to produce stable molecules: plants are harnessed to produce antibodies, biodegradable plastics, recombinant proteins, carbohydrates, and fatty acids. A major goal of plant-based bioreactor technology is the production of stable enzymes to produce industrially useful products. Among these enzymes are manganese-dependent lignin peroxidase (for bleaching of pulp), phytase (for animal feed), (1–3,1-4)-β-glucanase (for brewing), and xylanase (for animal feed, paper and baking), which are used in processing plants such as alfalfa, tobacco, and barley [1–8]. In addition, as a result of growing environmental concerns of consuming fossil-based fuels, enzymes used in the production of plant-based ethanol (i.e., bioethanol) gained more importance in recent years, including α-amylases and endoglucanases.
α-amylase and endoglucanase are both involved in the conversion of plant material into sugar; however, there is a critical difference when it comes to which part of the plant they catalyze: while α-amylases break down starch from maize grain, which is primarily used as food by humans and animals, endoglucanases break down cell-wall cellulose primarily from maize stover (i.e., leaves and stalk), which has been historically considered as waste. Despite the obvious drawbacks of using a crucial food item as a fuel source, maize grain remains a preferred choice as the biofeedstock for the ethanol production, because of its ease of harvesting in a large acreage and of introducing new traits.
The demand for bioethanol is expected to increase significantly in the near future: according to the Billion Ton Study published in August 2011 , 76 million tons of maize is used for the production of ethanol and yields 14.2 billion gallons of ethanol fuel per year. By 2017, the consumption of biomass for biofuel production is projected to be 103 million tons. Therefore, with the ever-increasing demand for energy sources, there is a strong need to look at other non-grain sources of biomass. While many feedstock sources (e.g., perennials such as switchgrass and Miscanthus, and biomass from forests) may take several years to become fully developed as resources, maize stover is a readily available resource for biofuel production .
Some efforts have been made to reduce the cost of cellulose breakdown in maize stover to make the process more cost-effective, and therefore more attractive for the bioethanol industry. Traditionally, industrial enzymes are added into a batch of reagents in reactors during chemical processing. Controlling pH and temperature levels become critical, because even small environmental variations can cause enzyme denaturation and subsequent loss of enzymatic activity . Cellulose-based bioethanol processing therefore requires several reactors, including one for a high-temperature pre-treatment (~75°C) and another for low-temperature endoglucanase-mediated cellulose conversion into sugar. One strategy currently being explored to improve the efficiency and lower the cost of these conversion steps is to develop feedstocks (such as maize stover) with heterologously expressed thermostable endoglucanases. Such a strategy will allow bypassing the need to add an externally sourced enzyme and possibly reduce number of processing steps [11–14]. To help in these efforts, we need to understand what factors, including protein sequence, structure, and dynamics, can be used to engineer thermostable endoglucanases.
Thermostability is a complex property that can be controlled by several factors, which may be working additively. Many studies comparing mesostable and thermostable proteins concluded that following properties may increase thermostability in general: selective pressure of certain amino acids , increase in hydrophobicity , change in a single amino acid , increase in compactness , increase in positively charged amino acids , and Gibbs free energy change of hydration . These studies are usually general in the sense that they compare proteins across protein families, folds, sizes, and cellular location. When it comes to a specific protein family, however, our knowledge is usually limited as to what differentiates thermostable and mesostable proteins in that protein family.
In this review, we will first provide a short description of endoglucanase-mediated cellulose conversion process. Then we will return our attention to the main question of the thermostability-sequence-structure relationship in endoglucanases.
Although the composition varies between different plants, 40-55% of the plant biomass is comprised of cellulose (a homopolymer made of repeating units of glucose), 25-50% of hemicelluloses (a heteropolymer made of glucose and other sugars) , and the remaining 10-40% is made up of lignin (a complex organic polymer) . Conversion of cellulose polymers to simple sugars requires the use of cellulases. Cellulase is comprised of three distinct classes of enzymes (endoglucanases, cellobiohydrolases, and β-glucosidases) that act synergistically to break down the polymer. Endoglucanases act by cleaving internal β-glycosidic bonds in the cellulose chain, thereby making chain ends accessible to cellobiohydrolase. The end product cellobiose is further broken down to glucose units by β-glucosidase. Endoglucanases have an enzyme classification number 22.214.171.124 and belong to the broader enzyme group called glycosyl hydrolases, which also includes other cellulases such as exoglucanase and β-glucosidase. According to the CAZy database (http://www.cazy.org), endoglucanases are part of 13 distinct glycosyl hydrolase families, distributed in several archeal, bacterial, fungal, and eukaryotic organisms.
From the agricultural field to the gas station, biofuel production from cellulosic biomass involves numerous steps that are strategically designed for efficient conversion . In one of the earlier steps, pre-processing of the biomass occurs at high temperatures (75°C) coupled with a dilute sulfuric acid treatment . This is done to liberate lignin, hemicellulose, and other compounds, and make the cellulosic polymers available for enzymatic degradation. Inefficient liberation of these compounds can lead to the enzyme inhibition, and therefore, reduce the efficiency of cellulose conversion. Currently, the efficiency of ethanol production from lingocellulosic biomass is physically and economically constrained by the pretreatment processes and the subsequent addition of cellulases to affect cellulose breakdown [25–27].
Industrial cellulases, particularly endoglucanases, are currently obtained from the fermentation of fungal and bacterial sources that are added to the production batch after the pretreatment step. The expression of endoglucanases from these external sources adds to the final cost of the bioethanol product. There is an intense interest in exploiting the potential of thermostable bioprocessing enzymes [28, 29].
Due to the pH and temperature extremes involved in the biomass-to-biofuel conversion, stable endoglucanases and its synergistic enzymes, such as cellobiohydrolase and β-glucosidase, are sought for the enzymatic conversion of biomass. Thermostable endoglucanases from extremophiles are considered promising because they typically exhibit valuable characteristics in biofuel production including optimal functionality at higher temperatures and the ability to withstand extreme pH changes .
The strategy of heterologously expressing a thermostable endoglucanase in maize stover bypasses the need to add an externally sourced enzyme [11, 14]. This way, the enzyme (in the biomass itself) can breakdown cellulose immediately following the pre-processing step. Furthermore, if the stable enzyme is expressed in the host tissue for biomass conversion, there can be substantive gains in production efficiency. This concept was shown to work previously with a non-cellulase enzyme during the breakdown of starch in maize grain: a synthetic chimera of three wild-type α-amylases from the archea Thermococcales that have an optimal growth temperature of above 80°C . Introducing endoglucanase into plants reduces the recalcitrance of cellulose [12, 13], since endoglucanases are capable of making random internal cleavage of the polymer, whose hydrolysis products are used by other enzymes. Transgenic plant expression is ideal for many reasons, including (a) almost unlimited scale-up potential, (b) cheap production – i.e. photosynthesis, (c) correct protein folding, (d) lack of human pathogens, and (e) the potential for direct use . Cellulases currently account for approximately $0.68 to $1.47 per gallon of ethanol produced from cellulosic feedstocks , but this cost component could be reduced 5-fold with in planta expression [34–36].
A closer look at endoglucanases originating from bacterial sources shows significant diversity in their optimal temperature for enzymatic function. Some of these are optimally functional at elevated temperatures and thus thermostable. In the next section, we are interested in the following questions: What could be the differences between a thermostable and a mesostable endoglucanase? If there are any, can we translate these differences to modify an existing endoglucanase into a more thermostable protein? We will pursue the answers to these questions by reviewing computational analyses of thermostability in endoglucanases.
Enzymes that can (a) withstand high temperatures, (b) resist unfolding, and (c) perform their optimal activity at higher temperatures are called thermostable or thermophilic. Thermostable enzymes are highly stable at elevated temperatures where mesostable enzymes become denatured and thus lose their optimal activities [30, 37, 38].
Generally speaking, thermostability is a desired quality for proteins that have industrial and therapeutic significance [28, 29, 39–44]. Introducing thermostability has been one of the major focuses of protein engineering studies, specifically of computational studies, which can be divided in three broad categories in terms of number of enzymatic families and organisms analyzed: the proteome of a thermophilic organism to the proteome of a mesophilic organism , (b) proteins from multiple organisms belonging to a range of different protein families [15, 16, 18, 20, 46–48], and (c) a single protein family between thermophilic and mesophilic organisms [17, 49, 50].
Such studies have identified various factors imparting thermostability, including sequence-level factors (specific amino acids like Arg and Glu being significantly higher in thermophiles) and structure-level factors (energy of unfolding, number of Van der Waals contacts per residue, number of hydrogen bonds per residue, or number of residues involved in secondary structure). Other factors such as Gibbs free energy change of hydration, long-range non-bonded energy, and hydrophobicity, have also been mentioned. Although most of the studies attempted to identify “universal” factors imparting thermostability across protein families, at the protein family level, another set of factors (e.g., a different amino acid composition) may determine thermostability . This inability to identify common features responsible for thermostability provided evidence for the view that no single rule defines thermostability .
In the previous studies of thermostable endoglucanases, Panasik et al.  analyzed the thermostability factors for proteins belonging to GH families that have the identical (α/β)8 fold. The lack of Gly in thermostable glycosyl hydrolase (which includes endoglucanase) was identified as the responsible factor. However, this analysis falls short in the following areas when it comes to its applicability to other endoglucanases: (a) the criteria of selecting the dataset of 29 proteins were based on higher crystallographic resolution rather than on diverse sequence identities, (b) only three endoglucanase structures were studied, and (c) the study does not individually analyze endoglucanases, but as part of a larger group of GH families.
Recently, a directed evolution approach has been used to identify an endoglucanase with higher thermostability from a thermophilic endoglucanase . Although directed evolution has the ability to successfully find a mutant with higher stability and similar enzymatic activity, it is not guaranteed to do so. As such, finding a desirable mutant is a matter of trial and error, and does not necessarily explain why a particular mutant is thermostable or not. A promising method is using SCHEMA, where the structure of the protein is also used. This was used in developing thermostable chimeric cellobiohydrolases, of which many exhibited higher stability and optimal activity [53, 54].
Abundant structural information is present for endoglucanases in three different folds; namely (α/β)8, β-jelly roll, and (α/α)6 fold. Segregating and analyzing the structural data on the basis of these three different fold families has indicated that the type of fold is critical in identifying specific thermostabilizing features for endoglucanases .
Proteins are constantly in motion, and an understanding of their dynamics will inform their functions . Coarse-grained models, such as elastic network models, help in identifying the biologically meaningful motions of a protein . Unlike molecular dynamics, which analyzes the dynamic motions of a protein at the atomistic level and are fine-grained, elastic network models perform at the amino acid level and are coarse-grained. An advantage of using coarse-grained models is the ability to observe the longer time-scaled motions in a short time-scale .
As a proof of concept, the dynamics of a pair of thermostable and mesostable endoglucanases using coarse-grained models was compared to identify dynamic differences . For this purpose a pair of structurally highly similar endoglucanases with lower sequence similarity was selected. The ENM analyses showed that both the thermophile and mesophile displayed open/close and shear-type motions in the slow modes, and the loops that face the substrate binding side were observed to be more mobile than the non-substrate binding side. In contrast, the thermophile had large dynamic blocks moving in concert within the catalytic domain, providing more stability to the thermostable protein. Differences were also observed in the catalytic residues (i.e., nucleophile and acid/base donor): in thermophiles they showed positively correlated motions while they remain uncoupled in the mesophile.
Even a single mutation can significantly increase the thermostability of cellulases and their optimal activities [17, 52, 58]: (1) a Cys to Ser mutation of a cellobiohydrolase resulted in an increased thermostability by 8°C and a 10-fold increase in expression  and (2) two positions of an endoglucanase were identified to be crucial for increasing activity . These studies provide evidence that only a few mutations can improve stability and activity significantly. Unfortunately two of these studies lack experimentally determined tertiary structures [52, 58] and their claim of intramolecular bonding as a possible candidate for imparting thermostability is not easy to validate without a structure.
List of plant species where E1 endoglucanase has been expressed and its activity studied
% E1 of TSP
Tobacco (Nicotiana tabacum)
0.18 - 1.35%
0.0007 - 0.015%
0.003 - 0.67%
0.09 - 1.6%
0.06 - 12.0%
Maize (Zea mays)
0.01 - 1.16%
0.2 - 2.0%
0.1 - 0.2%
Arabidopsis (Arabadopsis thaliana)
1.0 - 26.0%
Potato (Solanum tuberosum)
0.73 - 2.6%
0.38 - 0.92%
Duckweed (Lemna minor)
Rice (Oryza sativa)
2.4 - 4.9%
Many studies have analyzed different aspects of E1 expression: the level of heterologous expression , maximum recovery of recombinant enzyme after expression , retention of enzymatic activity after the ammonia fiber explosion treatment (AFEX) process , ability of E1 to access cell wall components , stability of E1 in various sub cellular compartments , and maximum accumulation through specific subcellular targeting . In maize tissues where E1 was expressed, studies focused on E1 stability after expression , activity on AFEX-treated stover , tissue specific production , synergistic action with other cellulases , and its effect on whole plant digestibility .
As of now, E1 expressed in plants has to be harvested and added exogenously to the ethanol process after pretreatment. In this respect, two studies focus on the feedback effect of pre-treatment processes on endoglucanase activity. Brunecky et al. reported that maize stover expressing E1 exhibited greater degradability due to E1 actively hydrolyzing the plant cell wall during growth . This increased processability was particularly interesting as E1 was substantively active during plant growth at ambient temperatures, however, after pretreatment processes (up to 170°C heat) E1 was not considered active in subsequent saccharification. Similarly, Teymouri et al. found that E1 present in biofeedstocks lost at least 65% of the original activity after the ammonia fiber expansion (AFEX) pre-treatment process. AFEX is a pretreatment process that combines use of ammonia under moderate pressure (up to 400 psi) and high temperature (up to 200°C). This causes the cellulose fibers to separate from depolymerized lignin. It has been noted that AFEX is a mild pretreatment strategy compared to alternative pretreatment processes such as steam explosion or acid treatment . Additionally, E1 has been expressed in planta (apoplast) and was found to have internal activity during plant growth that initiated the breakdown of plant cell walls before harvest . This in turn would increase the processability of feedstocks during stover-to-ethanol conversion.
Therefore, more studies are required on E1 that relate to thermostability and E1’s synergistic action in an enzyme cocktail, expression of hyper-thermostable mutants of E1 and its localization, pretreatment technologies, and overall processability.
Production of bioethanol from cellulosic biomass has come to a stage where newer catalysts are required for efficient production. In this review, we have discussed a specific endoglucanase (E1) that is currently seen as an important catalyst, and the factors contributing to thermostability that can be exploited for cellulosic bioethanol production. We explored the question of “What parameters need to be considered to engineer an endoglucanase with suitable thermodynamics, stability, and higher activity?” Any mutant or engineered endoglucanase that preserves the thermostability in terms of the dynamics is a potential industrial candidate. In reviewing literature and current structural studies, we observed that although “universal” rules that impart thermostability are useful for engineers, each protein family is different, and may have different factors imparting thermostability. In endoglucanases, protein sequence, structure, and dynamics all play a critical role in stabilizing the endoglucanase at high temperatures.
In the same vein, engineers should keep in mind the kinetic and thermodynamics constraints of other cellulolytic enzymes (cellobiohydrolase, ligninase, β-glucosidase and E1 endoglucanase) that act synergistically to break down the polymer, which we have not covered in this review . Screening for specific residual activity of endoglucanases in the presence of multiple enzymes can also improve enzyme selection.
Elastic network model
Floopy inclusions and rigid substructure topography
Endoglucanase from acidothermus cellulolyticus
Ammonia fiber explosion treatment.
This project is supported by the Biotechnology Risk Assessment Program Competitive Grant no. 2008-33522-04758 from the USDA National Institute of Food and Agriculture and USDA-Agricultural Research Service.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.