Efficient whole-cell-catalyzing cellulose saccharification using engineered Clostridium thermocellum

Background Cost-efficient saccharification is one of the main bottlenecks for industrial lignocellulose conversion. Clostridium thermocellum naturally degrades lignocellulose efficiently using the cellulosome, a multiprotein supermolecular complex, and thus can be potentially used as a low-cost catalyst for lignocellulose saccharification. The industrial use of C. thermocellum is restrained due largely to the inhibition of the hydrolysate cellobiose to its cellulosome. Although the supplementation of beta-glucosidase may solve the problem, the production of the enzymes greatly complicates the process and may also increase the cost of saccharification. Results To conquer the feedback inhibition and establish an efficient whole-cell catalyst for highly efficient cellulose saccharification, we constructed a recombinant strain of C. thermocellum ∆pyrF::CaBglA which produced a secretory exoglucanase CelS-bearing heterologous BGL using a newly developed seamless genome editing system. Without the extra addition of enzymes, the relative saccharification level of ∆pyrF::CaBglA was stimulated by over twofolds compared to its parent strain ∆pyrF through a two-stage saccharification process with 100 g/L Avicel as the carbon source. The production of reducing sugars and the relative saccharification level were further enhanced to 490 mM and 79.4%, respectively, with increased cell density. Conclusions The high cellulose-degrading ability and sugar productivity suggested that the whole-cell-catalysis strategy for cellulose saccharification is promising, and the C. thermocellum strain ∆pyrF::CaBglA could be potentially used as an efficient whole-cell catalyst for industrial cellulose saccharification. Electronic supplementary material The online version of this article (doi:10.1186/s13068-017-0796-y) contains supplementary material, which is available to authorized users.


Background
Lignocellulosic biomass is the most abundantly available raw material on the Earth. Its sustainability and effective cost make it an attractive feedstock to substitute fossil resources [1][2][3]. Because of the recalcitrant structure, the main obstacle of lignocellulose bioconversion is the high cost of deconstruction [4][5][6]. Clostridium thermocellum has previously been demonstrated to have outstanding potential in lignocellulose bioconversion [7], because it produces a cellulosome, a multiprotein supermolecular complex, for highly efficient degradation of cellulose [8,9]. By assembling various enzymatic subunits and multiple structural scaffoldings together, the cellulosome makes full use of the synergy effects resulting from the interactions to promote the lignocellulose hydrolysis process [10,11]. Thus, the cellulosome-producing C. thermocellum is naturally suitable for lignocellulose bioconversion [12][13][14]. Nevertheless, the wild-type C. thermocellum cannot be directly used as a whole-cell industrial catalyst so far. One of the uppermost problems is the feedback inhibition caused by the end-product cellobiose to the cellulosome greatly limits the continuous cellulose saccharification [15,16].

Open Access
Biotechnology for Biofuels *Correspondence: cuiqiu@qibebt.ac.cn; liuyj@qibebt.ac.cn 3 Qingdao Engineering Laboratory of Single Cell Oil, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao 266101, People's Republic of China Full list of author information is available at the end of the article Currently, the feedback inhibition is generally relieved by supplying beta-glucosidases (BGL) into the hydrolysis system. For example, the targeted integration of BGL into the C. thermocellum cellulosome could enhance the degradation of cellulosic substrates [17]. Prawitwong et al. obtained high production of glucose using a C. thermocellum culture supplemented with a BGL from Thermoanaerobacter brockii [18]. Many efforts have been made to obtain low-cost BGLs, including the screening and engineering of BGLs for higher activity [19][20][21][22][23], and the development of novel processes for protein recycling [24,25]. However, a relatively high load of the enzymes is still unavoidable because the activity and stability of the added BGLs are continuously decreasing during the hydrolysis process. In addition, the production and supplementation of BGL may greatly complicate the saccharification process.
To avoid the extra addition of enzymes in the saccharification system, we consider one of the most convenient and efficient way is to construct a recombinant C. thermocellum simultaneously producing secretory BGL as a whole-cell catalyst for cellulose saccharification. To relieve the feedback inhibition and promote the cellulose hydrolysis process, the expressed BGL should function synergistically with the cellulosome by assembling and form a substrate-coupled catalyzing pathway (cellulosecellobiose-glucose) with other cellulosomal enzymes. The exoglucanase CelS (also known as Cel48S) is the most abundant enzymatic subunit in the cellulosome of C. thermocellum [26,27], and a previous study supports it has the major contribution to cellulosome function [28]. Since CelS is the main producer of cellobiose in the cellulosome, relieving the cellobiose inhibition to CelS should be a priority. Thus, the fusion of BGL with CelS may be a preferred way for BGL supplementation. To achieve the fused expression of CelS with BGL in C. thermocellum, precise genome-editing methods, such as the markerless gene-deletion approach [29], are required.

Plasmid-dependent expression of BGLs in C. thermocellum
We initially tried to express BGLs in C. thermocellum using a replicating plasmid which is convenient to construct and to demonstrate the feasibility of BGL integration in cellulosome in vivo. pHK-CtBglA-Doc and pHK-CaBglA-Doc were constructed for the expression of fusion proteins CaBglA-Doc and CtBglA-Doc, respectively, under the control of the CelS promoter. The expressed proteins would contain a signal peptide for protein secretion and a dockerin (Doc) module for cellulosome assembly. Weak bands of ~65 kDa was detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis of the cellulosomes (data not shown), but mass spectroscopy analysis failed to verify the fusion protein due to the low concentration. This result indicated very few, if any, BGL expression and assembly in the cellulosomes of the recombinant strains. BGL assay was then performed against

Table 1 Enzymatic properties of selected BGLs
All experiments were performed in triplicate to calculate the averages and standard errors with pNPG as a substrate, and the reaction conditions were at pH 5.5 and 55 °C a The optimal temperature and pH of CaBglA and CtBglA were also determined in this study as shown in Additional file 1 b The thermal stability was shown as percentage of remaining BGL activity after incubating at 60 or 80 °C for 24 h c The glucose inhibition of BGLs was determined by adding glucose at different concentrations (0-600 mM) to the standard reaction mixture, and calculated as the glucose concentration required to inhibit 50% of initial BGL activity (Additional file 1) d Instead of inhibition, Td2f2 could be stimulated by 29.2-72.0% with the addition of 100-600 mM glucose, which was consistent with previous report [31]  p-nitrophenyl-β-d-glucopyranoside (pNPG) to confirm the assembly of the expressed BGLs to the cellulosomes and their functionality. 1.15 ± 0.07 U/mg BGL activity was detected in the cellulosome of ∆pyrF::pHK-CaBglA-Doc, but no BGL activity was detected in the cellulosome of ∆pyrF and ∆pyrF::pHK-CtBglA-Doc, indicating the plasmid-based expression of the fusion protein CaBglA-Doc with low abundance. The failed detection of CtBglA-Doc indicated its failed expression or secretion. The difficulty of the plasmid-based expression in C. thermocellum has been discussed in previous studies [33,34]. Nevertheless, the BGL activity of the cellulosome of ∆pyrF::pHK-CaB-glA-Doc demonstrated that C. thermocellum can express and secrete active CaBglA. Thus, we decided to introduce the BGL-encoding genes into C. thermocellum by chromosomal integration to relieve the metabolic burden that resulted from plasmid replication and the expression of plasmid-carrying antibiotic-resistant genes. In addition, the chromosomal gene expression is also more convenient for industrial application than the plasmid-based expression.

Seamless knock-in of BGL gene in the genome of C. thermocellum
A seamless genome editing system was developed on the basis of the allele-coupled exchange (ACE) strategy [35], including a pyrF-deleted chassis strain ∆pyrF and corresponding plasmids for homologous recombination.
The ACE strategy was originally developed in mesophilic clostridia, and it had not been used in the genetic engineering of thermophilic C. thermocellum previously according to our knowledge [35]. Three regions of homology with different lengths (two long regions of ~1.2 Kb and one short region of ~300 bp) were involved to control the order of homologous recombination events. Two selection markers, an orotidine 5-phosphate decarboxylase encoding gene pyrF from C. thermocellum DSM1313 and a thymidine kinase encoding gene tdk from a thermophilic anaerobe Thermoanaerobacter sp. X514 [36,37], were used to achieve the markerless manipulation (Additional file 2). With the developed genome editing system, seamless gene deletion, insertion and replacement could be achieved after three screening steps and two rounds of recombination ( Fig. 1). We opted to create a BGL-CelS fusion protein to form a substrate-coupled catalyzing channel because removing the cellobiose produced by the exoglucanase CelS, the most abundant cellulosome subunit in C. thermocellum, might greatly release the feedback inhibition to the whole cellulosome system. BGL was designed to locate between the catalyzing module (Cel) and dockerin module (Doc) of CelS to avoid the interference to the type I dockerincohesin interaction during cellulosome assembly. The produced fusion protein Cel-BGL-Doc would contain three functional modules. The plasmid pHK-HR-CaBglA was constructed for the knock-in of gene caBglA. The termination codons of the BGL-encoding genes were eliminated during the plasmid construction to guarantee the fused expression with CelS modules. The recombinant strain ∆pyrF::CaBglA was obtained after three screening steps (Fig. 1a). In short, the ∆pyrF transformants containing pHK-HR-CaBglA were initially screened on solid GS-2 medium with Tm. Then, the recombinants were selected in MJ medium lacking uracil but containing FUDR. Under such selection stress, the plasmid containing tdk cassette must be cured to ensure the cell growth, while because the host cell was pyrF-deleted, the pyrF cassette on the plasmid was required to produce uracil. Thus, the first round recombination event occurred through two long regions of homology-HR-up and HR-down-to integrate both the selection marker pyrF and CaBglA gene onto the chromosome. The pyrF function of the host strain was restored in this step, and the plasmid backbone was cured. In the third step, the counter-selection function of pyrF was used for the removal of pyrF cassette through the second round of recombination between the short region of homology HR-short and the 3′ region of HR-up, which harbored the same sequence with HR-short. The cells without PyrF function were selected in FOA-supplemented GS-2 medium. Colony PCRs and sequencing were performed using primer set HR-F/R to verify the recombination after each step of screening (Fig. 1b). We also tried to fuse the endogenous BGL of C. thermocellum DSM1313 (CtBglA) with CelS. However, the knock-in of ctBglA gene was not successful even after several attempts. The construction got stuck during the first round of recombination in the second selection step (Fig. 1a), and no positive recombinant could be detected after screening hundreds of colonies. The difficulty might due to the preferred recombination occurred between the homologous ctBglA sequences, since the ctBglA sequence (~1.3 kb) was longer than the regions of homology (~1.2 kb).

Investigation of BGL expression by C. thermocellum
The cellulosomes and extracellular proteins of ∆pyrF::CaBglA were prepared and analyzed to confirm the expression of the fusion protein Cel-BGL-Doc with a theoretical size of ~135 kDa. The samples from the parent strain ∆pyrF were also analyzed as the control. SDS-PAGE analysis showed that the ~75-kDa band referring to the wild-type CelS protein was rarely detected for ∆pyrF::CaBglA, but an additional band of ~135 kDa was detected (Fig. 2), indicating the expression of the fusion protein instead of the wild-type CelS. The ~135-kDa protein was further confirmed as Cel-CaBglA-Doc by mass spectroscopy analysis (Additional file 3). Furthermore, 19.1 ± 1.2 U/mg BGL activity was detected in the cellulosome of ∆pyrF::CaBglA, which was 16 times higher than that of plasmid-based CaBglA expression in ∆pyrF::pHK-CaBglA-Doc. These results indicated the successful expression, secretion and cellulosomal assembly of the active Cel-CaBglA-Doc in ∆pyrF::CaBglA.
We observed lower abundance of the fused protein Cel-CaBglA-Doc in ∆pyrF::CaBglA compared to that of the wild-type CelS in ∆pyrF, which might influence the efficiency of cellulose degradation. In order to investigate whether the decreased protein expression was because of the unsuitable codon usage of caBglA gene, another strain ∆pyrF::CaBglAm was constructed by replacing CaBglA of ∆pyrF::CaBglA with a codon-modified CaBglAm. The expression and cellulosomal assembly of the CelS-bearing CaBglAm in ∆pyrF::CaBglAm was confirmed by enzyme assay, but the expression of the fusion protein was not significantly different from that of ∆pyrF::CaBglA according to the SDS-PAGE analysis (Fig. 2).

Enhanced cellulosomal activity by CelS-bearing BGL
The cellulolytic activity of the cellulosome of ∆pyrF::CaBglA was analyzed by monitoring the concentrations of released reducing sugar after 24-h hydrolysis and extracellular proteins (E) of C. thermocellum strains. The parent strain ∆pyrF produced an intact CelS protein (black arrows). Compared to ∆pyrF, an additional ~135-kDa band was observed in both cellulosomal and extracellular proteins of ∆pyrF::CaBglA and ∆pyrF::CaBglAm (red arrows), suggesting the successful expression of the fusion protein Cel-CaBglA(m)-Doc, and its assembly in the cellulosome. Cellulosomal and extracellular proteins of ∆pyrF::CaBglA with the size of ~135 and ~75 kDa were further identified by mass spectroscopy (Additional file 3). Bands corresponding to known cellulosomal proteins are identified to the left of the Coomassie blue-stained gel. M protein standards assay at 55 °C with Avicel as the substrate (Fig. 3). Cellobiose and glucose were also quantified by HPLC. The cellulosomal activity of ∆pyrF::CaBglA was 1.6-fold of the parent strain ∆pyrF, while the glucose proportion in the released reducing sugar was increased from 34 to 78% due to the expression of CaBglA, suggesting the enhanced cellulolytic activity. Although the cellulosome of ∆pyrF contained no BGL activity against cellobiose, glucose was also detected in its hydrolysate against cellulose. This indicated that some endoglucanases involved in the cellulosome system of C. thermocellum could actively convert cello-oligosaccharides to glucose.

Cellulose saccharification using C. thermocellum as a whole-cell catalyst
A two-stage process, including a cell-cultivation stage and a cellulose hydrolysis stage, was employed for cellulose saccharification. To determine the cultivation time in the two-stage process, the growth patterns of ∆pyrF and ∆pyrF::CaBglA were initially analyzed using 5 g/L Avicel as a carbon source. The result showed that both strains grew in stationary phase after 28-h cultivation (Additional file 4). Thus, the cell-cultivation stage lasted for 36 h to guarantee the production of cellulosomal proteins and the complete utilization of the initial carbon source. In addition, ∆pyrF and ∆pyrF::CaBglA showed similar Avicel consumption patterns (Additional file 4). This result indicated that the expression of CelS-bearing CaB-glA did not influence much the cellulose degradation of C. thermocellum at the cell-cultivation stage because the small amounts of accumulated sugars (2.24 ± 0.05 and 1.85 ± 0.85 mM cellobiose for ∆pyrF and ∆pyrF::CaBglA, respectively; no glucose was detected) would show slight, if any, inhibition effect on cellulose hydrolysis [38].
After the cell-cultivation stage, 100 g/L Avicel was supplemented to initiate the cellulose hydrolysis stage. C. thermocellum strains can assimilate the produced sugars [39], which may result in the decreased production of reducing sugars. As a strict anaerobe, the growth of C. thermocellum may cease with the presence of oxygen or low pH (pH value <6) [39]. Thus, we performed aerobic treatment (aeration) and acidic treatment (reduce the pH value to 5.5) to inhibit the cell growth as well as the assimilation process. Without any treatment, ∆pyrF::CaBglA produced 394 ± 34.7 mM reducing sugar including 381 ± 13.3 mM glucose in 20 days. Meanwhile, the parent strain ∆pyrF produced only 182 ± 8.7 mM reducing sugar including 173 ± 0.2 mM glucose. The results showed that ∆pyrF::CaBglA produced more reducing sugar than the parent strain under untreated as well as acidic conditions, but not under the aerobic condition (Fig. 4a). In addition, neither the aerobic nor acidic treatment was conducive to the production of reducing sugars by C. thermocellum strains, especially ∆pyrF::CaBglA (Fig. 4a). Under anaerobic (untreated or acidic) conditions, the parent strain ∆pyrF produced less glucose but accumulated more cellobiose than ∆pyrF::CaBglA, indicating the expression of CaB-glA stimulated the conversion of cellobiose to glucose, thereby promoted the cellulose hydrolysis (Fig. 4b, c). However, ∆pyrF and ∆pyrF::CaBglA produced similar amounts of glucose and cellobiose under aerobic condition (Fig. 4b, c). This suggested that the presence of oxygen inhibited not only the activity of cellulosome but also the activity of CaBglA.
At the end of the cell-cultivation phase, the BGL activity in ∆pyrF::CaBglA broth was determined as 7.23 ± 0.2 U/g cellulose. Because there might be cellbound CelS-bearing CaBglA that did not release into the broth, we proposed higher BGL activity in ∆pyrF::CaBglA culture. To evaluate the contribution of expressed BGL to saccharification, a positive control was prepared by adding 15 U/g cellulose of purified CaBglA protein in ∆pyrF cultures at the beginning of the saccharification phase, and 359 ± 48.5 mM reducing sugar was produced after 20-day saccharification, which was similar to the production of ∆pyrF::CaBglA (Fig. 4a). The relative levels of cellulose saccharification from ∆pyrF::CaBglA and the positive control (∆pyrF supplemented with CaBglA) were 63.9 and 58.2%, respectively. These results indicated the CelS-bearing CaBglA in ∆pyrF::CaBglA was active as the free CaBglA protein. In addition, although the glucose production showed no significant difference at the end of saccharification process, cellobiose concentration in the culture of the positive control was slightly higher than that in ∆pyrF::CaBglA culture (Fig. 4b, c), indicating more effective removal of cellobiose by CelS-bearing The strain ∆pyrF::pHK-CaBglA-Doc producing plasmid-born CaBglA-Doc protein was also used for cellulose saccharification, since its cellulosome contained the BGL activity. However, no increase of reducing sugar was detected in this strain compared with the parent strain, which might be explained by the low expression of CaBglA. In addition, the saccharification activity of the strain ∆pyrF::CaBglAm expressing a chromosomeborn codon-modified CaBglA was compared with ∆pyrF::CaBglA, and no significant change was observed, either (Additional file 6).

Improvement of the sugar production by C. thermocellum ∆pyrF::CaBglA
Increased amount of whole-cell catalysts in the saccharification system might result in enhanced cellulose degradation and fermentable sugar production. Thus, we tried to stimulate the cell growth of C. thermocellum ∆pyrF::CaBglA by modifying the fermentation medium. Cellobiose and ammonium sulfate were supplemented as extra carbon and inorganic nitrogen source, respectively [40]. Over 4.5-fold increase of the cell density (OD 600nm = 5.5) was observed when 20 g/L cellobiose and 1.3 g/L ammonium sulfate was used (Additional file 7). However, ∆pyrF::CaBglA only produced 241 ± 3.4 mM reducing sugar using such rich medium after 15-day saccharification, which was only 59% of that in the control setup using regular medium. This indicated that although C. thermocellum prefers cellobiose to cellulose and grows fast with increased carbon and nitrogen loading, the cellulosome produced using cellobiose as the carbon source may not be suitable for cellulose degradation due to the substrate-coupled regulation mechanism [41].
To avoid the potential impact of substrate change on the composition and activity of the cellulosome, concentrated ∆pyrF::CaBglA cells were inoculated in fresh medium, and the cell density was increased by 1.6-or 2.4fold (the concentration of pellet protein increased from 0.26 to 0.46 or 0.63 mg/mL, respectively). The production of the reducing sugar was stimulated up to 490 ± 7.6 mM (Fig. 5), including 451 ± 5.7 mM (81.1 ± 1.0 g/L) glucose and 6.8 ± 0.4 mM (2.3 ± 0.1 g/L) cellobiose, and the saccharification increased to 79.4%. The pH values of the cultures were buffered and maintained at around 6 ( Fig. 5). At the end of saccharification process, we observed complete degradation of the initial Avicel but no significant cell growth was detected (Additional file 8). Thus, the change of cell biomass would show slight influence on the carbon recovery. To investigate the carbon flux in the saccharification system in addition to fermentable sugars, we quantified potential intermediates and end metabolites including pyruvate, ethanol, lactate, acetate, and formate produced by ∆pyrF::CaBglA. At the end of saccharification, ∆pyrF::CaBglA with 2.4-fold cell density produced 261 ± 10.3 mM of ethanol, while trace amount of other products were detected. The produced ethanol was equivalent to around 131 mM glucose. Together, 620.5 ± 9.2 mM products in glucose equivalent were produced from the initial Avicel (617 mM in glucose equivalent), and the carbon recovery was 100.6 ± 1.6%.

Discussion
Clostridium thermocellum can be used as a potential whole-cell catalyst for cellulose saccharification because of its robust cellulosome system [9], but proper engineering is indispensable to overcome its natural deficiencies, e.g. cellobiose inhibition, based on the industrial standard [5]. BGL supplementation is one of the most efficient strategies to relieve the cellobiose inhibition to cellulosome by previous in vitro studies [17]. Prawitwong et al. developed a biological saccharification system using a C. thermocellum culture supplemented with BGL to produce glucose [18], and the BGL recycling was considered by fusing BGL with a cellulose binding module [24,25], but the additional production and supplementation of BGL complicated the process and may also increase the saccharification cost. In addition, the decreasing stability and the glucose inhibition to BGL in company with the saccharification process might require relatively high or successive load of BGL proteins, which may further increase the cost. In this study, we employed the wholecell catalyzing strategy for lignocellulose saccharification without supplementation of extra enzymes. This strategy fits to the industry requirement of low-cost, simple process, and high efficiency, but needs an efficient whole-cell catalyst producing secretory BGL.
It is known that CelS plays a key role in cellulose hydrolysis by cellulosome and is the main producer of cellobiose [28,42], thus the fusion of selected BGL with CelS would greatly release the feedback inhibition effect on cellulosome. By developing a novel ACE-based seamless genome editing system in the thermophilic bacterium, we successfully inserted a BGL gene caBglA into the CelS-encoding sequence of C. thermocellum DSM 1313 precisely between its catalyzing module and dockerin module. A recombinant strain ∆pyrF::CaBglA was finally constructed producing a CelS-bearing CaBglA protein as an active cellulosomal component. Nevertheless, we observed decreased expression of the fused protein Cel-CaBglA-Doc in ∆pyrF::CaBglA compared to that of the wild-type CelS in ∆pyrF, which might influence the efficiency of cellulose degradation and sugar production. We thought the unsuitable codon usage of caBglA gene might be the problem, and optimized the codon of caBglA gene, but the result suggested that it was not the case. The fusion protein Cel-CaBglA-Doc was expressed under the control of the strong promoter In addition to cellulosomal integration, we also achieved the plasmid-dependent expression of CaBglA in C. thermocellum using a multicopy-replicating plasmid pHK. Although ∆pyrF::pHK-CaBglA-Doc theoretically contained more caBglA gene copies than ∆pyrF::CaBglA, and both employed the CelS promoter to drive the transcription, the cellulosome of ∆pyrF::pHK-CaBglA-Doc showed 16-fold lower BGL activity (1.15 ± 0.07 U/mg) than that of ∆pyrF::CaBglA (19.1 ± 1.2 U/mg). This result suggested episomal plasmid DNA might not be the proper vector for protein expression, and chromosomal integration might result in higher level of protein expression in some cases [43], and the CelS-bearing expression pattern might play key roles during the protein secretion and/or assembly process to support the functional expression of the heterologous BGL. This phenomenon should be considered when expressing other exogenous proteins in C. thermocellum.
Because the assimilation of cellodextrins by C. thermocellum may result in the decreased production of reducing sugars [39], we set up aerobic or low-pH treatments to inhibit the cell metabolism at the cellulose hydrolysis stage. However, neither aeration nor pH reduction resulted in increased production of reducing sugars. In contrast, lower sugar concentrations were detected compared with the untreated control. The negative effect of aeration on the cellulose hydrolysis might be explained by the oxygen sensitivity of the cellulosome [44,45], and CaBglA might also prefer anaerobic condition. Although pH 5.5 was determined as the optimal pH of both cellulosome [45,46] and CaBglA [23] (Additional file 1), low-pH condition decreased the production of reducing sugars in this study. We also detected decreased pellet protein and extracellular protein concentrations when the pH value of ∆pyrF::CaBglA was decreased to 5.5 at the beginning of the cellulose hydrolysis stage (Additional file 8). These results indicated that the low-pH condition might inhibit the cell growth as well as the continuous production of cellulosomal proteins by cells, and cell lysis might also occur under acidic conditions [39]. In addition, there might be other extracellular protein components that contributed to the cellulose hydrolysis but were not in favor of acidic conditions. Thus, the control of pH value should be considered in further application of C. thermocellum for cellulose conversion.
Although with addition of 15 U/g cellulose of purified CaBglA protein, the parent strain ∆pyrF showed similar saccharification activity to that of ∆pyrF::CaBglA, it accumulated more cellobiose during hydrolysis. This result indicated CelS-bearing CaBglA expressed by C. thermocellum was more effective in cellobiose conversion than supplemented free protein. In contrast to the parent strain ∆pyrF, further supplementation of purified CaB-glA in the culture of ∆pyrF::CaBglA showed no apparent effect on sugar production, suggesting that the fused expression of Cel-CaBglA-Doc led to the full occupation of CelS by CaBglA. The formed substrate-coupled catalyzing channel had relieved most of the cellobiose inhibition effect on the whole cellulosome system. Under our whole-cell-catalyzing conditions, the carbon recovery of supplemented cellulose was about 100%, i.e., 79% of supplemented carbon was converted to soluble sugars and 21% to the end product ethanol. The productivity of fermentable sugars might be further enhanced by disrupting the metabolic pathways leading to end products, especially ethanol, in the recombinant strain ∆pyrF::CaBglA.

Conclusion
Lignocellulosic biomass is an attractive feedstock to substitute fossil resources, but is difficult to deconstruct. C. thermocellum naturally degrades lignocellulose efficiently but cannot be directly used in industry due largely to the feedback inhibition to its enzymatic system cellulosome. The supplementation of BGL could solve the problem. However, the addition of enzymes produced elsewhere complicates the process and hinders the utilization of lignocellulose in industry. Here, we constructed an efficient whole-cell catalyst producing BGL for cellulose saccharification by targeted engineering of C. thermocellum. Without supplementation of any other enzymes, the whole-cell catalyst showed the high cellulose-saccharification activity and sugar production. Hence, our work confirmed the feasibility of the whole-cell-catalysis strategy for cellulose saccharification, and provided a potential whole-cell catalyst for industrial cellulose saccharification.
All plasmids constructed for the genetic manipulation in C. thermocellum were derived from pHK (Gen-Bank accession number: KY792637) [51]. The construct the plasmid pHK-∆pyrF, ~1-kb upstream and downstream homologous arms of pyrF were amplified from the genome DNA of C. thermocellum DSM1313 using primer sets PyrF5′-F/R and PyrF3′-F/R (Additional file 9), respectively, and were ligated together by overlap PCR using primers PyrF5′-F/PyrF3′-R. The obtained PCR product was cloned to pHK vector through PstI and NheI restriction sites to generate pHK-∆pyrF ( Table 2).
The seamless editing plasmid pHK-HR contains two selection gene markers and three regions of homology  (Additional file 2). The endogenous selection marker pyrF (Clo1313_1266) was driven by its native promoter [37]. The selection marker tdk (Teth514_0091) was obtained from Thermoanaerobacter sp. X514 [36], and was expressed under the control of a glyceraldehyde-3-phosphate dehydrogenase (gapDH) promoter from C. thermocellum DSM1313. Three regions of homology, HR-up, HR-short, and HR-down, were amplified from the genome DNA of C. thermocellum DSM1313 according to the genome editing demand, in which the sequence of HR-short was the same with the 3′ region of HR-up. BGL-encoding genes ctBglA and caBglA were chosen as the candidate DNA sequences to knock in, and were amplified from the genome DNAs of C. thermocellum DSM1313 and Caldicellulosiruptor sp. F32, respectively. The termination codons of the BGL-encoding genes were eliminated for fused protein expression. All fragments were cloned into the pHK vector sequentially. The Tdk expression cassette with the gapDH promoter was first cloned into pHK using NheI and XbaI sites. Subsequently, a fragment containing the upstream homology HR-up, the PyrF cassette as well as the short homology HR-short was obtained by overlap PCR and cloned into the plasmid using XbaI and EagI sites. The downstream homology HR-Down was then ligated into the plasmid using MluI and BamHI sites. We determined to integrate BGL sequences between the catalyzing module and dockerin module of CelS on the chromosome. To maintain the individual functions of the adjacent modules, repeated GGT sequence encoding glycine residuals were introduced in the upstream (3′ end of HR-short) and downstream (5′ end of HR-down) of the target DNA sequence. Finally, ctBglA, caBglA or caBglAm gene sequence was cloned into the plasmid via MluI and EagI sites, resulting in the plasmids pHK-HR-CtBglA, pHK-HR-CaBglA and pHK-HR-CaBglAm (Table 2), respectively, for seamless genome editing. For the plasmid-dependent expressions of CtBglA and CaBglA in C. thermocellum DSM1313, the predicted promoter and signal peptide region of CelS in C. thermocellum DSM1313 was amplified with primer set Pcs-sigF/R (Additional file 9), and cloned into pHK-HR-CtBglA and pHK-HR-CaBglA using NheI and EagI sites to construct pHK-CtBglA-Doc and pHK-CaBglA-Doc (Table 2), respectively. The Tdk cassette, HR-Up region, PyrF cassette, and HR-short region of the plasmids were replaced by the promoter and signal peptide of CelS. Thus, the BGL-encoding genes would be driven by the CelS promoter, and the expressed proteins would contain the signal peptide of CelS for protein secretion. Because the HR-Down region contained the encoding sequence of the dockerin module of CelS, the expressed BGLs would bear the dockerin module for cellulosome assembly.

Heterologous expression and purification of beta-glucosidases in E. coli
The plasmids pET28aNS-CaBglA, pET28aNS-CtBglA, pET28aNS-CglT, and pET28aNS-Td2f2 were transformed into E. coli BL21(DE3) for heterologous expression of CaBglA, CtBglA, CglT, and Td2f2, respectively. Synthesis of recombinant proteins in E. coli BL21(DE3) cells was initiated by the addition of 1 mM IPTG, and cultivation was continued for an additional 16 h at 16 °C. Cells were harvested by centrifugation at 10,000 rpm, resuspended in 50 mM Tris-HCl buffer containing 30 mM imidazole and 300 mM NaCl, pH 8.0, and lysed by ultrasonication. The supernatants were applied onto a Histrap ™ HP Ni-affinity column (GE Healthcare). The proteins were eluted with 50 mM Tris-HCl buffer containing 500 mM imidazole and 300 mM NaCl, pH 8.0. The eluted fractions were then concentrated to 2 mL using Amicon Ultra-15 centrifugal filter units (10.0 kDa cutoff ) (Merck Millipore, Billerica, MA, USA), and applied onto a Superdex 75 gel filtration column (GE Healthcare) with 50 mM K 2 HPO 4 -KH 2 PO 4 buffer with 100 mM KCl, pH 6.0.

Preparation of cellulosomal and extracellular proteins
Clostridium thermocellum strains were cultivated in GS-2 medium with 5 g/L Avicel as the sole carbon source at 55 °C for 48 h. 200-mL cultures were centrifuged at 3000g for 30 min, the cell pellets were washed twice and resuspended in 4 mL 50 mM Tris-HCl buffer containing 5 mM DTT, pH 7.0, and lysed using a high-pressure homogenizer (Constant Systems LTD). The extracellular proteins were prepared by condensing 10 mL of the culture supernatants to 0.5 mL using Amicon Ultra-15 centrifugal filter units (10.0 kDa cutoff ) (Merck Millipore, Billerica, MA, USA). The rest of the culture supernatants were used for cellulosome extraction according to a modified cellulose affinity procedure [10].

Protein analyses
Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was performed to check the protein purity and composition as previously described [10]. The molecular weight of the protein was estimated according to the relative mobility of protein ladders (10-245 kDa, New England BioLabs). The Bradford method was used for protein quantification [52]. The mass spectroscopy analyses were achieved according to a published procedure [10]. All protein samples were stored at −80 °C for further analyses.

Enzyme assay
The BGL activity was determined against p-nitrophenylβ-d-glucopyranoside (pNPG). Samples were incubated in 200-μL reaction buffer (50 mM sodium acetate, 1 mM pNPG, pH 5.5) at 55 °C for 5 or 10 min. The reaction was terminated by adding 1 mL of 1 M Na 2 CO 3 , and the absorbance of the mixture was measured at 405 nm immediately. One unit of enzyme activity was defined as the amount of enzyme required to produce 1 μmol of p-nitrophenol (pNP) per min under certain conditions.
The cellulase activity was tested in 1-mL final reaction volume containing 50 μg cellulosome proteins and 15 mg Avicel or 7.5 mg cellobiose as a substrate. The reaction buffer contained 50 mM sodium acetate, 10 mM CaCl 2 and 5 mM dithiothreitol (DTT), pH 5.5. The abundance of reducing sugars was determined by the 3,5-dinitrosalicylic acid (DNS) method after incubation at 55 °C for 24 h, and the glucose and cellobiose in the hydrolysates were quantified by high-performance liquid chromatography (HPLC) as previously described [53].

Electrotransformation and screening of C. thermocellum
The pHK derivative plasmids were transformed into E. coli BL21(DE3) to remove Dcm methylation, and transformed to C. thermocellum DSM1313 according to published protocols [10,51]. In brief, 200 μL of C. thermocellum competent cells was added to 0.2-cm electroporation cuvettes (BioRad) with 10 μL of DNA (~2000 ng) in sterile distilled water. A series of 40 square pulses were applied, each with an amplitude of 1.5 kV and for a duration of 50 s at 500-ms intervals. Cells were then recovered for 24 h at 51 °C in 4 mL of fresh GS-2 medium before screening on solid medium containing Tm. The ∆pyrF mutant was selected as described [37]. In detail, the transformants containing pHK-∆pyrF were cultivated with the presence of Tm and then plated in GS-2 solid medium with 500 μg/mL FOA. FOA-resistant colonies were screened by colony PCR using PyrF-F/R flanking the genomic gene pyrF. The colonies with 0.25-Kb PCR products were determined as pyrF-deleted mutants.
The mutant ∆pyrF was then used as the parent to construct other C. thermocellum recombinant strains. The ∆pyrF transformants containing pHK-derived plasmids were initially selected on solid GS-2 medium supplemented with Tm, and then inoculated into liquid GS-2 medium with Tm to strengthen the replication of the transformed plasmid before further screening. Transformants containing plasmid pHK-CtBglA-Doc or pHK-CaBglA-Doc were verified by detecting the CtBglA-Doc or CaBglA-Doc fragments via colony PCR and sequencing. Transformants containing pHK-HR-CtBglA, pHK-HR-CaBglA or pHK-HR-CaBglAm were further inoculated into MJ medium containing FUDR and cultivated until the late exponential phase. 200-μL cultures were diluted for 1, 10, 100, and 1000 fold, plated in MJ solid medium containing FUDR, and then cultivated at 51 °C for 5-7 days. The obtained colonies were screened by colony PCR using primer set HR-F/R (Additional file 9). Those colonies showing the PCR product of ~5.7 Kb indicated the first round of recombination was done via the long regions of homology. A band of 2.9 Kb might also be detected, indicating the mixing of the parent strain ∆pyrF (Additional file 10). If so, the FUDR screening in MJ medium should be repeated until single band of ~5.7 Kb was detected. The verified colonies were inoculated into GS-2 liquid medium with Tm to confirm the plasmid curing if no growth was observed, and further cultivated in GS-2 medium containing FOA to the late exponential phase, diluted, and plated in solid medium with FOA to select colonies without PyrF function. Colony PCRs were subsequently performed using primer set HR-F/R to confirm pyrF elimination. The colonies with PCR product of 4.7 Kb were finally verified as the target strain after sequencing.

Fermentation of C. thermocellum strains
Fermentation of C. thermocellum strains were in 250-mL anaerobic bottles containing 100-mL cultures using 10 g/L Avicel as the sole carbon source. Three independent fermentations were set up for each strain, and 1.5-mL cultures were sampled every 8 to 12 h with a 2.5-mL syringe. 1 mL of each sample was used to determine the amount of residual cellulose by a modified saccharification method [10], and 0.5 mL of each sample was centrifuged for pellet cells to determine the abundance of total cellular protein [10].

Cellulose saccharification
100-mL fermentations of C. thermocellum strains were initially performed with 5 g/L Avicel as the sole carbon source for 36 h. For aerobic treatment, the cultures were transferred into 250-mL sterile flasks shaking at 170 rpm aerobically. For acidic treatment, the pH value of the broths was adjusted to 5.5 by adding 1 N HCl in an anaerobic chamber. For BGL treatment, 15 U/g cellulose of CaBglA protein was added at the beginning of the saccharification process. For high-intensity treatment, cells from 200-to 300-mL culture were concentrated, resuspended, and reinoculated into 100-mL fresh GS-2 medium anaerobically. Untreated controls were prepared under consistent conditions. For all setups, 100 g/L Avicel was supplemented to initiate the cellulose-saccharification stage. The saccharification process lasted for 15-20 days, and 1-mL sample was taken from each setup with a 5-mL syringe per 1 to 5 days to determine the abundance of produced reducing sugar by the DNS method, and the concentrations of sugars (cellobiose and glucose) and other metabolites (pyruvate, ethanol, lactate, acetate and formate) by HPLC. The relative saccharification level was determined subsequently by dividing the initial Avicel (617 mM glucose equivalents) with the amounts of the obtained reducing sugar (mM). Three independent experiments were prepared for every strain under each condition.