Materials
Substrate
The substrate used for all experiments was αcellulose (SigmaAldrich, C8002). Compositional analysis showed that the substrate contained glucan 78.5% and xylan 14.4% [11].
Enzyme
The enzyme used in this study was Novozymes Cellic^{®} CTec2 (lot# VCPI 0007), a blend of aggressive cellulases with high levels of βglucosidases and hemicellulases that degrade lignocellulose into sugars [12]. The protein concentration was determined to be 294 mg protein/mL with Pierce BCA assay [11]. Before use, the enzyme solution was diluted ten times with deionized (DI) water.
Citrate buffer
To maintain relatively high enzyme activity, citrate buffer (0.1 M) with a pH of 4.8 was used in enzymatic hydrolysis experiments. To prepare the buffer, citric acid monohydrate and trisodium citrate dihydrate were added to DI water.
Antibiotics
To prevent the growth of contaminating microorganisms that could consume produced sugars, an antibiotic cocktail was added. To prepare the solutions, tetracycline powder was dissolved in an aqueous solution of 70% ethanol at 10 g/L and cycloheximide powder was dissolved in DI water at 10 g/L.
Enzymatic hydrolysis
In the enzymatic hydrolysis experiments, desired amounts of αcellulose, glucose, and DI water together with 125 mL citrate buffer, 2 mL tetracycline solution, and 1.5 mL cycloheximide solution were added to a 1L centrifuge bottle in sequence and then preheated. When the mixture reached 50 °C, enzymes were added. Then, the centrifuge bottle (total working volume of 250 mL) was placed in the incubator at 50 °C for 10 days with an axial rotation of 2 rpm. Liquid samples of 0.5 mL were taken periodically and submerged in boiling water for 20 min to deactivate the enzymes (note: the volume of liquid sample is small relative to the total slurry volume, so it is assumed to have a negligible impact on substrate concentration). Then, to determine the glucose concentration, the samples were filtered and analyzed by a highperformance liquid chromatography (HPLC), which was equipped with a pair of deashing guard columns (BioRad MicroGuard deashing cartridges, 30 mm × 4.6 mm) and an HPLC carbohydrate analysis column (BioRad Aminex HPX87P, 300 mm × 7.8 mm).
Selection of hydrolysis conditions
Experiments for model fitness
Based on our previous study [11], 16 enzymatic hydrolysis conditions were tested including four different substrate concentrations (40, 60, 80, and 100 g/L), two different enzyme loadings (2 and 5 mg protein/g of dry biomass (mg/g)), and two different initial glucose concentrations (0 and 33 g/L).
Experiments for model predictions
Three enzymatic hydrolysis conditions—which were different from the conditions used for model fitness—were tested for model predictions: (1) substrate concentration: 40 g/L, enzyme loading: 1 mg/g, initial glucose concentration: 0 g/L; (2) substrate concentration: 70 g/L, enzyme loading: 1 mg/g, initial glucose concentration: 28 g/L; (3) substrate concentration: 100 g/L, enzyme loading: 5 mg/g, initial glucose concentration: 28 g/L.
Enzyme stability
Enzyme stability was measured by quantifying the soluble protein concentration over the course of 20 days. In this experiment, the desired amount of CTec2 was added to the preheated mixture of citrate buffer, DI water, and antibiotic cocktail. The resulting solution was placed in the incubator at 50 °C. Samples of 0.5 mL were taken periodically and then centrifuged at 13,000 rpm for 10 min. The protein concentration of the supernatant was measured by the Pierce BCA method.
Modification of HCH1 model
Simulation of enzyme stability
Wallace et al. [13] reported that unproductive binding with lignin and thermal deactivation may play a significant role in enzyme deactivation. Considering the substrate used in this study is ligninfree, we assume that enzyme deactivation is solely due to thermal deactivation. RosalesCalderon et al. [14] observed that the protein concentration of a mixture of glucanase and βglucosidase dropped 34% after incubating at 50 °C for 4 days. It was hypothesized that the enzyme proteins suffered a structural change at 50 °C, which led to protein aggregation and precipitation. Additives, whose concentration was assumed constant and proportional to the initial enzyme protein concentration, were supposed to be present in the cocktail to stabilize the enzyme protein against aggregation. Equation 2 incorporates the presence of additives and is proposed to model protein stability [14]. The development of Eq. 2 is described in detail by RosalesCalderon et al. [14].
$$\begin{array}{*{20}c} {  \frac{{\left. {{\text{d}}[E} \right]}}{{{\text{d}}t}} = k_{1} \left[ E \right]  k_{2} \left( {\left[ {E_{0} } \right]  \left[ E \right]} \right)\left[ {E_{0} } \right]} \\ \end{array} ,$$
(2)
where E is the native enzyme protein concentration (g/L), E_{0} is the initial enzyme protein concentration (g/L), and k_{1} and k_{2} are the rate constants (h^{−1}).
The stability of CTec2 with three different initial concentrations was tested. Equation 2 was used to fit the experimental data.
Modification of HCH1 model
The HCH1 model was modified by the following steps:
 Step 1::

Use an empirical equation (Eq. 3) to fit the experimental data of the 16 enzymatic hydrolysis conditions (“Experiments for model fitness” section) with high accuracy. This smoothed version of the data provides the reaction rates needed to fit the parameters in the modified HCH1 model of enzymatic hydrolysis.
$$\begin{aligned} { \frac{{\left. {{\text{d}}[G_{1} } \right]}}{{{\text{d}}t}} = \frac{{3.7798\left( {\left[ {G_{x}^{0} } \right]  \left[ {G_{1} } \right]} \right)^{0.6410} \left( {\frac{{\left[ {E_{0} } \right]\left( {0.0574\left[ {E_{0} } \right] + 0.4370\exp \left( {  t\left( {0.4370 + 0.0574\left[ {E_{0} } \right]} \right)} \right)} \right)}}{{0.4370 + 0.0574\left[ {E_{0} } \right]}}} \right)^{0.8500} }}{{1 + 0.0247\left[ {G_{1} } \right]^{1.1579} }}} \\, \end{aligned} $$
(3)
where \(\begin{array}{*{20}c} { G_{x}^{0} } \\ \end{array}\) is the initial cellulose concentration (g/L, equivalent to glucose).
Equation 3 was developed based on the integrated version of Eq. 2 and an empirical model for batch fermentation [15]. Detailed development of this equation is described in Additional file 1. To fit the parameters, Eq. 3 was solved with the ode45 routine in MATLAB and nonlinear optimization was achieved by the fmincon routine. The objective function was the sum of square errors (SSE), which is the sum of the squared difference between experimental data and the predicted value [16]. The optimal set of parameters corresponds to the smallest SSE value. This empirical correlation of the data provided a coefficient of determination R^{2} = 0.994.
 Step 2::

Divide substrate conversion (from 0 to 1) into 50 equal segments. Using Eq. 3, calculate the reaction rate at each conversion and get a 16 × 50 data set.
 Step 3::

Determine product inhibition.
The inhibition parameter i in the HCH1 model was calculated by determining the initial velocities with and without inhibitor (Eq. 4) [17]. To estimate this value, the same enzyme and cellulose concentrations should be used.
$$\begin{array}{*{20}c} { i = \frac{{V_{{{\text{with}}\;{\text{inhibitor}}}} }}{{V_{{{\text{no}}\;{\text{inhibitor}}}} }}} \\ \end{array} = \frac{{\left[ {\frac{{\kappa \left[ {G_{x}^{0} } \right]\left[ E \right]}}{{\alpha + \left[ {G_{x}^{0} } \right] + \varepsilon \left[ E \right]}}} \right]i}}{{\frac{{\kappa \left[ {G_{x}^{0} } \right]\left[ E \right]}}{{\alpha + \left[ {G_{x}^{0} } \right] + \varepsilon \left[ E \right]}}}}$$
(4)
The inhibition of enzymatic hydrolysis by cellobiose was not considered in this study, because CTec2 contains a high level of βglucosidase that rapidly converts produced cellobiose into glucose; the cellobiose peak was not found in the HPLC results.
For a single inhibitor, the inhibition parameter i is expressed in Eq. 5 and the glucose binding constant β_{1} is calculated with Eq. 6.
$$\begin{array}{*{20}c} { i = \frac{1}{{\left. {1 + \beta_{1} [G_{1} } \right]}}} \\ \end{array}$$
(5)
$$\begin{array}{*{20}c} { \beta_{1} = \frac{{\left( {1  i} \right)}}{{i\left[ {G_{1} } \right]}}} \\ \end{array}$$
(6)
 Step 4::

Use the HCH1 model to fit the 16 reaction conditions at each conversion, and determine the bestfit coefficients κ, α, and ɛ.
 Step 5::

Determine the relationship between parameter κ and conversion x.
Figure 2a presents the relationship between parameter κ in the HCH1 model and substrate conversion x. The data were obtained from Steps 1–4. As shown in the figure, κ drops very fast in the beginning and then stabilizes after conversion reaches 0.38. Nidetzky and Steiner [18] assumed that cellulose consists of an easily hydrolysable part and a recalcitrant part. Based on their twosubstrate hypothesis, the authors derived a mathematical model to describe the kinetics of cellulose hydrolysis. According to the simulation results, the obtained rate constant for easily hydrolysable cellulose was much higher than that of recalcitrant cellulose. Using αcellulose as substrate, they determined that the fraction of easily hydrolysable cellulose was 0.3 [18]. Figure 2a can be explained by this hypothesis, but the rate constant for the easily hydrolysable cellulose decreases as conversion increases instead of being constant. In our experiments, the fraction of easily hydrolysable cellulose (0.38) is close to the result in [18].
Equation 7 was developed to describe the relationship between parameter κ and conversion x.
$$\begin{array}{*{20}c} { \kappa = \frac{{k_{3} }}{{\left( {1 + x^{{k_{4} }} } \right)^{{k_{5} }} }} + k_{6} } \\ \end{array} ,$$
(7)
where x is the substrate conversion, k_{3}, k_{4}, k_{5}, and k_{6} are the parameters.
The conversion x in the denominator is used to describe the negative effect of conversion on the rate constant. The parameter \(k_{6}\) is considered as the rate constant for recalcitrant cellulose. The parameter k_{3} is used to describe the difference in rate constants between the easily hydrolysable cellulose (initial) and recalcitrant cellulose (height of the curve). The parameters \(k_{4} \;{\text{and}}\;k_{5}\) are used to describe the decrease rate of the rate constant (steepness of the curve) for the easily hydrolysable part. To fit the data, the MATLAB curve fitting tool was used and a coefficient of determination R^{2} = 0.998 was acquired. The values of parameters k_{3}, k_{4}, k_{5}, and k_{6} were determined in this step.
 Step 6::

Determine the relationship between parameter ε and conversion x.
Figure 2b shows the relationship between parameter ε in the HCH1 model and conversion x. As shown in the figure, parameter ε has a very narrow range (0–0.5) over the entire conversion and remains almost unchanged (nearly zero) at conversion 0.1–0.95. Therefore, in this study, parameter ε is assumed not to be affected by conversion and its optimal value should be close to zero. Brown and Holtzapple [19] reported that if [\(\begin{array}{*{20}c} { G_{x}^{0} } \\ \end{array}\)]/[E_{0}] > 35, assuming ε = 0 would not introduce considerable error (< 1%) (note: in our study, [\(\begin{array}{*{20}c} { G_{x}^{0} } \\ \end{array}\)]/[E_{0}] ≥ 200). The parameter ε is needed only at high enzyme loadings. In industrialscale saccharification, considering the high cost of enzymes, the enzyme dosage must be very low; therefore, if modeling a commercial process, the value of ε can be set as zero.
 Step 7::

Determine the relationship between parameter α and conversion x.
The parameter α in the original HCH1 model may be expressed by Eq. 8, which is related to enzyme adsorption:
$$\begin{array}{*{20}c} {\alpha = \frac{{\left[ {E^{f} } \right]\left[ {G_{x}^{f} } \right]}}{{\left[ {E^{a} } \right] + \left[ {EG_{x} } \right]}}} \\ \end{array} .$$
(8)
Kumar and Wyman [20] showed that glucose addition and enzyme dosage can affect the percentage of cellulase adsorption. Therefore, besides the impact of conversion, the effects of glucose and enzyme concentration on the value of α were tested. Using the bestfit coefficients κ and ɛ obtained from Step 4, two optimal α values corresponding to two initial glucose concentrations were determined by fitting the data (eight data at each initial glucose concentration) from Step 2 at each conversion with the HCH1 model (Fig. 3a). Another two optimal α values corresponding to two enzyme concentrations were determined by repeating this procedure at each conversion (Fig. 3b). As shown in Fig. 3, as the reaction proceeds, the value of α increases and then is unchanged when the conversion reaches a certain point. It is obvious that high initial glucose concentration and low enzyme dosage improve the value of α significantly over the entire conversion range. Equation 9 was proposed to describe the relationship among α, conversion x, enzyme concentration E, and glucose concentration G_{1}. As shown in Fig. 3, all four curves show an “S” shape; therefore, the sigmoid function—which has an Sshaped curve—was used. The core structure of Eq. 9 is a sigmoidal function that describes the relationship between parameter α and conversion x (note: a_{2} and a_{3} are the parameters of the sigmoid function). In addition, because of the significant effect of glucose and enzyme concentrations on the value of α, the terms [G_{1}] and [E] were included in the numerator and denominator of the sigmoid function, respectively. The parameter a_{1} was added to describe the weight of terms [G_{1}] and [E].
$$\begin{array}{*{20}c} { \alpha = \frac{{a_{1} \left[ {G_{1} } \right]}}{{\left[ E \right]\left( {1 + { \exp }\left( {  a_{2} x + a_{3} } \right)} \right)}}} \\ \end{array} ,$$
(9)
where a_{1}, a_{2}, and a_{3} are the parameters.
 Step 8::

Modify HCH1 model.
Summarizing the proposed equations, Eq. 10 is the modified HCH1 model, where k_{1}, k_{2}, k_{3}, k_{4}, k_{5}, k_{6}, a_{1}, a_{2}, a_{3}, ε, and β_{1} are parameters. Estimates for k_{1}, k_{2}, k_{3}, k_{4}, k_{5}, k_{6}, and β_{1} were determined in previous steps. In this step, the optimal values of a_{1}, a_{2}, a_{3}, and ε were determined by simultaneously fitting the experimental data of the 16 enzymatic hydrolysis conditions (Section: Experiments for model fitness) with Eq. 10.
$$\frac{{\left. {d[G_{1} } \right]}}{dt} = \frac{{\kappa \left[ {G_{x} } \right]\left[ E \right]i}}{{\alpha + \varphi \left[ {G_{x} } \right] + \varepsilon \left[ E \right]}},$$
where,
$$i = \frac{1}{{\left. {1 + \beta_{1} [G_{1} } \right]}}$$
$$\begin{array}{*{20}c} {\varphi = \frac{{\left[ {G_{x} } \right]  \alpha  \varepsilon \left[ E \right] + \sqrt {\left( {\left[ {G_{x} } \right]  \alpha  \varepsilon \left[ E \right]} \right)^{2} + 4\alpha \left[ {G_{x} } \right]} }}{{2\left[ {G_{x} } \right]}}} \\ \end{array}$$
$$ \frac{{\left. {d[E} \right]}}{dt} = k_{1} \left[ E \right]  k_{2} \left( {\left[ {E_{0} } \right]  \left[ E \right]} \right)\left[ {E_{0} } \right]$$
$$\kappa = \frac{{k_{3} }}{{\left( {1 + x^{{k_{4} }} } \right)^{{k_{5} }} }} + k_{6}$$
$$\begin{array}{*{20}c} { \alpha = \frac{{a_{1} \left[ {G_{1} } \right]}}{{\left[ E \right]\left( {1 + { \exp }\left( {  a_{2} x + a_{3} } \right)} \right)}}} \\ \end{array} .$$
(10)
Integration of the differential equations described in Eq. 10 was performed using the same numerical methods described in Step 1.
Sensitivity analysis
Local sensitivity analysis
Local sensitivity analysis assesses the local impact of variation in input factors on model outputs. To do this analysis, the direct differential method [21] was used by calculating the sensitivity indices (Eq. 11). The sensitivities of parameters k_{1}, k_{2}, and β_{1} were not analyzed, because their values were obtained from independent experiment or calculation.
$$\begin{array}{*{20}c} { S_{{p_{j} }} = \frac{\partial y}{{\partial p_{j} }}\frac{{p_{j} }}{y}} \\ \end{array},$$
(11)
where \(S_{{p_{j} }}\) is the nondimensional sensitivity index of the jth parameter, y is the glucose concentration (g/L), and p_{j} is the jth parameter in the parameter vector p.
Global sensitivity analysis
Local sensitivity only analyzes the sensitivity of a solution close to the optimal parameter set. In contrast, global sensitivity analysis assesses the sensitivity of the model for the full range of possible parameter values [16]. In addition, global sensitivity indices can evaluate the combined impact of multiple parameters on model output.
To calculate a global sensitivity index, a normally distributed search of parameter space using the Monte Carlo method was performed and subsequent analysis of the variance in the model outputs was used. In this study, two global sensitivity indices were calculated: firstorder index and totaleffect index [16, 22]. The firstorder index measures the effect of the parameter of interest alone on the output variance. The totaleffect index accounts for not only the effect of the parameter of interest, but also interactions between the other parameters and the parameter of interest at any order.
Model evaluation
The modified HCH1 model was compared with the literature models for longterm enzymatic hydrolysis. As described in Step 1, the same differential equation integration method, nonlinear optimization constraint algorithm, and objective function were used. The Akaike information criterion (AIC) was used to evaluate model quality for the experimental data. The corrected version of AIC (Eq. 12) was used, because the number of observations was not large enough:
$$\begin{array}{*{20}c} {{\text{AIC}}_{C} = N \cdot \ln \left( {\frac{\text{SSE}}{N}} \right) + 2\left( {P + 1} \right) + 2\frac{{\left. {\left( {P + 1} \right)(P + 2} \right)}}{N  P}} \\ \end{array} ,$$
(12)
where N is the number of observations, P is the number of model parameters, and SSE is the sum square error.