Investigating the effects of substrate morphology and experimental conditions on the enzymatic hydrolysis of lignocellulosic biomass through modeling

Background Understanding how the digestibility of lignocellulosic biomass is affected by its morphology is essential to design efficient processes for biomass deconstruction. In this study, we used a model based on a set of partial differential equations describing the evolution of the substrate morphology to investigate the interplay between experimental conditions and the physical characteristics of biomass particles as the reaction proceeds. Our model carefully considers the overall quantity of cellulase present in the hydrolysis mixture and explores its interplay with the available accessible cellulose surface. Results Exploring the effect of various experimental and structural parameters highlighted the significant role of internal mass transfer as the substrate size increases and/or the enzyme loading decreases. In such cases, diffusion of cellulases to the available cellulose surface limits the rate of glucose release. We notably see that increasing biomass loading, while keeping enzyme loading constant should be favored for both small- (R < 300 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu m$$\end{document}μm) and middle-ranged (300 < R < 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu m$$\end{document}μm) substrates to enhance enzyme diffusion while minimizing the use of enzymes. In such cases, working at enzyme loadings exceeding the full coverage of the cellulose surface (i.e. eI>1) does not bring a significant benefit. For larger particles (R > 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu m$$\end{document}μm), increases in biomass loading do not offset the significant internal mass transfer limitations, but high enzyme loadings improve enzyme penetration by maintaining a high concentration gradient within the particle. We also confirm the well-known importance of cellulose accessibility, which increases with pretreatment. Conclusions Based on the developed model, we are able to propose several design criteria for deconstruction process. Importantly, we highlight the crucial role of adjusting the enzyme and biomass loading to the wood particle size and accessible cellulose surface to maintain a strong concentration gradient, while avoiding unnecessary excess in cellulase loading. Theory-based approaches that explicitly consider the entire lignocellulose particle structure can be used to clearly identify the relative importance of bottlenecks during the biomass deconstruction process, and serve as a framework to build on more detailed cellulase mechanisms. Supplementary Information The online version contains supplementary material available at 10.1186/s13068-021-01920-2.

surface. Mathematically, change in the particle porosity can thus be expressed as: This translation allows us to compare our model more directly to a general and common mechanistic model of cellulose hydrolysis by single enzymes (Nill et al., 2018), where the glucose release is dictated by the processive action of productively bound cellulases hydrolysing cellulose into glucose at a rate given by the catalytic rate constant [ Figure   S0]: leading to the following equivalence: Considering that values of intrinsic processivity for crystalline cellulose fall in the range of  Table S1 shows an overview of the available data found in literature for the various substrates used as inputs for the model in this work (Grethlein, 1985).  Table S1. Characteristics of the various substrates used in this study (Grethlein, 1985;Luterbacher et al., 2013) The overall digestibility, defining the accessible cellulose fraction, is taken as the reported 24h-glucose yields. Particle radii are assumed to correspond to the average size between sieves used to screen the native biomass particles (i.e. = 25

SUBSTRATE
). Enzymatic hydrolysis was carried out at a low solid loadings of 2% in citrate buffer pH=4.8 at 50°C for 24h, using cellulase powder from the Trichoderma Reesei Rutger's C strain (92.5mg/100ml) complemented by a source of -glucosidase (0.1ml).
In addition to these data, a new data set was generated from beech wood, the characteristics of which are summarized in Table S2. Beech wood particles were here subjected to dilute sulfuric acid pretreatment in typical lab-scale conditions (Shi et al, 2011, Sun et al. 2014). The final glucose yield was assumed to be that reached after 120h when the enzymatic hydrolysis was performed at high enzyme loadings to avoid incomplete cellulose degradation stemming from enzyme-related limitations. In any case, sensitivity analysis performed to evaluate this influence of change in the final yields on the predictions of the 2h-glucose yields showed only a marginal effect [ Figure S3]. The recorded concentration of cellobiose throughout hydrolysis was marginal in comparison to the glucose concentrations, as illustrated in Figure S4.  3150] 0.6773 0.61 33.9 ± 3.1 Table S2. Characteristics of the substrates generated in this study Figure S1. Predicted 2h-glucose yields as a function of the specified final glucose yield in the model for pretreated beech wood (1%SA/160°C/30min) with different particle sizes. Simulations were performed for a low enzyme loading of 0.3. Experimental enzyme loadings were expressed as normalized to the number of initially accessible binding sites on the cellulose surface assuming an average molecular mass of 51.8 kDa for the cellulases making up the enzyme cocktail used experimentally (Seiboth et al., 2011).

A3. Individual and combined optimization of and
Tortuosity values were rationally bound between 1 and 7, which represent situations from the case where all pores are aligned with the diffusive flux to extreme cases of tortuosity (Davis et al., 2003). When fitting Mp over the whole set of data, only marginal differences in fit were obtained for values of tortuosity up to 4-5 [ Figure S5], with slight decreases in goodness-of-fit when increasing the diffusion resistance. The choice of the optimal parameter to use for simulations was then purely based on the slightly improved R 2 value (leading to the optimal value of Mp = 755 and τ=2 ). When instead performing individual fits for each substrate [ Figure S6], we did not observe any clear trend based on substrate characteristics (e.g. pretreatment severity), indicating again that the dependence on accessibility dominated the kinetics and that small changes in Mp or τ are likely due to uncertainty in parameter estimation rather than the result of physical phenomena.

A4. Effect of external mass transfer
The potential impact of external mass transfer on the overall reaction rate of cellulose depolymerization was evaluated using the modified Weisz-Prater criterion for external mass transfer (Davis et al., 2003), where is the observed reaction rate, the particle radius, , the enzyme bulk concentration, is the reaction order and ̅ the mass transfer coefficient. Satisfying this relationship indicates that internal mass transfer or reaction controls the reaction and that external mass transfer can be neglected.
While the viscosity ̅ and density of the sodium citrate buffer containing the enzymes is assumed to be same as those for water, the fluid velocity is calculated based on the flask internal radius ( = 1.25 ⋅ 10 −2 ) and incubator rotation speed ( = 120 ), Since we are comparing enzyme mass transfer to its participation in the surface reaction, the quantity of glucose generated per amount of time is divided by the factor to calculate the quantity of enzyme participating in the reaction per time. This factor accounts for the fact that several glucose molecules are generated for one enzyme going through a binding cycle.
The Weisz-Prater criterion was evaluated for the initial predicted reaction rate (the fastest rate, and thus the most likely to be controlled by external transfer) for two extreme particle sizes, In both cases, predicted initial rates with respect to enzyme concentration, closely followed a first order reaction rate, thus = 1 was used as a first approximation. Altogether, the values were far below the criteria requirement. Therefore, even though the estimation of and ̅ were based on several approximations, we can safely assume that external mass transfer plays a negligible role on enzymatic hydrolysis and can safely be ignored.

A5. Working equationssurface enzyme concentration and non-dimensional formulation
The time evolution of the maximum enzyme concentration at the cellulose surface where is the parameter representing the number of moles of enzymes per unit of surface, ,0 is the initial accessible pore volume and ,0 the initial cellulose surface.
Here, we assumed that the available surface for adsorption tends to 0 as the cellulose hydrolysis goes to completion, i.e.: (4)- (5) and (6) where ℎ is the size of the discretized step and ( ) is the function of interest.

A6. Model parametersrate constants
In this work, we assumed a general desorption rate constant, with no distinction made between desorption and decomplexation rate. This global desorption rate constant used However, by assuming a particle with density including of 0.9 in solution and a loading between 2-10%, the desorption rate constant used in this study can be estimated at around 0.01-0.05 μM -1 s -1 , which is within the same order of magnitude reported for adsoprtion rates of 0.097-0.33 μM -1 s -1 and complexation rates of 0.018-0.029 μM -1 s -1 by Nill et al (2018). The slightly lower adsorption rates obtained can likely be explained by the fact that this parameter includes all enzymes, including some enzymes that bind and others that don't or may bind irreversibly to other parts of lignocellulosic biomass.

A7. Sensitivity Analysis
Sensitivity analyses were performed on relevant parameters for both cases with high cellulose accessibility (Dataset DS1, DAP mixed hardwood at 220°C for 7.8s) and low cellulose accessibility (dataset DS1, native mixed hardwood). Figure S7 shows the variation observed in 2h-glucose yield while changing the kdes, kads and Dbulk values by x10 or x0.1 times for each variable, while keeping other parameters equal to their initial values. Figure S8 represent the combined effect of change in fitted parameter  and Mp on the predicted 2h-glucose yield for both substrates. For the substrate with low cellulose accessibility, the change in kinetic parameters have virtually no impact on the initial glucose rate, as the reaction rates are largely controled by internal diffusion limitations [ Figure S6]. This effect is even more pronounced when the ratio of enzyme:initial binding sites is decreased, as the concentration gradient within the particle decreases further, slowing diffusion. In contrast, for particle presenting higher porosity, glucose release rates are strongly dependent to the desorption rate constant, while being relatively insensitive to the adsorption rate. In this case [ Figure S6]. For both substrates, increasing the internal diffusion resistance by increasing the tortuosity lead to a shift in the fitted parameter Mp towards a larger value [ Figure S7], which shows a proportional relationship between the optimal value of tortuosity and the value of Mp.  A8. Estimation of the number of accessible binding sites from porosity Figure S7. Schematic representation of the double-slit pore geometry, and how the pore's surfaces Sp are accounted for depending on the pore diameter dp and the size of a cellulase.
The initial number of accessible binding sites is computed from the pore size distribution measured for a specific substrate using purely geometrical assumptions [ Figure S1].
Based on the hypothesis that pores have a double-slit geometry (Stone and Scallan, 1968), the total pore surface Stot can be computed from the pore volume Vp and the diameter/width dp as follows, Importantly, we assume that pores presenting a diameter smaller than that of a cellulase (i.e. smaller than 51A) are not accessible to cellulases and thus their pore surface is not accounted for in the total pore surface Stot. In this study, a distinction is also made between pores that can accomodate 1 or 2 cellulases within their diameter by considering 1 or 2 pore surfaces within a slit geometry [ Figure S1]. This surface is then converted into a number of cellulase adsorption sites by assuming (i) cellulose, hemicellulose and lignin are evenly distribute on the pore surface surface and (ii) that the footprint of a cellulase on the cellulose surface is equivalent to 5X5 nm (Bothwell, 1994).

A9. Boundary condition -derivation
Adopting Fick's law to describe the flux of enzyme entering the particle from the bulk solution, the boundary condition at the outer radius of the cylindrical biomass particle is obtained by integrating this flux over the entire particle surface

A10. External vs. internal accessible cellulose surface
Assuming a cylindrical geometry for the biomass, the relationship between internal and external surface can be derived as follows. The external surface of a cylindrical particle , of radius , not considering the edge surface, can be related to its volume , and consequently its mass , where is the particle density including the void created by the pores. This lead to the ratio between external and internal surface being: , , = , 2 .
As external surface is likely to be more important for particles exhibiting a low porosity and a small diameter, we consider here the extreme cases of a 10 particle with a porosity of 6 2 / of biomass (i.e. the typical accessibility of native mixed hardwood), leading to a ratio of 29.5. In this extreme case, in which cellulose was also assumed to be evenly distributed in the particle, the error on the calculated accessible surface would be around 2%.

A11. Model implementationverification and convergence
The convergence of the system against the discretization parameter was first assessed to ensure reliable estimations of glucose release over time for relevant ranges of model parameters. As analytical solutions are not available for the system of equations considered here, the correctness of the implemented algorithm was evaluated by verifying proper conservation of enzymes over the course of the simulation.
The convergence of simulations can be evaluated by verifying that the conservation of the total number of enzymes in the system is conserved by calculating and verifying the following criterion : While increasing the discretization number improves accuracy, a trade-off value ensuring less than 2% deviation in the enzyme mole balance was chosen in order to alleviate computational cost and keep simulation time reasonable. For most systems considered here, this condition was already fulfilled with = 50. Notable exceptions occurred for simulations that used low enzyme loadings ( ≤ 1). In these cases, because the pore enzyme concentration tended to be particularly low due to significant surface adsorption, a finer grid was required to maintain numerical stability. Figure S1 shows convergence of the system as a function of the model parameters for a small particle, with different enzyme loadings. For lower enzyme loadings and/or higher biomass loadings (not pictured) the criterion was met only for a discretization number above 2000. However, = 50 was actually sufficient for most conditions. Notably, as a consequence of the finite time step and the fact that results are essentially exact up to machine precision, does not tend to zero, but to a finite value. Figure S8. Convergence as a function of discretization number for particle exhibiting a radius of 10um and biomass loading of 1%.

A12. Compositional Analysis
Compositional analysis followed the LAP procedure published by NREL (NREL, Sluiter et al., 2012). Briefly, dried substrates were subjected to a two-step acid hydrolysis, starting with the incubation of samples (~0.3 g) in 7.5ml of acid solution (72% SA) for 2h at 30°C and 120 rpm, followed by dilution down to 3% sulfuric acid concentration before proceeding with the reaction at 121°C for 1h in an autoclave. Lignin was then separated from the soluble sugars by filtration and dried overnight at 105°C for quantification, while cellulose and hemicellulose contents were estimated through determination of sugar concentration in the filtrate by HPLC analysis (Bio-Rad Aminex HPX-87H column, 5mM H2SO4 mobile phase). The moisture content of all samples was initially determined by drying the wood particles overnight at 105°C and weighing the dried mass at room temperature after the sample was left for 2h in a desiccator.

A13. Pore size distribution
Characteristics of the relevant probes employed in this study are summarized in Table S3.  ( Measurement of the fiber saturation point (FSP) was performed using a probe that was assumed to be unable to penetrate into the pores. In this case, we used a probe with a size of 560 Å, and used the following formula to calculate the pore volume accessible to probe : , = ,560 − , . (2)

A14. Model -Error propagation
To account for uncertainties associated with experimental measurements, and in particular those associated with the experimental determination of the pore volume distribution, a Monte-Carlo uncertainty analysis was performed in silico. To this end, the initial standard error associated with the accessible pore volume measured experimentally was sampled assuming a normal distribution using the function normrnd in Matlab. A sampling number of 50 was found to ensure that the average pore volume calculated from the generated sampling distribution led to the correct average pore volume and associated standard error, while limiting computational cost. Simulations were then run for each of the generated data sets, which allowed for computing uncertainties on the predicted glucose yields.