 Research
 Open access
 Published:
Computational inference of the structure and regulation of the lignin pathway in Panicum virgatum
Biotechnology for Biofuels volumeĀ 8, ArticleĀ number:Ā 151 (2015)
Abstract
Background
Switchgrass is a prime target for biofuel production from inedible plant parts and has been the subject of numerous investigations in recent years. Yet, one of the main obstacles to effective biofuel production remains to be the major problem of recalcitrance. Recalcitrance emerges in part from the 3D structure of lignin as a polymer in the secondary cell wall. Lignin limits accessibility of the sugars in the cellulose and hemicellulose polymers to enzymes and ultimately decreases ethanol yield. Monolignols, the building blocks of lignin polymers, are synthesized in the cytosol and translocated to the plant cell wall, where they undergo polymerization. The biosynthetic pathway leading to monolignols in switchgrass is not completely known, and difficulties associated with in vivo measurements of these intermediates pose a challenge for a true understanding of the functioning of the pathway.
Results
In this study, a systems biological modeling approach is used to address this challenge and to elucidate the structure and regulation of the lignin pathway through a computational characterization of alternate candidate topologies. The analysis is based on experimental data characterizing stem and tiller tissue of four transgenic lines (knockdowns of genes coding for key enzymes in the pathway) as well as wildtype switchgrass plants. These data consist of the observed content and composition of monolignols. The possibility of a Glignin specific metabolic channel associated with the production and degradation of coniferaldehyde is examined, and the results support previous findings from another plant species. The computational analysis suggests regulatory mechanisms of product inhibition and enzyme competition, which are well known in biochemistry, but so far had not been reported in switchgrass. By including these mechanisms, the pathway model is able to represent all observations.
Conclusions
The results show that the presence of the coniferaldehyde channel is necessary and that product inhibition and competition over cinnamoylCoAreductase (CCR1) are essential for matching the model to observed increases in Hlignin levels in 4coumarate:CoAligase (4CL) knockdowns. Moreover, competition for 4coumarate:CoAligase (4CL) is essential for matching the model to observed increases in the pathway metabolites in caffeic acid Omethyltransferase (COMT) knockdowns. As far as possible, the model was validated with independent data.
Background
About 440 million years ago plants started to leave the oceans and inhabit land [1, 2]. The emergence of lignin during this time was an adaptation to the new environment and, specifically, a response to gravity and to limitations in accessing water. The new life also demanded plants to store water and develop systems of water transfer. The plant furthermore needed to grow in height in order to have enough access to sunlight and oxygen. Plants ultimately accomplished these multiple tasks through their xylem structures, of which lignin is a key constituent. Lignin is a phenolic polymer that is woven around and between cellulose and hemicellulose within the secondary cell wall; it provides strength and facilitates water transfer in plants. A consequence of these significant benefits for plants is that lignin is very difficult to decompose, because it is an irregular polymer that contains aromatic rings. This resistance against decomposition and digestion is known as recalcitrance. It is arguably the most important barrier to industrializing secondgeneration biofuels, and in particular the production of ethanol from inedible plant parts as sustainable and affordable biofuels, because recalcitrance necessitates additional treatment steps, such as hot acid or ammonia baths, to loosen the lignin structure [3ā5]. These steps require time and expense and therefore reduce feasibility and cost effectiveness. Moreover, most of the pretreatments are not environmentally friendly [6, 7]. Outside the biofuel industry, recalcitrance affects forage digestibility, and progress toward reducing recalcitrance could have a significant impact on the cattle and sheep industry [8].
Numerous attempts have been made in recent times to manipulate the lignin content and composition in candidate plants for biofuel production. Many of these studies relied on the assumption that the lignin biosynthesis pathway was known. However, this is not necessarily the case, especially in understudied plant species, and the precise pathway structure is often unclear and requires dedicated research for such species. For instance, Selaginella moellendorffi and Medicago truncatula have basically similar lignin pathways, which however differ in some of their metabolic branch points as well as their enzyme properties [9ā11]. Beyond the topological structure, it is not surprising that different species have evolved distinct regulatory control patterns. The immediate consequence of such discrepancies for the biofuel industry is that the direct extrapolation of knowledge, methods and treatments from one species to another is not necessarily valid. Moreover, it is well known that pathway systems are highly nonlinear and difficult to predict with intuition alone. A feasible strategy is therefore to employ computational approaches of systems biology and metabolic engineering.
The design of suitable models for this purpose is not trivial. First, it is generically unclear which mathematical representations are optimal for describing a natural system. Second, one cannot be sure that information or data from one species can be assumed to be valid in another species, even if the two are closely related. Similarly, it has been shown many times that data obtained in vitro are not necessarily applicable in vivo [10ā14]. At the same time, speciesspecific experiments are time consuming and expensive. Mechanistic models based on enzyme kinetics seem to be an intriguing choice, but it has been shown that mechanistic models are not always good solutions, for instance, if parameter values and enzymatic rate laws are based on strong assumptions like bulk reactivity that are not necessarily satisfied in vivo [12]. An alternative that was recently proposed is the characterization of in vivolike kinetics [13], which however is costly and time consuming and would still require extensive validation, which however is seldom truly achieved [12]. An additional challenge for the design of models is the scarcity and quality of test and validation data, which pose a significant obstacle to all analyses of relatively understudied species.
In this study we analyze the lignin biosynthesis pathway in switchgrass, Panicum virgatum, with computational means of systems biology. The analysis is based on a dataset from stem and tiller tissue that consists of the lignin content (H, G and S lignin) and the S/G lignin ratio in wild type and in four transgenic lines (4CL, CCR1, CAD and COMT knockdowns). To some degree, details of the in vitro kinetics of some of the pathway enzymes have also been determined by one of our labs. Our approach here is to develop computational models that characterize the structure and regulatory control patterns of lignin biosynthesis in P. virgatum at a systemic level. The goals of this modeling approach are, first, to explain the experimental results from wild type and transgenic lines and, second, to devise a rational basis for strategies to manipulate the pathway toward reduced recalcitrance.
Results
The results are described in a sequence that follows our stepbystep model design and conveys our rationale for utilizing the observations to remediate discrepancies with the data and for suggesting the investigation of new features to the model in the next step of the analysis. We begin by assessing the pathway structure in switchgrass as it is alleged in the current literature. Next, we examine possible channeling of CCR/CAD, which has been reported for the lignin pathway in alfalfa [5, 14], but not in switchgrass. Even accounting for the possibility of channeling, the experimental data regarding H lignin cannot be captured at this point. Thus, we investigate the effects of product inhibition and competitive inhibition. In the next phase, 4CL inhibition is added as a potential explanation for the accumulation of 4CL substrates, along with a simultaneous decrease in coniferaldehyde in the COMT knockdown. Finally, principal component analysis is performed to investigate the distribution of parameters within the highdimensional parameter space and to reduce the feasible subspace of parameter values. The results section ends with a validation of the model.
Reaction system of lignin biosynthesis in switchgrass
The traditionally accepted lignin biosynthesis pathway branches at pcoumaroyl CoA to provide S and Glignin precursors (Fig. 1). The hexagon in this figure shows the details of this branch point. It was also previously assumed, based on studies in the dicots A. thaliana and N. benthamiana, that pcoumaroyl CoA is converted to pcoumaroyl shikimate and pcoumaroyl quinic acid by HCT. Subsequently, both products, pcoumaroyl shikimate and pcoumaroyl quinic acid, were shown to be converted to caffeoyl shikimate and caffeoyl quinic acid, respectively [15]. The enzyme for these unidirectional reactions is C3ā²H. Downstream, HCT was proposed to operate in the reverse direction to convert caffeoyl shikimate and caffeoyl quinic acid into caffeoylCoA.
A recent study demonstrated that this pathway organization is unlikely to occur in switchgrass [16]. Based on kinetic measurements of PvHCT1a, PvHCT2a and PvHCTLike1, it was shown that caffeoyl shikimate is not converted to caffeoylCoA by the reverse HCT reaction, but is more likely converted into caffeic acid through caffeoyl shikimate esterase, and that this step is actually the main route of mass transfer into the pathway towards S and G monolignols. As indicated with dashed arrows in Fig. 1, HCT is not active in the formation of caffeoylCoA. This new information helps us reduce the steps in Fig. 1. It has furthermore been suggested that cinnamic acid is a precursor for salicylic acid; this process is represented by the thick grey arrow [5]. Similarly, a considerable portion of ferulic acid leaves the pathway [17]. Finally, the efflux out of pcoumaric acid acts to avoid accumulation of the metabolite in the 4CL knockdown strain (Fig. 1). These simplifications yield the pathway diagram in Fig. 2.
At this point, it is not entirely clear whether the lignin pathway in switchgrass contains caffeyl aldehyde. It appears that this is not the case, and the following analysis assumes that caffeyl aldehyde is indeed not produced. Nonetheless, since other species do generate this intermediate, the Additional file 1: Text S1 analyzes this case.
Largescale simulation studies with this pathway structure lead to irreconcilable differences between the experimental data and the model results, which indicate that the model has genuine flaws. In particular, the dynamics of the different lignin species cannot be explained for the various transgenics (data not shown).
Channeling
Experimental and theoretical work in alfalfa has suggested that functional enzymatic channeling likely occurs at the coniferaldehyde node [5, 14]. According to this suggestion, the āGchannelā facilitates the use of feruloylCoA for the production of coniferyl alcohol, which is the precursor of the G monolignol (Fig. 2). We investigate the same channeling hypothesis here as a possibility. Specifically, we use pertinent experimental data from switchgrass to analyze the feasibility of different hypothetical pathway topologies. The potential existence of a functional complex consisting of CCR1/CAD leads to three possible pathway topologies that satisfy the requirement of mass conservation (Fig. 3).
Each of these sofar unregulated topologies was modeled as a generalized mass action (GMA) model, whose parameter values were obtained with a sophisticated largescale sampling scheme (see āMethodsā). Although all topologies were found to be consistent with most of the experimental results, no topology was compatible with the accumulation of H lignin in 4CL knockdown transgenics (Table 1); this situation could not be simulated by any of the candidate models, regardless of the presence or absence of the channel. This strong result suggests the existence of regulatory mechanisms, and considering the structure of the pathway and the branch toward H lignin in particular, we decided to analyze the possible role of product inhibition, which is frequently found in pathway systems in vivo.
Product inhibition
Experimental results from transgenic plants have demonstrated that H lignin accumulates when the enzyme 4CL is downregulated [3]. Analyzing this initially counterintuitive observation closer suggests that there might be a wave of accumulation in the metabolites preceding H lignin. Such a wave can be explained with product inhibition (Fig. 4). When an enzyme is downregulated, the corresponding substrate accumulates. The secondary effect is that the accumulated substrate is by itself a product of a previous reaction whose increased concentration decreases its own rate of production. This backward cascade has an upstream domino effect along the pathway and, depending on the kinetics of the reactions, can lead to the accumulation of upstream metabolites. This observation can be explained by the following chain of events: Downregulating 4CL leads to a decrease in the products of this enzyme, i.e., pcoumaroylCoA, caffeoylCoA, and feruloylCoA. At the same time, product inhibition leads to a backward accumulation in upstream metabolites, which compensates, at least partially, for the initial decrease in pcoumaroylCoA. Product inhibition is easily incorporated into the GMA model (see āMethodsā). Thus, in a new round of simulations, a new set of 100,000 randomly sampled parameter values was generated as before, this time accounting for product inhibition. Again, the configurations satisfying the experimental results were recorded.
Although the simulations showed an improvement regarding the H lignin accumulation in the 4CL knockdown, no topology reached the twofold increase that was reported in the literature [3].
Substrate competition for shared enzymes
Several enzymes in the lignin pathway catalyze multiple reactions with slightly different substrates, and it is reasonable to assume substrate competition for an enzyme among the multiple substrates. This competition can play an important role in altering the flow of mass in a mutant plant.
We explored the consequences of substrate competition with respect to the pertinent enzyme CCR. The analysis yielded the following result. If CCR favors pcoumaroylCoA over feruloylCoA, due to substrate competition, the flux towards H lignin is increased. In fact, simulation analysis shows that the increase in H lignin is strong enough to match the experimental data.
It could be possible that substrate competition alone would be sufficient for increased H lignin production. We tested this conjecture with a corresponding simulation, which revealed that only the combined model with product inhibition and substrate competition matches the experimental observations. The strength of inhibition is a priori unknown, but simply becomes a parameter value in the GMA model (see āMethodsā section). For instance, consider the pathway in Fig. 4, where X _{2} and X _{6} share the same enzyme for fluxes V _{2} and V _{4}. Blue arrows represent the competition between the substrates, while red arrows represent product inhibition. In this case the equation for V _{2} becomes
where Y _{2} is the enzyme catalyzing the reaction (CCR).
Inhibition of 4CL in COMT knockdown transgenics
Although product inhibition and substrate competition improve the consistency between the experimental data and numerical results in CCR1 transgenic plants, the model does not match COMT knockdown data sufficiently well. Specifically, the model does not capture the observed 30 % increase in ferulic acid in COMT knockdowns [4]. This observation becomes even more difficult to explain if one considers the simultaneous 20 % decrease in coniferyl aldehyde. One could speculate that the high accumulation in 5OHferulic acid might trigger a cascade of product inhibition that leads to the accumulation of ferulic acid, but computational results did not support the idea.
Further analysis with the model revealed that the reaction from ferulic acid to feruloylCoA, which is catalyzed by 4CL, is the bottleneck. Indeed, the computational results show that this reaction has a flux that is 10 times as large as the efflux from ferulic acid towards 5OHferulic acid. Thus, if the flux towards ferulic acid decreases, any substantial accumulation is impossible unless the 4CL reaction is inhibited. This modelbased deduction is indirectly supported by experimental data from one of our labs that exhibit a slight accumulation in the distant pcoumaric acid and caffeic acid, which is explained by 4CL inhibition as well (data not shown).
Accounting for the deduced 4CL inhibition in the model leads to simulations that faithfully capture all experimental data associated with the COMT knockdown; in particular, the 4CL substrates accumulate and the concentration of coniferaldehyde decreases, as observed. From a biochemical point of view, one might be interested in identifying the inhibiting agent. As it was mentioned earlier, the 5OHferulic acid concentration increases by 70 % in COMT knockdown plants. While the metabolite has not been identified as a substrate for 4CL, it might be reasonable to assume that it binds to 4CL in high concentrations, due to its molecular similarity, and thereby inhibit the enzyme competitively (Fig. 5). While this hypothesis remains to be experimentally validated, the same type of substrate competition with respect to 4CL has recently been proposed by others [18]. To implement 4CL inhibition in the model in the most generic manner, we simply lowered the corresponding rate constants.
Compatible configurations
The mathematical model with universal product inhibition, substrate competition for CCR1, inhibition of 4CL, and the possibility of a metabolic channel was subjected to largescale simulations aimed at inferring the most likely topology of the lignin pathway (recall Fig. 3). Similar to previous simulations, a sample of 100,000 parameter sets was generated to test model consistency with the experimental data and to provide likely kinetic orders for the model (see āMethodsā). Intriguingly, the only pathway configuration that is compatible with all available data is Configuration 1 of Fig. 3. Note that the speculated coniferaldehyde channel is indeed present. In fact, no parameter set, using Configurations 2 and 3, could reproduce the experimental data which eliminates the chance to compare the relative performance of the configurations.
Principal component analysis
To gain a better understanding of the parameter space of the system, principal component analysis (PCA) was performed on the parameter sets that had been filtered by the model criteria. Once the principal components of the parameter space were identified, a new round of simulations was executed. Specifically, a sample of 100,000 parameter sets was generated along the principal directions and within the reduced space. The set was then transformed back to the original coordinates. The successful parameter sets were recorded and are depicted in Additional file 2: Figure S8. Ultimately, principal components 1 through 4 collectively account for 88 % of the variance.
Model uniqueness
It is theoretically impossible to proof the uniqueness of a model for such complex nonlinear problem, because it is always possible to evoke additional processes in such a fashion that the original model could be subsumed as a simpler special case. In our case, one should note that our largescale simulation approach led to a structurally and numerically compact ensemble of similar solutions within the highdimensional parameter space of the system. Given that we determined the ensemble with Monte Carlo simulations that cast a very wide net over the parameter space, it is difficult to imagine entirely different parameterizations that would capture all data as well as our ensemble and perform well in the validation studies we performed.
Moreover, considering that the available data were obtained from several independent transgenics, and that the stoichiometric system of the system is underdetermined, the likelihood of significantly other solutions appears to be rather small. Also, our simulations show that the system converges to the same steadystate starting from a wide array of initial conditions. Some arbitrary initial conditions actually lead to steadystate values outside of the defined physiological bounds; however, among the initial conditions that lead to admissible steadystates, several rounds of screening showed identical results.
In summary, it is well understood that model design is an iterative procedure, and while our logical analysis of numerical results suggested the stepwise addition or elimination of new features, there is no mathematical proof that the model ensemble is truly unique.
Outside these purely mathematical arguments, we might also look at the biological reasonableness of the model. For instance, one could ask why only CCR was subjected to substrate competition, while there are other shared enzymes. The answer is a matter of simplicity, as suggested by Ockhamās razor. Namely, we demonstrate that the substrate competition of CCR is needed to match the available data, while additional mechanisms are not necessary to explain the experimental data. Thus, we cannot exclude that additional regulatory mechanisms might exist, but we would need additional, independent data to confirm or refute such a hypothesis.
We also note that, although the model design progressed iteratively, we carefully investigated the necessity of including each individual mechanism a posteriori. For example, upon discovering that competitive inhibition over CCR improves Hlignin accumulation, we asked whether product inhibition was still vital for the model to explain the observations. We examined this hypothesis and determined that Hlignin accumulation could not be captured anymore. We therefore concluded that both mechanisms, product inhibition and CCR competition, are necessary. We found this conclusion reasonable, as both product inhibition and substrate competition are common in metabolic pathway systems.
Model validation
The model with parameter values described above was constructed based on experimental data from wildtype switchgrass and four transgenic lines (4CL, CCR1, CAD and COMT knockdowns). To validate the model, experimental data from a separate transgenic plant, which had not been used in any way during the model design, were used to investigate how well the system performs under untested conditions. Namely, in a recent study, the transcription inhibitor PvMYB4 was overexpressed in order to reduce enzyme expression in the lignin pathway [19]. While metabolite concentrations were not measured for any of the pathway intermediates, the published data contain H, G and S lignin levels, as well as comparisons of enzyme activities between the wild type and PvMYB4 plants. The overall result of the study is a global reduction in the expression of the enzymes of the pathway, which in turn leads to 40ā70 % decreases in total lignin.
We tested our model against the profile of observed enzyme expression under overexpression of PvMYB4. We started with the already parameterized model without introducing any alterations or adjustments, except for resetting the appropriate enzyme activities, and tested how the system responded to the inhibition in comparison to the in vivo experiments [19]. Encouragingly, the altered G and Slignin amounts and their ratio, reported in the experimental study, are captured by the model with the compatible topological configuration quite well. The Hlignin was essentially unchanged in the experiment, while it slightly decreases in our model, in accordance with the data we used. However, Hlignin constitutes only about 3 % of the total lignin so that this difference is of no particular pertinence. Results are shown in Figs. 6 and 7. Figure 6 compares the fold change in lignin monomers between the experimental data and model results. The first row shows the fold change in G, S, the total lignin, and the S/G ratio comparing the wild type and PvMYB4 lines from the experiment; the second row corresponds to the computed configuration. As can be seen, the model results are quite consistent with experimental data.
This independent validation is very reassuring, especially with respect to future attempts to use metabolic engineering techniques to alter the S/G ratio in switchgrass. For instance, if further model predictions prove similarly reliable, the model could be used to simulate and optimize the outcome of combinatorial knockdowns, whose outcomes are not necessarily predictable with intuition alone. Such predictions would be very valuable, as a comprehensive combinatorial screening of double and triple knockdowns would neither be economical nor experimentally feasible.
While the published PvMYB4 data used for the first validation do not contain intermediate metabolite concentrations, a more recent study provides steadystate data for several of the pathway metabolites [20]. Comparing the published data in [20] with those in our model, we find that seven metabolites are represented in both, namely, caffeic acid, 5OHconiferyl alcohol, ferulic acid, sinapyl alcohol, coniferaldehyde, pcoumaric acid and coniferyl alcohol.
Figure 7 exhibits a comparison of the steadystate profiles. The top row shows the simulation results, while the bottom row represents experimentally measured steadystate concentrations in PvMYB4 normalized to wild type from [20]. The wildtype value for each concentration is set to 100 (white), and the redblue spectrum represents increases or decreases in steadystate values of knockdowns. For five of these seven metabolites, our computational results of PvMYB4 conditions show the same semiquantitative behavior in steadystate concentrations compared to the wild type; these are caffeic acid, 5OHconiferyl alcohol, sinapyl alcohol, coniferaldehyde and coniferyl alcohol. Discrepancies are seen in ferulic acid and pcoumaric acid. Here, the experimental data show a decrease in the steadystate concentrations, while our computational results predict an accumulation. Interestingly, these differences occur for metabolites whose effluxes out of the lignin pathway are ill defined, because their characteristics were not documented in the literature. It is therefore likely that they are not optimally parameterized in the model.
Discussion
In this work, we developed an ensemble of models of lignin biosynthesis in stem and tiller tissue in switchgrass, P. virgatum. The model reflects the consequences of various enzyme knockdowns quite well and performed satisfactorily in two validation studies with experimental data that had not been used in the model design or implementation. We used as the modeling framework the generalized mass action (GMA) format within biochemical systems theory (BST) [21ā25]. The powerLaw representation, which is the hallmark of this type of model, is arguably the least biased default formulation and by its mathematical nature avoids problems due to possibly invalid assumptions that may cast doubt on traditional MichaelisāMenten models in vivo [26]. Parameter values were, as always, difficult to obtain in a direct manner. We used for this purpose experimental knockdown data and a sophisticated Monte Carlo sampling strategy that has been used very successfully for similar systems before [14]. As a particular subgoal, we investigated the regulatory mechanism of the pathway and the possible colocalization or coupling of the pair of enzymes, CCR1/CAD that was previously suggested for Medicago [5].
To elucidate the colocalization or coupling of these enzymes in switchgrass, we studied multiple configurations that seemed a priori plausible and identified those natural designs that were consistent with the experimental data. The consistent designs were further examined under different regulation scenarios. The main result from this study is a very robust model of lignin biosynthesis in switchgrass that is consistent with all available data. The model was, at least to some degree, validated with a formerly unused dataset. If this validation can be confirmed and expanded experimentally, the model proposed here may be used to predict responses of the natural pathway system to alterations that are difficult to assess with experimental means. For instance, a further validated model will allow the prediction of responses to combinatorial knockdowns that could be the basis for future designs of more sophisticated transgenic lines than are currently available.
The computational analysis suggests the colocalization or functional coupling of the two enzymes CCR1 and CAD. Metabolic channeling and compartmentalization in plants have been identified in many biochemical pathways [27]. Of importance here, it has been suggested that enzymes catalyzing early reactions in the monolignol pathway may be colocalized in their binding to the ER. For instance, a multiprotein complex has been identified between PAL and C4H, and it seems that most of the substrates use these channels, but that some substrate undergoes the metabolic conversion in two steps [28ā30]. C4H can also form a complex with C3ā²H [31], and it has been suggested that different forms of 4CL form a complex in poplar [32]. Independent computational work on alfalfa came to a similar conclusion for channeling of enzymes associated with coniferaldehyde, which were proposed to form a metabolic channel [14]. Our results on switchgrass, presented in this article, are in line with the latter result and suggest moreover that channeling around coniferaldehyde is necessary to capture the available data.
The comparative study of different configurations revealed that consistency with the available experimental data was most difficult to achieve for transgenic 4CL downregulated lines, in which, surprisingly, the H lignin concentration is increased. This observation is at first counterintuitive because 4CL is located directly upstream of the H lignin precursors, which would lead to the a priori expectation of a decrease in H lignin. The combination of two postulated types of regulatory mechanisms was able to explain this observation. The first is product inhibition, which is observed quite frequently in biochemical systems. While improving the data compatibility, this mechanism turned out to be insufficient, thus requiring additional signaling. Arguably the simplest explanation is a regulatory structure that works in either of the mechanisms below:

An intermediate in the pathway is increased in response to the 4CL knockdown and activates the precursors of H lignin synthesis. The most likely candidates for this scenario appear to be pcoumaric acid, caffeic acid, and ferulic acid (Fig. 8a).

There exists an inhibitor for the H lignin branch. This metabolite would have to be located such that its concentration is decreased due to the 4CL knockdown, which means that the inhibitor activity is inhibited and therefore exerts a net positive effect on the system (Fig. 8b). FeruloylCoA could be a good candidate for this scenario.
The current literature does not support the first hypothesis. By contrast, multiple candidates are available for the second scenario. A reasonable scenario arises from the fact that the lignin pathway in switchgrass includes parallel fluxes that share the same enzymes. Indeed, 4CL, CAD, COMT, F5H and CCR1 all catalyze multiple reactions, and it is likely that the substrates exert competitive inhibition for the shared enzyme, as it was also suggested in [33]. Supporting this scenario, a targeted numerical analysis demonstrated that competition over CCR1 perfectly matches the results of the 4CL knockdown line in the model with product inhibition. One could surmise that the latter mechanism would suffice to represent the increase in H lignin concentration. To test this hypothesis, we simulated the model with enzyme competition but without product inhibition. The results showed that competitive inhibition by itself could not satisfactorily resolve the issue. By contrast, the combined model containing product inhibition and competitive inhibition matches the experimental results very well. One should also recall that the product inhibition and substrate competition mechanisms only work properly if the proposed metabolic channel is present (Fig. 3, Configuration 1).
Another aspect of the experimental data that was not captured well by the original model, even when product inhibition and substrate competition over CCR1 were taken into account, is the accumulation of 4CL substrates in COMT transgenic plants. Particularly counterintuitive appears to be the accumulation of ferulic acid as a product of a reaction catalyzed by COMT. The observed concomitant decrease in the steadystate concentration of coniferaldehyde supports the possible explanation that the observation is due to regulation that begins to inhibit the conversion of ferulic acid into coniferaldehyde, when 4CL substrates are in excess. The simultaneous accumulation of pcoumaric acid and caffeic acid provides additional evidence that reactions catalyzed by 4CL are inhibited in COMT knockdown plants. Accounting for this feature to our model, all experimental data are represented well. The mechanism of the regulation remains a subject of further experimental investigations. Figure 9 shows the pathway including all inferred regulatory signals.
Conclusions
The model proposed in this article captures all available data and performed well in independent PvMYB4 validation experiments. This good match with data is reason for cautious optimism, which however is to be supported with further experimental confirmation. Indeed, work is in progress to generate and analyze additional transgenic switchgrass lines and to incorporate further lignin compositional and enzyme activity and kinetic data into the model. If the model fares well in these additional validation studies, the results from the present study suggest that one might use the model for predictions, for instance, with respect to double knockdowns, and for optimization studies that could potentially affect the ligninbased recalcitrance in switchgrass in a favorable manner.
Methods
Model construction
Much of the analysis in this article consists of comparisons and simulations with different models. Each of these models consists of a system of differential equations that represent the rate of change in metabolite concentrations, which are represented as dependent variables. The righthand side of each equation contains a set of fluxes which enter (influxes) or leave (effluxes) the metabolite pool. Enzymes are included in the model as independent variables; that is, they do not change in activity during any given computational experiment. The generic formulation of each equation is
where each X _{ i } is a metabolite, V _{ j } are fluxes associated with X _{ i }, and the quantities s _{ i,j } are stoichiometric coefficients, which here are simply 0, 1 or ā1 and determine whether flux V _{ j } affects X _{ i } as influx or efflux or not at all. Each V _{ j } is a function of some or potentially all of the X _{ i }. At the steady state, the lefthand side is equal to zero, and fluxes can be assessed with methods of linear algebra [34]. Because the system in our case is underdetermined, infinitely many solutions satisfy the steadystate condition. Following the tenets of Flux Balance Analysis (FBA), an objective function is chosen and the problem is solved as a linear programming problem [34]. In the present study, maximizing the total amount of lignin is set as the objective of the system. The optimization problem is solved using MATLAB (version R2014a, The MathWorks, Natick, MA, USA) function linpro. The output is the set of fluxes at the steady state that maximizes the defined objective.
The fluxes themselves are formulated as general mass action (GMA) models of the type
within the modeling framework of BST [21, 22, 24, 35, 36]. Here, Ī± _{ j } is the rate constant, each X _{ r }, for 1 < r < n, is a metabolite or, for n + 1 < r < n + m, an enzyme involved in the reaction. Thus, n is the number of metabolites and m is the number of enzymes in the pathway. The exponents g _{ r,j } are kinetic orders that quantify the effect of X _{ r } on V _{ j }. Similarly, h _{ r,j } describes the effect of the enzyme on the reaction. It is customary to set each h _{ r,j } to 0 or 1, thus merely reflecting absence or presence of an enzyme in a specific flux. This setting of h _{ r,j } = 1 is consistent with the underpinnings of MichaelisāMenten, massaction, and other traditional models, where a reaction is assumed to be a linear function of enzyme activity. All other kinetic orders g _{ r,j } are sampled from the range between 0 and 1 if X _{ r } is a substrate or activator of the flux, or from the range between ā1 and 0 if X _{ r } is an inhibitor.
Due to the nature of the present experimental data for switchgrass, the real concentrations of metabolites and enzyme activities in vivo are unknown. As a remedy, we normalize these quantities with respect to the steady state and set all base values to 100. Thus, we set
and express Eq. (4) as
Since the constant X _{ ss,i } refers to the steady state, simple algebra adjusts the rate constants to this steady state. Thus, we obtain
The enzymes are independent variables and therefore constant for each experiment. Therefore, X _{ r } = X _{ ss } for n + 1 < r < n + m for wild type, whereas for a transgenic line it takes a value between 0 and 1, according to the level of knockdown. At the steady state we have:
with this setting, each steadystate flux is given as
If the flux is known, the rate constant can be computed as
With these settings, the set of the differential equations for the model takes the form below.
where the quantities \(I_{i}\) include the influxes into the pathway and the fluxes, \(V_{i}\), are defined as follows:
The metabolites of the pathway are
while the enzymes of the pathway are
Note that the model does not account for the dynamics of tyrosine, which we consider constant here. The model scheme is shown in Fig. 10.
Parameter space and sampling
Similar to earlier work [14, 37, 38], flux rates are computed with FBA. Next, the parameters to be estimated are the kinetic orders and rate constants are in turn estimated from the FBA results and randomly sampled kinetic orders through the steps mentioned above. The kinetic order of a metabolite is positive if the metabolite is a substrate or activator of the flux and negative if it acts as an inhibitor. The kinetic order of each enzyme has a default value of 1, which is in line with traditional enzyme kinetics, because it is customary to assume that a flux has a linear relationship with the enzyme. This assumption is explicitly or implicitly made in essentially all traditional models of enzyme kinetics as, for instance, in the MichaelisāMenten formalism, where V _{max} equals k _{cat} times the enzyme concentration.
The downregulation of an enzyme is modeled through the enzyme concentration, not the kinetic order. Since the concentrations of metabolites and enzymes are normalized, the concentration of an enzyme in the wild type has the default value of 1. In transgenics, the concentration of the corresponding enzyme is set to a value less than one if it is downregulated. For example, to represent the 4CL knockdown, the concentration of the enzyme is set to 0.6 as the enzyme is downregulated by 40 %.
To account for product inhibition, the inhibiting product is represented in each reaction by a factor consisting of its concentration, raised to a negative power. The result is as follows:
Here, S is the substrate, P is the product, g _{ I } is the kinetic order of the inhibiting product and g _{ S } is the kinetic order of the substrate. The ratio of kinetic orders could be derived directly [22] from the corresponding expression for a productinhibited MichaelisāMenten reaction, which takes the form
The powerlaw form of Eq. 15 can directly be computed from the tenets of Biological Systems Theory (BST), which defines the kinetic orders as
Rearrangement of these equations gives the ratio of kinetic orders as follows:
The bounded ratio of kinetic orders provides a valuable constraint for the Monte Carlo simulations, because a fixed ratio does not affect the dimension of the parameter space.
For the initial set of simulations, the sampling space is chosen as a unit hypercube in ā\(^{n}\) where n is number of kinetic orders to be estimated. A set of 100,000 parameter sets is generated for each scenario simulation. 10,000 sets are randomly generated from the sampling space using Latin Hypercube Sampling to assure a homogeneous coverage of the space, while 90,000 sets are generated by the MATLAB (version R2014a, The MathWorks, Natick, MA, USA) function rand. Each parameter set is simulated to examine whether the model with this set can match the experimental results for the wild type and transgenics. The model is deemed a match for the experimental results if:

The model returns proper lignin contents and S/G ratios for the wild type and different transgenics, with downregulation of 4CL (40 %), CCR1 (50 %), COMT (30 %), and CAD (30 %).

The model returns the proper decrease in lignin content in the case of knockdowns in 4CL, CCR1, COMT, and CAD.

The model demonstrates an increase in H lignin in 4CL transgenics.

The model matches the altered metabolite concentrations in the COMT transgenic.
If a parameter satisfies the above conditions, it is recorded along with the corresponding topological configuration.
While our model approach emphasizes ensembles of feasible models, the parameter values in Tables 2, 3, and 4 represent one implementation, which we used for further numerical exploration. This specific parameter set corresponds to the minimum error in the comparison of the model results in PvMYB4 and the experimental data.
Abbreviations
 PAL:

lphenylalanine ammonialyase
 C4H:

cinnamate 4hydroxylase
 4CL:

4coumarate:CoAligase
 CCR1:

cinnamoyl CoA reductase
 CAD:

cinnamyl alcohol dehydrogenase
 HCT:

hydroxycinnamoylCoA:shikimate hydroxycinnamoyl transferase
 C3ā²H:

pcoumaroyl shikimate 3ā²hydroxylase
 CSE:

caffeoyl shikimate esterase
 COMT:

caffeic acid Omethyltransferase
 CCoAOMT:

caffeoyl CoA Omethyltransferase
 F5H:

ferulate 5hydroxylase
 ER:

endoplasmic reticulum
 BST:

biochemical systems theory
 GMA:

generalized mass action
 PCA:

principal component analysis
 FBA:

flux balance analysis
References
Lang WH, Cookson IC. On a flora, including vascular land plants, associated with Monograptus, in rocks of Silurian age, from Victoria, Australia. Philos Trans R Soc Lond B. 1935;224(517):421ā49.
Kotyk ME, Basinger JF, Gensel PG, de Freitas TA. Morphologically complex plant macrofossils from the Late Silurian of Arctic Canada. Am J Bot. 2002;89:1004ā13.
Xu B, EscamillaTrevino LL, Sathitsuksanoh N, Shen Z, Shen H, Zhang YH, Dixon RA, Zhao B. Silencing of 4coumarate:coenzyme A ligase in switchgrass leads to reduced lignin content and improved fermentable sugar yields for biofuel production. New Phytol. 2011;192:611ā25.
Tschaplinski TJ, Standaert RF, Engle NL, Martin MZ, Sangha AK, Parks JM, Smith JC, Samuel R, Jiang N, Pu Y, et al. Downregulation of the caffeic acid Omethyltransferase gene in switchgrass reveals a novel monolignol analog. Biotechnol Biofuels. 2012;5:71.
Fu C, Mielenz JR, Xiao X, Ge Y, Hamilton CY, Rodriguez M Jr, Chen F, Foston M, Ragauskas A, Bouton J, et al. Genetic manipulation of lignin reduces recalcitrance and improves ethanol production from switchgrass. Proc Natl Acad Sci USA. 2011;108:3803ā8.
Zhang ZY, Rackemann DW, Doherty WOS, OāHara IM. Glycerol carbonate as green solvent for pretreatment of sugarcane bagasse. Biotechnol Biofuels. 2013;6(1):153. doi:10.1186/175468346153.
Naik SN, Goud VV, Rout PK, Dalai AK. Production of first and second generation biofuels: a comprehensive review. Renew Sust Energy Rev. 2010;14:578ā97.
Jung HG, Vogel KP. Influence of lignin on digestibility of forage cell wall material. J Anim Sci. 1986;62:1703ā12.
Weng JK, Li X, Stout J, Chapple C. Independent origins of syringyl lignin in vascular plants. Proc Natl Acad Sci USA. 2008;105:7887ā92.
Weng JK, Akiyama T, Bonawitz ND, Li X, Ralph J, Chapple C. Convergent evolution of syringyl lignin biosynthesis via distinct pathways in the lycophyte Selaginella and flowering plants. Plant Cell. 2010;22:1033ā45.
Weng JK, Akiyama T, Ralph J, Chapple C. Independent recruitment of an Omethyltransferase for syringyl lignin biosynthesis in Selaginella moellendorffii. Plant Cell. 2011;23:2708ā24.
Bulik S, Grimbs S, Huthmacher C, Selbig J, Holzhutter HG. Kinetic hybrid models composed of mechanistic and simplified enzymatic rate laws: a promising method for speeding up the kinetic modelling of complex metabolic networks. FEBS J. 2009;276:410ā24.
van Eunen K, Bouwman J, DaranLapujade P, Postmus J, Canelas AB, Mensonides FI, Orij R, Tuzun I, van den Brink J, Smits GJ, et al. Measuring enzyme activities under standardized in vivolike conditions for systems biology. FEBS J. 2010;277:749ā60.
Lee Y, EscamillaTrevino L, Dixon RA, Voit EO. Functional analysis of metabolic channeling and regulation in lignin biosynthesis: a computational approach. PLoS Comput Biol. 2012;8:e1002769.
Hoffmann L, Besseau S, Geoffroy P, Ritzenthaler C, Meyer D, Lapierre C, Pollet B, Legrand M. Silencing of hydroxycinnamoylcoenzyme A shikimate/quinate hydroxycinnamoyltransferase affects phenylpropanoid biosynthesis. Plant Cell. 2004;16:1446ā65.
EscamillaTrevino LL, Shen H, Hernandez T, Yin Y, Xu Y, Dixon RA. Early lignin pathway enzymes and routes to chlorogenic acid in switchgrass (Panicum virgatum L.). Plant Mol Biol. 2014;84:565ā76.
Ralph J, Grabber JH, Hatfield RD. Ligninferulate crosslinks in grassesāactive incorporation of ferulate polysaccharide esters into ryegrass lignins. Carbohydr Res. 1995;275:167ā78.
Lin CY, Wang JP, Li Q, Chen HC, Liu J, Loziuk P, Song J, Williams C, Muddiman DC, Sederoff RR, Chiang VL. 4Coumaroyl and caffeoyl shikimic acids inhibit 4coumaric acid: coenzyme a ligases and modulate metabolic flux for 3hydroxylation in monolignol biosynthesis of Populus trichocarpa. Mol Plant. 2015;8(1):176ā87. doi:10.1016/j.molp.2014.12.003.
Shen H, He X, Poovaiah CR, Wuddineh WA, Ma J, Mann DG, Wang H, Jackson L, Tang Y, Stewart CN Jr, et al. Functional characterization of the switchgrass (Panicum virgatum) R2R3MYB transcription factor PvMYB4 for improvement of lignocellulosic feedstocks. New Phytol. 2012;193:121ā36.
Shen H, Poovaiah CR, Ziebell A, Tschaplinski TJ, Pattathil S, Gjersing E, Engle NL, Katahira R, Pu Y, Sykes R, et al. Enhanced characteristics of genetically modified switchgrass (Panicum virgatum L.) for high biofuel production. Biotechnol Biofuels. 2013;6:71.
Savageau MA. Biochemical systems analysis: a study of function and design in molecular biology. Reading, Mass.: AddisonWesley Pub. Co. Advanced Book Program; 1976.
Voit EO. Computational analysis of biochemical systems : a practical guide for biochemists and molecular biologists. New York: Cambridge University Press; 2000.
Voit EO. A First course in systems biology. New York: Garland Science; Taylor & Francis distributor; 2012.
Voit EO. Biochemical systems theory: a review. ISRN Biomath. 2013;2013:53.
Torres NV, Voit EO. Pathway analysis and optimization in metabolic engineering. New York: Cambridge University Press; 2002.
Savageau MA: Enzyme kinetics in vitro and in vivo: MichaelisāMenten revisited. In: Bittar EE, Bittar N, editors. Greenwich Principles of medical biology, vol 4. Conn.: JAI Press Inc; 1995. p. 93ā146.
Winkel BS. Metabolic channeling in plants. Annu Rev Plant Biol. 2004;55:85ā107.
Achnine L, Blancaflor EB, Rasmussen S, Dixon RA. Colocalization of Lphenylalanine ammonialyase and cinnamate 4hydroxylase for metabolic channeling in phenylpropanoid biosynthesis. Plant Cell. 2004;16:3098ā109.
Rasmussen S, Dixon RA. Transgenemediated and elicitorinduced perturbation of metabolic channeling at the entry point into the phenylpropanoid pathway. Plant Cell. 1999;11:1537ā52.
Zhou R, Jackson L, Shadle G, Nakashima J, Temple S, Chen F, Dixon RA. Distinct cinnamoyl CoA reductases involved in parallel routes to lignin in Medicago truncatula. Proc Natl Acad Sci USA. 2010;107:17803ā8.
Bassard JE, Richert L, Geerinck J, Renault H, Duval F, Ullmann P, Schmitt M, Meyer E, Mutterer J, Boerjan W, et al. Proteināprotein and proteināmembrane associations in the lignin pathway. Plant Cell. 2012;24:4465ā82.
Chen HC, Song J, Wang JP, Lin YC, Ducoste J, Shuford CM, Liu J, Li Q, Shi R, Nepomuceno A, et al. Systems biology of lignin biosynthesis in Populus trichocarpa: Heteromeric 4coumaric acid: coenzyme A ligase protein complex formation, regulation, and numerical modeling. Plant Cell. 2014;26:876ā93.
Wang JP, Naik PP, Chen HC, Shi R, Lin CY, Liu J, Shuford CM, Li Q, Sun YH, TunlayaAnukit S, et al. Complete proteomicbased enzyme reaction and inhibition kinetics reveal how monolignol biosynthetic enzyme families affect metabolic flux and lignin in Populus trichocarpa. Plant Cell. 2014;26:894ā914.
Orth JD, Thiele I, Palsson BO. What is flux balance analysis? Nat Biotechnol. 2010;28:245ā8.
Savageau MA. Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions. J Theor Biol. 1969;25:365ā9.
Savageau MA. Biochemical systems analysis. II. The steadystate solutions for an npool system using a powerlaw approximation. J Theor Biol. 1969;25:370ā9.
Lee Y, Voit EO. Mathematical modeling of monolignol biosynthesis in Populus xylem. Math Biosci. 2010;228:78ā89.
Lee Y, Chen F, GallegoGiraldo L, Dixon RA, Voit EO. Integrative analysis of transgenic alfalfa (Medicago sativa L.) suggests new metabolic control mechanisms for monolignol biosynthesis. PLoS Comput Biol. 2011;7:e1002047.
Vanholme R, Cesarino I, Rataj K, Xiao Y, Sundin L, Goeminne G, Kim H, Cross J, Morreel K, Araujo P, et al. Caffeoyl shikimate esterase (CSE) is an enzyme in the lignin biosynthetic pathway in Arabidopsis. Science. 2013;341:1103ā6.
Shen H, Mazarei M, Hisano H, EscamillaTrevino L, Fu C, Pu Y, Rudis MR, Tang Y, Xiao X, Jackson L, et al. A genomics approach to deciphering lignin biosynthesis in switchgrass. Plant Cell. 2013;25:4342ā61.
Fu CX, Xiao XR, Xi YJ, Ge YX, Chen F, Bouton J, Dixon RA, Wang ZY. Downregulation of cinnamyl alcohol dehydrogenase (CAD) leads to improved saccharification efficiency in switchgrass. Bioenerg Res. 2011;4:153ā64.
Authorsā contributions
MF designed the model as well as the computational experiments and wrote the article. LLF provided feedback and proposed some of the regulatory mechanisms. LET and RAD performed the laboratory experiments and provided feedback on the manuscript. EOV supervised the project and contributed to all computational and editorial aspects. All authors read and approved the final manuscript.
Acknowledgements
This work was supported by DOEBESC grant DEAC0500OR22725 (PI: Paul Gilna). BESC, the BioEnergy Science Center, is a U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science.
Compliance with ethical guidelines
Competing interests The authors declare that they have no competing interests.
Author information
Authors and Affiliations
Corresponding author
Additional files
13068_2015_334_MOESM1_ESM.docx
Additional file 1. Analysis of the lignin pathway with inclusion of caffeyl aldehyde.
13068_2015_334_MOESM2_ESM.docx
Additional file 2. Principal component analysis.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Faraji, M., Fonseca, L.L., EscamillaTreviĆ±o, L. et al. Computational inference of the structure and regulation of the lignin pathway in Panicum virgatum . Biotechnol Biofuels 8, 151 (2015). https://doi.org/10.1186/s1306801503348
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1306801503348