Gene coexpression network analysis reveals a novel metabolic mechanism of Clostridium acetobutylicum responding to phenolic inhibitors from lignocellulosic hydrolysates

Background Lignocellulosic biomass is a promising resource of renewable biochemicals and biofuels. However, the presence of inhibitors existing in lignocellulosic hydrolysates (LCH) is a great challenge to acetone-butanol-ethanol (ABE) fermentation by Clostridium acetobutylicum. In particular, phenolic compounds (PCs) from LCH severely block ABE production even at low concentrations. Thus, it is urgent to gain insight into the intracellular metabolic disturbances caused by phenolic inhibitors and elucidate the underlying mechanisms to identify key industrial bottlenecks that undermine efficient ABE production. Results In this study, a time-course of ABE fermentation by C. acetobutylicum in the presence of four typical PCs (syringaldehyde, vanillin, ferulic acid, and p-coumaric acid) was characterized, respectively. Addition of PCs caused different irreversible effects on ABE production. Specifically, syringaldehyde showed the greatest inhibition to butanol production, followed by vanillin, ferulic acid, and p-coumaric acid. Subsequently, a weighted gene co-expression network analysis (WGCNA) based on RNA-sequencing data was applied to identify metabolic perturbations caused by four LCH-derived PCs, and extract the gene modules associated with extracellular fermentation traits. The hub genes in each module were subjected to protein–protein interaction analysis and enrichment analysis. The results showed that functional modules were PC-dependent and shared some unique features. Specifically, p-coumaric acid caused the most extensive transcriptomic disturbances, particularly affecting the gene expressions of ribosome proteins and the assembly of flagella, DNA replication, repair, and recombination; the addition of syringaldehyde caused significant metabolic disturbances on the gene expressions of ribosome proteins, starch and sucrose metabolism; vanillin mainly disturbed purine metabolism, sporulation and signal transduction; and ferulic acid caused a metabolic disturbance on glycosyl transferase-related gene expressions. Conclusion This study uncovers novel insights into the inhibitory mechanisms of PCs for the first time and provides guidance for future metabolic engineering efforts, which establishes a powerful foundation for the development of phenol-tolerant strains of C. acetobutylicum for economically sustainable ABE production with high productivity from lignocellulosic biomass.


Background
The exacerbation of global economic, social, and environmental problems due to the exhaustion of petroleum resources renders the production of renewable energy from low-cost plant biomass increasingly important, which has aroused intensive attention from governments and researchers worldwide [1,2]. Lignocellulose, the most abundant renewable and low-cost biomass, is a promising substrate for biofuel production. Lignocellulose can be converted into either butanol or ethanol, and among these second generation vehicle biofuels, butanol displays better performance [3][4][5] due to its higher energy density, higher combustion heat, better engine compatibility, and decreased corrosivity [2].
Lignocellulosic feedstocks are usually pretreated with acids, bases, steam, and other harsh conditions, changing their physical properties to overcome the "recalcitrance" of lignocellulosic biomass, prior to utilization by microorganisms such as Clostridium spp. [6,7]. Subsequently, pretreated lignocellulosic biomass is hydrolyzed into microorganism-accessible monosaccharides (hexose and pentose) through enzymatic hydrolysis [8]. During pretreatment and hydrolysis, a wide variety of toxic compounds are released from the lignocellulosic material [9,10]. The presence of these compounds inhibits cell growth, substrate utilization, and product synthesis, significantly depressing the subsequent butanol production efficiency [11]. Generally, these compounds are classified into three categories according to their sources and properties, furans (e.g., furfural, 5-hydroxymethylfurfural), weak acids (e.g., formic acid, acetic acid), and phenolic compounds (PCs; e.g., phenol, catechol, ferulic acid, syringaldehyde, vanillin, coumarin, and p-hydroxybenzoic acid) [10,12]. Among them, PCs are more toxic than furans to microorganisms such as Escherichia coli and Saccharomyces cerevisiae [13], and smaller molecular weight PCs show stronger cytotoxicity [14].
Lignocellulosic hydrolysate (LCH) PCs are mainly generated during lignin depolymerization [15], and their types and final concentrations vary greatly from milligrams to grams per liter of LCH [16][17][18], based on the source of feedstocks, as well as the technical protocols of pretreatment, detoxification, and enzymatic hydrolysis [19][20][21]. The effects of PCs on acetone-butanol-ethanol (ABE) fermentation have been well studied [22,23], but their mechanisms of inhibition remain unelucidated because of their low concentrations, various molecular structures, complexity, and difficulty to quantify. Therefore, a systematic investigation on the inhibitory mechanisms of PCs is urgently needed. The recent development of various omics technologies provides the opportunity for systematic profiling of the metabolic response mechanisms of microorganisms under various conditions that could be used to eliminate bottlenecks affecting target product synthesis. Great efforts have been made to dissect the metabolic mechanisms of C. acetobutylicum using transcriptomics [24,25], proteomics [26,27], and metabolomics [28,29], providing a technical basis for dissecting PC inhibitory mechanisms in this microorganism.
In this study, we used an integrated strategy based on RNA sequencing (RNA-seq) and weighted gene coexpression network analysis (WGCNA) to systematically dissect gene expression changes in C. acetobutylicum under different phenolic stress conditions. Syringaldehyde, vanillin, ferulic acid, and p-coumaric acid were selected as representative LCH-derived PCs. Using WGCNA, genes with similar expression patterns were clustered and the association between modules and specific traits or phenotypes were analyzed [30,31]. Our results promote fundamental understanding of the genetic regulatory mechanisms underlying C. acetobutylicum's responses to PCs. We also propose novel potential metabolic engineering targets involved in regulating resistance to PC inhibitor stress. The transcriptomeguided approach demonstrated here could be a promising strategy to improve complex phenotypes in wild-type strains.

Effect of different PCs on fermentation characteristics of C. acetobutylicum
In this study, PCs (vanillin (Van), p-coumaric acid (Coum), syringaldehyde (Syr), and ferulic acid (Fer)) were added during the logarithmic growth phase (12 h, Fig. 1). The biomass (measured by assaying the optical density at 600 nm (OD 600 )) displayed no significant changes between samples from 12 to 24 h, when it peaked. The highest OD 600 was 8.24 (Van), approximately 12% higher than in the control sample, while the lowest OD 600 (7.45) occurred with Syr. However, the ethanol, acetone, and butanol production of the PC-treated samples were lower than those in the control sample at 60 h. Among the PC-treated samples, addition of Syr resulted in the greatest inhibition of butanol production (66.4% of the control), followed by Van (72.7% of the control), Keywords: Phenolic compounds, Clostridium acetobutylicum, Weighted gene co-expression network analysis, RNA sequencing, Acetone-Butanol-Ethanol, Lignocellulosic hydrolysates indicating that these PCs had notable impacts on fermentation performance.
The acetic acid and butyric acid concentrations peaked at 36 h, when these concentrations were much higher in the control sample than in the PC-treated samples (Fig. 1). The concentrations of these two organic acids subsequently decreased rapidly in the control sample, such that there were no significant differences between the samples at the end of fermentation, indicating that the acetic acid and butyric acid in the control sample were largely reabsorbed and utilized in the later stage of fermentation, promoting ABE synthesis [24,32].
To compare short-term changes of fermentation characteristics caused by PC stress, a change ratio (CR) was used to express the ratio of production at 18 h compared to that at 12 h. As shown in Table 1, the CRs of acetone, and butyric acid in the treated samples were much lower than in the control sample, suggesting that the PCs negatively impacted acetone and butyric acid production. In addition, the CRs of acetic acid and the OD 600 with Syr, and ethanol with Fer and Coum, were greater than in the control sample. Interestingly, butanol CRs were larger in the treated samples than in the control sample, which suggested that the addition of PCs actually promoted butanol production in the short term. Taken together, the results indicate that the PCs exert different effects on the fermentation products.

RNA-seq analysis of PC-treated C. acetobutylicum
To further analyze the relationships between PC stresses and intracellular metabolic disturbances, we performed a transcriptomic analysis, in which 18 h samples treated with each PC were harvested and subjected to RNA-sequencing.
To analyze the relationship between gene expression profiles and fermentation traits, identify highly synergistic gene sets and candidate biomarker genes or metabolic targets, the transcriptomic data were used for WGCNA model construction according to gene set connectivity [33,34].

WGCNA model construction Soft threshold determination and network topology analysis of adjacency matrices based on WGCNA
In WGCNA, the soft-threshold process transforms the correlation matrix to generate a series of adjacency matrices that mimics the scale-free topology, a phenomenon observed in gene expression networks and in a variety of complex biological systems in which the distribution of gene relationships follows a power decay law, i.e., genes with the highest numbers of connections occur least frequently [35]. A gradient method was applied to evaluate the scale-free fit index and the mean connectivity degree of different coexpression modules with power values ranging from 1 to 20. The optimal power value was 18 when the scale-free fit index was > 0.9 (Fig. 2a), meeting the requirements of WGCNA modeling and enabling further analysis. To ensure high reliability of the results, the minimum gene number of each module was set to 10. Figure 2b depicts the clustering dendrogram of all expressed genes. Based on clustering and dynamic pruning using the topological overlap measure (TOM), 3,911 correlated genes were clustered into 14 modules marked by different colors. Module-trait relationships were analyzed using correlations between the module eigengenes and 12 fermentation traits (specific increasing rates of Acetone, Butanol, Ethanol, Acetate, Butyrate, Sugar, OD 600 as well as the conditions of Con, Van, Syr, Fer, and Coum), which enabled the identification of coexpression modules with significant correlations to fermentation traits. We considered the WGCNA modules with correction coefficients > 0.65 and p ≤ 0.001 to be highly associated with fermentation traits, and identified 24 module-trait relationships (Fig. 2c).

Gene clustering and module-trait relationships
For each gene expression profile, the gene significance (GS) was calculated as the absolute value of the correlation between the expression profile and each external trait; and the module membership (MM) was defined as the correlation between the expression profile and each module eigengene. By calculating GS and MM values, genes that are highly significant for each trait and have high MMs in interesting modules can be identified. Scatterplots of GS vs. MM in each module are shown in Fig. 3 and additional details on the GS and MM are provided in Table SII (Additional file 1). GS and MM were correlated, illustrating that genes that were significantly associated Fig. 2 Construction of the WGCNA model. a Network topology analysis for adjacency matrices with different soft threshold powers. Red numbers indicate the soft-threshold power corresponding to the correlation coefficient square value and mean connectivity. The linear model fit (R 2 ) between log(p(k)) and log(k) was calculated from each adjacency matrix, where k = the connectivity and p(k) = the proportion of genes with connectivity k. b Clustering dendrogram of all expressed genes. Each row corresponds to a module eigengene and each column to a fermentation phenotype. c Module-traits relationships identified by WGCNA. Each cell contains the corresponding correlation in the first line and the p-value in the second line. Modules are colored as in the legend. Green and red denote negative and positive correlations, respectively. The grey module represents a collection of genes that could not be grouped into other modules with a fermentation trait were also important elements of modules.

Functional enrichment analysis of hub genes
Hub genes located in their respective modules, have a high likelihood of being critical components, and are representative of the module's overall function within the network. In this study, genes with GS and MM values both > 0.5 were defined as the hub genes in their respective modules. WGCNA modules, related traits, eigengene counts, and hub genes are summarized in Table 2. Hub genes in each WGCNA module that were highly associated with the fermentation traits were subjected to gene ontology (GO) and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analysis with the Database for Annotation, Visualization and Integrated Discovery (DAVID). The results are shown in Fig. 4 and a further dissection is carried out as below.

Biomass (OD 600 ) and sugar (glucose)
As shown in Fig. 2c, the specific growth rate (rOD 600 ) was highly negatively associated with the purple and black modules, while the opposite trend was observed for the specific consumption rate of sugar (rSugar), indicating that sugar consumption supported biomass synthesis. Enriched GO terms for rOD 600 genes were de novo pyrimidine nucleobase biosynthetic process (GO:0006207) and de novo UMP biosynthetic process (GO:0044205), while glucosylceramidase activity (GO:0004348), de novo pyrimidine nucleobase biosynthetic process (GO:0006207), sphingolipid metabolic process (GO:0006665), and de novo UMP biosynthetic process (GO:0044205) were enriched by the rSugarrelated gene modules. Glucosylceramidase activity and the sphingolipid metabolic process are related to membrane lipid metabolic processes. Two KEGG metabolic pathways were enriched by the OD 600 -related gene modules, namely cysteine and methionine metabolism In each plot, the y-axis represents the GS of a fermentation trait and the x-axis represents the MMs of selected modules that were highly associated with those traits (correction coefficient > 0.65 and p ≤ 0.001) (cac00270) and alanine, aspartate, and glutamate metabolism (cac00250).

ABE production
The specific production rate of acetone (rAcetone) was positively associated with the purple and black modules. Four GO terms were enriched, and were consistent with those enriched in the sugar-related modules. In addition, the KEGG pathway cysteine and methionine metabolism (cac00270) was also enriched.
The specific production rate of butanol (rButanol) was positively associated with the black and salmon modules. Two GO terms (GO:0004348, glucosylceramidase activity; GO:0006665, sphingolipid metabolic process) and two KEGG pathways (cac00270, cysteine and methionine metabolism; cac02020, two-component system) were enriched in the black module; however, hub genes of the salmon module were not enriched for any GO terms or KEGG pathways. Five genes in this module (KDP operon transcriptional regulatory protein KdpE (CA_C3677), potassium-transporting ATPase subunit C (CA_C3680), methyl-accepting chemotaxis protein (CA_C3476), d-alanyl-d-alanine carboxypeptidase (CA_C3297), and anaerobic C4-dicarboxylate transporter (CA_C3500)) are involved in the two-component system, a signaling pathway that regulates many bacterial characteristics, such as virulence, pathogenicity, symbiosis, motility, nutrient uptake, secondary metabolite production, metabolic regulation, and cell division. These systems regulate physiological processes in response to environmental or cellular parameters and enable adaptation to changing conditions [36], consistent with the view that butanol production results in the deterioration of fermentation.
In ABE metabolic pathways, ethanol is closely related to C2 compounds, such as acetic acid, acetyl-P, and acetyl-CoA in primary metabolism. Butanol is derived from butyric acid, butyryl-P, and butyryl-CoA. Acetone is the decarboxylated product of acetoacetic acid, a branch pathway in the synthesis of butyric acid. The critical genes that regulate ABE synthesis are contained in the sol operon [37]. However, module-trait relationship analysis demonstrated no obvious links between ABE production and known ABE pathways, suggesting that ABE synthesis may be subjected to complex metabolic regulation.

Organic acid production
The specific production rate of acetate (rAcetate) was negatively associated with the blue module (GO:0006412, translation; GO:0003735, structural constituent of ribosome; GO:0019843, rRNA binding; ribosome; GO:0005840, transcription, DNA-templated; GO:0043565, sequence-specific DNA binding; GO:0071973, bacterial-type flagellum-dependent cell motility; GO:0031514, motile cilium; cac02040, flagellar assembly; cac03010, ribosome) and the turquoise module Table 2 Summary of the relationships between traits and WGCNA modules * The table shows WGCNA modules that were highly associated with the fermentation characteristics (correction coefficient > 0.65 and p ≤ 0.001) and the gene counts in each module. The specific rates of acetone, butanol, ethanol, acetate, butyrate, sugar, and biomass are represented by rAcetone, rButanol, rEthanol, rAcetate, rButyrate, rSugar, and rOD 600 , respectively. Con, Van, Coum, Syr, and Fer denote the control, vanillin, p-coumaric acid, syringaldehyde, and ferulic acid samples, respectively. + Positive correlation, − negative correlation. Eigengenes in the magenta, tan, and green-yellow modules were not significantly related to any traits (GO:0006810, transport), demonstrating that rAcetate was highly related to protein biosynthesis, flagellar assembly, and transportation. The specific production rate of butyrate (rButyrate) was positively associated with the black module (cac00270, cysteine and methionine metabolism and alanine; cac00250, aspartate and glutamate metabolism) and the salmon module (no enriched terms), which were involved in amino acid metabolism.

PC stresses
As shown in Fig. 2c, treatment with each PC corresponded to different WGCNA modules, indicating that these PCs caused different metabolic disturbance to the strain. In detail, the control condition was highly associated with the black module, with hub genes enriched in the KEGG pathways cysteine and methionine metabolism (cac00270) and alanine, aspartate, and glutamate metabolism (cac00250). Van treatment was associated with the purple (GO:0006207, de novo pyrimidine nucleobase biosynthetic process; GO:0044205, de novo UMP biosynthetic process), pink (no enriched terms), and brown (GO:0006189, de novo IMP biosynthetic process; cac01100, metabolic pathways; cac00670, one carbon pool by folate; cac00230, purine metabolism; cac01110, biosynthesis of secondary metabolites) modules, suggesting effects on nucleic acid metabolism.
To deeply explore the internal characteristics of the gene modules, protein-protein interaction (PPI) analysis was carried out.

PPI analysis of hub genes under different PC stresses
PPI networks represent webs of protein complexes formed by biochemical events and/or electrostatic forces that serve distinct biological functions while complexed. We performed the PPI analysis with the hub genes of all modules associated to each PC, obtaining the PC-specific interaction networks (Fig. 5). Using the STRING database, predicted functional associations between hub proteins with each treatment were identified based on known interactions (experimentally determined interactions from curated databases), predicted interactions (gene neighborhoods, gene fusions, and gene co-occurrences), and other evidence (text mining, coexpression, protein homology). As the result, sixteen high density subnetworks reveal the existence of highly interconnected gene sets that were biologically related to PC stress and were coexpressed, suggesting that they may play important roles under specific conditions. Information of the 16 subnetworks is listed in Table 3 and additional details regarding the PPI network are provided in Table SIII (Additional file 2).

p-Coumaric acid-associated PPI subnetworks
A total of 990 hub genes from the blue and green modules of the WGCNA were used to construct the PPI network. Using the MCODE algorithm, nine dense regions with high internal connectivity (subnetworks SN-1 ~ SN-9 in Fig. 5) were identified. The genes in each  Table 3. Enrichment analysis demonstrated that these subnetworks were associated with ribosome, DNA replication, flagellum, cytoplasm and biosynthesis of amino acids, kinase, glycosyl transferase, purine metabolism, sporulation and as well as several other cellular functions, suggesting wide effects on cellular metabolism caused by the addition of p-coumaric acid. Interestingly, SN-4 contained 18 genes that could be divided into two subnetworks (Fig. 5), one associated with DNA damage and repair, the other including the neighborhood genes CA_C2007 ~ CA_C20015 and CA_ C20019. Among them, glycosyl transferase (CA_C2007), 3-oxoacyl-ACP synthase (pksF, CA_C2008), 3-hydroxyacyl-CoA dehydrogenase (mmgB, CA_C2009), Fe-S oxidoreductase (CA_C2010), 3-oxoacyl-ACP synthase (fabH, CA_C2011), enoyl-CoA hydratase (fadB, CA_ C2012), esterase (CA_C2014), and malonyl CoA-ACP transacylase (CA_C2019) are associated with the fatty acid biosynthetic process. In additions, we noted spoT and obg, which encode the (p)guanosine 3′-diphosphate 5′-diphosphate (ppGpp) synthetase and GTPase (binding GTP, GDP, and possibly (p)ppGpp with moderate affinity); in eubacteria ppGpp is a mediator of the response to changes in nutritional abundance, which coordinates a variety of cellular activities [38].

Syringaldehyde-associated PPI subnetworks
A total of 263 hub genes from the blue and green modules of WGCNA were used to construct the PPI network, and two dense regions (SN-10 and SN-11) were extracted. SN-10 contained 16 hub genes which were enriched in ribosomal proteins. SN-11 contained 9 hub genes, which were enriched in starch and sucrose metabolism and signal transduction. Further analysis showed that CA_C0910, CA_C0911, CA_C0912, CA_C0913, CA_C0915 and CA_C0916 encode the cellulosomal proteins. The cellulosome is a cellulase system that exist in cellulolytic microorganisms such as C. thermocellum, C. cellulolyticum, and C. cellulovorans [39]. Cellulase and hemicellulase form a multienzyme complex structure through an anchorage-adhesion mechanism. Cellulosomes are attached to the bacterial cell wall by cell adhesion proteins, but lack cellulolytic activity in C. acetobutylicum [40].

Ferulic acid-associated PPI subnetworks
A total of 156 hub genes from the blue and green modules of the WGCNA were used to construct the PPI network, and two dense regions (SN-12 and SN-13) with extensive internal connections were extracted (Table 2). SN-12 contained seven hub genes that were clustered because of gene co-occurrence, but lacked annotations or descriptions. SN-13 contained seven genes that clustered by gene neighborhood, among which CA_C3068, CA_C3069, CA_C3070, CA_C3071 were related to glycosyl transferase.

Vanillin-associated PPI subnetworks
A total of 241 hub genes from the blue and green modules of the WGCNA were used to construct the PPI network. Three dense regions (SN-14, SN-15, and SN-16) were extracted (Fig. 5). SN-14 contained 13 hub genes that were enriched for roles in purine metabolism. SN-15 contained seven hub genes that were enriched for roles in sporulation, which was considered to be associated with the deterioration of fermentation [41,42]. SN-16 contained 17 hub genes which were involved with organonitrogen compound metabolic process, glycosyl transferase activity, phenylalanine, tyrosine and tryptophan biosynthesis and cell shape (Table 3).

Discussion
PCs widely exist in LCH and have strong toxicity to microorganisms. For ABE production, several research groups have examined PC-induced inhibition of Clostridia [10,43,44]. Cho et al. [10] assessed the inhibitory effects of six PCs (p-coumaric acid, ferulic acid, 4-hydroxybenzoic acid, vanillic acid, vanillin, and syringaldehyde) on C. beijerinckii. At 1 g/L, the tested PCs inhibited cell growth by 64 ~ 74% and completely inhibited butanol production. Ezeji et al. [43] evaluated the impact of PCs on C. beijerinckii growth and butanol production, and found that > 0.3 g/L ferulic acid was most toxic to C. beijerinckii growth and completely inhibited butanol production, followed by syringaldehyde. Chen et al. [44] established a mathematical model to evaluate the inhibitory effects of phenolic derivatives on ABE fermentation by C. saccharoperbutylacetonicum.
In this study, due to the low doses used, the four PCs did not show significant inhibitory effects on biomass and sugar consumption, but caused different irreversible effects on ABE fermentation (Fig. 1), indicating that under these conditions, C. acetobutylicum survived by regulating intracellular metabolism to cope with environmental stress, but at the expense of fermentation performance.
Generally, ABE fermentation can be divided into two stages, acidogenesis and solventogenesis. In the acidogenesis stage, large amounts of acetic acid and butyric acid are generated, accompanied with biomass accumulation, while in the solventogenesis stage the organic acids are reabsorbed and utilized by the cells to produce ethanol and butanol. With respect to metabolic pathways, butyric acid and acetic acid are the precursors of butanol and ethanol, respectively, so the levels of these organic acids and their corresponding alcohols are closely related. In the WGCNA models (Fig. 2c), we found that butanol production was highly related to the black and salmon modules, which were linked with butyric acid production in the module-trait correlation analysis. However, hub gene functional enrichment analysis revealed differences. Butanol production was associated with the twocomponent system, sphingolipid metabolic process, and glucosylceramidase activity, which are involved in cellular stress responses, while butyric acid was related to alanine, aspartate, and glutamate metabolism.
Similarly, acetic acid production was closely related to flagellar assembly (cell mobility) and ribosome function (protein synthesis), while ethanol was related to specific secondary metabolic activities, such as the biosynthesis of glycogen and secondary metabolites. Moreover, biological functions closely related to acetone and butanol were similar. The sphingolipid metabolic process and glucosylceramidase activity are both related to cellular membrane lipid metabolism. These analyses indicate that as a systematic method for describing gene association patterns among different samples, WGCNA can be used to identify highly synergistic gene sets and identify candidate biomarker genes or metabolic targets by associating gene sets and phenotypes. Some other microorganisms, e.g., S. cerevisiae, can convert PCs into less toxic compounds, achieving detoxification at the cost of reduced product synthesis [14], which was a in situ detoxification by the oxidoreductive pathway [45]. Larsson et al. [14] evaluated the influence of hydroxy-methoxy-benzaldehydes, diphenols/quinones, and phenylpropane derivatives on S. cerevisiae cell growth and ethanol formation. Aromatic alcohols were detected as the reduction products of their corresponding aldehydes. In another report, Shen et al. [46] reported upregulated oxidoreductase and antioxidant activities when S. cerevisiae was exposed to vanillin. However, in our study, no significant correlation between PCs and any oxidoreductive metabolism-related KEGG pathways, GO terms, or protein complexes were observed, indicating that the effects of PCs on C. acetobutylicum were distinct from their effects on S. cerevisiae.
Systems biology methods provides a global perspective on the effects of inhibitors on microbial metabolism. In this study, a WGCNA strategy based on RNA-seq transcriptomics was applied to identify the metabolic perturbations in gene expression caused by four LCH-derived PCs, and extract the metabolic modules associated with extracellular fermentation characteristics. Several gene sets that were highly associated with fermentation traits were identified and analyzed for enriched biological functions. Based on WGCNA, hub gene enrichment, and PPI analysis, four PCs exerted effects on cell metabolism via both different and shared targets.
PCs had significant effects on important physiological processes. As shown in Fig. 6, a potential metabolic mechanism in response to lignocellulose-derived PCs stress was proposed. In general, p-coumaric acid has an effect mainly on the assembly of ribosome, flagella, DNA replication, repair, and recombination; syringaldehyde on ribosome protein gene expressions, starch and sucrose metabolism; vanillin mainly on purine metabolism, sporulation and signal transduction, organonitrogen compound metabolic process; ferulic acid

Conclusions
In summary, we have performed an in-depth analysis of key pathway responses to various lignocellulosederived PCs. Our results indicate that precise regulation of these pathways or bottleneck steps are essential for strain development and aid in the improved ABE production from LCH.
C. acetobutylicum spores were heat-shocked at 85 °C for 10 min and then inoculated at a 1% (v/v) volume size for seed cultivation in a YQX-II anaerobic incubator (Shanghai Longyue Instrument Equipment Co., Ltd., China) for 24 h (to an OD 600 of ~ 3.0). Anaerobic ABE fermentation was performed in a 3.5 L fermenter (NBS BIOFLO-2000, New Brunswick Scientific Co., Inc., U.S.A) with a 10% (v/v) inoculation volume and a 2.0 L working volume. The stirring speed was maintained at 100 rpm and the temperature at 37 °C. Broth pH was monitored by pH electrode and maintained at 5.8 with 6 M ammonia and 6 M HCl.

Addition of phenolic inhibitors
The four PCs used in this study (vanillin, p-coumaric acid, syringaldehyde, and ferulic acid), were added separately 12 h after the beginning of fermentation to 200 mg/L. The main fermentation parameters, OD 600 , glucose, acetic acid, butyric acid, ethanol, butanol, and acetone, were monitored as previously reported [47].

Biomass and glucose measurements
The biomass was determined by ultraviolet spectrophotometry at 600 nm. Glucose was measured with a Bioanalyzer SBA 40D equipped with a biosensor (Biology Institute of Shandong Academy of Sciences, China) according to the manufacturer's instructions.

RNA-seq and quantification of gene expression levels
A 10 mL sample of each fermentation broth after 18 h was used for RNA-seq. Total RNA was extracted using TRIzol Reagent (Invitrogen, Carlsbad, USA) according to the manufacturer's instructions, and purified using an RNeasy Mini Kit (Qiagen, Germany) including an on-column DNase (Qiagen) digestion step to avoid contamination with genomic DNA. Sequencing libraries were generated using NEBNext ® Ultra ™ Directional RNA Library Prep Kit for Illumina ® (New England Biolabs Ltd., USA) following the manufacturer's recommendations, and index codes were added to attribute sequences to each sample. Clustering of the index-coded samples was performed on a cBot Cluster Generation System using a TruSeq PE Cluster Kit v3-cBot-HS (Illumina, Inc., USA). After cluster generation, library preparations were sequenced on an Illumina Hiseq platform and paired-end reads were generated. The total number of raw reads obtained from the samples (Con, Van, Syr, Fer, and Coum) ranged from 9.18 to 11.37 million. Raw data in fastq format were first processed to obtain clean data by removing reads containing adapters and poly-N tracts and low-quality reads from the raw data. At the same time, the Q20, Q30, and GC content of the clean data were calculated. Clean reads were mapped to the C. acetobutylicum ATCC 824 genome (NC_003030.1) and plasmid pSOL1 (NC_001988.2; NCBI, https ://www. ncbi.nlm.nih.gov/genom e/?term=ATCC+824) using Bowtie 2 v2.2.3 [49]. HTSeq v0.6.1 was used to count the read numbers mapped to each gene. Then, the expected number of FPKMs (fragment per kilobase of transcript sequence per millions base pairs sequenced) of each gene was calculated to determine their expression values, based on the length of each gene and the read counts mapped to it [50]. 3,911 genes in each sample had ≥ 1 (FPKM) and were considered expressed (Table SI in  Additional file 3).