RNA-seq based identification and mutant validation of gene targets related to ethanol resistance in cyanobacterial Synechocystis sp. PCC 6803

Background Fermentation production of biofuel ethanol consumes agricultural crops, which will compete directly with the food supply. As an alternative, photosynthetic cyanobacteria have been proposed as microbial factories to produce ethanol directly from solar energy and CO2. However, the ethanol productivity from photoautotrophic cyanobacteria is still very low, mostly due to the low tolerance of cyanobacterial systems to ethanol stress. Results To build a foundation necessary to engineer robust ethanol-producing cyanobacterial hosts, in this study we applied a quantitative transcriptomics approach with a next-generation sequencing technology, combined with quantitative reverse-transcript PCR (RT-PCR) analysis, to reveal the global metabolic responses to ethanol in model cyanobacterial Synechocystis sp. PCC 6803. The results showed that ethanol exposure induced genes involved in common stress responses, transporting and cell envelope modification. In addition, the cells can also utilize enhanced polyhydroxyalkanoates (PHA) accumulation and glyoxalase detoxication pathway as means against ethanol stress. The up-regulation of photosynthesis by ethanol was also further confirmed at transcriptional level. Finally, we used gene knockout strains to validate the potential target genes related to ethanol tolerance. Conclusion RNA-Seq based global transcriptomic analysis provided a comprehensive view of cellular response to ethanol exposure. The analysis provided a list of gene targets for engineering ethanol tolerance in cyanobacterium Synechocystis.


Background
Ethanol currently constitutes 99% of all biofuels in the United States. E-10 Unleaded, a blend of 10% ethanol and 90% ordinary gasoline, has been used in the U.S. for more than 25 years. Additionally, a blend of 85% ethanol and 15% ordinary gasoline (known as E-85) is rapidly growing in popularity [1]. The 3.4 billion gallons of ethanol blended into gasoline in 2004 amounted to about 2% of all gasoline sold by volume and 1.3% (2.5 x 1017 J) of its energy content [1]. Greater quantities of ethanol are expected to be used as a motor fuel in the future because of the federal policies, such as the "Twenty-in-Ten" program that proposes to cut gasoline consumption and greenhouse gas emissions from motor vehicles by 20 percent over the next 10 years. Large-scale ethanol production utilizes yeast or bacteria, such as Saccharomyces cerevisiae and Zymomonas mobilis to ferment sugar syrups [2]. The process has seen significant progress in recent years: inhibitor sensitivity, product tolerance, ethanol yield and specific ethanol productivity have been improved in modern industrial strains to the degree that up to 20% (v/v) of ethanol can be produced from starch-derived glucose [3]. However, since the large-scale ethanol fermentation consumes significant amount of agricultural crops, which competes directly with the world food supply, and its increased production has been blamed for the food price increases in recent years.
Photosynthetic cyanobacteria have recently attracted significant attention as a 'microbial factory' to produce biofuels and fine chemicals due to their capability to utilize solar energy and CO 2 as sole energy and carbon sources, respectively [4]. By expressing a bacterial pyruvate decarboxylase (pdc) and alcohol dehydrogenase (adh) from the bacterium Z. mobilis in the cyanobacterium Synechococcus sp. PCC 7942, Deng and Coleman (1999) obtained a recombinant microorganism which can produce up to 230 mg/L ethanol directly from CO 2 within 4 weeks of growth [5]. More recently, a genomescale metabolic network model of Synechocystis sp. PCC 6803 was used to improve cyanobacterial ethanol production up to 690 mg/L in a week [6]. Although still at very low productivity, these works clearly demonstrated that photoautotrophic cyanobacteria could potentially be engineered for a direct conversion of solar energy and CO 2 into biofuel products such as ethanol.
One of the key factors responsible for the low ethanol productivity is the low tolerance of photosynthetic systems to ethanol [7,8]. Ethanol can interfere with cell membrane's ability to act as a barrier, and interrupt key cellular processes such as protein biosynthesis, energy transduction and transport [8]. Although ethanol tolerance mechanism and application of ethanol-tolerant strains for enhanced production have been reported in native-producing yeasts and bacteria, current knowledge on ethanol tolerance in cyanobacteria is not enough to guide a rational engineering of more robust cyanobacterial hosts. To address this issue, we previously applied a quantitative iTRAQ LC-MS/MS proteomics approach to determine the responses of model cyanobacterial Synechocystis sp. PCC 6803 to ethanol [9]. The analysis showed that the Synechocystis cells employed a combination of induced common stress response, modifications of cell membrane and envelope, and induction of multiple transporters and cell mobility-related proteins as major protection mechanisms against ethanol toxicity [9]. To further decipher responses at transcriptional level, in this study, we applied a quantitative transcriptomics approach with a next-generation sequencing technology, combined with quantitative reverse-transcript PCR (RT-PCR) analysis, to reveal the global metabolic responses to ethanol in Synechocystis sp. PCC 6803 [10]. We then compared the transcriptomics data with proteomic data obtained previously to further confirm the targets related to ethanol tolerance [9]. Finally, we constructed several knockout mutants of ethanol-induced genes to validate their potential application as targets for engineering ethanol tolerance. The RNA-seq transcriptomics analysis not only further confirmed the cellular responses revealed from previous proteomics analysis, but also showed that Synechocystis cells can also utilize enhanced PHA accumulation and glyoxalase detoxication pathway as means against ethanol stress. The study provided a list of gene targets for tolerance engineering in cyanobacterium Synechocystis.

Results and discussion
Ethanol effects on Synechocystis sp. PCC 6803 To make the transcriptomics data comparable with previous proteomics data, we used the identical sampling conditions for transcriptomics as our previous proteomics analysis [9]. As described before, the growth of Synechocystis sp. PCC 6803 supplemented with 0, 1.25, 1.50 and 2.00% ethanol was assessed to determine an appropriate ethanol concentration for proteomic studies. The results showed that the concentration of ethanol that caused a 50% growth decrease was found to be 1.50% (v/v) at 24 h (corresponding to middleexponential phase), and was selected for the analysis in this study [9]. Cell morphology under ethanol-treated and control conditions was compared under microscope, and the results showed that visible aggregation of large number of cells was found after 24 h treatment even at a concentration of 1.50%, compared with the clearly individual cells in the control (data not shown). For transcriptomic analysis, two independent cultivations for both control (no ethanol) and 1.5% ethanol-treated experiments were conducted, and cells were collected by centrifugation (8,000 x g for 10 min at 4°C) at 24 h, 48 h and 72 h, resulting two biological replicates for each time point. The time points of sampling were corresponded to middle-exponential, exponential-stationary transition and stationary phases of the cell growth, respectively [9].

Overview of transcriptomics analysis
A total of 112-million raw sequencing reads was obtained from the RNA-seq transcriptomics analysis of nine samples, with average reads of 12.5-million reads per sample. After a two-step data filtering process, first to eliminate reads with low-quality bases (such as multiple N) and reads shorter than 20 bp, and then to eliminate sequence reads mapped to non-coding RNA of Synechocystis sp. PCC 6803 [10], a total of 20.4-million qualified mRNA-based sequence reads were identified (Table 1). Except for the control sample at 24 h (Control-24 h) which has a genome mapping ratio of 57%, all other samples have mapping ratio larger than 60%, with the control sample at 72 h larger than 80%. Reproducibility between biological replicates of ethanol-treated samples at three time points was plotted ( Figure 1), with correlation coefficient around 0.98-0.99, indicating the overall good quality of RNA sequencing. The sequence reads matched to all 3189 coding genes in Synechocystis sp. PCC 6803 genome (Additional file 1: Table S1), suggesting that the sequencing is deep enough to cover almost all species of transcripts in the cells. Abundance of the qualified mRNA-based raw sequence reads ranged from 1 to 341,135 for control samples, and from 1 to 154,326 for ethanol-treated samples, respectively, representing an expression dynamic range of 10 5 , which is higher than 10 3-4 of typical microarray-based analyses [11,12]. Using Reads Per Kilobase of Gene per Million Mapped Reads (RPKM) as an index of the normalized transcript abundance [13], we identified the top expressed genes under the control and ethanoltreated conditions through the growth time course ( Table 2). The top 50 expressed genes were found involved mostly in energy metabolism, including genes coding photosynthesis-related phycocyanin alpha subunit, phycocyanin beta subunit and photosystem I subunit XI and genes coding multiple subunits of ATP synthase, followed by genes encoding proteins synthesis such as multiple 50S ribosomal proteins and elongation factor, and genes involved in CO 2 fixation such as ribulose bisphosphate carboxylase genes, consistent well with previous analysis on highly expressed genes in Synechocystis [14,15]. Interestingly, we also found several genes encoding hypothetical proteins (ssl0483, slr0144, slr1470 and slr0373) were also among the top expressed genes, suggesting possible important physiological functions they may be responsible for. Although the exact function is still unknown, slr0144 has been suggested to encode a PSII-associated protein [16], and slr0373 forms an operon with slr0374 which has been found responsive to various environmental stresses [17].
Using a cutoff of 1.5-fold change in both biological replicates, we determined that 1874 and 274 genes were down-and up-regulated by ethanol, respectively. For the down-regulated genes, 1343, 596 and 830 genes were down-regulated at 24, 48 and 72 h, respectively. Among them 167 genes were down-regulated in all three time points (Additional file 2: Table S2). Analysis of the functional category of the down-regulated genes was shown in Figure 2. The results showed that the most affected functional categories were "hypothetical proteins" and "unknown function", representing a total of more than 68% of all the down-regulated genes, consistent with the fact that nearly half of the Synechocystis     genome is still annotated as hypothetical up to now [10,18]. Other most affected functional categories included "Energy metabolism", "Protein synthesis" and "Regulatory functions". Down-regulation of the central metabolism is consistent with the overall slower growth upon ethanol stress [9]. For the up-regulated genes, 29, 114 and 161 genes were up-regulated at 24 h, 48 h and 72 h, respectively, among which 3 genes were upregulated in all three time points (Table 3 and Additional file 3: Table S3). More genes up-regulated at late growth phases suggested that cells needed time to adjust their metabolism and initiate resistance responses.

Correlation with quantitative RT-PCR analysis
Based on their expression level and regulation patterns by ethanol, a subset of 12 genes was selected for quantitative RT-PCR validation. Among them, six genes were down-regulated (i.e. sll0721, sll1796, slr1992, sll0248, sll1327, ssr1399) and six genes were up-regulated (i.e. sll1734, slr1761, slr1828, sll1091, slr0288, sll0057) by ethanol, respectively according to the RNA-seq transcriptomics data. Under control condition, their expression levels varied from the normalized RPKM values 2529.6 for sll0248 (encoding a flavodoxin) to 421749.3 for ssr1399 (encoding ribosomal protein S18) (Additional file 4: Table S4). RT-PCR analysis was performed for the genes between the treated sample and control for all three time points (i.e. 24, 48 and 72 h). The results showed obvious positive correlation can be detected between qRT-PCR and RNA-Seq transcriptomics data (with correlation coefficient of 0.75-0.8) (Figure 3), suggesting a good quality of RNA-seq data.

Cells utilize multiple approaches to cope with ethanol stress
Our previous proteomic analysis found that the Synechocystis cells employed a combination of induced common stress response, modifications of cell membrane and envelope, and induction of multiple transporters and cell mobility-related proteins as protection mechanisms against ethanol toxicity [9]. At transcriptional level, a very similar response was also observed [9]. First, we found that common stress responses were induced: one gene encoding a heat-shock DnaK homolog (slr0086) was induced at 72 h. Multiple genes involved in resistance against reactive oxygen species (ROS), such as slr2033 encoding a membraneassociated rubredoxin, slr1109 encoding ankyrin homolog [19], sll1545 encoding glutathione S-transferase [20], slr0242 encoding a bacterioferritin comigratory protein [21] and slr1379 encoding quinol oxidase subunit I [22] were up-regulated. In addition, consistent with findings from proteomic analysis, we found circadian rhythms of Synechocystis sp. PCC 6803 was also regulated by ethanol. It was reported that cyanobacterial circadian rhythms are controlled by a cluster of three genes, kaiA, kaiB, and kaiC [23]. Previous proteomic analysis showed that one of the key circadian clock proteins, KaiB (Slr0757), was induced [9]. RNA-Seq transcriptomic analysis showed that kaiC gene (slr0758) was also induced (Table 3). Transcripomics analysis here complemented well with the proteomic analysis, further confirming that circadian rhythms are induced by ethanol treatment. The ethanol-induced genes were listed in Table 3, while the induced genes     encoding hypothetical proteins were provided in Additional file 3: Table S3. Cross-membrane transporters for small molecules have been suggested as one important mechanism against ethanol toxicity in the early studies with yeast [24,25]. In cyanobacteria, transporters were also involved in tolerance to many different types of stresses, such as arsenate, Cu 2+ , salinity and heavy metals [26][27][28][29][30]. Our quantitative proteomic analysis also identified 5 putative transporters with different substrate specificity induced by ethanol exposure [9]. RNA-Seq based transcriptomics found 12 transporters were induced by ethanol at varying growth phases. Similarly, these transport proteins were also with a wide range of putative functions and substrate specificity: sll0759 encoding an ABC transporter ATP-binding protein, slr0949 encoding an integral membrane protein of the ABC-type Nat permease, sll0540 encoding a phosphate-binding protein PstS homolog, sll0671 encoding a probable cation transporter, sll0536 encoding a probable potassium channel protein, sll1428 encoding a probable sodiumdependent transporter, slr2131 encoding a RND multidrug efflux transporter, sll0384 encoding a cation and iron carrying protein, sll1041 encoding a sulfate transport ATPbinding protein CysA, sll0374 encoding a urea transport system ATP-binding protein, and slr0678 encoding a biopolymer transport ExbD like protein, and slr1452 encoding a sulfate transport system substrate-binding protein. Interestingly, they represented a totally different set of ethanolinduced transporters when compared with transporters revealed by proteomics analysis [9], although they shared some similarity in terms of substrate specificity as two of previously identified transporters, Sll0689 as a sodiumdependent transporter and Slr1295 as an iron transporter.
Early studies have found that many microbes can modify their cell membrane and envelope to increase tolerance to ethanol [24,31]. One well described change is the shift from cis to trans unsaturated fatty acids to decrease membrane fluidity, resulting in a corresponding increase in solvent tolerance [8]. RNA-seq transcriptomics analysis showed that slr1350 encoding acyl-lipid desaturase was up-regulated at 72 h. In a previous study, the acyl-lipid desaturase (desA) gene from Synechocystis sp. PCC6803 was expressed in prokaryotic (E. coli) and  eukaryotic (Solanum tuberosum) cells, which led to an enhanced cold tolerance due to increased unsaturated fatty acid concentration in their lipids [32]. Several genes encoding cell envelope proteins were found induced by ethanol exposure ( Table 3). The slr0819 gene encoding apolipoprotein N-acyltransferase was induced 2.17 and 1.67 fold in both biological replicates at 72 h. Apolipoprotein N-acyltransferase is able to transfer an acyl group from sn-1-glycerophospholipid to the free alpha-amino group of the N-terminal cysteine of apolipoproteins, resulting in mature triacylated lipoprotein which plays important role in bacterial survival in mice for Staphylococcus aureus [33,34]. The sll1370 gene encoding a mannose-1-phosphate guanylyltransferase was induced 3.25 and 1.88 fold in both biological replicates at 72 h. Mannose-1-phosphate guanylyltransferase is involved in lipopolysaccharide biosynthesis which has been found necessary for adaptation to high external NaCl stress in Rhizobium tropici [35]. The slr1910 gene encoding a probable N-acetylmuramoyl-Lalanine amidase was induced 1.71 and 2.00 fold in both biological replicates at 72 h. N-acetylmuramoyl-L-alanine amidase has been suggested involved in degradation and reconstruction of the cell peptidoglycan layer in Anabaena sp. strain PCC 7120 [36]. Up-regulation of these cell envelope proteins by ethanol exposure could contribute to strengthening cell wall and extracellular matrix for stress resistance, although the mechanism still needs more investigation. Polyhydroxyalkanoates (PHAs) are highly reduced bacterial storage compounds that are accumulated in most bacteria during unbalanced growth conditions [37]. Accumulation and degradation of PHAs endow bacteria with enhanced survival, competition abilities, and stress tolerance, increasing fitness in changing environments [37,38]. RNA-seq analysis identified two genes involved in PHA biosynthesis, slr1994 encoding a PHA-specific acetoacetyl-CoA reductase and slr1993 encoding a PHAspecific beta-ketothiolase were up-regulated. Genetic analysis suggested that these two genes were probably located in the same operon [39]. Among them, slr1994 encoding PHA-specific acetoacetyl-CoA reductase was up-regulated significantly at all three time points (i.e. 24, 48 and 72 h) with 6.0 and 9.0 fold increase in both biological replicates at 24 h (Table 3). Although PHA accumulation has been reported for many natural stress conditions [38], it is the first time to report that this pathway is also responsive to organic solvents and biofuels.
One factor that may affect the long-term survival of bacterial cells in a population is the level of damage incurred by macromolecules via the nonenzymatic process of glycation, which is responsible for the formation of several compounds identified as advanced glycation end products (AGEs) [40]. Many biochemical pathways produce reactive dicarbonyl intermediates, such as glyoxal and methylglyoxal (MG), which can further react with DNA, proteins, or other biomolecules to form AGEs [40]. In E. coli, it has been found that the predominant MG detoxification system consisted of glyoxalase enzyme I which coverts MG to S-lactoyl glutathione [41]. In plant, the level of MG is enhanced upon exposure to different abiotic stresses and overexpression of glyoxalase pathway genes can support survival and growth of transgenic plants under various abiotic stresses [42]. RNA-Seq analysis of the ethanol-treated cells showed that lactoylglutathione lyase (also called as glyoxalase enzyme I) was up-regulated significantly by 5.14 and 5.0 fold in both biological replicates at 72 h, suggesting that glyoxalase pathway may play important roles in resistance to ethanol stress in Synechocystis.
In the previous proteomics analysis, we unexpectedly discovered that many proteins involved in multiple aspects of photosynthesis activity (i.e. photosystem I and II, cytochrome, ferredoxin) were up-regulated even when the cell growth was slow down. We further confirmed the results by comparatively measuring chlorophyll a concentration in cells [9]. Based on our results we proposed that ethanol treatment might enhance photosynthesis in Synechocystis to generate more ROS which will trigger oxidative stress response [9]. RNA-seq transcriptomics analysis showed very similar results, although cell growth was slow, and genes involved in energy metabolism and protein synthesis were mostly down-regulated ( Figure 2 and Additional file 2: Table S2), the genes involved in photosystem I and II, light collection and electron transfer, such as ssl0563 encoding photosystem I subunit VII, smr0009 encoding photosystem II PsbN protein, sll1051 encoding phycocyanin alpha-subunit phycocyanobilin lyase, and sll1471 encoding a phycobilisome rod-core linker polypeptide, and slr1828 encoding a ferredoxin were up-regulated. Among them, sll1051 encoding phycocyanin alpha-subunit phycocyanobilin lyase was increased significantly by 8.0 and 13.0 folds in both biological replicates at 72 h (Table 3). In addition, up-regulation of multiple cytochromes, such as slr1185 encoding cytochrome b6-f complex alternative ironsulfur subunit, sll1316 encoding cytochrome b6-f complex iron-sulfur subunit, sll0450 encoding cytochrome b subunit of nitric oxide reductase, smr0003 encoding cytochrome b6-f complex subunit PetM were also upregulated. The results further confirmed this unique phenomenon of cyanobacteria under stress of biofuels.
RNA-seq transcriptomics analysis identified ten signal transduction proteins induced upon ethanol exposure, including two histidine kinases (sll1473, slr1805), two response regulators (slr0947, sll1330) of bacterial twocomponent system (TCS), one serine/threonine kinase (slr1225) and three transcriptional regulators (slr0741, sll0792, sll1423, ssl0707) ( Table 3). sll1473 encoding a phytochrome-like sensor histidine kinase, was upregulated at 48 h. Phytochromes are red/far-red photoreceptors that bear linear tetrapyrrole (bilin) chromophores attached to an N-terminal sensory module, and have been identified in many prokaryotes, including cyanobacteria [43,44]. In a study, the cikA gene of the cyanobacterium Synechococcus elongatus PCC 7942, encoding a phytochrome-related histidine kinase, was found involved in signal perception for resetting the circadian clock in response to environmental cues [45]. Although still needs more proof, the up-regulation of sll1473 gene may be consistent with the enhanced expression of kaiC gene (slr0758) related to circadian rhythms. slr0947 encoding a response regulator for energy transfer from phycobilisomes to photosystems was up-regulated at 72 h after ethanol exposure. Early study has found that RpaB response regulator (Slr0947) can bind to the upstream region of the high light (HL)-inducible genes in Synechocystis sp. PCC 6803 to cope with the potentially damaging effects of high light [46]. slr0741 encoding transcriptional regulator was upregulated at 72 h. The gene was previously found involved in transduction of the phosphate-limitation signal in Synechocystis [47]. sll1330 encoding a twocomponent system response regulator OmpR subfamily was induced by ethanol at 48 h. A recent study found that expression of sll1330 can be enhanced by nitrogen depletion under the control of NtcA, which then activates transcript accumulation of sugar catabolic genes during nitrogen starvation [48]. slr1805 encoding a twocomponent sensor histidine kinase was up-regulated, which was previous found participating in the perception and transduction of salt-stress and hyperosmoticstress signals [49].
Considering many signal transduction genes were involved in the ethanol induced responses, we speculated that some of the ethanol-responsive genes may be under direct control of the response regulators or transcriptional regulators. To seek evidence to this hypothesis, we performed a promoter DNA-binding motif searching using 500 bp sequences extracted from upstream region of all the up-regulated genes using the Gibbs Motif Sampler software [50,51]. This analysis showed that the top conversed motifs identified were two palindrome containing 16 and 17 total sites with the DNA sequence "AXXCCTGGCCAAGGXXT" and "AAXXTTTXXAAAXXTT", respectively ( Figure 4) [52]. Both motif models have several conserved positions with information bits greater than 0.5 and are highly likely to be significant [50]. The genes associated with the first motif included slr0086 encoding a DnaK protein and slr0942 encoding an alcohol dehydrogenase [NADP + ] which have been confirmed in ethanol resistance in Clostridium [53] and sll1330 encoding a OmpR subfamily response regulator which was shown Sll1330 to control the expression of glycolytic genes in Synechocystis sp. PCC 6803 [54]. The genes associated with the second motif included slr1109 encoding an ankyrin, slr1828 encoding a ferredoxin, slr1994 encoding a PHAspecific acetoacetyl-CoA reductase, slr2033 encoding a membrane-associated rubredoxin and slr0940 encoding a zeta-carotene desaturase, which were all involved in stress response in various microbes. Functions of these motifs may worth further investigation. Figure 4 Putative regulatory module identified upstream of common-responsive genes. The motif is represented by a sequence logo generated by the WebLogos software [52].

Correlation of transcriptomic and proteomic analyses
While it is well-known that RNA expression and protein abundance are not always correlated well [55,56], we have presented evidences above that overall cellular responses identified from transcripomics and proteomics are very similar: responses such as induction of common stress response, transporters, cell envelope proteins and photosynthesis were observed in both proteomic and transcriptomic datasets. To further compare the proteomic and transcriptomic datasets quantitatively, twenty-three common genes/proteins up-regulated in both transcriptomics and proteomics datasets were plotted together ( Figure 5). The results also showed very similar trends of up-regulation, with only five genes up-regulated in transcriptomic data, but almost no change in proteomics dataset (i.e. sll1423, ssl0707, slr0947, slr2143 and sll1892). However, no gene/protein with opposite regulation direction was found. In Saccharomyces cerevisiae, it has been proposed that there are three potential reasons for the lack of a strong correlation between transcriptomic and proteomic datasets: i) translational regulation, ii) difference in protein half-lives in vivo and iii) significant levels of experimental error, including differences with respect to the experimental conditions being compared [57,58]. The inconsistence between transcriptomic and proteomic datasets also highlighted that it may not be enough to analyze biological systems only at a single level.
Validation of the potential resistance targets by mutant strains Two genes, slr0724 and sll1392 which were found induced by ethanol exposure at 72 h for 1.5-2.0 and 4.0-5.0 folds, respectively ( Table 3, Additional file 3: Table  S3), were selected for construction of knockout mutants and for validation of their involvement in ethanol resistance. slr0724 encodes a HtaR suppressor protein homolog (sohA, or prlF) according to CYORF Cyanobacteria Gene Annotation Database (http://cyano.genome.ad.jp/), and sll1392 encodes a regulatory gene, designated as pfsR (photosynthesis, Fe homeostasis and stressresponse regulator) [59]. After confirmation by PCR and sequencing analysis, the mutants were grown in parallel with wild type Synechocystis sp. PCC 6803 in both normal BG11 medium and the BG11 medium supplemented with 1.5% ethanol. Comparative analysis showed that although there is no visible difference in terms of growth  patterns between the wild type and the mutants in BG11 medium ( Figure 6A), the slr0724 and sll1392 mutants grew slower than the wild type under 1.5% ethanol ( Figure 6BC), suggesting that the mutants are more sensitive to ethanol, and the gene slr0724 and sll1392 may be involved in ethanol resistance. In addition, the results also showed that the growth difference between the wild type and the mutants became more significant at the late growth phases (i.e. 60-72 h), consistent the transcriptomic results that both genes were up-regulated only at 72 h ( Table 3, Additional file 3: Table S3). According to NCBI annotation (NCBI accession ID: NP_439991.1), the slr0724 gene could be involved in protein secretion and it induces growth defect when overproduced or mutated; however, under our growth condition, no difference in terms of growth was observed between the mutant and the wild type strains ( Figure 6A). In addition, the PrlF mutation was found to induce the activity of the Lon protease. In prokaryotic cells the ATP-dependent proteases Lon are involved in the turnover of misfolded proteins and the degradation of regulatory proteins, and depending on the organism, these proteases contribute variably to stress tolerance [60,61]. Early studies have shown that lon mutants of Campylobacter jejuni grow poorly at high temperature [60] and Lon protease is involved in the control of the SOS response, acid tolerance and nutritional deprivation in Escherichia coli [61]. It still needs more proof whether the similar biological process was also functional in PCC6803 against ethanol. An early study has found that the sll1392 (pfsR) deletion mutants were less sensitive to iron limitation under low light conditions and to suffer less lipid peroxidation following exposure to high light, suggesting a critical role of PfsR in regulation of iron homeostasis and stress response [59]. It may worth further investigation of the relationship between ethanol stress and iron homeostasis in Synechocystis sp. PCC 6803.

Conclusions
To fully elucidate microbial metabolism and its responses to ethanol, it is necessary to include functional characterization and accurate quantification of all levels of gene products, mRNA, proteins and even metabolites [54]. While high-throughput 'omics' approaches to analyze molecules at different cellular levels are rapidly becoming available, it is also becoming clear that any single 'omics' approach may not be sufficient to characterize the complexity of biological systems. To provide confirmation to previous proteomic analysis and also to reveal more responses at transcriptional level, in the study, we applied a quantitative RNA-Seq based transcriptomics approach combined with quantitative reverse-transcript PCR (RT-PCR) analysis to reveal the global transcriptomic responses to ethanol in Synechocystis sp. PCC 6803. The results showed that Synechocystis probably employed multiple and synergistic resistance mechanisms in dealing with ethanol stress. In addition, we found that the overall cellular responses inferred from transcriptomic and proteomic analyses were very similar, although the responsive genes were not always the same. By constructing knockout mutants and analyzing their ethanol tolerance, we have provided preliminary validation that the targets identified by the study could be used to obtain ethanol-tolerant cyanobacterial hosts by genetic engineering in Synechocystis sp. PCC 6803. Finally, our results showed that gene knockout of the potential targets individually caused only partial loss of the ethanol tolerance, consistent with the early conclusion that microbes tend to employ multiple resistance mechanisms in dealing with stress of single biofuel product [7,8]. With the ethanol-tolerance gene targets discovered from this study and previously proteomic analysis [9], it may be possible to engineer multiple gene targets from different cellular functional categories simultaneously to achieve high-tolerance hosts in the future.

Bacterial growth conditions and ethanol treatment
Synechocystis sp. PCC 6803 was grown in BG11 medium (pH 7.5) under a light intensity of approximately 50 μmol photons m -2 s -1 in an illuminating incubator of 130 rpm at 30°C (HNY-211B Illuminating Shaker, Honour, China). Cell density was measured on a UV-1750 spectrophotometer (Shimadzu, Japan). For growth and ethanol treatment, 10 mL fresh cells at OD 730 of 0.5 collected by centrifugation and then were inoculated into 50 mL BG11 liquid medium in a 250-mL flask. Ethanol of varying concentration was added at the beginning of cultivation. 1 mL of culture samples were took and measured (OD730) every 12 h. Morphology of Synechocystis sp. PCC6803 control and ethanol-treated samples was observed using a BX43 fluorescence microscope (Olympus, Japan). Cells for transcriptomics analysis were collected by centrifugation at 8,000 x g for 10 min at 4°C.

RNA preparation and cDNA synthesis
Approximately 10 mg of cell pellets were frozen by liquid nitrogen immediately after centrifugation and cell walls were broken with mechanical cracking at low temperature. Cell pellets were then resuspended in Trizol reagent (Ambion, Austin, TX) and mixed well by vortex.

Transcriptomics data analysis
Sequence reads were pre-processed using FASTX Toolkit (Version: 0.0.13) to remove low-quality bases, and reads shorter than 20 bp. The qualified sequence reads were then mapped to non-coding RNA (ncRNA) sequences using Bowtie (Version: 2.0.0) with default settings. Genome sequences (including ncRNA sequences) and annotation information of Synechocystis sp. PCC 6803 were downloaded from NCBI and the Comprehensive Microbial Resource (CMR) of TIGR (http://www. tigr.org/CMR) (Downloaded on April 22, 2012) [10].
Reads that mapped to ncRNA sequences were excluded from further analysis. For paired-end Illumina reads, both pairs were removed if either pair mapped to rRNA. Remaining reads were mapped to the Synechocystis sp. PCC 6803 genome using Bowtie (Version: 2.0.0) with the default parameters. For gene expression determination, we performed a standard calculation of Reads Per Kilobase of Gene Per Million Mapped Reads (RPKM) based on the following formula [13]: RPKM ¼ transcription reads transcription length X total assembly reads in run Â 10 9 in which "transcription_reads" stands for the number of reads mapped to a given gene; transcription_length stands for gene length; and "total_mapped_reads_in_run" stands for the total number of reads in a given measurement. For each time point, two biological replicates of ethanol-treated samples and their control were analyzed and the corresponding gene expression ratios based on RPKM were calculated, the genes with 1.5 fold changes in both biological replicates were determined as differentially regulated genes.

Quantitative real-time RT-PCR analysis
The RNA samples were collected from cells grown under the same growth condition as described above for transcriptomic analysis. Approximately 10 mg of cell pellets were frozen by liquid nitrogen immediately after centrifugation and cell walls were broken with mechanical cracking at low temperature. Cell pellets were then resuspended in Trizol reagent (Ambion, Austin, TX) and mixed well by vortex. Total RNA extraction was achieved using a miRNeasy Mini Kit (Qiagen, Valencia, CA). First-strand cDNAs were synthesized using Rever-tAidTM Reverse Transcriptase (Fermentas, Glen Burnie, MD). cDNA was subjected to eight hundred fold dilutions, and 2 μl of each dilution was used as template for following qPCR reaction. The qPCR reaction was carried out in 20 μl reactions containing 10 μl of SYBR W Green PCR Master Mix (Applied Biosystems, Foster City, CA), and 2 μl of each PCR primer at 2 mM, employing the StepOnePlus™ Real-Time PCR System (Applied Biosystems, Foster City, CA), under the following condition: 50°C for 2 min and 95°C for 10 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min. Quantification of gene expression was determined according to standard process of RT-PCR which used serial dilutions of known concentration of chromosome DNA as template to make a standard curve. A total of 18 selected genes based on their differential expression patterns revealed by iTRAQ were selected for verification and the rnpB gene (6803s01) encoding RNase P subunit B was used as an internal control according to the previous publication [62]. Three technical replicates were performed for each gene. Data analysis was carried out using the StepOnePlus analytical software (Applied Biosystems, Foster City, CA). Briefly, the amount of relative gene transcript was normalized by that of rnpB in each sample (wild type or mutant), using the following method: R relative gene expression of gene x ¼ 2 Ct control -Ct treated ð Þ of x =2 Ct control ÀCt treated ð Þ of rnpB Then data was presented as ratios of the amount of normalized transcript in the treatment to that from the control. The gene ID and their related primer sequences used for real-time RT-PCR analysis were listed in Additional file 4: Table S4.