Strains and culture conditions
Clostridium thermocellum DSM1313 and mutant strains C. thermocellum Δhpt ΔhydG (referred to as ΔhydG) and C. thermocellum Δhpt ΔhydG Δech (referred to as ΔhydG Δech)  were grown in CTFUD medium  and MTC minimal medium  prepared as described in . CTFUD medium composition was (L−1): 3 g sodium citrate tribasic dehydrate, 1.3 g ammonium sulfate, 1.43 g potassium phosphate monobasic, 1.8 g potassium phosphate dibasic trihydrate, 0.5 g cysteine-HCl, 10.5 g 3-morpholino-propane-1-sulfonic acid (MOPS), 6 g glycerol-2-phosphate disodium, 5 g cellobiose, 4.5 g yeast extract, 0.13 g calcium chloride dehydrate, 2.6 g magnesium chloride hexahydrate, 0.0011 g ferrous sulfate heptahydrate, and 0.0001 g resazurin, adjusted to pH 7.0. MTC medium consisted of (L−1): 2 g sodium citrate dehydrate, 1.25 g citric acid monohydrate, 1 g sodium sulfate, 1 g potassium phosphate dibasic trihydrate, 2.5 g sodium bicarbonate, 1.5 g ammonium chloride, 2 g urea, 1 g magnesium chloride hexahydrate, 0.2 g calcium chloride dehydrate, 0.1 g ferrous chloride tetrahydrate, 1 g l-cysteine hydrochloride monohydrate, 5 g cellobiose, 0.001 g resazurin, 5 g MOPS, 20 mg pyridoxamine dihydrochloride, 1 mg riboflavin, 1 mg nicotinamide, 0.5 mg DL-thioctic acid, 4 mg 4-amino benzoic acid, 4 mg D-biotin, 0.025 mg folic acid, 2 mg cyanocobalamin, 4 mg thiamine hydrochloride, 0.5 mg MnCl2·4H2O, 0.5 mg CoCl2·6H2O, 0.2 mg ZnSO4·7H2O, 0.05 mg CuSO4·5H2O, 0.05 mg HBO3, 0.05 mg Na2MoO4·2H2O, and 0.05 mg NiCl2·6H2O.
Genome resequencing was performed by the Department of Energy Joint Genome Institute (JGI, Walnut Creek, CA) using an Illumina MiSeq instrument. Genomic DNA was extracted using a Qiagen DNeasy kit (Qiagen, Valencia, CA), was sheared to 500 bp fragments using the Covaris LE220 ultrasonicator (Covaris), and size selected using AMPure XP SPRI beads (Beckman Coulter). The fragments were treated with end-repair, A-tailing, and ligation of Illumina compatible adapters (IDT, Inc) using the KAPA-Illumina library creation kit (KAPA Biosystems). The prepared libraries were quantified using KAPA Biosystem’s next-generation sequencing library qPCR kit and run on a Roche LightCycler 480 real-time PCR instrument. The quantified multiplexed libraries were pooled in sets of 10, and sequenced on the Illumina MiSeq sequencer using an indexed PE150 protocol with MiSeq V2 chemistry.
Resequencing data analysis was performed using QIAGEN Bioinformatics CLC Genomics Workbench (http://www.qiagenbioinformatics.com/products/clc-genomics-workbench), which incorporates a comprehensive set of analysis tools for Next-Generation Sequencing data. Paired-end reads were mapped to the reference genome [Genbank: CP002416] using the built-in Map Reads to Reference Tool. Further refinement of the reads mapping was performed by the Local Realignment Tool, which attempts to re-align each mapped read by exploiting the alignment information of other mapped reads. Realignment typically occurs in areas around insertions and deletions in the sample reads relative to the reference, resulting in more accurate mapping. Mapped reads were next analyzed by the built-in tools Basic Variant Detection Tool for putative SNV and MNV detection, and InDels and Structural Variants Tool for detection of putative structural variants. Variants occurring in <90% of the reads and variants that were identical to those of the parent Δhpt strain (e.g., due to errors in the reference sequence or mutations present at the beginning of strain construction) were filtered out. Raw data are available from the JGI Sequence Read Archive (JGI Project Id: 1053867 and 1053888).
The inoculum for batch fermentation was prepared by growing the mutants in MTC medium overnight at 55 °C in an anaerobic chamber (COY Laboratory Products, Grass Lake, MI). The fermentation was carbon limited and carried out in 27 mL Balch tubes with 10 mL of MTC medium containing 5 g L−1 of cellobiose as the carbon source, supplemented with 5 mM sodium acetate where noted, under a N2 headspace sealed with butyl rubber stoppers. The tubes were inoculated with 0.5% v/v culture and incubated at 55 °C. The fermentation products were determined after 53 h of growth. Final cellobiose concentration was usually <0.5 mM, suggesting that fermentation activity was complete. Fermentations were performed at least two times with three independent biological replicates each. The “No Acetate” data were previously reported , which were generated simultaneously with the “Added Acetate” data reported here.
Fermentation products, including ethanol, acetate, lactate, and formate, were analyzed on Breeze 2 High-Performance Liquid Chromatograph system (Waters Corp, Milford, MA) using an Aminex-HPX-87H column with a 5 mM sulfuric acid mobile phase. Sulfide was measured using an Orion silver ion selective electrode (Thermo Fisher Scientific, Waltham, MA) as previously described . H2 was measured using a 6850 Series II Gas Chromatograph (Agilent Technologies, Santa Clara, CA) using a thermal conductivity detector at 190 °C with a N2 reference flow and a Carboxen 1010 PLOT (30.0 m × 530 µm I.D.; model Supelco 25467) column.
The cells were grown to an OD of 0.3–0.4 in CTFUD medium, centrifuged at 4 °C for 5 min, and immediately flash frozen in liquid N2. Pelleted cells were resuspended in 1.5 mL of TRIzol (Invitrogen, Carlsbad, CA). Glass beads (0.8 g of 0.1 mm glass beads; BioSpec Products, Bartlesville, OK) were added to the cell suspension and lysed with 3 × 20 s bead beating treatments at 6500 rpm in a Precellys 24 high-throughput tissue homogenizer (Bertin Technologies, Montigny-le-Bretonneux, France). Total RNA was purified using an RNeasy kit (Qiagen, Valencia, CA) with DNase I on-column treatment. RNA quantity was determined by NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific) and RNA quality was assessed with Agilent Bioanalyzer (Agilent Technologies). RNA (10 µg) was used as the template to generate ds-cDNA using Invitrogen ds-cDNA synthesis kit according to the manufacturer’s protocols (Invitrogen).
Microarray sample labeling, hybridization, scan, and statistical analysis of array data
The ds-cDNA was labeled, hybridized, and washed according to the NimbleGen protocols. Hybridizations were conducted using a 12-bay hybridization station (BioMicro Systems, Salt Lake City, UT) and the arrays dried using a Maui wash system (BioMicro Systems). Microarrays were scanned with a Surescan high-resolution DNA microarray scanner (5 µm) (Agilent Technologies), and the images were quantified using the NimbleScan software (Roche NimbleGen, Madison, WI). Raw data were log2 transformed and imported into the statistical analysis software JMP Genomics 6.0 (SAS Institute, Cary, NC). The data were normalized together using a single round of the LOESS normalization algorithm within JMP Genomics, and distribution analyses were conducted before and after normalization as a quality control step. An ANOVA was performed in JMP Genomics to determine differential expression levels between conditions using the False Discovery Rate (FDR) testing method (p < 0.05). Microarray data have been deposited in NCBI Gene Expression Omnibus (GEO) database under accession number (GSE54082). Data are average of three independent biological replicates.
Real-time quantitative-PCR (RT-qPCR) analysis
Microarray data were validated using real-time qPCR, as described previously . Based on microarray hybridizations of C. thermocellum mutants, a set of 5 genes (Clo1313_0115, Clo1313_0147, Clo1313_0372, Clo1313_1559, and Clo1313_2243) representing a range of gene expression values was analyzed using qPCR from cDNA prepared for microarrays. Oligonucleotide sequences of the primers targeting the five genes selected for qPCR analysis are shown in Additional file 1: Table S1. Data are average of three independent biological replicates.
Crude protein fraction of C. thermocellum cell pellet was processed and digested with trypsin, and peptides were eluted and analyzed over an 11-step MudPIT as described previously . High mass accuracy was utilized for both MS1 (30 K resolution) and MS2 (7.5 K resolution; CID) scans (1 microscan each), with data-dependent acquisition settings as follows: 1 full scan followed by 20 MS/MS scans, isolation window = 2.1 m/z, dynamic exclusion window, duration, and max = −0.52/+ 1.02 m/z, 15 s, and 500, respectively. Peptides generated from C. thermocellum strain DSM1313 FASTA database concatenated with common contaminants and reversed entries were matched to MS/MS spectra using MyriMatch v. 2.1 . Common sample prep-induced modifications, i.e., Cys + 57.0214 Da (alkylation; static), Met + 15.9949 Da (oxidation; dynamic), and N-terminus + 43.0058 Da (carbamylation; dynamic), were included in the search parameters. Matches were filtered and assembled using IDPicker v. 3.0  using a minimum of two distinct peptides per protein identification and adjusting the minimum spectra count (SpC) per protein to achieve protein level FDRs < 5%, peptide-level FDRs < 1%, and PSM-level FDRs < 0.25%. Protein identifications with associated spectral counts (SpC) were tabulated, balanced, and normalized for semi-quantitative proteomics as previously described . Normalized SpC (nSpC) were used as a proxy for protein abundance across individual samples. To assess differences in protein abundance, the top 99% of total assigned spectra (across all sample conditions) was log2-transformed and processed by ANOVA (JMP Genomics v. 4.1) to assess statistical significance. Proteins with significant differences in abundance (p value ≤0.01) and minimum of twofold change were identified and compared with transcriptomics data to identify proteins affected by the knock-out of hydrogenases as well as the addition of acetate to the culture. Data are average of three independent biological replicates. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://www.proteomexchange.org) via the PRIDE partner repository with the data set identifier PXD000777.