- Open Access
Cellular adhesiveness and cellulolytic capacity in Anaerolineae revealed by omics-based genome interpretation
Biotechnology for Biofuels volume 9, Article number: 111 (2016)
The Anaerolineae lineage of Chloroflexi had been identified as one of the core microbial populations in anaerobic digesters; however, the ecological role of the Anaerolineae remains uncertain due to the scarcity of isolates and annotated genome sequences. Our previous metatranscriptional analysis revealed this prevalent population that showed minimum involvement in the main pathways of cellulose hydrolysis and subsequent methanogenesis in the thermophilic cellulose fermentative consortium (TCF).
In further pursuit, five high-quality curated draft genomes (>98 % completeness) of this population, including two affiliated with the inaccessible lineage of SBR1031, were retrieved by sequence-based multi-dimensional coverage binning. Comparative genomic analyses revealed versatile genetic capabilities for carbohydrate-based fermentative lifestyle including key genes catalyzing cellulose hydrolysis in Anaerolinea phylotypes. However, the low transcriptional activities of carbohydrate-active genes (CAGs) excluded cellulolytic capability as the selective advantage for their prevalence in the community. Instead, a substantially active type VI pili (Tfp) assembly was observed. Expression of the tight adherence protein on the Tfp indicated its function for cellular attachment which was further testified to be more likely related to cell aggregation other than cellulose surface adhesion. Meanwhile, this Tfp structure was found not contributing to syntrophic methanogenesis. Members of the SBR1031 encoded key genes for acetogenic dehydrogenation that may allow ethanol to be used as a carbon source.
The common prevalence of Anaerolineae in anaerobic digesters should be originated from advantageous cellular adhesiveness enabled by Tfp assembly other than its potential as cellulose degrader or anaerobic syntrophs.
Anaerobic digestion, as a key environmental technology for resource recovery, is empowered by microbial reactions involving a complex community. Anaerolineae (also known as subphylum I of Chloroflexi phylum) had been identified as one of the core populations, and for most of the cases, the dominating proportion of anaerobic digestive systems [1, 2]; however, its roles in anaerobic digestion process remain uncertain due to a general lack of sequenced genomes.
A series of metabolic reactions such as hydrolysis, acidogenesis (fermentation), acetogenesis, and methanogenesis are involved in the process of anaerobic digestion. Normally, the Anaerolineae linage has been regarded as a typical fermentative population within the community. As hydrogen could be produced during fermentation of soluble sugar, researchers also speculated that Anaerolineae acted as anaerobic syntrophs which conduct reverse electron transfer via tightly coupled mutualistic interaction with methanogens ; however the validity of Anaerolineae in syntrophic methanogenesis is not yet confirmed. Additionally, the common ability to grow on starch (alpha-glucan polysaccharides) [3, 4] and the recent discovery of cellulolytic representative Ornatilinea apprima  is attracting increasing research interest on the importance of this lineage in the bottlenecking polysaccharide hydrolysis step of anaerobic digestion. Moreover, the flow velocity within the digester, especially in upflow anaerobic sludge blanket reactor (UASB), may select organisms which can adhere to each other to form well-settling granular sludge. This widely distributed Anaerolineae population had been reported as both the backbone of sludge granules  and the causative agent of filamentous bulking in UASB [2, 7].
Currently ten strains had been isolated in Anaerolineae class [3–5, 8–11]. These representative strains isolated from anaerobic sludge treating various pollutants help to resolve the phylogenetic composition of this lineage into eight genera representing one family of Anaerolineaceae in one single order of Anaerolineales. Phenotypic comparison of the cultivated strains identifies a number of common traits including filamentous morphology as well as non-motile, non-sporulation, and gram-negative characteristics [3–5, 8–11]. In contrast, genomic information is quite limited for this class; indeed there is only one finished complete genome, Anaerolinea thermophila UNI-1 (short as UNI-1 in subsequent discussion), available in IMG 4.0 (up to 12 Jan 2016). Further recovery of interpretative genomes is requisite to disclose the ecological importance of Anaerolineae as a core population in anaerobic digestion process.
Rapid accumulation of next generation sequencing (NGS) data from various metagenomes had made genome reconstruction independent from isolation possible [12–15]. Such cultivation-independent binning method based on multi-dimensional abundance profile had provided initial genomic insight of the metabolic styles of the previously inaccessible phyla-like TM6 and OP8 etc. [15–17]. This creative approach had helped to expand the evolutionary boundaries of Dehalococcoidia lineage of the Chloroflexi from obligate organohalide respiration to fermentation, CO2 fixation, and acetogenesis [18, 19]; however, these frontier work did not emphasize on the lineage of Anaerolineae which is important in the operation of anaerobic reactors.
In the present study, with a purpose to resolve the special physiochemical features that support accumulation of Anaerolineae in the thermophilic cellulose-fermenting reactor, we utilized a sequence-based metagenomic binning to recover high-quality genomes from Anaerolineae lineage. Comparative genomics were conducted to reveal specific genetic traits related with key functions in anaerobic digestion. Beneficial ecological functions of Anaerolineae within the community were inferred based on the expressed genes and pathways identified by metatranscriptomic sequencing and then testified in experiments. Information obtained here would add a great amount of contextual information to the ecological importance of Anaerolineae in anaerobic digestive systems and help to resolve the intra-physiological differences among the uncultivated majority of this lineage.
Results and discussion
Two-dimensional coverage binning and quality evaluation
Metagenomic DNA extracted from sludge samples collected from the same thermophilic anaerobic cellulose-degrading reactor but at two different time points [the short-term enrichment at 120 days of enrichment (SE) and the long-term enrichment at 549 days (LE)] were deeply sequenced to construct the two differential coverage for genome reconstruction. The 129,535,600 high-quality reads obtained from illumina paired-end sequencing (see Additional file 1: Table S1 for a summary) were de novo assembled, resulting in a total of 119 Mb operational scaffolds (scaffolds longer than 1 kb) with N50 of 19,859 bp. Most of the reads (87 %) were included in the assembly with the longest scaffolds being 640 kb (Additional file 1: Table S2).
As shown in Fig. 1, the coverage binning produced six primary genome bins (named as TCF-2, 5, 8, 12, 13, and 14) showing closely clustered coverage in the SE and LE metagenomes. The accumulation of Chloroflexi during enrichment facilitated the retrieving of the related genome bins (Additional file 1: Figure S1). Then, tetra-nucleotide frequency (TNF) was used to filter out possible contamination at hierarchy distance of 0.1 . The quality of these six primary Chloroflexi genome bins was then evaluated in terms of genome completeness and contamination, respectively based on the occupation and duplication of the 107 essential single-copy genes (ESCGs) shared >95 % of all bacteria (Additional file 1: Tables S3, S4). Except for TCF-14, the other five genome bins showed comparable completeness and ESCG redundancy to that of the other finished genomes of Chloroflexi (completeness larger than 96 % with redundancy less than 5 %, Additional file 1: Table S5). This quality estimation was double checked by an alternative method based on 35 conserved single-copy COGs . Consistent completeness and purity estimation results were observed for these five genome bins (Additional file 1: Table S6). The subsequent genome annotation was only based on these five high-quality genome bins. The other retrieved genomes can be found in Additional file 1: Table S7.
Phylogenic position of curated genomes
To solve the phylogenetic position of these five genome bins, firstly neighbor-joining phylogenetic tree was constructed with recovered 16S rRNA genes of the TCF-2, 5, 12, and 13 and high-quality 16S clone sequences downloaded from Silva SSU 15.0 database. As shown in Fig. 2, all of the curated genomes were placed within Anaerolineae class. TCF-2, TCF-5, and TCF-12 were clustered to the order of Anaerolinales and respectively affiliated with A. thermolimosa, Bellilinea caldifistulae, and Thermanaerothrix daxensis. In contrast, TCF-13 showed no confirmed affiliation to any known genus. The closest known relative of TCF-13 is Thermomarinilinea lacunofontalis at maximum-likelihood evolutionary distance larger than 0.1, suggesting that this species may belong to a novel lineage within Anaerolineae class. This lineage was named as the SHA-31 family of SBR1031 order in the updated greengenes taxonomy (published in May 2013) . Since the 16S rRNA gene of TCF-8 is too short (only 65 bp) to make confident alignment, maximum-likehood tree based on the concatenated alignment of 35 single-copy ESCGs shared among the five genome bins and twenty-two finished genomes of Chloroflexi was used in addition to phylogenetic tree based on 16S rRNA genes. The concatenated clustering indicated close phylogenetic affiliation of TCF-8 to the SBR1031 lineage containing TCF-13 (Fig. 3). Further comparison based on the average nucleotide identity (ANI) also supported this affiliation that TCF-8 shared far more genes with TCF-13 (811 shared genes) than with any other genomes of the Anaerolineales lineage (Additional file 1: Figure S2b). TCF-8 and TCF-13 shall represent different species of this previously inaccessible lineage since ANI between these two genomes (79.3 %) was less than 94 %  and in silico DNA–DNA hybridization value (DDH of 17.7 %) much lower than 70 %  (Additional file 1: Figure S2a).
General physiology and prevalence of Anaerolineae in the TCF community
Among the five curated genomes obtained, TCF-2, 5, and 12 showed average genome size of 3.5 Mb and GC content 54 % which is more consistent with that of A. thermophila UNI-1 (the only available complete genome of Anaerolineae), while TCF-8 and TCF-13 showed slightly bigger genome (>4.0 Mb) with higher GC content (around 65 %) (Table 1). Resembling their phylogenetic affiliation (Figs. 2, 3), complete-linkage clustering on COG orthologs also indicated that TCF-8 and TCF-13 were functionally divergent from the cluster containing TCF-2, 5, 12, and UNI-1 (Additional file 1: Figure S3). In both the SE and LE metagenome, the order of Anaerolinales containing TCF-2, 5, 12, and UNI-1 were generally more prevalent than SBR1031 order containing TCF-8 and TCF-13 (Fig. 1). TCF-2 taking 14.1 % of SE and 11.7 % of LE metagenome was one of the dominant populations within the TCF community. Despite of the large population size, TCF-2 expressed only comparatively small fraction (27.8 % of possible transcripts detected, Table 1) of its genetic complement in situ and this at a moderate level of expression suggesting the tight regulation of gene expression to facilitate preferential metabolism in TCF-2. The metabolic advantage of this population will be discussed from the major steps of anaerobic digestion process: fermentative metabolism, cellulose hydrolysis, and syntrophic methanogenesis.
Fermentative lifestyle of Anaerolineae
Anaerolineae showed versatile metabolic abilities on carbohydrate fermentation. Glycolysis pathway towards acetate and lactate production was conserved among TCF-2, 5, 12, and UNI-1 (Additional file 1: Figure S4) suggesting that acetate and lactate shall be produced during fermentation. TCF-8 and TCF-13 also processed the complete acetate pathway but not that for lactate generation. Additionally, common encoding-gene cluster of NiFe hydrogenase (COG3260, 3261, 3262) and related proteins in five curated genome bins and UNI-1 indicated the metabolic ability to produce hydrogen during fermentation, consistent with previous experimental results based on isolated strains [3–5, 8–11].
Except for TCF-8, active transcription of cellulase M and cellulases of GH05 and GH09 in the other four Anaerolineae genomes indicated their ecological roles as cellulose hydrolyzers in the cellulolytic community (Fig. 4). We speculated that the Anaerolineae populations might rely on extra-cellular cellulase systems for hydrolysis because no cohension, dockerin (key component of cellulosome complex), or any cellulase-related carbohydrate-binding modules (CBMs) could be identified in the five curated genomes and UNI-1. Nevertheless our previous study on the transcriptional characterization of this TCF consortium showed minimum contribution of the Chloroflexi, compared to Clostridiales and Bacteroidetes in overall cellulose hydrolysis . Such transcriptional inefficiency may be regulatory that the cellulase M clusters of Anaerolineae were not protected by the preceding heat shock protein like that found in Clostridiales . Consequently, cellulolytic capacity is unlikely the driven force for the prevalence of Anaerolineae within the community.
Major transcription of type IV pili (Tfp)
Based on metatranscriptomic data, strikingly high transcription of pilA gene (type IV pili assembly protein as designated K02651 and K02650 in KEGG database), the leading gene for the assembly of a conservative type IV pili (Tfp), was noticed in TCF-2, TCF-5, and TCF-12 (Fig. 5). Based on the assumption, genes encoding the beneficial physiochemical features shall be actively transcribed, deciphering the function of the vigorous transcribed Tfp assembly in Anaerolinales shall bring useful insight into the gradual dominance of this population within TCF community. The Tfp is the most widespread organs of bacterial attachment . As has been already noted, pili are often involved in facilitating adhesion and colonization in a wide variety of scenarios including: host cells attachment in numerous human pathogens such as Actinobacillus actinomycetemcomitans ; cellulose binding in Ruminococcus albus  and biofilm formation on stainless steel in Pseudomonas aeruginosa . Moreover, the Tfp in Geobacter sulfurreducens are electrical-conductive nanowires involving in direct interspecies electron transfer (DIET) between syntrophic patterns .
As shown in Fig. 5, the conservative Tfp cluster observed in Anaerolinales contains a series of genes encoding pilus assembly proteins (pilA, CpaB, CpaE, CpaF) and two consecutive Tad proteins (TadB, TadC). The adhesive nature of the Tfp of Anaerolinales could be inferred from the occupation and expression of the Tad (tight adhesion) locus which is the essential machinery for the assembly of adhesive pili . The precursor of the major structure component of Tfp is encoded by pilA gene. This precursor seems regulatory crucial for effectiveness of Tfp in Anaerolinales because the Tfp cluster without proceeding pilA genes in TCF-8 and TCF-13 showed no transcriptional activities (Fig. 5). Given TCF-8 and 13′s comparable metabolic capacities to the dominating Anaerolineales, such ineffectiveness of Tfp cluster may play a role for their rareness in the community. Additionally, in contrast to the low expression level in carbohydrate metabolism, all the homolog copies of the pilA genes in TCF-2, TCF-5, and TCF-12 (respectively encoded 3, 2, and 2 genes annotated as homologs of this enzyme class) got expressed, suggesting the ecological benefits endorsed by pilA expression in Anaerolinales.
To further investigate the condition of Anaerolinales adherence, experiment was conducted to reveal the community change on the surface of filter paper (made of 98 % of microcrystalline cellulose) during hydrolysis. As community profiling based on HTS of 16S rRNA gene amplicons revealed, comparing to the evident accumulation of Clostridium and Fervidobacterium, populations of Anaerolinales (Anaerolineae and Bellilineae), though were among the most prevalent populations attached, stayed unchanged in size during the first 12 h of hydrolysis, suggesting its relative incompetence to grow on cellulose surface (Fig. 6). This observation was consistent with the reluctant expression of cellulase genes in the retrieved genomes bins of this lineage. Additionally, increasing bacterial diversity (Additional file 1: Figure S5) in the attached community was induced by the more degradable alpha- and beta-monosaccharides generated at the steady phase of hydrolysis (after 24 h, Figure S5). Remarkable increase of Sphingomonas and Pseudomonas was observed at this stage, but the paucity of Anaerolineae stayed unaffected. Although being commonly regarded as aerobic, strains of Sphingomonas and Pseudomonas had been reported to be tolerant to anaerobic environment [32–35]. The overgrowth of these two genera in the attached community after 24 h of incubation may be originated from a combination of their ability to utilize beta-linked monosaccharides released from the hydrolysis process  as well as their extraordinary ability to grow in biofilm [37, 38]. These results indicated the accumulation of Anaerolinales took place other than directly on the surface of cellulose, therefore, we speculate, instead of initiating the attachment on substrate surface, the adhesive feature of Anaerolineae enabled by active pilA expression might serve as the adhesive matrix for the aggregation of fermentative population in the liquid phase. Since, most anaerobic cellulolytic microorganisms grow optimally on cellulose when attached to the substrate and in at least a few species this adhesion appears to be obligate , this surface-free life style of Anaerolinales reflected its incompetence in cellulose hydrolysis as disclosed by metatranscriptome. The continuous stirring provided in the enrichment SBR may play the selective role for Anaerolinales that microorganisms capable of attaching to each other would benefit from a more efficient exchange of fermentation intermediates and thus proliferate more effectively in competition with other free-ranging anaerobic fermentative counterparts . The advantageous bonding capacity in Anaerolinales observed in this study may provide a novel insight into its ubiquity and accumulation in anaerobic digestive systems.
Another interesting function of Tfp is its conductive role for syntrophic DIET. Since the Anaerolineae lineage of Chloroflexi was considered as semi-syntrophic in anaerobic systems  and its interspecies electron transfer (IET) mechanism in mutualistic cooperation with methanogens was yet to be studied, study on the syntrophic machinery and DIET involvement of Anaerolineae is indispensable.
Despite the lack of detected transcriptional activities, the shared 926 genes between TCF-8 and TCF-13 (Additional file 1: Figure S2b) revealed a genetic potential of these populations to metabolize ethanol to acetate (Additional file 1: Figure S4), implying their putative role as anaerobic syntrophs. However, these pathways were absent from Anaerolinales containing TCF-2, 5, 12, and UNI-1. Additionally, by comparing the transcriptional activities of genes involved in the fundamental steps of syntrophic metabolism as proposed by Sieber et al. , relatively weak activities of hydrogenase and formate dehydrogenase suggested the unsteady involvement of H2 or formate as the electron carrier for IET in Anaerolineae populations (Table 2). Researchers believed genomic co-occurrence of pilA and outer membrane cytochromes was prerequisite for DIET to take place in a microbe . Despite the active pilA, none of the five curated genome bins and UNI-1 possesses c-type cytochromes (Table 2). As a result, we cannot confirm nor exclude the DIET potential based on the paradox between highly active pilA gene and absence of membrane cytochromes in TCF-2, 5, and 12. As a result, consecutive iron supplementation batch tests were designed to verify the DIET potential of the TCF community based on the hypothesis that electric-based syntrophic methanogenesis could be expedited by the dosage of conductive iron-oxide minerals . Fe2O3 powder was dosed at 20 mM of iron atom  to stimulate the electron exchange within the TCF community in three consecutive batches. But batches with iron-oxide supplementation showed no evident advancement on the overall methanogenesis in the short- (1st batch) and long-term (3rd batch) run (Additional file 1: Figure S6a). These batch results indicated that the possibility of DIET phenomenon among microbial populations within the TCF community was rare and thus rejected the initial speculation on DIET involvement of the highly active Tfp in TCF-2, 5, and 12.
Coverage-based genome recovery coupled with metatranscriptomic interpretation was used to disclose the advantageous features of Anaerolineae populations in anaerobic digestive system based on the five near-complete genomes retrieved from the TCF community. Despite the slight transcription of cellulolytic genes, the prevalence of this population should more likely interrelate with the evident cellular adhesiveness enabled by active transcription of Tfp. Further experiment showed this Tfp structure was functioned as adhesive matrix for cell–cell aggregation other than cell-surface attachment for biofilm initiation nor electron transfer for syntrophic methanogenesis.
Enrichment reactor setup
Anaerobic digestion sludge (ADS) collected from Shek Wu Hui Wastewater Treatment Plant (Hong Kong, SRA, China) were used for the enrichment of thermophilic cellulolytic consortium in a sequential batch reactor (SBR) as described previously . Enriched thermophilic cellulose-fermenting (TCF) sludge was sampled at two different time points (SE: short-term enrichment at 120 days and LE: long-term enrichment at 545 days) during the enrichment.
Metagenomic libraries and Illumina sequencing
Two metagenomic libraries were constructed with genomic DNA respectively extracted from the SE and LE sludge samples. Genomic DNA was extracted from 500 mg dry weight sludge sample with FastDNA® SPIN Kit for Soil (MP Biomedicals, LLC, Illkirch, France). Sequencing of the metagenomic DNA was carried out on the Illumina Hiseq 2000 platform at BGI (Shenzhen, China) by applying the 101 bp paired-end strategy with combined insert lengths of 180 and 800 bp for SE metagenome and sole 180 bp insert for LE metagenome (Additional file 1: Table S2). The resulted PE reads were trimmed for sequencing adaptors before filtering out reads with average phred quality score lower than 20 and ambiguous nucleotide using PRINSEQ . The shotgun metagenomic reads have been deposited into the MG-RAST server for data sharing (see Table S1 for the accession number). SE and LE metagenomes and LE metatranscriptome have been used in our previous studies with focus other than Anaerolineae populations [25, 44].
De novo assembly and two-dimensional coverage binning
De novo assembly by three popular de novo assemblers, namely MetaVelvet (1.2.01) , IDBA_UD (1.1.1) , and CLCbio Genomic Workbench 6.0.2 (CLCbio, Denmark), were compared in terms of reads utilization efficiency and length of scaffolds (Additional file 1: Table S9). The most comprehensive IDBA_UD were picked to assemble the SE and LE metagenomes together using a series of kmer 20,40,60,80, and 100. Two metagenomes were assembled together to facilitate generation of long scaffolds. Only scaffolds longer than 1 kb were kept for subsequent genomic binning analysis.
Based on the assumption that scaffolds belonging to the same genome (strain) should share similar coverage across different metagenomes, scaffolds of targeted Anaerolineae genome bins were recruited from the two-dimensional coverage plot using R scripts . Divergent coverage of Chloroflexi populations were provided by metagenomic libraries of thermophilic cellulolytic sludge sampled from the same reactor but at two different times (SE at 120 days and LE at 545 days). The coverage sets of scaffolds were obtained by independently mapping PE reads in the SE and LE metagenomes against scaffolds assembled, using Bowtie 1.0.1  allowing two mismatches over the entire read length (bowtie option: −v 2 −m 200) . Coverage of a scaffold was calculated as the total base pairs of mapped read divided by its length. After that, the scaffolds were binned based on the clustering of coverage and phylum assignment. To minimize the potential contamination, another genomic signature, tetra-nucleotide frequency (TNF), was used to refine the bins at euclidean distance cutoff of 0.1 . Finally, PE-tracking tools from the mm genome package  was used to reinforce the scaffolding by retrieving genes initially excluded, for example, genes showing deviate coverage caused by multiple copies.
At the same time, community composition was assessed by identifying 16S rRNA sequences in metagenomes. The unassembled illumina reads were searched against Silva SSU 115 database  with BLASTN  using evalue cutoff of 1E−20. The tabular BLAST results were parsed at phylum level with MEGAN4  using the lowest common ancestor algorithm.
Genome completeness, contamination, and abundance in metagenomes
The HMM of 107 essential single-copy genes (ESCGs) (Additional file 1: Table S4), defined as the single-copy genes conserved in 95 % of all bacteria , were used as pan-genome to indicate the completeness and potential contamination of the genome bins. The completeness of a draft genome was measured by the percentage of identified ESCGs out of the total 107 ESCGs, while the contamination was determined as dividing duplicated ESCGs by the number of ESCGs identified in the draft genome. To double check our estimation on completeness and purity of a draft genome, a set of 35 orthologous groups (COGs)  (Additional file 1: Table S6) were used as alternative markers. The relative abundance of each curated genome bin in a metagenome was calculated as the number of reads mapped in percentage of the total number of reads in a metagenome. ANI is calculated with similarity cutoff of 60 % , while DDH was in silico estimated by GGDC .
Reconstruction of 16S rRNA genes
Complete 16S rRNA gene of the genome bins TCF-2, 5, and 12 were determined by IMG 4.0 genome annotation pipeline  and double confirmed by EMIRGE . EMIRGE was used as a complementary approach to reconstruct 16S rRNA genes from the shotgun libraries with 80 iterations. Uchime  was used to filter the possible chimera formed in EMIRGE before comparing the reconstructed 16S rRNA gene to that of the curated genome bins. The incomplete prediction of 16S rRNA gene in TCF-13 (258 bp) was manually extended based on its nearly identical BLAST match (similarity higher than 99 % over 258 bp) to a 16S rRNA sequence in Silva SSU database (version 11.5).
Phylogenetic analysis of draft genomes
In order to determine the phylogenetic position of draft genomes obtained here, neighbor-joining tree of Anaerolineae was built using MEGA5  with maximum-likelihood method and bootstrap value of 1000. A phylogenetic tree was constructed using (1) 16S rRNA sequences of the draft genomes, (2) 16S rRNA gene of A. thermophila UNI-1, (3) 16S rRNA gene of ten isolated strains and high-quality 16S clones collected from Silva SSU database.
To determine the phylogenetic affiliation of TCF-8 whose 16S rRNA gene is too short for reliable alignment, genome tree was constructed from a concatenated alignment of 35 protein-coding ESCGs shared in single-copy manner among the five curated genomes and twenty-two finished genomes of Chloroflexi in IMG 4.0. A maximum-likelihood tree was created using phyml 3.1  using default setting for amino acids with 100 bootstraps based on MUSLE  alignments.
Functional and transcription analysis
Functional annotation of the Anaerolineae genomes
The five near-complete genomic bins retrieved from the TCF community were submitted to IMG annotation pipeline for ORF calling as well as functional annotation (The IMG genome ID of each bin was listed in Table 1). IMG annotation on Pfam, KEGG, and COG databases were compared against that of twenty-two finished genomes of Chloroflexi to reveal metabolism styles. Given the unavailability of syntrophic pathways in a single database, identification of key genes involved in the syntrophic process in the present study was based on the integration of COG, PfamA, TIGRFAMs, as well as KEGG KO annotation (The identifier of syntrophic metabolism related genes used in this study are listed in Additional file 1: Table S10).
Metatranscriptomic sequencing and expression quantification
Total RNA of the LE sludge sample was extracted and then sequenced following the protocol described previously . Transcriptional activities of genes in each draft genome were investigated in the same manner as previously established . Briefly, the concept of MRPKM, defined as the ratio of RPKM-RNA to RPKM-DNA, was used to evaluate the transcriptional activity of genes in metatranscriptome. RPKM-DNA and RPKM-RNA was respectively calculated from metagenome and metatranscriptome of LE sample using RSEM  based on Bowtie 1.0.1 alignment allowing two mismatches over the entire read length.
Iron-oxide supplementation batch tests
50 ml sludge collecting from the enrichment batch at the peak of a SBR cycle was added as seed sludge to batch test with working volume of 100 ml. Medium solution was prepared following previous protocol . Microcrystalline cellulose (50 µm in diameter, Sigma, USA) was dosed at concentration of 2.5 g/l, while Fe2O3 powder (≥99.995 % trace metals basis, Aldrich, USA) was supplemented to stock solution to give the final concentration of 20 mM as Fe atom . Nitrogen was used to purge out the air inside the serum bottle to ensure anaerobic environment. Batch tests were carried out in 55 °C water bath with continuous stirring at 120 rpm. For each batch test, the cellulose substrate was 2.5 g/l and the initial pH was controlled at around 7.5. Three consecutive batches were conducted to investigate the effect of iron supplementation in the short- and long-term run. Each batch was suspended when biogas generation ceased for all the reactors. The results represented were average value of duplicated tests.
Gas and volatile fatty acids analysis
Gas volume was monitored by a glass syringe. Gas content, including hydrogen, methane, and carbon dioxide, were determined using gas chromatograph (GC-TCD) following configuration described previously . The composition in the liquid phase including volatile fatty acids and alcohols, were measured using a second GC-FID .
Attachment experiment and community profiling by high-throughput sequencing
Slices of filter paper (Whatman, 98 % microcyrstalline cellulose) were dipped into the thermophilic SBR (run for 716 days) for a certain time (1 min and 2, 6, 12, 24, 30, and 32 h) to accumulate microbes that attached to cellulose surface. Biological replicates were sampled at 12 h (Additional file 1: Figure S7). Filter paper dipped for 1 min was used to represent the community binded to filter paper by physical adsorption. The initial community before experiment was also sampled. Filter paper after dipping was washed with DI waster to kept microbial populations steadily attached to the surface . The sampled filter paper was cut in half, with each half respectively used for DNA extraction (with the same protocol for metagenomic DNA extraction) and weight measuring. Dry weight lost was used to evaluate the hydrolysis efficiency (Additional file 1: Figure S8). Universal primers for V4 FLX forward primer (“AYTGGGYDTAAAGNG”) and reverse primers (“TACNVGGGTATCTAATCC”, “TACCRGGGTHTCTAATCC”, TACCAGAGTATCTAATTC”, “CTACDSRGGTMTCTAATC”) targeting the V4 region of 16S rRNA gene were used to amplify genetic amplicons for community profiling using Roche FLX 454 high-throughput sequencing (HTS) at BGI (Shenzhen, China). Slice of unused filter paper was also subject to the same DNA extraction and 16S rRNA gene amplification to testify primer specificity towards microbial populations (Additional file 1: Figure S9).
Quality filtering and community analysis of the 454 reads was conducted following protocol previous reported . Briefly, the raw reads were demultiplexed, quality trimmed, aligned, and finally checked with ChimeraSlayer to remove chimeric sequences by standard procedure in Mothur . The post quality filtering reads (Additional file 1: Table S11) were clustered into operational taxonomic units (OTUs) equivalent to genus level (0.97 similarity) by open OTU algorithm adopted in QIIME platform . Taxonomy of each OTU was assigned by RDP Classifier  using confidence threshold of 50 % which provides a trade-off between adequate classification accuracy and maximizing the percentage of classifiable sequences . Discussion on the community composition only focus on the prevalent populations taking >1 % of the bacterial community.
thermophilic cellulose fermenting
upflow anaerobic sludge blanket reactor
Anaerolinea thermophila UNI-1
type VI pili
next generation sequencing
essential single-copy genes
average nucleotide identity
DNA–DNA hybridization value
cellulase M cluster
anaerobic digestion sludge
sequential batch reactor
interspecies electron transfer
direct interspecies electron transfer
operational taxonomic units
Narihiro T, Terada T, Ohashi A, Kamagata Y, Nakamura K, Sekiguchi Y. Quantitative detection of previously characterized syntrophic bacteria in anaerobic wastewater treatment systems by sequence-specific rRNA cleavage method. Water Res. 2012;46:2167–75.
Rivière D, Desvignes V, Pelletier E, Chaussonnerie S, Guermazi S, Weissenbach J, et al. Towards the definition of a core of microorganisms involved in anaerobic digestion of sludge. ISME J. 2009;3:700–14.
Sekiguchi Y. Anaerolinea thermophila gen. nov., sp. nov. and Caldilinea aerophila gen. nov., sp. nov., novel filamentous thermophiles that represent a previously uncultured lineage of the domain bacteria at the subphylum level. Int J Syst Evol Microbiol. 2003;53:1843–51.
Yamada T. Anaerolinea thermolimosa sp. nov., Levilinea saccharolytica gen. nov., sp. nov. and Leptolinea tardivitalis gen. nov., sp. nov., novel filamentous anaerobes, and description of the new classes Anaerolineae classis nov. and Caldilineae classis nov. in the bacterial phylum Chloroflexi. Int J Syst Evol Microbiol. 2006;56:1331–40.
Podosokorskaya OA, Bonch-Osmolovskaya EA, Novikov AA, Kolganova TV, Kublanov IV. Ornatilinea apprima gen. nov., sp. nov., a cellulolytic representative of the class Anaerolineae. Int J Syst Evol Microbiol. 2013;63:86–92.
Mielczarek AT, Kragelund C, Eriksen PS, Nielsen PH. Population dynamics of filamentous bacteria in Danish wastewater treatment plants with nutrient removal. Water Res. 2012;46:3781–95.
Yamada T, Sekiguchi Y, Imachi H, Kamagata Y, Ohashi A, Harada H. Diversity, localization, and physiological properties of filamentous microbes belonging to Chloroflexi subphylum I in mesophilic and thermophilic methanogenic sludge granules. Appl Environ Microbiol. 2005;71:7493–503.
Yamada T, Imachi H, Ohashi A, Harada H, Hanada S, Kamagata Y, et al. Bellilinea caldifistulae gen. nov., sp. nov. and Longilinea arvoryzae gen. nov., sp. nov., strictly anaerobic, filamentous bacteria of the phylum Chloroflexi isolated from methanogenic propionate-degrading consortia. Int J Syst Evol Microbiol. 2007;57:2299–306.
Grégoire P, Fardeau M-L, Joseph M, Guasco S, Hamaide F, Biasutti S, et al. Isolation and characterization of Thermanaerothrix daxensis gen. nov., sp. nov., a thermophilic anaerobic bacterium pertaining to the phylum “Chloroflexi”, isolated from a deep hot aquifer in the Aquitaine Basin. Syst Appl Microbiol. 2011;34:494–7.
Imachi H, Sakai S, Lipp JS, Miyazaki M, Saito Y, Yamanaka Y, et al. Pelolinea submarina gen. nov., sp. nov., an anaerobic, filamentous bacterium of the phylum Chloroflexi isolated from subseafloor sediment. Int J Syst Evol Microbiol. 2014;64:812–8.
Nunoura T, Hirai M, Miyazaki M, Kazama H, Makita H, Hirayama H, et al. Isolation and characterization of a thermophilic, obligately anaerobic and heterotrophic marine Chloroflexi bacterium from a Chloroflexi-dominated microbial community associated with a Japanese SHALLOW hydrothermal system, and proposal for Thermomarinilinea lacunofontalis gen. nov., sp. nov. Microbes Environ. 2013;28:228–35.
Sharon I, Morowitz MJ, Thomas BC, Costello EK, Relman DA, Banfield JF. Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization. Genome Res. 2013;23:111–20.
Albertsen M, Hugenholtz P, Skarshewski A, Nielsen KL, Tyson GW, Nielsen PH. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol. 2013;31:533–8.
Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC, Yelton AP, et al. Community-wide analysis of microbial genome sequence signatures. Genome Biol. 2009;10:R85.
Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004;428:37–43.
Kantor RS, Wrighton KC, Handley KM, Sharon I, Hug LA, Castelle CJ, et al. Small genomes and sparse metabolisms of sediment-associated bacteria from four candidate phyla. mBio. 2013;4:e00708.
Wrighton KC, Thomas BC, Sharon I, Miller CS, Castelle CJ, VerBerkmoes NC, et al. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science. 2012;337:1661–5.
Hug LA, Castelle CJ, Wrighton KC, Thomas BC, Sharon I, Frischkorn KR, et al. Community genomic analyses constrain the distribution of metabolic traits across the Chloroflexi phylum and indicate roles in sediment carbon cycling. Microbiome. 2013;1:22.
Wasmund K, Schreiber L, Lloyd KG, Petersen DG, Schramm A, Stepanauskas R, et al. Genome sequencing of a single cell of the widely distributed marine subsurface Dehalococcoidia, phylum Chloroflexi. ISME J. 2014;8:383–97.
Hess M, Sczyrba A, Egan R, Kim T-W, Chokhawala H, Schroth G, et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science. 2011;331:463–7.
Raes J, Korbel JO, Lercher MJ, von Mering C, Bork P. Prediction of effective genome size in metagenomic samples. Genome Biol. 2007;8:R10.
DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006;72:5069–72.
Konstantinidis KT, Tiedje JM. Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci USA. 2005;102:2567–72.
Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA–DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57:81–91.
Xia Y, Wang Y, Fang HHP, Jin T, Zhong H, Zhang T. Thermophilic microbial cellulose decomposition and methanogenesis pathways recharacterized by metatranscriptomic and metagenomic analysis. Sci Rep. 2014;4:6708.
Pelicic V. Type IV pili: e pluribus unum? Mol Microbiol. 2008;68:827–37.
Kachlany SC, Planet PJ, DeSalle R, Fine DH, Figurski DH, Kaplan JB. flp-1, the first representative of a new pilin gene subfamily, is required for non-specific adherence of Actinobacillus actinomycetemcomitans. Mol Microbiol. 2001;40:542–54.
Rakotoarivonina H, Jubelin G, Hebraud M, Gaillard-Martinie B, Forano E, Mosoni P. Adhesion to cellulose of the gram-positive bacterium Ruminococcus albus involves type IV pili. Microbiology. 2002;148:1871–80.
Alm RA, Mattick JS. Genes involved in the biogenesis and function of type-4 fimbriae in Pseudomonas aeruginosa. Gene. 1997;192:89–98.
Reguera G, McCarthy KD, Mehta T, Nicoll JS, Tuominen MT, Lovley DR. Extracellular electron transfer via microbial nanowires. Nature. 2005;435:1098–101.
Tomich M, Planet PJ, Figurski DH. The tad locus: postcards from the widespread colonization island. Nat Rev Microbiol. 2007;5:363–75.
Kudlich M, Keck A, Klein J, Stolz A. Localization of the enzyme system involved in anaerobic reduction of azo dyes by Sphingomonas sp. Strain BN6 and effect of artificial redox mediators on the rate of azo dye reduction. Appl Environ Microbiol. 1997;63:3691–4.
Hunt JC, Phibbs PV. Regulation of alternate peripheral pathways of glucose catabolism during aerobic and anaerobic growth of Pseudomonas aeruginosa. J Bacteriol. 1983;154:793–802.
Cuskey SM, Wolff JA, Phibbs PV, Olsen RH. Cloning of genes specifying carbohydrate catabolism in Pseudomonas aeruginosa and Pseudomonas putida. J Bacteriol. 1985;162:865–71.
Soto-Giron MJ, Rodriguez-R LM, Luo C, Elk M, Ryu H, Hoelle J, et al. Characterization of biofilms developing on hospital shower hoses and implications for nosocomial infections. Appl Environ Microbiol. 2016. doi:10.1128/AEM.03529-15.
Lynd LR, Weimer PJ, Van Zyl WH, Pretorius IS. Microbial cellulose utilization: fundamentals and biotechnology. Microbiol Mol Biol Rev. 2002;66:506–77.
Spiers AJ, Arnold DL, Moon CD. A survey of AL biofilm formation and cellulose expression amongst soil and plant-associated Pseudomonas isolates. Microb Ecol Aer Plant Surf. 2006;29:121–32.
Ude S, Arnold DL, Moon CD, Timms-Wilson T, Spiers AJ. Biofilm formation and cellulose expression among diverse environmental Pseudomonas isolates. Environ Microbiol. 2006;8:1997–2011.
Dunne WM. Bacterial adhesion: seen any Good biofilms lately? Clin Microbiol Rev. 2002;15:155–66.
Sieber JR, McInerney MJ, Gunsalus RP. Genomic insights into syntrophy: the paradigm for anaerobic metabolic cooperation. Annu Rev Microbiol. 2012;66:429–52.
Shi L, Richardson DJ, Wang Z, Kerisit SN, Rosso KM, Zachara JM, et al. The roles of outer membrane cytochromes of Shewanella and Geobacter in extracellular electron transfer. Environ Microbiol Rep. 2009;1:220–7.
Kato S, Hashimoto K, Watanabe K. Methanogenesis facilitated by electric syntrophy via (semi)conductive iron-oxide minerals. Environ Microbiol. 2012;14:1646–54.
Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27:863–4.
Xia Y, Ju F, Fang HHP, Zhang T. Mining of novel thermo-stable cellulolytic genes from a Thermophilic Cellulose-Degrading Consortium by metagenomics. PLoS One. 2013;8:e53779.
Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. 2012;40:e155.
Peng Y, Leung HCM, Yiu SM, Chin FYL. Meta-IDBA: a de Novo assembler for metagenomic data. Bioinformatics. 2011;27:i94–101.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2012;41:D590–6.
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinform. 2009;10:421.
Huson DH, Mitra S, Ruscheweyh H-J, Weber N, Schuster SC. Integrative analysis of environmental sequences using MEGAN4. Genome Res. 2011;21:1552–60.
Dupont CL, Rusch DB, Yooseph S, Lombardo M-J, Alexander Richter R, Valas R, et al. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 2012;6:1186–99.
Auch AF, Klenk H-P, Göker M. Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs. Stand Genomic Sci. 2010;2:142–8.
Markowitz VM, Chen I-MA, Palaniappan K, Chu K, Szeto E, Pillay M, et al. IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res. 2014;42:D560–7.
Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol. 2011;12:R44.
Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics. 2011;27:2194–200.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.
Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.
Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform. 2011;12:323.
Fang HH, Li C, Zhang T. Acidophilic biohydrogen production from rice slurry. Int J Hydrog Energy. 2006;31:683–92.
Zhang T, Shao M-F, Ye L. 454 Pyrosequencing reveals bacterial diversity of activated sludge from 14 sewage treatment plants. ISME J. 2011;6:1137–47.
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75:7537–41.
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.
Wang Q, Garrity GM, Tiedje JM, Cole JR. Naïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73:5261–7.
Ibarbalz FM, Figuerola ELM, Erijman L. Industrial activated sludge exhibit unique bacterial community composition at high taxonomic ranks. Water Res. 2013;47:3854–64.
Handley KM, Bartels D, O′Loughlin EJ, Williams KH, Trimble WL, Skinner K, et al. The complete genome sequence for putative H2-and S-oxidizer Candidatus Sulfuricurvum sp., assembled de novo from an aquifer-derived metagenome. Environ Microbiol. 2014;16:3443–62.
Pope PB, Denman SE, Jones M, Tringe SG, Barry K, Malfatti SA, et al. Adaptation to herbivory by the Tammar wallaby includes bacterial and glycoside hydrolase profiles different from other herbivores. Proc Natl Acad Sci. 2010;107:14793–8.
YX carried out the experiment and data analysis as well as manuscript drafting. YW participated in the experiment. YW contributed in the data analysis. FYLC and TZ participated in the coordination of the study and helped to draft the manuscript. All the authors read and approved the final manuscript.
The authors thank ShenZhen Knowledge Innovation Program—Basic Research Project from Shenzhen Municipal Science and Technology Innovation Council (JCYJ20130401141412386) and the Hong Kong General Research Fund (HKU 7111/12E) for the financial support on this study. Francis Y.L. Chin would like to thank HKU for the Outstanding Researcher Award. YuBo Wang and Yi Wang thank HKU for the postgraduate studentships. Yu Xia would like to thank HKU for the postdoctoral fellowship. Technical support from Ms. Vicky Fung is greatly appreciated.
Availability of supporting data
Supporting data could be found in Additional file 1.
The authors declare that they have no competing interests.
Consent for publication
All the authors consented on the publication of this work.
ShenZhen Knowledge Innovation Program—Basic Research Project from Shenzhen Municipal Science and Technology Innovation Council (JCYJ20130401141412386). Hong Kong General Research Fund (HKU 7111/12E).