Metataxonomic profiling and prediction of functional behaviour of wheat straw degrading microbial consortia

Background Mixed microbial cultures, in which bacteria and fungi interact, have been proposed as an efficient way to deconstruct plant waste. The characterization of specific microbial consortia could be the starting point for novel biotechnological applications related to the efficient conversion of lignocellulose to cello-oligosaccharides, plastics and/or biofuels. Here, the diversity, composition and predicted functional profiles of novel bacterial-fungal consortia are reported, on the basis of replicated aerobic wheat straw enrichment cultures. Results In order to set up biodegradative microcosms, microbial communities were retrieved from a forest soil and introduced into a mineral salt medium containing 1% of (un)treated wheat straw. Following each incubation step, sequential transfers were carried out using 1 to 1,000 dilutions. The microbial source next to three sequential batch cultures (transfers 1, 3 and 10) were analyzed by bacterial 16S rRNA gene and fungal ITS1 pyrosequencing. Faith’s phylogenetic diversity values became progressively smaller from the inoculum to the sequential batch cultures. Moreover, increases in the relative abundances of Enterobacteriales, Pseudomonadales, Flavobacteriales and Sphingobacteriales were noted along the enrichment process. Operational taxonomic units affiliated with Acinetobacter johnsonii, Pseudomonas putida and Sphingobacterium faecium were abundant and the underlying strains were successfully isolated. Interestingly, Klebsiella variicola (OTU1062) was found to dominate in both consortia, whereas K. variicola-affiliated strains retrieved from untreated wheat straw consortia showed endoglucanase/xylanase activities. Among the fungal players with high biotechnological relevance, we recovered members of the genera Penicillium, Acremonium, Coniochaeta and Trichosporon. Remarkably, the presence of peroxidases, alpha-L-fucosidases, beta-xylosidases, beta-mannases and beta-glucosidases, involved in lignocellulose degradation, was indicated by predictive bacterial metagenome reconstruction. Reassuringly, tests for specific (hemi)cellulolytic enzymatic activities, performed on the consortial secretomes, confirmed the presence of such gene functions. Conclusion In an in-depth characterization of two wheat straw degrading microbial consortia, we revealed the enrichment and selection of specific bacterial and fungal taxa that were presumably involved in (hemi) cellulose degradation. Interestingly, the microbial community composition was strongly influenced by the wheat straw pretreatment. Finally, the functional bacterial-metagenome prediction and the evaluation of enzymatic activities (at the consortial secretomes) revealed the presence and enrichment of proteins involved in the deconstruction of plant biomass.


Background
Efficient bioconversion of lignocellulosic substrates depends critically on the functioning of multispecies microbial consortia rather than single strains [1]. In such consortia, secretion of the enzymes involved in biodegradation, as affected by the interactions between the microbial players (bacteria-fungi), is of crucial importance [2,3]. Wheat straw, as the source of lignocellulose, can potentially serve to provide building blocks for production of plastics or energy in biofuels [4]. The conversion of lignocellulosic polymers into monomers that can be further processed involves the synergistic action of a range of secreted enzymes, that is, peroxidases, xylanases and endo/exoglucanases [5,6]. In spite of the fact that intricate knowledge on the decomposition process is lacking, many bacteria are known to be capable of producing such enzymes. In particular, members of the Gammaproteobacteria, Firmicutes and Bacteroidetes have been implicated in lignocellulose biodegradation [7,8]. Moreover, fungi like Trichosporon and Coniochaeta are considered as potential sources of hydrolytic enzymes, in particular those involved in the bioconversion of (toxic) furanic compounds and in the production of unique secondary metabolites [9,10]. In addition, recent evidence suggests that, from the biotechnological perspective, Penicillium, Acremonium and Trichoderma species represent fungi that are applicable in the production of commercial lignocellulases [11].
The current literature indicates several strategies by which effective microbial consortia can be obtained [12]. In addition, the construction of target microbial communities can be aided using stable isotope probing (SIP) [13]. However, SIP suffers from drawbacks related to cross-feeding phenomena and/or the possible detection of bacterial or fungal predators of labeled cells, that is, those representing "microbial cheaters" [14]. Thus, a valid strategy to obtain efficient microbial consortia that degrade lignocellulosic matter is ex situ dilution to stimulation, using (partially unlocked) plant material as the unique energy and carbon source [15,16]. Due to selective processes, this last approach results in a stimulus of (biodegradation) function within the emerging consortia during succession [12]. The enrichment cultures produced can then provide a robust platform for biotechnological applications [17][18][19].
Unfortunately, cultivation-based analyses of complex microbial consortia are restrictive, as key organisms may be omitted. Thus, DNA-based high-throughput sequencing techniques have been recently applied to lignocellulolytic consortia [20,21]. The studies performed so far have, however, only addressed the role of bacteria, to the exclusion of fungal players. Fungi, either in the mycelial or yeast form, can have dominant roles in lignocellulose decomposition in plant litter and soil [22,23]. In lignocellulosic enrichment cultures, the bacterial and fungal diversities may be driven by the microbial source, available substrates, pH, redox potential, temperature and possible toxic compounds [24][25][26]. Thus, such consortia need to be assessed over time in relation to conditions and metabolic fluxes among key members, which is important for further "consortium engineering" [2].
The classical bacterial 16S rRNA gene and fungal ITS1 based markers are useful to describe community composition but do not provide information on the genes that are involved in lignocellulose deconstruction. Recently, Langille et al. (2013) [27] suggested a way to overcome such a limitation. They developed the software PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) to predict the occurrence of functions in microbial communities solely on the basis of bacterial 16S rRNA gene sequences. Although such an approach is theoretically fraught with uncertainties, realistic predictions of function in low-complexity environments were given. Thus, PICRUSt has been used to analyze the human intestinal mucosal surface microbiome, and the results correlated fairly well with the extant metabolome, suggesting a relationship between inferred function and metabolites found [28]. However, the method needs extreme caution in the interpretation of its outcomes, given the known impact of horizontal gene transfer (HGT) across the genomes of the members of most microbial communities. In addition, the quality of these functional predictions is largely dependent on the availability of annotated reference genomes.
In a previous study [29], we reported the construction of two novel bacterial-fungal consortia involved in the bioconversion of lignocellulose next to furanic compounds. We described their characteristics based on bacterial cell counts, quantitative PCR (qPCR), denaturing gradient gel electrophoresis (DGGE) analyses and isolation of some key consortium members. In addition, we designed a novel iodide oxidation method to detect 5-hydroxymethylfurfural oxidoreductase activity. In the current study, we expanded our previous work by focusing on the metataxonomic evaluation (based on bacterial 16S rRNA gene and fungal ITS1 pyrosequencing data) of two lignocellulolytic microbial consortia enriched on untreated versus pretreated wheat straw. We here analyze the successional microbial diversity and community composition of the two consortia and apply PICRUSt, thus predicting genes for functions involved in lignocellulose metabolism. Moreover, we evaluated the joint expression of some of these genes in the secretome, by quantification of specific (hemi)cellulolytic enzymatic activities. The two consortia constitute starting points for biotechnological applications in the light of their possible capacities in the conversion of lignin, (hemi)cellulose, furanic compounds and cello-oligosaccharides.

Analysis of the community structures and diversities of two microbial consortia
Overall, 18,200 trimmed-rarefied sequences of bacterial 16S rRNA from the forest soil inoculum (SS) as well as the RWS (untreated wheat straw) and TWS (heat-treated wheat straw) consortia (n = 13) were analyzed. The rarefied sequencing data (1,400 sequences per sample) were binned into 338, 109 (±6.5) and 102 (±0.3) abundant operational taxonomic units (OTUs), for SS, RWS and TWS (both at transfer 1 -T1), respectively ( Figure 1A Figure 1A). For TWS, the bacterial consortia also showed progressively decreasing CRE and PD values, with the higher ones in the SS and T1 (161,67 ± 0.49 and 3.54 ± 0.26 for CRE and PD, respectively) ( Figure 1B).

Relative abundance at genus level and structure of the microbial consortia
To assess the RA of each genus in the RWS and TWS microbial consortia, we removed the least abundant OTUs (those containing less than 10 sequences in total). We thus used 92.64% (±1.50) to 98.50% (±0.57), and 81.63% (±0) to 95.65 (±1.09) of the 1,400 16S rRNA gene and 550 ITS1 sequences, respectively. On this basis, 24 bacterial and 13 fungal genera were detected across all samples in both enrichment strategies (omitting the source SS) (Figure 3). The bacterial communities at T1 for RWS and TWS showed eight abundant genera, defined as having an RA > 5%; these were Stenotrophomonas, Sphingobacterium, Acinetobacter, Flavobacterium, Pseudomonas, Serratia, Klebsiella and Paenibacillus.

Bacterial OTUs related to (hemi)cellulolytic strains
A total of 10 out of the 15 abundant bacterial OTUs detected by direct molecular assessment was recovered as isolates ( Figure 5). Bacterial isolates were recovered by dilution plating on R2A agar and presumptively identified using 16S rRNA gene sequencing [29]. Among these, non-(hemi)cellulolytic strain 10w8 isolated from TWS matched Pseudomonas putida_OTU418 (99% identity). In RWS, two OTUs representing the genus Acinetobacter were found to be abundant. Sequence-wise, the isolated strains 8w3 and 8w5, which were found to have endoglucanase and xylanase activities, matched one OTU, affiliated with Acinetobacter johnsonii (OTU1927), whereas strain 10w16, which was devoid of any (hemi)cellulolytic activity (based on its activity on carboxymethylcellulose (CMC) and xylan from birchwood), matched Acinetobacter calcoaceticus_OTU636. The sequence of OTU1062 (Klebsiella variicola) represented the most abundant OTU in RWS (approximately 37%), and the RWS-and TWS-derived strains 10w11, 10w26, 10 t14 and 1 t2 matched this sequence type. Interestingly, the strains retrieved from RWS (10w11 and 10w26) showed (hemi)cellulolytic activity, whereas those from TWS (10 t14 and 1 t2) did not ( Figure 5). In accordance with the phylogenetic tree and low identity (90%), we not consider that Flavobacterium hercynium_OTU838 represents the (hemi)cellulolytic bacterial strain 3w2. Moreover, strains 3 t5 and 3 t6, which likely represented the highly abundant Sphingobacterium faecium_OTU387, did not show CMC-ase and xylanase activities on agar plates.

Predicting functions involved in lignocellulose deconstruction
Totals of 25 and 17 genes related with plant biomass deconstruction were predicted to be consistently enriched from the SS to the RWS and TWS consortia at T10, respectively. Conversely, 34 and 18 predicted genes were enriched from the sequential batches (T1 to T10) in RWS and TWS, respectively (Table 1). Interestingly, predicted genes that codify for glycolate oxidase (EC:1.  (Table 1).
Six predicted genes potentially related to lignin bioconversion were enriched in RWS at T10, while one (catalase -EC:1.11.1.6) was enriched in TWS at T10, compared with the SS source (Table 1). With respect to (hemi)cellulose bioconversion, the number of predicted genes that encode alpha-mannosidase (EC:3.2.1.24) and levanase (EC:3.2.1.65) increased along the sequential transfers (T1 to T10) in both cultures and also were higher than those in the SS source. The predicted genes that encode glucosidase (alpha and beta) enzymes also increased along the sequential transfer in both strategies. Moreover, many predicted genes that were already abundant in SS, such as those that encode beta-galactosidases (EC:3.2.1.23) and beta-glucosidases (EC:3.2.1.21), showed an extra increase by the sequential transfer in both cultures. Also, xylan 1.4-beta-xylosidase (EC:3.2.1.37) and endoglucanase (EC:3.2.1.4) related predicted genes were either enriched or depleted along the sequential transfers in RWS and TWS, respectively (Table 1). In addition, a beta-mannosidase (EC:3.2.1.25) related gene was found in both enrichment cultures. Regarding the accuracy of metagenome predictions, the Nearest Sequenced Taxon Index (NSTI), which quantifies the uncertainty of the prediction (lower values mean a better prediction), decreased from 0.18 in SS to 0.03 (RWS) and 0.09 (TWS) at T10 ( Figure 6A). The NSTI metric represents the sum of phylogenetic distances for each organism in the OTU table to its nearest relative with a sequenced reference genome, measured in terms of substitutions per site in the 16S rRNA gene and weighted by the frequency of that organism in the OTU table [27].
Quantification of specific enzymatic activities related to (hemi)cellulose bioconversion As shown by direct enzymatic assays, beta-xylosidases, beta-galactosidases, beta-mannosidases, cellobiohydrolases and beta-glucosidases were active in the secretome of both consortia. Interestingly, we observed high betamannosidase (0.35 nM 4-methylumbelliferyl (MUF)/min) and beta-galactosidase (1.3 nM MUF/min) activities in TWS but low ones in RWS. In addition, cellobiohydrolases   The absolute values show the number of predicted genes per each 1,000 rarefied 16S rRNA sequences analyzed (normalized data).  showed higher activities in RWS (0.11 nM MUF/min) compared to TWS (0.03 nM MUF/min). Beta-galactosidases, beta-glucosidases and beta-xylosidases also showed higher activity values in both secretomes compared with the other two enzymes ( Figure 6B). The bacterial abundance at T10 was evaluated by cell counts, and we observed increases from the inoculum level, around 5, to about 8.4 log bacterial cells/mL in both consortia (data not shown).

Discussion
The use of microbial consortia in lignocellulose transformation is likely to reduce impairments in bioprocesses using lignocellulosic matter, such as incompletely synergistic enzymes, pH regulation, the presence of toxic compounds, end-product inhibition and tolerance to environmental fluctuations [2,18]. Previously, several different types of plant biomass, such as sugarcane bagasse, poplar wood chips and switchgrass [15,30,31], have been used in the selection of biodegradative microbial consortia. In our work, wheat straw, either torrefied (pretreated with heat at 240°C) or not, was used. However, torrefaction, which is proposed as a valuable step in waste plant biomass valorization [32], introduces furanic compounds and/or cello-oligosaccharides in the medium [24]. Other studies suggested that lignocellulose-degrading soil communities are best addressed using SIP analysis based on 13 C-substrates [33], whereas lignin-degradative microbial communities can be enriched directly in soil [34].
Lignocellulolytic microbes are important members of forest soil communities and aid in the degradation of litter [35,36]. Our dilution-to-stimulation approach from a forest soil source community resulted in a stimulus of biodegradative function within the emerging bacterialfungal consortia. Moreover, qPCR showed that the bacterial 16S rRNA gene copy numbers were higher than the fungal ITS1 copy numbers (about 2 log units) in both substrates [29]. Notably, the microbial diversity became markedly reduced in both consortia compared to the source SS (Figure 1), suggesting the selection of particular taxa (including lignocellulose and possibly cello-oligosaccharide eaters) at the detriment of others. On the other hand, the apparently low microbial richness in SS could be related to the deletion of all singletons in our bioinformatic analysis. The bacterial CRE values decreased along the sequential batches, indicating enrichment of OTUs that grew consistently well in the enrichment. Interestingly, the TWS consortia showed low bacterial diversities compared to the RWS ones, possible due to the presence of toxic compounds such as furanic aldehydes. With respect to fungal diversity and richness, our data showed values that were similar between RWS and TWS, indicating that substrate type did not strongly affect fungal diversity. However, Faith's PD measure suggested the selection of particular lignocellulolytic fungi in both consortia. Moreover, both UniFrac unweighted distances and PCA (Figure 4; Additional file 4) showed that the consortia were highly influenced by substrate type (variation explained: 43.8%), which is similar to other reported data [21,37]. In addition, the structures of both consortia became less different after T3. In our previous analysis based on PCR-DGGE, stability of the dominant microorganisms after six transfers was reported [29].
The enrichment process increased the abundances of the bacterial orders Enterobacteriales, Pseudomonadales, Flavobacteriales and Sphingobacteriales (Figure 2A and B). In 2012 Eichorst and Kuske [38] performed SIP using 13 C-maize cellulose to evaluate the biodegradative microbial communities in different soils. They identified Burkholderiales, Caulobacterales, Rhizobiales, Sphingobacteriales and Xanthomonadales as the active bacterial groups. Enterobacteriales and Pseudomonadales were also enriched in insect herbivore microbiomes and lignin-enriched cultures, respectively [39,40]. Interestingly, in our RWS consortia the abundance of Enterobacteriales increased after T1, but the number of OTUs decreased, suggesting the selection of specific strains, for example, K. variicola, within this group. A similar pattern was observed for Sphingobacteriales in TWS, although the number of OTUs did not change after T3 (Additional file 5).
In soils, Ascomycota may be highly abundant in early stages of litter decomposition, whereas particular basidiomycetous yeasts increase in the later stages, possibly due to their capacity to degrade more recalcitrant compounds [41]. Members of the orders Atheliales, Agaricales, Helotiales, Chaetothyriales and Russulales were found to be abundant in (coniferous) forest soils [35]. Here, we found a high abundance of Malasseziales and Hypocreales. Abundant fungal orders present in SS were even more enriched in RWS. In contrast, only low-abundance orders were enriched in TWS, such as Coniochaetales and Tremellales (Additional file 3B). Štursová et al. (2012) [22], using SIP analysis in soil and litter plant samples with 13 C-cellulose, identified fungi affiliated with Dothideales, Leotiomycetes, Helotiales, Tremellales and Chaetothyriales. Thus, diverse fungi can be involved in lignocellulose bioconversion. We posit here that this diversity is dictated by the environmental source, substrate and methodology used for recovery and characterization.
Bacterial strains affiliated with an abundant K. variicola_OTU1062 showed (hemi)cellulolytic activity when isolated from RWS, but not from TWS, suggesting either HGT of mobile genes or differential gene expression between the isolates obtained from the two substrate types. Okeke and Lu (2011) [42] proposed that the capacity of Klebsiella types to degrade lignocellulose can be attributed to the acquisition of plasmids encoding (hemi)cellulolytic enzymes from the environment. However, Suen et al. (2010) [43] reported a chromosomal location in K. variicola At-22 of genes involved in plant biomass degradation, that is, beta-1,4-glucanase, alpha-xylosidases and alpha-mannosidases. Interestingly, the degradation of lignocellulose by an insect herbivore microbiome has been attributed to an association between Leucoagaricus gongylophorus (Basidiomycota) with Klebsiella species [39]. Another possible explanation for such findings arises from the regulatory mechanism of (hemi)cellulolytic genes, which is ultimately mediated by environmental conditions, in this case the torrefied substrate.
Members of Citrobacter, Enterobacter, Acinetobacter, Pseudomonas, Flavobacterium and Stenotrophomonas have the capacity to degrade plant lignin, (hemi)cellulose and/or CMC [44][45][46][47][48]40]. The presence of Sphingobacterium and Pedobacter in both microbial consortia may suggest the production of beta-glucosidases, indicating that these organisms are acting as "cheaters" that remove the cello-oligosaccharides produced by polymer degraders [49]. However, such organisms might also be involved in the production of aryl alcohol oxidases or endoxylanases [50,51]. We recently confirmed the production of betamannosidases, beta-galactosidases and beta-glucosidases by characterization of S. faecium (similar to OTU387, strain 3T5, data not shown). The presence of Chryseobacterium, Opitutus, Chitinophaga and Xanthomonas in RWS might relate to secondary functions, the nature of which is unclear. In TWS, members of Stenotrophomonas, Pseudomonas and Flavobacterium can be involved in the degradation of furanic compounds [52,29]. However, in both our consortia, P. putida_OTU418 might also act as a sugar cheater. Interestingly, Ronan et al. (2013) [53] reported an aerotolerant bacterial consortium composed of Clostridium and Flavobacterium that had the ability to produce ethanol. Moreover, the production of hydrogen by a consortium composed of Clostridium, Klebsiella, Acinetobacter, Bacillus, Pseudomonas, Ruminococcus and Bacteroides retrieved from sludge anaerobic digester has been evaluated [19]. These studies highlight the importance of aerobic bacterial members to deconstruct lignocellulose, such as those belonging to Flavobacterium, Klebsiella and Acinetobacter.
Concerning fungi, Acremonium is considered to be a very important organism for the production of (hemi)cellulases, as compared to Trichoderma reesii [54,55]. Moreover, Penicillium species have an elaborate enzymatic machinery to deconstruct lignocellulose, such as vanillyl-alcohol oxidases, copper-dependent polysaccharide monooxygenases [56], galactosidases, mannosidases and fucosidases [57]. In our consortia, the Malassezia species may have acted as sugar monomer cheaters, and their high abundances in RWS might be related with their high abundance in the SS. Trichosporon species are anamorphic basidiomycetous yeasts that are widespread in nature [58,59]. The presence of Trichosporon in the gut of xylophagous insects is probably facilitated by their ability to assimilate and transform lignin and various phenolic compounds [60]. Recent results from our group confirm the ability of our Trichosporon isolates to produce cellobiohydrolases and β-xylosidases (data not shown). Trichosporon, an oil-rich yeast, has high biotechnological potential and has been shown to be tolerant to furanic compounds [61,10]. It has been reported that the use of single fungal strains can be highly efficient to deconstruct specific compounds, such as lignin [62]. However, the breakdown of lignocellulose, for example, for biofuel production, often encounters great recalcitrance which will likely require the synergistic action of multispecies consortia (with higher gene diversity) to overcome it [2]. Some enzymatic transformations might be slow in such communities as a consequence of interspecific competition or even antagonism. To resolve these issues, enzyme cocktails that come from multispecies consortia may be retrieved and applied directly to the plant waste materials.
On theoretical grounds, one could bring up compelling evidence pointing to the scientific danger of attempting to link phylogeny with function by using PICRUSt, and the arguments extend to the limitation of current databases used in the software. However, the linkage might be regarded in a loose manner, including genes/functions that might be actually "floating" in the horizontal gene pool of the community. Thus, such functions are thought of as being not tightly linked to a phylogenetically determined species. In both microbial consortia, the uncertainty of the prediction as revealed by the NSTI was very reduced compared with that in the SS, thus indicating fair reliability and accuracy in the metagenome reconstruction ( Figure 6A). The analysis predicted the enrichment of several genes in our consortia that were potentially involved in lignocellulose degradation, and also showed that TWS was possibly a poorer selector of such genes than RWS (Table 1). It was predicted that some peroxidases (EC:1.11.1-), classified as an "auxiliary activities" (AA2 family) in the CAZy (Carbohydrate-Active EnZymes database) [63], were enriched in both consortia by the sequential transfers. Such enzymes oxidize phenolic and non-phenolic aromatic compounds and can modify lignin polymers [56]. These enzymes were more evident in the RWS consortium, supporting its potential to act on lignin. Furthermore, glycolate oxidase (EC:1.1.3.15; an oxidoreductase capable of oxidizing glycolate to glyoxylate, producing reactive oxygen species) was progressively enriched in the TWS consortium, suggesting a correlation with the metabolism of furanic compounds. Glycolate oxidases are classified in CAZy as family AA7. In this family, we found gluco-oligosaccharide oxidases capable of oxidizing a variety of carbohydrates and possibly involved in the biotransformation of lignocellulosic compounds [64].
The beta-galactosidases, which hydrolyze beta-galactosidic bonds between galactose and its organic functional group and can act on xyloglucans [69], were highly active in both consortia ( Figure 6B). The beta-mannosidases (EC:3.2.1.25), involved in the hydrolysis of terminal, nonreducing beta-D-mannosyl residues in beta-D-mannosides [70], were lowly abundant in our consortia as compared to SS, but such activities were also detected in the secretome. The activities of these last two types of enzymes were higher in TWS1 than in RWS ( Figure 6B), suggesting the raised availability of beta-D-galactose and beta-D-mannosyl residues in TWS, possibly released due to the torrefaction.
Conversely, mannan endo-1,4-beta-mannosidase (E.C: 3.2.1.78) (GH5, GH26, GH113 and AA10) related genes were enriched in RWS compared to SS. These enzymes are involved in the random hydrolysis of (1 -4)-beta-Dmannosidic linkages in mannans, galactomannans and glucomannans. Cellobiohydrolases (endo-and exoglucanases) showed low activity in the secretome of TWS, suggesting the presence of high cellulose levels in the untreated compared with the torrefied wheat tissue. Several genes that encode beta-glucosidases were enriched in both consortia compared with SS, suggesting that the conversion of cellobiose to glucose is an important function in these consortia. Finally, cleavage and further metabolism of di-sugars was represented by several predicted enzymes. For example, alpha-L-rhamnosidase related genes were highly abundant in TWS at T10, compared to SS. These enzymes cleave terminal alpha-l-rhamnose from a large number of natural glycosides, and are relevant for application in citrus fruit juice and wine industries [71].

Conclusion
In this study, the application of DNA-based highthroughput sequencing technology allowed the characterization of novel bacterial-fungal consortia growing on wheat straw. The data, in conformity with our previous work [29], indicate that mixed microbial consortia, encompassing specific biodegradative (mainly affiliated to Klebsiella, Sphingobacterium, Flavobacterium, Acinetobacter, Penicillium and Acremonium) and cheater types, are selected by the specific lignocellulose substrate. The approach allowed us to identify interesting yeasts, such as Coniochaeta and Trichosporon, that are possibly involved in plant biomass degradation and/or conversion of furanic compounds. Application of PICRUSt to predict the functional profile (using 16S rRNA sequences), in conjunction with the evaluation of enzymatic activities in the consortial secretomes, allowed the inference of genes/proteins that were presumptively involved in lignocellulose degradation (such as peroxidases, betamannases, beta-galactosidases, alpha-L-fucosidases, alpha-L-arabinofuranosidases and beta-glucosidases). Finally, assays of the degradation of other plant waste and quantification of initial and final products (for example, cello-oligosaccharides) might demonstrate the degradative potential that is needed for future biofuel production. A closer analysis of the metagenome and mobilome in our consortia will clarify the enzymatic profile and biotechnological potential present and can also shed light on the potential role of HGT in its evolution. A greater understanding of the ecological interactions between consortium members during plant biomass biodegradation is required for further progress in this area.

Lignocellulolytic microbial consortia construction
Soil samples (n = 10) were collected and mixed from a forest (top layer, 0 to 10 cm depth) in Groningen, The Netherlands (53.41 N; 6.90 E). Cell soil -suspensions were prepared by adding 10 g of fresh sampled soil to 250-mL flasks containing 10 g of sterile gravel in 90 mL of mineral salt medium (MSM). The flasks were shaken for 20 min at 250 rpm, and aliquots (250 μL) of soil suspension were added to triplicate Erlenmeyer flasks containing 25 mL of MSM with 1% lignocellulose substrate (0.25 g in 25 mL), amended with a trace element and vitamin solution. Two different substrates were thus obtained, to serve as carbon sources: i) "raw" wheat straw (RWS) and ii) heat-treated (torrefied) wheat straw (TWS). The flasks were incubated at 25°C with shaking at 100 rpm. Cultures were monitored at regular time intervals, and once the systems reached high cell density (log 7 to 8 cells/mL), aliquots (25 μL microbial suspension with fibrous material) were transferred to 25 mL of fresh medium. Finally, a sample of soil suspension and duplicate flask samples (selected based on reported DGGE analyses) at the final batches were taken from the RWS and TWS consortia after 1 (T1), 3 (T3) and 10 (T10) transfers (n = 13) and used for total DNA extractions and pyrosequencing as described below. Details of the experimental setup, substrate preparation, growth in sequential-batch cultures (cell counts and qPCR) and negative controls have been reported [29].

Sequencing processing and statistical analysis
Pyrosequencing raw data were processed using the Quantitative Insights Into Microbial Ecology (QIIME) toolkit [74]. The sequences were quality trimmed using the following parameters: quality score of >25, sequence length of 300 to 900 bp (for 16S rRNA) and 100 to 900 bp (for ITS1), maximum homopolymer of 6, 0 maximum ambiguous bases and 0 mismatched bases in the primer. In order to select for the same region of each gene, we retrieved sequences with primers GM3F (for the bacterial 16S rRNA) and ITS1F (for fungal ITS1). We identified bacterial and fungal players by grouping highly similar sequences into OTUs (at 97% of nucleotide identity) using UCLUST [75] followed by selection of representative sequences. Subsequently, chimeric sequences were detected using ChimeraSlayer [76] and deleted. Additionally, clusters consisting only of singleton sequences were removed in order to avoid sequencing errors. Analyses of community composition, as well as richness and diversity estimators, were carried out at a depth of 1,400 bacterial and 550 fungal rarefied sequences per sample, to eliminate the effect of sampling effort. QIIME was also used to generate alpha-and beta-diversity metrics, including OTU richness, CRE, SWI, PD and UniFrac distances. Taxonomic classifications at the phylum and order level of each OTU were done using RDP classifier and BLAST algorithms against the Greengenes (16S rRNA), UNITE and GenBank (ITS1) databases. The assignment of each OTU on the genus level was based on the best BLASTn hit against the GenBank database (Additional file 6). Abundant OTUs, more than 10 and 5 sequences per OTU for the 16S rRNA and ITS1 data respectively, were selected to construct the PCA using Canoco software v4.52 (Wageningen, The Netherlands). The 16S rRNA and ITS1sequences were deposited in GenBank with SRA accession numbers [SRP039495].

Detection of abundant OTUs as bacterial strains
Isolation of bacterial strains along the experiment and the determination of their taxonomic identification and (hemi)cellulolytic activity in agar plates (with CMC and xylan from birchwood) were previously reported [29]. Partial 16S rRNA gene sequences of these strains were obtained using the same forward primer as used for the 16S rRNA pyrosequencing. To detect which OTUs were possibly recovered as bacterial strains, we constructed a phylogenetic tree using the sequences of the 15 most abundant bacterial OTUs (representing over 72% and 88% of the consortia in RWS and TWS, respectively) in addition to 20 sequences retrieved from the bacterial strains. Sequences were aligned using the ClustalW software, and the phylogenetic analyses (p-distance) were conducted with MEGA v5.1 using the Neighbour-Joining method [77]. The evolutionary distances were computed using the Kimura-2 parameter method and are in the units of the number of base substitutions per site (note scale bar - Figure 5). The branches were tested with bootstrap analyses (1,000 replications). Furthermore, (hemi)cellulolytic activity was linked to the OTUs based on the similarity and clustering with the bacterial strains.

Reconstructing the bacterial metagenomes with PICRUSt software
The bacterial metagenomes were reconstructed using the PICRUSt software [27]. A PICRUSt-compatible OTU table was constructed in QIIME (at 97% of nucleotide identity) using the newest available reference closed-reference OTU collection in the Greengenes database [78]. In order to normalize the data, we used 1,000 rarefied sequences of bacterial 16S rRNA per sample as an input. Subsequently, the normalization by 16S rRNA copies number per OTU was performed with the normalize_by_copy_number.py script and IMG database information. The metagenome inference was done using the predict_metagenomes.py script with the normalized OTU table as an input. We analyzed the average number of annotated genes in each sample and selected the top 40 known genes related with the bioconversion of lignocellulose. PICRUSt also calculated the NSTI, a measure of prediction uncertainty presented here in a comparative way along the sequential batches in both consortia datasets.

Quantification of specific enzymatic activities related to the (hemi)cellulose bioconversion
In order to evaluate the metabolic potential in the degradation of (hemi)cellulose and the expression of selected genes identified by the PICRUSt prediction, we quantified specific enzymatic activities in samples of 2 mL from the enriched cultures after final batch (T10), when the communities are stable. Microbial cells plus wheat substrate were harvested by centrifugation for 3 min at 12,000 rpm, the supernatant (secretome) was recovered and tested for enzymatic activity using MUF-beta-D-xylopyranoside, MUF-beta-D-mannopyranoside, MUF-beta-D-galactopyranoside, MUF-beta-D-cellobioside and MUF-beta-D-glucopyranoside as substrates. The reaction mixture consisted of 10 μl of MUF-substrate (10 mM in dimethyl sulfoxide), 15 μL of Mcllvaine buffer (pH 6.8) and 25 μL of each supernatant. The mixture was incubated at 27°C for 45 min in the dark, and the reaction was stopped by adding 150 μL of 0.2 M glycine-NaOH buffer (pH 10.4). Fluorescence was measured at an excitation of 365 nm and emission of 445 nm. We also evaluated the fluorescence without the MUF-substrate as a negative control. Enzyme activities were determined from the fluorescence units using a standard calibration curve and expressed as rates of MUF production (nM MUF per min at 27°C, pH 6.8).

Additional files
Additional file 1: Rarefaction curves in the soil inoculum (SS) and in enriched cultures (RWS and TWS) along the sequential batches. Rarefaction curves of (A) bacterial 16S rRNA and (B) ITS1 pyrosequencing. OTUs were generated at 97% of nucleotide identity.