Plant biomass degradation modules (PDMs) mapping to the cel-cip and xyl-doc gene clusters in Clostridium cellulolyticum H10. Blouzard et al. described two clusters of genes that are involved in cellulose and hemicellulose degradation , and we adopted their domain architecture in this study. Abbreviations used for the carbohydrate-binding module (CBM) and glycoside hydrolase (GH) architecture are S, signal sequence; DOC1, dockerin type-I module; COH, cohesin type-I module; LNK, linker sequence; UNK, unknown function. We marked additional predicted domains as part of our in-house annotation sets using [+family X]. Some dockerin annotations were filtered out by our bit score criterion. (A) Genes from the cel-cip operon (Ccel_0728 to Ccel_0740) are essential for the cellulose degradation ability of the organism C. cellulolyticum H10, which uses the cellulosome strategy. The cluster includes multiple protein families of the PDMs M1 and M5. Although the consensus modules of M1 and M5 did not directly include the two endoglucanase families GH8 and GH48, associations between M1 and GH8, and between M5 and GH48 existed (probability values ≥ 0.005 in the respective topic probability distributions). (B) Genes from the 32-kb xyl-doc gene cluster (Ccel_1229 to Ccel_1242) encode functionalities for hemicellulose degradation. The cluster includes multiple protein families of the PDMs M1, M2, and M5, which together cover most of the cluster. Some additional protein families originate from M3 and M4 (purple). We assumed the following correspondences: CE1 ~ PF00756 (esterase); CBM22 ~ PF02018, and COG3533 (an uncharacterized protein in bacteria) ~ PF07944 (a putative glycosyl hydrolase of unknown function, DUF1680). The xyl-doc cluster contains a xylosidase/arabinofuranosidase gene (Ccel_1233), which is characterized as a putative β-xylosidase in the Integrated Microbial Genomes database (IMG). The gene corresponds to β-xylosidase genes in Caldicellulosiruptor saccharolyticus (Csac_2411), Bacteroides cellulosilyticus (BACCELL_02584 and BACCELL_00858), and Fibrobacter succinogenes (FSU_2269/ Fisuc_1769). Clusters containing M1 protein families were also detected around these genes.