Skip to main content

Advertisement

Table 1 All cellulosomal proteins containing a dockerin type I module

From: Optimizing the composition of a synthetic cellulosome complex for the hydrolysis of softwood pulp: identification of the enzymatic core functions and biochemical complex characterization

Locus tag (Clo1313_)a Alternative locus tag (CLO1313_)b Homolog locus tag in type strainc Homolog protein in type strainc Protein sequence identity (%)d Refs. Domain structure (CAZy/Pfam)e Molecular weight (kDa) Expression and purification
2216 RS11245 Cthe_0015   100   CBM42, GH43 79 Soluble protein
2202 RS11155 Cthe_0032 Man26B 100   CBM35, GH26 67 Soluble protein
2189 RS11090 Cthe_0043 Cel9N 100 [23] GH9, CBM3c 82 Soluble protein
2188 RS11085 Cthe_0044 CseP 100 [23] CotH 62 Soluble protein
2122 RS10750 Cthe_0109   99   Un 12 Soluble protein
2043 RS10350 Cthe_0190 PinA 100 [24] Fn3, serpin 68 Soluble protein
2042 RS10345 Cthe_0191 PinB 100 [24] Fn3, serpin 68 Soluble protein
2022 RS10235 Cthe_0211 Lic16B 100 [25] GH16 38 Soluble protein
1990 RS10065 Cthe_0239   99   LTD, LTD, Fn3, CotH 117 Not clonable
1983 RS10030 Cthe_0246   99   CBM35, PL11 89 Not clonable
1971 RS09970 Cthe_0258   100   (RCC1)5 51 Soluble protein
1960 RS09915 Cthe_0269 Cel8A 100 [26] GH8 53 Soluble protein
1959 RS09910 Cthe_0270 Chi18A 100 [27] GH18 55 Soluble protein
1955 RS09890 Cthe_0274 Cel9P 100 [10] GH9 63 Soluble protein
1816 RS09185 Cthe_0405 Cel5L 100 [10] GH5 58 Soluble protein
1809 Synthetic construct Cthe_0412 Cel9K 100 [28] GH9, Ig, CBM4_9, CBM3b 101 Soluble protein
1808 RS09145 Cthe_0413 Cbh9A 99 [29] GH9, Ig, CBM4_9, CBM3b 138 Soluble protein
1788 RS09045 Cthe_0433 Lec9B 100 [10] GH9, CBM3c 89 Soluble protein
1786 Synthetic construct Cthe_0435 Cel124A 100 [30] GH124 40 Soluble protein
1783 RS09020 Cthe_0438   100   Un 15 No purification
1701 RS08595 Cthe_0536 Cel5B 100 [31] GH5 64 Soluble protein
1694 RS08560 Cthe_0543 Cel9F 100 [32] GH9, CBM3c 82 Soluble protein
1659 RS08380 Cthe_0578 Cel9R 99 [33] GH9, CBM3c 85 Soluble protein
1604 Synthetic construct Cthe_0624 Cel9-44J 100 [34] GH9, GH44, Ig, CBM4_9 178 Soluble protein
1603 RS08090 Cthe_0625 Cel9Q 100 [35] GH9, CBM3c 80 Soluble protein
1587 RS08010 Cthe_0640   100   Pectate-lyase 3 superfamily 65 Soluble protein
1564 RS07895 Cthe_0660   99   GH81 86 Soluble protein
1563 RS07890 Cthe_0661 Gal43A 99 [36] GH43, CBM13 64 Soluble protein
1494 RS07550 Cthe_0729   100   CBM 58 No expression
1477 RS07470 Cthe_0745 Cel9W 100 [10] GH9, CBM3c 82 Soluble protein
1425 RS07205 Cthe_0797 Cel5E 99 [37] GH5, CE2 90 Soluble protein
1425* RS07205* Cthe_0797* tCel5E 100 [38] GH5 54 Soluble protein
1424 RS07200 Cthe_0798 Ces3A 100 [39] CE3, CE3 55 Soluble protein
1398 RS07080 Cthe_0821 Man5A 99 [40] GH5, CBM32 60 Soluble protein
1396 RS07070 Cthe_0825 Cel9D 99 [41] GH9, Ig 72 Soluble protein
1305 RS06630 Cthe_0912 Xyn10Y 100 [42] CBM22, GH10, CBM22, CE1 120 Soluble protein
0987 RS05040 Cthe_1271   100   GH43, CBM6, CBM6 75 Soluble protein
0851 RS04370 Cthe_1398 Xgh74A 100 [43] GH74 92 Soluble protein
0849 RS04360 Cthe_1400   100   GH53 47 Soluble protein
2234 RS11350 Cthe_1472 Cel5-26H 99 [44] GH5, GH26, CBM11 102 Soluble protein
2479 RS12560 Cthe_1806   93   Un 236 Not clonable
2530 RS12825 Cthe_1838 Xyn10C 100 [45] CBM22, GH10 70 Soluble protein
2564 RS13020 Cthe_1890   85   (LRR_5)3 76 Not clonable
2635 RS13380 Cthe_1963 Xyn10Z 99 [46] CE1, CBM6, GH10 92 Soluble protein
2693 RS13665 Cthe_2038 Pgu28A 99   GH28 homology 92 Soluble protein
2747 Synthetic construct Cthe_2089 Cel48S 100 [47] GH48 83 Soluble protein
2793 RS14190 Cthe_2137   100   GH39, CBM35, CBM35 88 Insoluble protein
2794 RS14195 Cthe_2138   100   CBM42, GH43 66 Soluble protein
2795 RS14200 Cthe_2139   99   GH30, CBM42, GH43 111 Low expression yield
2805 RS14250 Cthe_2147 Cel5O 99 [48] GH5, CBM3b 75 Soluble protein
2843 RS14430 Cthe_2179   98   PL1, CBM35, PL9 98 No expression
2856 RS14510 Cthe_2193 Xyl5A 99 [49] GH5, CBM6, CBM13, CBM62 103 Soluble protein
2858 RS14520 Cthe_2194   96   CE1, CBM6 54 Insoluble protein
2859 RS14525 Cthe_2195 Xyn141E 99 [65] GH141, CBM6 105 Soluble protein
2860 RS14530 Cthe_2196   100   GH43, CBM6 59 Soluble protein
2861 RS14535 Cthe_2197   74   GH2, CBM6 104 Truncated protein only
2944 RS14960 Cthe_2271   100   Un 19 No expression
3023 RS15380 Cthe_2360 Cel9U 99 [10] GH9, CBM3b, CBM3c 105 Soluble protein
0135 RS00705 Cthe_2549   100   Un 37 Insoluble protein
0177 RS00915 Cthe_2590 Xyn10D 100 [10] CBM22, GH10 72 Soluble protein, partially degraded
0349 RS01780 Cthe_2760 Cel9V 99 [10] GH9, CBM3b, CBM3c 110 Soluble protein
0350 RS01785 Cthe_2761 Lec9A 99 [10] GH9, CBM3c 80 Soluble protein
0399 RS02020 Cthe_2811 Man26A 100 [66] CBM35, GH26 67 Soluble protein
0400 RS02025 Cthe_2812 Cel9T 100 [50] GH9 69 Soluble protein
0413 RS02085 Cthe_2872 Cel5G 99 [51] GH5 63 Soluble protein
0420 RS02120 Cthe_2879   99   CE-nc 55 Soluble protein, partially degraded
0500 RS02535 Cthe_2949   99   CE8 62 Soluble protein
0501 RS02540 Cthe_2950   99   PL1, CBM35 60 Soluble protein
0521 RS02665 Cthe_2972 Xyn11A 99 [52] GH11, CBM6, CE4 74 Soluble protein
0563 RS02880 Cthe_3012   100   GH30, CBM6 71 Soluble protein
0685 RS03545 Cthe_3132   100   UN 47 Soluble/insoluble protein
0689 RS03565 Cthe_3136 CprA 100 [53] Subtilisin-like serine protease 40 Insoluble protein
0693 RS03585 Cthe_3141   99   CE12, CBM35, CE12 91 Soluble protein
  1. GH glycoside hydrolase family, CBM carbohydrate-binding module family, Ig glycoside hydrolase-associated immunoglobulin module, CE carbohydrate esterase family, PL polysaccharide lyase family, UN unknown module or module with unknown function, LTD lamin tail domain, FN3 fibronectin module, CotH CotH spore coat protein kinase module, RCC1 regulator of chromosome condensation, LRR leucin-rich repeat
  2. aGene feature record annotated as old locus tag for C. thermocellum DSM1313 in NCBI database (https://www.ncbi.nlm.nih.gov/nuccore/385777386)
  3. bCurrent gene feature record annotated as locus tag in the NCBI database
  4. cHomolog sequence annotation (locus tag and protein name) of type strain C. thermocellum ATCC 27405
  5. dSequence identity by blastP (https://blast.ncbi.nlm.nih.gov) against type strain C. thermocellum ATCC 27405 (% of protein sequence)
  6. eProtein family classification based on carbohydrate-active enzyme (CAZy) database [54] (http://www.cazy.org) and Pfam database (http://pfam.sanger.ac.uk)