Domain structure of naturally occurring CBM33-containing proteins. Annotations are based on Pfam (http://pfam.sanger.ac.uk) and the number of sequences currently representing each architecture is indicated in brackets. All module families shown are themselves diverse, but have been show experimentally to have (at least) the following substrate preferences: CBM33, chitin, chitosan, cellulose; CBM1, cellulose and chitin; CBM2, chitin, cellulose and xylan; CBM5/2, chitin and cellulose, FnIII, a wide variety of soluble and insoluble substrates; CBM20, granular starch and cyclodextrins; CBM18, chitin; CBM3, cellulose and chitin; CBM14, chitin; PKD (Polycystic kidney disease protein like protein), unknown substrate; LysM, peptidoglycan. Three hydrolytic modules are also present: GH5 (cellulose/mannan/chitosan/xylan and more), GH18 (chitin, chitosan) and GH19 (chitin, chitosan). Note that the CBM33 module almost exclusively occurs at the N-terminus, in accordance with the notion that the N-terminal histidine is crucial for activity (Figure 2).