Configuration of active site segments in lytic polysaccharide monooxygenases steers oxidative xyloglucan degradation

Background Lytic polysaccharide monooxygenases (LPMOs) are powerful enzymes that oxidatively cleave plant cell wall polysaccharides. LPMOs classified as fungal Auxiliary Activities family 9 (AA9) have been mainly studied for their activity towards cellulose; however, various members of this AA9 family have been also shown to oxidatively cleave hemicelluloses, in particularly xyloglucan (XG). So far, it has not been studied in detail how various AA9 LPMOs act in XG degradation, and in particular, how the mode-of-action relates to the structural configuration of these LPMOs. Results Two Neurospora crassa (Nc) LPMOs were found to represent different mode-of-action towards XG. Interestingly, the configuration of active site segments of these LPMOs differed as well, with a shorter Segment 1 (−Seg1) and a longer Segment 2 (+Seg2) present in NcLPMO9C and the opposite for NcLPMO9M (+Seg1−Seg2). We confirmed that NcLPMO9C cleaved the non-reducing end of unbranched glucosyl residues within XG via the oxidation of the C4-carbon. In contrast, we found that the oxidative cleavage of the XG backbone by NcLPMO9M occurred next to both unbranched and substituted glucosyl residues. The latter are decorated with xylosyl, xylosyl–galactosyl and xylosyl–galactosyl–fucosyl units. The relationship between active site segments and the mode-of-action of these NcLPMOs was rationalized by a structure-based phylogenetic analysis of fungal AA9 LPMOs. LPMOs with a −Seg1+Seg2 configuration clustered together and appear to have a similar XG substitution-intolerant cleavage pattern. LPMOs with the +Seg1−Seg2 configuration also clustered together and are reported to display a XG substitution-tolerant cleavage pattern. A third cluster contained LPMOs with a −Seg1−Seg2 configuration and no oxidative XG activity. Conclusions The detailed characterization of XG degradation products released by LPMOs reveal a correlation between the configuration of active site segments and mode-of-action of LPMOs. In particular, oxidative XG-active LPMOs, which are tolerant and intolerant to XG substitutions are structurally and phylogenetically distinguished from XG-inactive LPMOs. This study contributes to a better understanding of the structure–function relationship of AA9 LPMOs.


Background
For establishing sustainable processes and a circular economy, plant biomass is an essential source for the production of fuels and chemicals, in particular, to replace fossil-based resources [1]. Plant biomass dry matter is mainly composed of plant cell wall polymers, which are present in the middle lamella, primary and secondary cell wall layers [2]. The primary cell wall is mainly built of pectin, cellulose and hemicellulosic xyloglucan (XG), while the secondary cell wall is mostly composed of cellulose, hemicellulosic xylan or mannan and the aromatic polymer lignin [3][4][5][6]. While in general dicotyledonous plant biomass dry matter is majorly composed of primary cell wall components, other species such as grasses and wood plant biomass dry matter are majorly composed of secondary cell wall components [3][4][5][6]. An important step in biomass-based processes is the release of fermentable carbohydrates. In the last decade, monocopper-dependent lytic polysaccharide monooxygenases (LPMOs) have been shown to assist glycosyl hydrolases as green and effective tools for biomass polysaccharide degradation [7][8][9][10]. In this research, we aimed to understand how LPMOs oxidatively cleave XG, and hypothesized that the mode-of-action of LPMOs towards XG correlates with their active site configuration.
Whether these different XG-cleavage pathways result from distinct XG-binding sites neighboring the catalytic site of the LPMOs has yet to be defined. Active site structures of LPMOs interacting with cellulosic substrates have already been reported [24,[34][35][36][37][38], but information about relevant binding sites of XG is still scarcely available in the literature. Courtade and coworkers have shown through NMR analysis that a so-called L3 loop around the active site of a XG-active NcLPMO9C strongly interacted with XG [35]. This L3 loop also has been shown to be present in other XG-active LPMOs like PaLPMO9H [31] and MtLPMO9J [22]. However, in another XG-active GtLPMO9A-2, the L3 loop is absent. Instead, GtLP-MO9A-2 has an extended L2 loop [28]. This difference might indicate that the configuration of segments around the AA9 LPMO active site influences their catalytic behavior on XG. The definition of the loops (L2, L3, LS and LC) around the active site has previously been suggested [38][39][40], and further redefined as segments in our previous study due to the presence of secondary structure elements [41]. Briefly, five segments (Seg1-Seg5) were defined, of which Seg1, Seg2, Seg3 and Seg5 are comparable, but slightly different, to the previously defined L2, L3, LS and LC regions, respectively (see also Fig. 1). Seg4 was newly defined and has not been described before.
In this work, two distinct product profiles of two different XG-active LPMOs from Neurospora crassa (NcLPMO9C and NcLPMO9M) were characterized by identification of the formed non-oxidized and oxidized XG oligosaccharides. In addition to various other chromatographic techniques, hydrophilic interaction chromatography coupled with electrospray ionization-collision induced dissociation-mass spectrometry (HILIC-ESI-CID-MS/MS) was used. To test our hypothesis that the mode-of-action of LPMOs towards XG is a result of their specific structural configuration around the active site, a structure-based sequence analysis of AA9 LPMOs was performed. The resulting phylogenetic tree shows three distinct groups, which not only differ in structural active site segments, but also seemingly correlate to the oxidative XG cleavage being either tolerant or intolerant to substitutions, and to XG-inactive LPMOs.

Results
NcLPMO9C and NcLPMO9M and their oxidative XG cleavage patterns Two LPMOs from N. crassa with different active site segment configurations (NcLPMO9C and NcLPMO9M; Fig. 1) were tested for their mode-of-action towards XG. As presented in Fig. 1, NcLPMO9C holds a short ( − ) Seg1 and a long ( + ) Seg2, whereas NcLPMO9M has a + Seg1 − Seg2 configuration [41]. We first monitored the mode-of-action of two NcLP-MOs on TXG by profiling the molecular weight (MW) distribution of NcLPMO9M-and NcLPMO9C-TXGdigests during incubation using high-performance size exclusion chromatography coupled to a refractive index detector (HPSEC-RI) ( Fig. 2; Additional file 1: Fig. S1). The MW distribution of both NcLPMO-TXG-digests after 24 h incubation showed only little change in the absence of ascorbic acid (Asc) (Additional file 1: Fig.  S1), which showed that these enzyme preparations were almost free of hydrolytic side activities. However, upon addition of Asc an autooxidation of the TXG could be observed, resulting in a visible decrease in the MW distribution after 24 h (Additional file 1: Fig. S1). Therefore, the MW distributions of the NcLPMO digests (with Asc) were compared to the ones of TXG without enzyme but with Asc (24 h; Fig. 2).
Already after 2 h, the products formed by NcLPMO9M had a lower MW-range compared to NcLPMO9C indicating that both LPMOs show distinct mode-of-action on TXG (Fig. 2). To be more precise, NcLPMO9M formed two rather broad populations (Fig. 2a), one ranging from 30-200 kDa and another ranging from 1-30 kDa, while NcLPMO9C formed two larger MW populations (80-700 kDa and 1-80 kDa, Fig. 2b). Notably, oxidative XG cleavage of NcLPMO9M has not been reported previously and neither have the MW distributions of XG digests of these NcLPMOs. Further, seen from the MW profiles (Fig. 2a), after 8 h the NcLPMO9M TXG degradation was complete, no high MW population (30-200 kDa) of XG remained, and final products ranged from 0.4 to 3 kDa (Fig. 2a). In contrast, for the NcLPMO9C-TXG-digest the high MW XG population (80-700 kDa) remained and a decrease in MW of the products was observed even between 8 h and 24 h of incubation (Fig. 2b). The final digest was composed of products ranging from 0.4 to 3 kDa and showed a different MW distribution profile as the 24 h NcLPMO9M-TXG-digest (Fig. 2b).
To learn more about the exact cleavage sites in the TXG for both NcLPMOs, the formed TXG oligosaccharides were characterized in detail. First, the digests were analyzed by HPAEC-PAD and the corresponding chromatograms are shown in Fig. 3. For comparison, the commercial xyloglucanase (XEG)-TXG-digest (Fig. 3g) and commercial non-oxidized TXG oligosaccharide (XXXG, XLXG, XXLG and XLLG) standards (Fig. 3h) were analyzed, of which the annotation of HPAEC-peaks has been well defined in literature [42][43][44]. The control reactions (Fig. 3b, d) did not show the formation of (detectable) oligosaccharides, which confirms the absence of hydrolytic in the presence of Asc (g) were added as the reference. In addition, TXG only (f ), TXG with 1 mM Asc (e), TXG oligosaccharide standards (xyloglucan hepta + octa + nona saccharides; h) and a standard (i) containing a mixture of cellobiose, cellotriose, cellotetraose, cellopentaose and cellohexaose (from left to right in chromatogram) are shown xyloglucanase (side-)activities. In the presence of Asc, both NcLPMOs released noticeably different types of TXG oligosaccharides (Fig. 3a, c), underlining the differences in the above-described MW distributions (Fig. 2). The TXG-digest of NcLPMO9C has been described previously and our HPAEC profile corresponds with the published one [32]. However, the annotation, of in particular the non-oxidized products, seems to be different compared to the previous research. Based on our results, the common non-oxidized "XXXG"-type products were not present in NcLPMO9C-TXG-digest (Fig. 3a). Our annotation was based on (i) comparison with the XEG-TXG-digest and standards of a mixture of XXXG, XLXG, XXLG and XLLG (Fig. 3g, h), and (ii) β-galactosidase treatment of the NcLPMO9C-and XEG-TXG-digest to confirm that L units were degraded to X units (Additional file 1: Fig. S2). Indeed, β-galactosidase treatment of the XEG-TXG-digest (Additional file 1: Fig. S2b) resulted in removal of XLXG, XXLG and XLLG, and only XXXG remained. In addition, XXG was formed, confirmed by MALDI-TOF-MS (Additional file 1: Fig. S3a; m/z 775.3 (lithium (Li)-adduct, [M+Li] + )), due to the presence of isoprimeverase in the commercial β-galactosidase [45,46], which was further substantiated by the formation of isoprimeverose (X unit) (Additional file 1: Fig. S2b). In contrast, β-galactosidase treated NcLPMO9C-TXGdigest, majorly resulted in XXX (Additional file 1: Fig.  S2d), which was confirmed by MALDI-TOF-MS (Additional file 1: Fig. S3b; m/z 907.3 ([M+Li] + )), and no other main non-oxidized compounds remained. Again minor isoprimeverase side-activity was seen, resulting in formation of X and XX. The peak representing XXX was also present in the NcLPMO9C-TXG-digest, without β-galactosidase treatment, in addition to three peaks now defined as XLX, XXL and XLL. These last three peaks were removed by the β-galactosidase treatment, which confirmed the presence of L unit. It should be noted that in previous research studying LPMO activity towards TXG, the non-oxidized oligosaccharides now annotated as XXX, XLX, XXL and XLL, were incorrectly suggested to be XXXG, in addition to XLXG, XXLG and XLLG [22,24,29,31,32].
The HPAEC pattern of the NcLPMO9M-TXG-digest showed considerably more oligosaccharides peaks compared to the TXG-digest of NcLPMO9C. As with HPAEC the type of oligosaccharides (especially the oxidized ones) formed cannot be identified without standards, further characterization of degraded TXG oligosaccharides was carried out by MALDI-TOF-MS and HILIC-ESI-CID-MS/MS.
The NcLPMO9M-TXG-digest showed m/z-values ([M+Li] + ) corresponding to many different types of TXG oligosaccharides (i.e., H 7 P 5 ; Fig. 4b). The NcLPMO9M-TXG-digest was again composed of both non-oxidized (i.e., H 5 P 3 (m/z 1231.4) and oxidized oligosaccharides (i.e., Ox-H 5 P 3 ; m/z 1229.4 and 1247.4; Fig. 4b). The m/z difference of +16 suggested the occurrence of C1-oxidation and can be explained by the spontaneous hydrolysis of the unstable δ-lactone form (− 2 Da) into the aldonic acid form (+ 16 Da) [23,30,47]. Although some studies have shown that m/z of +16 could also attribute to the gem-diol form of the C4-oxidized products, other studies, i.e., in our laboratory, by using the same MALDI-TOF-MS settings as in the current work, did not observe m/z of +16 for C4-oxidized products [30,47]. Therefore, we suggest that TXG, most likely, was oxidatively cleaved by NcLPMO9M at C1 position. Still, occurrence of C4-oxidation could not be excluded, because of the presence of oxidized oligosaccharides with the m/z difference of − 2. These masses (M − 2) not only represent the unstable δ-lactone form, but also the keto-form of C4-oxidized oligosaccharides [30,32,48].

Unambiguous structural characterization of XG degradation products generated by NcLPMO9C and NcLPMO9M
To further identify the exact TXG cleavage sites of the two NcLPMOs, digests were subjected to negative ion mode HILIC-ESI-CID-MS/MS. Similar to the data discussed above ( compared to the same degree of polymerization (DP) of non-oxidized oligosaccharides. Secondly, masses that could be either C1-oxidized products or, based on their mass, formic acid adducts of non-oxidized products were observed (Fig. 5). For instance, m/z 1107 could represent the C1-oxidized H 5 P 2 , but also the formic acid adduct of non-oxidized H 4 P 3 (Additional file 1: Table S1). Nevertheless, corresponding MS/MS data easily distinguished formic acid adducts as these products showed a clear fragment of m/z − 46 (formic acid; data not shown).
Due to the complexity of multiple charges and formic acid adducts, the intensity of the MS/MS spectra was too poor for structural elucidation. The spectral quality improved considerably after having established MS and MS/MS analysis via a defined mass list (Additional file 1:

Characterization of non-oxidized TXG oligosaccharide products
Multiple non-oxidized TXG oligosaccharides released by the two NcLPMOs were identified (see Additional file 1: Figs. S4, S5 for examples). A summary of all MS/MS fragments and structural annotations can be found in Additional file 1: Table S2 (for NcLPMO9C) and Additional file 1: Table S3 (for NcLPMO9M). MS/MS fragments of non-oxidized products were annotated following the principle of predominance of C/Z-type and A-type fragments of neutral oligosaccharides in negative MS-mode [49,50]. In addition, a double C/Z-type cleavage on three linked sugar residues was observed and annotated as D-type (Additional file 1: Figs. S4, S5), which has previously been reported for TXG oligosaccharides [50]. Overall, non-oxidized XXX (m/z 899.3, Additional file 1:  Table S2). These non-oxidized "XXX"type TXG oligosaccharides reflected cleavage at the nonreducing end of an unbranched glucosyl unit in TXG (see below). In summary, 19 different non-oxidized TXG oligosaccharides released by NcLPMO9M were identified (Additional file 1: Tables S3, Fig. S5).

Characterization of C4-oxidized TXG oligosaccharide products
Based on our previous study on CID-MS/MS fragmentation patterns of C4-oxidized cello-oligosaccharides [51], we identified multiple structures of C4-oxidized TXG oligosaccharides, which are shown in Tables 2 and  3, for NcLPMO9C and NcLPMO9M, respectively. In the NcLPMO9C-TXG-digest, we found several "XXXG"-type C4-oxidized products such as O=G GXXX (m/z 1059.4, O=G indicates the C4-oxidized glucosyl unit), O=G GXLX (m/z 1221.5), O=G GXXL (m/z 1221.5), O=G GLXX (m/z 1221.5) and O=G G(H 5 P 3 ) (m/z 1383.7) ( Table 2). To explain the identification of these compounds, for instance through annotation of MS/MS fragments of O=G GXXX (m/z 1059.4, Fig. 6a) and O=G GXLX (m/z 1221.5, Fig. 6b), a fragment (Y 4 ) was observed having the terminal oxidized unbranched glucosyl residue removed via B/Y-cleavage (m/z difference of 160 compared to the parent m/z). In addition, the diagnostic cross-ring fragment 2,4 X 4 confirmed the single C4-oxidation on an unbranched glucosyl unit. This diagnostic cleavage fragment has been shown for C4-oxidized cello-oligosaccharides as well [51]. Additionally, to a much lesser extent, oligosaccharides with a C4-oxidized terminal X unit were determined, such as in O=G XXXG (m/z 1059.4) and O=G X(H 4 P 2 ) (m/z 1221.5). Again, fragments resulting from B/Y-cleavage of the glycosidic linkage between the glucosyl units next to the C4-oxidized glucosyl unit were observed in the MS/MS spectra. Fragments of (m/z) 767 and 929 showed a 292 m/z difference compared to the parent m/z of 1059 and 1221, respectively. The 292 m/z . O=G indicates that the oxidation is on the glucosyl unit in keto-form. Oxidation of the C4-carbon position is indicated in red. The fragments are annotated according to the nomenclature proposed by Domon and Costello [49] difference indicated the loss of the oxidized glucosyl unit (m/z 160) substituted with a xylosyl residue (m/z 132). C4-oxidized TXG oligosaccharides released by NcLP-MO9M were different from the ones formed by NcLP-MO9C, which is summarized in Table 3. First, two small motifs, O=G GX and O=G XG (both m/z 471.2), were identified. The single C4-oxidation on these G and X units  (Table 3).

Table 2 List of C4-oxidized XG oligosaccharides identified based on fragmentation patterns in CID-MS/MS present in the NcLPMO9C-TXG-digest
Chromatograms, including peak numbers, are shown in Fig. 5   Among these structures, the single C4-oxidation of G and X units was elucidated by MS/MS fragments having m/z differences of 160 and 292 from their parent m/z, respectively, as described previously. An example for the identification of O=G L units in MS/MS is shown in Fig. 7a, where the B 3 (m/z 453) indicated the oxidation on the H 2 P 1 structure ( O=G H 2 P 1 ). However, O=G H 2 P 1 has three isomeric structures: O=G L, O=G XG and O=G GX. These three structures were further distinguished by the ion B 4 ( O=G H 3 P 1 , m/z 615) and the cross-ring fragment 2,4 X 3 (an X unit and a cross-ring cleaved G unit, m/z 413). Altogether, including the m/z of the parent oligosaccharide ( O=G H 4 P 2 , m/z 927.3), it is concluded that O=G LGX represented m/z 927.3.
All above-mentioned motifs were generated by the oxidative XG cleavage of NcLPMO9M at the non-reducing end of substituted glucosyl units from "XXXG"-type building block of TXG. Furthermore, the C4-oxidized oligosaccharides having an m/z-value of 1059.5 ( O=G H 4 P 3 ) in NcLPMO9M-TXG-digest were composed of mainly O=G XXGX and O=G XGXX instead of compounds having terminal G units (for example O=G GXXX and O=G XXXG in the NcLPMO9C-TXG-digest). Similarly, an m/zvalue of 1221.5 was also annotated as mainly O=G XGXL, O=G XGLX and O=G LXGX and an m/z-value of 1383.7 was O=G XGLL (only one was identified, Fig. 7b) in the NcLPMO9M-TXG-digest.

Characterization of C1-oxidized TXG oligosaccharide products
C1-oxidized products were only detected in the NcLP-MO9M-TXG-digest. However, due to the poor signal intensity and heavy co-elution of all C1-oxidized products in HILIC-ESI-MS, these products could not be structurally identified. Nevertheless, the presence of the parent masses of C1-oxidized products confirmed that NcLPMO9M resulted in both C1-and C4-oxidized XG oligosaccharides.

Characterization of (oxidized) BCXG oligosaccharide products
We further analyzed the cleavage patterns of NcLP-MO9C-and NcLPMO9M-digests towards BCXG which is a XG having additional F units (glucosyl-xylosylgalactosyl-fucosyl residue; Table 1), again by using HILIC-ESI-CID-MS/MS (Additional file 1: Fig. S6). The HILIC-ESI-MS base-peak chromatograms of two NcLPMO-BCXG-digests showed once more the striking difference between the patterns (Additional file 1: Fig.  S6a, b). Due to the high complexity, not all released (oxidized) BCXG degradation products by LPMOs were fully elucidated. Nevertheless, in the NcLPMO9C-BCXGdigest, we were able to identify BCXG oligosaccharides with a C4-oxidized terminal G unit (e.g., O=G GXXF, m/z 1367.7, Additional file 1: Fig. S6c), which is absent in the NcLPMO9M-BCXG-digest. Interestingly, a diagnostic C4-oxidized F unit ( O=G F(H 3 P 2 ), m/z 1367.7, Additional file 1: Fig. S6d) was identified in the NcLPMO9M-BCXGdigest, which was absent in the NcLPMO9C-BCXGdigest. The identified C4-oxidized F unit indicated that oxidative cleavage of BCXG by NcLPMO9M also occurred next to the extensively substituted glucosyl units.

Distinct mode-of-action of NcLPMO9C and NcLPMO9M towards XG
In this study, the structures of oxidized TXG oligosaccharides generated by two NcLPMO9C (Table 2) and NcLPMO9M (Table 3) from XG were unambiguously elucidated. In the NcLPMO9C-TXG-digest, TXG oligosaccharides were found mostly to be typical "XXXG"type block units, but with C4-oxidized unbranched G units (e.g., O=G GXXX, O=G GXLX, O=G GXXL, O=G GLXX and O=G G(H 5 P 3 )). Another C4-oxidized "XXXG"-type product ( O=G GXXF) was identified in the NcLPMO9C-BCXG-digest. In contrast, non-"XXXG"-type of C4-oxidized TXG oligosaccharides were identified in the NcLPMO9M-TXG-digest. The C4-oxidation of TXG oligosaccharides by NcLPMO9M on X and L units confirmed that NcLPMO9M can oxidize substituted glucosyl units at the C4-carbon. In addition, the oxidation predominately found on X and L units in HILIC-ESI-CID-MS/MS characterized TXG oligosaccharides, instead of on unbranched G units, may reflect that NcLPMO9M has the preference in cleaving the substituted glucosyl backbone. The identified C4-oxidized F unit from NcLP-MO9M-BCXG-digest further indicated that the oxidative cleavage of XG by NcLPMO9M is independent of the type and length of the branches. Based on these determined XG cleavage sites, it was defined that NcLPMO9C oxidatively cleaves XG predominantly at the non-reducing end of single unbranched glucosyl units [32], further referred to as a substitution-intolerant mode-of-action towards XG (in brief "Substitution-intolerant") (Fig. 8). In contrast, the oxidative cleavage of XG by NcLPMO9M was shown to be more tolerant to substitutions with even a preference next to substituted glucosyl units and referred to as "Substitution-tolerant" (Fig. 8).

Phylogenetic and structural analysis of LPMOs with XG activity
To test our hypothesis whether the mode-of-action of AA9 LPMOs towards XG is dependent on the type of active site segments, as showcased by NcLPMO9C and NcLPMO9M (Fig. 8), amino acid sequence alignment and phylogenetic analysis were conducted. Here, all characterized fungal AA9 LPMOs (cellulose-active and XG-(plus cellulose)active LPMOs) and a number of randomly selected uncharacterized AA9 LPMOs from the CAZy database were compared. We first aligned the mature amino acid sequences (Additional file 2), which revealed three main clusters, and generated an unrooted "full-length" (FL) phylogenetic tree (Additional file 1: Fig.  S7). The clustering of AA9 LPMOs into three groups has already been described in literature [36,40,41,[52][53][54], however, never been used for comparisons of active site segments and XG catalytic behavior. Next, only the amino acids of the five active site segments (Seg1-Seg5, based on the definition described in our previous study [41]) were aligned (Additional file 3) and subjected to a phylogenetic analysis. The resulting structure-based "segments-only" (SO phylogenetic tree (Additional file 1: Fig. S8; Fig. 9) shows three main clusters: one with the structural features + Seg1 − Seg2 (red area), the second defined as − Seg1 + Seg2 (light blue area) and the third defined as − Seg1 − Seg2 (yellow area). A sub-cluster with a − Seg1 + Seg2 feature was found (dark blue area in Fig. 9), but mostly with an extended Seg3 ( − Seg1 + Seg2 + Seg3).

Homology of active site segments of XG-active and XG-inactive LPMOs
As previously described, NcLPMO9C and NcLPMO9M have a different catalytic site configuration in terms of neighboring segments, in particular for Seg1 and Seg2 (Fig. 1). In this research, we characterized NcLPMO9C as "Substitution-intolerant" and NcLPMO9M as "Substitution-tolerant". From this, we hypothesized that the long/short Seg1 and Seg2 is a generic feature amongst AA9 LPMOs altering their interaction with XG, which further steers their mode-of-action in degrading XG. Indeed, the characterized NcLPMOs belong to different clusters of the structure-based SO phylogenetic tree of AA9 LPMOs (Fig. 9). Whether other characterized AA9 LPMOs, shown in the three clusters, have been reported to represent "Substitution-intolerant" or "Substitution-tolerant" oxidative cleavage activities is discussed here. Note that all discussed AA9 LPMOs are able to oxidatively cleave cellulose. For ease of structural comparison, published three-dimensional structures or homology models of selected characterized AA9 LPMOs from each of the three main phylogenetic clusters are shown in Additional file 1: Fig. S9.
In the + Seg1 − Seg2 cluster (red area in Fig. 9), a "Substitution-tolerant" mode-of-action was found for all characterized LPMOs, except for PaLPMO9D (No. 17 in Fig. 9), which was determined to be "Inactive", although only based on a colorimetric H 2 O 2 -production assay [31]. A similar conclusion of "Inactive" for PaLPMO9B (No. 62 in Fig. 9) and PaLPMO9E (No. 84 in Fig. 9) was drawn also based on the H 2 O 2 -production assay [31]. As only a repression of the H 2 O 2 production of the LPMOs is measured with this peroxidase assay, it cannot be concluded whether these LPMOs show really no oxidative cleavage of XG. Hence, to confirm their (non-) XG activity, a more detailed chromatography-and mass spectrometry-based analysis is required.
In the cluster of − Seg1 + Seg2 (light blue area in Fig. 9), NcLPMO9A (No. 47 in Fig. 9), having a high structural similarity to NcLPMO9C (No. 43 in Fig. 9) and NcLP-MO9D (No. 51 in Fig. 9), displayed no activity on XG alone [20]. NcLPMO9A showed the "Substitution-intolerant" degradation only when cellulose was present [20] and apparently is, an exception in this cluster. From the same cluster, PaLPMO9H (No. 45 in Fig. 9) was reported as "Substitution-tolerant" LPMO by using direct infusion mass spectrometry [26]. But, in another research, the HPAEC chromatogram of a PaLPMO9H-TXGdigest showed a more "Substitution-intolerant" behavior [31]. Again, a more detailed chromatography-and mass spectrometry-based analysis is required to unambiguously define the mode-of-action of PaLPMO9H towards XG. Nevertheless, taking a closer look at the PaLPMO9H structure, it appeared that this enzyme has a higher content of hydrophobic amino acid residues (F, W, Y) in Seg1, less charged residues but a higher negative net charge in Seg3, and one additional positively charged residue in Seg4 (Additional files 2 and 3), compared to NcLPMO9C.
Also, in the − Seg1-+ Seg2 cluster (yellow area in Fig. 9) some exceptions were annotated. For example, AN3046 (No. 73 in Fig. 9) was reported to be active towards XG based on MALDI-TOF-MS data [29]. However, these data remain to be verified with other analytical techniques, as the reported MALDI-TOF mass spectra only showed aldonic acid forms, while m/z-values of δ-lactone forms were absent. Detection of aldonic acids without δ-lactones in MALDI-TOF-MS analysis of LPMO-digests has not been observed in other studies. In addition, only XXLG ox and XLLG ox were detected in the LPMO-TXG digest, while the more common XXXG ox block was not found [29]. Another still difficult to classify candidate in the − Seg1 − Seg2 cluster (yellow area in Fig. 9) is TtLPMO9E (No. 76 in Fig. 9), which has been reported as "Inactive" when using Asc as electron donor, but as active when reduced by photosynthetic pigments with light [33]. The above special cases, together with LPMOs not yet tested on XG, further exemplify the difficulties and pitfalls in understanding LPMO mode-of-action towards XG based on their active site segment configuration. The latter can only be properly understood if not only experimental conditions and assays used are carefully considered, but also detailed characterization of LPMO-XG degradation products is performed, which further reflects the importance of our research. Hence, careful characterization of more LPMO mode-of-actions towards XG is highly recommended to further understand how active site segments steer the XG degradation by AA9 LPMOs.

Conclusions
In this study, we described two distinct XG degradation patterns generated by two AA9 NcLPMOs representing different configuration of active site segments. The oxidative cleavage of XG by NcLPMO9C predominantly occurred at the non-reducing end of single unbranched glucosyl units ("Substitution-intolerant"), while NcLP-MO9M displayed a more substitution-tolerant cleavage behavior ("Substitution-tolerant"). Based on active site segment phylogeny of AA9 LPMOs, "Substitutionintolerant" was found to correlate to the configuration − Seg1 + Seg2, while "Substitution-tolerant" correlated to + Seg1 − Seg2. These findings support the hypothesis that the mode-of-action of AA9 LPMOs towards XG is based on the distinct structural features of their active site segments.

Catalytic performance of XEG, NcLPMO9C and NcLPMO9M on XG
Expression, production and purification of NcLPMO9C and NcLPMO9M were described previously [41]. XG substrates (TXG or BCXG, 2 mg/mL) were dissolved in 50 mM ammonium acetate buffer (pH 5.0) with the addition of Asc (1 mM final concentration). Subsequently, XEG, NcLPMO9C and NcLPMO9M were added to a concentration of 1.25 µM. Control reactions were performed without the addition of Asc. Single 200 µL reactions were incubated in an Eppendorf ThermoMixer ® C at 800 rpm (in a vertical orientation) and reactions used to produce the time curves were incubated in a head-over-tail rotator at 20 rpm (5 mL total volume). NcLPMO9C and NcLPMO9M reactions were incubated at 30 °C while XEG reaction was at 50 °C. All reactions were performed in duplicate. To create a time curve for NcLPMO9C and NcLPMO9M, a larger reaction volume of 500 µL was sampled at 0, 1, 2, 4, 8 and 24 h after enzyme addition. The reactions were stopped while incubating for 10 min at 97 °C in an Eppendorf ThermoMixer ® C. Subsequently, the supernatant was recovered after centrifugation in a Hermile Z 233 MK-2 centrifuge at 22000×g (Rotor: 220.87 VO5/6) for 20 min and stored at − 20 °C until further usage. Parts of XEG-and NcLPMO9C-TXG-digests were further treated with β-galactosidase (GH35 from Aspergillus niger, Megazyme), which is further described in Additional file 1.

HPSEC analysis for molecular weight distribution of (degraded) TXG
TXG and corresponding digests were analyzed by HPSEC-RI for their molecular weight distribution. Instrument settings, column and elution program were the same as described previously [41]. Pullulans (Associated Polymer Labs Inc., New York, USA) in the MW range of 0.4-708 kDa were used for calibration.

MALDI-TOF-MS analysis of oligosaccharides
To analyze the mass of formed XG oligosaccharides, MALDI-TOF-MS (Bruker Daltonics, Billerica, Massachusetts, USA) was used as previously described [47]. The mass spectrometer was calibrated using maltodextrins (Avebe, Veendam, The Netherlands) in a mass range (m/z) of 500-3000 and a total of 300 spectra were collected for each measurement. Prior to analysis, samples were desalted using Dowex AG 50 W-X8 Resin (Bio-Rad Laboratories, Hempel Hempstead, UK). The desalted supernatants were dried under nitrogen and re-dissolved in water containing 20 mM LiCl to obtain lithium (Li)adducts. 1 µL of each lithium-rich sample was mixed with 1 µL matrix solution (50% (v/v) acetonitrile in H 2 O containing 12 mg/mL 2,5-dihydroxy-benzoic acid (Bruker Daltonics)) and dried under nitrogen.

HILIC-ESI-CID-MS/MS for structural elucidation of (degraded) XG
The LPMO-TXG-and -BCXG-digests were separated and analyzed using HILIC coupled to ESI-MS. To separate the TXG oligosaccharides, a Vanquish UHPLC system (Thermo Scientific, San Jose, CA, USA) equipped with an Acquity UPLC BEH Amide column (1.7 μm, 2.1 mm ID × 150 mm) and a VanGuard pre-column (1.7 μm, 2.1 mm ID × 5 mm) was used. Supernatants from LPMO-TXG-and LPMO-BCXG-digests were concentrated five times and then subjected (2 μL) to the column. The column temperature was set at 35 °C using the still air mode and the flow rate was 0.45 mL/min. Water (A) and acetonitrile (B) both containing 0.1% formic acid (all were UHPLC-grade; Biosolve, Valkenswaard, The Netherlands) were used as mobile phases. The elution profile was: 0-2 min at 82% B (isocratic), 2-62 min from 82% to 60% B (linear gradient), 62-62.5 min from 60% to 42% B (linear gradient), 62.5-69 min at 42% B (isocratic), 69-70 min from 42% to 82% B (linear gradient) and 70-80 min at 82% B (isocratic). The MS settings have been described previously [51]. The full MS (m/z) range was set to 300-2000. To improve the fragmentation, MS/ MS was performed using dependent scan followed by a parent mass list. The mass list used is displayed in Additional file 1: Table S1. For MS/MS, the CID with a normalized collision energy was set at 35%, the minimum signal threshold was 20,000 counts, activation Q was 0.15 and activation time was 10 ms. Mass spectrometric data were processed using Xcalibur 2.2 (Thermo Scientific).

Crystal structures and homology models
Structural data of LPMOs were derived from the RCSB protein data bank (https ://www.rcsb.org). Homology models of LPMOs without published three-dimensional structures were generated using SWISS-MODEL (https ://swiss model .expas y.org) [63][64][65][66][67]. Template search with BLAST [68] and HHBlites [69] were performed against the SWISS-MODEL template library (SMTL). The target sequences were searched with BLAST against the primary amino acid sequence contained in the SMTL. The PyMOL Molecular Graphics System (Version 1.7.2.1 Schrödinger, LLC) was used for visualization and structural alignments.

Sequence mining, structure-based multiple sequence alignment and phylogenetic analysis
In order to obtain an unbiased set of amino acid sequences, which covers the whole range of the large variety within AA9 LPMOs, sequences were selected randomly from the 498 available eukaryotic AA9 LPMO sequences in the CAZy database. This set was completed by addition of all AA9 LPMO sequences labeled as "characterized" in the CAZy database, all AA9 LPMO sequences with a resolved structure, and those with known XG (in)activity, if not already present in the set. The amino acid sequences were aligned using the MUS-CLE algorithm [70] in MEGA7 [71] and fine-tuned by cutting out the signal peptide, the linker-and the CBMregion, as well as sequences not fitting to the alignment. The amino acid sequences were then realigned using the structure-based MAFFT-DASH algorithm [72]. The resulting structure-based alignment was then cut down to the regions of interest termed "Segments 1 to 5" (Seg1-Seg5).
Phylogenetic analysis of both the FL and SO structure-based multiple sequence alignment was done using RAxML-NG [73]. Firstly, the alignments were tested for the most applicable substitution model using ModelTest-NG [74]. The tree was inferred using the BLOSUM62 model [75] (number of discrete gamma categories: 4; with frequencies and invariant sites) for the FL alignment, and the Probability Matrix from Blocks (PMB) [76] model (number of discrete gamma categories: 4; with frequencies and invariant sites) for the SO alignment and 20 starting trees were calculated. Bootstrap analysis was then carried out until convergence criteria (cut-off: 0.03) based on the bootstopping test [77] were reached (800 and 1120 bootstraps for the FL and SO alignment, respectively). The resulting phylogenetic trees were prepared for publication using MEGA7.