Phylogenetics-based identification and characterization of a superior 2,3-butanediol dehydrogenase for Zymomonas mobilis expression

Background Zymomonas mobilis has recently been shown to be capable of producing the valuable platform biochemical, 2,3-butanediol (2,3-BDO). Despite this capability, the production of high titers of 2,3-BDO is restricted by several physiological parameters. One such bottleneck involves the conversion of acetoin to 2,3-BDO, a step catalyzed by 2,3-butanediol dehydrogenase (Bdh). Several Bdh enzymes have been successfully expressed in Z. mobilis, although a highly active enzyme is yet to be identified for expression in this host. Here, we report the application of a phylogenetic approach to identify and characterize a superior Bdh, followed by validation of its structural attributes using a mutagenesis approach. Results Of the 11 distinct bdh genes that were expressed in Z. mobilis, crude extracts expressing Serratia marcescens Bdh (SmBdh) were found to have the highest activity (8.89 µmol/min/mg), when compared to other Bdh enzymes (0.34–2.87 µmol/min/mg). The SmBdh crystal structure was determined through crystallization with cofactor (NAD+) and substrate (acetoin) molecules bound in the active site. Active SmBdh was shown to be a tetramer with the active site populated by a Gln247 residue contributed by the diagonally opposite subunit. SmBdh showed a more extensive supporting hydrogen-bond network in comparison to the other well-studied Bdh enzymes, which enables improved substrate positioning and substrate specificity. This protein also contains a short α6 helix, which provides more efficient entry and exit of molecules from the active site, thereby contributing to enhanced substrate turnover. Extending the α6 helix to mimic the lower activity Enterobacter cloacae (EcBdh) enzyme resulted in reduction of SmBdh function to nearly 3% of the total activity. In great contrast, reduction of the corresponding α6 helix of the EcBdh to mimic the SmBdh structure resulted in ~ 70% increase in its activity. Conclusions This study has demonstrated that SmBdh is superior to other Bdhs for expression in Z. mobilis for 2,3-BDO production. SmBdh possesses unique structural features that confer biochemical advantage to this protein. While coordinated active site formation is a unique structural characteristic of this tetrameric complex, the smaller α6 helix and extended hydrogen network contribute towards improved activity and substrate promiscuity of the enzyme.

Background Petroleum alternatives are critical to maintaining a sustainable economy while satisfying the ever-growing global energy demands. Diols, potentially renewable compounds containing two hydroxyl groups, have wide-ranging applications in chemicals and fuels. 2,3-butanediol (2,3-BDO) is exemplar of industrial diols having been used in liquid fuels, cosmetics and drugs, paints, food additives, and synthetic rubber [1,2].
Zymomonas mobilis is known primarily for its ethanologenic properties. In comparison to yeasts, which use the Entner-Meyerhof-Parnas pathway for glycolysis, Z. mobilis uses the Entner-Doudoroff (ED) pathway. The ED pathway is found in facultative anaerobes and aerobic microorganisms, leading to higher ethanol yields [14]. Z. mobilis has other advantages, such as high alcohol and pH tolerance, high rate of sugar uptake, low biomass production, and reduced aeration requirement thereby reducing the production costs [15]. Interestingly, Z. mobilis is capable of using both deacetylated-disc refined (DDR) and deacetylatedmechanically refined (DMR) sugar streams for ethanol production [16]. Recently, it was also demonstrated by our laboratory that the ethanol flux could be diverted to 2,3-BDO using pure or mixed sugars as substrates. By introducing the three 2,3-BDO pathway genes followed by promoter replacements and fermentation condition optimizations, the 2,3-BDO titers of > 13 g/L was achieved [15].
One of the bottlenecks for production of 2,3-BDO in Z. mobilis is the competition between NADH oxidase (Ndh) and Bdh to oxidize NADH to NAD + under oxic conditions. This is in addition to the NADH demand for glycerol and ethanol production by this organism. Depending on the activity of the Bdh enzyme, the conversion of NADH could be shifted towards 2,3-BDO production instead of the respiratory chain. The conversion of acetoin to 2,3-BDO by Bdh is a reversible reaction governed by the pH [17,18].
Bdh enzymes can be classified into R-acting or S-acting depending on the chirality of the chiral center introduced by the enzyme at the acetoin C2 atom. Whereas the preference for (3R)-acetoin or (3S)-acetoin is imprinted in the geometry of the substrate-binding pocket, R-acting and S-acting Bdh enzymes belong to different protein families and possess different architectures. Recently, the first structure of an R-acting Bdh from Bacillus subtilis was deposited to the Protein Data Bank (PDB, www.rcsb.org, [19]) with the PDB ID 6IE0, while there are numerous structures available for other members of the mediumchain dehydrogenase/reductase (MDR) superfamily [20][21][22][23]. Examples of R-acting Bdh enzymes are those from S. cerevisiae, Paenibacillus polymyxa, B. subtilis, Pseudomonas putida, and Bacillus licheniformis [24][25][26][27].
S-acting Bdh enzymes belong to the short-chain dehydrogenase/reductase (SDR) superfamily of proteins [28]. The substrates for this superfamily of enzymes vary greatly in size and include glucose, alcohols, and steroids [29]. These enzymes are well studied and several S-acting Bdhs are characterized structurally, such as those of (3R)-acetoin-dependent S-acting Bdh from Klebsiella pneumoniae (PDB ID 1GEG, deposited in 2001) and (3S)acetoin-dependent S-acting Bdh from Corynebacterium glutamicum (PDB ID 3A28, deposited in 2010) reported by Otagiri and colleagues [30,31].
In this work, we have focused on identifying the most active Bdh enzyme that can function in Z. mobilis using a phylogenetics approach to enable efficient conversion of acetoin to 2,3-BDO. Furthermore, we have obtained the crystal structure of the Serratia marcescens Bdh (SmBdh) by expressing it in this industrially relevant host and provided comparative analysis against similar Bdh enzymes. We have also carried out structurally guided changes to SmBdh to explain its superiority over lower activity Bdh enzymes from the same family of proteins, followed by biochemical confirmation of the activity of the designed mutants.

Results and discussion
Production of 2,3-BDO is dependent on several factors, some of which include the types of heterologous 2,3-BDO pathway genes used, the levels of oxygen present, the presence of competitive pathways, such as ethanol and glycerol, as well as competing enzymes for NADH conversion. We have concentrated on only one of these parameters, namely the activity of the NADH-consuming enzyme, Bdh, that can compete with Ndh in the presence of oxygen. This biochemical step is considered as one of the bottlenecks in 2,3-BDO production by this organism [15]. Although Z. mobilis has been shown to express several commonly known Bdh enzymes [15], there has been no comprehensive study undertaken to express functional Bdhs from diverse species that could contribute towards improving 2,3-BDO production in this organism. We therefore decided to take a phylogenetic approach to this problem by screening Bdh enzymes across different bacterial kingdoms to identify the best active enzyme that could be engineered in this organism.

Identification of butanediol dehydrogenase sequences for expression in Z. mobilis
A total of 57 protein sequences were included in the list of Bdh for phylogenetic analysis. This list included two sequences each from Azotobacter vinelandii, Acidovorax avenae, Rhodococcus jostii, and Agrobacterium tumefaciens. These sequences belonged to two different classes of Bdh proteins based on their protein sequence lengths. Bifidobacterium asteroides contained three Bdh sequences, two belonging to the short length class and one belonging to the long class. We also included B. subtilis, B. licheniformis, K. pneumoniae, E. cloacae, and S. marcescens, studies for which expression has been carried out in E. coli [13]. Clearly, two distinct clusters were observed on the phylogenetic tree, one cluster (21 sequences) primarily contained the "short length" Bdhs, while the other (36 sequences) contained the "long" Bdhs ( Fig. 1). The "short length" Bdh cluster comprised S-acting enzymes (presumably belonging to the SDR superfamily), such as K. pneumoniae, E. cloacae and S. marcescens [18,32,33]. The "long" Bdh cluster comprised R-acting enzymes (presumably belonging to the MDR superfamily), such as B. subtilis and C. beijerinckii [33]. Each of these two major clusters were further subdivided into smaller sub-clusters. Based on their clustering pattern, we subdivided the "short length" clusters into two sub-clusters, while subdividing the "long" class of Bdhs into five sub-clusters. Eight protein sequences were selected from the "long" and three sequences were selected from the "short-length" clusters, such that at least one representative protein sequence was selected from each sub-cluster of the phylogenetic tree ( Fig. 1).

Screening of butanediol dehydrogenases for protein expression and activity in Z. mobilis
The Z. mobilis strain used in this study was 9C, which was a modified version of strain 8b that was originally developed for improved ethanol production [15,34]. The 9C strain lacked the chloramphenicol and tetracycline markers that the 8b strain contained. Neither of these strains were engineered for BDH production. Moreover, native Z. mobilis does not contain the BDH pathway to enable 2,3-BDO production. Thus, our study was not designed to carry out expression of the entire 2,3-BDO pathway, instead it was only intended to study expression and activity of the expressed BDH enzyme in this organism. Transformants obtained from the spectinomycin selection plates were confirmed for the presence of the individual bdh genes by PCR analysis. Eight individual transformants were subjected to colony PCR, results of which are shown in Additional file 1. Based on PCR analysis, five independent transformants were selected for protein expression analysis. Because these Bdh proteins were expressed without an epitope tag, we decided to use activity assays as a screen to detect protein expression in the different Z. mobilis transformants. The results of the activity assay are shown in Table 1. We observed that most of the bdh transformants showed an activity of ≤ 2 µmol/min/mg using acetoin and NADH as the substrate and cofactor, respectively. This included 9 of the 11 tested bdh genes (Additional file 2). We used three Z. mobilis strains expressing the E. cloacae, B. subtilis, and Lactococcus lactis bdh genes under the same promoter as controls for which expression and/or 2,3-BDO production has been tested in-house (unpublished data). Two of the Bdh enzymes, namely those from M. luteus and S. marcescens, showed Bdh activity of ~ 3.0 and 9.0 µmol/ min/mg, respectively, which was 2-to 4-fold higher than Phylogenetic tree analysis of butanediol dehydrogenases. The neighbor joining tree was generated using 57 full-length Bdh sequences. The optimal tree with the sum of branch length = 16.08413861 is shown. Boot strap values are shown next to the branches. The tree has been divided into 2 clusters, the medium-chain dehydrogenases/reductases (MDR) and short chain dehydrogenases/reductases (SDR). The MDR and SDR clusters are further subdivided into 5 and 2 sub-clusters, respectively. The sub-clusters are represented by numbers 1 through 7. Red triangles represent the sequences selected for gene synthesis the control strains. Based on the activity analysis, we selected S. marcescens Bdh (SmBdh) as the candidate enzyme for structural characterization. We also carried out SDS-PAGE analysis to determine expression of individual Bdh protein produced by these transformants. Interestingly, only some of the bdh gene transformants showed unique bands at the expected molecular weight based on Coomassie staining (Additional file 3), suggesting different expression levels of the heterologous Bdh proteins. It is likely that some of these heterologous enzymes were not expressed to detectable levels, which may have been a factor in their poor activity levels. Nevertheless, for the purpose of screening, we used their crude extract activities as the standard to determine the best-expressing and active enzyme.

Crystal structure of SmBdh along with its bound cofactor and substrate molecules
Expression of a high-performing BDH enzyme in Z. mobilis is critical for 2,3-BDO production in this industrially relevant ethanol producer. Although the structural analysis of SmBdh was not the primary intent of this study, considering that this enzyme turned out to be the best expressed and most active enzyme in this organism, it was therefore deemed important to determine the reasons behind the improved functionality of this enzyme. The Z. mobilis expressed WT SmBdh crystallized in the space group P4 3 2 1 2 with two protein molecules per asymmetric unit that could be superimposed with r.m.s.d. of 0.271 Å over 1379 atoms. Two molecules found in the asymmetric unit form a tetramer (dimer of dimers) with a symmetry-related dimer ( Fig. 2a) with extensive interface surfaces ( Table 2, [35]). The tetrameric arrangement is common to the proteins belonging to the SDR superfamily. SmBdh was shown to be a tetramer in solution by Native PAGE [36]. The main difference between the two molecules in the asymmetric unit is that molecule A is found in 'closed' conformation (that could be best described by ~ 10 Å distance between Cα-atoms of residues Ala93 and Trp192) and molecule B is found in 'open' conformation (the same distance is ~ 14 Å) (Additional file 4, Fig. 2b and c). Upon examination of the electron density maps, we were able to locate NAD + cofactor molecule and acetoin molecules in protein molecule A ('closed' conformation Fig. 3a), while in molecule B ('open' conformation) the corresponding space is occupied by only adenine diphosphate (Fig. 3b). The transition from the 'open' conformation into the 'closed' seems to be coincidental with NAD + binding and includes the shift of the loop 183-203, containing two helices, α6 and α7 (Fig. 2b, c). In the process, the short helix α6 is unfolded, but this is compensated by the hydrogen bonds formed between the NAD + and Thr186Oγ and between the main chain carbonyl of Val184 and Thr189Oγ (Figs. 2c, 3). Additional stabilization is likely provided by the hydrophobic interaction between Met188 side chain and nicotinamide ring (Fig. 2c). Interactions between protein and NAD + are shown in Fig. 3c. WT SmBdh contains acetoin in the substrate-binding pocket of the 'closed' molecule A. Crystallization conditions included NAD + and acetoin, and more acetoin was used for cryoprotection. Omit electron density maps showed three bulges at the C3 atom (Fig. 4a). When only (3R)-acetoin molecule was modeled, a positive peak at the difference map showed up (Fig. 4b). We interpreted this as both (3R)-and (3S)-acetoin molecules bound in the pocket since commercially available acetoin is a racemic mixture and no other small molecules (like ethylene glycol) were added during crystallization or as a cryoprotectant. Out of three "bulges", only one should be modeled as a methyl group since its surrounding is hydrophobic (side chains of Leu183, Trp192, Phe200, and Thr189Cγ) and no possible hydrogen bond donors or acceptors are available. Thus, this bulge was modeled as C4 atom of both (3R)-acetoin and (3S)-acetoin. Two other bulges, in contrast, have hydrogen bond partners available: Gln247 symm Nε (3.1 Å), Ser140Oγ (3.1 Å), or a water molecule (2.7 Å) for (3R)-acetoin O3 atom, and Gln247 symm Oε (3.0 Å) or Ser138 Oγ (3.2 Å) for (3S)-acetoin O3 atom.

SmBdh belongs to the SDR superfamily of enzymes
SmBdh exhibits the typical single domain short-chain dehydrogenase/reductase (SDR) architecture. A monomer displays a dinucleotide-binding Rossmann fold that includes seven-stranded parallel β-sheet (β3-β2-β1-β4-β5-β6-β7) flanked by three α-helices (α2, α1, and α8) on one side and three α-helices (α3, α4, and α5) on another. There is an additional small lobe on top of a core structure, formed by two short helices, α6 and α7, creating a deep cleft in which the cofactor is bound, and where the active site is located (Figs. 2 and 3). It should be noted that SmBdh has been shown to be an NADH dependent dehydrogenase enzyme [18]. Two α-turns (αt1 and αt2) can also be recognized in this structure.
PDBeFold [37] search returned over 3000 molecules with Z-score between 6 (usually considered to be the lower threshold for likeliness) and 16.6, and sequence identity up to 37%. Top matching structure could be superimposed over SmBdh with r.m.s.d. of 1.03 Å over 239 Cα-atoms (FabG from Bacillus sp., PDB ID 4NBU). While SDR superfamily is vast and includes enzymes active towards various substrates, there are several reported Bdh structures, namely from Burkholderia cenocepacea (BcBdh, PDB ID 4WEO), Burkholderia xenovorans (BxBdh, PDB ID 5JY1), C. glutamicum (PDB ID 3A28), Gluconobacter oxydans (GoBdh, PDB ID 3WTC), and K. pneumoniae (KpBdh, PDB ID 1GEG) that are available in the PDB database. All mentioned enzymes are S-acting or S-installing, introducing 2S chiral center in the acetoin molecule. However, only two enzymes are well characterized with regard to the substrate chiral specificity: Bdh from K. pneumoniae (KpBdh) is (3R)-acetoin dependent [30] and Bdh from C. glutamicum (CgBdh) is (3S)-acetoin dependent [31]. An explanation for the chiral substrate recognition based on the spatial organization of the substrate-binding pocket has been presented before [31]. To determine the structural characteristics that confers this property to the SmBdh enzyme, we compared this protein with similar Bdhs from other organisms. Based on our assessment, the strategic placement of the hydrophobic cluster on the αt1 side of the acetoin binding site (Ile142, Phe148, Leu151) and possible hydrogen bond partner on the α6 side of the acetoin binding site (Trp192) would be preferential for (3S)-acetoin binding in CgBdh (Figs. 2b and 5a  (Fig. 5c). This water molecule is held in place by Nε of His91 that replaced Ala 92/90 of CgBdh/KpBdh, respectively, and Oδ of Asp145. This water molecule is placed within 2.7 Å from O3 atom of (3R)-acetoin molecule in SmBdh. 2: Ala143/141 of CgBdh/KpBdh, respectively, is replaced with Ser140 in SmBdh and Ser140Oγ is within 3.1 Å from the O3 atom of (3R)-acetoin. Nε of the Gln247 symm is situated 3.1 Å away from the O3 atom of (3R)-acetoin (Fig. 5c). 3: For the (3S)-acetoin, its O3 atom is located 2.8 Å away from Oε of the Gln247 symm and 3.2 Å away from Ser138Oγ (Fig. 5c). We can suggest that significant improvements in supporting the hydrogen bond network in the active site of SmBdh, in comparison to the other known Bdhs, contributes to more precise positioning of acetoin molecule in the active site and probably better stabilization of the reaction intermediate which could in turn lead to higher turnover of the substrate.

Substrate promiscuity of the SmBdh enzyme
Most of the characterized Bdh enzymes show strong preference to either (3S)-or (3R)-acetoin as a substrate [24,33,38,39]. SmBdh was reported to be able to reduce both (3R)-and (3S)-acetoin to 2,3-BDO, although (3R)acetoin was more readily converted than (3S)-acetoin [18]. On the other hand, SmBdh was able to oxidize meso-2,3-BDO and (2S,3S)-BDO, while oxidation of (2R,3R)-BDO was not detectable. Moreover, its activity towards (2S,3S)-BDO was only 11% of that towards meso-2,3-BDO [18]. This observation was similar to that reported by Médici et al. where (3R)-acetoin was found to be the preferred substrate in the racemic mixture of acetoin. Fifteen percent of the (3S)-acetoin remained unconverted after 24 h in their study [36]. Indeed, in our crystal structure we were able to locate both acetoin enantiomers in the active site. Whereas the positions of all acetoin atoms, with the exception of O3, superimpose quite well; the O3 atoms of (3S)-and (3R)-acetoin molecules are involved in hydrogen bonds with different surrounding residues (Figs. 4, 5c). This finding supports the idea of productive binding of both acetoin enantiomers in the active site. Still, (3S)-and (3R)-acetoin molecules are not treated equally by the SmBdh. We suggest that the differences in the hydrogen bond networks of the (3S)-acetoin and (3R)-acetoin can explain our findings. Specifically, the O3 atom of the (3S)-acetoin is only involved in one hydrogen bond with Gln247 symm Oε (or Nε if the side chain is flipped). While Ser138Oγ is located only 3.2 Å away from (3S)-acetoin O3, the geometry is not favorable for the H-bond formation. In contrast, O3 atom of the (3R)-acetoin has three possible partners for   (Fig. 2a).
Comparison between SmBdh and Bdhs with known structures shows a shortened α6 helix in SmBdh (also known as FG1 in SDR structure description) (Fig. 6). The α6 helix is six residues long in SmBdh, 10 residues long in BxBdh and 15 residues long in KpBdh, CgBdh, and GoBdh. The shortened α6 helix provides wider opening of the catalytic cleft thus improving access to the NADH/ NAD + and acetoin/2,3-BDO binding sites and easing ingress/egress of the cofactor and substrate/product.

One of the most interesting and exclusive features of
SmBdh is the organization of the active site. We found that the substrate-binding pocket is formed by two protein molecules, not a single peptide as found in all other reported Bdh enzymes. The C-terminus of molecule A protrudes into the groove between α7 helix and the α-turn αt1 capping substrate-binding pocket of molecule A symm and vice versa (Fig. 2a). The side chain of Q247 symm is involved in the substrate positioning forming a hydrogen bond with the O3 atom of the acetoin. (Fig. 5). In most SDRs the groove between α7 helix and α-turn αt1 is unobstructed and open to the solvent, but there are examples of active sites with the involvement of the second protein chain, for example: 17β hydroxysteroid  [40,41]), alcohol dehydrogenase from Arthrobacter sp. TS-15 (6qhe, [42]), and glucose dehydrogenase 4 from Bacillus megaterium (3auu, [43]). Searching through the PDB, we could not find any other Bdh structure where the active site required the participation of a second protein molecule, so SmBdh is the first Bdh with this feature. The only similar functional aspect of the involvement of C-terminus in active site formation is found in the case of BxBdh, where it protrudes into the same protein molecule instead of a different protein molecule, as in SmBdh.
There are several consequences of this substrate-binding pocket organization. First, SmBdh would only be active as a tetramer because the active site is incomplete unless tetramerization is achieved. In other Bdh enzymes active sites are solely formed by a single protein molecule and the enzymatic activity in theory could occur in the monomeric or dimeric state. Second, the presence of the C-terminal portion of another protein molecule near the substrate-binding pocket creates a cap making the pocket much less accessible from the solvent compared to other Bdh enzymes. Third, higher specificity towards acetoin is another possible consequence as the substrate binding site is restricted in volume and cannot accept any larger substrate. This concept is in good agreement with the observation that SmBdh prefers smaller substrates, such as vicinal diketones/diols that do not contain bulky groups [36]. Tighter and more specific binding of the acetoin molecule could be achieved due to an improved hydrogen bonding network (as explained in the section "Key differences in the SmBdh structure lead to improved positioning of acetoin in the active site of the enzyme"); thereby improving reaction intermediate stabilization and ultimately enhancing catalytic efficiency of the SmBdh.

Mutational studies of the SmBdh Gln247 plays a crucial role in SmBdh catalysis
Based on structural evidence that we have obtained, Gln247 seems to play an important role in the hydrogen bonding-mediated stabilization of the substrate, as well as better positioning of the substrate in the substrate-binding pocket of the protein. This is important in the context of the overall catalytic ability of this protein; as well as proving to be one of the critical parameters that could make SmBdh a better performing Bdh enzyme in comparison to its closely related homologs. In order to determine the importance of the Gln247 side chain for the catalysis, we developed two mutants of SmBdh: (1) Q247A where this side chain is removed, leaving alanine at position 247 and (2) the double mutant Q247A + V139Q, where the missing glutamine side chain would be reinstated at the position 139 that is present in KpBdh (Fig. 7, Additional files 5 and 6). Whereas Q247A will disrupt the active site of the protein, Q247A + V139Q is expected to restore the active site via a compensatory mechanism resulting in the active site being established by a single protein chain without contribution from a symmetry-related molecule (i.e., from the C-terminus of the opposite molecule in the tetramer).
The SmBdh Q247A and the Q247A + V139Q mutants crystallized in the same space group (P4 3 2 1 2) as the WT SmBdh repeating its protein chains arrangement. Two protein molecules per asymmetric unit could be superimposed with r.m.s.d. of 0.275 Å over 1400 atoms (Q247A), and 0.263 Å over 1368 atoms (Q247A + V139Q). As with the WT SmBdh, the primary difference between the two molecules in the asymmetric unit in the mutant structures was that molecule A was found in 'closed' conformation (that could be best described by ~ 10 Å distance between Cα-atoms of residues Ala93 and Trp192) and molecule B was found in 'open' conformation (the corresponding distance is ~ 15 Å) (Additional file 4, Fig. 2). Upon examination of the electron density maps, we were able to locate NAD + cofactor molecule and a substrate in protein molecule A ('closed' conformation), whereas in molecule B ('open' conformation) the corresponding space is occupied by only adenine diphosphate (as in Q247A mutant) or remains unoccupied (as in Q247A + V139Q mutant) at all. The substrate binding site was found to be occupied by glycerol molecule in Q247A mutant and ethylene glycol in Q247A + V139Q mutant (Fig. 7).
In the Q247A mutant structure, a glycerol molecule was modeled at the substrate-binding pocket (Fig. 7b). Acetoin was present in the crystallization conditions, but our attempts to use additional acetoin for cryoprotection only led to crystal damage. Therefore, a 50/50 v/v mix of glycerol and ethylene glycol was used for cryoprotection. We postulate that the absence of the Gln247 symm side chain led to an increase of the available space for the substrate and therefore the larger glycerol molecule could be bound. In the Q247A + V139Q double mutant structure, an ethylene glycol molecule was modeled at the substrate-binding pocket (Fig. 7c). As with the Q247A mutant, acetoin was present during crystallization, but raising its concentration was damaging to the crystals. A 50/50 v/v mix of glycerol and ethylene glycol was therefore used for cryoprotection. We suggest that the available space in the substrate binding site became reduced compared to the Q247A single mutant, thereby providing insufficient space for glycerol to fit. Instead, ethylene glycol which is comparable in size to acetoin could fit.
We further tested the activity of both these mutants using the NADH consumption assay to determine the effect of these mutations on the function of the protein. In comparison to the WT SmBdh, Q247A mutant showed a ~ 90% loss in activity as predicted by the structure, whereas the double mutant Q247A + V139Q showed ~ 300% improvement in activity in comparison to Q247A mutant (Fig. 8). Although the double mutant did not completely restore the loss of Gln247 activity, significant function was regained by introducing the V139Q mutation in this protein. It can thus be inferred that Gln247 is extremely important for the activity of the protein, which cannot be fully restored by a complementary mutation, such as V139Q. However, this is not a thorough analysis of complementation of the Q247A mutation and there could be other mutations that could help restore full functionality of the Q247A mutation. This analysis also highlights the importance of the C-terminal Gln247 from the opposite molecule in conferring full functionality.

Extending the short α6 helix of SmBdh results in loss in activity
Comparing the different Bdh enzymes, we found that most of the S-acting Bdh enzymes, such as EcBdh, CgBdh, and KpBdh have longer α6 helix structure (Fig. 6), whereas deletion of 11 amino acid residues in the SmBdh leads to a shorter α6 helix along with a shorter linker between the α6 and α7 helices. Based on the activity analysis data (Table 1, Fig. 8), it was clear that SmBdh is much more active than the control enzymes, which include EcBdh. Furthermore, structural analysis also suggested that the shortened α6 helix contributes to the improved ingress or egress of the substrate and/or the product molecules. This result prompted us to investigate the role of this important structural difference in SmBdh with regard to enzyme performance. In order to support this hypothesis, we used two Bdh enzymes-SmBdh with short α6 helix and EcBdh with long α6 helix. Sequence alignments with highlighted differences are shown in Fig. 6b. We designed two mutants: (1) 11aa-Ins-SmBdh, where sequence fragment "RDK" was replaced with "SEAAGKPLGYGTET" to mimic longer α6 helix of EcBdh and (2) 11aa-del-EcBdh, where the amino acid sequence "SEAAGKPLGYGTET" was replaced with "RDK" to mimic the corresponding shorter α6 helix of SmBdh. The residues flanking these regions in EcBdh and SmBdh showed acceptable homology, so we decided to keep the swap region as small as possible. Proteins were expressed in the Z. mobilis 9C strain, purified to homogeneity and subjected to activity analysis using the NADH consumption assay. As expected, activity of the Fig. 8 2,3-Butanediol dehydrogenase activity of purified Bdh enzyme variants. Bdh activity was measured by following NADH consumption at 340 nm. Purified proteins were used for the enzyme assay. Reactions were carried out in triplicates and are represented as mean SD 11aa-Ins-SmBdh was greatly reduced (down to 3% of the WT activity) with the insertion, whereas the activity of 11aa-del-EcBdh increased by ~ 70% in comparison to WT EcBdh (Fig. 8), suggesting that the size of the α6 helix contributes to the unique activity of SmBdh in comparison to the other S-acting Bdhs. Nevertheless, replacing the longer α6 helix in EcBdh with the shorter helix from SmBdh did not render EcBdh similar to SmBdh in terms of its activity; we found that the activity of the 11aa-del-EcBdh was only 17% of WT SmBdh (Fig. 8). This clearly suggests that while the short α6 helix is important for the high activity of SmBdh, there are other structural features that this enzyme possesses that also contribute towards its superior performance.

Conclusions
Based on expression of 11 phylogenetically different Bdh enzymes, we have identified that the S. marcescens Bdh enzyme shows the highest activity when expressed in Z. mobilis. This enzyme has been classified as an S-acting Bdh based on production of meso-2,3-BDO and (2S,3S)-BDO from a racemic mixture of acetoin. We have structurally characterized this enzyme and ascertained the distinct structural features that may be critical for its activity. Specifically: (1) this enzyme is organized as a tetramer with diagonally opposite protein molecules acting in tandem, such that one diagonally opposite pair is bound by the substrate in the closed conformation, with the other two protein molecules in the tetramer found ready to take up new substrate molecules in the open confirmation. This is the first instance of a pistontype function of a Bdh enzyme. (2) The active site of the SmBdh enzyme is formed with the participation of the Gln247 of the opposite molecule from the tetramer, removal of this residue causes severe functional defects to the protein. A complementary mutation that introduces a glutamine partially restores the function of this protein.
(3) SmBdh possesses improved hydrogen bond network resulting in better positioning of the substrate in its active site. (4) SmBdh is able to bind to both (3S)and (3R)-acetoin productively and we were able to locate both substrate molecules in the active site of the crystal structure, thus providing the structural confirmation to the enzyme substrate promiscuity. (5) The presence of a shorter α6 helix provides a wider cleft for efficient entry and exit of the substrate and products, respectively, to SmBdh. We have experimentally verified this hypothesis by introducing residues from a low activity Bdh enzyme (i.e., EcBdh), which resulted in dramatic reduction in activity of this protein. (6) Finally, we have also demonstrated that deletion of residues that shortened the α6 helix of EcBdh can result in dramatic increase in its activity (~ 70%), which could prove to be a game-changer when considering low-active Bdh enzymes for 2,3-BDO production. Overall, this work has identified a superior heterologous Bdh enzyme suitable for expression in Z. mobilis. Moreover, availability of a strain expressing this Bdh enzyme is expected to greatly alleviate one of the major bottlenecks involved in 2,3-BDO production by this organism. The next logical step would be to incorporate the SmBDH into a Z. mobilis strain carrying the other two BDO pathway genes, Als and Aldc. Our future work will involve testing the SmBdh in combination with other Als and Aldc enzymes to identify the best combination of enzymes that would lead to improvement in 2,3-BDO production.

Strain and growth conditions
Zymomonas mobilis strain 9C (ZM4 derivative strain with xylose-utilizing abilities) was revived from glycerol stocks and grown in rich medium (RM, 10 g/L yeast extract, 2 g/L KH 2 PO 4 ) containing 5% glucose (RMG) at 30 °C under shaking conditions (120 rpm). Spectinomycin at a concentration of 200 µg/mL was used to maintain transformant colonies.

Phylogenetic analysis of Bdh
Initially, 52 Bdh amino acid sequences were obtained from the NCBI database. These Bdh sequences represented one enzyme per genus under the assumption that Bdh enzymes within the same genus would have similar structures and functions. Two distinct sets of Bdh sequences were formed based on their sequence lengths. One set contained ~ 350 aa, while the other contained ~ 250 aa. In genera containing both sequence types, both the sequences were selected. In the end, 57 Bdh sequences were used for phylogenetic analysis. Fulllength Bdh sequences were aligned using ClustalW, and phylogenetic analysis was carried out using the MEGA X software [44]. The unrooted evolutionary tree was generated using the neighbor joining method [45]. A bootstrap value of 1000 replications was used for the analysis [46]. All gaps and ambiguous positions were removed using the pairwise deletion option. The evolutionary distances were calculated using the Poisson correction method [47].

Plasmid construction, transformation, and screening of transformants
Eleven bdh gene sequences were codon-optimized for expression in Z. mobilis (Additional file 2), synthesized and cloned into the vector pEZ15Asp [15] under the control of the pyruvate decarboxylase promoter (PDC) by GenScript (Piscataway, NJ, USA). These included Erwinia amylovora (Ea), Myroides odoratimimus (Ma), Staphylococcus warneri (Sw), Thermococcus gammatolerans (Tg), A. vinelandii (Av), Mycobacterium Smegmatis (Ms), Micrococcus luteus (Ml), A. tumefaciens (At), S. marcescens (Sm), Streptomyces coelicolor (Sc), and Dickeya dadantii (Dd). Prior to transformation, these plasmids were transformed into a Dam-Dcm-E. coli strain to avoid methylation to the DNA. These plasmids were transformed into the Z. mobilis 9C strain using the protocol described in Yang et al. [15]. Briefly, 1 µg of plasmid DNA was mixed with 50 µL of freshly thawed competent cells in a 0.1 cm gap electroporation cuvette on ice. Cells were electroporated using a Bio-Rad Gene Pulser (Bio-Rad Laboratories, Inc. Hercules, CA, USA) using the following settings: 200 Ω, 25 µF, 1.6 kV. The tubes were immediately transferred to ice for cooling. One milliliter of mating medium (MM; 50 g/L glucose, 10 g/L yeast extract, 5 g/L tryptone, 2.5 g/L (NH 4 ) 2 SO 4 , and 0.2 g/L K 2 HPO 4 containing 1 mM MgSO 4 was added to the tube, the liquid transferred to a 1 mL cryovial and then incubated for 5-6 h at 30 °C to allow cell recovery. Cells were then plated on MM agar containing spectinomycin (200 µg/mL) and incubated in an anaerobic jar (containing a gas pak) at 30 °C for 2 days. Individual colonies were restreaked on MM medium containing spectinomycin followed by colony PCR to check for the presence of the bdh gene. Colony PCR was carried out using the forward primer, SV-90, and the reverse primer, SV-91. All primers used in this study are available in Additional file 7. These primers flanked the promoter and terminator sequences, respectively, thereby amplifying the complete gene cassette.

Construction of Bdh variants for crystal structure assessment
Two different primers SV-161 and SV-162 were used to amplify the pEZ15Asp plasmid containing the Smbdh gene, such that a 6X-histidine tag and a Tobacco etch virus protease cleavage site (ENLYFQG) sequences were introduced, in that order, immediately following the start codon of the Smbdh gene. The PCR product was subjected to DpnI digestion followed by ligation using the In-Fusion Cloning protocol (Takara Bio USA, Inc., Mountain View, CA, USA) to obtain the final plasmid, which will be henceforth referred to as "N-term SmBdh".
The Q247A SmBdh mutant was generated by amplifying the N-terminal His-TEV-Smbdh containing vector with primers SV-274 and SV-275 to introduce the Q247A mutation within the bdh coding fragment, while amplifying the entire plasmid. This PCR product was then subjected to DpnI treatment followed by recircularization using the In-Fusion Cloning protocol to obtain the final plasmid.
The double mutant, Q247A + V139Q SmBdh, was generated by amplifying the Q247A Smbdh plasmid with primers SV-279 and SV-280. Following DpnI treatment, the PCR product was self-ligated using the In-Fusion Cloning protocol to obtain the final plasmid.
The 11aa-Ins-SmBdh mutant was constructed by amplifying the N-term SmBdh plasmid with primers SV-277 and SV-276 to introduce the Enterobacter cloacae Bdh (EcBdh) 14 amino acid coding sequence "SEAA-GKPLGYGTET" replacing 3 amino acids from SmBdh ("RDK") into the Smbdh gene along with amplification of the entire vector. This PCR product was subjected to DpnI treatment followed by self-ligation using the In-Fusion Cloning protocol. Prior to cloning the 14 amino acid sequence, the respective E cloacae sequence was codon-optimized for expression in Z. mobilis.
In order to introduce a 6X-histidine tag and a TEV protease cleavage site into the N-terminal end of the E. cloacae Bdh (N-term EcBdh), the untagged version of the Ecbdh gene was first amplified from a previously transformed Z. mobilis strain BC-11B [15] using the primers SV-285 and SV-286. The vector fragment was obtained by amplifying pEZ15Asp-Smbdh vector with primers SV-283 and SV-284 followed by DpnI treatment of the PCR product. Both these fragments were ligated using the In-Fusion Cloning protocol to obtain the Ecbdh gene in the same expression vector, pEZ15Asp. The N-terminal histidine tag along with the TEV site was then introduced by amplifying the pEZ-15Asp-Ecbdh vector with primers SV-293 and SV-294, followed by DpnI treatment of the PCR product and self-ligation by the In-Fusion Cloning protocol.
The 11 amino acid deleted version of the Ecbdh gene (11aa-del-EcBdh) was cloned by amplifying the N-term Ecbdh plasmid with primers SV-302 and SV-303 to amplify the entire plasmid, except the 14 amino acid sequence ("SEAAGKPLGYGTET") of EcBdh, while replacing this segment with a 3 amino acid sequence ("RDK") of SmBdh. Following amplification of the entire plasmid, this PCR product was subjected to DpnI treatment followed by circularization using the In-Fusion Cloning protocol.

Bdh protein expression and purification
Following confirmation of the strains for the presence of the bdh genes, individual colonies were inoculated into 5 mL RMG medium containing 200 µg/mL spectinomycin and incubated at 30 °C for 2 days under shaking conditions (120 rpm). An aliquot of this pre-grown culture was then transferred to 20 mL RMG medium containing 200 µg/mL spectinomycin for small scale protein extraction and activity analysis. For large-scale protein purification studies, 750 mL cultures were started in duplicates.
In both cases, the starting OD 600 of the cultures were adjusted to 0.025. Cultures were incubated at 120 rpm at 30 °C for 3 days. The final OD 600 of the cultures were between 3 and 6 for the different cultures. Cultures were then transferred to ice prior to centrifugation at 6000 × g for 5 min to obtain cell pellets. Cells were immediately frozen in liquid nitrogen prior to further processing.
For small-scale protein extraction, cells were thawed on ice and resuspended on 1 mL of 100 mM phosphate buffer (pH 7.0) containing protease inhibitor cocktail (Sigma-Aldrich Corp. St. Louis, MO, USA). Cells were disrupted by sonication using four cycles of 30 s pulses, with intermittent cooling on ice for 30 s between cycles. Cells were then subjected to bead beating for 10 min at 4 °C to ensure complete lysis of the cells. Protein extract was clarified by centrifugation at 13,000 × g for 10 min. Total protein content was estimated using the Bradford reagent protocol (Bio-Rad Laboratories, Inc. Hercules, CA, USA).
For large-scale protein extraction, frozen cell pellets were thawed and resuspended at room temperature with equal volume of buffer (50 mM Tris pH 7.5, 100 mM NaCl and 10 mM imidazole) and lysed using lysozyme and vortexing with glass beads. Specifically, 1 mg/mL lysozyme (Hampton Research, Aliso Viejo, CA, USA), 1 U/mL Pierce Universal Nuclease (Thermo Scientific, Rockford, IL, USA) and EDTA-free protease inhibitor (Thermo Scientific, Rockford, IL,USA according to manufacturer instructions) were added and the lysis mixture was incubated for 30 min at room temperature. Lysis was completed via vortexing with glass beads (0.1 mm diameter, 1:1 volume with the cell pellet) for 3-5 min and cell debris was removed by centrifugation for 15 min at 22,000 × g. The resulting supernatant was loaded into a 5-mL HisTrap FF crude column (GE Life Sciences, Piscataway, NJ, USA) using an Akta FPLC system (GE Life Sciences, Piscataway, NJ, USA) in 50 mM Tris pH 7.5, 100 mM NaCl and 10 mM imidazole. After loading and washing the unbound proteins from the column, the target proteins were eluted using 50 mM Tris pH 7.5, 100 mM NaCl and 250 mM imidazole. Minor impurities were removed by size exclusion chromatography using HiLoad Superdex 75 (26/60) (GE Life Sciences, Piscataway, NJ, USA) in 20 mM Tris pH 7.5 and 100 mM NaCl. Peaks corresponding to the Bdh samples were pooled and concentrated to 5-15 mg/mL. Protein concentration was measured using absorbance at 280 nm in a NanoDrop1000 and the protein-specific extinction coefficients and molecular weights.

Protein crystallization
Initial screening was done in a sitting drop vapor diffusion setup using a 96-well plate with Crystal Screen HT, PEG/Ion HT and Grid Screen Salt HT from Hampton Research (Aliso Viejo, CA, USA). Fifty microlitre of well solution was added to the reservoir with three drops made of 0.2 µL of well solution and 0.1/0.2/0.3 µL of protein solution using a Phoenix crystallization robot (Art Robbins Instruments, Sunnyvale, CA, USA). The final crystals for all three structures were grown at 20 °C using an optimization screen with 0.1 M sodium malonate pH 6-7 and 6-15% w/v polyethylene glycol 3350 as the well solution. The protein solution for WT SmBdh contained 9 mg/mL of protein, 20 mM Tris pH 7.5, 100 mM NaCl, 20 mM NAD + and 200 mM acetoin. SmBdh Q247A protein solution consisted of 4.2 mg/mL of protein, 20 mM Tris pH 7.5, 100 mM NaCl, 20 mM NAD + and 20 mM acetoin. SmBdh Q247V + V139Q had 14.5 mg/mL of protein, 20 mM Tris pH 7.5, 100 mM NaCl, 20 mM NAD + and 20 mM acetoin.

Data collection and processing
Before data collection the SmBdh crystals were flash frozen in a nitrogen gas stream at 100 K followed by data collection using an in-house Bruker X8 MicroStar X-Ray generator with Helios mirrors and a Bruker Platinum 135 CCD detector (Bruker AXS LLC, Madison, WI, USA. SmBdh Q247A and SmBdh Q247V + V139Q crystals were further protected from ice formation by shortly moving them into a well solution drop with 15% (v/v) ethylene glycol and 15% (v/v) glycerol for improved cryo protection. Bruker Suite of programs version 2014-9 (Bruker AXS LLC, Madison, WI, USA) was used for data indexing and processing.

Structure solution and refinement
The CCP4 package of programs [48] was used for converting intensities into structure factors, for project tracking, and access to the individual programs. Five percent of reflections were reserved for R free calculations using programs F2MTZ, Truncate, CAD and Unique. The structure of the WT SmBdh was determined via molecular replacement using MOLREP [49]. The initial model was built based on PDB entry 4ni5 with FFAS03 search and ProtMod modeling servers [50][51][52] using the SCWRL method. The resulting model was refined in REFMAC5 [53] version 5.8.0258 and rebuilt and inspected using Coot version 0.8.0.2 [54]. Structures of the SmBdh Q247A mutant and Q247A + V139Q double mutant were determined via molecular replacement as described above using the WT SmBdh as a model. Ramachandran plot was calculated with MOLPROBITY [55] and root mean square deviations (r.m.s.d.) of bond lengths and angles were calculated using ideal values of Engh and Huber stereochemical parameters [56]. Wilson B-factor was obtained from CTRUNCATE [48] version 1.0.11. All structures have been deposited to the Protein Data Bank with PDB codes 6XEW (WT SmBdh), 6VSP (SmBdh Q247A), and 6XEX (SmBdh Q247A + V139Q). Data collection and refinement statistics are shown in Table 3.

Bdh activity analysis
Bdh activity was assayed by following NADH oxidation at 340 nm for 5 min using a molar extinction coefficient of 6.22 mM −1 cm −1 . The enzyme reaction was carried out in a total volume of 200 µL and contained 100 mM potassium phosphate buffer (pH 7.0), 0.25 mM NADH and 25 mM acetoin. The reaction was started by addition of the enzyme. All reactions were carried out in microtiter plates and monitored using a FLUOstar Omega plate reader (BMG Labtech GmbH, Ortenburg, Germany). Enzyme activities were represented as unit per nmol of the purified protein. One unit of enzyme was defined as the amount of enzyme that consumed one micromole of NADH per minute. For total proteins, enzyme activities were represented as micromole of NADH consumed per min per mg of total protein.
For phylogenetic screening of the Bdh enzymes, whole cell extracts obtained from the small-scale extraction procedure were used for enzyme assays. A range of concentrations from 1-20 µg total protein were tested for each protein extract. For the Bdh assays involving purified proteins (SmBdh and EcBdh), a range of protein concentrations from 0.05-1 µg were used.

Table 3 X-ray data collection and refinement statistics
Statistics for the highest resolution bin are shown in parenthesis a R int = ∑hkl ∑i|Ii(hkl)-‹I(hkl)›|/ ∑hkl ∑i Ii(hkl), where Ii(hkl) is the intensity of an individual reflection and ‹I(hkl)› is the mean intensity of a group of equivalents; the sums are calculated over all reflections with more than one equivalent measured b [56] c [55] WT SmBDH SmBDH Q247A SmBDH Q247A + V139Q

Data collection
Space group P4 3