In silico design
Substitutions were selected based on Rosetta energy units as well as identifying substitutions displaying features associated with increased thermal stability, including increasing buried hydrophobic surface area [22], increasing quality of side-chain packing [23], increase surface charge of the tetrameric complex [24], decreasing conformational flexibility and improved packing at the symmetric dimer interface [25]. Computational approaches can be clustered into four topics:
-
1.
Single substitutions were identified with in silico site saturation mutagenesis, using the Rosetta protein modeling software [21]. The in silico experiments used either a fixed-backbone approximation, where the backbone coordinates from the X-ray crystal structure were held fixed in the simulation, or a minimization of backbone and side-chain torsion angles prior to evaluating amino acid identities at each position.
-
2.
Previous work comparing thermophilic and mesophilic enzymes showed that the backbone of structurally equivalent clusters can accommodate alternate sequence combinations by adjusting an average of 3.5 Å for each Cα atom [23]. Here, clusters of interacting residues were determined using a distance cutoff, where residues were considered interacting when any side-chain heavy-atoms were within 4 Å. Two clusters were identified for design targets based on poor side-chain packing, G6, I97, I166, A171 and H21, M42, Q44, Y46. These two clusters were designed by iterating between sequence design and minimization of backbone and side-chain torsion angles.
-
3.
The PDC dimer displays symmetry at the interface resulting in a design constraint where any mutation made to one monomer must be mirrored in the bound partner. The symmetric design protocol simultaneously models the effects of an amino acid substitution at both positions at the interface [25].
-
4.
Engineering a protein surface to have a high net charge can increase solubility and enhance thermal stability [24, 26]. Designing the PDC complex on the hypothesis that supercharging the surface can stabilize the interactions between domains, we identified substitutions to impart a high net charge while modeling the energetic consequences of the substitutions.
We selected PDC1.01 and PDC1.10 as the foundations for the second round of protein design based on the observed enhancements in thermal stability. We identified four additional mutations by selecting the next set of substitutions that would remove glycine residues (G491A, G515A, G516A, and G540A; Additional file 1: Table S1). We targeted the identification of additional glycine-removing mutations due to the successful thermal stabilization of PDC1.01, which contained two such mutations. These cases where a substitution removed a glycine residue underwent additional evaluation. Glycine, which has only a hydrogen atom at its side-chain position, can accommodate phi–psi backbone angles not accessible to any other natural amino acid. For this reason, any in silico mutations replacing a glycine residue underwent a filter based on Rosetta energy scores. The Rosetta energy term, p_aa_pp, gives the probability of finding an amino acid at given phi and psi dihedral angles. In silico mutations that removed a glycine residue were only considered if the p_aa_pp energy score was below 0.2. Additionally, glycine is the smallest amino acid. Mutations that passed the above filter were then sorted by the Lennard–Jones repulsive energy term (LJrep) to ensure significant atomic clashes were not present that might indicate the protein region could not accommodate the substitution (Additional file 1: Table S1). Glycine residues within 6 Å of the dimer active sites were not considered for redesign. We additionally avoided mutations introducing cysteine residues, as cysteine residues can participate in redox reactions [27].
A solvent exposed mutation in PDC1.01, A189K, was identified as a potentially beneficial single mutation in an in silico point mutation scan, but places a positively charged residue at the PDC surface and thus does not fit the goal of the negatively supercharged surface for PDC1.10. This mutation was subsequently removed in PDC2.02. A version of PDC2.02 that included the A189K mutation was characterized and evaluated using CD to confirm the mutation does not alter thermal stability based on observed changes in molar ellipticity (Additional file 1: Figure S3).
Construction, cloning, and expression of PDC variants
DNA sequences for each PDC variant were codon optimized and cloned into a pET22b(+) vector (GenScript, Piscataway, NJ). The sequence for a hexahistidine tag was placed at the C terminus of the constructs. PDC protein variants were expressed either for 4 h at 37 °C or overnight at 25 °C with 0.1 mM IPTG in the BL21 (DE3) strain of Escherichia coli. Proteins were concentrated using Vivaspin spin columns with a molecular weight cutoff of 10,000 Da (GE Healthcare Life Sciences, Pittsburgh).
Protein purification. The frozen cell pellets were thawed at room temperature with equal volume of buffer A (50 mM Tris pH 7.5, 100 mM NaCl, 10 mM imidazole, 0.1 mM thiamine pyrophosphate (TPP), 0.5 mM dithiothreitol (DTT) and 1 mM MgCl2) and lysed with lysozyme and sonication. One mg/mL lysozyme (Hampton Research, Aliso Viejo, CA), 1.0 U/mL Pierce Universal Nuclease (Thermo Scientific, Rockford, IL) and EDTA-free protease inhibitor (Thermo Scientific, Rockford, IL) according to manufacturer instructions were added in the lysis mixture and incubated for 30 min at room temperature with occasional vortexing. Sonication was done at room temperature for 2 min using a Branson 5510 water bath sonicator (Branson Ultrasonics Corporation, Danbury, CT). Cell debris was removed by centrifugation at 15,000×g for 15 min. The supernatant was loaded onto an eight mL HisPur Cobalt column (Thermo Scientific, Rockford, IL) using an Akta FPLC system (GE Life Sciences, Piscataway, NJ) with buffer A (50 mM Tris pH 7.5, 100 mM NaCl, 20 mM imidazole, 0.1 mM TPP, 0.5 mM DTT, and 1 mM MgCl2). After loading and washing the unbound proteins from the column, PDC samples were eluted using 100% of Buffer B (50 mM Tris pH 7.5, 100 mM NaCl, 250 mM imidazole, 0.1 mM TPP, 0.5 mM DTT, and 1 mM MgCl2). Final purification was performed by size-exclusion chromatography using a HiLoad Superdex 200 (26/60) column (GE Healthcare, Piscataway, New Jersey, USA) in buffer C (20 mM Tris pH 7.5, 100 mM NaCl, 0.1 mM TPP, 0.5 mM DTT, and 1 mM MgCl2).
Differential scanning calorimetry
Protein samples were analyzed by differential scanning calorimetry (DSC) using a Microcal VP-DSC instrument (Malvern, Worcestershire, UK) to measure the excess heat capacity of protein unfolding as a function of temperature. These measurements were used to directly calculate the enthalpy of unfolding (∆Hcal) for each protein sample according to the equation:
$$\Delta H = \mathop \int \limits_{{T_{\text{f}} }}^{{T_{\text{u}} }} C_{\text{p}} {\text{d}}T$$
Protein samples were measured over a temperature range of 10–90 °C and at a scan rate of 60 °C/h. No feedback mode was used for each DSC experiment. Buffer baseline scans were established before each protein run by loading both the sample and reference cells with buffer and performing 3 or more heating and cooling cycles until the deviation in scans was less than 0.01 mcal/min. Buffer was then removed from the sample cell and replaced with the protein sample during the temperature range of 25–15 °C during the cool down cycle. All protein samples were diluted to 0.2 mg/mL in 20 mM TRIS pH 7.5, 100 mM MnCl, 1 mM MgCl2, 0.5 mM DTT, 0.1 mM TPP buffer and run at that concentration in order to minimize post-unfolding aggregation.
DSC data on each sample was analyzed by the Origin 7.0 software [28] coupled to a DSC data analysis module provided with the Microcal VP-DSC. Sample scans were buffer subtracted, normalized to molar concentration and corrected by baseline correction options in the DSC data analysis module prior to least squares analysis using a non-two-state model option. The fitted sample curves produced the melting temperature (Tm), enthalpy of unfolding (∆Hcal), and van’t Hoff enthalpy (∆HvH) for each transition.
Circular dichroism spectroscopy and thermal melts
Circular dichroism (CD) spectra and thermal melt measurements were performed on a Chirascan-plus spectrometer (Leatherhead, Surrey, UK) using a 0.5-mm path-length quartz cuvette. WT and mutant PDC proteins were buffer exchanged into a potassium phosphate buffer (10 mM potassium phosphate, 100 mM NaCl, 1 mM MgCl2, 0.1 mM TPP, 0.5 mM DTT, pH 7.5) and prepared to a final concentration of 0.2 mg/mL. CD spectra were measured at the stated temperatures with a step size of 0.25 nm. The thermal melts were performed in continuous ramp mode at a rate of 2 °C/min while measuring CD at 222 nm. The Tm of each protein was determined from the first derivative of the thermal melt curves using Prism 6 (GraphPad, La Jolla, CA). Each thermal melt experiment was performed in triplicate.
ThermoFluor high-throughput protein stability assay
The high-throughput ThermoFluor assay was used to evaluate the effects of the glycine-to-alanine substitutions G491A, G515A, G516A, and G540A. For the ThermoFluor assay, a hydrophobic dye, SYPRO Orange, binds to exposed hydrophobic regions of the protein. As proteins begin denaturation upon heating, increasing amounts of hydrophobic regions are exposed resulting in an increased signal. The assay is performed in an RT-PCR machine using 96-well plates, allowing the simultaneous characterization of many protein variants. This assay has been used to rapidly screen protein mutants generated from computational design approaches [29, 30], and thus allowed us to evaluate the threshold for selecting these mutations for combinatorial libraries.
SYPRO® Orange Protein Gel Stain, supplied at 5000× concentrate in dimethyl sulfoxide (Thermo Scientific, Waltham, MA), was diluted to 200× in buffer (20 mM Tris pH 7.5, 100 mM NaCl, 0.1 mM TPP, 0.5 mM DTT, and 1 mM MgCl2). Wild-type PDC and variants were diluted to 5 μM in buffer, with protein concentrations determined by measuring absorbance at 280 nm. Extinction coefficients were calculated using the method described by Gill and von Hipple [31]. 50 μL samples were made by combining 45 μL of protein with 5 μL of 200× SYPRO® Orange stain. Protein variants were measured in triplicate, with 50 μL samples placed in Hard-Shell® 96-well PCR plates with clear wells (BioRad, Hercules, CA). Plates were covered with MicroAmp® optical adhesive film (Thermo Scientific, Rockford, IL) to prevent sample evaporation. Spectra were obtained on a BioRad Real-Time C1000 Touch Thermal Cycler (BioRad, Hercules, CA). Thermal denaturations were done by increasing temperature from 25 to 95 °C at a rate of 1 °C/min, taking a plate read every minute using the FRET scan mode. Circular dichroism and ThermoFluor spectra were generated using IGOR Pro (WaveMetrics Inc., Lake Oswego, OR).
PDC activity assay
PDC activity was measured using an assay where decarboxylation is coupled with alcohol dehydrogenase [32], and the conversion of NADH to NAD+ by the alcohol dehydrogenase was monitored for 5 min with a Varian Cary 400 (Agilent Technologies, Santa Clara, CA) temperature-controlled spectrophotometer at 25 °C. Four cuvettes, a blank and a triplicate of one sample of interest, were prepared and measured at a time. Each cuvette contained one mL of reaction mix and 20 µL of PDC with a suitable dilution in buffer D (20 mM BIS–Tris pH 6.5, 100 mM NaCl, 1 mM MgCl2, 0.1 mM TPP, 0.5 mM DTT), mixed using a 1 mL pipette (all samples in triplicate). Reaction mix contained 20 mM Na pyruvate (Fisher Scientific, Fair Lawn, New Jersey), 0.288 mM NADH (Sigma Chemical CO, St. Louis, MO) and 40 U yeast alcohol dehydrogenase (MP Biomedicals LLC, Solon, OH) in buffer D. Blank contained only the reaction mix. The Protein concentration was determined using the Bradford protein reagent with bovine serum albumin as the standard (BioRad, Hercules, CA).
Crystallization
PDC2.03 crystals were initially obtained with sitting drop vapor diffusion using a 96-well plate with Grid Screen Salt HT from Hampton Research (Aliso Viejo, CA). Fifty microliter of well solution was added to the reservoir, and drops were made with 0.2 µL of well solution and 0.2 µL of protein solution using a Phoenix crystallization robot (Art Robbins Instruments, Sunnyvale, CA). The crystals were grown in 0.1 M MES monohydrate pH 6.0 and 2.4 M ammonium sulfate at 20 °C. The protein solutions contained 6 mg/mL of protein in 20 mM Tris pH 7.5, 100 mM NaCl, 1 mM MgCl2, 0.5 mM DTT, and 0.1 mM TPP.
Data collection and processing
The PDC2.03 crystals were flash frozen in a nitrogen gas stream at 100 K before home source data collection using an in-house Bruker X8 MicroStar X-ray generator with Helios mirrors and Bruker Platinum 135 CCD detector. Data were indexed and processed with the Bruker Suite of programs version 2014.9 (Bruker AXS, Madison, WI).
Structure solution and refinement
Intensities were converted into structure factors, and 5% of the reflections were flagged for Rfree calculations using programs F2MTZ, Truncate, CAD, and Unique from the CCP4 package of programs [33]. The program MOLREP [34] version 11.4.06 was used for molecular replacement using wild-type PDC (PDB code 2WVG [35]) as the search model. Refinement and manual correction was performed using REFMAC5 [36] version 5.8.0155, and Coot [37] version 0.8.6. The MOLPROBITY method [38] was used to analyze the Ramachandran plot, and root-mean-square deviations (RMSD) of bond lengths and angles were calculated from ideal values of Engh and Huber stereo chemical parameters 47 [39]. Wilson B-factor was calculated using CTRUNCATE version 1.15.10 [33]. The data collection and refinement statistics are shown in Additional file 1: Table S2.
Structure analysis
Programs Coot 45, PyMOL (http://www.pymol.org) and ICM (http://www.molsoft.com) were used for comparing and analyzing structures. Figures 2a, b, 5 were created using PyMOL. The root-mean-square deviation (RMSD) between the monomeric unit of PDC and PDC2.03 was computed using PyMol (The PyMOL MolecularGraphics System, Version 1.5.0.4 Schrödinger, LLC.)
DIC microscopy
Pyruvate decarboxylase (PDC) protein preps were diluted in buffer to 1 mg/mL concentration. 7.5 µL of protein solution was placed between two glass coverslips separated by a 0.15 mm deep SecureSeal imaging spacer (Grace Bio-labs, Bend, OR). Heating was controlled using a Linkam FTIR600 temperature-controlled microscope stage (Linkam Scientific Instruments, UK) and heated from 24 °C to either 60 or 120 °C at a ramp rate of 1 °C/min. The optics were set up for bright field differential interference contrast imaging on a NikonE800 microscope (Nikon, Tokyo, Japan), using a 20× 0.75 NA PlanApo ELWD objective. Images were captured every 30 s over the 3 h using a SPOT RTKE CCD camera (Diagnostic Instruments, Sterling Heights, MI) as TIFF stacks. TIFF stacks were analyzed using FIJI (ImageJ).