Selection of 10 representative ADOs
Previously, we compared the aldehyde-producing activity of AARs from eight representative cyanobacteria [24]. Here, we compared the activity of the ADOs from the same eight representative cyanobacteria: 73102ADO, 7421ADO, 7942ADO, 9313ADO, and TeADO and ADOs from Synechocystis sp. PCC 6803 (6803ADO), Synechococcus sp. PCC 7336 (7336ADO), and Microcystis aeruginosa PCC 9443 (9443ADO) [24]. In addition, we also used 7425ADO and PaADO because these ADOs have been reported to produce high amounts of hydrocarbons [28, 52]. Thus, a total of 10 ADOs were selected for the activity assays. Multiple sequence alignments of the 10 ADOs are shown in Fig. 1. The amino acid sequence identities among them are, on average, 69% (Additional file 1: Table S1).
A phylogenetic tree of cyanobacterial ADOs was constructed based on the ADO amino acid sequences (Additional file 2: Figure S1). There are two large groups (groups 1 and 2) and one small group (group 3) of ADO sequences in the phylogenetic tree, which is similar to that constructed using AAR amino acid sequences [24]. Group 1 contains 93 ADO sequences, which are mainly from freshwater cyanobacteria, including Nostoc punctiforme PCC 73102, Synechococcus elongatus PCC 7942, Planktothrix agardhii NIVA-CYA 126/8, Synechocystis sp. PCC 6803, Microcystis aeruginosa PCC 9443, Cyanothece sp. PCC 7425, and Thermosynechococcus elongatus BP-1. Group 2 contains 31 ADO sequences, which are mainly from marine cyanobacteria, including Prochlorococcus marinus MIT 9313. Group 3 contains 10 ADO sequences from both marine and freshwater cyanobacteria, such as Synechococcus sp. PCC 7336 and Gloeobacter violaceus PCC 7421, respectively. The amino acid sequences of the ADOs from Group 3 are distinct from those categorized into groups 1 and 2 (Additional file 1: Table S1).
ADO activity
ADO activity towards the C16 and C18 aldehydes was measured with an in vivo activity assay that we previously developed [22, 24] because the in vitro activity measurement was precluded by the low solubility of the substrates (see above). Thus, we coexpressed 7942AAR, having the highest activity among the various AARs [24], with one of the 10 selected ADOs in E. coli to produce hydrocarbons. In this method, ADOs that exhibit higher hydrocarbon-producing activity should have higher catalytic efficiency, higher affinity for 7942AAR, and/or more efficient coupling with an external reducing system. A GC–MS profile of the extract from the E. coli cell culture coexpressing AAR and ADO showed predominant amounts of heptadecene and pentadecane and a small amount of heptadecane (Additional file 3: Figure S2), consistent with previous studies [22, 24]. The predominant production of heptadecene (alkene) is probably related to the fact that most acyl-ACPs are desaturated during fatty acid synthesis in E. coli [55].
A control experiment showed that fatty aldehydes and corresponding fatty alcohols can be extracted using ethyl acetate (Additional file 4: Figure S3a, b). Thus, the GC–MS profiles of the extracts from the E. coli cell cultures coexpressing AAR and ADO showed peaks of fatty aldehydes and fatty alcohols, in addition to those of alkanes/alkenes (Additional file 4: Figure S3d). However, because the peaks for aldehydes and alcohols had extremely low intensities and were partially overlapped with those from E. coli cells (Additional file 4: Figure S3c, d), it was not possible to accurately quantify their amounts and therefore to estimate ADO activity using them.
Figure 2a shows the amounts of pentadecane, heptadecene, and heptadecane and their total amounts in E. coli cell cultures coexpressing ADO from one of 10 representative cyanobacteria and 7942AAR. Because 73102ADO had been previously considered to have the highest activity [11], the amounts of hydrocarbons relative to those produced by 73102ADO are shown in Fig. 2. The actual amounts of produced hydrocarbons are shown in Additional file 5: Table S2. The total amounts of hydrocarbons produced in E. coli were the highest for TeADO, followed by, in descending order, 7421ADO, 73102ADO, 7425ADO, 9313ADO, 7942ADO, 9443ADO, PaADO, 6803ADO, and 7336ADO. The amounts of hydrocarbons produced by 6803ADO and 7336ADO were less than 20% of that produced by TeADO.
Factors affecting the amounts of hydrocarbons produced in E. coli are ADO activity and the amount of the soluble form of ADO. Therefore, the total amount of hydrocarbon in the E. coli cell culture, normalized to the amount of the soluble form of ADO, was used as an index of ADO activity. The amount of the soluble form of ADO was quantified by western blotting (Fig. 3a; Additional file 6: Figure S4) because western blotting and SDS-PAGE have been successfully used to quantify the amounts of soluble and insoluble forms of proteins [56,57,58]. The 7425ADO protein migrated more slowly than the other ADOs. This may be because the 7425ADO protein has the highest number of negative charges at pH 8.8 among the ADOs used in this study (see Additional file 7: Table S3), as abnormally slow migration by SDS-PAGE has been previously reported for proteins that have many negative charges [59, 60]. The western blotting results showed that there were large differences in the amount of the soluble form among the various ADOs. We found that the amount of the soluble protein was high for 7421ADO, TeADO, and 7336ADO but low for PaADO, 6803ADO, and 7942ADO (Fig. 3a; Additional file 6: Figure S4). Notably, the variations in the amounts of soluble ADOs were not due to handling errors associated with western blotting because the reproducibility of the data was confirmed by triplicate experiments, as shown in Additional file 6: Figure S4a; the error bars for quantifications of the amounts of soluble ADO and AAR are shown in Fig. 4 and Additional file 6: Figure S4b, respectively.
Figure 2b shows the hydrocarbon-producing activity of ADO expressed as the total amount of hydrocarbons in the E. coli cell culture (Fig. 2a) divided by the amount of the soluble form of ADO (Fig. 3a). The activities relative to that of 73102ADO are shown. The hydrocarbon-producing activity was the highest for 7942ADO, followed by, in descending order, 6803ADO, 9313ADO, 73102ADO, PaADO, 7425ADO, 9443ADO, TeADO, 7421ADO, and 7336ADO (Fig. 2b). Although the amount of the soluble form of ADO for 7421ADO, TeADO, and 7336ADO was high, their activities were low.
Additional file 6: Figure S4b shows that the amount of soluble AAR ranged from 80 to 175% when the value of E. coli coexpressing 7942AAR and 73102ADO was used as a reference. However, this variation does not affect the measurement of ADO activities. We have previously shown that the rate-limiting step in the biosynthesis of hydrocarbons in E. coli coexpressing 7942AAR and 73102ADO is the reaction catalyzed by ADO [24]. Moreover, when the amount of the soluble AAR protein was reduced to 1/3 (33%), the amount of hydrocarbon produced by 73102ADO was decreased to 1/2 [24]. This result indicates that when the amount of soluble AAR is reduced to 2/3 (66%), the efficiency of substrate supply by AAR becomes the same as that of hydrocarbon production by 73102ADO. Thus, if the amount of soluble AAR is larger than 80%, aldehyde substrates that are sufficient for a hydrocarbon production level that is 1.2-fold higher than that of 73102ADO are supplied. Figure 2a shows that the levels of hydrocarbons produced by all ADOs were less than 1.2-fold that of 73102ADO. Therefore, in the present study, the aldehyde supply from AAR was sufficient for estimating ADO activities, and normalizing the amount of hydrocarbon to the amount of soluble AAR was unnecessary when estimating ADO activity.
Solubility and protein expression level
The solubility and protein expression levels of the ADOs in E. coli were quantified by western blotting (Fig. 3b, c). Here, the protein expression level was determined as the total amount of the soluble and insoluble forms of ADO, while solubility was estimated as the ratio of the amount of the soluble form to the total amount of the soluble and insoluble forms of ADO. We found that solubility was highest for TeADO (88%), followed by, in descending order, 9443ADO, 7336ADO, 7942ADO, 9313ADO, PaADO, 6803ADO, 7425ADO, 73102ADO, and 7421ADO (Fig. 3b). The high solubility of TeADO may be explained by the fact that proteins derived from thermophilic bacteria have high stability [61]. In contrast, the solubility of 7421ADO was lowest (70%). This may be due to the significantly high expression level of the 7421ADO protein (Fig. 3c). Nonetheless, the amount of the soluble form of 7421ADO, which corresponds to the product of the protein expression level and solubility, was highest among the ADOs (Fig. 3a).
Substrate specificity
All 10 ADOs examined in this study showed that their major product is heptadecene (Fig. 2c), indicating that these ADOs have a similar substrate specificity. However, there were slight differences in the substrate specificities of the 10 ADOs (Fig. 2d). 9443ADO and 6803ADO had relatively higher substrate specificities for the 18-carbon fatty aldehyde than the other ADOs because the amount of pentadecane was less than 30% of the total amount of produced hydrocarbon (Fig. 2c). In contrast, the amount of pentadecane was more than 40% of the total amount of hydrocarbon produced with TeADO and 9313ADO, which are derived from freshwater and marine cyanobacteria, respectively (Fig. 2c), indicating that these ADOs had a relatively higher substrate specificity for the 16-carbon fatty aldehyde. Among the three types of detected hydrocarbons, heptadecane exhibited the lowest production (Fig. 2a). Nevertheless, the level of heptadecane produced by 9313ADO was ~ 2.5 times higher than that produced by 73102ADO (Fig. 2d).
Correlation analysis
Correlation analysis of activity, solubility, the protein expression level, the amount of hydrocarbon, and the amount of soluble ADO measured for the 10 representative ADOs was performed (Additional file 8: Figure S5). Strong correlations were not observed, except for a correlation between the protein expression level and the amount of soluble ADO. However, clear correlations between ADO properties were observed when the data from the 7421ADO mutants were used (see below).
We also carried out correlation analyses between the above properties and the amino acid sequence identities. To determine whether the activity was higher for the ADOs with amino acid sequences that were more similar to that of the most active ADO, we plotted the ADO activities against the sequence identities with 7942ADO, which exhibited the highest activity (Additional file 9: Figure S6a). There appears to be a positive correlation, but when the data point for 7942ADO is omitted, the correlation disappears, indicating that the activity level of ADO is not determined by the overall similarity of amino acid sequences. Similarly, solubility, the relative protein expression level, the relative amount of hydrocarbon, and the relative amount of soluble ADO do not correlate well with the sequence identities of TeADO, 7421ADO, TeADO, and 7421ADO, respectively, which have the highest number of corresponding properties (Additional file 9: Figure S6b–e). These results indicate that local differences in amino acid sequences, which cannot be inferred from the phylogenetic tree of ADOs, determine these ADO properties, suggesting that a limited number of non-conserved residues determine the activity level of an ADO.
Mutational analysis of non-conserved residues of 7421ADO
As shown above, hydrocarbon-producing activity is diverse among the various ADOs. This difference can be ascribed to the residues not being completely conserved among ADOs. To identify the non-conserved residues that are essential for ADO activity, we introduced mutations at the non-conserved residues of the less active ADO (7421ADO) to make its amino acid sequence more similar to that of the highly active ADO (7942ADO). The sites for mutation were selected using the multiple sequence alignment shown in Fig. 1, in which the amino acid sequences of the ADOs are shown in descending order of activity. Eighty-four ADO residues are completely conserved. Among the non-conserved residues, including partially conserved residues, both 7421ADO and 7942ADO have the same amino acids at 56 positions, which were excluded from the group of candidates for mutational studies. In addition, positions excluded from the group of candidates were those with the same amino acids as the highly active 7942ADO or the less active 7336ADO. Thus, we selected 37 positions in 7421ADO and replaced them with the corresponding residues in 7942ADO one at a time (Fig. 1). We then coexpressed each of the 37 mutants of 7421ADO with 7942AAR in E. coli and measured the amounts of produced hydrocarbons and soluble ADO, as well as the activity, substrate specificity, solubility, and protein expression level of the ADO (Fig. 4).
As expected, mutations affected the hydrocarbon-producing activity of 7421ADO. Twenty out of 37 mutations improved ADO activity (Fig. 4b). Among them, four mutations increased the activity of all hydrocarbons by more than 40% (I179L, E88S, R46Q, and D115K mutations, in descending order of activity). Six other mutations increased the activity by more than 17% (R192L, Q102R, V203E, V121L, H49F, and Q165E). However, the amount of the soluble form of ADO was decreased in the 10 mutants (Fig. 4d). These results are consistent with the fact that 7942ADO has high activity but yields low amounts of the soluble protein (Figs. 2, 3). Ten other mutations increased the activity by up to 17% but maintained more than 85% of the amount of soluble ADO (R171N, E202R, R181E, T174A, A184R, L53N, Q108K, A63R, P18A, and L161R) (Fig. 4b, d).
In contrast, 17 mutations in 7421ADO decreased the activity while increasing or maintaining the amount of soluble ADO (Fig. 4b, d). Among them, 13 mutations had more than 80% wild-type activity while increasing the amount of soluble protein (I189L, K235R, G224E, A98K, G178E, V233I, D93M, R97Q, Y99F, M230T, A196E, I211L, and A17E). However, three mutations (R239Y, A50D, and V175S) decreased the activity by more than 20%.
Substrate specificity was almost unchanged by the mutations, although the fraction of pentadecane was slightly decreased for the mutants having the 10 highest activities (Fig. 4c). The results are consistent with the fact that all representative ADOs have similar specificity for the substrates (Fig. 2c).
In addition to their effects on activity, the mutations also affected the amounts of hydrocarbons produced by ADO. Among the 37 mutants examined, the R181E mutant showed the highest yield of total hydrocarbon, which was 60% higher than that of the wild-type 7421ADO (Fig. 4a). Subsequently, a more than 20% increase in hydrocarbon production was observed for the Q108K, A17E, A196E, A98K, and G224E mutants of 7421ADO, in descending order. However, the activities of these mutants, especially those of A17E and A196E, both of which had activities of ~ 0.8-fold that of the wild type (Fig. 4b), did not increase by much, and the amount of the soluble form of ADO was increased by more than 20% (Fig. 4d). In particular, A17E showed the highest amount of soluble ADO among all mutants (57% increase compared with that of the wild type), followed by A196E and R181E.
The amount of the soluble form of ADO depends on both the solubility and protein expression level of ADO. A decrease in solubility was observed for the mutants with high activities (Fig. 4e). For some mutants, the protein expression level was increased by more than 50%, especially for R181E (2.2-fold greater than that of the wild type) (Fig. 4f). This increase in the protein expression level resulted in the higher amount of soluble ADO (Fig. 4d).
Correlation analysis among the properties of 7421ADO mutants
We performed a correlation analysis among the properties measured for the 37 mutants of 7421ADO (Fig. 5; Additional file 10: Figure S7). The amounts of total hydrocarbons produced in E. coli cells coexpressing AAR and ADO should be directly related to both the activity of the ADO and the amount of the soluble form of ADO. We found that there was a good positive correlation between the amount of total hydrocarbon and the amount of the soluble form of ADO (Fig. 5a). In contrast, a clear correlation was not observed between the amount of hydrocarbon and activity (Fig. 5b).
The amount of the soluble form of ADO should be directly related to both the solubility and protein expression level of the ADO. In fact, the amount of the soluble form was positively correlated with both the solubility and protein expression level of the ADO (Fig. 5c, d). In contrast, the activity of the ADO was negatively correlated with both the amount of soluble ADO and solubility (Fig. 5e, f) but not with the protein expression level of the ADO (Additional file 10: Figure S7c).
The correlation between the amount of the soluble form and solubility (Fig. 5c) may seem to contradict the results for the R181E mutant, which showed higher amounts of the soluble protein but had lower solubility than the wild-type 7421ADO. However, this may be due to the twofold higher expression level of the protein relative to wild-type 7421ADO expression, and the solubility of R181E may not be correctly estimated. When this data point is omitted from Fig. 5c, a more significant correlation is observed between solubility and the amount of soluble ADO. Similarly, the correlation between the solubility and the amount of hydrocarbon produced by the ADO becomes more significant when the data point for R181E is omitted (Additional file 10: Figure S7a).