Zymomonas diversity and potential for biofuel production

Background Zymomonas mobilis is an aerotolerant α-proteobacterium, which has been genetically engineered for industrial purposes for decades. However, a comprehensive comparison of existing strains on the genomic level in conjunction with phenotype analysis has yet to be carried out. We here performed whole-genome comparison of 17 strains including nine that were sequenced in this study. We then compared 15 available Zymomonas strains for their natural abilities to perform under conditions relevant to biofuel synthesis. We tested their growth in anaerobic rich media, as well as growth, ethanol production and xylose utilization in lignocellulosic hydrolysate. We additionally compared their tolerance to isobutanol, flocculation characteristics, and ability to uptake foreign DNA by electroporation and conjugation. Results Using clustering based on 99% average nucleotide identity (ANI), we classified 12 strains into four clusters based on sequence similarity, while five strains did not cluster with any other strain. Strains belonging to the same 99% ANI cluster showed similar performance while significant variation was observed between the clusters. Overall, conjugation and electroporation efficiencies were poor across all strains, which was consistent with our finding of coding potential for several DNA defense mechanisms, such as CRISPR and restriction–modification systems, across all genomes. We found that strain ATCC31821 (ZM4) had a more diverse plasmid profile than other strains, possibly leading to the unique phenotypes observed for this strain. ZM4 also showed the highest growth of any strain in both laboratory media and lignocellulosic hydrolysate and was among the top 3 strains for isobutanol tolerance and electroporation and conjugation efficiency. Conclusions Our findings suggest that strain ZM4 has a unique combination of genetic and phenotypic traits that are beneficial for biofuel production and propose investing future efforts in further engineering of ZM4 for industrial purposes rather than exploring new Zymomonas isolates. Supplementary Information The online version contains supplementary material available at 10.1186/s13068-021-01958-2.

knowledge required to enable rapid and effective engineering of Z. mobilis for these features.
One approach to improve biofuel-producing organisms is to take advantage of naturally occurring genetic diversity within a species or genus, as has been successful in yeast [2]. Comparative genomics is now starting to be utilized in Zymomonas and several Z. mobilis genomes were compared in a recent analysis [3]. This comparison showed a high degree of similarity between Zymomonas genomes, although different strains had different plasmid sets. This is in line with an earlier phenotypic characterization which showed relatively low diversity, and only one species (mobilis) within the genus Zymomonas [4]. However, the previous studies did not combine genome sequence analysis with phenotypic analysis, therefore, it is difficult to make inferences about gene function from their datasets. We hypothesized that by expanding the set of available genomes in conjunction with performing physiological characterization of all accessible strains, new connections between specific genes, operons, and/ or plasmids and phenotypic biofuel-relevant traits could be revealed.
To better understand Z. mobilis as a platform organism, we combined a comparative genomics approach with physiological characterization. Specifically, we compared the genomes of 17 strains, including 8 which had previously been sequenced [3,[5][6][7][8][9][10][11] and 9 which were sequenced at the Joint Genome Institute for this study (Table 1). Specific sequencing information, such as read coverage for each genome sequenced in this study is in Additional file 1: Table S1. Here, strains are defined as separate isolates or derivates that have been deposited to strain collections under unique identifiers. The strains were originally isolated from several locations around the world, across North and South America, Africa, and Europe (Additional file 1: Table S2). The strains were primarily obtained from fermentations intended to produce alcohol for human consumption. One strain (PROIMI A1) was isolated specifically for research purposes [12]. In some cases, the Zymomonas was a normal component of the fermentation and in other cases (European isolates) it was considered a contaminant [13]. While Zymomonas strains have previously been found in environmental samples, including bees, we were not able to obtain any such isolates [14]. Future studies may be able to expand knowledge of the Zymomonas genus by focusing on strains not isolated from human-associated fermentations. We compared the strains for their growth rates in rich media, tolerance of lignocellulosic hydrolysate and isobutanol, conjugation and transformation efficiencies, and flocculation characteristics.
We obtained and physiologically characterized as many of the Zymomonas strains as possible using three publicly available culture collections; the American Type Culture Collection (ATCC), the German Collection of Microorganisms and Cell Cultures (DSMZ), and the

Table 1 Information on Zymomonas genomes
Common strain names were used in the table and in the text. Other names used by different strain collections are listed in Additional file 1: Table S2 for crossreference. For strains with multiple scaffolds, NCBI Acc # for the largest contig is shown; smaller contigs have consecutive numbers. Contigs recognized as plasmids are listed in Additional file 1: Table S3 with specific accession numbers. SRA accession numbers are provided for the genomes sequenced for this study USDA Agricultural Research Service Culture Collection (NRRL). Two strains, NCIMB11163 and B-12526, were not readily available from culture collections, resulting in a total of 15 strains for phenotyping.

Phylogenetic analysis
We explored the phylogenetic relationships of all 17 Zymomonas strains by first constructing a concatenated marker gene tree using a set of 56 universal single copy markers [15]. Based on this tree, most Zymomonas isolates were highly similar, with a slight branching of the pomaceae and francensis subspecies in the marker gene tree (Fig. 1a). To increase resolution, a single nucleotide polymorphism (SNP) tree was constructed, which showed increased branching patterns between the genomes [16]. We analyzed the branching pattern together with two separate clustering approaches, hierarchical Bayesian clustering (hierBAPS) [17,18] and 99% average nucleotide identity (ANI) clustering (Fig. 1b, c). The hierBAPS approach placed the isolates into 4 distinct genome clusters, while 9 clusters resulted from the 99% ANI clustering. Based on 99% ANI clustering, most of the 17 sequenced Zymomonas strains can be classified into four multi-strain clusters, which we refer to as CP1 (ANI_99 cluster 4), CP4 (ANI_99 cluster 2), Z6 (ANI_99 cluster 3), and Drainas (ANI_99 cluster 1) (Table 1; Fig. 1). Each group was named for the most commonly studied strain within the group, except for the Drainas group, which was named for the research group that generated three of the four strains. Five strains did not cluster with any other strains based on 99% ANI, including Z. mobilis subsp. pomaceae, Z. mobilis subsp. francensis, PROIMI A1, NCIMB11163, and ZM4. Throughout this work, we use ZM4 as the comparator strain, and group PROIMI A1, francensis, pomaceae and NCIMB11163 together in a 'none' group. In general, the clusters align with our expectations based on the history of each strain as described below. The CP1 group consists of two strains, CP1 and B-1960. Strain CP1 was isolated from fermenting sugarcane juice in Recife, Brazil [19] and strain B-1960 was isolated from spoiled beer by H.J. Bunker of Barclay Perkins & Co., Ltd ( Table 1). The CP4 group includes two strains that were isolated from the same market as CP1 (CP3 and CP4), and a flocculating variant of CP4 developed at Oak Ridge National Lab, B-12526 [20]. Three additional strains fall within this large branch but do not fall within one of the clustering groups, PROIMI A1, NCIMB11163, and ZM4. Strain PROIMI A1 was isolated in Argentina from fermenting sugarcane juice. PROIMI A1 has not been widely studied, but previous work shows that it flocculates readily and has plasmids that can be repurposed for genetic modification purposes [12]. NCIMB11163 was isolated from ale in England in the 1970s [21]. The origin of strain ZM4 is somewhat unclear, because it is either a derivative or contaminant of CP4. After several decades of research on CP4, research groups realized that not all strains designated CP4 were identical and designated ZM4 as a separate group because of differing plasmid profiles [22]. Despite the history of the two strains, there are significant differences in their genome sequences [7]. In our analysis, ZM4 did not cluster with either the CP1 or CP4 groups, despite the expectation that they would be close relatives based on strain history. The Z6 group is formed by three strains; two that were isolated from fermenting palm sap in Zaire (Z6 and B-23394), and a third, B-4492, which is listed in the NRRL catalog as identical to strain B-1960 (from the CP1 group) [23]. We observed that strain B-4492 is phenotypically and genetically much more similar to the strains originating in Zaire than to the CP1 group. We suggest that strain B-4492 is likely an isolate from Zaire and therefore include it with the Z6 group.
The Drainas group is formed by the type strain, ATCC10988, and three derivative strains created by mutagenesis and selection by Dr. Constantin Drainas and coauthors. ATCC10988 was isolated from fermenting agave sap in Mexico in 1924 [4]. Strain CU1 was developed by incubating ATCC10988 with acridine orange and was reported to have a decreased tolerance to glucose concentrations ≥ 20% [24]. Strain CU1rif2 is a derivative of CU1 developed by further treatment with acridine orange and screening for rifampicin resistance [25]. Strain uvs51 is a derivative of CU1rif2 and was developed by mutagenizing with MNNG and screening for UV sensitivity [26]. Based on the relationships between these strains, we expect all three derivative strains to be sensitive to high glucose concentrations and CU1rif2 and uvs51 to be resitant to rifampicin. All strains showed the expected phenotypes (Additional file 1: Figure S1).
Z. mobilis subsp. pomaceae and Z. mobilis subsp. francensis were the most genetically distinct from the rest of the Zymomonas strains. Z. mobilis subsp. francensis strains were isolated from contaminated cider in France in 2001 [27]. These strains were designated as a separate subspecies based on differences in proteome (as observed by SDS-PAGE) and differences from subsp. mobilis and subsp. pomaceae in sucrose fermentation and tolerance of bile salts. Similarly, Z. mobilis subsp. pomaceae strains were isolated from contaminated cider in England [13].

Zymomonas mobilis pangenome
The pangenome of the Zymomonas strains was assessed by assigning genes to ortholog clusters and generating a rarefaction curve by plotting the number of unique orthologue groups as a function of total gene content, with a step size of 100 genes (Fig. 2a). A Venn diagram was also generated to visualize the number of ortholog clusters shared or unique among the four hierBAPS clusters (Fig. 2b). There were 2,564 unique gene families and while the rarefaction curve appears to approach an asymptote, the number of unique gene families is still increasing, suggesting that there are more unique gene families to be discovered when increasing the number of sampled genomes, i.e., the pangenome may be open. The Venn diagram does display however, that most unique gene families are observed in the core (62%, dark blue), suggesting that while we do not yet have a closed pangenome, these Zymomonas genomes have a relatively small accessory component.

Zymomonas mobilis plasmids
Beyond chromosome sequences, PacBio sequencing revealed multiple circular plasmids in 6 of 9 genomes analyzed. In two strains (CP1 and CU1), only small linear contigs were found, which may represent incompletely sequenced or linear plasmids. CP3 was assembled in 22 linear contigs, some of which may represent plasmids. The new and previously identified plasmids and linear contigs are summarized in Table 2 and accession numbers for all contigs and topology determination are included in Additional file 1: Table S3. Each group of strains appears to have a common set of plasmids, which share over 99.97% ANI. There is little homology between plasmids from different groups. Interestingly, three out of four plasmids of ZM4 share partial homolgy with plasmids from two other groups: CP4 and Drainas. Namely, 64% of p32.8 has 96% ANI to p32.3 of ATCC10988 and 78 and 61% of p36.5 and p39.3 share 98% ANI with p36.9 and p32.4 of the CP4 group, respectively. It is possible that ZM4 evolved from CP4 but developed a unique set of plasmids through conjugation with other Zymomonas strains during serial passage in the laboratory, which would explain its unusual history.
One of the most important features found on extrachromosomal elements is the Type I restriction-modification (R-M) system, encoded by hsdRMS genes. At least one such system was found on plasmids in all four groups and in ZM4 and pomaceae (highlighted in Table 2). Consistent with the ANI between the plasmids, BLASTp confirmed almost 100% amino acid identity between the R-M subunits (Restriction, Methylation, and Specificity) within the same group. Interestingly, although R and M subunits encoded on p32.8 of ZM4 show 99% amino acid identity to respective proteins encoded on p32.3 of ATCC10988, the putative S subunit is annotated as a hypothetical protein and has only 39% homology to respective subunit of ATCC10988. Similarly, although R and M subunits of the R-M system in pomaceae show 95% amino acid identity to subunits on CP4′s p30.9, the S subunit has only 38% identity to the same subunit of CP4. Hence, the specificity factor of Type I R-M system seems to evolve faster than endonuclease and methylase, possibly as a result of adaptation to new foreign DNA sequences present in a changing environment. Except for those in ZM4 and pomaceae, no hsdRMS genes were found on plasmids present in strains outside of the four groups. It is noteworthy that ZM4 was the only strain with hsdMS genes found on the chromosome without a corresponding hsdR identified.  Table S3. BLASTn of whole plasmids sequences was used to determine their ANI. Plasmids shared by more than one strain in the group were assigned the same colors, i.e., all blue-labeled plasmid were shared by multiple strains in the CP4 group. Homologous plasmids (> 99% ANI) are vertically aligned, i.e., the blue-labeled plasmids in the ZM4 row are homologous to the corresponding blue-labeled plasmids in the CP4 row. Plasmids with no significant homology to other plasmids are shown in black. Plasmids with annotated hsdRMS genes are highlighted (see details in text). ND-not detected 1 hsdMS and truncated hsdR with 100% aa identity to hsdRMS on linear contig 34.5 of B-1960

Comparison of growth in rich medium
We compared growth of all strains in rich medium in anoxic conditions to determine growth rates (Fig. 3).
Grouping the growth curves based on the phylogenetic tree reveals similarities within the groups. ZM4 and strains in the Z6 group have the fastest growth rates. Strains in the CP1 and CP4 groups grew more slowly than ZM4 and Z6. The type strain and its derivatives in the Drainas group show that two of the mutants have a growth defect compared with the parent strain. Of the strains that could not be classified, the francensis subspecies grew more slowly than all other strains. The pomaceae subspecies also grew slowly. The exponential growth rates and doubling times for the growth curves were calculated by fitting logistic curves to the data in Fig. 3 and are reported in Table 3. Strains in the Z6 group along with ZM4 and CP4 showed the highest growth rates and shortest doubling times. Flocculation of some strains ( Fig. 3b) caused instability in the apparent growth rates, making the doubling time an unreliable summary statistic for some of these growth curves. Therefore, we also report the area under the curve after 20 h (AUC 20 ) as a summary of growth, which takes into account both the growth rate and the final density [28]. ZM4 and Z6 group strains showed the highest growth as measured by AUC 20 , the CP1, CP4 and Drainas groups showed intermediate growth, and the unclassified strains showed variable growth with francensis showing the lowest growth ( Table 3).

Comparison of growth in lignocellulosic hydrolysate
Because of significant costs associated with purifying sugars from lignocellulosic materials, ideal biofuel-producing strains should have the capability to grow directly in lignocellulosic hydrolysates. However, these hydrolysates can be stressful environments for microbes, due to high sugar concentrations, lignotoxins such as furfural or phenolic aldehydes, and low pH due to organic acids [29][30][31]. Therefore, we measured how well each Zymomonas strain tolerated lignocellulosic hydrolysate during anaerobic growth. All strains except the francensis subspecies were able to grow in lignocellulosic hydrolysate, although some required a relatively dense  Table 3 Growth statistics for Zymomonas strains calculated from data in Fig. 3 Logistic curves were fit to the measured growth curves using the R package 'growthcurver' . The calculated exponential growth rate (r), doubling time, and area under the curve after 20 h (AUC 20  inoculum (initial OD 600 = 0.5) to enable growth and ethanol production (Figs. 4 and 5). The requirement of some strains for a dense inoculum to grow may reflect higher sensitivity of these strains to toxins present in hydrolysate. Previous work with Escherichia coli demonstrated that increased inoculation density improved growth in the presence of furfural and phenolic aldehydes [32]. Strain ZM4 grew to a higher final OD 600 than any of the other strains. The Drainas group provides interesting comparisons to examine genes relevant to hydrolysate tolerance because of the variation in hydrolysate tolerance despite very similar genomes. Within the Drainas group, the derivatives of ATCC10988 did not grow well in hydrolysate and required a high inoculum, unlike their parent strain. CU1 and CU1rif2 could not consume all glucose in the hydrolysate even at the higher inoculum level. These strains' previously documented intolerance of high glucose concentrations may explain the poor growth of the mutant strains on hydrolysate (6% glucose in the hydrolysate used here). Genome comparison between the four strains reveals that all three derivatives lost a set of 41 genes present in the parent strain (Additional file 1: Table S4). Of these, 37 were annotated as hypothetical proteins, 3 had general protein family annotations and one was annotated as a ribosomal protein. It seems likely that some of these 41 genes contribute to glucose tolerance, but at the   Fig. 4 and the initial hydrolysate were analyzed by HPLC for a glucose, b xylose, c ethanol, and d acetate concentrations. Cultures were inoculated to a starting OD 600 of 0.1 (triangles) or 0.5 (circles). Points represent the average of three biological replicates and error bars represent standard error current level of annotation of the Zymomonas genomes, it is not possible to propose a mechanism. These 41 genes are good candidates for future investigation of glucose tolerance in Zymomonas.
ZM4, the CP1 group, the CP4 group, ATCC10988, PROIMI A1, and pomaceae were able to consume all glucose in the hydrolysate and convert it to ethanol with high efficiency, regardless of inoculation density (Fig. 5). These strains also produced 2 to 8 mM acetate during growth on hydrolysate (initial hydrolysate contained 38.45 ± 0.05 mM acetate). Strains in the Z6 group consistently required a high inoculum to grow in hydrolysate and convert glucose to ethanol in this condition. One possible cause of the lower hydrolysate tolerance of strains in the Z6 group is that none of the strains contain a homolog of ZM4 ZM0101. This locus encodes a NAD-dependent epimerase/dehydratase and is important for hydrolysate tolerance in ZM4 [30]. The francensis subspecies appeared completely intolerant of hydrolysate and did not grow or significantly ferment glucose at either inoculum level.
All strains except the pomaceae subspecies were incapable of consuming xylose in the hydrolysate. The pomaceae subspecies consumed ~ 10 mM or ~ 4.7% of the xylose in the hydrolysate. However, it did not produce more ethanol than ZM4 and rather produced slightly more acetate. Significant efforts have been made to engineer ZM4 and CP4 to ferment xylose, therefore a native xylose fermentation pathway in the Zymomonas genus would be of significant interest in the field [33][34][35]. Although the pomaceae genome contains over 200 genes that do not have homologs in the other Zymomonas genomes, none of these could be clearly linked to xylose metabolism via current annotations. It is also possible that the xylose consumption phenotype results from changes in regulation of genes found in other organisms, considering that ZM4 can be evolved for xylose utilization without expression of heterologous genes [35]. Although the pomaceae subspecies in general does not have optimal characteristics as a biofuel producer, its native xylose utilization capability should be explored further.

Isobutanol tolerance
Isobutanol (IBA) is an advanced biofuel being pursued for production in several platforms [36][37][38]. Therefore, we tested the tolerance of the Z. mobilis strains to this compound. We first determined the half maximal inhibitory concentration (IC 50 ) of IBA with strain ZM4 to develop a protocol for assaying IBA tolerance. We grew ZM4 with a range of IBA concentrations and calculated area under the curve after 8 h of growth (AUC 8 ). We used the AUC values to calculate an observed IC 50 of ~ 1% for ZM4 grown in rich medium in anoxic conditions (Fig. 6a). We then compared the tolerance of all strains to 0.5% and 1.0% IBA by measuring the AUC at 8 h (Fig. 6b). All strains were able to grow in the presence of 1% IBA, with AUC 8 , ranging from ~ 40 to 80% of the control without isobutanol. Strains ZM4, Z6, francensis, and pomaceae showed the highest relative growth with 1% IBA, although the overall growth of the francensis and pomaceae strains was low at 8 h (Additional file 1: Figure  S2). Strains in other groups showed similar tolerance to IBA with Drainas group being the least tolerant. Considering the relatively small differences in isobutanol tolerance, it was not possible to connect specific genes with improved isobutanol tolerance at this time. Because there was an unexpected increase in AUC 8 for strain Z6 growth with 0.5% isobutanol, we repeated growth of ZM4 and Z6 with isobutanol in culture tubes. We observed that Z6 did not actually have increased growth with isobutanol and the result in the initial experiment was likely due to an inoculation error (Additional file 1: Figure S3).

Genome methylation analysis and DNA defense systems
The new genomes presented here were sequenced using a PacBio platform that also detects methylation at each base. The methylation sites for the new genomes were analyzed to predict which types of DNA methylation machinery are present in each strain. All 9 of the newly sequenced genomes showed greater than 99% methylation at GANTC sites, characteristic of the CcrM (Cell cycle regulated Methyltransferase) system, which has been well-studied in Caulobacter crescentus and other α-proteobacteria. This methylation pattern is consistent with prior REBASE predictions for Zymomonas based on the presence of CcrM system genes [39]. The modification methylase (ZMO1005) has been found in all 17 sequenced genomes, with amino acid identity to the ZM4 gene of 99-100% in subspecies mobilis, 95% in francensis, and 82% in pomaceae. CcrM is involved in cell cycle regulation, including transcriptional regulation of cell-cycle control genes, and GANTC sites are overrepresented in intergenic regions in C. crescentus [40]. Further study of the Z. mobilis CcrM system is warranted, considering that recent analysis suggests that Z. mobilis may contain up to 100 copies of the genome per cell [30,41].
Beyond the GANTC methylation site, several methylation patterns consistent with restriction-modification (R-M) systems were detected ( Table 4). Sites that were methylated with a frequency greater than 99.9% were usually associated with Type I or Type II R-M systems. In some strains, a few highly similar sequences were detected. These are denoted as 'variable' in Table 4, and all sequence variants are provided in Additional file 1: Table S5. Methylated sequences were shared within the groups of strains (at least 4 within the Drainas group, and two within the Z6 group), probably resulting from the activity of hsdM methylases on the homologous plasmids described above. Some of the Type I methylation sequences were predicted earlier, but only for a few of the previously sequenced genomes (CP4, ZM4 and pomaceae, REBASE [39]).Thus, our methylation analysis together with annotation of Type I R-M genes to specific plasmids is an important addition to the current knowledge of restriction-modification systems in Z. mobilis.
Beyond the Type I and Type II R-M systems detected by methylation analysis, Zymomonas strains have additional DNA defense mechanisms in the form of Type IV restriction and CRIPSR/Cas systems. In 2011, Kerr et al. [43] identified a putative Type IV restriction endonuclease coding gene, mrr (ZMO0028) in ZM4, based on conserved domains with known Mrr proteins using BLASTp. Inactivation of mrr by insertion of a chloramphenicol resistance gene (cam) resulted in increased transformation efficiency in Z. mobilis ZM4, suggesting that the Mrr protein was responsible for cleaving foreign DNA [43,44]. Mrr homologues were found in all sequenced genomes except subspecies pomaceae and are all chromosomally located. Sequence identity to the ZM4 gene was 100% with Drainas group, NCIMB 11163 and CP1, 99% with PROIMI A1, 97% with CP4 and 82% with francensis. CRISPR systems have also been detected in Zymomonas previously. Z. mobilis ZM4 has a functional Type I-F CRISPR/Cas system, with Cas3 nuclease/helicase [3]. This CRISPR system has been successfully utilized for genome editing in ZM4 [45]. We observed Cas3 homologs on the chromosomes of all sequenced genomes of subspecies mobilis. Sequence identity with ZM4 Cas3 is 98% in the CP1 group, PROIMI A1 and NCIMB 11163, 97% with the CP4 and Drainas groups, and 95% with the Z6 group. Z. mobilis pomaceae appears to have Type I-E and I-C CRISPR systems on the chromosome and a 37 kb plasmid, respectively [3]. Z. mobilis francensis has many genes associated with CRISPR systems, but no homolog to the Cas3 of ZM4 has been found.

Electroporation and conjugation efficiency
To begin to link the presence of DNA defense systems with physiology, we measured the efficiency of electroporation and conjugation of a plasmid into each of the Z. mobilis strains and normalized the efficiency to that of ZM4 (Table 5). We utilized a plasmid (pRL814) that was previously constructed from standard synthetic biology parts for use in Z. mobilis [46,47]. pRL814 carries a spectinomycin resistance cassette and an IPTG-inducible gfp. We found that most strains were not easily transformed with this plasmid by electroporation. We successfully obtained colonies for only 5 strains, ZM4, Z6, B-4492, CP4, and CU1rif2. Two strains in the Z6 group had slightly higher electroporation efficiencies than ZM4, while CP4 and CU1rif2 had significantly lower electroporation efficiencies. Conjugation of the same plasmid from E. coli WM6026 into these strains was more successful. We obtained transconjugants carrying the plasmid for 12 out of 15 strains tested. In the case of conjugation, ZM4 clearly had the highest efficiency, with all other strains having a 10-to 1000-fold lower electroporation efficiencies. Overall, the ZM4 and Z6 groups were the most amenable to DNA uptake, with Z6 strains having slightly higher electroporation efficiency and ZM4 having significantly higher conjugation efficiency. We did not successfully introduce a plasmid into the francensis and pomaceae strains by either method. Specific method development for these subspecies is likely necessary to enable their engineering. Recent efforts to enhance the transformability of ZM4 by removing restriction-modification systems have been highly successful and a modified ZM4 strain has been developed with 3 R-M systems and the CRISPR-Cas system inactivated. This strain has a much higher conjugation efficiency than the parent strain with several plasmids tested (personal communication, Patricia Kiley, University of Wisconsin, Madison). Overall, the low conjugation and electroporation efficiencies we observed are consistent with the large number of genome defense systems detected in all genomes. Because a high level of DNA defense appears to be a common trait across the genus, further modification to improve ZM4 as a chassis appears to be a more promising strategy than screening for strains with a higher natural competency.

Variability in cellulose synthase operon and promoter
Z. mobilis strains have a tendency to flocculate, a trait that can be beneficial for bioprocessing [20,48], but challenging for genetic modification and other laboratory protocols [49]. Floc formation by Z. mobilis ZM4, CP4 and their flocculating variants is driven by the production of extracellular cellulose fibrils, which become entangled and bind cells into flocs up to 4 mm in length [50,51]. Several potential flocculation triggers have been reported, including high ethanol concentration, minimal medium, and oxygen, but overall, regulation of flocculation remains unclear [52,53]. We identified three strains with naturally flocculating phenotypes: two strains in the CP1 group, and Z. mobilis subspecies francensis (Fig. 3c). To better understand the regulation of floc formation, we compared flocculating strains and ZM4 under a range of growth conditions, varying oxygen availability, shaking, and medium type (Fig. 7). The final OD 600 of cultures grown in different conditions is shown in (Additional file 1: Figure S4. To measure flocculation, we fully resuspended stationary phase cultures by harsh vortexing, allowed them to re-flocculate and settle for 45 min, then measured the decrease in OD 600 in the top 100 µL after settling. A greater % decrease in OD 600 indicates more settling, and therefore, a greater degree of flocculation. Although PROIMI A1 was previously described as a flocculating strain, it formed only delicate, easy-to-disperse flocs under these experimental conditions, therefore it was not included in this analysis. We observed that ZM4 exhibited little flocculation under any tested condition, although flocculation has been observed for this strain previously [53]. Overall patterns of flocculation were similar between the other three strains, with increased flocculation in rich medium and static cultures. Oxygen did not appear to be a major driver of flocculation, except in the case of the francensis subspecies, for which oxygen exposure promoted flocculation. The greater effect of mechanical disruption than oxygen or medium suggested that the flocculating phenotypes were constitutive and probably determined by major changes in cellulose synthesis rather than more subtle changes at the regulatory level. To ensure that floc formation of all strains was due to cellulose fibril synthesis, we also treated cultures of CP1, B-1960, and francensis with cellulase and found that this completely dispersed the flocs, indicating that all strains tested here use the same flocculation mechanism as other Zymomonas strains (data not shown). Cellulose synthesis is required for floc formation in Z. mobilis ZM4, CP4, and flocculating variants and production of cellulose fibrils is performed by the cellulose synthase complex encoded by the bcs operon [50,51]. Strains bearing Tn5 insertions in bcs genes or with the entire operon deleted are not able to form flocs during aerobic growth in rich or minimal media [30,49,53]. Several highly flocculent strains have been described previously: PROIMI A1, ATCC 31822 (ZM401) and B-12526, a flocculating derivative of CP4. However, little work has been done to identify the genetic differences leading to the flocculating phenotype. Using Multiple Sequence Alignment (Clustal Ω), we compared bcs operons and their regulatory regions for 18 genomes to find differences between flocculating and non-flocculating strains. We added ZM401, which was not included in the general genome comparison, to this analysis.
Two transcription start sites (TSS) for the bcs operon at − 26 and − 23 have been identified upstream of ZMO1080 in ZM4 [54]. We identified and analyzed regulatory elements upstream of the TSS (Additional file 1: Figure S5) and found that a 177-nt region upstream of ZMO1080 was mostly invariable within Z. mobilis mobilis strains. Single nucleotide substitutions in the Z6 group seem irrelevant for flocculation and a single-nucleotide deletion in PROIMI A1 is unlikely to affect regulation (Additional file 1: Figure S5C). Subspecies francensis and pomaceae showed significant variability from Z. mobilis mobilis and each other within the region. TSSs of the bcs operon in these strains must be identified separately.
Overall, it appears unlikely that differences in the promoter region contribute to differences in flocculation between strains.
There are 6 genes in the bcs operon of Z. mobilis: two coding for hypothetical proteins (ZMO1080, ZMO1081) and four genes with known functions (bcs-ABCZ, ZMO1083-1086). BcsA encodes a catalytic subunit anchored in the inner membrane with the catalytic center and regulatory PilZ domain in the cytosol. BcsA requires c-di-GMP bound to the PilZ domain for activity. BcsB is a periplasmic subunit anchored in the inner membrane. The BcsAB complex is found in all bacterial cellulose synthase operons and is indispensable for cellulose synthesis. BcsC is composed of an N-terminal periplasmic domain and C-terminal β-barrel forming a channel in the outer membrane for cellulose secretion. The N-terminal domain has multiple TPR (tetratricopeptide repeat) domains and is thought to organize function of the entire complex. Finally, BcsZ is a gluconase, which is required for proper formation of fibrils. ZMO1080 and ZMO1081 are highly conserved within Zymomonas mobilis and ZMO1081 shares homology with bcsQ from other Proteobacteria. ZMO1080 appears to be specific for Z. mobilis. Given their high conservation, these two genes are likely important for cellulose synthase function, but their roles have yet to be determined.
Unlike the promoter regions, bcsA genes were much more variable within Zymomonas. Clustal Ω alignment revealed significant length and sequence variability of Bcs proteins in pomaceae and francensis compared to Fig. 7 Flocculation of Zymomonas strains. Identical volume of the same colony resuspended in ZMMG was used to inoculate 5 ml of ZMMG or ZRMG. Cultures were grown for 48 h statically or with shaking in aerobic or anaerobic conditions as described in "Materials and methods". After this time flocs were dispersed by vortexing, and OD 600 was measured by removing 100 μl from the top of the culture to 0.9 ml of medium immediately and after 45 min of standing on a bench. Results shown are average of three independent experiments. Points represent the average of three biological replicates and error bars represent standard error Z. mobilis mobilis and the bcsA gene was not annotated in pomaceae. Thus, although we identified francensis as a flocculating strain, we were not able to attribute this phenotype to any specific gene within the bcs operon. Protein alignment of BcsA within the mobilis subspecies identified differences specific for one or more of the flocculating strains (Additional file 2: Figure S6). In PROIMI A1, a 465-nucleotide deletion at the junction of bcsA and bcsB results in truncated forms of both proteins. Adding the sequence of BcsA from Rhodobacter sphaeroides (for which the 3D crystal structure has been resolved [55]) to the alignment showed that the truncated BcsA lacks part of the PilZ domain that participates in binding c-di-GMP (Additional file 3: Figure S7). It seems likely that activity of cellulose synthase in PROIMI A1 has been deregulated by this deletion, possibly resulting in the constant but weak floc formation.
Sequence comparison of 26 genes differentially expressed in ZM401 compared to ZM4, did not reveal any differences in bcsA or bcsB genes or their upstream regions [56]. However, in independent research Xia et al. [50] found a single-nucleotide deletion (T) in bcsA of ZM401 as compared to ZM4 and suggested that bcsA401 was fused to the upstream gene. This would result in a longer protein with an additional transmembrane helix compared with BcsA of ZM4. Our protein alignment of BcsA from ZM401 to other Z. mobilis mobilis strains showed a protein of the same length as in the other 12 strains (Additional file 2: Figure S6), revealing that rather than a T deletion in ZM401, there is a T insertion in ZM4. We confirmed one T insertion after A172 in bcsA from ZM4 by Sanger sequencing. Thus, ZM4 bcsA is a pseudogene as annotated in NCBI data base. It is uncertain why ZM4 can still flocculate in certain conditions. Because the additional T was inserted within a stretch of 8 thymidine nucleotides, it is possible that a ribosomal shift or RNA polymerase slippage can occasionally lead to synthesis of full-length BcsA in ZM4, thus providing some cellulose synthesis apparently sufficient to cause flocculation under specific conditions [53].
As described above, we found that two strains from subspecies mobilis, B-1960 and CP1, showed strong flocculation in all tested conditions. Protein alignment identified four single amino acid substitutions in BcsA that are specific for these two strains (Y408H, W452S, T519I, and F645V, Fig S5). The first three residues are conserved in Zymomonas mobilis and corresponding residues of R. sphaeroides BcsA (W418, L462, T528, Additional file 3: Figure S7) are located in transmembrane helices where they may assist in passage of cellulose through the inner membrane. It is possible that these mutations affect efficiency of cellulose export and result in increased flocculation. BcsB showed less variability across Zymomonas except that in B-12526 it is a pseudogene and in PROIMI A1 it is truncated from N-terminus (∆61 aa). As both these strains are naturally flocculating, it is possible that these changes contribute to flocculation.
In three flocculating strains (CP1, B-1960 and ZM401), we identified insertion of two serines in BcsC (S818-819), resulting in seven serines in a row versus four or five in other strains (Additional file 4: Figure S8). This stretch of serines is located within the periplasmic N-terminal part of BcsC. P. aeruginosa AlgK, a functional homolog of the BcsC N-terminal domain, organizes the function of the entire Bcs secretion complex [57]. It is possible that in Zymomonas the N-terminus of BcsC has a similar function and thus mutations within this region may lead to altered cellulose transport and subsequent flocculation.
Overall comparison of the Bcs operons and their promoter regions across 18 Z. mobilis strains showed that the promoter region is highly conserved while genes encoding cellulose synthase subunits display significant changes, including gene fusions, pseudogenes, and amino acid substitutions. Some changes within the bcs genes were specific to the flocculating strains, at least in one case affecting a regulatory domain, possibly leading to the flocculation phenotype appearing across all conditions. Our analysis suggests that flocculation is regulated primarily at the post-translational level in Z. mobilis and efforts to implement transcriptional control of this trait are not likely to be successful.

Regulation of flocculation by diguanylate cyclases/ phosphodiesterases
Bacterial Bcs complexes are positively regulated by cyclic di-GMP. Four GG(D/E)EF motif-containing diguanylate cyclases have been identified in ZM4 [53]. Two were previously linked to flocculation phenotypes; a Tn5 insertion in a diguanylate cyclase (ZMO0919) abolished flocculation of ZM4 in oxic minimal media while another diguanylate cyclase, ZMO1055 was upregulated in ZM401 as compared to ZM4. The single amino acid substitution A525V in the latter was proposed to cause flocculation [56]. We compared sequences of both genes and their promoter regions to see if there are any differences, which could lead to higher activation of BcsA in flocculating strains. There were two TSS reported at − 27 and − 38 of ZMO0919 gene and TSS at -119 of ZMO1055 [54]. Similar to the bcs operon, we found that the regulatory regions are highly conserved across subspecies mobilis with just one nucleotide deletion in two strains in the Drainas group. This mutation seems irrelevant to flocculation as we did not observe any differences between the four strains in this group. Other single-nucleotide substitutions are group-specific (Z6 or CP4) and not related to flocculation. The diguanylate cyclase genes were more variable; we found that in Drainas group ZMO1019 was split into two pseudogenes by insertion of a transposase. In NCIMB 11163, this locus acquired a stop codon and is annotated as a pseudogene. We also found a V430I substitution in flocculating strains CP1, B-1960 and PROIMI A1 (not shown). ZMO1055 showed even more variability within Z. mobilis mobilis. In the Z6 group and CP1 it is a pseudogene. We confirmed that the A525V substitution observed by Jeon et. al. [56] is unique for ZM401 as is A571T in PROIMI A1, both located in the N-terminus of the protein (not shown). Some diguanylate cyclases can substitute for each other, but recent findings show that only some regulate BcsA [58]. Our analysis shows that all flocculating strains have ZMO0919 while ZMO1055 is carrying mutations in PROIMI A1 and ZMO401 and is non-functional in the CP1 group (pseudogene). We speculate that the ZMO0919 diguanylate cyclase is more active in regulating BcsA and it is also possible that mutated versions of ZMO1055 acquired higher activities. We expect that future functional studies will shed additional light on the regulation of flocculation by diguanylate cyclases/phosphodiesterases.

Conclusion
We described the present Zymomonas genomic diversity and compared bioenergy-relevant traits of 15 Zymomonas strains which revealed that strain ZM4 had the highest isobutanol tolerance, hydrolysate tolerance, and conjugation efficiency. Considering the low overall diversity observed within the genus, we propose that future efforts should focus on further developing one or a few strains via synthetic biology methods, rather than exploration of additional Zymomonas isolates. For example, several DNA defense systems were present in all strains, suggesting that low natural competence is a trait common to the entire genus. Based on our physiological characterization, we propose that ZM4 should be further developed into the model organism and platform strain for the genus. Some other strains had promising traits that should be studied further for potential transfer to ZM4, particularly the capability of the pomaceae strain to consume xylose in lignocellulosic hydrolysate.

Sequencing of Zymomonas genomes
The draft genomes of Zymomonas mobilis strains sequenced for this study were generated at the DOE Joint Genome Institute (JGI) using the Pacific Biosciences (PacBio) sequencing technology [59]. A > 10 kbp Pacbio SMRTbellTM library was constructed and sequenced on the PacBio Sequel platform. All general aspects of library construction and sequencing performed at the JGI can be found at http:// www. jgi. doe. gov. The raw reads were assembled using HGAP [smrtlink/6.0.0.47835, HGAP 4 (0.2.1)] [60]. Additional file 1: Table S1 contains sequencing information for each genome, including filtered subreads, total bp, number of contigs and scaffolds in the final draft assembly, and genome size and input read coverage. DNA modification detection and motif analysis were performed using PacBio SMRT analysis platform (pbsmrtpipe.pipelines.ds modification motif analysis 0.1.0). Briefly, raw reads were filtered using SFilter, to remove short reads and reads derived from sequencing adapters. Filtered reads were aligned to the reference genome using BLASR (v5.3) [61]. Modified sites were then identified through kinetic analysis of the aligned DNA sequence data [42]. Modified sites were then grouped into motifs using MotifFinder2. These motifs represent the recognition sequences of methyltransferase genes active in the genome [62].

Phylogenetics of Zymomonas genomes Marker gene trees
A concatenated marker gene tree was constuctued using all 17 Zymomonas genomes together with 5 Sphingomonas genomes used as outgroup. The marker gene tree consists of 56 universal single copy genes and was produced by extracting these marker genes from each genome using hmmsearch (version 3.1b2). For each marker, alignments were constructed with MAFFT [63], trimmed with trimAl 1.4 [64], then concatenated into a single alignment. Maximum likelihood phylogenies were constructed with IQ-TREE [65], using the WAG substitution model and 1000 bootstraps. Trees were visualized with ggtree [66].

SNP trees
To generate the SNP tree, we followed a similar pipeline to the one found in [16], where each genome was aligned to a reference genome (highest quality genome in the set). SNPs were identified based on the Nucmer alignment output from the MUMmer package [67], concentated, followed by neighbor joinging tree construction. Trees were rooted on the two outlying genomes: francensis and pomaceae, and visualized with ggtree [66].

Genome clustering
Genome clustering was performed with two different approaches, the hierBAPS bayesian clustering analysis and clustering based on 99% average nucleotide identity (ANI). First for hierBAPS clustering, we employed a package integrated into R called RhierBAPS, Tonkin-Hill et al. 2018 which is an R implementation of the BAPS algorithm (Bayesian Analysis of Population Structure) [18], designed to identify subspecies genome clusters. Input for this software was the same whole genome alignment used in the SNP tree. Annother approach to unravel genetic relationships from the 17 Zymomonas genomes was to perform pairwise ANI clustering of all genomes at 99% similarity. To do this we used fastANI [68], filtered to an alignment fraction > 70%, then clustered into > 99% ANI clusters using mcl [69].

Orthologue clustering and analysis of gene content
Genes were called and annotated based on the Integrated Microbial Genomes (IMG) system at the DOE Joint Genome Instittute [70]. Naming of genes and contigs follows the IMG nomenclature. To cluster genes into gene families/orthologue groups, all genomes were passed into OrthoFinder [71]. Following clustering, genes were identified as belonging to the Core, Accessory or Singleton genomic components. For a gene to be considered core, it had to be identified in all 17 of the analyzed genomes. We merged the orthologue data together with the annotation data and provide as a supplemental file (Additional file 5: Table S6). A single rarefaction curve was generated based on the random resampling of the pool of unique gene family/orthologue groups, and a Venn diagram was created to assess shared and unique orthologues, both plotted in R.

Growth conditions
Growth kinetics was performed using a plate reader (BioTek Synergy H1) in anaerobic chamber (Coy Laboratory Products). Overnight cultures in 5 mL of ZRMG were inoculated from single colonies grown on ZRMG plates anaerobically. The cultures were grown statically to early stationary phase (OD 600 around 1.0). The starter cultures were diluted to OD 600 0.1 with fresh ZRMG and wells of 96-well microtiter plates were filled with 150 μl of each culture in triplicates. The side wells of the plate were filled with ZRMG to minimize evaporation. Cells were grown at 30 °C with shaking for 20 h, and OD 600 measurements were taken every 15 min. Doubling times were calculated from growth curves using the package "growthcurver" in R [28].

Conjugation efficiency
Conjugation was performed as described in Felczak et al. [47] except that the E. coli strain WM6026 was used as a donor of pRL814. Briefly, Z. mobilis strains were inoculated from a single colony and grown in 5 ml of ZRMG media statically in 14-ml tubes at 30 °C. WM6026 was grown overnight in LB supplemented with diaminopimelic acid (DAP) and spectinomycin at 37 °C. In the morning WM6026 culture was diluted to OD 600 0.2 in LB with DAP but without the antibiotic and grown at 30 °C to OD 600 1.0. Overnight culture of Z. mobilis was diluted to OD 600 1.0 and 0.9 ml was mixed with 0.9 ml of WM6026 at OD 600 1.0 in 2.0 ml Eppendorf tubes. Bacteria were centrifuged at 17,000 × g for 30 s in microcentrifuge at room temperature. Supernatant was decanted and the pellet let stand for 3 min. Loosened pellet was placed as a one drop in the center of plate containing solid ZRMG supplemented with 0.1 mM DAP and tryptone (10 g/ L). Plates were incubated overnight at 30 °C. Next day cells were scraped into 1.5-ml Eppendorf tube, 1 ml of ZRMG was added and cultures incubated for 5 h at 30 °C. After this time, 100 μl of undiluted or diluted (10 or 100 times) culture was spread on ZRMG with spectinomycin. 100 μl from serial dilutions in duplicate were spread on ZRMG plates without spectinomycin. Plates were incubated at 30 °C for 48 h for CFU or 4 days for exconjugants. Efficiency of conjugation was calculated as ratio of exconjugants to CFU/ ml.

Electroporation efficiency
pRL814 was introduced to Z. mobilis by electroporation. Electrocompetent cells were prepared as follows. 125 ml of ZRMG was inoculated from a single colony and cells were grown overnight in 125 ml bottles, loosely covered, at 30 °C until OD 600 0.4 (microaerobic conditions). Bottles were chilled on ice for over 30 min and bacteria were spun at 4000 × g for 15 min at 4 °C. Pellets were resuspended in equal volume of ice-cold MQ H 2 O and left on ice for 10 min. After this time, bacteria were spun as above. Washing was repeated one more time with half of the initial volume of ice-cold MQ H 2 O. After spinning as above cells were resuspended in 10 ml of ice-cold 10% glycerol and centrifuged at 4000 × g for 10 min at 4 °C. Final pellets were resuspended gently in 150 μl of ice-cold 10% glycerol and used immediately for electroporation as follows. 0.5-1 μg of plasmid DNA was mixed with 40 μl of electrocompetent cells. Electroporation was in 0.1mm cuvettes in Micro Pulser (BioRad) set at EC-1. Cells were immediately resuspended in 1 ml of pre-warmed ZRecM and incubated statically at 30 °C for 5 h. 100 μl from undiluted or 10 times diluted electroporations was spread on ZRMG plates supplemented with spectinomycin. Plates were protected with parafilm from drying and incubated at 30 °C aerobically, for 4 days or longer until colonies appeared. Serial dilutions were plated on ZRMG without antibiotic to measure survival after electroporation. Efficiency of electroporation was calculated per 1 μg of plasmid DNA.

Growth on AFEX hydrolysate
pH of AFEX hydrolysate was adjusted to 5.8 with 10 M NaOH. The hydrolysate was filtered through 22-μm filter device (VWR) and stored in anaerobic chamber. Z. mobilis cultures were inoculated from fresh ZRMG plates and grown anaerobically in 10 mL of ZRMG overnight to OD 1-2 in anaerobic chamber. OD 600 was measured and cells were concentrated to OD 600 10 by centrifuging cells at 8,000 rpm for 10 min (Sorval ST8, Thermo Scientific) at room temperature, followed by resuspending in appropriate volume of the hydrolysate. 5 mL of hydrolysate in Hungate tubes was inoculated to OD 600 0.1 or 0.5 under anaerobic chamber and tubes were capped and sealed. Tubes were incubated outside the chamber at 30 °C with shaking for 50 h. After this time, OD 600 of 10 times diluted cultures was measured and samples were saved and stored at − 20 °C for analysis by HPLC.

Isobutanol tolerance
On cultures in 5 ml of ZRMG were inoculated from single colonies and grown overnight in anaerobic chamber to stationary phase. Cultures were diluted to OD 600 0.1 in fresh anaerobic ZRMG. Isobutanol was added to indicated final concentration, mixed and aliquots of 150 μl were loaded in triplicate to 96-well plate. Incubation was performed with shaking at 30 °C until growth reached saturation [18 h]. IC 50 for ZM4 was calculated from % growth inhibition by 0.25-10% IBA calculated as area under curve (AUC) after 8 h of incubation using online IC 50 calculator (https:// www. aatbio. com/ tools/ ic50calcu lator). Tolerance of Z. mobilis strains to 0.5% and 1% IBA was calculated similarly from AUC after 8 h.

Flocculation
Single colony from ZRMG plate grown anaerobically for 48 h was suspended in 200 μl of ZMMG. 40 μl was used to inoculate 5 ml of ZMMG or ZRMG in duplicate for subsequent aerobic or anaerobic growth. For static conditions, cultures were grown in 14-mL plastic tubes in an aerobic environment or an anaerobic chamber. For shaking conditions, cultures were grown with shaking at 275 rpm in Hungate tubes closed with a butyl rubber stopper or loosely covered with aluminum foil for anaerobic and aerobic growth, respectively. For the anaerobic condition, anoxic medium was dispensed into the tubes in an anaerobic chamber and the tubes were capped before removal to ensure an anoxic environment. Cultures were grown for 48 h. Cultures were vortexed until visible flocs disappeared (usually 5-10 s) and OD 600 was measured immediately, and again after standing for 45 min on a bench. Each time 100 μl from the top of the culture was removed to 0.9 ml of media for measurement.

Chromosomal DNA purification
For the novo genome sequencing, Z. mobilis strains were grown to stationary phase in 100 ml of ZRMG at 30 °C, statically in anaerobic chamber. Cells were harvested by centrifugation at 6,000 rpm for 10 min at 4 °C (Sorval ST8, Thermo Scientific). Genomic DNA was purified as described by Neumann, et al. [73] with some modifications. Briefly, pelleted cells were washed with 100 ml of 0.1 M NaCl, centrifuged as above and resuspended in 5 ml of SET buffer (75 mM NaCl, 25 mM EDTA, 20 mM Tris pH 7.5). Lysozyme was added to final concentration of 1.0 mg/ml and incubated at 37 °C for 20 min. RNase A (Sigma), was added to a final concentration of 0.3 mg/ml and mixture was incubated for 20 min at 37 °C. After this time, SDS and Proteinase K were added to final concentrations of 1% and 1.0 mg/ ml, respectively, and incubated at 55 °C for 2 h. After this time, 0.4 volume of 5.0 M NaCl was added to a clear, viscous cell lysate and mixed gently. An equal volume of chloroform was added and the mixture was incubated at RT for 30 min, on a swinging platform. After this time, phases were separated by spinning for 15 min at 8000 rpm at room temperature (Sorval ST8, Thermo Scientific). Clear water phase was collected into a new 15-ml conical tube and equal volume of isopropanol was added. The precipitated DNA was collected by wrapping around Pasteur pipette and submerged in 70% ethanol to wash out the residual water. DNA was transferred to Eppendorf tube, let dry at room temperature, resuspended in 200 μl of TE buffer (10 mM Tris HCl pH 8.0, 1 mM EDTA) and left in 4 °C overnight. Quality of the DNA was checked by electrophoresis on 0.7% agarose alongside lambda DNA standard (Thermo Scientific), and RNase A treatment was repeated when necessary followed by ethanol precipitation as described [74]. A 260 / A 280 and A 260 /A 230 was measured by nanodrop assay and the final concentration of DNA was determined by fluorometric method (Qubit). Finally, identity of the isolated DNA was confirmed by16S rRNA gene PCR using 27F and 1492R primers followed by Sanger-based sequencing of the PCR product [75].