The present invention relates to methods and compositions for genetically manipulating bacterial cells, particularly a cell of the class Clostridia, but also related bacteria which are difficult to genetically manipulate due to lack of an effective recombination system. In particular, embodiments of the present invention relate to the expression of recombinant homologous recombination proteins in Clostridia and in other bacterial species as demonstrated in the following provisional patent.
Rising and unstable prices for petroleum based chemicals and fuels have resulted in renewed interest in their production via alternative approaches (e.g. biochemical approaches). Coupled with concerns of global climate change and securing a domestic source of transportation fuels, efforts are being focused towards the fermentative conversion of inexpensive renewable feedstocks (biomass) to fuel alcohols and chemicals. Such processes have been employed for over a century at very large scale, but advanced genetic and metabolic engineering approaches for generating second-generation chemical and fuel producing microbes are required for making these current ventures commercially viable.
Clostridia are strictly anaerobic, endospore forming prokaryotes of major importance to cellulose degradation, human and animal health and physiology, anaerobic degradation of simple and complex carbohydrates, acidogenesis, and bioremediation of complex organics [10]. Solventogenic, butyric-acid clostridia (e.g., Clostridium acetobutylicum, C. beijerinckii and C. butyricum) [11] played a major industrial role in the production of acetone and butanol in the past (and likely now and in the future) by the Acetone-Butanol-Ethanol fermentation (ABE) (Jones and Woods 1986; Rogers 1986; Lesnik, Sampath et al. 2001). Significantly, metabolic engineering (ME) of solventogenic clostridia, as recently reviewed [9], may lead to industrial processes for production of additional chemicals such as butyric acid, butanediol, propanol, and acetoin (Jones and Woods 1986; Rogers 1986; Lesnik, Sampath et al. 2001), production of hydrogen [12] or for biotransformations [13]. Some of these chemicals (butanol, ethanol) can serve as biofuels directly, while others can be used for chemical conversion to biofuels (e.g., butyric acid [14]) or the generation of electricity [12]. Related clostridia can produce additional chemicals such as propionic and acrylic acids [15] [16]. Finally, clostridia, as might be expected from these ancient anaerobic soil organisms, have a great potential for applications in bioremediation [17].
Based on the fundamental and applied importance of this genus, the DOE has completed the genome sequence [18] of C. acetobutylicum ATCC 824 (referred to as Cac from now on). A number of ME tools have been developed for this genus of bacteria such as recombinant DNA expression plasmids [19], antisense RNA approaches [20], and gene expression libraries [21]. However, the full potential of any industrially relevant species will not be fully realized until an efficient chromosomal integration system is developed that allows for more elaborate and stable genetic manipulations of the host. Such a system would ideally be suitable for all clostridia species, be able to disrupt desired genes, be able to integrate large pieces of recombinant DNA into the host chromosome, and be easily and rapidly implemented in any research and R&D setting (academic or industrial). Methods for gene inactivation in clostridia have been inefficient, first based on non-replicating vectors [22], and later using a replicating vector (Harris, Welker et al. 2002). These methods however are tedious, slow and require substantial time and effort commitments. Moreover, they are rarely successful in inactivating genes. Most recently, the TargeTron™ system (group II intron principle) was adapted to clostridia by two different groups (Heap, Pennington et al. 2007; Shao, Hu et al. 2007). There are also two other more recently reported methods, the first method was developed by the research group of Dr. P. Soucaille at INSA of Toulouse, in collaboration with the company Metabolic Explorer. They developed a novel endonuclease expression technique to digest replicating plasmid DNA into linear disruption cassettes within the clostridia host that can then recombine via homologous recombination into the target chromosomal region. A patent has been filed based on their work [23] and a paper has been published that employed this method [24]. The other was a suicide plasmid approach developed by the inventors (Bryan Tracy and Eleftherios Papoutsakis) and was used to knockout acid-formation genes in the Cac asporogenous mutant M5 [25]. However, all approaches are severely limited in multiple regards.
First, the group II intron and endonuclease based methods have not demonstrated the ability to incorporate more than ˜1.5 kB of DNA into the chromosome, and the majority of 1.5 kB is already consumed by a selection marker. Secondly, the suicide approach has only been demonstrated in one specific strain of C. acetobutylicum, the M5 strain, and has yet to be successful in any other clostridia strain.
Homologous Recombination
Homologous recombination is a housekeeping process involved in the maintenance of chromosome integrity and generation of genetic variability that is nearly ubiquitous to all microorganisms [26-28]. The cellular machinery involved is not necessarily conserved, but the general series of events is common to all microorganisms studied to date. The typical series of events for homologous recombination are initiation, strand-invasion, strand-exchange, and Holliday junction resolution [28-30], see
Absense, Importance and Utility of Resolvase Expression
The most essential protein in the later stages of homologous recombination is the resolvase. Resolvases are a well-known class of proteins that perform a key role in Holliday-junction resolution. There are a number of distinct resolvase enzymes, and resolvase activity is ubiquitous to nearly all bacteria [28, 31]. Holliday-junctions are four-way DNA intermediate complexes formed during homologous recombination [32]. There are two major resolvases found on the genomes of Gram-negative and Gram-positive bacteria. These are ruvC and recU, respectively [28, 33]. The significance of resolvases, and more specifically of recU in Gram-positive organisms was studied via deletion mutants and tested by the deficiency in DNA repair and intramolecular recombination [33-35]. These studies strongly support the essential role of RecU in Holliday-junction resolution for Gram-positive organisms, such as clostridia and bacilli. Subsequent studies determined high-resolution structures of RecU from Bacillus subtilis and B. stearothermophilus and proposed detailed models for how the RecU protein physically interacts with the Holliday-junction [36, 37]. A recent comparative genomic analysis suggests that clostridia do not contain genes for any recognizable resolvase protein [28]. Thus, we hypothesized that the lack of resolvase activity is responsible for the experimental difficulty in generating homologous recombination events for gene disruptions in all clostridia [38].
The analysis by Rocha et al. [28] demonstrated the absence of resolvase activity in Clostridium perfringens, C. tetani and Cac, which were all the clostridia genomes they analyzed. Significantly only 10 genomes out of the 110 analyzed appeared to be resolvase deficient. Of those 10, four were void of any sort of recombination system. To further extend their analysis, we performed our own homology searches of the B. subtilis recU gene against six annotated or draft clostridia genome sequences, based upon the first round of orthology assignment performed in the Rocha et al. paper. The final analysis included Cac, C. perfringens, C. tetani, C. difficile 630, C. novyi NT, C. thermocellum, C. beijerincki, and C. cellulolyticum. All genomes were void of any discernable RecU resolvase, which is the conserved resolvase for Gram-positive organisms.
We thus tested the hypothesis that RecU expression in clostridia will result in efficient recombination. Specifically, we over-expressed the heterologous resolvase from B. subtilis, RecU (coded by recU), in Cac under the control of the strong Cac thiolase promoter. For this, we used a replicating plasmid that contained two contiguous regions of homology for a gene on the Cac chromosome. Initial investigations targeted the sigma factor sigE, which is a known transcriptional regulator in clostridia sporulation and possibly also solvent formation [10]. The final plasmid, targeted against a specific gene is referred to as a site-specific chromosomal integration plasmid (SSCI plasmid). For sigE we refer to this as the sigE SSCI plasmid. The two regions of homology were disrupted by a thiamphenicol (TH) antibiotic resistance gene. There was also an erythromycin (EM) antibiotic resistance gene on the plasmid, outside of the regions of homology. Therefore if a double crossover event between the host chromosome and plasmid were to occur, the TH marker would be incorporated into the chromosome and the EM marker would be lost upon plasmid curing, which describes the loss of the plasmid. If a single crossover occurred, both TH and EM markers would be incorporated in the chromosome, but EM resistance would not be as strong upon plasmid curing, because cells require multiple copies of the EM resistance gene to be very EM resistant. A single copy of the TH resistance gene is sufficient for strong TH resistance.
Resolvase Expression is not Sufficient for Efficient Double-Crossover Recombination
Previous analyses, for which a utility patent was filed, showed that resolvase expression improved single-crossover efficiency very well, but double-crossover homologous recombination was still only witnessed in one example out of many. Therefore, we hypothesized that homologous recombination proteins are not well-expressed and/or down-regulated during exponential growth, the period of culture during which we believe homologous recombination is occurring.
Current State of the Technology for Generating Targeted Gene Disruptions in Clostridia
In regards to gene disruption via homologous recombination, the current state of the art is to employ the Clostridia host's homologous recombination machinery for double crossover recombination. Recombination occurs between parent chromosome and plasmid-borne homologous regions that flank a selectable marker, see
For C. acetobutylicum there are only five published reports of site-specific integration; three via non-replicating (suicide) plasmids [22, 25, 39], one via a replicating plasmid [40] and a third via linear DNA created within the host cell [24]. The first attempt utilized a suicide plasmid with an integration cassette composed of ˜225 bp nucleotide sequences of contiguous homology flanking a macrolide-lincosamide-streptogramin B resistance (MLSr) gene. The MLSr gene confers erythromycin (EM) resistance. The plasmid was introduced into C. acetobutylicum via electroporation, and knockout mutants were selected for EM resistance. Only mutants that had undergone a recombination event could maintain EM resistance. This technique was successful three times for the generation of pta, bk and aad mutants [22, 39]. However, integration efficiency was very low (, 0.5 mutants/μg transformed KO plasmid DNA), the approach was and unsuccessful for many additional targets, and the nature of integration is still not fully characterized (i.e. the region of integration has yet to be sequenced and confirmed). A similar approach was performed to knockout the bk and ak genes in the degenerate C. acetobutylicum strain M5 [25]. This approach employed a “stronger” antibiotic marker (i.e. chloramphenicol under the strong ptb promoter). The integration was confirmed (i.e. sequenced) to have perfectly integrated through a single region of homology. Unfortunately this approach has not been successful in the WT C. acetobutylicum strain.
A second approach was later developed that employed a replicating plasmid. When using the replicating approach, an additional selection marker is required outside the integration cassette in order to prove the loss of plasmid after a successful of double crossover recombination. For this approach a thiamphenicol (CM/TH) resistance gene was employed. Following electroporation, transformed cells are selected for EM resistance. Transformants were then vegetatively transferred six times on non-antibiotic nutrient plates. The seventh and eighth transfers were onto EM and TH containing plates, respectively. These two plates were compared for regions of growth on EM but not on TH, suggesting double crossover recombination and loss of plasmid. So far this approach has been successful at generating only a handful of mutants such as spo0A mutant [40], CAC8241 mutant and ctfAB mutant [41], and all subsequent attempts at additional targets have been unsuccessful.
Among other clostridia species there have been few successful attempts at generating targeted chromosomal integrations via suicide and replicating plasmids [42-44]. Most notable is the use of multimeric, suicide plasmids in C. perfringens, but this approach is unreliable (i.e. low frequency of integration) the mechanism is very poorly characterized [45]. Thus a different sort of gene disruption system was adapted to Clostridia in order to increase site-specific integration efficiency. The group II intron system developed by the Lambowitz lab at University of Texas—Austin, now commercialized by Sigma-Aldrich (TargeTron™), has been employed on multiple occasions to generate gene disruptions in C. perfringens and C. acetobutylicum [46-48]. A more intensive study modified the TargeTron™ specifically for application in clostridia species. The ClosTron system has been employed to generate gene disruptions in C. acetobutylicum, C. difficile, C. botulinum and C. sporogenes [38]. Group II introns are naturally occurring autocatalytic retrotransposable elements that include a six stem-loop RNA structure complexed with an intron-encoded protein (IEP). The IEP exhibits four unique activities: 1) maturase for intron splicing, 2) DNA binding for target site recognition, 3) endonuclease for nicking host chromosome and 4) reverse transcriptase for forming intron cDNA. Group II introns can insert RNA directly into target DNA sequences and then reverse transcribe themselves. DNA is targeted mainly by base pairing of the intron RNA, however the IEP also recognizes a few base pairs. Subsequently, group II introns can theoretically be engineered to target any desired DNA sequence by modifying the intron RNA [49].
The engineering of microbes for specialty chemical conversion, biofuel generation, bioremediation and pharmaceutical production remains an immediate scientific and industrial goal. Specifically for the class Clostridia among prokaryotes, the pursuit of industrial scale biofuel production, replacement of fossil feedstock chemicals with biomass, and the production of value added chemicals from industrial and municipal waste streams is motivating a tremendous amount of strain development. Clostridia are naturally some of the most prolific cellulosic-material fermenting microbes [1-4], or exhibit great potential to be metabolically engineered into cellulose utilizing bacteria [5-8]. Additionally, due to the anaerobic and spore forming characteristics, clostridia are being engineered to target the necrotic and anaerobic cores of malignant tumors and to essentially kill tumors, inside out (REFS).
The tools for genetically manipulating clostridia remain limiting and insufficient for harboring the total potential of this class of bacteria [9]. Advances have occurred slowly over the past thirty years, but need to be dramatically accelerated, especially given the recent interest in biofuels and white biotechnology. Three of the more. notable limitations, which can be addressed with our technology, are engineering gene specific mutants for gene inactivation, generating genetically diverse mutant populations for genome scale library screenings, and being able to add/delete large lengths of chromosomal DNA in any host clostridia strain.
We have invented a novel approach for genetically altering clostridia. Our invention in its simplest form is the recombinant expression of the homologous recombination associated proteins RecO, RecG, and RecA derived from any heterologous source that is naturally compatible (i.e. can be transcribed and translated) or engineered to be compatible (e.g. codon usage of heterologous gene is varied to be readily transcribed in the bacteria host) any clostridia species or other difficult to genetically manipulate species. Expression can be independent or in combinations, particulary with RecU. We demonstrate that during independent or combinatorial expression, homologous recombination is stimulated at higher frequency than in the absence of recombinant homologous recombination protein expression. Additionally, we demonstrate that this approach is feasible and delivers similar advantages in other clostridia species, particulary C. cellulolyticum and C. cellulovorans, but also including C. butyricum, C. thermocellum, C. tyrobutyricum, C. beijerinckii, C. perfringens, C. tetani, C. difficile, C. botulinum, C. sporogenes, and C. novyi.
The utility of our technology is the enhanced capability to genetically modify clostridia. We demonstrate this utility through the proceeding examples in Clostridium acetobutylicum: 1) enhanced frequency for site-specific homologous recombination by independent recombinant expression of a homologous recombination protein, 2) enhanced frequency of site-specific homologous recombination by combinatorial recombinant expression of homologous recombination proteins (particularly with RecU), 3) enhanced frequency of site-specific double crossover homologous recombination by combinatorial recombinant expression of homologous recombination proteins (particularly with RecU), and 4) We also demonstrate the exact same advantages in C. cellulolyticum and C. cellulovorans.
The simplest explanation of our technology is the recombinant expression of any individual or combination of the homologous recombination proteins RecG, RecO or RecA. This can be in any clostridia host or related bacteria species and source of the recG, recO or recA can be any natural or engineered heterologous gene. Specific applications include complementing a clostridia or related species with the aforementioned genes. The genes can be expressed individually, in combination, from the site-specific chromosome targeted integration plasmid, from a separate plasmid, or from chromosomal integration into a host organism. The expression is used for gene knockins, gene knockouts, constructing gene knockin/knockout libraries, creating chromosomal expressed fusion proteins, etc.
We analyzed the expression profiles and absolute expression levels of each homologous recombination proteins from a detailed time profile of a batch culture of the WT Cac strain (ATCC824) [50]. Expression profiles refer to the differential expression of the gene over all growth phases (exponential, transition, early-stationary, mid-stationary and late-stationary), which were determined by hybridizing cDNA from a specific period of growth against a pool of cDNA from all periods of growth. The expression level was determined by ranking each gene from the full genome microarray on a percentage scale of 0 to 100. Genes with greater expression at a specific time point, as determined by greater fluorescence intensity on the microarray, are ranked closer to 100. Genes that showed very low intensity were ranked closer to 0.
Based upon the microarray analysis, differential expression suggests that a number of homologous recombination initiation proteins are upregulated during exponential growth, such as recO, recN, recJ and recD. However, the most important “strand exchange” protein, RecA, is down-regulated during exponential growth. Looking at the expression rankings, it suggests that recO and recG are very lowly expressed (<23rd percentile ranking for all timepoints for recO and <31st percentile for recG). Based upon these findings and recU over-expression results, we believe homologous recombination can be enhanced by over-expressing recO, recG and/or recA.
Base upon our findings from the DNA-microarray transcriptional analysis, as discussed above, we believe that RecG, RecO and RecA are ideal targets for over-expression. Additionally we will test the expression of a heterologous RecA (from B. subtilis) since this approach was successful with the B. subtilis resolvase (RecU) expression.
Each homologous recombination protein (RecO, RecG, endogenous RecA and heterologous RecA) will be PCR amplified from Cac genomic DNA with an appended thiolase promoter (Pthl) on the 5′-primer and a rho-independent transcription terminator sequence on the 3′-primer. The thiolase promoter is a strong, growth-associated promoter, commonly used in Cac for gene over-expression, and was used in our previous studies with RecU expression. The rho-independent terminator is a palindromic sequence that forms a stem-loop, hairpin structure when transcribed, causing the RNA polymerase to dissociate from the DNA thus terminating transcription. Due to the presence of the Pthl and the rho-independent terminator, each PCR product is a single transcriptional unit. The resulting PCR products will be individually cloned into the sigE-targeted, replicating, SSCI plasmid (sigE-SSCI plasmid). This is the same SSCI plasmid we previously employed for disrupting the sigE locus via a single crossover event with RecU over-expression, thus already has RecU over-expression.
The 5′ region of homology is 253 basepairs (bp) and the 3′ region is 306 bp. The regions of homology are contiguous to the targeted region of the chromosome, and are disrupted on the plasmid by a thiamphenicol (TH) antibiotic resistance marker (refer to
To induce SSCI, we grow cells harboring the SSCI plasmid for 5 days under vegetative growth conditions and under TH selection. This is done by replica plating cells every 24 hours onto a fresh nutrient plate with TH selection. Cells grow exponentially to create a “lawn” of growth within 24 hours and are then replica plated again with velveteen squares and a replica-plating device. TH selection is maintained for a period of 5 days in order to either maintain cells harboring the SSCI plasmid or to maintain cells that have integrated the SSCI plasmid into the chromosome via either a single or double crossover event. A single-crossover event incorporates the entire SSCI plasmid (disruption cassette and plasmid backbone), and a double-crossover event replaces the endogenous regions of homology with the disruption cassette, and excises the SSCI plasmid backbone. Therefore, SSCI plasmid harboring cells, single-crossover and double-crossover cells will be maintained during the TH replica plating. Cells that lose the plasmid and have not undergone a crossover event will be lost from the population.
Prior to screening, we “cure” cells of the SSCI plasmid by replica plating for 5 days under vegetative growth conditions without any antibiotic selection. During this time, cells are likely to lose the replicating plasmid since there is no selection for its maintenance, but copies of the TH marker that have integrated into the chromosome are maintained. Additionally, copies of the EM marker that have integrated into the chromosome via single-crossover events are also maintained, unless a second crossover event occurs and excises out the plasmid backbone.
For screening, plates are replica plated after the 5th day of no antibiotic pressure onto fresh nutrient plates with TH selection. Cells that were “cured” of the plasmid and did not undergo a crossover event will be lost from the population under TH selection. Cells are allowed to grow for 24 hours under TH selection and then replica plated onto fresh nutrient plates with EM selection. Cells that still harbor the SSCI plasmid or have undergone a single-crossover event will grow on the EM plates in 24 to 48 hours. Cells that have plasmid borne resistance to EM grow within 24 hours of replica plating. Cells that have single-crossover, chromosomal borne EM resistance require at least 36 and more often 48 hours to grow because there is only a single copy of the EM resistance gene compared to 5-15 copies from the replicating SSCI plasmid (the average copy number of these plasmids is 7). Cells that do not grow at all on EM plates, but do grow on TH are indicative of double-crossover events. Table 1 outlines the selection criteria and likely explanation for each cell type.
Table 1. The possible phenotypes from SSCI screening and the likely genotype associated with each phenotype. Cross is in reference to crossover.
The current standard for confirming SSCI is sequencing the genomic region about which the integration event occurred. For double-crossover integrations, this is a simple task of PCR amplifying the region where integration occurred (refer to
In the case of single-crossover integrations, the PCR amplification of the region of integration is not easy to perform because the PCR product would typically be greater than 6000 bp and will be susceptible to a lot of mispriming due to incomplete product extension. However, by knowing the orientation of the TH marker in relation to the gene we are attempting to disrupt (i.e., whether the TH marker is in the same or opposite coding strand of the gene of interest), we can perform two PCR reactions to determine if crossover occurred through the first or second region of homology. This is depicted in
Table 2. List of appropriate primer sets to use when confirming a single integration event through the 1st and 2nd region of homology. The table also details possible results and the most probably explanation of such results.
Eventually we need to determine the exact sequence of the entire region of integration. So after confirming a putative single-crossover clone by the aforementioned PCR method, we perform XL (extra-long) PCR reactions under an assortment of reaction and annealing temperature conditions to obtain specific and large quantities of PCR product that can then be sequenced.
To determine the relative overall effectiveness of each homologous recombination protein in conjunction with RecU at stimulating and enhancing recombination, we first determine whether single or double-crossovers occur at all. Our comparison control is the sigE-SSCI plasmid without any homologous recombination protein expression. Previously, such experiments never generated single of double-crossover events without the expression of the RecU protein. Thus the ability to generate either a single or double-crossover event is a positive outcome. However, there are no established protocols for quantitatively determining the effectiveness of stimulating homologous recombination. Therefore we propose the following semi-quantitative approach, which will likely be necessary for comparing the results from each homologous protein expression against each other.
Semi-quantitative analysis will be performed by first quantifying the physical area on each TH screening plate that indicates single or double crossover integration. Subsequently we will determine the frequency of single and double-crossover events per colony screened as determined by PCR confirmation. This value, multiplied by the physical area of single or double integration from the TH screening plates will represent the relative effectiveness (RE) for enhancing and stimulating chromosomal integration.
We have already demonstrated the utility of resolvase (RecU) expression for stimulating homologous recombination. However, we will continue to investigate various parameters that affect the frequency of the recombination events, as well as parameters that affect the frequency of single versus double-crossover events.
As mentioned, the majority of our experiments has and will continue to use regions of homology that are 250-300 bp long. However, the majority of clostridia literature that has attempted chromosomal integration via homologous recombination, reports using regions of homology that are significantly longer. Therefore we will investigate the significance of the length of the homologous regions. Specifically we will test 1000, 500, 250 and 100 bp regions of homology aiming as above to integrate into the sigE locus. We will construct new disruption cassettes and clone them into the already made sigE-SSCI plasmid that contains the RecU-Pthl expression. We will stimulate, screen, confirm and determine the relative effectiveness of enhancing recombination for each length of homology by the methods described above.
Other common approaches for integrating DNA into the chromosome include linear DNA (i.e. the Longtine approach employed in yeast [52]) and suicide/non-replicating plasmids, which has been reported in Cac but cannot be routinely performed. We will attempt these same approaches by creating the strain 824(pRecU), which expresses RecU-Pthl from a separate plasmid than the SSCI plasmid.
EM resistance provided on the pRecU plasmid will maintain RecU expression. We will transform 824(pRecU) with either a linear DNA-disruption cassette or a suicide SSCI plasmid that contains a disruption cassette but no origin of replication for Gram-positive organisms, such as pAKKO from a recent publication from the Papoutsakis group [25]. Transformants that survive TH selection theoretically must have undergone a chromosomal integration event because suicide plasmids and linear DNA cannot replicate. In this approach, RecU is under the expression of the strong, growth-associated thiolase promoter. Thus, at the time of transformation, the competent cells should be actively expressing RecU and RecU will serve the same purpose of promoting recombination as demonstrated via the replicating SSCI plasmid approach. RecU expression will again be verified by reverse transcription PCR. Resulting TH resistant mutants are readily cured of the pRecU plasmid by vegetatively transferring without EM selection. We will test a range of DNA amounts for each approach, from 50 μg to 0.1 μg of DNA per transformation. We typically use 0.5 μg of DNA for transforming a replicating plasmid. We will stimulate, screen, confirm and determine the relative success at enhancing recombination by the methods described previously.
This application claims priority from U.S. Provisional Patent Application No. 61/262,288, filed Nov. 18, 2009, which is incorporated herein by reference in entirety.
Number | Date | Country | |
---|---|---|---|
61262288 | Nov 2009 | US |