The present invention relates to biofuels and biorenewables. More particularly, but not exclusively, the present invention relates to engineering a methane-to-acetate pathway for producing liquid biofuels.
The United States has vast reserves of methane; however, many of these methane reserves are remote and a significant amount of the methane is released into the atmosphere (this is detrimental since methane is a potent greenhouse gas). There is a need to convert these resources easily, reliably, and efficiently into liquid fuels.
Therefore, it is a primary object, feature, or advantage of the present invention to improve over the state of the art.
It is a further object, feature, or advantage of the present invention to convert methane reserves into liquid fuels.
It is a still further object, feature, or advantage of the present invention to reduce greenhouse gas from the atmosphere.
It is another object, feature, or advantage of the present invention to provide for conversion of methane gas into liquid fuels in a manner that is easy, reliable, and efficient.
One or more of these and/or other objects, features, or advantages of the present invention will become apparent from the specification and claims that follow. No single embodiment need exhibit each or every object, feature, or advantage as it is contemplated that different embodiments or aspects of the invention may have different objects, features, or advantages.
According to one aspect, the present invention provides compositions and methods for creating metabolically-engineered microbes that converts methane gas to a biofuel such as ethanol. For example, the activity of methylreductase of M. acetivorans can be reversed to create a pathway that converts methane to acetate. Moreover, the invention includes methods for converting acetate to biofuels, such as but not limited to, ethanol or butane, for example by: (i) further engineering the pathway to accommodate supplemental reductant (CO and/or electricity) or (ii) coupling M. acetivorans with pure cultures or consortia of anaerobes thermodynamically assisted with H2 or CO.
The terms “comprises” and variations thereof do not have a limiting meaning where these terms appear in the description and claims.
Unless otherwise specified, “a,” “an,” “the,” and “at least one” are used interchangeably and mean one or more than one.
Also herein, the recitations of numerical ranges by endpoints include all numbers subsumed within that range (e.g., 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4,5, etc.). For any method disclosed herein that includes discrete steps, the steps may be conducted in any feasible order. And, as appropriate, any combination of two or more steps may be conducted simultaneously.
As used herein, the term “biofuel” is intended to mean a liquid fuel produced using a microbial biology based process. Thus, the term “biofuel” includes fuels made from fossil methane (e.g. natural gas, coal bed methane), as well as fuels where the carbon feedstock is recently from a biological source (e.g. biorenewables).
As used herein, the term “microbe” refers to any microscopic organism, which may be a single cell or multicellular organism. The term is understood to include all of the bacteria and archaea and almost all the protozoa. They also include some members of the fungi, algae.
“Methanogens” refers to microorganisms that produce methane as a metabolic byproduct in anoxic conditions. Methanogens are understood to include any microorganism that possesses the cellular machinery for producing methane, including those microorganisms with endogenous systems and microorganisms expressing heterologous systems. Methanogens include bacteria and fungi, but are most commonly classified as archaea. They are common in wetlands, where they are responsible for marsh gas, and in the digestive tracts of animals such as ruminants and humans. Only two genera of archea methanogens, Methanosarcina (Msr.) and Methanosaeta (Mse.), contain species that produce methane by the “aceticlastic” pathway . Microbes that produce methane by the “aceticlastic” pathway, including but not limited to Msr. And Mse. microbes, are referred to herein as “acetate-utilizing” or “ aceticlastic”, which are used interchangeably. Methanosaeta (previously Methanothrix) species have a higher affinity for acetate than Methanosarcina and dominate in methanogenic habitats with low concentrations of acetate. The biochemistry of the aceticlastic pathway has been investigated primarily in the freshwater Methanosarcina species Msr. barkeri, Msr. mazei and Msr. thermophila and the marine isolate Msr. acetivorans.
“Methylreductase” (“Mcr”) refers to a group of enzymes, including coenzyme-B sulfoethylthiotransferase, also known as methyl-coenzyme M reductase or most systematically as 2-(methylthio)ethanesulfonate:N-(7-thioheptanoyl)-3-O-phosphothreonine S-(2-sulfoethyl)thiotransferase, that catalyze the final step in the formation of methane. Mcr refers to any enzyme that combines the hydrogen donor coenzyme B and the methyl donor coenzyme M. Mcr are most commonly expressed by methanogens.
As used herein, “mutation” includes reference to alterations in the nucleotide sequence of a polynucleotide, such as for example a gene or coding DNA sequence (CDS), compared to the wild-type sequence. The term includes, without limitation, substitutions, insertions, frameshifts, deletions, inversions, translocations, duplications, splice-donor site mutations, point-mutations or the like.
As used herein, “nucleic acid” includes reference to a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, and unless otherwise limited, encompasses conservatively modified variants and known analogues having the essential nature of natural nucleotides in that they hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids).
As used herein “operably linked” includes reference to a functional linkage between a promoter and a second sequence, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to the second sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame.
As used herein, “polynucleotide” includes reference to a deoxyribopolynucleotide, ribopolynucleotide, or conservatively modified variants; the term may also refer to analogs thereof that have the essential nature of a natural ribonucleotide in that they hybridize, under stringent hybridization conditions, to substantially the same nucleotide sequence as naturally occurring nucleotides and/or allow translation into the same amino acid(s) as the naturally occurring nucleotide(s). A polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are “polynucleotides” as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art.
The term polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.
The terms “polypeptide”, “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms also may apply to conservatively modified variants and to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms “polypeptide”, “peptide” and “protein” are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation. It will be appreciated, as is well known and as noted above, that polypeptides are not always entirely linear. For instance, polypeptides may be branched as a result of ubiquitization, and they may be circular, with or without branching, generally as a result of posttranslation events, including natural processing event and events brought about by human manipulation which do not occur naturally. Circular, branched and branched circular polypeptides may be synthesized by non-translation natural process and by entirely synthetic methods, as well. Further, this invention contemplates the use of both the methionine-containing and the methionine-less amino terminal variants of the protein of the invention.
As used herein “promoter” includes reference to a region of DNA upstream from the start of transcription and involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Examples of promoters under developmental control include promoters that preferentially initiate transcription in certain tissues, such as leaves, roots, or seeds. Such promoters are referred to as “tissue preferred”. Promoters which initiate transcription only in certain tissue are referred to as “tissue specific”. A “cell type” specific promoter primarily drives expression in certain cell types in one or more organs, for example, vascular cells in roots or leaves. An “inducible” or “repressible” promoter is a promoter which is under environmental control. Examples of environmental conditions that may affect transcription by inducible promoters include anaerobic conditions or the presence of light. Tissue specific, tissue preferred, cell type specific, and inducible promoters constitute the class of “non-constitutive” promoters. A “constitutive” promoter is a promoter which is active under most environmental conditions.
As used herein “recombinant” includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under-expressed or not expressed at all as a result of deliberate human intervention. The term “recombinant” as used herein does not encompass the alteration of the cell or vector by naturally occurring events (e.g., spontaneous mutation, natural transformation/transduction/transposition) such as those occurring without deliberate human intervention.
As used herein, a “recombinant expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements which permit transcription of a particular nucleic acid in a host cell. The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid to be transcribed, and a promoter.
The term “residue” or “amino acid residue” or “amino acid” are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide, or peptide (collectively “protein”). The amino acid may be a naturally occurring amino acid and, unless otherwise limited, may encompass non-natural analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids.
The term “selectively hybridizes” includes reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to another nucleic acid sequence or other biologics. When utilizing a hybridization-based detection system, a nucleic acid probe is chosen that is complementary to a reference nucleic acid sequence, and then by selection of appropriate conditions the probe and the reference sequence selectively hybridize, or bind, to each other to form a duplex molecule. The term “sequence” refers to an amino acid or nucleotide sequence of any length, which can be DNA or RNA; can be linear, circular or branched and can be either single-stranded or double stranded. The term “donor sequence” refers to a nucleotide sequence that is inserted into a genome. A donor sequence can be of any length, for example between 2 and 10,000 nucleotides in length (or any integer value there between or thereabove), preferably between about 100 and 1,000 nucleotides in length (or any integer there between), more preferably between about 200 and 500 nucleotides in length.
A “homologous, non-identical sequence” refers to a first sequence which shares a degree of sequence identity with a second sequence, but whose sequence is not identical to that of the second sequence. For example, a polynucleotide comprising the wild-type sequence of a mutant gene is homologous and non-identical to the sequence of the mutant gene. In certain embodiments, the degree of homology between the two sequences is sufficient to allow homologous recombination there between, utilizing normal cellular mechanisms. Two homologous non-identical sequences can be any length and their degree of non-homology can be as small as a single nucleotide (e.g., for correction of a genomic point mutation by targeted homologous recombination) or as large as 10 or more kilobases (e.g., for insertion of a gene at a predetermined ectopic site in a chromosome). Two polynucleotides comprising the homologous non-identical sequences need not be the same length. For example, an exogenous polynucleotide (i.e., donor polynucleotide) of between 20 and 10,000 nucleotides or nucleotide pairs can be used.
Techniques for determining nucleic acid and amino acid sequence identity are known in the art. Typically, such techniques include determining the nucleotide sequence of the mRNA for a gene and/or determining the amino acid sequence encoded thereby, and comparing these sequences to a second nucleotide or amino acid sequence. Genomic sequences can also be determined and compared in this fashion. In general, identity refers to an exact nucleotide-to-nucleotide or amino acid-to-amino acid correspondence of two polynucleotides or polypeptide sequences, respectively.
Two or more sequences (polynucleotide or amino acid) can be compared by determining their percent identity. The percent identity of two sequences, whether nucleic acid or amino acid sequences, is the number of exact matches between two aligned sequences divided by the length of the shorter sequences and multiplied by 100. An approximate alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff, Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., USA, and normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An exemplary implementation of this algorithm to determine percent identity of a sequence is provided by the Genetics Computer Group (Madison, Wis.) in the “BestFit” utility application. The default parameters for this method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 8 (1995) (available from Genetics Computer Group, Madison, Wis.). A preferred method of establishing percent identity in the context of the present disclosure is to use the MPSRCH package of programs copyrighted by the University of Edinburgh, developed by John F. Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, Calif.). From this suite of packages the Smith-Waterman algorithm can be employed where default parameters are used for the scoring table (for example, gap open penalty of 12, gap extension penalty of one, and a gap of six). From the data generated the “Match” value reflects sequence identity. Other suitable programs for calculating the percent identity or similarity between sequences are generally known in the art, for example, another alignment program is BLAST, used with default parameters. For example, BLASTN and BLASTP can be used using the following default parameters: genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by=HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+Swiss protein+Spupdate+PIR. Details of these programs can be found at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. With respect to sequences described herein, the range of desired degrees of sequence identity is approximately 80% to 100% and any integer value therebetween. Typically the percent identities between sequences are at least 70-75%, preferably 80-82%, more preferably 85-90%, even more preferably 92%, still more preferably 95%, and most preferably 98% sequence identity.
Alternatively, the degree of sequence similarity between polynucleotides can be determined by hybridization of polynucleotides under conditions that allow formation of stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s), and size determination of the digested fragments. Two nucleic acid, or two polypeptide sequences are substantially homologous to each other when the sequences exhibit at least about 70%-75%, preferably 80%-82%, more preferably 85%-90%, even more preferably 92%, still more preferably 95%, and most preferably 98% sequence identity over a defined length of the molecules, as determined using the methods above. As used herein, substantially homologous also refers to sequences showing complete identity to a specified DNA or polypeptide sequence. DNA sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions, as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Sambrook et al., supra; Nucleic Acid Hybridization: A Practical Approach, editors B. D. Hames and S. J. Higgins, (1985) Oxford; Washington, D.C.; IRL Press).
A “gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product (see infra), as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
“Gene expression” refers to the conversion of the information, contained in a gene, into a gene product. A gene product can be the direct transcriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any other type of RNA) or a protein produced by translation of a mRNA. Gene products also include RNAs which are modified, by processes such as capping, polyadenylation, methylation, and editing, and proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristilation, and glycosylation.
“Modulation” of gene expression refers to a change in the activity of a gene. Modulation of expression can include, but is not limited to, gene activation and gene repression.
As used herein, the terms “coding region” and “coding sequence” are used interchangeably and refer to a nucleotide sequence that encodes a polypeptide and, when placed under the control of appropriate regulatory sequences expresses the encoded polypeptide. The boundaries of a coding region are generally determined by a translation start codon at its 5′ end and a translation stop codon at its 3′ end. A “regulatory sequence” is a nucleotide sequence that regulates expression of a coding sequence to which it is operably linked. Non-limiting examples of regulatory sequences include promoters, enhancers, transcription initiation sites, translation start sites, translation stop sites, and transcription terminators. The term “operably linked” refers to a juxtaposition of components such that they are in a relationship permitting them to function in their intended manner. A regulatory sequence is “operably linked” to a coding region when it is joined in such a way that expression of the coding region is achieved under conditions compatible with the regulatory sequence.
A polynucleotide that includes a coding region may include heterologous nucleotides that flank one or both sides of the coding region. As used herein, “heterologous nucleotides” refer to nucleotides that are not normally present flanking a coding region that is present in a wild-type cell. For instance, a coding region present in a wild-type microbe and encoding a Mcr polypeptide is flanked by homologous sequences, and any other nucleotide sequence flanking the coding region is considered to be heterologous. Examples of heterologous nucleotides include, but are not limited to regulatory sequences. Typically, heterologous nucleotides are present in a polynucleotide disclosed herein through the use of standard genetic and/or recombinant methodologies well known to one skilled in the art. A polynucleotide disclosed herein may be included in a suitable vector.
Conservatively modified variants refer to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations” and represent one species of conservatively modified variation. Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid.
One of ordinary skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine; and UGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide of the present invention is implicit in each described polypeptide sequence and is within the scope of the present invention.
As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Thus, any number of amino acid residues selected from the group of integers consisting of from 1 to 15 can be so altered. Thus, for example, 1, 2, 3, 4, 5, 7, or 10 alterations can be made.
Conservatively modified variants typically provide similar biological activity as the unmodified polypeptide sequence from which they are derived. For example, substrate specificity, enzyme activity, or ligand/receptor binding is generally at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the native protein for its native substrate. Conservative substitution tables providing functionally similar amino acids are well known in the art.
The following six groups each contain amino acids that are conservative substitutions for one another:
1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine(I), Leucine (L), Methionine (M), Valine (V); and 6)Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also, Creighton (1984) Proteins W. H. Freeman and Company. The above summary of the present invention is not intended to describe each disclosed embodiment or every implementation of the present invention. The description that follows more particularly exemplifies illustrative embodiments. In several places throughout the application, guidance is provided through lists of examples, which examples can be used in various combinations. In each instance, the recited list serves only as a representative group and should not be interpreted as an exclusive list.
As used herein “full-length sequence” in reference to a specified polynucleotide or its encoded protein means having the entire amino acid sequence of a native (nonsynthetic), endogenous, biologically active form of the specified protein. Methods to determine whether a sequence is full-length are well known in the art including such exemplary techniques as northern or western blots, primer extension, S 1 protection, and ribonuclease protection. Comparison to known full-length homologous (orthologous and/or paralogous) sequences can also be used to identify full-length sequences of the present invention. Additionally, consensus sequences typically present at the 5′ and 3′ untranslated regions of mRNA aid in the identification of a polynucleotide as full-length. For example, the consensus sequence ANNNNAUGG, where the underlined codon represents the N-terminal methionine, aids in determining whether the polynucleotide has a complete 5′ end. Consensus sequences at the 3′ end, such as polyadenylation sequences, aid in determining whether the polynucleotide has a complete 3′ end.
As used herein, “heterologous” in reference to a nucleic acid is a nucleic acid that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, one or both are substantially modified from their original form. A heterologous protein may originate from a foreign species or, if from the same species, is substantially modified from its original form by deliberate human intervention.
Engineered pathways are provided for converting methane (CH4) to liquid fuels using a two-phase approach. Phase I involves an engineered reversal of the natural pathway for acetate conversion to CH4, such as by Methanosarcina acetivorans. Phase II involves converting the acetate to liquid fuels via CO oxidation, co-culture with acetate utilizing species or non-biological processes. It is important to note that naturally occurring cultures produce ethanol from acetate utilizing H2 as a reductant. Phase I has three significant advantages: (i) the pathway conserves all electrons in CH4, (ii) consumes CO2 and (iii) produces a C—C bond. The choice of M. acetivorans is appropriate as: (i) trace CH4 oxidation has been documented, (ii) a novel methanogenic pathway (the first and only) has been engineered in M. acetivorans, (iii) the genome is sequenced and a well-documented robust genetic system is available, (iv) the first, and only, over-production of an active methanogen metalloenzyme was accomplished in M. acetivorans, (v) the pathway for acetate formation from relevant pathway intermediates has been elucidated, (vi) the enzymology required for potential CH4 conversion to acetate is well-characterized, (vii) novel enzymes and physiologies to reverse engineer the acetate-to-CH4 pathway are in-hand hand and (viii) a molecular understanding of gene regulation in the acetate-to-CH4 pathway is well underway. After reversing the acetate-to-methane pathway, methane may be converted to ethanol by supplying a reductant (CO or electricity) to M. acetivorans that has been further engineered to express three genes for ethanol production from acetate. Alternatively, the engineered M. acetivorans may be coupled with cultures converting acetate to ethanol assisted with CO or H2.
Opportunity of the Marcellus Shale. Initiated in 2004, horizontal drilling has made the methane resources of the Marcellus Shale economical to be tapped through hydraulic fracturing. The Marsellus Shale is a 54,000 square mile black shale formation that is up to 10,000 feet below primarily Pennsylvania and New York and holds up to 516 trillion cubic feet of natural gas; for Pennsylvania, the recoverable gas is worth more than $500B. Based on these and other reserves, this creates an exciting time in which the U.S. can think of energy independence. Unfortunately, methane is released in significant quantities as a greenhouse gas in some cases, and it is difficult to transport in remote areas. Also, liquid biofuels are easier to use for our existing automotive infrastructure. Therefore, it would be advantageous to in situ convert methane to a liquid fuel. This application aims to accomplish this through biological means by engineering a microbe such as Methanosarcina acetivorans to convert methane gas for use as a liquid fuel. In particular, M. acetivorans may be engineered to reverse the natural pathway for conversion of acetate to methane and then biological and/or non-biological methods may be used for converting the acetate to liquid fuels.
Microbiology of the natural pathway converting acetate to methane. Methane is the end product of the decomposition of complex organic matter in diverse O2-free (anaerobic) environments, producing nearly one billion metric tons of methane each year. The process is an essential link in Earth's carbon cycle (
Approximately one-third of the methane produced in Earth's biosphere is generated by the reduction of CO2 with electrons derived from the oxidation of H2 (
CO2+4H243 CH4+2H2O [Eq. 1]
The remaining two-third originates from the methyl group of acetate (
CH3COO−+H+→CH430 CO2 [Eq. 2]
Pathway converting acetate to methane.
Biochemical and bioinformatic evidence indicates that the core methyl transfer steps leading from the methyl group of acetate to methane are similar in all freshwater and marine species (
CH3COO−+ATP→CH3CO2PO3−2+ADP [Eq. 3]
CH3CO2PO3−2+HS-CoA→CH3COSCoA+Pi [Eq. 4]
CH3COSCoA+H4SPT+H2O+Fdo→CH3—H4SPT+FdrCO2+HS-CoA [Eq. 5]
The structure and function of the enzymes catalyzing these reactions have been investigated in considerable detail, the understanding of which has impacted the broader field of prokaryotic biology in view of the fact that paralogs are wide-spread in diverse anaerobes from the Bacteria domain. Acetate kinase (
CH3COO−+ATP+CoASH→CH3COSCoA+AMP+PiPi [Eq. 6]
The Cdh cleaves the C—C and C—S bonds of acetyl-CoA (Eq. 5) yielding methyl and carbonyl groups. The methyl group is transferred to H4SPT for eventual conversion to methane and the carbonyl group is oxidized to CO2 with the electrons transferred to a 2x[4Fe-4S] ferredoxin. In addition to aceticlastic methanogens, diverse anaerobes from the Bacteria domain utilize Cdh in energy-yielding pathways generating acetyl-CoA with CoA-SH, a methyl group and CO2 plus a pair of electrons. The acetyl-CoA is further metabolized to acetate catalyzed by phosphotransacetylase and acetate kinase producing ATP. The Cdh from methanogens has been biochemically characterized from both Methanosarcina and Methanosaeta species. Most structure and function studies have been performed with the enzymes from Msr. thermophila and Msr. barkeri. In both species the enzyme is a complex comprised of five-subunits (α,β,γ,δ,ε) resolvable by detergent treatment into a Ni/Fe—S component (α and ε subunits), a Co/Fe—S component (γ and δ subunits), and the β subunit. The resolved Ni/Fe—S component catalyzes the reversible oxidation of CO to CO2 utilizing a 2x[4Fe-4S] ferredoxin as the redox partner consistent with a role for this component in oxidizing the carbonyl group of acetyl-CoA and reducing ferredoxin. The Co/Fe—S component of the Cdh from Methanosarcina species contains Factor III and is involved in transfer of the methyl group of acetyl-CoA to H4SPT. The Co/Fe—S component also contains a [4Fe-4S] cluster with a midpoint potential at pH 7.8 of −502 mV, which is nearly isopotential with the Co2+/Co1+ couple and likely serves as the direct electron donor. The cdhD and cdhE genes which encode the δ and γ subunits have been cloned and sequenced. The CdhE sequence contains a four-cysteine motif with the potential to bind a 4Fe-4S cluster. Further, CdhE overproduced in E. coli contains the cluster and a corrinoid cofactor with the benzimidazole base in the base-off configuration. The results are consistent with a role for the γ subunit in transfer of the methyl group to H4SPT.
The conversion of CH3—H4SPT to methane is common to all methanogenic pathways and requires three reactions catalyzed by an eight-subunit, membrane-bound, CH3—H4SPT:coenzyme M methyltransferase (MtrA-H) (Eq. 7), CH3-CoM methylreductase (McrABC) (Eq. 8) and heterodisulfide reductase (HdrDE) (Eq. 9).
CH3-THMPT+HS-CoM→CH3—S-CoM+THMPT [Eq. 7]
CH3—S-CoM+HS-CoB→CoMS—SCoB+CH4 [Eq. 8]
CoM-S—S-CoB+2e−+2H+→HS-CoB+HS-CoM [Eq. 9]
The methyltransferase (
The heterodisulfide CoM-S—S-CoB is the terminal electron acceptor of a membrane-bound electron transport chain coupled to formation of an electrochemical ion gradient driving ATP synthesis. The “archaeal” A1A0-type ATP synthase is abundant in acetate-grown Msr. acetivorans and Msr. mazei consistent with a role in ATP synthesis. Freshwater Methanosarcina species utilize a membrane-bound reduced ferredoxin (Fdr):CoM-S—S-CoB oxidoreductase system involving the production and consumption of H2. However, Msr. acetivorans, a marine species, has evolved a mechanism for oxidizing ferredoxin and reducing CoM-S—S-CoB that does not involve H2 (
The conversion of acetate to CH4 and CO2 provides only a marginal amount of energy available for ATP synthesis (ΔG°′=−36 kJ/CH4). A calorimetric and thermodynamic analysis of Msr. barkeri grown with acetate suggests a retarding effect of the positive enthalpy change on the driving force of growth that is overcompensated by a large positive entropy change resulting from the conversion of acetate to only gaseous products. Since both the enthalpy and the entropy increases are due in part to transition of CH4 and CO2 into the gaseous phase, it is proposed that a carbonic anhydrase (
CO2+H2O→HCO3−+H+ [Eq. 10]
The synthesis of Cam is up-regulated in Msr. thermophila, Msr. acetivorans and Msr. mazei when switched from growth on methanol to growth on acetate consistent with a role during growth on acetate. Cam from Msr. thermophila is the archetype of an independently-evolved class of carbonic anhydrases (γ class) and is the first carbonic anhydrase shown to function with iron in the active site. This provides an overview of the native methane utilization pathways in archaea. Enzyme engineering techniques, such as directed evolution, may be used to improve enzymatic performance of key steps.
Reversal of the Natural Pathway for Acetate Conversion to Methane
One aspect of the present invention reverses the natural pathway converting acetate to methane, instead converting methane to acetate. Such reversal may be accomplished using a variety of approaches, for example by altering the activity or expression of key enzymes. Alteration can be achieved through manipulation and engineering techniques, for example by directed evolution, DNA shuffling, mutagenesis (e.g. saturation mutagenesis), enzyme redesign, heterologous expression of enzymes, and other molecular genetics approaches.
Key enzymes include those that carry out or contribute to the reactions set out in the above equations. For example, one or more of methylreductase (McrABC), coenzyme M (HS-CoM), bound methyltransferase complex (Mtr), soluble monomeric methyltransferase (CmtA), dehydrogenase/acetyl-CoA synthase (Cdh),the Rnf complex and methanophenazine (MP), phosphotransacetylase and acetate kinase, and methyl-CoM methylreductase may be specifically altered.
In one aspect, the natural pathway for acetate conversion to methane may be reversed along with modifications to specific aspects of the pathway. In an exemplary embodiment, shown in
In another aspect, replacing membrane-bound complexes may involve reversing or circumventing the natural sodium gradient, for example by replacing endogenous 8-subunit membrane-bound methyltransferase complex (Mtr) with a soluble monomeric methyltransferase (CmtA) that does not require a sodium gradient. In another aspect, a homolog of the CO dehydrogenase/acetyl-CoA synthase (Cdh) of Methanosarcina spp. may be provided or expressed, which is capable of consuming CO2 and conserving electrons generated by oxidation of HS-CoB and HS-CoM to the heterodisulfide CoMS-SCoB with production of reduced ferredoxin (Fd).
In certain aspects, reversal of the natural pathway for acetate conversion to methane may involve altering expression or activity of enzymes that are common to all methanogenic pathways and are normally constitutively expressed. In other aspects, reversal of the pathway may involve expression of engineered enzymes under strong and/or inducible promoters. In other aspects, reversal of the pathway may involve utilizing enzymes that are naturally induced under certain conditions, for example growth in the presence of acetate. One or more of these approaches may be used in various combinations.
In another aspect, electricity may be used to supply reductant in place of CO. This approach may be preferred in light of reduction of carbon dioxide to methane by a methanogen using electricity to supply electrons.
DNA shuffling. Directed evolution or DNA shuffling is a powerful mutagenesis technique that mimics the natural molecular evolution of genes to efficiently re-design them. Its power lies in that it can introduce multiple mutations (many of them distal to the active site) into a gene to create new enzymatic activity (found by a suitable method of screening/selection). The power of the method stems from the fact that it can access improving mutations far away from the active site that are difficult to rationally predict using molecular modeling or intuition. Hence, it is not necessary to know the 3-D structure to optimize enzyme activity, and directed evolution identifies mutations that influence activity through subtle, long-range interactions. This method was developed by Willem Stemmer of Affymax Research Institute (now Maxygen) and consists of using PCR without oligo primers to re-assemble a gene from random 10-300 bp DNA fragments generated by first cleaving the gene with DNase. After re-assembling the original gene from these 300 bp fragments using a series of homologous recombinations and extensions with dNTPs and polymerase, normal PCR (with nested oligos) is performed using traditional oligos to yield the full-length gene with random mutations. The mutations arise from infidelity in the assembly process, PCR infidelity (polymerase base-reading errors), and errors introduced in the assembly process by insertion of mutated gene fragments (controlled by the researcher by adding specific oligos or DNA fragments from related but not identical genes). The advantages of this method are that DNA shuffling introduces mutations much more efficiently than other methods (e.g., unlike DNA shuffling, error-prone PCR and oligonucleotide cassette mutagenesis are not combinatorial), and it may be used to create a chimeric gene by reassembling closely-related genes (molecular breeding). This method has been used to increase β-lactamase antibiotic activity by 32,000-fold, to increase the fluorescence signal of the green fluorescent protein by 45-fold, and to evolve a fucosidase from β-galactosidase. Family shuffling has also been used to combine sequences of related enzymes to improve enzyme activity.
Saturation mutagenesis. Saturation mutagenesis is used to introduce all possible amino acids at key sites to explore a larger fraction of the protein sequence space than that of site-specific mutagenesis. It can provide much more comprehensive information than can be achieved by single-amino acid substitutions as well as overcome the drawbacks of random mutagenesis (e.g., error-prone PCR, DNA shuffling) in that a single mutation randomly placed in codons generates on average only 5.6 out of 19 possible substitutions (because it is very difficult to introduce more than one by change per codon using these methods and this is insufficient to introduce all amino acids at each codon). To determine the number of independent clones from saturation mutagenesis that need to be screened to ensure each possible codon has been tested, a multinomial distribution equation was developed; for example, it was determined that 292 colonies need to be screened if a single codon is changed.
These techniques (DNA shuffling and saturation mutagenesis) may be used to increase the effectiveness of Mcr in converting methane to methyl coenzyme M (CH3—SCoM, step 1).
Molecular genetics with methanoarchaea. The pathways producing methane utilize several specialized metalloenzymes unique to the methanoarchaea that contain a diversity of metals and unique cofactors in the active site. The Inventors previously developed of a system for overproduction of recombinant metalloproteins in M. acetivorans to circumvent problems associated with overproduction in eukaryotic or bacterial systems. The system was validated by overproduction of catalytically active Cam, an iron-containing carbonic anhydrase that functions in the pathway (
Rapid generation of metabolic models. Genome-scale models have been developed for a wide variety of microorganisms including pathogens, archaea and plants. Table 1 summarizes the reconstruction efforts and highlights size statistics and quality metrics. Specificity, as shown in Table 1, is defined as the % correctly-identified true positives such as essential genes or feasible growth substrates while sensitivity is the % of correctly-identified true negatives such as non-essential genes or substrates that do allow for growth. Systematic procedures have been used to refine and improve the prediction accuracy of the draft reconstructed metabolic models. Examples include model refinement via GapFind and GapFill to unblock biomass precursors and reconnect unreachable metabolites, as well as evaluation and improvement of model performance when compared to in vivo gene essentiality or synthetic lethality data (if available) using the GrowMatch and Extended GrowMatch procedures. A Knowledgebase called MetRxn has also been developed for the reaction/metabolite data standardization, correction, and congruency to aid model reconstruction efforts.
Mycoplasma
genitalium
Salmonella
Typhimurium
Methanosarcina
acetivorans
Zea mays
Cyanothece sp.
Synechocystis
A genome-scale metabolic model has been developed for Methanosarcina acetivorance (iVS941). M. acetivorans, with a genome size of ˜5.7 mb, is the largest sequenced archaeon methanogen and unique amongst the methanogens in its biochemical characteristics. The generated model iVS941 accounts for 941 genes, 705 reactions and 708 metabolites. Using a systematic procedure relying on the GapFind, GapFill and the GrowMatch procedures enabled sequential evaluation and improvement of model capabilities. The completed model has metabolites that can be produced (87%) and it has a high agreement of 93.3% against published in vivo growth data across different substrates and genetic perturbations with specificity of 81% and sensitivity of 89.7%. The model also correctly recapitulates metabolic pathway usage patterns of M. acetivorans such as the indispensability of flux through methanogenesis for growth on acetate and methanol and the unique biochemical characteristics under growth on carbon monoxide.
Metabolic model reconstruction pipeline. The metabolic model reconstruction process follows four major steps (see
The model reconstruction and refinement tools may be used in conjunction with the ModelSEED resource and MetRxn knowledgebase to rapidly reconstruct highly curated metabolic models for methanogens. The energetics, electron flow and regulation of export/import of various metabolites and signaling molecules may be described.
Community metabolic modeling using multi-level and multi-criteria optimization techniques. OptCom is a flux balance analysis framework for microbial communities using genome-scale metabolic models, which relies on a multi-level and multi-objective optimization formulation to properly describe trade-offs between individual vs. community level fitness criteria (see
An automated computational workflow may be used to select, from an ensemble of microbial models, what member would lead to the most efficient utilization of available metabolic resources. Such a workflow has been used in the design of the synthetic microbial community composed of M. acetivorans co-cultured with an acetate-utilizing species to alleviate the thermodynamic unfavorable conditions.
In addition, targeted gene knockouts may be introduced. These genetic manipulations may be warranted for forcing a desired obligatory syntrophic relation in the community (so as metabolic flows can be uniquely attributed) or as part of optimizing the production of a targeted product metabolite. To this end, computational strain design procedures such as OptKnock and OptForce may be extended to a multi-species setting (see
De novo enzyme design. Computational protein design techniques may be used in support of the enzyme engineering for a targeted pathway.
In
Here, an enzyme design method, termed OptZyme may be used. The initial step of OptZyme is assessing whether the enzyme should be redesigned to modify kcat, KM, kcat/KM, or a combination of these kinetic properties. If the objective kinetic property is either kcat or kcat/KM an appropriate transition state analogue (TSA) must be chosen that resemble the transition state of the desired reaction. TSA selection can be guided from the transition state structure, resolved from quantum mechanics calculations, or reported inhibitory compounds. Next, key catalytic contacts are identified, and harmonic restraints are imposed to preserve enzymatic activity during the redesign process. Finally, IPRO is employed to find mutations that minimize interaction energy with the appropriate substrate. Multiple IPRO trajectories can be used to identify various mutants with low objective function values. These values can be further validated by modeling the enzyme's active site with quantum mechanics, while representing the remainder of the enzyme with molecular mechanics.
For the E. coli β-glucuronidase (GUS), KM correlates (R2=0.960) with the interaction energy with the native substrate analogue, while kcat/KM correlates (R2=0.864) with the transition state analogue (TSA) D-glucaro-1,5-lactone (see
OptZyme may be used to alter enzyme specificity from a native substrate analogue, para-nitrophenyl-β, D-glucuronide, to a novel substrate, para-nitrophenyl-β, D-galactoside. OptZyme was used to improve kcat, KM, and kcat/KM for both substrates. Differences were observed between the corresponding libraries for both substrates, exhibiting the power of OptZyme to distinguish between similar substrates. In addition, separate OptZyme runs may be used to decipher between essential residues for substrate binding and important residues for substrate turnover. Using various objectives within OptZyme, we were able to discover that large, polar side chains (e.g. lysine and aspartate) favored the improvement of kcat/KM for GUS. This tendency was attributed to the more flattened geometry of the TSA relative to the substrate. Furthermore, this propensity could be used to bias combinatorial libraries with large, polar amino acids if the goal of the redesign is improving kcat/KM. This computational infrastructure may be deployed and fine-tuned for enzymes whose activity is identified as limiting in our assembled pathway.
Novel biocatalysts through DNA shuffling and saturation mutagenesis. DNA shuffling has been successfully deployed by for monooxygenases, dioxygenases, epoxide hydrolases, and biofilm regulators. Hence, such approaches have been used to create better catalysts as well as to create better regulators for synthetic circuits.
Soluble methane monooxygenase (sMMO) was first cloned with activity in 1994 in a heterologous host. Since then the related toluene monooxygenases in E. coli have been evolved for various applications including the production of 8 industrial compounds that could not previously be made by bacteria or enzymes (nitrohydroquinone, 4-methylresorcinol, 1-hydroxyfluorene, 3-hydroxyfluorene, 4-hydroxyfluorene, 2-naphthol, 2,6-dihydroxynaphthalene, and 3,6-dihydroxyfluorene) and have discovered six residues that influence catalysis (gate residue I100, A101, A107, A110, M180, and gate residue E214,
The Inventors have also used these methods to evolve dioxygenases for green chemistry applications and for bioremediation. For example, the Inventors evolved the first two enzymes in the pathway for 2,4-dinitrotoluene (2,4-DNT) degradation, and the evolution of the large subunit of 2,4-DNT dioxygenase led to an increase in the rates of degradation of 2,3-DNT, 2,4-DNT, 2,5-DNT, 2,6-DNT, 2-NT, and 4-NT and to the first enzyme capable of degrading 2,3-DNT and 2,5-DNT; the evolution of 4-methyl-5-nitrocatechol monooxygenase (the second enzyme in the degradation of 2,4-DNT) led to a broadening of the enzyme substrate range to include 4-nitrophenol and 3-methyl-4-nitrophenol. The production of engineered enzymes by the Inventors has been described previously.
Synthetic circuits, and engineered process regulators. In previous work determining that indole decreases biofilm formation, the Inventors also developed the first synthetic signaling circuit to control biofilm formation and used it to control the biofilm formation of E. coli and Pseudomonas fluorescens by manipulating the extracellular concentration of the signal indole. Based on this previous work, protein engineering may be used to create cell novel catalysts as well as circuits for various applications, including engineered CH4-to acetate pathways.
Engineering Promoters
In certain embodiments, expression of one or more components of the engineered pathway for conversion of methane to acetate may be under the control of a particular promoter, for example an inducible promoter. In an exemplary embodiment expression of one or more components may be under the control of a tetracycline-dependent, for example, the promoter driving expression of one or more genes encoding a component of the engineered pathway may include additional nucleotide sequences permitting tetracycline inducibility.
The following biological material has been maintained by Applicant since prior to the filing date of this application. Access to biological material will be available during the pendency of the application to the Commissioner of Patents and Trademarks and persons determined by the Commissioner to be entitled thereto upon request.
Methanosarcina acetivorans
The biological material listed above will be deposited under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. The listed deposit will be maintained in the indicated international depository for at least 30 years and will be made available to the public upon the grant of a patent disclosing it. All restrictions imposed by the depositor on the availability to the public of the deposited materials will be irrevocably removed upon the granting of a patent in this application. The availability of a deposit does not constitute a license to practice the subject invention in derogation of patent rights granted by government action.
The examples below are provided only for illustrative purposes and not to limit the scope of the present invention. Numerous embodiments within the scope of the claims will be apparent to those of ordinary skill in the art, thus the following non-limiting examples only describe particular embodiments of the invention. The present invention relates to colorimetric read-out systems capable of detecting a variety of biomolecules, including methods and kits relating thereto.
To facilitate a better understanding of the present invention, the following examples of preferred or representative embodiments are given. In no way should the following examples be read to limit, or to define, the scope of the invention.
Engineered Reversal of the Natural Pathway for Acetate Conversion to Methane
An exemplary engineered CH4-to-acetate pathway (Phase I) is shown in
CO-assisted conversion of methane to liquid fuels. The engineered pathway shown in
Table 2 lists the engineered pathway reaction steps and associated ΔG°′ for each under standard conditions. The pathway initiates with oxidation of CH4 (step 1) and transfer of the methyl group to form CH3—H4SPT (step 2) catalyzed by McrABC* and CmtA. The HS-CoM and HS-CoB produced in steps 1 and 2 are oxidized by the soluble ferredoxin:heterodisulfide oxidoreductase system expressed during growth with acetate to regenerate CoMS-SCoB (step 3). Oxidized ferredoxin is regenerated by transfer of electrons to NADP+ catalyzed by Fnr (step 4). CO supplies the carbonyl group for coupling with the methyl group of CH3—H4SPT producing acetyl-CoA (step 5) catalyzed by the Cdh synthesized during growth with acetate112. A second molecule of CO is oxidized by the multi-functional Cdh reducing ferredoxin (step 6) that is re-oxidized by Fnr (steps 6-7) contributing to the NADPH,H+ pool. The two molecules of NADPH and H+ are re-oxidized by Ald and Adh in steps 8-9 producing ethanol.
The acetate-grown and tetracycline-induced engineered M. acetivorans strain may be transferred to a buffered resting cell solution under one atmosphere containing one-third CH4 and two-thirds CO. Factors (temperature, pH, gas composition and partial pressures, and ethanol concentration) determining the rate, product composition, stoichiometry and ethanol tolerance will be examined. High tolerance to CO partial pressures (at least up to three atmospheres) are possible based on published accounts for growth on CO.58, 78 Although acetate kinase and phosphotransacetylase will be present (catalyzing acetyl-CoA+ADP→ATP+acetate) in the acetate-grown cells, little acetate production is anticipated based on unfavorable thermodynamics.
Electricity may be used to supply reductant in place of CO. This approach is encouraged by the published account for reduction of carbon dioxide to methane by a methanogen using electricity to supply electrons.
Calculating the ΔG of reaction under non-standard conditions for methane to ethanol. The non-standard ΔG (in kJ mol−1) is calculated by the following relation:
ΔG=ΔG′0+RT.ln(Q) [Eq. 11]
where, ΔG′0 is the standard Gibbs Free Energy change of reaction (in kJ mol−1), R is the Universal Gas constant (8.314×10−3 kJ mol−1 K−1), T is temperature (298 K), and Q is the reaction quotient.
Under standard conditions of 1 atm and equimolar composition of CO, CH4 and CO2 in gaseous phase, the non-standard ΔG of the reaction is calculated in Table 3 for the CO-assisted phase II reaction:
H2O+2CO+CH4→CH3CH2OH+CO2 [Eq. 12]
aUsing Henry's Law: Pi = KH · ci
bConcentration of water inside the cell is slightly less than pure water (55.5M)
cAssumed maximum ethanol tolerance 10 g/l → 0.217M
As shown in Table 3, the net reaction is thermodynamically unfavorable (i.e., ΔG=+39.17 kJ mol−1). It is contemplated that this unfavorable condition can be alleviated by increasing the pressure to 20 atm in order to enhance the solubility (and hence activity) of CH4 and CO. Assuming 1 mM aqueous concentration of ethanol and CO2, this analysis revealed that the optimal partial pressures of CO and CH4 minimizing the ΔG of the reaction and reducing it to a negative value are 13 and 7 atm, respectively (see Table 4). Therefore, at sufficiently high pressures, with a 2:1 ratio of CH4 and CO, it is possible to ensure thermodynamic feasibility of the reaction (i.e., ΔG=−0.76 kJ mol−1).
aUsing Henry's Law: Pi = KH · ci
bConcentration of water inside the cell is slightly less than pure water (55.5M)
Protein & pathway engineering. In addition to the assembly of the pathway, improvements in the catalytic efficiency of Steps 1 through 5 (
The screen for enhanced, reversed Mcr activity may be conducted in M. acetivorans due to the need for the nickel cofactor, hydroporphinoid nickel complex coenzyme F430, that is not synthesized in Escherichia coli. The E. coli-M. acetivorans shuttle vector pWM321 may be used to express Mcr from M. acetivorans using the constitutive M. acetivorans promoter Ptbp. To gauge higher Mcr activity (Step 1,
After transforming M. acetivorans via a liposome-mediated protocol, to conduct the enzymatic screen in a 96-well format, an anaerobic chamber may be used to keep Mcr active as well as to provide the proper concentration of methane to drive the reaction. M. acetivorans cells overexpressing the Mcr variants formed from DNA shuffling will be screened spectrophotometrically for HSCoB production in cell lysates by detecting the yellow color formed with 7NDCC. Cells containing Mtr variants may be grown in 300 μL of high-salt medium with shaking in Costar 96-well plates (Corning, Corning, N.Y.). The cells may be harvested at mid-log phase by filtering 200 μL of the cell cultures using MultiScreen-GV 96-well filter plates (Millipore, Bedford, Mass.). The collected cells may then be washed with 200 μL 50 mM Tris-HCl, pH 7.4, suspended in 200 μL of 10 mM HEPES buffer, and sonicated using a 96-well plate sonicator (MISONIX Sonicator 4000). The cell lysates may be exposed to methane (0.5 atm) and heterodisulfide (1 mM) then 7NDCC (25 μM) may be added, and the appearance of HSCoB may be quantified spectrophotometrically at 405 nm using a 96-well plate reader (Multiscan RC, Labsystems, Helsinki, Finland). Lysates from wells with the greatest absorbance (largest amount of HSCoB) may be saved, re-checked in additional 96-well plates, and the plasmids isolated from the highest-expressing strains may be used for subsequent rounds of shuffling, saturation mutagenesis, and DNA sequencing. In this way, Mcr may be engineered to improve dramatically the rate limiting step (Step 1) in
De novo enzyme design. Along with the protein engineering using combinatorial screening (both directed evolution and site directed mutagenesis), de novo enzyme design computational tools may be used. An enzyme design method such as OptZyme may be used. OptZyme implements as surrogates for the typically unknown transition state (TS) structure, transition state analogue (TSA) compounds which are known for many enzymatic reactions. Metrics were derived that correlate enzyme-TSA interaction energies with enzyme activity (i.e., kcat and kcat/KM). By identifying mutations that minimize the interaction energy of the enzyme with its TSA rather than its native substrate, transition state energy barrier lowering is presumably achieved. It is contemplated that integrated use of computations and combinatorial screening will help lead to improved enzymatic performance in the pathway. In addition, the M. acetivorans metabolic model iVS941 (705 reactions and 708 metabolites) may be used as a blue print for exploring the impact of any proposed pathway construct on system-wide metabolism including coupling of any proposed pathway with metabolic conversions already present in M. acetivorans (or in co-culture with other organisms) with a negative free energy to drive the pathway in the intended direction.
Reversal of Methanogenic Pathways and Extra-Cellular Electron Donation by Mcr.
In order to determine the reversibility of methanogenic pathways, experiments were initiated using artificial electron acceptors for the oxidation of methane by Mcr, with the longer-term goal of co-culturing to enhance the rate of methane oxidation and produce reductant for production of liquid fuels and value-added products. When whole cells of M. acetivorans grown with methanol were incubated with methane and methylene blue, the electron acceptor was reduced as evidenced by loss of color (
Next, methylene blue was replaced with 2-hydroxyphenazine, an analog of the physiological electron acceptor methanophenazine, and 13C-methane. NMR analyses of the assay mixture identified two methyl-containing compounds other than methane as products of the incubation, tentatively identified as acetate and methanol (
The results further document the feasibility of co-culturing with an electron accepting partner. The results also document reversibility of the pathway, demonstrating the efficacy of processes for converting methane to liquid biofuels and other value-added products by monocultures of M. acetivorans.
Engineering Reversal of the Acetate-to-Methane Pathway to Obtain Baseline Rates of Acetyl-CoA Production
Conversion of methane to liquid fuels and value-added products depends on three modifications of the pathway shown in
The system was reconstituted in vitro in order to identify the electron carriers and the Hdr leading to optimization in a genetically engineered strain that by-passes the membrane-bound system reversing electron transport independent of an ion gradient (
Reduction of a flavodoxin (FldI) coupled to oxidation of reduced Fd generated by the CO dehydrogenase/acetyl-CoA synthase (Cdh) was previously demonstrated in cell-free extract. Reduction of FdII is also coupled to oxidation of reduced Fd is further shown overexpression and purification of a second flavodoxin (FldII) (
Previous work has also presented the purification of two soluble electron carriers (polyferredoxin, PolyFd; and electron transport protein, Etp). Reconstitution of these proteins with a full-compliment of iron-sulfur clusters is demonstrated in
Characterization of Promoter Structure for Optimization and Regulation of Engineered Pathways
MreA is a significant global regulator of methaneproducing pathways, and represents a target for optimization and regulation of engineered pathways reversing methanogenesis. The inventors have identified the MreA binding site in promoters to better design promoters most efficient in optimizing both expression and regulation of essential genes. Previously work using chromatin immunoprecipitation (ChIP) yielded results identifying a tentative binding site.
However, the genome encodes three MreA homologs with the potential to interact with MreA antisera confounding specificity of binding. Accordingly, all three homologs were overexpressed and purified, and showed that none cross react with MreA antisera (
Growth of Metabolically Engineered Methanogens
An acetate-grown and tetracycline-induced engineered M. acetivorans strain expressing Mcr was produced (C2A/pES1-MATmcr). The C2A strain was assessed initially for the ability to replicate. Using phase-contrast microscopy and hemocytometer counts, the cell number of the metabolically engineered strain, M. acetivorans C2A/pES1-MATmcr, increased by ˜10 fold after 30 days of incubation (
Increase in Total Protein
Using the bicinchoninic assay, a 450-fold increase in total protein for M. acetivorans C2A/pES1-ATmcr cultures grown in methane and 0.1 mM Fe3+ was obtained, in comparison to the negative control where growth was not observed (
Methane Consumption
Gas chromatography was used to measure the level of methane in the headspace of tubes showing growth and no growth. After 46 days, 51% of the methane was consumed by cells grown on methane and 0.1 Mm FeCl3. Hundreds of crimp-sealed 28-mL tubes showed no methane consumption, and methane was only consumed in tubes containing the engineered strain (e.g., no methane was consumed by strains with empty plasmids or no plasmid). Therefore, the wild-type strain is unable to consume methane, and methane consumption is a robust assay.
Methane, Not Bicarbonate or Cysteine in the Growth Medium, Provides the Carbon Source for Growth
The composition of the growth medium was further examined to negate the possibility of cells growing on a carbon source other than methane (190 mM). Computational approaches have also been used to check feasibilities of cells using certain media components as carbon sources, or as terminal electron acceptors. Consistent with data from these computational approaches, the present experimental results support that other major carbon sources in the growth medium, bicarbonate (45 mM) and cysteine (3.2 mM), are not sufficient to support the growth of ANME-1 Mcr-producing M. acetivorans. Trace vitamins found in the medium are below 0.005 mM, so we do not consider these as major carbon sources. Furthermore, we have yet to observe any growth of M. acetivorans (with and without ANME-1 mcr plasmid) when the headspace of the tube was filled with N2 instead of CH4. Omission of bicarbonate was also found to reduce the pH of the growth medium, while omission of cysteine renders the growth medium less reductive. In both cases, no growth was seen, despite adding methanol (the preferred carbon source of M. acetivorans) to the growth medium. Therefore, bicarbonate and cysteine are required for growth of M. acetivorans, but they act mainly as a buffering agent and a reducing agent, respectively.
Product Identification
We identified several bio-products from growth on methane by M. acetivorans C2A/pES1-MATmcr. Gas chromatography and high performance liquid chromatography (HPLC) analyses were performed on methane-grown cultures from both liquid (culture medium) and gas (head-space) phases. Acetate (up to 6 mM), pyruvate (up to 10 μM), and hydrogen (up to 28 μM) were detected (see
To corroborate our results, pyruvate was also detected by liquid-chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS) analysis of the culture supernatant of the engineered strain (M. acetivorans/pES1-MATmcr) and the wild-type M. acetivorans. In comparison to the wild-type M. acetivorans strain grown on methanol (the preferred substrate), the methane-grown, engineered strain secreted 40 fold higher pyruvate. This indicates pyruvate secretion is specific to the engineered strain growing on methane. Acetate was not detected by LC-ESI-MS, as the molecular weight of acetate is smaller than the detection limit of LC-ESI-MS (˜80 g/mol).
Methane Consumption by Samples Inoculated with Sludge
Growth selection experiments were begun by incubating activated sludge from anaerobic digesters of waste-water treatment plants as an alternative approach for converting methane into biofuels. After 16 days of incubation, growth was seen at various concentrations of SO42− (ranging from 0.01 to 100 mM) and with 10 mM Fe3+. Gas chromatography analysis of the methane remaining in the headspace of these cultures indicated promising amounts of methane being consumed from 9% to 69%. PCR amplification of the 16s genes from the sludge samples indicate the presence of both archaea and bacteria. The dominating species appears to be an unculturable Methanomicrobiales archaeon based on preliminary sequencing analysis of the 16s genes. However, upon enriching the consortium present in the activated sludge (i.e., by subculturing into new growth medium), methane was only moderately consumed (up to 10%) after 41 days.
Computational Protein Engineering
Overview
Within the framework of producing liquid fuels from methane, computational protein engineering is used to improve the activity of rate-limiting enzymes along the methaneto-acetate pathway, namely methyl-coenzyme M reductase (Mcr). Factor F430 is a nickel-centered tetrapyyrole cofactor in Mcr. All known methanogenic archaea Mcrs contain an unmodified F430. In contract, an Mcr homolog from a consortia of methanotrophic archaea (ANME Mcr) consists of a methylthiolated F430. This key discrepancy between cofactors may be responsible for the methanotrophic activity in ANME Mcr. Since the methylthio moiety lies at the exterior of the F430 macrocycle, this modification may be necessary to lock the cofactor into the proper geometry within the active site. Alteration of the cofactor specificity of ANME Mcr to the unmodified F430 was investigated, and expression of this mutant protein in M. acetivorans, where the unmodified cofactor is accessible.
Algorithmic Inputs
Iterative Protein Redesign & Optimization (IPRO) [1] procedures were used to help identify mutations that improve ANME Mcr activity. IPRO relies on molecular statics calculations, and it was thus necessary to fully parameterize the system before any calculations were performed. The molecules that were parameterized include coenzyme M, coenzyme B, and the unmodified Ni(I) F430. Parameterization was largely performed using CGENFF version 2b8 [2, 3], which obtains parameters from a database of analogous molecules. The validity of analogies made using CGENFF are quantified using penalty scores. Mulliken population analysis from Ni(I) unmodified F430 quantum mechanics calculations were used to estimate partial atomic charges for atoms with high CGENFF penalty scores. The optimized geometry of the Ni(I) unmodified cofactor was also used to adjust highly penalized equilibrium bond lengths and bond angles. All adjusted partial charges were a weighted combination of the CGENFF value and the QM value, where the value of the weight was dependent upon the penalty score (higher penalty, higher contribution from quantum mechanics).
In addition to small molecule parameterization, IPRO also requires design position selection. A design position is an amino acid that is permitted to mutate away from wild-type such that binding to a target molecule is improved. Mcrs from highly diverse methanogenic archaea display high sequence conservation (61-69% identity), and these Mcrs and are also largely homologous with ANME Mcrs (˜50% identity). Since the ANME Mcr sequence is largely conserved, destruction of catalytic activity by mutating an essential residue was to be avoided. Structure and sequence alignments were used to find residues that were highly conserved amongst methanogenic Mcrs but more diverse in ANME Mcrs.
Structure alignments were performed by aligning the homologous regions of ANME-1, Methanothermobacter marburgensis, Methanopyrus kandleri, and Methanosarcina barkeri Mcrs' α subunits. Homologous regions were identified using BLAST and were then aligned by minimizing the root mean squared deviation (RMSD) between the backbones of the homologous regions. For consistency, all Mcrs were aligned to the coordinates corresponding to ANME-1 Mcr. Following the structure alignment protocol, the backbone atom RMSD between the a subunits of ANME-1 Mcr and the Mcrs of M. marburgensis, M. kandleri, and M. barkeri were 1.12, 1.20, and 3.85 Å, respectively. The alignment of these α subunits can be seen in
Sequence alignment was performed using Clustal-Omega version 1.2.1 on both Mcr and ANME Mcr sequences. P07962, P22948, A4PJ22, D3E050, P12971, P11558, O27232, Q49605, Q58256, P11559, P07961, Q6LWZ5 are the Uniprot sequence IDs for methanogenic Mcrs. The sequence IDs for the methanotrophic ANME Mcrs are Q6VUA6, Q64E03, Q64EA1, Q648C5, Q64D16, D1JBK4, Q6MZD1, Q64CB7, Q64AN3, Q64EF1, Q649Z5, and Q64DN6. Sequence ID D1JBK4 is the primary structure for the lone crystallized ANME Mcr (ANME-1 Mcr [5], see
Sequence alignment, distance from the active site, and literature all contributed to the eventual design position choice. The sequence alignment comparison was used and the nine closest residues to the modified portion of the cofactor were chosen. In addition to these nine residues, a tenth design position (V419) was selected as it was suggested to accommodate the modified cofactor by others. The ten selected design positions were Q72, L77, M78, N90, P149, I154, H157, H414, V419, and C423.
IPRO was used to redesign ANME Mcr to tightly bind the unmodified cofactor. If the purpose of the methylthiolation is to lock the cofactor into its proper orientation, then we can introduce mutations to ANME Mcr to change cofactor specificity. The IPRO trajectory was run for 2000 iterations using the standard CHARMM energy terms without the use of solvation. We can neglect solvation here because the binding of CoB blocks the ability of water to reach the active site. Neglecting solvation decreases the computational time required for IPRO and also decreases the energetic variance between sequentially identical mutants. ANME Mcr was modeled by only using the two α subunits immediately adjacent to the unmodified cofactor. The β and γ subunits were removed since they are sufficiently distant from the active site that they would not affect nonbonded interactions within the active site. The IPRO runs additionally contained weak position restraints that limited how much atoms could move from the initial ANME-1 Mcr structure. The model furthermore contained two restraints that limited the distances of the atoms coordinated to the nickel of Factor F430. IPRO was adjusted to avoid tendencies to mutate to glycine (through incorporation of a softening term) and charged residues (by matching the electrostatic contribution to the total energy in Rosetta). The designs predicted to improve binding affinity for the unmodified cofactor are presented in Table 5. From Table 5, I154N, V419K, and C423N are all completely conserved. V419K seems to be a critical mutation suggested by IPRO as it was also conserved from another set of results (data not shown). The structure of V419K also suggests efficient packing around C172 (see
Genome-Scale Metabolic Modeling
Model Development
An extended genome-scale metabolic model for M. acetivorans, iMAC865, was constructed by adding reactions from a previously published model (iVS941) to a more recently published model, iMB745. The integrated model contains 865 genes, 827 reactions, and 712 metabolites. 117 GPRs were only found in iVS941 model, and 194 GPRs were only found in iMB745 model. 277 GPRs were found in both models and were in agreement, while 38 reactions found in both models had differing gene associations. All 38 GPRs were revised and corrected using a list of 781 newly revised gene re-annotations (Ferry and co-workers, unpublished data) along with the KEGG database and relevant literature. 10 of these reactions were involved in the methanogenesis pathway. 13 reactions required elemental and charge rebalancing due to missing or wrong chemical formulae. Three of these reactions were in the methanogenesis pathway. iMAC865 more accurately predicts biomass yields on methanol and acetate as compared to iMB745 as shown in Table 6.
Incorporation of Substrate Specific Protein Measurements
Along with updates to the genome-scale model, the Maranas lab has incorporated regulatory interactions via a regulatory gene-protein-reaction (R-GPR) approach. This approach modifies certain GPRs and removes select reactions in order to align with the protein expression profiles for different substrates. M. acetivorans has been shown to have differential transcriptome and proteome profiles depending on the substrate it utilizes. This approach was used for the methanol/acetate protein expression dataset published by Li et al. The dataset contained quantitative protein levels for over 250 genes of M. acetivorans grown on acetate and methanol.
The ratio of protein abundance for acetate to methanol or methanol to acetate was calculated and any genes that had values below a cutoff of 0.25 were assigned to a set G. The reactions with GPRs that had genes within set G were then reevaluated assuming that all genes in set G were knocked out. Those reactions that would still be active had their GPRs modified to account for the loss of these genes. Those reactions that could not exist without the genes in set G were added to set R. The maximum number of reactions in set R that could be removed without dropping the biomass yield below the experimentally reported value was identified.
The R-GPR approach suggested the removal of 34 out of a possible 47 reactions when the sole carbon source was acetate to keep the in silico yield at the in vivo yield. 45 additional reactions had their GPRs modified. When the carbon source was methanol, 5 out of 8 possible reactions were removed and 12 GPRs were modified. These changes led to an increase from 90% to 96.7% for the correct prediction of gene deletion mutants grown on acetate and methanol (
These R-GPR modifications may be implemented again on the M. acetivorans model when the final products from the reverse methanogenesis pathway are determined, as such products could trigger similar regulatory responses as when they are supplied as substrates.
The Reverse Methanogenesis Pathway in the Genome-Scale Model
The iMAC865 model was modified in order to test the hypothesis that pure cultures of M. acetivorans can oxidize methane through the reversal of the methanogenesis pathway (
Therefore various methods, compositions, bioengineered organisms, and systems have been disclosed relating to making biofuels, including developing bioengineered organisms, the resulting bioengineered organisms, method for identifying pathways, and other methods, compositions, and systems. Although specific examples have been provided throughout, numerous variations, options, and alternatives are contemplated including variations in the particular pathways used, the manner in which the pathways are identified, the type of intermediates produced, and the type of fuel produced. The present invention is not to be limited to the specific disclosure provided herein as numerous options, variations, and alternatives are contemplated.
All references provided herein are hereby incorporated by reference in their entirety.
This application claims priority to U.S. Provisional Patent Application No. 61/907,086, filed Nov. 21, 2013, hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61907086 | Nov 2013 | US |