Nitrogenases are an ancient group of enzymes, existing approximately 3.2 billion years ago, before the evolution of oxygenic photosynthesis and subsequent widespread oxygenation (1, 2), Their essential function is reduction of dinitrogen gas into ammonia, contributing over half of the annual global nitrogen fixation required for the synthesis of nucleic and amino acids by all life on earth (3). Ancestors to nitrogenase in anaerobic prokaryotes also gave rise to distinct nitrogenase-like reductases for bacterial photosynthesis and archaeal methanogenesis cofactor metabolism (4, 5, 6, 7). These include the dark operative protochlorophyllide oxidoreductase (DPOR) and chlorophyllide α oxidoreductase (COR) of bacteriochlorophyll biosynthesis, and Ni2+-sirohydrochlorin a,c-diamide reductive cyclase for biosynthesis of the archaeal methyl coenzyme-M reductase cofactor F430 (4, 5, 6, 7). However, the evolutionary history of nitrogen fixation revealed overlooked nitrogen fixation-like (NFL) sequences in the genomes of anaerobic bacteria with entirely unknown function. Some were surprisingly associated with sulfur metabolism and transport genes (8, 9). This suggested that certain members of the nitrogenase family potentially have a role in sulfur metabolism.
Previous production of ethylene gas (>1 μmol/h/g dry cell weight) was observed from photosynthetic Alphaproteobacteria such as Rhodospirillum rubrum and Rhodopseudomonas palustris when growing anaerobically under the low sulfate concentrations (<200 μM) commonly encountered in their freshwater and soil habitats (see
There is a clear need for methods of producing the industrial precursor compounds ethylene, ethane, and methane, and microorganisms for the same. In particular, known ethylene producing enzyme systems require oxygen (aminocyclopropanecarboxylate oxidase and 2-oxoglutatate dioxygenase), forming a flammable ethylene-oxygen gas mixture. In addition methane and ethane when mixed with air are also explosive and flammable. Therefore, a microorganism and enzyme system to produce significant levels of ethylene, ethane, or methane in the absence of oxygen would have great utility.
The present disclosure provides non-naturally occurring microbial organisms which are capable of producing ethylene, ethane, methane, or combinations thereof.
In one aspect, a non-naturally occurring microbial organism is provided comprising a nucleic acid encoding one or more genes of a methylthio-alkane reductase complex and one or more genes of a methionine salvage pathway.
In another aspect, a non-naturally occurring microbial organism is provided, wherein the organism is an anaerobic organism which produces ethylene, ethane, and/or methane using a methylthio-alkane reductase complex and a methionine salvage pathway, and wherein the organism has been optimized for producing ethylene, ethane, and/or methane with one or more non-naturally occurring genes.
In another aspect, a method of producing ethylene, ethane, and/or methane is provided, the method comprising:
culturing a population of the non-naturally occurring microbial organism described herein in a culture medium comprising one or more carbon sources; and
recovering the ethylene, ethane, and/or methane.
A bioreactor is further provided comprising the non-naturally occurring microbial organism described herein.
A vector is also provided comprising: one or more exogenous nucleic acid molecules encoding one or more genes of a methylthio-alkane reductase complex and one to or more genes of a methionine salvage pathway.
The details of one or more embodiments of the disclosure are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the disclosure will be apparent from the description and drawings, and from the claims.
12 shows the NifH superfamily ammo acid alignment. Pairwise alignment of NifH superfamily sequences are shown in the region of active site residues responsible for MgATP binding and hydrolysis Fe4—S4 iron sulfur cluster binding (*). The conserved arginine (▾) is the site of ADP-ribosylation post translational modification for nitrogenase activity regulation in the bona fide nitrogenases. ADP-ribosylation performed by dinitrogenase reductase ADP-ribosyl transferase (DraT) in R. rubrum prevents association of NifH with NifDK. The modification is removed by dinitrogenase reductase activating glycohydrolase (DraG). Numbering is based off of Azotobacter vinelandii NifH (Av) (9). For NflH, NfaH, and MarH, corresponding genes were located with 10 genes upstream or downstream from nflDK, nfαKD, , and marDK, respectively, in each organism.
Like reference symbols in the various drawings indicate like elements,
Methane is used for the production of energy, hydrogen gas, synthesis gas, and methanol used in the manufacturing of various organic chemicals. Methane is the second most used energy source next to electricity. Ethylene is used in a variety of industrial processes, including the production of polyethylene for plastic bags, polystyrene for packaging and insulation, and ethylene oxide for detergents. In addition, ethylene may be converted to C5-C10 gasoline-like molecules. Ethylene is thus thought to be the most widely used chemical on earth (over 175 million tons in 2018) and the demands and market for this feedstock are steadily increasing, with nearly a $300 billion annual market. Thus, there is considerable interest in developing new and innovative ways to produce these key industrial precursor compounds (ethylene, ethane, methane) with bio-based methods as a potential way to supplement chemical-based processes.
For anaerobic ethylene production by microorganisms, the novel and widespread bacterial carbon and sulfur salvage pathway, the DHAP Shunt (
Disclosed herein is an exclusively anaerobic enzyme system and associated pathways that couples sulfur metabolism to ethylene and methane production in the purple non-sulfur alpha-proteobacteria. Rhodospirillum rubrum, Rhodopseudomonas palustris, and Blastochloris viridis (
Disclosed herein are methods for the development of a potential industrially compatible process to biologically produce ethylene and methane in high yields. Disclosed herein is a method to fully characterize the anaerobic ethylenelethane/methane producing enzyme system and determine how the genes are regulated at the molecular level. Computational modeling of the chemical reactions performed by the relevant enzymes are initiated to learn the mechanisms by which these enzymes catalyze the reactions involved in ethylene biosynthesis. In addition, since ethylene/ethane/methane synthesis from the respective precursor compound is an inducible process, further studies probe the molecular regulation of the genes involved during photosynthetic metabolism using a variety of “omics” tools. These biochemical and molecular studies are invaluable for optimizing ethylene/ethane/methane production and creating bacterial strains that over-produce ethylene/ethane/methane under controlled conditions.
Also disclosed herein is a method to maximize ethylene and methane production with different feedstocks; e.g., lignocellulose digests as well as inorganic carbon sources (
Further disclosed are metagenomics and bioinformatic/computational approaches to discover more effective enzymes of uncultured organisms from anaerobic environments. Analysis of existing genome and metagenome databases allow identification of potential gene sequences for ethylene/ethane/methane producing enzymes systems that have specific or enhanced catalytic properties. Such sequences, homologous to known genes, may then be screened for their effectiveness in catalyzing key reactions of ethylene/ethane/methane synthesis. This leverages over 4 billion years of evolution to obtain the most efficient enzymes. In addition, a functional genomics approach may be established to isolate relevant genes from the metagenome without previous knowledge of sequences; e.g., by complementing specific mutant host organisms with environmental DNA (68). These metagenomics approaches, plus a full battery of other synthetic biology and “omics” approaches is utilized to optimize ethylene/ethane/methane formation.
The following description of the disclosure is provided as an enabling teaching of the disclosure in its best, currently known embodiments. Many modifications and other embodiments disclosed herein will come to mind to one skilled in the art to which the disclosed compositions and methods pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the disclosures are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. The skilled artisan will recognize many variants and adaptations of the aspects described herein. These variants and adaptations are intended to be included in the teachings of this disclosure and to be encompassed by the claims herein.
Any recited method can be carried out in the order of events recited or in any other order that is logically possible. That is, unless otherwise expressly stated, it is in no way intended that any method or aspect set forth herein be construed as requiring that its steps be performed in a specific order. Accordingly, where a method claim does not specifically state in the claims or descriptions that the steps are to be limited to a specific order, it is no way intended that an order be inferred, in any respect. This holds for any possible non-express basis for interpretation, including matters of logic with respect to arrangement of steps or operational flow, plain meaning derived from grammatical organization or punctuation, or the number or type of aspects described in the specification.
All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided herein can be different from the actual publication dates, which can require independent confirmation.
It is also to be understood that the terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosed compositions and methods belong. It can be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the specification and relevant art and should not be interpreted in an idealized or overly formal sense unless expressly defined herein.
Prior to describing the various aspects of the present disclosure, the following definitions are provided and should be used unless otherwise indicated. Additional terms may be defined elsewhere in the present disclosure.
As used herein, “comprising” is to be interpreted as specifying the presence of the stated features, integers, steps, or components as referred to, but does not preclude the presence or addition of one or more features, integers, steps, or components, or groups thereof. Moreover, each of the terms “by”, “comprising,” “comprises”, “comprised of,” “including,” “includes,” “included,” “involving,” “involves,” “involved,” and “such as” are used in their open, non-limiting sense and may be used interchangeably. Further, the terns “comprising” is intended to include examples and aspects encompassed by the terms “consisting essentially of” and “consisting of.” Similarly, the term “consisting essentially of” is intended to include examples encompassed by the term “consisting of.
As used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
It should be noted that ratios, concentrations, amounts, and other numerical data can be expressed herein in a range format. It can be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as “about” that particular value in addition to the value itself. For example, if the value “10” is disclosed, then “about 10” is also disclosed. Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it can be understood that the particular value forms a further aspect. For example, if the value “about 10” is disclosed, then “10” is also disclosed.
When a range is expressed, a further aspect includes from the one particular value and/or to the other particular value. For example, where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure, e.g. the phrase “x to y” includes the range from ‘x’ to ‘y’ as well as the range greater than ‘x’ and less than ‘y’. The range can also be expressed as an upper limit, e.g. ‘about x, y, z, or less’ and should be interpreted to include the specific ranges of ‘about x’, ‘about y’, and ‘about z’ as well as the ranges of ‘less than x’, less than y’, and ‘less than z’. Likewise, the phrase ‘about x, y, z, or greater’ should be interpreted to include the specific ranges of ‘about x’, ‘about y’, and ‘about z’ as well as the ranges of ‘greater than x’, greater than y’, and ‘greater than z’. In addition, the phrase “about ‘x’ to ‘y’”, where ‘x’ and ‘y’ are numerical values, includes “about to about ‘y’”.
It is to be understood that such a range format is used for convenience and brevity, and thus, should be interpreted in a flexible manner to include not only the numerical values explicitly recited as the limits of the range, but also to include all the individual numerical values or sub-ranges encompassed within that range as if each numerical value and sub-range is explicitly recited. To illustrate, a numerical range of “about 0.1% to 5%” should be interpreted to include not only the explicitly recited values of about 0.1% to about 5%, but also include individual values (e.g., about 1%, about 2%, about 3%, and about 4%) and the sub-ranges (e.g., about 0.5% to about 1.1%; about 5% to about 2.4%; about 0.5% to about 3.2%, and about 0.5% to about 4.4%, and other possible sub-ranges) within the indicated. range.
As used herein, the terms “about,” “approximate,” “at or about,” and “substantially” mean that the amount or value in question can be the exact value or a value that provides equivalent results or effects as recited in the claims or taught herein. That is, it is understood that amounts, sizes, formulations, parameters, and other quantities and characteristics are not and need not be exact, but may be approximate and/or larger or smaller, as desired, reflecting tolerances, conversion factors, rounding off, measurement error and the like, and other factors known to those of skill in the art such that equivalent results or effects are obtained. In some circumstances, the value that provides equivalent results or effects cannot be reasonably determined. In such cases, it is generally understood, as used herein, that “about” and “at or about” mean the nominal value indicated ±10% variation unless otherwise indicated or inferred. In general, an amount, size, formulation, parameter or other quantity or characteristic is “about,” “approximate,” or “at or about” whether or not expressly stated to be such. It is understood that where “about,” “approximate,” or “at or about” is used before a quantitative value, the parameter also includes the specific quantitative value itself, unless specifically stated otherwise.
The term “culture”, “cultivate”, and “ferment” are used interchangeably and refer to the intentional growth, propagation, proliferation, and/or enablement of metabolism, catabolism, and/or anabolism of one or more cells (e.g. a microbial organism). The combination of both growth and propagation may be termed proliferation, Examples include production by an organism of ethylene, ethane, or methane. Culture does not refer to the growth or propagation of microorganisms in nature or otherwise without human intervention.
The term “growth” means an increase in cell size, total cellular contents, and/or cell mass or weight of a cell (e.g. a microbial organism).
A “growth media” or “growth medium” as used herein can be a solid, powder, or liquid mixture which comprises all or substantially all of the nutrients necessary to support the growth of microbial organisms; various nutrient compositions are preferably prepared when particular microbial species are being assayed. Amino acids, carbohydrates, minerals, vitamins and other elements known to those skilled in the art to be necessary for the growth of microbial organisms are provided in the medium. In one embodiment, the growth medium is liquid.
The term “propagation” refers to an increase in cell number via cell division.
The term “promoter” or “regulatory element” refers to a region or sequence determinants located upstream or downstream from the start of transcription and which are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Promoters need not be of origin in the microbial organism used, for example, promoters derived from viruses or from other organisms can be used in the compositions or methods described herein,
A polynucleotide sequence is “heterologous” to a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified by human action from its original form. For example, a promoter operably linked to a heterologous coding sequence refers to a coding sequence from a species different from that from which the promoter was derived, or, if from the same species, a coding sequence which is different from naturally occurring allelic variants.
The term “recombinant” refers to a human manipulated nucleic acid (e.g. polynucleotide) or a copy or complement of a human manipulated nucleic acid (e.g. polynucleotide), or if in reference to a protein (i.e, a “recombinant protein”), a protein encoded by a recombinant nucleic acid (e.g. polynucleotide). In embodiments, a recombinant expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In another example, a recombinant expression cassette may comprise nucleic acids (e.g. polynucleotides) combined in such a way that the nucleic acids (e.g. polynucleotides) are extremely unlikely to be found in nature. For instance, human manipulated restriction sites or plasmid vector sequences may flank or separate the promoter from the second nucleic acid (e.g. polynucleotide).
“Nucleic acid” or “oligonucleotide” or “polynucleotide” or grammatical equivalents used herein means at least two nucleotides covalently linked together. The term “nucleic acid” includes single-, double-, or multiple-stranded DNA, RNA and analogs (derivatives) thereof. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and polynucleotides are polymers of any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, etc. In certain embodiments, the nucleic acids herein contain phosphodiester bonds. In other embodiments, nucleic acid analogs are included that may have alternate backbones. The term encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid. A particular nucleic acid sequence also encompasses “splice variants.” Similarly, a particular protein encoded by a nucleic acid encompasses any protein encoded by a splice variant of that nucleic acid. “Splice variants,” as the name suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic acid transcript may be spliced such that different (alternate) nucleic acid splice products encode different polypeptides. Mechanisms for the production of splice variants vary, but include alternate splicing of exons. Alternate polypeptides derived from the same nucleic acid by read-through transcription are also encompassed by this definition. Any products of a splicing reaction, including recombinant forms of the splice products, are included in this definition. An example of splice variants is discussed in Leicher, et al., J. Biol. Chem. 273 (52):35095-35101 (1998).
The term “expression cassette” refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively. In some embodiments, an expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In some embodiments, an expression cassette comprising a terminator (or termination sequence) operably linked to a second nucleic acid (e.g. polynucleotide) may include a terminator that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation. In some embodiments, the expression cassette comprises a promoter operably linked to a second nucleic acid (e.g. polynucleotide) and a terminator operably linked to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation. In some embodiments, the expression cassette comprises an endogenous promoter. In some embodiments, the expression cassette comprises an endogenous terminator. In some embodiments, the expression cassette comprises a synthetic (or non-natural) promoter. In some embodiments, the expression cassette comprises a synthetic (or non-natural) terminator.
The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length. As used herein, percent (%) amino acid sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the amino acids in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.
For sequence comparisons, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
One example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.hih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al. (1990) J. Mol. Biol. 215:403-410). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPS containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=−4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands.
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01.
The phrase “codon optimized” as it refers to genes or coding regions of nucleic acid molecules for the transformation of various hosts, refers to the alteration of codons in the gene or coding regions of polynucleic acid molecules to reflect the typical codon usage of a selected organism without altering the polypeptide encoded by the DNA. Such optimization includes replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that selected organism.
The phrase “selectively (or specifically) hybridizes to” refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence with a higher affinity, e.g., under more stringent conditions, than to other nucleotide sequences (e.g., total cellular or library DNA or RNA).
The phrase “stringent hybridization conditions” refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Probes, “Overview of principles of hybridization and the strategy of nucleic acid assays” (1993). Generally, stringent conditions are selected to be about 5-10° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary stringent hybridization conditions can be as following: 50% formamide, 5×SSC, and 1% SDS, incubating at 42° C., or, 5×SSC, 1% SDS, incubating at 65° C., with wash in 0.2×SSC, and 0.1% SDS at 65° C.
Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This occurs, for example, when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize under moderately stringent hybridization conditions. Exemplary “moderately stringent hybridization conditions” include a hybridization in a buffer of 40% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 1×SSC at 45° C. A positive hybridization is at least twice background. Those of ordinary skill will readily recognize that alternative hybridization and wash conditions can be utilized to provide conditions of similar stringency. Additional guidelines for determining hybridization parameters are provided in numerous reference, e.g., and Current Protocols in Molecular Biology, ed. Ausubel, et al. One of skill will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like. Polypeptides which are “substantially similar” share sequences as noted above except that residue positions which are not identical may differ by conservative amino acid changes. Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. Exemplary conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, aspartic acid-glutamic acid, and asparagine-glutamine.
The term “modulator” refers to a composition that increases or decreases the level of a target molecule or the level of activity or function of a target molecule or the physical state of the target of the molecule. In embodiments a modulator is a recombinant nucleic acid that is capable of increasing or decreasing the amount of a protein in a cell or the level of activity of a protein in a cell or transcription of a second nucleic acid in a cell. In embodiments, a modulator increases or decreases the level of activity of a protein or the amount of the protein in a cell. The term “modulate” is used in accordance with its plain and ordinary meaning and refers to the act of changing or varying one or more properties. “Modulation” refers to the process of changing or varying one or more properties. For example, as applied to the effects of a modulator on a target protein, to modulate means to change by increasing or decreasing a property or function of the target molecule or the amount of the target molecule. In embodiments, a recombinant nucleic acid that modulates the level of activity of a protein may increase the activity or amount of the protein relative the absence of the recombinant nucleic acid. In embodiments, an increase in the activity or amount of a protein may include overexpression of the protein. “Overexpression” is used in accordance with its plain and ordinary meaning and refers to an increased level of expression of a protein relative to a control (e.g. cell or expression system not including a recombinant nucleic acid that contributes to the overexpression of a protein). In embodiments, a decrease in the activity or amount of a protein may include a mutation (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid; all/any of which may be in the coding region for a protein or in an operably linked region (e.g, promoter)) of the protein. The term “increased” refers to a detectable increase compared to a control.
A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, “operably linked” means that the DNA sequences being linked are near each other, and, in the case of a secretory leader, contiguous and in reading phase. However, operably linked nucleic acids (e.g. enhancers and coding sequences) do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. In embodiments, a promoter is operably linked with a coding sequence when it is capable of affecting (e.g. modulating relative to the absence of the promoter) the expression of a protein from that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter).
Transformation” refers to the transfer of a nucleic acid molecule into a host organism (e.g. a microbial organism). In embodiments, the nucleic acid molecule may be a plasmid that replicates autonomously or it may integrate into the genome of the host organism (e.g. a microbial organism). Host organisms containing the transformed nucleic acid molecule may be referred to as “transgenic” or “recombinant” or “transformed” organisms (e.g. microbial organisms). A “genetically modified” organism (e.g. genetically modified microbial organism) is an organism (e.g. microbial organism) that includes a nucleic acid that has been modified by human intervention. Examples of a nucleic acid that has been modified by human intervention include, but are not limited to, insertions, deletions, mutations, expression nucleic acid constructs (e.g. over-expression or expression from a non-natural promoter or control sequence or an operably linked promoter and gene nucleic acid distinct from a naturally occurring promoter and gene nucleic acid in an organism), extra-chromosomal nucleic acids, and genomically contained modified nucleic acids. Genetically modified organisms may be made by rational modification of a nucleic acid or may be made by use of a mutagen or mutagenesis protocol that results in a mutation that was not identified (e.g. intended or targeted) prior to the use of the mutagen or mutagenesis protocol (e.g. UV exposure, EMS exposure, mutagen exposure, random genomic mutagenesis, transformation of a library of different nucleic acid constructs). Genetically modified organisms that include a modification (e.g. modification, insertion, deletion, mutation) not previously known or intended prior to making of the genetically modified organism may be identified through screening a plurality of organism including one or more genetically modified organisms by using a selection criteria that identifies the genetically modified organism of interest. In embodiments, a genetically modified organism includes a recombinant nucleic acid.
As used herein, the term “episome” or “episomally” is intended to refer to an extrachromosomal DNA moiety or plasmid that can replicate autonomously in a host cell when physically separated from the chromosomal DNA of the host cell.
Methods for synthesizing sequences and bringing sequences together are well established and known to those of skill in the art. For example, in vitro mutagenesis and selection, site-directed mutagenesis, error prone PCR (Melnikov et al., Nucleic Acids Research, 27 (4)1056-1062 (Feb. 15, 1999)), “gene shuffling” or other means can be employed to obtain mutations of naturally occurring genes.
The present disclosure provides non-naturally occurring microbial organisms which are capable of producing ethylene, ethane, methane, or combinations thereof. some aspects, the microbial organism has been genetically modified with one or more genes directed to the production of ethylene, ethane, methane, or combinations thereof. In other aspects, the microbial organism may naturally produce ethylene, ethane, methane, or combinations thereof, but has been optimized for said production by the introduction of one or more non-naturally occurring genes.
Thus, in one aspect, a non-naturally occurring microbial organism is provided comprising a nucleic acid encoding one or more genes of a methylthio-alkane reductase complex and one or more genes of a methionine salvage pathway.
In some embodiments, the organism can produce ethylene, ethane, methane, or combinations thereof, In some embodiments, the organism produces ethylene, In some embodiments, the organism produces ethane. In some embodiments, the organism produces methane.
In another aspect, a non-naturally occurring microbial organism is provided, wherein the organism is an anaerobic organism which produces ethylene, ethane, and/or methane using a methylthio-alkane reductase complex and a methionine salvage pathway, and wherein the organism has been optimized for producing ethylene, ethane, and/or methane with one or more non-naturally occurring genes. In some embodiments, the one or more non-naturally occurring genes comprise one or more genes of a SAM hydrolase. In some embodiments, the one or more non-naturally occurring genes comprise one or more genes of a methanethiol methylase (mddik), a methionine gamma lyase (mgt), or combinations thereof.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex may comprise marB, marH, marD, marK, or combinations thereof.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise marB. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 1 (marB).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 1 (marB). In some embodiments, the one or more genese of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID No: 1.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 2 (MarB).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 2. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 2. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments, the one or more genes of a methyltbio-alkane reductase complex comprise marH. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 3 (marH).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 3. In some embodiments, the one or more genese of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 3.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise one or more marll genes associated with an accession number found in Table 1 below:
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 4 (MarH).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 4. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 4. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise marD. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 5 (marD).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 5. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 5.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise one or more marD genes associated with an accession number found in Table 2 below:
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 6 (MarD).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 6. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 6. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments, the one or more genes of a methyltbio-alkane reductase complex comprise marK. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 7 (marK).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 7. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 7.
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise one or more marK genes associated with an accession number found in Table 3 below:
Pararhodospirillum oryzae strain NBRC 107573 sequence093,
Rhodospirillum photometricum DSM 122 draft genome sequence
Rhodospirillum rubrum ATCC 11170 chromosome, complete
Rhodospirillum rubrum F11, complete genome
Rhodomicrobium udaipurense JA643 contig00206, whole genome
Phaeospirillum fulvum MGU-K5 contig00054, whole genome
Rhodoblastus sphagnicola strain DSM 16996 scaffold0018, whole
Rhodoblastus acidophilus strain DSM 137
Rhodoblastus acidophilus strain DSM 137, whole genome shotgun
Rhodoblastus acidophilus strain DSM 137 scaffold0022, whole
Rhodomicrobium sp. JA980
Pleomorphomonas carboxyditropha strain SVCO-16
Rhodomicrobium vannielii ATCC 17100, complete genome
Thermoanaerobacterium thermosaccharolyticum M0795, complete
Bacteroidales bacterium Barb6XT Barb6XT_contig_167, whole
Prevotella bryantii strain TC1-1 contig9, whole genome shotgun
Selenomonas ruminantium strain WCT3, whole genome shotgun
Phaeospirillum fulvum strain DSM 13234, whole genome shotgun
Clostridium coskatii strain PTA-10522 CLCOS_contig000056,
Clostridium coskatii strain PS02 scaffold19 1 86601, whole
Clostridium autoethanogenum strain H21-9 Contig_058, whole
Clostridium ljungdahlii DSM 13528 strain PETC
Clostridium drakei strain SLI contig_79, whole genome shotgun
Clostridium drakei strain SLI chromosome, complete genome
Clostridium scatologenes strain ATCC 25775, complete genome
Fibrobacter sp. UWT2, whole genome shotgun sequence
Fibrobacter sp. UWB8, whole genome shotgun sequence
Fibrobacter sp. UWB6 Ga0136278_108, whole genome shotgun
Fibrobacter sp. UWB15, whole genome shotgun sequence
Fibrobacter sp. UWB5 NODE_1, whole genome shotgun sequence
Fibrobacter sp. UWBI NODE_4, whole genome shotgun sequence
Fibrobacter sp. UWOV1, whole genome shotgun sequence
Fibrobacter sp. UWH4, whole genome shotgun sequence
Selenomonas bovis 8-14-1 T485DRAFT scaffold00002.2 C, whole
Rhodopseudomonas palustris strain 2.1.37 scaffold 36, whole
Rhodopseudomonas palustris strain DSM 126 scaffold0001, whole
Rhodopseudomonas palustris strain R1
Rhodopseudomonas sp. AAP120 AAP120_Contigs_11, whole
Fibrobacter succinogenes subsp. succinogenes S85, complete
Fibrobacter succinogenes subsp. succinogenes S85, complete
Blastochloris viridis genome assembly Blastochloris viridis genome,
Blastochloris viridis strain ATCC 19567, complete genome
Blastochloris viridis DNA, complete genome, strain: DSM 133
Clostridium autoethanogenum DSM 10061 seq4, whole genome
Clostridium autoethanogenum strain JA1-1
Ruminococcaceae bacterium HV4-5-B5C, whole genome shotgun
Clostridium bornimense replicon M2/40_rep1, complete genome,
Clostridium ljungdahlii strain ERI-2 scaffold7, whole genome
Clostridium chromiireducens strain DSM 23318
Rhodopseudomonas palustris strain R1
Rhodopseudomonas palustris strain DSM 126 scaffold0020, whole
Pleomorphomonas carboxyditropha strain SVCO-16
Pleomorphomonas sp. CF100 Ga0189743 114, whole genome
Pleomorphomonas koreensis DSM 23070
Roseiarcus fermentans strain DSM 24875 Ga0244512_102, whole
Ruminococcus flavefaciens strain XPD3002, whole genome shotgun
Clostridium beijerinckii HUN142 T483DRAFT scaffold00009.9 C,
Clostridium beijerinckii strain NRRL B-591 CLBKI_contig000007,
Clostridium beijerinckii strain 4J9 CLOSB_contig000013, whole
Clostridium beijerinckii ATCC 35702, complete genome
Clostridium beijerinckii NCIMB 8052, complete genome
Clostridium beijerinckii G117 Scaffold22, whole genome shotgun
Clostridium beijerinckii strain WB
Clostridium_beijerinckii_WB_contig15, whole genome shotgun
Clostridium beijerinckii strain DSM 791 CLBEI_contig000075,
Clostridium beijerinckii strain NBRC 109359 sequence070, whole
Clostridium beijerinckii strain BAS/B2 CLBEJ_contig000034,
Clostridium beijerinckii strain NCP 260 CLOBJ_contig000033,
Clostridium beijerinckii strain ATCC 39058 CBEIJ_contig000004,
Clostridium beijerinckii strain NCTC13035, whole genome shotgun
Clostridium beijerinckii strain BAS/B3/1/124, complete genome
Clostridium beijerinckii NRRL B-598 chromosome, complete
Clostridium beijerinckii strain NCIMB 14988, complete genome
Clostridium beijerinckii strain NRRL B-593 CLOBI_contig000172,
Clostridium beijerinckii strain NRRL B-528
Clostridium beijerinckii isolate C. beijerinckii DSM 6423 genome
Clostridium beijerinckii strain NRRL B-596 CLOBE_contig000006,
Clostridium sp. BL-8 CLOBL_contig000019, whole genome
Ruminococcus sp. HUN007
Siculibacillus lacustris strain SA-279 scaffold_6, whole genome
Pelosinus sp. UFO1, complete genome
Pectinatus cerevisiiphilus strain DSM 20467 Ga0244680_115,
Clostridium tyrobutyricum isolate MGYG-HGUT-00125, whole
Dendrosporobacter quercicolus strain DSM 1736, whole genome
Rhodopseudomonas palustris strain YSC3 chromosome, complete
Sporomusaceae bacterium strain FL31 scf_SPFL3102_001, whole
Sporomusaceae bacterium FL31 scf_SPFL3101_011, whole genome
Ruminiclostridium hungatei strain DSM 14427
Propionispora vibrioides strain DSM 13305, whole genome shotgun
Paenibacillus durus ATCC 35681, complete genome
Rhodopseudomonas palustris strain PS3 chromosome, complete
Sporomusa sp. KB1 SalpaDRAFT_Scaffold1.2, whole genome
Propionispora sp. 2/2-37, whole genome shotgun sequence
Clostridium pasteurianum strain W5 contig00122, whole genome
Clostridium sp. BNL 1100, complete genome
Paenibacillus stellifer strain DSM 14472, complete genome
Ruminiclostridium josui JCM 17888
Rhodopseudomonas palustris strain ELI 1980 Contig20, whole
Rhodopseudomonas palustris CGA009 complete genome
Rhodopseudomonas palustris TIE-1, complete genome
Clostridium chromiireducens strain C1 Scaffold1, whole genome
Rhodomicrobium sp. JA980
Clostridium tyrobutyricum strain Cirm BIA 2237 chromosome
Paenibacillus sabinae T27, complete genome
Clostridium pasteurianum DSM 525 = ATCC 6013 ctg1, whole
Clostridium ljungdahlii DSM 13528, complete genome
Clostridium autoethanogenum DSM 10061, complete genome
Clostridium autoethanogenum DSM 10061, complete genome
Clostridium pasteurianum strain M150B, complete genome
Clostridium pasteurianum DSM 525 = ATCC 6013, complete
Clostridium pasteurianum DSM 525 = ATCC 6013, complete
Clostridium pasteurianum DSM 525 = ATCC 6013, complete
Clostridium pasteurianum BCI, complete genome
Clostridium sp. DL-VIII chromosome, whole genome shotgun
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 8 (MarK).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 8. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 8. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
The art is familiar with the methods and techniques used to identify other methylthio-alkane reductase genes and nucleotide sequences.
In some embodiments, the one or more genes of a methionine salvage pathway comprise one or more genes of a dihydroxyacetone phosphate (DHAP) shunt pathway. In some embodiments, the one or more genes of a DHAP shunt pathway comprise 5′-methylthioadenosine phosphorylase (mtnP), methylthioadenosine nucleosidase (mtn1), 5-methylthioribose kinase (mtnK), 5-methylthioribose-1-phosphate isomerase (mtnA), 5-methylthioribulose-1-phosphate aldolase (ald2), or combinations thereof.
In some embodiments, the one or more genes of a methionine salvage pathway comprises mtnP. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnP gene associated with an accession number found in Table 4 below:
The art, is familiar with the methods and techniques used to identify other 5′-methylthioadenosine phosphorylase genes and nucleotide sequences.
In some embodiments, the one or more genes of a methionine salvage pathway comprises mtnK. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnK gene associated with an accession number found in Table 5 below:
The art is familiar with the methods and techniques used to identify other 5-methylthioribose kinase genes and nucleotide sequences.
In some embodiments, the one or more genes of a methionine salvage pathway comprises mtnA. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnA gene associated with an accession number found in Table 6 below:
The art is familiar with the methods and techniques used to identify other 5-methylthioribose-1-P isomerase genes and nucleotide sequences.
In some embodiments, the one or more genes of a methionine salvage pathway comprises ald2. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an ald2 gene associated with an accession number found in Table 7 below:
The art is familiar with the methods and techniques used to identify other 5-methylthioribulose-1-P aldolase genes and nucleotide sequences.
In some embodiments, the nucleic acid may encode one or more genes of a SAM hydrolase. In some embodiments, the one or more genes of a SAM hydrolase may be a non-naturally occurring, or exogenous, gene. In some embodiments, the SAM hydrolase may be derived from a coliphage virus. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
The art is familiar with the methods and techniques used to identify other SAM hydrolase genes and nucleotide sequences.
In some embodiments, the nucleic acid may encode one or more genes of a methanethiol methylase (mddA), a methionine gamma lyase (mgl), or combinations thereof. In some embodiments, the one or more genes of mddA, mgi, or combinations thereof, may be a non-naturally occurring, or exogenous, gene. In some embodiments, the one or more genes of mddA and/or mgl are derived from Rhodopseudomonal palsutris. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
The art is familiar with the methods and techniques used to identify other methanethiol methylase and/or methionine gamma lyase genes and nucleotide sequences.
In some embodiments, the nucleic acid may be codon optimized. In some embodiments, the one or more may be optionally and independently linked to a control element. In some embodiments, the control element comprises a promoter.
In another aspect, vectors are provided comprising one or more exogenous nucleic acid molecules encoding one or more genes of a methylthio-alkane reductase complex and one or more genes of a methionine salvage pathway. Vectors are also provided for use in the methods disclosed herein. For example, one or more of the vectors disclosed herein can be used to transform a microbial organism. Microbial organisms are also described transformed with or comprising one or more of the vectors described herein.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex may comprise marB, marH, marD, marK, or combinations thereof.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise marB. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 1 (marB).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 1 (marB). In some embodiments, the one or more genese of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 1.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 2 (MarB).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 2. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 2. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise marH. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 3 (marH).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93% 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 3. In some embodiments, the one or more genese of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 3.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise one or more marH genes associated with an accession number found in Table 1.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 4 (MarH).
In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 4. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 4. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise marD. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 5 (marD).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 5. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID No: 5.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise one or more marD genes associated with an accession number found in Table 2.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 6 (MarD).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 6. In some embodiments, the one or more genes of a inethylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 6. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise marK. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence haying at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the nucleic acid sequence of SEQ ID NO: 7 (marK).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 7. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a nucleic acid sequence of SEQ ID NO: 7.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise one or more marK genes associated with an accession number found in Table 3.
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein haying at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the peptide sequence of SEQ ID NO: 8 (MarK).
In some embodiments of the vectors described herein, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the peptide sequence of SEQ ID NO: 8. In some embodiments, the one or more genes of a methylthio-alkane reductase complex comprise a gene encoding a protein of SEQ ID NO: 8. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprise one or more genes of a dihydroxyacetone phosphate (DHAP) shunt pathway. In some embodiments, the one or more genes of a DHAP shunt pathway comprise 5′-methylthioadenosine phosphorylase (mtnP), 5-methylthioribose kinase (mtnK), 5-methylthioribose-1-phosphate isomerase (mtnA), 5-methylthioribulose-1-phosphate aldolase (ald2), or combinations thereof.
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprises mtnP. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnP gene associated with an accession number found in Table 4.
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprises mtnl. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprises mtnK. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnK, gene associated with an accession number found in Table 5.
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprises mtnA. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an mtnA gene associated with an accession number found in Table 6.
In some embodiments of the vectors described herein, the one or more genes of a methionine salvage pathway comprises ald2. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid). In some embodiments, the one or more genes of a methionine salvage pathway comprises an ald2 gene associated with an accession number found in Table 7.
In some embodiments of the vectors described herein, the exogenous nucleic acid molecules may further encode one or more genes of a SAM hydrolase. In some embodiments, the one or more genes of a SAM hydrolase may be a non-naturally occurring, or exogenous, gene. In some embodiments, the SAM hydrolase may be derived from a coliphage virus. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid).
In some embodiments of the vectors described herein, the exogenous nucleic acid molecules may encode one or more genes of a methanethiol methylase (mddA), a methionine gamma lyase (mgl), or combinations thereof. In some embodiments, the one or more genes of mddA, mgl, or combinations thereof, may be a non-naturally occurring, or exogenous, gene. In some embodiments, the one or more genes of mddA and/or mgl are derived from Rhodopseudomonal palsutris. In some embodiments, the gene is a wildtype version of the gene or encodes a wildtype form of the associated protein. In some embodiments, the gene is a mutant form of the gene or may encode a mutant form of the associated protein (e.g. point mutant, loss of function mutation, missense mutation, deletion, or insertion of heterologous nucleic acid),
In some embodiments the one or more exogenous nucleic acid molecules are integrated into a gene expression cassette. In some embodiments, the gene expression cassette comprises one or more control elements. In some embodiments, the one or more exogenous nucleic acid molecules disclosed herein are operably linked to a control element. In some embodiments, the control element is a promoter. In some embodiments, the promoter may be constitutively active or inducibly active. In some embodiments, the promoter is constitutively active regardless of sulfate concentration, i.e., sulfate limitation is not required in order to induce expression of the gens found in the one or more exogenous nucleic acid molecules.
In some embodiments, the promoter comprises a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the sequence of SEQ ID NO: 9:
In some embodiments, the promoter comprises a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 9. In some embodiments, the promoter comprises a nucleic acid sequence of SEQ ID NO: 9.
In some embodiments, the promoter comprises a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or more identity to the sequence of SEQ ID NO: 10:
In some embodiments, the promoter comprises a nucleic acid sequence having 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the nucleic acid sequence of SEQ ID NO: 10. In some embodiments, the promoter comprises a nucleic acid sequence of SEQ ID NO: 10.
In another aspect, a non-naturally occurring organism is provided comprising a vector described herein.
In another aspect, methods of producing ethylene, ethane, and/or methane are provided comprising:
culturing a population of the non-naturally occurring microbial organism described herein in a culture medium comprising one or more carbon sources; and
recovering the ethylene, ethane, and/or methane.
In some embodiments, the methods described herein may be used in the production of ethylene. In some embodiments, the methods described herein may be used in the production of ethane. In some embodiments, the methods described herein may be used in the production of methane.
The term “carbon source” means a carbon source that a microbial organism described herein will metabolize to derive energy (e.g. monosaccharides, oligosaccharides, polysaccharides, alkanes, fatty acids, esters of fatty acids, monoglycerides, acetate, carbon dioxide, methanol, formaldehyde, formate or carbon-containing amines). The term “carbon source” refers to a carbon containing composition (e.g. compound, mixture of compounds) that an organism may metabolize for use by the organism or that may be used for organism viability. A “majority carbon source” refers to a carbon containing composition that accounts for greater than 50% of the available carbon sources for an organism (e.g. in a media, in a growth media, in a defined media for the organism, or in a defined media for producing ethylene, ethane, and/or methane by an organism) at a specified time (e.g. media when starting a culture, media in a bioreactor when growing the organism, or media when producing ethylene, ethane, and/or methane from the organism). In embodiments, an organism may be cultured using a medium comprising a majority carbon source selected from the group consisting of glucose, glycerol, xylose, fructose, mannose, ribose, sucrose, and lignocellusic biomass. In embodiments, an organism may be cultured using a medium comprising one or more carbon sources selected from the group consisting of glucose, fructose, sucrose, lactose, galactose, xylose, mannose, rhamnose, arabinose, glycerol, acetate, depolymerized sugar beet pulp, black liquor, corn starch, depolymerized cellulosic material, corn stover, sugar beet pulp, switchgrass, milk whey, molasses, potato, rice, sorghum, sugar cane, wheat, and mixtures thereof (e.g. mixtures of glycerol and glucose, mixtures of glucose and xylose, mixtures of fructose and glucose, mixtures of sucrose and depolymerized sugar beet pulp, black liquor, corn starch, depolymerized cellulosic material, corn stover, sugar beet pulp, switchgrass, milk whey, molasses, potato, rice, sorghum, sugar cane, and/or wheat). In some embodiments, an organism is cultured using a medium comprising one or more carbon sources selected from the group consisting of depolymerized sugar beet pulp, black liquor, corn starch, depolymerized cellulosic material, corn stover, sugar beet pulp, switchgrass, milk whey, molasses, potato, rice, sorghum, sugar cane, thick cane juice, sugar beet juice, and wheat. In some embodiments, an organism is cultured using a medium comprising lignocellulosic biomass. In some embodiments, carbon sources may be monosaccharides (e.g., glucose, fructose), disaccharides (e.g., lactose, sucrose), oligosaccharides, polysaccharides (e.g., starch, cellulose or mixtures thereof), sugar alcohols (e.g., glycerol) or mixtures from renewable feedstocks (e.g., cheese whey permeate, cornsteep liquor, sugar beet molasses, or barley malt). Additionally, carbon sources may include alkanes, thtty acids, esters of fatty acids, monoglycerides, diglycerides, triglycerides, phospholipids, various commercial sources of fatty acids including vegetable oils (e.g., soybean oil) or animal fats. In some embodiments, the culture medium may contain, in addition to the primary (or majority) carbon source, one or more secondary carbon sources. In some embodiments, the secondary carbon source comprises lignin or lignin derived aromatic compounds. In some embodiments, the secondary carbon source comprises lignin breakdown products.
In some embodiments, the one or more carbon sources may comprise biomass, for example lignocellulosic biomass. The term “biomass” refers to material produced by growth and/or propagation of cells. “Lignocellulosic biomass” is used according to it plain and ordinary meaning and refers to plant dry matter comprising carbohydrate (e.g. cellulose or hemicellulose) and polymer (e.g. lignin). Lignocellulosic biomass may include agricultural residues (e.g. corn stover or sugarcane bagasse), energy crops (e.g. poplar trees, willow, Miscanthus purpureum, Pennisetum purpureum, elephant grass, maize, Sudan grass, millet, white sweet clover, rapeseed, giant miscanthus, switchgrass, jatropha, Miscanthus giganteus, or sugarcane), wood residues (e.g. sawmill or papermill discard), or municipal paper waste.
In some embodiments, the one or more carbon sources may be selected from one or more in combination of: carbon dioxide and carbon monoxide, mono and disaccharide sugars, organic acids (for example, malate, succinate, pyruvate, and fumarate), volatile fatty acids (for example, formate, acetate, propionate, and butyrate), alcohols (for example, ethanol and glycerol), and cellulosic plant biomass including but not limited to corn stover, miscanthus, switchgrass.
A “growth media” or “growth medium” as used herein can be a solid, powder, or liquid mixture which comprises all or substantially all of the nutrients necessary to support the growth of an organism; various nutrient compositions are preferably prepared when particular species are being assayed. Amino acids, carbohydrates, minerals, vitamins and other elements known to those skilled in the art to be necessary for the growth of microbial organisms are provided in the medium. In one embodiment, the growth medium is liquid. In one embodiment, the growth medium is a production medium (for example, medium optionally containing higher concentrations of glucose and/or altered concentrations of nitrogen).
In some embodiments, the growth media is sufficiently deficient in or absent of sulfate.
In another aspect, a bioreactor is provided comprising a non-naturally occurring organism as described herein. Such bioreactors may be used in the methods described herein.
Further embodiments of the present disclosure are provided as follows:
Embodiment 39: a vector comprising: one or more exogenous nucleic acid molecules encoding one or more genes of a methylthio-alkane reductase complex and one or more genes of a methionine salvage pathway.
A number of embodiments of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.
By way of non-limiting illustration, examples of certain embodiments of the present disclosure are given below.
The following examples are set forth below to illustrate the compositions, methods, and results according to the disclosed subject matter. These examples are not intended to be inclusive of all aspects of the subject matter disclosed herein, but rather to illustrate representative methods and results. These examples are not intended to exclude equivalents and variations of the present invention which are apparent to one skilled in the art.
Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in ° C. or is at ambient temperature, and pressure is at or near atmospheric.
R. rubrum was grown under conditions for ethylene induction (50 μM limiting sulfate or 1 mM MT-EtOH as sole S-source) and ethylene repression (1 mM sulfate) (
Several proteins previously identified as NFL sequences of unknown function (8,9) showed some of the highest increases in abundance under ethylene inducing conditions (
Other biologically relevant volatile organic sulfur compounds (VOSCs) were then tested for utilization by this putative nitrogenase-like enzyme system (
When all R. rubrum NFL genes were deleted (strain Δ0772:3/Δ0793:6) and specific gene combinations were re-introduced via expression from a plasmid, expression of marBHDK was necessary and sufficient to restore growth and hydrocarbon metabolism from VOSCs (
These results demonstrated the requirement of the MarBHDK nitrogenase-like system for the anaerobic assimilation of sulfur from common environmental VOSCs such as DMS and MT-EtOH in order to support growth and methionine metabolism. Moreover, these observations revealed a previously unknown mechanism for the bacterial production of methane and ethylene.
The link between VOSC utilization and methionine synthesis via the marBHDK gene products was characterized by feeding experiments with (2-[methyl-C14]thio)ethanol. This enabled detection of the methylthio-moiety of MT-EtOH. Upon feeding the wild type strain, MT-EtOH was consumed. Labeled methanethiol (C14H3—SH) and methionine (methyl-C14) were concomitantly produced and observed at low levels (˜2% of MT-ETOH concentration) until MT-EtOH was depleted (
SalR—Sulfur metabolism evidently is the primary function of these nitrogenase-like methylthio-alkane reductases, as opposed to nitrogen fixation by nitrogenase. R. rubrum possesses molybdenum nitrogenase (NifHDK), which is the default nitrogenase, and iron only nitrogenase (AnfHDGK) nitrogenase, which is synthesized in the absence of molybdenum (9). In in vivo activity assays, the R. rubrum molybdenum nitrogenase could not perform methylthio-alkane reduction, even under maximally inducing conditions, and vice versa (
The nitrogenase superfamily is composed of the bona fide nitrogenase sequences (groups I-III) and nitrogen fixation-like sequences (NFL; groups IV-VI) (
Nitrogenase functions via a coordinated transfer of electrons through a network of highly modified iron and sulfur metal clusters. The minimal molybdenum nitrogenase system requires gene products NifBHDKEN; the vanadium (Vnf) and iron (Anf) nitrogenases have similar requirements (8, 9). The NifH homodimer possesses a single Fe4—S4 cluster at the homodimer interface. The NifDK heterotetramer contains Fe8—S7 P-clusters coordinated at each of the two NifDK subunit interfaces, and each NifD subunit contains the characteristic catalytic FeMo-cofactor [Fe7—S9—C—Mo-homocitrate] (12). In the Vnf and Anf nitrogenase systems Mo is replaced with V or Fe, respectively. Initially, electrons are donated to the NifH Fe4—S4 cluster from a reducing agent such as a ferredoxin or flavodoxin (61). When NifH is in complex with NifDK, these electrons are transferred in an ATP binding and hydrolysis dependent manner to the P-cluster of NifDK. NifH also has roles in P-duster assembly from two Fe4—S4 clusters on the apo-NifDK heterotetramer and synthesis of FeMo-cofactor when in complex with NifDK-like FeMo-cofactor assembly proteins, NifEN (12). P-cluster electrons are then passed to the FeMo-cofactor catalytic cluster and ultimately to FeMo-cofactor-bound dinitrogen for stepwise reduction to ammonia (17, 62).
MarH: MarH contains the same NifH conserved residues for MgATP hydrolysis and Fe4—S4 cluster coordination that enables transfer of electrons from the NifH Fe4—S4 cluster to the NifDK P-cluster (
MarDK: MarD and MarK each possess the triad of cysteines conserved in the molybdenum nitrogenase subunits NifD and NifK for P-cluster coordination (
MarB: NifB is a radical SAM enzyme responsible for carbide insertion and formation of the 8Fe—9S—C NifB-cofactor, the precursor to FeMo-cofactor (12). MarB possesses all of the identified motifs conserved across NifB enzymes associated with bona fide nitrogenases (
Together, this indicates that methylthio-alkane reductase proceeds via a mechanism, similar but distinct to that of nitrogenase to convert MT-EtOH to ethylene, ethylmethylsulfide to ethane, and dimethylsulfide to methane (17). Methane release from DMS by the methylthio-alkane reductases is separate and distinct from the other known non-archaeal methanogenic processes, including photosynthesis-linked methane production by cyanobacteria (18), methane release from methylphosphonates by marine bacteria (19), and direct reduction of carbon dioxide to methane by iron-only nitrogenase (AnfDHGK) (20). In waterlogged soils, strictly anaerobic microbial processes produce ethylene that can accumulate to levels inhibitory to plant root growth, causing crop damage (21, 22). Early attempts at identifying ethylene-producing organisms surprisingly isolated oxygen-dependent soil bacteria and fungi (23, 24). The organisms and methylthio-alkane reductases identified here function a,naerobically and could contribute to this soil-ethylene paradox (10). This anaerobic ethylene process is distinct from the oxygen-dependent reactions catalyzed by aminocyclopropanecarboxylate oxidase and 2-oxoglutate dioxygenase in plants, fungi, and certain bacteria.
The ethylene precursor, 5′-methylthioadenosine (MTA) is a routine byproduct of highly regulated processes such as quorum sensing, polyamine production, etc. These are highly regulated processes, making the native production of MTA for subsequent ethylene production rate limiting. The coliphage SAM hydrolase (MTA-forming) is a viral enzyme that directly converts SAM to MTA (
The methane precursor, dimethylsulfide, is the most abundant organic sulfur compound in the environment. It is produced by marine bacteria from dimethylsulfinypropionate and by terrestrial bacteria from methanethiol (71, 72). A non-natural methionine salvage pathway from Rhodopseudomonal palsutris for the conversion of methionine to dimethylsulfide is constructed using methionine gamma lyase (mgl) and methanethiol methyltransferase (mddA) (
Fine chemicals: Dimethyl sulfide, methanethiol, L-methionine, 5′-methylthioadenosine, and S-methyl-t-cysteine were from Sigma; ethyl methyl sulfide, (2-methylthio)ethanol, (2-methylthio)acetate, and (3-methylthio)propanol were from Alfa Aesar. All media components were of ultrapure grade from Sigma or J. T. Baker, For targeted metabolite detection, (2-[methyl-C14]thio)ethanol was synthesized from [methyl-C14]-S-adenosylmethionine (Perkin Elmer). Labeled S-adenosylmethionine was acid hydrolyzed in 0.01 N H2SO4 under reflux at 100° C. for 30 min to form [methyl-C14]-5′-methylthioadenosine. (2-[methyl-C14]thio)ethanol was subsequently formed enzymatically in a reaction containing 50 mM potassium phosphate pH 7.8, 5 mM MgCl2, 0.2 mM NADH, 60 μM substrate, and 2 μM each of purified R. rubrum 5′-methylthioadenosine phosphorylase (10), Bacillus subtilis 5-methylthioribose-1-phosphate isomerase (29), E. coli 5-methylthioribulose-1-phosphate aldolase (25), and S. cerevisiae alcohol dehydrogenase (Sigma) at 30° C. for 2 h. Enzymes were synthesized and purified as previously described (10). Complete conversion was monitored by reverse phase HPLC with an inline scintillation detector as previously described (10), followed by enzyme removal via Amicon (Millipore) centrifugal concentration device.
Bacterial strains and growth conditions: R. rubrum ATCC 11170 wild type strain (SmR; NC_007643.1; American Type Culture Collection), Rru_A1998 deletion strain WR (ΔrlpA::GmR) in which the MTA-isoprenoid shunt is inactivated, and Rru_A1998/Rru_A0359 deletion strain WRdht (ΔrlpA::GmR/Δald2) in which the MTA-isoprenoid and DHAP shunts are inactivated were as previously described (10, 30). Rhodobacter capsulatus SB1003 (NC_014034.1, American Type Culture Collection) (31), Rhodopseudomonas palustris CGA010 (32), and Blastochloris viridis DSM133 (NZ_AP014854.2, University of Leibnitz DSMZ) (33) wild type strains were also as previously described. Rhodopseudomonal palustris CGA010 (Caroline Harwood, University of Washington) is a derivative of CGA009 (SmR; NC_005296.1, American Type Culture Collection) in which a frame shift mutation is corrected. Anaerobic growth of R. rubrum and R. capsulatus was performed in static anaerobic culture tubes and serum bottles at 30° C. with 2000 lux incandescent illumination. Cultures were composed of sulfur-free Ormerod's malate (30 mM) minimal medium supplemented with the indicated sulfur source under a 95:5 mixture of N2LH2 gaseous headspace as previously described (34, 35). Anaerobic growth of R. palustris was similarly performed by replacing malate with 0.5% (v/v) ethanol and 0.2% (w/v) sodium bicarbonate and adding 2 μg/ml para-aminobenzoic acid. All anaerobic manipulations were performed using an anaerobic chamber under 5% hydrogen and 95% nitrogen (Coy Laboratories).
Anaerobic growth of B. viridis was performed in anaerobic cultures tubes continuously rotated on a rotisserie at 30° C. with 2000 lux incandescent illumination. Cultures were composed of a modified sulfur-free succinate medium 27 (N medium) (36) supplemented with the indicated sulfur source under an N2 gaseous headspace. Briefly, sulfur-free succinate medium 27 contained (per liter water) 0.3 g yeast extract, 1.0 g Na2-succinate, 0.5 g ammonium acetate, 5 mg Fe(III) citrate, 0.5 g KH2PO4, 0.33 g MgCl2.6H2O, 0.4 g NaCl, 0.4 g NH4Cl, 0.05 g CaCl2.2H2O, 0.4 ml of 0.1 g/L vitamin B12 solution, 0.5 ml of 1.0 g/L resazurin solution, and 1.0 ml of trace element solution [(per liter water) 0.075 g Zn-acetate, 0.03 g MnCl2.4H2O, 0.3 g H3BO3, 0.20 g CoCl2.6H2O, 0.01 g CuCl2.2H2O, 0.02 g NiCl2.6H2O, 0.03 g Na2MoO4.2H2O] at pH 6.8. Media was brought to a boil, dispensed and sealed in anaerobic culture tubes, sparged with N2 until anaerobic, autoclaved, cooled, supplemented with the appropriate sulfur source, and reduced with Tris-buffered titanium citrate pH 8.0 (1 mM final concentration) before inoculating.
Proteomics analysis: To optimize ethylene induction, and by inference of the remaining steps of the pathway in metabolizing MT-EtOH to methionine, the growth of R. rubrum strain WR (ΔrlpA::GmR) was measured spectrophotometrically by optical density at 660 nm (O.D.660nm) and the specific rate of ethylene production (μmol/h/g dry cell weight) was independently measured by gas chromatography (see GC analysis below) at regular intervals for a given sulfate or MT-EtOH concentration (
Each cell pellet was lysed by 4% sodium deoxycholate in 100 mM ammonium bicarbonate with the application of sonication (20% amplitude, 10 s pulse, 10 s rest, 2 min total puke time). Crude protein extract was precleared via centrifugation, reduced with 10 mM dithiothreitol, alkylated with 30 mM iodoacetamide, and then collected on top of a 10 kDa cutoff spin column filter (VIVASPIN 500, Sartorius). Collected proteins were digested to peptides with two sequential aliquots of sequencing-grade trypsin (Sigma) at a 1:75 enzyme:protein ratio (w/w), initially overnight at room temperature followed by additional 3 h at room temperature. Peptides were collected by centrifugation and acidified to 1% formic acid followed by extraction with ethyl acetate to remove sodium deoxycholate. The peptide containing aqueous phase was recovered and concentrated. Concentrated peptides were measured using the bicinchoninic acid assay (Pierce).
Each peptide mixture was analyzed on a two-dimensional liquid chromatography tandem mass spectrometry (2D-LC-MS/MS) platform using a Q Exactive Plus (QE+) mass spectrometer (Thermo Fisher Scientific) equipped with an Ultimate 3000 RS system (Thermo Fisher Scientific). 9 μg of each peptide sample was loaded via autosampler onto a triphasic pre-column (5 cm C18 reversed phase (RP), 5 cm strong cation exchange, and 5 cm C18 RP). Bound peptides were then washed and separated over three successive salt cuts of ammonium acetate (35 mM, 50 mM and 500 mM), each followed by an RP-LC elution via an in-house pulled nano-electrospray emitter (75 μm ID) packed with 30 cm of C18 RP. Mass spectra were acquired on QE+ in a data-dependent mode with full scan at 70K resolution, followed by HCD fragmentation of the top 15 most abundant ions at 15K resolution.
Acquired MS/MS spectra were matched with theoretical tryptic peptides generated from a concatenated Rhodaspirillum rubrum proteome FASTA database with contaminants and decoy sequences using MyriMatch v. 2.2 (37). Peptide spectral matches were filtered to achieve peptide false-discovery rates (FDR) <1% and assembled to their respective proteins using IDPicker v. 3.1 (38). Peptide abundance intensities were derived in IDPicker by extracting precursor intensities from chromatograms with lower and upper retention time of 90 s and tnass tolerance of 5 ppm. Protein abundances were calculated by summing up intensities of all identified peptides and normalized by their protein lengths respectively. Protein intensities were further log2 transformed and median centered using InfernoRDN version 1.1 (39), to approximate a normal distribution and reduce technical variance for further pairwise comparison. Student's T-test was then performed for every pair condition using Perseus platform (40) for two different thresholds (Benjamini-Hochberg FDR adjusted p-value <0.05 and fold change >2, or Benjamini-Hochberg FDR adjusted p-value <0.01 and fold change >4; two-sided).
Transcriptomies analysis: R. rubrum strain WRdht (ΔrlpA/Δald2) and 0785::Tn5 (ΔrlpA/Δald2/0785::Tn5) were grown in triplicate (biological replicates) photoheterotrophically in anaerobic culture tubes containing 20 ml sulfur-free malate minimal medium supplemented with 50 μM (“Lo”) or 1000 μM (“Hi”) sulfate. When cells reached an O.D.660nm of 0.65-0.8, cells were harvested and stabilized by RNA protect reagent (Qiagen). RNA was isolated using the RNeasy protect kit (Qiagen) and quantified by UV absorbance. RNA-seq library construction and sequencing were performed at The Genomics and Microarray Shared Resource at University of Colorado Denver Cancer Center, Denver, CO, USA. Library preparation and rRNA depletion were performed using to the Zymo-Seq Ribo Free Total RNA Library Kit Cat No. R3000 with input of 250 ng and libraries were sequenced on the Illumina NovaSeq 6000 using 2×150 paired end reads. Raw RNA-seq data were trimmed using sickle (github.com/najoshi/sickle) (41). Prior genomic sequencing of R. rubrum strain WRdht confirmed the rlpA and ald2 deletions and >99% nucleotide identity to the R. rubrum ATCC11170 genome. Mapping of transcriptomic reads to the reference was conducted using Bowtie2 (v2.3.5.1) with the options—very-sensitive and—score-min L,0, −0.1 (42). Differential expression analysis was performed using DESEq2 (v 1.22.2) (fitType=local, test=Wald) (43). Comparison of transcriptomes from the parent strain (WRdht) grown under 50 μM versus 1000 μM sulfate indicated all genes that were transcriptionally regulated >1.5-fold in response to sulfate availability (two-sided Wald Chi-square test, BH-FDR adjusted p<0.002 as implemented by DESeq2 (43)). Corresponding comparison for the SalR deletion strain (0785::Tn5) indicated which of these genes were no longer regulated in response to sulfate availability. Comparison of the SalR deletion strain to the parent strain under 1000 μM sulfate indicated which of these genes were potentially transcriptionally activated or repressed by SalR.
Transposon mullagenesis: R. rubrum strain WRdht (ΔrlpA::GmR/Δald2) was randomly mutagenized using the efficient mini-Tn5 transposable element (44). R. rubrum was initially grown aerobically at 30° C. to late log phase in PYE liquid medium (3 g/L peptone, 3 g/L yeast extract, 266 mg/L MgSO4.7H2O, 75 mg/L CaCl2.2H2O, 11.8 mg/L FeSO4.7H2O, 20 mg/L ethyl enediaminetetraacetic acid, 1 mlt/ Ormerod's trace elements solution (31),1 mg/L thiamine, 1 mg/L nicotinic acid, 15 μg/L biotin). Donor strain, E. coli BW20767/pRL27 (Coli Genetic Stock Center, Yale) (44), was grown in lysogeny broth at 37° C. to mid exponential phase. Strains were separately centrifuged and washed three times with PYE medium, combined in a 1:2 ratio of E. coli to A. rubrum, concentrated, and spotted onto a 16% PYE agar plate. Biparental conjugation was carried out aerobically at 30° C. in the dark for no more 24 h to ensure R. rubrum cells received no more than one Tn5 insertion per genome. R. rubrum transconjugants were selected on 16% PYE agar plates with 25 μg/ml kanamycin and 30 μg/ml gentamycin under the same growth conditions.
Transposon-insertion isolates of R. rubrum were individually picked into 96-well flat-bottom tissue culture plates containing 200 μl of sulfur-free Ormerod's malate minimal medium supplemented with 100 μM ammonium sulfate and 25 μg/ml kanamycin. Inoculated plates were incubated in an anaerobic chamber for 2 h, sealed with thermal adhesive film to prevent evaporation, and further sealed in thermal-seal bags (Kapak, ProAmpac) to maintain anaerobic conditions. Isolates were grown anaerobically at 30° C. under 2000 lux incandescent illumination to late log phase. Cultures were briefly exposed to air atmosphere, quickly transferred by 96-pin transfer device to new anaerobic 96-well plates containing 200 μl of anaerobic sulfur-free Ormerod's malate minimal medium supplemented with 1 mM ammonium sulfate or 1 mM MT-EtOH, and then incubated and sealed in an anaerobic chamber as before. Isolates were again grown anaerobically under illumination to screen for mutants incapable of growth on MT-EtOH but still able to grow on sulfate as sole S-source. 11,250 mutants were screened to ensure each gene received a transposon insertion at least once (
Gene deletion and complementation studies: Nonpolar gene cluster deletions of Rru_A1066-Rru_A1069, Rru_A0772-Rru_A0773, and Rru_A0793-Rru_A0796 in the R. rubrum wild type strain were performed by homologous recombination using previously described methods (10). Briefly, DNA fragments were amplified by PCR using primers listed in Table A below, digested with the indicated restriction enzyme following manufacturer's protocols (New England Biolabs), and ligated into pK18mobSacBgm (10) using T4 DNA ligase (New England Biolabs). Sequence verified plasmids were transformed into E. coli Stellar strain (TaKaRa Bio) and mobilized into R. rubrum wild type by triparental conjugation with helper strain E. coli JM109/pRK2013 (American Type Culture Collection) (45), similar to methods used for the transposon mutagenesis. Transconjugants were selected on 16% PYE agar plates with 25 μg/ml kanamycin and 50 μg/ml streptomycin under aerobic growth at 30° C. First and second homologous recombination events were selected by 10% (w/v) sucrose sensitivity and kanamycin resistance of the isolates, and second recombinants possessing the proper gene deletion were sequence verified.
Gene complementation of the R. rubrum NFL gene deletion strain Δ0772:3/Δ0793:6 was performed in trans by NFL genes expressed from complementation plasmid pMTAP (70). Genes were amplified by PCR using primers listed in Table A, digested with the indicated restriction enzyme, and ligated into pMTAP. Sequence verified plasmids were transformed into E. coli Stellar strain (Takara) and mobilized into R. rubrum by triparental conjugation with helper strain E. coli JM109/pRK2013. Transconjugants were selected on 16% PYE agar plates with 2 μg/ml tetracycline and 50 μg/ml streptomycin under aerobic growth at 30° C. Isolates were then tested for their ability to grow anaerobically with sulfate, MT-EtOH, or DMS as sole sulfur source. R. rubrum Δ0772:3/Δ0793:6 transconjugants with plasmids that complemented for growth on MT-EtOH and DMS were also quantified for restoration of ethylene and methane production by GC as described below.
Whole-cell VOSC utilization and gas production assays: Cells were initially grown aerobically in 150 ml serum bottles containing sulfur-free Ormerod's malate minif al medium supplemented with 50 μM ammonium sulfate (methylthio-alkane reductase inducing conditions) to mid log phase (O.D.660nm of 0.7-0.8). Cultures were washed anaerobically three times by centrifugation and resuspension in sulfur-free Ormerod's malate minimal medium. Cells were resuspended to a final O.D.660nm of ˜2.0 (higher cell densities suppressed methylthio-alkane reductase activity), dispensed in 20 ml aliquots in 60 ml serum vials, fed with 1 mM of DMS, EMS, or MT-EtOH, sealed, and incubated at 30° C. under 2000 lux incandescent illumination for 12 h. Produced methane, ethane, and ethylene gas was quantified by GC as described below.
Whole-cell nitrogenase and methylthio-alkane reductase specific rate assays: R. rubrum wild type and NFL gene deletion (Δ0772:3/Δ0793:6) strains were grown anaerobically under argon headspace to late log phase (O.D.660nm 0.9-1.1) in Ormerod's malate minimal medium with 15 mM ammonium chloride or sodium glutamate as sole N-source and 50 μM or 1 mM sodium sulfate as sole S-source, For whole-cell nitrogenase assays (46), 2 ml of culture was transferred via syringe to an anaerobic 7.5 ml serum vial flushed with argon. Assays were initiated by the addition of 0.06 atm acetylene and allowed to proceed for 10 min under 2000 lux illumination at 30° C. Assays were quenched with 100% (w/v) trichloroacetic acid to 10% final and ethylene was quantified by GC as described below. Similarly, for whole-cell methylthio-alkane reductase assay, 4 ml of culture were transferred via syringe to an anaerobic 8 ml serum vial flushed with argon. Assays were initiated by the addition of EMS to 1 mM final concentration and allowed to proceed for 30 min under 2000 lux illumination at 30° C. Assays were quenched with TCA and ethane was quantified by GC.
GC analysis of hydrocarbons: Quantification of methane, ethane, and ethylene was performed using a Shimazdu GC-14A with Restek Rt-Alumina BOND/Na2SO4 column. Gaseous culture headspace after feeding or growth experiments was injected (250-500 μl) at 180° C. and separated isothermally at 30° C. Eluted compounds were detected by flame ionization detector at 180° C. and identified based on retention time of methane, ethane, and ethylene standard (Praxair). The total amount of each hydrocarbon present was calculated from the peak area as compared to standard concentration curves of the corresponding reference standard.
Targeted metabolomics: R. rubrum wild type and Rru_A0793-Ru_A0796 deletion strains were grown anaerobically to an O.D.660nm of 0.8 (mid log phase) in Ormerod's malate minimal medium supplemented with 50 μM ammonium sulfate to induce ethylene production. Cultures were washed anaerobically three times by centrifugation and resuspension in sulfur-free Ormerod's malate minimal medium. Cells were resuspended to a final O.D.660nm of ˜2.0 (higher concentrations repressed methylthio-alkane reductase activity), supplemented with 100 μM 5,5′-dithiobis-(2-nitrobenzoic acid) (Ellman's reagent for trapping free thiols), and sealed as 1 ml aliquots in 1.5 ml anaerobic serum vials. Cells were then fed with 10 μM MT-EtOH and 1 μM (2-[methyl-C14]thio)ethanol and incubated under 2000 lux incandescent light at 30° C. Metabolism was stopped by flash freezing in liquid nitrogen; cells were pelleted, media supernatant reserved, and the cell pellet was extracted with 80% acetonitrile+0.04N ammonium hydroxide with vortexing for 5 min followed by 20 min incubation at −20° C. Acetonitrile was removed by vacuum concentration, and the extracted metabolites were combined with the reserved media supernatant. Metabolites were separated by reverse phase HPLC and identified by inline scintillation detector based on retention time compared to reference standards as previously described for N=2 biological replicates (10).
Free-energy calculations: Standard free energies of formation and reaction were determined using electronic structure calculations with continuum solvent models. Specifically, density functional theory with the B3YLP (47 , 48) exchange correlation functional was used with the 6-311++G(2d, 2p) basis set. The geometries were optimized and harmonic frequencies determined in a continuum model solvent using the COSMO self-consistent reaction field method (49). All calculations were performed with the NWChem computational chemistry package (50) using the EMSL Arrows interface (5.1). H2 was used as the electron donor in each redox reaction since the actual electron donor is not known. The relative difference in the reaction free energies will not change if, for example, ferredoxin or any other redox pair were used as the electron donor, since the electrochemical potential of the actual electron donor would be measured relative to the standard hydrogen electrode.
Phylogenetics: The R. rubrum MarH, MarD, and MarK proteins were separately queried against the NCBI reference genome database using the translated nucleotide blast (tblastn) algorithm and filtered for protein subjects with e-value<e-50. Each identified MarH, MarD, and MarK candidate was correlated with its reference genome and only genomes were retained that contained all three homologues on the same contig and with MarD and MarK being adjacent. These candidates, along with recently discovered Group VI representatives from metagenome assembled genomes (28) were then appended to a reference nitrogenase (Groups I, II, III) and NFL sequence (Groups IV and V) database (9) with additional sequences identified from genomes in the JGI IMG/M database. Amino acid sequences were aligned using MAFFT (52) (v7.394) (—auto). Alignments were trimmed using TrimAl (53) (v1.4.rev22) (—gappyont). Maximum likelihood trees were constructed using IQ-TREE (54) (v1.6.8) (−alrt 1000-bb 1000) using best-fit models (NifH: LG+R10; NifD: LG+R6) identified by ModelFinder (55) as implemented in IQ-TREE with ultrafast bootstrap (UFBoot) (56).
Pairwise alignment of NifB, NifH, NifD, and NifK superfamily sequences for conserved active site residue analysis (
To identify organisms with native ethylene capacity (DHAP Shunt plus marBHDK genes,
1. E. E. Stueken, R. Buick, B. M. Guy, M. C. Koehler, Isotopic evidence for biological nitrogen fixation by molybdenum-nitrogenase from 3.2 Gyr. Nature. 520, 666-669 (2015).
2. M, C. Weiss, F. L. Sousa, N. Mrnjavac, S. Neukirchen, M. Roettger, S. Nelson-Sathi, W. F. Martin, The physiology and habitat of the last universal common ancestor. Nat. Microbiol. 1, 16116 (2016).
3. E. S. Boyd, J. W. Peters, New biological insights into the evolutionary history of biological nitrogen fixation. Front. Microbiol. 4, 201 (2013).
4. K. Zheng, P. D. Ngo, V. L. Owens, X. P. Yang, S. O. Mansoorabadi, The biosynthetic pathway of coenzyme F430 in methanogenic and methanotrophic archaea. Science. 354, 339-342 (2016).
5. S. J. Moore, S. T. Sowa, C. Schuchardt, E. Deery, A. D. Lawrence, J. V. Ramos, S. Billig, C. Birkemeyer, P. T. Chivers, M. J. Howard, S. E. Rigby, G. Layer, M. J. Warren, Elucidation of the biosynthesis of the methane catalyst coenzyme F430. Nature. 543, 78-82 (2017).
6. N. Muraki, J. Nomata, K. Ebata, T. Mizoguchi, T. Shiba, H. Tamiaki, G. Kurisu, X. Y. Fujita, X-ray crystal structure of the light-independent protochlorophyllide reductase. Nature. 465, 110-4 (2010).
7. J. INomata, T. Mizoguchi, H. Tamiaki, Y. A. Fujita, A second nitrogenase-like enzyme for bacteriochlorophyll biosynthesis: reconstitution of chlorophyllide a reductase with purified X-protein (BchX) and YZ-protein (BchY-BchZ) from Rhodobacter capsulatus. J. Biol. Chem. 281, 15021-8 (2006).
8. P. C. Dos Santos, Z. Fang, S. W. Mason, J. C. Setubal, R. Dixon, Distribution of nitrogen fixation and nitrogenase-like sequences amongst microbial genomes. BMC Genomics. 13, 162 (2012).
9. J. Raymond, J. L. Siefert, C. R. Stales, R. E. Blankenship, The natural history of nitrogen fixation. Mol. Biol. Evol. 21, 541-54 (2004).
10. J. A. North, A. R. Miller, J. A. Wildenthal, S. J. Young, F. R. Tabita, Microbial pathway for anaerobic 5′-methylthioadenosine metabolism coupled to ethylene formation. Proc. Natl. Acad. Sci. U.S.A. 114, E10455-E10464 (2017).
11. N. Parveen, K. A. Cornell, Methylthioadenosine/S-adenosylhomocysteine nucleosidase, a critical enzyme for bacterial metabolism. Mol. Microbial. 79, 7-20 (2011).
12. S. Burén, E. Jiménez-Vicente, C. Echavarri-Erasun, L. M, Rubio, Biosynthesis of nitrogenase cofactors. Chemical Reviews. doi: 10.1021/acs.chemrev.9b00489 (2020).
13. T. J. Erb, B. S. Evans, K. Cho, B. P. Warlick, J. Sriram, B. M. Wood, H. J. Imker, J. V. Sweedler, F. R. Tabita., J. A. Gerlt, A RuBisCO-like protein links SAM metabolism with isoprenoid biosynthesis. Nat. Chem. Biol. 8, 926-932 (2012).
14. Y. Zhang, E. L. Pohlmann, P. W. Ludden, G. P. Roberts, Mutagenesis and functional characterization of the glnB, glnA, and nifA genes from the photosynthetic bacterium Rhodaspirillum rubrum. J. Bacteriol. 182, 983-92 (2000).
15. Sippel, O. Einsle, The structure of vanadium nitrogenase reveals an unusual bridging ligand. Nat. Chem. Biol. 13, 956-960 (2017).
16. L. M. Zhang, C. M. Morrison, J. T. Kaiser, D. C. Reese, Nitrogenase MoFe protein from Clostridium pasteurianum at 1.08 Å resolution: comparison with the Azotobacter vinelandii MoFe protein. Acta Crystallogr. D Biol. Crystallogr. 71, 274-282 (2015).
17. D, Sippel, M. Rohde, J. Netzer, C. Trncik, J, Gies, K. Grunau, I. Djurdjevic, L. Decamps, S. L. A. Andrade, O. Einsle, A bound reaction intermediate sheds light on the mechanism of nitrogenase. Science. 359, 1484-1489 (2018).
18. M. Bižić, T. Klintzsch, D. Ionescu, M. Y. Hindiyeh, M. Günthel, A. M. Muro-Pastor, W. Eckert, T. Urich, F. Keppler, H.-P. Grossart, Aquatic and terrestrial cyanobacteria produce methane. Sci. Adv. 6, eaax5343 (2020)
19. D. Repeta, S. Ferran, O. Sosa, C. G. Johnson, L. D. Repeta, M. Acker, E. F. DeLong, D. M. Karl, Marine methane paradox explained by bacterial degradation of dissolved organic matter. Nat. Geosci. 9, 884-887 (2016).
20. Y. Zheng, D. F. Harris, Z. Yu, Y. Fu, S. Poudel, R. N. Ledbetter, K. R. Fixen, Z. Y. Yang, E. S. Boyd, M. E. Lidstrom, L. C. Seefeldt, C. S. Harwood, A pathway for biological methane production using bacterial iron-only nitrogenase. Nat. Microbiol. 3, 281-286 (2018).
21. K. A. Smith, R. S. Russell, Occurrence of ethylene and its significance in anaerobic soils. Nature. 222, 769-771 (1969).
22. S. Manik, G. Pengilley, G. Dean, B. Field, S. Shabala, M. Zhou, Soil and crop management practices to minimize the impact of waterlogging on crop productivity. Front. Plant. Sci., 10, 140 (2019).
23. J. M. Lynch, Identification of substrates and isolation of micro-organisms responsible for ethylene production in soil. Nature. 240, 45-46 (1972).
24. J. M, Lynch, Ethylene in soil. Nature. 256, 576-577 (1975).
25. J. A. North, J. A. Wildenthal, T. J. Erb, B. E. Evans, K. M. Byerly, J. A. Gerlt, F. R. Tabita, A bifunctional salvage pathway for two distinct S-adenosylmethionine byproducts that is widespread in bacteria, including pathogenic Escherichia coli. Mol. Microbiol. 10.1111/mmi.14459 (2020).
26. G. A. W. Beaudoin, Q. Li, J. Folz, O. Fiehn, J. L. Goodsell, A. Angerhofer, S. D. Bruner, A. D, Hanson, Salvage of the 5-deoxyribose byproduct of radical SAM enzymes. Nat. Commun. 9, 3105 (2018).
27. H. Zheng, C. Dietrich, R. Radek, A. Brune, Endomicrobium proavitum, the first isolate of Endomicrobia class. nov. (phylum Elusimicrobia)—an ultramicrobacterium with an unusual cell cycle that fixes nitrogen with a Group IV nitrogenase. Environ. Microbiol. 18, 191-204 (2016).
28. R., Méheust, C. J. Castelle, P. B. M. Carnevali, I. F. Farag, C. He, L. X. Chen, Y. Amano, L. A. Hug. J. F. Banfield, Aquatic Elusimicrobia are metabolically diverse compared to gut microbiome Elusimicrobia and some have novel nitrogenase-like gene clusters. https://www.biorxiv.org/content/10.1101/765248v2 (2019).
29. H. J. Imker, A. A. Fedorov, E. V. Fedorov, S. C. Almo, J. A. Gerlt, Mechanistic diversity in the RuBisCO superfamily: the “enolase” in the methionine salvage pathway in Geobacillus kaustophilus. Biochemistry. 46, 4077-89 (2007).
30. J. Singh, F. R. Tabita, Roles of RubisCO and the RubisCO-like protein in 5-methylthioadenosine metabolism in the nonsulfur purple bacterium Rhodospirillum rubrum. J. Bacteriol. 192, 1324-31 (2010).
31. H. Strnad, A. Lapidus, J. Paces, P. Ulbrich, C. Vlcek, V. Paces, R. Haselkorn, Complete genome sequence of the photosynthetic purple nonsulfur bacterium Rhodobacter capsulatus SB 1003. J. Bacteriol. 192, 3545-6 (2010).
32. F. E. Rey, Y. Oda, C. S. Harwood, Regulation of uptake hydrogenase and effects of hydrogen utilization on gene expression in Rhodopseudomonas palustris. J. Bacteriol. 188, 6143-6152 (2006).
33. G. Drews, P. Giesbrecht, Rhodopseudomonas viridis, nov. spec., ein neu isoliertes, obligat phototrophes Bakterium. Archiv für Mikrobiol. 53, 255-262 (1966).
34. J. G. Ormerod, K. S. Ormerod, H. Gest, Light-dependent utilization of organic compounds and photoproduction of molecular hydrogen by photosynthetic bacteria; relationships with nitrogen metabolism, Arch. Biochem. Biophys. 94, 449-463 (1961).
35. S. Dey, J. A. North, J. Sriram, B. S. Evans, F. R. Tabita, In vivo studies in Rhodospirillum rubrum indicate that ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) catalyzes two obligatorily required and physiologically significant reactions for distinct carbon and sulfur metabolic pathways. J. Biol. Chem. 290, 30658-68 (2015).
36. D. P. Canniffe, D. A. Bryant, Engineered biosynthesis of bacteriochlorophyll b in Rhodobacter sphaeroides. Biochim. Biochim. Acta. 1837, 1611-6 (2014).
37. D. L. Tabb, C. G. Fernando, M. C. Chambers, MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J. Proteome Res. 6, 654-661 (2007).
38. Z. Q. Ma, S. Dasari, M. C. Chambers, M. D. Litton, S. M. Sobecki, L. J. Zimmerman, P. J. Halvey, B. Schilling, P. M. Drake, B. W. Gibson, D. L. Tabb, IDPicker 2.0: Improved protein assembly with high discrimination peptide identification filtering. J. Proteome Res. 8, 3872-3881 (2009).
39. T. Taverner, Y. V. Karpievitch, A. D. Polpitiya, J. N. Brown, A. R. Dabney, G. A. Anderson, R. D. Smith, DanteR: an extensible R-based tool for quantitative analysis of omics data. Bioinformatics. 28, 2404-2406 (2012).
40. S. Tyanova, T. Temu, P. Sinitcyn, A. Carlson, M. Y. Hein, T. Geiger, M. Mann, J. Cox, The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods. 9, 731-740 (2016).
41. N. A. Joshi, J. N. Fass, Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33) https://github.com/najoshi/sickle (2011).
42. B. Langmead, S. L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat. Methods. 9, 357-359 (2012).
43. M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 15, 550 (2014).
44. R. A. Larsen, M. M. Wilson, A. M. Guss, W. W. Metcalf, Genetic analysis of pigment biosynthesis in Xanthobacter autotrophicus Py2 using a new, highly efficient transposon mutagenesis system that is functional in a wide variety of bacteria. Arch. Microbiol. 178, 193-201 (2002).
45. D. H. Figurski, D. R. Helinski, Replication of an origin-containing derivative of plasmid RK2 dependent on a plasmid function provided in trans. Proc. Natl. Acad. Sci. U.S.A. 76, 1648-1652 (1979).
46. R. W. F. Hardy, R. D. Holsten, E. K. Jackson, R. C. Burns, The acetylene reduction assay for N2 fixation: laboratory and field evaluation, Plant Physiol. 43, 1185-1207 (1968).
47. C. T. Lee, W. T. Yang, R. G. Parr, Development of the Colle-Salvetti correlation-energy formula into a functional of the electron-density. Phys. Rev. B. 37, 785-789 (1988).
48. A. D. Becke, Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648-5652 (1993).
49. A. Klamt, G. Schuurmann, Cosmo—a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J. Chem. Soc., Perkin Trans. 2. 1993, 799-805 (1993).
50. M. Valiev, E. J. Bylaskaa, N. Govinda, K. Kowalskia, T. P. Straatsmaa, H. J. J. Van Dama, D. Wanga, J. Nieplochaa, E. Aprab, T. L. Windusc, W. A. de Jonga, NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Comput. Phys. Commun. 181, 1477-1489 (2010).
51. E. J. Bylaska, EMSL Arrows. https://arrows.emsl.pnnl.gov/api/ (2020).
52. K. Katoh, D. M. Standley, MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772-780 (2013).
53. S. Capella-Gutierrez, J. M. Silla-Martinez, I. Gabaldon, TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 25, 1972-1973 (2009),
54. L. T. Nguyen, H. A. Schmidt, A. von Haeseler, B. Q. Minh, IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268-274 (7 1015).
55. S. Kalyaanamoorthy, B. Q. Minh, T. K. F. Wong, A. von Haeseler, L. S. Jermiin, ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods. 14, 587-589 (2017).
56. D. T. Hoang, O. Chernomor, A. von Haeseler, B. Q. Minh, L. S. Vinh, UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518-522 (2018).
57. F. Madeira, Y. M. Park, J. Lee, N. Buso, T. Gur, N. Madhusoodanan, P. Basutkar, A. R. N. Tivey, S. C. Potter, R. D. Finn, R. Lopez, The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids. Res. 47, W636-W641 (2019).
58. A. M. Waterhouse, J. B. Procter, D. M. A. Martin, M. Clamp, G. J. Barton, Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics. 25, 1189-1191 (2009).
59. D. Wilkins, gggenes: Draw Gene Arrow Maps in ‘ggplot2’. R package version 0.4.0. https://wilkox.org/gggenes (2019).
60. P.-A. Chaumeil, A. J. Mussig, P. Hugenholtz, D. H. Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925-1927 (2019).
61. S. Poudel, D. R. Colman, K. R, Fixen, R. N. Ledbetter, Y. Zheng, N. Pence, L. C. Seefeldt, J. W. Peters, C. S. Harwood, E. S. Boyd. Electron Transfer to Nitrogenase in Different Genomic and Metabolic Backgrounds. J. Bacteriol. 200, e00757-17 (2018).
62. B. M. Hoffman, D. Lukoyanov, Z-.Y. Yang, D. R. Dean, L. C. Seefeldt, Mechanism of nitrogen fixation by nitrogenase: the next stage. Chem. Rev. 114, 4041-62 (2014).
63. J. Oetjen, B. Reinhold-Hurek, Characterization of the DraT/DraG system for posttranslational regulation of nitrogenase in the endophytic betaproteobacterium Azoarcus sp. Strain BH72. J. Bacteriol. 191, 3726-3735 (2009).
64. M. J. Bröcker, S. Virus, S. Ganskow, P. Heathcote, D. W. Heinz, W. D. Schubert, D. Jahn, J. Moser, ATP-driven reduction by dark-operative protochlorophyllide oxidoreductase from Chlorobium tepidum mechanistically resembles nitrogenase catalysis. J. Biol. Chem. 283, 10559-67 (2008).
65. S. J. Moore, S. I. Sowa, C. Schuchardt, E. Deery, A. D. L., J. Vazquez Ramos, S. Billig, C. Birkemeyer, P. T. Chivers, M. J. Howard, S. E. J. Rigby, G. Layer, M. J., Warren Elucidation of the biosynthesis of the methane catalyst coenzyme F430, Nature. 543, 78-82 (2017).
66. Y. Hu, J. M. Yoshizawa, A. W. Fay, C. Chung Lee, J. A. Wiig, M. W. Ribbe, Catalytic activities of NifEN: Implications for nitrogenase evolution and mechanism. Proc. Natl. Acad. Sci. U.S.A. 106, 16962-16966 (2009).
67. Miller, A. R., North, J. A., Wildenthal, J. A. & Tabita, F. R. Two distinct aerobic methionine salvage pathways generate volatile methanethiol in Rhodopseudomonas palustris. MBio 9, e00407-18 (2018).
68. Varaljay, V. A., Satagopan, S., North, J. A., Witte, B., Dourado, M. N., Anantharaman, K., Arbing, M. A., Hoeft McCann, S., Oremland, R. S., Banfield, J. F., Wrighton, K. C. and Tabita, F. R. Functional metagenomic selection of RubisCO from uncultivated bacteria. Environ. Microbiol. 18, 1187-1199 (2016).
69. J. J. Hultqvist, O. Warsi, A. Söderholm, M. Knopp, U. Eckhard, E. Vorontsov, M. Selmer, D. A. Andersson. A bacteriophage enzyme induces bacterial metabolic perturbation that confers a novel promiscuous function. Nat Ecol Evol. 2, 1321-1330 (2018).
70. J. A. Hughes. In vivo hydrolysis of S-adenosyl-L-methionine in Escherichia coli increases export of 5-methylthioribose. Can J Microbiol, 52, 599-602 (2006).
71. Curson, A. R. J., Todd, J. D., Sullivan, M. J. & Johnston, A. W. B. Catabolism of dimethylsulphoniopropionate: microorganisms, enzymes and genes. Nat. Rev. Microbiol. 9, 849-859 (2011).
72. Carrión, O., Curson, A., Kumaresan, D., Fu, Y., Lang, A. S., Mercadé, E. & Todd, J. D. A novel pathway producing dimethylsulphide in bacteria is widespread in soil environments. Nat Commun 6, 6579 (7 1015).
It will be apparent to those skilled in the art that various modifications and variations can be made in the present disclosure without departing from the scope or spirit of the invention. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the methods disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
This application claims the benefit of priority to U.S. Provisional Application No. 63/165,904, filed Mar. 25, 2021, the disclosure of which is incorporated herein by reference in its entirety,
This invention was made with government support under Grant No. DE-SC0019338 awarded by the U.S. Department of Energy. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63165904 | Mar 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2022/021905 | Mar 2022 | US |
Child | 18473637 | US |