The invention relates generally to the field of industrial microbiology. The invention relates to recombinant host cells comprising (i) a modification in an endogenous gene encoding a polypeptide that converts pyruvate to acetyl-CoA, acetaldehyde or acetyl-phosphate and (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. The invention also relates to recombinant host cells comprising (i) a modification in an endogenous gene encoding a polypeptide having pyruvate decarboxylase (PDC) activity, or a modification in an endogenous polypeptide having PDC activity, and (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. The invention also relates to recombinant host cells further comprising (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. Additionally, the invention relates to methods of making and using such recombinant host cells including, for example, methods of increasing cell growth, methods of reducing or eliminating the requirement of an exogenous carbon substrate for cell growth, methods of increasing glucose consumption and methods of increasing the production of a product of a pyruvate-utilizing pathway.
Global demand for liquid transportation fuel is projected to strain the ability to meet certain environmentally driven goals, for example, the conservation of oil reserves and limitation of green house gas emissions. Such demand has driven the development of technology which allows utilization of renewable resources to mitigate the depletion of oil reserves and to minimize green house gas emissions.
Butanol is an important industrial chemical, useful as a fuel additive, as a feedstock chemical in the plastics industry, and as a food grade extractant in the food and flavor industry. Each year 10 to 12 billion pounds of butanol are produced by petrochemical means and the need for this commodity chemical will likely increase in the future.
Methods for the chemical synthesis of isobutanol are known, such as oxo synthesis, catalytic hydrogenation of carbon monoxide (Ullmann's Encyclopedia of Industrial Chemistry, 6th edition, 2003, Wiley-VCH Verlag GmbH and Co., Weinheim, Germany, Vol. 5, pp. 716-719) and Guerbet condensation of methanol with n-propanol (Carlini et al., J. Molec. Catal. A: Chem. 220:215-220, 2004). These processes use starting materials derived from petrochemicals, are generally expensive, and are not environmentally friendly. The production of isobutanol from plant-derived raw materials would minimize green house gas emissions and would represent an advance in the art.
2-Butanone, also referred to as methyl ethyl ketone (MEK) is a widely used solvent and is the most important commercially produced ketone, after acetone. It is used as a solvent for paints, resins, and adhesives, as well as a selective extractant, activator of oxidative reactions, and it can be chemically converted to 2-butanol by reacting with hydrogen in the presence of a catalyst (Nystrom, R. F. and Brown, W. G. (J. Am. Chem. Soc. (1947) 69:1198). 2,3-butanediol can be used in the chemical synthesis of butene and butadiene, important industrial chemicals currently obtained from cracked petroleum, and esters of 2,3-butanediol may be used as plasticizers (Voloch et al., “Fermentation Derived 2,3-butanediol,” in Comprehensive Biotechnology, Pergamon Press Ltd., England Vol. 2, Section 3:933-947 (1986)).
Microorganisms can be engineered for the expression of biosynthetic pathways that initiate with cellular pyruvate to produce, for example, 2,3-butanediol, 2-butanone, 2-butanol and isobutanol. U.S. Pat. No. 7,851,188 discloses the engineering of recombinant microorganisms for production of isobutanol. U.S. Patent Application Publication Nos. US 20070259410 A1 and US 20070292927 A1 disclose the engineering of recombinant microorganisms for production of 2-butanone or 2-butanol. Multiple pathways are disclosed for biosynthesis of isobutanol and 2-butanol, all of which initiate with cellular pyruvate. Butanediol is an intermediate in the 2-butanol pathway disclosed in U.S. Patent Application Publication No. US 20070292927 A1.
The disruption of the enzyme pyruvate decarboxylase (PDC) in recombinant host cells engineered to express a pyruvate-utilizing biosynthetic pathway has been used to increase the availability of pyruvate for product formation via the biosynthetic pathway. For example, U.S. Application Publication No. US 20070031950 A1 discloses a yeast strain with a disruption of one or more pyruvate decarboxylase genes (a PDC knock-out or PDC-KO) and expression of a D-lactate dehydrogenase gene, which is used for production of D-lactic acid. U.S. Application Publication No. US 20050059136 A1 discloses glucose tolerant two-carbon source-independent (GCSI) yeast strains with no PDC activity, which may have an exogenous lactate dehydrogenase gene. Nevoigt and Stahl (Yeast 12:1331-1337 (1996)) describe the impact of reduced PDC and increased NAD-dependent glycerol-3-phosphate dehydrogenase iii Saccharomyces cerevisiae on glycerol yield. U.S. Application Publication No. 20090305363 A1 discloses increased conversion of pyruvate to acetolactate by engineering yeast for expression of a cytosol-localized acetolactate synthase and substantial elimination of PDC activity.
While PDC-KO recombinant host cells can be used to produce the products of pyruvate-utilizing biosynthetic pathways, PDC-KO recombinant host cells require exogenous carbon substrate supplementation (e.g., ethanol or acetate) for their growth (Flikweert et al. 1999. FEMS Microbiol. Lett. 174(1):73-79 “Growth requirements of pyruvate-decarboxylase-negative Saccharomyces cerevisiae”). A similar auxotrophy is observed in Escherichia coli strains carrying a mutation of one or more genes encoding pyruvate dehydrogenase (Langley and Guest, 1977, J. Gen, Microbiol. 99:263-276).
In commercial applications, addition of exogenous carbon substrate in addition to the substrate converted to a desired product can lead, to increased costs. There remains a need in the art for recombinant host cells with reduced or eliminated need for exogenous carbon substrate supplementation.
One aspect of the invention relates to a recombinant host cell comprising (i) least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate, or acetyl-CoA; and ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. Another aspect of the invention relates to such a recombinant host cell further comprising (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. In embodiments, the polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate, or acetyl-CoA is pyruvate decarboxylase, pyruvate-formate lyase, pyruvate dehydrogenase, pyruvate oxidase, or pyruvate:ferredoxin oxidoreductase.
One aspect of the invention relates to a recombinant host cell comprising (i) a modification in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity or in an endogenous polypeptide having pyruvate decarboxylase activity; and (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. Another aspect of the invention relates to such a recombinant host cell further comprising (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
One aspect of the invention relates to a recombinant host cell comprising (i) at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity; and (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. Another aspect of the invention relates to a recombinant host cell further comprising: (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. Another aspect of invention relates to a reduced or eliminated requirement of such cells for an exogenous two-carbon substrate for its growth in culture compared to a recombinant eukaryotic host cell comprising (i) and not (ii) or (iii). Another aspect of the invention relates to the growth of such host cells in culture media that is not supplemented with an exogenous two-carbon substrate, for example, at a growth rate substantially equivalent to, or greater than, the growth rate of a host cell comprising (i) and not (ii) or (iii) in culture media supplemented with an exogenous two-carbon substrate.
In one aspect of the invention, the recombinant host cell is a member of the genera Clostridium, Zymomonas, Escherichia, Salmonella, Serratia, Erwinia, Klebsiella, Shigella, Rhodococcus, Pseudomonas, Bacillus, Lactobacillus, Enterococcus, Alcaligenes, Klebsiella, Paenibacillus, Arthrobacter, Corynebacterium, Brevibacterium, Schizosaccharomyces, Kluyveromyces, Yarrowia, Pichia, Candida, Hansenula, or Saccharomyces. In another aspect of the invention, the recombinant host cell is S. cerevisiae
In another aspect of the invention, the recombinant host cell expresses a pyruvate-utilizing biosynthetic pathway including, for example, a biosynthetic pathway for a product such as 2,3-butanediol, isobutanol, 2-butanol, 2-butanone, valine, leucine, alanine, lactic acid, malic acid, fumaric acid, succinic acid, or isoamyl alcohol. Another aspect of the invention relates to expression of an isobutanol biosynthetic pathway in the recombinant host cell comprising at least one DNA molecule encoding a polypeptide that catalyzes a substrate to product conversion selected from the group consisting of (i) pyruvate to acetolactate; (ii) acetolactate to 2,3-dihydroxyisovalerate; (iii) 2,3-dihydroxyisovalerate to 2-ketoisovalerate; (iv) 2-ketoisovalerate to isobutyraldehyde; and (v) isobutyraldehyde to isobutanol. Another aspect of the invention relates to expression of a 2-butanone biosynthetic pathway in the recombinant host cell comprising at least one DNA molecule encoding a polypeptide that catalyzes a substrate to product conversion selected from the group consisting of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; and (iv) 2,3-butanediol to 2-butanone.
Another aspect of the invention relates to expression of a 2-butanol biosynthetic pathway in the recombinant host cell comprising at least one DNA molecule encoding a polypeptide that catalyzes a substrate to product conversion selected from the group consisting of (i) pyruvate to acetolactate; (ii) acetolactate to acetoin; (iii) acetoin to 2,3-butanediol; (iv) 2,3-butanediol to 2-butanone; and (v) 2-butanone to 2-butanol.
One aspect of the invention relates to methods for the production of a product selected from the group consisting of 2,3-butanediol, isobutanol, 2-butanol, 2-butanone, valine, leucine, alanine, lactic acid, malic acid, fumaric acid, succinic acid and isoamyl alcohol comprising growing the recombinant host cells described herein under conditions wherein the product is produced and optionally recovering the product. Another aspect of the invention relates to methods of producing a recombinant host cell comprising transforming a host cell comprising at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity with (i) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (ii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
Another aspect of the invention relates to methods of improving the growth of a recombinant host cell comprising at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity, comprising (i) transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (ii) transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. In embodiments, the methods further comprise growing the recombinant host cell in media containing limited carbon substrate.
Another aspect of the invention relates to methods of reducing the requirement for an exogenous two-carbon substrate for the growth of a recombinant host cell comprising at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity, comprising (i) transforming the host cell with a heterologous polynucleotide encoding, a polypeptide having, phosphoketolase activity; and optionally (ii) transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
Another aspect of the invention relates to methods of eliminating the requirement for an exogenous two-carbon substrate for the growth of a recombinant host cell comprising at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity, comprising (i) transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (ii) transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
Still another aspect of the invention relates to methods for increasing the activity of the phosphoketolase pathway in a recombinant host cell comprising (i) providing a recombinant host cell of the invention; and (ii) growing the recombinant host cell under conditions whereby the activity of the phosphoketolase pathway in the recombinant host cell is increased.
In another aspect, the recombinant host cells comprise a phosphoketolase that matches the Profile HMM given in Table 6 with an E value of less than 7.5E-242. In another aspect, the phosphoketolase has at least about 40% identity to at least one of SEQ ID NO 355, 379, 381, 388, 481, 486, 468, or 504. In another aspect, the phosphoketolase has at least about 90% identity to at least one of SEQ ID NO: 355, 379, 381, 388, 481, 486, 468, or 504. In another aspect, the phosphoketolase matches the Profile HMMs given in Tables 6, 7, 8, and 9 with E values of less than 7.5E-242, 1.1E-124, 2.1E-49, 7.8E-37, respectively. In another aspect, the recombinant host cells further comprise a phosphotransacetylase which matches the Profile HMM given in Table 14 with an E value of less than 5E-34. In another aspect, the phosphotransacetylase has at least about 40% identity to SEQ ID NO: 1475, 1472, 1453, 1422, 1277, 1275, 1206, 1200, 1159, or 1129. In another aspect, the phosphotransacetylase has at least about 90% identity to SEQ ID NO: 1475, 1472, 1453, 1422, 1277, 1275, 1206, 1200, 1159, or 1129
The various embodiments of the invention can be more fully understood from the detailed description, the figures, and the accompanying sequence descriptions, which form a part of this application.
Tables 6, 7, 8, 9, and 14 are tables of the Profile HMMs described herein. Table 6, 7, 8, and 14 are submitted herewith electronically and are incorporated herein by reference.
The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions which form a part of this application.
The sequence listing provided herewith is herein incorporated by reference and conforms with 37 C.F.R. 1.821-1.825 (“Requirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures—the Sequence Rules”) and is consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (2009) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5 (a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822. The content of the electronically submitted sequence listing Name: 20110615_CL4871USNA_SeqList.txt; Size: 6.67 MB; and Date of Creation/Modification: Jun. 9, 2011/Jun. 15, 2011 is incorporated herein by reference in its entirety.
SEQ ID NOs: 1-20 are sequences of PDC target gene coding regions and proteins.
SEQ ID NOs: 21-638 are phosphoketolase target gene coding regions and proteins.
SEQ ID NOs: 762-1885 are phosphotransacetylase target gene coding regions and proteins.
SEQ ID NOs: 1893-1897 are hybrid promoter sequences.
SEQ ID NOs: 639-642, 644-654, 656-660, 662-701-714, 725-726, 729-740, 742-748, and 750-761 are primers.
SEQ ID NO: 643 is the vector pRS426::GPD-xpk1+ADH1-eutD.
SEQ ID NO: 655 is the TEF1p-kan-TEF1t gene.
SEQ ID NO: 661 is vector pLA54.
SEQ ID NO: 715 is vector pRS423::pGAL1-cre.
SEQ ID NO: 716 is the vector pLH468-sadB.
SEQ ID NOs: 717 and 718 are the amino acid and nucleic acid sequences for sadB from Achromobacter xylosoxidans.
SEQ ID NO: 719 is the kivD coding region from L. lactis.
SEQ ID NO: 720 is the plasmid pRS425::GPM-sadB.
SEQ ID NO: 721 is the GPM promoter.
SEQ ID NO: 722 is the ADH1 terminator.
SEQ ID NO: 723 is the GPM-sadB-ADHt segment.
SEQ ID NO: 724 is the pUC19-URA3 plasmid.
SEQ ID NO: 741 is the ilvD-FBA1t segment.
SEQ ID NO 749 is URA3r2 template DNA.
SEQ ID NO: 1886 is the ilvD coding region from S. mutans.
SEQ ID NO: 1888 is vector 011468.
SEQ ID NO: 1898 is pUC19-URA3::pdc1::GPD-xpk1+ADH1-eutD.
SEQ ID NOs: 1899-1906 are the sequences of modified S. cerevisiae loci.
SEQ ID NO: 1907 is the sequence of pLH702.
SEQ ID NO: 1908 is the sequence of pYZ067DkivDDhADH
SEQ ID NO: 1909 is the amino acid sequence of ALD6.
SEQ ID NO: 1910 is the amino acid sequence of K9D3.
SEQ ID NO: 1911 is the amino acid sequence of K9G9.
SEQ ID NO: 1912 is the amino acid sequence of YMR226c.
SEQ ID NOs: 1913 and 1914 are the nucleic acid and amino acid sequences of AFT1.
SEQ ID NOs: 1915 and 1916 are the nucleic acid and amino acid sequences of AFT2.
SEQ ID NOs: 1917 and 1918 are the nucleic acid and amino acid sequences of FRA2.
SEQ ID NOs. 1919 and 1920 are the nucleic acid and amino acid sequences of GRx3.
SEQ ID NOs: 1921 and 1922 are the nucleic acid and amino acid sequences of CCC1.
SEQ ID NO: 1923 is the amino acid sequence of an alcohol dehydrogenase from Beijerinkia indica.
Applicants have solved the stated problem by reducing or eliminating the need for providing two substrates, one of which is converted to a desired product, the other fully or partly into acetyl-CoA by recombinant host cells requiring such supplementation for growth comprising the expression of enzymes of the phosphoketolase pathway in such cells. One such enzyme, phosphoketolase (Enzyme Commission Number EC 4.1.2.9), catalyzes the conversion of xylulose 5-phosphate into glyceraldehyde 3-phosphate and acetyl-phosphate (Heath et al., J. Biol. Chem. 231: 1009-29; 1958). Another such enzyme is phosphotransacetylase (Enzyme Commission Number EC 2.3.1.8) which converts acetyl-phosphate into acetyl-CoA.
Applicants have provided PDC-KO recombinant host cells comprising a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity, and optionally a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. Such cells exhibit a reduced or eliminated requirement for exogenous two-carbon substrate supplementation for their growth compared to PDC-KO cells. Applicants have also provided methods of making and using such recombinant host cells including, for example, methods of increasing cell growth, methods of reducing or eliminating the requirement of an exogenous two-carbon substrate for cell growth, methods of increasing glucose consumption and methods of increasing the production of a product of a pyruvate-utilizing pathway.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present application including the definitions will control. Unless otherwise required, by context, singular terms shall include pluralities and plural terms shall include the, singular. All publications, patents and other references mentioned herein are incorporated by reference in their entireties for all purposes as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference, unless only specific sections of patents or patent publications are indicated to be incorporated by reference.
Although methods and materials similar or equivalent to those described herein can be used in practice or testing of the present invention, suitable methods and materials are described below. The materials, methods and examples are illustrative only and are not intended to be limiting. Other features and advantages of the invention will be apparent from the detailed description and from the claims.
In order to further define this invention, the following terms, abbreviations and definitions are provided.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having,” “contains,” or “containing,” or any other variation thereof, are intended to be non-exclusive or open-ended. For example, a composition, a mixture, a process, a method, an article, or an apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such composition, mixture, process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
Also, the indefinite articles “a” and “an” preceding an element or component of the invention are intended to be nonrestrictive regarding the number of instances, i.e., occurrences of the element or component. Therefore “a” or “an” should be read to include one or at least one, and the singular word form of the element or component also includes the plural unless the number is obviously meant to be singular.
The term “invention” or “present invention” as used herein is a non-limiting term and is not intended to refer to any single embodiment of the particular invention but encompasses all possible embodiments as described in the application.
As used herein, the term “about” modifying the quantity of an ingredient or reactant of the invention employed refers to variation in the numerical quantity that can occur, for example, through typical measuring and liquid handling procedures used for making concentrates or use, solutions in the real world; through inadvertent error in these procedures; through differences in the manufacture, source, or purity of the ingredients employed to make the compositions or to carry out the methods; and the like. The term “about” also encompasses amounts that differ due to different equilibrium conditions for a composition resulting from a particular initial mixture. Whether or not modified by the term “about”, the claims include equivalents to the quantities. In one embodiment, the term “about” means within 10% of the reported numerical value, preferably within 5% of the reported numerical value.
The term “butanol” as used herein, refers to 2-butanol, 1-butanol, isobutanol, or mixtures thereof.
The term “pyruvate-utilizing biosynthetic pathway” refers to an enzyme pathway to produce a biosynthetic product from pyruvate.
The term “isobutanol biosynthetic pathway” refers to an enzyme pathway to produce isobutanol from pyruvate.
The term “2-butanol biosynthetic pathway” refers to an enzyme pathway to produce 2-butanol from pyruvate.
The term “2-butanone biosynthetic pathway” refers to an enzyme pathway to produce 2-butanone from pyruvate.
The terms “pdc-,” “PDC knock-out,” or “PDC-KO” as used herein refer to a cell that has a genetic modification to inactivate or reduce expression of at least one gene encoding pyruvate decarboxylase (PDC) so that the cell substantially or completely lacks pyruvate decarboxylase enzyme activity. If the cell has more than one expressed (active) PDC gene, then each of the active PDC genes may be inactivated or have minimal expression thereby producing a pdc-cell.
The term “carbon substrate” refers to a carbon source capable of being metabolized by the recombinant host cells disclosed herein. Non-limiting examples of carbon substrates are provided herein and include, but are not limited to, monosaccharides, oligosaccharides, polysaccharides, ethanol, lactate, succinate, glycerol, carbon dioxide, methanol, glucose, fructose, sucrose, xylose, arabinose, dextrose, or mixtures thereof.
The term “exogenous two-carbon substrate” refers to the carbon source provided to be metabolized into acetyl-CoA by a host cell that lacks the ability to convert pyruvic acid into acetyl-CoA. The term is used to distinguish from the carbon substrate which is converted into a pyruvate-derived product by a pyruvate-utilizing biosynthetic pathway, herein also referred to as the “pathway substrate” which includes, for example, glucose.
The term “polynucleotide” is intended to encompass a singular nucleic acid as well as plural nucleic acids, and refers to a nucleic acid molecule or construct, e.g., messenger RNA (mRNA) or plasmid DNA (pDNA). A polynucleotide can contain the nucleotide sequence of the full-length cDNA sequence, or a fragment thereof, including the untranslated 5′ and 3′ sequences and the coding sequences. The polynucleotide can be composed of any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. For example, polynucleotides can be composed of single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. “Polynucleotide” embraces chemically, enzymatically, or metabolically modified forms.
A polynucleotide sequence may be referred to as “isolated,” in which it has been removed from its native environment. For example, a heterologous polynucleotide encoding a polypeptide or polypeptide fragment having dihydroxy-acid dehydratase activity contained in a vector is considered isolated for the purposes of the present invention. Further examples of an isolated polynucleotide include recombinant polynucleotides maintained in heterologous host cells or purified (partially or substantially) polynucleotides in solution. Isolated polynucleotides or nucleic acids according to the present invention further include such molecules produced synthetically. An isolated polynucleotide fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
The term “gene” refers to a nucleic acid fragment that is capable of being expressed as a specific protein, optionally including regulatory sequences preceding (5′ non-coding sequences) and following (3′ non-coding sequences) the coding sequence. “Native gene” refers to a gene as found in nature with its own regulatory sequences. “Endogenous gene” refers to a native gene in its natural location in the genome of an organism. A “heterologous gene” refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. “Heterologous gene” includes a native coding region, or portion thereof, that is reintroduced into the source organism in a form that is different from the corresponding native gene. For example, a heterologous gene may include a native coding region that is a portion of a chimeric gene including non-native regulatory regions that is reintroduced into the native host. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes.
As used herein the term “coding region” refers to a DNA sequence that codes for a specific amino acid sequence. “Suitable regulatory sequences” refer to nucleotide sequences located upstream (5′ non-coding sequences), within, or downstream (3′ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, polyadenylation recognition sequences, RNA processing site, effector binding site and stem-loop structure.
As used herein, the term “polypeptide” is intended to encompass a singular “polypeptide” as well as plural “polypeptides,” and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds). The term “polypeptide” refers to any chain or chains of two or more amino acids, and does not refer to a specific length of the product. Thus, peptides, dipeptides, tripeptides, oligopeptides, “protein,” “amino acid chain,” or any other term used to refer to a chain or chains of two or more amino acids, are included within the definition of “polypeptide,” and the term “polypeptide” may be used instead of, or interchangeably with any of these terms. A polypeptide may be derived from a natural biological source or produced by recombinant technology, but is not necessarily translated from a designated nucleic acid sequence. It may be generated in any manner, including by chemical synthesis.
By an “isolated” polypeptide or a fragment, variant, or derivative thereof is intended a polypeptide that is not in its natural milieu. No particular level of purification is required. For example, an isolated polypeptide can be removed from its native or natural environment. Recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for purposed of the invention, as are native or recombinant polypeptides which have been separated, fractionated, or partially or substantially purified by any suitable technique.
As used herein, the term “valiant” refers to a polypeptide differing from a specifically recited polypeptide of the invention by amino acid insertions, deletions, mutations, and substitutions, created using, e.g., recombinant DNA techniques, such as mutagenesis. Guidance in determining which amino acid residues may be replaced, added, or deleted without abolishing activities of interest, may be found by comparing the sequence of the particular polypeptide with that of homologous polypeptides, e.g., yeast or bacterial, and minimizing the number of amino acid sequence changes made in regions of high homology (conserved regions) or by replacing amino acids with consensus sequences.
Alternatively, recombinant polynucleotide variants encoding these same or similar polypeptides may be synthesized or selected by making use of the “redundancy” in the genetic code. Various codon substitutions, such as silent changes which produce various restriction sites, may be introduced to optimize cloning into a plasmid or viral vector for expression. Mutations in the polynucleotide sequence may be reflected in the polypeptide or domains of other peptides added to the polypeptide to modify the properties of any part of the polypeptide.
Amino acid “substitutions” may be the result of replacing one amino acid with another amino acid having similar structural and/or chemical properties, i.e., conservative amino acid replacements, or they may be the result of replacing, one amino acid with an amino acid having different structural and/or chemical properties, i.e., non-conservative amino acid replacements. “Conservative” amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, or the amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, and, histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic acid. Alternatively, “non-conservative” amino acid substitutions may be made by selecting the differences in polarity, charge, solubility, hydrophobicity, hydrophilicity, or the amphipathic nature of any of these amino acids. “Insertions” or “deletions” may be within the range of variation as structurally or functionally tolerated by the recombinant proteins. The variation allowed may be experimentally determined by systematically making insertions, deletions, or substitutions of amino acids in a polypeptide molecule using recombinant DNA techniques and assaying the resulting recombinant variants for activity.
The term “promoter” refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3′ to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters.” It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity. For example, it will be understood that “FBA1 promoter” can be used to refer to a fragment derived from the promoter region of the FBA1 gene.
The term “terminator” as used herein refers to DNA sequences located downstream of a coding sequence. This includes polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3′ end of the mRNA precursor. The 3′ region can influence the transcription, RNA processing or stability, or translation of the associated coding sequence. It is recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical terminator activity. For example, it will be understood that “CYC1 terminator” can be used to refer to a fragment derived from the terminator region of the CYC1 gene.
The term “operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of effecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
The term “expression,” as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.
The term “overexpression,” as used herein, refers to expression that is higher than endogenous expression of the same or related gene. A heterologous gene is overexpressed if its expression is higher than that of a comparable endogenous gene.
As used herein the term “transformation” refers to the transfer of a nucleic acid fragment into a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” or “recombinant” or “transformed” organisms.
The terms “plasmid” and “vector” as used herein, refer to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3′ untranslated sequence into a cell.
As used herein the term “codon degeneracy” refers to the nature in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. The skilled artisan is well aware of the “codon-bias” exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
The term “codon-optimized” as it refers to genes or coding regions of nucleic acid molecules for transformation of various hosts, refers to the alteration of codons in the gene or coding regions of the nucleic acid molecules to reflect the typical codon usage of the host organism without altering the polypeptide encoded by the DNA. Such optimization includes replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that organism.
Deviations in the nucleotide sequence that comprise the codons encoding the amino acids of any polypeptide chain allow for variations in the sequence coding for the gene. Since each codon consists of three nucleotides, and the nucleotides comprising DNA are restricted to four specific bases, there are 64 possible combinations of nucleotides, 61 of which encode amino acids (the remaining three codons encode signals ending translation). The “genetic code” which shows which codons encode which amino acids is reproduced herein as Table 1. As a result, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are coded for by four triplets, serine and arginine by six, whereas tryptophan and methionine are coded by just one triplet. This degeneracy allows for DNA base composition to vary over a wide range without altering the amino acid sequence of the proteins encoded by the DNA.
Many organisms display a bias for use of particular codons to code for insertion of a particular amino acid in a growing peptide chain. Codon preference, or codon bias, differences in codon usage between organisms, is afforded by degeneracy of the genetic code, and is well documented among many organisms. Codon bias often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, inter alia, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.
Given the large number of gene sequences available for a wide variety of animal, plant and microbial species, it is possible to calculate the relative frequencies of codon usage. Codon usage tables are readily available, for example, at the “Codon Usage Database” available at http://www.kazusa.or.jp/codon/ (visited Mar. 20, 2008), and these tables can be adapted in a number of ways. See Nakamura, Y., et al. Nucl. Acids Res. 28:292 (2000). Codon usage tables for yeast, calculated from GenBank Release 128.0 [15 Feb. 2002], are reproduced below as Table 2. This table uses mRNA nomenclature, and so instead of thymine (T) which is found in DNA, the tables use uracil (U) which is found in RNA. Table 2 has been adapted so that frequencies are calculated for each amino acid, rather than for all 64 codons.
cerevisiae Genes
By utilizing this or similar tables, one of ordinary skill in the art can apply the frequencies to any given polypeptide sequence, and produce a nucleic acid fragment of a codon-optimized coding region which encodes the polypeptide, but which uses codons optimal for a given species.
Randomly assigning codons at an optimized frequency to encode a given polypeptide sequence, can be done manually by calculating codon frequencies for each amino acid, and then assigning the codons to the polypeptide sequence randomly. Additionally, various algorithms and computer software programs are readily available to those of ordinary skill in the art. For example, the “EditSeq” function in the Lasergene Package, available from DNAstar, Inc., Madison, Wis., the backtransation function in the VectorNTI Suite, available from InforMax, Inc., Bethesda, Md., and the “backtranslate” function in the GCG-Wisconsin Package, available from Accelrys, Inc., San Diego, Calif. In addition, various resources are publicly available to codon-optimize coding region sequences, e.g., the “backtranslation” function at http://www.entelechon.com/bioinformatics/backtranslation.php?lang=eng (visited Apr. 15, 2008) and the “backtranseq” function available at http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html (visited Jul. 9, 2002). Constructing a rudimentary algorithm to assign, codons based on a given frequency can also easily be accomplished with basic mathematical functions by one of ordinary skill in the art.
Codon-optimized coding regions can be designed by various methods known to those skilled in the art including software packages such as “synthetic gene designer” (http://phenotype.biosci.umbc.edu/codon/sal/index.php).
A polynucleotide or nucleic acid fragment is “hybridizable” to another nucleic acid fragment, such as a cDNA, genomic DNA, or RNA molecule, when a single-stranded form of the nucleic acid fragment can anneal to the other nucleic acid fragment under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboraton Manual, 2nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989), particularly Chapter 11 and Table 11.1 therein (entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the “stringency” of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments (such as homologous sequences from distantly related organisms), to highly similar fragments (such as genes that duplicate functional enzymes from closely related organisms). Post-hybridization washes determine stringency conditions. One set of preferred conditions uses a series of washes starting with 6×SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2×SSC, 0.5% SDS at 45° C. for 30 min, and then repeated twice with 0.2×SSC, 0.5% SDS at 50° C. for 30 min. A more preferred set of stringent conditions uses higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2×SSC, 0.5% SDS was increased to 60° C. Another preferred set of highly stringent conditions uses two final washes in 0.1×SSC, 0.1% SDS at 65° C. An additional set of stringent conditions include hybridization at 0.1×SSC, 0.1% SDS, 65° C. and washes with 2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS, for example.
Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tin) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). In one embodiment the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferably a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least about 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.
A “substantial portion” of an amino acid or nucleotide sequence is that portion comprising enough of the amino acid sequence, of a polypeptide or the nucleotide sequence of a gene to putatively identify that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Altschul, S. F., et al., J. Mol. Biol., 215:403-410 (199.3)). In general, a sequence often or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a “substantial portion” of a nucleotide sequence comprises enough of the sequence to specifically identify and/or isolate a nucleic acid fragment comprising the sequence. The instant specification teaches the complete amino acid and nucleotide sequence encoding particular proteins. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in this art. Accordingly, the instant invention comprises the complete sequences as reported in the accompanying Sequence Listing, as well as substantial portions of those sequences as defined above.
The team “complementary” is used to describe the relationship between nucleotide bases that are capable of hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine.
The term “percent identity”, as known in the art, is a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences. In the art, “identity” also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the match between strings of such sequences. “Identity” and “similarity” can be readily calculated by known methods, including but not limited to those described in 1.) Computational Molecular Biology (Lesk, A. M., Ed.) Oxford University: NY (1988); 2.) Biocomputing: Informatics and Genome Projects (Smith, D. W., Ed.) Academic: NY (1993); 3.) Computer Analysis of Sequence Data Part I (Griffin, A. M., and Griffin, H. G., Eds.) Humania: NJ (1994); 4.) Sequence Analysis in Molecular Biology (von Heinje, G., Ed.) Academic (1987); and 5.) Sequence Analysis Primer (Gribskov, M. and Devereux, J., Eds.) Stockton: NY (1991).
Preferred methods to determine identity are designed to give the best match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Sequence alignments and percent identity calculations may be performed using the MegAlign™ program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences is performed using the “Clustal method of alignment” which encompasses several varieties of the algorithm including the “Clustal V method of alignment” corresponding to the alignment method labeled Clustal V (described by Higgins and Sharp, CABIOS. 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci., 8:189-191 (1992)) and found in the MegAlign™ program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.). For multiple alignments, the default values correspond to GAP PENALTY=10 and GAP LENGTH PENALTY-10. Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY-3, WINDOW-5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONAS SAVED=4. After alignment of the sequences using the Clustal V program, it is possible to obtain a “percent identity” by viewing the “sequence distances” table in the same program. Additionally the “Clustal W method of alignment” is available and corresponds to the alignment method labeled Clustal W (described by Higgins and Sharp, CABIOS. 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci. 8:189-191 (1992)) and found in the MegAlign™ v6.1 program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.). Default parameters for multiple alignment (GAP PENALTY=10, GAP LENGTH PENALTY=0.2, Delay Divergen Seqs(%)=30, DNA Transition Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA Weight Matrix=IUB). After alignment of the sequences using the Clustal W program, it is possible to obtain a “percent identity” by viewing the “sequence distances” table in the same program.
It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, such as from other species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to: 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100% may be useful in describing the present invention, such as 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%. Suitable nucleic acid fragments not only have the above homologies but typically encode a polypeptide having at least 50 amino acids, preferably at least 100 amino acids, more preferably at least 150 amino acids, still more preferably at least 200 amino acids, and most preferably at least 250 amino acids.
The term “sequence analysis software” refers to any computer algorithm or software program that is useful for the analysis of nucleotide or amino acid sequences. “Sequence analysis software” may be commercially available or independently developed. Typical sequence analysis software will include, but is not limited to: 1.) the GCG suite of programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wis.); 2.) BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol., 215:403-410 (1990)); 3.) DNASTAR (DNASTAR, Inc. Madison, Wis.); 4.) Sequencher (Gene Codes Corporation, Ann Arbor, Mich.); and 5.) the FASTA program incorporating the Smith-Waterman algorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Samlor. Plenum: New York, N.Y.). Within the context of this application it will be understood that where sequence analysis software is used for analysis, that the results of the analysis will be based on the “default values” of the program referenced, unless otherwise specified. As used herein “default values” will mean any set of values or parameters that originally load with the software when first initialized.
Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1984); and by Ausubel, F. M, et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-Interscience (1987). Additional methods used here are in Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A, 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.). Other molecular tools and techniques are known in the art and include splicing by overlapping extension polymerase chain reaction (PCR) (Yu, et al. (2004) Fungal Genet. Biol. 41:973-981), positive selection for mutations at the URA3 locus of Saccharomyces cerevisiae (Boeke, J. D. et al. (1984) Mol. Gen. Genet. 197, 345-346; M A Romanos, et al. Nucleic Acids Res. 1991 Jan. 11; 19(1): 187), the cre-lox site-specific recombination system as well as mutant lox sites and FLP substrate mutations (Sauer, B. (1987) Mol Cell Biol 7: 2087-2096; Senecoff, et al. (1988) Journal of Molecular Biology, Volume 201, Issue 2, Pages 405-421; Albert, et al, (1995) The Plant Journal. Volume 7, Issue 4, pages 649-659), “seamless” gene deletion (Akada, et al. (2006) Yeast; 23(5):399-405), andgap repair methodology (Ma et al., Genetics 58:201-216; i. 981). Applicants have discovered that activation of the phosphoketolase pathway in a recombinant host cell comprising a modification in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity or a modification in an endogenous polypeptide having pyruvate decarboxylase activity, reduces or eliminates the need for an exogenous carbon substrate for the growth of such a cell. In embodiments, the recombinant host cells comprise (i) at least one deletion, mutation and/or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity); (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
The genetic manipulations of the host cells described herein can be performed using standard genetic techniques and screening and can be made in any host cell that is suitable to genetic manipulation (Methods in Yeast Genetic's, 2005, Cold Spring Harbor Laboratory Press Cold Spring Harbor, N.Y., pp. 201-202). In embodiments, the recombinant host cells disclosed herein can be any bacteria, yeast or fungi host useful for genetic modification and recombinant gene expression. In other embodiments a recombinant host cell can be a member of the genera Clostridium, Zymomonas, Eseherichia, Salmonella, Serratia, Erwinia, Klebsiella, Shigella, Rhodococcus, Pseudomonas, Bacillus, Lactobacillus, Enterococcus, Alcaligenes, Paenibacillus, Arthrobacter, Corynebacterium, Brevibacterium, Schizoaccharomyces, Kluyveromyces, Yarrowia, Pichia, Candida, Hansenula, Issatchenkia, saccharomyces. In other embodiments, the host cell can be Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces lactic, Kluyveromyces thermotolerans, Kluyveromyces marxianus, Candida glabrata, Candida albicans, Pichia stipitis, Yarrowia lipolytica, E. coli, or L. plantarum. In still other embodiments, the host cell is a yeast host cell. In some embodiments, the host cell is a member of the genera Saccharomyces. In some embodiments, the host cell is Kluyveromyces Candida glabrata or Schizosaccharomyces pombe. In some embodiments, the host cell is Saccharomyces cerevisiae. S. cerevisiae yeast are known in the art and are available from a variety of sources, including, but not limited to, American Type Culture Collection (Rockville, Md.), Centraalbureau voor Schimmelcultures (CBS) Fungal Biodiversity Centre, LeSaffre, Gert Strand AB, Ferm Solutions, North American Bioproducts, Martrex, and Lallemand. S. cerevisiae include, but are not limited to, BY4741, CEN.PK 113-7D, Ethanol Red® yeast, Ferm Pro™ yeast, Bio-Ferm® XR yeast, Gert Strand Prestige Batch Turbo alcohol yeast, Gert Strand Pot Distillers yeast, Gert Strand Distillers Turbo yeast, FerMax™ Green yeast, FerMax™ Gold yeast, Thermosacc® yeast, BG-1, PE-2, CAT-1, CBS7959, CBS7960, and CBS7961.
Sources of Acetyl-CoA
Acetyl-CoA is a major cellular building block, required for the synthesis of fatty acids, sterols, and lysine. Pyruvate is often a major contributor to the acetyl-CoA pool. Pyruvate dehydrogenase catalyzes the direct conversion of pyruvate to acetyl-CoA (E.C. 1.2.4.1, E.C. 1.2.1.51) or acetate (E.C. 1.2.2.2) and is almost ubiquitous in nature. Other enzymes involved in conversion of pyruvate to acetyl-CoA, acetyl-phosphate or acetate include pyruvate-formate lyase (E.C. 2.3.1.54), pyruvate oxidase (E.C. 1.2.3.3, E.C. 1.2.3.6), pyruvate-ferredoxin oxidoreductase (E.C. 1.2.7.1), and pyruvate decarboxylase (E.C. 4.1.1.1). Genetic modifications made to a host cell to conserve the pyruvate pool for a product of interest may include those that restrict conversion to acetyl-CoA, leading to decreased growth in the absence of an exogenously supplied two-carbon substrate, a carbon substrate that can be readily converted to acetyl-CoA independent of pyruvate (e.g. ethanol or acetate). An example is the documented auxotrophy observed in pyruvate decarboxylase deficient Saccharomyces cerevisiae (Flikweert et al. 1999, supra). Another example is the documented auxotrophy observed in pyruvate dehydrogenase deficient Escherichia coli when grown aerobically on glucose (Langley and Guest, 1977, J. Gen. Microbial. 99:2630276).
In embodiments, the recombinant host cells disclosed herein comprise a modification in an endogenous polynucleotide encoding a polypeptide having pyruvate decarboxylase (PDC) or a modification in an endogenous polypeptide having PDC activity. In embodiments, the recombinant host cells disclosed herein can have a modification or disruption of one or more polynucleotides, genes or polypeptides encoding PDC. In embodiments, the recombinant host cell comprises at least one deletion, mutation, and/or substitution in one of more endogenous polynucleotides or genes encoding a polypeptide having PDC activity, or in one or more endogenous polypeptides having PDC activity. Such modifications, disruptions, deletions, mutations, and/or substitutions can result in PDC activity that is reduced or eliminated, resulting in a PDC knock-out (PDC-KO) phenotype.
In embodiments, the endogenous pyruvate decarboxylase activity of the recombinant host cells disclosed herein converts pyruvate to acetaldehyde, which can then be converted to ethanol or to acetyl-CoA via acetate.
In embodiments, the recombinant host cell is Kluyveromyces lactis containing one gene encoding pyruvate decarboxylase, Candida glabrata containing one gene encoding pyruvate decarboxylase, or Schizosaccharomyces pombe containing one gene encoding pyruvate decarboxylase.
In other embodiments, the recombinant host cell is Saccharomyces cerevisiae containing three isozymes of pyruvate decarboxylase encoded by the pdc1, pdc5, and pdc6 genes, as well as a pyruvate decarboxylase regulatory gene, pdc2. In a non-limiting example in S. cerevisiae, the pdc1 and pdc5 genes, or all three genes, are disrupted. In another non-limiting example in S. cerevisiae, pyruvate decarboxylase activity may be reduced by disrupting the pdc2 regulatory gene. In, another non-limiting example in S. cerevisiae, polynucleotides or genes encoding pyruvate decarboxylase proteins such as those having about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to pdc1 or pdc5 can be disrupted.
In embodiments, the polypeptide having PDC activity or the polynucleotide or gene encoding a polypeptide having PDC activity is associated with Enzyme Commission Number EC 4.1.1.1. In other embodiments, a PDC gene of the recombinant host cells disclosed herein is not active under the fermentation conditions used, and therefore such a gene would not need to be modified or inactivated.
Examples of recombinant host cells with reduced pyruvate decarboxylase activity due to disruption of pyruvate decarboxylase encoding genes have been reported, such as for Saccharomyces in Flikweert et al. (Yeast (1996) 12:247-257), for Kluyveromyces in Bianchi et al. (Mol. Microbiol. (1996) 19(1):27-36), and disruption of the regulatory gene in Hohmann (Mol. Gen. Genet. (1993) 241:657-666). Saccharomyces strains having no pyruvate decarboxylase activity are available from the ATCC with Accession #200027 and #200028.
Examples of PDC polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in the recombinant host cells disclosed herein include, but are not limited to, those of the following table.
Candida glabrata
Kluyveromyces lactis
Yarrowia lipolytica
Schizosaccharomyces pombe
Zygosaccharomyces rouxii
Other examples of PDC polynucleotides, genes and polypeptides that can be targeted for modification or inactivation in the recombinant host cells disclosed herein include, but are not limited to, PDC polynucleotides, genes and/or polypeptides having at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any one of the sequences of Table 3.
In embodiments, the sequences of other PDC polynucleotides, genes and/or polypeptides can be identified in the literature and in bioinformatics databases well known to the skilled person using sequences disclosed herein and available in the art. For example, such sequences can be identified through BLAST (as described above) searching of publicly available databases with known PDC encoding polynucleotide or polypeptide sequences. In such a method, identities can be based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
Additionally, the PDC polynucleotide or polypeptide sequences described herein or known the art can be used to identify other PDC homologs in nature. For example, each of the PDC encoding nucleic acid fragments described herein can be used to isolate genes encoding homologous proteins. Isolation of homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to (1) methods of nucleic acid hybridization; (2) methods of DNA and RNA amplification, as exemplified by various uses of nucleic acid amplification technologies [e.g., polymerase chain reaction (PCR), Mullis et al., U.S. Pat. No. 4,683,202; ligase chain reaction (LCR), Tabor, S. et al., Proc. Acad. Sci. USA 82:1074 (1985); or strand displacement amplification (SDA), Walker et al., Proc. Natl. Acad. Sci. USA., 89:392 (1992)]; and (3) methods of library construction and screening by complementation.
In embodiments, PDC polynucleotides, genes and/or polypeptides related to the recombinant host cells described herein can be modified or disrupted. Many methods for genetic modification and disruption of target genes to reduce or eliminate expression are known to one of ordinary skill in the art and can be used to create the recombinant host cells described herein. Modifications that can be used include, but are not limited to, deletion of the entire gene or a portion of the gene encoding a PDC protein, inserting a DNA fragment into the encoding gene (in either the promoter or coding region) so that the protein is not expressed or expressed at lower levels, introducing a mutation into the coding region which adds a stop codon or frame shift such that a functional protein is not expressed, and introducing one or more mutations into the coding region to alter amino acids so that a non-functional or a less active protein is expressed. In other embodiments, expression of a target gene can be blocked by expression of an antisense RNA or an interfering RNA, and constructs can be introduced that result in, cosuppression. In other embodiments, the synthesis or stability of the transcript can be lessened by mutation. In embodiments, the efficiency by which a protein is translated from mRNA can be modulated by mutation. All of these methods can be readily practiced by one skilled in the art making use of the known or identified sequences encoding target proteins.
In other embodiments, DNA sequences surrounding a target PDC coding sequence are also useful in some modification procedures and are available, for example, for yeasts such as Saccharomyces cerevisiae in the complete genome sequence coordinated by Genome Project ID9518 of Genome Projects coordinated by NCBI (National Center for Biotechnology Information) with identifying GOPID #413838. An additional non-limiting example of yeast genomic sequences is that of Candida albicans, which is included in GPID #10771, #10701 and #416373, Other yeast genomic sequences can be readily found by one of skill in the art in publicly available databases.
In other embodiments, DNA sequences surrounding a target PDC coding sequence can be useful for modification methods using homologous recombination. In a non-limiting example of this method, PDC gene flanking sequences can be placed bounding a selectable marker gene to mediate homologous recombination whereby the marker gene replaces the PDC gene. In another non-limiting example, partial PDC gene sequences and PDC, gene flanking sequences bounding a selectable marker gene can be used to mediate homologous recombination whereby the marker gene replaces a portion of the target PDC, gene. In embodiments, the selectable marker can be bounded by site-specific recombination sites, so that following expression of the corresponding site-specific recombinase, the resistance gene is excised from the PDC gene without reactivating the latter. In embodiments, the site-specific recombination leaves behind a recombination site which disrupts expression of the PDC protein. In other embodiments, the homologous recombination vector can be constructed to also leave a deletion in the PDC gene following excision of the selectable marker, as is well known to one skilled in the art.
In other embodiments, deletions can be made to a PDC target gene using mitotic recombination as described in Wach et al. (Yeast, 10:1793-1808; 1994). Such a method can involve preparing a DNA fragment that contains a selectable marker between genomic regions that can be as short as 20 bp, and which bound a target DNA sequence. In other embodiments, this DNA fragment can be prepared by PCR amplification of the selectable marker gene using as primers oligonucleotides that hybridize to the ends of the marker gene and that include the genomic regions that can recombine with the yeast genome. In embodiments, the linear DNA fragment can be efficiently transformed into yeast and recombined into the genome resulting in gene replacement including with deletion of the target DNA sequence ((as described, for example, in Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A. 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.)).
Moreover, promoter replacement methods can be used to exchange the endogenous transcriptional control elements allowing another means to modulate expression such as described in Mnaimneh et al. ((2004) Cell 118(1):31-44).
In other embodiments, the PDC target gene encoded activity can be disrupted using random mutagenesis, which can then be followed by screening to identify strains with dependency on carbon substrates for growth. In this type of method, the DNA sequence of the target gene encoding region, or any other region of the genome affecting carbon substrate dependency for growth, need not be known. In embodiments, a screen for cells with reduced PDC activity and/or two-carbon substrate dependency, or other mutants having reduced PDC activity and a reduced or eliminated dependency for exogenous two-carbon substrate for growth, can be useful as recombinant host cells of the invention.
Methods for creating genetic mutations are common and well known in the art and can be applied to the exercise of creating mutants. Commonly used random genetic modification methods (reviewed in Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) include spontaneous mutagenesis, mutagenesis caused by mutator genes, chemical mutagenesis, irradiation with UV or X-rays, or transposon mutagenesis.
Chemical mutagenesis of host cells can involve, but is not limited to, treatment with one of the following DNA mutagens: ethyl methanesulfonate (EMS), nitrous acid, diethyl sulfate, or N-methyl-N′-nitro-N-nitroso-guanidine (MNNG). Such methods of mutagenesis have been reviewed in Spencer et al., (Mutagenesis in Yeast, 1996, Yeast Protocols: Methods in Cell and Molecular Biology. Humana Press, Totowa, N.J.). In embodiments, chemical mutagenesis with EMS can be performed as described in Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Irradiation with ultraviolet (UV) light or X-rays can also be used to produce random mutagenesis in yeast cells. The primary effect of mutagenesis by UV irradiation is the formation of pyrimidine dimers which disrupt the fidelity of DNA replication. Protocols for UV-mutagenesis of yeast can be found in Spencer et al. (Mutagenesis in Yeast, 1996, Yeast Protocols: Methods in Cell and Molecular Biology. Humana Press, Totowa, N.J.). In embodiments, the introduction of a mutator phenotype can also be used to generate random chromosomal mutations in host cells. In embodiments, common imitator phenotypes can be obtained through disruption of one or more of the following genes: PMS1, MAG1, RAD18 or RAD51. In other embodiments, restoration of the non-mutator phenotype can be obtained by insertion of the wildtype allele. In other embodiments, collections of modified cells produced from any of these or other known random mutagenesis processes may be screened for reduced or eliminated PDC activity.
Genomes have been completely sequenced and annotated and are publicly available for the following yeast strains: Ashbya gossypii ATCC 10895, Candida glabrata CBS138, Kluyveromyces lactis NRRL Y-1140, Pichia stipitis CBS 6054, Saccharomyces cerevisiae S288c, Schizosaccharomyces pombe 972h-, and Yarrowia lipolytica CLIB 122. Typically BLAST (described above) searching of publicly available databases with known PDC polynucleotide or polypeptide sequences, such as those provided herein, is used to identify PDC-encoding sequences of other host cells, such as yeast cells.
Accordingly, it is within the scope of the invention to provide pyruvate decarboxylase polynucleotides and polypeptides having at, least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any of the PDC polypeptides or polypeptides disclosed herein (SEQ ID NOs: 1-20). Identities are based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
The modification of PDC in the host cells disclosed herein to reduce or eliminate PDC activity can be confirmed using methods known in the art. For example, PCR methods well known in the art can be used to confirm deletion of PDC. Other suitable methods will be known to those of skill in the art and include, but are not limited to lack of growth on yeast extract peptone-dextrose medium (YPD).
Applicants have found that expression of enzymes associated with the phosphoketolase pathway (e.g., phosphoketolase and/or phosphotransacetylase) results in a reduced or eliminated requirement for exogenous two-carbon substrate supplementation for growth of PDC-KO cells. Phosphoketolases and/or phosphotransacetylases identified as described herein, can be expressed in such cells using methods described herein.
Enzymes of the phosphoketolase pathway include phosphoketolase and phosphotransacetylase (
In embodiments, the phosphoketolase pathway is activated in the recombinant host cells disclosed herein by engineering the cells to express polynucleotides and/or polypeptides encoding phosphoketolase and, optionally, phosphotransacetylase. In embodiments, the recombinant host cells disclosed herein comprise a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the recombinant host cells disclosed herein comprise a heterologous polynucleotide encoding, a polypeptide having phosphoketolase activity and a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. In other embodiments, the heterologous polynucleotide encoding a polypeptide having phosphoketolase activity is overexpressed, or expressed at a level that is higher than endogenous expression of the same or related endogenous gene, if any. In still other embodiments, the heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity is overexpressed, or expressed at a level that is higher than endogenous expression of the same or related endogenous gene, if any.
In embodiments, a polypeptide having phosphoketolase activity catalyzes the conversion of xylulose 5-phosphate into glyceraldehyde-3-phosphate and acetyl-phosphate and/or the conversion of fructose-6-phosphate into erythrose-4-phosphate and acetyl-phosphate. In embodiments, the activity of a polypeptide having phosphoketolase activity is inhibited by erythrose 4-phosphate and/or glyceraldehyde 3-phosphate. In other embodiments, a polypeptide having phosphotransacetylase activity catalyzes the conversion of acetyl-phosphate into acetyl-CoA.
Numerous examples of polynucleotides, genes and polypeptides encoding phosphoketolase activity are known in the art and can be used in the recombinant host cells disclosed herein. In embodiments, such a polynucleotide, gene and/or polypeptide can be the xylulose 5-phosphateketolase (XpkA) of Lactobacillus pentosus MD363 (Posthuma et al., Appl. Environ Microbiol. 68: 831-7; 2002). XpkA is the central enzyme of the phosphoketolase pathway (PKP) in lactic acid bacteria, and exhibits a specific activity of 4.455 μmol/min/mg (Posthuma et al., Appl. Environ. Microbiol. 68: 831-7; 2002). In other embodiments, such a polynucleotide, gene and/or polypeptide can be the phosphoketolase of Leuconostoc mesenteroides which exhibits a specific activity of 9.9 μmol/min/mg and is stable at pH above 4.5 (Goldberg et al., Methods Enzymol. 9: 515-520; 1966). This phosphoketolase exhibits a Km of 4.7 mM for D-xylulose 5-phosphate and a Km of 29 mM for fructose 6-phosphate (Goldberg et al., Methods Enzymol. 9: 515-520; 1966). In other embodiments, such a polynucleotide, gene and/or polypeptide can be the D-xylulose 5-phosphate/D-fructose 6-phosphate phosphoketolase gene xfp from B. lactis, as described, for example, in a pentose-metabolizing S. cerevisiae strain by Sonderegger et al. (Appl. Environ. Microbiol. 70: 2892-7; 2004).
In embodiments, a polynucleotide, gene and/or polypeptide encoding phosphoketolase corresponds to the Enzyme Commission Number EC 4.1.2.9.
In embodiments, host cells comprise a polypeptide having at least about 80%, at least about 85%, at least about 90%, or 100% identity to a polypeptide of Table 4 or an, active fragment thereof or a polynucleotide encoding such a polypeptide. In other embodiments, a polynucleotide, gene and/or polypeptide encoding phosphoketolase can include, but is not limited to, a sequence provided in the following tables 4 or 5.
Lactobacillus
plantarum
Lactobacillus
pentosus
In other embodiments, a polynucleotide, gene and/or polypeptide encoding phosphoketolase can have at least about 70% to about 75%, about 75% to about 80%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to that of any one of the sequences of Table 4, wherein the polynucleotide, gene and/or polypeptide encodes a polypeptide having phosphoketolase activity.
In other embodiments, a polynucleotide, gene and/or polypeptide encoding phosphoketolase can be used to identify other phosphoketolase polynucleotide, gene and/or polypeptide sequences or to identify phosphoketolase homologs in other cells, as described above for PDC. Such phosphoketolase encoding sequences can be identified, for example, in, the literature and/or in bioinformatics databases well known to the skilled person. For example, the identification of phosphoketolase encoding sequences in other cell types using bioinformatics can be accomplished through BLAST (as described above) searching of publicly available databases with known phosphoketolase encoding DNA and polypeptide sequences, such as those provided herein. Identities are based on the Clustal W method of alignment using the default parameters of GAP PENALTY=10, GAP LENGTH PENALTY=0.1, and Gonnet 250 series of protein weight matrix.
Additional phosphoketolase target gene coding regions were identified using, diversity search, clustering, experimentally verified xylulose-5-phosphate/fructose-6-phosphate phosphoketolases and domain architecture. Briefly, a BLAST search with the experimentally verified sequences with an Evalue cut-off of 0.01 resulted in 595 sequence matches. Clustering with the CD-HIT program at 95% sequence identity and 90% length overlap reduced the number to 436. CD-HIT is a program for clustering large protein database at nigh sequence identity threshold. The program removes redundant sequences and gel erates a database of only the representatives. (Clustering of highly homologous sequences to reduce the size of large protein database, Weizhong Li, Lukasz Jaroszewski & Adam Godzik Bioinformatics, (2001) 17:282-283)
Xylulose-5-phosphate/fructose-6-phosphate phosphoketolases have three Pfam domains: XFP_XFP; XFP_C. Although each of these domains may be present in several domain architectures, e.g. XFP_N is found in eight architectures. The architecture of interest was determined to be XFP_N; XFP; XFP_C. The cumulative length of the three domains is 760 amino acids.
A structure/function characterization of the phosphoketolases was performed using the HMMER software package. The following information based on the HMMER software user guide gives some description of the way that the hmmbuild program prepares a Profile HMM. A Profile HMM is capable of modeling gapped alignments, e.g. including insertions and deletions, which lets the software describe a complete conserved domain (rather than just a small ungapped motif). Insertions and deletions are modeled using insertion (I) states and deletion (D) states. All columns that contain more than a certain fraction x of gap characters will be assigned as an insert column. By default, x is set to 0.5. Each match state has an I and a D state associated with it. HMMER calls a group of three states (M/D/I) at the same consensus position in the alignment a “node”, These states are interconnected with arrows called state transition probabilities. M and I states are emitters, while D states are silent. The transitions are arranged so that at each node, either the M state is used (and a residue is aligned and scored) or the D state is, used (and no residue is aligned, resulting in a deletion-gap character, ‘-’). Insertions occur between nodes, and I states, have a self-transition, allowing one or more inserted residues to occur between consensus columns.
The scores of residues in a match state (i.e. match state emission scores), or in an insert state (i.e. insert state emission scores) are proportional to Log—2 (p_x)/(null_x). Where p_x is the probability of an amino acid residue, at a particular position in the alignment, according to the Profile HMM and null_x is the probability according to the Null model. The Null model is a simple one state probabilistic model with pre-calculated set of emission probabilities for each of the 20 amino acids derived from the distribution of amino acids in the SWISSPROT release 24. State transition scores are also calculated as log odds parameters and are propotional to Log—2 (t_x). Where t_x is the probability of transiting to an emitter or non-emitter state.
Using a multiple sequence alignment of experimentally verified sequences containing the architecture of interest XFP_N; XFP; XFP_C, a profile Hidden Markov Model (HMM) was created for representing members of the xylulose-5-phosphate/fructose-6-phosphate phosphoketolases (XPK-XFP). As stated in the user guide, Profile HMMs are statistical models of multiple sequence alignments. They capture position-specific information about how conserved each column of the alignment is, and which amino acid residues are most likely to occur at each position. Thus HMMs have a formal probabilistic basis. Profile HMMs for a large number of protein families are publicly available in the PFAM database (Janelia Farm Research Campus, Ashburn, Va.), see ftp://ftp.sanger.ac.uk/pub/databases/Pfam/releases/Pfam24.0/.
Eight xylulose-5-phosphate/fructose-6-phosphate phosphoketolases sequences with experimentally verified function were identified in the BRENDA database:
1. CBF76492.1 from Aspergillus nidulans FGSC A4 (SEQ ID NO: 355)
2. AAR98787.1 from Bifidobacterium longum (SEQ ID NO: 379)
3. ZP—03646196.1 from Bifidobacterium bifidum NCIMB 41171 (SEQ ID NO: 381)
4. ZP—02962870.1 from Bifidobacterium animalis subsp. lactis HNO19 (SEQ ID NO: 388)
5. ZP—786060.1 from Lactobacillus plantarum WCFS1 (SEQ ID NO: 481)
6. ZP—03940142.1 from Lactobacillus brevis subsp. gravesensis ATCC 27305 (SEQ ID NO: 486)
7. ZP—03073172.1 from Lactobacillus reuteri 100-23 (SEQ ID NO 468)
8. YP—818922.1 from Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293 (SEQ ID NO: 504)
The BRENDA database is a freely available information system containing biochemical and molecular information on all classified enzymes as well as software tools for querying the database and calculating molecular properties. The database covers information on classification and nomenclature, reaction and specificity, functional parameters, occurrence, enzyme structure and stability, mutants and enzyme engineering, preparation and isolation, the application of enzymes, and ligand-related data. (BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009, Nucleic Acids Res. 2009 Jan; 37 (Database issue): D588-92. Epub 2008 Nov. 4. Chang A. Scheer M, Grote A, Schomburg 1.Schomburg D.) The eight sequences were used to build a profile HMM which is provided herein as Table 6.
To further identify the proteins of interest, the 436 sequences were searched with tour profile HMMs: the generated XPK_XFP_HMM profile HMM provided in Table 6 as well as the three published profiles for the three domains XFP_N; XFP_C (PFAM DATABASE) described in Tables 7, 8, and 9, respectively. 309 protein sequences which lengths were between 650 amino acids and 900 amino acids, and contained the three domains were retained.
All 309 sequences are at least 40% identical to an experimentally verified phosphoketolase, with exception of 12 sequences that are within 35% identity distance. However, all 31.9 sequences have a highly significant match to all 4 profile HMMs. The least significant matches have Evalues of 7.5E-242, 1,1E-124, 2,1E-49, 7.8E-37 to XFP_XPK HMM, XFP_N, XFP, and XFP_C profile HMMs respectively. The 309 sequences are provided in Table 5, however, it is understood that any xylulose-5-phosphate/fructose-6-phosphate phosphoketolase identifiable by the method described may be expressed in host cells as described herein. Where accession information is given as “complement (NN_NNNNN:X . . . Y)”, it should be understood to mean the reverse complement of nucleotides X to Y of the sequence with Accession number NN_NNNNN.N. Where accession information is given as “join (NNNNNN.N:X . . . Y, NNNNNN.N:Z . . . Q)”, it should be understood to mean the sequence resulting from joining nucleotides X to Y of NNNNNN.N to nucleotides Z to Q of NNNNNN.N.
Gluconacetobacter
diazotrophicus
Shewanella
loihica
Shewanella
amazonensis
Shewanella
Shewanella
baltica
Shewanella
benthic
Shewanella
sediminis
Shewanella
woodyi
Shewanella
halifaxensis
Shewanella
denitrificans
Legionella
drancourtii
Ajellomyces
dermatitidis
Ajellomyces
dermatitidis
Ajellomyces
capsulatus
Ajellomyces
capsulatus
Paracoccidioides
brasiliensis
Paracoccidioides
brasiliensis
Uncinocarpus
reesii
Coccidioides
posadasii
Microsporum
canis
Aspergillus
oryzae
Aspergillus
niger
Neosartorya
fischer
Aspergillus
clavatu
Aspergillus
terreus
Aspergillus
nidulan
Penicillium
chrysogenum
Talaromyces
stipitatus
Penicillium
marneffei
Aspergillus
fumigatus
Botryotinia
fuckeliana
Sclerotinia
sclerotiorum
Gibberella
zeae
Nectria
haematococca
Verticillium
albo-atrum
Neurospora
crassa
Magnaporthe
grisea
Podospora
anserine
Coprinopsis
cinerea
Schizosaccharomyces
pombe
Schizosaccharomyces
japonicus
Cryptococcus
neoformans
neoformans
Ustilago
maydis
Microcoleus
chthonoplastes
Actinosynnema
mirum
Atopobium
rimae
Atopobium
parvulum
Atopobium
vaginae
Collinsella
stercoris
Bifidobacterium
longum
Bifidobacterium
breve
Bifidobacterium
bifidum
Bifidobacterium
angulatum
Bifidobacterium
catenulatum
Bifidobacterium
Bifidobacterium
pullorum
Gardnerella
vaginalis
Bifidobacterium
gallicum
Bifidobacterium
animalis
lactis
Bifidobacterium
pseudolongum
Globosum
Atopobium
vaginae
Frankia
Frankia
alni
Rhodococcus
opacus
Rhodococcus
jostii
Rhodococcus
erythropolis
Arthrobacter
aurescens
Propionibacterium
freudenreichii
Shermanii
Pseudomonas
syringae
Chitinophaga
pinensis
Verrucomicrobiae
bacterium
Cyanothece
Cyanothece
Cyanothece
Planktothrix
rubescens
Arthrospira
maxima
Cyanothece
Cyanothece
Microcystis
aeruginosa
Lyngbya
Nostoc
Synechococcus
Acaryochloris
marina
Synechococcus
Burkholderia
graminis
Burkholderia
Synechococcus
Cyanobium
Synechococcus
Synechococcus
Synechococcus
Burkholderia
phytofirmans
Burkholderia
xenovorans
Burkholderia
graminis
Burkholderia
Burkholderia
phymatum
Acidobacterium
capsulatum
Leptospirillum
Leptospirillum
ferrodiazotrophum
Synechococcus
elongatus
Thermosynechococcus
elongatus
Methylococcus
capsulatus
Cyanothece
Synechocystis
Cyanothece
Allochromatium
vinosum
Mariprofundus
ferrooxydans
Gallionella
ferruginea
Aspergillus
clavatu
Neosartorya
fischer
Aspergillus
oryzae
Aspergillus
niger
Aspergillus
terreus
Penicillium
chrysogenum
Penicillium
marneffei
Talaromyces
stipitatus
Botryotinia
fuckeliana
Sclerotinia
sclerotiorum
Pyrenophora
tritici-repentis
Phaeosphaeria
nodorum
Cryptococcus
neoformans
neoformans
Gibberella
zeae
Nectria
haematococca
Metarhizium
anisopliae
Neurospora
crassa
Podospora
anserine
Acidithiobacillus
ferrooxidans
Acidiphilium
cryptum
Thermotoga
lettingae
Dictyoglomus
turgidum
Nitrobacter
hamburgensis
Blastopirellula
marina
Marinomonas
Rhodopirellula
baltica
Legionella
drancourtii
Streptomyces
Lactobacillus
reuter
Lactobacillus
vaginalis
Lactobacillus
reuter
Lactobacillus
coleohominis
Lactobacillus
fermentum
Lactobacillus
acidophilus
Lactobacillus
crispatus
Lactobacillus
ultunensis
Lactobacillus
crispatus
Lactobacillus
gasseri
Lactobacillus
iners
Lactobacillus
delbrueckii
bulgaricus
Lactobacillus
jensenii
Lactobacillus
buchneri
Oenococcus
oeni
Lactobacillus
plantarum
Lactobacillus
sakei subsp.
sakei
Pediococcus
pentosaceus
Lactobacillus
rhamnosus
Lactobacillus
brevis
gravesensis
Lactobacillus
salivarius
Lactobacillus
ruminis
Bacillus
coagulans
Kingella
oralis
Granulicatella
adiacens
Streptococcus
gordonii
Challis
Streptococcus
agalactiae
Listeria
grayi
Enterococcus
casseliflavus
Enterococcus
gallinarum
Enterococcus
faecium
Mycoplasma
fermentans
Mycoplasma
arthritidis
Mycoplasma
agalactiae
Lactobacillus
casei
Lactobacillus
plantarum
Lactobacillus
salivarius
Leuconostoc
mesenteroides
mesenteroides
Lactobacillus
brevis
Weissella
paramesenteroides
Leuconostoc
citreum
Leuconostoc
mesenteroides
mesenteroides
Lactococcus
lactis
Lactis
Oenococcus
oeni
Clostridium
butyricum
Clostridium
carboxidivorans
Clostridium
acetobutylicum
Coprococcus
comes
Ruminococcus
Roseburia
intestinalis
Bacteroides
capillosus
Phaeodactylum
tricornutum
Rhodopseudomonas
palustris
Rhodopseudomonas
palustris
Rhodopseudomonas
palustris
Rhodopseudomonas
palustris
Polaromonas
naphthalenivorans
Stigmatella
aurantiaca
Pseudomonas
putid
Arthrobacter
Arthrobacter
chlorophenolicus
Sanguibacter
keddieii
Beutenbergia
cavernae
Jonesia
denitrifican
Xylanimonas
cellulosilytica
Cellulomonas
flavigena
Mycobacterium
gilvum
Mycobacterium
vanbaalenii
Brachybacterium
faecium
Kytococcus
sedentarius
Clavibacter
michiganensis
michiganensis
Salinispora
tropica
Salinispora
arenicola
Micromonospora
Mycobacterium
smegmatis
Mycobacterium
Mycobacterium
kansasii
Mycobacterium
marinum
Mycobacterium
avium
paratuberculosis
Mycobacterium
intracellulare
Mycobacterium
abscessus
Janibacter
Thermobifida
fusca
Thermomonospora
curvata
Streptosporangium
roseum
Nocardiopsis
dassonvillei
dassonvillei
Stackebrandtia
nassauensis
Actinosynnema
mirum
Streptomyces
coelicolor
Streptomyces
ambofaciens
Streptomyces
griseoflavus
Streptomyces
sviceus
Streptomyces
scabiei
Streptomyces
avermitilis
Streptomyces
ghanaensis
Streptomyces
viridochromogenes
Streptomyces
hygroscopicus
Streptomyces
flavogriseus
Streptomyces
griseus
griseus
Streptomyces
albus
Streptomyces
Streptomyces
Kribbella
flavida
Nocardia
farcinica
Frankia
Frankia
Catenulispora
acidiphila
Acidothermus
cellulolyticus
Nocardioides
Saccharopolyspora
erythraea
Rhizobium
leguminosarum
trifolii
Rhizobium
leguminosarum
trifolii
Rhizobium
etli
Rhizobium
etli
Agrobacterium
radiobacter
Brucella
Ochrobactrum
intermedium
Ochrobactrum
anthropi
Bradyrhizobium
Bradyrhizobium
Bradyrhizobium
japonicum
Nitrobacter
hamburgensis
Methylobacterium
extorquens
Mesorhizobium
Mesorhizobium
opportunistum
Nitrobacter
winogradskyi
Methylobacterium
radiotolerans
Methylobacterium
radiotolerans
Methylobacterium
extorquens
Methylobacterium
extorquens
Methylobacterium
nodulans
Methylobacterium
nodulans
Methylobacterium
Variovorax
paradoxus
Oceanicola
granulosus
Nodularia
spumigena
Nostoc
punctiforme
Anabaena
variabilis
Nostoc
azollae′
Gloeobacter
violaceus
Synechococcus
Sinorhizobium
medicae WSM419
Rhizobium
leguminosarum
viciae 3841
Sinorhizobium
meliloti
Verrucomicrobium
spinosum
Methylobacterium
extorquens
Nitrobacter
Gemmata
obscuriglobus
Desulfomicrobium
baculatum
Oxalobacter
formigenes
Oxalobacter
formigenes
Solibacter
usitatus
Pelodictyon
phaeoclathratiforme
Prosthecochloris
aestuarii
Chlorobium
limicola
Chlorobium
tepidum
Chlorobium
ferrooxidans
Chlorobium
luteolum
Dechloromonas
aromatica
Thiobacillus
denitrificans
Methylobacillus
flagellatus
Nitrosomonas
europaea
Nitrosomonas
eutrqpha
Nitrosospira
multiformis
Nitrosococcus
oceani
Candidatus
Protochlamydia
amoebophila
Sinorhizobium
meliloti
Sulfurospirillum
deleyianum
Mesorhizobium
loti
Oligotropha
carboxidovorans
Beijerinckia
indica
indica
indicates data missing or illegible when filed
Numerous examples of polynucleotides, genes and/or polypeptides encoding phosphotransacetylase are known in the art and can be used in relation to the recombinant host cells disclosed herein. In embodiments, the phosphotransacetylase can be EutD from Lactobacillus plantarum. In embodiments, the phosphotransacetylase can be the phosphotransacetylase from Bacillus subtilis. This phosphotransacetylase has a specific activity of 1371 mmol/min/mg and a Km 0.06 mM for acetyl-CoA (Rado and Hoch, Biochim. Biophys. Acta. 321: 114-25; 1973). In addition, the equilibrium constant (Keq) of this reaction was found to be 154±14 in favor of the formation of acetyl-CoA according to the following formula:
In embodiments, host cells comprise a polypeptide having at least about 80%, at least about 85%, at least about 90%, or at least about 99% identity to a polypeptide of Table 10 or an active fragment thereof or a polynucleotide encoding such a polypeptide. In embodiments, the phosplotransacetylase can be a polypeptide having at least about 80%, at least about 85%, at least about 90%, at least about 95%, or at least about 99% identity to SEQ ID NO: 1472 or an active fragment thereof. In other embodiments, a polynucleotide, gene and/or polypeptide encoding phosphotransacetylase can include, but is not limited to, a sequence provided in the following tables 10 or 12.
DAAFVEKVGLQKAPGSKVAGHANVFVFPELQSGNIGYKIAQRFGHFEAVGPVLQGLNK
bacillus
SDLSRGCSEEDVYKVAIITAAQGLA
plantarum
LVSGAAHSTADTVRPALQIIKTKEGVKKTSGVFIMARGEEQYVFADCAINIAPDSQDL
Bacillus
subtilis
DLSRGCNAEDVYNLALITAAQAL
indicates data missing or illegible when filed
Additional suitable phosphotransacetylase target gene coding regions and proteins were identified by diversity searching and clustering. A blast search of the non redundant GenBank protein database (NR) was performed with the L. plantarum EutD protein as a query sequence. A blast cut-off (Evalue) of 0.01 was applied. This search resulted in 2124 sequence matches. Redundancy reduction was achieved by clustering proteins with the CD-HIT program with parameters set at 95% sequence identity and 90% length overlap. The longest seed sequence, representative of each cluster, is retained for further analysis. Clustering reduced the number of protein sequences to 1336. Further clean-up of the sequences by removing sequences <280 amino acids and sequences >795 amino acids resulted in 1231 seqs.
The Brenda database was queried for experimentally verified phosphate acetyltransferases. Thirteen were found in the following organisms: S. enterica, E. coli K12, V. Parvula, C. Kluyveri, C. Acetobutylicum, C. Thermocellum, M thermophila, S. pyogenes, B. subtilis, L. fermentum, L. plantarum, L. sanfranciscensis, B subtilis, L. fermentum, L. plantarum, L. sanfranciscensis, R. palustris, E. coli.
Experimentally verified phosphate acetyltransferases (EC 2.3.1.8) belong to the PTA_PTB pfam family. However, the PTA_PTB domain is present in 13 distinct architectures (http://pfam.janelia.org/family/PF01515, Pfam database version 24). The motivation for investigating the domain architecture is to determine which of the proteins, that were identified by BLAST search, are likely to be phosphate acetyltransferases.
Experimentally verified sequences extracted from the BRENDA database as well as sequences retained after the CD-HIT clustering and clean-up, were searched against the Pfam database to determine their domain architecture. Pfam is a collection of multiple sequence alignments and profile hidden Markov models (HMMs). Each Pfam HMM represents a protein family or domain. By searching a protein sequence against the Pfam library of HMMs, it is possible to determine which domains it carries i.e. its domain architecture. (The Pfam protein families database: R. D. Finn, J. Tate, J. Mistry, P. C. Coggill, J. S. Sammut, H. R. Hotz, G. Ceric, K. Forslund, S. R. Eddy, E. L. Sonnhammer and A. Bateman Nucleic Acids Research (2008) Database Issue 36:D281-D288)
Twelve of the experimentally verified proteins only contained the PTA_PTB domain. Two sequences, from R. palustris and E. coli, contained two domains PTA_PTB and DRTGG, a domain of unknown function. Therefore, from the CD-HIT clustering results, proteins that contained either the PTA_PTB domain only (Group 1: 549 sequences) or a combination of PTA_PTB+DRTGG domains (Group 2: 201 sequences) were chosen.
Furthermore, the PTA_PTB domain, as the name indicates, is actually not specific to phosphate acetyltransferases. The domain is also a signature for phosphate butyryltransferases (EC 2.3.1.19) methods to distinguish between the two subfamilies: acetyltransferases and butyryltransferases were employed and are as follows:
To further characterize the relationship among the sequences, multiple sequence alignment MSA), phylogenetic analysis. profile HMMs and GroupSim analysis were performed, For this set of analyses, the phosphate acetyltransferases are split in two groups. Group 1 contains phosphate acetyltransferases with the PTA_PTB domain only, while Group 2 contains phosphate acetyltransferases with PTA_PTB+DRTGG. The motivation here is to generate groups with similar lengths.
Clustal X, version 2.0 was used for sequence alignments with default parameters. (Thompson J D, et al. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. (1997) 25:4876-4882.)
Alignment results were utilized to compute % sequence identities to a reference sequence. If the sequence from L. plantarum is taken as a reference, % IDs range from as low as 10.5% to 75.6% for the closest sequence. Alignment results also provided the basis for re-constructing phylogenetic trees. The Neighbor Joining method, available in the Jalview package version 2.3, was used to produce the trees, and computed trees were visualized in MEGA 4 (Tamura, Dudley, Nei, and Kumar 2007). The Neighbor Joining method is a method for re-constructing phylogenetic trees, and computing the lengths of the branches of this tree. In each stage, the two nearest nodes of the tree (the term “nearest nodes” will be defined in the following paragraphs) are chosen and defined as neighbors in our tree. This is done recursively until all of the nodes are paired together. “The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987 July; 4(4):406-25. Saitou N, Nei M.” Jalview Version 2 is a system for interactive editing, analysis and annotation of multiple sequence alignments (Waterhouse, A. M., Procter, J. B., Martin, D. M. A, Clamp, M. and Barton, G. J. (2009) “Jalview Version 2—a multiple sequence alignment editor and analysis workbench” Bioinformatics 25 (9) 1189-1191). The MEGA software provides tools for exploring, discovering, and analyzing DNA and protein sequences from an evolutionary perspective. MEGA4 enables the integration of sequence acquisition with evolutionary analysis. It contains an array of input data and multiple results explorers for visual representation; the handling and editing of sequence data, sequence alignments, inferred phylogenetic trees; and estimated evolutionary distances. The results explorers allow users to browse, edit, summarize, export, and generate publication-quality captions for their results. MEGA 4 also includes distance matrix and phylogeny explorers as well as advanced graphical modules for the visual representation of input data and output results (Tamura K, Dudley J, Nei M & Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24:1596-1599).
Taken together, % IDs and the generated tree (
Based on experimentally verified sequences contained within each of the Subfamilies, Subfamily 1 and Subfamily 2 were determined to represent phosphate butyryltransferases (PTB) and phosphate acetylytransferases (PTA) respectively.
Discrimination between Subfamily 1 members and Subfamily 2 members was also performed by GroupSim analysis (Capra and Singh (2008) Bioinformatics 24: 1473-1480). The GroupSim method identifies amino acid residues that determine a protein's functional specificity. In a multiple sequence alignment (MSA) of a protein family whose sequences are, divided into multiple Subfamilies, amino acid residues that distinguish between the functional Subfamilies of sequences can be identified. The method takes a multiple sequence alignment (MSA) and known specificity groupings as input, and assigns a score to each amino acid position in the MSA. Higher scores indicate a greater likelihood that an amino acid position is a specificity determining position (SDP).
GroupSim analysis performed on the MSA of 537 sequences (divided into Subfamily 1 and Subfamily 2 by the phylogenetic analysis, above) identified highly discriminating positions. Listed in Table 11 are positions (Pos) having scores greater than to 0.7, where, a perfect score of 1.0 would indicate that all proteins within the Subfamily have the listed amino acid in the specified position and between Subfamilies the amino acid would always be different. The “Pattern” columns give the amino acid(s) in single letter code. Numbers between parentheses indicate frequency of occurrence of each amino acid at the particular position. The amino acid position number in column 1 is for the PTA protein sequence from Lactobacillus plantarum, the representative protein of group 2 with a GI#28377658 (SEQ ID NO: 1472),
An alternative structure/function characterization of the PTA and PTB subfamilies of enzymes was performed using the HMMER software package (the theory behind profile HMMs is described in R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis: probabilistic models of proteins and nucleic acids, Cambridge University Press, 1998; Krogh et al., 1994; J. Mol. Biol. 235:1501-1531), following the user guide which is available from HMMER (Janelia Farm Research Campus, Ashburn, Va.).
Using a multiple sequence alignment of the experimentally verified sequences (containing the PTA_PTB domain only) in Subfamily 2, a profile Hidden Markov Model (HMM) was created for representing Subfamily 2 members. The sequences were:
1. BAB19267.1 from Lactobacillus sanfranciscensis (SEQ ID NO: 1475)
2. NP—784550.1 from Lactobacillus plantarum WCFS1 (SEQ ID NO: 1472)
3. ZP—03944466.1 from Lactobacillus fermentum ATCC 14931 (SEQ ID NO: 1453)
4. NP—391646.1 from Bacillus subtilis subsp. subtilis str. 168 (SEQ ID NO: 1422)
5. AAA72041.1 from Methanosarcina thermophila (SEQ ID NO: 1277)
6. ZP—03152606.1 from Clostridium thermocellum DSM 4150 (SEQ ID NO: 1275)
7. NP—348368.1 from Clostridium acetobutylicum ATCC 824 (SEQ ID NO: 1206)
8. YP 001394780.1 from Clostridium kluyveri DSM 555 (SEQ ID NO: 1200)
9. ZP—03855267.1 from Veillonella parvula DSM 2008 (SEQ ID NO: 1159)
10. YP—149725.1 from Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150 (SEQ ID NO: 1129)
The Profile HMM was built as follows: The 10 seed sequences (sequences representing experimentally verified function) that are in Subfamily 2 were aligned using Clustal X (interface to Clustal W) with default parameters. The hmmbuild program was run on each set of the aligned sequences using default parameters. hmmbuild reads the multiple sequence alignment file, builds a new Profile HMM, and saves the Profile HMM to file. Using this program an un-calibrated profile was generated from the multiple alignment for each set of subunit sequences described above.
The Profile HMM was read using hrnmcalibrate which scores a large number of synthesized random sequences with the Profile (the default number of synthetic sequences used is 5,000), fits an extreme value distribution (EVD) to the histogram of those scores, and re-saves the HMM file now including the EVD parameters. These EVD parameters (μ and λ) are used to calculate the E-values of bit scores when the profile is searched against a protein sequence database. hmmcalibrate writes two parameters into the HMM file on a line labeled “EVD”: these parameters are the μ (location) and λ (scale) parameters of an extreme value distribution (EVD) that best fits a histogram of scores calculated on randomly generated sequences of about the same length and residue composition as SWISS-PROT. This calibration was done once for the Profile HMM.
The calibrated pofile HMM for the Subfamily 2 set is provided as Table 14. The Profile HMM table gives the probability of each amino acid occurring at each position in the amino acid sequence. The amino acids are represented by the one letter code. The first line for each position reports the match emission scores: probability for each amino acid to be in that state (highest score is highlighted). The second line reports the insert emission scores, and the third line reports on state transition scores: M→M, M→i, M→D; I→M, I→I; D→M, D→D; B→M; M→E. Table 14 shows that in the Subfamily 2 profile HMM, methionine has a 3792 ans 4481 probability of being in the first two positions.
The Subfamily 2 profile HMM was evaluated using, hmmsearch, with the Z parameter set to one billion, for the ability to discriminate Subfamily 1 members from those of Subfamily 2. The hmmsearch program takes the hmm file for the Subfamily 2 profile HMM and all the sequences from both Subfamilies and assigns an E-value score to each sequence. This E-value score is a measure of fit to the Profile HMM, with a lower score being a better fit. The Profile HMM distinguished Subfamily 2 members from Subfamily 1 members and there was a large margin of E-value difference between the worst scoring Subfamily 2 member (5e-34) and the best scoring Subfamily 1 member (4.3e-07). This analysis shows that the Profile HMM prepared for Subfamily 2 phosphate acetyltransferases (PTA) distinguishes PTA sequences from phosphate butyryltransferase PTB protein sequences.
Based on these analyses, 361 phosphate acetyltransferase sequences (PTA_PTB domain only) were identified and are provided in Table 12a.
Eubacterium
saphenum
Chthoniobacter
flavus
Akkermansia
muciniphila
Citrobacter
Citrobacter
koseri
Salmonella
enterica subsp.
enterica serovar
Salmonella
enterica subsp.
arizonae
Escherichia
coli str.
Klebsiella
pneumoniae
Yersinia
intermedia
Photobacterium
profundum
Shewanella
benthica
Marinobacter
aquaeolei
Desulfotalea
psychrophila
Rhodococcus
opacus
Rhodococcus
jostii
Streptomyces
Gemmatimonas
aurantiaca
Clostridiales
bacterium
Clostridium
papyrosolvens
Slackia
heliotrinireducens
Ruegeria
Sagittula
stellata
Fusobacterium
nucleatum subsp.
nucleatum
Fusobacterium
Fusobacterium
periodonticum
Fusobacterium
Fusobacterium
Fusobacterium
varium
Fusobacterium
mortiferum
Arcobacter
butzleri
Leptotrichia
buccalis
Leptotrichia
hofstadii
Leptotrichia
goodfellowii
Streptobacillus
moniliformis
Veillonella
parvula
Acidaminococcus
Treponema
denticola
Treponema
vincentii
Treponema
pallidum subsp.
pallidum
Brachyspira
murdochii
Brachyspira
hyodysenteriae
Candidatus
Cloacamonas
acidaminovorans
Perkinsus
marinus
Borrelia
turicatae
Borrelia
hermsii
Borrelia
duttonii
Borrelia
spielmanii
Borrelia
afzelii
Borrelia
garinii
Borrelia
valaisiana
Borrelia
burgdorferi
Mycoplasma
mycoides subsp.
mycoides
Mycoplasma
capricolum subsp.
capricolum
Mesoplasma
florum
Spiroplasma
citri
Mycoplasma
genitalium
Mycoplasma
pneumoniae
Mycoplasma
gallisepticum
Mycoplasma
penetrans
Mycoplasma
hyopneumoniae
Mycoplasma
conjunctivae
Mycoplasma
agalactiae
Mycoplasma
fermentans
Mycoplasma
synoviae
Mycoplasma
pulmonis
Mycoplasma
mobile
Mycoplasma
agalactiae
Buchnera
aphidicola str.
cedri)
Clostridium
botulinum
Clostridium
beijerinckii
Clostridium
Clostridium
butyricum
Clostridium
perfringens
Clostridium
carboxidivorans
Clostridium
kluyveri
Clostridium
sporogenes
Clostridium
tetani
Clostridium
botulinum
Clostridium
novyi
Clostridium
cellulovorans
Clostridium
acetobutylicum
Thermoanaerobacterium
saccharolyticum
Thermoanaerobacterium
thermosaccharolyticum
Thermoanaerobacter
tengcongensis
Thermoanaerobacter
Halothermothrix
orenii
Desulfotomaculum
acetoxidans
Natranaerobius
thermophilus
Carboxydothermus
hydrogenoformans
Bacteroides
Parabacteroides
merdae
Porphyromonas
gingivalis
Porphyromonas
uenonis
Porphyromonas
endodontalis
Bacteroides
uniformis
Bacteroides
eggerthii
Bacteroides
cellulosilyticus
Bacteroides
fragilis
Bacteroides
Bacteroides
coprophilus
Bacteroides
plebeius
Bacteroides
vulgatus
Prevotella
tannerae
Prevotella
bergensis
Prevotella
veroralis
Prevotella
Candidatus
Azobacteroides
pseudotrichonymphae
Syntrophomonas
wolfei
Collinsella
aerofaciens
Collinsella
stercoris
Collinsella
intestinalis
Atopobium
rimae
Atopobium
parvulum
Atopobium
vaginae
Oribacterium
sinus
Abiotrophia
defectiva
Oribacterium
Clostridium
Ruminococcus
Ruminococcus
obeum
Bryantella
formatexigens
Blautia
hydrogenotrophica
Clostridium
nexile
Ruminococcus
gnavus
Ruminococcus
lactaris
Ruminococcus
torques
Clostridium
scindens
Clostridium
hylemonae
Dorea
formicigenerans
Dorea
longicatena
Clostridium
phytofermentans
Clostridiales
bacterium
Clostridium
bolteae
Butyrivibrio
crossotus
Eubacterium
ventriosum
Eubacterium
eligens
Helicobacter
pullorum
Helicobacter
canadensis
Helicobacter
winghamensis
Helicobacter
hepaticus
Helicobacter
cinaedi
Anaerostipes
caccae
Clostridium
Coprococcus
eutactus
Epulopiscium
Eggerthella
lenta
Cryptobacterium
curtum
Slackia
heliotrinireducens
Clostridium
papyrosolvens
Clostridium
thermocellum
Caldicellulosiruptor
saccharolyticus
Methanosarcina
thermophila
Methanosarcina
acetivorans
Methanosarcina
barkeri
Roseobacter
litoralis
Roseobacter
denitrificans
Dinoroseobacter
shibae
Rhodobacteraceae
bacterium
Silicibacter
lacuscaerulensis
Sinorhizobium
medicae
Sinorhizobium
meliloti
Ochrobactrum
intermedium
Ochrobactrum
anthropi
Burkholderia
phytofirmans
Burkholderia
xenovorans
Burkholderia
phymatum
Ralstonia
eutropha
Cupriavidus
taiwanensis
Burkholderia
multivorans
Burkholderia
cenocepacia
Photobacterium
prorundum
Lutiella
nitroferrum
Vibrionales
bacterium
Vibrio
splendidus
Vibrio
shilonii
Vibrio
coralliilyticus
Paracoccus
denitrificans
Rhodobacter
sphaeroides
Castellaniella
defragrans
Roseovarius
nubinhibens
Ruegeria
pomeroyi
Roseobacter
Roseobacter
Roseobacter
litoralis
Jannaschia
Rhodobacterales
bacterium
Candidatus
Solibacter
usitatus
Desulfuromonas
acetoxidans
Pelobacter
carbinolicus
Geobacter
Geobacter
uraniireducens
Geobacter
sulfurreducens
Geobacter
metallireducens
Pelobacter
propionicus
Geobacter
lovleyi
Geobacter
Geobacter
Pelobacter
carbinolicus
Denitrovibrio
acetiphilus
Chloroherpeton
thalassium
Victivallis
vadensis
Helicobacter
pylori
Helicobacter
pylori
Helicobacter
pylori
Helicobacter
pylori
Helicobacter
acinonychis
Campylobacter
jejuni subsp.
jejuni
Campylobacter
coli
Campylobacter
upsaliensis
Campylobacter
lari
Campylobacter
hominis
Campylobacter
gracilis
Campylobacter
fetus subsp.
fetus 82-40
Campylobacter
concisus
Campylobacter
curvus
Campylobacter
showae
Bifidobacterium
pseudocatenulatum
Bifidobacterium
dentium
Bifidobacterium
adolescentis
Bifidobacterium
angulatum
Bifidobacterium
breve
Bifidobacterium
longum subsp.
infantis
Bifidobacterium
longum subsp.
infantis
Bifidobacterium
bifidum
Gardnerella
vaginalis
Bifidobacterium
animalis subsp.
lactis
Bifidobacterium
gallicum
Actinomyces
odontolyticus
Actinomyces
coleocanis
Corynebacterium
glutamicum
Corynebacterium
efficiens
Corynebacterium
diphtheriae
Corynebacterium
matruchotii
Corynebacterium
glucuronolyticum
Corynebacterium
genitalium
Corynebacterium
lipophiloflavum
Corynebacterium
accolens
Corynebacterium
tuberculostearicum
Corynebacterium
striatum
Corynebacterum
aurimucosum
Corynebacterium
jeikeium
Corynebacterium
urealyticum
Corynebacterium
kroppenstedtii
Corynebacterium
amycolatum
Neisseria
flavescens
Neisseria
sicca
Neisseria
meningitidis
Kingella
oralis
Rhodospirillum
rubrum
Wigglesworthia
glossinidia
brevipalpis
Buchnera
aphidicola str.
pistaciae)
Fibrobacter
succinogenes subsp.
succinogenes S85
Mycobacterium
tuberculosis
Capnocytophaga
gingivalis
Candidatus
Sulcia
muelleri
Corynebacterium
glutamicum
Mobiluncus
mulieris
Mobiluncus
curtisii
Eubacterium
halli
Eubacterium
hallii
Faecalibacterium
prausnitzii
Bacteroides
capillosus
Roseburia
inulinivorans
Roseburia
intestinalis
Eubacterium
rectale
Clostridium
Shuttleworthia
satelles
Eubacterium
biforme
Eubacterium
dolichum
Eubacterium
dolichum
Anaerococcus
hydrogenalis
Anaerococcus
vaginalis
Anaerococcus
tetradius
Anaerococcus
prevotii
Anaerococcus
lactolyticus
Streptococcus
pyogenes
Streptococcus
pyogenes
Streptococcus
uberis
Streptococcus
equi subsp.
Zooepidemicus
Streptococcus
mutans
Streptococcus
infantarius
infantarius
Streptococcus
agalactiae
Streptococcus
salivarius
Streptococcus
thermophilus
Streptococcus
pneumoniae
Streptococcus
Streptococcus
suis
Lactobacillus
johnsonii
Lactobacillus
acidophilus
Lactobacillus
ultunensis
Lactobacillus
crispatus
Lactobacillus
helveticus
Lactobacillus
jensenii
Lactobacillus
jensenii
Lactobacillus
delbrueckii subsp.
bulgaricus
Lactobacillus
iners
Bacillus
subtilis subsp.
subtilis str. 168
Bacillus
amyloliquefaciens
Bacillus
licheniformis
Bacillus
pumilus
Anoxybacillus
flavithermus
Geobacillus
Geobacillus
thermodenitrificans
Geobacillus
kaustophilus
Bacillus
Bacillus
coahuilensis
Bacillus sp.
Oceanobacillus
iheyensis
Bacillus
cereus
Listeria
monocytogenes
Listeria grayi
Bacillus
halodurans
Bacillus
clausii
Exiguobacterium
Exiguobacterium
sibiricum
Bacillus
selenitireducens
Staphylococcus
epidermidis
Staphylococcus
capitis
Staphylococcus
warneri
Staphylococcus
epidermidis
Staphylococcus
aureus
Staphylococcus
haemolyticus
Staphylococcus
hominis
Staphylococcus
xylosus
Staphylococcus
saprophyticus subsp.
saprophyticus
Staphylococcus
carnosus subsp.
carnosus
Macrococcus
caseolyticus
Lactobacillus
fermentum
Lactobacillus
coleohominis
Lactobacillus
vaginalis
Lactobacillus
reuteri
Lactobacillus
antri
Leuconostoc
mesenteroides subsp.
mesenteroides
Leuconostoc
citreum
Weissella
paramesenteroides
Oenococcus
oeni
Granulicatella
adiacens
Granulicatella
elegans
Carnobacterium
Enterococcus
gallinarum
Enterococcus
faecalis
Enterococcus
faecium
Lactobacillus
sakei subsp.
sakei 23K
Catonella
morbi
Lactococcus
lactis subsp.
cremoris
Lactobacillus
casei
Lactobacillus
plantarum
Lactobacillus
brevis
Lactobacillus
hilgardii
Lactobacillus
sanfranciscensis
Lactobacillus
ruminis
Lactobacillus
salivarius
Erysipelothrix
rhusiopathiae
Pediococcus
pentosaceus
Parvimonas
micra
Finegoldia
magna
Bacillus
coagulans
Gemella
haemolysans
In addition, 201 phosphate acetyltransferase sequences that are characterized by two domains (DRTGG and PTA_PTB) are provided in Table 12b. MSA and phylogenetic analysis were performed as described above. Percent identity with respect to experimentally verified (or human curated) sequences is equal to or larger than 40, except for 4 sequences derived from plant organisms. Furthermore, hmmer search of the 201 sequences against the profile HMM of subfamily 2 (Table 14), clearly indicates that all Group 2 sequences belong to the PTA subfamily (least significant Evalue is 4.1e-93).
Kineococcus
radiotolerans
Reinekea
blandensis
Teredinibacter
turnerae
Marinobacter
aquaeolei
Hahella
chejuensis
Pseudomonas
mendocina
Pseudomonas
aeruginosa
Pseudomonas
syringae pv.
Pseudomonas
fluorescens
Pseudomonas
entomophila
Azotobacter
vinelandii
Pseudomonas
stutzeri
Nitrosomonas
europaea
Azotobacter
vinelandii
Deinococcus
deserti
Deinococcus
geothermalis
Deinococcus
radiodurans
Rhodoferax
ferrireducens
Rhodopseudomonas
palustris
Rhodopseudomonas
palustris
Rhodopseudomonas
palustris
Burkholderia
oklahomensis
Rhodospirillum
rubrum
Rhodopseudomonas
palustris
Chromobacterium
violaceum
Lutiella
nitroferrum
Psychrobacter
Psychrobacter
cryohalolentis
Enhydrobacter
aerosaccus
Acinetobacter
radioresistens
Acinetobacter
Acinetobacter
Acinetobacter
Anaeromyxobacter
Anaeromyxobacter
dehalogenans
Mannheimia
succiniciproducens
Actinobacillus
succinogenes
Aggregatibacter
aphrophilus
Haemophilus
influenzae
Haemophilus
somnus
Pasteurella
multocida
multocida
Pasteurella
dagmatis
Actinobacillus
pleuropneumoniae
Actinobacillus
minor
Haemophilus
ducreyi
Mannheimia
haemolytica
Haemophilus
parasuis
Pantoea
Erwinia
tasmaniensis
Sodalis
glossinidius
Dickeya
dadantii
Pectobacterium
wasabiae
Dickeya
dadantii
Yersinia
pestis
Serratia
proteamaculans
Edwardsiella
ictaluri
Proteus
mirabilis
Photorhabdus
luminescens
laumondii
Klebsiella
pneumoniae
Enterobacter
Cronobacter
turicensis
Escherichia
coli
Candidatus
Hamiltonella
defensa 5AT
pisum)
Pectobacterium
carotovorum
brasiliensis
Photobacterium
Photobacterium
profundum
Grimontia
hollisae
Vibrio
furnissii
Vibrio
metschnikovii
Vibrio
Vibrio
vulnificus
Vibrio
shilonii
Vibrio
splendidus
Aliivibrio
salmonicida
Vibrio
cholerae bv.
albensis
Aeromonas
salmonicida
salmonicida
Tolumonas
auensis
Psychromonas
Psychromonas
ingrahamii
Shewanella
sediminis
Shewanella
woodyi
Shewanella
loihica
Shewanella
halifaxensis
Shewanella
Shewanella
amazonensis
Shewanella
frigidimarina
Shewanella
denitrificans
Shewanella
sediminis
Shewanella
halifaxensis
Alteromonas
macleodii
Pseudoalteromonas
atlantica
Alteromonadales
bacterium
Pseudoalteromonas
tunicata
Colwellia
psychrerythraea
Marinomonas
Marinomonas
Dichelobacter
nodosus
Cardiobacterium
hominis
Phytophthora
infestans
Phytophthora
infestans
Chlamydomonas
reinhardtii
Physcomitrella
patens
Patens
Cyanothece
Cyanothece
Cyanothece
Cyanothece
Microcystis
aeruginosa
Cyanothece
Synechocystis
Leeuwenhoekiella
blandensis
Flavobacterium
johnsoniae
Robiginitalea
biformata
Flavobacteriales
bacterium
Polaribacter
Polaribacter
irgensii
Capnocytophaga
sputigena
Capnocytophaga
ochracea
Desulfovibrio
vulgaris str.
Desulfovibrio
vulgaris str.
Desulfovibrio
desulfuricans
desulfuricans
Desulfovibrio
salexigens
Desulfohalobium
retbaense
Desulfomicrobium
baculatum
Desulfonatronospira
thiodismutans
Desulfovibrio
salexigens
Desulfovibrio
piger
Desulfovibrio
desulfuricans
desulfuricans
Desulfotalea
psychrophila
Lawsonia
intracellularis
Lyngbya
Arthrospira
maxima
Syntrophobacter
fumaroxidans
Allochromatium
vinosum
Rhodopirellula
baltica
Sulfurimonas
denitrificans
Campylobacterales
bacterium
Sulfurospirillum
deleyianum
Sulfurovum
Mycobacterium
vanbaalenii
Mycobacterium
gilvum
Mycobacterium
Mycobacterium
smegmatis
Mycobacterium
abscessus
Mycobacterium
kansasii
Mycobacterium
marinum
Mycobacterium
tuberculosis
Mycobacterium
avium
paratuberculosis
Mycobacterium
intracellulare
Rhodococcus
erythropolis
Rhodococcus
jostii
Nocardia
farcinica
Tsukamurella
paurometabola
Gordonia
bronchialis
Jonesia
denitrificans
Sanguibacter
keddieii
Cellulomonas
flavigena
Beutenbergia
cavernae
Xylanimonas
cellulosilytica
Nocardioides
Kribbella
flavida
actinobacterium
Clavibacter
michiganensis
Sepedonicus
Leifsonia
xyli
xyli
Nitrosomonas
eutropha
Catenulispora
acidiphila
Nakamurella
multipartita
Brachybacterium
faecium
Actinomyces
urogenitalis
Kytococcus
sedentarius
Streptomyces
flavogriseus
Streptomyces
griseus
griseus
Streptomyces
clavuligerus
Streptomyces
sviceus
Streptomyces
griseoflavus
Streptomyces
ghanaensis
Streptomyces
viridochromogenes
Streptomyces
lividans
Streptomyces
avermitilis
Streptomyces
scabiei
Streptomyces
albus
Streptomyces
Streptomyces
Streptomyces
hygroscopicus
Streptomyces
Streptosporangium
roseum
Salinispora
tropica
Salinispora
arenicola
Micromonospora
Arthrobacter
Arthrobacter
oxydans
Micrococcus
luteus
Rothia
mucilaginosa
Kocuria
rhizophila
Francisella
tularensis
holarctica
Francisella
philomiragia
philomiragia
Baumannia
cicadellinicola
coagulata)
Buchnera
aphidicola
pisum)
Verrucomicrobiae
bacterium
Verrucomicrobium
spinosum
Mariprofundus
ferrooxydans
Bermanella
marisrubri
In other embodiments, a polynucleotide, gene and/or polypeptide encoding phosphotransacetylase can have at least about 70% to about 75%, about 80% to about 85%, about 85% to about 90%, about 90% to about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity to any one of the sequences of Tables 10 or 12a or 12b, wherein the polynucleotide, gene and/or polypeptide encodes a polypeptide having phosphotransacetylase activity.
In embodiments, a polynucleotide, gene and/or polypeptide encoding phosphotransacetylase corresponds to the Enzyme Commission Number EC 2.3.1.8.
In other embodiments, the phosphotransacetylase polynucleotide, gene and/or polypeptide sequences described herein or those recited in the art can be used to identify phosphotransacetylase sequences or phosphotransacetylase homologs in other cells, as described above for PDC.
Methods for gene expression in recombinant host cells, including, but not limited to, yeast cells are known in the art (see, for example, Methods in Enzymology, Volume 194, Guide to Yeast Genetics and Molecular and Cell Biology (Part A, 2004, Christine Guthrie and Gerald R. Fink (Eds.), Elsevier Academic Press, San Diego, Calif.). In embodiments, the coding region for the phosphoketolase and/or phosphotransacetylase genes to be expressed can be codon optimized for the target host cell, as well known to one skilled in the art. Expression of genes in recombinant host cells, including but not limited to yeast cells, can require a promoter operably linked to a coding region of interest, and a transcriptional terminator. A number of promoters can be used in constructing expression cassettes for genes, including, but not limited to, the following constitutive promoters suitable for use in yeast: FBA1, TDH3 (GPD), ADH1, and GPM1; and the following inducible promoters suitable for use in yeast: GAL1, GAL10 and CUP1. Other yeast promoters include hybrid promoters UAS(PGK1)-FBA1p (SEQ ID NO: 1893), UAS(PGK1)-ENO2p (SEQ ID NO: 1894), UAS(FBA1)-PDC1p (SEQ ID NO: 1895), UAS(PGK1)-PDC1p (SEQ II) NO: 1896), and UAS(PGK)-OLE1p (SEQ ID NO: 1897). Suitable transcriptional terminators that can be used in a chimeric gene construct for expression include, but are not limited to, FBA1t, TDH3t, GPM1t. ERG10t, GAL1t, CYC1t, and ADH1t.
Recombinant polynucleotides are typically cloned for expression using the coding sequence as part of a chimeric gene used for transformation, which includes a promoter operably linked to the coding sequence as well as a ribosome binding site and a termination control region. The coding region may be from the host cell for transformation and combined with regulatory sequences that are not native to the natural gene encoding phosphoketolase and/or phosphotransacetylase. Alternatively, the coding region may be from another host cell.
Vectors useful for the transformation of a variety of host cells are common and described in the literature. Typically the vector contains a selectable marker and sequences allowing autonomous replication or chromosomal integration in the desired host. In addition, suitable vectors can comprise a promoter region which harbors transcriptional initiation controls and a transcriptional termination control region, between which a coding region DNA fragment may be inserted, to provide expression of the inserted coding region. Both control regions can be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions can also be derived from genes that are not native to the specific species chosen as a production host.
In embodiments, suitable promoters, transcriptional terminators, and phosphoketolase and/or phosphotransacetylase coding regions can be cloned into E. coli-yeast shuttle vectors, and transformed into yeast cells. Such vectors allow strain propagation in both E. coli and yeast strains, and can contain a selectable marker and sequences allowing autonomous replication or chromosomal integration in the desired host. Typically used plasmids in yeast include, but are not limited to, shuttle vectors pRS423, pRS424, pRS425, and pRS426 (American Type Culture Collection, Rockville, Md.), which contain an E. coli replication origin (e.g., pMB1), a yeast 2-micron origin of replication, and a marker for nutritional selection. The selection markers for these four vectors are 141S3 (vector pRS423), TRP1 (vector pRS424), LEU2 (vector pRS425) and URA3 (vector pRS426).
In embodiments, construction of expression vectors with a chimeric gene encoding the described phosphoketolases and/or phosphotransacetylases can be performed by the gap repair recombination method in yeast. In embodiments, a yeast vector DNA is digested (e.g., in its multiple cloning site) to create a “gap” in its sequence. A number of insert DNAs of interest are generated that contain an approximately 21 by sequence at both the 5′ and the 3′ ends that sequentially overlap with each other, and with the 5′ and 3′ terminus of the vector DNA. For example, to construct a yeast expression vector for “Gene X,” a yeast promoter and a yeast terminator are selected for the expression cassette. The promoter and terminator are amplified from the yeast genomic DNA, and Gene X is either PCR amplified from its source organism or obtained from a cloning vector comprising Gene X sequence. There is at least a 21 by overlapping sequence between the 5′ end of the linearized vector and the promoter sequence, between the promoter and Gene X, between Gene X and the terminator sequence, and between the terminator and the 3′ end of the linearized vector. The “gapped” vector and the insert DNAs are then co-transformed into a yeast strain and plated on the medium containing the appropriate compound mixtures that allow complementation of the nutritional selection markers on the plasmids. The presence of correct insert combinations can be confirmed by PCR mapping using plasmid DNA prepared from the selected cells. The plasmid DNA isolated from yeast (usually low in concentration) can then be transformed into an E. coli strain, e.g. TOP10, followed by mini preps and restriction mapping to further verify the plasmid construct. Finally the construct can be verified by sequence analysis.
Like the gap repair technique, integration into the yeast genome also takes advantage of the homologous recombination system in yeast. In embodiments, a cassette containing a coding region plus control elements (promoter and terminator) and auxotrophic marker is PCR-amplified with a high-fidelity DNA polymerase using primers that hybridize to the cassette and contain 40-70 base pairs of sequence homology to the regions 5′ and 3′ of the genomic area where insertion is desired. The FCR product is then transformed into yeast and plated on medium containing the appropriate compound mixtures that allow selection for the integrated auxotrophic marker. For example, to integrate “Gene X” into chromosomal location “Y”, the promoter-coding region X-terminator construct is PCR amplified from a plasmid DNA construct and joined to an, autotrophic marker (such as URA3) by either SOE PCR or by common restriction digests and cloning. The full cassette, containing the promoter-coding regionX-terminator-URA3 region, is PCR amplified with primer sequences that contain 40-70 by of homology to the legions 5′ and 3′ of location “Y” on the yeast chromosome. The PCR product is transformed into yeast and selected on growth media lacking, uracil. Transformants can be verified either by colony PCR or by direct sequencing of chromosomal DNA.
The presence of phosphoketolase and phosphotransacetylase activity in the recombinant host cells disclosed herein can be, confirmed using routine methods known in the art. In a non-limiting example, and as described in the Examples herein, transformants can be screened by PCR using primers for the phosphoketolase and phosphotransacetylase genes. In embodiments, and as described in the Examples herein, transformants can be screened by PCR with primers N1039 and N1040 (SEQ ID NOs: 639 and 640) to confirm integration of the xpk1 gene, and primers N1041 and N1042 (SEQ ID NOs: 641 and 642) can be used to confirm integration of the eutD gene. In another non-limiting example, and as described in the Examples herein, transformants can be, screened for integration of phosphoketolase constructs and/or phosphotransacetylase constructs at the Δpdc1::ilvD(Sm) locus by the loss of ilvD(Sm) in the host cells.
In another non-limiting example, and as described in the examples herein, phosphoketolase activity can be assayed by expressing phosphoketolase identifiable by the methods disclosed herein in a recombinant host cell disclosed herein that lacks endogenous phosphoketolase activity. If phosphoketolase activity is present, such cells exhibit a reduced or eliminated requirement for exogenous two-carbon substrate supplementation for growth in culture.
In another non-limiting example, and as described in the examples herein, phosphoketolase and phosphotransacetylase activity can be assayed by expressing phosphoketolase and phosphotransacetylase activity identifiable by the methods disclosed herein in a recombinant host cell disclosed herein that lacks endogenous phosphoketolase and phosphotransacetylase activity. If phosphoketolase activity and phosphoketolase activity are present, such cells exhibit a reduced or eliminated requirement for exogenous two-carbon substrate supplementation for growth in culture.
In another non-limiting example, phosphoketolase, and/or phosphotransacetylase activity can be confirmed by more indirect methods, such as by assaying for a downstream product in a pathway requiring phosphoketolase activity. For example, a polypeptide having phosphoketolase activity can catalyze the conversion of xylulose-5-phosphate into glyceraldehyde-3-phosphate and acetyl-phosphate and/or the conversion of fructose-6-phosphate into erythrose-4-phosphate and acetyl-phosphate. Also, a polypeptide having phosphotransacetylase activity can catalyze the conversion of acetyl-phosphate into acetyl-CoA.
PDC-KO cells fail to grow in glucose-containing media (e.g., 2% glucose), but PDC-KO cells carrying a functional butanediol biosynthetic pathway have been shown to grow on glucose supplemented with exogenous two-carbon substrates such as ethanol (see for example, US Patent Application Publication No. 20090305363, herein incorporated by reference). In embodiments, the host cells disclosed herein can be grown in fermentation media which contains a suitable pathway carbon substrate and two-substrate supplement, including combinations of suitable pathway carbon substrates with C2-substrate supplement. Non-limiting examples of suitable pathway carbon substrates include, but are not limited to, monosaccharides such as fructose, oligosaccharides such as lactose maltose, galactose, or sucrose, polysaccharides such as starch or cellulose or mixtures thereof and unpurified mixtures from renewable feedstocks such as cheese whey permeate, cornsteep liquor, sugar beet molasses, and barley malt, including any combinations thereof. In other embodiments, the suitable pathway carbon substrates can include lactate, glycerol, or combinations thereof.
In embodiments, a suitable carbon substrate can be a one-carbon substrate such as carbon dioxide, or methanol for which metabolic conversion into key biochemical intermediates has been demonstrated, or combinations thereof. In other embodiments related to methylotrophic organisms, the carbon substrate can be carbon containing compounds such as methylamine, glucosamine and a variety of amino acids for metabolic activity. In a non-limiting example, methylotrophic yeasts are known to utilize the carbon from methylamine to form trehalose or glycerol (Bellion et al., Microb. Growth C1 Compd., [Int. Symp.], 7th (1993), 415-32, Editor(s): Murrell, J. Collin; Kelly Don P. Publisher: Intercept, Andover, UK). In another non-limiting example, various species of Candida can metabolize alanine (Sulter et al., Arch. Microbiol. 153:485-489 (1990)). Hence it is contemplated that the source of carbon utilized in the present invention can encompass a wide variety of carbon containing substrates and will only be limited by the choice of organism.
In other embodiments. the suitable pathway carbon substrate can be glucose, fructose. and sucrose, or mixtures of these with five-ca hon (C5) sugars such as xylose and/or arabinose for yeasts cells modified to use C5 sugars. In embodiments. sucrose can be derived from renewable sugar sources such as sugar cane, sugar beets, cassaya; sweet sorghum and mixtures thereof. In other embodiment. glucose and dextrose can derived from renewable grain sources through saccharification of starch based feedstocks including grains such as corn, wheat, rye, barley, oats. and mixtures thereof. In embodiments, the pathway carbon substrates can be derived from renewable cellulosic or lignocellulosic biomass through processes of pretreatment and saccharification, as described, for example, in U.S. Patent Application Publication No. US 20070031918 A1, which is herein incorporated by reference.
As used herein, “biomass” refers to any cellulosic or lignocellulosic material and includes, but is not limited to, materials comprising cellulose, and optionally further comprising hemicellulose, lignin, starch, oligosaccharides and/or monosaccharides. In embodiments, biomass can also comprise additional components, such as protein and/or lipid. In other embodiments, biomass can be derived from a single source, or biomass can comprise a mixture derived from more than one source; for example, biomass can comprise a mixture of corn cobs and corn stover, or a mixture of grass and leaves. Biomass includes, but is not limited to, bioenergy crops, agricultural residues, municipal solid waste, industrial solid waste, sludge from paper manufacture, yard waste, wood and forestry waste. Other non-limiting examples of biomass include, but are not limited to, corn grain, corn cobs, crop residues such as corn husks, corn stover, grasses, wheat, wheat straw, barley, barley straw. hay, rice straw, switchgrass, waste paper, sugar cane bagasse. sorghum, soy, components obtained from milling of grains, trees, branches, roots, leaves, wood chips, sawdust, shrubs and bushes, vegetables, fruits, flowers, animal manure and mixtures thereof.
The recombinant host cells described herein can be cultured using standard laboratory techniques known in the art (see, e.g. Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring harbor, NY, pp. 201-202). In embodiments related to media supplemented with exogenous two-carbon substrates, and as described in, the Examples, recombinant host cells can be grown in synthetic complete medium supplemented with one or more exogenous two-carbon substrates as described herein at a concentration of about 0.01%, about 0.05%, about 0.1%, about 0.5%, about 1.0%, about 1.5%, about 1.5% or about 2% (v/v) of the media. In embodiments, the recombinant host cells can be grown in synthetic complete culture without uracil or histidine, supplemented with 0.5% (v/v) ethanol. In embodiments related to growth in media that is not supplemented with exogenous two-carbon substrates, the recombinant host cells described herein can be first grown in culture medium comprising an exogenous two carbon substrate and then diluted (e.g., starting OD=0.1! ml medium in a 125 ml vented flask) into media that is not supplemented with exogenous two-carbon substrate.
The growth of the recombinant, host cells described herein can be measured by methods known in the art (see, e.g., Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). In a non-limiting example, the growth of the recombinant host cells described herein can be determined by measuring the optical density (01)) of cell cultures over time. For example, the OD at 600 nm for a yeast culture is proportional to yeast cell number. In another non-limiting example, the growth of the recombinant host cells described herein can be determined by counting viable cells in a sample of the culture over time.
Applicants have provided cells that have a reduced or eliminated requirement for two-carbon substrate supplementation for growth. In embodiments, such cells comprise (i) a deletion, mutation, and/or substitution in an endogenous gene encoding, a polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA that results in a requirement for exogenous two-carbon substrate supplementation for optimal growth; (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. In embodiments, such cells comprise (i) a modification in an endogenous polypeptide having PDC activity which results in reduced or eliminated PDC activity; (ii) a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity; and optionally (iii) a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity. As such, Applicants have also provided methods of improving the growth of a recombinant host cell comprising at least one modification in an endogenous polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA that results in a requirement for exogenous two-carbon substrate supplementation for optimal growth comprising transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. Applicants have also provided methods of improving the growth of a recombinant host cell comprising at least one, modification in an endogenous polypeptide having pyruvate decarboxylase activity (e.g., having at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having PDC activity that results in reduced or eliminated PDC activity) comprising transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the method further comprises transforming, a recombinant host cell described herein with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
Applicants have also provided methods of reducing or eliminating the requirement for an exogenous two-carbon substrate for the growth of a recombinant host cell comprising at least one modification in an endogenous activity that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA that results in a requirement for exogenous two-carbon substrate supplementation for optimal growth comprising transforming the host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity comprising, transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the method further comprises transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
Applicants have also provided methods of reducing the requirement for an exogenous two-carbon substrate for the growth of a recombinant host cell comprising at least one modification in an endogenous polypeptide having. PDC activity (e.g., having at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity) comprising transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the method further comprises transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
In addition, Applicants have provided methods of eliminating the requirement for an exogenous two-carbon substrate for the growth of a recombinant host cell comprising at least one modification in an endogenous polypeptide having PDC activity (e.g., having at least one deletion, mutation or substitution in an endogenous gene encoding a polypeptide having PDC activity that results in reduced or eliminated PDC activity) comprising transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the method further comprises transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
In embodiments, a reduced requirement for exogenous two-carbon substrate supplementation can be a growth rate of the recombinant host cells described herein in media that is not supplemented, with an exogenous two-carbon substrate that is the same or substantially equivalent to the growth rate of a recombinant host cell comprising a modification in an endogenous activity that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA grown in media that is supplemented with an exogenous two-carbon substrate. In embodiments, such a growth rate can be at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of the growth rate of a recombinant host cell comprising a modification in an endogenous activity that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA grown in media that is supplemented with an exogenous two-carbon substrate.
In embodiments, a reduced requirement for exogenous two-carbon substrate supplementation can be a growth rate of the recombinant host cells described herein in media that is not supplemented with an exogenous two-carbon substrate that is the same or substantially equivalent to the growth rate of a recombinant host cell comprising a modification in an endogenous PDC activity grown in media that is supplemented with an exogenous two-carbon substrate. In embodiments, such a growth rate can be at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of the growth rate of a recombinant host cell comprising a modification in an endogenous PDC activity grown in media that is supplemented with an exogenous two-carbon substrate.
In other embodiments, the recombinant host cells described herein have a growth rate in media that is not supplemented with an exogenous two-carbon substrate that is greater than the growth rate of a recombinant host cell comprising a modification in an endogenous activity that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA in media that is not supplemented with an exogenous two-carbon substrate.
In other embodiments, the recombinant host cells described herein have a growth rate in media that is not supplemented with an exogenous two-carbon substrate that is greater than the growth rate of a recombinant host cell comprising a modification in an endogenous PDC activity in media that is not supplemented with an exogenous two-carbon substrate.
In other embodiments, the recombinant host cells described herein can have an increased glucose consumption compared to a recombinant host cell comprising a modification in an endogenous polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA.
In other embodiments, the recombinant host cells described herein can have an increased glucose consumption compared to a recombinant host cell comprising a modification in an endogenous polypeptide having PDC activity (e.g., at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide having PDC activity that reduces or eliminates PDC activity).
Glucose consumption of the recombinant host cells described herein can be measured by methods known in the art (see, e.g., Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). In a non-limiting example, glucose consumption can be measured by quantitating the amount of glucose in culture media by HPLC or with a YSI Biochemistry Analyzer (YSI, Inc., Yellow Springs, Ohio).
In other embodiments, methods of producing a recombinant host cell are provided comprising transforming a recombinant host cell comprising a modification in an endogenous polynucleotide, gene or polypeptide encoding pyruvate decarboxylase (e.g., at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide having pyruvate decarboxylase activity) with a heterologous polynucleotide encoding a polypeptide having phosphoketolase activity. In other embodiments, the method further comprises transforming the recombinant host cell with a heterologous polynucleotide encoding a polypeptide having phosphotransacetylase activity.
In other embodiments, methods for the conversion of xylulose 5-phosphate or fructose 6-phosphate into acetyl-phosphate are provided comprising (i) providing a recombinant host cell as described herein, or combinations thereof; and (ii) growing the recombinant host cell under conditions wherein xylulose 5-phosphate or fructose-6-phosphate is converted into acetyl-phosphate. In other embodiments, methods for the conversion of xylulose 5-phosphate or fructose-6-phosphate into acetyl-CoA are provided comprising (i) providing a recombinant rose cell as described herein, or combinations thereof; and (ii) growing the recombinant host cell under conditions where xylulose 5-phosphate or fructose-6-phosphate is converted into acetyl-CoA.
In other embodiments, methods for the conversion of acetyl-phosphate to, acetyl-CoA are provided comprising (i) providing a recombinant host cell as described herein, or combinations thereof; and (ii) growing the recombinant host cell under conditions where acetyl-phosphate is converted into acetyl-CoA. In other embodiments, methods for increasing the specific activity of a heterologous polypeptide having phosphoketolase activity in a recombinant host cell are provided comprising (i) providing a recombinant host cell as described herein, or combinations thereof; and (ii) growing the recombinant host cell under conditions wherein the heterologous polypeptide having phosphoketolase activity is expressed in functional form having a specific activity greater than the same recombinant host cell lacking the heterologous polypeptide having phosphoketolase activity.
In other embodiments, methods for increasing the specific activity of a heterologous polypeptide having phosphotransacetylase activity in a recombinant host cell are provided comprising (i) providing a recombinant host cell described herein, or combinations thereof; and (ii) growing the recombinant host cell under conditions whereby the heterologous polypeptide having phosphotransacetylase activity is expressed in functional form having a specific activity greater than the same recombinant host cell lacking a heterologous polypeptide having phosphotransacetylase activity.
In still other embodiments, methods for increasing the activity of the phosphoketolase pathway in a recombinant host cell are provided comprising (i) providing a recombinant host cell as described herein, or combinations thereof; and (ii) growing the host cell under conditions whereby the activity of the phosphoketolase pathway in the host cell is increased.
Threonine aldolase (E.C. number 4.1.2.5) catalyzes cleavage of threonine to produce glycine and acetaldehyde. Plasmid-based overexpression of a gene encoding this enzyme in S. cerevisiae PDC-KO strains was shown to eliminate the requirement for exogenous C2 supplementation (van Maris et al, Appl Environ Microbiol. 2003 April; 69(4):2094-9). In embodiments, recombinant host cells comprise (i) a deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA that results in a requirement for exogenous two-carbon substrate supplementation for optimal growth; and (ii) heterologous polynucleotide encoding a polypeptide having threonine aldolase activity.
In embodiments, the recombinant host cells described herein can be engineered to have a biosynthetic pathway for production of a product from pyruvate. A product from such a pyruvate-utilizing biosynthetic pathway includes, but is not limited to, 2,3-butanediol, isobutanol, 2-butanol, 2-butanone, valine, leucine, alanine, lactic acid, malic acid, fumaric acid, succinic acid and isoamyl alcohol. The features of any pyruvate-utilizing biosynthetic pathway may be engineered in the recombinant host cells described herein in any order. Any product made using a biosynthetic pathway that has, pyruvate as the initial substrate can be produced with greater effectiveness in a recombinant host cell disclosed herein having a, modification in an endogenous polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA (such as pyruvate decarboxylase, pyruvate formate lyase, pyruvate dehydrogenase, pyruvate oxidase, or pyruvate:ferredoxin oxioreductase) and having heterologous phosphoketolase and/or phosphotransacetylase activity, compared to a recombinant host cell having a modification in an endogenous polypeptide that converts pyruvate to acetaldehyde, acetyl-phosphate or acetyl-CoA (such as pyruvate decarboxylase, pyruvate formate lyase, pyruvate dehydrogenase, pyruvate oxidase, or pyruvate:ferredoxin oxioreductase). Any product made using a biosynthetic pathway that has pyruvate as the initial substrate can be produced with greater effectiveness in a recombinant host cell disclosed herein having a modification in an endogenous polypeptide having PDC activity that reduces or eliminates PDC activity and having heterologous phosphoketolase and/or phosphotransacetylase activity, compared to a recombinant host cell having a modification in an endogenous polypeptide having PDC activity that reduces or eliminates PDC activity.
The biosynthetic pathway of the recombinant host cells described herein can be any pathway that utilizes pyruvate and produces a desired product. The pathway genes may include endogenous genes and/or heterologous genes. Typically at least one gene in the biosynthetic pathway is a heterologous gene. Suitable biosynthetic pathways for production of butanol are known in the art, and certain suitable pathways are described herein. In some embodiments, the butanol biosynthetic pathway comprises, at least one gene that is heterologous to the host cell. In some embodiments, the butanol biosynthetic pathway comprises more than one gene that is heterologous to the host cell. In some embodiments, the butanol biosynthetic pathway comprises heterologous genes encoding polypeptides corresponding to every step of a biosynthetic pathway.
Genes and polypeptides that can be used for substrate to product conversions described herein as well as methods of identifying such, genes and polypeptides, are described herein and/or in the art, for example, for isobutanol, in the Examples and in U.S. Pat. No. 7,851,188. Ketol-acid reductoisomerase (KAKI) enzymes are described in U.S. Patent Appl. Pub. Nos. 20080261230 A1, 20090163376 A1, 20100197519 A1, and PCT Appl. Pub. No. WO/2011/04.1415. Examples of KARIs disclosed therein are those from Lactococcus lactis, Vibrio cholera, Pseudomonas aeruginosa PAO1, and Pseudomonas fluorescens PF5 mutants. KARIs include Anaerostipes caccae KAR1 variants “K9G9” and “K9D3” (SEQ ID NOs: 1911 and 1910, respectively). US Appl. Pub. No. 20100081154 A1, and U.S. Pat. No. 7,851,188 describe dihydroxyacid dehydratases (DHADs), including a DHAD from Streptococcus mutans. U.S. Patent Appl. Publ. No. 20090269823 A1 describes SadB, an alcohol dehydrogenase (ADH) from Achromobacter xylosoxidans. Alcohol dehydrogenases also include horse liver ADH and Beijerinkia indica ADH (protein SEQ ID NO: 1923).
An example of a biosynthetic pathway for producing 2,3-butanediol can be engineered in the recombinant host cells described herein, as described in U.S. Patent Application No. 20090305363, which is herein incorporated by reference. The 2,3-butanediol pathway is a portion of the 2-butanol biosynthetic pathway that is disclosed in U.S. Patent Application Publication No. US 20070292927 A1, which is herein incorporated by reference. Such pathway steps include, but are not limited to, conversion of pyruvate to acetolactate by acetolactate synthase, conversion of acetolactate to acetoin by acetolactate decarboxylase, and conversion of acetoin to 2,3-butanediol by butanediol dehydrogenase. The skilled person will appreciate that polypeptides having the activity of such pathway steps can be isolated from a variety of sources can be used in the recombinant host cells described herein.
In addition, examples of biosynthetic pathways for production of 2-butanone or 2-butanol that can be engineered in the recombinant host cells described herein are disclosed in U.S. Patent Application Publication Nos. US 20070292927 A1 and US 20070259410 A1, which are herein incorporated by reference. The pathway in U.S. Patent Application Publication No. US 20070292927 A1 is the same as described for butanediol production with the addition of the following steps:
2,3-butanediol to 2-butanone as catalyzed for example by diol dehydratase or glycerol dehydratase; and
2-butanone to 2-butanol as catalyzed for example by butanol dehydrogenase.
Described in U.S. Patent Application Publication No. US 20090155870 A1, which is herein incorporated by reference, is the construction of chimeric genes and genetic engineering of yeast for 2-butanol production using the U.S. Patent Application Publication No. US 20070292927 A1 disclosed biosynthetic pathway. Further description for gene construction and expression related to these pathways can be found, for example, in International Publication No. WO 2009046370 (e.g., butanediol dehydratases); and U.S. Patent Application Publication No. US 20090269823 A1 (e.g., butanol dehydrogenase) and U.S. Patent Application Publication No. US 20070259410 A1 which are herein incorporated by reference. The skilled person will appreciate that polypeptides having the activity of such pathway steps can be isolated from a variety of sources can be used in the recombinant host cells described herein.
Biosynthetic pathways for the production of isobutanol that may be used include those described in U.S. Pat. No. 7,851,188 and PCT Publication WO 2007050671, incorporated herein by reference. One isobutanol biosynthetic pathway comprises the following substrate to product conversions:
pyruvate to acetolactate, which may be catalyzed, for example, by acetolactate synthase;
acetolactate to 2,3-dihydroxyisovalerate, which may be catalyzed, for example, by acetohydroxy acid reductoisomerase;
2,3-dihydroxyisovalerate to α-ketoisovalerate, which may be catalyzed, for example, by acetohydroxy acid dehydratase;
α-ketoisovalerate to isobutyraldehyde, which may be catalyzed, for example, by a branched-chain keto acid decarboxylase; and
isobutyraldehyde to isobutanol, which may be catalyzed, for example, by a branched-chain alcohol dehydrogenase. In some embodiments, the isobutanol biosynthetic pathway comprises at least one gene, at least two genes, at least three genes, or at least four genes that is/are heterologous to the yeast cell. In embodiments, each substrate to product conversion of an isobutanol biosynthetic pathway in a recombinant host cell is catalyzed by a heterologous polypeptide. In embodiments, the polypeptide catalyzing the substrate to product conversions of acetolactate to 2,3-dihydroxyisovalerate and/or the polypeptide catalyzing the substrate to product conversion of isobutyraldehyde to isobutanol are capable of utilizing NADH as a cofactor.
An example of a biosynthetic pathway for production of valine that can be engineered in the recombinant host cells described herein includes the steps of acetolactate conversion to 2,3-dihydroxy-isovalerate by acetohydroxyacid reductoisomerase (ILV5), conversion of 2,3-dihydroxy-isovalerate to 2-keto-isovalerate by dihydroxy-acid dehydratase (ILV3), and conversion of 2-keto-isovalerate to valine by branched-chain, amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1). Biosynthesis of leucine includes the same steps to 2-keto-isovalerate, followed by conversion of 2-keto-isovalerate to alpha-isopropylmalate by alpha-isopropylmalate synthase (LEU9. LEU4), conversion of alpha-isopropylmalate to beta-isopropylmalate by isopropylmalate isomerase (LEU1), conversion of beta-isopropylmalate to alpha-ketoisocaproate by beta-IPM dehydrogenase (LEU2), and finally conversion of alpha-ketoisocaproate to leucine by branched-chain amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1). It is desired for production of valine or leucine to overexpress at least one of the enzymes in these described pathways.
An example of a biosynthetic pathway for production of isoamyl alcohol that can be engineered in the recombinant host cells described herein includes the steps of leucine conversion to alpha-ketoisocaproate by branched-chain amino acid transaminase (BAT2) and branched-chain amino acid aminotransferase (BAT1), conversion of alpha-ketoisocaproate to 3-methylbutanal by ketoisocaproate decarboxylase (THI3) or decarboxylase ARO10, and finally conversion of 3-methylbutanal to isoamyl alcohol by an alcohol dehydrogenase such as ADH1 or SFA 1. Production of isoamyl alcohol benefits from increased production of leucine or the alpha-ketoisocaproate intermediate by overexpression of one or more enzymes in biosynthetic pathways for these chemicals. In addition, one or both enzymes for the final two steps can be overexpressed.
An example of a biosynthetic pathway for production of lactic acid that can be engineered in the recombinant host cells described herein includes pyruvate conversion to lactic acid by lactate dehydrogenase. Engineering yeast for lactic acid production using lactate dehydrogenase, known as EC 1.1.1.27, is well known in the art such as in Ishida et al. (Appl. Environ. Microbiol. 71:1964-70 (2005)).
An example of a biosynthetic pathway for production of alanine that can be engineered in the recombinant host cells described herein includes pyruvate conversion to alanine by aminotransferase.
An example of a biosynthetic pathway for production of malate that can be engineered in the recombinant host cells described herein includes pyruvate conversion to oxaloacetate by pyruvate carboxylase, and conversion of oxaloacetate to malate by malate dehydrogenase as described in Zelle et al. (Applied and Environmental Microbiology 74:2766-77 (2008)). In addition, a malate transporter can be expressed.
An example of a biosynthetic pathway for production of fumarate that can be engineered in the recombinant host cells described herein includes pyruvate conversion to oxaloacetate by pyruvate carboxylase, and conversion of oxaloacetate to malate by malate dehydrogenase as described in Zelle et al. (Applied and Environmental Microbiology 74:2766-77 (2008)). In addition, a fumarase and a fumarate transporter can be expressed. Favorable production conditions and engineering of fungi for fumarate production is well known in the art, described e.g. by Goldberg et al. (Journal of Chemical Technology and Biotechnology 81:1601 1611 (2006)).
An example of a biosynthetic pathway for production of succinate that can be engineered in the recombinant host cells described herein includes pyruvate conversion to oxaloacetate by pyruvate carboxylase, and conversion of oxaloacetate to malate by malate dehydrogenase as described in Zelle et al. (Applied and Environmental Microbiology 74:2766-77 (2008)). In addition, a fumarase, a succinate dehydrogenase and a succinate transporter can be expressed.
The skilled person will appreciate that polypeptides having activities of the above-mentioned biosynthetic pathways can be isolated from a variety of sources can be used in the recombinant host cells described herein.
It will be appreciated that host cells comprising a butanol biosynthetic pathway such as an isobutanol biosynthetic pathway as provided herein may further comprise one or more additional modifications. U.S. Appl. Pub. No. 20090305363 (incorporated by reference) discloses increased conversion of pyruvate to acetolactate by engineering yeast for expression of a cytosol-localized acetolactate synthase and substantial elimination of pyruvate decarboxylase activity. Modifications to reduce glycerol-3-phosphate dehydrogenase activity and/or disruption in at least one gene encoding a polypeptide having pyruvate decarboxylase activity or a disruption in at least one gene encoding a regulatory element controlling pyruvate decarboxylase gene expression as described in U.S. Patent Appl. Pub. No. 20090305363 (incorporated herein by reference), modifications to a host cell that provide for increased carbon flux through an Entner-Doudoroff Pathway or reducing equivalents balance as described in U.S. Patent Appl. Pub. No. 20100120105 (incorporated herein by reference). Other modifications include integration of at least one polynucleotide encoding a polypeptide that catalyzes a step in a pyruvate-utilizing biosynthetic pathway. Other modifications include at least one deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having acetolactate reductase activity. In embodiments, the polypeptide having acetolactate reductase activity is YMR226C (SEQ ID NO: 1912) of Saccharomyces cerevisae or a homolog thereof. Additional modifications include a deletion, mutation, and/or substitution in an endogenous polynucleotide encoding a polypeptide having aldehyde dehydrogenase and/or aldehyde oxidase activity. In embodiments, the polypeptide having aldehyde dehydrogenase activity is ALD6 (SEQ ID NO 1909) from Saccharomyces cerevisiae or a homolog thereof. A genetic modification which has the effect of reducing glucose repression wherein the yeast production host cell is pdc—is described in U.S. Appl. Publication No. 20110124060, incorporated herein by reference.
Recombinant host cells may further comprise (a) at least one heterologous polynucleotide encoding a polypeptide having dihydroxy-acid dehydratase activity; and (b)(i) at least one deletion, mutation, and/or substitution in an endogenous gene encoding a polypeptide affecting Fe-S cluster biosynthesis; and/or (ii) at least one heterologous polynucleotide encoding a polypeptide affecting Fe-S cluster biosynthesis. In embodiments, the polypeptide affecting Fe-S cluster biosynthesis is encoded by AFT1 (nucleic acid SEQ ID NO: 1913, amino acid SEQ ID NO: 1914), AFT2 (SEQ ID NOs: 1915 and 1916), FRA2 (SEQ ID NOs: 1917 and 1918), GRx3(SEQ ID NOs: 1919 and 1920), or CCC1 (SEQ ID NOs: 1921 and 1922). In embodiments, the polypeptide affecting Fe-S cluster biosynthesis is constitutive mutant AFT1 L99A, AFT1 L102A, AFT1 C291F, or AFT1 C293F.
The recombinant host cells disclosed herein can be grown in fermentation media for production of a product utilizing pyruvate. For maximal production of some products, such as 2,3-butanediol, isobutanol, 2-butanone, or 2-butanol, the recombinant host cells disclosed herein used as production hosts preferably have enhanced tolerance to the produced chemical, and have a high rate of carbohydrate utilization. These characteristics can be conferred by mutagenesis and selection, genetic engineering, or can be natural.
Fermentation media for production of the products disclosed herein may contain glucose. Additional carbon substrates for product production pathways can include but are not limited to those described above. It is contemplated that the source of carbon utilized can encompass a wide variety of carbon containing substrates and will only be limited by the choice of organism.
In addition to an appropriate carbon source, fermentation media can contain suitable minerals, salts, cofactors, buffers and other components, known to those skilled in the art, suitable for the growth of the cultures and promotion of the enzymatic pathway necessary for production of the desired product.
10192 Typically cells are grown at a temperature in the range of about 20° C. to about 37° C. in, an appropriate medium. Suitable growth media for the recombinant host cells described herein are common commercially prepared media such as broth that includes yeast nitrogen base, ammonium sulfate, and dextrose as the carbon/energy source) or YPD Medium, a blend of peptone, yeast extract, and dextrose in optimal proportions for growing most Saccharomyces cerevisiae strains. Other defined or synthetic growth media may also be used and the appropriate medium for growth of the particular microorganism will be known by one skilled in the art of microbiology or fermentation science.
Suitable pH ranges for the fermentation are between pH 3.0 to pH 7.5, where pH 4.5 to pH 6.5 is preferred as the initial condition.
Fermentations can be performed under aerobic or anaerobic conditions, where anaerobic or microaerobic conditions are preferred.
The amount of product in the fermentation medium can be determined using a number of methods known in the art, for example, high performance liquid chromatography (HPLC) or gas chromatography (GC).
A batch method of fermentation can be used with the recombinant host cells described herein. A classical batch fermentation is a closed system where the composition of the medium is set at the beginning of the fermentation and not subject to artificial alterations during the fermentation. Thus, at the beginning of the fermentation the medium is inoculated with the desired organism or organisms, and fermentation is permitted to occur without adding anything to the system. Typically, however, a “batch” fermentation is batch with respect to the addition of carbon source and attempts are often made at controlling factors such as pH and oxygen concentration. In batch systems the metabolite and biomass compositions of the system, change constantly up to the time the fermentation is stopped. Within batch cultures cells progress through a static lag phase to a high growth log phase and finally to a stationary phase where growth rate is diminished or halted. If untreated, cells in the stationary phase will eventually die. Cells in log phase generally are responsible for the bulk of production of end product or intermediate.
A Fed-Batch system can also be used with the recombinant host cells described herein. A Fed-Batch system is similar to a typical batch system with the exception that the carbon source substrate is added in increments as the fermentation progresses. Fed-Batch systems are useful when catabolite repression (e.g. glucose repression) is apt to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the media. Measurement of the actual substrate concentration in Fed-Batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases such as CO2. Batch and Fed-Batch fermentations are common and well known in the art and examples may be found in Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition (1989) Sinauer Associates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl. Biochem. Biotechnol., 36:227, (1992), herein incorporated by reference.
Although a batch mode can be performed, it is also contemplated that continuous fermentation methods could also be performed with the recombinant host cells described herein. Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned media is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth.
Continuous fermentation allows for the modulation of one factor or any number of factors that affect cell growth or end product concentration. For example, one method will maintain a limiting nutrient such as the carbon source or nitrogen level at a fixed rate and allow all other parameters to vary. In other systems a number of factors affecting growth can be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems strive to maintain steady state growth, conditions and thus the cell loss due to the medium being drawn off must be balanced against the cell growth rate in the fermentation. Methods of modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology and a variety of methods are detailed by Brock, supra.
It is contemplated that the present invention can be practiced using either batch, fed-batch or continuous processes and that any known mode of fermentation would be suitable. Additionally, it is contemplated that cells can be immobilized on a substrate as whole cell catalysts and subjected to fermentation conditions for production.
Product Isolation from Fermentation Medium
Products can be isolated from the fermentation medium by methods known to one skilled in the art. For example, bioproduced isobutanol may be isolated from the fermentation medium using methods known in the art for ABE fermentations (see, e.g., Dune, Appl. Microbiol. Biotechnol. 49:639-648 (1998), Groot et al., Process. Biochem. 27:61-75 (1992), and references therein). For example, solids may be removed from the fermentation medium by centrifugation, filtration, decantation, or the like. Then, the isobutanol may be isolated from the fermentation medium using methods such as distillation, azeotropic distillation, liquid-liquid extraction, adsorption, gas stripping, membrane evaporation, pervaporation or vacuum flash fermentation (see e.g., U.S. Pub. No. 20090171129 A1, and International Pub. No. WO2010/151832 A1, both incorporated herein by reference in their entirety).
Because butanol forms a low boiling point, azeotropic mixture with, water, distillation can be used to separate the mixture up to its azeotropic composition. Distillation may be used in combination with another separation method to obtain separation around the azeotrope. Methods that may be used in combination with distillation to isolate and purify butanol include, but are not limited to, decantation, liquid-liquid extraction, adsorption, and membrane-based techniques. Additionally, butanol may be isolated using azeotropic distillation using an entrainer (see, e.g., Doherty and Malone, Conceptual Design of Distillation Systems, McGraw Hill, N.Y., 2001).
The butanol-water mixture forms a heterogeneous azeotrope so that distillation may be used in combination with decantation to isolate and purify the isobutanol. In this method, the isobutanol containing fermentation broth is distilled to near the azeotropic composition. Then, the azeotropic mixture is condensed, and the isobutanol is separated from the fermentation medium by decantation. The decanted aqueous phase may be returned to the first distillation column as reflux. The isobutanol-rich decanted organic phase may be further purified by distillation in a second distillation column.
The butanol can also be isolated from the fermentation medium using liquid-liquid extraction in combination with distillation. In this method, the isobutanol is extracted from the fermentation broth using liquid-liquid extraction with a suitable solvent. The isobutanol-containing organic phase is then distilled to separate the butanol from the solvent.
Distillation in combination with adsorption can also be used to isolate isobutanol from the fermentation medium. In this method, the fermentation broth containing the isobutanol is distilled to near the azeotropic composition and then the remaining water is removed by use of an adsorbent. such as molecular sieves (Aden et al., Lignocellulosic Biomass to Ethanol Process Design and Economics Utilizing Co-Current Dilute Acid Prehydrolysis and Enzymatic Hydrolysis for Corn Stover, Report NREL/TP-510-32438, National Renewable Energy Laboratory, June 2002).
Additionally, distillation in combination with pervaporation may be used to isolate and purify the isobutanol from the fermentation medium. In this method, the fermentation broth containing the isobutanol is distilled to near the azeotropic composition, and then the remaining water is removed by pervaporation through a hydrophilic membrane (Guo et al., J. Membr. Sci. 245, 199-210 (2004)).
In situ product removal (ISPR) (also referred to as extractive fermentation) can be used to remove butanol (or other fermentative alcohol) from the fermentation vessel as it is produced, thereby allowing the microorganism to produce butanol at high yields. One method for ISPR for removing fermentative alcohol that has been described in the art is liquid-liquid extraction. In, general, with regard to butanol fermentation, for example, the fermentation medium, which includes the microorganism, is contacted with an organic extractant at a time before the butanol concentration reaches a toxic level. The organic extractant and, the fermentation medium form a biphasic mixture. The butanol partitions into the organic extractant phase, decreasing the concentration in the aqueous phase containing the microorganism, thereby limiting the exposure of the microorganism to the inhibitory butanol.
Liquid-liquid extraction can be performed, for example, according to the processes described in U.S. Patent Appl. Pub. No. 20090305370, the disclosure of which is hereby incorporated in its entirety. U.S. Patent Appl. Pub. No. 20090305370 describes methods for producing and recovering butanol from a fermentation broth using liquid-liquid extraction, the methods comprising the step of contacting the fermentation broth with a water immiscible extractant to form a two-phase mixture comprising an aqueous phase and an organic phase. Typically, the extractant can be an organic extractant selected from the group consisting of saturated, mono-unsaturated, poly-unsaturated (and mixtures thereof) C12 to C22 fatty alcohols, C12 to C22 fatty acids, esters of C12 to C22 fatty acids, C12 to C22 fatty aldehydes, and mixtures thereof. The extractant(s) for ISPR can be non-alcohol extractants. The ISPR extractant can be an exogenous organic extractant such as oleyl alcohol, behenyl alcohol, cetyl alcohol, lauryl alcohol, myristyl alcohol, stearyl alcohol, 1-undecanol, oleic acid, lauric acid, myristic acid, stearic acid, methyl myristate, methyl oleate, undecanal, lauric aldehyde, 20-methylundecanal, and mixtures thereof.
In some embodiments, the alcohol can be esterfied by contacting the alcohol in a fermentation medium with an organic acid (e.g., fatty acids) and a catalyst (e.g. enzyme such as a lipase) capable of esterifying the alcohol with the organic acid. In such embodiments, the organic acid can serve as an ISPR extractant into which the alcohol esters partition. The organic acid can be supplied to the fermentation vessel and/or derived from the biomass supplying fermentable carbon fed to the fermentation vessel. Lipids present in the feedstock can be catalytically hydrolyzed to organic acid, and the same catalyst (e.g., enzymes) can esterify the organic acid with the alcohol. The catalyst can be supplied to the feedstock prior to fermentation, or can be supplied to the fermentation vessel before or contemporaneously with the supplying of the feedstock. When the catalyst is supplied to the fermentation vessel, alcohol esters can be obtained by hydrolysis of the lipids into organic acid and substantially simultaneous esterification of the organic acid with butanol present in the fermentation vessel. Organic acid and/or native oil not derived from the feedstock can also be fed to the fermentation vessel, with the native oil being hydrolyzed into organic acid. Any organic acid not esterified with the alcohol can serve as part of the ISPR extractant. The extractant containing alcohol esters can be separated from the fermentation medium, and the alcohol can be recovered from the extractant. The extractant can be recycled to the fermentation vessel. Thus, in the case of butanol production, for example, the conversion of the butanol to an ester reduces the free butanol concentration in the fermentation medium, shielding the microorganism from the toxic effect of increasing butanol concentration. In addition, unfractionated grain can be used as feedstock without separation of lipids therein, since the lipids can be catalytically hydrolyzed to organic acid, thereby decreasing the rate of build-up of lipids in the ISPR extractant.
In situ product removal can be carried out in a batch mode or a continuous mode. In a continuous mode of in situ product removal, product is continually removed from the reactor. In a batchwise mode of in situ product removal, a volume of organic extractant is added to the fermentation vessel and the extractant is not removed during the process. For in situ product removal, the organic extractant can contact the fermentation medium at the start of the fermentation forming a biphasic fermentation medium. Alternatively, the organic extractant can contact the fermentation medium after the microorganism has achieved a desired amount of growth, which can be determined by measuring the optical density of the culture. Further, the organic extractant can contact the fermentation medium at a time at which the product alcohol level in the fermentation medium reaches a preselected level. In the case of butanol production according to some embodiments of the present invention, the organic acid extractant can contact the fermentation medium at a time before the butanol concentration reaches a toxic level, so as to esterify the butanol with the organic acid to produce butanol esters and consequently reduce the concentration of butanol in the fermentation vessel. The ester-containing organic phase can then be removed from the fermentation vessel (and separated from the fermentation broth which constitutes the aqueous phase) after a desired effective titer of the butanol esters is achieved. In some embodiments, the ester-containing organic phase is separated from the aqueous phase after fermentation of the available fermentable sugar in the fermentation vessel is substantially complete.
The meaning of abbreviations used is as follows: “min” means minute(s), “h” means hour(s), “sec” means second(s), “μl” means microliter(s), “ml” means milliliter(s), “L” means liter(s), “nm” means nanometer(s), “mm” means millimeter(s), “cm” means centimeter(s), “μm” means micrometer(s), “mM” means millimolar, “M” means molar, “mmol” means millimole(s), “μmmole” means micromole(s), “g” means gram(s), “μg” means microgram(s), “mg” means milligram(s), “rpm” means revolutions per minute, “w/v” means weight/volume, “v/v” means volume/volume, “OD” means optical density, “bp” means base pair(s), and “PCR” means polymerase chain reaction.
Standard recombinant DNA and molecular cloning techniques used in the Examples are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Mamatis, T., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, by T. J. Sillavy, M. L. Bennan, and L. W. Enquist, Experiments' with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1984, and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, Greene Publishing Assoc. and Wiley-Interscience, N.Y., 1987. Phusion® HF Master Mix (NEB Cat. No. F-531) and HotStarTaq® Master Mix (Qiagen Cat. No. 203443) were used for PCR in gene cloning and clone screening, respectively.
Materials and methods suitable for the maintenance and growth of bacterial cultures are also well known in the art. Techniques suitable for use in the following Examples may be found in Manual of Methods for General Bacteriology, Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds., American Society for Microbiology, Washington, D.C., 1994, or by Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland, Mass., 1989. All reagents, restriction enzymes and materials used for the growth and maintenance of bacterial cells were obtained from Aldrich Chemicals (Milwaukee, Wis.), BD Diagnostic Systems (Sparks, Md.)), Life Technologies (Rockville, Md.), or Sigma Chemical Company (St. Louis, Mo.), unless otherwise specified.
Analysis for fermentation by-product composition is well known to those skilled in the art. For example, one high performance liquid chromatography (HPLC) method utilizes a Shodex SH-1011 column with a Shodex SH-G guard column (both available from Waters Corporation, Milford, Mass.), with refractive index (R1) detection. Chromatographic separation is achieved using 0.01 M H2SO4 as the mobile phase with a flow rate of 0.5 mL/min and a column temperature of 50° C. Isobutanol retention time is 47.6 minutes. For butanediol, meso-butanediol eluted at 26.0 min and 2R,3R-butanediol eluted at 27.7 min.
Construction of phosphoketolase/phosphotransacetylase expression cassette. The xpk1 and eutD genes (GenBank GI numbers 28379168 (SEQ ID NO: 172) and 28377658 (SEQ ID NO: 1111), respectively) were obtained from Lactobacillus plantarum (ATCC No. BAA-793) via polymerase chain reaction (PCR) using primers N1039 and N1040 (for xpk1) and N1041 and N1042 (for eutD). The primer sequences of N1039, N1040, N1041 and N1042 correspond to SEQ ID Nos. 639-642, respectively.
The xpk1 and eutD genes were fused to a DNA fragment containing opposing yeast terminator sequences (CYC and ADH terminators, obtained from PacI digestion of pRS423::CUP1-alsS+FBA-budA, described in U.S. Patent Application Publication No. 20090155870, herein incorporated by reference) by overlap PCR method (Yu et al., Fungal Genet. Biol. 41: 973-981; 2003). The resulting PCR product was cloned into an E. coli-yeast shuttle vector using gap repair methodology (Ma et al., Genetics 58:201-216; 1981). The shuttle vector was based on pRS426 (ATCC No. 77107) and contained both GPD (also known as TDH3) and ADH1 promoters. The resulting vector contained xpk1 under control of the GPD promoter and eutD under control of the ADH1 promoter in opposing orientation. The sequence of the resulting vector (pRS426::GPD-xpk1+ADH1-eutD) is provided as SEQ ID No: 643 (see
An expression cassette of the pRS426::GPD-xpk1+ADH1-eutD vector (GPD-xpk1+ADH1-eutD) was prepared by digestion with EcoRI and SacI restriction enzymes. The resulting cassette was ligated into the yeast integration vector pUC19-URA3-MCS which was also prepared by digestion with EcoRI and SacI restriction enzymes.
Vector pUC19-URA3MCS is pUC19-based and contains the sequence of the URA3 gene from Saccaromyces cerevisiae situated within a multiple cloning site (MCS). pUC19 (American Type Culture Collection, Manassas, Va.; ATCC#37254) contains the pMB 1 replicon and a gene coding for beta-lactamase for replication and selection in Escherichia coli. In addition to the coding sequence for LRA3, the sequences from upstream and downstream of this gone are included for expression of the URA3 gene in yeast. The vector can be used for cloning purposes and can be used as a yeast integration vector.
The DNA encompassing the URA3 coding region along with 250 bp upstream and 150 bp downstream of the URA3 coding region from Saccaromyces cerevisiae CEN.PK 113-7D (CBS 8340; Centraalbureau voor Schimmelcultures (CBS) Fungal Biodiversity Centre, Netherlands) genomic DNA was amplified with primers oBP438 (SEQ ID NO: 644), containing BamHI, AscI, Pinel, and FseI restriction sites, and oBP439 (SEQ ID NO: 645), containing XbaI, P acI, and NotI restriction sites. Genomic DNA was prepared using a Gentra Puregene Yeast/Bact kit (Qiagen). The PCR product and, pUC19 were ligated with T4 DNA ligase after digestion with BamHI and XbaI to create vector pUC19-URA3MCS. The vector was confirmed by PCR and sequencing with primers oBP264 (SEQ ID NO:646) and oBP265 (SEQ ID NO:647).
The ligation reaction was transformed into E. coli Stb13 cells, according to the manufacturer's instructions (Invitrogen, Carlsbad, Calif., Cat. No. C7373). Transformants were screened by polymerase chain reaction (PCR) to detect the eutD gene using the primers N1041 and N1042 (SEQ ID NOs: 641 and 642, respectively). Positive clones for eutD gene expression detected by PCR were further confirmed for eutD gene incorporation by digestion of the vector with SacII restriction enzyme.
Two confirmed clones were selected and an integration targeting sequence was added to the clones as follows. PCR was used to amplify regions of the genome of S. cerevisiae strain BY4700 (ATCC No. 200866) both 5′ and 3′ of the PDC1 gene using the following primers: N1049 and N1050 (5′) and N1047 and N1048 (3′) (SEQ ID NOs: 648-651, respectively). Primer N1049 enables the 3′ end of the 161-bp PDC1 3′ sequence to be fused to the 5′ end of the 237 bp PDC1 5′ sequence via PCR. This pdc 1 3′-5′-fusion fragment (368 by in length) was cloned into the pCR11—Blunt TOPO vector according to the manufacturer's instructions (Invitrogen, Carlsbad, Calif., Cat. No. K2800).
Transformants were screened by PCR to detect the pdc1 3′-5′-fusion fragment using primers N1047 and N1050. The pdc1 3′-5′-fusion fragment was isolated from positive clones and released from the vector by digestion with EcoRI enzyme, and ligated into a pUC19-URA3::GPD-xpk1+ADH-eutD vector that had been linearized by digestion with EcoRI restriction enzyme to generate the “phosphoketolase pathway” vector. Additionally, the pdc1 3′-5′-fusion fragment was ligated with pUC19-URA3-MCS digested with EcoRI restriction enzyme to generate the control vector. Both ligation reactions were transformed into E. coli Stb13 cells according to the manufacturer's instructions (Invitrogen, Carlsbad, Calif., Cat. No. C7373). The resulting transformants were screened by PCR to detect the pdc1 3′-5′-fusion fragment using primers N1047 and N1050. Positive clones containing the pdc1 3′-5′-fusion fragment were identified and the vectors were digested with either NcoI restriction enzyme (control vector) or BsgI restriction enzyme (phosphoketolase pathway vector) to confirm cloning orientation. One control clone (=pUC19-URA3::pdc1) and one phosphoketolase pathway clone (=pUC19-URA3::pde1::GPD-xpk1+ADH1-eutD; SEQ ID NO: 1898) were selected for integration.
The control and phosphoketolase pathway vectors described in Example 2 were linearized with AflII restriction enzyme and transformed into strain BP913 (CEN.PK113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvD(Sm) Δpc-1c5::sadB) to form control and phosphoketolase pathway strains. Strain BP913 is further described in Example 10.
Transformed cells were plated on synthetic complete medium without uracil containing ethanol as the sole carbon source (1% vol/vol) and screened by PCR using primers N238 and oBP264 (SEQ ID Nos. 652 and 646, respectively to confirm integration at the pdc1 locus. Integration at the Δpdc1::ilvD(Sm) locus resulted in the loss of ilvD(Sm).
Pyruvate decarboxylase knockout (PDC-KO) yeast strains are unable to grow in media containing 2% glucose as the sole carbon source, but can grow in 2% glucose supplemented with ethanol as shown with a strain transformed with one or more plasmids encoding members of the butanediol pathway (described in U.S. Patent Application Publication No. 20090305363, herein incorporated by reference). To test whether the introduction of the phosphoketolase and phosphotransacetylase genes could support growth of PDC-KO cells, PDC-KO yeast were transformed with the phosphoketolase and phosphotransacetylase gene (as described in Example 3) and with the vector pRS423CUP1-alsS+F13A-budA (described in U.S. Patent Application Publication No 20090155870, herein incorporated by reference) encoding members of the butanediol pathway. After cultivation in media containing 2% glucose (synthetic complete minus his and ura) supplemented with 0.05% v/v ethanol, cultures were diluted into the same media lacking ethanol (starting OD=0.1, 20 ml medium in a 125 ml vented flask). For comparison, a control PDC-KO strain without introduction of the phosphoketolase and phosphotransacetylase genes was also diluted into medium supplemented with ethanol (0.05% vol/vol). The optical density at 600 nm was measured during growth (results shown in
The growth of PDC-KO yeast transformed with phosphoketolase and phosphotransacetylase in media that was not supplemented with ethanol (xpkA-xpkC, representing n=3 results) was indistinguishable from the growth of PDC-KO yeast strains grown in media containing 2% glucose that was supplemented with ethanol (cont A-cont C w/EtOH, representing n=3 results). The average growth rate of the phosphoketolase- and phosphotransacetylase-transformed strains under these conditions was 0.19 h−1, A growth rate of 0.23 h−1 for the phosphoketolase- and phosphotransacetylase-transformed strains was observed upon culturing under the same conditions with higher aeration (data not shown). PDC-KO yeast strains grown in media containing 2% glucose that was not supplemented with ethanol showed some growth in the first 16 hours, but then grew at a rate of only 0.01 h−1 (control A-control C, representing n=3 results).
The integration vector described above (pUC19-URA3::pdc1::GPD-xpk1+ADH1-eutD) was modified to eliminate either the xpk1 phosphoketolase gene or the eutD phosphotransacetylase gene. Specifically, to remove eutD, the integration vector was digested with the Gal and SpeI restriction enzymes to remove a 0.6 kb region from the eutD coding sequence, forming the vector pUC19-URA3::pdc1::GPD-xpk1. To remove xpk1, the integration, vector was digested with the SpeI and KpnI restriction enzymes to remove the 3.4 kb region from SpeI to KpnI, forming the vector pUC19-URA3::pdc1::ADH-eutD. The resulting vectors, were linearized with digestion with the AflII restriction enzyme and transformed into B1-913/pRS423::CUP1-alsS+FBA-budA cells (described in Example 3). Transformed cells were screened by PCR to confirm integration at the pdc1 locus and cultured, as described above.
To test whether the introduction of either the phosphoketolase or phosphotransacetylase genes could support the growth of PDC-KO cells, PDC-KO yeast were transformed with either the phosphoketolase or phosphotransacetylase genes (as described in Example 5) and with the vector pRS423::CUP1-alsS+FBA-budA encoding members of the butanediol pathway (as described in Example 4). After cultivation in media containing 2% glucose (synthetic complete minus his and ura) supplemented with 0.05% v/v ethanol, cultures were diluted into the same media lacking ethanol (starting OD=0.1, 20 ml medium in a 125 ml vented flask). For comparison, a PDC-KO strain without introduction of the phosphoketolase or phosphotransacetylase genes were grown under the same conditions. The optical density at 600 nm was measured during growth (results shown in
The growth of PDC-KO yeast transformed with phosphoketolase in media that was not supplemented with exogenous carbon substrate (xpk1,
To test the effects of introduction of p iosphoketolase into PDC-KO cells on glucose consumption and butanediol yield, PDC-KO yeast ransformed with either (1) phosphoketolase and phosphotransacetylase (as described in Example 4) and the vector pRS423::CUP1-alsS+FBA-budA encoding members of the butanediol (BDO pRS423::CUP1-alsS+FBA-budA encoding members of the butanediol pathway (“Control” in Table 6 below).
After cultivation in medium containing 2% glucose (synthetic complete minus histidine and uracil) supplemented with 0.05% ethanol, Xpk and Control cultures were diluted into medium without ethanol (starting OD=0.1, 20 ml, medium in a 125 ml vented flask). Glucose consumption and butanediol yield of Xpk and Control cultures were measured by HPLC analysis of culture media for amount of glucose and butanediol as shown in the Table below.
The glucose consumption of Xpk cells (n=3) was nearly twice the amount of glucose consumption of control strains (n=3). In addition, the butanediol molar yield of Xpk cells was increased compared to the butanediol molar yield of Control cells.
A phosphoketolase/phosphotransacetylase integration vector similar to the one described in Example 2 was constructed. In this case the xpk1 and eutD gene constricts were cloned so that they would be integrated immediately downstream of the Δpdc1::ilvD(Sm) locus of BP913. To do this, the intergenic region between ilvD(Sm) and TRX1 was amplified from BP913 genomic DNA using primers N1110 and N1111 (SEQ ID Nos. 653 and 654). This was cloned into pUC19-URA3-MCS at the Pmel site, as follows. The ilvD-TRX1 PCR product was phosphorylated with polynucleotide kinase (NEB Cat. No. MO201), the vector was prepared by digesting with PmeI and treating with calf intestinal phosphatase, the two fragments were ligated overnight and cloned into E. coli Stb13 cells. Clones were screened by PCR (using N1110 and N1111 primers) and then digested with BsgI to determine the orientation of the ilvD-TRX1 insertion. One clone from each orientation (pUC19-URA3::ilvD-TRX1 A and B was carried over to the next step: addition of the xpk1/eutD expression cassette. The xpk1/eutD expression cassette from pRS426::GPD-xpk1+ADH1-eutD was obtained by digestion with BglII and EcoRV. The 5′ overhanging DNA was filled in using Klenow Fragment. pUC19-URA3::ilvD-TRX1 was linearized with AflII and the 5′ overhanging DNA was filled in using Klenow fragment. This vector was then ligated with the prepared xpk1/eutD cassette. Ligation reactions were transformed into E. coli Stb13 cells. Clones were screened using primers for evil) (N1041 and N1042) and then digested with Ban to determine orientation of the xpk1/eutD cassette relative to the ilvD-TRXJ DNA sequence.
The URA3 marker gene was then replaced with a geneticin resistance marker as follows. A chimeric geneticin resistance gene was constructed that contained the Kluyveromyces lactis TEF1 promoter and terminator (TEF1p-kan-TEF1t gene, provided as SEQ ID No. 655). This gene was maintained in a pUC19 vector (cloned at the SmaI site). The kan gene was isolated from pUC19 by first digesting with KpnI, removal of 3′ overhanging DNA using Klenow Fragment (NEB, Cat. No. M212), digesting, with HincII and then gel purifying the 1.8 kb gene fragment (Zymoclean™ Gel DNA Recovery Kit, Cat. No. D4001, Zymo Research, Orange, Calif.). The URA3 marker was removed from pUC19-URA3::ilvD::GPD-xpk1+ADH1-eutD::TRX1 (paragraph above) using NsiI and NaeI (the 3′ overhanging DNA from NsiI digestion was removed with Klenow fragment). The vector and kan gene were ligated overnight and transformed into E. coli Stb13 cells. Clones were screened by PCR using primers BK468 and either N1090 or N1113 (SEQ ID Nos. 656, 657, and 658, respectively)—positive PCR results indicate presence and orientation of kan gene. Clones in both orientations were digested with PmeI and transformed into BP913 with selection on yeast extract-peptone medium supplied with 1% (v/v) ethanol as carbon source and 200 pg/ml geneticin (G418). A single transformant was obtained, as confirmed by PCR (primers N886 and oBP264 for the 5′ end N1090 and oBP512 for the 3′ end, SEQ ID Nos.659, 646, 657, and 660, respectively).
The strain described in Example 8 was transformed with 2 plasmids containing genes for an isobutanol pathway pYZ090 and pYZ067 (SEQ ID NOs: 1892 and 1891).
pYZ090 was constructed to contain a chimeric gene having the coding region of the alsS gene from Bacillus subtilis (nt position 457-2172) expressed from the yeast CUP1 promoter (nt 2-449) and followed by the CYC1 terminator (nt 2181-2430) for expression of ALS, and a chimeric gene having the coding region of the ilvC gene from Lactococcus lactis (nt 3634-4656) expressed from the yeast ILV5 promoter (2433-3626) and followed by the ILV5 terminator (nt 4682-5304) for expression of KARI. pYZ067 was constructed to contain the following chimeric genes: 1) the coding region of the ilvD gene from S. mutans UA159 with a C-terminal Lumio tag (nt 2260-3972) expressed from the yeast FBA1 promoter (nt 1661-2250) followed by the FBA1 terminator (nt 40005-4317) for expression of dihydroxy acid dehydratase, 2) the coding region for horse liver ADH (nt 4680-5807) expressed from the yeast GPM1 promoter (nt 5819-6575) followed by the ADH1 terminator (nt 4356-4671) for expression of alcohol dehydrogenase, and 3) the coding region of the kivD gene from Lactococcus lactis (nt 7175-8821) expressed from the yeast TDH3 promoter (nt 8830-9493) followed by the TDH3 terminator (nt 6582-7161) for expression of ketoisovalerate decarboxylase.
Transformants were obtained on synthetic complete medium lacking uracil and histidine with 1% (v/v) ethanol as carbon source and 100 pg/ml geneticin. Control strains (BP913) were also transformed with the same plasmids and plated without geneticin. A number of transformants were then patched to the same medium containing 2% glucose as carbon source and supplemented with 0.05% (v/v) ethanol. After 36 hours, patches were used to inoculate liquid medium (same composition as the plates). After 48 hours, ODs for both phosphoketolase pathway and control strains were similar (ca. 4-5 OD) and all were subcultured into medium lacking ethanol (i.e. no exogenous two-carbon substrate source). The phosphoketolase cultures grew without ethanol supplementation, similar to ethanol supplemented control strains. Results are shown in
The purpose of this example is to describe the construction of Saccharomyces cerevisiae strain BP913. The strain was derived from CEN.PK 113-7D (CBS 8340; Centraalbureau voor Schimmelcultures (CBS) Fungal Biodiversity Centre, Netherlands) and contains deletions of the following genes: URA3, HISS, PDC1, PDC5, and PDC6.
Deletions, which completely removed the entire coding sequence, were created by homologous recombination with PCR fragments containing regions of homology upstream and downstream of the target gene and either a G418 resistance marker or URA3 gene for selection of transformants. The G418 resistance marker, flanked by loxP sites, was removed using Cre recombinase. The URA3 gene was removed by homologous recombination to create a scarless deletion.
In general, the PCR cassette for each scarless deletion was made by combining four fragments, A-B-U-C, by overlapping PCR. The PCR cassette contained a selectable/counter-selectable marker, URA3 (Fragment U), consisting of the native CEN.PK 113-7D URA3 gene, along with the promoter (250 bp upstream of the URA3 gene) and terminator (150 bp downstream of the URA3 gene). Fragments A and C, each 500 by long, corresponded to the 500 by immediately upstream of the target gene (Fragment A) and the 3 500 by of the target gene (Fragment C). Fragments A and C were used for integration of the cassette into the chromosome by homologous recombination. Fragment B (500 by long) corresponded to the 500 by immediately downstream of the target gene and was used for excision of the URA3 marker and Fragment C from the chromosome by homologous recombination, as a direct repeat of the sequence corresponding to Fragment B was created upon integration of the cassette into the chromosome. Using the PCR product ABUC cassette, the URA3 marker was first integrated into and then excised from the chromosome by homologous recombination. The initial integration deleted the gene, excluding the 3′ 500 bp. Upon excision, the 3′ 500 by region of the gene was also deleted. For integration of genes using this method, the gene to be integrated was included in the PCR cassette between fragments A and B. URA3 Deletion
To delete the endogenous URA3 coding region, a ura3::loxP-kanMX-loxP cassette was PCR-amplified from pLA54 template DNA (SEQ ID NO: 661). pLA54 contains the K. lactis TEF1 promoter ar d kanMX marker, and is flanked by loxP sites to allow recombination with Cre recombinase and removal of the marker. PCR was done using Phusion DNA polymerase and primers BK505 and BK506 (SEQ ID NOs:662 and 663). The URA3 portion of each primer was derived from the 5′ region upstream of the URA3 promoter and 3′ region downstream of the coding region such that integration of the loxP-kanMX-loxP marker resulted in replacement of the URA3 coding region. The PCR product was transformed into CEN.PK 113-7D using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on YPD containing 0418 (100 pg/ml) at 30 C. Transformants were screened to verify correct integration by PCR using primers LA468 and LA492 (SEQ ID NOs: 664 and 665) and designated CEN.PK 113-7D Δura3::kanMX.
The four fragments for the PCR cassette for the scarless HIS3 deletion were amplified using Phusion High Fidelity PCR Master Mix (New England BioLabs) and CEN.PK 113-7D genomic DNA as template, prepared with a Centra Puregene Yeast/Bact kit (Qiagen; Valencia, Calif.). HIS3 Fragment A was amplified with primer oBP452 (SEQ ID NO: 666) and primer oBP453 (SEQ ID NO: 667), containing a 5′ tail with homology to the 5′ end of HIS3 Fragment B. HIS3 Fragmeht B was amplified with primer oBP454 (SEQ ID NO: 668), containing a 5′ tail with homology to the 3′ end of HIS3 Fragment A, and primer oBP455 (SEQ ID NO: 669), containing a 5′ tail with homology to the 5′ end of HIS3 Fragment U. HIS3 Fragment U was amplified with primer oBP456 (SEQ ID NO: 670), containing a 5′ tail with homology to the 3′ end of HIS3 Fragment B, and primer oBP457 (SEQ ID NO: 671), containing a 5′ tail with homology to the 5′ end of HIS3 Fragment C. HIS3 Fragment C was amplified with primer oBP458 (SEQ ID NO: 672), containing a 5′ tail with homology to the 3′ end of HIS3 Fragment U, and primer oBP459 (SEQ ID NO: 673). PCR products were purified with a PCR Purification kit (Qiagen, Valencia, Calif.). HIS3 Fragment AB was created by overlapping PCR by mixing HIS3 Fragment A and HIS3 Fragment B and amplifying with primers oBP452 (SEQ ID NO: 666) and oBP455 (SEQ ID NO: 669). HIS3 Fragment UC was created by overlapping PCR by mixing HIS3 Fragment U and HIS3 Fragment C and amplifying with primers oBP456 (SEQ ID NO: 670) and oBP459 (SEQ ID NO: 673). The resulting PCR products were purified on an agarose gel followed by a Gel Extraction kit (Qiagen, Valencia, Calif.). The HIS3 ABUC cassette was created by overlapping PCR by mixing HIS3 Fragment AB and HIS3 Fragment UC and amplifying with primers oBP452 (SEQ ID NO: 666) and oBP459 (SEQ ID NO: 673). The PCR product was purified with a PCR Purification kit (Qiagen, Valencia, Calif.).
Competent cells of CEN.PK 113-71) Δura3::kanMX were made and transformed with the HIS3 ABUC PCR cassette using a Frozen-EZ Yeast Transformation II kit (Zymo Research, Orange, Calif.). Transformation mixtures were plated on synthetic complete media lacking uracil supplemented with 2% glucose at 30 C. Transformants with a his knockout were screened for by PCR with primers oBP460 (SEQ ID NO: 674) and oBP461 (SEQ ID NO: 671) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). A correct transformant was selected as strain CEN.PK 113-7D Δura3::kanMX Δhis3::URA3.
KanMX Marker Removal from the Δura3 Site and URA3 Marker Removal from the Δhis3 Site
The KanMX marker was removed by transforming CEN.PK 113-7D Δura3::kanMX Δhis3::URA3 with pRS423::PGAL1-cre (SEQ ID NO: 715) using a Frozen-EZ Yeast Transformation II kit (Zymo Research, Orange, Calif.) and plating, on synthetic complete medium lacking histidine and uracil supplemented with 2% glucose at 30 C. Transformants were grown in YP supplemented with galactose at 30 C for ˜6 hours to induce the Cre recombinase and KanMX marker excision and plated onto YPD (22 u glucose) plates at 30 C for reekwery. An isolate was grown overnight in YPD and plated on synthetic complete medium containing 5-fluoro-orotic acid (0.1%) at 30 C to select for isolates that lost the URA3 marker, 5-FOA resistant isolates were grown in and plated on YPD for removal of the pRS423.:PGAL1-cre plasmid. Isolates were checked for loss of the KanMX marker, URA3 marker, and pRS423::PGAL1-erc plasmid by assaying, growth on YPD+G418 plates. synthetic complete medium lacking uracil plates. and synthetic complete medium lacking histidine plates. A correct isolate that was sensitive to G418 and auxotrophic for uracil and histidine was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 and designated as BP857. The deletions and marker removal were confirmed by PCR and sequencing with primers oB13450 (SEQ ID NO: 676) and oBP451 (SEQ ID NO: 677) for Δura3 and primers oBP460 (SEQ ID NO: 674) and oBP461 (SEQ ID NO: 675) for Δhis3 using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen, Valencia, Calif.).
The four fragments for the PCR cassette for the starless PDC6 deletion were amplified using Phusion High Fidelity PCR Master Mix (New England BioLabs, Ipswich, Mass.) and CEN.PK 113-7D genomic DNA as template, prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). PDC6 Fragment A was amplified with primer oBP440 (SEQ ID NO: 670) and primer oBP441 (SEQ ID NO: 679), containing a 5′ tail with homology to the 5′ end of PDC6 Fragment B. PDC6 Fragment B was amplified with primer oBP442 (SEQ ID NO: 680), containing a 5′ tail with homology to the 3′ end of PDC6 Fragment A, and primer oBP443 (SEQ ID NO: 681), containing, a 5′ tail with homology to the 5′ end of PDC6 Fragment U. PDC6 Fragment U was amplified with primer oBP444 (SEQ ID NO: 682), containing a 5′ tail with homology to the 3′ end of PDC6 Fragment B, and primer oBP445 (SEQ ID NO: 683), containing a 5′ tail with homology to the 5′ end of PDC6 Fragment C. PDC6 Fragment C was amplified with primer oBP446 (SEQ ID NO: 684), containing a 5′ tail with homology to the 3′ end of PDC6 Fragment U, and primer oBP447 (SEQ ID NO: 685). PCR products were purified with, a PCR Purification kit (Qiagen). PDC6 Fragment AB was created by overlapping PCR by mixing PDC6 Fragment A and PDC6 Fragment B and amplifying with primers oBP440 (SEQ ID NO: 678) and oBP443 (SEQ ID NO: 681). PDC6 Fragment UC was created by overlapping PCR by mixing PDC6 Fragment U and PDC6 Fragment C and amplifying with primers oBP444 (SEQ ID NO: 682) and oBP447 (SEQ ID NO: 685). The resulting PCR products were purified on an agarose gel followed by a Gel Extraction kit (Qiagen). The PDC6 ABUC cassette was created by overlapping PCR by mixing PDC6 Fragment AB and PDC6 Fragment UC and amplifying with primers oBP440 (SEQ ID NO: 678) and oBP447 (SEQ ID NO 685). The PCR product was purified with a PCR Purification kit (Qiagen).
Competent cells of CEN.PK 113-7D Δura3::loxP Δhis3 were made and transformed with the PDC6 ABUC PCR cassette using a Frozen-E2 Yeast Transformation II kit (Zymo Research), Transformation mixtures were plated on synthetic complete media lacking uracil supplemented with 2% glucose at 30 C. Transformants with a pdc6 knockout were screened for by PCR with primers oBP448 (SEQ ID NO 686) and oBP449 (SEQ ID NO; 687) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). A correct transformant was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6::URA3.
CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6::URA3 was grown overnight in YPD and plated on synthetic complete medium containing 5-fluoro-orotic acid (0,1%) at 30 C to select for isolates that lost the URA3 marker. The deletion and marker removal were confirmed by PCR and sequencing with primers oBP448 (SEQ ID NO: 686) and oBP449 (SEQ ID NO 687) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). The absence of the PDC6 gene from the isolate was demonstrated by a negative PCR result using primers specific for the coding sequence of PDC6, oBP554 (SEQ ID NO 688) and oBP555 (SEQ ID NO 689). The correct isolate was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 and designated as BP891.
PDC1 Deletion ilvDSm Integration
The PDC1 gene was deleted and replaced with the ilvD coding region from Streptococcus mutans ATCC #700610 (SEQ ID NO: 1886). The A fragment followed by the ilvD coding region from Streptococcus mutans for the PCR cassette for the PDC1 deletion-ilvDSm integration was amplified using Phusion High Fidelity PCR Master Mix (New England BioLabs) and NYLA83 genomic DNA as template (construction of strain NYLA83 is described in U.S. Application Pub. No. 20110124060 A1, incorporated herein by reference), prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). PDC1 Fragment A-ilvDSm was amplified with primer oBP513 (SEQ ID NO: 690) and primer oBP515 (SEQ ID NO: 691), containing a 5′ tail with homology to the 5′ end of PDC1 Fragment B. The B, U, and C fragments for the PCR cassette for the PDC1 deletion-ilvDSm integration were amplified using Phusion High Fidelity PCR Master Mix (New England BioLabs) and CEN.PK 113-7D genomic DNA as template, prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). PDC1 Fragment B was amplified with primer oBP516 (SEQ ID NO: 692), containing a 5′ tail with homology to the 3′ end of PDC1 Fragment A-ilvDSm, and primer oBP517 (SEQ ID NO: 693), containing a 5′ tail with homology to the 5′ end of PDC1 Fragment U. PDC1 Fragment U was amplified with primer oBP518 (SEQ ID NO: 694), containing a 5′ tail with homology to the 3′ end of PDC1 Fragment B, and primer oBP519 (SEQ ID NO: 695), containing a 5′ tail with homology to the 5′ end of PDC1 Fragment C. PDC1 Fragment C was amplified with primer oBP520 (SEQ ID NO: 696), containing a 5′ tail with homology to the 3′ end of PDC1 Fragment U, and primer oBP521 (SEQ ID NO: 697). PCR products were purified with a PCR Purification kit (Qiagen). PDC1 Fragment A-ilvDSm-B was created by overlapping PCR by mixing PDC1 Fragment A-ilvDSrn and PDC1 Fragment B and amplifying with primers oBP513 (SEQ ID NO: 690) and oBP517 (SEQ ID NO: 693). PDC Fragment UC was created by overlapping PCR by mixing PDC1 Fragment U and PDC1 Fragment C and amplifying with primers oBP518 (SEQ ID NO: 694) and oBP521 (SEQ ID NO: 697). The resulting PCR products were purified on an agarose gel followed by a Gel Extraction kit (Qiagen). The PDC 1 A-ilvDSm-BUC cassette was created by overlapping PCR by mixing PDC 1 Fragment A-ilvDSm-B and PDC1 Fragment UC and amplifying with primers oBP513 (SEQ ID NO 690) and oBP521 (SEQ ID NO: 697). The PCR product was purified with a PCR Purification kit (Qiagen).
Competent cells of CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 were made and transformed with the PDC1 A-ilvDSm-BUC PCR cassette using a Frozen-EZ Yeast Transformation II kit (Zymo Research). Transformation mixtures were plated on synthetic complete media lacking uracil supplemented with 2% glucose at 30 C. Transformants with a pdc1 knockout ilvDSm integration were screened for by PCR with primers oBP511 (SEQ ID NO: 698) and oBP512 (SEQ ID NO: 699) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). The absence of the PDC1 gene from the isolate was demonstrated by a negative PCR result using primers specific for the coding sequence of PDC1, oBP550 (SEQ ID NO: 700) and oBP551 (SEQ ID NO: 701). A correct transformant was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc 1::ilvDSm-URA3.
CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvDSm-URA3 was grown overnight in YPD and plated on synthetic complete medium containing 5-fluoro-orotic acid (0.1%) at 30 C to select for isolates that lost the URA3 marker. The deletion of PDC1, integration of ilvDSm, and marker removal were confirmed by PCR and sequencing with primers oBP511 (SEQ ID NO: 698) and oBP512 (SEQ ID NO: 699) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). The correct isolate was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvDSm and designated as BP907.
PDC5 Deletion sadB Integration
The PDC5 gene was deleted and replaced with the sadB coding region from Achromobacter xylosoxidans. A segment of the PCR cassette for the PDC5 deletion-sadB integration was first cloned into plasmid pUC19-URA3MCS (described in Example 2). The coding sequence of sadB (SEQ ID NO: 718) and PDC5 Fragment B were cloned into pUC19-URA3MCS to create the sadB-BU portion of the PDC5 A-sadB-BUC PCR cassette. The coding sequence of sadB was amplified using pLH468-sadB (SEQ ID NO: 716) as template with primer oBP530 (SEQ ID NO: 702), containing an AscI restriction site, and primer oBP531 (SEQ ID NO: 703), containing a 5′ tail with homology to the 5′ end of PDC5 Fragment B. PDC5 Fragment B was amplified from CEN.PK 113-7D genomic DNA with primer oBP532 (SEQ ID NO: 704), containing a 5′ tail with homology to the 3′ end of sadB, and primer oBP533 (SEQ ID NO: 705), containing a PmeI restriction site PCR products were purified with a PCR Purification kit (Qiagen). sadB-PDC5 Fragment B was created by overlapping PCR by mixing the sadB and PDC5 Fragment B PCR products and amplifying with primers oBP530 (SEQ ID NO: 702) and oBP533 (SEQ ID NO 705). The resulting PCR product was digested with Asci and PmeI and ligated with T4 DNA ligase into the corresponding sites of pUC19-URA3MCS after digestion with the appropriate enzymes. The resulting plasmid was used as a template for amplification of sadB-Fragment B-Fragment U using primers oBP536 (SEQ ID NO: 706) and oBP546 (SEQ ID NO: 707), containing a 5′ tail with homology to the 5′ end of PDC5 Fragment C. PDC5 Fragment C was amplified from CEN.PK 113-7D genomic DNA with primer oBP547 (SEQ ID NO: 708), containing a 5′ tail with homology to the 3′ end of PDC5 sadB-Fragment B-Fragment U, and primer oBP539 (SEQ ID NO: 709). PCR products were purified with a PCR Purification kit (Qiagen). PDC5 sadB-Fragment B-Fragment U-Fragment C was created by overlapping PCR by mixing PDC5 sadB-Fragment B-Fragment U and PDC5 Fragment C and amplifying with primers oBP536 (SEQ ID NO: 706) and oBP539 (SEQ II) NO: 709). The resulting PCR product was purified on an agarose gel followed by a Gel Extraction kit (Qiagen). The PDC5 A-sadB-BUC cassette was created by amplifying PDC5 sadB-Fragment B-Fragment U-Fragment C with primers oBP542 (SEQ ID NO: 710) containing a 5′ tail with homology to the 50 nucleotides immediately upstream of the native PDC5 coding sequence, and oBP539 (SEQ ID NO: 709). The PCR product was purified with a PCR Purification kit (Qiagen).
Competent cells of CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvDSm were made and transformed with the PDC5 A-sadB-13UC PCR cassette using a Frozen-EZ Yeast Transformation II kit (Zymo Research). Transformation mixtures were plated on synthetic complete media lacking uracil supplemented with 1% ethanol (no glucose) at 30 C. Transformants with a pdc5 knockout sadB integration were screened for by PCR with primers oBP540 (SEQ ID NO: 711) and oBP541 (SEQ ID NO: 712) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). The absence of the PDC5 gene from the isolate was demonstrated by a negative PCR result using primers specific for the coding sequence of PDC5, oBP552 (SEQ ID NO: 713) and oBP553 (SEQ ID NO: 714). A correct transformant was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3
Δpdc6 Δpdc1::ilvDSrn Δpdc5::sadB-URA3.
CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvDSm Δpdc5::sad13-URA3 was grown overnight in YPE (1% ethanol) and plated on synthetic complete medium supplemented with ethanol (no glucose) and containing 5-fluoro-orotic acid (0.1%) at 30 C to select for isolates that lost the URA3 marker. The deletion, of PDC5, integration of sadB, and marker removal were confirmed by PCR with primers oBP540 (SEQ ID NO 711) and oBP541 (SEQ ID NO: 712) using genomic DNA prepared with a Gentra Puregene Yeast/Bact kit (Qiagen). The correct isolate was selected as strain CEN.PK 113-7D Δura3::loxP Δhis3 Δpdc6 Δpdc1::ilvDSm Δpdc5::sadB and designated as BP913.
This example describes insertion-inactivation of endogenous PDC1 and PDC6 genes of S. cerevisiae. PDC1, PDC5, and PDC6 genes encode the three major isozymes of pyruvate decarboxylase. The resulting strain was used as described in Example 10.
Construction of pRS425::GPM-sadB
A DNA fragment encoding a butanol dehydrogenase (SEQ ID NO: 717) from Achromobacter xylosoxidans (disclosed in US Patent Application Publication No. US20090269823) was cloned. The coding region of this gene called sadB for secondary alcohol dehydrogenase (SEQ ID NO: 718) was amplified using standard conditions from A. xylosoxidans genomic DNA, prepared using a Gentra Puregene kit (Gentra Systems, Inc., Minneapolis, Minn.; catalog number D-5500A) following the recommended protocol for gram negative organisms using forward and reverse primers N473 and N469 (SEQ ID NOs:725 and 726), respectively. The PCR product was TOPO-Blunt cloned into pCR4 BLUNT (Invitrogen) to produce pCR4Blunt::sadB, which was transformed into E. coli Mach-1 cells. Plasmid was subsequently isolated from four clones, and the sequence verified.
The sadB coding region was PCR amplified from pCR4Blunt::sadB. PCR primers contained additional 5′ sequences that would overlap with the yeast GPM 1 promoter and the ADH1 terminator (N583 and N584, provided as SEQ ID NOs:727 and 728). The PCR product was then cloned using “gap repair” methodology in Saccharomyces cerevisiae (Ma et al. ibid) as follows. The yeast-E. coli shuttle vector pRS425::GPM::kivD::ADH which contains the GPM1 promoter (SEQ ID NO:721), kivD coding region from Lactococcus lactis (SEQ D NO:719), and ADH1 terminator (SEQ ID NO:722) (described in U.S. Pat. No. 7,851,188, Example 17) was digested with BbvCI and PacI restriction enzymes to release the kivD coding region. Approximately 1 μg of the remaining vector fragment was transformed into S. cerevisiae strain 13Y4741 along, with 1 mg of sadB PCR product. Transformants were selected on synthetic complete medium lacking leucine. The proper recombination event, generating pRS425::GPM-sadB, was confirmed by PCR using primers N142 and N459 (SEQ ID NOs:729 and 730).
Construction of pdc6::PGPM1-sadB Integration cassette and PDC6 Deletion:
A pdc6:: PGPM1-sadB-ADH/t-URA3r integration cassette was made by joining the GPM-sadB-ADHt segment (SEQ ID NO:723) from pRS425::GPM-sadB (SEQ ID NO: 720) to the URA3r gene from pUC19-URA3r. pUC19-URA3r (SEQ ID NO:724) contains the URA3 marker from pRS426 (ATCC #77107) flanked by 75 by homologous repeat sequences to allow homologous recombination in vivo and removal of the URA3 marker. The two DNA segments were joined by SOE PCR (as described by Horton et al. (1989) Gene 77:61-68) using as template pRS425::GPM-sadB and pUC19-URA3r plasmid DNAs, with Phusion DNA polymerase (New England Biolabs Inc., Beverly, Mass.; catalog no. F-5405) and primers 114117-11A through 114117-11D (SEQ ID NOs:731, 732, 733 and 734), and 114117-13A and 114117-13B (SEQ ID NOs:735 and 736).
The outer primers for the SOE PCR (114117-13A and 114117-13B) contained 5′ and 3′ ˜50 by regions homologous to regions upstream and downstream of the PDC6 promoter and terminator, respectively. The completed cassette PCR fragment was transformed into BY4700 (ATCC #200866) and transformants were maintained on synthetic complete media lacking uracil and supplemented with 2% glucose at 30° C. using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). Transformants were screened by PCR using primers 112590-34G and 112590-34H (SEQ ID NOs:737 and 738), and 112590-34F and 112590-49E (SEQ ID NOs: 739 and 740) to verify integration at the PDC6 locus with deletion of the PDC6 coding region. The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30° C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-URA media to verify the absence of growth. The resulting identified strain has the genotype: BY4700pdc6::PGpmi-sadB-ADH1t.
Construction of pdc1::PPDC1-ilvD Integration Cassette and PDC1 Deletion:
A pdch:: PPDC1-ilvD-FBA1t-URA3r integration cassette was made by joining the ilvD-FBA1t segment (SEQ ID NO:741) from pLH468 (SEQ ID NO: 1888) to the URA3r gene from pUC19-URA3r by SOE FCR (as described by Horton et al. (1989) Gene 77:61-68) using as template pLH468 and pUC19-UR.\3r plasmid DNAs, with Phusion DNA polymerase (New England Biolabs Inc., Beverly, Mass.; catalog no. F-540S) and primers 114117-27A through 114117-27D (SEQ ID NOs:742, 743, 744 and 745).
The outer primers for the SOE PCR (114117-27A and 114117-27D) contained 5′ and 3′ ˜50 by regions homologous to regions downstream of the PDC1 promoter and downstream of the PDC1 coding sequence. The completed cassette PCR fragment was transformed into BY4700 pdc6::PGPM1-sadB-ADH1t and transformants were maintained on synthetic complete media lacking uracil and supplemented with 2% glucose at 30° C. using standard, genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202). Transformants were screened by PCR using primers 114117-36D and 135 (SEQ ID NOs 746 and 747), and primers 112590-49E and 112590-30F (SEQ ID NOs 740 and 748) to verify integration at the PDC1 locus with deletion of the PDC1 coding sequence. The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30° C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-UR \ media to verify the absence of growth. The resulting identified strain “NYLA67” has the genotype: BY4700 pdc6:: PGPM1-sadB-ADH1t pdc1::_PPDC1-ilvD-FBA1t.
To delete the endogenous HIS3 coding region, a his 3:URA3r2 cassette was PCR-amplified from URA3r2 template DNA (SEQ ID NO: 749). URA3r2 contains the URA3 marker from pRS426 (ATCC #77107) flanked by 500 by homologous repeat sequences to allow homologous recombination in vivo and removal of the URA3 marker. PCR was done using Phusion DNA polymerase and primers 114117-45A and 114117-45B (SEQ ID NOs:750 and 751) which generated a ˜2.3 kb PCR product. The HIS3 portion of each primer was derived from the 5′ region upstream of the HIS3 promoter and 3′ region downstream of the coding region such that integration of the URA3r2 marker results in replacement of the HIS3 coding region. The PCR product was transformed into NYLA67 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on synthetic complete media lacking, uracil and supplemented with 2% glucose at 30° C. Transformants were screened to verify correct integration by replica plating of transformants onto synthetic complete media lacking histidine and supplemented with 2% glucose at 30° C. The URA3r marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30° C. following standard protocols. Marker removal was confirmed by patching colonies, from the 5-FOA plates onto SD-URA. media to verify the absence of growth. The resulting identified strain, called NYLA73, has the genotype: BY4700 pdc6 . . . PGPM1/sadB-ADH1t ppc1:: PPDC1-ilvD-FBA1t Δhis3.
Construction of pdc5::kanMX Integration Cassette and PDC5 Deletion:
A pdc5::kanMX4 cassette was PCR-amplified from strain YLR134W chromosomal DNA (ATCC No. 4034091) using Phusion DNA polymetase and primers PDC5::KanMXF and PDC5.:KanMXR (SEQ ID NOs:752 and 753) which generated a ˜2.2 kb PCR product. The PDC5 portion of each primer was derived from the 5′ region upstream of the PDC5 promoter and 3′ region downstream of the coding region such that integration of the kanMX4 marker results in replacement of the PDC5 coding region. The PCR product was transformed into NYLA73 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on YP media supplemented with 1% ethanol and geneticin (200 μg/ml) at 30° C. Transformants were screened by PCR to verify correct integration at the PDC locus with replacement of the PDC5 coding region using primers PDC5kofor and N175 (SEQ ID NOs: 754 and 755). The identified correct transformants have the genotype: BY4700 pdc 6:: PGPM1-sadB-ADH1t pdc 1::PPDC1-ilvD-FBA1t Δhis3 pdc5::kanMX4. The strain was named NYLA74.
Deletion of HXK2 (hexokinase II):
A hxk2::URA3r cassette was PCR-amplified from URA3r2 template (described above) using Phusion DNA polymerase and primers 384 and 385 (SEQ ID NOs:756 and 757) which generated a ˜2.3 kb PCR product. The HXK2 portion of each primer was derived from the 5′ region upstream of the HXK2 promoter and 3′ region downstream of the coding region such that integration of the URA3r2 marker results in replacement of the HXK2 coding region. The PCR product was transformed into NYLA73 using standard genetic techniques (Methods in Yeast Genetics, 2005, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 201-202) and transformants were selected on synthetic complete media lacking uracil and supplemented with 2% glucose at 30° C. Transformants were screened by PCR to verify correct integration at the HXK2 locus with replacement of the HXK2 coding region using primers N869 and N871 (SEQ ID NOs:758 and 759). The URA3r2 marker was recycled by plating on synthetic complete media supplemented with 2% glucose and 5-FOA at 30° C. following standard protocols. Marker removal was confirmed by patching colonies from the 5-FOA plates onto SD-URA media to verify the absence of growth, and by PCR to verify correct marker removal using primers N946 and N947 (SEQ ID NOs:760 and 761). The resulting identified strain named NYLA83 has the genotype: BY4700 pdc6:: PGPM1-sadB-ADH1t pdc1:: PPDC1-ilvD-FBA1t Δhis3 Δhxk2.
Strain PNY2242 was constructed in several steps from BP913 (described above). First, the native GPD2 gene on Chromsome XV was deleted. The coding region was deleted using CRE-lox mediated marker removal (methodology described above), so the resulting locus contains one loxP site. The sequence of the modified locus is provided as SEQ ID NO 1899 (Upstream region=nt 1-500; loxP site=nt 531-564; Downstream region=nt 616-1115). Second, the native FRA2 gene on Chromosome VII was deleted. Elimination of FRA2 was a scarless deletion of only the coding region. The sequence of the modified locus is provided as SEQ ID NO 1900 (Upstream region=nt 1-501; Downstream region=nt 526-1025). Next, the ADH1 gene on Chromosome XV was deleted along with insertion of a chimeric gene comprised of the UAS(PGK1)—FBA1 promoter and the kivD coding region. The native ADH1 terminator was used to complete the gene. The sequence of the modified locus is provided as SEQ ID No. 1901 (Upstream region nt=1-500; UAS(PGK1)FBA promoter=nt 509-1233; kivD coding region=nt 1242-2888; Downstream region (includes terminator)=nt 2889-3388). Next, a chimeric gene comprised of the FBA1 promoter, the alsS coding region and the CYC1 terminator was integrated into Chromosome XII, upstream of the TRX1 gene. The sequence of the modified locus is provided as SEQ ID No. 1902(Upstream region=nt 1-154; FBA1 promoter=nt 155-802; alsS CDS=nt 810-2525; CYC1 terminator=nt 2534-2788; Downstream region=nt 2790-3015). Next, two copies of a gene encoding, horse liver alcohol dehydrogenase were integrated into Chromsomes VII and XVI. On Chromosome VII, a chimeric gene comprised of the PDC1 promoter, the hADH coding region and the ADH 1 terminator were placed into the fra2A locus (the original deletion of FRA2 is described above). The sequence of the modified locus is provided as SEQ ID No. 1903 (Upstream region=nt 1-300; PDC1 promoter=nt 309-1178; hADH coding region=nt 1179-2306; ADH1 terminator=nt 2315-2630; Downstream region nt 2639-2900). On Chromosome XVI, a chimeric gene comprised of the PDC5 promoter, the hADH coding region and the ADH1 terminator were integrated in, the region formerly occupied by the long term repeat element YPRCdelta15. The sequence of the modified locus is provided as SEQ ID No. 1904 (Upstream region=nt 1-150; PDC5 promoter=nt 159-696; hADH coding region=nt 697-1824; ADH1 terminator=nt 1833-2148: Downstream region=nt 2157-2656). Then the native genes YMR226c and ALD6 were deleted. Elimination of YMR226c was a scarless deletion of only the coding region. The sequence of the modified locus is provided as SEQ ID No. 1905 (Upstream region=nt 1-250; Downstream region=nt 251-451). The ALD6 coding region plus 700 by of upstream sequence were deleted using CRE-lox mediated marker removal, so the resulting locus contains one loxP site. The sequence of the modified locus is provided as SEQ ID No. 1906(Upstream region=nt 1-500; loxP site=nt 551-584; Downstream region=nt 678-1128). The geneticin-selectable phosphoketolase expression vector described in Example 8 was transformed into the strain and confirmed as described above (the locus is depicted in
Growth rates were assessed as described in previous examples. Over a 24 hour period, PNY2257 displayed growth rates without ethanol or other two-carbon supplement similar to those growth rates observed for PNY2242 with supplementation.
This application is a continuation of and claims the benefit of U.S. application Ser. No. 13/161,168, filed on Jun. 15, 2011, which is related to and claims the benefit of priority of U.S. Provisional Patent Application No. 61/356,379, filed on Jun. 18, 2010. Each of the referenced applications is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61356379 | Jun 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13161168 | Jun 2011 | US |
Child | 14335734 | US |