A Sequence Listing accompanies this application and is submitted as an XML file named “960296.04348.xml” which is 216,433 bytes in size and was created on Dec. 6, 2022. The sequence listing is electronically submitted via Patent Center with the application and is incorporated herein by reference in its entirety.
Plants can directly convert atmospheric carbon dioxide (CO2) into diverse aromatic natural products, which are primarily derived from the aromatic amino acids tyrosine, phenylalanine, and tryptophan. Aromatic compounds have unusual stability due to their aromaticity (i.e., electron delocalization). As a result, aromatic compounds have potential to be used as a carbon sink for reducing atmospheric CO2 (1). Aromatic compounds are also key precursors for pharmaceuticals, commodity chemicals, and industrial materials, for which there is rapidly growing global demand (2, 6). However, the chemical conversion of CO2 into aromatic compounds remains challenging, and fossil fuels remain the primary source of aromatic compounds (3). Thus, there remains a need in the art for improved methods for harvesting aromatic compounds from renewable sources, such as plants.
In a first aspect, the present invention provides engineered 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase (DHS) polypeptides. The polypeptides comprise at least one mutation at a position corresponding to amino acid residue 109, 114, 159, 240, 244, 245, 247, 248, 319, 322, or 348 of the Arabidopsis DHS1 polypeptide (SEQ ID NO:1).
In a second aspect, the present invention provides polynucleotides encoding the engineered polypeptides disclosed herein.
In a third aspect, the present invention provides constructs comprising a promoter operably linked to one of the polynucleotides described herein.
In a fourth aspect, the present invention provides vectors comprising one of the polynucleotides or constructs described herein.
In a fifth aspect, the present invention provides cells comprising one of the engineered polypeptides, polynucleotides, constructs, or vectors described herein.
In a sixth aspect, the present invention provides seeds comprising one of the engineered polypeptides, polynucleotides, constructs, vectors, or cells described herein.
In a seventh aspect, the present invention provides plants grown from the seeds described herein and plants comprising one of the engineered polypeptides, polynucleotides, constructs, vectors, or cells described herein.
In an eighth aspect, the present invention provides methods for improving a plant by (1) increasing production of aromatic amino acids in the plant, and/or (2) increasing the amount of carbon dioxide (CO2) sequestered by the plant. The methods comprise: introducing one of the engineered polypeptides, polynucleotides, constructs, or vectors described herein into the plant.
In a ninth aspect, the present invention provides methods for using the plants described herein to (1) produce aromatic amino acids or derivatives thereof, or (2) sequester CO2. Both sets of methods comprise growing the plants described herein. The methods for producing aromatic amino acids or derivatives thereof further comprise purifying the aromatic amino acids or derivatives thereof produced by the plant.
The present invention provides engineered 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase (DHS) polypeptides comprising mutations that deregulate the shikimate pathway, resulting in increased production of aromatic amino acids and enhanced carbon assimilation in plants. Also provided are polynucleotides, constructs, and vectors that encode the engineered polypeptides; cells, seeds, and plants that express the engineered polypeptides; and methods for generating and using plants that express the engineered polypeptides.
In the Examples, the inventors describe the identification of suppressor of tyra2 (sota) mutations in Arabidopsis thaliana that deregulate the first step of the shikimate pathway, i.e., a pathway that connects central carbon metabolism to the pathway for aromatic amino acid biosynthesis in plants. The sota mutations mapped to genomic loci that encode the three Arabidopsis isoforms of the enzyme 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase (DHS). DHS catalyzes the first reaction of the shikimate pathway using two substrates, phosphoenolpyruvate (PEP) and erythrose-4-phosphate (E4P), which are directly supplied from glycolysis and the Calvin-Benson-Bassham (CBB) cycle, respectively (
In a first aspect, the present invention provides engineered DHS polypeptides. The polypeptides comprise at least one mutation at a position corresponding to amino acid residue 109, 114, 159, 240, 244, 245, 247, 248, 319, 322, or 348 of the Arabidopsis DHS1 polypeptide (SEQ ID NO:1). These residues correspond to positions at which suppressor of tyra2 (sota) mutations were identified by the inventors. Identification of the mutations at residues 114, 159, 240, 244, 245, and 247 is described in Example 1, whereas identification of the mutations at residues 109, 248, 319, 322, and 348 is described in Example 2.
The terms “polypeptide,” “protein,” and “peptide” are used interchangeably herein to refer to a series of amino acid residues connected by peptide bonds between the alpha-amino and carboxy groups of adjacent residues. Polypeptides include modified amino acids. Suitable polypeptide modifications include, but are not limited to, acylation, acetylation, formylation, lipoylation, myristoylation, palmitoylation, alkylation, isoprenylation, prenylation, amidation at C-terminus, glycosylation, glycation, polysialylation, glypiation, and phosphorylation. Polypeptides may also include amino acid analogs.
The engineered DHS polypeptides described herein may be full-length polypeptides or may be fragments of a full-length polypeptide. As used herein, a “fragment” is a portion of a polypeptide that is identical in sequence to, but shorter in length than, the full-length polypeptide. For example, a fragment may comprise at least 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 250, or 500 contiguous amino acid residues of a full-length polypeptide. Fragments may be preferentially selected from certain regions of a polypeptide. A fragment may include an N-terminal truncation, a C-terminal truncation, or both an N-terminal and C-terminal truncation relative to the full-length polypeptide. Preferably, the DHS polypeptide fragments used with the present invention are functional fragments. As used herein, a “functional fragment” is a fragment that retains at least 20%, 40%, 60%, 80%, or 100% of the DHS activity of the corresponding full-length polypeptide.
The polypeptides described herein are “engineered,” meaning that they have been altered by the hand of man. Specifically, the engineered DHS polypeptides of the present invention have been altered to comprise a mutation. As used herein, the term “mutation” refers to a difference in an amino acid sequence relative to a reference sequence (e.g., the sequence of the wild-type polypeptide). Mutations include insertions, deletions, and substitutions of an amino acid relative to a reference sequence. An “insertion” refers to a change in an amino acid sequence that results in the addition of one or more amino acid residues. An insertion may add 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more amino acid residues to a sequence. A “deletion” refers to a change in an amino acid sequence that results in the removal of one or more amino acid residues. A deletion may remove 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, or more amino acids residues from a sequence. A “substitution” refers to a change in an amino acid sequence in which one amino acid is replaced with a different amino acid. An amino acid substitution may be a conversative replacement (i.e., a replacement with an amino acid that has similar properties) or a radical replacement (i.e., a replacement with an amino acid that has different properties).
The engineered DHS polypeptides of the present invention comprise one or more mutations relative to the corresponding wild-type polypeptide (i.e., the wild-type version of the same DHS polypeptide). The term “wild-type” is used to describe the non-mutated version of a polypeptide that is most typically found in nature.
Arabidopsis thaliana expresses three isoforms of DHS, which are referred to as DHS1, DHS2, and DHS3. The sota mutations described herein were identified in one or more of these three Arabidopsis DHS isoforms. These isoforms are closely related (e.g., DHS2 has 77.58% identity to DHS1, and DHS3 has 80.53% identity to DHS1). Thus, for simplicity, we have arbitrarily used the Arabidopsis DHS1 polypeptide (SEQ ID NO:1) as a reference sequence and have specified the positions of the sota mutations using the amino acid residue numbering of this polypeptide. However, the polypeptide sequence of any related DHS polypeptide could be used instead. For example, amino acid residues 109, 114, 159, 240, 244, 245, 247, 248, 319, 322, and 348 of DHS1 (SEQ ID NO:1) correspond to residues 91, 136, 217, 218, 219, 220, 221, 222, 223, 224, and 225 of DHS2 (SEQ ID NO:2); and to residues 114, 159, 240, 241, 242, 243, 244, 245, 246, 247, and 248 of DHS3 (SEQ ID NO:3), respectively, as is demonstrated in the sequence alignment shown in
In the Examples, the inventors demonstrate that expression of engineered DHS polypeptides from several plants (i.e., Arabidopsis, sorghum, and poplar) can be used to increase the aromatic amino acid production and CO2 sequestration of a plant. DHS enzymes (which are found in bacteria and plants) are highly conserved across a wide variety of plants, as is demonstrated in
In some embodiments, the engineered DHS polypeptides comprise a polypeptide or a functional fragment thereof having at least 50%, 60%, 70%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% sequence identity to a polypeptide selected from SEQ ID NO:1-37. “Percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window. The aligned sequences may comprise additions or deletions (i.e., gaps) relative to each other for optimal alignment. The percentage is calculated by determining the number of matched positions at which an identical nucleic acid base or amino acid residue occurs in both sequences, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Protein and nucleic acid sequence identities are evaluated using the Basic Local Alignment Search Tool (“BLAST”), which is well known in the art (Proc. Natl. Acad. Sci. USA (1990) 87: 2267-2268; Nucl. Acids Res. (1997) 25: 3389-3402). The BLAST programs identify homologous sequences by identifying similar segments, which are referred to herein as “high-scoring segment pairs”, between a query amino acid or nucleic acid sequence and a test sequence which is preferably obtained from a protein or nucleic acid sequence database. Preferably, the statistical significance of a high-scoring segment pair is evaluated using the statistical significance formula Proc. Natl. Acad. Sci. USA (1990) 87: 2267-2268), the disclosure of which is incorporated by reference in its entirety. The BLAST programs can be used with the default parameters or with modified parameters provided by the user.
Regardless of their origin, the engineered DHS polypeptides of the present invention comprise at least one mutation at a position corresponding to amino acid residue 109, 114, 159, 240, 244, 245, 247, 248, 319, 322, or 348 of the Arabidopsis DHS1 polypeptide (SEQ ID NO:1). As used herein, the phrase “at a position corresponding to” refers to an amino acid position that aligns with an amino acid position in another protein in a protein sequence alignment or a protein structure alignment. For example, the phrase “at a position corresponding to amino acid residue 114 of SEQ ID NO:1” refers to an amino acid position in a polypeptide sequence that aligns with the 114th amino acid residue in SEQ ID NO:1 when the two polypeptide sequences are aligned using a sequence alignment program. (Note: This position is flagged with a red arrow labeled “G114R on DHS3” above the partial sequence alignment of SEQ ID NO:1-37 shown in
In some embodiments, the engineered polypeptide comprises one of the specific sota mutations that were identified by the inventors in the Arabidopsis DHS enzymes in the Examples. These specific mutations include mutations corresponding to G114R, L159F, A240T, G244R, G245S, and A247T in SEQ ID NO:1 (identified in Example 1), and mutations corresponding to P109S, P109L, A240V, A247V, A248T, D319N, S322F, and E348K in SEQ ID NO:1 (identified in Example 2). Thus, in some embodiments, the at least one mutation includes at least one mutation corresponding to P109S, P109L, G114R, L159F, A240V, A240T, G244R, G245S, A247V, A247T, A248T, D319N, S322F, or E348K in SEQ ID NO:1.
In the Examples, the inventors demonstrate that the identified DHS mutations reduce inhibition by tyrosine-associated compounds and tryptophan-associated compounds (i.e., compounds consisting of or derived from tyrosine and tryptophan, respectively). Thus, in some embodiments, the engineered DHS enzymes have reduced inhibition by one or more of these compounds relative to the wild-type version of the same DHS enzyme. Exemplary tyrosine-associated compounds include, without limitation, tyrosine, tyrosol, tyramine, hydroxyphenylpyruvate (HPP), and homogentisate (HGA). Exemplary tryptophan-derived compounds include, without limitation, tryptophan, indole-3-pyruvate (IPA), indole-3-acetate (IAA; auxin), indole-3-lactate (ILA), anthranilate, and tryptamine.
Inhibition by tyrosine, tryptophan, and tyrosine/tryptophan-associated compounds may be reduced by 1.5-, 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, 20-fold, or more as compared to the inhibition exhibited by the corresponding wild-type DHS enzyme. Inhibition by these compounds may be measured using a DHS enzyme activity assay performed in the presence of the compound. Suitable DHS enzyme activity assays include those described in Plant Cell. (2021) 33, 671-696, which is incorporated by reference in its entirety. Alternatively, DHS enzyme activity can be analyzed by measuring the loss of the substrate phosphoenolpyruvate (PEP) at absorbance 232 nm (Acta Crystallogr Sect F Struct Biol Cryst Commun (2005) 61(Pt 4): 403-6; J Biol Chem (2010) 285(40): 30567-30576). Also, the production of the product 3-deoxy-D-arabinoheptulosonate 7-phosphate (DAHP) can be directly measured via liquid chromatography-mass spectrometry (LCMS, Yokoyama R, El-Azaz J, Maeda H A, unpublished data).
In a second aspect, the present invention provides polynucleotides encoding the engineered polypeptides disclosed herein. The terms “polynucleotide,” “oligonucleotide,” and “nucleic acid” are used interchangeably to refer a polymer of DNA or RNA. A polynucleotide may be single-stranded or double-stranded and may represent the sense or the antisense strand. A polynucleotide may be synthesized or obtained from a natural source. A polynucleotide may contain natural, non-natural, or altered nucleotides, as well as natural, non-natural, or altered internucleotide linkages (e.g., phosphoroamidate linkages, phosphorothioate linkages). The term polynucleotide encompasses constructs, vectors, plasmids, and the like. In some embodiments, the polynucleotide is complementary DNA (cDNA; i.e., synthetic DNA that has been reverse transcribed from a messenger RNA) or genomic DNA (i.e., chromosomal DNA from an organism). Those of skill in the art understand the degeneracy of the genetic code and that a variety of polynucleotides can encode the same polypeptide.
While the polynucleotide sequences disclosed herein are derived from sequences found in plants, any polynucleotide sequence that encodes the desired engineered DHS polypeptide may be used with the present invention. For example, in some embodiments, the polynucleotides are codon-optimized for expression in a particular cell (e.g., a plant cell, bacterial cell, or fungal cell). “Codon optimization” is a process used to increase expression of a polynucleotide in a particular host cell by altering the sequence of the polynucleotide to accommodate the codon bias of the host cell. Computer programs for generating codon-optimized sequences for use in a particular host cell are known in the art.
In a third aspect, the present invention provides constructs comprising a promoter operably linked to one of the polynucleotides described herein. As used herein, the term “construct” refers a to recombinant polynucleotide, i.e., a polynucleotide that was formed by combining at least two polynucleotide components from different sources, natural or synthetic. For example, a construct may comprise the coding region of one gene operably linked to a promoter that is (1) associated with another gene found within the same genome, (2) from the genome of a different species, or (3) synthetic. Constructs can be generated using conventional recombinant DNA methods.
As used herein, the term “promoter” refers to a DNA sequence defines where transcription of a polynucleotide beings. RNA polymerase and the necessary transcription factors bind to the promoter to initiate transcription. Promoters are typically located directly upstream (i.e., at the 5′ end) of the transcription start site. However, a promoter may also be located at the 3′ end, within a coding region, or within an intron of a gene that it regulates. Promoters may be derived in their entirety from a native or heterologous gene, may be composed of elements derived from multiple regulatory sequences found in nature, or may comprise synthetic DNA. A promoter is “operably linked” to a polynucleotide if the promoter is positioned such that it can affect transcription of the polynucleotide.
The promoter used in the constructs described herein may be a heterologous promoter (i.e., a promoter that is not naturally associated with the DHS polynucleotide), an endogenous promoter (i.e., a promoter that is naturally associated with the DHS polynucleotide), or a synthetic promoter that is designed to function in a desired manner in a particular host cell. Suitable promoters for use with the present invention include, but are not limited to, constitutive, inducible, temporally regulated, developmentally regulated, chemically regulated, tissue-preferred, and tissue-specific promoters. In some cases, it may be advantageous to use a tissue-specific promoter or a developmental stage-specific promoter such that the construct will drive expression of the DHS polypeptide in a particular tissue (e.g., the roots or leaves of a plant) or during a particular developmental stage (e.g., leaf maturation, seed development, senescence).
In some embodiments, the promoter is a plant promoter, i.e., a promoter that is active in plant cells. Suitable plant promoters include, without limitation, the 35S promoter of the cauliflower mosaic virus, ubiquitin, the tCUP cryptic constitutive promoter, the Rsyn7 promoter, the maize In2-2 promoter, and the tobacco PR-1a promoter.
In a fourth aspect, the present invention provides vectors comprising one of the polynucleotides or constructs described herein. The term “vector” refers to a DNA molecule that is used to carry a particular DNA segment (i.e., a DNA segment included in the vector) into a host cell. Some vectors are capable of autonomous replication in a host cell (e.g., bacterial vectors that include an origin of replication and episomal mammalian vectors). Other vectors can be integrated into the genome of a host cell such that they are replicated along with the host genome (e.g., viral vectors and transposons). Vectors may include heterologous genetic elements that are necessary for propagation of the vector or for expression of an encoded gene product. Vectors may also include a reporter gene or a selectable marker gene. Suitable vectors include plasmids (i.e., circular double-stranded DNA molecules) and mini-chromosomes.
In a fifth aspect, the present invention provides cells comprising one of the engineered polypeptides, polynucleotides, constructs, or vectors described herein. The cells may be eukaryotic or prokaryotic. Preferably, the cell is a type of cell that can be used for large-scale production of aromatic amino acids or CO2 sequestration. For example, in some embodiments, the cell is a plant cell, a bacterial cell, a fungal cell, or a protist cell.
In some embodiments, the cell is a plant cell. Suitable plant cells for use with the present invention include, without limitation, tomato plant cells, tobacco plant cells, soybean plant cells, cotton plant cells, poplar plant cells, sorghum plant cells, rice plant cells, corn plant cells, beet plant cells, mung bean plant cells, opium poppy plant cells, alfalfa plant cells, wheat plant cells, barley plant cells, millet plant cells, oat plant cells, rye plant cells, rapeseed plant cells, and miscanthus plant cells.
In a sixth aspect, the present invention provides seeds comprising one of the engineered polypeptides, polynucleotides, constructs, vectors, or cells described herein. A “seed” is an embryonic plant enclosed in a protective outer covering. In embodiments in which the plant comprises a nucleic acid (i.e., a polynucleotide, construct, or vector) described herein, the nucleic acid may either be integrated into the genome of the seed or exist independently from the genome.
In a seventh aspect, the present invention provides plants grown from the seeds described herein and plants comprising one of the engineered polypeptides, polynucleotides, constructs, vectors, or cells described herein.
As used herein, the term “plant” includes both whole plants and plant parts. Examples of plant parts include, without limitation, embryos, pollen, ovules, flowers, glumes, panicles, roots, root tips, anthers, pistils, leaves, stems, seeds, pods, flowers, calli, clumps, cells, protoplasts, germplasm, asexual propagates, and tissue cultures. This term also includes chimeric plants in which only a subset of the plant's cells comprises the engineered polypeptide, polynucleotide, construct, or vector.
The plants may be of any species. In some embodiments, the plant is selected from a tomato plant, a tobacco plant, a soybean plant, a cotton plant, a poplar plant, a sorghum plant, a rice plant, and a corn plant. The protein sequences of DHS enzymes found in these plants are provided as SEQ ID NO:1-37 (see
In the Examples, the inventors demonstrate that plants (i.e., both Arabidopsis thaliana and Nicotiana benthamiana plants) comprising sota mutant DHS enzymes (1) produce more aromatic amino acids, and (2) assimilate a greater quantity of CO2 as compared to a control plant. As used herein, the term “control plant” refers to a comparable plant (e.g., of the same species, cultivar, and age) that was raised under the same or comparable conditions (e.g., water, sunlight, nutrients) but that does not express an engineered DHS polypeptide described herein.
In some embodiments, the plant produces a greater quantity of aromatic amino acids (i.e., tyrosine, phenylalanine, and tryptophan) or produces aromatic amino acids at a greater rate as compared to a control plant. Suitably, the plant produces at least 1.5-, 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, or 20-fold more aromatic amino acids as compared to the control plant. Production of aromatic amino acids may be measured using 13CO2 labeling followed by quantification via gas chromatography-mass spectrometry (GC-MS), liquid chromatography-mass spectrometry (LC-MS), or nuclear magnetic resonance (NMR).
In some embodiments, the plant assimilates a greater quantity of CO2 or assimilates CO2 at a greater rate as compared to a control plant. Suitably, the CO2 assimilation of the plant is at least 2%, 5%, 10%, 20%, 30%, 40%, 50%, or 60% greater than that of a control plant. CO2 assimilation may be quantified by measuring the gas exchange activity of the plant. For example, CO2 assimilation may be measured using an LI-6400XT photosynthesis system equipped with the 6400-40 leaf chamber (LI-COR), as described in the Examples. Alternatively, labeled 13CO2 can be fed to plants and the rate of 13C incorporation into plants can be measured over time.
In an eighth aspect, the present invention provides methods for improving a plant by (1) increasing production of aromatic amino acids in a plant, and/or (2) increasing the amount of CO2 sequestered by the plant. The methods comprise: introducing one of the engineered polypeptides, polynucleotides, constructs, or vectors described herein into the plant.
As used herein, “introducing” describes a process by which exogenous polypeptides or polynucleotides are introduced into a recipient cell. Suitable introduction methods include, without limitation, Agrobacterium-mediated transformation, the floral dip method, bacteriophage or viral infection, electroporation, heat shock, lipofection, microinjection, and particle bombardment. CRISPR/Cas-based gene editing systems may also be used to edit a native DHS gene in a plant to include at least one of the sota mutations described herein.
In some embodiments, the methods further comprise purifying aromatic amino acids or derivatives thereof from the plant. As used herein, the term “purifying” refers to the process of separating a desired product from other cellular components and impurities. Suitable methods for purifying aromatic amino acids and derivatives thereof include, without limitation, high performance liquid chromatography (HPLC) and other chromatographic techniques, such as affinity chromatography. A “purified” product may be at least 85% pure, at least 95% pure, or at least 99% pure.
In some embodiments, the plant to be improved is selected from a tomato plant, a tobacco plant, a soybean plant, a cotton plant, a poplar plant, a sorghum plant, a rice plant, and a corn plant.
In a ninth aspect, the present invention provides methods for using the plants described herein to (1) produce aromatic amino acids or derivatives thereof, or (2) sequester CO2. Both sets of methods comprise growing the plants described herein. The methods for producing aromatic amino acids or derivatives thereof further comprise purifying the aromatic amino acids or derivatives thereof produced by the plant.
Exemplary aromatic amino acid derivatives that could be produced using the methods of the present invention include the tyrosine derivatives homogentisate (HGA), α-tocopherols, and γ-tocopherols, which were found to be produced at increased levels in plants comprising engineered DHS polynucleotides.
“Carbon sequestration” is a process in which atmospheric CO2 is captured and stored. It is one method for reducing the amount of CO2 in the atmosphere (i.e., to reduce global climate change). In some embodiments, the methods further comprise harvesting part of the plant while leaving the roots of the plant in the soil such that the carbon contained in the roots is sequestered therein. Harvestable parts of plants include, without limitation, flowers, pollen, seedlings, tubers, leaves, stems, fruit, seeds, roots, cuttings, and the like. Above ground tissues that are enriched for aromatic compounds will be decomposed slowly by soil microbes, which also enhances carbon sequestration.
The present disclosure is not limited to the specific details of construction, arrangement of components, or method steps set forth herein. The compositions and methods disclosed herein are capable of being made, practiced, used, carried out and/or formed in various ways that will be apparent to one of skill in the art in light of the disclosure that follows. The phraseology and terminology used herein is for the purpose of description only and should not be regarded as limiting to the scope of the claims. Ordinal indicators, such as first, second, and third, as used in the description and the claims to refer to various structures or method steps, are not meant to be construed to indicate any specific structures or steps, or any particular order or configuration to such structures or steps. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to facilitate the disclosure and does not imply any limitation on the scope of the disclosure unless otherwise claimed. No language in the specification, and no structures shown in the drawings, should be construed as indicating that any non-claimed element is essential to the practice of the disclosed subject matter. The use herein of the terms “including,” “comprising,” or “having,” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof, as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements.
Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if a concentration range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this disclosure. Use of the word “about” to describe a particular recited amount or range of amounts is meant to indicate that values very near to the recited amount are included in that amount, such as values that could or naturally would be accounted for due to manufacturing tolerances, instrument and human error in forming measurements, and the like. All percentages referring to amounts are by weight unless indicated otherwise.
No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.
The following examples are meant only to be illustrative and are not meant as limitations on the scope of the invention or of the appended claims.
Terrestrial plants can convert atmospheric CO2 into diverse and abundant aromatic compounds, which have unusual stability due to their aromaticity (i.e., electron delocalization) and hence are promising sinks for carbon storage of atmospheric CO2. However, it is unclear how plants control the shikimate pathway, which connects the photosynthetic carbon fixation pathway (i.e., the Calvin-Benson-Bassham (CBB) cycle) to the pathways responsible for the biosynthesis of aromatic amino acids (AAs) and aromatic phytochemicals (
In the following example, we identify suppressor of tyra2 (sota) mutations in Arabidopsis thaliana that deregulate the first step of the plant shikimate pathway by alleviating effector-mediated feedback regulation. Plants with these sota mutations showed hyperaccumulation of aromatic amino acids accompanied by up to a 30% increase in net CO2 assimilation. Thus, the identified mutations could be used to enhance plant-based conversion of atmospheric CO2 into high-energy and high-value aromatic compounds.
Suppressor of Tyra2 (sota) Identified Dominant Mutations Targeting the Entry Step of the Shikimate Pathway
We conducted genetic screening to isolate suppressors of the Arabidopsis thaliana tyra2 knockout mutant, which lacks one of two TyrA genes of tyrosine biosynthesis (
For genetic mapping, eight representative lines (i.e., sotaA4, sotaA11, sotaB3, sotaB4, sotaF1, sotaG1, sotaH1, and sotaH9) were backcrossed with the original Arabidopsis tyra2 mutant. Illumina whole-genome sequencing of tyra2-like and/or sota-like F2 progenies identified high-frequency missense mutations from all eight lines in At4g39980, At4g33510, or At5g05920, which are the three loci encoding 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase (DAHP synthase or DHS) isoforms (
sota Mutations Alleviate the Complex Regulation of Plant DHS Enzymes
Within the DHS proteins, the identified DHS sota mutations were located near a predicted effector binding site away from the active site (
Unlike DHS2, DHS1 is not inhibited by AAAs (9), and this was also the case for the DHS1B4 mutant enzyme (
Further screening demonstrated that Trp-derived indole-3-pyruvate (IPA), the immediate precursor of the plant hormone indole-3-acetate (IAA; auxin), and, to a lesser extent, IAA itself inhibit both DHS1 and DHS2 (
sota Mutations Deregulate the Shikimate Pathway and Elevate AAAs
To directly test whether the relaxed feedback regulation of DHS enzymes with sota mutations increases the shikimate pathway activity in plants, Arabidopsis Col-0 (WT) and the sotaB4 and sotaA4 mutant plants were fed with stable isotope-labeled 13CO2 in the light for 6 hours from the beginning of the day. The following time course metabolite analyses showed that the 13C label was gradually incorporated into various metabolites (
To further assess the impacts of the sota mutations on AAA and AAA-derived metabolites, we conducted targeted metabolite profiling using GC-MS and liquid chromatography (LC)-MS. First, we generated the sotaB4 and sotaA4 mutants in the Arabidopsis Col-0 background by outcrossing to Col-0. Overall, these plants were indistinguishable from Col-0 in terms of their growth and seed yield (Table 6 and
The levels of HGA and α- and γ-tocopherols derived from Tyr were, like Tyr, also elevated in both sotaB4 and sotaA4 mutants (
Further careful comparisons of GC-MS traces between genotypes revealed that a few previously unidentified peaks appeared in both sotaB4 and sotaA4 mutants but not in Col-0 samples. On the basis of the National Institute of Standards and Technology library search and subsequent comparison to respective authentic standards, these peaks were identified as PPY, the keto acid of Phe produced by aromatic aminotransferases (21-23), as well as phenylacetate and phenyllactate, which are both likely derived from PPY (
DHS uses two substrates, phosphoenolpyruvate (PEP) and erythrose-4-phosphate (E4P) that are directly supplied from glycolysis and the CBB cycle, respectively (
To further test potential impacts of the sota mutations on photosynthetic carbon fixation, net CO2 assimilation rates (A) in response to different light intensity were analyzed by measuring the gas exchange activity of Col-0, sotaB4, and sotaA4 plants. Both sota mutant plants exhibited significantly higher A levels at all light intensities at and above 100 microeinstein (μE), the growth light condition used in this study, and eventually reached a plateau to an approximately 30% higher assimilation than Col-0 (
The DHS-catalyzed reaction has been assumed to be important for the regulation of the plant shikimate pathway based on prior microbial studies (26, 28) and expression of deregulated microbial DHS in plants (29-31). Our study provides strong genetic evidence to support this notion, as all eight studied metabolic sota mutations mapped to the loci encoding DHSs, but not other shikimate pathway enzymes. Unlike microbial DHSs that are directly by inhibited by the pathway product (i.e., AAAs), this study found that plant DHSs are subjected to highly complex feedback regulation mediated by not only AAAs but also by many AAA-derived compounds (
The elevated CO2 assimilation observed in the sota mutants was striking and is likely important for efficient supply of E4P (
Arabidopsis thaliana plants used in this study were grown under a 12-hour/12-hour 100-μE light/dark cycle with 85% air humidity in soil supplied with Hoagland solution or on the agarose-containing 0.5-strength Murashige and Skoog (MS) medium with 1% sucrose, unless stated otherwise.
Screen for Suppressor of Tyra2 (sota) Mutations
The seeds of the tyra2-1 transfer DNA insertion mutant (SALK_001756), which were previously characterized determined to be null homozygous with a dwarf and reticulate phenotype (15), were used to conduct a forward genetic suppressor screen using ethyl methanesulfonate (EMS), following a method by Weigel and Glazebrook (38) with a few modifications. Briefly, ˜10,000 tyra2 homozygous seeds were mutagenized with 0.2% EMS (M0880, Sigma-Aldrich) for 15 hours in a 50-mL Falcon tube on a rocking platform. Seeds were rinsed with ultrapure water 10 times and soaked in the last rinse for 1 hour. Subsequently, seeds were suspended in 400 mL 0.1% agarose and spread on eight different trays (˜50 mL on each tray, the 1020 tray; CN-FLXHD, Greenhouse Megastore, Danville) containing germination soil mix (8269028, Sungro). Eight M1 pools from different trays were named with alphabet letters (A to H). Each pool contained approximately 1000 M1 plants. Mutagenesis efficiency was calculated by applying the Poisson distribution, as described previously (38). Observation of siliques from 50 M1 plants identified 15 plants without aborted seeds, indicating that the mutagenesis was successful. M2 screening was performed by germinating ˜10,000 seeds from each M1 pool on 10 trays containing the germination mix. A total of ˜80,000 M2 seeds were germinated on 80 trays. Phenotypes were evaluated at 4 to 5 weeks after germination. Col-0 and tyra2-1 were germinated side by side with EMS mutants in each tray for comparison. Plants showing the tyra2-like dwarf and reticulate leaf phenotypes were removed, while ones showing any recovery of either one or both of the tyra2 phenotypes were kept and deemed to be suppressor of tyra2 (sota) lines. Each sota line was named based on the pool (i.e., A to H) from which it originated followed by a number. For example, the line sotaB4 is the fourth sota line recovered from pool B. Each M2 sota line was allowed to self-fertilize, and the resulting M3 seeds were collected for further experiments.
Whole-Genome Sequencing-Based Mapping of sota Mutations
To identify the causal mutations leading to the suppression of the tyra2 phenotypes and the accumulation of aromatic amino acids (AAA) in the metabolic sota lines, the M3 plants of a first subset of the sota lines (i.e., sotaA4, sotaA11, and sotaB4) were backcrossed with tyra2. Note: The remaining sota lines were analyzed later, see below. The F1 population also showed the tyra2 recovery phenotype, indicating that all three of the tested sota mutations had semidominant or dominant characteristics, with the F1 plants of sotaB4 being almost indistinguishable from its M3 parent. As expected, roughly one quarter of F2 segregating populations showed the tyra2-like phenotypes (
dCAPS-Based Genotyping of the sota Mutants
To determine if the DHS sota mutations identified by the whole genome sequencing segregated with the sota-like phenotype (i.e., suppression of tyra2 phenotypes), the presence and absence of each DHS sota mutation was examined in F2 populations via a cleaved amplified polymorphic sequence (dCAPS) analysis. Primers for each sota SNV were designed using the bioinformatic tool dCAPS Finder 2.0 (39), while complementary primers for each dCAPS primer were designed using primer3 v.0.4.0 (40). The sequences of these primers are listed in Table 9 and Table 10. Polymerase chain reaction (PCR) was performed using EconoTaq PLUS green 2× master mix (Lucigen) in a 20-μL reaction containing ˜10 ng genomic DNA and 0.5 μM of each primer. After amplification, the PCR product was visualized on a 4% 1× tris-borate EDTA (TBE)-agarose gel via electrophoresis and 5 μL of the PCR product was digested using the restriction enzyme indicated in Table 9 (Thermo Scientific) in a 20-μL reaction. Digested fragments were separated by electrophoresis in a 4-5% 1×TBE-agarose gel containing ethidium bromide. The GeneRuler Ultra Low Range DNA Ladder (Thermo Scientific) was used to verify the sizes of the digested fragments. In all eight sota lines, the corresponding DHS sota mutation was found only in F2 individuals exhibiting the tyra2 suppression phenotypes (
We next determined whether the identified sota mutations were responsible for the observed phenotypes, including the tyra2 suppression phenotypes and the elevated levels of Tyr and Phe (
Mutagenesis PCR was carried out by mixing 1 ng ribonuclease-treated plasmid as template, 2× PrimeSTAR® MAX DNA polymerase mix (R045A, Takara Bio USA), and 0.5 μM oligonucleotide primers (Table 10), which were designed using the Takara Web tool for mutagenesis (www.takarabio.com/learning-centers/cloning/primer-design-and-other-tools). After 20 cycles of PCR (98° C. for 15 seconds, 58° C. for 10 seconds, 72° C. for 2 minutes, and final extension at 72° C. for 5 minutes), the PCR product was treated with FastDigest DpnI (Thermo Scientific), purified using QIAquick PCR Purification Kit (QIAGEN), and introduced into ultracompetent E. coli MC1061 cells (Lucigen). The final binary vector sequence was confirmed by whole-plasmid sequencing (MGH DNA Core).
To generate transgenic Arabidopsis plants in the tyra2-1 mutant background, tyra2-1 seeds were germinated on the germination mix and grown until flowering before being transformed with each construct using the floral dip method (43). The transformed T0 plants were allowed to complete their life cycle in the growth chamber, and dried T1 seeds were harvested. The positive T1 transformants were then selected based on RFP fluorescent marker expression, i.e., by observing the seeds under the AxioZoom V16 (Zeiss) stereo fluorescent microscope with RFP settings (EX 572/25, BA590, EM 629/62). T2 seeds were used to select lines that contain a single insertion of the transgene. Overall, eight individual T2 plants from each single insertion line were allowed to complete their life cycle and their seeds were observed under a stereo RFP fluorescent microscope to identify homozygous T3 seeds, which were used for further analyses. Due to positional effects, some T2 homozygous plants could not complete their life cycle because of high accumulation of AAAs, similar to the sotaF1 homozygous line. For these specific lines, T2 heterogeneous plant populations were used for further analysis. Notably, although the hygromycin resistance gene was also present, seed selection based on RFP expression was more efficient and less aggressive, allowing for the germination of positive transformants directly on soil.
To generate transgenic lines expressing the WT or sota-mutated DHS genes in the Col-0 background, the same constructs that were used for the complementation test were transformed into Col-0 plants. One leaf of each 5-week-old T2 plant of each line was first analyzed for photosynthetic measurement, and then other leaves were harvested from the same plant for metabolite analysis.
To generate recombinant DHS proteins the pET28a vectors carrying the A. thaliana DHS1 (AtDHS1), AtDHS2, or AtDHS3 WT sequence without the predicted plastid transit peptide (amino acid residues; 49-525, 34-507, and 52-527, respectively) were expressed in E. coli Rosetta-2 cells and purified using Ni-affinity chromatography, exactly as was conducted previously (9). To generate DHS proteins with individual sota mutations via site-directed mutagenesis, these pET28a plasmid templates were diluted by 500-fold, mixed with 0.04 U/μL Phusion DNA polymerase (Thermo Scientific), 0.2 mM deoxynucleoside triphosphates (dNTPs), 1× Phusion reaction buffer (Thermo Scientific), and 0.5 μM forward and reverse mutagenesis primers (Table 10). The PCR reaction was run using the following protocol: 98° C. for 30 s followed by 20 cycles of 10 s at 98° C., 20 s at 70° C., 4.5 min at 72° C. with a final extension at 72° C. for 10 min. The PCR products were purified using a QIAquick Gel Extraction Kit (QIAGEN), treated with FastDigest DpnI (Thermo Scientific) to digest methylated plasmid template DNA for 20 min at 37° C., and transformed into E. coli cells. The mutagenized pET28a plasmids were sequenced to confirm that no errors were introduced during the mutagenesis process.
The DHS enzyme assays were conducted using the colorimetric method that we recently described (9). Briefly, the enzyme solution (7.7 μl) containing 50 mM Hepes (pH 7.4) was preincubated with an effector molecule(s) at room temperature for 15 min. For assays using recombinant protein and enzyme fractions isolated from plant leaves, 0.01 to 0.1 μg and approximately 50 μg of proteins were used, respectively. After adding 0.5 μl of 0.1 M dithiothreitol, the samples were further incubated at room temperature for 15 min. During these incubations, the substrate solution containing 50 mM Hepes (pH 7.4), 2 mM MnCl2, 4 mM E4P, and 4 mM PEP at final concentration was preheated at 37° C. The enzyme reaction was started by adding 6.8 μl of the substrate solution, then incubated at 37° C. for 30 min, and terminated by adding 30 μl of 0.6 M trichloroacetic acid. After a brief centrifugation, 5 μl of 200 mM NaIO4 (sodium meta-periodate) in 9 N H3PO4 was added to oxidize the enzymatic product and to incubate at 25° C. for 20 min. To stop the oxidation reaction, 20 μl of 0.75 M NaAsO2 (sodium arsenite), which was dissolved in 0.5 M Na2SO4 and 0.05 M H2SO4, was added and immediately mixed. After 5 min of incubation at room temperature, one-third of the sample solution was transferred to a new tube to be mixed with 50 μl of 40 mM thiobarbituric acid and incubated at 99° C. for 15 min in a thermal cycler. The mixture was added to 600 μl of cyclohexanone in eight-strip solvent-resistant plastic tubes, mixed vigorously, and centrifuged at 4500 g for 3 min to separate water- and cyclohexanone-based layers for the extraction off the developed pink chromophore. The absorbance of the pink supernatant was read at 549 nm with the microplate reader (Infinite 200 PRO, TECAN) to calculate DAHP production with the molar extinction coefficient at 549 nm (ε=549 nm) of 4.5×104 M−1 cm−1. Reaction mixtures with boiled enzymes were run in parallel and used as negative controls to estimate the background signal.
The three-dimensional structure of DHS2 WT was generated by homology modeling using the high resolution structure 5uxm.pdb of type II DHS from Pseudomonas aeruginosa as a template structure (16). DHS2 WT has more than 60% sequence identity with the template. Homology modeling was performed using Modeller 9.24 (44). The model with the lowest discrete optimized protein energy (DOPE) value was chosen for further validation. The modelled structure was validated by inspection of phi/psi distributions of a Ramachandran plot obtained through PROCHECK (45) and the significance of consistency between template and models was evaluated using the ProSA server (46). In addition, the root mean square deviation (RMSD) was analyzed by Chimera (match-maker) (47) on superimposition of template (5uxm.pdb) with predicted structures to check the reliability of models. The model shows RMSD of 0.207 Å to 5uxm.pdb Trp for 441 atom pairs. The Trp binding site was mapped in the model on Chimera by superposition of Trp-bound 5uxm.pdb.
To examine the impact of the sota mutations on the interaction between DHS and AAA effectors (
Approximately 50 to 80 mg of fully expanded mature leaves were pooled from multiple plants at the same developmental stages. For seedling analyses, approximately 50 mg of shoots and 10 to 20 mg of roots were pooled from more than five 10-day-old seedlings. After quickly measuring their fresh weight, obtained tissues were immediately frozen in liquid nitrogen and kept at −80° C. until use. The frozen tissues were mixed in 800 μl of extraction buffer containing (v/v) 2:1 of methanol and chloroform with isovitexin (0.5 μg/ml) (MilliporeSigma), 100 μM norvaline (Thermo Fisher Scientific), and Tocol (1.25 μg/ml) (Matreya LLC), as internal standards for soluble metabolite analysis by LC-MS and GC-MS and tocopherol analysis by GC-MS, respectively. The mixtures were immediately homogenized for at least 3 min using the 1600 MiniG Tissue Homogenizer (SPEX SamplePrep) and 3-mm glass beads. After adding 600 μl of H2O and then 250 μl of chloroform, polar phase containing amino acids and nonpolar phase containing tocopherols were separated by centrifugation and dried in new tubes for further analysis.
Metabolite analyses of amino acids and tocopherols using GC-MS were carried out after derivatization of the polar and nonpolar metabolites with N-tert-butyldimethylsilyl-N-methyltrifluoroacetamide with 1% tert-butyldimethylchlorosilane (Cerilliant) and N-methyl-N-(trimethylsilyl)trifluoroacetamide (MSTFA) with 1% trimethylchlorosilane (Restek), respectively, exactly as we previously described (15, 49).
For targeted metabolite analysis of Trp and AAA-derived compounds, reverse-phase LC-MS analysis with the Vanquish UHPLC system coupled with the Q Exactive Quadrupole-Orbitrap MS (Thermo Fisher Scientific) was conducted as previously described (9), with some modifications. The metabolites were dissolved in 70 μl of LC-MS-grade 80% methanol and separated using the mobile phases of 0.1% formic acid in LC-MS-grade water (solvent A) and 0.1% formic acid in LC-MS-grade acetonitrile (solvent B) at a flow rate of 0.4 ml/min and a column temperature of 40° C. The binary 25-min linear gradient with the following ratios of solvent B was used: 0 to 1 min, 1%; 1 to 10 min, 1 to 10%; 10 to 13 min, 10 to 30%; 13 to 14.5 min, 30 to 70%; 14.5 to 15.5 min, 70 to 99%; 15.5 to 21 min, 99%; 21 to 22.5 min, 99 to 10%; 22.5 to 23 min, 10 to 1%; and 23 to 25 min, 1%. The spectra were recorded using the full scan mode of negative ion detection, covering a mass range from mass/charge ratio (m/z) 100 to 1500. The resolution was set to 25,000, and the maximum scan time was set to 250 ms. The sheath gas was set to a value of 60, while the auxiliary gas was set to 35. The transfer capillary temperature was set to 150° C., while the heater temperature was adjusted to 300° C. The spray voltage was fixed at 3 kV, with a capillary voltage and a skimmer voltage of 25 and 15 V, respectively. The identity of amino acids and I3M peaks was confirmed by comparing their accurate masses and retention times with those of the corresponding authentic standards. The identity of the other compounds was confirmed by LC-tandem MS analysis as previously performed (9). Quantification was based on the standard curves generated by injecting different concentrations of authentic chemical standards. The isovitexin peak of each sample was detected to normalize the sample-to-sample variation and to calculate the recovery rate by comparing with a blank sample corresponding to 800 μl of the extraction buffer.
For quantification of some highly polar metabolites such as shikimate, we used hydrophilic interaction chromatography (HILIC) followed by compound detection with a Vanquish UHPLC (ultrahigh-performance LC) system coupled with the Q Exactive MS (Thermo Fisher Scientific). The same samples used for reverse-phase LC-MS analysis was injected onto a HPLC Poroshell 120 HILIC-Z column (150-mm by 2.1-mm inner diameter, 2.7-μm particle size; Agilent) and eluted using mobile phases of 0.2% acetic acid in LC-MS-grade water containing 5 mM ammonium acetate (solvent A) and 0.2% acetic acid in LC-MS-grade acetonitrile containing 5 mM ammonium acetate (solvent B) with the following 22.5-min gradient at a flow rate of 0.45 ml/min and column temperature of 40° C. The binary linear gradient with the following ratios of solvent B was used: 0 to 1 min, 100%; 1 to 11 min, 100 to 89%; 11 to 15.75 min, 89 to 70%; 15.75 to 16.25 min, 70 to 20%; 16.25 to 18.5 min, 20%; 18.5 to 18.6 min, 20 to 100%; and 18.6 to 22.5 min, 100%. The spectra were recorded using the full-scan negative-ion mode, covering a mass range from m/z 70 to 1050. The resolution was set to 70,000, and the maximum scan time was set to 100 ms. The sheath gas was set to a value of 60, while the auxiliary gas was set to 35. The transfer capillary temperature was set to 150° C., while the heater temperature was adjusted to 300° C. The spray voltage was fixed at 3 kV, with a capillary voltage and a skimmer voltage of 25 and 15 V, respectively. Retention times, MS spectra, and associated peak intensities were extracted from the raw files using the Xcalibur software (Thermo Fisher Scientific). The identities of metabolite peaks were confirmed by comparing their accurate masses and retention times with those of the corresponding authentic standards. Quantification was based on the standard curves generated by injecting different concentrations of authentic chemical standards. The isovitexin peak was also detected as an internal standard for the normalization and the recovery rate calculation as used in the reverse-phase LC-MS analysis above.
The IAA level was quantified as previously reported (50), with some modifications. Approximately 150 mg of 10-day-old Arabidopsis WT and the sota mutant seedlings grown on the agar plates were pooled and quickly frozen in a tube with three 3-mm glass beads. After grounding frozen tissues with the 1600 MiniG Tissue Homogenizer (SPEX SamplePrep), the sample was dissolved in 1 ml of ice-cold sodium phosphate buffer (100 mM; pH 7.0) containing 1% (w/v) diethyldithiocarbamic acid and 1 μM isovitexin and shaken on an orbital shaker for 20 min at 4° C. After the centrifugation at 23,000 g, 4° C. for 20 min, the pH of the supernatant was adjusted to below 3.0 with 1 N hydrochloric acid. The IAA metabolite was obtained by solid-phase extraction using Oasis HLB columns (1 ml/30 mg; Waters), which were conditioned with 1 ml of methanol and then 1 ml of water and equilibrated with 0.5 ml of sodium phosphate buffer (acidified with 1 N hydrochloric acid below 3). After the sample application, the column was washed with 2 ml of 5% methanol and then eluted with 2 ml of 80% methanol. The eluate was evaporated and stored at −20° C. until LC-MS analysis. IAA was detected by the same reverse-phase LC-MS method as described above, with the following modifications. The metabolites were separated using the mobile phases of 0.1% formic acid in LC-MS-grade water (solvent A) and 0.1% formic acid in LC-MS-grade acetonitrile (solvent B) at a flow rate of 0.2 ml/min. The binary 25-min linear gradient with the following ratios of solvent B was used: 0 to 0.5 min, 10%; 0.5 to 10 min, 10 to 50%; 10 to 12.5 min, 50 to 60%; 12.5 to 14.5 min, 60 to 70%; 14.5 to 16 min, 70 to 99%; 16 to 21 min, 99%; 21 to 22.5 min, 99 to 10%; and 22.5 to 25 min, 10%. The separated metabolites were detected as described above in the reverse-phase LC-MS analysis, with a selective ion monitoring (SIM) mode. The identity of the IAA peak was confirmed by comparing its accurate mass and retention times with those of the corresponding authentic standards. Quantification was based on the standard curves generated by injecting different concentrations of authentic chemical standards. The isovitexin peak was also detected as an internal standard for the normalization and the recovery rate calculation.
For anthocyanin quantification, the polar phase isolated for amino acid analysis was diluted 10 times with water in a new tube. After adding 5 μl of 5 N HCl for acidification, the absorption was measured at 530 and 657 nm with a microplate reader (Infinite 200 PRO, TECAN) to calculate anthocyanin contents with the formula A530−0.25×A657 (51). For chlorophyll quantification, the nonpolar phase was dried down and then resuspended in 1 ml of 90% methanol. Several serial dilutions were prepared, and absorbance at 652 and 665 nm was measured using a microplate reader (Infinite 200 PRO, TECAN). The quantities of chlorophylls in each dilution were estimated by the following equations: Chl a=16.72×A665−9.16×A652 and Chl b=34.09×A652−15.28×A665 (52).
13CO2 Labeling Experiments
The 13CO2 labeling experiments were conducted following the previously published protocol (53, 54). Briefly, for the time course labeling experiment, Col-0 WT and the sotaB4 and sotaA4 mutants (in the tyra2 background) were grown for 3 weeks under 12 hours of 150-μE light and 12 hours of darkness. These plants were transferred to a 60-liter labeling chamber (75 cm in width, 40 cm in depth, and 20 cm in height;
The harvested shoot samples were ground-frozen to fine powders using the Retsch Ball Mill MM400, and soluble metabolites were extracted as described above, except ribitol, in addition to isovitexin, which was added as an internal standard for GC-MS analysis. Soluble metabolites were dried and derivatized by MSTFA and analyzed by GC-time-of-flight-MS as described previously (55). For quantification of shikimate and Trp, the dried samples were dissolved in 100 μl of 80% MeOH and analyzed by the HILIC LC-MS and the reverse-phase LC-MS methods, respectively, as described above, with the following modified HILIC mobile phase gradient: 0 to 1 min, 100%; 1 to 1.5 min, 100 to 89%; 1.5 to 15.75 min, 89 to 70%; 15.75 to 16.25 min, 70 to 20%; 16.25 to 18.5 min, 20%; 18.5 to 18.6 min, 20 to 100%; and 18.6 to 22.5 min, 100%. To increase the sensitivity of peak detections, especially for 13C-labeled fragments, the MS compound detection was performed by a SIM mode.
The peak integration and labeling calculation were carried out as described previously (54). Briefly, the peak areas of nonlabeled and labeled ions (isotopomers) in different samples were integrated using the Xcalibur software (Thermo Fisher Scientific). The obtained data were corrected for natural abundance by comparing to unlabeled control samples using the CORRECTOR software as described previously (54). The amounts of 13C-labeled metabolites (nmol/mg of fresh weight) were calculated by multiplying the total metabolite pool sizes (nmol/mg of fresh weight) with the percent of 13C-labeled over total metabolite (the sum of both 12C- and 13C-labeled metabolites).
Quantification of starch and sugar contents was conducted as previously described (56), with some modifications. Thirty to 50 mg of 4-week-old fully mature leaves were harvested for each biological sample at indicated time points and frozen in a tube with three 3-mm glass beads. Soluble sugars were extracted twice by boiling the sample in 700 μl of 80% ethanol at 80° C. for 45 min until the leaves became bleached. The ethanol extract was evaporated and dissolved in 200 μl of distilled water. The sucrose and glucose levels were determined using the Total Sugar Assay Kit (Megazyme) according to the manufacturer's instruction. For starch analysis, the bleached leaves' tissues were air-dried and then ground in 1 ml of 100 mM sodium acetate buffer (pH 5.0) containing 5 mM CaCl2. The solubilized starch was enzymatically hydrolyzed into glucose by incubating with 10 μl of α-amylase (3 U/μl; Megazyme) at 100° C. for 15 min. After cooling to room temperature, the mixture was further incubated with 10 μl of amyloglucosidase (3 U/μl; Megazyme) at 50° C. for 50 min. The glucose concentration was determined using the Total Starch Assay Kit (Megazyme) according to the manufacturer's instruction and expressed as micromole glucose equivalent/g fresh weight (FW).
For determination of total protein content, frozen leaf tissues harvested from 4-week-old Arabidopsis plants were ground in liquid nitrogen and dissolved in 500 μl of ice-cold isolation buffer containing 20 mM Hepes (pH 7.4) and 2.5 mM EDTA to determine the protein concentration via a Bradford assay (57). For analyzing the protein amount of Rubisco large subunit (RbcL), the same samples were applied to 4 to 20% Mini-PROTEAN TGX Stain-Free Protein Gels (Bio-Rad) to visualize and quantify the RbcL bands.
To determine the lignin deposition, 4-week-old leaves and roots were first fixed in formaldehyde/acetic acid/ethanol/water at a ratio of 5:5:45:45 (v/v) and decolorized with ethanol/acetic acid at a ratio of 6:1 (v/v). Phloroglucinol staining was conducted as previously described (58). Briefly, tissues were incubated in a mixture of one volume of 37% HCl (v/v) and two volumes of 3% phloroglucinol in ethanol (w/v) for 10 min and observed under bright-field lighting with an Olympus SZX12 stereoscope. For quantifying lignin content, 4-week-old leaves (whole aerial parts) and matured inflorescence stems were harvested and freeze-dried. Three individual plant samples were obtained for each genotype. The tissues were homogeneously pulverized with a tissue homogenizer (1600 MiniG, Spex SamplePrep). The homogenate was then extracted sequentially with distilled water, methanol, and hexane and then freeze-dried to give cell wall residues (CWRs). Thioglycolic acid lignin analysis was performed as described previously (59). The relative lignin content was expressed as absorbance of thioglycolic acid lignin at 280 nm (A280) per weight of CWRs (mg).
The rate of net CO2 assimilation was measured using an LI-6400XT photosynthesis system equipped with the 6400-40 leaf chamber (LI-COR). Arabidopsis plants were grown in the growth chamber under the condition of a 12-hour/12-hour 100-μE light/dark cycle with 85% air humidity for 4 weeks after germination, and fully expanded nonshaded leaves were used for the measurement. Because leaves did not fully fill the cuvette area, the leaf area inside the cuvette was photographed and quantified by ImageJ to normalize each assimilation rate. The temperature was kept at 25° C. for all measurements. For analysis of the light response curve, the CO2 concentration in the airstream was maintained at 400 mol/mol. For analysis of the A-Ci curve, the light intensity was saturated at 1500 E. After acclimating the leaves at the Ci level of 400 μmol/mol to achieve a steady-state rate of assimilation, the Ci level of the response curve was set at 400, 185, 70, 35, 740, 1100, 1500, and 1900 μmol/mol, and measurements were taken when assimilation reached a steady-state rate. To determine the Vcmax, Jmax, and Rd values, each A-Ci curve was fitted to the Farquhar-von Caemmerer-Berry model by the “plantecophys” R package (60, 61). The initial slope and CO2 compensation point of the light response curves and A-Ci curves were determined using the first three and five points at low light and low Ci points, respectively, as previously calculated (62).
To test the effects of the sota mutations on the DHS gene expression, the transcript levels of DHS1, DHS2, and DHS3 were analyzed by reverse transcription quantitative PCR (RT-qPCR). Approximately 20 to 30 mg of fully expanded mature leaves were pooled from multiple 4-week-old plants grown on soils, immediately frozen in liquid nitrogen in a tube with three 3-mm glass beads, and ground using the 1600 MiniG Tissue Homogenizer (SPEX SamplePrep). Total RNA was isolated as previously described (63), treated with deoxyribonuclease I (Thermo Fisher Scientific), and reverse-transcribed to synthesize cDNA with M-MuLV reverse transcriptase and random hexamer primers (Promega) according to the manufacturer's protocol. RT-qPCR was conducted by the Stratagene Mx3000P (Agilent Technologies) using the GoTaq qPCR Master Mix (Promega), and target gene-specific primers listed in Table 10. Four biological replicates with two technical RT-qPCR replicates were conducted. Expression of the UBQ9 gene was used to normalize the sample-to-sample variations between different cDNA preparations. Relative expression levels among different genotypes were analyzed for each DHS gene using the 2−ΔΔCt method.
DHS orthologs were first identified by BlastP searches using the amino acid sequence of AtDHS1 as a query against Phytozome 13 (64). Nicotiana benthamiana DHSs were searched from the N. benthamiana draft genome sequence v1.0.1 (65). The sequence alignment of
L., Ethylmethanesulfonate saturation mutagenesis in Arabidopsis to determine frequency of herbicide resistance. Plant Physiol. 131, 139-146 (2003).
We also conducted Illumina whole-genome sequencing on 12 additional sota lines (sotaA12, sotaE3, sotaE31, sotaC4, sotaA2, sotaA5, sotaB1, sotaA3, sotaA9, sotaA13, sotaF26, and sotaA12) using the methods described in Example 1. However, here it was the mutants themselves that were sequenced rather than backcrossing them with Arabidopsis tyra2 to generate a population. These additional sota lines were selected based on the data presented in
Mycobacterium_tuberculosis_Type2_DHS
Pseudomonas_aeruginosa_DHS
In the following example, we demonstrate that the identified mutant Arabidopsis DHS proteins can be used to increase production of AAAs in other plant species. The Arabidopsis DHS1 sotaB4 mutant was transiently expressed in tobacco, and the levels of the three AAAs were measured using LC-MS. As is shown in
Vectors for plant expression were made as previously described using MoClo modular cloning technology. For transient expression in Nicotiana benthamiana, gene expression of the protein coding sequence (CDS) of DHS1 WT and B4 were driven by a 1987-bp sequence obtained from the upstream region of the ubiquitin 10 gene (At4g05320) from Arabidopsis. In addition, a synthetic hemagglutinin (HA) tag comprising 6 repeats of the sequence YPYDVPDYA (SEQ ID NO:75) was added to the C-terminus of the protein for quantification of protein expression using an anti-HA antibody. Vectors containing the promoter, CDS, epitope HA-tag, and a terminator were transformed into Agrobacterium tumefaciens via electroporation.
Transient Expression in Nicotiana benthamiana
Positive transformants were used to perform transient expression in Nicotiana benthamiana. Single colony bacteria were grown in LB media supplemented with kanamycin (100 mg/L) and gentamycin (100 mg/L) at 28° C. with constant agitation at 200 rpm. 10 mL of initial culture was expanded to 50 mL by inoculating 50 mL of fresh LB media supplemented with same antibiotics plus 10 mM MES and 200 uM acetosyringone with 3 mL of overnight culture. Bacteria cultures were allowed to grow at 28° C. and 200 rpm agitation for 16 hours. Bacteria cultures were sedimented in 50 mL Falcon tubes via centrifugation at room temperature and 4000 g for 20 min. After centrifugation, growth media was decanted and bacteria were resuspended in 5 mL of inoculation solution (10 mM MES, 10 mM MgCl2 200 uM acetosyringone). After complete resuspension, bacteria concentration was evaluated by spectrometry using optical density (OD) at 600 nm. Bacteria solution was diluted to final OD600=1.0 using inoculation solution. Diluted bacteria were incubated at room temperature without agitation for 3 hours and used to inoculate Nicotiana leaves via needle-less 1 mL syringes.
Each inoculated leaf was separated into 4 quadrants, and each quadrant received a different bacteria solution containing a different vector. The experiment was completely randomized to allow the production of aromatic amino acids by DHS1 WT and DHS1 B4 to be compared. After inoculation, the inoculated area was marked by black sharpie, the excess of bacterial solution on the leaves were gently removed using tissue papers, and plants were returned to growth chambers. Samples for metabolite analysis were collected two days after inoculation and processed as previously described for quantification of aromatic amino acids.
In the following example, we demonstrate that introducing sota mutations into DHS genes from sorghum and poplar also dramatically enhances AAA production in plants. The sota mutations sotaB4 and sotaF1 were introduced into the Sorghum bicolor gene SbDHS (Sobic.007G225700.1.p) and the Populus trichocarpa gene PtDHS (Potri.005G073300.1.p) and expressed in Nicotiana benthamiana leaves via Agrobacterium-mediated transformation.
To generate these mutant genes, DNA sequences encoding SbDHS (SEQ ID NO:20; which was cloned from Sorghum cDNA) and PtDHS (SEQ ID NO:17; which was synthesized) were cloned into E. coli expression vectors. Notably, the portions of these sequences that encode plastid transit peptides were omitted from the cloned sequences to aid in the production of recombinant protein. The sequences were modified to include the sotaB4 and sotaF1 mutations via site-directed mutagenesis. Then, both wild-type and sota mutant versions of the CDS were cloned into the modular cloning (MoClo) vector pAGM1287, wherein each sequence was flanked by the quadruplets AATG and TTCG for future cloning purposes. In this vector, expression of the DHS proteins was driven by a 739-bp sequence containing the promoter and 5′-UTR from the upstream region of the rbcS2 (ribulose bisphosphate carboxylase small subunit, chloroplastic 2) gene from Solanum lycopersicum (SEQ ID NO:123). This regulatory sequence was obtained from the MoClo plasmid pICH71301 and was modified to be flanked by the quadruplets GGAG and CCAT in the vector. Additionally, to allow the DHS proteins to be expressed in the plastids, a 176-bp synthetic DNA fragment encoding the rubisco complex (RbcS) plastid transit peptide (SEQ ID NO:124; obtained from the MoClo plasmid pICH78133) was included in the vector and was modified to be flanked by the quadruples CCAT and AATG. Two different tags, i.e., hemagglutinin (HA) and TdTomato-HA, were used to monitor protein expression, and the P19 vector was co-transformed to prevent gene silencing. A dipeptide (glycine-serine) linker was included between the C-terminus of the DHS proteins and the HA/TdTomato-HA tags. Additionally, the PtDHS protein contained a 6×-His-Tag at its N-terminus (introduced during sequence synthesis) for purification using Ni+-affinity chromatography. The sequences of the components used in these expression vectors are outlined in Table 12, and the sequences of the proteins expressed from these vectors are outlined in Table 13.
Sorghum bicolor DHS CDS with sotaB4 mutation
Sorghum bicolor DHS CDS with sotaF1 mutation
Populus trichocarpa DHS CDS with sotaB4
Populus trichocarpa DHS CDS with sotaF1
Sorghum bicolor DHS with sotaB4 mutation
Sorghum bicolor DHS with sotaF1 mutation
Populus trichocarpa DHS with sotaB4 mutation
Populus trichocarpa DHS with sotaF1 mutation
The levels of AAAs produced in Nicotiana benthamiana leaves that expressed the wild-type and sota mutant versions of these DHS proteins were measured via liquid chromatography-mass spectrometry (LC-MS), as described in Materials and Methods. As is shown in
This application claims priority to U.S. Provisional Application No. 63/286,811 filed on Dec. 7, 2021, the contents of which are incorporated by reference in their entireties.
This invention was made with government support under 1818040 awarded by the National Science Foundation. The government has certain rights in this invention.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/US2022/081110 | 12/7/2022 | WO |
| Number | Date | Country | |
|---|---|---|---|
| 63286811 | Dec 2021 | US |