This application includes a sequence listing in XML format titled “960296.04479_ST26.xml”, which is 356,334 bytes in size and was created on Mar. 14, 2024. The sequence listing is electronically submitted with this application via Patent Center and is incorporated herein by reference in its entirety.
Lignin is a complex organic polymer that is used as a structural material to support the tissues of land plants. It comprises up to 30% of plant dry mass and is the most abundant aromatic polymer on earth. Engineering the lignin biosynthesis pathway is a potential way to increase carbon sequestration in plants and to enhance the value of plant biomass for use in the production of bioenergy and biomaterials. Accordingly, there is a need in the art for methods of altering this pathway.
In a first aspect, the present invention provides engineered phenylalanine ammonia-lyase (PAL) enzymes that have increased tyrosine ammonia-lyase (TAL) activity. These engineered PAL enzymes comprise a first mutation at a position corresponding to residue 112 of SEQ ID NO: 28 and a second mutation at a position corresponding to residue 140 of SEQ ID NO: 28 in a wild-type PAL enzyme and have increased TAL activity relative to the wild-type PAL enzyme.
In a second aspect, the present invention provides polynucleotides encoding an engineered PAL enzyme described herein.
In a third aspect, the present invention provides constructs comprising a promoter operably linked to a polynucleotide described herein.
In a fourth aspect, the present invention provides vectors comprising a polynucleotide or construct described herein.
In a fifth aspect, the present invention provides cells comprising an engineered PAL enzyme, polynucleotide, construct, or vector described herein.
In a sixth aspect, the present invention provides seeds comprising an engineered PAL enzyme, polynucleotide, construct, vector, or cell described herein.
In a seventh aspect, the present invention provides plants grown from a seed described herein and plants comprising an engineered PAL enzyme, polynucleotide, construct, vector, or cell described herein.
In an eighth aspect, the present invention provides methods of making the plants described herein.
In a ninth aspect, the present invention provides methods for using the plants described herein to (1) produce a phenylpropanoid-derived product or (3) sequester carbon dioxide. The methods comprise growing the plants. The methods for producing phenylpropanoid-derived products further comprise purifying the phenylpropanoid-derived products produced by the plant.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
The present invention provides engineered phenylalanine ammonia-lyase (PAL) enzymes comprising one or more mutations that increase the enzymes' tyrosine ammonia-lyase (TAL) activity. Also provided are plants comprising the engineered PAL enzymes and methods of using these plants to sequester CO2 or produce phenylpropanoid-derived products.
Most vascular plants synthesize lignin from the amino acid phenylalanine using the enzyme phenylalanine ammonia-lyase (PAL). However, grass plants possess a bifunctional enzyme, phenylalanine tyrosine ammonia-lyase (PTAL), that allows them to synthesize lignin and other phenylpropanoids using either phenylalanine or tyrosine as a substrate. To better understand how PTAL enzymes evolved in grasses, the inventors identified orthologs of grass PTAL enzymes in other, closely related plants. Biochemical characterization of these orthologs revealed that PTAL enzymes are found, not only in grasses, but also in the non-grass graminid Joinvillea ascendans, which indicates that PTAL enzymes emerged before the evolution of grasses.
It was previously reported that a particular residue, referred to herein as His/Phe 140, determines whether PAL/PTAL enzymes have TAL activity in bacteria. However, the inventors discovered that both His 140 and an additional residue, Ile112, are required for TAL activity in plants. They demonstrate that introducing Ile 112 and His 140 into the monofunctional PAL enzymes of J. ascendans and Arabidopsis thaliana converts them into bifunctional PTAL enzymes. Thus, these residues represent novel gene editing targets that can be used to introduce the alternative TAL pathway into plants. Creating genetically engineered plants that can use both phenylalanine and tyrosine to synthesize lignin and phenylpropanoids should increase the carbon flow into these synthesis pathways and increase the amount of carbon sequestered by the plants. Further, it should increase the phenylpropanoid content of the plants, which may increase the value of their plant material, strengthen their disease resistance, and/or improve their nutritional quality.
While others have previously shown that overexpressing PAL enzymes (Phytochemistry, 64: 153-161, 2003) or expressing bacterial TAL enzymes in transgenic plants (Planta, 232: 209-218, 2010) have some effect on the production of phenylpropanoid-derived compounds, the inventors predict that engineering the native PAL enzymes of plants to introduce TAL activity will more effectively increase carbon flow into the phenylpropanoid synthesis pathway as compared to PAL overexpression (i.e., because TAL activity is more efficient than PAL activity, see below) while avoiding the need to introduce a transgene from another organism into the plant.
Land plants produce a diverse array of phenylpropanoid compounds, which include polymers, such as lignin, suberin, and condensed tannin, as well as soluble metabolites, such as flavonoids, coumarin, stilbenes, and phenylpropenes. In most plants, the first step in the phenylpropanoid biosynthetic pathway is the deamination of the amino acid phenylalanine into trans-cinnamic acid (
The PAL and PTAL enzymes of the non-grass graminid Joinvillea ascendens are used as reference sequences herein. These enzymes are referred to as JaPAL (protein sequence: SEQ ID NO: 28, DNA sequence: SEQ ID NO: 147) and JaPTAL (protein sequence: SEQ ID NO: 27, DNA sequence: SEQ ID NO: 151).
“Tyrosine ammonia-lyase (TAL) activity” is enzyme activity that converts the amino acid tyrosine into p-coumaric acid via non-oxidative deamination. PAL enzymes naturally lack or have trace levels TAL activity, whereas PTAL enzymes naturally possess strong TAL activity. However, in the Examples, the inventors demonstrate that TAL activity can be introduced into or dramatically increased in PAL enzymes via the introduction of mutations at two specific residues. The TAL activity of an engineered PAL enzyme of the present invention may be increased by 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 11-fold, 12-fold, 13-fold, 14-fold, 15-fold, 16-fold, 17-fold, 18-fold, 19-fold, 20-fold, or more as compared to the TAL activity of the corresponding wild-type PAL enzyme. The TAL activity of an enzyme can be assessed using TAL activity assays, in which the reaction products formed by the enzyme in the presence of the substrate tyrosine are measured. For example, TAL activity can be assessed by measuring the production of the product p-coumaric acid using high-performance liquid chromatography (HPLC) or by measuring absorbance at 309 nm (e.g., using a plate reader). TAL activity can also be assessed by measuring the release of ammonia from the reaction. See Example 1 for a description of such assays.
Thus, in a first aspect, the present invention provides engineered phenylalanine ammonia-lyase (PAL) enzymes that have increased tyrosine ammonia-lyase (TAL) activity. An “enzyme” is a protein or RNA molecule that acts as a catalyst in living organism. Enzymes decrease the activation energy required for a chemical reaction to occur by stabilizing the transition state.
The engineered PAL enzymes described herein may be full-length proteins or may be fragments of full-length proteins. As used herein, a “fragment” is a portion of a protein that is identical in sequence to, but shorter in length than, the full-length protein. For example, a fragment may comprise at least 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 250, or 500 contiguous amino acid residues of a full-length protein. Fragments may be preferentially selected from certain regions of a protein. A fragment may comprise an N-terminal truncation, a C-terminal truncation, or both an N-terminal and C-terminal truncation relative to the full-length protein. Preferably, the PAL enzyme fragments used with the present invention are functional fragments. As used herein, the term “functional fragment” refers to a fragment that retains at least 20%, 40%, 60%, 80%, or 100% of the PAL/TAL activity of the corresponding full-length protein.
The PAL enzymes described herein are “engineered,” meaning that they have been altered by the hand of man. Specifically, the PAL enzymes of the present invention have been engineered to comprise one or more mutations. As used herein, the term “mutation” refers to a difference in an amino acid sequence relative to a reference sequence (e.g., the sequence of a wild-type PAL enzyme). Mutations include insertions, deletions, and substitutions of an amino acid relative to a reference sequence. An “insertion” refers to a change in an amino acid sequence that results in the addition of one or more amino acid residues. An insertion may add 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more amino acid residues to a sequence. A “deletion” refers to a change in an amino acid sequence that results in the removal of one or more amino acid residues. A deletion may remove 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, or more amino acids residues from a sequence. A “substitution” refers to a change in an amino acid sequence in which one amino acid is replaced with a different amino acid. An amino acid substitution may be a conversative replacement (i.e., a replacement with an amino acid that has similar properties) or a radical replacement (i.e., a replacement with an amino acid that has different properties).
The engineered PAL enzymes of the present invention comprise one or more mutations relative to the corresponding wild-type PAL enzyme. The term “wild-type” is used herein to describe the non-mutated version of an enzyme that is most typically found in nature. Wild-type PAL enzymes comprise a serine at the position corresponding to residue 112 of SEQ ID NO: 28 (Ser112) and comprise a phenylalanine at the position corresponding to residue 140 of SEQ ID NO: 28 (Phe 140), whereas wild-type PTAL enzymes comprise an isoleucine at the position corresponding to residue 112 of SEQ ID NO: 28 (Ile112) and comprise a histidine at the position corresponding to residue 140 of SEQ ID NO: 28 (His140) (see, e.g.,
For simplicity, throughout this application, we have arbitrarily used the wild-type PAL enzyme of Joinvillea ascendens (JaPAL; SEQ ID NO: 28) as a reference sequence and have specified the positions of mutations in various PAL/PTAL enzymes using the residue numbering of this enzyme. Any mutation position can be converted to use the residue numbering of another PAL or PTAL enzyme using a sequence alignment, such as the alignment shown in
In Example 1, the inventors demonstrate that introducing the mutation S112I into the PAL enzyme of Joinvillea ascendens (JaPAL; SEQ ID NO: 28) or introducing the corresponding mutation (i.e., S116I) into the PAL enzyme of the distantly related plant Arabidopsis thaliana (AtPAL1; SEQ ID NO: 144) increases the TAL activity of these enzymes (
As is noted above, the inventors have demonstrated that PAL enzymes from multiple, distantly related plants (i.e., Joinvillea ascendens (a monocot) and Arabidopsis thaliana (a dicot)) can be converted into bifunctional PTAL enzymes. PAL enzymes (which are found in bacteria, fungi, and plants) are highly conserved across a wide variety of land plants, as is demonstrated in
In some embodiments, the engineered PAL enzymes comprise a polypeptide or a functional fragment thereof having at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity to a polypeptide selected from SEQ ID NO: 28-143. “Percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window. The aligned sequences may comprise additions or deletions (i.e., gaps) relative to each other for optimal alignment. The percentage is calculated by determining the number of matched positions at which an identical nucleic acid base or amino acid residue occurs in both sequences, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100. Protein and nucleic acid sequence identities can be evaluated using the Basic Local Alignment Search Tool (“BLAST”), which is well known in the art (Karlin and Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA (1990) 87: 2267-2268; Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. (1997) 25: 3389-3402). The BLAST programs identify homologous sequences by identifying similar segments between a query sequence and a test sequence, which is preferably obtained from a protein or nucleic acid sequence database. The BLAST programs can be used with the default parameters or with modified parameters provided by the user.
Regardless of their origin, the engineered PAL enzymes of the present invention comprise a mutation at a position corresponding to residue 112 of JaPAL (SEQ ID NO: 28) and optionally further comprise a second mutation at a position corresponding to residue 140 of JaPAL. As used herein, the phrase “at a position corresponding to” refers to an amino acid position that aligns with an amino acid position in another protein in a protein sequence alignment or a protein structure alignment. For example, the phrase “a position corresponding to residue 112 of SEQ ID NO: 28” refers to an amino acid position in the sequence of protein X that aligns with the 112th amino acid residue of SEQ ID NO: 28 when the sequence of protein X is aligned with SEQ ID NO: 28. To determine whether a particular protein sequence has a mutation at a position “corresponding to” a position disclosed herein, one may align that particular protein sequence with SEQ ID NO: 28 using a conventional sequence alignment method (see, e.g., Bioinformatics (2007) 23(7): 802-8) and examine the alignment at the appropriate position.
In some embodiments, the engineered PAL enzyme comprises a serine to isoleucine mutation at a position corresponding to residue 112 of SEQ ID NO: 28 (e.g., a S112I mutation). However, in Example 3, the inventors demonstrate that several different substitutions at position 112 retain the TAL activity of the JaPALF140H_S112I double mutant. Specifically, they show that substituting the Ile at this position with a valine or threonine retains strong TAL activity but substituting it with a serine does not (
In Example 1, the inventors generated a JaPAL enzyme, referred to as JaPALF140H_MUT8, that has a PTAL-type substitution at residue 140 and at eight additional residues that are highly conserved within both PAL and PTAL enzymes but are distinct between these two groups (i.e., residues 102, 112, 121, 138, 267, 444, 448, and 500). Kinetic assays showed that the catalytic properties of TAL activity (especially tyrosine substrate affinity (Km)) of JaPALF140H_MUT8 were significantly improved compared to those of wild-type JaPAL and were comparable with those of wild-type JaPTAL (
In a second aspect, the present invention provides polynucleotides encoding an engineered PAL enzyme described herein. The terms “polynucleotide,” “oligonucleotide,” and “nucleic acid” are used interchangeably to refer a polymer of DNA or RNA. A polynucleotide may be single-stranded or double-stranded and may represent the sense or the antisense strand. A polynucleotide may be synthesized or obtained from a natural source. A polynucleotide may contain natural, non-natural, or altered nucleotides, as well as natural, non-natural, or altered internucleotide linkages (e.g., phosphoroamidate linkages, phosphorothioate linkages). The term polynucleotide encompasses constructs, vectors, plasmids, and the like. In some embodiments, the polynucleotide is complementary DNA (cDNA; i.e., synthetic DNA that has been reverse transcribed from a messenger RNA) or genomic DNA (i.e., chromosomal DNA from an organism). Those of skill in the art understand that, due to degeneracy of the genetic code, a variety of polynucleotides can encode the same polypeptide.
While the polynucleotide sequences disclosed herein are derived from sequences found in plants, any polynucleotide sequence that encodes the desired engineered PAL enzyme may be used with the present invention. For example, in some embodiments, the polynucleotides are codon-optimized for expression in a particular cell (e.g., a plant cell, bacterial cell, or fungal cell). “Codon optimization” is a process used to increase expression of a polynucleotide in a particular host cell by altering the sequence of the polynucleotide to accommodate the codon bias of the host cell. Computer programs for generating codon-optimized sequences for use in a particular host cell are known in the art.
In a third aspect, the present invention provides constructs comprising a promoter operably linked to one of the polynucleotides described herein. As used herein, the term “construct” refers to a recombinant polynucleotide, i.e., a polynucleotide that was formed by combining at least two polynucleotide components from different sources, natural or synthetic. For example, a construct may comprise the coding region of one gene operably linked to a promoter that is (1) associated with another gene found within the same genome, (2) from the genome of a different species, or (3) synthetic. Constructs can be generated using conventional recombinant DNA methods.
As used herein, the term “promoter” refers to a DNA sequence that defines where transcription of a polynucleotide beings. RNA polymerase and the necessary transcription factors bind to the promoter to initiate transcription. Promoters are typically located directly upstream (i.e., at the 5′ end) of the transcription start site. However, a promoter may also be located at the 3′ end, within a coding region, or within an intron of a gene that it regulates. Promoters may be derived in their entirety from a native or heterologous gene, may be composed of elements derived from multiple regulatory sequences found in nature, or may comprise synthetic DNA. A promoter is “operably linked” to a polynucleotide if the promoter is positioned such that it can affect transcription of the polynucleotide.
The promoter used in the constructs described herein may be a heterologous promoter (i.e., a promoter that is not naturally associated with the wild-type PAL enzyme), an endogenous promoter (i.e., a promoter that is naturally associated with the wild-type PAL enzyme), or a synthetic promoter that is designed to function in a desired manner in a particular host cell. Suitable promoters for use with the present invention include, but are not limited to, constitutive, inducible, temporally regulated, developmentally regulated, chemically regulated, tissue-preferred, and tissue-specific promoters. In some cases, it may be advantageous to use a tissue-specific promoter or a developmental stage-specific promoter to ensure that the construct will drive expression of the engineered enzyme in a particular tissue (e.g., roots, leaves) or during a particular developmental stage (e.g., leaf maturation, seed development, senescence).
In some embodiments, the promoter is a plant promoter, i.e., a promoter that is active in plant cells. Suitable plant promoters include, without limitation, the 35S promoter of the cauliflower mosaic virus, ubiquitin, the tCUP cryptic constitutive promoter, the Rsyn7 promoter, the maize In2-2 promoter, and the tobacco PR-la promoter.
In a fourth aspect, the present invention provides vectors comprising one of the polynucleotides or constructs described herein. The term “vector” refers to a DNA molecule that is used to carry a particular DNA segment (i.e., a DNA segment included in the vector) into a host cell. Some vectors are capable of autonomous replication in a host cell (e.g., bacterial vectors that include an origin of replication and episomal mammalian vectors). Other vectors can be integrated into the genome of a host cell such that they are replicated along with the host genome (e.g., viral vectors and transposons). Vectors may include heterologous genetic elements that are necessary for propagation of the vector or for expression of an encoded gene product. Vectors may also include a reporter gene or a selectable marker gene. Suitable vectors include plasmids (i.e., circular double-stranded DNA molecules) and viral vectors.
In a fifth aspect, the present invention provides cells comprising one of the engineered enzymes, polynucleotides, constructs, or vectors described herein. The cells may be eukaryotic or prokaryotic. Preferably, the cell is a type of cell that can be used for large-scale production of phenylpropanoid-derived compounds or for carbon dioxide sequestration. In some embodiments, the cell is a plant cell, a bacterial cell, a fungal cell, or a protist cell.
In a sixth aspect, the present invention provides seeds comprising one of the engineered enzymes, polynucleotides, constructs, vectors, or cells described herein. A “seed” is an embryonic plant enclosed in a protective outer covering. In embodiments in which the plant comprises a nucleic acid (i.e., a polynucleotide, construct, or vector) described herein, the nucleic acid may either be integrated into the genome of the seed or exist independently from the genome.
In a seventh aspect, the present invention provides plants grown from the seeds described herein and plants comprising one of the engineered PAL enzymes, polynucleotides, constructs, vectors, or cells described herein.
As used herein, the term “plant” includes both whole plants and plant parts. Examples of plant parts include, without limitation, embryos, pollen, ovules, flowers, glumes, panicles, roots, root tips, anthers, pistils, leaves, stems, seeds, pods, flowers, calli, clumps, cells, protoplasts, germplasm, asexual propagates, and tissue cultures. This term also includes chimeric plants in which only a subset of the plant's cells comprises the engineered PAL enzyme, polynucleotide, construct, or vector.
The inventors predict that engineering the native PAL enzymes of plants to introduce TAL activity will increase carbon flow into lignin/phenylpropanoid synthesis pathways. Thus, the inventors predict that the plants described herein will: (a) produce a greater quantity of lignin as compared to a control plant; (b) produce a greater quantity of phenylpropanoid-derived compounds as compared to a control plant; and/or (c) sequester a greater quantity of carbon dioxide (CO2) into aromatic compounds as compared to a control plant.
Examples of phenylpropanoid compounds and derivatives thereof that could be produced in higher quantities by the plants of the present invention include flavonoids, anthocyanins, lignins, phenolic acids, stilbenes, coumarins, tannins, suberin, cutins, sporopollenin, lignans, and phenylpropenes. These compounds may be useful, for example, for making dyes, colorants, nutraceuticals, pharmaceuticals, and industrial materials. Lignin-derived aromatic monomers can be obtained from plants using microbial (Curr Opin Biotechnol 56: 179-186, 2019) or chemical (Angew Chem Int Ed 55: 8164-8215, 2016) lignin degradation methods.
“Carbon sequestration” is a process in which atmospheric CO2 is captured and stored. It is one method for reducing the amount of CO2 in the atmosphere (i.e., to reduce global climate change). In some embodiments, the methods further comprise harvesting part of the plant while leaving the roots of the plant in the soil such that the carbon contained in the roots is sequestered therein. Harvestable parts of plants include, without limitation, flowers, pollen, seedlings, tubers, leaves, stems, fruit, seeds, roots, cuttings, and the like.
As used herein, the term “control plant” refers to a comparable plant (e.g., of the same species, cultivar, and age) that was raised under the same or comparable conditions (e.g., water, sunlight, nutrients) but that does not express an engineered PAL enzyme described herein.
In some embodiments, the plant produces a greater quantity of lignin and/or phenylpropanoid-derived products or produces these products at a greater rate as compared to a control plant. Suitably, the plant produces at least 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 11-fold, 12-fold, 13-fold, 14-fold, 15-fold, 16-fold, 17-fold, 18-fold, 19-fold, or 20-fold more lignin and/or phenylpropanoid-derived products as compared to the control plant. The amount of lignin produced by a plant may be measured using the thioglycolic acid method (J Agric Food Chem 60(4): 922-8, 2012), which is a standard method for estimating the total lignin content in plant biomass. The amount of a phenylpropanoid-derived product produced by a plant may be measured using liquid chromatography-mass spectrometry (LC-MS).
In some embodiments, the plant sequesters a greater quantity of CO2 or sequesters CO2 at a greater rate as compared to a control plant. Suitably, the CO2 sequestration of the plant is at least 2%, 5%, 10%, 20%, 30%, 40%, 50%, or 60% greater than that of a control plant. CO2 sequestration may be quantified by measuring the gas exchange activity of the plant. For example, CO2 assimilation may be measured using an LI-6400XT photosynthesis system equipped with the 6400-40 leaf chamber (LI-COR). Alternatively, labeled 13CO2 can be fed to plants and the rate of 13C incorporation into plants can be measured over time.
The plants of the present invention may be of any species. In some embodiments, the plant is a land plant that comprises a native PAL enzyme. PAL enzymes are expressed broadly in plants. In some embodiments, the plant is selected from Acorus americanus, Amborella trichopoda, Ananas comosus, Apostasia shenzhenica, Asparagus officinalis, Brachypodium distachyon, Calamus simplicifolius, Dendrobium catenatum, Ecdeiocolea monostachya, Elaeis guineensis, Flagellaria indica, Joinvillea ascendens, Musa acuminata, Oryza sativa, Panicum hallii, Panicum virgatum, Phalaenopsis equestris, Setaria italica, Setaria viridis, Sorghum bicolor, Spirodela polyrhiza, Streptochaeta angustifolia, Zea mays, and Zostera marina. Protein sequences of PAL enzymes found in these plants are provided as SEQ ID NO: 28-143, and these sequences are aligned in
In some embodiments, the engineered PAL enzyme is encoded by the genome of the plant. In some embodiments, the plant is a plant that naturally expresses a PAL enzyme, and the gene encoding the native PAL enzyme was modified via gene editing to encode a mutation at a position corresponding to residue 112 of SEQ ID NO: 28. In other embodiments, a polynucleotide encoding an engineered version of a PAL enzyme that is not natively expressed by the plant is introduced into the genome of the plant. In other embodiments, the plant comprises a polynucleotide encoding an engineered PAL enzyme that exists independently of the genome. Methods of genetically engineering plants using recombinant biology or gene editing, such as CRISPR/Cas based gene editing, are known to those of skill in the art.
In some embodiments, the plants further comprise additional mutations that affect how they absorb and utilize atmospheric carbon. The inventors have previously identified mutations in Arabidopsis thaliana that deregulate the first step of the shikimate pathway, i.e., a pathway that connects central carbon metabolism to the pathway for aromatic amino acid biosynthesis in plants. See Yokoyama et al., Science Advances 8(23): eabo3416 (2022), which is hereby incorporated by reference in its entirety. These mutations map to genomic loci that encode the three Arabidopsis isoforms of the enzyme 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase (DHS), which catalyzes the first reaction of the shikimate pathway. The inventors discovered that these mutations reduce inhibition by tyrosine/tryptophan-associated compounds and that plants that express DHS enzymes comprising these mutations produce greater quantities of aromatic amino acids and assimilate greater quantities of CO2. Thus, in some embodiments, the plants of the present invention further comprise an engineered DHS enzyme that comprises one or more of these mutations, i.e., one or more mutation at a position corresponding to residue 109, 114, 159, 240, 244, 245, 247, 248, 319, 322, or 348 of the Arabidopsis thaliana DHS1 enzyme (SEQ ID NO: 152). Plants that further comprise such engineered DHS enzymes (i.e., in addition an engineered PAL enzyme) are expected to produce even higher levels of phenylpropanoids.
Additionally, the inventors have previously identified an active site residue (i.e., residue 220 of the Medicago truncatula PDH enzyme) that determines the substrate specificity (i.e., for prephenate or arogenate) and level of tyrosine feedback inhibition of TyrA family enzymes, which are the key regulatory enzymes of tyrosine biosynthesis. See U.S. Pat. No. 11,136,559, which is hereby incorporated by reference in its entirety. These mutations may be used to enhance the production of tyrosine and tyrosine-derived products in plants. Thus, in some embodiments, the plants of the present invention further comprise an engineered TyrA enzyme. In some embodiments, the engineered TyrA enzyme is an engineered arogenate dehydrogenase (ADH) enzyme comprising a non-acidic amino acid residue at a position corresponding to residue 220 of the Medicago truncatula ADH enzyme (e.g., SEQ ID NO: 153, which comprises a D220C mutation). These engineered ADH enzymes have increased prephenate dehydrogenase (PDH) activity and relaxed tyrosine sensitivity as compared to the corresponding wild-type ADH enzyme. In other embodiments, the engineered TyrA enzyme is an engineered PDH enzyme comprising an aspartic acid or glutamic acid at a position corresponding to residue 220 of the Medicago truncatula PDH enzyme (e.g., SEQ ID NO: 154, which comprises a C220D mutation). These engineered PDH enzymes have increased ADH activity and increased tyrosine sensitivity as compared to the corresponding wild-type PDH enzyme. Plants that further comprise such engineered TyrA enzymes (i.e., in addition an engineered PAL enzyme) are expected to produce even higher levels of phenylpropanoids.
In an eighth aspect, the present invention provides methods of making the plants described herein. In some embodiments, the methods comprise introducing one of the engineered PAL enzymes, polynucleotides, constructs, or vectors described herein into the plant. As used herein, “introducing” describes a process by which exogenous polypeptides or polynucleotides are introduced into a recipient cell. Suitable introduction methods include, without limitation, Agrobacterium-mediated transformation, the floral dip method, bacteriophage or viral infection, electroporation, heat shock, lipofection, microinjection, and particle bombardment.
In other embodiments, the plant comprises a native gene encoding a PAL enzyme, and the methods comprise editing the native gene to encode an engineered PAL enzyme described herein. “Gene editing” describes a process by which mutations (i.e., deletions, insertions, and substitutions) are introduced into a native gene within an organism's genome. Gene editing can be performed using several different nucleases, including zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALENs), and CRISPR/Cas endonucleases. Site-directed mutagenesis (e.g., homologous recombination) may also be used to edit a gene.
In specific embodiments, the methods comprise using a RNA-guided endonuclease (e.g., Cas9) to edit the native gene to have a mutation at a position corresponding to residue 112 of SEQ ID NO: 28. This can be accomplished by using the endonuclease to specifically edit the codon of the gene encoding the residue corresponding to residue 112 of SEQ ID NO: 28. In some embodiments, the methods further comprise using the endonuclease to edit the native gene to have a mutation at a position corresponding to residue 140 of SEQ ID NO: 28.
In a ninth aspect, the present invention provides methods for using the plants described herein to (1) produce a phenylpropanoid-derived product or (2) sequester CO2. The methods comprise growing the plants described herein or plants genetically engineered to produce the engineered PAL enzymes described herein. The methods for producing phenylpropanoid-derived products further comprise purifying the phenylpropanoid-derived products produced by the plant.
The present disclosure is not limited to the specific details of construction, arrangement of components, or method steps set forth herein. The compositions and methods disclosed herein are capable of being made, practiced, used, carried out and/or formed in various ways that will be apparent to one of skill in the art in light of the disclosure that follows. The phraseology and terminology used herein is for the purpose of description only and should not be regarded as limiting to the scope of the claims. Ordinal indicators, such as first, second, and third, as used in the description and the claims to refer to various structures or method steps, are not meant to be construed to indicate any specific structures or steps, or any particular order or configuration to such structures or steps. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to facilitate the disclosure and does not imply any limitation on the scope of the disclosure unless otherwise claimed. No language in the specification, and no structures shown in the drawings, should be construed as indicating that any non-claimed element is essential to the practice of the disclosed subject matter. The use herein of the terms “including,” “comprising,” or “having,” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof, as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements.
Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if a concentration range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this disclosure. Use of the word “about” to describe a particular recited amount or range of amounts is meant to indicate that values very near to the recited amount are included in that amount, such as values that could or naturally would be accounted for due to manufacturing tolerances, instrument and human error in forming measurements, and the like. All percentages referring to amounts are by weight unless indicated otherwise.
No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.
The following examples are meant only to be illustrative and are not meant as limitations on the scope of the invention or of the appended claims.
In the following example, the inventors describe their discovery of a novel mutation that is necessary to convert monofunctional phenylalanine ammonia-lyase (PAL) enzymes into bifunctional phenylalanine tyrosine ammonia-lyase (PTAL) enzymes.
Acquisition of the ability to synthesize lignin was one of the most important events that allowed vascular plants to migrate from water to land and adapt to the harsh environment. Lignin is essential in land plants for providing mechanical strength, facilitating water transportation, and strengthening the physical barrier against biotic and abiotic stresses. In addition to cellulose and hemicelluloses, lignin is one of the major components of plant secondary cell walls, and up to 30% of photosynthetically fixed carbon is utilized to produce lignin. Lignin hinders the efficient use of cell wall polysaccharides as a source of pulp, paper, and bioethanol. However, lignin is the only abundant, renewable feedstock that comprises aromatics. Thus, it has potential for use in the production of sustainable, value-added aromatic materials and high-energy-density solid fuels.
The monocot grass plant group is one of the most widely distributed plant groups on earth and contains 780 genera and about 12,000 species. These plants succeeded in expanding their habitat from forest to harsh open land by developing a series of morphological, physiological, and biochemical features. This plant group contains a substantial number of economically important crops. For example, grass cereal crops (e.g., rice, wheat, and corn) comprise a major portion of most people's diets, and grass straws are used as livestock feeds. This plant group also contains several crops with superior biomass productivity (e.g., switchgrass, sorghum, and Miscanthus) that have potential for use in the production of plant-based energy and materials. Grasses are classified as Poales, a large order of flowering, monocotyledonous plants that contains around 21,000 species of great diversity that evolved within a relatively short evolutionary timescale (Givnish et al., 2010; McKain et al., 2016) (
Although lignin is an indispensable component of vascular plants, the biosynthesis and structure of lignin differ not only among plant species but also across the organ and cell types of individual plants (Renault et al., 2019; Vanholme et al., 2019). In all vascular plants, lignin is composed of the monomeric units guaiacyl (G), syringyl unit (S), and p-hydroxyphenyl (H), which are produced via polymerization of coniferyl alcohol, sinapyl alcohol, and p-hydroxyphenyl alcohol, respectively. In addition to these three monomers, grass lignin uniquely incorporates γ-acylated (p-coumarylated and feruloylated) monomers and flavone tricin (
TAL activity has been detected in plant extracts of a wide range of grass species, including species classified in both the BOP and PACMAD clades (
The residue His 140, which is located in the substrate binding pocket of TAL enzymes, was previously proposed to be a key residue for the acquisition of TAL activity (Dixon and Barros, 2019). This residue was shown to be critical for recognition of the substrate tyrosine based on the crystal structure of the bacterial TAL enzyme (Watts et al. 2006). PAL enzymes have a highly conserved Phe 140 at this position (Louie et al. 2006; Watts et al. 2006). When a His 140 to Phe (H140F) mutation was introduced into the bacterial TAL enzyme, the TAL enzyme (which previously had a high substrate specificity for L-Tyr) was essentially converted into a PAL enzyme with a high specificity for L-Phe (Watts et al. 2006). However, in previous studies, introducing a Phe 140 to His (F140H) mutation into the Arabidopsis PAL enzyme failed to convert it into a bifunctional PTAL enzyme (Watts et al. 2006). Further, introducing a H140F mutation into the Sorghum bicolor PTAL enzyme produced an enzyme with kinetic properties that were noticeably different from other S. bicolor PAL enzymes (Jun et al., 2018). Thus, in addition to His140, other unidentified residue(s) are thought to be necessary for the acquisition of TAL activity (Barros and Dixon, 2020).
To elucidate the evolutionary history of the emergence of the PTAL enzyme in Poales, we obtained PAL/PTAL homolog sequences from 45 monocot species, including basal-grasses and non-grass graminids, whose genomes were sequenced only recently. We found that PAL orthologs from non-grass graminids nested directly into the grass PTAL clade and were distinct from the PAL clade. Biochemical characterization of recombinant PAL/PTAL homologs demonstrated that PTAL enzymes emerged in the common ancestor of the non-grass graminid Joinvillea ascendens and grasses, just before the emergence of grasses. A combined approach using phylogeny-guided sequence comparison and site-directed mutagenesis identified an additional mutation, Ser112 to Ile (S112I), that is essential for the transition from a monofunctional PAL enzyme to a bifunctional PTAL enzyme. We found that introduction of S112I and F140H mutations into PAL enzymes from J. ascendans and Arabidopsis thaliana conferred significant TAL activity to these enzymes.
To determine when PTAL enzymes emerged in grasses, we obtained the genome sequences of 44 species of green plants, identified their PAL family enzymes using the PTAL orthogroup from OrthoFinder (Table 1), and generated a large-scale phylogenetic tree of plant PAL and PTAL enzymes. The angiosperm PAL family was divided into two distinct clades: clades I and II. Clade I includes well-characterized angiosperm PAL enzymes (e.g., from Arabidopsis thaliana, Cochrane et al., 2004) and both PAL and PTAL enzymes from grasses, such as Zea mays (Rosler et al., 1997), Sorghum bicolor (Jun et al., 2018), and Brachypodium distachyon (Barros et al., 2016) (
The residue His 140, which is located in the substrate binding pocket of TAL enzymes, was previously shown to be critical for the recognition of the substrate tyrosine based on the crystal structure of the bacterial enzyme (Watts et al. 2006). In contrast, PAL enzymes have highly conserved Phe 140 at this position (Louie et al. 2006, Watts et al. 2006). When the His residue of a bacterial TAL enzyme was mutated to Phe, the TAL enzyme was essentially converted to a PAL enzyme (Watts et al. 2006). To predict the functionality of the PAL/PTAL orthologs from S. angustifolia, J. ascendens, and E. monostachya (which are labeled in
To further examine the TAL activities of these PAL (i.e., JaPAL, EmoPAL, BdPAL, and SbPAL) and PTAL (i.e., SbPTAL, BdPTAL, SaPTAL-a, SaPTAL-b, EmoPTAL, and JaPTAL) enzymes, we determined the kinetic parameters of reactions using various concentrations of the substrate Tyr (
Additional Amino Acids are Involved in the Transition from PAL to PTAL
To experimentally test the role of His 140 in the acquisition of TAL activity, we next conducted site-directed mutagenesis on the PAL and PTAL enzymes of grasses and non-grass graminids characterized above and analyzed their effects on TAL activity. For the PAL enzymes, the residue corresponding to Phe 140 was converted to His to generate JaPALF140H EmoPALF134H, BdPALF137H, and SbPALF135H. A detailed kinetic analysis showed that, compared to the corresponding wild-type PAL enzymes, all these mutants exhibited increased overall TAL activity (kcat/Km; 9.7-fold on average) with significantly reduced Km values for Tyr (0.04-fold on average) (Table 3). For the PTAL enzymes, the residue corresponding to His 140 was converted to Phe to generate SbPTALH123F, BdPTALH123F, SaPTAL-aH118F, SaPTAL-bH126F, EmoPTALH127F, and JaPTALH125F. Compared to the corresponding wild-type PTAL enzymes, all these mutants exhibited decreased TAL activity (0.01-fold on average) and significantly increased Km for Tyr (13.2-fold on average) (
Introduction of Eight Additional Mutations Besides F140H Converts PAL into PTAL
To identify the additional residues critical for the transition of PAL to PTAL in this plant lineage, we conducted a phylogeny-guided sequence comparison (Maeda, 2019) utilizing the phylogenetic distribution of the functional PAL and PTAL enzymes (
To investigate the potential role of these residues in TAL activity, we generated two JaPAL mutant enzymes, one with PTAL-type substitutions in the 8 circle residues and the other with PTAL-type substitutions in both the circle and triangle residues (Table 4) in addition to the F140H mutation (JaPALF140H_MUT8 and JaPALF140H_MUT16, respectively). Kinetic assays showed that the apparent Km value of JaPALF140H_MUT8 (17.9 μM) was significantly improved compared to that of the JaPALF140H single mutant (222.7 μM) and closely approached that of wild-type JaPTAL with similar kcat values (
To determine which of the 8 circle residues are essential in the conversion of PAL enzymes to PTAL enzymes (
We generated homology model structures of JaPAL and JaPTAL proteins using the parsley PAL and sorghum PTAL enzymes, respectively, as templates (
Introduction of F140H and S112I is Sufficient to Change PAL into PTAL
To test this hypothesis further, the reciprocal S112I mutation was introduced into the JaPALF140H single mutant to generate the JaPALF140H_S112I double mutant. For comparison, a single mutant in which the residue corresponding to Ser112 was converted to Ile (i.e., JaPALS112I) was generated as well. While kcat was not drastically affected by these mutations, Km of the JaPALF140H_S112I mutant for TAL activity (17.5 μM) became significantly lower than those of wild-type JaPAL (4859 μM) and the single mutants JaPALF140H and JaPALS112I (223 μM and 354 μM, respectively) and reached to the level of wild-type JaPTAL (
To test whether two amino acid substitutions equivalent to F140H and S112I can also confer TAL activity in distantly related PAL enzymes, we introduced these mutations into a recombinant Arabidopsis PAL1 enzyme that has higher PAL activity and weak TAL activity (Cochrane et al., 2004; Watts et al., 2006) (Table 3). AtPAL1F144H_S116I showed a drastic reduction in its Km towards Tyr (20.2 μM) as compared to that of wild-type AtPAL1 (3070 μM) and its single mutants (AtPAL1F144H and AtPAL1S116I) (314 μM and 515 μM, respectively) (
The protein sequences of the JaPAL and AtPAL1 enzymes tested in this example are outlined in Table 6, and the DNA sequences of the JaPAL and AtPAL1 enzymes tested in this example are outlined in Table 7.
Amborella
trichopoda
Physcomitrella
patens
Sphagnum fallax
Selaginella
moellendorffii 
Marchantia
polymorpha 
Azolla
filiculoides
Salvinia
cucullata
Daucus carota
Solanum
lycopersicum
Mimulus guttatus
Solanum_tuberosum
Amaranthus
hypochondriacus
Aquilegia
coerulea
Arabidopsis
thaliana
Brassica
oleracea capitata
Brassica rapa
Cucumis sativus
Eucalyptus
grandis
Fragaria vesca
Gossypium
raimondii
Medicago
truncatula
Populus
trichocarpa
Phaseolus
vulgaris
Ricinus
communis
Theobroma
cacao
Vitis vinifera
Kalanchoe
fedtschenkoi 
Chlamydomonas
reinhardtii
Picea abies
Acorus
americanus
Spirodela
polyrhiza
Zostera marina
Joinvillea
ascendens
Musa acuminata
Calamus
simplicifolius
Elaeis guineensis
Brachypodium
distachyon
Oryza sativa
Panicum
virgatum
Setaria italica
Streptochaeta
angustifolia
Setaria viridis
Zea mays
Ananas comosus
Amborella
trichopoda
Acorus
americanus
Zostera marina
Spirodela
polyrhiza
Xerophyta
viscosa
Asparagus
officinalis
Apostasia
shenzhenica
Phalaenopsis
equestris
Dendrobium
catenatum
Allium sativum
Dioscorea
rotundata
Musa acuminata
Calamus
simplicifolius
Elaeis guineensis
Cocos nucifera
Phoenix
dactylifera
Carex littledalei
Ananas comosus
Joinvillea
ascendens
Ecdeiocolea
monostachya
Streptochaeta
angustifolia
Pharus latifolius
Oropetium
thomaeum
Sorghum bicolor
Zea mays
Setaria viridis
Setaria italica
Panicum
virgatum
Panicum hallii
Oryza sativa
Brachypodium
stacei
Brachypodium
sylvaticum
Brachypodium
distachyon
Hordeum vulgare
Joinvillea ascendens
Joinvillea ascendens
Joinvillea ascendens
Joinvillea ascendens
Joinvillea ascendens
Joinvillea ascendens
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Sorghum bicolor RTx430
Brachypodium distachyon
Brachypodium distachyon
Brachypodium distachyon
Brachypodium distachyon
Arabidopsis thaliana
Arabidopsis thaliana
Arabidopsis thaliana
Arabidopsis thaliana
Joinvillea
ascendens PAL
Arabidopsis
thaliana PAL1
Joinvillea
ascendens PAL
Arabidopsis
thaliana PAL1
Setaria viridis
Setaria italica
Setaria viridis
Setaria italica
Panicum hallii
Panicum virgatum
Zea mays
Zea mays
Sorghum bicolor
Setaria viridis
Setaria italica
Oryza sativa
Brachypodium distachyon
Panicum virgatum
Panicum virgatum
Panicum hallii
Zea mays
Sorghum bicolor
Setaria italica
Setaria viridis
Setaria viridis
Setaria italica
Oryza sativa
Streptochaeta angustifolia
Streptochaeta angustifolia
Ecdeiocolea monostachya
Joinvillea ascendens
Joinvillea ascendens
Ecdeiocolea monostachya
Flagellaria_indica_Trinity_comp23995_c0_seq1
Flagellaria indica
Setaria italica
Setaria viridis
Setaria italica
Setaria viridis
Setaria italica
Setaria viridis
Panicum hallii
Panicum hallii
Panicum hallii
Panicum virgatum
Panicum virgatum
Panicum virgatum
Sorghum bicolor
Sorghum bicolor
Sorghum bicolor
Zea mays
Zea mays
Zea mays
Oryza sativa
Oryza sativa
Brachypodium distachyon
Brachypodium distachyon
Brachypodium distachyon
Brachypodium distachyon
Brachypodium distachyon
Oryza sativa
Panicum virgatum
Panicum virgatum
Panicum hallii
Setaria italica
Setaria viridis
Zea mays
Zea mays
Sorghum bicolor
Oryza sativa
Oryza sativa
Oryza sativa
Sorghum bicolor
Zea mays
Brachypodium distachyon
Streptochaeta angustifolia
Streptochaeta angustifolia
Panicum virgatum
Panicum hallii
Panicum virgatum
Panicum virgatum
Setaria viridis
Setaria italica
Zea mays
Sorghum bicolor
Panicum virgatum
Oryza sativa
Oryza sativa
Brachypodium distachyon
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Streptochaeta angustifolia
Ananas comosus
Ananas comosus
Apostasia shenzhenica
Apostasia shenzhenica
Apostasia shenzhenica
Dendrobium catenatum
Phalaenopsis equestris
Dendrobium catenatum
Apostasia shenzhenica
Apostasia shenzhenica
Phalaenopsis equestris
Spirodela polyrhiza
Spirodela polyrhiza
Musa acuminata
Musa acuminata
Musa acuminata
Musa acuminata
Musa acuminata
Musa acuminata
Elaeis guineensis
Calamus simplicifolius
Ananas comosus
Ananas comosus
Musa acuminata
Elaeis guineensis
Elaeis guineensis
Ananas comosus
Asparagus officinalis
Asparagus officinalis
Asparagus officinalis
Asparagus officinalis
Asparagus officinalis
Asparagus officinalis
Acorus americanus
Acorus americanus
Zostera marina
Zostera marina
Amborella trichopoda
Amborella trichopoda
Calamus simplicifolius
Calamus simplicifolius
Elaeis guineensis
Elaeis guineensis
Calamus simplicifolius
Ananas comosus
Musa acuminata
Zostera marina
Zostera marina
Spirodela polyrhiza
Acorus americanus
Acorus americanus
Acorus americanus
Amborella trichopoda
Amborella trichopoda
Amborella trichopoda
We obtained the genome and protein sequence data listed in Table 1 and Table 2 from NCBI, DNA Databank of Japan (DDBJ), phytozome, JGI, and plaza_v4.5_monocots databases. The genome sequence of Streptochaeta angustifolia was downloaded from a publication (Seetharam et al., 2021). The genome sequence of Ecdeiocolea monostachya was provided by Dr. Matthew Moscou (University of Minnesota, MN).
Phylogenetic Tree Analysis and Identification of Residues Involved in the Transition from PAL to PTAL
To find PAL homologs, we used OrthoFinder with the protein sequence datasets for green plants (Table 1) and monocots (Table 2) with the options of an MCL inflation parameter of 1.5, DIAMOND for sequence alignment, FastME, MAFFT for multiple sequence alignment, and FastTree for gene trees (Emms and Kelly, 2015). Because many genome sequences had duplicated or truncated sequences annotated as genes, we then ran filter fasta script using the obtained orthogroup sequences to remove duplicate genes and genes shorter than 3× the standard deviation from the mean or a given length (less than 50 amino acids). Using the filtered sequence dataset, we generated an alignment using MAFFT v7.450 (Katoh and Standley, 2013). To determine the best evolutionary model for each PAL tree, we ran ModelTest-NG (Darriba et al. 2020). The best model was JTT+G4+F for the green plant dataset and JTT+I+G4+F for the monocot dataset. The maximum-likelihood phylogenetic tree was generated using RAXML-NG (Alexey et al., 2019).
Sequences encoding PAL and PTAL candidate enzymes from S. bicolor, B. distachyon, S. angustifolia, and J. ascendens were amplified from cDNA with gene specific primers and PrimeSTAR® MAX DNA polymerase (Takara Bio) and were cloned into the pET28a vector using the In-Fusion® HD Cloning Kit (Takara Bio). The resulting vectors were submitted for sequence analysis, which confirmed that the coding sequences matched the sequences in the database. Polynucleotides encoding BdPTAL1, EmoPTAL, EmoPAL, JaPAL-MUT9, and JaPAL-MUT17 were synthesized and cloned into pET28a vectors (SynbioTechnologies). For site-directed mutagenesis, 1:100 diluted plasmid was PCR amplified using PrimeSTAR® MAX DNA polymerase (Takara Bio) and mutagenesis primers. The primers used for cloning are shown in Table 5.
For recombinant protein expression, the pET28a vectors were transformed into Rosetta-2 (DE3) E. coli and cultured in 3 ml of terrific broth (TB) medium containing kanamycin (50 μg/ml), chloramphenicol (34 μg/ml), and 0.1% glucose at 37° C. and 200 rpm overnight. Then, 500 μl of pre-culture solution was added to 50 ml TB medium containing the same antibiotics and further cultured at 27° C. and 200 rpm until the OD600 reached 0.5-0.7. The bacterial cultures were then cooled down on ice, isopropyl β-D-1-thiogalactopyranoside (IPTG, 0.5 mM final concentration) was added, and the cultures were incubated at 22° C. and 200 rpm. After 24 hours, the cultures were harvested by centrifugation (5000 g, 5 min, 4° C.) and the pellets were frozen at −30° C. The pellets were thawed and resuspended in lysis buffer containing 50 mM sodium phosphate buffer (pH 8.0), 300 mM NaCl, 10% glycerol, and 0.25 mg lysozyme. After a 30 min incubation on ice, the suspension was sonicated three times for 20 sec and the supernatant was recovered after centrifugation (12500 g, 20 min, 4° C.). The supernatants were added to a new tube containing 100 μl of Ni-NTA beads (Millipore) and the mixture was incubated at 25° C. for 30 min under constant inversion. After unbound proteins were washed away via three washes with washing buffer containing 50 mM sodium phosphate buffer (pH 8.0), 300 mM NaCl, 10% glycerol, and 10 mM imidazole, target proteins were eluted with elution buffer containing 50 mM sodium phosphate buffer (pH 8.0), 300 mM NaCl, 10% glycerol, and 300 mM imidazole. The purified enzyme solutions were desalted using a Sephadex G-50 column (GE Healthcare). The protein concentration was determined using the BioRad protein assay dye (BioRad). The purity was confirmed to be >90% using SDS-PAGE and ImageJ software.
All substrate solutions were prepared with 0.01 N NaOH to increase the solubility of L-Tyr. A mixture containing 100 mM Tris-HCl (pH 8.5), 1% glycerol, and purified enzyme in a total volume of 50 μl was preincubated for 3 min at 30° C. PAL and TAL reactions were started by addition of 50 μl of 1 mM substrate (L-Phe or L-Tyr, respectively) and were incubated at 30° C. for 20 min unless otherwise noted. The reactions were terminated by addition of 6N acetic acid (10 μl).
The reaction products were analyzed using high-performance liquid chromatography (HPLC) (1200 Infinitely Series-Infinitely better, Agilent Technologies) to directly detect products produced by PAL and TAL activity, i.e., cinnamic acid and p-coumaric acid, respectively. Analytical conditions were as follows: column, Neptune T3 C18 column (3 μm, 2.1×150 mm, ES industries); solvent system, solvent A (water including 0.1%[v/v] formic acid) and solvent B (acetonitrile including 0.1%[v/v] formic acid); gradient program: 99% A/1% B at 0 min, 99% A/1% B at 4.5 min, 95% A/5% B at 7.5 min, 85% A/15% B at 12 min, 75% A/25% B at 16.5 min, 70% A/30% B at 21 min, 5% A/95% B at 23 min, 5% A/95% B at 26 min, 99% A/5% B at 26.5 min, and 99% A/5% B at 30 min; flow rate: 0.3 mL/min; DAD: 275 nm for cinnamic acid, 309 nm for p-coumaric acid.
The kinetic parameters of the recombinant enzymes were determined using HPLC. Reaction mixtures containing 100 mM Tris-HCl (pH 8.5), 1% glycerol, and purified enzyme (0.15 μg for PAL assay and 1 μg for TAL assay) in a 50 μl total volume were preincubated for 3 min at 30° C. PAL and TAL reactions were started by addition of 50 μl substrate solution prepared with 0-4 mM L-Phe and 0-2 mM L-Tyr. After 10 min and 20 min incubations for PAL and TAL assay, respectively, at 30° C., the reaction was terminated by addition of 6N acetic acid (10 μl). Analytical conditions were as follows: column, Atlantis T3 C18 column (3 μm, 2.1×150 mm, Waters); solvent system, solvent A (water including 0.1%[v/v] formic acid) and solvent B (acetonitrile including 0.1%[v/v] formic acid); gradient program: 85% A/15% B at 0 min, 85% A/15% B at 1 min, 70% A/30% B at 3 min, 15% A/95% B at 6.5 min, 15% A/95% B at 7.5 min, 85% A/15% B at 8.5 min, and 85% A/15% B at 10 min; flow rate: 0.4 mL/min; DAD: 275 nm for cinnamic acid, 309 nm for p-coumaric acid. The products were quantified using calibration curves generated using authentic standards. Non-linear hyperbolic regression analyses were conducted using the Excel Solver tool to calculate Km and Vmax values.
The structures of JaPAL and JaPTAL were generated with SWISS-MODEL (Waterhouse et al., 2018) using a homo-tetrameric PAL structure from parsley 6F6T.pdb (Bata et al., 2021) and a homo-dimeric PTAL structure from sorghum 6AT7.pdb (Sun et al., 2018), respectively, as templates. The sequence identity against each template were 77.3% and 80.5% for JaPAL and JaPTAL, respectively.
In the following example, the inventors describe experiments that demonstrate that several different amino acid substitutions at position 112 in JaPAL retain the TAL activity observed in the JaPALF140H_S112I double mutant.
A phylogenetic analysis revealed that, while the amino acids Ser and Ile are well conserved at positions corresponding to residue 112 in JaPAL in angiosperm PAL enzymes, basal non-flower PAL enzymes possess Ile, Thr, or Val at this position (
In the following example, the inventors describe future experiments in which engineered PAL enzymes will be tested in planta.
To test the effects of the F140H and S112I mutations in plants, we will transiently express recombinant PAL enzymes (e.g., Arabidopsis PAL_S112I-F140H) with and without the corresponding mutations in Nicotiana benthamiana using Agrobacterium-mediated transformation. Soluble metabolites will be extracted from the transformed Nicotiana leaves and quantified to determine if the production of any soluble phenylpropanoid compounds was affected by the presence of the recombinant PAL enzymes.
This experiment will also be conducted in plants that express deregulated TyrA enzymes that we previously discovered, such as Beta vulgaris TyrAalpha (Lopez-Nieves et al., Plant J 109: 844-855 (2021)). The presence of the deregulated TyrA enzymes should increase the availability of the tyrosine substrate for the TAL activity.
This application claims priority to U.S. Provisional Application No. 63/491,152, filed on Mar. 20, 2023, the contents of which are incorporated by reference in their entireties.
This invention was made with government support under grant number 1836824 awarded by the National Science Foundation. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63491152 | Mar 2023 | US |