The sequence listing written in file 081906-1082455i-220730us_sl.txt created on Oct. 10, 2018, 13,046 bytes, machine format IBM-PC, MS-Windows Operating System, is hereby incorporated by reference in its entirety for all purposes.
Cytoplasmic lipid droplets (LDs) of neutral lipids are reserves in cells of all eukaryotes and prokaryotes. Recently, studies of LDs have attracted significant attention for their involvement in the production of biodiesels in photosynthetic organisms and high-valued lipid-related products in diverse organisms, as well as in human diseases related to obesity and host-pathogen interaction. LDs in plant seeds are prominent and were studied extensively before those in other organisms. Plant LDs are covered with a layer of phospholipids and abundant structural protein called oleosin. Oleosin has short amphipathic N- and C-terminal peptides flanking a 72-residue hydrophobic hairpin, which penetrates and stabilizes the LD. Oleosin is synthesized on endoplasmic reticulum (ER) and extracts ER-budding LDs to cytosol. The present inventors examined oleosin targeting signals for ER-LDs by expressing recombinant-oleosin genes in Physcomitrella patens after transient expression and Nicotiana tabacum cells after stable transformation. The initial ˜4 residues and the entirety of the hairpin, but not the N- and C-terminal peptides, were required for oleosin targeting to ER and staying on LDs. Oleosin with additions of an N-terminal ER-targeting peptide and a vacuole-targeting propeptide and a reduction of the hairpin length entered the ER lumen; extracted ER-budding LDs to the lumen; and guided LDs to vacuoles. These findings define the mechanism of oleosin and LD biosynthesis and reveal approaches to redirecting cytosolic LDs to storage vacuoles or secretion to avoid metabolic feedback inhibition and for other industrial and health applications.
Cytoplasmic lipid droplets (LDs) of neutral lipids are reserves in all eukaryotes and prokaryotes (1-10). The spherical LDs are 1-2 μm in diameter, depending on cell types and metabolic conditions. Each LD is enclosed with a layer of phospholipids (PLs) embedded with proteins, which exert structural and/or metabolic/regulatory functions. Vegetative oils from plant LDs have been extensively used for food and non-food purposes. Recently, studies of LDs have attracted major attention because they are involved in industrial manufacture of renewable biodiesels in photosynthetic organisms (11) and high-valued lipid-related products in diverse organisms (12), as well as human diseases related to obesity and host-pathogen interactions (6-10).
LDs in plant seeds are prominent and were studied extensively (1, 2) before those in mammals and microbes (5-10). Seeds store triacylglycerols (TAGs) in LDs (also called oil bodies, oleosomes, lipid bodies, spherosomes, etc.) as food reserves for germination. Each LD has a TAG matrix enclosed with a layer of PLs and the structural protein oleosin. Oleosin completely covers the surface of LDs and prevents them from coalescing, even in desiccated seeds (2, 13). The small size of LDs provides a large surface area per unit TAG, which facilitates lipolysis during germination.
Oleosins are present in green algae and primitive and advanced plants (1, 2, 13, 14, 30). They are small proteins of 15-26 kDa. Each oleosin has short amphipathic N- and C-terminal peptides lying on the LD and a hallmark central hydrophobic hairpin of ˜72 uninterrupted non-charged residues. The hairpin has 2 arms each of ˜30 residues linked with a loop of 12 most-conserved residues (PX5SPX3P (SEQ ID NO:1), with X representing a nonpolar residue). The hairpin of an alpha (15) or beta (16) structure of 5-6 nm long penetrates the surface PL layer into the TAG matrix of an LD and stabilizes the whole LD.
Seed LDs are synthesized on endoplasmic reticulum (ER) (1, 2). TAG-synthesizing enzymes are associated with extended regions or subdomains of ER (17-23). TAGs synthesized on ER are sequestered in the non-polar acyl region of the PL bilayer, which results in an ER-budding LD. Oleosin is synthesized on the cytosolic side of ER via Signal-Recognition-Particle-guided mRNAs (22) and extracts ER-budding LDs to cytosol. The C-terminal peptide is not required for oleosin targeting to ER-LDs (21, 22), but the N-terminal peptide may or may not be so required (21, 24, 25). The hairpin and its loop PX5SPX3P (SEQ ID NO:1) are required for proper oleosin targeting to ER-LDs (21). Adding an N-terminal ER-targeting peptide to oleosin allows the protein to associate with ER but not enter the ER lumen, presumably because of the bulky hydrophobic hairpin (21).
Seed LDs produced from ER budding are discharged to and stored in cytosol, because oleosin is synthesized on the ER cytosolic side and extracts budding LDs to cytosol. If oleosin were synthesized on the ER luminal side, budding LDs could be extracted to the ER lumen and then move to protein storage vacuoles (PSVs) or other vesicular compartments, or be excreted. Plant vacuoles serve many functions (26-28), one of which is being metabolic sinks of accumulated secondary metabolites, not just for storage but also for avoidance of metabolic feedback inhibition.
Here the present inventors have delineated the mechanisms of oleosin synthesis and its targeting to ER-LDs. From the findings, they have modified oleosin with additions of an N-terminal ER targeting peptide and a vacuole-targeting propeptide and a reduction of the hairpin length; the modified oleosin enters the ER lumen and guides ER-budding LDs to the lumen and then vacuoles. The inventors have delineated the mechanisms of oleosin synthesis and its targeting to ER-LDs. In addition, they show that oleosin with additions of an N-terminal ER targeting signal peptide and a vacuole-targeting propeptide and a reduction of the hairpin length enters the ER lumen and guides ER-budding LDs to the lumen and then vacuoles (29). These approaches have potential for industrial and health applications.
This invention provides a modified oleosin protein, which is effective for regulating subcellular lipids storage, transportation, and disposition thus possesses potentials of substantial importance in various applications. Thus, in one aspect, the present invention relates to a recombinant or modified oleosin protein generated from a native oleosin protein. Typically, the native oleosin protein comprises an amphipathic N-terminal peptide, an amphipathic C-terminal peptide, and a hairpin in the middle comprising or consisting of a hairpin loop flanked by two hairpin arms. In contrast, the modified oleosin protein comprises (i) an endoplasmic reticulum (ER)-targeting peptide at the N-terminus of the modified oleosin protein; and (ii) a truncated hairpin arm of the native oleosin protein—at least one hairpin arm is truncated, more typically both are truncated.
In some embodiments, the modified oleosin protein further comprises a vacuole-targeting sequence, for example, a protein storage vacuoles (PSV)-targeting sequence. In some embodiments, the vacuole-targeting sequence is located between the ER-targeting peptide and the first (or closest to N-terminus) truncated hairpin arm. In some embodiments, the modified oleosin protein has one or both hairpin arms truncated, with 5-15 (e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15) amino acids remaining in one or each arm. The amino acids that have been deleted from a hairpin arm may have been originally located at the N-terminus or the C-terminus of the hairpin arm; or they may have been originally located in the middle of the hairpin arm, i.e., not immediately at either of the N-terminus and C-terminus of a hairpin arm but rather at least one or a few amino acids away. In some embodiments, the amphipathic N- or C-terminal peptide of the native oleosin protein is truncated or deleted. In some embodiments, the modified oleosin protein includes the initial 5-10 amino acids of the first hairpin arm, e.g., the first 5, 6, 7, 8, 9, or 10 amino acids of the first hairpin arm counting from the direction of the N-terminus. In some embodiments, the modified oleosin protein is a fusion protein further comprising an additional heterologous peptide, or a peptide from a different origin, such as a green fluorescent protein (GFP) or β-glucuronidase (GUS). In some embodiments, the modified oleosin protein is covalently linked to a detectable label, which in addition to proteins capable of generating a detectable signal may include molecules of other nature (e.g., radioisotopes or marker peptides with commercially available antibodies).
In a second aspect, this invention provides an isolated polynucleotide sequence encoding the modified oleosin protein describe herein and above (29). In some embodiments, the polynucleotide sequence is present in an expression cassette (e.g., one that is capable of directing transcription of the coding sequence) or a vector (e.g., one that is capable of self-replicating). Also provided is a host cell comprising the expression cassette or the vector. The host cell may be a prokaryotic or eukaryotic cell. In some cases, the host cell has its genomic sequence encoding the endogenous oleosin gene and/or its regulatory elements manipulated (e.g., by deletion, substitution, or insertion of nucleotide(s) to alter the genomic sequence) such that the expression level of the endogenous oleosin protein is substantially reduced or completely eliminated. Some examples of the host cells include plant cells, animal cells, fungal cells, or bacterial cells. An organism comprising such a host cell is also provided, which may be a plant, an animal, or a microbe. In essence, the organism can be of any species, including plants and animals (invertebrates or vertebrates, such as mammals including primates such as humans) and microorganisms such as fungi (e.g., yeast), algae, all prokaryotes and eukaryotes that contain lipid droplets in their cells.
In a third aspect, the present invention provides a method of regulating distribution of subcellular lipid within a cell or a method for promoting secretion of lipids from within a cell. Both methods comprise the step of introducing into the cell the polynucleotide encoding the modified oleosin protein of this invention described above and herein. Typically, the modified oleosin protein is expressed in the cell, which may be in a permanent manner or only transiently. In some cases, the method further comprises a step of manipulating the cell's genomic sequence encoding the endogenous oleosin gene and/or its regulatory elements (e.g., by deletion, substitution, or insertion of nucleotide(s) to alter the genomic sequence) such that the expression level of the endogenous oleosin protein is substantially reduced or completely eliminated. In some embodiments, the cell is within a living organism such as a plant, an animal, or a microbe. In essence, the organism can be of any species, including plants and animals (invertebrates or vertebrates, such as mammals including primates such as humans) and microorganisms such as fungi (e.g., yeast), algae, all prokaryotes and eukaryotes that contain lipid droplets in their cells.
In a fourth aspect, this invention provides a method for generating a cell with increased extracellular secretion of lipids. This method comprises introducing into the cell the polynucleotide encoding the modified oleosin protein of this invention. In some cases, the method further comprises a step of manipulating the cell's genomic sequence encoding the endogenous oleosin gene and/or its regulatory elements (e.g., by deletion, substitution, or insertion of nucleotide(s) to alter the genomic sequence) such that the expression level of the endogenous oleosin protein is substantially reduced or completely eliminated. In some embodiments, this method may further comprise a step of selecting a cell, subsequent to the introducing step, for increased extracellular secretion of lipids. Optionally, the method scheme can include an additional step of collecting secreted lipids from a cell exhibiting increased extracellular secretion of lipids. In some embodiments, the cell is within and a part of a living organism (e.g., a plant, a non-human animal, or a microbe). An organism generated by this method is also provided, which may be a plant, a non-human animal, or a microbe. In essence, the organism can be of any species, including plants and animals (invertebrates or vertebrates, such as mammals including primates such as humans) and microorganisms such as fungi (e.g., yeast), algae, all prokaryotes and eukaryotes that contain lipid droplets in their cells.
The term “oleosin protein,” as used herein, refers to small proteins of 15-26 kDa that are present in green algae and primitive and advanced plants. Each oleosin has short amphipathic N- and C-terminal peptides orienting horizontally on the LD surface and a characteristic central hydrophobic hairpin of ˜72 uninterrupted non-charged residues. The hairpin has 2 arms each of ˜30 residues linked with a loop of the 12 most-conserved residues (PX5SPX3P (SEQ ID NO:1), with X representing a nonpolar residue). The hairpin of an alpha or beta structure of 5-6 nm long penetrates the surface PL layer into the TAG matrix of an LD and stabilizes the whole LD. The inventor's group cloned the first oleosin gene and published the finding in 1987 and soon afterward christened the protein name oleosin. Three reviews on oleosins subsequently published are: Huang, A H C. 1992 Oil bodies and oleosins in seeds. Annu. Rev. Plant Physiol. Mol. Biol. 43, 177-200; Huang A H C. 2010 Subcellular lipid droplets and oleosins in plants. Am. Oil Chem Soc. Library on lipids (website: lipidlibrary.aocs.org/plantbio/oilbodies/index.htm); and Huang A H C. 2018 Plant lipid droplets and their associated proteins: potential for rapid advances. Plant Physiol 176: 1894-1918.
As used herein, a “signal peptide” refers to a short (5-30 amino acids long) peptide present at the N-terminus of the majority of newly synthesized proteins that are destined towards the secretory pathway. These proteins include those that reside either inside certain organelles (the endoplasmic reticulum or ER, Golgi or endosomes), secreted from the cell, or inserted into most cellular membranes. Although most type I membrane-bound proteins have signal peptides, the majority of type II and multi-spanning membrane-bound proteins are targeted to the secretory pathway by their first transmembrane domain, which biochemically resembles a signal sequence except that it is not cleaved. A signal peptide is sometimes also referred to as signal sequence, targeting signals, localization signals, localization sequence, transit peptides leader sequence or leader peptide.
The N-terminus is the first part of the protein that exits the ribosome during protein biosynthesis. It often contains signal peptide sequences, “intracellular postal codes” that direct delivery of the protein to the proper organelle (in the current case: the organelle is ER). The signal peptide is typically removed at the destination by a signal peptidase. The N-terminal signal peptide is recognized by the signal recognition particle (SRP) and results in the targeting of the protein to the secretory pathway. In eukaryotic cells, these proteins are synthesized at the rough endoplasmic reticulum. In prokaryotic cells, the proteins are exported across the cell membrane. The N-terminal peptide from one species often works in another species. In this study, the inventors used successfully either (i) the 21-amino-acid N-terminal ER targeting sequence encoding aspartic proteinase in Physcomitrella (a primitive plant called moss) or (ii) the 23-amino-acid N-terminal ER targeting sequence encoding a protein called CLV3 in Arabidopsis (an advanced plant). For a review of signal peptides, see, e.g., Nilsson I, Lara P, Hessa T. et al. 2015 The code for directing proteins for translocation across ER membrane: SRP cotranslationally recognizes specific features of a signal sequence. J Mol Biol 427: 1191-1201.
As used herein, a “vacuole-targeting sequence” refers to a type of signal peptide specifically for targeting proteins to the subcellular vacuoles. The sequence (peptide) is located usually (but not absolutely) immediately after the N-terminal ER-targeting sequence; it is called a propeptide. The vacuole targeting peptide from one species often works in another species. In this study, the sequence (peptide) used was that (12 amino acids) for targeting a protein called ricin to subcellular vacuoles in castor bean (an advanced plant). For reviews, see, e.g., Chrispeels M J, Raikhel R V. 1992. Short peptide domains target proteins to plant vacuoles. Cell 68 (4): 613-616; Martinoia E, Meyer S, De Angeli A, & Nagy R (2012) Vacuolar transporters in their physiological context. Annual review of plant biology 63:183-213; and Zhang C, Hicks G R, & Raikhel Nev. (2014) Plant vacuole morphology and vacuolar trafficking. Frontiers in plant science 5:476.
The term “nucleic acid” or “polynucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.
The term “gene” means the segment of DNA involved in producing a polypeptide chain. It may include regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
The term “amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. “Amino acid mimetics” refers to chemical compounds having a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
There are various known methods in the art that permit the incorporation of an unnatural amino acid derivative or analog into a polypeptide chain in a site-specific manner, see, e.g., WO 02/086075.
Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
“Conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants” refers to those nucleic acids that encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein that encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.
The following eight groups each contain amino acids that are conservative substitutions for one another:
2) Aspartic acid (D), Glutamic acid (E);
(see, e.g., Creighton, Proteins, W. H. Freeman and Co., N. Y. (1984)).
Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
In the present application, amino acid residues are numbered according to their relative positions from the left most residue, which is numbered 1, in an unmodified wild-type polypeptide sequence.
As used in herein, the terms “identical” or percent “identity,” in the context of describing two or more polynucleotide or amino acid sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (for example, a modified oleosin protein sequence of this invention (or a portion thereof, e.g., the hairpin loop section of a modified oleosin protein) has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., a wild-type or native oleosin protein or the corresponding portion therefore, such as the hairpin loop section of a wild-type oleosin protein), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences are then said to be “substantially identical.” With regard to polynucleotide sequences, this definition also refers to the complement of a test sequence. Preferably, the identity exists over a region that is at least about 50 amino acids or nucleotides in length, or more preferably over a region that is 75-100 amino acids or nucleotides in length.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. For sequence comparison of nucleic acids and proteins, the BLAST and BLAST 2.0 algorithms and the default parameters discussed below are used.
A “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).
Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available at the National Center for Biotechnology Information website, ncbi.nlm.nih.gov. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length Win the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=1, N=−2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.
“Polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. All three terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
In the context of a fusion protein comprising a first polypeptide segment and a second “heterologous” polypeptide segment, the term “heterologous” refers to the fact that the two polypeptide segments originate from two distinct sources, for example, from two different proteins, or from the same protein but two different portions of the protein. In other words, a polypeptide being a component in a fusion protein is “heterologous” to another polypeptide component of the fusion protein when the two polypeptides do not appear in nature in the same manner as they appear in the fusion protein. The term “heterologous” has a similar meaning when used in the context of describing the relationship between two polynucleotide sequences joined together in a manner not found in nature.
An “expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular polynucleotide sequence in a host cell. An expression cassette may be part of a plasmid, viral genome, or nucleic acid fragment. Typically, an expression cassette includes a polynucleotide to be transcribed, operably linked to a promoter. Other elements that may be present in an expression cassette include those that enhance transcription (e.g., enhancers) and terminate transcription (e.g., terminators), as well as those that confer certain binding affinity or antigenicity to the recombinant protein produced from the expression cassette.
Based on the present inventors' discovery of oleosin's important role in cellular lipid distribution, this disclosure provides a recombinant or modified protein based on a naturally occurring oleosin protein, a polynucleotide sequence encoding the protein and associated vectors, expression cassettes, and host cells, as well as methods of making and using the modified oleosin protein to modulate lipid storage/secretion at a cellular level.
A. General Recombinant Technology
Basic texts disclosing general methods and techniques in the field of recombinant genetics include Sambrook and Russell, Molecular Cloning, A Laboratory Manual (3rd ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Ausubel et al., eds., Current Protocols in Molecular Biology (1994).
For nucleic acids, sizes are given in either kilobases (kb) or base pairs (bp). These are estimates derived from agarose or acrylamide gel electrophoresis, from sequenced nucleic acids, or from published DNA sequences. For proteins, sizes are given in kilodaltons (kDa) or amino acid residue numbers. Proteins sizes are estimated from gel electrophoresis, from sequenced proteins, from derived amino acid sequences, or from published protein sequences.
Oligonucleotides that are not commercially available can be chemically synthesized, e.g., according to the solid phase phosphoramidite triester method first described by Beaucage & Caruthers, Tetrahedron Lett. 22: 1859-1862 (1981), using an automated synthesizer, as described in Van Devanter et. al., Nucleic Acids Res. 12: 6159-6168 (1984). Purification of oligonucleotides is performed using any art-recognized strategy, e.g., native acrylamide gel electrophoresis or anion-exchange HPLC as described in Pearson & Reanier, J. Chrom. 255: 137-149 (1983).
The sequence of a gene of interest, such as an oleosin gene, a polynucleotide encoding a modified oleosin protein, and synthetic oligonucleotides can be verified after cloning or subcloning using, e.g., the chain termination method for sequencing double-stranded templates of Wallace et al., Gene 16: 21-26 (1981).
B. Coding Sequence for an Oleosin Protein
Polynucleotide sequences encoding a protein of interest, such as a naturally occurring oleosin protein, are typically known and in some cases may be obtained from a commercial supplier.
The rapid progress in the studies of the genome of human or other species has made possible a cloning approach where a genomic DNA sequence database can be searched for any gene segment that has a certain percentage of sequence homology to a known nucleotide sequence, such as one encoding a previously identified oleosin protein. Any DNA sequence so identified can be subsequently obtained by chemical synthesis and/or a polymerase chain reaction (PCR) technique such as overlap extension method. For a short sequence, completely de novo synthesis may be sufficient; whereas further isolation of full length coding sequence from a cDNA or genomic library using a synthetic probe may be necessary to obtain a larger gene.
Alternatively, a nucleic acid sequence encoding a naturally occurring oleosin protein can be isolated from a cDNA or genomic DNA library of human or another species using standard cloning techniques such as polymerase chain reaction (PCR), where homology-based primers can often be derived from a known nucleic acid sequence encoding an oleosin protein. Most commonly used techniques for this purpose are described in standard texts, e.g., Sambrook and Russell, supra.
cDNA libraries suitable for obtaining a coding sequence for a naturally occurring oleosin protein may be commercially available or can be constructed. The general methods of isolating mRNA, making cDNA by reverse transcription, ligating cDNA into a recombinant vector, transfecting into a recombinant host for propagation, screening, and cloning are well known (see, e.g., Gubler and Hoffman, Gene, 25: 263-269 (1983); Ausubel et al., supra). Upon obtaining an amplified segment of nucleotide sequence by PCR, the segment can be further used as a probe to isolate the full length polynucleotide sequence encoding the oleosin protein from the cDNA library. A general description of appropriate procedures can be found in Sambrook and Russell, supra.
A similar procedure can be followed to obtain a full-length sequence encoding a naturally occurring oleosin protein from a genomic library. Human genomic libraries, for example, are commercially available or can be constructed according to various art-recognized methods. In general, to construct a genomic library, the DNA is first extracted from a tissue where an oleosin protein is likely found. The DNA is then either mechanically sheared or enzymatically digested to yield fragments of about 12-20 kb in length. The fragments are subsequently separated by gradient centrifugation from polynucleotide fragments of undesired sizes and are inserted in bacteriophage λ vectors. These vectors and phages are packaged in vitro. Recombinant phages are analyzed by plaque hybridization as described in Benton and Davis, Science, 196: 180-182 (1977). Colony hybridization is carried out as described by Grunstein et al., Proc. Natl. Acad. Sci. USA, 72: 3961-3965 (1975).
Based on sequence homology, degenerate oligonucleotides can be designed as primer sets and PCR can be performed under suitable conditions (see, e.g., White et al., PCR Protocols: Current Methods and Applications, 1993; Griffin and Griffin, PCR Technology, CRC Press Inc. 1994) to amplify a segment of nucleotide sequence from a cDNA or genomic library. Using the amplified segment as a probe, the full-length nucleic acid encoding an oleosin protein is obtained.
Upon acquiring a nucleic acid sequence encoding a naturally occurring oleosin protein, the coding sequence can be further modified by a number of well-known techniques such as restriction endonuclease digestion, PCR, and PCR-related methods to generate coding sequences for oleosin proteins, including mutants and variants derived from the wild-type oleosin protein. The polynucleotide sequence encoding the desired polypeptide, e.g., a modified oleosin protein as described herein, can then be subcloned into a vector, for instance, an expression vector, so that a recombinant polypeptide can be produced from the resulting construct. Further modifications to the coding sequence, e.g., nucleotide substitutions, may be subsequently made to alter the characteristics of the polypeptide.
A variety of mutation-generating protocols are established and described in the art, and can be readily used to modify a polynucleotide sequence encoding a naturally occurring oleosin protein. See, e.g., Zhang et al., Proc. Natl. Acad. Sci. USA, 94: 4504-4509 (1997); and Stemmer, Nature, 370: 389-391 (1994). The procedures can be used separately or in combination to produce variants of a set of nucleic acids, and hence variants of encoded polypeptides. Kits for mutagenesis, library construction, and other diversity-generating methods are commercially available.
Mutational methods of generating diversity include, for example, site-directed mutagenesis (Botstein and Shortie, Science, 229: 1193-1201 (1985)), mutagenesis using uracil-containing templates (Kunkel, Proc. Natl. Acad. Sci. USA, 82: 488-492 (1985)), oligonucleotide-directed mutagenesis (Zoller and Smith, Nucl. Acids Res., 10: 6487-6500 (1982)), phosphorothioate-modified DNA mutagenesis (Taylor et al., Nucl. Acids Res., 13: 8749-8764 and 8765-8787 (1985)), and mutagenesis using gapped duplex DNA (Kramer et al., Nucl. Acids Res., 12: 9441-9456 (1984)).
Other possible methods for generating mutations include point mismatch repair (Kramer et al., Cell, 38: 879-887 (1984)), mutagenesis using repair-deficient host strains (Carter et al., Nucl. Acids Res., 13: 4431-4443 (1985)), deletion mutagenesis (Eghtedarzadeh and Henikoff, Nucl. Acids Res., 14: 5115 (1986)), restriction-selection and restriction-purification (Wells et al., Phil. Trans. R. Soc. Lond. A, 317: 415-423 (1986)), mutagenesis by total gene synthesis (Nambiar et al., Science, 223: 1299-1301 (1984)), double-strand break repair (Mandecki, Proc. Natl. Acad. Sci. USA, 83: 7177-7181 (1986)), mutagenesis by polynucleotide chain termination methods (U.S. Pat. No. 5,965,408), and error-prone PCR (Leung et al., Biotechniques, 1: 11-15 (1989)).
C. Modification of Nucleic Acids for Preferred Codon Usage in a Host Organism
The polynucleotide sequence encoding a protein of interest, e.g., a modified oleosin protein, can be further altered to coincide with the preferred codon usage of a particular host. For example, the preferred codon usage of one strain of bacterial cells can be used to derive a polynucleotide that encodes a recombinant polypeptide of the invention and includes the codons favored by this strain. The frequency of preferred codon usage exhibited by a host cell can be calculated by averaging frequency of preferred codon usage in a large number of genes expressed by the host cell (e.g., calculation service is available from web site of the Kazusa DNA Research Institute, Japan). This analysis is preferably limited to genes that are highly expressed by the host cell.
At the completion of modification, the coding sequences are verified by sequencing and are then subcloned into an appropriate expression vector for recombinant production of a protein of interest, such as a modified oleosin protein described herein.
Following verification of the coding sequence, a protein of the interest (e.g., a modified oleosin protein) can be produced using routine techniques in the field of recombinant genetics, relying on the polynucleotide sequences encoding the polypeptide disclosed herein.
A. Expression Systems
To obtain high level expression of a nucleic acid encoding a modified oleosin protein of this invention, one typically subclones a polynucleotide encoding the protein in the correct reading frame into an expression vector that contains a strong promoter to direct transcription, a transcription/translation terminator and a ribosome binding site for translational initiation. Suitable bacterial promoters are well known in the art and described, e.g., in Sambrook and Russell, supra, and Ausubel et al., supra. Bacterial expression systems for expressing the polypeptide are available in, e.g., E. coli, Bacillus sp., Salmonella, and Caulobacter. Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells (including human cells), yeast, and insect cells are well known in the art and are also commercially available. In one embodiment, the eukaryotic expression vector is an adenoviral vector, an adeno-associated vector, or a retroviral vector.
The promoter used to direct expression of a heterologous coding sequence (e.g., one encoding a modified oleosin protein) depends on the particular application. The promoter is optionally positioned about the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function.
In addition to the promoter, the expression vector typically includes a transcription unit or expression cassette that contains all the additional elements required for the expression of the modified oleosin protein of this invention in host cells. A typical expression cassette thus contains a promoter operably linked to the nucleic acid sequence encoding the modified protein and signals required for efficient polyadenylation of the transcript, ribosome binding sites, and translation termination. The nucleic acid sequence encoding the modified protein may be linked to a cleavable signal peptide sequence to promote secretion of the polypeptide by the transformed cell. Such signal peptides include, among others, the signal peptides from tissue plasminogen activator, insulin, and neuron growth factor, and juvenile hormone esterase of Heliothis virescens. Additional elements of the cassette may include enhancers and, if genomic DNA is used as the structural gene, introns with functional splice donor and acceptor sites.
In addition to a promoter sequence, the expression cassette should also contain a transcription termination region downstream of the structural gene to provide for efficient termination. The termination region may be obtained from the same gene as the promoter sequence or may be obtained from different genes.
The particular expression vector used to transport the genetic information into the cell is not particularly critical. Any of the conventional vectors used for expression in eukaryotic or prokaryotic cells may be used. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and fusion expression systems such as GST and LacZ. Epitope tags can also be added to recombinant proteins to provide convenient methods of isolation, e.g., c-myc.
Expression vectors containing regulatory elements from eukaryotic viruses are typically used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 later promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.
Some expression systems have markers that provide gene amplification such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. Alternatively, high yield expression systems not involving gene amplification are also suitable, such as a baculovirus vector in insect cells, with a polynucleotide sequence encoding a protein of interest (e.g., the modified olsoein protein) under the direction of the polyhedrin promoter or other strong baculovirus promoters.
The elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of eukaryotic sequences. The particular antibiotic resistance gene chosen is not critical, any of the many resistance genes known in the art are suitable. The prokaryotic sequences are optionally chosen such that they do not interfere with the replication of the DNA in eukaryotic cells, if necessary. Similar to antibiotic resistance selection markers, metabolic selection markers based on known metabolic pathways may also be used as a means for selecting transformed host cells.
When periplasmic expression of a recombinant protein (e.g., a modified oleosin protein of the present invention) is desired, the expression vector further comprises a sequence encoding a secretion signal, such as the E. coli OppA (Periplasmic Oligopeptide Binding Protein) secretion signal or a modified version thereof, which is directly connected to 5′ of the coding sequence of the protein to be expressed. This signal sequence directs the recombinant protein produced in cytoplasm through the cell membrane into the periplasmic space. The expression vector may further comprise a coding sequence for signal peptidase 1, which is capable of enzymatically cleaving the signal sequence when the recombinant protein is entering the periplasmic space. More detailed description for periplasmic production of a recombinant protein can be found in, e.g., Gray et al., Gene 39: 247-254 (1985), U.S. Pat. Nos. 6,160,089 and 6,436,674.
A person skilled in the art will recognize that various conservative substitutions can be made to any wild-type or mutant/variant protein to produce a modified oleosin protein within the scope of this disclosure. Moreover, modifications of a polynucleotide coding sequence may also be made to accommodate preferred codon usage in a particular expression host without altering the resulting amino acid sequence.
B. Transfection Methods
Standard transfection methods are used to produce bacterial, mammalian, yeast, insect, or plant cell lines that express large quantities of a modified oleosin protein of this invention, which can then be purified using standard techniques (see, e.g., Colley et al., J. Biol. Chem. 264: 17619-17622 (1989); Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, J. Bact. 132: 349-351 (1977); Clark-Curtiss & Curtiss, Methods in Enzymology 101: 347-362 (Wu et al., eds, 1983).
Any of the well-known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, liposomes, microinjection, plasma vectors, viral vectors and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, e.g., Sambrook and Russell, supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the modified oleosin protein of this invention.
In some cases, the host cell into which the modified oleosin coding sequence is being introduced may also have its genomic sequence(s) modified so as to reduce or abolish the expression of its native oleosin protein. Methods such as sequence homology-based gene disruption methods utilizing a viral vector or CRISPR system can be used for altering the oleosin genomic sequence, for example, by insertion, deletion, or substitution, which may occur in the coding region of the gene or in the non-coding regions (e.g., promoter region or other regulatory region) and which may result in substantial suppression or complete abolition of endogenous oleosin expression.
C. Purification of Recombinantly Produced Proteins
Once the expression of a recombinant protein, such as a modified oleosin protein of this invention, in transfected host cells is confirmed, e.g., via an immunoassay such as Western blotting assay, the host cells are then cultured in an appropriate scale for the purpose of purifying the recombinant protein.
1. Purification of Recombinantly Produced Polypeptides from Bacteria
When the modified proteins of the present invention are produced recombinantly by transformed bacteria in large amounts, typically after promoter induction, although expression can be constitutive, the polypeptides may form insoluble aggregates. There are several protocols that are suitable for purification of protein inclusion bodies. For example, purification of aggregate proteins (hereinafter referred to as inclusion bodies) typically involves the extraction, separation and/or purification of inclusion bodies by disruption of bacterial cells, e.g., by incubation in a buffer of about 100-150 μg/ml lysozyme and 0.1% Nonidet P40, a non-ionic detergent. The cell suspension can be ground using a Polytron grinder (Brinkman Instruments, Westbury, N.Y.). Alternatively, the cells can be sonicated on ice. Additional methods of lysing bacteria are described in Ausubel et al. and Sambrook and Russell, both supra, and will be apparent to those of skill in the art.
The cell suspension is generally centrifuged and the pellet containing the inclusion bodies resuspended in buffer which does not dissolve but washes the inclusion bodies, e.g., 20 mM Tris-HCl (pH 7.2), 1 mM EDTA, 150 mM NaCl and 2% Triton-X 100, a non-ionic detergent. It may be necessary to repeat the wash step to remove as much cellular debris as possible. The remaining pellet of inclusion bodies may be resuspended in an appropriate buffer (e.g., 20 mM sodium phosphate, pH 6.8, 150 mM NaCl). Other appropriate buffers will be apparent to those of skill in the art.
Following the washing step, the inclusion bodies are solubilized by the addition of a solvent that is both a strong hydrogen acceptor and a strong hydrogen donor (or a combination of solvents each having one of these properties). The proteins that formed the inclusion bodies may then be renatured by dilution or dialysis with a compatible buffer. Suitable solvents include, but are not limited to, urea (from about 4 M to about 8 M), formamide (at least about 80%, volume/volume basis), and guanidine hydrochloride (from about 4 M to about 8 M). Some solvents that are capable of solubilizing aggregate-forming proteins, such as SDS (sodium dodecyl sulfate) and 70% formic acid, may be inappropriate for use in this procedure due to the possibility of irreversible denaturation of the proteins, accompanied by a lack of immunogenicity and/or activity. Although guanidine hydrochloride and similar agents are denaturants, this denaturation is not irreversible and renaturation may occur upon removal (by dialysis, for example) or dilution of the denaturant, allowing re-formation of the immunologically and/or biologically active protein of interest. After solubilization, the protein can be separated from other bacterial proteins by standard separation techniques. For further description of purifying recombinant polypeptides from bacterial inclusion body, see, e.g., Patra et al., Protein Expression and Purification 18: 182-190 (2000).
Alternatively, it is possible to purify recombinant polypeptides, e.g., a modified oleosin protein, from bacterial periplasm. Where the recombinant protein is exported into the periplasm of the bacteria, the periplasmic fraction of the bacteria can be isolated by cold osmotic shock in addition to other methods known to those of skill in the art (see e.g., Ausubel et al., supra). To isolate recombinant proteins from the periplasm, the bacterial cells are centrifuged to form a pellet. The pellet is resuspended in a buffer containing 20% sucrose. To lyse the cells, the bacteria are centrifuged and the pellet is resuspended in ice-cold 5 mM MgSO4 and kept in an ice bath for approximately 10 minutes. The cell suspension is centrifuged and the supernatant decanted and saved. The recombinant proteins present in the supernatant can be separated from the host proteins by standard separation techniques well known to those of skill in the art.
2. Standard Protein Separation Techniques for Purification
When a recombinant polypeptide of the present invention, e.g., a modified oleosin protein, is expressed in host cells (such as human cells) in a soluble form, its purification can follow the standard protein purification procedure described below. This standard purification procedure is also suitable for purifying recombinant proteins obtained from chemical synthesis.
i. Solubility Fractionation
Often as an initial step, and if the protein mixture is complex, an initial salt fractionation can separate many of the unwanted host cell proteins (or proteins derived from the cell culture media) from the recombinant protein of interest, e.g., a modified oleosin protein of the present invention. The preferred salt is ammonium sulfate. Ammonium sulfate precipitates proteins by effectively reducing the amount of water in the protein mixture. Proteins then precipitate on the basis of their solubility. The more hydrophobic a protein is, the more likely it is to precipitate at lower ammonium sulfate concentrations. A typical protocol is to add saturated ammonium sulfate to a protein solution so that the resultant ammonium sulfate concentration is between 20-30%. This will precipitate the most hydrophobic proteins. The precipitate is discarded (unless the protein of interest is hydrophobic) and ammonium sulfate is added to the supernatant to a concentration known to precipitate the protein of interest. The precipitate is then solubilized in buffer and the excess salt removed if necessary, through either dialysis or diafiltration. Other methods that rely on solubility of proteins, such as cold ethanol precipitation, are well known to those of skill in the art and can be used to fractionate complex protein mixtures.
ii. Size Differential Filtration
Based on a calculated molecular weight, a protein of greater and lesser size can be isolated using ultrafiltration through membranes of different pore sizes (for example, Amicon or Millipore membranes). As a first step, the protein mixture is ultrafiltered through a membrane with a pore size that has a lower molecular weight cut-off than the molecular weight of a protein of interest, e.g., a modified oleosin protein. The retentate of the ultrafiltration is then ultrafiltered against a membrane with a molecular cut off greater than the molecular weight of the protein of interest. The recombinant protein will pass through the membrane into the filtrate. The filtrate can then be chromatographed as described below.
iii. Column Chromatography
The proteins of interest (such as a modified oleosin protein of the present invention) can also be separated from other proteins on the basis of their size, net surface charge, hydrophobicity, or affinity for ligands, such as amylose. In addition, antibodies raised against a segment of the protein of interest (e.g., a modified oleosin protein) can be conjugated to column matrices and the target fusion protein can therefore be immunopurified. All of these methods are well known in the art.
Optionally, a cleavage site recognized by a protease may be designed into the coding sequence of the protein of this invention. For example, a cleavage site can be built in the sequence or sequences linking the target protein (e.g., a modified oleosin protein) and one or more affinity tags such as MBP or GST tag(s), such that the tag(s) can be readily removed after protease treatment.
It will be apparent to one of skill that chromatographic techniques can be performed at any scale and using equipment from many different manufacturers (e.g., Pharmacia Biotech).
The following examples are provided by way of illustration only and not by way of limitation. Those of skill in the art will readily recognize a variety of non-critical parameters that could be changed or modified to yield essentially the same or similar results.
Oleosin has a Mushroom- or T-Shaped Structure on a Lipid Droplet.
An oleosin molecule on a lipid droplet (LD) has its N- and C-terminal amphipathic peptides lying on the surface and interacting with the phospholipid (PL) charged/polar moieties, and its central hydrophobic polypeptide of ˜72 residues forming a hairpin structure and penetrating the TAG matrix (1, 2). The 2 hairpin arms could be an alpha-helix (15) or a beta-sheet structure (16). Homology modeling was used to delineate the secondary structures of oleosin (PpOLE1 from Physcomitrella [30]); hereafter named OLE) and found a mushroom- or T-shaped oleosin with the hairpin arms largely configuring an alpha-helix structure (
N- and C-Terminal Portions of Oleosin are not Essential for Oleosin Targeting to ER-LDs.
OLE was modified (
In transformed Physcomitrella cells (
Because of the uncertainty of whether the N-portion of oleosin is (25) or is not (21, 22, 31, 32) needed for ER targeting, the N-portion (24 of the 25 residues) and the initial 3 or 6 residues of the hairpin polypeptide were deleted (OLE-GFP [wild type], OLEΔN24-GFP, OLEΔN28-GFP and OLEΔN31-GFP, respectively;
The inventors expanded the N-portion studies to an oleosin of Arabidopsis. Arabidopsis thaliana has 17 oleosins (2), and the one with the shortest N-portion (6-residue N-portion+72-residue hairpin+28-residue C-portion;
Overall, several initial hairpin residues adjacent to the N-portion, but not the N-portion per se, of oleosin are required for the targeting. The several residues of Physcomitrella (NRRQ-VLGL) (SEQ ID NO: 43) and Arabidopsis (EIIQ-AVFS) (SEQ ID NO: 44) oleosins at the junction of the N-portion and the hairpin have no appreciable common denominators. The residues at this junction for oleosin targeting to ER-LDs require further study.
The Highly Conserved PSPP Residues of the Hairpin Loop PX5SPX3P (SEQ ID NO:1) of Oleosin are Required for Oleosin Targeting to ER-LDs.
The 12-residue loop, PX5SPX3P (SEQ ID NO:1), of the oleosin hairpin is highly conserved, and the 3 proline and 1 serine residues (PSPP in discontinuity) are completely conserved among all known oleosins (1, 2). Replacing PSPP with LLLL allows for limited targeting of the modified oleosin to ER-LDs (32). The 4 residues were modified from PSPP to PPPP, SSSS, PYPP and LSLL via the encoded genes and transferred the mutated genes into Physcomitrella (
Oleosin with a Shortened Hairpin Shows Reduced Targeting to ER-LDs.
The inventors maintained the N- and C-portions and the hairpin loop of oleosin but shortened each of the 2 hairpin arms from ˜30 (wild type) to 15, 10 and 5 residues via their encoded genes (termed OLE-GFP [wild-type], OLE-15-GFP, OLE-10-GFP and OLE-5-GFP, respectively). In transformed Physcomitrella, the recombinant oleosins showed progressive reduced targeting to LDs in proportion to the hairpin length, from 86% (wild-type) to 31%, 14% and 3%, respectively (
Oleosin with an Added N-Terminal ER-Targeting Peptide and a Shortened Hairpin Enters the ER Lumen.
OLE was modified by adding an N-terminal ER-targeting 21-residue peptide (of Physcomitrella aspartic proteinase [33]) to the N-terminus and shortening each of the 2 hairpin arms from ˜30 (wild type) to 15, 10 and 5 residues (termed s-OLE-GFP, s-OLE-15-GFP, s-OLE-10-GFP and s-OLE-5-GFP, respectively) (
Physcomitrella cells were transformed with DNA constructs encoding s-OLE-GFP, s-OLE-10-GFP (
Oleosin with an Added N-Terminal ER-Targeting Peptide and a Shortened Hairpin Directs Budding LDs into the ER Lumen.
Because s-OLE-10-GFP successfully entered the ER lumen, the inventors tested whether the luminal s-OLE-10-GFP could extract ER-budding LDs to the luminal rather than cytosolic side. The Physcomitrella transient expression system was not used, because the cells would already have had native oleosin-coated solitary LDs in or ER-budding LDs facing cytosol. Instead, tobacco cells were used for stable transformation and the transformed cells were grown for several generations, such that s-OLE-10-GFP would outcompete native oleosin and extract ER-budding LDs into the ER lumen. In tobacco cells transformed with a DNA construct encoding OLE-GFP or s-OLE-10-GFP (
Modified Oleosin (s-OLE-10-GFP) with a Further Addition of a Vacuole-Targeting Propeptide Directs ER-Luminal LDs to Vacuoles.
Because s-OLE-10-GFP extracted LDs to the ER lumen, the inventors tested whether these luminal LDs firmly bonded to the ER luminal surface or were held in the lumen because of their bulkiness or whether they could be exported to the cellular exterior via a default pathway or moved to PSVs. They did not observe the export of s-OLE-10-GFP-associated LDs to the cellular exterior of transformed tobacco cells. Therefore, the inventors examined whether s-OLE-10-GFP attached to a PSV-targeting propeptide would guide s-OLE-10-GFP-associated LDs in the ER lumen to PSVs.
The present inventors made a DNA construct encoding s-p-OLE-10-GFP that included a 12-residue PSV-targeting propeptide (of castor ricin [36]) (
Oleosin has a Mushroom- or T-Shaped Structure on the Surface of a LD.
Homology modeling was used to define the structure of oleosin on the surface of a LD (
Overall, the oleosin structure is predicted to be a T- or mushroom-shaped molecule with the hairpin inserted into the TAG matrix of a LD (panel b in
The length of the hairpin arms was artificially reduced from 30+12+26 residues (first arm+loop+second arm) to 10+12+10 residues and repeated the homology modeling, the hairpin arms became shorter (lower panel b in
Further Observation and Confirmation of Redistribution of Lipid Droplets (LDs) by Modified Oleosin: LDs are Directed to Endoplasmic Reticulum (ER) and/or Vacuole Instead of Cytosol.
Cells that were treated with digitonin (a mild detergent that breaks the plasma membrane but not the ER membrane) and a commercial protease retained the oleosin-LDs inside the ER lumen. Cells that were further treated with Triton-X (a stronger detergent that breaks also the ER membrane) had the oleosin-LDs proteolyzed (i.e., the ER membrane was broken, allowing the applied protease to enter the ER lumen and hydrolyze the oleosin-LDs).
In addition to delineating the mechanism of oleosin and LD biosynthesis in plant cells, the current study demonstrates a successful redirection of massive LDs originally designated for cytosol to the ER lumen via 3 manipulations: (i) addition of a N-terminal ER targeting signal to oleosin, (ii) reduction of the oleosin hairpin length, and (iii) lack of abundant pre-existing native oleosins that would have already extracted ER-budding LDs to the cytosolic side or into cytosol. Further addition of a PSV-targeting propeptide to the modified oleosin transports the ER luminal LDs to PSVs and then large vacuoles in transformed tobacco cells. Without this PSV-targeting propeptide, LDs coated with the recombinant oleosin in the ER lumen did not move to the cellular exterior via a default pathway. This observation may reflect an undefined signal within proteins that could allow for secretion in tobacco cells; this signal peptide is absent in oleosin. However, in cells of other organisms, LDs coated with modified oleosin in the ER lumen may move to the cell exterior by default.
The current work has potential applications in various areas. Earlier studies have shown that after gene transformation and expression, oleosin is correctly targeted to LDs in yeast (37) and mammal (38) cells. Photosynthetic microbes are being used to produce oils in LDs as renewable biodiesels and high-value products. This industrial production is inefficient, because the microbes must be stressed (thereby stopping growth) to induce LD accumulation and then be killed to extract the LD oils. The photosynthetic microbes could be manipulated to excrete LDs, such that there is no need to stress and then kill the cells. Even if the LDs cannot be excreted but rather are stored in metabolically inert vacuoles, the compartmentation would eliminate metabolic feedback and allow for the continuous synthesis and accumulation of more oils (end metabolite). This can benefit agricultural production of oils in seeds and industrial use of yeast and other microbes to produce high-value lipid-related metabolites in LDs. For obesity treatments, the addition of an apparently inert recombinant oleosin to mammalian cells could lead to the transfer of cytosol-designated LDs to the intracellular secretory pathway for excretion. The present inventors have demonstrated that cytosol-designated LDs can be redirected to the ER lumen, PSVs and then large vacuoles in tobacco cells. Procedures for moving ER-luminal LDs to the cell exterior in plants and other organisms can be explored.
Plant Materials.
The gametophyte of Physcomitrella patens subsp. patens were grown axenically on a solid Knop's medium supplemented with micronutrients (30) at 25±1° C. under a 16-h light (60˜100 μE m−2S−1)/8-h dark cycle. Nicotiana tabaccum BY2 cell line was maintained as described (39).
Transient Expression with Physcomitrella Cells.
Expression constructs encoding OLE and recombinant OLEs (Table 1) and the primers (Table 2) are shown in Supplemental Data. The coding fragments were digested with BamHI and cloned into the expression site of a GFP expression vector (40) or an RFP expression vector (41) driven by a CaMV 35S promoter. A BIP-RFP expression vector of a similar construct (33) was also used. Transformation involved particle bombardment (30). Gold particles of 1.6-nm diameter coated with 5 μg plasmid DNA were bombarded with 900 psi under 28-in Hg vacuum onto 60-day-old leafy tissue from a distance of 6 cm in PDS-1000 (BIO-RAD, Hercules, Calif.). The bombarded tissue was observed with CLSM (Zeiss 510M for Physcomitrella, and Leica SP5 for tobacco) at time intervals. GFP and RFP were excited at 488 and 543 nm, and emission was detected at 500-530 and 565-615 nm, respectively.
Transformation of Tobacco BY-2 Cells.
Agrobacterium-mediated transformation of BY2 cells was as described (39). The expression vectors are shown in Table 1. Agrobacterium tumefaciens (strain GV3101) with the binary expression vector (100 μL) at OD600 ˜0.5 were added to 4 ml of 3-d-old suspension cells. After co-cultivation at 25° C. for 2 d, cells were collected by centrifugation at 500 g for 2 min, washed 3 times with liquid medium containing 500 mg·L−1 carbencillin, and transferred to solid BY-2 medium containing 500 mg·L−1 carbencillin and selection antibiotic, 50 mg·L−1 kanamycin and/or 20 mg·L−1 hygromycin B. Expression was observed with CLSM.
Staining of LDs and ER.
LDs were stained with Nile Red (42). ER was stained with ER-Tracker-Red (BODIPY TR Glibenclamide, Invitrogen, Carlsbad, Calif.). Tissue was placed in a solution containing Nile Red stock (100 mg/ml DMSO) or ER-Tracker-Red stock (100 μg/110 μl DMSO) diluted 100× with 1× phosphate buffered saline (PBS: 10 mM K phosphate, pH 7.4, 138 mM NaCl and 2.7 mM KCl) for 10 min, washed with PBS twice, and observed with CLSM. Nile Red and ER-Tracker-Red were excited at 543 and 594 nm, and emission was detected at 565-615 and 610-650 nm, respectively.
Fluorescence Protease Protection Assay.
The assay was modified from Lorenz et al. (34). Physcomitrella cells were transformed with DNA constructs encoding s-OLE or s-OLE-10 (attached to GFP or RFP) and BIP-RFP. After 12 h, cells were incubated in 1×PBS for 10 min, washed with PBS twice, and permeated with 25 μg/mL digitonin for 10 min. Then, 4-mM trypsin in PBS was added, and digestion was allowed for 20 min. Fluorescence was observed with CLSM before and after trypsin treatment.
Electron Microscopy.
Tissue was fixed via high-pressure freezing or chemical fixation. For freezing fixation, tissue was fixed in a high-pressure freezer (Leica EM PACT2) and then subjected to freeze substitution in ethanol containing 0.2% glutaraldehyde and 0.1% uranyl acetate in Leica AFS System and embedded in LR Gold resin (Structural Probe, West Chester, Pa.). For chemical fixation, tissue was fixed with 2.5% glutaraldehyde, 4% paraformaldehyde and 0.1 M K-phosphate (pH 7.0) at 4° C. for 24 h. Materials were washed with 0.1 M K-phosphate buffer (pH 7.0) for 10 min twice and treated with 1% 0504 in 0.1 M K-phosphate (pH 7.0) at 24° C. for 4 h. Fixed materials were rinsed with 0.1 M K-phosphate buffer (pH 7.0), dehydrated in an acetone series and embedded in Spurr resin. Ultrathin sections (70-90 nm) were stained with uranyl acetate and lead citrate and examined with a Philips CM 100 TEM at 80 KV.
All patents, patent applications, and other publications, including GenBank Accession Numbers, cited in this application are incorporated by reference in the entirety for all purposes.
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
Physcomitrella
This application claims priority to U.S. Provisional Patent Application No. 62/511,494, filed on May 26, 2017, the contents of which are hereby incorporated by reference in the entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
62511494 | May 2017 | US |