The invention relates to a process of producing a heterologous protein of interest in genetically-modified plant cells, in particular in plastids thereof, and to proteins produced thereby.
Plants have become an attractive choice as a low cost, indefinitely scalable production system for recombinant proteins including for pharmaceutical and industrial application (Stoger et al., 2000, Plant Mol. Biol., 42, 583-590; Larrick & Thomas, 2001, Curr. Opin. Biotechnol., 12, 411-418). Most proteins especially in the pharmaceutical area are secretory, e.g. they are initially translated as protein precursors carrying signal peptides which target them to the endoplasmatic reticulum (ER) for further post-translational processing and compartmentalization. Said processing includes correct folding and assembly, disulfide bond formation and complex enzymatic modifications. However, post-translational protein modifications in ER of animal and plant cells can differ significantly, especially in glycosylation pattern, as plants synthesize different types of carbohydrates attached to glycosylation sites (Wilson, I B., 2002, Curr. Opin. Struct. Biol., 12, 569-577; Schillberg, Fischer & Emans, 2003, Cell Mol. Life. Sci., 60, 433-445). In many cases it would be an advantage to avoid plant-specific post-translational modifications, especially glycosylation of plant-produced pharmaceutical proteins, as they may cause allergenic responses in patients. Chloroplasts have their own protein quality control system and, like the ER can provide for correct disulfide bond formation and protein folding (Dickson et al., 2000, J. Biol. Chem., 275, 11829-11835). Glycosylation that does not play any role in protein activity can be easily avoided by using two different approaches: a) targeting the protein of interest into different subcellular compartments for processing, for example in chloroplasts; b) expressing the protein of interest in plant plastids by engineering transplastomic plants. The latter approach is not suitable for expressing proteins requiring an N-terminal amino acid residue other than methionine (M). There was an attempt to address this issue by expressing the human secretory protein somatotropin in transplastomic tobacco as ubiquitin fusion in order to obtain the required N-terminus upon cleavage of ubiquitin (Staub et al., 2000, Nature Biotechnol., 18, 333-338). Ubiquitin fusion proteins are cleaved by ubiquitin protease immediately downstream of the C-terminal residue of ubiquitin, thus allowing production of recombinant proteins containing N-terminal residues of choice except proline (Baker, R. T., 1996,. Cuff. Opin. Biotechnol. 7, 541-546). However, there is no ubuquitin-specific protease in chloroplasts and processing of ubiquitin-somatotropin fusion took place only during the protein extraction period giving a high level (up to 70%) of unprocessed somatotropin.
There have been attemps (U.S. Pat. No. 6,063,601; U.S. Pat. No. 6,130,366) to target heterologous proteins into plastids of plant cells after expression outside the plastids. This requires fusion of the protein of interest with an N-terminal transit peptide. The transit peptide has the function of facilitating translocation through the membranes of the plastid. Inside the plastid, the transit peptide is cleaved off by a plastid protease. Details of the cleavage sites of natural plastid-targeted plant proteins have been investigated (Gavel & Von Heijne 1990, FEBS Lett., 261, 455-458). These investigations concern well-adapted natural proteins of certain plants. Heterologous proteins, however, are not adapted to plant plastids by evolution. Therefore, there is the general problem to construct fusion proteins with a transit peptide and a heterologous protein of interest and a sequence around the prospective cleavage position for securing efficient and defined cleavage of the fusion protein for producing the desired protein of interest in high quality, notably with high N-terminal sequence accuracy.
Therefore, it is the problem of the invention to provide a process for producing a heterologous plastid-targeted protein of interest, whereby the desired N-terminal sequence of said protein of interest is readily obtained with high reliability.
This problem is solved by a process of producing a protein of interest in cells of a predetermined plant, comprising:
introducing into said cells a vector encoding a fusion protein comprising in the direction from the N-terminus to the C-terminus
(i) a transit peptide for targeting said fusion protein into plastids and, contiguous thereto,
(ii) said protein of interest,
wherein the C-terminal three amino acids X−3X−2X−1 of said transit peptide and the N-terminal amino acid Z of said protein of interest form a cleavage site
for cleaving said fusion protein between X−1 and Z for releasing said protein of interest in plastids, whereby for a predetermined amino acid Z of said protein of interest an amino acid sequence X−3X−2X−1 is selected from the set of amino acid sequences X−3X−2X−1 naturally occurring contiguous to Z on the N-terminal side of Z in plastid-targeted fusion proteins in plants, thereby forming said cleavage site.
Herein, the single letter code is generally used for the 20 standard amino acids. Similarly, the symbols X−3, X−2, X−1, and Z each stands for a standard amino acid. Numerals -1, -2, and -3 indicate the position of the respective amino acid in the sequence of the transit peptide in direction towards the N-terminus of the transit peptide.
It is generally accepted that the cleavage accuracy and efficiency is dependent on the sequence of the transit peptide and the plastid-targeted natural protein in the vicinity of the cleavage position. However, a general knowledge extendable to combinations of an arbitrary transit peptide and a selected protein of interest is not available for application to heterologous proteins.
The length of the C-terminal sequence of the transit peptide and the length of the N-terminal sequence of the protein of interest that jointly determine the cleavage position have been uncertain. For natural proteins it is reasonable to assume that these two sequences have been mutually adjusted during evolution. For a heterologous protein to be targeted to plastids, the appropriate transit peptide has to be determined by trial and error. The success of said trial and error is not predictable. With shear luck a suitable combination of sequences may be found. However, the uncertainty is great and not calculateable.
We have surprisingly found that the problem of finding a C-terminal sequence of the transit peptide adapted to the N-terminal sequence of the protein of interest can be solved with a greater success rate based on a hierarchy of considerations. Within this hierarchy, it has been found that it is sufficient to primarily consider only the N-terminal amino acid of the heterologous protein of interest and the three last amino acids of the transit peptide. Notably, it has been found that each N-terminal amino acid of the protein of interest is correlated with a certain set of suitable C-terminal amino acid triads of the transit peptide.
In the process of the invention, a vector encoding said fusion protein is introduced into plant cells. Said protein of interest may then be expressed in said plant cells e.g. in cell culture. Preferably, said protein of interest is expressed in whole plants. This may be achieved by regenerating plants form plant cells transformed with said vector. Alternatively, said vector is introduced in plant cells of a whole plant. Several transformation or transfection methods for plants or plant cells are known in the art and include Agrobacterium-mediated transformation, particle bombardment, PEG-mediated protoplast transformation, viral infection etc. For the preferred embodiment of transient expression or transfection for transient expression, viral infection or Agrobacterium-mediated transformation are advantageously employed. Plants or plant cells are transformed or transfected with a nucleotide sequence (vector) having a coding region encoding said fusion protein. Transformation may produce stably transformed plants or plant cells, e.g. transgenic plants. Alternatively, said plant or plant cells may be transfected for transient expression of said fusion protein. Transient transfection of grown up plants is most preferred.
Said vector may be a DNA or an RNA vector depending on the transformation or transfection method. In most cases, it will be DNA. In an important embodiment, however, transformation or transfection is performed using RNA virus-based vectors, in which case said nucleotide sequence may be RNA. One very convenient way is to use a DNA vector that is based on a virus. Preferably, the DNA vector is based on an RNA virus, i.e. the DNA vector contains the cDNA of RNA viral sequences in addition to said nucleotide sequence. Examples of plant DNA or RNA viruses sequences of which may be used for viral vectors according to the invention are given in WO 02/29068 and in WO0288369. Such DNA vectors further contain a transcriptional promoter for producing the RNA viral transcript. In these embodiments, transformation or transfection is preferably carried out by viral transfection, more preferably via Agrobacterium-mediated transformation.
For an N-terminal amino acid Z of a protein of interest to be produced by the process of the invention, an amino acid triade X−3X−2X−1 is selected from the set of amino acid sequences X−3X−2X−1 naturally occurring on the N-terminal side of Z contiguous to Z in plastid-targeted fusion proteins in plants. Preferably, said protein of interest is produced in angiosperms. In this case, X−3X−2X−1 triades naturally occurring in angiosperms are selected for a predetermined Z. More preferably, the amino acid sequence (or triade) X−3X−2X−1 is selected from the set of amino acid sequences X−3X−2X−1 naturally occurring contiguous to Z in plastid-targeted fusion proteins in the family, more preferably the genus, of plants said predetermined plant is a member of. Most preferably, the amino acid sequence X−3X−2X−1 is selected from the set of amino acid sequences X−3X−2X−1 naturally occurring contiguous to Z in plastid-targeted fusion proteins of plants of the same species as said predetermined plant.
If Z is A, said triade X−3X−2X−1 is preferably ASN, ACR, MA, FVA, HVR, ICC, IGA, IRA, IRC, ISA, ISC, IQC, QIR, KTK, KAK, PLQ, PIA, PIQ, RMG, RCM, RAQ, RVK, SAA, SCT, SLA, SIC, SIV, TCQ, TAM, TAQ, TCK, VCK, VAM, WA, VKA, VRA, VTR, VGA, WR, WY, WQ, VSC, WC, or VFA.
Using the above combinations of triades X−3X−2X−1 with a predetermined N-terminal residue Z of a protein of interest, the probability of finding a suitable transit peptide with a chosen protein of interest such that the resulting fusion protein is efficiently targeted to plastids and cleaved to release the protein of interest with high accuracy of the N-terminal end is significantly increased compared to a prior art process that relies on luck. If more than one X−3X−2X−1 is possible for a certain Z, the efficiencies of these X−3X−2X−1 triades may differ. A particularly suitable X−3X−2X−1 amino acid triade for a certain protein of interest may be sleeked by an assay comprising the following steps:
Instead of putting said reporter protein on the C-terminal side of X−3X−2X−1, said protein of interest may be placed contiguous to X3X−2X−1, whereby said protein of interest may be followed by a reporter protein. If said protein of interest is used in said assay, the X−3X−2X−1 triade selected in the assay as a higher probability of giving efficient targeting and cleaving in the process of the invention.
For simplicity, the assay is preferably performed in plant cell culture, whereby the cells may derive from the same plant as is used for producing the protein of interest. Alternatively, leaves of a plant may be transiently transfected with the vector of the assay. Both methods allow a straightforward assessing of targeting the fusion protein to plastids and of cleaving the fusion protein. Targeting of the fusion protein to plastids may e.g. be assessed using a fluorescent reporter protein like green fluorescent protein (GFP) and together with fluorescence microscopy as has been done for
Regarding said transit peptide to be used in the process of the invention, there are no particular limitations. It is however of advantage to use a plastid transit peptide known from a plant that is related to the predetermined plant used for the process of the invention. If the process of the invention is carried out in dicot plant cells or in dicot plants, a highly preferred transit peptide has the sequence, in N-terminal to C-terminal direction, MASSMLSSM WATRASAAQ ASMVAPFTGL KSMSFPVTR KQNNLDITSI ASNGGR X−3X−2X−1. If the process of the invention is carried out in monocot plant cells or in monocot plants, a highly preferred transit peptide has the sequence, in N-terminal to C-terminal direction, MAPTVMASSA TTVAPFQGLK STAGRLPVAR RSSGSLGSVS NGGRX−3X−2X−1.
No particular limitations exist regarding to protein of interest to be produced according to the invention. The protein of interest may be of bacterial, viral, plant, or animal origin or it may be artificially designed. Said protein of interest may be an agricultural trait, a human or animal health protein, an immune response protein, a polypeptide hormone, etc.
A: Western blot with anti-hGH antibodies. Lane C—mature hGH (control); lanes 1,2—hGH expressed from pICH14061A; lanes 3,4—hGH expressed from pICH14061B. U—hGH precursor (unprocessed); M—mature, correctly processed hGH; S—incorrectly processed (small) hGH.
B: Detailed schemes of the T-DNA regions of the binary vectors pICH14061A and pICH14061B. TP—transit peptide; P—transcriptional promoter; T—transcriptional terminator; NPT—neomycin phosphotransferase; NTR—3′ non-translated region of tobacco mosaic virus. Amino acid triads R—F—N (pICH14061A) and P—S—R (pICH14061B) are given in the one-letter amino acid code in the direction from the N-terminal side to the C-terminal side.
The general principle of the invention is the following: a gene encoding the fusion protein TP(XXX)-(Z)P (from the N-terminus to the C-terminus) comprising
The data shown in Table 1 are the result of transit peptide cleavage site analysis for approximately 400 nuclear-encoded chloroplast targeted proteins from publicly available databases. Some X−3X−2X−1 triads correspond to the cleavage motif (IN)—X-(A/C)-A suggested by Gavel & Von Heijne 1990, FEBS Lett., 261, 455-458. For compiling Table 1, we have taken into account the possibility of 1-2 amino acid residues removal from the N-terminus after cleaving off the transit peptide (Emanuelsson, Nielsen & Heijne, Protein Sci., 8, 978-984). We found appropriate X−3X−2X−1 triads for almost all possible N-terminal amino acid residues Z, except for tryptophan (trp, W). However, according to the N-end rule, W at the N-terminus destabilizes proteins in eucaryotic and prokaryofic cells, reducing the protein half-life to 2-3 min (Varshavsky, A., 1996, Proc. Natl. Acad. Sci. USA, 93, 12142-12149).
We also used sequence alignments of transit peptides of the small subunit of RUBISCO from different plant species in order to build artificial consensus transit peptides suitable for efficient targeting of proteins of interest into the plastids of dicotyledonous (
The sequences designed as described in example 1 can be used for testing all possible combinations of a specific Z with X−3X−2X−1 triads. The scheme of experiment is described in example 2.
In example 3 of the invention we describe the use of the artificial transit peptide for delivery of the human growth hormone (hGH) somatotropin and human interferon alpha 2b into the chloroplasts of Nicotiana benthamiana plants. Both proteins are secretory and have in the processed form (after cleaving off the transit peptide) an N-terminus starting from phenylalanine (F) for somatotropin and from cysteine (C) for interferon alpha 2b.
Various methods can be used to deliver a DNA or RNA vector into the plant cell, including direct introduction of said vector into a plant cell by means of microprojectile bombardment, electroporation or PEG-mediated treatment of protoplasts (for review see: Gelvin, S. B., 1998, Curr. Opin. Biotechnol., 9, 227-232; Hansen & Wright, 1999, Trends Plant Sci., 4, 226-231). Plant RNA and DNA viruses are also efficient delivery systems (Hayes et al., 1988, Nature, 334, 179-182; Palmer et al., 1999, Arch. Virol., 144, 1345-1360; Lindbo et al., 2001, Curr. Opin. Plant. Biol., 4, 181-185). Said vectors can deliver a transgene either for stable integration into the genome of the plant (direct or Agrobacterium-mediated DNA integration) or for transient expression of the transgene (“agroinfiltration”).
Preferred plants for the use in this invention include any plant species with preference given to agronomically and horticulturally important species. Common crop plants for the use in the invention include alfalfa, barley, beans, canola, cowpeas, cotton, corn, clover, lotus, lentils, lupine, millet, oats, peas, peanuts, rice, rye, sweet clover, sunflower, sweetpea, soybean, sorghum triticale, yam beans, velvet beans, vetch, wheat, wisteria, and nut plants. Plant species preferred for practicing this invention include but are not restricted to representatives of Graminae, Compositae, Solanacea and Rosaceae.
Additionally, preferred species for use in the invention are plants from the genera: Arabidopsis, Agrostis, Allium, Antirrhinum, Apium, Arachis, Asparagus, Atropa, Avena, Bambusa, Brassica, Bromus, Browaalia, Camellia, Cannabis, Capsicum, Cicer, Chenopodium, Chichorium, Citrus, Coffea, Coix, Cucumis, Curcubita, Cynodon, Dactylis, Datura, Daucus, Digitalis, Dioscorea, Elaeis, Eleusine, Festuca, Fragaria, Geranium, Glycine, Helianthus, Heterocallis, Hevea, Hordeum, Hyoscyamus, lpomoea, Lactuca, Lens, Lilium, Linum, Lolium, Lotus, Lycopersicon, Majorana, Malus, Mangifera, Manihot, Medicago, Nemesia, Nicotiana, Onobrychis, Oryza, Panicum, Pelargonium, Pennisetum, Petunia, Pisum, Phaseolus, Phleum, Poa, Prunus, Ranunculus, Raphanus, Ribes, Ricinus, Rubus, Saccharum, Salpiglossis, Secale, Senecio, Setaria, Sinapis, Solanum, Sorghum, Stenotaphrum, Theobroma, Trifolium, Trigonella, Triticum, Vicia, Vigna, Vitis, Zea, and the Olyreae, the Pharoideae and many others.
Within the scope of this invention, plant species which are not included into the food or feed chain are particularly preferred for producing pharmaceutical and technical proteins. Among those, Nicotiana species are the most preferred, as they are easy to transform and to cultivate with well developed expression vector (especially viral vectors) systems.
Genes of interest, their fragments (functional or non-functional) and their artificial derivatives that can be expressed as the cellular process of interest and isolated using the present invention include, but are not limited to: starch modifying enzymes (starch synthase, starch phosphorylation enzyme, debranching enzyme, starch branching enzyme, starch branching enzyme II, granule bound starch synthase), sucrose phosphate synthase, sucrose phosphorylase, polygalacturonase, polyfructan sucrase, ADP glucose pyrophosphorylase, cyclodextrin glycosyltransferase, fructosyl transferase, glycogen synthase, pectin esterase, aprotinin, avidin, bacterial levansucrase, E.coli glgA protein, MAPK4 and orthologues, nitrogen assimilation/methanolism enzyme, glutamine synthase, plant osmotin, 2S albumin, thaumatin, site-specific recombinase/integrase (FLP, Cre, R recombinase, Int, SSVI Integrase R, Integrase phiC31, or an active fragment or variant thereof), isopentenyl transferase, Sca M5 (soybean calmodulin), coleopteran type toxin or an insecticidally active fragment, ubiquitin conjugating enzyme (E2) fusion proteins, enzymes that metabolise lipids, amino acids, sugars, nucleic acids and polysaccharides, superoxide dismutase, inactive proenzyme form of a protease, plant protein toxins, traits altering fiber in fiber producing plants, Coleopteran active toxin from Bacillus thuringiensis (Bt2 toxin, insecticidal crystal protein (ICP), CrylC toxin, delta endotoxin, polyopeptide toxin, protoxin etc.), insect specific toxin AalT, cellulose degrading enzymes, E1 cellulase from Acidothermus celluloticus, lignin modifying enzymes, cinnamoyl alcohol dehydrogenase, trehalose-6-phosphate synthase, enzymes of cytokinin metabolic pathway, HMG-CoA reductase, E. coli inorganic pyrophosphatase, seed storage protein, Erwinia herbicola lycopen synthase, ACC oxidase, pTOM36 encoded protein, phytase, ketohydrolase, acetoacetyl CoA reductase, PHB (polyhydroxybutanoate) synthase, acyl carrier protein, napin, EA9, non-higher plant phytoene synthase, pTOM5 encoded protein, ETR (ethylene receptor), plastidic pyruvate phosphate dikinase, nematode-inducible transmembrane pore protein, trait enhancing photosynthetic or plastid function of the plant cell, stilbene synthase, an enzyme capable of hydroxylating phenols, catechol dioxygenase, catechol 2,3-dioxygenase, chloromuconate cycloisomerase, anthranilate synthase, Brassica AGL15 protein, fructose 1,6-biphosphatase (FBPase), AMV RNA3, PVY replicase, PLRV replicase, potyvirus coat protein, CMV coat protein, TMV coat protein, luteovirus replicase, MDMV messenger RNA, mutant geminiviral replicase, Umbellularia californica C12:0 preferring acyl-ACP thioesterase, plant C10 or C12:0 preferring acyl-ACP thioesterase, C14:0 preferring acyl-ACP thioesterase (luxD), plant synthase factor A, plant synthase factor B, Δ6-desaturase, protein having an enzymatic activity in the peroxysomal β-oxidation of fatty acids in plant cells, acyl-CoA oxidase, 3-ketoacyl-CoA thiolase, lipase, maize acetyl-CoA-carboxylase, 5-enolpyruvylshikimate-3-phosphate synthase (EPSP), phosphinothricin acetyl transferase (BAR, PAT), CP4 protein, ACC deaminase, protein having posttranslational cleavage site, DHPS gene conferring sulfonamide resistance, bacterial nitrilase, 2,4-D monooxygenase, acetolactate synthase or acetohydroxyacid synthase (ALS, AHAS), polygalacturonase, Taq polymerase, bacterial nitrilase, many other enzymes of bacterial or phage including restriction endonucleases, methylases, DNA and RNA ligases, DNA and RNA polymerases, reverse trascryptases, nucleases (DNases and RNAses), phosphatases, transferases etc.
Our invention also can be used for the purpose of molecular farming and purification of commercially valuable and pharmaceutically important proteins including industrial enzymes (cellulases, lipases, proteases, phytases etc.) and fibrous proteins (collagen, spider silk protein, etc.). Any human or animal health protein can be expressed and purified using described in our invention approach. Examples of such proteins of interest include inter alia immune response proteins (monoclonal antibodies, single chain antibodies, T cell receptors etc.), antigens including those derived from pathogenic microorganisms, colony stimulating factors, relaxins, polypeptide hormones including somatotropin (HGH) and proinsulin, cytokines and their receptors, interferons, growth factors and coagulation factors, enzymatically active lysosomal enzyme, fibrinolytic polypeptides, blood clotting factors, trypsinogen, a1-antitrypsin (AAT), human serum albumin, glucocerebrosidases, native cholera toxin B as well as function-conservative proteins like fusions, mutant versions and synthetic derivatives of the above proteins.
a) transit Peptide for Plastid Targeting in Dicotyledonous Plants
The consensus amino acid sequence of nine chloroplast targeting transit peptides from rubisco small subunit precursor proteins (rbcs) of different dicotyledonous plants was generated by sequence analysis with the DNASTAR software package (
b) Transit Peptide for Targeting in Monocotyledonous Plants
The same strategy as described above was used to create an artificial chloroplast targeting signal sequence for the expression and plastid targeting in monocot plants. The consensus amino acid sequence derived from six chloroplast transit peptides from rbcs-proteins of different monocot plants was generated by sequence analysis with the DNASTAR software package (
In order to test the efficiency of chloroplast targeting with the help of artificial transit peptides, the plasmids encoding the transit peptide—reporter gene fusion were delivered into leaf cells of tobacco and wheat with the help of microprojectile bombardment. The results showed an efficient GFP targeting into chloroplasts of both dicotyledonous and monocotyledonous plant species with the help of artificial transit peptides (
In order to produce a required N-terminus for a protein of interest targeted into the chloroplast, we have analyzed approximately 400 predicted or experimentally identified transit peptide cleavage sites of nuclear-encoded chloroplast targeted proteins from publicly available databases. The results of such analysis are shown in the Table 1. The cleavage sites can be tested for their suitability to provide a desired N-terminus to the protein of interest by using the constructs shown in
The coding sequences for protein fusions shown in the
Number | Date | Country | Kind |
---|---|---|---|
103 21 963.3 | May 2003 | DE | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP04/05151 | 5/13/2004 | WO | 00 | 12/13/2007 |