The present invention relates to leader sequences that are useful for production of heterologous secretable polypeptides; heterologous secreted polypeptides; nucleic acid constructs encoding such leader sequences; nucleic acid constructs encoding such heterologous secretable polypeptides; vectors that contain such nucleic acid constructs; recombinant host cells that contain such nucleic acid constructs, vectors and polypeptides; and methods of making such secretable polypeptides with such heterologous leader sequences; and methods of using such secretable polypeptides.
Proteins are the most prominent biomolecules in living organisms. In addition to being structural components and catalysts, they play crucial roles in regulatory processes. The cooperation of numerous cellular and extracellular proteins controls and affects the regulation of cell proliferation and metabolism. For example, many signal transduction pathways that affect physiological responses operate through proteins via intermolecular interactions.
Extracellular proteins, sometimes referred to as “secreted proteins,” or “secretable proteins” herein, often function as intercellular signal communicators. In this role, they act as ligands. Their counterpart, the membrane-associated receptors that have extracellular, intracellular, or cytoplasmic domains, transmit extracellular signals into the cells when ligand/receptor binding events take place on the cell surfaces.
While receptors often make potentially important therapeutic targets, secretable proteins are of particular interest as therapeutic agents. Because of their frequent involvement in signaling or hormonal pathways, secretable proteins tend to exhibit high and specific biological activities (Schoen, 1994). For example, secretable proteins have been reported to control or regulate physiological processes such as differentiation and proliferation, blood clotting and thrombolysis, somatic growth and cell death, as well as various immune responses (Id.). Significant resources and research efforts have been expended to discovering new secretable proteins and investigating their regulatory functions. Some of these secretable proteins, including cytokines and peptide hormones, have been manufactured and used as therapeutic agents (Zavyalov et al. 1997), but they constitute a minority amongst the thousands of proteins that are expected to be secreted and potentially efficacious therapeutically.
Typically, a secretable protein is expressed as a full-length polypeptide, sometimes referred to as a “protein precursor,” which is then processed in the Endoplasmic Reticulum (ER) and the Golgi in the post-translational phase. During this phase, a signal peptidase cleaves off a characteristic hydrophobic amino acid sequence at the N-terminus, a sequence that is generally referred to as a “signal peptide” (SP) or a “secretory leader sequence.” A typical SP is about 16 to 30 amino acid residues in length. The resulting polypeptide sans the SP is then exported to outside the cell. The resulting polypeptide is called a “mature protein” or a “secreted polypeptide.” And compared to the original secretable protein, this mature protein lacks the signal peptide sequence. Some proteins do not have an SP at the N-terminus, such as some of the members in the fibroblast growth factor family.
Naturally-occurring secretable proteins are expressed in varying amounts depending on their physiological roles in vivo. Many of them, under the regulation of their natural or endogenous SP, are expressed in quantities that are too low to be used commercially. It would therefore be advantageous if nucleic acid constructs and methods are devised to enable the production of secretory proteins in vivo or in vitro to meet the manufacturing needs for therapeutic applications.
The present invention provides nucleic acid and polypeptide constructs for producing proteins in higher yields than when such proteins are produced from sequences that comprise their endogenous signal peptide. The present invention also provides vectors, host cells and methods for producing proteins in higher yields than when such proteins are produced from DNA sequences that encode the protein with its endogenous signal peptide or without an endogenous signal peptide; the higher yield being achieved either by replacing the endogenous secretory leader sequence with an heterologous secretory leader sequence of the invention, or by adding a heterologous secretory leader sequence of the invention to a protein that would otherwise not contain a leader sequence. Accordingly, the present invention provides polypeptide and polynucleotide constructs where the polypeptides and polynucleotides are modified, such as to form a fusion molecule with a fusion partner. The fusion molecules of the invention may be prepared by any conventional technique.
Accordingly, the present invention comprises the following embodiments:
1. A heterologous polypeptide comprising a secretory leader and a second polypeptide, wherein the secretory leader is operably linked to the N-terminal of the second polypeptide, wherein the secretory leader is not so linked to the second polypeptide in nature, and wherein the secretory leader comprises a leader sequence of a secretable protein.
2. The heterologous polypeptide of 1, wherein the second polypeptide is a secretable protein selected from collagen type IX alpha 1 chain, long splice form, alpha-2-antiplasmin precursor (alpha-2-plamin inhibitor), trinucleotide repeat containing 5, ARMET protein, calumenin, COL9A1 protein, NBL1, PACAP protein, alpha-1B-glycoprotein precursor (alpha-1-B glycoprotein), similar to brain-specific angiogenesis inhibitor 2 precursor, SPOCK2, protein disulfide-isomerase (EC 5341) ER60 precursor, serine (or cysteine) proteinase inhibitor, clade A (alpha-1), GM2 ganglioside activator precursor, coagulation factor X precursor, secreted phosphoprotein 1 (osteopontin, bone sialoprotein 1), Vitamin D-binding protein precursor, interleukin 6 (interferon, beta 2), orosomucoid 1 precursor, hemopexin, glycoprotein hormones, alpha polypeptide precursor, kininogen 1, prolyl 4-hydroxylase, beta subunit, proopiomelanocortn, prostaglandin D2 synthase 21 kDa, alpha-2-glycoprotein 1, zinc, chromogranin A, cystatin M precursor, clusterin isoform 1, inter-alpha (globulin) inhibitor H1, leukemia inhibitory factor (cholinergic differentiation factor), lumican, secretoglobin, family 2A, member 2, nov precursor, reticulocalbin 1 precursor, reticulocalbin 2, EF-hand calcium binding domain, gastric intrinsic factor (vitamin B synthesis), cerberus 1, lipocalin 2 (oncogene 24p3), interleukin 18 binding protein isoform C precursor, cell growth regulator with EF hand domain 1, leukocyte immunoglobulin-like receptor, subfamily A, spondin 2, extracellular matrix protein, transmembrane protein 4, sparc/osteonectin, cwcv and kazal-like domain proteoglycan, Rho GTPase activating protein 25 isoform b, dickkopf homolog 3, ameloblastin precursor, chorionic gonadotropin, beta polypeptide 8 precursor, multiple coagulation factor deficiency 2, similar to common salivary protein 1, hypothetical protein. FLJ32115, oncoprotein-induced transcript 3, hypothetical protein MGC40499, interleukin 18 binding protein isoform A precursor, interleukin 1 receptor antagonist isoform 1 precursor, WFIKKN2 protein, similar to hypothetical protein 9330140G23, and SEQ ID. NOs: 2-3, 9, 19, 22, 26, 28, 31, 37, 41, 47, 54, 57, 62, 68, 75, 79, 82, 86, 88, 94, 97, 102, 104, 107, 111, 116, 120, 127, 131, 137, 140, 145, 147, 153, 159, 167, 175, 177, 181, 185, 189, 191, 196, 200, 207, 209, 215, 218, 222, 227, 232, 235, 239, 241, 245, 248, and 254.
3. The heterologous polypeptide of 1, wherein the secretory leader comprises an amino acid sequence selected from SEQ ID NOs: 20-21, 23-25, 27, 32-36, 38-40, 48-53, 76-78, 80-81, 83-85, 87, 95-96, 103, 108-110, 112-115, 117-119, 121-126, 128-130, 132-136, 138-139, 141-144, 154-158, 160-166, 178-180, 186-188, 197-199, 210-214, 223-226, 233-234, 240, and 246-247.
4. The heterologous polypeptide of 1, wherein the second polypeptide is selected from a secretable polypeptide, an extracellular portion of a transmembrane protein, and a soluble receptor.
5. The heterologous polypeptide of 4, wherein the secretable polypeptide is selected from a growth factor, a cytokine, a lymphokine, an interferon, a hormone, a stimulatory factor, an inhibitory factor, a soluble receptor, and splice variants thereof.
6. A secretory leader comprising a leader amino acid sequence selected from the leader sequences of the secretable polypeptides of Table 1 and the secretory leaders listed in Table 2.
7. The secretory leader of 6, the amino acid sequence of which is selected from the amino acid sequences of Appendix A, the amino acids residues of SEQ ID NOs: 1, 4-8, 10-18, 20-21, 23-25, 27, 29-30, 32-36, 38-40, 42-46, 48-53, 55-56, 58-61, 63-67, 69-74, 76-78, 80-81, 83-85, 87, 89-93, 95-96, 98-101, 103, 105-106, 109-110, 112-115, 117-119, 121-126, 128-130, 132-136, 138-139, 141-144, 146, 148-152, 154-158, 160-166, 168-174, 176, 178-180, 182-184, 186-188, 190, 192-195, 197-199, 201-206, 208, 210-214, 216-217, 219-221, 223-226, 228-231, 233-234, 236-238, 240, 242-244, 246-247, 249-253, and 255-256.
8. The heterologous polypeptide of 1, further comprising a fusion partner.
9. The heterologous polypeptide of 8, wherein the fusion partner is a polymer.
10. The heterologous polypeptide of 9, wherein the polymer is a third molecule, and wherein the third molecule is selected from polyethylene glycol and all or part of human serum albumin, fetuin A, fetuin B and Fc.
11. An isolated nucleic acid molecule comprising a polynucleotide sequence selected from: (1) a polynucleotide sequence encoding an amino acid sequence of a heterologous polypeptide according to any one of 1-5 and 8-10; (2) a polynucleotide encoding an amino acid sequence of a secretory leader according to any one of 0.6-7.
12. A nucleic acid molecule encoding a heterologous polypeptide, comprising a first polynucleotide that encodes a secretory leader of any one of 6-7, a second polynucleotide that encodes a second polypeptide, wherein the first polynucleotide and the second polynucleotide are operably inked to facilitate secretion of the heterologous polypeptide from a cell, and wherein the first and second polynucleotide are not so linked in nature.
13. The nucleic acid of claim 12, wherein the second polypeptide is elected from a secretable polypeptide, an extracellular portion of a transmembrane protein, and a soluble receptor.
14. The nucleic acid molecule of claim 12, further comprising a third polynucleotide, wherein the third polynucleotide is a Kozak sequence or a fragment thereof that is situated at its 5′ end.
15. The nucleic acid molecule of 14, further comprising a fourth polynucleotide, wherein the fourth polynucleotide comprises a restriction enzyme-cleavable sequence at its 3′ end.
16. The nucleic acid molecule of 15, further comprising a fifth polynucleotide that encodes a tag.
17. The nucleic acid molecule of 16, wherein the tag is a purification tag.
18. The nucleic acid molecule of 16, wherein the tag is selected from V5, HisX6, HisX8, an avidin molecule, and a biotin molecule.
19. The nucleic acid molecule of 16, further comprising a sixth polynucleotide that encodes a second enzyme-cleavable sequence that can be cleaved by a second enzyme, wherein the second cleavable sequence is situated upstream of the tag if the tag is situated at the C-terminus of the heterologous polypeptide, or downstream of the tag if the tag is situated at the N-terminus of the heterologous polypeptide.
20. The nucleic acid molecule of 19, wherein the second enzyme is thrombin or TEV from a tobacco virus.
21. A vector comprising the nucleic acid molecule of any one of claims 11-20, further comprising an origin of replication and a selectable marker.
22. The vector of 21, wherein the origin of replication is selected from SV40 ori, Pol ori, EBNA ori, and pMB1 ori.
23. The vector of 21, wherein the selectable marker is an antibiotic resistance gene.
24. The vector of 23, wherein the antibiotic resistance is selected from puromycin resistance, kanamycin resistance, and ampicillin resistance.
25. A recombinant host cell comprising a cell and the heterologous polypeptide of any of 1-4 and 8-10, the nucleic acid molecule of any of 11-20, or the vector of any one of 21-24.
26. The recombinant host cell of 25, wherein the cell is a eukaryotic cell.
27. The recombinant host cell of 26, wherein the cell is a human cell.
28. A method of producing a secreted polypeptide, comprising:
29. The method of 28, wherein the expression system is a cellular expression system or a cell free expression system.
30. The method of 28, wherein the expression system is a cellular expression system and the cell is a mammalian cell.
31. The method of 30, wherein the mammalian cell is selected from a 293 cell line, a PERC6® cell line, and a CHO cell line.
32. The method of 31, wherein the 293 cell is a 293-T cell or a 293-6E cell.
Table 1: lists information regarding the secretable proteins from which the leader sequences of the invention are derived. Column 1 lists the internal designation identification numbers; column 2 lists the reference identification numbers; column 3 lists the identities of the secretable proteins.
Table 2: lists information regarding the leader sequences of the invention Column 1 lists the internal designation identification numbers; column 2 lists the SEQ ID NOs. for the leader sequences (P); column 3 lists the reference identification numbers; column 4 lists the leader sequence types, i.e., full length versus alternative leader sequences; and column 5 lists the secretable proteins from which the leader sequences are derived.
Table 3: summarizes the results obtained with the leader sequences of the current invention. Column 1 lists the clone designation identification numbers; column 2 lists the protein concentrations in micrograms/milliliter (μg/ml) as detected and measured from the Coomassie-stained SDS-PAGE; column 3 ranks the expression levels as measured by Coomassie-stained SDS-PAGE, silver stained SDS-PAGE, or quantitative Western Blot using an Anti-V5 antibody relative to purified V5-tagged protein standards, of each construct on a scale of 1 to 56, from the lowest at 56 to the highest at 1; column 4 lists whether a band was detected using silver-stain developed SDS-PAGE; column 5 lists the molecular weights of the tested secretable proteins in Daltons; column 6 lists the gel numbers and lane numbers corresponding to
Appendix A/Sequence Listing lists the amino acid sequences of the leader sequences (P1) in Table 2.
To express and secrete the proteins of interest in larger quantities (e.g., about 10% more, 20% more, 30% more, or a higher percentage more) than those obtained when the proteins are expressed and secreted from DNA sequences that encode their full-length amino acid sequence and contain their endogenous signal peptide, the inventors replaced their endogenous secretory leader sequence with that from another, i.e., different or heterologous, secretable protein. The latter secretable protein of interest is typically one that is expressed and/or secreted at high levels (“high expressor protein” or “high secretor protein”), or moderately high levels (“moderate expressor protein” or “moderate secretor protein”) under typical conditions for assaying protein expression and secretion, which are not limited to those described in detail in the Examples of the invention. In other words, if one were to express a panel of proteins (including but are not limited to those listed in this specification, in Appendix A, and those listed in Tables 1-3), and all were expressed under the same assay conditions, one would find that some proteins are expressed and/or secreted at higher levels than others. Accordingly, it is an aspect of the invention to recognize the differences in expression and secretion levels among the proteins of the invention, and take advantage of these recognized differences to further identify from the leader sequences those that are useful for improving the secretion and/or expression of otherwise low expressor proteins, or of proteins that are not secreted at the desirable levels. Employing heterologous secretory leader sequences is further advantageous in that, during the secretion process, the resulting mature amino acid sequence of the secretable polypeptide is not altered as the secretory leader sequence is removed in the endoplasmic reticulum (ER) or the Golgi. A secretory leader sequence of the invention serves to direct certain proteins to the ER. The ER separates the membrane-bounded proteins from all other types of proteins amongst those comprising the leader sequences. Each group is then separately moved to the Golgi apparatus. The Golgi apparatus then distributes the proteins to vesicles such as secretory vesicles, the cell membranes, the lysosomes, or other organelles.
Moreover, the addition of a heterologous secretory leader facilitates the expression and secretion of the extracellular domains of transmembrane proteins. An example of such a transmembrane protein is the Type II single transmembrane proteins (STM), the secretory leader of which is also the transmembrane domain, which must be removed before the protein becomes soluble and secreted.
Thus, to identify robust secretory leader sequence(s), which enhance or improve the secretion and expression of proteins relative to that achieved by the endogenous leader sequence, and which optionally can be used universally for making secretable proteins, many different secretable proteins have been cloned and expressed, as described herein. The expression and secretion levels of the cloned and expressed proteins in the supernatant of the mammalian 293 cells have also been measured, the results of which are shown in, for example, Example 1,
In one embodiment, a secretory leader sequence that is a part of the secretable protein collagen type IX alpha I chain, long for has been identified. This particular leader sequence was selected to further examine its ability to promote expression and secretion when used as a heterologous and/or universal secretory leader sequence. The amino acid sequence of the secretory leader, which is part of the secretable protein collagen type DC alpha I chain, long form, is predicted to be MKTCWKIPVFFFVCSFLEPWASA (SEQ ID NO: 1). As further described herein, vectors were constructed to comprise this particular secretory leader. Using these vectors, several proteins were cloned without their own naturally-existing secretory leaders, yielding secretable proteins with a heterologous secretory leader sequence. The expression and secretion levels of these fusion proteins were found to be about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70% or more higher than the expression or secretion levels as observed with their non-fusion counterparts.
The present invention may be more clearly understood in light of the following definitions. Generally, the terms used herein have their ordinary meanings and the meanings given them specifically below.
The terms “polynucleotide,” “nucleotide,” “nucleic acid,” “nucleotide molecule,” “nucleic acid molecule,” “nucleic acid sequence,” “polynucleotide sequence,” and “nucleotide sequence” are used interchangeably herein to refer to polymeric forms of nucleotides of any length. The polynucleotides can contain deoxyribonucleotides, ribonucleotides, and/or their analogs or derivatives. For example, nucleic acids can be naturally occurring DNA or RNA, or can be synthetic analogs of the naturally occurring DNA or RNA, as known in the art. The terms also encompass genomic DNA; genes; gene fragments; exons; introns; regulatory sequences or regulatory elements, such as promoters, enhancers, initiation and termination regions, other control regions, expression regulatory factors and expression controls; isolated DNA; and cDNA. In addition, the terms encompass mRNA, tRNA, rRNA, ribozymes, splice variants, antisense RNA, antisense conjugates, RNAi, siRNA and isolated RNAs. The terms also encompass recombinant polynucleotides heterologous polynucleotides, branched polynucleotides, labeled polynucleotides, DNA/RNA hybrids, polynucleotide constructs, vectors comprising the subject nucleic acids, nucleic acid probes, primers and primer pairs. The terms comprise modified nucleic acid molecules, such as analogs of purines and pyrimidines, with alterations in the backbones, sugars; or heterocyclic bases, such as methylated nucleic acid molecules; peptide nucleic acids; and nucleic acid molecule analogs, which may be suitable as, for example, probes if they demonstrate superior stability and/or binding affinity under assay conditions. Analogs of purines and pyrimidines, including radiolabeled and fluorescent analogs, are known in the art. The polynucleotides can have any there dimensional structure. The terms also encompass single-stranded, double-stranded and triple-helical molecules that are DNA, RNA, or hybrid DNA/RNA, and that may encode a full-length gene or a biologically active fragment thereof. Biologically active fragments of polynucleotides can encode the polypeptides herein, as well as anti-sense, ribozymes, or RNAi molecules. Thus, for example, the full length polynucleotides herein may be treated with enzymes, such as Dicer, to generate a library of short RNAi fragments, which are also within the scope of the present invention.
The terms “polypeptide,” “peptide,” and “protein,” used interchangeably herein, refer to a polymeric form of amino acids of any length. The amino acids can include naturally-occurring amino acids; coded and non-coded amino acids; chemically or biochemically modified, derivatized, or designer amino acids; amino acid analogs; peptidomimetics and depsipeptides; and polypeptides having modified, cyclic, bicyclic, depsicyclic, or depsibicyclic peptide backbones. The terms may also refer to conjugated proteins; fusion proteins, including, but not limited to, GST fusion proteins, fusion proteins with a heterologous amino acid sequence, fusion proteins with heterologous and homologous leader sequences, fusion proteins with or without N-terminal methionine residues; pegylated proteins; and immunologically tagged proteins. Also included in the terms are variations of naturally occurring proteins, where such variations are homologous or substantially similar to the naturally occurring proteins, as well as their corresponding homologs from different species. Variants of polypeptide sequences include insertions, additions, deletions, or substitutions when compared with the original polypeptides, but nonetheless retaining the same type of biological activity albeit possibly at a different level. The term also includes peptide aptamers.
A “secretory leader,” “signal peptide,”or a “leader sequence,” contains a sequence comprising amino acid residues that directs the intracellular trafficking of the polypeptide to which it is a part. Polypeptides contain secretory leaders, signal peptides or leader sequences, typically at their N-terminus. These polypeptides may also contain cleavage sites where the secretory leaders, signal peptides or leader sequences may be cleaved from the rest of the polypeptides by signal endopeptidases. Such polypeptides after cleavage at the cleavage sites, generate mature polypeptides. Cleavage typically takes place during secretion or after the intact polypeptide has been directed to the appropriate cellular compartment.
According to the invention, a “high secretor signal peptide/secretory leader sequenced” is one that (i) can be operably linked to a protein as an heterologous sequence, thereby replacing its endogenous signal peptide; and (ii) is capable of enhancing the level of secretion of the protein at least about 5 fold, when compared to the level of secretion that the protein exhibits when it carries its endogenous SP.
Also according to the invention, a “moderate secretor signal peptide/secretory leader sequence” is one that (i) can be operably linked to a protein as an heterologous sequence, thereby replacing its endogenous signal peptide; and (ii) is capable of enhancing the level of secretion of the protein about 2 to 5 fold, when compared to the level of secretion that the protein exhibits when it carries its endogenous SP.
Further according to the invention, a “low secretor signal peptide/secretory leader” is one that (i) can be operably linked to a protein as an heterologous sequence, thereby replacing its endogenous signal peptide; and (ii) is capable of enhancing the level of secretion of the protein less than about 2 fold or does not enhance the level of secretion of the protein when compared to the level of secretion that the protein exhibits when it carries its endogenous SP.
Moreover, a secretory leader of the invention can also be added to a protein which is otherwise not predicted to be secreted via the ER-Golgi and does not have an endogenous signal peptide. In this case, the above definitions of “high/moderate/low secretor signal peptide/secretory leader sequence” are not applicable since there is no baseline secretion level for the protein that can be used for comparison purposes. In this case, the effect that the addition of the signal peptide/secretory leader sequence has on the secretion of an otherwise non-secretable protein will be compared among the resulting heterologous proteins.
For the purpose of this invention, the above definitions of “high/moderate/low secretor signal peptide/secretory leader sequence” relate only to the signal peptide (or secretory leader sequences). They do not relate to “high secretor proteins,” “moderate secretor proteins” or “low secretor proteins”. The proteins themselves were ranked as such on a basis of a relative scale that served to rank all the proteins of the invention (Tables 1-3 and Appendix A) relatively to each other, with regards to their own expression and secretion levels in either wheat germ extracts, or mammalian cells (see Examples 1-3 for detailed explanation).
A “secretable” protein is one capable of being directed to the ER, secretory vesicles, or the extracellular space by a secretory leader, signal peptide, or leader sequence. It may also be one that is released into the extracellular space without necessarily containing a signal sequence. If the secretable protein is one that is released into the extracellular space, it can undergo processing to produce a “mature” polypeptide.” Proteins that contain transmembrane domains and typically remain inserted into the plasma membrane are considered, for the purposes of the invention, secretable proteins because they are also synthesized in the ER-Golgi, and some fragments or parts of such proteins can be released into the extracellular compartment, for example, by proteolytic cleavage. Thus, release into the extracellular space can occur in multiple ways, including, for example, exocytosis and proteolytic cleavage.
The terms “mature protein” and “secreted protein” are used interchangeably herein, and refer to the form(s) of a secretable protein after it is secreted to the outside of the cell (for example, into the media conditioned by cells in culture). Typically, the mature protein has the amino acid sequence of the secretable protein sans the signal peptide. However, when a protein is expressed in nature or recombinantly, parts of the signal peptides are often not removed, resulting in a mature-protein mixture that may contain many forms of the mature protein, attached to varying-lengths of the signal peptides. Thus, multiple “mature forms” can exist for a secretable protein depending on the specific amino acids cleaved off by the signal endopeptidase. Other proteases can also cleave off amino acids from a secretable protein, further adding to the heterogeneity of its “mature-protein” The exact place where a signal peptide has been removed from a particular protein sample may be determined by N-terminal protein sequencing or otherwise by standard methods known to those skilled in the art.
A “biologically active” entity, or an entity having “biological activitys,” is one that has the structural, regulatory, or biochemical functions of a naturally occurring molecule, or one that has the functions related to or associated with a metabolic or physiological process. A biologically active polynucleotide fragment or polypeptide fragment according to this invention is one that exhibits activities similar, but not necessary identical, to the activities of the counterpart polynucleotide or polypeptide, to which the fragment is a part. Biological activities may include, but are not limited to, an improved desired activity and a decreased undesirable activity. For example, an entity demonstrates biological activity when it participates in molecular interactions with other molecules. An example of such an interaction is hybridization. Another example of such an interaction may be the exhibition of therapeutic effectiveness in alleviating a disease condition, or prophylactic effectiveness in inducing an immune response to the molecule. Another example of such an interaction may be the demonstration of potential uses as diagnostic tools in determining the presence of the molecule, for example, when the active fragment of a polynucleotide or a polypeptide is unique to the polynucleotide or the polypeptide, allowing the detection of the polynucleotide or the polypeptide by detecting fragment A biologically active polypeptide or fragment thereof includes one that can participate in a biological reaction, for example, one that can serve as an epitope or immunogen to stimulate an immune response, which includes but is not limited to the production of antibodies; or one that participates in signal transduction pathways by binding to receptors, proteins, or nucleic acids; or one that activates enzymes or substrates. Yet another example of such an interaction may be the suitability of using the polynucleotide molecule as a primer in PCR.
An “isolated” or “substantially isolated” polynucleotide or polypeptide, or a polynucleotide or polypeptide in “substantially pure form,” in “substantially purified form,” or as an “isolate,” is one that is substantially free of the sequences with which it is associated in nature, or of other nucleic acid sequences that do not include a sequence or fragment of the subject polynucleotide or polypeptide. “Substantially free” means that less than about 10%, less than about 20%, less than about 30%, less than about 40%, or less than about 50%, of the composition is composed of the undesired materials.
“Operably linked” refers to an arrangement of elements wherein the components so described are configured so as to perform their desired function. Thus, a given promoter operably linked to a coding sequence is capable of effecting the expression of the coding sequence when the proper transcription factors and conditions are present. The promoter need not be contiguous with the coding sequence, so long as it functions to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between the promoter sequence and the coding sequence, as can translated introns, and the promoter sequence can still be considered “operably linked” to the coding sequence.
“Recombinant,” when used to describe a nucleic acid molecule, means a polynucleotide of genomic, cDNA, viral, or synthetic origin which, by virtue of its origin or manipulation, is not associated with all or a portion of the polynucleotide with which it is associated in nature. The term “recombinant” when used to describe a protein or polypeptide, means a polypeptide produced by expression of a recombinant polynucleotide.
A “control element” refers to a polynucleotide sequence that aids in the expression of a coding sequence to which it is linked. The term may refer to promoters, transcription termination sequences, upstream regulatory domains, polyadenylation signals, and when appropriate, leader sequences and enhancers, which collectively provide for the transcription and translation of a coding sequence in a host cell.
A “promoter” as used herein refers to a DNA regulatory region capable of binding RNA polymerase in a mammalian cell and initiating transcription of a downstream (3′ direction) coding sequence operably linked thereto. For purposes of the present invention, a promoter sequence includes the minimum number of bases or elements required to initiate transcription of a gene of interest at a level detectable above background. Within the promoter sequence is a transcription initiation site, as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase. Eukaryotic promoters often, but not always, contain “TATA” boxes and “CAT”boxes. Promoters further include those that are naturally contiguous to nucleic acid molecules and those that are not naturally contiguous to nucleic acid molecules. Additionally, promoters may include inducible promoters; conditionally active promoters, such as a cre-lox promoter, constitutive promoters; and tissue specific promoters.
A “selectable marker” refers to a gene that confers one or more phenotypes on a cell expressing the marker, such that the cell can be identified in appropriate conditions under which the phenotypes associated with the markers are manifested and observable. Generally, a selectable marker allows selection of transformed cells based on their ability to thrive in the presence or absence of one or more chemicals and/or other agents that inhibit an essential cell function. Suitable markers, therefore, include genes coding for proteins that confer drug resistance or sensitivity thereto, impart color to, or change the antigenic characteristics of those cells transfected with a molecule encoding the selectable marker; when the transfected cells are grown in an appropriate selective medium. For example, selectable markers include: cytotoxic markers and drug resistance markers, whereby cells are selected by their ability to grow on media containing one or more of the cytotoxins or drugs; auxotrophic markers by which cells are selected by their ability to grow on defined media with or without particular nutrients or supplements, such as thymidine and hypoxanthine; metabolic markers by which cells are selected for phenotypes such as their abilities to grow on defined media containing the appropriate sugar as the sole carbon source; or markers that confer the abilities of forming colored colonies on chromogenic substrates or the abilities to fluoresce.
“Transformation,” as used herein, refers to the insertion of a polynucleotide into a host cell, regardless of the method used for insertion, which may be, for example, transformation, transfection, infection, and the like. The introduced polynucleotide may be maintained as a nonintegrated vector, for example, an episome, or alternatively, may be integrated into the host genome.
A “gene” comprises a DNA region encoding a gene product, as well as all DNA sequence regions that regulate the production of the gene product, whether or not such regulatory sequence regions are adjacent to coding sequences that may or may not be transcribed. Accordingly, a gene may be, for example, a promoter sequence, a terminator, a translational regulatory sequence such as a ribosome binding site or an internal ribosome entry site, an enhancer, a silencer, an insulator, a boundary element, a replication origin, a matrix attachment site, or a locus control region.
“Gene expression” refers to the conversion of the information, contained in a gene, into a gene product. A gene product can be the direct transcriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any other type of RNA) or a protein produced by translating an mRNA. A gene product can also be an RNA that is modified, by a process such as capping, polyadenylation, methylation, or editing; or a protein modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristilation, and glycosylation.
A “coding sequence” or a sequence that “encodes” a selected polypeptide, is a nucleic acid molecule that is transcribed (in the case of a DNA) and translated (in the case of an mRNA) into a polypeptide in vivo, when the sequence is placed under the control of one or more appropriate regulatory sequences. The coding sequence begins at a start codon at the 5′ (amino) terminus and ends at a translation stop codon at the 3′ (carboxy) terminus. A coding sequence can be, for example, a cDNA from viral, prokaryotic, or eukaryotic mRNA; a genomic DNA viral sequence (e.g. DNA viruses and retroviruses); a prokaryotic DNA; or a synthetic DNA sequence. A transcription termination sequence may be located at a position that is 3′ to the coding sequence.
A “fragment” refers to a polypeptide or polynucleotide comprising only a part of the sequence and structure of an intact full-length polypeptide or polynucleotide. The polypeptide fragment can comprise a C-terminal deletion, an N-terminal deletion, and/or an internal deletion from the intact polypeptide. The polynucleotide fragment can comprise a 5′ deletion, a 3′ deletion, and/or an internal deletion from the intact polynucleotide. A fragment of a protein generally comprises at least about 5-10 contiguous amino acid residues of the full-length molecule, at least about 15-25 contiguous amino acid residues of the full-length molecule, and at least about 20-50 or more contiguous amino acid residues of the full-length molecule. A fragment of a polynucleotide generally comprises at least about 15-30 contiguous nucleotides of the full-length molecule, at least about 45-75 continuous nucleotides of the full-length molecule, and at least about 60-150 or more contiguous nucleotides of the full-length molecule. In a certain embodiment, the number of amino acid residues in the fragment may be any integer between 5 and the total number of amino acid residues in the full-length molecule. In another embodiment, the number of nucleotides in the polynucleotide fragment may be any integer between 15 and the total number of nucleotides in the fall-length molecule.
The term “host cell” or “recombinant host cell” refers to an individual cell, cell line, cell culture, or a cell in vivo, which can be or has been a recipient of one or more, polynucleotides or polypeptides of the invention, which may be, for example, a recombinant vector, an isolated polynucleotide, an antibody, or a fusion protein. Host cells may be progeny of a single host cell, and the progeny may not necessarily be identical in morphology, physiology, in total DNA, RNA, or in polypeptide complement to the original recipient cell, as a result of natural, accidental, or deliberate mutations and/or changes. Host cells can be prokaryotic or eukaryotic, including but are not limited to, mammalian, insect, amphibian, reptile, crustacean, avian, fish, plant and fungal cells. A host cell may be a cell that is transformed, transfected, transduced, or infected in vivo or in vitro with a polynucleotide of the invention such as a recombinant vector. A host cell that comprises a recombinant vector of the invention may be called a “recombinant host cell.”
The term “receptor” refers to a polypeptide that binds to a specific extracellular molecule and this binding may initiate a cellular response.
The term “ligand” refers to a molecule that binds to a specific site on another molecule.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed. Moreover, it must be understood that the invention is not limited to the particular embodiments described, as the embodiments may, of course, vary. Further, the terminology used to describe particular embodiments is not intended to be limiting, since the scope of the present invention will be limited only by its claim.
Unless defined otherwise, the meanings of all technical and scientific terms used herein are those commonly understood by one of ordinary skill in the art to which this invention belongs. One of ordinary skill in the art will appreciate that any methods and materials similar or equivalent to those described herein can also be used to practice or test the invention.
It must be noted that, as used herein and in the appended claims, the singular forms “a,” “or,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a subject polypeptide” includes a plurality of such polypeptides and reference to “the agent” includes reference to one or more agents as well as equivalents thereof known to those skilled in the art.
Further, all numbers expressing quantities of ingredients, reaction conditions, % purity, polypeptide and polynucleotide lengths, and so forth, used in the specification and the claims, are modified by the term “about,” unless otherwise indicated. Accordingly, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties of the present invention. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, each numerical parameter should be construed in light of the number of reported significant digits, applying customary rounding techniques.
Nonetheless, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors from the standard deviation of its experimental measurement.
All publications cited are incorporated by reference herein in their entireties, including references cited in such publications are also incorporated by reference in their entireties.
As described herein, secretory leader sequences, which are identified from secretable proteins, are demonstrated to be useful for producing proteins at an amount that is about 5% higher, about 10% higher, about 20% higher, about 30% higher, about 40% higher, or about 50% or more higher, than when such proteins are produced under the same conditions from DNA sequences that contain the protein's endogenous secretory leader sequence. Secretory leader sequences identified and described herein include, for example, those from the following secretable proteins: interleukin-9 precursor, T cell growth factor P40, P40 cytokine, triacylglycerol lipase, pancreatic precursor, somatoliberin precursor, vasopressin-neurophysin 2-copeptin precursor, beta-enoendorphin-dynorphin precursor, complement C2 precursor, small inducible cytokine A14 precursor, elastase 2A precursor, plasma serine protease inhibitor precursor, granulocyte-macrophage colony-stimulating factor precursor, interleukin-2 precursor, interleukin-3 precursor, alpha-fetoprotein precursor, alpha-2-HS-glycoprotein precursor, serum albumin precursor, inter-alpha-trypsin inhibitor light chain, serum amyloid P-component precursor, apolipoprotein A-II precursor, apolipoprotein D precursor, colipase precursor, carboxypeptidase A1 precursor, alpha-si casein precursor, beta casein precursor, cystatin SA precursor, follitropin beta chain precursor, glucagon precursor, complement factor H precursor, histidine-rich glycoprotein precursor, interleukin-5 precursor, alpha-lactalbumin precursor, Von Ebner's gland protein precursor, matrix Gla-protein precursor, alpha-1-acid glycoprotein 2 precursor, phospholipase A2 precursor, dendritic cell chemokine 1, statherin precursor, transthyretin precursor, apolipoprotein A-1 precursor, apolipoprotein C-III precursor, apolipoprotein E precursor, complement component C8 gamma chain precursor, serotransferrin precursor, beta-2-microglobulin precursor, neutrophils defensins 1 precursor, triacylglycerol lipase gastric precursor, haptoglobin precursor, neutrophils defensins 3 precursor, neuroblastoma suppressor of tumorigenicity 1 precursor, small inducible cytokine A13 precursor, CD5 antigen-like precursor, phospholipids transfer protein precursor, dickkopf related protein-4 precursor, elastase 2B precursor, alpha-1-acid glycoprotein 1 precursor, beta-2-glycoprotein 1 precursor, neutrophils gelatinase-associated lipocalin precursor, C-reactive protein precursor, interferon gamma precursor, kappa casein precursor, plasma retinol-binding protein precursor, interleukin-13 precursor, and any of the secretable-proteins listed in Tables 1-3.
The above-identified secretory leader sequences, together with the vectors and methods of the invention, are useful in expressing a wide variety of polypeptides, including, for example, secretable polypeptides, extracellular proteins, transrnembrane proteins, and receptors, such as a soluble receptor. Examples of such polypeptides include cytokines and growth factors, such as Interleukins 1 through 18; the interferons; the lymphokines; hormones; RANTES; lymphotoxin-β; Fas ligand; flt-3 ligand; ligand for receptor activator of NF-kappa B (RANKL); TNF-related apoptosis-inducing ligand (TRAIL); CD40 ligand; Ox40 ligand; 4-1BB ligand and other members of the TNF family; thymic stroma-derived lymphopoietin; stimrulatory factors such as, for example, granulocyte colony stimulating factor and granulocyte-macrophage colony stimulating factor, inhibitory factors; mast cell growth factor, stem cell growth factor, epidermal growth factor, growth hormone, tumor necrosis factor; leukemia inhibitory factor, oncostatin-M; splice variants; and hematopoietic factors such as erythropoietin and thrombopoietin.
Descriptions of some of the proteins that can be expressed according to the invention may be found, for example, in H
Receptors for any of the aforementioned proteins may also be expressed using secretory leader sequences, vectors and methods described herein. The receptors may include, for example, both forms of tumor necrosis factor receptor (referred to as p55 and p75), Interleukin-1 receptors (types 1 and 2), Interleukin-4 receptor, Interleukin-15 receptor, Interleukin-17 receptor, Interleukin-18 receptor, granulocyte-macrophage colony stimulating factor receptor, granulocyte colony stimulating factor receptor, receptors for oncostatin-M and leukemia inhibitory factor, receptor activator of NF-kappa B (RANK), receptors for TRAIL, and receptors that comprise death domains, such as Fas or Apoptosis-Inducing Receptor (AIR).
Other proteins can also be expressed using the secretory leader sequences, vectors and methods described herein. These proteins include, for example, cluster of differentiation antigens (referred to as “CD proteins” or “CD molecules”) such as those disclosed in L
Proteins that are enzymes may also be expressed employing the herein described secretory leader sequences, vectors and methods. These enzymes may include, for example, members of the metalloproteinase-disintegrin family, various kinases such as streptokinase, tissue plasminogen activator, Death Associated Kinase Containing Ankyrin Repeats, IKR 1, or IKR 2; TNF-alpha Converting Enzyme; and numerous other enzymes. Ligands for enzymes can also be expressed by applying the secretory leader sequences, vectors and methods of the instant invention.
The secretory leader sequences, vectors and methods described herein, are also useful for the expression of other types of recombinant proteins. These recombinant proteins may include, for example, immunoglobulin molecules or portions thereof, as well as chimeric antibodies (e.g., antibodies that have human constant regions coupled to murine antigen-binding regions) or fragments thereof. Numerous techniques are known by which DNAs encoding immunoglobulin molecules can be manipulated to yield-DNAs capable of encoding recombinant proteins such as single chain antibodies, antibodies with enhanced affinity, or other antibody-based polypeptides (see, e.g., Larrick et al. 1989; Reichmann et al. 1988; Roberts et al. 1987; Verhoeyen et al. 1988; Chaudhary et al. 1989).
The present invention provides recombinant vectors that contain, for example, nucleic acid constructs that encode one or more secretory leader sequences of interest or selected heterologous polypeptides of interest that are not necessarily secretory leader sequences, and host cells that are genetically engineered to incorporate the recombinant-vectors.
The vector of the invention may be one that contains a selectable marker for propagation in a host and a secretory leader sequence such as one of those listed in Table 1. Such selectable markers may be, for example, dihydrofolate reductase; G418; neomycin-, or puromycin-resistance for eukaryotic cell cultures; or tetracycline-, kanamycin-, puromycin-, or ampicillin-resistance for E. coli and other bacterial cultures.
The vector of the invention may be, for example, a phage, plasmid, viral, or retroviral vector. Generally, a plasmid vector is introduced in a precipitate form, such as a calcium phosphate precipitate, or in a complex comprising a charged lipid. If the vector is a virus, it may be packaged in vitro using an appropriate packaging cell line, and then incorporated into host cells by transduction. A retroviral vectors may be replication competent or replication defective. And when it is replication defective, viral propagation generally occurs only in complementing host cells.
Among vectors useful in the present invention are the herein described vectors employing a pTT vector backbone (see, e.g.,
The nucleic acid constructs of interest may be a DNA that is operatively linked to an appropriate promoter. The appropriate promoter may be, for example, the phage lambda PL promoter, the E. coli lac, trp, phoA and tac promoters, the SV40 early and late promoters, or one of the promoters from retroviral LTRs. The promoters may also be, for example, the metallothionein promoters derived from the genome of mammalian cells. Alternatively, the promoters may be the adnovirus late promoters or the vaccinia virus 7.5K promoters derived from mammalian viruses. Other suitable promoters are known to the person skilled in the art.
The expression constructs further contain sites for transcription initiation, termination and, in the transcribed region, a ribosome binding site. The coding portion of the transcripts expressed by the constructs will preferably include an appropriately-positioned translation initiating codon at the beginning and a termination codon (UAA, UGA or UAG) at the end of the polypeptide to be translated. The heterologous polypeptides the polynucleotides encode may include, for example, extracellular fragments of secretable proteins, type I membrane proteins, type II membrane proteins, multi-membrane proteins, and soluble receptors.
A construct can be introduced into a host cell by calcium phosphate transfection, DEAE-dextran-mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection, or any other methods known to the person skilled in the art. Such methods are described in many standard laboratory manuals, such as by Davis et al., in B
A variety of host-expression vector systems may be used to express the polypeptides of the invention. Such host-expression systems are vehicles by which the coding sequences of interest may be produced and subsequently purified. These systems can also be cells that, when transformed or transfected with the appropriate nucleotide coding sequences, express the polypeptides of the invention. These systems may include, for example, microorganisms, such as bacteria like E. coli or B. subtilis, transformed with recombinant bacteriophage DNA, plasmid DNA, or cosmid DNA expression vectors that contain the polypeptide coding sequences. These systems may also include, for example, yeast such as Saccharomyces or Pichia, transformed with recombinant yeast expression vectors that contain the polypeptide coding sequences. They may also be insect cells infected with recombinant virus expression vectors such as baculovirus, which contain the polypeptide coding sequences. They may also be plant cells infected with recombinant virus expression vectors such as cauliflower mosaic viruses (“CaMV”) or tobacco mosaic viruses (“TMV”), or transformed with recombinant plasmid expression vectors such as Ti plasmids, which contain the polypeptide coding sequences. They may further include mammalian cells such as COS, CHO, BHK, 293, 293-6E, PER.C6®, 293T, or 3T3, which harbor recombinant expression constructs that contain promoters.
After the host cells are transfected with the vectors or DNA constructs encoding the polypeptides of interest, the cells are then grown on proper mediums and under proper conditions to produce the polypeptides of the present invention.
Typically, a heterologous polypeptide may be expressed as a fusion protein. It may further include not only one or more of the secretion signals, but also one or more of the secretory leader sequences as exemplified in Table 1. The expression of such fusion proteins according to the invention is described in detail below.
Additionally, peptide moieties and/or purification tags may be added to the polypeptide to facilitate purification, improve stability, and engender secretion or excretion. The moieties and/or tags may be removed prior to the final steps of purification. The techniques are familiar and routine to one skilled in the art. In certain embodiments, such a tag may be a hexa-histidine peptide, such the one provided in a pQE™ vector (QIAGEN™, Inc., Chatsworth, Calif.). Another peptide tag, the “HA” tag that is an epitope derived from the influenza hemagglutinin protein may also be fused with the polypeptide of the present invention. (See Wilson et al. 1984). Other suitable purification tags may be, for example, V5, HISX8, avidin, or biotin.
In a certain embodiment, the fusion protein comprises a heterologous region from immunoglobulin, the presence of which may facilitate purification and may help to stabilize the purified protein. For example, EP-A-O 464 533 and its Canadian counterpart 2045869 describe fusion proteins comprising various parts of the immunoglobulin constant region (Fc) and a human protein or parts thereof. According to EP-A-0232 262, the Fc regions in a fusion protein is thought to be advantageous for use in therapy and diagnosis because they tend to lead to improved pharmacokinetic properties. But for some other uses, it might be desirable to delete the Fc regions after the fusion protein has been expressed, detected and purified, especially when the Fc regions hinder the use of the polypeptide to which the regions are fused in therapy and diagnosis. For example, the deletion of Fc regions might be necessary when the fusion protein is used as an antigen for immunization.
The purification tags may also be used in drug discovery. For example, a human protein hIL-5 was fused with the Fc regions to facilitate the identification of hIL-5 antagonists using high-throughput screening assays. (Bennett et al. 1995; Johanson et al. 1995).
A heterologous polypeptide of the invention can be purified from a recombinant cell culture by well-known methods, which include, for example, ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. In a particular embodiment, high performance liquid chromatography (“HPLC”) is employed for purification. Polypeptides of the present invention may include, for example, products purified from directly-isolated or cultured natural sources such as bodily fluids, tissues and cells; products of chemical synthetic procedures; products produced by recombinant techniques from prokaryotic or eukaryotic hosts such as bacterial cells, yeast, higher plant cells, insect cells, mammalian cells; or products produced by recombinant techniques from cell-free expression systems.
The invention encompasses polypeptides that are differentially modified during or after translation, for example, by glycosylation, acetylation, methylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or linkage to an antibody molecule or other cellular ligand. Any of these chemical modifications may be carried out by known techniques, including specific chemical cleavage by cyanogen bromide; digestion by trypsin, chymotrypsin, papain, or V8 protease; treatment by NABH4; acetylation; formylation; oxidation; reduction; and metabolic synthesis in the presence of tunicamycin.
Depending upon the hosts employed in the recombinant production procedures, the polypeptides of the present invention may be glycosylated or non-glycosylated. A polypeptide of the invention may also include an initial modified methionine residue at the N-terminus, usually as the result of host-mediated processes. It is known in the art that the N-terminal methionine encoded by the translation initiation codon can generally be removed with high efficiency after translation in eukaryotic cells. While the N-terminal methionines can be efficiently removed from most prokaryotic proteins, the removal processes are not always efficient in prokaryotes. The efficiency depends on the nature and identity of the amino acids to which the N-terminal methionines are covalently linked.
Additional post-translational modifications according to by the invention include, for example, N-linked or O-linked carbohydrate chains, processing of N-terminal or C-terminal ends, attaching the chemical moieties to the amino acid backbone, chemical modifications of N-linked or O-linked carbohydrate chains, or addition or deletion of an N-terminal methionine as the result of prokaryotic host cell expression. To facilitate detection and isolation of the protein, the polypeptide may also be modified with one or more detectable labels, which may be, for example, an enzymatic, fluorescent, isotopic, or affinity label.
Additional embodiments of the invention may be chemically modified derivatives of the polypeptides of the invention, which may provide additional advantages such as increased solubility, stability and circulating time for the polypeptides, or decreased immunogenicity in biological systems (U.S. Pat. No. 4,179,337). The chemical moieties used in derivitization may be selected from water soluble polymers such as polyethylene glycol, ethylene glycol/propylene glycol copolymers, carboxymethylcellulose, dextran, polyvinyl alcohol and the like. The polypeptides may be modified at random positions, or at predetermined positions within the molecule, and may include one, two, three, or more attached chemical moieties. Reaction conditions may be selected from any of those known in the art and those subsequently developed, but should be selected so that the protein to be modified is not exposed or will suffer only limited loss of activity due to harsh temperature, solvent, and pH conditions. In general, the larger the ratio of polymer to polypeptide conjugate, the greater the percentage of conjugated product. The optimum ratio, measured by the efficiency of the reaction, may be determined by factors such as the desired degree of derivatization (e.g. mono-, di-, tri- etc.), the molecular weight of the polymer-selected, the degree of branching, and the reaction conditions. The ratio of polymer to polypeptide generally ranges from 1:1 to 100:1. One or more purified conjugates may be prepared from each mixture by standard purification techniques, which includes, for example, dialysis, salting-out, ultrafiltration, ion-exchange chromatography, gel filtration chromatography, and electrophoresis.
A polymer may be of any molecular weight, and may be branched or unbranched. In certain embodiments, where the polypeptides of the invention are modified by polyethylene glycol, the molecular weight of the polyethylene glycol is from about 1 kDa to about 100 kDa. The term “about,” when used in the description of polyethylene glycol, is intended to suggest that, during the preparations of polyethylene glycol, some molecules will weigh more, some less, than the stated molecular weight. The size of the polyethylene glycol used in the modification may depend on the desired therapeutic profile, such as, for example, the desired duration of sustained release; the effects, if any, on biological activities; the ease of handling; the degree or the lack of antigenicity-, as well as other known effects of the polyethylene glycol on a therapeutic protein or an analog.
There are a number of attachment methods available to those skilled in the art. For example, EP 0 401 384 describes the coupling of PEG to G-CSF. Malik et al. reported pegylation of GM-CSF using tresyl chloride (Malik et al. 1992). Polyethylene glycol may be covalently bound to a reactive group of an amino acid residue, which may be, for example, a free amino group, a carboxyl group, or a sulfhydryl group. In the context of pegylation, which means attaching polyethylene glycol moieties to a molecule, reactive groups are defined as groups to which an activated polyethylene glycol molecule may be bound.
One may specifically desire proteins chemically modified at the N-terminus. This may be accomplished by reductive alkylation, which exploits the different reactivities of primary amino groups such as the internal lysines and the N-terminal amino acid, which are available for derivatization. For example, one may selectively attach a polymer to the N-terminus of a protein by performing the reaction at a pH where only the α-amino group of an N-terminal residue, and not the ε-amino residue, would be susceptible to the reaction, taking advantage of the pKa differences between these types of amino groups. The polymer used in reductive alkylation typically have a single reactive aldehyde. The N-terminally chemically modified protein may be separated from other monoderivatized moieties, if necessary, by purifying the N-terminally modified protein from a population of protein molecules that are modified elsewhere.
In a further embodiment of the invention, the heterologous polypeptides of may be combined with one or more fusion partners to form fusion molecules. Such fusion molecules may advantageously provide improved pharmacokinetic properties when compared to their unmodified non-fused counterparts. These fusion molecules comprising the heterologous polypeptides of the invention may be prepared by a person skilled in the art who is apprised with the disclosures herein. Suitable chemical moieties for derivatization of a heterologous polypeptide in this regard may be, for example, polymers such as water soluble polymers; all or part of human serum albumin; fetuin A; fetuin B; or Fc regions.
Specifically, a modified heterologous polypeptide of the invention may be prepared by attaching one or more polyaminoacids, peptide moieties, or branch-point amino acids to the polypeptide. Polyaminoacids are commercially available and widely used in drug delivery technology and other emerging technologies such as gene therapy. In addition to the advantages one may achieve with a fusion molecule as described above, the polyaminoacid may be a carrier that serves to increase the polypeptide's circulation half-life. For the therapeutic purpose of the present invention, such a polyaminoacid should ideally be one that does not generate neutralizing antigenic or other adverse responses. As described herein, the position at which the polyaminoacid is attached to the polypeptide or fusion polypeptide may be located at the N-terminus, C-terminus, or any other positions in between. The polyaminoacid may also be connected by a chemical “linker” moiety to either end of the selected polypeptide or fusion polypeptide.
A method for preparing a fusion molecule conjugated with one or more polymers, such as water-soluble polymers is described above.
Additionally, heterologous polypeptides of the present invention and the epitope-bearing fragments thereof can be combined with parts of the immuoglobulin constant domain, resulting in chimeric polypeptides. These particular fusion molecules facilitate purification and tend to show an increased half-life in vivo when compared to their pre-fusion counterparts. Examples of these chimeric polypeptides include, for example, the chimeric proteins comprising the first two domains of the human CD4-polypeptide and various domains of the mammalian immunoglobulin constant regions (EP A 394,827; Traunecker et al. 1988). A fusion molecule having a disulfide-linked dimeric structure tends to be more efficient in binding and neutralizing other molecules than, for example, a monomeric polypeptide or fragment (Fountoulakis et al. 1995).
In another embodiment, a human serum albumin fusion molecule may also be prepared as described herein and as further described in U.S. Pat. No. 6,686,179, which is hereby-incorporated by reference in its entirety.
Moreover, the polypeptides of the present invention can also be fused to a purification tag, which is a peptide region that would facilitate the purification of the polypeptides to which they are a part. The method of fusing the tag to the polypeptide of interest is described herein.
It will be clear to those skilled in the art that the invention may be practiced in ways other than those particularly described in the foregoing descriptions and the examples herein. Many modifications and variations of the present invention are possible in light of the teachings herein and, therefore, are within the scope of the appended claims.
Recombinant technologies allow for expression of proteins in vitro or in vivo. Examples of in vitro systems for protein expression include cell-free systems such as rabbit reticulocyte lysates and wheat germ extracts, and cell-based systems such as bacteria, insect cells, yeast cells and mammalian cells (for example, CHO cells, 293 cells, and human embryonic retinal cells PER.C6® cells (Crucell, Netherlands)). In vivo expression of recombinant proteins is useful, for example, in the generation of transgenic animals in which the transgene(s) encodes protein(s) tagged with markers such as, for example, Green Fluorescent Proteins and its variants or β-galactosidase. Such tags allow for easier visualization, tracking and/or isolation of the cells in which the tagged protein is expressed. Another example of in vivo expression of recombinant proteins is the use of transgenic mice, or of cells implanted into mice, that have been genetically modified for the expression of secretable proteins. The latter can be proteins that, for example, are thought to promote tumor development, work as hormones, as growth factors, and/or as survival factors. In that setting, it can be important to obtain various levels of protein secretion (low, moderate, high) in order to obtain a specific result (e.g. tumor promotion). Many proteins are not efficiently secreted when expressed in recombinant settings. In that case, it is useful to be able to replace, via recombinant methods, its endogenous leader sequence with a leader sequence that is capable of driving its efficient secretion.
It is often useful to confirm that a given isolated cDNA is capable of supporting the expression of the protein which its nucleotide sequence encodes in vitro, before the cDNA is used to express that protein in vivo. This process may also serve, for example, to obtain further information regarding the post-translational modifications that the protein undergoes in a specific host cell (e.g. CHO cells versus PERC6® cells), and the activity of the protein. In the case of a secretable protein, the cDNA sequence may either encode its full-length form, its mature form (i.e., the protein without the leader sequence), or any other parts of the protein, such as a particular domain.
Preparation of Plasmid Templates for Recombinant Protein Expression in Cell-Free Systems.
To recombinantly express a cDNA encoding the mature form of any protein of interest, it is often useful that the cDNA be modified in order to include, in addition to the coding sequence, a translational initiation site/translational enhancer (e.g. KOZAK sequence, Omega sequence, Non-Omega sequences). In this example, the mature form refers to the most typical product of secretion, which is the protein without the signal peptide. Furthermore, if no antibody exists for the protein of interest, a tag may also be added which facilitates both the detection and the purification processes. Examples of such tags are Glutathione-S-Transferase (GST), and the epitopes V5, HisX6, and HisX8(H8). The addition of these features to a cDNA encoding a protein of interest can be done by a variety of cloning methods. If no appropriate restriction enzyme sites are present in the cDNA of interest. PCR amplification methods such as those described below can also be used during the cloning process. A cloning process that involves three PCR steps and results in a mature ORF tagged with Glutathione-S-Transferase is exemplified below;.
To begin, a first plasmid containing the cDNA sequence encoding the mature open reading frame (mature ORP) of interest was provided for the first PCR. To add the translational initiation site/translational enhancer to the 5′ region of the coding sequence for the mature ORF, a nucleotide primer (forward primer FP1) was designed and synthesized, which contained 5′GTTCTGTTCCAGGGGCCC 3′ followed by the first nineteen nucleotides predicted to encode the amino terminus of the mature secretable protein of interest. A second primer (reverse primer RP1) was designed and synthesized, based on a region of the plasmid approximately 1000 nucleotides downstream from the coding sequence (mature ORF) of the cDNA to be expressed. In fact, the RP1 primer was designed as the reverse complement of the vector sequence in this region such that RP1 could be used with FP1 in a PCR to amplify the mature ORF. The exact sequence of RP1 would vary depending on the starting plasmid, but it was typically 17-23 nucleotides long with a Tm of approximately 55-65° C.
The purified starting plasmid containing the cDNA to be expressed as a mature ORF, or E coli cells containing the purified plasmid, was then added as a template to a standard PCR, which included the two primers (FP1 and RP1), as described above, standard PCR reagents, and a DNA polymerase. The reaction mixture was then subjected to 15-30 cycles of PCR amplification. The product of this first PCR is called the “PCR1 coding templates” for the purpose of this application.
A separate PCR was performed to prepare a “GST-Mega primer,” whose purpose was to provide the GST portion of the final GST-mature ORF expression template in the second PCR step. To this end, a different starting plasmid template was used, for example, one containing a GST coding sequence downstream from the Non-Omega translation initiation sequence, and which is herein referred to as “template 2.” It is often useful that the GST fusion protein is linked to the mature ORF via a cleavable bridge. To this end, the template might have a GST protein modified to include a protease-cleavable sequence, such as one sensitive to thrombin, or to the commercially available PreScission™ Protease (Amersham, N.J.). This allows for the two proteins, mature ORF and GST, to be separated at the end of the purification procedure by protease-mediated cleavage. Thus, a PCR was prepared to amplify “template 2” using two primers: FP2, of sequence 5′ GGTGACACTATAGAACTCACCTATCTCCCCAACA 3′; and RP2, of sequence 5′ GGGCCCCTGGAACAGAACTTC 3′. The amplification took place for 15 to 30 cycles in a standard PCR mixture that included template 2, the two primers described above (FP2 and RP2), standard PCR reagents, and a DNA polymerase. After the PCR was complete, the amplification product was treated with exonuclease I for 30 minutes at 37° C., and then heat-inactivated at 80° C. for 30 minutes. The product was then purified by agarose gel electrophoresis and extracted using a gel purification kit (Amersham, N.J.), producing the “GST-Mega primer.” The “GST-Mega primer” was, in fact, one of the two templates used in the second PCR that yields a GST-fusion expression template. The other template of the final reaction was the “PCR1 coding template,” prepared as described above.
The final construct, which was the mature ORF/GST fusion expression template, was prepared as follows. The two templates “GST-Mega Primer” and “PCR1 coding template” were combined via the second PCR involving the mature ORF. This PCR reaction mix included: (i) standard PCR reagents; (ii) a DNA polymerase; (iii) an aliquot of the “PCR1 coding template” (e.g., 0.5 μl); (iv) an aliquot of the “GST-Mega primers” (e.g., 1 μl); (iv) a fifth primer, FP3, of sequence 5′ GCGTAGCATTTAGGTGACACT 3′, which comprised part of the SP6 promoter sequence, and was annealed to the 5′ end of the “GST Mega primer” via its common 3′ end (compare underlined sequences); and (v) a sixth primer, RP3, which was designed as the reverse complement of the vector sequence in the same region of the vector as RP1 but starting three nucleotides upstream of RP1 to specifically anneal only on the full-length PCR1 coding template; RP3 is typically 17-23 nucleotides long with a Tm of approximately 55-65° C., and can be used in amplifying the “PCR1 coding template.” After 15-30 cycles of PCR amplification, the “Mature ORF/GST-fusion expression template” was thus generated.
Expression of GST-Fusion Expression Templates in Wheat Germ Extracts.
In order to express a mature protein of interest in a cell-free system, the mRNA can be both transcribed and translated from the “Mature ORF/GST-fusion expression template” in the same reaction, or in separate reactions. A separate in vitro transcription reaction (50 μl) can be prepared with 5 μl of the “GST-fusion expression template” in the following buffer: 80 mM HEPES KOH pH 7.8, 16 mM Mg(OAc)2, 2 mM spermidine, 10 mM DTT, 1 unit of SP6 (Promega, Wis.) and 1 unit of RNasin (Promega, Wis.). The reaction mixture is incubated for 3 hours at 37° C. The resulting mRNA is subjected to ethanol precipitation in a solution containing 200 μl of RNase-free water, 37.5 μl of 5 M ammonium acetate, and 862 μl of 99% ethanol. The ethanol precipitation comprises the steps of mixing by vortexing and pelleting by centrifugation at 15,000×g for 10 minutes at 4° C. The mRNA pellet is then washed in 70% ethanol and again pelleted by centrifugation at 15,000×g for 5 minutes at 4° C., after which steps the pelleted mRNA is ready for in vitro translation.
Wheat germ extracts can be used for in vitro translation of the mRNA, prepared separately as described above. First, a stock solution of 2× Dialysis Buffer was prepared from mixing two separate stocks of amino acids. The first stock contained 20 mM HEPES KOH buffer pH 7.8, 200 mM KOAc, 5.4 mM Mg(OAc)2, 0.8 mM Spermidine, 100 μM DTT, 2.4 mM ATP, 0.5 mM GTP, 32 mM creatine phosphate, 0.02% NaN3, and 0.6 mM of an amino acid mix that did not contain aspartic acid, tryptophan, glutamic acid, isoleucine, leucine, phenyalanine and tyrosine. The second stock contained a 80 mM mix of the amino acids aspartic acid, tryptophan, glutamic acid, isoleucine, phenylalanine and tyrosine in 1 N HCl. After all the amino acids in the second stock were dissolved, the two stocks were mixed, so that the final concentration of the second-stock of amino acids was 0.6 in M. The 2× Dialysis Buffer stock was then adjusted to pH 7.6 using 5 N KOH, filter sterilized, and stored frozen in aliquots at −80° C.
To resuspend the in vitro transcribed mRNA (prepared separately as described above), a 50 μl “translation mixture” was prepared that includes Wheat Germ Reagent (Promega, Wis.) at a final OD260 nm of 60 prepared in 1× Dialysis buffer containing 2 mM dithiothreitol (DTI). After removing the supernatant (ethanol) from the precipitated rRNA, the 50 μl “translation mixture” was added to the precipitate and allowed to sit for 5-10 minutes before the mRNA was resuspended into the translation mixture. The complete translation mixture containing the resuspended mRNA was then layered under 250 μl of 1× Dialysis Buffer that had already been added to one well of 96 well round bottom microtiter plate to setup a Bilayer Reaction. The plate was then sealed manually with a plate seal and the in vitro translation reaction allowed to incubate for 20 hours at 26° C.
At the end of the in vitro translation reaction period, and to recover the recombinant mature ORF protein expressed as a GST fusion, the translation mixture was transferred to a tube and diluted five-fold with phosphate buffer-saline containing 0.25 M sucrose and 2 mM DTT. Ten microliters of glutathione(GSH)-sepharose beads. (Amersham-Pharmacia Biotech, N.J.), to which the Glutathione-S-Transferase (GST) protein binds, were then added to the mixture, which was then incubated at 4° C. for 3 hours, with constant agitation to ensure mixing. The GSH-sepharose beads, containing the bound GST-fusion protein, were then washed three times in PBS containing 0.25 M sucrose and 2 mM DTT. If the mature ORF and the GST were recombinantly engineered to be fused via a protease cleavable bridge, a fourth wash was then performed in a protease-cleavage buffer containing 50 mM Tris pH 7.4, 150 mM NaCl, 1 mM EDTA, 2 mM DTT, and 0.25 M sucrose. This wash buffer was also called the “final wash buffer.” After the wash buffer was carefully removed from the bead mixture, 10 μl of the final wash buffer collected from the last step was mixed with the beads, and 0.4 μl of the appropriate protease such as PreScission™ Protease (Amersham, N.J.) was added to the mixture. A pipette was then used to gently suspend the beads. This bead mixture/suspension was then allowed to sit overnight at 4° C. To recover the cleaved mature ORF protein product, 20 μl of the final wash buffer was added and entire liquid fraction (without the beads) recovered by pipetting (after allowing the beads to settle), or by filtering through a sintered frit.
Aliquots of the recovered liquid fraction (containing the purified mature protein) were analysed by ELISA and/or Coomassie/Silver Staining of SDS-PAGE gels, in order to quantify the level of expression of the mature protein.
To stabilize the recovered mature ORF protein, a solution of 10 mg/ml purified BSA in PBS was added to the purified protein solution so that the final concentration of BSA became about 1 mg/ml. The protein sample was then dialyzed in PBS and filter-sterilized for storage. Western blot analysis can be done from aliquots recovered throughout various steps along the purification procedure to assess, for example, the level of protein expression, and to determine whether or not the protein translated corresponds to the protein expected to be encoded by the cDNA of interest, both in terms of its length and its sequence. The protein can also be used in future characterization studies, such as biological activity measurements, mass-spectrometry, and post-translational modification assays. To produce additional protein from the same mRNA template, the single Bilayer Reaction can be repeated multiple times, and the purification and formulation can be scaled accordingly.
Typically, sixteen Bilayer reactions (set up as described above) will produce sufficient biologically active protein for testing in most typical assays such as biological activity assays. Since these reactions are done in 96 well plates, this expression system is suitable for high-throughput assays in which multiple cDNAs of interested can be translated simultaneously in separate wells. Once a cDNA is shown capable to encode a specific protein of interest in wheat germ extracts, it can be desirable to express larger amounts of protein than those typically obtainable with this expression system. It can also be desirable to compare the post-translational modifications that a given protein undergoes in different cell systems, for example those that occur in a plant-based system such as the wheat germ lysates, with those that occur in a mammalian system (e.g. CHO cells, 293 cells, PER.C6® cells).
Evaluation of the Expression Levels of Various Signal-Peptide-Less Mature Proteins
Column 3 (“Highest Expressors”) of Table 3 summarizes the results of a high-throughput expression experiment aimed at comparing the expression levels of various proteins of the invention, without their endogenous signal peptide and under standardized conditions. Starting with a set of cDNAs that included those encoding the full-length proteins listed in Table 1 and Appendix A, mature ORF templates were prepared as described in detail in the previous paragraphs, to express the mature version of each protein (i.e., the protein without its endogenous signal peptide). After purification, the expression levels were quantified by Coomassie-stained SDS-PAGE, silver stained SDS-PAGE, or quantitative Western Blot using an Anti-V5 antibody relative to purified V5-tagged protein standards, and 56 of the “highest expressors” were ranked from 1 (high) to 56 (low) based on their expression levels, relatively to each other. Under these standardized conditions, among the “highest expressors” of column 3/Table 3, the very highest expressor (ranked 1) was the mature version of the beta-subunit of prolyl 4-hydroxylase, (CLN00517790); a moderate expressor (ranked 20) was the mature version of the long form of alpha I collagen type IX (CLN 00517648); and the lowest expressor (ranked 56) was WFIKKN-related protein (CLN 00463474).
The next set of assays aimed at comparing proteins on the basis of the amounts that could be recovered from the conditioned media (i.e., on the basis of “secretion”). The cDNAs used for Example 1, table 1 and table 2, were subcloned into modified versions of the pTT mammalian expression vector, and the proteins were expressed with their endogenous signal peptides/leader sequences, in mammalian cells. After quantifying the levels of the resulting protein present in the conditioned media, proteins were ranked again, this time from “high secretors” to “moderate secretors” to “low secretors.”
Later on, this information served as the baseline to assess whether one could improve secretion of a protein by re-engineering its signal peptide/leader sequence, This “re-engineering” was done by replacing the endogenous signal peptide of a “low secretor” protein, with that of a “moderate” or “high secretor.”
In order to proceed with the above re-engineering, the amino acid sequence corresponding to the signal peptide/leader sequence of each of the proteins of the invention had first to be identified (Appendix A, Table 1 and 2). Based on a defined set of attributes, cDNAs from an existing library can be predicted to encode secretable proteins bioinformatically. For example, a signal peptide is typically encoded by the first 6-27 amino acid codons (18-81 nucleotides) of the OR, and it usually begins with 1-4 polar amino acids, followed by a stretch of hydrophobic amino acids, and then followed by a short region of charged amino acids just before the site where the secretion-related cleavage takes place. Using these attributes, together with other physical characteristics, cDNAs can be predicted to encode secretable proteins while the identities of the proteins may remain unknown The results, of one of such analysis done on our complete cDNA library are summarized on Appendix A, and Tables 1 and 2. A current limitation still is that one can not predict whether or not the presence of a putative signal peptide/leader sequence allows a protein containing said leader sequence to be secreted in vitro or in vivo, and what the efficiency of this process will be.
Preparation of the Expression Vectors for High-Throughput Screening of Leader Sequences
In order to identify signal peptides or secretory leader sequences that yield high secretion levels in proteins, a set of cDNAs predicted to encode secretable proteins (using a cDNA library existing in house and the methods described above) were subcloned into one of several modified version of the pTT5 expression vector (
The plasmid DNAs for each cDNA clone inserted into pTT5 were purified using the QIAGEN™ TURBO™ DNA system in 96-well plates. The DNA concentration for each clone was determined by absorbance at 260 nm, and subsequently adjusted, for example, to a concentration of 50 μg/ml in a suitable buffer. The expression/secretion assays were done after the resulting pTT5-based constructs were transiently transfected into 293T cells (ATCC®, VA) using a high-throughput 96-well system. These steps are described next.
High-Throughput Transfection in 96-Well Plates
For transient transfection of ten 96-well plates, 10 μl of each cDNA plasmid were combined with 50 μl of GIBCO® OPTI-MEM I™ (GIBCO, Gaithersburg, Md., Cat#: 319-85-070) in separate wells (one for each cDNA) of a round-bottom 96-well polystyrene plate. This plate was named the “master transfection plate.” Then, 37.5 μl of each OPTI-MEM I™/cDNA mix were pre-incubated for 5 minutes with 2.5 μl of FUGENE™ 6 transfection reagent (Roche Applied Science, Palo Alto, Calif., cat#1988387) in separate wells (one for each cDNA) of another round-bottom 96-well polystyrene plate. The mixture was then incubated at room temperature for about 30 minutes, resulting in one “transfection complex” per cDNA.
Each transfection complex was subsequently diluted by the addition of 100 μl of OPTI-MEM I™, mixed several times by repeated pipetting, and then transferred 20 μl at a time into ten separate wells. Each well was on a separate 96 well flat bottom poly-lysine-coated plate (Becton Dickinson, Rockville, Md., cat# 356461) to facilitate collection of samples for up to 10 different assays after transfection. Each plate contained up to 96 different cDNAs.
Two hundred microliters of a suspension of 2×105 cell/ml of 293T cells in DMEM medium (containing 10% FBS, penicillin and streptomycin) were then added to each well. The different mixtures of cells and diluted transfection complex were allowed to incubate at 37° C. in 5% CO2. After approximately 40 hours, the medium was removed from the wells by aspiration, the cells were briefly washed with 150 μl PBS, and new pre-warmed medium was added.
To prepare the set of transfected cells used for the purpose of assaying the expression and secretion levels of each protein, 150 μl of fresh HYQ-PF™ CHO Liquid Soy medium (Hyclone, Logan, Utah, Cat# SH30359.02) were added to each well,
To prepare the set of transfected cells used for the purpose of assaying the activity of the secreted protein, 150 μl fresh DMEM medium containing 5% FBS, penicillin and streptomycin were added to the wells instead of the HYQ-PF™ CHO Liquid Soy, and the resulting mixtures were incubated at 37° C. in 5% CO2.
After an additional 48 hours, during which the various cDNA expressed their respective secretable proteins, the culture supernatants from all ten 96-well plates were harvested and, when appropriate, combined into a single sterile deep-well plate, covered with a sterile lid. The deep-well plates were centrifuged at 1,400 RPM for 10 minutes to pellet any loose cells or cell debris. The supernatants were then transferred to new sterile deep-well plates so that the level of protein released into the conditioned media (i.e. secreted protein) could be measured. This was achieved by Western blot using anti-V5-HRP antibody and sandwich ELISA using the anti-penta-HIS antibody as a capture step and anti-V5-HRP to detect expression and measure expression levels relative to purified V5His standard. The layer of cells, which remained attached to the plates, was solubilized with 0.2% SDS, 0.5% NP-40 in PBS; the resulting cell lysates were used to assay the levels of protein in the cell lysates by ELISA.
In the first set of screening assays, a subset of leader sequences were identified that were shown to correlate with high secretion levels of the proteins they belonged to. The results of a high-throughput secretion assay, done following the steps just described are shown in
The high-throughput assay described in detail in Example 2 provided a panel of cDNAs from the “highest expresser” proteins with levels of secretion which varied from “low secretor proteins” to “high secretor proteins.” For a summary of their identity and properties, see Tables 1, 2 and 3. The next question was whether the signal peptide/leader sequences of the high-secretors were transferable into other proteins. More importantly, we asked whether the secretion of “low secretor proteins” could be improved by replacing their endogenous leader sequence with one taken from any one of “high secretor proteins” of the invention. To this end, a series of experiments were conducted, using standard subcloning techniques, transfection and expression methods essentially as described in detail in Examples 1 and 2. One of these experiments is exemplified next.
The signal peptide/leader sequence from CLN00517648 was used to replace the signal peptide/leader sequence of a panel of proteins, which in the initial sets of high-throughput expression and high-throughput secretion assays had been shown to be lows expressing proteins, low secretor proteins, or both. The proteins encoded by the resulting re-engineered cDNAs, which carried the heterologous leader sequence of the high secretor clone CLN00517648 instead of their own endogenous leader sequence, were found to have become high secretor proteins from what otherwise had been low expressor/low secretor proteins. Indeed, the signal peptide/leader sequence of CLN00517648 is capable of enhancing the secretion of type I™ proteins and type II proteins. Some specific examples of proteins whose secretion was improved by this process include cDNA constructs encoding the following proteins: human CD30 Ligand, SCDFR1 Ox40 Ligand, all of which were engineered to replace their endogenous signal peptide/leader sequence with that of CLN00517648 according to the process described in Examples 1 and 2. Moreover, the total level of expression of the modified proteins was also increased by this substitution. This was determined both by quantified the total levels of protein in both cell lysates and conditioned media. Thus, the signal peptide/leader sequence from CLN00517648 can, enhance both the expression and the secretion of low expresser proteins.
The high-throughput results described above, showing improvements in secretion and/or expression levels of low secretors and/or low expressors by replacing their endogenous leader sequence with that of either CLN00517648 or of another protein (heterologous leader sequence) selected from the list of “highest expressors” (see Table 3, column 3), were further confirmed using the scale-up procedures described in Example 4.
An alternative to the 96-well high-throughput transfection-expression assay is one in which both the transfection and the expression are done in larger scale protocols. These can use, for example, 293-6E cells provided by Y. Durocher grown in shaker flasks rather than 96-well plates. For the high-throughput process, the 293-6E cells can be treated with the same reagents and subject to the same conditions as the ones used for the 293T cells except that PEI is used for DNA transfection in shake flasks instead of Fugene 6.
For the scale-up process, the 293-6E cells were grown in polycarbonate Erlenmeyer flasks fitted with a vented screw cap and rotated on a table top shaker at 100 RPM in FREESTYLE™ Medium (INVITROGEN®, Carlsbad, Calif.) at 37° C. in 5% CO2. The cell densities in those flasks were maintained in a range from 0.5 to 3×106 cells/ml. Typically 50 ml cultures were grown in 250 ml flasks. One day before transfection, 293-6E cells were diluted into fresh FREESTYLE™ Medium to a cell density of about 0.6×106 cells/ml. On the day of transfection, the cells were predicted to be in the log phase, which is characterized by a cell density range of 0.8 to 1.5×106 cells/ml. The volumes of the log-phase cell cultures were adjusted so that their cell densities were about 106 cells/ml.
For each cDNA, a different transfection mix was prepared. To prepare each transfection mix, 2.5 ml sterile PBS were added to two 15-ml tubes. The first tube also contained 50 μg DNA. The second tube also contained 100 μl PEI solution, which includes 1 mg/ml sterile stock solution of linear 25 kDa Polyethylenimine pH 7.0 (from Polysciences, Warrington, Wis.). The solutions in the two tubes were then combined and allowed to incubate together for 15 minutes at room temperature, yielding the transfection complex. The transfection complex was then transferred to a 293-6E suspension culture and allowed to grow for 4-6 days at 37° C. in 5% CO2; this process was repeated for each cDNA.
To determine protein secretion levels, culture supernatants were analyzed by Western blot. Samples (15 μl per cDNA) were resolved by SDS-PAGE on 26-lane CRITERION™ gels (Bio-Rad, Inc., Hercules, Calif.) and transferred to nitrocellulose membranes. The membranes were blocked, and probed with an antibody against the specific epitope introduced at the cloning step. For example, for proteins tagged with a V5 and/or a HisX8 epitope, either an anti-V5 or an anti-HisX8 epitope antibody, conjugated to HRP (INVITROGEN®, Carlsbad, Calif.), was used. The HRP signal was developed using standard HRP chemiluminescence substrates (ECL Detection Kit, Amersham).
Secretion levels were determined by comparing the intensity of signal obtained for each secreted protein to that of one of three purified mass standards (for example, 15 μl of standards at 8, 33, and 133 ng/ml) that were loaded into separate lanes of the same gels. The comparison involved determining the area of the bands present on either the Coomassie-stained gel, the silver-stained gel, or the Western blot; this process was done with a image scanner and NIH Image freeware, which can be downloaded from Scion Corporation website. Various protein standards were used. Examples include a V5-HisX6-tagged Delta-like protein 1 extracellular protein, a V5-HisX6-tagged CSF-1 Receptor extracellular domain, and/or a POSITOPE™ (INVITROGEN®, Carlsbad, Calif., cat#: R900-50) containing a V5-HisX6 tag. These standards can be expressed separately using, for example, a baculovirus expression system, and purified to >90% purity.
The combined results from the experiments described in Examples 1-4, suggest a classification of the leader sequences of the invention according to their ability to, in their role as heterologous leader sequences, improve secretion and/or expression of the proteins they are inserted into. The leader sequences are accordingly classified under categories such as “high secretor signal peptide/secretory leaders,” “moderate secretory signal peptide/secretory leaders,” or “low secretory signal peptide/secretory leader sequences.”
Because the secretion levels and the increases in secretion caused by the heterologous polypeptide of the invention is separate and distinct from the expression levels of the resulting polypeptides, the resulting polypeptides were also ranked on the basis of their expression levels on a relative scale that served to rank all the proteins of the invention (Tables 1-3 and Appendix A) relatively to each other. These rankings were made for expression and secretion levels in either wheat germ extracts, or mammalian cells (see Examples 1-3).
Moreover, whereas the above classification is based on the results obtained from using in vitro assays, the classification extends to results that can be obtained while expressing the proteins of the invention in vivo. As already discussed in Example 1, the signal peptides/leaders sequences of the invention can be assayed for their ability to be used to improve the in vivo expression of heterologous proteins they are attached to. For example, any of the leader sequences described in Table 2 can be operatively linked to an heterologous protein using cloning methods essentially as described in Examples 1 and 2. The resulting cDNA construct can then be electroporated or microinjected into embryonic stem (ES) cells (for example, mouse or pig ES cells), which are then used, according to standard methods known to those skilled in the art, for generating transgenic animals (e.g. mice or pigs). Depending on the protein, and on other properties of the cDNA construct (for example, the specific promoter used to drive expression of the recombinant protein), the secreted recombinant protein can be assayed from bodily fluids such as, for example, blood, milk, saliva, and its expression levels quantified. The assay can be done such that two recombinant proteins are expressed that vary only by their signal peptide (i.e., comparing endogenous signal peptide and heterologous signal peptide of the invention).
It is possible that the signal peptide/leader sequences of the invention do not fall into the same categories when, instead of being used for protein expression in vitro they are used for protein expression in vivo. However, the results from the in vitro assays described herein should serve as guidelines for choosing which particular signal peptide/leader sequence one can use in order to achieve the desired levels of protein expression both in vitro and in vivo.
The specification is most thoroughly understood in light of the following references, all of which are hereby incorporated by reference in their entireties. The disclosures of the patents and other references cited above are also hereby incorporated by reference.
Applicants include a Sequence Listing provided in both electronic and paper format as Appendix A.
The leader sequences, heterologous secreted polypeptides, nucleic acids, vectors, host cells and methods of making these find use in a number of investigative, diagnostic, and therapeutic applications.
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
sapiens]
This application claims the benefit (pursuant to 35 U.S.C. § 119(e)) of provisional application 60/647,013, filed in the United States Patent and Trademark Office on Jan. 27, 2005, the disclosures of which are hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2006/002951 | 1/27/2006 | WO | 00 | 6/3/2008 |
Number | Date | Country | |
---|---|---|---|
60647013 | Jan 2005 | US |