The present invention is directed to small molecular weight molecules including, but not limited to, peptides, peptide analogs and peptide mimetics that stabilize the non-native states of proteins and prevent the aggregation of unfolded, abnormally folded or misfolded proteins. The invention is further directed to methods for treatment of protein conformation disease utilizing the peptides, peptide analogs or peptide mimetics, or nucleic acids encoding the peptides.
Proteins are large polypeptide chains composed of sequences of amino acids encoded by genes and synthesized by the protein synthesis machinery of cells. Synthesis of proteins is followed by folding into functional 3-dimensional structures, which often requires participation of separate proteins called molecular chaperones. Molecular chaperones are endogenous specialized proteins that assist normal folding of synthesized polypeptides into their functional form. Correctly folded proteins are transported to their destination where they perform their function(s).
Mutations, molecular and environmental stress, post-translational modifications, proteolysis and aging can alter the structure of a protein leading to an unfolding or misfolded protein with an altered function. The altered function can lead to increased morbidity through a number of mechanisms including, but not limited to, disruption of important cellular processes, toxicity due to aggregation and cell-death responses.
The interaction between molecular chaperones and partially folded polypeptides is a natural defense against protein unfolding and aggregation diseases. Protein conformation diseases occur when natural proteins in the body gain or lose function due to structural instability. Protein aggregation diseases are a subtype of protein conformation diseases in which unfolded or misfolded proteins form aggregates that are toxic to cells. A large number of protein conformation diseases are a natural consequence of aging. With age, the ability of cells to protect themselves from the lethal effects of protein unfolding and aggregation diminishes greatly. The ability of molecular chaperones which are the natural defense molecules against protein unfolding and misfolding reduces dramatically with age while the number of unfolded and misfolded proteins increases dramatically. Consequently, when protective pathways like refolding and degradation become inefficient and are unable to clear non-functional structurally compromised proteins from cells and/or tissues, the result is protein misfolding and aggregation diseases (Table 1). Sanders and Myers, Annu. Rev. Biophys. Biomol. Struct., 33: 25-51, 2004.
While molecular chaperones can consist of thousands of peptides, only a small proportion of the peptides are necessary for their function against protein conformation diseases. Santhoshkumar and Sharma, Molecular and Cellular Biochemistry 267: 147-155, 2004. Although protein molecular chaperones are very efficient in vivo, their enormous size limits their bioavailability in therapeutic applications. Accordingly, there is a clear and unmet need in the art for peptides having the functional characteristics of molecular chaperones, which may be more readily produced and used in a variety of therapeutic and manufacturing applications.
The present invention generally relates to polypeptides, peptide analogs and peptide mimetics that stabilize and reduce the aggregation of unfolded, abnormally folded or misfolded proteins. Accordingly, the present invention provides peptide-based compositions, peptide variant compositions, or peptide mimetic compositions that inhibit protein misfolding and/or aggregation and are, therefore, useful in a variety of therapeutic and manufacturing applications, including, e.g., the treatment of diseases and disorders associated with protein misfolding and/or aggregation and methods for manufacturing and purifying recombinant proteins. The present invention provides polypeptide compositions, functional variants, and peptide mimetics thereof, and methods for treating a disease in a mammalian subject comprising administering a polypeptide up to about 50 amino acids in length having molecular chaperone activity to the subject in need thereof. The methods are useful for treating diseases, for example, diseases related to protein aggregation, and diseases such as age-related myopathy and cardiac ischemia. A method for stabilizing a protein is provided comprising contacting the protein with a polypeptide up to about 50 amino acids in length having molecular chaperone activity. A method is provided to increase the efficacy of a therapeutic protein to treat disease. A method is also provided to increase production of a recombinantly-produced protein. The methods provide a polypeptide up to about 50 amino acids in length that limits protein aggregation and provides recombinant proteins with correct folding of the polypeptide as an active protein compositions.
A polypeptide X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, or X1-IAIHHPWI-X2, or a functional variant or mimetic thereof, is provided wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic is a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant has about 70% or greater amino acid sequence identity to X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, or X1-IAIHHPWI-X2.
A polypeptide X1-SLSPFYLRPPSFLRAP-X2X1-SPFYLRPP-X2X1-SLSPFYLR-X2X1-FYLRPPSF-X2X1-LRPPSFLR-X2X1-PPSFLRAP-X2X1-SFLRAPSW-X2 X1-LRAPSWFD-X2, or a functional variant or mimetic thereof, is provided wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic is a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant has about 70% or greater amino acid sequence identity to X1-SLSPFYLRPPSFLRAP-X2X1-SPFYLRPP-X2X1-SLSPFYLR-X2X1-FYLRPPSF-X2X1-LRPPSFLR-X2X1-PPSFLRAP-X2X1-SFLRAPSW-X2X1-LRAPSWFD-X2.
A polypeptide X1-RLEKDRFS-X2X1-FSVNLDVK-X2X1-LKVKVLGD-X2X1-FISREFHR-X2X1-HGFISREF-X2X1-KYRIPADV-X2X1-EFHRKYRI-X2X1-SREFHRKY-X2X1-LTITSSLS-X2X1-GVLTVNGP-X2, or X1-LTVNGPRK-X2, or a functional variant or mimetic thereof, is provided wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid identity to X1-RLEKDRFS-X2X1-FSVNLDVK-X2X1-LKVKVLGD-X2X1-FISREFHR-X2X1-HGFISREF-X2X1-KYRIPADV-X2X1-EFHRKYRI-X2X1-SREFHRKY-X2X1-LTITSSLS-X2 X1-GVLTVNGP-X2, or X1-LTVNGPRK-X2.
A polypeptide X1-RTIPITRE-X2, or a functional variant or mimetic thereof, is provided wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid sequence identity to X1-RTIPITRE-X2. the functional variant comprises an I-X-I/V amino acid motif.
A method for treating a protein conformation disease in a mammalian subject is provided comprising administering a polypeptide to the subject in need thereof, wherein the polypeptide is X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2, or a functional variant or mimetic thereof, wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid sequence identity to X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2. In a further detailed aspect, the functional variant of X1-RTIPITRE-X2 polypeptide comprises an I-X-I/V amino acid motif. The disease includes, but is not limited to, Alexander's disease, Alzheimer's disease, Creutzfeld-Jakob disease, Parkinson's disease, Huntington's disease, cataract, retinitis pigmentosa, prion disease, or mad cow disease. The disease further includes, but is not limited to, age-related myopathy or cardiac ischemia.
A method for treating a protein conformation disease in a mammalian subject is provided comprising administering a nucleic acid encoding a polypeptide X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2, or a functional variant or mimetic thereof, wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid sequence identity to X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2. In a further detailed aspect, the functional variant of X1-RTIPITRE-X2 polypeptide comprises an I-X-I/V amino acid motif. The disease includes, but is not limited to, Alexander's disease, Alzheimer's disease, Creutzfeld-Jakob disease, Parkinson's disease, Huntington's disease, cataract, retinitis pigmentosa, prion disease, or mad cow disease. The disease further includes, but is not limited to, age-related myopathy or cardiac ischemia.
A method for stabilizing a protein is provided comprising contacting the protein with a polypeptide X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2, or a functional variant or mimetic thereof, wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid sequence identity to X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2. In a further detailed aspect, the X1-RTIPITRE-X2 polypeptide comprises an I-X-I/V amino acid motif.
The method for stabilizing a protein further provides that the protein is a therapeutic protein. The therapeutic protein includes, but is not limited to, a vaccine, insulin, growth factor, or antibody. In a further aspect, the protein is a recombinantly-produced protein. The method for stabilizing a protein further comprises increasing the stability of the therapeutic protein to treat a disease state in a mammalian subject. The method further comprises inhibiting protein misfolding or reducing protein aggregation. The method further comprises restoring correct or native folding to the protein.
A method for diagnosing a protein conformation disease in a mammalian subject is provided comprising, contacting a tissue sample from the mammalian subject with a polypeptide X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFIFFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2, or a functional variant or mimetic thereof, wherein each X1 and X2 independently of one another represents any amino acid sequence of n amino acids, n varying from 0 to 50, and n being identical or different in X1 and X2, detecting binding of a misfolded protein, an unfolded protein, or an aggregated protein to the polypeptide, and determining a presence or absence of the disease in the mammalian subject. In a further aspect, the presence or absence of the disease is detected by a stage of misfolding, unfolding, or aggregation of the protein. In a further aspect, the functional variant or mimetic comprises a conservative amino acid substitution or peptide mimetic substitution. In a detailed aspect, the functional variant comprises about 70% or greater amino acid sequence identity to X1-WIRRPFFPFHSP-X2, X1-WIRRPFFP-X2, X1-PFFPFHSP-X2, X1-FPFHSPSR-X2, X1-DQFFGEHL-X2, X1-FFGEHLLE-X2, X1-IAIHHPWI-X2, X1-SLSPFYLRPPSFLRAP-X2, X1-SPFYLRPP-X2, X1-SLSPFYLR-X2, X1-FYLRPPSF-X2, X1-LRPPSFLR-X2, X1-PPSFLRAP-X2, X1-SFLRAPSW-X2, X1-LRAPSWFD-X2, X1-RLEKDRFS-X2, X1-FSVNLDVK-X2, X1-LKVKVLGD-X2, X1-FISREFHR-X2, X1-HGFISREF-X2, X1-KYRIPADV-X2, X1-EFHRKYRI-X2, X1-SREFHRKY-X2, X1-LTITSSLS-X2, X1-GVLTVNGP-X2, X1-LTVNGPRK-X2, or X1-RTIPITRE-X2. In a further detailed aspect, the X1-RTIPITRE-X2 polypeptide comprises an I-X-I/V amino acid motif. The disease includes, but is not limited to, Alexander's disease, Alzheimer's disease, Creutzfeld-Jakob disease, Parkinson's disease, Huntington's disease, cataract, retinitis pigmentosa, prion disease, or mad cow disease. The disease further includes, but is not limited to, age-related myopathy or cardiac ischemia.
The present invention provides polypeptides, peptide analogs and peptide mimetics the non-native states of proteins and prevent the aggregation of unfolded, abnormally folded or misfolded proteins. Accordingly, the present invention provides peptide-based compositions that inhibit protein misfolding, abnormal folding, and/or aggregation and are, therefore, useful in a variety of therapeutic and manufacturing applications, including, e.g., the treatment of diseases and disorders associated with protein misfolding, abnormal folding and/or aggregation and in methods for manufacturing and purifying recombinant proteins.
Protein pin arrays identified interactive polypeptide sequences for chaperone activity in human αB crystallin using natural lens proteins, βH crystallin and γD crystallin, and in vitro chaperone target proteins, for example, alcohol dehydrogenase and citrate synthase. A polypeptide fragment having activity to stabilize and reduce aggregation of misfolded proteins comprises polypeptide sequences from the N-terminal domain, α crystallin core domain, or the C-terminal domain of the human αB crystallin protein. The N-terminal domain contained interactive polypeptide sequences with chaperone activity, 9WIRRPFFPFHSP20 and 43SLSPFYLRPPSFLRAP58. The N-terminal domain also contained the following interactive polypeptide sequences with chaperone activity: WIRRPFFP, PFFPFHSP, FPFHSPSR, DQFFGEHL, FFGEHLLE, or IAIHHPWI, SPFYLRPP, SLSPFYLR, FYLRPPSF, LRPPSFLR, PPSFLRAP, SFLRAPSW, LRAPSWFD. The α crystallin core domain contained interactive protein sequences with chaperone activity, 75FSVNLDVK82 (β3), 113FISREFHR120, 131LTITSSLS138 (β8) and 141GVLTVNGP148 (β9). The α crystallin core domain also contained interactive protein sequences with chaperone activity: RLEKDRFS, LKVKVLGD, HGFISREF, KYRIPADV, EFHRKYRI, SREFHRKY, or LTVNGPRK. The C-terminal domain contained an interactive sequence, 157RTIPITRE164 that included the highly conserved I-X-I/V amino acid motif. Two interactive sequences, 73DRFSVNLDVKBIFS85 and 131LTITSSLSDGV141 belonging to the α crystallin core domain were synthesized as peptides and assayed for chaperone activity in vitro. Both synthesized peptides inhibited the thermal aggregation of βH crystallin, alcohol dehydrogenase and citrate synthase in vitro. Five of the seven chaperone sequences identified by the pin arrays overlapped with sequences identified previously as sequences for subunit-subunit interactions in human αB crystallin. The results suggested that interactive sequences in human αB crystallin have dual roles in subunit-subunit assembly and chaperone activity.
It is to be understood that this invention is not limited to particular methods, reagents, compounds, compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to “a cell” includes a combination of two or more cells, and the like.
“About” as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of ±20% or ±10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1% from the specified value, as such variations are appropriate to perform the disclosed methods.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although any methods and materials similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used.
The peptides, peptide analogs and peptide-mimetics of the present invention, herein are collectively referred to as “Intellipeptides”, “aggregation inhibition peptides”, “peptides that inhibit abnormal protein folding, protein unfolding, protein misfolding, or protein aggregation.” Intellipeptides are identified using protein pin arrays, computer modeling, multiple sequence alignment analyses of structurally and functionally similar proteins, spectroscopic in vitro chaperone assays and/or in vivo cell killing assays.
Intellipeptides stabilize and prevent the protein unfolding, misfolding or aggregation of a wide variety of target proteins including, but not limited to, amyloid-beta, beta/gamma crystallins, actin, desmin, vimentin, insulin, citrate synthase, alcohol dehydrogenase, glial fibrillary acidic protein, alpha-lactalbumin, fibroblast growth factor, insulin-like growth factor, transforming growth factor-beta, nerve growth factor-beta, epidermal growth factor, vascular endothelial growth factor, beta-catenin, tumor necrosis factor-alpha, Bcl-2, Bcl-X1 and caspases.
In a method for treating a protein conformation disease in a mammalian subject, Intellipeptides are useful stabilize and or prevent the protein unfolding, misfolding or aggregation of a wide variety of disease target proteins. Disease targetin proteins include, but not limited to, neurodegenerative disease: Alzheimer's disease (Amyloid beta, tau); Parkinson's disease (Alpha-synuclein, tau); Creutzfeld-Jakob disease (Amyloid protein); Kuru (Amyloid protein); GSS disease (Amyloid protein); Huntington's disease (Huntingtin); Polyglutamine diseases (Atrophin-1, ataxins); Prion disease (Prion protein); Bovine Spongiform Encephalopathy (BSE) (Prion protein); Amyotrophic Lateral Sclerosis (Superoxide dismutase); Alexander's disease (Glial fibrillary acidic protein); Primary Systemic Amyloidosis (Immunoglobulin light chain or fragments); Secondary Systemic Amyloidosis (Fragments of serum amyloid-A); Senile Systemic Amyloidosis (Transthyretin and fragments); Amyloidosis in senescence (Apolipoprotein A-II); Ocular disease: Cataract (Crystallins, filaments); Retinitis Pigmentosa (Rhodopsin); Macular Degeneration (Amyloid-beta, crystallins); and other disease: Islet amyloid (Insulin); Medullar Carcinoma of the Thyroid (Calcitonin); Hereditary Renal Amyloidosis (Fibrinogen); Hemodialysis-related amyloidosis (beta 2-microglobulin); Desmin-related Cardiomyopathy (Desmin); Charcot-Marie Tooth disease (PMP-22); diabetes insipidis (aquaporin); diabetes insipidis (vasopressin receptor); Charcot-Marie Tooth disease (connexin 32); cystic fibrosis (CFTR). See, for example, Sanders and Myers, Annu. Rev. Biophys. Biomol. Struct., 33: 25-51, 2004.
In one embodiment, Intellipeptides of the present invention comprise or consist of a fragment of αB crystallin. In another embodiment, Intellipeptides of the present invention comprise or consist of peptides that are structurally and functionally similar to the parent set of peptide sequences identified from αB crystallin, including, but not limited to the peptides provided in
In certain embodiments, Intellipeptides include peptide analogs and peptide mimetics. Indeed, Intellipeptides include peptides having any of a variety of different modifications, including those described herein.
Intellipeptide analogs are generally designed and produced by chemical modifications of a lead peptide, including, e.g., any of the particular peptides described herein, such as any of the following sequences: i) EKDRFSVNLDVKHFS; ii) DPLTITSSLSSDGVLTVNGPRKQ; iii) LTITSSLSDGVLTVNGPRK; iv) STSLSPFYLRPPSFLRAP; v) SLSPFYLRPPSFLRAPS; vl) GPERTIPITREEK; vii) PERTIPITREEK; viii) HGKBEERQDE; ix) HGFISREFHRKYR or functional variants or peptide mimetics thereof. An exemplary polypeptide fragment of αB crystallin protein having molecular chaperone activity is presented; e.g., the N-terminal domain polypeptide fragment is 9WIRRPFPHFHSP20 or 43SLSPFYLRPPSFLRAP58, the α crystallin core domain polypeptide fragment is 75FSVNLDVK82 (β3), 113FISREFHR120, 131LTITSSLS138 (β8), 141GVLTVNGP148 (β9), 73DRFSVNLDVKHFS85, or 131LTITSSLSDGV141, or the C-terminal domain polypeptide fragment is 157RTIPITRE164, or functional variants thereof. The present invention clearly establishes that these peptides in their entirety and derivatives created by modifying any side chains of the constituent amino acids have the ability to prevent aggregation of proteins, correctly fold proteins, and stabilize proteins. The present invention further encompasses polypeptides up to about 50 amino acids in length that include the amino acid sequences and functional variants or peptide mimetics of the sequences described herein.
In one embodiment, an Intellipeptide of the present invention includes an N- and C-terminal modification. N-terminal acetylation or desamination confers protection against digestion by a number of aminopeptidases in the presence of amides or alcohols replacing the C-terminal carboxyl group prevent splitting by several carboxypeptidases, including carboxypeptidases A and B.
In another embodiment, an Intellipeptide of the present invention includes a side-chain modification. The presence of non-natural amino acids usually increases peptide stability. In addition, at least one of these amino acids (alpha-aminoisobutyric acid or Aib) imposes significant constraints to model peptides diminishing their conformational flexibility. Therefore, the introduction of Aib is expected to enhance peptide stability and inhibitory activity at the same time.
In a further embodiment, an Intellipeptide of the present invention includes modifications in the alpha-carbon. The most commonly used alpha-carbon modification to improve peptide stability is alpha-methylation. In addition, replacement of the hydrogen atom linked to the alpha-carbon of Phe, Val or Leu favors the adoption of beta-bend conformation that is unfavorable for the formation of beta-pleated sheet structures. According to the present invention, methylation of those residues in the inhibitor peptides is expected to enhance stability and potency.
In another embodiment, an Intellipeptide of the present invention includes a chirality change. Replacement of the natural L-residue by the D-enantiomers dramatically increases resistance to proteolytic degradation. Aggregation inhibitors containing D-enantiomers are as effective in preventing aggregation as the L-enantiomer forms of the aggregation inhibition parent peptides.
In another embodiment, Intellipeptides of the present invention are cyclic peptides. Conformationally constrained cyclic peptides represent better drug candidates than linear peptides due to their reduced conformational flexibility and improved resistance to exopeptidase cleavage. Two alternative strategies can be used to convert a linear sequence into a cyclic structure. One is the introduction of cysteine residue to achieve cyclization through the formation of a disulfide bridge and the other is the side-chain attachment strategy involving resin-bound head-to-tail cyclization. To avoid modifications of the peptide sequence the latter approach is used. Aggregation inhibition peptides contain the ideal sequences for facilitating macrocyclization because proline, due to its ability to promote turns and loops, is a constituent of many naturally occurring or artificially synthesized cyclic peptides.
In another embodiment, an Intellipeptide of the present invention is a pseudopeptides. Pseudopeptides or amide bond surrogates refers to peptides containing chemical modifications of some (or all) of the peptide bonds. The introduction of amide bond surrogates not only decreases peptide degradation but also may significantly modify some of the biochemical properties of the peptides, particularly the conformational flexibility and hydrophobicity. It is likely that an increase in conformational flexibility will be beneficial for docking the inhibitor to the binding sites. On the other hand, since the interaction between the aggregation-prone proteins and the inhibitors seems to depend to a great extent on hydrophobic interactions, it is likely that amide bond replacement increasing hydrophobicity may enhance affinity and hence, potency of the inhibitors. In addition, increased hydrophobicity could also enhance transport of the peptide across membranes and thus, improve barrier permeability (blood-brain barrier and intestinal barrier). The amide bonds to replace are those located at the end of the peptide to prevent exoprotease degradation and after each of the prolines, since it has been described that a frequent endopeptidase cleavage site occurs after this residue by an enzyme known as prolylendopeptidase.
To improve or alter the characteristics of polypeptides of the present invention, protein engineering can be employed. Recombinant DNA technology known to those skilled in the art can be used to create novel mutant proteins or muteins including single or multiple amino acid substitutions, deletions, additions, or fusion proteins. Such modified polypeptides can show, e.g., increased/decreased biological activity or increased/decreased stability. In addition, they can be purified in higher yields and show better solubility than the corresponding natural polypeptide, at least under certain purification and storage conditions. Further, the polypeptides of the present invention can be produced as multimers including dimers, trimers and tetramers. Multimerization can be facilitated by linkers or recombinantly though heterologous polypeptides such as Fc regions.
It is known in the art that one or more amino acids can be deleted from the N-terminus or C-terminus without substantial loss of biological function. See, e.g., Ron, et al., Biol. Chem., 268: 2984-2988, 1993. Accordingly, the present invention provides polypeptides having one or more residues deleted from the amino terminus. Similarly, many examples of biologically functional C-terminal deletion mutants are known (see, e.g., Dobeli, et al., 1988). Accordingly, the present invention provides polypeptides having one or more residues deleted from the carboxy terminus. The invention also provides polypeptides having one or more amino acids deleted from both the amino and the carboxyl termini as described below.
Other mutants in addition to N- and C-terminal deletion forms of the protein discussed above are included in the present invention. Thus, the invention further includes variations of the polypeptides which show substantial chaperone polypeptide activity. Such mutants include deletions, insertions, inversions, repeats, and substitutions selected according to general rules known in the art so as to have little effect on activity.
There are two main approaches for studying the tolerance of an amino acid sequence to change, see, Bowie, et al., Science, 247: 1306-1310, 1994. The first method relies on the process of evolution, in which mutations are either accepted or rejected by natural selection. The second approach uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene and selections or screens to identify sequences that maintain functionality. These studies have revealed that proteins are surprisingly tolerant of amino acid substitutions.
Typically seen as conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu and Phe; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gln, exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. Thus, the polypeptide of the present invention can be, for example: (i) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue can or cannot be one encoded by the genetic code; or (ii) one in which one or more of the amino acid residues includes a substituent group; or (iii) one in which the PEDF-R polypeptide is fused with another compound, such as a compound to increase the half-life of the polypeptide (for example, polyethylene glycol); or (iv) one in which the additional amino acids are fused to the above form of the polypeptide, such as an IgG Fc fusion region peptide or leader or secretory sequence or a sequence which is employed for purification of the above form of the polypeptide or a pro-protein sequence.
Thus, the polypeptides of the present invention can include one or more amino acid substitutions, deletions, or additions, either from natural mutations or human manipulation. As indicated, changes are preferably of a minor nature, such as conservative amino acid substitutions that do not significantly affect the folding or activity of the protein. The following groups of amino acids represent equivalent changes: (1) Ala, Pro, Gly, Glu, Asp, Gln, Asn, Ser, Thr; (2) Cys, Ser, Tyr, Thr; (3) Val, Ile, Leu, Met, Ala, Phe; (4) Lys, Arg, H is; (5) Phe, Tyr, Trp, His.
Furthermore, polypeptides of the present invention can include one or more amino acid substitutions that mimic modified amino acids. An example of this type of substitution includes replacing amino acids that are capable of being phosphorylated (e.g., serine, threonine, or tyrosine) with a negatively charged amino acid that resembles the negative charge of the phosphorylated amino acid (e.g., aspartic acid or glutamic acid). Also included is substitution of amino acids that are capable of being modified by hydrophobic groups (e.g., arginine) with amino acids carrying bulky hydrophobic side chains, such as tryptophan or phenylalanine. Therefore, a specific embodiment of the invention includes chaperone polypeptides that include one or more amino acid substitutions that mimic modified amino acids at positions where amino acids that are capable of being modified are normally positioned. Further included are chaperone polypeptides where any subset of modifiable amino acids is substituted. For example, a chaperone polypeptide that includes three serine residues can be substituted at any one, any two, or all three of said serines. Furthermore, any chaperone polypeptide amino acid capable of being modified can be excluded from substitution with a modification-mimicking amino acid.
The present invention is further directed to fragments of the polypeptides of the present invention. More specifically, the present invention embodies purified, isolated, and recombinant polypeptides comprising at least any one integer between 6 and 504 (or the length of the polypeptides amino acid residues minus 1 if the length is less than 1000) of consecutive amino acid residues. Preferably, the fragments are at least 6, preferably at least 8 to 10, more preferably 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 360, or more consecutive amino acids of a polypeptide of the present invention.
The present invention also provides for the exclusion of any species of polypeptide fragments of the present invention specified by 5′ and 3′ positions or sub-genuses of polypeptides specified by size in amino acids as described above. Any number of fragments specified by 5′ and 3′ positions or by size in amino acids, as described above, can be excluded.
In addition, it should be understood that in certain embodiments, Intellipeptides of the present invention include two or more modifications, including, but not limited to those described herein. By taking into the account the features of the peptide drugs on the market or under current development, it is clear that most of the peptides successfully stabilized against proteolysis consist of a mixture of several types of the above described modifications. This conclusion is understood in the light of the knowledge that many different enzymes are implicated in peptide degradation.
The present invention includes libraries of Intellipeptides. Such libraries include both peptide libraries and libraries of nucleic acid constructs capable of expressing Intellipeptides. In one embodiment, a library of the present invention consists of sequences related to i) EKDRFSVNLDVKHFS; ii) DPLTITSSLSSDGVLTVNGPRKQ; iii) LTITSSLSDGVLTVNGPRK; iv) STSLSPFYLRPPSFLRAP; v) SLSPFYLRPPSFLRAPS; vi) GPERTIPITREEK; vii) PERTIPITREEK; viii) HGKHEERQDE; ix) HGFISREFHRKYR or functional derivatives or mimetics thereof. In a particular embodiment, a library of the invention consists of two or more Intellipeptides or encoding sequences, including, e.g., the sequences provided in
The invention provides isolated or recombinant polypeptides comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to a polypeptide fragment of an N-terminal domain, an α crystallin core domain, or a C-terminal domain of the αB crystallin protein over a region of at least about 10, 50, 100, 150, or 200, or more residues, or, a polypeptide encoded by a nucleic acid of the invention. In one aspect, the polypeptide comprises a sequence as set forth in a polypeptide fragment of an N-terminal domain, an α crystallin core domain, or a C-terminal domain of the αB crystallin protein. The invention provides methods for inhibiting protein aggregation in a mammalian subject by administering a polypeptide fragment of αB crystallin protein, e.g., a polypeptide of the invention. The invention also provides methods for screening for compositions that have chaperone activity or inhibit protein aggregation by screening polypeptide fragments of αB crystallin protein, e.g., a polypeptide of the invention.
In one aspect, the invention provides a polypeptide fragment of αB crystallin protein (and the nucleic acids encoding them) where one, some or all of the amino acids in the polypeptide fragment of αB crystallin protein comprise replacements with substituted amino acids. In one aspect, the invention provides methods to enhance the interaction of a polypeptide fragment of αB crystallin protein having molecular chaperone activity with unfolded proteins, denatured proteins, or native conformation proteins.
The peptides and polypeptides of the invention can be expressed recombinantly in vivo after administration of nucleic acids, as described above, or, they can be administered directly, e.g., as a pharmaceutical composition. They can be expressed in vitro or in vivo to screen for polypeptide fragments of αB crystallin protein having molecular chaperone activity activity and for agents that can ameliorate disease, for example, protein aggregation disease, age-related myopathy, or cardiac ischemia.
Polypeptides and peptides of the invention can be isolated from natural sources, be synthetic, or be recombinantly generated polypeptides. Peptides and proteins can be recombinantly expressed in vitro or in vivo. The peptides and polypeptides of the invention can be made and isolated using any method known in the art. Polypeptide and peptides of the invention can also be synthesized, whole or in part, using chemical methods well known in the art. See e.g., Caruthers, Nucleic Acids Res. Symp. Ser. 215-223, 1980; Horn, Nucleic Acids Res. Symp. Ser. 225-232, 1980; Banga, Therapeutic Peptides and Proteins, Formulation, Processing and Delivery Systems Technomic Publishing Co., Lancaster, Pa., 1995. For example, peptide synthesis can be performed using various solid-phase techniques (see e.g., Roberge, Science 269: 202, 1995; Merrifield, Methods Enzymol. 289: 3-13, 1997) and automated synthesis can be achieved, e.g., using the ABI 431A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer.
The peptides and polypeptides of the invention, as defined above, include all “mimetic” and “peptidomimetic” forms. The terms “mimetic” and “peptidomimetic” refer to a synthetic chemical compound which has substantially the same structural and/or functional characteristics of the polypeptides of the invention. The mimetic can be either entirely composed of synthetic, non-natural analogues of amino acids, or, is a chimeric molecule of partly natural peptide amino acids and partly non-natural analogs of amino acids. The mimetic can also incorporate any amount of natural amino acid conservative substitutions as long as such substitutions also do not substantially alter the mimetic's structure and/or activity. As with polypeptides of the invention which are conservative variants, routine experimentation will determine whether a mimetic is within the scope of the invention, i.e., that its structure and/or function is not substantially altered. Thus, a mimetic composition is within the scope of the invention if, when administered to or expressed in a cell, e.g., a polypeptide fragment of αB crystallin protein having molecular chaperone activity. A mimetic composition can also be within the scope of the invention if it stimulates a molecular chaperone activity in a cell or mammalian subject with a protein aggregation disease.
Intellipeptides or peptides that inhibit abnormal protein folding, protein unfolding, protein misfolding, or protein aggregation include, but are not limited to, Intellipeptides SLSPFYLRPPSFLRAPS, EKDRFSVNLDVKHFS, HGFISREFHRKYR, DPLTITSSLSSDGVLTVNGPRKQ, and PERTIPITREEK.
Polypeptide mimetic compositions can contain any combination of non-natural structural components, which are typically from three structural groups: a) residue linkage groups other than the natural amide bond (“peptide bond”) linkages; b) non-natural residues in place of naturally occurring amino acid residues; or c) residues which induce secondary structural mimicry, i.e., to induce or stabilize a secondary structure, e.g., a beta turn, gamma turn, beta sheet, alpha helix conformation, and the like. For example, a polypeptide can be characterized as a mimetic when all or some of its residues are joined by chemical means other than natural peptide bonds. Individual peptidomimetic residues can be joined by peptide bonds, other chemical bonds or coupling means, such as, e.g., glutaraldehyde, N-hydroxysuccinimide esters, bifunctional maleimides, N,N′-dicyclohexylcarbodiimide (DCC) or N,N′-diisopropylcarbodiimide (DIC). Linking groups that can be an alternative to the traditional amide bond (“peptide bond”) linkages include, e.g., ketomethylene (e.g., —C(═O)—CH2— for —C(═O)—NH—), aminomethylene (CH2—NH), ethylene, olefin (CH═CH), ether (CH2—O), thioether (CH2—S), tetrazole (CN4—), thiazole, retroamide, thioamide, or ester (see, e.g., Spatola (1983) in Chemistry and Biochemistry of Amino Acids, Peptides and Proteins, Vol. 7, pp 267-357, “Peptide Backbone Modifications,” Marcell Dekker, N.Y.).
A polypeptide can also be characterized as a mimetic by containing all or some non-natural residues in place of naturally occurring amino acid residues. Non-natural residues are well described in the scientific and patent literature; a few exemplary non-natural compositions useful as mimetics of natural amino acid residue's and guidelines are described below. Mimetics of aromatic amino acids can be generated by replacing by, e.g., D- or L-naphylalanine; D- or L-phenylglycine; D- or L-2 thieneylalanine; D- or L-1, -2,3-, or 4-pyreneylalanine; D- or L-3 thieneylalanine; D- or L-(2-pyridinyl)-alanine; D- or L-(3-pyridinyl)-alanine; D- or L-(2-pyrazinyl)-alanine; D- or L-(4-isopropyl)-phenylglycine; D-(trifluoromethyl)-phenylglycine; D-(trifluoromethyl)-phenylalanine; D-p-fluoro-phenylalanine; D- or L-p-biphenylphenylalanine; K- or L-p-methoxy-biphenylphenylalanine; D- or L-2-indole(alkyl)alanines; and, D- or L-alkylainines, where alkyl can be substituted or unsubstituted methyl, ethyl, propyl, hexyl, butyl, pentyl, isopropyl, iso-butyl, sec-isotyl, iso-pentyl, or a non-acidic amino acids. Aromatic rings of a non-natural amino acid include, e.g., thiazolyl, thiophenyl, pyrazolyl, benzimidazolyl, naphthyl, furanyl, pyrrolyl, and pyridyl aromatic rings.
Mimetics of acidic amino acids can be generated by substitution by, e.g., non-carboxylate amino acids while maintaining a negative charge; (phosphono)alanine; sulfated threonine. Carboxyl side groups (e.g., aspartyl or glutamyl) can also be selectively modified by reaction with carbodiimides (R′—N—C—N—R′) such as, e.g., 1-cyclohexyl-3(2-morpholin-yl-(4-ethyl) carbodiimide or 1-ethyl-3(4-azonia-4,4-dimetholpentyl) carbodiimide. Aspartyl or glutamyl can also be converted to asparaginyl and glutaminyl residues by reaction with ammonium ions.
Mimetics of basic amino acids can be generated by substitution with, e.g., (in addition to lysine and arginine) the amino acids ornithine, citrulline, or (′ adioimmu)-acetic acid, or (′ adioimmu)alkyl-acetic acid, where alkyl is defined above. Nitrile derivative (e.g., containing the CN-moiety in place of COOH) can be substituted for ′ adioimmuno or glutamine. Asparaginyl and glutaminyl residues can be deaminated to the corresponding aspartyl or glutamyl residues.
Arginine residue mimetics can be generated by reacting arginyl with, e.g., one or more conventional reagents, including, e.g., phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, or ninhydrin, preferably under alkaline conditions. Tyrosine residue mimetics can be generated by reacting tyrosyl with, e.g., aromatic diazonium compounds or tetranitromethane. N-acetylimidizol and tetranitromethane can be used to form O-acetyl tyrosyl species and 3-nitro derivatives, respectively. Cysteine residue mimetics can be generated by reacting cysteinyl residues with, e.g., alpha-haloacetates such as 2-chloroacetic acid or chloroacetamide and corresponding amines; to give carboxymethyl or carboxyamidomethyl derivatives. Cysteine residue mimetics can also be generated by reacting cysteinyl residues with, e.g., bromo-trifluoroacetone, alpha-bromo-beta-(5-imidozoyl) propionic acid; chloroacetyl phosphate, N-alkylmaleimides, 3-nitro-2-pyridyl di sulfide; methyl 2-pyridyl di sulfide; p-chloromercuribenzoate; 2-chloromercuri-4 nitrophenol; or, chloro-7-nitrobenzo-oxa-1,3-diazole. Lysine mimetics can be generated (and amino terminal residues can be altered) by reacting lysinyl with, e.g., succinic or other carboxylic acid anhydrides. Lysine and other alpha-amino-containing residue mimetics can also be generated by reaction with imidoesters, such as methyl picolinimidate, pyridoxal phosphate, pyridoxal, chloroborohydride, trinitrobenzenesulfonic acid, O-methylisourea, 2,4, pentanedione, and transamidase-catalyzed reactions with glyoxylate. Mimetics of methionine can be generated by reaction with, e.g., methionine sulfoxide. Mimetics of ′ adioim include, e.g., pipecolic acid, thiazolidine carboxylic acid, 3- or 4-hydroxy ′ adioim, dehydroproline, 3- or 4-methylproline, or 3,3,-dimethylproline. Histidine residue mimetics can be generated by reacting histidyl with, e.g., diethylprocarbonate or para-bromophenacyl bromide. Other mimetics include, e.g., those generated by hydroxylation of ′ adloim and lysine; phosphorylation of the hydroxyl groups of seryl or threonyl residues; methylation of the alpha-amino groups of lysine, arginine and histidine; acetylation of the N-terminal amine; methylation of main chain amide residues or substitution with N-methyl amino acids; or amidation of C-terminal carboxyl groups.
A component of a polypeptide of the invention can also be replaced by an amino acid (or peptidomimetic residue) of the opposite chirality. Thus, any amino acid naturally occurring in the L-configuration (which can also be referred to as the R or S, depending upon the structure of the chemical entity) can be replaced with the amino acid of the same chemical structural type or a peptidomimetic, but of the opposite chirality, referred to as the D-amino acid, but which can additionally be referred to as the R- or S-form
The invention also provides polypeptides that are “substantially identical” to an exemplary polypeptide of the invention. A “substantially identical” amino acid sequence is a sequence that differs from a reference sequence by one or more conservative or non-conservative amino acid substitutions, deletions, or insertions, particularly when such a substitution occurs at a site that is not the active site of the molecule, and provided that the polypeptide essentially retains its functional properties. A conservative amino acid substitution, for example, substitutes one amino acid for another of the same class (e.g., substitution of one hydrophobic amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of one polar amino acid for another, such as substitution of arginine for lysine, glutamic acid for aspartic acid or glutamine for ′ adioimmuno). One or more amino acids can be deleted, for example, from an αB crystallin polypeptide having molecular chaperone activity of the invention, resulting in modification of the structure of the polypeptide, without significantly altering its biological activity. For example, amino- or carboxyl-terminal, or internal, amino acids which are not required for molecular chaperone activity of aB crystallin protein can be removed.
The skilled artisan will recognize that individual synthetic residues and polypeptides incorporating these mimetics can be synthesized using a variety of procedures and methodologies, which are well described in the scientific and patent literature, e.g., Organic Syntheses Collective Volumes, Gilman, et al. (Eds) John Wiley & Sons, Inc., NY. Peptides and peptide mimetics of the invention can also be synthesized using combinatorial methodologies. Various techniques for generation of peptide and peptidomimetic libraries are well known, and include, e.g., multipin, tea bag, and split-couple-mix techniques; see, e.g., al-Obeidi, Mol. Biotechnol. 9: 205-223, 1998; Hruby, Curr. Opin. Chem. Biol. 1: 114-119, 1997; Ostergaard, Mol. Divers. 3: 17-27, 1997; Ostresh, Methods Enzymol. 267: 220-234, 1996. Modified peptides of the invention can be further produced by chemical modification methods, see, e.g., Belousov, Nucleic Acids Res. 25: 3440-3444, 1997; Frenkel, Free Radic. Biol. Med. 19: 373-380, 1995; Blommers, Biochemistry 33: 7886-7896, 1994.
Peptides and polypeptides of the invention can also be synthesized and expressed as fusion proteins with one or more additional domains linked thereto for, e.g., producing a more immunogenic peptide, to more readily isolate a recombinantly synthesized peptide, to identify and isolate antibodies and antibody-expressing B cells, and the like. Detection and purification facilitating domains include, e.g., metal chelating peptides such as polyhistidine tracts and histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Amgen Corp, Seattle, Wash.). The inclusion of a cleavable linker sequences such as Factor Xa or enterokinase (Invitrogen, San Diego, Calif.) between a purification domain and the motif-comprising peptide or polypeptide to facilitate purification. For example, an expression vector can include an epitope-encoding nucleic acid sequence linked to six histidine residues followed by a thioredoxin and an enterokinase cleavage site (see e.g., Williams, Biochemistry 34: 1787-1797, 1995; Dobeli, Protein Expr. Purif 12: 404-14, 1998). The histidine residues facilitate detection and purification while the enterokinase cleavage site provides a means for purifying the epitope from the remainder of the fusion protein. Technology pertaining to vectors encoding fusion proteins and application of fusion proteins are well described in the scientific and patent literature, see e.g., Kroll, DNA Cell. Biol., 12: 441-53, 1993.
As used herein, the term “isolated” means that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring). For example, a naturally occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment. As used herein, an isolated material or composition can also be a “purified” composition, i.e., it does not require absolute purity; rather, it is intended as a relative definition. Individual nucleic acids obtained from a library can be conventionally purified to electrophoretic homogeneity. In alternative aspects, the invention provides nucleic acids which have been purified from genomic DNA or from other sequences in a library or other environment by at least one, two, three, four, five or more orders of magnitude.
Intellipeptide analogs, polypeptide fragment of αB crystallin protein having molecular chaperone activity, are generally designed and produced by chemical modifications of a lead peptide, including, e.g., any of the particular peptides described herein, such as any of the following sequences: i) EKDRFSVNLDVKHFS; ii) DPLTITSSLSSDGVLTVNGPRKQ; iii) LTITSSLSDGVLTVNGPRK; iv) STSLSPFYLRPPSFLRAP; v) SLSPFYLRPPSFLRAPS; vi) GPERTIPITREEK; vii) PERTIPITREEK; viii) HGKHEERQDE; ix) HGFISREFHRKYR or functional variants or mimetics thereof. An exemplary polypeptide fragment of αB crystallin protein having molecular chaperone activity is presented; e.g., the N-terminal domain polypeptide fragment is 9WIRRPFFPFHSP20 or 43SLSPFYLRPPSFLRAP58, the α crystallin core domain polypeptide fragment is 75FSVNLDVK82 (β3), 113FISREFHR120, 131LTITSSLS138 (β8), 141GVLTVNGP148 (β9), 73DRFSVNLDVKHFS85, or 131LTITSSLSDGV141, or the C-terminal domain polypeptide fragment is 157RTIPITRE164, or functional variants thereof.
The terms “identical” or percent “identity”, in the context of two or more nucleic acids or polypeptide sequences, refers to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region (e.g., nucleotide sequence encoding an antibody described herein or amino acid sequence of an antibody described herein), when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection. Such sequences are then said to be “substantially identical.” This term also refers to, or can be applied to, the compliment of a test sequence. The term also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
A “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence can be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2: 482, 1981, by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48: 443, 1970, by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85: 2444, 1988, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology, Ausubel et al., eds. 1995 supplement)).
Programs for searching for alignments are well known in the art, e.g., BLAST and the like. For example, if the target species is human, a source of such amino acid sequences or gene sequences (germline or rearranged antibody sequences) can be found in any suitable reference database such as Genbank, the NCBI protein databank (http://ncbi.nlm.nih.gov/BLAST/), VBASE, a database of human antibody genes (http://www.mrc-cpe.cam.ac.uk/imt-doc), and the Kabat database of immunoglobulins (http://www.immuno.bme.nwu.edu) or translated products thereof. If the alignments are done based on the nucleotide sequences, then the selected genes should be analyzed to determine which genes of that subset have the closest amino acid homology to the originating species antibody. It is contemplated that amino acid sequences or gene sequences which approach a higher degree homology as compared to other sequences in the database can be utilized and manipulated in accordance with the procedures described herein. Moreover, amino acid sequences or genes which have lesser homology can be utilized when they encode products which, when manipulated and selected in accordance with the procedures described herein, exhibit specificity for the predetermined target antigen. In certain embodiments, an acceptable range of homology is greater than about 50%. It should be understood that target species can be other than human.
A preferred example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nuc. Acids Res. 25: 3389-3402, 1977 and Altschul et al., J. Mol. Biol. 215: 403-410, 1990, respectively. BLAST and BLAST 2.0 are used, with the parameters described herein, to determine percent sequence identity for the nucleic acids and proteins of the invention. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=−4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915, 1989) alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands.
“Polypeptide”, “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymer. “Polypeptide” and “protein” further refer to amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and can contain modified amino acids other than the 20 gene-encoded amino acids. The term “polypeptide” also includes peptides and polypeptide fragments, motifs and the like. The term also includes glycosylated polypeptides. The peptides and polypeptides of the invention also include all “mimetic” and “peptidomimetic” forms, as described in further detail, below.
“Amino acid” or “functional variant or mimetic” of a polypeptide refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an α carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
Amino acids can be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, can be referred to by their commonly accepted single-letter codes.
“Conservatively modified variants” or “variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations”, which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence with respect to the expression product, but not with respect to actual probe sequences.
As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.
The following eight groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins, 1984).
Macromolecular structures such as polypeptide structures can be described in terms of various levels of organization. For a general discussion of this organization, see, e.g., Alberts et al., Molecular Biology of the Cell (3rd ed., 1994) and Cantor and Schimmel, Biophysical Chemistry Part I: The Conformation of Biological Macromolecules, 1980. “Primary structure” refers to the amino acid sequence of a particular peptide. “Secondary structure” refers to locally ordered, three dimensional structures within a polypeptide. These structures are commonly known as domains, e.g., enzymatic domains, extracellular domains, transmembrane domains, pore domains, and cytoplasmic tail domains. Domains are portions of a polypeptide that form a compact unit of the polypeptide and are typically 15 to 350 amino acids long. Exemplary domains include domains with enzymatic activity, e.g., a kinase domain. Typical domains are made up of sections of lesser organization such as stretches of β-sheet and α-helices. “Tertiary structure” refers to the complete three dimensional structure of a polypeptide monomer. “Quaternary structure” refers to the three dimensional structure formed by the noncovalent association of independent tertiary units. Anisotropic terms are also known as energy terms.
A particular nucleic acid sequence also implicitly encompasses “splice variants.” Similarly, a particular protein encoded by a nucleic acid implicitly encompasses any protein encoded by a splice variant of that nucleic acid. “Splice variants”, as the name suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic acid transcript can be spliced such that different (alternate) nucleic acid splice products encode different polypeptides. Mechanisms for the production of splice variants vary, but include alternate splicing of exons. Alternate polypeptides derived from the same nucleic acid by read-through transcription are also encompassed by this definition. Any products of a splicing reaction, including recombinant forms of the splice products, are included in this definition.
Functional variants of polypeptides of these genes and gene products are useful in the invention. “Functional variant” refers to a nucleic acid or protein having a nucleotide sequence or amino acid sequence, respectively, that is “identical,” “essentially identical,” “substantially identical,” “homologous” or “similar” (as described below) to a reference sequence which may, by way of non-limiting example, be the sequence of an isolated nucleic acid or protein, or a consensus sequence derived by comparison of two or more related nucleic acids or proteins, or a group of isoforms of a given nucleic acid or protein. Non-limiting examples of types of isoforms include isoforms of differing molecular weight that result from, e.g., alternate RNA splicing or proteolytic cleavage; and isoforms having different post-translational modifications, such as glycosylation; and the like.
Two sequences are said to be “identical” if the two sequences, when aligned with each other, are exactly the same with no gaps, substitutions, insertions or deletions.
Two sequences are said to be “essentially identical” if the following criteria are met. Two amino acid sequences are “essentially identical” if the two sequences, when aligned with each other, are exactly the same with no gaps, insertions or deletions, and the sequences have only conservative amino acid substitutions. Conservative amino acid substitutions are as described in Table 3.
Two nucleotide sequences are “essentially identical” if they encode the identical or essentially identical amino acid sequence. As is known in the art, due to the nature of the genetic code, some amino acids are encoded by several different three base codons, and these codons may thus be substituted for each other without altering the amino acid at that position in an amino acid sequence. In the genetic code, TTA, TTG, CTT, CTC, CTA and CTG encode Leu; AGA, AGG, CGT, CGC, CGA and CGG encode Arg; GCT, GCC, GCA and GCG encode Ala; GGT, GGC, GGA and GGG encode Gly; ACT, ACC, ACA and ACG encode Thr; GTT, GTC, GTA and GTG encode Val; TCT, TCC, TCA and TCG encode Ser; CCT, CCC, CCA and CCG encode Pro; ATA, ATC and ATA encode Ile; GAA and GAG encode Glu; CAA and CAG encode Gln; GAT and GAC encode Asp; AAT and AAC encode Asn; AGT and AGC encode Ser; TAT and TAC encode Tyr; TGT and TGC encode Cys; AAA and AAG encode Lys; CAT and CAC encode His; TTT and TTC encode Phe, TGG encodes Trp; ATG encodes Met; and TGA, TAA and TAG are translation stop codons.
Two amino acid sequences are “substantially identical” if, when aligned, the two sequences are, (i) less than 30%, preferably <20%, more preferably <15%, most preferably <10%, of the identities of the amino acid residues vary between the two sequences; (ii) the number of gaps between or insertions in, deletions of and/or substitutions of, is <10%, more preferably <5%, more preferably <3%, most preferably <1%, of the number of amino acid residues that occur over the length of the shortest of two aligned sequences.
Two sequences are said to be “homologous” if any of the following criteria are met. The term “homolog” includes without limitation orthologs (homologs having genetic similarity as the result of sharing a common ancestor and encoding proteins that have the same function in different species) and paralog (similar to orthologs, yet gene and protein similarity is the result of a gene duplication).
One indication that nucleotide sequences are homologous is if two nucleic acid molecules hybridize to each other under stringent conditions. Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Typically, stringent conditions will be those in which the salt concentration is about 0.02 M at pH 7 and the temperature is at least about 60° C.
Another way by which it can be determined if two sequences are homologous is by using an appropriate algorithm to determine if the above-described criteria for substantially identical sequences are met. Sequence comparisons between two (or more) polynucleotides or polypeptides are typically performed by algorithms such as, for example, the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2: 482, 1981; by the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48: 443, 1970; by the search for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A. 85: 2444, 1988; and by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, version 10.2 Genetics Computer Group (GCG), 575 Science Dr., Madison, Wis.); BLASTP, BLASTN, and FASTA (Altschul et al., J. Mol. Biol. 215: 403-410, 1990); or by visual inspection.
Optimal alignments are found by inserting gaps to maximize the number of matches using the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2: 482-489, 1981. “Gap” uses the algorithm of Needleman and Wunsch, J. Mol. Biol. 48: 443-453, 1970) to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps. In such algorithms, a “penalty” of about 3.0 to about 20 for each gap, and no penalty for end gaps, is used.
Homologous proteins also include members of clusters of orthologous groups of proteins (COGs), which are generated by phylogenetic classification of proteins encoded in complete genomes. To date, COGs have been delineated by comparing protein sequences encoded in 43 complete genomes, representing 30 major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain (see Tatusov et al., Science, 278: 631-637, 1997; Tatusov et al., Nucleic Acids Res. 29: 22-28, 2001; Chervitz et al., Science 282: 2022-2028, 1998; and http://www.ncbi.nlm.nih.gov/COG/).
The entirety of two sequences may be identical, essentially identical, substantially identical, or homologous to one another, or portions of such sequences may be identical or substantially identical with sequences of similar length in other sequences. In either case, such sequences are similar to each other. Typically, stretches of identical or essentially within similar sequences have a length of >12, preferably >24, more preferably >48, and most preferably >96 residues.
“Polypeptide” includes proteins, fusion proteins, oligopeptides and polypeptide derivatives, with the exception that peptidomimetics are considered to be small molecules herein. Although they are polypeptides, antibodies and their derivatives are described in a separate section. Antibodies and antibody derivatives are described in a separate section, but antibodies and antibody derivatives are, for purposes of the invention, treated as a subclass of the polypeptides and derivatives.
A “protein” is a molecule having a sequence of amino acids that are linked to each other in a linear molecule by peptide bonds. The term protein refers to a polypeptide that is isolated from a natural source, or produced from an isolated cDNA using recombinant DNA technology; and has a sequence of amino acids having a length of at least about 200 amino acids.
A “fusion protein” is a type of recombinant protein that has an amino acid sequence that results from the linkage of the amino acid sequences of two or more normally separate polypeptides.
A “protein fragment” is a proteolytic fragment of a larger polypeptide, which may be a protein or a fusion protein. A proteolytic fragment may be prepared by in vivo or in vitro proteolytic cleavage of a larger polypeptide, and is generally too large to be prepared by chemical synthesis. Proteolytic fragments have amino acid sequences having a length from about 200 to about 1,000 amino acids.
An “oligopeptide” is a polypeptide having a short amino acid sequence (i.e., 2 to about 200 amino acids). An oligopeptide is generally prepared by chemical synthesis.
Although oligopeptides and protein fragments may be otherwise prepared, it is possible to use recombinant DNA technology and/or in vitro biochemical manipulations. For example, a nucleic acid encoding an amino acid sequence may be prepared and used as a template for in vitro transcription/translation reactions. In such reactions, an exogenous nucleic acid encoding a preselected polypeptide is introduced into a mixture that is essentially depleted of exogenous nucleic acids that contains all of the cellular components required for transcription and translation. One or more radiolabeled amino acids are added before or with the exogenous DNA, and transcription and translation are allowed to proceed. Because the only nucleic acid present in the reaction mix is the exogenous nucleic acid added to the reaction, only polypeptides encoded thereby are produced, and incorporate the radiolabelled amino acid(s). In this manner, polypeptides encoded by a preselected exogenous nucleic acid are radiolabeled. Although other proteins are present in the reaction mix, the preselected polypeptide is the only one that is produced in the presence of the radiolabeled amino acids and is thus uniquely labeled.
As is explained in detail below, “polypeptide derivatives” include without limitation mutant polypeptides, chemically modified polypeptides, and peptidomimetics.
The polypeptides of this invention, including the analogs and other modified variants, may generally be prepared following known techniques. Preferably, synthetic production of the polypeptide of the invention may be according to the solid phase synthetic method. For example, the solid phase synthesis is well understood and is a common method for preparation of polypeptides, as are a variety of modifications of that technique. Merrifield, J. Am. Chem. Soc., 85: 2149, 1964; Stewart and Young, Solid Phase polypeptide Synthesis, Pierce Chemical Company, Rockford, Ill., 1984; Bodansky and Bodanszky, The Practice of polypeptide Synthesis, Springer-Verlag, New York, 1984; Atherton and Sheppard, Solid Phase polypeptide Synthesis: A Practical Approach, IRL Press, New York, 1989]. See, also, the specific method described in Example 1 below.
Alternatively, polypeptides of this invention may be prepared in recombinant systems using polynucleotide sequences encoding the polypeptides. For example, fusion proteins are typically prepared using recombinant DNA technology.
Functional Polypeptide Variant. A “variant” or “functional variant” of a polypeptide is a compound that is not, by definition, a polypeptide, i.e., it contains at least one chemical linkage that is not a peptide bond. Thus, polypeptide derivatives include without limitation proteins that naturally undergo post-translational modifications such as, e.g., glycosylation. It is understood that a polypeptide of the invention may contain more than one of the following modifications within the same polypeptide. Preferred polypeptide derivatives retain a desirable attribute, which may be biological activity; more preferably, a polypeptide derivative is enhanced with regard to one or more desirable attributes, or has one or more desirable attributes not found in the parent polypeptide. Although they are described in this section, peptidomimetics are taken as small molecules in the present disclosure.
A polypeptide having an amino acid sequence identical to that found in a protein prepared from a natural source is a “wildtype” polypeptide. Functional variants of polypeptides can be prepared by chemical synthesis, including without limitation combinatorial synthesis.
Functional variants of polypeptides larger than oligopeptides can be prepared using recombinant DNA technology by altering the nucleotide sequence of a nucleic acid encoding a polypeptide. Although some alterations in the nucleotide sequence will not alter the amino acid sequence of the polypeptide encoded thereby (“silent” mutations), many will result in a polypeptide having an altered amino acid sequence that is altered relative to the parent sequence. Such altered amino acid sequences may comprise substitutions, deletions and additions of amino acids, with the proviso that such amino acids are naturally occurring amino acids.
Thus, subjecting a nucleic acid that encodes a polypeptide to mutagenesis is one technique that can be used to prepare Functional variants of polypeptides, particularly ones having substitutions of amino acids but no deletions or insertions thereof. A variety of mutagenic techniques are known that can be used in vitro or in vivo including without limitation chemical mutagenesis and PCR-mediated mutagenesis. Such mutagenesis may be randomly targeted (i.e., mutations may occur anywhere within the nucleic acid) or directed to a section of the nucleic acid that encodes a stretch of amino acids of particular interest. Using such techniques, it is possible to prepare randomized, combinatorial or focused compound libraries, pools and mixtures.
Polypeptides having deletions or insertions of naturally occurring amino acids may be synthetic oligopeptides that result from the chemical synthesis of amino acid sequences that are based on the amino acid sequence of a parent polypeptide but which have one or more amino acids inserted or deleted relative to the sequence of the parent polypeptide. Insertions and deletions of amino acid residues in polypeptides having longer amino acid sequences may be prepared by directed mutagenesis.
Chemically Modified Polypeptides. As contemplated by this invention, “polypeptide” includes those having one or more chemical modification relative to another polypeptide, i.e., chemically modified polypeptides. The polypeptide from which a chemically modified polypeptide is derived may be a wildtype protein, a functional variant protein or a functional variant polypeptide, or polypeptide fragments thereof; an antibody or other polypeptide ligand according to the invention including without limitation single-chain antibodies, crystalline proteins and polypeptide derivatives thereof; or polypeptide ligands prepared according to the disclosure. Preferably, the chemical modification(s) confer(s) or improve(s) desirable attributes of the polypeptide but does not substantially alter or compromise the biological activity thereof. Desirable attributes include but are limited to increased shelf-life; enhanced serum or other in vivo stability; resistance to proteases; and the like. Such modifications include by way of non-limiting example N-terminal acetylation, glycosylation, and biotinylation.
Polypeptides with N-Terminal or C-Terminal Chemical Groups. An effective approach to confer resistance to peptidases acting on the N-terminal or C-terminal residues of a polypeptide is to add chemical groups at the polypeptide termini, such that the modified polypeptide is no longer a substrate for the peptidase. One such chemical modification is glycosylation of the polypeptides at either or both termini. Certain chemical modifications, in particular N-terminal glycosylation, have been shown to increase the stability of polypeptides in human serum (Powell et al., Pharma. Res. 10: 1268-1273, 1993). Other chemical modifications which enhance serum stability include, but are not limited to, the addition of an N-terminal alkyl group, consisting of a lower alkyl of from 1 to 20 carbons, such as an acetyl group, and/or the addition of a C-terminal amide or substituted amide group.
Polypeptides with a Terminal D-Amino Acid. The presence of an N-terminal D-amino acid increases the serum stability of a polypeptide that otherwise contains L-amino acids, because exopeptidases acting on the N-terminal residue cannot utilize a D-amino acid as a substrate. Similarly, the presence of a C-terminal D-amino acid also stabilizes a polypeptide, because serum exopeptidases acting on the C-terminal residue cannot utilize a D-amino acid as a substrate. With the exception of these terminal modifications, the amino acid sequences of polypeptides with N-terminal and/or C-terminal D-amino acids are usually identical to the sequences of the parent L-amino acid polypeptide.
Polypeptides with Substitution of Natural Amino Acids by Unnatural Amino Acids. Substitution of unnatural amino acids for natural amino acids in a subsequence of a polypeptide can confer or enhance desirable attributes including biological activity. Such a substitution can, for example, confer resistance to proteolysis by exopeptidases acting on the N-terminus. The synthesis of polypeptides with unnatural amino acids is routine and known in the art (see, for example, Coller, et al. 1993, cited above).
Post-Translational Chemical Modifications. Different host cells will contain different post-translational modification mechanisms that may provide particular types of post-translational modification of a fusion protein if the amino acid sequences required for such modifications is present in the fusion protein. A large number (.about.100) of post-translational modifications have been described, a few of which are discussed herein. One skilled in the art will be able to choose appropriate host cells, and design chimeric genes that encode protein members comprising the amino acid sequence needed for a particular type of modification.
Glycosylation is one type of post-translational chemical modification that occurs in many eukaryotic systems, and may influence the activity, stability, pharmacogenetics, immunogenicity and/or antigenicity of proteins. However, specific amino acids must be present at such sites to recruit the appropriate glycosylation machinery, and not all host cells have the appropriate molecular machinery. Saccharomyces cerevisieae and Pichia pastoris provide for the production of glycosylated proteins, as do expression systems that utilize insect cells, although the pattern of glyscoylation may vary depending on which host cells are used to produce the fusion protein.
Another type of post-translation modification is the phosphorylation of a free hydroxyl group of the side chain of one or more Ser, Thr or Tyr residues, Protein kinases catalyze such reactions. Phosphorylation is often reversible due to the action of a protein phosphatase, an enzyme that catalyzes the dephosphorylation of amino acid residues.
Differences in the chemical structure of amino terminal residues result from different host cells, each of which may have a different chemical version of the methionine residue encoded by a start codon, and these will result in amino termini with different chemical modifications.
For example, many or most bacterial proteins are synthesized with an amino terminal amino acid that is a modified form of methionine, i.e, N-formyl-methionine (fMet). Although the statement is often made that all bacterial proteins are synthesized with an fMet initiator amino acid; although this may be true for E. coli, recent studies have shown that it is not true in the case of other bacteria such as Pseudomonas aeruginosa (Newton et al., J. Biol. Chem. 274: 22143-22146, 1999). In any event, in E. coli, the formyl group of fMet is usually enzymatically removed after translation to yield an amino terminal methionine residue, although the entire fMet residue is sometimes removed (see Hershey, Chapter 40, “Protein Synthesis” in: Escherichia Coli and Salmonella Typhimurium: Cellular and Molecular Biology, Neidhardt, Frederick C., Editor in Chief, American Society for Microbiology, Washington, D.C., 1987, Volume 1, pages 613-647, and references cited therein.) E. coli mutants that lack the enzymes (such as, e.g., formylase) that catalyze such post-translational modifications will produce proteins having an amino terminal fMet residue (Guillon et al., J. Bacteriol. 174: 4294-4301, 1992).
In eukaryotes, acetylation of the initiator methionine residue, or the penultimate residue if the initiator methionine has been removed, typically occurs co- or post-translationally. The acetylation reactions are catalyzed by N-terminal acetyltransferases (NATs, a.k.a. N-alpha-acetyltransferases), whereas removal of the initiator methionine residue is catalyzed by methionine aminopeptidases (for reviews, see Bradshaw et al., Trends Biochem. Sci. 23: 263-267, 1998; and Driessen et al., CRC Crit. Rev. Biochem. 18: 281-325, 1985). Amino terminally acetylated proteins are said to be “N-acetylated,” “N alpha acetylated” or simply “acetylated.”
Another post-translational process that occurs in eukaryotes is the alpha-amidation of the carboxy terminus. For reviews, see Eipper et al. Annu. Rev. Physiol. 50: 333-344, 1988, and Bradbury et al. Lung Cancer 14: 239-251, 1996. About 50% of known endocrine and neuroendocrine peptide hormones are alpha-amidated (Treston et al., Cell Growth Differ. 4: 911-920, 1993). In most cases, carboxy alpha-amidation is required to activate these peptide hormones.
In general, a polypeptide mimetic (“peptidomimetic”) is a molecule that mimics the biological activity of a polypeptide but is no longer peptidic in chemical nature. By strict definition, a peptidomimetic is a molecule that contains no peptide bonds (that is, amide bonds between amino acids). However, the term peptidomimetic is sometimes used to describe molecules that are no longer completely peptidic in nature, such as pseudo-peptides, semi-peptides and peptoids. Examples of some peptidomimetics by the broader definition (where part of a polypeptide is replaced by a structure lacking peptide bonds) are described below. Whether completely or partially non-peptide, peptidomimetics according to this invention provide a spatial arrangement of reactive chemical moieties that closely resembles the three-dimensional arrangement of active groups in the polypeptide on which the peptidomimetic is based. As a result of this similar active-site geometry, the peptidomimetic has effects on biological systems that are similar to the biological activity of the polypeptide.
There are several potential advantages for using a mimetic of a given polypeptide rather than the polypeptide itself. For example, polypeptides may exhibit two undesirable attributes, i.e., poor bioavailability and short duration of action. Peptidomimetics are often small enough to be both orally active and to have a long duration of action. There are also problems associated with stability, storage and immunoreactivity for polypeptides that are not experienced with peptidomimetics.
Candidate, lead and other polypeptides having a desired biological activity can be used in the development of peptidomimetics with similar biological activities. Techniques of developing peptidomimetics from polypeptides are known. Peptide bonds can be replaced by non-peptide bonds that allow the peptidomimetic to adopt a similar structure, and therefore biological activity, to the original polypeptide. Further modifications can also be made by replacing chemical groups of the amino acids with other chemical groups of similar structure. The development of peptidomimetics can be aided by determining the tertiary structure of the original polypeptide, either free or bound to a ligand, by NMR spectroscopy, crystallography and/or computer-aided molecular modeling. These techniques aid in the development of novel compositions of higher potency and/or greater bioavailability and/or greater stability than the original polypeptide (Dean, BioEssays, 16: 683-687, 1994; Cohen and Shatzmiller, J. Mol. Graph., 11: 166-173, 1993; Wiley and Rich, Med. Res. Rev., 13: 327-384, 1993; Moore, Trends Pharmacol. Sci., 15: 124-129, 1994; Hruby, Biopolymers, 33: 1073-1082, 1993; Bugg et al., Sci. Am., 269: 92-98, 1993, all incorporated herein by reference].
Thus, through use of methods described above, the present invention provides compounds exhibiting enhanced therapeutic activity in comparison to the polypeptides described above. The peptidomimetic compounds obtained by the above methods, having the biological activity of the above named polypeptides and similar three-dimensional structure, are encompassed by this invention. It will be readily apparent to one skilled in the art that a peptidomimetic can be generated from any of the modified polypeptides described in the previous section or from a polypeptide bearing more than one of the modifications described from the previous section. It will furthermore be apparent that the peptidomimetics of this invention can be further used for the development of even more potent non-peptidic compounds, in addition to their utility as therapeutic compounds.
Specific examples of peptidomimetics derived from the polypeptides described in the previous section are presented below. These examples are illustrative and not limiting in terms of the other or additional modifications.
Peptides with a Reduced Isostere Pseudopeptide Bond. Proteases act on peptide bonds. It therefore follows that substitution of peptide bonds by pseudopeptide bonds confers resistance to proteolysis. A number of pseudopeptide bonds have been described that in general do not affect polypeptide structure and biological activity. The reduced isostere pseudopeptide bond is a suitable pseudopeptide bond that is known to enhance stability to enzymatic cleavage with no or little loss of biological activity (Couder, et al., Int. J. Polypeptide Protein Res. 41: 181-184, 1993, incorporated herein by reference). Thus, the amino acid sequences of these compounds may be identical to the sequences of their parent L-amino acid polypeptides, except that one or more of the peptide bonds are replaced by an isostere pseudopeptide bond. Preferably the most N-terminal peptide bond is substituted, since such a substitution would confer resistance to proteolysis by exopeptidases acting on the N-terminus.
Peptides with a Retro-Inverso Pseudopeptide Bond. To confer resistance to proteolysis, peptide bonds may also be substituted by retro-inverso pseudopeptide bonds (Dalpozzo, et al., Int. J. Polypeptide Protein Res. 41: 561-566, incorporated herein by reference). According to this modification, the amino acid sequences of the compounds may be identical to the sequences of their L-amino acid parent polypeptides, except that one or more of the peptide bonds are replaced by a retro-inverso pseudopeptide bond. Preferably the most N-terminal peptide bond is substituted, since such a substitution will confer resistance to proteolysis by exopeptidases acting on the N-terminus.
Peptoid Derivatives. Peptoid derivatives of polypeptides represent another form of modified polypeptides that retain the important structural determinants for biological activity, yet eliminate the peptide bonds, thereby conferring resistance to proteolysis (Simon, et al., Proc. Natl. Acad. Sci. USA, 89: 9367-9371, 1992, and incorporated herein by reference). Peptoids are oligomers of N-substituted glycines. A number of N-alkyl groups have been described, each corresponding to the side chain of a natural amino acid.
The variants typically exhibit the same qualitative biological activity, however the chaperone activity may be altered from that of the original candidate variant protein, as needed. Alternatively, the variant may be designed such that the biological activity of the candidate variant protein is altered. For example, glycosylation sites may be altered or removed. Similarly, the biological function may be altered.
In addition, in some embodiments, it is desirable to have candidate variant proteins with altered chaperone activity that will bind to the target protein. Preferably, it would be desirable have proteins that exhibit oxidative stability, alkaline stability, and thermal stability.
A change in oxidative stability is evidenced by at least about 20%, more preferably at least about 50% increase of activity of a variant protein when exposed to various oxidizing conditions as compared to that of wild-type protein. Oxidative stability is measured by known procedures.
A change in alkaline stability is evidenced by at least about a 5% or greater increase or decrease (preferably increase) in the half life of the activity of a variant protein when exposed to increasing or decreasing pH conditions as compared to that of wild-type protein. Generally, alkaline stability is measured by known procedures.
A change in thermal stability is evidenced by at least about a 5% or greater increase or decrease (preferably increase) in the half-life of the activity of a variant protein when exposed to a relatively high temperature and neutral pH as compared to that of wild-type protein. Generally, thermal stability is measured by known procedures.
The candidate variant proteins and nucleic acids of the invention can be made in a number of ways. Individual nucleic acids and proteins can be made as known in the art and outlined below. Alternatively, libraries of candidate variant proteins can be made for testing.
In a preferred embodiment, the library of candidate variant proteins is generated from a probability distribution table. As outlined herein, there are a variety of methods of generating a probability distribution table, including using PDA™ technology, sequence alignments, forcefield calculations such as self-consistent meant field (SCMF) calculations. In addition, the probability distribution can be used to generate information entropy scores for each position, as a measure of the mutational frequency observed in the library.
In this embodiment, the frequency of each amino acid residue at each variable position in the list is identified. Frequencies can be thresholded, wherein any variant frequency lower than a cutoff is set to zero. This cutoff is preferably about 1%, 2%, 5%, 10% or 20%, with about 10% being particularly preferred. These frequencies are then built into the library of candidate variant proteins. That is, as above, these variable positions are collected and all possible combinations are generated, but the amino acid residues that “fill” the library of candidate variant proteins are utilized on a frequency basis. Thus, in a non-frequency based library of candidate variant proteins, a variable position that has 5 possible residues will have about 20% of the proteins comprising that variable position with the first possible residue, 20% with the second, etc. However, in a frequency based library of candidate variant proteins, a variable position that has 5 possible residues with frequencies of about 10%, 15%, 25%, 30% and 20%, respectively, will have 10% of the proteins comprising that variable position with the first possible residue, 15% of the proteins with the second residue, 25% with the third, etc. As will be appreciated by those in the art, the actual frequency may depend on the method used to actually generate the proteins; for example, exact frequencies may be possible when the proteins are synthesized. However, when the frequency-based primer system outlined below is used, the actual frequencies at each position will vary, as outlined below.
As will be appreciated by those in the art and outlined herein, probability distribution tables can be generated in a variety of ways. In addition to the methods outlined herein, self-consistent mean field (SCMF) methods can be used in the direct generation of probability tables. SCMF is a deterministic computational method that uses a mean field description of rotamer interactions to calculate energies. A probability table generated in this way can be used to create libraries of candidate variant proteins as described herein. SCMF can be used in three ways: the frequencies of amino acids and rotamers for each amino acid are listed at each position; the probabilities are determined directly from SCMF (see Delarue et al. Pac. Symp. Biocomput. 109-21, 1997, expressly incorporated by reference). In addition, highly variable positions and non-variable positions can be identified. Alternatively, another method is used to determine what sequence is jumped to during a search of sequence space; SCMF is used to obtain an accurate energy for that sequence; this energy is then used to rank it and create a rank-ordered list of sequences (similar to a Monte Carlo sequence list). A probability table showing the frequencies of amino acids at each position can then be calculated from this list. Koehl et al., J. Mol. Biol. 239: 249, 1994; Koehl et al., Nat. Struc. Biol. 2: 163, 1995; Koehl et al., Curr. Opin. Struct. Biol. 6: 222, 1996; Koehl et al., J. Mol. Bio. 293: 1183, 1999; Koehl et al., J. Mol. Biol. 293: 1161, 1999; Lee, J. Mol. Biol. 236: 918, 1994; and Vasquez Biopolymers 36:53-70, 1995; all of which are expressly incorporated by reference. Similar methods include, but are not limited to, OPLS-AA (Jorgensen, et al., J. Am. Chem. Soc., 118: 11225-11236, 1996; Jorgensen, W. L.; BOSS, Version 4.1; Yale University: New Haven, Conn., 1999); OPLS (Jorgensen, et al., J. Am. Chem. Soc., 110: 1657ff, 1988; Jorgensen, et al., J. Am. Chem. Soc. 112: 4768ff, 1990); UNRES (United Residue Forcefield; Liwo, et al., Protein Science, 2: 1697-1714, 1993; Liwo, et al., Protein Science, 2: 1715-1731, 1993; Liwo, et al., J. Comp. Chem. 18: 849-873, 1997; Liwo, et al., J. Comp. Chem., 18: 874-884, 1997; Liwo, et al., J. Comp. Chem. 19: 259-276, 1998; Forcefield for Protein Structure Prediction (Liwo, et al., Proc. Natl. Acad. Sci. USA, 96: 5482-5485, 1999); ECEPP/3 (Liwo et al., J. Protein Chem 13(4): 375-80, 1994); AMBER 1.1 force field (Weiner, et al., J. Am. Chem. Soc. 106: 765-784, 1994); AMBER 3.0 force field (U. C. Singh et al., Proc. Natl. Acad. Sci. USA. 82: 755-759, 1994); CHARMM and CHARMM22 (Brooks, et al., J. Comp. Chem. 4: 187-217); cvff3.0 (Dauber-Osguthorpe, et al., Proteins: Structure, Function and Genetics, 4: 31-47, 1988); cff91 (Maple, et al., J. Comp. Chem. 15: 162-182, 1988); also, the DISCOVER (cvff and cff91) and AMBER forcefields are used in the INSIGHT molecular modeling package (Biosym/MSI, San Diego Calif.) and HARMM is used in the QUANTA molecular modeling package (Biosym/MSI, San Diego Calif.); all references hereby expressly incorporated by reference in their entirety.
In addition, a method of generating a probability distribution table is through the use of sequence alignment programs. In addition, the probability table can be obtained by a combination of sequence alignments and computational approaches. For example, one can add amino acids found in the alignment of homologous sequences to the result of the computation. Preferable one can add the wild type amino acid identity to the probability table if it is not found in the computation.
As will be appreciated, a library of candidate variant proteins created by recombining variable positions and/or residues at the variable position may not be in a rank-ordered list. In some embodiments, the entire list may just be made and tested. Alternatively, in a preferred embodiment, the secondary library is also in the form of a rank ordered list. This may be done for several reasons, including the size of the secondary library is still too big to generate experimentally, or for predictive purposes. This may be done in several ways. In one embodiment, the secondary library is ranked or filtered using the scoring functions of PDA™ to rank or filter the library members. Alternatively, statistical methods could be used. For example, the secondary library may be ranked or filtered by frequency score; that is, proteins containing the most of high frequency residues could be ranked higher, etc. This may be done by adding or multiplying the frequency at each variable position to generate a numerical score. Similarly, the secondary library different positions could be weighted and then the proteins scored; for example, those containing certain residues could be arbitrarily ranked or filtered.
In a one embodiment, the different protein members of the candidate variant library may be chemically synthesized. This is particularly useful when the designed proteins are short, preferably less than 150 amino acids in length, with less than 100 amino acids being preferred, and less than 50 amino acids being particularly preferred, although as is known in the art, longer proteins can be made chemically or enzymatically. See for example Wilken et al, Curr. Opin. Biotechnol. 9: 412-26, 1998, hereby expressly incorporated by reference.
In another embodiment, particularly for longer proteins or proteins for which large samples are desired, the candidate variant sequences are used to create nucleic acids such as DNA which encode the member sequences and which can then be cloned into host cells, expressed and assayed, if desired. Thus, nucleic acids, and particularly DNA, can be made which encodes each member protein sequence. This is done using well known procedures. The choice of codons, suitable expression vectors and suitable host cells will vary depending on a number of factors, and can be easily optimized as needed.
In a further embodiment, multiple PCR reactions with pooled oligonucleotides is done. In this embodiment, overlapping oligonucleotides are synthesized which correspond to the full length gene. Again, these oligonucleotides may represent all of the different amino acids at each variant position or subsets. These oligonucleotides can be pooled in equal proportions and multiple PCR reactions are performed to create full length sequences containing the combinations of mutations defined by the secondary library. In addition, this may be done using error-prone PCR methods. The different oligonucleotides can be added in relative amounts corresponding to the probability distribution table. The multiple PCR reactions thus result in full length sequences with the desired combinations of mutation in the desired proportions.
The total number of oligonucleotides needed is a function of the number of positions being mutated and the number of mutations being considered at these positions: (number of oligos for constant positions)+M1+M2+M3+ . . . Mn=(total number of oligos required), where Mn is the number of mutations considered at position n in the sequence.
In a further aspect, each overlapping oligonucleotide comprises only one position to be varied; in alternate embodiments, the variant positions are too close together to allow this and multiple variants per oligonucleotide are used to allow complete recombination of all the possibilities. That is, each oligo can contain the codon for a single position being mutated, or for more than one position being mutated. The multiple positions being mutated must be close in sequence to prevent the oligo length from being impractical. For multiple mutating positions on an oligonucleotide, particular combinations of mutations can be included or excluded in the library by including or excluding the oligonucleotide encoding that combination. For example, as discussed herein, there may be correlations between variable regions; that is, when position X is a certain residue, position Y must (or must not) be a particular residue. These sets of variable positions are sometimes referred to herein as a “cluster”. When the clusters are comprised of residues close together, and thus can reside on one oligonucleotide primer, the clusters can be set to the “good” correlations, and eliminate the bad combinations that may decrease the effectiveness of the library. However, if the residues of the cluster are far apart in sequence, and thus will reside on different oligonucleotides for synthesis, it may be desirable to either set the residues to the “good” correlation, or eliminate them as variable residues entirely. In an alternative embodiment, the library may be generated in several steps, so that the cluster mutations only appear together. This procedure, i.e., the procedure of identifying mutation clusters and either placing them on the same oligonucleotides or eliminating them from the library or library generation in several steps preserving clusters, can considerably enrich the experimental library with properly folded protein. Identification of clusters can be carried out by a number of ways, e.g. by using known pattern recognition methods, comparisons of frequencies of occurrence of mutations or by using energy analysis of the sequences to be experimentally generated (for example, if the energy of interaction is high, the positions are correlated). These correlations may be positional correlations (e.g. variable positions 1 and 2 always change together or never change together) or sequence correlations (e.g. if there is a residue A at position 1, there is always residue B at position 2). See: Pattern discovery in Biomolecular Data: Tools, Techniques, and Applications; edited by Jason T. L. Wang, Bruce A. Shapiro, Dennis Shasha. New York: Oxford University, 1999; Andrews, Harry C. Introduction to mathematical techniques in patter recognition; New York, Wiley-Interscience, 1972; Applications of Pattern Recognition; Editor, K. S. Fu. Boca Raton, Fla. CRC Press, 1982; Genetic Algorithms for Pattern Recognition; edited by Sankar K. Pal, Paul P. Wang. Boca Raton: CRC Press, c1996; Pandya, Abhijit S., Pattern recognition with Neural networks in C++/Abhijit S. Pandya, Robert B. Macy. Boca Raton, Fla.: CRC Press, 1996; Handbook of pattern recognition and computer vision/edited by C. H. Chen, L. F. Pau, P. S. P. Wang. 2nd ed. Signapore; River Edge, N.J.: World Scientific, cl999; Friedman, Introduction to Pattern Recognition:Statistical, Structural, Neural, and Fuzzy Logic Approaches; River Edge, N.J.: World Scientific, c1999, Series title: Serien a machine perception and artificial intelligence; vol. 32; all of which are expressly incorporated by reference. In addition programs used to search for consensus motifs can be used as well.
In addition, correlations and shuffling can be fixed or optimized by altering the design of the oligonucleotides; that is, by deciding where the oligonucleotides (primers) start and stop (e.g. where the sequences are “cut”). The start and stop sites of oligos can be set to maximize the number of clusters that appear in single oligonucleotides, thereby enriching the library with higher scoring sequences. Different oligonucleotides start and stop site options can be computationally modeled and ranked or filtered according to number of clusters that are represented on single oligos, or the percentage of the resulting sequences consistent with the predicted library of sequences.
The total number of oligonucleotides required increases when multiple mutable positions are encoded by a single oligonucleotide. The annealed regions are the ones that remain constant, i.e. have the sequence of the reference sequence.
Oligonucleotides with insertions or deletions of codons can be used to create a library expressing different length proteins. In particular computational sequence screening for insertions or deletions can result in secondary libraries defining different length proteins, which can be expressed by a library of pooled oligonucleotide of different lengths.
In a further aspect, the secondary library is done by shuffling the family (e.g. a set of variants); that is, some set of the top sequences (if a rank-ordered list is used) can be shuffled, either with or without error-prone PCR. “Shuffling” in this context means a recombination of related sequences, generally in a random way. It can include “shuffling” as defined and exemplified in U.S. Pat. Nos. 5,830,721; 5,811,238; 5,605,793; 5,837,458 and PCT US/19256, all of which are expressly incorporated by reference in their entirety. This set of sequences can also be an artificial set; for example, from a probability table (for example generated using SCMF) or a Monte Carlo set. Similarly, the “family” can be the top 10 and the bottom 10 sequences, the top 100 sequences, etc. This may also be done using error-prone PCR.
Thus, in a further aspect, in silico shuffling is done using the computational methods described therein. That is, starting with either two libraries or two sequences, random recombinations of the sequences can be generated and evaluated.
Error-prone PCR can be done to generate the secondary library. See U.S. Pat. Nos. 5,605,793, 5,811,238, and 5,830,721, all of which are hereby incorporated by reference. This can be done on the optimal sequence or on top members of the library, or some other artificial set or family. In this embodiment, the gene for the optimal sequence found in the computational screen of the primary library can be synthesized. Error prone PCR is then performed on the optimal sequence gene in the presence of oligonucleotides that code for the mutations at the variant positions of the secondary library (bias oligonucleotides). The addition of the oligonucleotides will create a bias favoring the incorporation of the mutations in the secondary library. Alternatively, only oligonucleotides for certain mutations may be used to bias the library.
Gene shuffling with error prone PCR can be performed on the gene for the optimal sequence, in the presence of bias oligonucleotides, to create a DNA sequence library that reflects the proportion of the mutations found in the secondary library. The choice of the bias oligonucleotides can be done in a variety of ways; they can chosen on the basis of their frequency, i.e. oligonucleotides encoding high mutational frequency positions can be used; alternatively, oligonucleotides containing the most variable positions can be used, such that the diversity is increased; if the secondary library is ranked or filtered, some number of top scoring positions can be used to generate bias oligonucleotides; random positions may be chosen; a few top scoring and a few low scoring ones may be chosen; etc. What is important is to generate new sequences based on preferred variable positions and sequences.
PCR using a wild type gene or polypeptide sequence can be used. In this embodiment, a starting gene is used; generally, although this is not required, the gene is the wild type gene. In some cases it may be the gene encoding the global optimized sequence, or any other sequence of the list. In this embodiment, oligonucleotides are used that correspond to the variant positions and contain the different amino acids of the secondary library. PCR is done using PCR primers at the termini, as is known in the art. This provides two benefits; the first is that this generally requires fewer oligonucleotides and can result in fewer errors. In addition, it has experimental advantages in that if the wild type gene is used, it need not be synthesized. Ligation of PCR products can be done.
A variety of additional steps may be done to one or more candidate variant secondary libraries; for example, further computational processing can occur, candidate variant secondary libraries can be recombined, or cutoffs from different candidate variant secondary libraries can be combined. In a preferred embodiment, a candidate variant secondary library may be computationally remanipulated to form an additional secondary library (sometimes referred to herein as “tertiary libraries”). For example, any of the candidate variant secondary library sequences may be chosen for a second round of PDA™, by freezing or fixing some or all of the changed positions in the first secondary library. Alternatively, only changes seen in the last probability distribution table are allowed. Alternatively, the stringency of the probability table may be altered, either by increasing or decreasing the cutoff for inclusion. Similarly, the candidate variant secondary library may be recombined experimentally after the first round; for example, the best gene/genes from the first screen may be taken and gene assembly redone (for example, using techniques outlined below, multiple PCR, error prone PCR, or shuffling). Alternatively, the fragments from one or more good gene(s) to change probabilities at some positions. This biases the search to an area of sequence space found in the first round of computational and experimental screening.
“Small molecule” includes any chemical or other moiety that can act to affect biological processes. Small molecules can include any number of therapeutic agents presently known and used, or can be small molecules synthesized in a library of such molecules for the purpose of screening for biological function(s). Small molecules are distinguished from macromolecules by size. The small molecules of this invention usually have molecular weight less than about 5,000 daltons (Da), preferably less than about 2,500 Da, more preferably less than 1,000 Da, most preferably less than about 500 Da.
Small molecules include without limitation organic compounds, peptidomimetics and conjugates thereof. As used herein, the term “organic compound” refers to any carbon-based compound other than macromolecules such nucleic acids and polypeptides. In addition to carbon, organic compounds may contain calcium, chlorine, fluorine, copper, hydrogen, iron, potassium, nitrogen, oxygen, sulfur and other elements. An organic compound may be in an aromatic or aliphatic form. Non-limiting examples of organic compounds include acetones, alcohols, anilines, carbohydrates, monosaccharides, oligosaccharides, polysaccharides, amino acids, nucleosides, nucleotides, lipids, retinoids, steroids, proteoglycans, ketones, aldehydes, saturated, unsaturated and polyunsaturated fats, oils and waxes, alkenes, esters, ethers, thiols, sulfides, cyclic compounds, heterocylcic compounds, imidizoles and phenols. An organic compound as used herein also includes nitrated organic compounds and halogenated (e.g., chlorinated) organic compounds. Methods for preparing peptidomimetics are described below. Collections of small molecules, and small molecules identified according to the invention are characterized by techniques such as accelerator mass spectrometry (AMS; see Turteltaub et al., Curr Pharm Des 6(10): 991-1007, 2000, Bioanalytical applications of accelerator mass spectrometry for pharmaceutical research; and Enjalbal et al., Mass Spectrom Rev 19(3): 139-61, 2000, Mass spectrometry in combinatorial chemistry.)
Preferred small molecules are relatively easier and less expensively manufactured, formulated or otherwise prepared. Preferred small molecules are stable under a variety of storage conditions. Preferred small molecules may be placed in tight association with macromolecules to form molecules that are biologically active and that have improved pharmaceutical properties. Improved pharmaceutical properties include changes in circulation time, distribution, metabolism, modification, excretion, secretion, elimination, and stability that are favorable to the desired biological activity. Improved pharmaceutical properties include changes in the toxicological and efficacy characteristics of the chemical entity.
Intellipeptides of the present invention may be useful in a variety of applications, including, but not limited to, therapeutic uses, e.g., to treat diseases and disorders associated with protein aggregation or misfolding, as well as in the manufacture and purification of polypeptides, including recombinantly-produced polypeptides. It is believed that the ability of a candidate therapeutic compound to prevent protein unfolding and aggregation in vitro may be correlated with the ability of the compound to inhibit protein unfolding and aggregation in vivo. In addition, it is believed that the ability of a candidate therapeutic compound to stabilize the functional structure of a protein in vitro may be correlated with the ability of the compound to assist that protein in performing its function in vivo.
The peptides presented here provide a versatile set of drug molecules that can be customized for use as therapeutic peptides to prevent protein aggregation or protein misfolding involved in disease. Examples of diseases related to protein aggregation or protein misfolding include, but are not limited to, amyloid-beta in Alzheimer's disease, beta/gamma crystallins and filaments in cataract, alpha-synuclein in Parkinson's, Huntingtin in Huntington's disease, rhodopsin in retinitis pigmentosa, prions, mad cow disease and others. Missense mutations leading to single amino acid changes in protein sequences have been linked to human disease. A majority of the approximately 16,0000 identified missense mutations affect folding or trafficking of proteins, rather than specifically affecting protein function. Disease linked missense mutations in integral membrane proteins result in membrane protein misassemby, for example, PMP-22 in Charcot-Marie Tooth disease, aquaporin in diabetes insipidis, vasopressin receptor in diabetes insipidis, rhodopsin in retinitis pigmentosa, connexin 32 in Charcot-Marie Tooth disease, CFTR in cystic fibrosis.
Further the peptides can be used for the stabilization of therapeutic proteins such as vaccines, insulin, growth factors, monoclonal and antibodies. The Intellipeptides can be used in the purification of proteins to aid with folding of the proteins to their functional 3D conformation.
Accordingly, the present invention describes a variety of methods related to the use of Intellipeptides. In one embodiment, the present invention provides a method of inhibiting protein unfolding or reducing protein aggregation by providing an Intellipeptide to a cell or solution comprising said protein. In a related embodiment, the present invention includes a method of restoring correct or proper protein folding, by providing an Intellipeptide to a cell or solution comprising said protein. In addition, the present invention further provides a method of enhancing the production and/or isolation of a recombinantly-produced polypeptide, by providing an Intellipeptide to a cell or solution comprising said polypeptide.
Intellipeptides may be provided to a cell or solution by a variety of means available in the art. For example, synthesized Intellipeptides may be directly provided to a solution or into a cell. In addition, Intellipeptides may be provided to a cell or solution by introducing an expression vector comprising a polynucleotide sequence encoding an Intellipeptide with regulatory elements that drive expression of said Intellipeptide in a cell. The polynucleotide sequence may further comprise additional coding regions, including, e.g., a secretion signal such that the Intellipeptide will be secreted from the cell and/or additional elements regulating expression of the encoded Intellipeptide, of which a large variety are known and available in the art, including those used for inducible expression of peptides and polypeptides. Thus, the present invention further includes polynucleotide sequences encoding Intellipeptides and expression vectors comprising the same, including, e.g., viral vectors.
Intellipeptides can be used as therapeutics for, but not limited to, protein aggregation diseases, including, e.g., Alzheimer's disease, Cataract, Parkinson's, Huntington's, Lou Gehrig's, Bovine Spongiform Encephalopathy (Mad Cow's disease), Prion disease, Macular Degeneration and Retinitis Pigmentosa. In addition, Intellipeptides can stabilize proteins and/or peptides used as therapeutics including but not limited to vaccines, insulin, growth factors and monoclonal antibodies. Intellipeptides help fold and stabilize the 3-dimensional structure of proteins during purification. Thus, the present invention provides a method for treating a disease or disorder comprising the administration of an Intellipeptide to a patient in need thereof. A diagram related to the therapeutic applications of Intellipeptides in provided in
In one embodiment, the invention provides a method for treatment of Alzheimer's disease by providing an Intellipeptide to a patient with Alzheimer's. Alzheimer's disease (AD) is a devastating neurodegenerative condition characterized by loss of short-term memory, disorientation, and impairment of judgment and reasoning. AD is the most common dementia in elderly population and is estimated to affect more than twenty-five million people worldwide in some degree. A hallmark event in AD is the deposition of insoluble protein aggregates, known as amyloid, in brain parenchyma and cerebral vessel walls. The main component of amyloid is a 4.3 KDa hydrophobic peptide, named amyloid beta-peptide (Amyloid-beta.) that is encoded on the chromosome 21 as part of a much longer precursor protein (APP) (Selkoe, Science 275: 630-631, 1997). Genetic, biochemical, and neuropathological accumulated in the last 10 years strongly suggest that amyloid plays an important role in early pathogenesis of AD and perhaps triggers the disease (Selkoe, 1997). In the case of Alzheimer's disease, the aggregation of the proteins amyloid-beta and tau is associated with degeneration of nerve cells and nerve processes. Amyloid-beta aggregation leads to the formation of toxic plaque which causes nerve cell death. Tau on the other hand plays a key role in the structure of nerve processes. Tau associates with a cytoskeletal filament called tubulin to form microtubules that are the core of nerve processes. Post-translational modifications, specifically hyper-phosphorylation of tau leads to destabilization of the tau-tubulin interaction. Destabilization of microtubules leads to degeneration of nerve processes. Eventually the tau -tubulin interaction is so severely destabilized that tau dissociates from tubulin and becomes free. Unbound tau has a tendency to aggregate and form neurofibrillary tangles. Neurofibrillary tangles are neurotoxic and lead to neurodegeneration. Amyloid-beta plaques and neurofibrillary tangles are the hallmarks of Alzheimer's disease.
In another embodiment, the present invention provides a method for the treatment of diabetes by providing an Intellipeptide to a patient with diabetes. In a particular embodiment, an Intellipeptide is provided to the patient as a stabilizing molecule for the oral and or nasal delivery of insulin for diabetics. Accordingly, the present invention includes a method of treating diabetes, comprising administering insulin in combination with an Intellipeptide to a patient in need thereof.
Even though, oral and nasal delivery methods for insulin are convenient and pain-free and have been recently approved for use in humans by the FDA, currently the most widely used method for delivering insulin to Type I diabetics is through injections. Many diabetics need multiple injections in a single day which can be both painful and inconvenient. Insulin, a peptide drug, is unstable at room temperature, cannot be absorbed through the gastrointestinal (GI) membrane and is prone to proteolysis by the enzymes of the GI tract. Molecules that can stabilize insulin by protecting it from harsh environmental conditions including temperate, pH and proteolytic enzymes will enable the delivery of insulin via an oral or nasal route. Though small molecules and polymers exist that will protect insulin from pH and proteolytic enzymes, no existing polymer or small molecule can maintain the structural integrity of insulin for long periods of time. Intellipeptides can bind to insulin and stabilize its structure, maintain its function and protect it from temperate, pH and proteolytic enzymes. Formulations of insulin with one or more Intellipeptides will enable insulin delivery via routes including but not limited to oral, nasal, device and patch methods.
In another embodiment, the present invention includes a method of manufacturing and/or purification of a peptide or protein by introduction of an Intellipeptide to a cell or solution comprising said peptide or protein. Modern pharmaceutical discovery processes increasingly focus on developing drugs against specific molecular targets and have greatly increased the requirements within the industry for the production of recombinant proteins. Expression of high quality proteins is essential for drug discovery and drug therapeutics. As the number of potential drug targets and protein and peptide drugs increases the requirement for recombinant proteins is likely to increase. Development of robust methods to produce target proteins in a soluble form and in significant amounts is an essential requirement for modern drug discovery. Intellipeptides can serve as folding aids to fold proteins that are synthesized using biotechnological methods such as bacterial or mammalian expression systems.
Intellipeptides are a versatile set of molecules that target one or more intermediates in the protein misfolding and aggregation disease pathway. In one embodiment, Intellipeptides are designed to bind and stabilize non-native intermediates of conformationally compromised proteins, for example, Intellipeptides bind and stabilize amyloid-beta and prevent its aggregation.
αB Crystallin peptides that bind the target protein beta-amyloid were identified by screening an αB crystallin protein pin array, which provides a systematic method for evaluating interactions between peptides corresponding to residues 1-175 of human αB crystallin and selected target proteins in a parallel and simultaneous manner.
Peptides were synthesized on derivatized polyethylene pins arranged in a microtiter plate format. Each peptide was 8 amino acids in length and consecutive peptides were offset by 2 amino acids. All peptides were covalently bonded to the surface plastic pins. The first immobilized peptide was 1MDIAIHHP8 and the last peptide was 168PAVTAAPK175 for the human αB crystallin.
After synthesis, the pins were pre-coated at room temperature with 2% Bovine Serum Albumin (BSA), 0.1% Tween-20 and 0.1% Sodium azide in 10 mM phosphate buffered saline pH 7.2 (PBS) for 1 hr, and washed three times for 10 mins each with 10 mM PBS. To screen for binding to the peptides, fixed concentrations of the target protein beta amyloid was dissolved in 10 mM PBS, containing 0.05% Tween-20, added to each well and incubated for 1 hr at room temperature. The pin array was washed three times for 10 mins each with 10 mM PBS, containing 0.05% Tween-20. Monoclonal or polyclonal primary antibodies for the target protein beta amyloid was diluted into PBS buffer, added to each well and incubated for 1 hr at room temperature. Subsequently, the pin array was washed three times for 10 mins each with 10 mM PBS containing 0.05% Tween-20. Anti-rabbit horse-radish peroxidase conjugate secondary antibodies was diluted into PBS buffer, added to each well and incubated for 1 hr at room temperature. The pin array was washed three times for 10 mins each with 10 mM PBS containing 0.05% Tween-20. A chromogenic substrate of horse-radish peroxidase 3,3′,5,5′-Tetramethylbenzidine (TMB) (Pharmingen, San Diego, Calif.) which is colorless was added, and the reaction was carried out for 45 min.
Pins displaying a positive reaction resulted in the formation of a blue color. The reaction was stopped by adding 6N sulfuric acid, which changes the color from blue to yellow. The absorption at 450 nm was measured by an ELISA reader (BioTek, Winooski, Vt.).
In all, 84 peptides corresponding to entire primary sequence of human αB crystallin were screened for binding to the target protein beta amyloid (Table 2). 36/84 peptides bound the target protein beta amyloid.
The pin array was regenerated by sonication in a water bath (VWR Aquasonic, West Chester, Pa.) with 100 mM PBS, containing 1% Sodium dodecyl sulfate (SDS) and 0.1% 2-Mercaptoethanol @ 60° C. for 10 mins. Next, the pin array was rinsed three times in deionized water, preheated to 60° C. for 30 secs each time, and shaken in water preheated at 60° C. for 30 min. The pin array was then washed with methanol at 60° C. for 15 secs and air-dried and stored @−20° C. for future use. Each target protein was assayed 2-5 times against the human αB crystallin protein pin array peptides to verify the reproducibility of the results. Peptides numbered 5, 7, 8, 9, 13, 19, 22, 23, 24, 27, 35, 38, 45, 51, 52, 56, 57, 61, 66, 71, 72, and 79 in Table 2 reproducibly tested positive in the protein pin assay.
In vitro assays were performed to measure the ability of the human αB crystallin derived peptides identified in Example 1 to inhibit aggregation of pH-induced amyloid beta (1-42) aggregation. Amyloid-beta (1-42), the peptide implicated in Alzheimer's disease, aggregates under acidic conditions (pH 5.2). Aggregation assay were performed using routine procedures, in the presence or absence of a synthesized peptide identified in Example 1, at the acidic pH of 5.2. Aggregation was measured spectrophotometrically as light scattering at a wavelength of 360 nm. Aggregated amyloid-beta in the absence of other peptides was assigned a value of 100% aggregation and used to normalize aggregation in the presence of synthesized peptides.
Three peptides identified containing sequences identified using protein pin arrays and tested for their efficacy in preventing the pH induced aggregation of amyloid-beta were able to reproducibly prevent the aggregation of amyloid-beta in vitro. The sequences of these peptides are DRFSVNLDVKHFS or LTITSSLSDGV or HGKHEERQDE, respectively.
In vitro assays were also performed to measure the ability of the human αB crystallin derived peptides identified in Example 1 to inhibit aggregation of Cu+++-induced amyloid beta (1-42) aggregation. Aggregation assays were performed using routine procedures, in the presence or absence of a synthesized peptide identified in Example 1, in the presence of 100 mM Cu+++. Aggregation was measured spectrophotometrically as light scattering at a wavelength of 360 nm. Aggregated amyloid-beta was assigned a value of 100% aggregation and used to normalize aggregation in the presence of synthesize peptides.
Three peptides containing sequences identified using protein pin arrays and tested for their efficacy in preventing the Cu+++ induced aggregation of amyloid-beta were able to reproducibly prevent the aggregation of amyloid-beta in vitro, as shown in
The sequences of the peptides identified in Example 2 and 3 for the ability to decrease protein aggregation were used to generate a library of related Intellipeptides, based upon molecular modeling of human αB crystallin.
The homology modeling program Molecular Operating Environment (Chemical Computing Group, Inc, Montreal, Canada) was used to construct a 3D homology model of human αB crystallin using the wheat sHSP 16.9 crystal structure as the template. MOE employs a number of techniques including and not limited to multiple sequence alignments, structure superposition, contact analyzer and fold identification to develop homology models based on available high resolution crystal and/or NMR structures of the template protein molecule. In building the homology model of human αB crystallin, the primary sequence of human αB crystallin was first coarsely aligned to that of the template protein, wheat sHSP16.9 primary sequence using ClustalX. The predicted secondary structure of human αB crystallin was then obtained (JPred) and verified with the available spin labeling information about the structural elements. The human αB crystallin 3-dimensional structure was evaluated using Procheck and the ModelEval module of MOE.
Percent surface hydrophobicities of the peptides sequences were calculated using an electrostatic potential to the 3D structure of human αB crystallin.
The primary sequence and the predicted secondary structure of human αB crystallin holoprotein were aligned with the primary sequence and the predicted secondary structure of all the proteins in the small heat shock protein family whose sequence is known. Homologous peptides that are structurally conserved and have primary sequence similar to the human αB crystallin peptides were selected. A library of peptides having sequences related to and derived from i) EKDRFSVNLDVKHFS or ii) LTITSSLSDGVLTVNGPRK was prepared. The amino acid sequence of selected peptides is presented in
Protein pin arrays identified seven interactive sequences for chaperone activity in human αB crystallin using natural lens proteins, βH crystallin and γD crystallin, and in vitro chaperone target proteins, alcohol dehydrogenase and citrate synthase. The N-terminal domain contained two interactive sequences, 9WIRRPFFPFHSP20 and 43SLSPFYLRPPSFLRAP58. The α crystallin core domain contained four interactive sequences, 75FSVNIDVK82 (β3), 113FISREFHR120, 131LTITSSLS138 (β8) and 141GVLTVNGP148 (β9). The C-terminal domain contained one interactive sequence, 157RTIPITRE164 that included the highly conserved I-X-I/V amino acid motif. Two interactive sequences, 73DRFSVNLDVKHFS85 and 131LTITSSLSDGV141 belonging to the α crystallin core domain were synthesized as peptides and assayed for chaperone activity in vitro. Both synthesized peptides inhibited the thermal aggregation of βH crystallin, alcohol dehydrogenase and citrate synthase in vitro. Five of the seven chaperone sequences identified by the pin arrays overlapped with sequences identified previously as sequences for subunit-subunit interactions in human αB crystallin. The results suggested that interactive sequences in human αB crystallin have dual roles in subunit-subunit assembly and chaperone activity.
Human αB crystallin is a small heat shock protein (sHSP) and molecular chaperone. sHSPs are characterized by molecular weights <43 kDa, low sequence similarity, up-regulation in response to environmental stress and an ability to protect against the unfolding and aggregation of proteins through activity as molecular chaperones. Ingolia and Craig, Proc Natl Acad Sci USA 79: 2360-4, 1982; Klemenz et al., Proc Natl Acad Sci USA 88: 3652-6, 1991; Merck et al., J Biol Chem 268: 1046-52, 1993; de Jong et al., Mol Biol Evol 10: 103-26, 1993; Groenen et al., Eur J Biochem 225: 1-19, 1994; de Jong et al., Int J Biol Macromol 22: 151-62, 1998; Bloemendal et al., Prog Biophys Mol Biol 86: 407-85, 2004. sHSPs are ubiquitous in cells and tissues throughout the plant and animal kingdoms and are upregulated in age-related myopathies, cardiac ischemia, and a variety of protein aggregation diseases including Alexander's disease, Alzheimer's disease, Creutzfeld-Jakob disease and Parkinson's disease. Iwaki et al., Cell 57 :71-8, 1989, Iwaki et al., Am J Pathol 140: 345-56, 1992; Kato et al., Acta Neuropathol (Berl) 84: 443-8, 1992; Shinohara et al., J Neurol Sci 119: 203-8, 1993; Goebel and Bornemann, Virchows Arch B Cell Pathol Incl Mol Pathol 64: 127-35, 1993; van Noort et al., Nature 375: 798-801, 1995; Jackson et al., Neuropathol Appl Neurobiol 21: 18-26, 1995; Lobrinus et al., Neuromuscul Disord 8: 77-86, 1998; Renkawek et al., Neuroreport 10: 2273-6, 1999; van Rijk and Bloemendal, Opthalmologica 214: 7-12, 2000; Yeboah and White, Croat Med J 42: 523-6, 2001. In the lens where α crystallins comprise approximately 33% of the total protein content, the accumulation of post-translational modifications is associated with protein unfolding that favors attractive interactions between proteins and formation of aggregates large enough to result in light scattering and cataract. Garland et al., Basic Life Sci 49: 347-52, 1988; Harding, Lens Eye Toxic Res 8: 245-50, 1991; Miesbauer et al., J Biol Chem 269: 12494-502, 1994; Goode and Crabbe, Comput Chem 19: 65-74, 1995; Lund et al., Exp Eye Res 63: 661-72, 1996; Hanson et al., Exp Eye Res 67: 301-12, 1998; Yan and Hui, Yan Ke Xue Bao 16: 91-6, 2000; Feng et al., J Biol Chem 275: 11585-90, 2000; Garner et al., Biochim Biophys Acta 1476: 265-78, 2000; Lapko et al., Protein Sci 10: 1130-6, 2001; Kim et al., Biochemistry 41: 14076-84, 2002; Ueda et al., Invest Opthalmol V is Sci 43: 205-15, 2002. In theory, high concentrations of 0: crystallins in lens cytoplasm can bind unfolding β/γ crystallin proteins, stabilize transparent cell structure and protect against aggregation and opacification through their function as molecular chaperones. Fu and Liang, Invest Opthalmol V is Sci 44: 1155-9, 2003; Srivastava and Srivastava, Mol Vis 9: 110-8, 2003; del Valle et al., Biochim Biophys Acta 1601: 100-9, 2002; Slingsby and Clout, Eye 13 Pt 3b: 395-402, 1999; Hook and Harding, Int J Biol Macromol 22: 295-306, 1998; Takemoto and Boyle, Int J Biol Macromol 22: 331-7, 1998. In a normal lens, αB crystallin is a structural protein, that interacts weakly with the β/γ crystallins and is closely associated with the filament network. Fu and Liang, J Biol Chem 277: 4255-60, 2002; Nicholl and Quinlan, Embo J 13: 945-53, 1994.
While X-ray crystal structures exist for β and γ crystallin, the structure of αB crystallin is based on spectroscopic data and homology models. Kim et al., Nature 394: 595-9, 1998; Basak et al., J Mol Biol 328: 1137-47, 2003; Purkiss et al., J Biol Chem 277: 4199-205, 2002; Sergeev et al., Mol Vis 4: 9, 1998; Hejtmancik et al., Protein Eng 10: 1347-52, 1997; Norledge et al., Protein Sci 6: 1612-20, 1997; van Montfort et al., Nat Struct Biol 8: 1025-30, 2001; Carver and Lindner, Int J Biol Macromol 22: 197-209, 1998; Ghosh and Clark, Protein Sci 14: 684-95, 2005. Spectroscopic data, secondary structure prediction and X-ray crystal structures of two homologous sHSPs, Methanococcus jannaschi (Mj) sHSP16.5 and wheat sHSP16.9, indicated that sHSPs are characterized by three structural domains, an N-terminal domain that varies in primary sequence, an α crystallin core domain that is conserved in primary sequence and secondary structure and a C-terminal extension that is variable in sequence. In the crystal structures of Mj sHSP16.5 and wheat sHSP16.9, the N-terminal domain is largely helical or unstructured, the α crystallin core domain is an immunoglobulin-like fold and the C-terminal extension domain protrudes from the α crystallin core domain and is unstructured and flexible. Carver and Lindner, Int J Biol Macromol 22: 197-209, 1998. The immunoglobulin-like fold adopted by the α crystallin core domain is a P sandwich composed of two anti-parallel β sheets formed by six to nine β strands connected by loops of variable lengths. The formation of dimers in wheat sHSP16.9 is due to interactions between the β2 and β3 strands of one monomer with the β6 strand contained in the loop connecting β5 and β7 of another monomer. The C-terminal extension contains a conserved I-X-I/V amino acid motif where I is Isoleucine, V is Valine and X is any natural amino acid. In wheat sHSP16.9, the I-X-I motif of one monomer interacts with residues of the β4 and β8 strands of another monomer to form the higher order dodecameric quaternary structure observed in the crystal structure. While αB crystallin contains the same three structural domains found in Mj sHSP16.5 and wheat sHSP16.9, the complex assembly of human αB crystallin is larger, and more polydisperse than the two sHSPs that have been crystallized. This suggests that the dimer interface and the oligomerization interface in αB crystallin may be different from the smaller homologous sHSPs. Proteolysis and domain swapped chimeric mutants of αB crystallin and Caenorhabditis elegans (C. elegans) sHSP12.2 indicated that sequences in all three structural domains of sHSPs were important for complex assembly and chaperone activity. Kokke et al., Cell Stress Chaperones 6: 360-7, 2001; Saha and Das, Proteins 57: 610-7, 2004. The identity of specific residues or sequences in each domain important for the complex assembly and chaperone activity of αB crystallin and other small heat shock proteins remain to be determined.
In a previous study using a protein pin array of sequential and overlapping eight-mer peptides of human αB crystallin, the interactive domains required for subunit-subunit interactions and complex assembly in human αB crystallin were identified. Ghosh and Clark, Protein Sci 14: 684-95, 2005. The N-terminal sequence, 37LFPTSTSLSPFYLRPPSF54, three α crystallin core domain sequences, 75FSVNLDVK82 (13), 131LTITSSLS138 (18) and 141GVLTVNGP148 (19) and the C-terminal sequence, 155PERTIPITREEK166 were identified as subunit-subunit interaction sites in αB crystallin. The pin array studies confirmed and expanded on spectroscopic observations, mutational studies, proteolytic degradation experiments and a two-hybrid screen that characterized interactive domains in sHSPs. The subunit-subunit interactive domains identified by the pin arrays were consistent with the dimer and complex interfaces (β3, β8, β9 and the I-X-I/V amino acid motif) identified in the crystal structures of Mj sHSP16.5 and wheat sHSP16.9 with one exception. The pin arrays did not identify sequences in the loop region connecting β5 and β7 as interactive sequences for dimerization in αB crystalline. Kim et al., Nature 394: 595-9, 1998; van Montfort et al., Nat Struct Biol 8: 1025-30, 2001; Ghosh and Clark, Protein Sci 14: 684-95, 2005. In separate reports using peptide scanning techniques, sequences for subunit-subunit assembly and chaperone function were identified in αB crystallin and a small heat shock protein, sHSPB, that were consistent with the sequences identified by the pin arrays. Ghosh and Clark, Protein Sci 14: 684-95, 2005; Sreelakshmi et al., Biochemistry 43: 15785-95, 2004; Lentze and Narberhaus, Biochem Biophys Res Commun 325: 401-7, 2004. A sequence in the α crystallin core domain of αA crystallin, KFVIFLDVKHFSPEDLTVK which is homologous to the sequence 75FSVNLDVK82 from the α crystallin core domain of human αB crystallin was reported to have chaperone-like activity in vitro. Sharma et al., J Biol Chem 275: 3767-71, 2000.
In this report, protein pin arrays identified and characterized interactive sequences that were mapped to a 3-D structural model. Seven interactive domains for chaperone function in human αB crystallin were identified as sequences that interacted with denatured βH crystallin, γD crystallin, alcohol dehydrogenase and citrate synthase. Two of the interactive peptides, 73DRFSVNLDVKHFS85 and 131LTITSSLSDGV141, were synthesized and observed to have chaperone activity in vitro against the thermal aggregation of βH crystallin, ADH and CS. The seven interactive sequences for chaperone function identified by the pin arrays overlapped with the interactive domains for subunit-subunit interactions and complex assembly identified previously. Ghosh and Clark, Protein Sci 14: 684-95, 2005. Taken together, these data suggest that interactive domains in αB crystallin mediate dual functions of chaperone activity and subunit-subunit assembly.
Synthesis of Pins, binding and detection of peptides binding to ligand proteins. The αB crystallin protein pin array was used to measure the interaction between peptides and chaperone target proteins including bovine βH crystallin, human γD crystallin, equine alcohol dehydrogenase (ADH) and porcine citrate synthase (CS) as described (
The purity of the target proteins used in the pin array assays, bovine βH crystallin, human γD crystallin, equine alcohol dehydrogenase (ADH) and porcine citrate synthase (CS) were determined to be >90% by SDS-PAGE. In addition, primary antibodies for each target protein were specific to that target protein and consequently contaminating proteins that may bind to the peptides will not be detected. Eighty-four sequential and overlapping peptide fragments corresponding to residues 1-175 of human αB crystallin were synthesized employing a simultaneous peptide synthesis strategy developed by Geysen, called Multipin Peptide Synthesis (Mimotopes, San Diego, Calif.). Geysen, Southeast Asian J Trop Med Public Health 21: 523-33, 1990; Chin et al., J Org Chem 62: 538-539, 1997. Peptides were immobilized on derivatized polyethylene pins arranged in a microtiter ELISA plate format. Each peptide was eight amino acids in length and consecutive peptides were offset by two amino acids. All peptides were bound covalently to the surface of the plastic pins. The first peptide was 1MDIAIHBP8 and the last peptide was 168PAVTAAPK175 for human αB crystallin. All proteins and antibodies were purchased from suppliers as listed in Table 3.
Human Myoglobin did not interact with the αB crystallin peptides and was the negative control for the protein pin array assay. Ghosh and Clark, Protein Sci 14: 684-95, 2005. Positive interactions resulted in blue color in the wells of the ELISA plate. The intensity of the blue color in the wells was measured at 450 nm using an ELISA plate-reader (BioTek, Winooski, Vt.). The intensity of the blue color (plotted on the X-axis) was a measure of the interaction between αB crystallin peptides (plotted on the Y-axis) and target proteins. To measure the effect of temperature on the interactions between interactive peptides and the target proteins, the target proteins were heated in a water-bath at 45° C. for fifteen minutes prior to use. Pin arrays are unable to differentiate between monomers, dimers or oligomers of target proteins that exist in solution. Instead, pin arrays are very sensitive detectors of interactions between individual peptides and the entire population (monomers, dimers or oligomers) of specific target proteins that may exist in solution under specific conditions. Each target protein was assayed two to five times. A single pin array was used for all experiments and no change in interactions was observed after more than thirty repetitions. The last three peptides of the protein pin array, 163REEKPAVT170, 165EKPAVTAA172, 167PAVTAAPK174, correspond to the epitope (163REEKPAVTAAPKK175) recognized by the primary antibody for human αB crystallin. A positive reaction is observed for these three peptides in the absence of human αB crystallin as the ligand due to direct binding of the anti-human αB crystallin antibody to these three peptides. The loss of efficiency for the pin array was measured using this assay. The loss of efficiency for the pin array was determined to be <5% by after more than 30 assays.
Molecular modeling of human αB crystallin. The homology modeling program Molecular Operating Environment (MOE) (Chemical Computing Group, Inc, Montreal, Canada) was used to construct the 3-dimensional homology model of human αB crystallin as described previously. Ghosh and Clark, Protein Sci 14: 684-95, 2005. The software included modules for multiple sequence alignment, structure superposition, contact analysis, fold identification, analysis of the stereochemical quality of the predicted models which takes into account parameters like planarity, chirality, phi/psi preferences, chi angles, non-bonded contact distances, unsatisfied donors and acceptors. Fechteler et al., J Mol Biol 253: 114-31, 1995. The wheat sHSP16.9 crystal structure was chosen as the template for the homology modeling of human αB crystallin because the wheat sHSP16.9 has the highest degree of sequence similarity with human αB crystallin (40% in the α crystallin core domain and 25.4% overall) of all the available crystal structures of sHSPs. In building the homology model of human αB crystallin, the primary sequence of human αB crystallin was aligned with the template protein, wheat sHSP16.9 primary sequence using ClustaIX. Jeanmougin et al., Trends Biochem Sci 23: 403-5, 1998; Aiyar, Methods Mol Biol 132: 221-41, 2000. The predicted secondary structure of human αB crystallin was then obtained (JPred) and verified with the available spin labeling information for the structural elements. Koteiche and McHaourab, J Mol Biol 294: 561-77, 1999. The secondary structure of human αB crystallin was then aligned structurally with the observed secondary structure of the wheat sHSP16.9. This alignment was used to create a series of ten energy minimized models in MOE. Each model was evaluated using the ModelEval module of MOE and the best fit was selected as the final model and verified by Procheck. Morris et al., Proteins 12: 345-64, 1992. The αB crystallin 3D model computed on the basis of the X-ray crystal structures of wheat sHSP16.9 and Mj sHSP16.5 was consistent with the electron spin resonance (ESR) data and previous homology models of αB crystallin. Koteiche and McHaourab, J Mol Biol 294: 561-77, 1999; Guruprasad and Kumari, Int J Biol Macromol 33: 107-12, 2003; Berengian et al., Biochemistry 36: 9951-7, 1997; Koteiche et al., Biochemistry 37: 12681-8, 1998; Muchowski et al., J Mol Biol 289: 397-411, 1999. When the 3D homology model of αB crystallin was superimposed on the crystal structures of wheat sHSP16.9 and Mj sHSP16.5, the Cα root mean square deviation of the fit was 3.25 Å. Ghosh and Clark, Protein Sci 14: 684-95, 2005. Superimposition of the conserved α crystallin core domains of the three structures resulted in a Cα root mean square deviation of 2.06 Å. Hydrophobic surface areas formed by the chaperone sequences were calculated by a custom script provided by the manufacturer (Chemical Computing Group, Inc, Montreal, Canada). Graphical representations on human alphaB crystallin were made using PyMol and MOE.
Protein pin arrays enabled the identification of interactive sequences necessary for the chaperone activity of human αB crystallin. Interactions between immobilized 8-mer human αB crystallin peptides and unheated and pre-heated βH crystallin, a natural protein constitute of lens cells, were measured as absorbances at λ=450 nm at 23° C. Maximum absorbances were measured for the peptide sequences, 9WIRRPFFP16, 45SPFYLRPP52, 47FYLRPPSF54, 51PPSFLRAP58 69RLEKDRFS76, 75FSVNLDVK82, 89LKVKVLGD96, 113FISREFHR120, 121KYRIPADV128, 13LTITSSLS138, 141GVLTVNGP148 and 157RTIPITRE164 when unheated βH crystallin was the target protein (
When βH crystallin was replaced with γD crystallin, a native protein constituent of lens cytoplasm from the same gene family as the β crystallins, as the ligand in the pin array, twelve interactive sequences were identified (
In addition to interactions with physiological target proteins in the β/γ crystallin family, interactions of the αB crystallin peptides with unheated and pre-heated chaperone target proteins, alcohol dehydrogenase (ADH) and citrate synthase (CS) were measured. Prior to use in the pin array assay, the secondary structures of βH crystallin, γD crystallin, ADH, and CS were determined using far ultra-violet circular dichroism (UVCD) at 23° C., after heating at 45° C. for fifteen minutes and after heating at 50° C. for sixty minutes (
Prior to use in the pin array assay, the tertiary structures of βH crystallin, γD crystallin, ADH, and CS were determined using near ultra-violet circular dichroism at 23° C., after heating at 45° C. for fifteen minutes and after heating at 50° C. for sixty minutes (
The pattern of absorbance at λ=450 nm when αB crystallin peptides were assayed with unheated and pre-heated ADH was similar to the absorbance pattern obtained with unheated and pre-heated βH/γD crystallin (
The pattern of absorbance at λ=450 nm when αB crystallin peptides were assayed with unheated and pre-heated CS was similar to the absorbance pattern obtained with unheated and pre-heated βH/γD crystallin and ADH (
3IAIHHPWI10
9
WIRRPFFP
16
9
WIRRPFFP
16
9
WIRRPFFP
16
9
WIRRPFFP
16
9WIRRPF
21PFFPFHSP28
21PFFPFHSP28
23FPFHSPSR30
25DQFFGEHL32
27FFGEHLLE34
37LFPTSTSL44
43SLSPFY
45SPFYLRPP52
43SLSPFYLR50
43SLSPFYLR50
43SLSPFYLR50
47FYLRPPSF54
47FYLRPPSF54
49LRPPSFLR56
47FYLRPPSF54
51PPSFLRAP58
53SFLRAPSW60
55LRAPSWFD62
69RLEKDRFS76
69RLEKDRFS76
75FSVNL
75
FSVNLDVK
82
75
FSVNLDVK
82
75
FSVNLDVK
82
75
FSVNLDVK
82
89LKVKVLGD96
113FISREFHR120
111HGFISREF118
113FISREFHR120
113FISREF
121KYRIPADV128
117EFHRKYRI124
115SREFHRKY122
131
LTITSSLS
138
131
LTITSSLS
138
131
LTITSSLS
138
131
LTITSSLS
138
131LTITSS
141GVLTVNGP148
141GVLTVNGP148
141GVLTVNGP148
143LTVNGPRK150
141GVLT
157RTIPITRE164
157RTIPITRE164
157RTIPITRE164
157RTIPIT
Seven interactive sequences for chaperone function in αB crystallin, 9WIRRPFFPFHSP20, 43SLSPFYLRPPSFLRAP58, 75FSVNLDVK82, 113FISREFHR120, 131LTITSSLS138, 141GVLTVNGP148 and 157RTIPITRE164 were identified as sequences that had the strongest interactions with chaperone the target proteins, ADH and CS, and the physiological proteins, βH/γD crystallin, that were pre-heated at 45° C. for fifteen minutes (Table 4: Column 6). Two of the chaperone sequences are in the N-terminus region, four are non-overlapping chaperone sequences in the conserved α crystallin core domain, and a single non-overlapping chaperone sequence is in the C-terminus extension of αB crystallin.
Two αB crystallin sequences, 73DRFSVNLDVKHFS85 and 131LTITSSLSDGV141, that were in the conserved α crystallin core domain and were observed to have positive interactions with pre-heated target proteins in the pin array were synthesized to determine their chaperone activity in vitro. The chaperone activity of the peptides was measured as their ability to protect against the thermal aggregation of three chaperone target proteins βH crystallin, ADH and CS in chaperone assays performed at 50° C. (
The seven sequences identified as interactive domains for chaperone function in human αB crystallin using protein pin arrays (Table 4: Column 6) were assigned secondary structure based on electron spin resonance (ESR) and homology modeling data (
The observed interactive domains were mapped to a computed 3D homology model of human αB crystallin to identify the 3-D structure of the interactive chaperone sequences (
Small heat shock proteins, sHSPs, are a family of stress proteins and molecular chaperones with molecular weights up to 43 kDa that contain an N-terminus domain variable in length and primary sequence, a conserved α crystallin core domain, and a C-terminal extension domain that contains the highly conserved I-X-I/V amino acid motif. In this report protein pin arrays identified seven interactive sequences 9WIRRPFFPFHSP20, 43SLSPFYLRPPSFLRAP58, 75FSVNLDVK82, 113FISREFHR120, 131LTITSSLS138, 141GVLTVNGP148 and 157RTIPITRE164, as being important for the chaperone activity of human αB crystallin using endogenous target proteins βH/γD crystallins and non-physiological targets ADH and CS. Although, it is possible that interactions of αB crystallin with native βH crystallin and γD crystallin that require secondary structure might evade detection by the protein pin arrays, it is likely that the pin arrays identified most sequences involved in the interactions of αB crystallin with native βH crystallin and γD crystallin. The interactive peptides identified by the pin arrays were not the most hydrophobic peptides in the human αB crystallin protein pin array. Based on the hydrophobicity values provided by the manufacturer, the interactive peptides, 9WIRRPFFPFHSP20, 43SLSPFYLRPPSFLRAP58 and 131LTITSSLS138 were quite hydrophobic. Fourteen non-interactive peptides in the human αB crystallin pin array were more hydrophobic than the remaining interactive peptides, 75FSVNLDVK82, 113FISREFHR120, 141GVLTVNGP148 and 157RTIPITRE164. Ghosh and Clark, Protein Sci 14: 684-95, 2005.
Far UVCD analysis indicated that there were <10% loss of secondary structure of βH crystallin and γD crystallin when they were heated at 45° C. for fifteen minutes. However, the magnitude of the absorption peaks in the near UVCD spectra of βH crystallin and γD crystallin decreased by ˜20-25%, which indicated conformational changes in the tertiary structures of those proteins. In the absence of significant unfolding and loss of secondary structure of βH crystallin and γD crystallin as determined by far UVCD, the increased interaction between interactive αB crystallin peptides and pre-heated βH crystallin and γD crystallin suggested that the interactive domains of αB crystallin detected conformational changes in tertiary structure that resulted from pre-heating βH crystallin and γD crystallin. It appeared that the pin arrays are as sensitive as near UVCD in detecting perturbations in the tertiary structure of unfolded proteins.
Two of the seven chaperone sequences were in the N-terminus, four in the conserved α crystallin core domain, and one in the C-terminal extension containing the highly conserved I-X-I/V amino acid motif. The pin array results suggested that point mutations within the interactive domains can be expected to modify the chaperone activity of αB crystallin and point mutations outside the interactive domains can be expected to have little or no effect on chaperone activity. While systematic studies have not been reported for the sequence motifs identified using the pin arrays to date, literature results are consistent with the suggestion that the sequences identified by the pin arrays are important motifs for the chaperone-like activity of human αB crystallin. Muchowski et al., J Mol Biol 289: 397-411, 1999; Gupta and Srivastava, Invest Opthalmol Vis Sci 45: 206-14, 2004; Liao et al., Biochem Biophys Res Commun 297: 309-16, 2002; Kumar et al., J Biol Chem 274: 24137-41, 1999; Horwitz et al., Int J Biol Macromol 22: 263-9, 1998; Plater et al., J Biol Chem 271: 28558-66, 1996; Derham et al., Eur J Biochem 268: 713-21, 2001. Mutagenesis of αB crystallin in which deletions of the C-terminus extension included Arginine 157 resulted in diminished chaperone-activity in vitro when compared to full-length αB crystallin. Thampi and Abraham, Biochemistry 42: 11857-63, 2003. Arginine 157 is present in the 157RTIPITRE164 chaperone sequence identified by the pin arrays. X-ray solution scattering on a truncation mutant of αB crystallin (αB crystallin 57-157) indicated that the α crystallin domain in the absence of the N- and C-terminal extensions formed dimers and had decreased chaperone-like activity in vitro as compared to full-length αB crystallin. Feil et al., J Biol Chem 276: 12024-9, 2001. When two consensus chaperone sequences 73DRFSVNLDVKHFS85 and 131LTITSSLSDGV141 belonging to the α crystallin core domain were synthesized and tested for protection against thermal unfolding and aggregation of chaperone target proteins βH crystallin, ADH and CS in vitro, a substantial protective effect was observed. The chaperone assays confirmed that the sequences identified using the pin array were important for the chaperone activity of αB crystallin and were consistent with an earlier study in which hydrophobic probes and chaperone assays identified the αB crystallin sequence 73DRFSVNLDVK82 as an interactive sequence for chaperone activity. Sharma et al., J Biol Chem 279: 6027-34, 2004. Selected point or combination mutations in the interactive sequences of αB crystallin can be expected to improve or diminish chaperone activity. A higher concentration of both peptides was required to protect against the aggregation of βH crystallin and ADH, than to protect against the aggregation of CS. Circular dichroism analysis indicated that βH crystallin was partially unfolded and both ADH and CS were almost completely unfolded at 50° C. Taken together, the chaperone assay and circular dichroism data suggested that the peptides were more efficient in protecting against the aggregation of a completely unfolded protein and less efficient in protecting against the aggregation of partially unfolded or native-like proteins.
The interactive sequences identified using the pin arrays were mapped onto a 3-dimensional model of αB crystallin to analyze the structural topography of the chaperone interface. Ghosh and Clark, Protein Sci 14: 684-95, 2005. In the absence of an X-ray crystal or NMR structure, it was observed that superposition of the computed αB crystallin homology model with the crystal structure of Mj sHSP16.5 and wheat sHSP16.9 was remarkable with a Cα root mean square deviation of 2.06 Å for the α crystallin core domain. Kim et al., Nature 394: 595-9, 1998; van Montfort et al., Nat Struct Biol 8: 1025-30, 2001. Secondary structure was assigned to the interactive sequences identified by the pin arrays on the basis of electron paramagnetic resonance data (EPR) and a multiple sequence alignment of human αB crystallin with the crystal structures of wheat sHSP16.9 and Mj sHSP16.5. Kim et al., Nature 394: 595-9, 1998; van Montfort et al., Nat Struct Biol 8: 1025-30, 2001; Koteiche and McHaourab, J Mol Biol 294: 561-77, 1999; Koteiche et al., Biochemistry 37: 12681-8, 1998. The N-terminal chaperone sequence 9WIRRPFFPFHSP20 was unstructured and formed a surface that was 70% hydrophobic while the sequence 43SLSPFYLRPPSF54 formed a helix-turn-helix motif with an external surface that was 72% hydrophobic and favorable for binding exposed hydrophobic patches of unfolding proteins. Three of the four sequences in the α crystallin core domain, 75FSVNLDVK82 (β3), 131LTITSSLS138 (β8) and 141GVLTVNGP148 (β9) were β strands and formed a surface that was 67% hydrophobic. The C-terminal chaperone sequence 157RTIPITRE164 containing the highly conserved I-X-I/V motif was unstructured and formed a surface that was 59% hydrophobic. The chaperone sequences 43SLSPFYLRPPSF54, 75FSVNLDVK82, 131LTITSSLS138, 141GVLTVNGP148 and 157RTIPITRE164 identified in this report overlapped significantly with sequences identified previously as subunit-subunit interaction sites in αB crystallin using protein pin arrays (
The lack of interactive sequences similar to the αB crystallin N-terminal and C-terminal chaperone sequences, 9WIRRPFFPFHSP20, 43SLSPFYLRPPSFLRAP58 and 157RTIPITRE164, in sHSPs of C. elegans (sHSP12.2, sHSP12.3 and sHSP12.6) could account for the absence of chaperone-like activity of C. elegans sHSPs in vitro. Kokke et al., FEBS Lett 433: 228-32, 1998. The interface formed by the α crystallin core domain peptides 75FSVNLDVK82 (β3), 131LTITSSLS138 (β8), 141GVLTVNGP148 (β9) that interacted with all four chaperone target proteins and collectively formed an external surface that was 67% hydrophobic was previously identified as the interface for the assembly of human αB crystallin subunits using pin arrays. Ghosh and Clark, Protein Sci 14: 684-95, 2005. The interface provides residues for both hydrophobic and hydrophilic interactions with native proteins αA and αB crystallin) as well as with unfolding chaperone target proteins (βH crystallin, γD crystallin). The structure of the α crystallin core domain is highly conserved in the small heat shock protein family and sequences homologous to the αB crystallin chaperone sequences 75FSVNLDVK82, 113FISREFHR120, 131LTITSSLS138, 141GVLTVNGP148 in other small heat shock proteins are expected to be involved in the chaperone function of other sHSPs. In summary, pin array assays, in vitro chaperone assays and circular dichroism spectroscopy of target proteins identified the sequences in full-length αB crystallin that were responsible for interactions with a broad range of target proteins including proteins that are almost completely unfolded (ADH and CS) and proteins that are partially unfolded (βH crystallin and γD crystallin). Protein pin arrays were effective in identifying protein-protein interactive domains in human αB crystallin that were important in oligomeric assembly and in interactions with unfolding chaperone target proteins. Further investigation of the specific sequences and 3-dimensional structures of the interactive domains in physiologically relevant small heat shock proteins including human sHSP27 and Mycobacterium tuberculosis sHSP16.3, will provide new information on the function of sHSPs and molecular chaperones in disease. The current results suggest that the collective response of sHSPs to protein unfolding involves several interactive domains and that sHSPs are exquisitely sensitive to protein unfolding. These results could account for the selectivity and sensitivity of small heat shock proteins and their adaptation to the needs of specific cells and their response to stress.
Effect of αB Crystallin and Five αB Crystallin Derived Peptides on the Fibrillization of Aβ or γD Crystallin
Aβ Thioflaviin T fluorescence. Aβ fibrils were grown in the presence and absence of αB crystallin and peptides in conditions similar to that previously described for the presence of αB crystallin. Bakthisaran et al., Biochem J, 2005. Peptides were dissolved to 9.1 mM in trifluorethanol (TFE) and diluted to 0.91 mM stock solutions in 50 mM PBS, 100 mM NaCl, pH 7.4. 3.5 mg of ThioflavinT was dissolved in 100 μl of 50 mM Glycine pH 8.5. A stock solution of 1 mg/ml (0.22 mM) Aβ was prepared. 20 μl Aβ, 50 μl chaperone, 2 μl ThioflavinT was mixed with 28 μl 50 mM PBS, 100 mM NaCl, pH 7.4. Fluorescence was read using a Beckman Coultier DTX 880 multimode detector. Samples were heated at 50° C. for 72 hours and fluorescence was read again.
γD crystallin Thioflaviin T fluorescence. γD crystallin fibrils were formed in a similar way to that of γ crystalline. Meehan et al., J Biol Chem 279: 3413-3419, 2004. Peptides were dissolved to 9.1 nM IN tfe AND DILUTED to 0.91 mM stock solutions in 10% TFE, pH 2.0.
3.5 mg of ThioflavinT was dissolved in 100 μL of 50 mM Glycine pH 8.5. A stock solution of 1 mg/ml (0.22 mM) Aβ was prepared. 20 μl Aβ, 5 μl chaperone, 2 μl ThioflavinT was mixed with 73 μL 10% TFE, pH 2.0. Fluorescence was read using a Beckman Coultier DTX 880 multimode detector. Samples were heated at 50° C. for 72 hours and fluorescence was read again.
Chaperone Assays. Chaperone assays of the five peptides were performed using previously established methods with minor modifications. Muchowski et al., J Mol Biol 289: 397-411, 1999. A 1:1 monomeric molar ratio of chaperone: target protein was determined to be the optimum ratio and all subsequent chaperone assays of the β3 mutants were performed in triplicate and at a 1:1 monomeric molar ratio of chaperone: target protein in a 96-well ELISA micro-titer plate. 0.1 mmoles of the chaperone and βL crystallin were mixed in a total volume of 200 μL buffer (5 mM PBS, pH 7.0). Light scattering at λ=340 nm was measured using a Multiskan MCC/340 plate reader before heating (time=0). After the first reading, the plate was heated at 50° C. and readings were taken at fifteen minute intervals for one hour.
Table 5 shows the fluorescence of Aβ or γD crystallin in the absence or presence of wild type αB crystallin and five αB crystallin derived peptides.
αB crystallin
ST
SLSPFYLRPPSFLRAP
DR
FSVNLDVKHFS
HG
KHEERQDE
LT
ITSSLSSDGV
ER
TIPITRE
This description of the invention, will enable those skilled in the art to perform within a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without due experimentation results that are presented here.
While this invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications. This application is intended to cover any variations, uses, or adaptations of the inventions following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth as follows in the scope of the appended claims.
All references cited herein, including journal articles or abstracts, published or corresponding U.S. or foreign patent applications, issued U.S. or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and text presented in the cited references. Additionally, the entire contents of the references cited within the references cited herein are also entirely incorporated by reference.
This application is related to U.S. Provisional Application No. 60/625,364, filed Nov. 4, 2004. and U.S. Provisional Application No. 60/724,961, filed Oct. 7, 2005, which are incorporated herein by reference in their entirety.
This invention was made by government support by Grant No. EY04542 from National Eye Institute of The National Institutes of Health. The Government has certain rights in this invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US05/40161 | 11/4/2005 | WO | 00 | 12/3/2007 |
Number | Date | Country | |
---|---|---|---|
60625364 | Nov 2004 | US | |
60724961 | Oct 2005 | US |