Claims
- 1. A method for identifying a site on a first protein, wherein the site has a particular structure that is essentially not present in a second protein, comprising:
(a) providing purified first and second proteins; (b) subjecting the first and second proteins to analysis by mass spectrometry; (c) subjecting the first and the second protein to NMR spectroscopic analysis; (d) subjecting the first and second protein to X-ray diffraction analysis; and (e) comparing the analyses of the first protein obtained in (b)-(d), which analyses may be performed in any order, with that of the second protein obtained in (b)-(d), to thereby identify a site on the first protein that is essentially not present on the second protein, such that a molecule that binds to the first protein is not expected to bind substantially to the second protein.
- 2. A method for identifying a site on a first protein, wherein the site has a particular structure that is present with sufficient similarity in a second protein, comprising:
(a) providing purified first and second proteins; (b) subjecting the first and second proteins to analysis by mass spectrometry; (c) subjecting the first and the second protein to NMR spectroscopic analysis; (d) subjecting the first and second protein to X-ray diffraction analysis; and (e) comparing the analyses obtained in (b)-(d), which may be performed in any order, to thereby identify a site on the first protein that is present with sufficient similarity on the second protein, such that a molecule that binds to the first protein is expected to bind substantially to the second protein.
- 3. The method of claim 1, wherein the first and the second proteins are structurally related proteins.
- 4. The method of claim 2, wherein the first and the second proteins are homologs of each other.
- 5. The method of claim 4, wherein the amino acid sequences of the first and the second proteins are at least 80% identical.
- 6. The method of claim 2, wherein the atomic coordinates for the two or more proteins have a root mean square deviation of not more than 1.5 Å for all backbone atoms shared in common in the site.
- 7. The method of claim 2, wherein the atomic coordinates for the two or more proteins have a root mean square deviation of not more than 1.5 Å for all side chain atoms and Cα atoms shared in common in the site.
- 8. The method of claim 2, wherein the first and the second proteins are structurally unrelated polypeptides.
- 9. The method of claim 1, wherein the first and the second proteins have a substantially similar biologically activity.
- 10. The method of claim 1, wherein the first and the second proteins is one of the following: kinases, proteases, phosphatases, P450s, conjugation enzymes, ATPases, GTPase, nucleotide binding proteins, DNA processing enzymes, helicases, polymerases, RNA polymerases, DNA polymerases, GPCRs, intracellular receptors, metabolic enzymes, nuclear receptors, channels, phosphodiesterases, Ca binding proteins, bacterial proteins, non-membrane bacterial proteins, human proteins that bind viral proteins, viral proteins, or nonmembrane viral proteins.
- 11. The method of claim 1, further comprising repeating (a)-(e) on a third protein and including the third protein in the comparison of (e).
- 12. The method of claim 2, further comprising repeating (a)-(e) on at least about 10% of the polypeptides in a defined proteome and including the polypeptides in the comparison of (e).
- 13. The method of claim 12, wherein the defined proteome comprises non-membrane proteins, membrane proteins, proteins in an organelle, or proteins in a pathway.
- 14. The method of claim 2, wherein the first and the second proteins are in the same biosynthetic pathway.
- 15. The method of claim 1, further comprising identifying a compound that binds to the site on the first protein using structure guided drug design.
- 16. The method of claim 15, the structure guided drug design comprising:
(i) supplying a computer modeling application with a set of structure coordinates and structural information obtained from (b)-(d); (ii) supplying the computer modeling application with a set of structure coordinates for a chemical entity; and (iii) determining whether the chemical entity is expected to bind to the first protein.
- 17. The method of claim 16, wherein (iii) for the structure guided drug design further comprises performing a fitting operation between the chemical and the site of the first protein, followed by computationally analyzing the results of the fitting operation to quantify the association between the chemical entity and the site of the first protein.
- 18. The method of claim 16, wherein the structure guided drug design comprises:
(1) supplying a computer modeling application with a set of structure coordinates and structural information obtained from (b)-(d); (2) supplying the computer modeling application with a set of structure coordinates for a chemical entity; (3) evaluating the potential binding interactions between the chemical entity and the site of the first protein; (4) structurally modifying the chemical entity to yield a set of structure coordinates for a modified chemical entity; and (5) determining whether the chemical entity is expected to bind to the first protein.
- 19. The method of claim 16, wherein the structure guided drug design comprises:
(1) supplying a computer modeling application with a set of structure coordinates and structural information obtained from (b)-(d); (2) computationally building a chemical entity represented by a set of structure coordinates; and (3) determining whether the chemical entity is expected to bind to the first protein.
- 20. The method of claim 2, further comprising identifying a compound that binds to the site on the first protein using structure guided drug design.
- 21. The method of claim 2, further comprising identifying a compound that is expected to bind to the site on the first protein and determining the ability of the compound to bind to the first and the second proteins using an activity assay, wherein a change in the activity of one of the proteins in the presence of the compound indicates that the compound modulates the activity of the protein.
- 22. The method of claim 1, wherein the mass spectrometry analysis identifies the primary sequence of the protein; the type and location of post translational modifications of the protein, or identifies regions of the protein which interact with another molecule.
- 23. The method of claim 1, wherein the NMR spectroscopic analysis involves 1D NMR, 2D NMR or 15N/1H correlation spectroscopy.
- 24. A computer readable storage medium comprising structural data, wherein the data comprise the identity of a first and a second proteins and the three dimensional structural information of the first and the second proteins obtained using the method of claim 1.
- 25. A database comprising the identity of two or more proteins and the three dimensional structure information of the two or more proteins obtained using the method of claim 2.
- 26. The method of claim 1, wherein several of the experimental procedures for one or more of the analyses are automated.
- 27. The method of claim 2, wherein the first and the second proteins are at least about 80% pure by weight.
- 28. The method of claim 1, wherein either of the crystallized first or second proteins diffracts X-rays to a resolution of about 3.5 Å or better.
- 29. The method of claim 1, further comprising subjecting the first and second proteins to proteolytic digestion prior to the analysis by mass spectrometry.
- 30. The method of claim 2, wherein the NMR spectroscopic analysis is used to determine information about the three dimensional structure, the conformational state, the aggregation level, or the state of unfolding of the protein.
- 31. The method of claim 2, wherein the X-ray diffraction is used to determine the three dimensional structure of the first and second proteins.
- 32. The method of claim 1, wherein the first and the second protein comprise one or more labels.
- 33. A method for identifying a compound that binds preferably to a first protein relative to a second protein, comprising:
(a) providing purified first and second proteins; (b) subjecting each of the first and second protein to one or more of the following in any order:
(i) NMR spectroscopic analysis in the absence of the compound; (ii) NMR spectroscopic analysis in the presence of the compound; (iii) X-ray diffraction analysis of a crystal in the absence of the compound; (iv) X-ray diffraction analysis of a co-crystal of the first protein with the compound and optionally X-ray diffraction analysis of a co-crystal of the second protein with the compound; and (v) analysis by mass spectrometry; and (c) comparing the information from the analyses obtained in (b) for the first protein and the second protein, to thereby identify a compound that binds preferably to the first protein relative to the second protein.
- 34. A method for identifying a compound that binds to a first and to a second protein, comprising:
(a) providing purified first and second proteins; (b) subjecting each of the first and second protein to one or more of the following in any order:
(i) NMR spectroscopic analysis in the absence of the compound; (ii) NMR spectroscopic analysis in the presence of the compound; (iii) X-ray diffraction analysis of a crystal in the absence of the compound; (iv) X-ray diffraction analysis of a co-crystal with the compound; and (v) analysis by mass spectrometry; and (c) comparing the information from the analyses obtained in (b) for the first protein and the second protein, to thereby identify a compound that binds to the first and to the second protein.
- 35. The method of claim 33, wherein each of the first and second protein are subjected in (b) to at least (i), (ii) and (v).
- 36. The method of claim 33, wherein one or the other of the first and second protein are subjected in (b) to at least (i), (ii), (iii) and (v).
- 37. The method of claim 34, wherein each of the first and second protein are subjected in (b) to at least (iv) and (v).
- 38. The method of claim 34, wherein each of the first and second protein are subjected in (b) to at least (ii), (iii) and (v).
- 39. The method of claim 34, wherein each of the first and second protein are subjected in (b) to (v), and one or the other of the first and second protein are subjected in (b) to at least (i), (ii), and (iii).
- 40. The method of claim 34, wherein the second protein is a mutant of the first protein.
- 41. The method of claim 33, wherein the first and the second proteins are orthologs.
- 42. The method of claim 33, wherein the first and the second proteins are from different species.
- 43. The method of claim 42, wherein the species are microbial species.
- 44. The method of claim 42, wherein the species are mammalian species.
- 45. The method of claim 42, wherein one species is microbial and at one species is mammalian.
- 46. The method of claim 33, wherein the first and the second proteins are involved in different biosynthetic pathways.
- 47. The method of claim 33, further comprising repeating (a)-(c) on a third protein and including the third protein in the comparison of (c).
- 48. The method of claim 34, further comprising repeating (a)-(c) on at least about 10% of the polypeptides in a defined proteome and including the polypeptides in the comparison of (c).
- 49. The method of claim 48, wherein the defined proteome comprises non-membrane proteins, membrane proteins, proteins in an organelle, or proteins in a pathway.
- 50. The method of claim 33, which further comprises characterizing the ability of the compound to interact with the first and second proteins using a computational method.
- 51. The method of claim 33, further comprising identifying the compound that binds to the first protein using structure guided drug design.
- 52. The method of claim 34, further comprising identifying the compound that binds to the first protein using structure guided drug design.
- 53. The method of claim 34, which further comprises characterizing the ability of the compound to interact with the first and second proteins using a computational method.
- 54. The method of claim 33, wherein the method comprises analysis of the first protein and second protein by mass spectrometry, and further comprising subjecting the first and second proteins to proteolytic digestion prior to the analysis by mass spectrometry.
- 55. The method of claim 29, further comprising identifying a compound that is expected to bind to the site on the first protein, wherein the proteolytic digestion of the first and second proteins is carried out in the presence of a compound.
- 56. The method of claim 54, wherein the proteolytic digestion of the first and second proteins is carried out in the presence of the compound.
- 57. The method of claim 34, which further comprises determining the ability of the compound to bind to the first and the second proteins using an activity assay, wherein a change in the activity of one of the proteins in the presence of the compound indicates that the compound modulates the activity of the protein.
- 58. The method of claim 34, wherein the compound is a polypeptide, nucleic acid, or small molecule.
- 59. The method of claim 58, wherein the compound is isolated from a naturally occurring source.
- 60. The method of claim 58, wherein the compound is a member of a library of compounds.
- 61. The method of claim 33, wherein the method comprises analysis of the first protein and second protein by mass spectrometry, and wherein the mass spectrometry analysis identifies the primary sequence of the protein; the type and location of post translational modifications of the protein, or identifies regions of the protein which interact with another molecule.
- 62. The method of claim 34, wherein the method comprises analysis of the first protein and second protein by one of the two NMR analyses, and wherein the NMR spectroscopic analysis is used to determine information about the three dimensional structure, the conformational state, the aggregation level, or the state of unfolding of the protein.
- 63. The method of claim 33, wherein the method comprises analysis of the first protein and second protein by one of the two NMR analyses, and wherein the NMR spectroscopic analysis involves 1D NMR, 2D NMR or 15N/1H correlation spectroscopy.
- 64. The method of claim 34, wherein the method comprises analysis of the first protein and second protein by one of the two X-ray diffraction analyses, and wherein the X-ray diffraction is used to determine the three dimensional structures of the first and second protein optionally with the compound.
- 65. The method of claim 33, wherein the first and the second protein comprise one or more labels.
- 66. The method of claim 32, wherein the first and the second protein comprise an isotopic label.
- 67. The method of claim 65, wherein the first and the second protein comprise an isotopic label.
- 68. The method of claim 66, wherein the isotopic label is selected from the group consisting of potassium-40 (40K), carbon-14 (14C), tritium (3H), sulphur-35 (35S), phosphorus-32 (32P), technetium-99m (99mTc), thallium-201 (201Tl), gallium-67 (67Ga), indium-111 (111In), iodine-123 (123I), iodine-131 (131I), yttrium-90 (90Y), samarium-153 (153Sm), rhenium-186 (186Re), rhenium-188 (188Re), dysprosium-165 (165Dy), holmium-166 (166Ho), hydrogen-1 (1H), hydrogen-2 (2H), hydrogen-3 (3H), phosphorous-31 (31P), sodium-23 (23Na), nitrogen-14 (14N), nitrogen-15 (15N), carbon-13 (13 C) and fluorine-19 (19F).
- 69. The method of claim 66, wherein the first and the second proteins comprise at least two different isotopic labels.
- 70. The method of claim 67, wherein the first and the second proteins comprise at least two different isotopic labels.
- 71. The method of claim 66, wherein the first and the second proteins comprise at least one 15N label and at least one 13C label.
- 72. The method of claim, wherein the first and the second proteins comprise a heavy atom label.
- 73. The method of claim 72, wherein the heavy atom label is selected from the group consisting of cobalt, selenium, krypton, bromine, strontium, molybdenum, ruthenium, rhodium, palladium, silver, cadmium, tin, iodine, xenon, barium, lanthanum, cerium, praseodymium, neodymium, samarium, europium, gadolinium, terbium, dysprosium, holmium, erbium, thulium, ytterbium, lutetium, tantalum, tungsten, rhenium, osmium, iridium, platinum, gold, mercury, thallium, lead, thorium and uranium.
- 74. The method of claim 32, wherein the first and the second proteins comprise at least one seleno-methionine.
- 75. The method of claim 67, wherein the first and the second proteins comprise at least one isotopic label and at least one heavy atom label.
- 76. A computer readable storage medium comprising structural data, wherein the data comprise the identity of a first and a second proteins, the identity of a compound, and the three dimensional structure information of the first and the second proteins obtained using the method of claim 33.
- 77. A database comprising the identity of two or more proteins, the identity of a compound, and the three dimensional structure information of the two or more proteins obtained using the method of claim 34.
- 78. The method of claim 33, wherein the first and the second proteins are at least about 70% soluble as measured by light scattering.
- 79. The method of claim 34, wherein the first and the second proteins are fused to at least one heterologous polypeptide.
RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Application No. 60/275,216, filed Mar. 12, 2001, which is incorporated herein in its entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60275216 |
Mar 2001 |
US |