Claims
- 1. A process for determination of three-dimensional macromolecular atomic structures of proteins comprising:
a. clustering a plurality of gene products into at least one family of homologous sequences; b. preparing, purifying and characterizing at least one protein encoded by a sequence selected from each of said families; c. crystallizing the purified proteins in parallel against crystallization screens; d. measuring diffraction data produced by the crystallized proteins using a multi-wavelength anomalous diffraction method; and e. analyzing the diffraction data by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure for each of the crystallized proteins.
- 2. The process of claim 1, further comprising the step of freezing the protein crystal.
- 3. The process of claim 1, wherein at least one family is pan-genomic.
- 4. The process of claim 1, further comprising storing said models in a database.
- 5. The process of claim 1, further comprising the step of cloning simultaneously, in parallel for each family, cDNAs from representative species so as to produce at least one expression vector for at least one expression system.
- 6. The process of claim 1, further comprising screening constructs for expression prior to preparing and purifying said proteins.
- 7. The process of claim 1, further comprising testing crystals for diffraction characteristics.
- 8. The process of claim 1, further comprising analyzing the refined model for structural information selected from the group consisting of functional motifs, surface characteristics, active sites, and macromolecular contact sites to store the active site and macromolecular contact site definition in a database.
- 9. The process of claim 8, further comprising storing the structural information in the database.
- 10. The process of claim 8, further comprising defining classes of compounds predicted to have binding potency using the active site information.
- 11. The process of claim 10, further comprising storing the definition in the database.
- 12. The process of claim 1, further comprising developing homology models of the atomic structures.
- 13. The process of claim 12, further comprising storing the homology models in a database.
- 14. A process for determination of three-dimensional macromolecular atomic structures of proteins comprising:
a. organizing systematically known structural information into a database so as to cluster known gene products into at least one family of homologous sequences; b. preparing, purifying and characterizing proteins encoded by sequences selected from said families; c. crystallizing the purified proteins in parallel against crystallization screens; d. measuring diffraction data produced by the crystallized proteins using a multi-wavelength anomalous diffraction method; e. analyzing the diffraction data by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure of the crystallized proteins; and f. updating the database with the atomic structure information.
- 15. The process of claim 14, further comprising the step of freezing the protein crystal.
- 16. The process of claim 14, wherein at least one family is pan-genomic.
- 17. The process of claim 14, further comprising the step of cloning simultaneously, in parallel for each family, at least one cDNA from a representative species so as to produce at least one expression vector for at least one expression system.
- 18. The process of claim 14, further comprising screening constructs for expression prior to preparing and purifying said proteins.
- 19. The process of claim 14, further comprising testing crystals for diffraction characteristics.
- 20. The process of claim 14, further comprising analyzing the refined model for structural information selected from the group consisting of functional motifs, surface characteristics, active sites, and macromolecular contact sites.
- 21. The process of claim 20, further comprising storing the structural information in the database.
- 22. The process of claim 20, further comprising defining classes of compounds predicted to have binding potency using the active site information.
- 23. The process of claim 22, further comprising storing the definition in the database.
- 24. The process of claim 14, further comprising developing homology models of the atomic structures.
- 25. The process of claim 24, further comprising storing the homology models in the database.
- 26. A database generated by a process comprising the steps of:
a. clustering a plurality of gene products into at least one family of homologous sequences; b. preparing, purifying and characterizing at least one protein encoded by a sequence selected from each of said families; c. crystallizing the purified proteins in parallel against crystallization screens; d. freezing at least one crystallized protein from each family and measuring diffraction data produced by the frozen crystallized proteins using a multi-wavelength anomalous diffraction method; and e. analyzing the diffraction data by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure for each of the crystallized proteins.
- 27. A database generated by a process comprising the steps of:
a. organizing systematically known structural information into a database so as to cluster known gene products into at least one family of homologous sequences; b. preparing, purifying and characterizing proteins encoded by sequences selected from said families; c. crystallizing the purified proteins in parallel against crystallization screens; d. freezing the crystallized proteins and measuring diffraction data produced by the frozen crystallized proteins using a multi-wavelength anomalous diffraction method; e. analyzing the diffraction data by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure of the crystallized proteins; and f. updating the database with the atomic structure information.
- 28. A database for storing structural genomics information, said information comprising:
a. transgenome sequence cluster information; b. three-dimensional structures of molecules, wherein said molecules are selected from the group consisting of a family exemplars, specific targets, and homology models; and c. functional annotations selected from the group consisting of surface descriptions, conservation patterns, active sites, and macromolecule binding epitopes.
- 29. A system for determining a plurality of three-dimensional atomic structures, each of which is associated with a corresponding target protein, comprising:
a. a genomics database storing protein sequence and structure information; b. a bioinformatics tool adapted to use the information stored in the database to identify proteins belonging to at least one family having corresponding homologous sequences; c. a protein synthesis apparatus for synthesizing the identified proteins; d. a screening apparatus for selecting from the synthesized proteins a plurality of target proteins; e. a protein processing apparatus for purifying each of the target proteins; f. a crystallization apparatus for crystallizing each of the target proteins; and g. an X-ray crystallography apparatus for performing high throughput crystallography on the crystallized proteins, wherein the X-ray crystallography apparatus measures diffraction data for the crystallized proteins, analyzes the diffraction data, and builds three-dimensional atomic structures of the crystallized proteins, and stores the structures of the target proteins in the database.
- 30. The system of claim 29, further comprising a cryoprotection apparatus for freezing the target protein crystals.
- 31. The system of claim 29, further comprising a structure extraction apparatus for analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites or macromolecular contact sites.
- 32. The system of claim 31, further comprising an apparatus to store the active site and macromolecular contact site definitions in the database.
- 33. The system of claim 29, further comprising a homology binding tool for developing a homology model for one or more of the atomic structures.
- 34. The system of claim 33, further comprising an apparatus to store the homology models in the database.
- 35. The system of claim 29, wherein said proteins are synthesized using a cloned DNA expression system.
- 36. A database for storing three-dimensional atomic structures for a plurality of target proteins, wherein the database is created in accordance with a process comprising the steps of:
a. storing protein sequence and structure information in the database; b. using the stored information to identify proteins belonging to at least one family having corresponding homologous sequences; c. selecting from the identified proteins a plurality of target proteins representative of the at least one family; d. synthesizing the plurality of target proteins; e. purifying the synthesized proteins; f. crystallizing the purified proteins; g. measuring diffraction data for the crystallized proteins; h. building three-dimensional atomic structures of the crystallized proteins based on the diffraction data; and i. storing the structures of the target proteins in the database.
- 37. The database of claim 36, wherein said process further comprises the step of freezing the crystallized proteins.
- 38. The database of claim 36, wherein said process further comprises the step of analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites to store the active site and macromolecular contact site definitions in the database.
- 39. The database of claim 36, wherein said process further comprises the step of developing a homology model for one or more of the target proteins to store the homology model in the database.
- 40. The database of claim 36, wherein one or more protein is synthesized using a cloned DNA expression system.
- 41. A process for determination of three-dimensional macromolecular atomic structures of proteins comprising:
a. organizing structural information for a plurality of proteins in a database; b. clustering the plurality of proteins into at least one family of homologous sequences; c. for at least one family, preparing, purifying and characterizing corresponding proteins; d. crystallizing the purified proteins against crystallization screens so as to produce a plurality of purified protein crystals; e. analyzing diffraction data obtained from the crystals by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure for at least one of the crystallized proteins; and f. updating the database with the additional structural information.
- 42. The process of claim 41, further comprising the step of testing the purified protein crystals for diffraction characteristics.
- 43. The process of claim 41, further comprising the step of freezing the crystals.
- 44. The process of claim 41, further comprising the step of analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites.
- 45. The process of claim 44, further comprising the step of storing the active site and macromolecular contact site definitions in the database.
- 46. The process of claim 41, further comprising the step of developing a homology model for one or more of the target proteins.
- 47. The process of claim 46, further comprising the step of storing the homology model in the database.
- 48. The process of claim 41, wherein one or more protein is synthesized using a cloned DNA expression system.
- 49. The process of claim 41, wherein at least one cluster of proteins is pan-genomic.
- 50. A database for storing three-dimensional macromolecular atomic structures of proteins, the database being generated by a process comprising the steps of:
a. organizing structural information for a plurality of proteins in a database; b. clustering the plurality of proteins into at least one family of homologous sequences; c. for at least one family, preparing, purifying and characterizing corresponding proteins; d. crystallizing the purified proteins against crystallization screens so as to produce a plurality of purified protein crystals; e. analyzing diffraction data obtained from the crystals by a multi-wavelength anomalous diffraction phasing method, building an atomic model and refining the model based on the diffraction data to determine a three-dimensional macromolecular atomic structure for at least one of the crystallized proteins; and f. updating the database with the additional structural information.
- 51. The database of claim 50, wherein said process further comprises testing the purified protein crystals for diffraction characteristics.
- 52. The database of claim 50, wherein said process further comprises freezing the protein crystals.
- 53. The database of claim 50, wherein said process further comprises the step of analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites to store the active site and macromolecular contact site definitions in the database.
- 54. The database of claim 50, wherein said process further comprises the step of developing a homology model for one or more of the target proteins to store the homology model in the database.
- 55. The database of claim 50, wherein at least one protein is synthesized using a cloned DNA expression system.
- 56. The database of claim 50, wherein at least one family is pan-genomic.
- 57. A system for determining a plurality of three-dimensional atomic structures, each of which is associated with a corresponding protein, comprising:
a. a genomics database storing protein sequence information; b. a bioinformatics tool adapted to use the information stored in the database to identify and cluster proteins into at least one family of homologous sequences and to select proteins to synthesize; c. a protein synthesis apparatus for synthesizing the selected proteins; d. a protein processing apparatus for purifying the synthesized proteins; e. a crystallization apparatus for crystallizing the purified proteins; and f. an X-ray crystallography apparatus for performing high throughput crystallography on the crystallized proteins, wherein the X-ray crystallography apparatus measures diffraction data for the crystallized proteins, analyzes the diffraction data, and builds three-dimensional atomic structures of the selected proteins.
- 58. The system of claim 57, further comprising a cryoprotection apparatus for freezing the protein crystals.
- 59. The system of claim 57, wherein at lease one protein is synthesized using a cloned DNA expression system.
- 60. The system of claim 57, further comprising a screening apparatus for selecting synthesized proteins for purification.
- 61. The system of claim 57 wherein said three-dimensional atomic structures are stored in said database.
- 62. The system of claim 57, further comprising a structure extraction apparatus for analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites.
- 63. The system of claim 62, further comprising an apparatus to store the active site and macromolecular contact site definitions in the database.
- 64. The system of claim 57, further comprising a homology model building tool for developing a homology model for one or more of the atomic structures.
- 65. The system of claim 64, further comprising an apparatus to store the homology models in the database.
- 66. A system for determining a plurality of three-dimensional atomic structures, each of which is associated with a corresponding protein, comprising:
a. a genomics database storing protein sequence, structure, and functional information; b. a bioinformatics tool adapted to use the information stored in the database to cluster proteins into at least one family of homologous sequences and to select proteins to synthesize; c. a protein synthesis apparatus for synthesizing the selected proteins; d. a protein processing apparatus for purifying the synthesized proteins; e. a crystallization apparatus for crystallizing the purified proteins; and f. an X-ray crystallography apparatus for performing high throughput crystallography on the crystallized proteins, wherein the X-ray crystallography apparatus measures diffraction data for the proteins, analyzes the diffraction data, builds three-dimensional atomic structures of the crystallized proteins, and stores the structures of the selected proteins in the database.
- 67. The system of claim 66, further comprising a cryoprotection apparatus for freezing the protein crystals.
- 68. The system of claim 66, further comprising a screening apparatus for selecting synthesized proteins for purification.
- 69. The system of claim 66, further comprising a structure extraction apparatus for analyzing the structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites.
- 70. The system of claim 69, further comprising an apparatus to store the active site and macromolecular contact site definition in the database.
- 71. The system of claim 66, further comprising a homology model building tool for developing a homology model for one or more of the atomic structures.
- 72. The system of claim 71, further comprising an apparatus to store the homology models in the database.
- 73. A process for determining a plurality of three-dimensional atomic structures, each of which is associated with a corresponding target protein, comprising:
a. accessing a database storing protein sequence, structure, and functional information; b. identifying proteins belonging to a plurality of at least one family of homologous sequences; c. selecting target proteins to synthesize from said families; d. synthesizing the selected target proteins; e. purifying the synthesized target proteins; f. crystallizing the purified target proteins; g. using an X-ray crystallography apparatus to perform high throughput crystallography on the crystallized proteins, wherein the X-ray crystallography apparatus measures diffraction data for the crystallized proteins, analyzes the diffraction data, and builds three-dimensional atomic structures of the target proteins; and h. storing the structures in the database.
- 74. The process of claim 73, further comprising the step of freezing the protein crystals.
- 75. The process of claim 73, wherein at least one protein is synthesized using a cloned DNA expression system.
- 76. The process of claim 73, further comprising screening selected synthesized proteins for purification.
- 77. The process of claim 73, further comprising analyzing the atomic structures to determine functional motifs and surface characteristics to define active sites and macromolecular contact sites.
- 78. The process of claim 77, further comprising storing the active site and macromolecular contact site definition in the database.
- 79. The process of claim 73, further comprising developing a homology model for one or more of the atomic structures.
- 80. The process of claim 79, further comprising storing the homology model in the database.
- 81. A database generated by a process comprising the steps of:
a. identifying proteins belonging to a at least one family of homologous sequences; b. selecting target proteins to synthesize from said families; c. synthesizing the selected target proteins; d. purifying the synthesized target proteins; e. crystallizing the purified target proteins; f. using an X-ray crystallography apparatus to perform high throughput crystallography on the crystallized proteins, wherein the X-ray crystallography apparatus measures diffraction data for the crystallized proteins, analyzes the diffraction data, and builds three-dimensional atomic structures of the target proteins; and g. storing the atomic structures in the database.
- 82. The database of claim 81, wherein said process further comprises freezing the protein crystals.
Parent Case Info
[0001] This application is a Rule 1.53(b) continuation of U.S. Ser. No. 09/911,100 filed Jul. 20, 2001, which is a continuation of PCT International Application No. PCT/US00/01600, filed Jan. 21, 2000, claiming priority of U.S. Ser. No. 09/235,986, filed Jan. 22, 1999.
Continuations (2)
|
Number |
Date |
Country |
Parent |
09911100 |
Jul 2001 |
US |
Child |
10242196 |
Sep 2002 |
US |
Parent |
PCT/US00/01600 |
Jan 2000 |
US |
Child |
09911100 |
Jul 2001 |
US |