Claims
- 1. A method for creating a Group Signature for a plurality of compounds having related activities, said method comprising:
a) providing a plurality of expression datasets, each expression dataset comprising the expression response of a first plurality of genes in a subject cell following exposure to a compound, wherein said plurality of expression datasets comprises an expression dataset for each of a plurality of test compounds having a similar or identical biological activity, and an expression dataset for each of a plurality of control compounds that lack the biological activity of the test compounds; b) deriving a discrimination metric that distinguishes the plurality of test compounds from the control compounds based on gene expression to provide a distinctive gene set; and c) selecting a second plurality of genes from said distinctive gene set to provide a Group Signature for said plurality of test compounds.
- 2. The method of claim 1, wherein step b) comprises:
i) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; ii) identifying the Principal Component that distinguishes the plurality of test compounds from the plurality of control compounds to the greatest degree, to provide a test Principal Component; and iii) identifying the genes that distinguish the test Principal Component from the control compounds to the greatest degree to provide a distinctive gene set.
- 3. The method of claim 2, wherein said distinctive gene set is selected by identifying the genes that have the greatest eigenvalues in the test Principal Component.
- 4. The method of claim 1, wherein said discrimination metric comprises selecting a set of genes identified using the Golub distinction metric.
- 5. The method of claim 1, wherein said plurality of genes comprises at least 1,000 genes.
- 6. The method of claim 5, wherein said plurality of genes comprises at least 4,000 genes.
- 7. The method of claim 6, wherein said plurality of genes comprises at least 10,000 genes.
- 8. The method of claim 1, wherein the number of control compounds is less than the number of test compounds.
- 9. The method of claim 1, wherein said distinctive gene set comprises only upregulated genes.
- 10. The method of claim 2, wherein said distinctive gene set is selected by identifying the upregulated genes that have the greatest eigenvalues in the test Principal Component.
- 11. The method of claim 1, further comprising:
d) storing said expression datasets in a database; and e) repeating steps a)-d) with a different set of test compounds.
- 12. The method of claim 1, further comprising:
d) contacting a subject cell expressing a plurality of proteins with each test compound; and e) measuring the change in amount of each protein resulting from said contact to provide a protein response dataset for each compound.
- 13. The method of claim 12, further comprising:
f) storing said expression datasets and said protein response datasets in a database; and g) repeating steps a)-f) with a different set of test compounds.
- 14. The method of claim 1, wherein said Group Signature consists of one to 50 genes.
- 15. The method of claim 14, wherein said Group Signature consists of one to 25 genes.
- 16. The method of claim 15, wherein said Group Signature consists of no more than three genes.
- 17. The method of claim 1, wherein said Group Signature comprises at least three genes.
- 18. The method of claim 17, wherein said Group Signature comprises at least 5 genes.
- 19. The method of claim 18, wherein said Group Signature comprises at least 10 genes.
- 20. The method of claim 19, wherein said Group Signature comprises at least 15 genes.
- 21. A method for creating a Group Signature for a plurality of compounds having related activities, said method comprising:
a) providing a plurality of test compounds having a similar or identical biological activity, and a plurality of control compounds that lack the biological activity of the test compounds; b) contacting each compound with a subject cell; c) measuring the expression response of a first plurality of genes for each subject cell to provide an expression dataset for each compound; d) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; e) identifying the Principal Component that distinguishes the plurality of test compounds from the plurality of control compounds to the greatest degree, to provide a test Principal Component; f) identifying the genes that distinguish the test Principal Component from the control compounds to the greatest degree to provide a distinctive gene set; and g) selecting a second plurality of genes from said distinctive gene set to provide a Group Signature for said plurality of test compounds.
- 22. The method of claim 21, wherein said compounds are contacted with the cell in vivo.
- 23. A method for creating a Drug Signature capable of distinguishing the activity of a selected drug compound from a plurality of compounds having related activities, said method comprising:
a) providing a plurality of expression datasets, each expression dataset comprising the expression response of a plurality of genes in a subject cell following exposure to a compound, wherein said plurality of expression datasets comprises an expression dataset for said selected drug compound and an expression dataset for each of a plurality of test compounds having a similar or identical biological activity; b) deriving a discrimination metric that distinguishes the selected drug compound from the plurality of test compounds based on gene expression to provide a distinctive gene set; and c) selecting a plurality of genes from said distinctive gene set to provide a Drug Signature for said selected drug compound.
- 24. The method of claim 23, wherein step b) comprises:
i) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; ii) identifying the Principal Component that distinguishes the plurality of test compounds from the plurality of control compounds to the greatest degree, to provide a test Principal Component; and iii) identifying the genes that distinguish the test Principal Component from the control compounds to the greatest degree to provide a distinctive gene set.
- 25. The method of claim 24, wherein said distinctive gene set is selected by identifying the genes that have the greatest eigenvalues in the test Principal Component.
- 26. The method of claim 23, wherein said discrimination metric comprises selecting a set of genes identified using the Golub distinction metric.
- 27. The method of claim 23, wherein said Drug Signature comprises at least three genes.
- 28. The method of claim 27, wherein said Drug Signature comprises at least five genes.
- 29. The method of claim 28, wherein said Drug Signature comprises at least ten genes.
- 30. The method of claim 23, wherein said Drug Signature consists of one to fifty genes.
- 31. The method of claim 30, wherein said Drug Signature consists of one to 25 genes.
- 32. The method of claim 31, where said Drug Signature consists of one to three genes.
- 33. The method of claim 23, wherein said Drug Signature comprises only upregulated genes.
- 34. A method for creating a Drug Signature capable of distinguishing the activity of a selected drug compound from a plurality of compounds having related activities, said method comprising:
a) providing said selected drug compound and a plurality of test compounds having a similar or identical primary biological activity; b) contacting each compound with a subject cell; c) measuring the expression response of a first plurality of genes for each subject cell to provide an expression dataset for each compound; d) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; e) identifying the Principal Component that distinguishes the selected drug compound from said plurality of test compounds to the greatest degree, to provide a distinguishing Principal Component; f) identifying the genes that contribute to the distinguishing Principal Component to the greatest degree to provide a distinguishing gene set; and g) selecting a second plurality of genes from said distinguishing gene set to provide a Drug Signature for said selected drug compound.
- 35. The method of claim 34, wherein said compounds are contacted with the cell in vivo.
- 36. A Group Signature database, comprising:
a plurality of Group Signature records, wherein each Group Signature record comprises
indicia of at least one compound, wherein all compounds within a Group exhibit a similar or identical primary bioactivity; indicia of a set of genes, wherein the expression of said genes is modulated in response to exposure to a compound having a primary bioactivity similar or identical to the primary bioactivity of a compound indicated in the Group record, and wherein said set of genes distinguishes said Group from all other Groups within said Group Signature database.
- 37. The Group Signature database of claim 36, wherein said plurality of Group Signature records comprises at least 10 Group Signature records.
- 38. The Group Signature database of claim 37, wherein said plurality of Group Signature records comprises at least 25 Group Signature records.
- 39. The Group Signature database of claim 36, wherein said set of genes for each Group Signature record comprises at least 5 genes.
- 40. The Group Signature database of claim 39, wherein said set of genes for each Group Signature record comprises at least 10 genes.
- 41. The Group Signature database of claim 36, wherein said set of genes for each Group Signature record consists of one to 50 genes.
- 42. The Group Signature database of claim 41, wherein said set of genes for each Group Signature record consists of one to 25 genes.
- 43. The Group Signature database of claim 36, wherein said database further comprises stress records, wherein each stress record comprises:
an indicia of a stress; and indicia of a set of genes, wherein expression of said genes is modulated in response to said stress, and wherein said set of genes distinguishes said stress from all other stresses and Groups within said Group Signature database.
- 44. The Group Signature database of claim 43, wherein said stress is selected from the group consisting of elevated temperature, depressed temperature, elevated oxygen pressure, depressed oxygen pressure, elevated CO2 pressure, depressed CO2 pressure, starvation, dehydration, overcrowding, sleep deprivation, pain, infection, exposure to toxins, and light deprivation.
- 45. A Drug Signature database, comprising:
a plurality of Drug Signature records, wherein each Drug Signature record comprises
indicia of one compound; and indicia of a set of genes, wherein expression of said genes is modulated in response to exposure to said compound, and wherein said set of genes distinguishes said compound from all other compounds within said Drug Signature database.
- 46. The Drug Signature database of claim 45, wherein said plurality of Drug Signature records comprises at least 10 records.
- 47. The Drug Signature database of claim 46, wherein said plurality of Drug Signature records comprises at least 50 records.
- 48. The Drug Signature database of claim 45, wherein said set of genes for each Drug Signature record comprises at least 5 genes.
- 49. The Drug Signature database of claim 48, wherein said set of genes for each Drug Signature record comprises at least 10 genes.
- 50. The Drug Signature database of claim 45, wherein said set of genes for each Drug Signature record consists of one to 50 genes.
- 51. The Drug Signature database of claim 50, wherein said set of genes for each Drug Signature record consists of one to 25 genes.
- 52. A method for determining the activity of a drug candidate, said method comprising:
a) providing a Group Signature database, said Group Signature database comprising a plurality of Group Signature records, wherein each Group Signature record comprises indicia of at least one compound, wherein all compounds within a Group exhibit a similar or identical primary bioactivity; and indicia of a set of genes, wherein expression of said genes is modulated in response to exposure to a compound having a primary bioactivity similar or identical to the primary bioactivity of a compound indicated in the Group record, and wherein said set of genes distinguishes said Group from all other Groups within said Group Signature database; b) providing a drug candidate expression dataset for said drug candidate, said drug candidate expression dataset comprising the expression response of a plurality of genes in a subject cell following exposure to said drug candidate; c) comparing said drug candidate expression dataset with each Group Signature; d) selecting the Group Signature most similar to said drug candidate expression dataset; e) identifying the activity of the drug candidate to be the primary bioactivity exhibited by the compounds within the most similar Group Signature.
- 53. The method of claim 52, wherein the similarity of the drug candidate expression dataset to each Group Signature is measured by a similarity score of S=Πx RelRkx.
- 54. The method of claim 52, wherein said drug candidate expression dataset consists of one to 200 genes.
- 55. The method of claim 54, wherein said Group Signature database further comprises bioassay data for each compound, and said drug candidate expression dataset further comprises bioassay data for said drug candidate.
- 56. A method for designing a Group Signature reagent, comprising:
a) providing a plurality of expression datasets, each expression dataset comprising the expression response of a first plurality of genes in a subject cell following exposure to a compound, wherein said plurality of expression datasets comprises an expression dataset for each of a plurality of test compounds having a similar or identical biological activity, and an expression dataset for each of a plurality of control compounds that lack the biological activity of the test compounds; b) deriving a discrimination metric that distinguishes the plurality of test compounds from the control compounds based on gene expression to provide a distinctive gene set; c) selecting a second plurality of genes from said distinctive gene set to provide a Group Signature for said plurality of test compounds; and d) providing a set of polynucleotide probes capable of hybridizing specifically to one or more sequences of said second plurality of genes in said Group Signature to provide a Group Signature probe set.
- 57. The method of claim 56, wherein step b) comprises:
i) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; ii) identifying the Principal Component that distinguishes the plurality of test compounds from the plurality of control compounds to the greatest degree, to provide a test Principal Component; and iii) identifying the genes that distinguish the test Principal Component from the control compounds to the greatest degree to provide a distinctive gene set.
- 58. The method of claim 57, wherein said distinctive gene set is selected by identifying the genes that have the greatest eigenvalues in the test Principal Component.
- 59. The method of claim 56, wherein said discrimination metric comprises selecting a set of genes identified using the Golub distinction metric.
- 60. The method of claim 56, further comprising:
e) repeating steps a)-d) to generate a plurality of different Group Signatures for unrelated compounds.
- 61. The method of claim 60, further comprising:
f) attaching said Group Signature probe set to a solid support in a defined location to form a Group Signature Array.
- 62. The method of claim 61, wherein said Group Signature Array comprises at least 100 Group Signature probe sets.
- 63. The method of claim 62, wherein said Group Signature Array comprises at least 500 Group Signature probe sets.
- 64. The method of claim 63, wherein said Group Signature Array comprises at least 1,000 Group Signature probe sets.
- 65. A Group Signature Array prepared in accordance with the method of claim 61.
- 66. A kit comprising a suitable container means, a Group Signature Array of claim 65, and instructions for using said kit.
- 67. A method for designing a Drug Signature reagent, comprising:
a) providing a plurality of expression datasets, each expression dataset comprising the expression response of a plurality of genes in a subject cell following exposure to a compound, wherein said plurality of expression datasets comprises an expression dataset for said selected drug compound and an expression dataset for each of a plurality of test compounds having a similar or identical biological activity; b) deriving a discrimination metric that distinguishes the plurality of test compounds from the control compounds based on gene expression to provide a distinctive gene set; c) selecting a plurality of genes from said distinguishing gene set to provide a Drug Signature for said selected drug compound; and d) providing a set of polynucleotide probes capable of hybridizing specifically to the sequences of said genes in said Drug Signature to form a Drug Signature probe set.
- 68. The method of claim 67, wherein step b) comprises:
i) ordering the expression datasets by Principal Component Analysis to provide a plurality of principal components; ii) identifying the Principal Component that distinguishes the plurality of test compounds from the plurality of control compounds to the greatest degree, to provide a test Principal Component; and iii) identifying the genes that distinguish the test Principal Component from the control compounds to the greatest degree to provide a distinctive gene set.
- 69. The method of claim 68, wherein said distinctive gene set is selected by identifying the genes that have the greatest eigenvalues in the test Principal Component.
- 70. The method of claim 67, wherein said discrimination metric comprises selecting a set of genes identified using the Golub distinction metric.
- 71. The method of claim 67, further comprising:
e) repeating steps a)-d) to generate a plurality of different Drug Signatures for unrelated compounds.
- 72. The method of claim 67, further comprising:
e) attaching said Drug Signature probe set to a solid support in a defined location to form a Drug Signature Array.
- 73. The method of claim 67, wherein said Drug Signature Array comprises at least 100 Drug Signature probe sets.
- 74. The method of claim 73, wherein said Drug Signature Array comprises at least 500 Drug Signature probe sets.
- 75. The method of claim 74, wherein said Drug Signature Array comprises at least 1,000 Drug Signature probe sets.
- 76. The method of claim 75, wherein said Drug Signature Array comprises at least 10,000 Drug Signature probe sets.
- 77. A Drug Signature Array prepared in accordance with the method of claim 72.
- 78. A kit comprising a suitable container means, a Drug Signature Array of claim 77, and instructions for using said kit.
- 79. A method for determining the activity of a drug candidate, said method comprising:
a) providing a Group Signature Array, said Group Signature Array comprising a solid support having affixed thereto a plurality of Group Signature probe sets, wherein each Group Signature probe set comprises a set of polynucleotide probes capable of hybridizing specifically to the sequences of the genes in each Group Signature, wherein said Group Signatures are obtained by:
i) providing a plurality of expression datasets, each expression dataset comprising the expression response of a plurality of genes in a subject cell following exposure to a compound, wherein said plurality of expression datasets comprises an expression dataset for each of a plurality of test compounds having a similar or identical biological activity, and an expression dataset for each of a plurality of control compounds that lack the biological activity of the test compounds; ii) deriving a discrimination metric that distinguishes the plurality of test compounds from the control compounds based on gene expression to provide a distinctive gene set; iii) selecting a plurality of genes from said distinctive gene set to provide a Group Signature for said plurality of test compounds; and iv) repeating steps i)-iii) for each Group Signature; b) contacting a subject cell with said drug candidate; c) extracting mRNA from said subject cell; d) reverse-transcribing said mRNA to cDNA; e) contacting said Group Signature Array with said cDNA; and f) determining whether any Group Signature probe set exhibits increased binding of cDNA.
- 80. A method for screening a library of compounds, wherein the library comprises a plurality of drug candidates, comprising:
a) determining the activity of each drug candidate according to the method of claim 79; and b) selecting a drug candidate, wherein the Group Signature probe set exhibits increased binding to the cDNA that results from contacting the subject cell with said drug candidate.
- 81. A polynucleotide probe set for detecting fibrate-like activity, the set comprising:
a plurality of polynucleotides capable of hybridizing specifically to genes selected from the group consisting of Rat for cytochrome P452, Rat cytochrome P450, Rat cytochrome P450-LA-omega (lauric acid omega-hydroxylase), Rat Sulfotransferase K2, Rat cytochrome P450-LA-omega (lauric acid omega-hydroxylase), Rat Cyp4a locus, encoding cytochrome P450 (IVA3), Rat cytochrome P450, Rat mitochondrial 3-2-trans-enoyl-CoA isomerase, Rat carnitine octanoyltransferase, Rat Wistar peroxisomal enoyl hydratase-like protein (PXEL), Rat mitochondrial long-chain 3-ketoacyl-CoA thiolase β-subunit of mitochondrial trifunctional protein, Rat liver fatty acid binding protein (FABP), Rat pyruvate dehydrogenase kinase isoenzyme 4 (PDK4), Rat mitochondrial isoform of cytochrome b5, Hypothetical protein Rv3224, Rat peroxisomal enoyl-CoA: hydrotase-3-hydroxyacyl-CoA bifunctional enzyme, Rat peroxisomal membrane protein Pmp26p (Peroxin-11), Rat acyl-CoA hydrolase, Rat acyl-CoA oxidase, Rat acyl-CoA hydrolase, Rat 2,4-dienoyl-CoA reductase precursor, Rat mitochondrial 3-hydroxy-3-methylglutaryl-CoA synthase, Rat peroxisomal enoyl-CoA: hydrotase-3-hydroxyacyl-CoA bifunctional enzyme, and Mouse peroxisomal long chain acyl-CoA thioesterase Ib (Pte1b).
- 82. The polynucleotide probe set of claim 81, wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 3 genes.
- 83. The polynucleotide probe set of claim 82, wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 5 genes.
- 84. The polynucleotide probe set of claim 83, wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 10 genes.
- 85. A kit comprising a suitable container means, a polynucleotide probe set of claim 81, and instructions for using said kit.
- 86. A polynucleotide probe set for detecting gemfibrozil-like activity, the set comprising:
a plurality of polynucleotides capable of hybridizing specifically to genes selected from the group consisting of Rat fatty acid synthase, Rat cholesterol 7α-hydroxylase, Mouse acetyl-CoA synthetase, Mouse Vanin-1, Rat kidney-specific protein (KS), Rat 2,3-oxidosqualene:lanosterol cyclase, Rat aldehyde dehydrogenase, and Rat thymosin β-10.
- 87. The polynucleotide probe set of claim 86 wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 3 genes.
- 88. The polynucleotide probe set of claim 87 wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 5 genes.
- 89. The polynucleotide probe set of claim 88 wherein said plurality of polynucleotides are capable of hybridizing specifically to at least 10 genes.
- 90. A kit comprising a suitable container means, a polynucleotide probe set of claim 86, and instructions for using said kit.
- 91. A method for screening drug candidates for fibrate activity, the method comprising:
a) contacting a subject cell with a drug candidate; b) extracting mRNA from said subject cell; c) reverse-transcribing said mRNA into cDNA; d) hybridizing said cDNA to a fibrate signature probe set, said probe set comprising a plurality of polynucleotides capable of hybridizing specifically to a fibrate signature gene, wherein said fibrate signature genes are selected from the group consisting of Rat for cytochrome P452, Rat cytochrome P450, Rat cytochrome P450-LA-omega (lauric acid omega-hydroxylase), Rat Sulfotransferase K2, Rat cytochrome P450-LA-omega (lauric acid omega-hydroxylase), Rat Cyp4a locus, encoding cytochrome P450 (IVA3), Rat cytochrome P450, Rat mitochondrial 3-2-trans-enoyl-CoA isomerase, Rat carnitine octanoyltransferase, Rat Wistar peroxisomal enoyl hydratase-like protein (PXEL), Rat mitochondrial long-chain 3-ketoacyl-CoA thiolase β-subunit of mitochondrial trifunctional protein, Rat liver fatty acid binding protein (FABP), Rat pyruvate dehydrogenase kinase isoenzyme 4 (PDK4), Rat mitochondrial isoform of cytochrome b5, Hypothetical protein Rv3224, Rat peroxisomal enoyl-CoA: hydrotase-3-hydroxyacyl-CoA bifunctional enzyme, Rat peroxisomal membrane protein Pmp26p (Peroxin-11), Rat acyl-CoA hydrolase, Rat acyl-CoA oxidase, Rat acyl-CoA hydrolase, Rat 2,4-dienoyl-CoA reductase precursor, Rat mitochondrial 3-hydroxy-3-methylglutaryl-CoA synthase, Rat peroxisomal enoyl-CoA: hydrotase-3-hydroxyacyl-CoA bifunctional enzyme, and Mouse peroxisomal long chain acyl-CoA thioesterase Ib (Pte1b); and e) determining if said subject cell exhibits increased expression of a fibrate signature gene.
- 92. A database product, comprising:
a computer-readable medium, said medium storing thereon a Group Signature database, said database comprising a plurality of Group Signature records, wherein each Group Signature record comprises indicia of at least one compound, wherein all compounds within a Group exhibit a similar or identical primary bioactivity; and indicia of a set of genes, wherein expression of said gene is modulated in response to exposure to a compound having a primary bioactivity similar or identical to the primary bioactivity of a compound indicated in the Group record, and wherein said set of genes distinguishes said Group from all other Groups within said Group Signature database.
Parent Case Info
[0001] This application claims the benefit of U.S. Provisional Application No. 60/360,728, filed Feb. 28, 2002.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60360728 |
Feb 2002 |
US |