Claims
- 1. A method of identifying a set of informative genes whose expression correlates with a chemosensitive class or a chemoresistant class across samples which are determined to be either sensitive or resistant to a compound by growth inhibition, wherein more than one compound is assessed, comprising the steps of:
a) sorting genes by degree to which their expression in said sample which is determined to be either sensitive or resistant to a compound by growth inhibition correlates with the chemosensitive class or the chemoresistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with either the chemosensitive class or chemoresistant class more strongly than expected by chance is an informative gene, thereby identifying a set of informative genes.
- 2. The method of claim 1, wherein the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class is determined by:
- 3. The method of claim 2, further comprising:
a) assigning a weight to each informative gene according to the degree of correlation between an expression value of the gene and the correlation to the chemosensitive class or the chemoresistant class, and b) repeating step a) for each compound being assessed, to thereby obtain a classifier.
- 4. The method of claim 1, further comprising performing cross-validation.
- 5. The method of claim 4, wherein performing cross-validation comprises:
a) eliminating a sample used to build the classifier; b) using a weighted voting routine, building a cross-validation model for classifying without the eliminated sample; and c) using the cross-validation model, classifying the eliminated sample including comparing the gene expression values of the eliminated sample to level of gene expression of the cross-validation model
- 6. The method of claim 5, further comprising:
a) dividing the samples into a training set and a test set; b) using the classifier, classifying each sample of the test set; and c) determining the accuracy of the classifier, as compared to a classification of the sample as expected by chance.
- 7. A method of predicting the sensitivity or resistance of a sample to one or more compounds, wherein a sample is assigned to a chemosensitive class or a chemoresistant class, comprising the steps of:
a) determining a weighted vote for the chemosensitive class or the chemoresistant class for one or more informative genes in said sample in accordance with a classifier built with a weighted voting scheme, wherein the magnitude of each vote depends on the expression level of the gene in said sample and on the degree of correlation of the gene's expression with being either chemosensitive or chemoresistant; and b) summing the votes to determine the winning class.
- 8. The method of claim 7, wherein the weighted voting scheme is performed in accordance with:
- 9. The method of claim 8, wherein a set of informative genes whose expression correlates with a chemosensitive class or a chemoresistant class among samples is identified.
- 10. The method of claim 9, wherein identifying a set of informative genes, comprises the steps of:
a) sorting genes by degree to which their expression in said sample which is determined to be either sensitive or resistant to a compound by growth inhibition correlates with the chemosensitive class or the chemoresistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with either the chemosensitive class or chemoresistant class more strongly than expected by chance is an informative gene, thereby identifying a set of informative genes.
- 11. The method of claim 10, wherein the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class is determined by:
- 12. The method of claim 11, wherein building the classifier comprises:
a) assigning a weight to each informative gene according to the degree of correlation between an expression value of the gene and the correlation to the chemosensitive class or the chemoresistant class, and b) repeating step a) for each compound being assessed, to thereby obtain a classifier.
- 13. The method of claim 12, wherein the number of informative genes used in the weighted voting scheme is at least 50.
- 14. A method of assigning a sample to a chemosensitive class or a chemoresistant class, comprising the steps of:
a) determining a weighted vote for the chemosensitive class or the chemoresistant class for one or more informative genes in said sample in accordance with a classifier built with a weighted voting scheme from gene expression data from samples which are determined to be either chemosensitive or chemoresistant to more than one compound by growth inhibition, wherein the magnitude of each vote depends on the expression level of the gene in said sample and on the degree of correlation of the gene's expression with being either sensitive or resistant; and b) summing the votes to determine the winning class.
- 15. A method for classifying a sample obtained from an individual in a chemosensitive class or a chemoresistant class, comprising:
a) obtaining a gene expression product for two or more genes from the sample; b) assessing the gene expression product for the genes to thereby determine the levels of the gene expression product for the genes; c) using a model built with a weighted voting scheme from gene expression data from samples which are determined to be either chemosensitive or chemoresistant to more than one compound by growth inhibition, classifying the sample including comparing the gene expression values of the sample to that of the model.
- 16. A method for determining whether a compound is effective for the treatment of cancer, comprising:
a) assessing the sample to obtain gene expression values for one or more informative genes; b) determining a weighted vote for a chemosensitive class or a chemoresistant class for the informative genes in said sample in accordance with a classifier built with a weighted voting scheme from samples which have been determined to be either sensitive or resistant to a compound by growth inhibition, wherein the magnitude of each vote depends on the expression level of the gene in said sample and on the degree of correlation of the gene's expression with being either chemosensitive or chemoresistant; and c) summing the votes to determine the winning class.
- 17. A method for determining a mechanism of action in the proliferation or non-proliferation of cells in a sample which is determined to be either sensitive or resistant to a compound by growth inhibition, wherein more than one compound is assessed, comprising the steps of:
a) sorting genes by the degree to which their expression in said sample which is determined to be either sensitive or resistant to a compound by growth inhibition correlates with a chemosensitive class or a chemoresistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with the chemoresistant class more strongly than expected by chance is a gene involved in the proliferation of cells; and wherein a gene whose expression correlates with the chemosensitive class more strongly than expected by chance is a gene involved in the non-proliferation of cells, thereby determining a mechanism of action.
- 18. A method of determining a treatment plan for an individual undergoing chemotherapy, comprising:
a) obtaining a sample from the individual; b) assessing the sample for the level of gene expression for one or more genes; and c) using a model built with a weighted voting scheme from gene expression data from samples which are determined to be either chemosensitive or chemoresistant to more than one compound by growth inhibition, classifying the sample including comparing the gene expression values of the sample to that of the model, to thereby determine a treatment plan.
- 19. In a computer system, method for classifying a sample obtained from an individual in a chemosensitive class or a chemoresistant class, wherein gene expression values are determined for the sample to be tested, comprising:
a) receiving the gene expression values for two or more genes from the sample to be tested; b) using a model built with a weighted voting scheme from gene expression data from samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition, classifying the sample in a chemosensitive class or a chemoresistant class including comparing the gene expression values of the sample to that of the model, to thereby produce a classification of the sample; and c) providing an output indication of the classification.
- 20. The method of claim 19, wherein the model is built according to:
- 21. The method of claim 20, wherein the vote for the first class is determined by obtaining a sum of the absolute values of the positive votes for the first class, and the vote for the second class is determined by obtaining a sum of the absolute values of the negative votes for the second class.
- 22. The method of claim 21, wherein the weighted voting scheme builds the model using a portion of genes that correlate to the chemosensitive class or chemoresistant class.
- 23. The method of claim 19, wherein the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class is determined by:
- 24. In a computer system, method for classifying a sample obtained from an individual in a chemosensitive class or a chemoresistant class, wherein gene expression values are determined for the sample to be tested, comprising:
a) providing a model built by a weighted voting scheme from gene expression data from samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition; b) assessing the sample for the level of gene expression for more than one gene, to thereby obtain a gene expression value for each gene; c) using the model built with a weighted voting scheme, classifying the sample in a chemosensitive class or a chemoresistant class including comparing the gene expression values of the sample to that of the model, to thereby produce a classification of the sample; and d) providing an output indication of the classification.
- 25. The method of claim 24, wherein the weighted voting scheme is performed in accordance with:
- 26. The method of claim 25, wherein the vote for the first class is determined by obtaining a sum of the absolute values of the positive votes for the first class, and the vote for the second class is determined by obtaining a sum of the absolute values of the negative votes for the second class.
- 27. The method of claim 24, wherein the weighted voting scheme builds the model using a portion of genes that correlate to the chemosensitive class or chemoresistant class.
- 28. The method of claim 27, wherein the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class is determined by:
- 29. In a computer system, a method for building a model for classifying at least one sample to be tested having a gene expression product, comprising:
a) receiving a vector for gene expression values of two or more samples which are determined to be either sensitive or resistant to a compound by growth inhibition, the vector being a series of gene expression values for the samples; b) determining genes that are relevant for classification of said sample in a chemosensitive class or a chemosensitive class; and c) using a weighted voting routine, constructing the model for classifying the samples using at least a portion of the genes determined in step b).
- 30. The method of claim 29, wherein determining genes that are relevant for classification of a sample in the chemosensitive class or the chemosensitive class, further comprises the steps of:
a) sorting genes by degree to which their expression in one or more samples which is determined to be either sensitive or resistant to a compound by growth inhibition correlates with the chemosensitive class or the chemoresistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with either the chemosensitive class or chemoresistant class more strongly than expected by chance is a gene that is relevant for classification.
- 31. The method of claim 30, wherein the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class is determined by:
- 32. The method of claim 31, further comprising:
a) assigning a weight to each informative gene according to the degree of correlation between an expression value of the gene and the correlation to the chemosensitive class or the chemoresistant class, and b) repeating step a) for each compound being assessed, to thereby obtain a classifier.
- 33. The method of claim 32, wherein the a weighted voting routine employs:
- 34. The method of claim 29, further comprising performing cross-validation.
- 35. The method of claim 34, wherein performing cross-validation comprises:
a) eliminating a sample used to build the classifier; b) using a weighted voting routine, building a cross-validation model for classifying without the eliminated sample; and c) using the cross-validation model, classifying the eliminated sample including comparing the gene expression values of the eliminated sample to level of gene expression of the cross-validation model.
- 36. The method of claim 29, further comprising:
a) dividing the samples into a training set and a test set; b) using the classifier, classifying each sample of the test set; and c) determining the accuracy of the classifier, as compared to a classification of the sample as expected by chance.
- 37. The method of claim 29, further comprising filtering out any gene expression values in the sample that exhibit an insignificant change.
- 38. The method of claim 37, further comprising normalizing the gene expression value of the vectors.
- 39. A computer apparatus for classifying a sample into a chemosensitive class or a chemoresistant class, wherein the sample is obtained from an individual, wherein the apparatus comprises:
a) a source of gene expression values of the sample; b) a processor routine executed by a digital processor, coupled to receive the gene expression values from the source, the processor routine determining classification of the sample by comparing the gene expression values of the sample to a model built with a weighted voting scheme from gene expression data from samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition; and c) an output assembly, coupled to the digital processor, for providing an indication of the classification of the sample.
- 40. The computer apparatus of claim 39, wherein the model is built according to:
- 41. The computer apparatus of claim 40, wherein the vote for the first class is determined by obtaining a sum of the absolute values of the positive votes for the first class, and the vote for the second class is determined by obtaining a sum of the absolute values of the negative votes for the second class.
- 42. The computer apparatus of claim 39, wherein the output assembly comprises a display of the classification.
- 43. A computer apparatus for constructing a model for classifying at least one sample into a chemosensitive class or a chemoresistant class, wherein the apparatus comprises:
a) a source of vectors for gene expression values from two or more samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition, the vector being a series of gene expression values for the samples; b) a processor routine executed by a digital processor, coupled to receive the gene expression values of the vectors from the source, the processor routine determining relevant genes for classifying the sample in a chemosensitive class or a chemosensitive class, and constructing the model with a portion of the relevant genes by utilizing a weighted voting scheme.
- 44. The computer apparatus of claim 43, further comprising an output assembly, coupled to the digital processor, for providing the model.
- 45. The computer apparatus of claim 43, wherein the processor routine determining relevant genes for classifying the sample in the chemosensitive class or the chemosensitive class, further comprises a routine that, for each compound being assessed:
a) sorts genes by degree to which their expression in one or more samples which is determined to be either sensitive or resistant to a compound by growth inhibition correlates with the chemosensitive class or the chemoresistant class; and b) determines whether said correlation is stronger than expected by chance; wherein a gene whose expression correlates with either the chemosensitive class or chemoresistant class more strongly than expected by chance is a gene that is relevant for classification.
- 46. The computer apparatus of claim 45, wherein the processor routine determines the degree the expression of a gene correlates with the chemosensitive class or the chemoresistant class by:
- 47. The computer apparatus of claim 46, wherein, for each compound being assessed, the weighted voting routine assigns a weight to each informative gene according to the degree of correlation between an expression value of the gene and the correlation to the chemosensitive class or the chemoresistant class.
- 48. The computer apparatus of claim 47, wherein a weighted voting routine employs:
- 49. The computer apparatus of claim 48, wherein the vote for the first class is determined by obtaining a sum of the absolute values of the positive votes for the first class, and the vote for the second class is determined by obtaining a sum of the absolute values of the negative votes for the second class.
- 50. The computer apparatus of claim 43, further comprising a filter, coupled between the source and the processor routine, for filtering out any of the gene expression values in a sample that exhibit an insignificant change.
- 51. The computer apparatus of claim 43, further comprising a normalizer, coupled to the filter, for normalizing the gene expression values.
- 52. The computer apparatus of claim 43, wherein the output assembly comprises a display of the model.
- 53. The computer apparatus of claim 52, wherein the output assembly comprises a graphical representation.
- 54. The computer apparatus of claim 53, wherein the graphical representation is color coordinated.
- 55. The computer apparatus of claim 54, wherein the color coordination comprises shades of contiguous colors.
- 56. A machine readable computer assembly for classifying a sample into a chemosensitive class or a chemoresistant class, wherein the sample is obtained from an individual, wherein the apparatus comprises:
a) a source of gene expression values of the sample; b) a processor routine executed by a digital processor, coupled to receive the gene expression values from the source, the processor routine determining classification of the sample by comparing the gene expression values of the sample to a model built with a weighted voting scheme from gene expression data from samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition; and c) an output assembly, coupled to the digital processor, for providing an indication of the classification of the sample.
- 57. A machine readable computer assembly for constructing a model for classifying at least one sample into a chemosensitive class or a chemoresistant class, wherein the apparatus comprises:
a) a source of vectors for gene expression values from two or more samples which are determined to be either sensitive or resistant to more than one compound by growth inhibition, the vector being a series of gene expression values for the samples; b) a processor routine executed by a digital processor, coupled to receive the gene expression values of the vectors from the source, the processor routine determining relevant genes for classifying the sample in a chemosensitive class or a chemosensitive class, and constructing the model with a portion of the relevant genes by utilizing a weighted voting scheme.
- 58. A method of identifying a set of informative genes whose expression correlates with a treatment-sensitive class or a treatment-resistant class across samples which are determined to be either sensitive or resistant to a treatment, wherein more than one compound is assessed, comprising the steps of:
a) sorting genes by degree to which their expression in said sample which is determined to be either sensitive or resistant to a treatment correlates with the treatment-sensitive class or the treatment-resistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with either the treatment-sensitive class or the treatment-resistant class more strongly than expected by chance is an informative gene, thereby identifying a set of informative genes.
- 59. A method of identifying a set of informative genes whose expression correlates with a drug-sensitive class or a drug-resistant class across samples which are determined to be either sensitive or resistant to a compound, wherein more than one compound is assessed, comprising the steps of:
a) sorting genes by degree to which their expression in said sample which is determined to be either sensitive or resistant to a compound correlates with the drug-sensitive class or the drug-resistant class; b) determining whether said correlation is stronger than expected by chance; and c) repeating steps a) and b) for each compound being assessed; wherein a gene whose expression correlates with either the drug-sensitive class or drug-resistant class more strongly than expected by chance is an informative gene, thereby identifying a set of informative genes.
RELATED APPLICATIONS
[0001] This Application is a Continuation In Part of application Ser. No. 09/544,627, filed Apr. 6, 2000, which claims the benefit of U.S. Provisional Application No. 60/188,765, filed Mar. 13, 2000; U.S. Provisional Application No. 60/159,477, filed on Oct. 14, 1999; U.S. Provisional Application No. 60/158,467, filed on Oct. 8, 1999; U.S. Provisional Application No. 60/135,397, filed May 21, 1999; and U.S. Provisional Application No. 60/128,664, filed Apr. 9, 1999. This Application also claims the benefit of U.S. Provisional Application No. 60/236,769, filed on Sep. 29, 2000. The entire teachings of the above applications are incorporated herein by reference.
Provisional Applications (6)
|
Number |
Date |
Country |
|
60188765 |
Mar 2000 |
US |
|
60159477 |
Oct 1999 |
US |
|
60158467 |
Oct 1999 |
US |
|
60135397 |
May 1999 |
US |
|
60128664 |
Apr 1999 |
US |
|
60236769 |
Sep 2000 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09544627 |
Apr 2000 |
US |
Child |
09968627 |
Oct 2001 |
US |