The field of the invention is molecular biology, genetics, oncology, bioinformatics and diagnostic testing.
Most cancer drugs are effective in some patients, but not others. This results from genetic variation among tumors, and can be observed even among tumors within the same patient. Variable patient response is particularly pronounced with respect to targeted therapeutics. Therefore, the full potential of targeted therapies cannot be realized without suitable tests for determining which patients will benefit from which drugs. According to the National Institutes of Health (NIH), the term “biomarker” is defined as “a characteristic that is objectively measured and evaluated as an indicator of normal biologic or pathogenic processes or pharmacological response to a therapeutic intervention.”
The development of improved diagnostics based on the discovery of biomarkers has the potential to accelerate new drug development by identifying, in advance, those patients most likely to show a clinical response to a given drug. This would significantly reduce the size, length and cost of clinical trials. Technologies such as genomics, proteomics and molecular imaging currently enable rapid, sensitive and reliable detection of specific gene mutations, expression levels of particular genes, and other molecular biomarkers. In spite of the availability of various technologies for molecular characterization of tumors, the clinical utilization of cancer biomarkers remains largely unrealized because few cancer biomarkers have been discovered. For example, a recent review article states:
Another recent review article on cancer biomarkers contains the following comments:
There is a well-recognized need for methods of identifying multigene biomarkers for identifying which patients are suitable candidates for treatment with a given drug or therapy. This is particularly true with regard to targeted cancer therapeutics.
Using gene expression profiling technologies, proprietary bioinformatics tools, and applied statistics, we have discovered that the mammalian genome can be usefully represented by 51 non-overlapping, functionally relevant groups of genes whose intra-group transcript level is coordinately regulated, i.e., strongly correlated, or “coherent,” across various microarray datasets. We have designated these groups of genes Transcription Clusters 1-51 (TC1-TC51). Based on this discovery, we have discovered a broadly applicable method for rapidly identifying: (a) a multigene predictive biomarker for sensitivity or resistance to an anti-cancer drug of interest; or (b) a multigene cancer prognostic biomarker. We call such a multigene biomarker a Predictive Gene Set, or PGS.
A PGS can be based on one transcription cluster or a multiplicity of transcription clusters. In some embodiments, a PGS is based on one or more transcription clusters in their entirety. In other embodiments, the PGS is based on a subset of genes in a single transcription cluster or subsets of a multiplicity of transcription clusters. A subset of genes from any given transcription cluster is representative of the entire transcription cluster from which it is taken, because expression of the genes within that transcription cluster is coherent. Thus, when a subset of genes in a transcription cluster is used, the subset is a representative subset of genes from the transcription cluster.
Provided herein is a method for identifying a predictive gene set (“PGS”) for classifying a cancerous tissue as sensitive or resistant to a particular anticancer drug or class of drug. The method comprises the steps of (a) measuring expression levels of a representative number of genes (such as 10, 15, 20 or more genes) from a transcription cluster in Table 1, in (i) a set of tissue samples from a population of cancerous tissues identified as sensitive to the anticancer drug, and (ii) a set of a tissue samples from a population of cancerous tissues identified as resistant to the anticancer drug; and (b) determining whether there is a statistically significant difference between the expression levels of the representative number of genes in the set of tissue samples from the sensitive population, and the set of tissue samples from the resistant population. A representative number of genes whose gene expression levels in the sensitive population are significantly different from its gene expression levels in the resistant population is a PGS for classifying a sample as sensitive or resistant to the anticancer drug. A Student's t test or Gene Set Enrichment Analysis (GSEA) can be used for determining whether there is a statistically significant difference between the expression levels of the representative number of genes in the set of tissue samples from the sensitive population and the set of tissue samples from the resistant population. In some embodiments, steps (a) and (b) are performed for each of the 51 transcription clusters disclosed herein. The tissue sample may be a tumor sample or a blood sample.
Provided herein is another method for identifying a PGS for classifying a cancerous tissue as sensitive or resistant to a particular anticancer drug or class of drug. The method comprises (a) measuring the expression levels of the ten genes in
Provided herein is a method for identifying a PGS for classifying a cancer patient as having a good prognosis or a poor prognosis. The method comprises (a) measuring the expression levels of a representative number of genes (such as 10, 15, 20 or more genes) from a transcription cluster in Table 1 in: (i) a set of tissue samples from a population of cancer patients identified as having a good prognosis, and (ii) a set of tissue samples from a population of cancer patients identified as having a poor prognosis; and (b) determining whether there is a statistically significant difference between the expression levels of the representative number of genes in the set of tissue samples from the good prognosis population, and the set of tissue samples from the poor prognosis population. A representative number of genes whose gene expression levels in the good prognosis population are significantly different from its gene expression levels in the poor prognosis population is a PGS for classifying a patient as having a good prognosis or poor prognosis. A Student's t test or Gene Set Enrichment Analysis (GSEA) can be used for determining whether there is a statistically significant difference between the expression levels of the representative number of genes in the set of tissue samples from the good prognosis population and the set of tissue samples from the poor prognosis population. In some embodiments, steps (a) and (b) are performed for each of the 51 transcription clusters disclosed herein. The tissue sample may be a tumor sample or a blood sample.
Provided herein is another method for identifying a PGS for classifying a cancer patient as having a good prognosis or a poor prognosis. The method comprises (a) measuring the expression levels of the ten genes in
Provided herein is a method of identifying a human tumor as likely to be sensitive or resistant to treatment with the anti-cancer drug tivozanib. The method comprises (a) measuring, in a sample from the tumor, the relative expression level of each gene in a PGS that comprises at least 10 of the genes from TC50; and (b) calculating a PGS score according to the algorithm
wherein E1, E2, . . . En are the expression values of the n of genes in the PGS, wherein n is the number of genes in the PGS, and wherein a PGS score below a defined threshold indicates that the tumor is likely to be sensitive to tivozanib, and a PGS score above the defined threshold indicates that the tumor is likely to be resistant to tivozanib. In one embodiment, the PGS comprises a 10-gene subset of TC50. An exemplary 10-gene subset from TC50 is MRC1, ALOX5AP, TM6SF1, CTSB, FCGR2B, TBXAS1, MS4A4A, MSR1, NCKAP1L, and FLI1. Another exemplary 10-gene subset from TC50 is LAPTM5, FCER1G, CD48, BIN2, C1QB, NCF2, CD14, TLR2, CCL5, and CD163.
In some embodiments, the method of identifying a human tumor as likely to be sensitive or resistant to treatment with tivozanib includes performing a threshold determination analysis, thereby generating a defined threshold. The threshold determination analysis can include a receiver operator characteristic curve analysis. The relative gene expression level for each gene in the PGS can be determined (e.g., measured) by DNA microarray analysis, qRT-PCR analysis, qNPA analysis, a molecular barcode-based assay, or a multiplex bead-based assay.
Provided herein is a method of identifying a human tumor as likely to be sensitive or resistant to treatment with rapamycin. The method comprises (a) measuring, in a sample from the tumor, the relative expression level of each gene in a PGS that comprises (i) at least 10 genes from TC33; and (ii) at least 10 genes from TC26; and (b) calculating a PGS score according to the algorithm:
wherein E1, E2, . . . Em are the expression values of the m genes from TC33 (for example, wherein m is at least 10 genes), which are up-regulated in sensitive tumors; and F1, F2, Fn are the expression values of n genes from TC26 (for example, wherein n is at least 10 genes), which are up-regulated in resistant tumors. A PGS score above the defined threshold indicates that the tumor is likely to be sensitive to rapamycin, and a PGS score below the defined threshold indicates that the tumor is likely to be resistant to rapamycin. An exemplary PGS comprises the following genes: FRY, HLF, HMBS, RCAN2, HMGA1, ITPR1, ENPP2, SLC16A4, ANK2, PIK3R1, DTL, CTPS, GINS2, GMNN, MCM5, PRIM1, SNRPA, TK1, UCK2, and PCNA.
In some embodiments, the method of identifying a human tumor as likely to be sensitive or resistant to treatment with rapamycin includes performing a threshold determination analysis, thereby generating a defined threshold. The threshold determination analysis can include a receiver operator characteristic curve analysis. The relative gene expression level for each gene in the PGS can be determined (e.g., measured) by DNA microarray analysis, qRT-PCR analysis, qNPA analysis, a molecular barcode-based assay, or a multiplex bead-based assay.
Provided herein is a method of classifying a human breast cancer patient as having a good prognosis or a poor prognosis. The method comprises (a) measuring, in a sample from a tumor obtained from the patient, the relative expression level of each gene in a PGS that comprises (i) at least 10 genes from TC35; and (ii) at least 10 genes from TC26; and (b) calculating a PGS score according to the algorithm:
wherein E1, E2, . . . Em are the expression values of the m genes from TC35 (for example, wherein m is at least 10 genes), which are up-regulated in good prognosis patients; and F1, F2, . . . Fn are the expression values of the n genes from TC26 (for example, wherein n is at least 10 genes), which are up-regulated in poor prognosis patients. A PGS score above the defined threshold indicates that the patient has a good prognosis, and a PGS score below the defined threshold indicates that the patient is likely to have a poor prognosis. An exemplary PGS comprises the following genes: RPL29, RPL36A, RPS8, RPS9, EEF1B2, RPS10P5, RPL13A, RPL36, RPL18, RPL14, DTL, CTPS, GINS2, GMNN, MCM5, PRIM1, SNRPA, TK1, UCK2, and PCNA.
In some embodiments, the method of classifying a human breast cancer patient as having a good prognosis or a poor prognosis include performing a threshold determination analysis, thereby generating a defined threshold. The threshold determination analysis can include a receiver operator characteristic curve analysis. The relative gene expression level for each gene in the PGS can be determined (e.g., measured) by DNA microarray analysis, qRT-PCR analysis, qNPA analysis, a molecular barcode-based assay, or a multiplex bead-based assay.
Provided herein is a probe set comprising probes for at least 10 genes from each transcription cluster in Table 1, provided that the probe set is not a whole-genome microarray chip. Examples of suitable probe sets include a microarray probe set, a set of PCR primers, a qNPA probe set, a probe set comprising molecular bar codes (e.g., NanoString® Technology) or a probe set wherein probes are affixed to beads (e.g., QuantiGene® Plex assay system). In one embodiment, the probe set comprises probes for each of the 510 genes listed in
These and other aspects and advantages of the invention will become apparent upon consideration of the following figures, detailed description, and claims.
As used herein, “coherence” means, when applied to a set of genes, that expression levels of the members of the set display a statistically significant tendency to increase or decrease in concert, within a given type of tissue, e.g., tumor tissue. Without intending to be bound by theory, the inventors note that coherence is likely to indicate that the coherent genes share a common involvement in one or more biological functions.
As used herein, “optimum threshold PGS score” means the threshold PGS score at which the classifier gives the most desirable balance between the cost of false negative calls and false positive calls.
As used herein, “Predictive Gene Set” or “PGS” means, with respect to a given phenotype, e.g., sensitivity or resistance to a particular cancer drug, a set of ten or more genes whose PGS score in a given type of tissue sample significantly correlates with the given phenotype in the given type of tissue.
As used herein, “good prognosis” means that a patient is expected to have no distant metastases of a tumor within five years of initial diagnosis of cancer.
As used herein, “poor prognosis” means that a patient is expected to have distant metastases of a tumor within five years of initial diagnosis of cancer.
As used herein, “probe” means a molecule that can be used for measuring the expression of a particular gene. Exemplary probes include PCR primers, as well as gene-specific DNA oligonucleotide probes such as microarray probes affixed to a microarray substrate, quantitative nuclease protection assay probes, probes linked to molecular barcodes, and probes affixed to beads.
As used herein, “receiver operating characteristic” (ROC) curve means a graphical plot of false positive rate (sensitivity) versus true positive rate (specificity) for a binary classifier system. In construction of an ROC curve, the following definitions apply:
False negative rate: FNR=1−TPR
True positive rate: TPR=true positive/(true positive+false negative)
False positive rate: FPR=false positive/(false positive+true negative)
As used herein, “response” or “responding” to treatment means, with regard to a treated tumor, that the tumor displays: (a) slowing of growth, (b) cessation of growth, or (c) regression. A tumor that responds to therapy is a “responder” and is “sensitive” to treatment. A tumor that does not respond to therapy is a “non-responder” and is “resistant” to treatment.
As used herein, “threshold determination analysis” means analysis of a dataset representing a given tumor type, e.g., human renal cell carcinoma, to determine a threshold PGS score, e.g., an optimum threshold PGS score, for that particular tumor type. In the context of a threshold determination analysis, the dataset representing a given tumor type includes (a) actual response data (response or non-response), and (b) a PGS score for each tumor from a group of tumor-bearing mice or humans.
Current thinking among many biologists is that the approximately 25,000 genes expressed in mammals are subject to complex regulation in order to carry out the development and function of the organism. Groups of genes function together in coordinated systems such as DNA replication, protein synthesis, neural development, etc. Currently, there is no comprehensive methodology for studying and characterizing coordinated expression of genes across the entire genome, at the transcriptional level.
We set out to group, or “bin,” genes into different functional groups or pathways, based on expression microarray data. We developed a stepwise statistical methodology to identify sets of coordinately regulated genes. The first step was to calculate a correlation coefficient for the expression level of every gene with respect to every other gene, in each of eight human datasets. This resulted in a 13,000 by 13,000 matrix of correlation scores based on data from commercial microarray chips (Affymetrix U133A). K-means clustering then was carried out across the 13,000 by 13,000 matrix of correlation scores. Because the 13,000 genes on the microarray chips are scattered across the entire human genome, and because these 13,000 genes are generally considered to include the most important human genes, the 13,000-gene chips are considered “whole genome” microarrays.
Historically, many investigators have found correlations between expression levels of certain genes and a biological condition or phenotype of interest. Such correlations, however, have had very limited usefulness. This is because the correlations typically do not hold up across datasets, e.g., human breast tumors vs. mouse breast tumors; human breast tumors vs. human lung tumors; or one gene expression technology platform (Affymetrix) vs. another gene expression technology platform (Agilent).
We have avoided this pitfall by identifying gene expression correlations that are observed across multiple, diverse datasets. By applying K-means cluster analysis (Lloyd et al., 1982, IEEE Transactions on Information Theory 28:129-137) to measured RNA expression values for all 13,000 human genes, across multiple independent data sets, we sorted the universe of transcribed human genes, the “transcriptome,” into 100 unique, non-overlapping sets of genes whose expression levels, in terms of transcriptional flux, move (increase or decrease) together. The coordinated variation in gene transcript level observed across multiple data sets is an empirical phenomenon that we call “coherence.”
After identifying the 100 non-overlapping gene groups through K-means cluster analysis, we performed an optimization process that included the following steps: (a) application of a coherency threshold, which eliminated outliers (individual genes) within each of the 100 groups; (b) identification and removal of individual genes whose expression value varied excessively, when tested in an Affymetrix system versus an Agilent system; and (c) application of threshold for minimum number of genes in any cluster, after steps (a) and (b). The end result of this optimization process was a set of 51 defined, highly coherent, non-overlapping, gene lists which we call “transcription clusters.” By mathematically reducing the complexity of a biological system containing tens of thousands of genes down to 51 groups of genes that can be represented by as few as ten genes per group, this set of 51 transcription clusters has proven to be a powerful tool for interpreting and utilizing gene expression data. The genes in each transcription cluster are listed in Table 1 (below) and identified by both Human Genome Organization (HUGO) symbol and Entrez Identifier.
Although the transcription clusters were identified by mathematical analysis, we have demonstrated that the transcription clusters have biological significance. We have found the transcription clusters to be highly enriched for a wide variety of basic biological structures or functions. Examples of associations between transcription clusters and basic biological structures or functions are listed in Table 2 below.
For some transcription clusters, the associated biology (structure and/or function), is presumed to exist, but has not been identified yet. It is important to note, however, that the practice of the methods disclosed herein, e.g., identifying a PGS for classifying a cancerous tissue as sensitive or resistant to an anticancer drug, does not require knowledge of any biological structure or function associated with any transcription cluster. Utilization of the methods described herein depends solely on two types of correlations: (1) the correlations among transcript levels within each transcription cluster; and (2) the correlation between the mean expression score for a transcription cluster and phenotype, e.g., drug sensitivity versus drug resistance, or good prognosis versus poor prognosis. Our discovery that many different basic biological structures and functions are associated with, or represented by, the disclosed transcription clusters, is strong evidence that numerous and varied phenotypic traits can be correlated readily with one or more of the transcription clusters by a person of skill in the art, without undue experimentation.
Once a transcription cluster has been associated with a phenotype of interest (such as tumor sensitivity or resistance to a particular drug), that transcription cluster (or a subset of that transcription cluster) can be used as a multigene biomarker for that phenotype. In other words, a transcription cluster, or a subset thereof, is a PGS for the phenotype(s) associated with that transcription cluster. Any given transcription cluster can be associated with more than one phenotype.
A phenotype can be associated with more than one transcription cluster. The more than one transcription cluster, or subsets thereof, can be a PGS for the phenotype(s) associated with those transcription clusters.
In certain embodiments, one or more transcription clusters from Table 1 may be optionally excluded from the analysis. For example, TC1, TC2, TC3, TC4, TC5, TC6, TC7, TC8, TC9, TC10, TC11, TC12, TC13, TC14, TC15, TC16, TC17, TC18, TC19, TC20, TC21, TC22, TC23, TC24, TC25, TC26, TC27, TC28, TC29, TC30, TC31, TC32, TC33, TC34, TC35, TC36, TC37, TC38, TC39, TC40, TC41, TC42, TC43, TC44, TC45, TC46, TC47, TC48, TC49, TC50, or TC51 may be excluded from the analysis.
In order to practice the methods disclosed herein, the skilled person needs gene expression data, e.g., conventional microarray data or quantitative PCR data, from: (a) a population shown to be positive for the phenotype of interest, and (b) a population shown to be negative for the phenotype of interest (collectively, “response data”). Examples of populations that can be used to generate response data include populations of tissue samples (tumor samples or blood samples) that represent populations of human patients or animal models, for example, mouse models of cancer. The necessary response data can be obtained readily by the skilled person, using nothing more than conventional methods, materials and instrumentation for measuring gene expression or transcript abundance in a tissue sample. Suitable methods, materials and instrumentation are well-known and commercially available. Once the response data are in hand, the methods described herein can be performed by using the lists of genes in the transcription clusters set forth above in Table 1, and mathematical calculations that are described herein.
As described in more detail in Example 2 below, we measured the transcript levels of subsets of genes from all 51 transcription clusters in tissue samples from a population of tumor samples shown to be sensitive to tivozanib; and a population of tumor samples shown to be resistant to tivozanib. Next, we calculated a cluster score for each cluster, in each individual in each population. Then, with respect to each transcription cluster, we used a Student's t-test to calculate whether the cluster scores of the tivozanib-sensitive population was significantly different from the cluster scores of the tivozanib-resistant population. We found that with regard to TC50, there was a statistically significant difference between the cluster scores of the tivozanib-sensitive population and the cluster scores of the tivozanib-resistant population.
The transcription clusters disclosed herein resulted from a genome-wide analysis, and the transcription clusters represent widely divergent biological structures and functions that are not unique to cancer biology. The transcription cluster useful for predicting response to tivozanib, TC50, is highly enriched for genes expressed by a particular class of hematopoietic cells that infiltrate certain tumors. Hematopoietic cells are critical for many biological processes. In principle, any phenotype mediated by this class of hematopoietic cells can be identified by a test for expression of TC50.
Populations.
The methods disclosed herein can be used on the basis of: (a) gene expression data (transcript abundance data) from a population of human patients, animal models or tumors, shown to be positive for the phenotypic trait of interest, e.g., response to a particular drug, or cancer prognosis; together with (b) relative gene expression data or relative transcript abundance data from populations shown to differ with respect to a phenotypic trait of interest, such as sensitivity to a particular cancer drug, and/or overall prognosis in cancer treatment. Preferably, the classified populations that differ in the phenotypic trait of interest are otherwise generally comparable. For example, if a drug sensitive population is a group of a particular strain of mice, the resistant population should be a group of the same strain of mice. In another example, if the sensitive population is a set of human kidney tumor biopsy samples, the resistant population should be a set of human kidney tumor biopsy samples.
Phenotype Definition.
Suitable criteria for phenotypic classification will depend on the phenotypes of interest. For example, if the phenotypes of interest are sensitivity and resistance of tumors to treatment with a particular anti-tumor agent, tumors can be classified on the basis of one or more parameters such as tumor growth inhibition (TGI) assessed at a single endpoint, TGI assessed over time in terms of a growth curve, or tumor histology. For a given parameter, a threshold or cut-off value can be set for distinguishing a positive phenotype from a negative phenotype. A particular percent TGI is sometimes used as a threshold or cut-off. For example, this could be clinically defined RECIST criteria (Response Evaluation Criteria In Solid Tumors) for measuring TGI in human clinical trials. In another example, the timing of an inflection point in a tumor growth curve is used. In another example, a given score in a histological assessment is used. There is considerable latitude in selection of suitable parameters and suitable thresholds for phenotype definition. For anti-tumor drug response classification, suitable phenotype definitions will depend on factors including the tumor type and the particular drug involved. Selection of suitable parameters and suitable thresholds for phenotype definition are within skill in the art.
Tissue Samples.
A tissue sample from a tumor in a human patient or a tumor in mouse model can be used as a source of RNA, so that an individual mean expression score for each transcription cluster, and a population mean expression score for each transcription cluster, can be determined. Examples of tumors are carcinomas, sarcomas, gliomas and lymphomas. The tissue sample can be obtained by using conventional tumor biopsy instruments and procedures. Endoscopic biopsy, excisional biopsy, incisional biopsy, fine needle biopsy, punch biopsy, shave biopsy and skin biopsy are examples of recognized medical procedures that can be used by one of skill in the art to obtain tumor samples for use in practicing the invention. The tumor tissue sample should be large enough to provide sufficient RNA for measuring individual gene expression levels.
The tumor tissue sample can be in any form that allows quantitative analysis of gene expression or transcript abundance. In some embodiments, RNA is isolated from the tissue sample prior to quantitative analysis. Some methods of RNA analysis, however, do not require RNA extraction, e.g., the gNPA™ technology commercially available from High Throughput Genomics, Inc. (Tucson, Ariz.). Accordingly, the tissue sample can be fresh, preserved through suitable cryogenic techniques, or preserved through non-cryogenic techniques. Tissue samples used in the invention can be clinical biopsy specimens, which often are fixed in formalin and then embedded in paraffin. Samples in this form are commonly known as formalin-fixed, paraffin-embedded (FFPE) tissue. Techniques of tissue preparation and tissue preservation suitable for use in the present invention are well-known to those skilled in the art.
Expression levels for a representative number of genes from a given transcription cluster are the input values used to calculate the individual mean expression score for that transcription cluster, in a given tissue sample. Each tissue sample is a member of a population, e.g., a sensitive population or a resistant population. The individual mean expression scores for all the individuals in a given population then are used to calculate the population mean expression score for a given transcription cluster, in a given population. So for each tissue sample, it is necessary to determine, i.e., measure, the expression levels of individual genes in a transcription cluster. Gene expression levels (transcript abundance) can be determined by any suitable method. Exemplary methods for measuring individual gene expression levels include DNA microarray analysis, qRT-PCR, gNPA™, the NanoString® technology, and the QuantiGene® Plex assay system, each of which is discussed below.
RNA Isolation.
DNA microarray analysis and qRT-PCR generally involve RNA isolation from a tissue sample. Methods for rapid and efficient extraction of eukaryotic mRNA, i.e., poly(a) RNA, from tissue samples are well-established and known to those of skill in the art. See, e.g., Ausubel et al., 1997, Current Protocols of Molecular Biology, John Wiley & Sons. The tissue sample can be fresh, frozen or fixed paraffin-embedded (FFPE) clinical study tumor specimens. In general, RNA isolated from fresh or frozen tissue samples tends to be less fragmented than RNA from FFPE samples. FFPE samples of tumor material, however, are more readily available, and FFPE samples are suitable sources of RNA for use in methods of the present invention. For a discussion of FFPE samples as sources of RNA for gene expression profiling by RT-PCR, see, e.g., Clark-Langone et al., 2007, BMC Genomics 8:279. Also see, De Andrés et al., 1995, Biotechniques 18:42044; and Baker et al., U.S. Patent Application Publication No. 2005/0095634. The use of commercially available kits with vendor's instructions for RNA extraction and preparation is widespread and common. Commercial vendors of various RNA isolation products and complete kits include Qiagen (Valencia, Calif.), Invitrogen (Carlsbad, Calif.), Ambion (Austin, Tex.) and Exiqon (Woburn, Mass.).
In general, RNA isolation begins with tissue/cell disruption. During tissue/cell disruption, it is desirable to minimize RNA degradation by RNases. One approach to limiting RNase activity during the RNA isolation process is to ensure that a denaturant is in contact with cellular contents as soon as the cells are disrupted. Another common practice is to include one or more proteases in the RNA isolation process. Optionally, fresh tissue samples are immersed in an RNA stabilization solution, at room temperature, as soon as they are collected. The stabilization solution rapidly permeates the cells, stabilizing the RNA for storage at 4° C., for subsequent isolation. One such stabilization solution is available commercially as RNAlater® (Ambion, Austin, Tex.).
In some protocols, total RNA is isolated from disrupted tumor material by cesium chloride density gradient centrifugation. In general, mRNA makes up approximately 1% to 5% of total cellular RNA. Immobilized oligo(dT), e.g., oligo(dT) cellulose, is commonly used to separate mRNA from ribosomal RNA and transfer RNA. If stored after isolation, RNA must be stored under RNase-free conditions. Methods for stable storage of isolated RNA are known in the art. Various commercial products for stable storage of RNA are available.
Microarray Analysis.
The mRNA expression level for multiple genes can be measured using conventional DNA microarray expression profiling technology. A DNA microarray is a collection of specific DNA segments or probes affixed to a solid surface or substrate such as glass, plastic or silicon, with each specific DNA segment occupying a known location in the array. Hybridization with a sample of labeled RNA, usually under stringent hybridization conditions, allows detection and quantitation of RNA molecules corresponding to each probe in the array. After stringent washing to remove non-specifically bound sample material, the microarray is scanned by confocal laser microscopy or other suitable detection method. Modern commercial DNA microarrays, often known as DNA chips, typically contain tens of thousands of probes, and thus can measure expression of tens of thousands of genes simultaneously. Such microarrays can be used in practicing the disclosed methods. Alternatively, custom chips containing as few probes as those needed to measure expression of the genes of the transcription clusters, plus any desired controls or standards.
To facilitate data normalization, a two-color microarray reader can be used. In a two-color (two-channel) system, samples are labeled with a first fluorophore that emits at a first wavelength, while an RNA or cDNA standard is labeled with a second fluorophore that emits at a different wavelength. For example, Cy3 (570 nm) and Cy5 (670 nm) often are employed together in two-color microarray systems.
DNA microarray technology is well-developed, commercially available, and widely employed. Therefore, in performing the methods disclosed herein, the skilled person can use microarray technology to measure expression levels of genes in the transcription cluster without undue experimentation. DNA microarray chips, reagents (such as those for RNA or cDNA preparation, RNA or cDNA labeling, hybridization and washing solutions), instruments (such as microarray readers) and protocols are well-known in the art and available from various commercial sources. Commercial vendors of microarray systems include Agilent Technologies (Santa Clara, Calif.) and Affymetrix (Santa Clara, Calif.), but other microarray systems can be used.
Quantitative RT-PCR.
The level of mRNA representing individual genes in a transcription cluster can be measured using conventional quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) technology. Advantages of qRT-PCR include sensitivity, flexibility, quantitative accuracy, and ability to discriminate between closely related mRNAs. Guidance concerning the processing of tissue samples for quantitative PCR is available from various sources, including manufacturers and vendors of commercial products for qRT-PCR (e.g., Qiagen (Valencia, Calif.) and Ambion (Austin, Tex.)). Instrument systems for automated performance of qRT-PCR are commercially available and used routinely in many laboratories. An example of a well-known commercial system is the Applied Biosystems 7900HT Fast Real-Time PCR System (Applied Biosystems, Foster City, Calif.).
Once isolated mRNA is in hand, the first step in gene expression profiling by RT-PCR is the reverse transcription of the mRNA template into cDNA, which is then exponentially amplified in a PCR reaction. Two commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney murine leukemia virus reverse transcriptase (MMLV-RT). The reverse transcription reaction typically is primed with specific primers, random hexamers, or oligo(dT) primers. Suitable primers are commercially available, e.g., GeneAmp® RNA PCR kit (Perkin Elmer, Waltham, Mass.). The resulting cDNA product can be used as a template in the subsequent polymerase chain reaction.
The PCR step is carried out using a thermostable DNA-dependent DNA polymerase. The polymerase most commonly used in PCR systems is a Thermus aquaticus (Taq) polymerase. The selectivity of PCR results from the use of primers that are complementary to the DNA region targeted for amplification, i.e., regions of the cDNAs reverse transcribed from the genes of the Transcription Cluster. Therefore, when qRT-PCR is employed in the present invention, primers specific to each gene in a given Transcription Cluster are based on the cDNA sequence of the gene. Commercial technologies such as SYBR® green or TaqMan® (Applied Biosystems, Foster City, Calif.) can be used in accordance with the vendor's instructions. Messenger RNA levels can be normalized for differences in loading among samples by comparing the levels of housekeeping genes such as beta-actin or GAPDH. The level of mRNA expression can be expressed relative to any single control sample such as mRNA from normal, non-tumor tissue or cells. Alternatively, it can be expressed relative to mRNA from a pool of tumor samples, or tumor cell lines, or from a commercially available set of control mRNA.
Suitable primer sets for PCR analysis of expression levels of genes in a transcription cluster can be designed and synthesized by one of skill in the art, without undue experimentation. Alternatively, complete PCR primer sets for practicing the disclosed methods can be purchased from commercial sources, e.g., Applied Biosystems, based on the identities of genes in the transcription clusters, as listed in Table 1. PCR primers preferably are about 17 to 25 nucleotides in length. Primers can be designed to have a particular melting temperature (Tm), using conventional algorithms for Tm estimation. Software for primer design and Tm estimation are available commercially, e.g., Primer Express™ (Applied Biosystems), and also are available on the internet, e.g., Primer3 (Massachusetts Institute of Technology). By applying established principles of PCR primer design, a large number of different primers can be used to measure the expression level of any given gene. Accordingly, the disclosed methods are not limited with respect to which particular primers are used for any given gene in a transcription cluster.
Quantitative Nuclease Protection Assay.
An example of a suitable method for determining expression levels of genes in a transcription cluster without performing an RNA extraction step is the quantitative nuclease protection assay (qNPAT™), which is commercially available from High Throughput Genomics, Inc. (aka “HTG”; Tucson, Ariz.). In the qNPA method, samples are treated in a 96-well plate with a proprietary Lysis Buffer (HTG), which releases total RNA into solution. Gene-specific DNA oligonucleotides, i.e., specific for each gene in a given Transcription Cluster, are added directly to the Lysis Buffer solution, and they hybridize to the RNA present in the Lysis Buffer solution. The DNA oligonucleotides are added in excess, to ensure that all RNA molecules complementary to the DNA oligonucleotides are hybridized. After the hybridization step, S1 nuclease is added to the mixture. The S1 nuclease digests the non-hybridized portion of the target RNA, all of the non-target RNA, and excess DNA oligonucleotides. Then the S1 nuclease enzyme is inactivated. The RNA::DNA heteroduplexes are treated to remove the RNA portion of the duplex, leaving only the previously protected oligonucleotide probes. The surviving DNA oligonucleotides are a stoichiometrically representative library of the original RNA sample. The qNPA oligonucleotide library can be quantified using the ArrayPlate Detection System (HTG).
NanoString® nCounter® Analysis.
Another example of a technology suitable for determining expression levels of genes in a transcription cluster is a commercially available assay system based on probes with molecular “barcodes” is the NanoString® nCounter™ Analysis system (NanoString® Technologies, Seattle, Wash.). This system is designed to detect and count hundreds of unique transcripts in a single reaction. Each color-coded barcode is attached to a single target-specific probe corresponding to a gene interest, e.g., a gene in a transcription cluster. When mixed together with controls, probes form a multiplexed “CodeSet.” The NanoString® technology employs two approximately 50-base probes per mRNA, that hybridize in solution. A “reporter probe” carries the signal, and a “capture probe” allows the complex to be immobilized for data collection. After hybridization, the excess probes are removed, and the probe/target complexes are aligned and immobilized in nCounter® cartridges, which are placed in a digital analyzer. The nCounter® analysis system is an integrated system comprising an automated sample prep station, a digital analyzer, the CodeSet (molecular barcodes), and all of the reagents and consumables needed to perform the analysis.
QuantiGene® Plex Assay.
Another example of a technology suitable for determining expression levels of genes in a transcription cluster is a commercially available assay system known as the QuantiGene® Plex Assay (Panomics, Fremont, Calif.). This technology combines branched DNA signal amplification with xMAP (multi-analyte profiling) beads, to enable simultaneous quantification of multiple RNA targets directly from fresh, frozen or FFPE tissue samples, or purified RNA preparations. For further description of this technology, see, e.g., Flagella et al., 2006, Anal. Biochem. 352:50-60.
Practice of the methods disclosed herein is not limited to the use of any particular technology for generation of gene expression data. As discussed above, various accurate and reliable systems, including protocols, reagents and instrumentation are commercially available. Selection and use of a suitable system for generating gene expression data for use in the methods described herein is a design choice, and can be accomplished by a person of skill in the art, without undue experimentation.
Cluster Scores and Statistical Differences between Populations
A cluster score for any given transcription cluster in each tissue sample can be calculated according to the following algorithm:
wherein E1, E2, . . . En are the relative expression values obtained with respect to each of the n genes representing each transcription cluster.
A cluster score can be calculated for each of the 51 transcription clusters in each tissue sample in the drug sensitive population and each member tissue sample in the drug resistant population.
Statistical significance can be calculated in various ways well-known in the art, e.g., a t-test or a Kolmogorov-Smirnov test. For example, a Student's t-test can be performed by using the cluster score of each individual and then calculating a p-value using a two sample t-test between the drug sensitive population and the drug resistant population. See Example 2 below. Another suitable method is to do a Kolmogorov-Smirnov test as in the GSEA algorithm described in Subramanian, Tamayo et al., 2005, Proc. Nat'l Acad. Sci USA 102:15545-15550). Statistical significance may also be calculated by applying Fisher's exact test (Fisher, 1922, J. Royal Statistical Soc. 85:87-94; Agresti, 1992, Statistical Science 7:131-153) to calculate p-value between the drug sensitive population and the drug resistant population.
A statistically significant difference may be based on commonly used statistical cutoffs well-known in the art. For example, a statistically significant difference may be a p-value of less than or equal to 0.05, 0.01, 0.005, 0.001. The p-value can be calculated using algorithms such as the Student's t-test, the Kolmogorov-Smimov test, or the Fisher's exact test. It is contemplated herein that determining a statistically significant difference, using a suitable algorithm, is within the skill in the art, and that the skilled person can select an appropriate statistical cutoff for determining significance, based on the drug and population (e.g., tumor sample or patient population) being tested.
In some embodiments, the correlation between expression of a transcription cluster and a phenotype of interest, e.g., drug resistance, is established through the use of expression measurements for all the genes in a transcription cluster. However, the use of expression measurements for all the genes in a transcription cluster is optional. In some embodiments, the correlation between expression of a transcription cluster and a phenotype is established through the use of expression measurements for a subset, i.e., a representative number of genes, from the transcription cluster. Subsets of a transcription cluster can be used reliably to represent the entire transcription cluster, because within each transcription cluster, the genes are expressed coherently. By definition, gene expression levels (as represented by transcript abundance) within a given transcription cluster are correlated. In general, a larger subset generally yields a more accurate cluster score, with the marginal increase in accuracy per additional gene decreasing, as the size of the subset increases. A smaller subset provides convenience and economy. For example, if each transcription cluster is represented by 10 genes, the entire set of 51 transcription clusters can be effectively represented by only 510 probes, which can be incorporated into a single microarray chip, a single PCR kit, a single nCounter Analysis™ assay (NanoString® Technologies), or a single QuantiGene® Plex assay (Panomics, Fremont, Calif.), using technology that is currently available from commercial vendors.
Such a reduction in the number of probes can be advantageous in biomarker discovery projects, i.e., associating clinical phenotypes in oncology (drug response or prognosis) with specific sets of biologically relevant genes (biomarkers), and in clinical assays. Often, in clinical practice, small amounts of tissue are collected, without regard to preserving the integrity of the RNA in the sample. Consequently, the quantity and quality of RNA can be insufficient for precise measurement of the expression of large numbers of genes. By greatly reducing the number of genes to be assayed, e.g., a 100-fold reduction, the use of subsets of the transcription clusters enables robust transcription cluster analysis from small tissue amounts, yielding low quality RNA.
The optimal number of genes employed to represent each transcription cluster can be viewed as a balance between assay robustness and convenience. When a subset of a transcription cluster is used, the subset preferably contains ten or more genes. The selection of a suitable number to be the representative number can be done by a person of skill in the art, without undue experimentation.
We sought to demonstrate with mathematical rigor, that essentially any subset of at least ten genes from any one of Transcription Clusters 1-51 would be a highly effective surrogate for the entire transcription cluster from which it was taken. In other words, we sought to determine whether any randomly selected 10-gene subset would yield an individual mean expression score highly correlated with the individual mean expression score calculated from expression scores for every member of the respective transcription cluster. To accomplish this, we generated 10,000 randomly chosen 10-gene subsets from each transcription cluster. Then we calculated the correlation between each of the 10,000 individual mean expression scores and the individual mean expression score for all genes of the transcription cluster.
Table 3 shows the worst correlation p-value of the 10,000 Pearson correlation comparisons for every transcription cluster. For each of the 51 transcription clusters, every one of the 10,000 randomly selected 10-gene subsets yields an individual mean expression score that is significantly correlated with the individual mean expression score calculated from the complete transcription cluster. This is a rigorous mathematical demonstration that essentially any 10-gene subset from any of the 51 transcription clusters is sufficiently representative of the entire transcription cluster, that it can be employed as a highly effective surrogate for the entire transcription cluster, thereby greatly reducing the number of gene expression measurements (and thus, the number of probes) needed to establish an association between a transcription cluster and a phenotype of interest.
In a further example of subset-based embodiments, we demonstrated with mathematical rigor that, for any of the transcription clusters, any ten-gene subset comprising at least five genes from the subset representing that cluster in
In this demonstration, for each of the 51 transcription clusters, we generated 10,000 new ten-gene subsets wherein at least five genes were taken from the ten-gene subset representing that cluster in
A predictive gene set (PGS) is a multigene biomarker that is useful for classifying a type of tissue, e.g., a mammalian tumor, with respect to a particular phenotype. Examples of particular phenotypes are: (a) sensitive to a particular cancer drug; (b) resistant to a particular cancer drug; (c) likely to have a good outcome upon treatment (good prognosis); and (d) likely to have a poor outcome upon treatment (poor prognosis).
Disclosed herein is a general method for identifying novel predictive gene sets by using one or more of the 51 transcription clusters set forth herein. When a transcription cluster is shown to yield cluster scores significantly correlated with a phenotype of interest, the PGS is based on, or derived from, that transcription cluster. In some embodiments, the PGS includes all the genes in the transcription cluster. In other embodiments, the PGS includes only a subset of genes from the transcription cluster, rather than the entire transcription cluster. Preferably, a PGS identified using the methods described herein will include ten or more genes, e.g., 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 42, 44, 46, 48 or 50 genes from the transcription cluster.
In some embodiments, more than one transcription cluster is associated with a phenotype of interest. In such a situation, a PGS can be based on any one of the associated transcription clusters, or a multiplicity of the associated transcription clusters.
The predictive value of a PGS is achieved by measuring (with respect to a tissue sample) the expression levels of each of at least 10 of the genes in the PGS, and calculating a PGS score for the tissue sample according to the following algorithm:
wherein E1, E2, . . . En are the expression values of the n genes in the PGS.
Optionally, expression levels of additional genes, e.g., housekeeping genes to be used as internal standards, may be measured in addition to the PGS.
It should be noted that although the algorithms for calculating cluster scores and PGS scores are essentially the same, and both calculations involve gene expression values, a cluster score is not the same as a PGS score. The difference is in the context. A cluster score is associated with a sample of known phenotype, which sample is being used in a method of identifying a PGS. In contrast, a PGS score is associated with a sample of unknown phenotype, which sample is being tested and classified as to likely phenotype.
PGS scores are interpreted with respect to a threshold PGS score. PGS scores higher than the threshold PGS score will be interpreted as indicating a tissue sample classified as likely to have a first phenotype, e.g., a tumor likely to be sensitive to treatment a particular drug. PGS scores lower than the threshold PGS score will be interpreted as indicating a tissue sample classified as likely to have a second phenotype, e.g., a tumor likely to be resistant to treatment with the drug. With respect to tumors, a given threshold PGS score may vary, depending on tumor type. In the context of the disclosed methods, the term “tumor type” takes into account (a) species (mouse or human); and (b) organ or tissue of origin. Optionally, tumor type further takes into account tumor categorization based on gene expression characteristics, e.g., HER2-positive breast tumors, or non-small cell lung tumors expressing a particular EGFR mutation.
For any given tumor type, an optimum threshold PGS score can be determined (or at least approximated) empirically by performing a threshold determination analysis. Preferably, threshold determination analysis includes receiver operator characteristic (ROC) curve analysis.
ROC curve analysis is a well-known statistical technique, the application of which is within ordinary skill in the art. For a discussion of ROC curve analysis, see generally Zweig et al., 1993, “Receiver operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine,” Clin. Chem. 39:561-577; and Pepe, 2003, The statistical evaluation of medical tests for classification and prediction, Oxford Press, New York.
PGS scores and the optimum threshold PGS score may vary from tumor type to tumor type. Therefore, a threshold determination analysis preferably is performed on one or more datasets representing any given tumor type to be tested using the disclosed methods. The dataset used for threshold determination analysis includes: (a) actual response data (response or non-response), and (b) a PGS score for each tumor sample from a group of human tumors or mouse tumors. Once a PGS score threshold is determined with respect to a given tumor type, that threshold can be applied to interpret PGS scores from tumors of that tumor type.
The ROC curve analysis is performed essentially as follows. Any sample with a PGS score greater than threshold is identified as a non-responder. Any sample with a PGS score less than or equal to threshold is identified as responder. For every PGS score from a tested set of samples, “responders” and “non-responders” (hypothetical calls) are classified using that PGS score as the threshold. This process enables calculation of TPR (y vector) and FPR (x vector) for each potential threshold, through comparison of hypothetical calls against the actual response data for the data set. Then an ROC curve is constructed by making a dot plot, using the TPR vector, and FPR vector. If the ROC curve is above the diagonal from (0, 0) point to (1.0, 1.0) point, it shows that the PGS test result is a better test than random (see, e.g.,
The ROC curve can be used to identify the best operating point. The best operating point is the one that yields the best balance between the cost of false positives weighed against the cost of false negatives. These costs need not be equal. The average expected cost of classification at point x,y in the ROC space is denoted by the expression
C=(1−p)alpha*x+p*beta(1−y)
wherein:
alpha=cost of a false positive,
beta=cost of missing a positive (false negative), and
p=proportion of positive cases.
False positives and false negatives can be weighted differently by assigning different values for alpha and beta. For example, if the phenotypic trait of interest is drug response, and it is decided to include more patients in the responder group at the cost of treating more patients who are non-responders, one can put more weight on alpha. In this case, it is assumed that the cost of false positive and false negative is the same (alpha equals to beta). Therefore, the average expected cost of classification at point x,y in the ROC space is:
C′=(1−p)*x+p*(1−y).
The smallest C′ can be calculated after using all pairs of false positive and false negative (x, y). The optimum PGS score threshold is calculated as the PGS score of the (x, y) at C′. For example, as shown in Example 2, the optimum PGS score threshold, as determined using this approach, was found to be 1.62.
In addition to predicting whether a tumor will be sensitive or resistant to treatment with a particular drug, e.g., tivozanib, a PGS score provides an approximate, but useful, indication of how likely a tumor is to be sensitive or resistant, according to the magnitude of the PGS score.
The invention is further illustrated by the following examples. The examples are provided for illustrative purposes only, and are not to be construed as limiting the scope or content of the invention in any way.
A genetically diverse population of more than 100 murine breast tumors (BH archive) was used to identify tumors that are sensitive to a drug of interest (responders) and tumors that are resistant to the same drug of interest (non-responders). The BH archive was established by in vivo propagation and cryopreservation of primary tumor material from more than 100 spontaneous murine breast tumors derived from engineered chimeric mice that develop HER2-dependent, inducible spontaneous breast tumors.
The mice were produced essentially as follows. Ink4a homozygous null murine ES cells were co-transfected with the following four constructs, as separate fragments: MMTV-rtTA, TetO-HER2V659Eneu, TetO-luciferase and PGK-puromycin. ES cells carrying these constructs were injected into 3-day-old C57BL/6 blastocysts, which were transplanted into pseudo-pregnant female mice for gestation leading to birth of the chimeric mice. The mouse mammary tumor virus long terminal repeat (MMTV) was used to drive breast-specific expression of the reverse tetracycline transactivator (rtTA). The rtTA provided for breast-specific expression of the HER2 activated oncogene, when doxycycline was provided to the mice in their drinking water. Following induction of the tetracycline-responsive promoter by doxycycline, the mice developed invasive mammary carcinomas with a latency of about 2 to 6 months.
The BH archive of more than 100 tumors was produced essentially as follows. Primary tumor cells were isolated from the chimeric animals by physical disruption of the tumors using cell strainers. Typically 1×105 cells were mixed with Matrigel (50:50 by vol.) and injected subcutaneously into female NCr nu/nu mice. When these tumors grew to approximately 500 mm3, which typically required 2 to 4 weeks, they were collected for one further round of in vivo propagation, after which tumor material was cryopreserved in liquid nitrogen. To characterize the propagated and archived tumors, 1×105 cells from each individual tumor line were thawed and injected subcutaneously in BALB/c nude mice. When the tumors reached a mean size of 500 to 800 mm3, animals were sacrificed and tumors were surgically removed for further analysis.
The BH tumor archive was characterized at the tissue, cellular and molecular level. Analyses included general histopathology (architecture, cytology, desmoplasia, extent of necrosis, vasculature morphology), IHC (e.g., CD31 for tumor vasculature, Ki67 for tumor cell proliferation, signaling proteins for pathway activation), and global molecular profiling (microarray for RNA expression, array CGH for DNA copy number), as well as RNA and protein expression levels for specific genes (qRT-PCR, immunoassays). Such analyses revealed a remarkable degree of molecular variation which were manifest in key phenotypic parameters such as tumor growth rate, microvasculature, and variable sensitivity to different cancer drugs.
For example, among the approximately 100 BH murine tumors, histopathologic analysis revealed subtypes each with distinct morphologic features including level of stromal cell involvement, cytokeratin staining, and cellular architecture. One subtype exhibited nested cytokeratin-positive, epithelial cells surrounded by collagen-positive, fibroblast-like stromal cells, along with slower proliferation rate, while a second subtype exhibited solid sheet, epithelioid malignant cells with little stromal involvement, and faster proliferation rates. These and other subtypes are also distinguishable by their gene expression profiles.
Tumors in the BH murine tumor archive were tested for sensitivity to treatment with tivozanib. Evaluation of tumor response to this drug treatment was performed essentially as follows. Subcutaneously transplanted tumors were established by injecting physically disrupted tumor cells (mixed with Matrigel) into 6 week-old female BALB/c nude mice. When the tumors reached approximately 100-200 mm3, 20 tumor-bearing mice were randomized into two groups. Group 1 received vehicle. Group 2 received tivozanib at 5 mg/kg daily by oral gavage. Tumors were measured twice per week by a caliper, and tumor volume was calculated.
These studies revealed significant tumor-to-tumor variation in growth inhibition in response to tivozanib. The variation in response was expected, because the mouse model tumors had been propagated from spontaneously arising tumors, and were therefore expected to contain differing sets of secondary de novo mutations that contributed to tumorogenesis. The variation in drug response was useful and desirable, because it modeled the tumor-to-tumor variation drug response displayed by naturally occurring human tumors. Tivozanib-sensitive tumors and tivozanib-resistant tumors were identified (classified) on the basis of tumor growth inhibition, histopathology and IHC (CD31). Typically, tivozanib-sensitive tumors exhibited no tumor progression (by caliper measurement), and close to complete tumor killing, except for the peripheries, when the tumor-bearing mice were treated with 5 mg/kg tivozanib.
Messenger RNA (approx. 6 μg) from each tumor in the BH archive was amplified and hybridized, using a custom Agilent microarray (Agilent mouse 40K chip). Conventional microarray technology was used to measure the expression of approximately 40,000 genes in tissue samples from each of the 66 tumors. Comparison of the gene expression profile of a mouse tumor sample to control sample (universal mouse reference RNA from Stratagene, cat. #740100-41) was performed, and commercially available feature extraction software (Agilent Technologies, Santa Clara, Calif.) was used for feature extraction and data normalization.
Differences between tivozanib-sensitive tumors and tivozanib-resistant tumors, with respect to average (aggregate) expression of genes in different transcription clusters, were evaluated using a Student's t-test. The t-test was performed essentially as follows. Gene expression values from the microarray analysis described above were used to calculate a cluster score for each transcription cluster in each tumor. Then a p-value for each transcription cluster was calculated by applying a two-sample t-test comparing tivozanib-sensitive tumors and tivozanib-resistant tumors. False discovery rates (FDR) also were calculated. The p-values and false discovery rates for the ten highest-scoring transcription clusters are shown in Table 4.
Transcription clusters with a false discovery rate greater than 0.005 were eliminated from further consideration. Two transcription clusters, i.e., TC50 and TC48 were identified as having a false discovery rate lower than 0.005. TC50 was identified as having the lowest false discovery rate, i.e., 0.003. High expression of TC50 correlates with tivozanib resistance.
This example demonstrates the power of the disclosed method. In this example, mathematical analysis of conventional microarray expression profiling led to TC50, which is associated with certain subsets of myeloid cells that can mediate non-VEGF-dependent angiogenesis, thereby providing a mechanism of tivozanib resistance.
The predictive power of the tivozanib PGS (TC50) identified in Example 2 was evaluated in an experiment involving a population of 25 tumors previously classified as tivozanib-sensitive or tivozanib-resistant, based on actual drug response testing with tivozanib, as described in Examples 1 and 2. These 25 tumors were from a proprietary archive of primary mouse tumors in which the driving oncogene is HER2. In this example, the PGS employed was the following 10-gene subset from TC50:
MRC1
ALOX5AP
TM6SF1
CTSB
FCGR2B
TBXAS1
MS4A4A
MSR1
NCKAP1L
FLI1
A PGS score for each of the tumors was calculated from gene expression data obtained by conventional microarray analysis. We calculated the tivozanib PGS score according to the following algorithm:
wherein E1, E2, . . . En are the expression values of the n genes in the PGS.
The data from this experiment are summarized as a waterfall plot shown in
When this threshold was applied, the test yielded a correct prediction of tivozanib-sensitivity (response) or tivozanib-resistance (non-response) for 22 out of the 25 tumors (
In this example, the Fisher's exact test p-value was 0.00722, which is the probability of observing this test result due to chance alone. This p-value is 6.9-fold better than the conventional cut-off for statistical significance, i.e., p=0.05.
Tumors from the BH murine tumor archive were tested for sensitivity to treatment with rapamycin (also known as sirolimus, or RAPAMUNE®). Evaluation of tumor response to rapamycin treatment was performed essentially as follows. Subcutaneously transplanted tumors were established by injecting physically disrupted tumor cells (primary tumor material), mixed with Matrigel, into 6 week-old female BALB/c nude mice. When the tumors reached approximately 100-200 mm3, 20 tumor-bearing mice were randomized into two groups. Group 1 received vehicle. Group 2 received rapamycin at 0.1 mg/kg daily, by intraperitoneal injection. Tumors were measured twice per week by a caliper, and tumor volume was calculated. These studies revealed significant tumor-to-tumor variation in growth inhibition in response to rapamycin. Rapamycin-resistant tumors were defined as those exhibiting 50% tumor growth inhibition or less. Rapamycin-sensitive tumors were defined as those exhibiting more than 50% tumor growth inhibition. Out of 66 tumors tested, 41 were found to be rapamycin-sensitive, and 25 were found to be rapamycin-resistant.
Preparation of mRNA from the tumors, and microarray analysis, were as described above in Example 2. To identify differences between rapamycin-sensitive and rapamycin-resistant tumors with respect to enrichment of expression of the 51 transcription clusters, we applied Gene Set Enrichment Analysis (GSEA) to the RNA expression data from the 41 rapamycin-sensitive tumors, and the 25 rapamycin-resistant tumors. (For a discussion of GSEA, see Subramanian et al., 2005, “Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles,” Proc. Natl. Acad. Sci. USA 102: 15545-15550.)
Application of GSEA to the RNA expression data revealed significant differences between the rapamycin-sensitive group and the rapamycin-resistant group, with respect to expression of the 51 transcription clusters. Table 6 (below) shows GSEA results for the sensitive group of tumors. When ranked by false discovery rate q-value, the transcription cluster most enriched for high expression was found to be TC33.
Table 7 (below) shows GSEA results for the resistant group of tumors. When ranked by false discovery rate q-value, the transcription cluster most enriched for high expression was found to be TC26.
Top enriched transcription cluster for rapamycin-sensitive tumors (TC33), and the top enriched transcription cluster for rapamycin-resistant tumors (TC26) were used to generate a 20-gene rapamycin PGS, which consists of 10 genes from TC33 and 10 genes from TC26. This particular rapamycin PGS contains the following 20 genes:
Since the PGS contains 10 genes that are up-regulated in sensitive tumors and 10 genes that are up-regulated in resistant tumors, the following algorithm was used to calculate the rapamcin PGS score:
wherein E1, E2, . . . Em are the expression values of the m-gene signature up-regulated in sensitive tumors (TC33); and wherein F1, F2, . . . Fn are the expression values of the n-gene signature upregulated in resistant tumors (TC26). In the example above, m is 10, and n is 10.
The predictive power of the rapamycin PGS identified in Example 4 was evaluated in an experiment involving a population of 66 tumors previously classified as rapamycin-sensitive or rapamycin-resistant, based on actual drug response testing with rapamycin, as described in Examples 4. These 66 tumors were from a proprietary archive of primary mouse tumors in which the driving oncogene is HER2. A rapamycin PGS score for each tumor was calculated from gene expression data obtained by conventional microarray analysis. The data from this experiment are summarized as a waterfall plot shown in
When this threshold was applied, the test yielded a correct prediction of rapamycin-sensitivity (response) or rapamycin-resistance (non-response) with regard to 45 out of the 66 tumors (
In this example, the Fisher's exact test p-value was 0.000815. This means the probability of observing this test due to chance alone was 0.000815, which is the probability of observing this test result due to chance alone. This p-value is 61.4-fold better than the conventional cut-off for statistical significance, i.e., p=0.05.
A population of 295 breast tumors (NKI breast cancer dataset) was used to separate tumors that have a short interval to distant metastases (poor prognosis, metastasis within 5 years) from tumors that have a long interval to distant metastases (good prognosis, no metastasis within 5 years). Among the 295 NKI breast tumors, 196 samples were good prognostic and 78 samples were bad prognostic.
Differentially expressed gene sets representing biological pathways were identified when 196 good prognosis tumors from the NKI breast dataset were compared against 78 poor prognosis tumors from the NKI breast dataset. Differences in enrichment of pathway gene lists between good prognosis and poor prognosis tumors were evaluated by employing Gene Set Enrichment Analysis (GSEA) with respect to the 51 transcription clusters. Our analysis in comparing good prognosis tumors to poor prognosis tumors demonstrated that of the transcription clusters whose member genes exhibited a significant difference in expression, TC35 (associated with ribosomes), is the top over-expressed transcription cluster in the good prognosis group (Table 9).
TC26 (associated with proliferation) is the top over-expressed cluster in the poor prognosis group, as shown in the GSEA results presented in Table 10.
The most enriched transcription cluster for the good prognosis tumors (TC35), and the most enriched transcription cluster for the poor prognosis tumors (TC26) were used to generate a 20-gene breast cancer prognosis PGS, which consists of ten genes from TC35 and ten genes from TC26. This particular breast cancer PGS contains the following 20 genes:
Since the breast cancer prognosis PGS contains 10 genes that are up-regulated in good prognosis tumors and 10 genes that are up-regulated in poor prognosis tumors, the following algorithm was used to calculate the breast cancer prognosis PGS scores:
wherein E1, E2, . . . Em are the expression values of the m-gene signature up-regulated in good prognosis tumors (TC35); and wherein F1, F2, . . . Fn are the expression values of the n-gene signature upregulated in poor prognosis tumors (TC26). In the example above, m is 10, and n is 10.
The prognostic PGS identified in Example 6 (above) was validated in an independent breast cancer dataset, i.e., the Wang breast cancer dataset (Wang et al., 2005, Lancet 365:671-679). A population of 286 breast tumors from the Wang breast cancer dataset was used as an independent validation dataset. The samples in Wang datasets had clinical annotation including Overall Survival Time and Event (dead or not). The 20-gene breast cancer prognostic PGS identified in Example 6 was an effective predictor of patient outcome. This is shown in
The following prophetic example illustrates in detail how the skilled person could use the disclosed methods to predict human response to tivozanib, using TaqMan® data.
With regard to a given tumor type (e.g., renal cell carcinoma), tumor samples (archival FFPE blocks, fresh samples or frozen samples) are obtained from human patients (indirectly through a hospital or clinical laboratory) prior to treatment of the patients with tivozanib. Fresh or frozen tumor samples are placed in 10% neutral-buffered formalin for 5-10 hours before being alcohol dehydrated and embedded in paraffin, according to standard histology procedures.
RNA is extracted from 10 μm FFPE sections. Paraffin is removed by xylene extraction followed by ethanol washing. RNA is isolated using a commercial RNA preparation kit. RNA is quantitated using a suitable commercial kit, e.g., the RiboGreen® fluorescence method (Molecular Probes, Eugene, Oreg.). RNA size is analyzed by conventional methods.
Reverse transcription is carried out using the SuperScript™ First-Strand Synthesis Kit for qRT-PCR (Invitrogen). Total RNA and pooled gene-specific primers are present at 10-50 ng/μl and 100 nM (each), respectively.
For each gene in the PGS, qRT-PCR primers are designed using commercial software, e.g., Primer Express® software (Applied Biosystems, Foster City, Calif.). The oligonucleotide primers are synthesized using a commercial synthesizer instrument and appropriate reagents, as recommended by the instrument manufacturer or vendor. Probes are labeled using a suitable commercial labeling kit.
TaqMan® reactions are performed in 384-well plates, using an Applied Biosystems 7900HT instrument according to the manufacturer's instructions. Expression of each gene in the PGS is measured in duplicate 5 μl reactions, using cDNA synthesized from 1 ng of total RNA per reaction well. Final primer and probe concentrations are 0.9 μM (each primer) and 0.2 μM, respectively. PCR cycling is carried out according to a standard operating procedure. To verify that the qRT-PCR signal is due to RNA rather than contaminating DNA, for each gene tested, a no RT control is run in parallel. The threshold cycle for a given amplification curve during qRT-PCR occurs at the point the fluorescent signal from probe cleavage grows beyond a specified fluorescence threshold setting. Test samples with greater initial template exceed the threshold value at earlier amplification cycles.
To compare gene expression levels across all the samples, normalization based on five reference genes (housekeeping genes whose expression level is similar across all samples of the evaluated tumor type) is used to correct for differences arising from variation in RNA quality, and total quantity of RNA, in each assay well. A reference CT (threshold cycle) for each sample is defined as the average measured CT of the reference genes. Normalized mRNA levels of test genes are defined as ΔCT, where ΔCT reference gene CT minus test gene CT.
The PGS score for each tumor sample is calculated from the gene expression levels, according to the algorithm set forth above. The actual response data associated with tested tumor samples are obtained from the hospital or clinical laboratory supplying the tumor samples. Clinical response is typically defined in terms of tumor shrinkage, e.g., 30% shrinkage, as determined by suitable imaging technique, e.g., CT scan. In some cases, human clinical response is defined in terms of time, e.g., progression free survival time. The optimal threshold PGS score for the given tumor type is calculated, as described above. Subsequently, this optimal threshold PGS score is used to predict whether newly-tested human tumors of the same tumor type will be responsive or non-responsive to treatment with tivozanib.
The entire disclosure of each of the patent documents and scientific articles cited herein is incorporated by reference for all purposes.
The invention can be embodied in other specific forms with departing from the essential characteristics thereof. The foregoing embodiments therefore are to be considered illustrative rather than limiting on the invention described herein. The scope of the invention is indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.
This application is a divisional of U.S. patent application Ser. No. 13/669,275, filed Nov. 5, 2012, which claims the benefit of and priority to U.S. provisional application Ser. No. 61/579,530, filed Dec. 22, 2011; the entire contents of each of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61579530 | Dec 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13669275 | Nov 2012 | US |
Child | 13775928 | US |