CALCULATED INDEX OF GENOMIC EXPRESSION OF ESTROGEN RECEPTOR (ER) AND ER-RELATED GENES

Abstract
The present invention provides the identification and combination of genes that are expressed in tumors that are responsive to a given therapeutic agent and whose combined expression can be used as an index that correlates with responsiveness to that therapeutic agent. One or more of the genes of the present invention may be used as markers (or surrogate markers) to identify tumors that are likely to be successfully treated by that agent or class of agents such as hormonal or endocrine treatment.
Description
I. FIELD OF THE INVENTION

The present invention relates to the fields of medicine and molecular biology, particularly transcriptional profiling, molecular arrays and predictive tools for response to cancer treatment.


II. BACKGROUND

Endocrine treatments of breast cancer target the activity of estrogen receptor alpha (ER, gene name ESR1). The current challenges for treatment of patients with ER-positive breast cancer include the ability to predict benefit from endocrine (hormonal) therapy and/or chemotherapy, to select among endocrine agents, and to define the duration and sequence of endocrine treatments. These challenges are each conceptually related to the state of ER activity in a patient's breast cancer. Since ER acts principally at the level of transcriptional control, a genomic index to measure downstream ER-associated gene expression activity in a patient's tumor sample can help quantify ER pathway activity, and thus dependence on estrogen, and intrinsic sensitivity to endocrine therapy. Treatment-specific predictors can enable available multiplex genomic technology to provide a way to specifically address a distinct clinical decision or treatment choice.


SUMMARY OF THE INVENTION

Embodiments of the invention include methods of calculating an index, e.g., an estrogen receptor (ER) reporter index or a sensitivity to endocrine treatment (SET) index, for assessing the hormonal sensitivity of a tumor comprising one or more of the steps of: (a) obtaining gene expression data from samples obtained from a plurality of patients; (b) calculating one or more reference gene expression profiles from a plurality of patients with a specific diagnosis, e.g., cancer diagnosis; (c) normalizing the expression data of additional samples to the reference gene expression profile; (d) measuring and reporting estrogen receptor (ER) gene expression from the profile as a method for defining ER status of a cancer; (e) identifying the genes to define a profile to measure ER-related transcriptional activity in any cancer sample; (f) defining one or more reference ER-related gene expression profiles; (g) calculating a weighted index or index (e.g., a SET index) based on ER-related gene expression in any patient sample(s) and the ER-related reference profile; and/or (h) combining the measurements of ER gene expression and the index (e.g., weighted index or SET index) for ER-related gene expression to measure and report the gene expression of ER and ER-related transcriptional profile as a continuous or categorical result. In certain aspects assessing the likely sensitivity of any cancer to treatment by measuring ER and ER-related gene expression singly or as a combined result. In certain embodiments, the cancer is suspected of being a hormone-sensitive cancer, preferably an estrogen-sensitive cancer. In certain aspects, the suspected estrogen-sensitive cancer is breast cancer. The ER-related genes may include one or more genes selected from two-hundred ER related genes or gene probes. In certain aspects of the invention, ER related genes or gene probes include 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, or 200 ER related genes or gene probes. In particular embodiments one or more genes are selected from Table 1 or Table 2. The weighted or calculated index may be based on similarity with the reference ER-related gene expression profile(s). In a further aspect of the invention similarity is calculated based on: (a) an algorithm to calculate a distance metric, such as one or a combination of Euclidian, Mahalanobis, or general Miknowski norms; and/or (b) calculation of a correlation coefficient for the sample based on expression levels or ranks of expression levels. The calculation of the weighted or reporter index may include various parameters (e.g., patient covariates) related to the disease condition including, but not limited to the parameters or characteristics of tumor size, nodal status, grade, age, and/or evaluation of prognosis based on distant relapse-free survival (DRFS) or overall survival (OS) of patients.


Embodiments of the invention include patients that are ER-positive and receiving hormonal therapy. In certain aspects the hormonal therapy includes, but is not limited to tamoxifen therapy and may include other known hormonal therapies used to treat cancers, particularly breast cancer. The treatment administered is typically a hormonal therapy, chemotherapy or a combination of the two. Additional aspects of the invention include evaluation of risk stratification of noncancerous cells and may be used to mitigate or prevent future disease. Still further aspects of the invention include normalization by a single digital standard. The method may further comprise normalizing expression data of the one or more samples to the ER-related gene expression profile. The expression data can be normalized to a digital standard. The digital standard can be a gene expression profile from a reference sample.


Further embodiments of the invention include methods of assessing patient sensitivity to treatment comprising one or more steps of: (a) determining expression levels of the ER gene and/or one or more additional ER-related genes; (b) calculating the value of the ER reporter index (e.g., a SET index); (c) assessing or predicting the response to hormonal therapy based on the value of the index; (d) assessing or predicting the response to an administered treatment (e.g., chemotherapy) based on the value of the index, and/or (e) selecting a treatment(s) for a patient based on consideration of the predicted responsiveness to hormonal therapy and/or chemotherapy.


In yet still further embodiments of the invention include a calculated index for predicting response (e.g., a response to treatment) produced by the method comprising the steps of: (a) obtaining gene expression data from samples obtained from a plurality of cancer patients; (b) normalizing the gene expression data; and (c) calculating an index (e.g., a weighted or SET index) based on the ER gene and one or more additional ER-related gene expression levels in the patient sample. In certain aspects the ER-related genes are selected as described supra. Parameters (e.g., patient covariates) used in conjunction with the calculation of the index includes, but is not limited to tumor size, nodal status, grade, age, evaluation of distant relapse-free survival (DRFS) or of overall survival (OS) of the patients and various combinations thereof. Typically, the patients are ER-positive and receiving hormonal therapy, preferably tamoxifen therapy. The methods of the invention may also include treatment administered as a combination of one or more cancer drugs. In particular aspects, the treatment administered is a hormonal therapy, a chemotherapy, or a combination of hormonal therapy and chemotherapy.


In yet still further embodiments of the invention include a calculated index for predicting response to therapy for late-stage (recurrent) cancer as performed by the method comprising the steps of: (a) obtaining gene expression data from samples obtained from a plurality of stage IV cancer patients; (b) normalizing the expression data; (c) calculating an index based on the ER gene and/or one or more additional ER-related gene expression levels in the patient sample; and (d) predicting response to therapy. Typically, the patients are ER-positive and have previously received, or are currently receiving hormonal therapy. The methods of the invention may also include treatment administered as a combination of one or more cancer drugs. In particular aspects, the treatment administered is a hormonal therapy, a chemotherapy, or a combination of hormonal therapy and chemotherapy.


Other embodiments of the invention include methods of assessing, e.g., assessing quantitatively, the estrogen receptor (ER) status of a cancer sample by measuring transcriptional activity comprising two or more of the steps of: (a) obtaining a sample of cancerous tissue from a patient; (b) determining mRNA gene expression levels of the ER gene in the sample; (c) establishing a cut-off ER mRNA value from the distribution of ER transcripts in a plurality of cancer samples, and/or (d) assessing ER status based on the mRNA level of the ER gene in the sample relative to the pre-determined cut-off level of mRNA transcript. The sample may be a biopsy sample, a surgically excised sample, a sample of bodily fluids, a fine needle aspiration biopsy, core needle biopsy, tissue sample, or exfoliative cytology sample. In certain aspects, the patient is a cancer patient, a patient suspected of having hormone-sensitive cancer, a patient suspected of having an estrogen or progesterone sensitive cancer, and/or a patient having or suspected of having breast cancer. In further aspects of the invention, the expression levels of the genes are determined by hybridization, nucleic amplification, or array hybridization, such as nucleic acid array hybridization. In certain aspects the nucleic acid array is a microarray. In still further embodiments, nucleic acid amplification is by polymerase chain reaction (PCR).


Embodiments of the invention may also include kits for the determination of ER status of cancer comprising: (a) reagents for determining expression levels of the ER gene and/or one or more additional ER-related genes in a sample; and/or (b) algorithm and software encoding the algorithm for calculating an ER reporter index from expression of ER and ER-related genes in a sample to determine the sensitivity of a patient to hormonal therapy.


Other embodiments of the invention are discussed throughout this application. Any embodiment discussed with respect to one aspect of the invention applies to other aspects of the invention as well and vice versa. The embodiments in the Example section are understood to be embodiments of the invention that are applicable to all aspects of the invention.


The terms “inhibiting,” “reducing,” or “prevention,” or any variation of these terms, when used in the claims and/or the specification includes any measurable decrease or complete inhibition to achieve a desired result.


The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”


Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.


The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.”


As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.


Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.




DESCRIPTION OF THE DRAWINGS

The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of the specific embodiments presented herein.



FIG. 1. Selection probabilities Pg(50), Pg(100), Pg(200) for the 200 top-ranking probe sets in terms of their Spearman's rank correlation with the ESR1 transcript (probe set 205225_at) plotted as a function of the probe set's rank in the original dataset. Probabilities were estimated from 1000 bootstrap samples of the original dataset.



FIG. 2. Distribution of ranks of the top 200 genes estimated from 1000 bootstrap replications of the original dataset as a function of the magnitude of the Spearman's rank correlation with the ESR1 transcript.



FIGS. 3A-3D. Distribution of the index of expression of the 200 ER-related genes by ER status for (FIG. 3A) 277 tamoxifen-treated patients and (FIG. 3B) 286 node-negative untreated patients. (FIGS. 3C and 3D) Dependence of ER gene expression index on ESR1 mRNA expression for patient populations corresponding to panels (FIG. 3A) and (FIG. 3B).



FIG. 4. Replicate measurements of ESR1 expression, PGR expression, ER reporter index and sensitivity to endocrine treatment (SET) index in 35 sample pairs of experimental replicates using residual RNA. Also shown is the 45° line through the origin. FIG. 4A (ESR1), FIG. 4B (PGR), FIG. 4C (ER Reporter Index), and FIG. 4D (SET Index).



FIGS. 5A-5C. Predicted marginal risk of distant relapse at 10 years in ER-positive breast cancer patients treated with adjuvant tamoxifen as a continuous function of genomic covariates: (FIG. 5A) ESR1 (ER) expression level, (FIG. 5B) log-transformed PGR expression level, and (FIG. 5C) genomic sensitivity to endocrine therapy (SET) index. The dashed lines show the 95% confidence interval of the predicted risk rates.



FIGS. 6A-6D. Kaplan-Meier estimates of relapse-free survival in ER-positive patients treated with adjuvant tamoxifen (FIG. 6A, FIG. 6C) or in patients not receiving systemic therapy after surgery (FIG. 6B, FIG. 6D). Groups were defined by the SET index (FIG. 6A, FIG. 6B) or the median-dichotomized log-transformed PGR expression (FIG. 6C, FIG. 6D). P-values are from the log-rank test.



FIGS. 7A-7B. Kaplan-Meier estimates of relapse-free survival in ER-positive patients treated with adjuvant tamoxifen grouped by nodal status: (FIG. 7A) node-negative group; (FIG. 7B) node-positive group. P-values are from the log-rank test.



FIG. 8A-8D. Box plots demonstrate genomic measurements in 351 ER-positive samples categorized by AJCC Stage (58 stage I, 123 stage IIA, 107 stage IIB, 44 stage III, and 18 stage IV). Each box indicates the median and interquartile range, and the whisker lines extend 1.5× the interquartile range above the 75th percentile and below the 25th percentile. FIG. 8A=SET index; FIG. 8B=ESR1; FIG. 8C=Log PGR; FIG. 8D=GAPDH.




DETAILED DESCRIPTION OF THE INVENTION

It has already been established that the overall transcriptional profile in breast cancers is dependent on ER status, being largely determined in ER-positive breast cancer by the genomic activity of ER on the transcription of numerous genes (Perou et al., 2000; van't Veer et al., 2002; Gruvberger et al., 2001; Pusztai et al., 2003). The inventors contemplate that the amount of ER-associated reporter gene expression is an indicator of ER transcriptional activity, likely dependence on ER activity, and sensitivity to hormonal therapy. Differences in expression of ER mRNA (the receptor) and ER reporter genes (the transcriptional output) might contribute to variable response of patients with ER-positive breast cancers to hormonal therapy (Buzdar, 2001; Howell and Dowsett, 2004; Hess et al, 2003). Herein, a set of genes are defined that are co-expressed with ER from an independent public database of Affymetrix U133A gene profiles from 286 lymph node-negative breast cancers and calculated an index score for their expression (Wang et al., 2005). Another goal was to determine whether the expression level of ESR1 gene, and value of this index for expression of ER reporter (associated) genes, is associated with distant relapse-free survival (DRFS) in other patients following adjuvant hormonal therapy with tamoxifen.


There are four main approaches to improving the ability to predict responsiveness to endocrine therapies. One approach is a standard predictive or chemopredictive study focused on treatment, in which a sufficiently powered discovery population of subjects is used to define a predictive test that must then be proven to be accurate in a similarly sized validation population (Ransohoff, 2005; Ransohoff 2004). Several studies have used this approach to define predictive genes for adjuvant tamoxifen therapy (Ma et al., 2004; Jansen et al., 2005; Loi et al., 2005). There are advantages to this approach, particularly when samples are available from mature studies for retrospective analysis. But two disadvantages are that the study design is empirical and that adjuvant treatment introduces surgery as a confounding variable, because it is impossible to ever know which patients were cured by their surgery and would never relapse, irrespective of their sensitivity to systemic therapy. Neoadjuvant chemotherapy trials enable a direct comparison of tumor characteristics with pathologic response (Ayers et al., 2004). While an empirical study design is needed for chemopredictive studies of cytotoxic chemotherapy regimens because multiple cellular pathways are likely to be disrupted, endocrine therapy of breast cancer specifically targets ER-mediated tumor growth and survival. The compositions and methods of the present invention may define and measure this ER-mediated effect supplanting the need for a limited empirical study design.


A second approach is to identify genes that are downregulated in vivo after treatment with an endocrine agent. This involves a small sample size of patients who undergo repeat biopsies, but is complicated by the selection of agent and dose used, variable timing of downregulation of different genes after therapy, and variable treatment effect in different tumors.


A third approach is to quantify receptor expression as accurately as possible. Semiquantitative scoring of ER immunoflourescent/immunohistochemical (IFIC) staining is related to disease-free survival following adjuvant tamoxifen (Harvey et al., 1999). For example, measurement of 16 selected genes (mostly related to ER, proliferation, and HER-2) using RT-PCR in a central reference laboratory predicts survival of women with tamoxifen-treated node-negative breast cancer (Paik et al., 2004). In a recent report, measurement of ER mRNA using RT-PCR diagnoses ER IHC status with 93% overall accuracy (Esteva et al., 2005). It was also recently reported that ER mRNA measurements from the same RT-PCR assay predict survival after adjuvant tamoxifen (Paik et al., 2005). So, if gene expression microarrays can reliably measure ER mRNA in a way that can be standardized in different laboratories, those measurements should predict response to endocrine treatment. Certain aspects of the invention described herein demonstrate that measurements of ER mRNA expression levels from microarrays also predict distant relapse-free survival following adjuvant tamoxifen therapy (Tables 4 and 5, and FIG. 6). However, other gene expression measurements from the microarray are informative as well.


A fourth approach, selected by the inventors, measures ER gene expression and the transcriptional output from ER activity, taking advantage of the high-throughput microarray platform. This approach theoretically applies to all endocrine treatments and does not require the empirical discovery and validation study populations. If a continuous scale of endocrine responsiveness exists, then specific endocrine treatments could be matched to likely response. Some patients would have an excellent response from tamoxifen, but others may need more potent endocrine treatment to respond to the same extent. A challenge with this approach is to accurately define the number and correct ER reporter genes to measure. The approach was to define ER reporter genes from a large, independent data set of 286 breast cancer profiles from Affymetrix U133A arrays. It is not necessary that these patients receive endocrine treatment, or to know their immunohistochemical ER status or survival, in order to define the genes most correlated with ER gene expression. Even with the relatively large sample size of 286 cases, the inventors calculated that 200 genes should be included as reporter genes in order to contain the 50 most ER-related genes with 98.5% confidence and the 100 most related genes with about 90% confidence (FIG. 1). This demonstrates the importance of a sufficiently large reporter gene set to capture a reliable transcriptional signature for ER activity in breast cancers (Perou et al., 2000; Van't Veer et al., 2002; Gruvberger et al., 2001; Pusztai et al., 2003).


If quantitative measurements of the ER-related expression, expression of ER mRNA, and/or ER activity (represented by a calculated index of ER reporter gene expression) accurately predict benefit from hormonal therapy, it is possible to develop a continuous genomic scale of measurement for ER expression and activity. This scale could be used to identify subsets of patients with ER-positive breast cancer that: (1) are expected to benefit from tamoxifen alone, (2) require more potent endocrine therapy, (3) may require chemotherapy along with endocrine therapy, or (4) are unlikely to benefit from any endocrine therapy.


To assess expression of at least 5, 25, 50, 100 or 200 reporter (ER-related) genes in a sample, the inventors first developed a gene-expression-based ER associated index. ER-positive and ER-negative reference signatures, or centroids, were then described as the median log-transformed expression value of each of the 200 reporter genes in the 209 ER-positive and 77 ER-negative subjects, respectively. For new samples, the similarity between the log-transformed 200-gene ER associated gene expression signature with the reference centroids was determined based on Hoeffding's D statistic (Hollander and Wolfe, 1999). D takes into account the joint rankings of the two variables and thus provides a robust measure of association that, unlike correlation-based statistics, will detect nonmonotonic associations (in statistical terms, it detects a much broader class of alternatives to independence than correlation-based statistics). The ER reporter index (RI) was defined as the difference between the similarities with the ER+ and ER− reference centroids: RI=D+−D.


The 200-gene signature of a tumor with high ER-dependent transcriptional activity will resemble more closely the ER-positive centroid and therefore D+ will be greater than D and RI will be positive. The opposite will be the case for tumors with low ER-related activity and thus RI will be small or negative. Subtraction of D normalizes the reporter index relative to the basal levels of expression of the ER-related genes in ER negative tumors. Because of this and since D is a distribution-free statistic, RI is relatively insensitive to the method used to normalize the microarray data and therefore can be computed across datasets. From the RI, a genomic index of sensitivity to endocrine therapy (SET) was calculated as follows: SET=100(RI+0.2)3. The offset translated RI to mostly positive values and was then transformed to normality using an unconditional Box-Cox power transformation. Finally, the maximum likelihood estimate of the exponent was rounded to the closest integer and the index was scaled to a maximum value of 10.


Embodiments of the present invention also provide a clinically relevant measurement of estrogen receptor (ER) activity within cells by accurately quantifying the transcriptional output due to estrogen receptor activity. This measure or index of the ER pathway or ER activity is an index or measure of the dependence on this growth pathway, and therefore, likely susceptibility to an anti-estrogen receptor hormonal therapy. There are a growing number of hormonal therapies that are used for patients with cancer or to protect from cancer and that vary in their efficacy, cost, and side effects. Aspects of the invention will assist doctors to make improved recommendations about whether and how long to use hormonal therapy for patients with breast cancer or ER-positive breast cancer, particularly those with ER-positive status as established by the existing immunochemical assay, and which hormonal therapy to prescribe for a patient based on the amount of ER-related transcriptional activity measured from a patient's biopsy that indicates the likely sensitivity to hormonal therapy and so matches the treatment selected to the predicted sensitivity to treatment.


Embodiments of the invention are pathway-specific, are applicable to any sample cohort, and are not dependent on inherent biostatistical bias that can limit the accuracy of predictive profiles derived empirically from discovery and validation trial designs linking genes to observed clinical or pathological responses. One advantage of the assay, in addition to its ability to link genomic activity to clinical or pathological response, is that it is quantitative, accurate, and directly comparable using results from different laboratories.


In one aspect of the invention, a calculated index is used to measure the expression of many genes that represent activity of the estrogen receptor pathway within the cells that provides independently predictive information about likely response to hormonal therapy, and that improves the response prediction otherwise obtained by measuring expression of the estrogen receptor alone. The invention includes the methods for standardizing the expression values of future samples to a normalization standard that will allow direct comparison of the results to past samples, such as from a clinical trial. The invention also includes the biostatistical methods to calculate and report the results.


In certain aspects of the invention, measurements of ER and ER-related genes from microarrays have demonstrated to be comparable in standardized datasets from two different laboratories that analyzed two different types of clinical samples (fine needle aspiration cytology samples and surgical tissue samples) and that these accurately diagnose ER status as defined by existing immunochemical assays. In further aspects of the invention, measurements of ER and ER-related genes using this technique have been demonstrated to independently predict distant relapse-free survival in patients who were treated with local therapy (surgery/radiation) followed by post-operative hormonal therapy with tamoxifen. In still further aspects, these gene expression measurements were demonstrated to outperform existing measurements of ER for prediction of survival with this hormonal therapy. In yet still further aspects, measurement of ER-related genes were demonstrated to add to the predictive accuracy of measurements of ER gene expression in the survival analysis of tamoxifen-treated women.


Further embodiments of the invention include kits for the measurement, analysis, and reporting of ER expression and transcriptional output. A kit may include, but is not limited to microarray, quantitative RT-PCR, or other genomic platform reagents and materials, as well as hardware and/or software for performing at least a portion of the methods described. For example, custom microarrays or analysis methods for existing microarrays are contemplated. Also, methods of the invention include methods of accessing and using a reporting system that compares a single result to a scale of clinical trial results. In yet still further aspects of the invention, a digital standard for data normalization is contemplated so that the assay result values from future samples would be able to be directly compared with the assay value results from past samples, such as from specific clinical trials.


The clinical relevance for measurements of ER mRNA and ER related genes from microarrays is also demonstrated herein. Some exemplary advantages to the current composition and methods include, but are not limited to: (1) standardized, quantitative reporting of ER mRNA expression that is comparable in different sample types and laboratories, (2) use of different methods for defining genomic profiles to predict response to adjuvant endocrine treatments, and (3) combining ER-related reporter genes expression to develop a measurable scale or index of estrogen dependence and likely sensitivity to endocrine therapy.


The performance of certain embodiments of a microarray-based ER determination is presented in relation to the current immunohistochemical “gold” standard for evaluation of ER. It is important to remember that IHC assays for ER in routine clinical use are imperfect. The existing IHC assay for ER has only modest positive predictive value (30-60%) for response to various single agent hormonal therapies (Bonneterre et al, 2000; Mouridsen et al, 2001). There are also occasional false negative results. Much of the recognized inter-laboratory differences that affect the IHC results for ER are caused in part by problems associated with tissue fixation methods and antigen retrieval in paraffin tissue sections (Rhodes et al., 2000; Rudiger et al., 2002; Rhodes, 2003; Taylor et al., 1994; Regitnig et al., 2002). Finally, IHC is at least a qualitative assay (reported as positive or negative) and at most a semiquantitative assay (reported as a score). There is still a need to further improve the accuracy with which pathologic assays for ER can predict response to endocrine therapies.


The microarrays provide a suitable method to measure ER expression from clinical samples. ER mRNA levels measured by microarrays, such as Affymetrix U133A gene chips, in fine needle aspirates (FNA), core needle biopsy, and/or frozen tumor tissue samples of breast cancer correlated closely with protein expression by enzyme immunoassay and by routine immunohistochemistry. This is consistent with the previously observed correlation between ER mRNA expression using Northern blot and ER protein expression (Lacroix et al., 2001). An expression level of ER mRNA (ESR1 probe set 205225_)≧500 correctly identified ER-positive tumors (IHC≧10%) with overall accuracy of 96% (95% CI, 90%-99%) in the original set of 82 FNAs and this threshold was validated with 95% overall accuracy (95% CI, 88%-98%) in an independent set of 94 tissue samples (see Table 3). If any ER staining is considered to be ER-positive, the overall accuracy was 98% for FNAs and 99% for tissues. These results indicate that ER status can be reliably determined from gene expression microarray data, with the advantage of providing comparable results from cytologic and surgical samples, and from different laboratories. With appropriately standardized methods for analysis of data, a microarray platform may also provide robust clinical information of ER status.


ER-positive breast cancer includes a continuum of ER expression that might reflect a continuum of biologic behavior and endocrine sensitivity. Others have reported that some breast cancers are difficult to predict as ER-positive based on transcriptional profile and described non-estrogenic growth effects, such as HER-2, more frequently in this small subset of tumors with aggressive natural history (Kun et al., 2003). Indeed, ER mRNA levels are lower in breast cancers that are positive for both ER and HER2 (Konecny et al., 2003). Another group defined a gene expression signature from cDNA arrays that could predict ER protein levels (enzyme immunoassay) and another signature that predicted flow cytometric S-phase measurements (Gruvberger et al., 2004). Their finding of a reciprocal relationship supports the concept that less ER-positive breast cancers are more proliferative. This relationship is also factored into the calculation of the Recurrence Score that adds the values for proliferation and HER-2 gene groups and subtracts the values for the ER gene group (Paik et al., 2004; Paik et al., 2005). Molecular classification from unsupervised cluster analysis shows the same thing by identifying subtypes of luminal-type (ER-positive) breast cancer (Sorlie et al., 2001). The inverse relationship between ER expression and genes associated with proliferation and other growth pathways is best explained by viewing differentiation as a continuum in which cells become increasingly less proliferative and more dependent on ER stimulation as they differentiate. It follows that there would be an inverse relationship between greater sensitivity to endocrine therapy in differentiated tumors and greater sensitivity to chemotherapy in less differentiated tumors. Measurements along this scale could be valuable for treatment selection.


Randomized clinical trials have demonstrated a survival benefit for some patients who receive additional endocrine therapy with an aromatase inhibitor (compared to placebo) after 5 years of adjuvant tamoxifen (Goss et al., 2003; Bryant and Wolmark, 2003). Although there was a 24% relative reduction in deaths after 2.4 years of letrozole, the absolute difference in recurrence or new primaries was only 2.2% at 2.4 years (Goss et al., 2003, Burnstein, 2003). Without a test to identify patients who actually benefit from prolonged adjuvant endocrine therapy, the resulting decision to provide routine extension of adjuvant endocrine treatment (possibly for an indefinite period) in all women with ER-positive cancer could be a costly and potentially avoidable practice for the healthcare community that would benefit an unidentified minority (Buzdar, 2001). It is therefore helpful to consider that this genomic SET index of ER-associated gene expression might identify patients with intermediate endocrine sensitivity as candidates for extended adjuvant endocrine therapy.


A genomic scale of intrinsic endocrine sensitivity might also provide an improved scientific basis for selection of the most appropriate subjects for inclusion in clinical trials. The ATAC and BIG 1-98 trials enrolled 9,366 and 8,010 postmenopausal women, respectively, and both demonstrated 3% absolute improvement in disease-free survival (DFS) at 5 years from adjuvant aromatase inhibition, compared to tamoxifen (Howell et al., 2005; Thurlimann et al., 2005). Aromatase inhibition as first-line endocrine treatment for all postmenopausal women with ER-positive breast cancer would achieve this survival benefit in 3% of patients at significant cost, and might relegate an effective and less expensive treatment (tamoxifen) to relative obscurity. It is also likely that identification of potentially informative subjects, based on predicted partial endocrine sensitivity from indicators such as the SET index, could reduce the size and cost of adjuvant trials, demonstrate larger absolute survival benefit from improved treatment, and establish who should receive each treatment in routine practice after a positive trial result.


As the cost and complexity of endocrine therapy increase, diagnostic tools are needed not merely for prognosis, but, using strong biological rationale, to demonstrate clinical benefit when they are used to guide the selection and duration of endocrine agents therapy. Indicators such as the SET index can predict response to tamoxifen rather than intrinsic prognosis, and should be independent of stage, grade, and the expression levels of ESR1 and PGR. Continuing validation of the SET index with samples from trials of other hormonal agents would help continual refinement of this clinical interpretation.

TABLE 1Reporter genes for ER-related genomic activityand use in calculating indexUnigeneGeneRankProbe Set IDIDSymbolRsPg (200) 1209603_at169946GATA30.7831.000 2215304_at1592640.7791.000 3218195_at15929C6orf2110.7741.000 4212956_at411317KIAA08820.7711.000 5209604_s_at169946GATA30.7641.000 6202088_at79136SLC39A60.7571.000 7209602_s_at169946GATA30.7491.000 8212496_s_at301011JMJD2B0.7331.000 9212960_at411317KIAA08820.7241.000 10215867_x_at5344AP1G10.7241.000 11214164_x_at512620CA120.7211.000 12203963_at512620CA120.7191.000 1341660_at252387CELSR10.7091.000 14218259_at151076MRTF-B0.6951.000 15204667_at163484FOXA10.6891.000 16211712_s_at430324ANXA90.6841.000 17218532_s_at82273FLJ201S20.6771.000 18212970_at15740FLJ140010.6771.000 19209459_s_at1588ABAT0.6760.999 20204508_s_at512620CA120.6751.000 21218976_at260720DNAJC120.6730.998 22217838_s_at241471EVL0.6731.000 23218211_s_at297405MLPH0.6691.000 24222275_at124165MRPS300.6661.000 25218471_s_at129213BBS10.6660.999 26214053_at78880.6660.999 27203438_at155223STC20.6641.000 28213234_at6189KIAA14670.6640.999 29219197_s_at435861SCUBE20.6570.999 30212692_s_at209846LRBA0.6570.999 31200711_s_at171626SKP1A0.6541.000 32205074_at15813SLC22A50.6531.000 33203685_at501181BCL20.6531.000 34209460_at1588ABAT0.6530.999 35222125_s_at271224PH-40.6511.000 36204798_at407830MYB0.6510.999 37212985_at15740FLJ140010.6481.000 38203929_s_at101174MAPT0.6470.998 39202089_s_at79136SLC39A60.6420.997 40205696_s_at444372GFRA10.6390.997 41209681_at30246SLC19A20.6370.999 42212495_at301011JMJD2B0.6370.999 43218510_x_at82273FLJ201520.6340.995 44208682_s_at376719MAGED20.6320.994 45212195_at5297720.6300.997 4651192_at29173SSH-30.6300.999 4740016_g_at212787KIAA03030.6280.997 48212638_s_at450060WWP10.6270.994 49218692_at354793FLJ203660.6240.991 50213077_at283283FLJ219400.6230.985 51203439_s_at155223STC20.6230.995 52212441_at79276KIAA02320.6220.988 53210652_s_at112949C1orf340.6210.990 54219981_x_at288995ZNF5870.6200.984 55205186_at406050DNALI10.6200.990 56213627_at376719MAGED20.6200.987 57200670_at437638XBP10.6170.985 58218437_s_at30824LZTFL10.6170.987 59206754_s_at1360CYP2B60.6160.985 60209696_at360509FBP10.6160.987 61201826_s_at238126CGI-490.6150.984 62219833_s_at446047EFHC10.6100.975 63203928_x_at101174MAPT0.6100.976 64216092_s_at22891SLC7A80.6090.985 65200810_s_at437351CIRBP0.6090.977 66204811_s_at389415CACNA2D20.6090.968 6744654_at294005G6PC30.6090.974 68202371_at194329FLJ211740.6080.970 69209173_at226391AGR20.6070.971 70212196_at5297720.6060.953 71210720_s_at324104APBA2BP0.6060.965 72204497_at20196ADCY90.6050.965 73214440_at155956NAT10.6040.960 74205009_at350470TFF10.6030.964 75204862_s_at81687NME30.6010.971 76219562_at3797RAB260.6000.949 7750965_at3797RAB260.5990.951 78218966_at111782MYO5C0.5980.961 79217979_at364544TM4SF130.5960.972 80209759_s_at403436DCI0.5960.938 81212637_s_at450060WWP10.5940.951 82218094_s_at256086C20orf350.5920.954 83219222_at11916RBKS0.5920.941 84202121_s_at12107BC-20.5910.940 85215001_s_at442669GLUL0.5910.940 86210085_s_at430324ANXA90.5900.934 87210958_s_at212787KIAA03030.5890.940 88201596_x_at406013KRT180.5880.928 89212209_at435249THRAP20.5870.923 90221139_s_at279815CSAD0.5860.924 91201384_s_at458271M17S20.5860.910 92213283_s_at416358SALL20.5860.927 93202908 at26077WFS10.5850.917 94219786_at121378MTL50.5850.918 95214109_at209846LRBA0.5840.930 96203791_at181042DMXL10.5830.914 97205012_s_at155482HAGH0.5830.903 98212492_s_at301011JMJD2B0.5820.902 99218026_at16059HSPC0090.5790.905100210272_at1360CYP2B60.5790.897101204199_at432842RALGPS10.5770.892102202752_x_at22891SLC7A80.5770.886103217645_at5311030.5760.882104213419_at324125APBB20.5760.888105219919_s_at29173SSH-30.5750.861106213365_at248437MGC169430.5740.861107219206_x_at126372CGI-1190.5740.883108221751_at388400PANK30.5730.875109211596_s_at528353LRIG10.5720.863110221963_x_at3565300.5720.867111202641_at182215ARL30.5720.850112201754_at351875COX6C0.5710.857113219741_x_at515644ZNF5520.5690.848114209224_s_atNDUFA20.5680.862115212099_at406064RHOB0.5680.836116205794_s_at292511NOVA10.5680.836117219913_s_at171342CRNKL10.5680.816118204934_s_at432750HPN0.5670.830119209341_s_at413513IKBKB0.5670.816120204231_s_at528334FAAH0.5670.817121203571_s_at511763C10orf1160.5670.807122204045_at95243TCEAL10.5660.833123202636_at147159RNF1030.5660.788124202962_at15711KIF13B0.5650.798125208865_at318381CSNK1A10.5630.801126201825_s_at238126CGI-490.5630.806127219686_at58241STK32B0.5620.80612857540_at11916RBKS0.5600.782129212416_at31218SCAMP10.5590.801130201170_s_at171825BHLHB20.5590.75813140093_at155048LU0.5580.773132219414_at12079CLSTN20.5570.761133209623_at167531MCCC20.5560.758134202772_at444925HMGCL0.5550.752135208517_x_at446567BTF30.5530.734136213018_at21145ODAG0.5520.764137204703_at251328TTC100.5510.731138203801_at247324MRPS140.5510.730139203246_s_at437083TUSC40.5500.733140218769_s_at239154ANKRA20.5490.740141203476_at82128TPBG0.5490.706142217770_at437388PIGT0.5480.73614335666_at32981SEMA3F0.5470.694144212508_at24719MOAP10.5460.686145208712_at371468CCND10.5450.703146204863_s_at71968IL6ST0.5440.710147204284_at303090PPP1R3C0.5440.672148203628_at239176IGF1R0.5440.674149200719_at171626SKP1A0.5440.668150214919_s_atMASK-BP30.5440.669151205376_at153687INPP4B0.5440.691152202263_at334832CYB5R10.5430.674153218450_at294133HEBP10.5430.660154213285_at146180LOC1612910.5430.666155209740_s_at264DXS1283E0.5430.653156205380_at15456PDZK10.5430.661157203144_s_at368916KIAA00400.5430.656158214552_s_at390163RABEP10.5420.660159202814_s_at15299HIS10.5400.629160205776_at396595FMO50.5390.633161217906_at415236KLHDC20.5390.640162212148_at408222PBX10.5390.620163220581_at287738C6orf970.5380.643164200811_at437351CIRBP0.5380.574165217894_at239155KCTD30.5380.580166206197_at72050NME50.5370.610167202454_s_at306251ERBB30.5370.614168218394_at22795FLJ223860.5350.601169201413_at356894HSD17B40.5350.59317040569_at458361ZNF420.5350.574171221856_s_at3346FLJ112800.5350.576172210336_x_at458361ZNF420.5340.584173211621_at99915AR0.5330.573174204623_at82961TFF30.5330.53317540148_at324125APBB20.5330.581176212446_s_at387400LASS60.5320.543177210735_s_at279916CA120.5310.540178214924_s_at457063OIP1060.5310.561179203071_at82222SEMA3B0.5310.522180213527_s_at301463LOC1465420.5300.531181208617_s_at82911PTP4A20.5300.517182213249_at76798FBXL70.5290.552183205645_at334168REPS20.5290.520184208788_at343667ELOVL50.5290.543185205769_at11729SLC27A20.5280.501186213712_at246107ELOVL20.5280.510187212697_at432850LOC1624270.5280.503188219900_s_at435303FLJ206260.5280.485189213832_at237290.5270.490190213049_at167031GARNL10.5270.47419159437_at414028C9orf1160.5270.504192204072_s_at39087413CDNA730.5260.451193210108_at399966CACNA1D0.5260.489194214855_s_at167031GARNL10.5250.459195209662_at528302CETN30.5250.441196219687_at58650MART20.5250.470197217191_x_atCOX6CP10.5240.440198203538_at13572CAMLG0.5240.442199213702_x_at324808ASAH10.5220.456200212744_at26471BBS40.5220.458


In some aspects, although not intending to bound to any single theory, the ER reporter index can be of importance for tumors with high ER mRNA expression. If ER mRNA and the reporter index are high, this can describe a highly endocrine-dependent state for which tamoxifen alone seems to be sufficient for prolonged survival benefit. Patients with high ER mRNA expression but low reporter index appear to derive initial benefit from tamoxifen, but that is not sustained over the long term. Those patients' tumors are likely to be partially endocrine-dependent and might benefit from more potent endocrine therapy in the adjuvant setting. Some women might also benefit from more potent endocrine therapy. A measurable scale of ER gene expression and genomic activity might be applicable to any endocrine therapy that targets ER or other hormonal receptor activity. The relation of an index to efficacy of different endocrine therapies could be used to guide the selection of first-line treatment (e.g., chemotherapy versus endocrine therapy), influence the selection of endocrine agent based on likely endocrine sensitivity, and possibly to re-evaluate endocrine sensitivity if ER-positive breast cancer recurs.


Typically for clinical utility one would define the optimal probe set for ESR1 (ERα gene) on the Affymetrix U133A GeneChip™ to measure ER gene expression. The ESR1 205225_probe set produces the highest median and greatest range of expression and the strongest correlation with ER status because this probe set recognizes the most 3′ end of ESR1 (NetAffx search tool at www.affymetrix.com). The initial reverse transcription (RT) of mRNA sequences in each sample begins at the unique poly-A tail at the 3′ end of mRNA. Therefore, the 3′ end is likely to be the most represented part of any mRNA sequence, and probes that target the 3′ end generally produce the strongest hybridization signal.


In other aspects of the invention it is preferred that biostatistical methods be used that allow standardization of microarray data from any contributing laboratory. At present, direct comparison of IHC results for ER from multiple centers is difficult because technical staining methods differ, positive and negative tissue controls are laboratory-dependent, and interpretation of staining is subjective to the interpretation of the individual pathologist or the threshold setting of the image analysis system being used (Rhodes et al., 2000; Rhodes, 2003; Regitnig et al., 2002). Even in quantitative RT-PCR assays, the expression of genes of interest are calculated relative to only one or several intrinsic housekeeper genes in each assay. The techniques for RNA extraction from fresh samples and preparation for hybridization to Affymetrix microarrays are available from standardized laboratory protocols. However, it should not be overlooked that uniform normalization of microarray data from every breast cancer sample to a digital standard (e.g., U133A dCHIP dataset) will consistently calculate the expression of all genes of interest relative to the expression of thousands of intrinsic control genes. This availability of multiple controls to standardize expression levels of all genes on the microarray is a robust mathematical control that can explain the comparable results from measurements of ER mRNA expression levels in different sample types and in different laboratories. Adoption of an agreed dCHIP standard for data normalization of breast cancer samples using the Affymetrix U133A array could lead to a digital standard available to laboratories for clinical trials and for routine diagnostics.


The implications of establishing standard analysis tools for development of a useful clinical assay are clear. When diagnostic microarrays are introduced into the clinic through a central reference laboratory, then uniform data normalization and standardized experimental procedure require internal quality control procedures by the central laboratory. However, in a decentralized system where each center performs its own profiling following a standard procedure using the same microarray platform, a single digital standard should be available for data normalization. This allows different laboratories to generate data that is directly comparable to a common standard.

TABLE 2Genes indicative of the responsiveness of a cancer cell to therapyProbe.SetAccessionNameT-statP-val203930_s_atNM_016835.1Microtubule-associated protein−6.425.25 × 10-08212745_s_atA1813772Bardet-Biedl syndrome 4−6.259.40 × 10-08203928_x_atNM_016835.1Microtubule-associated protein−5.992.70 × 10-07206401_s_atJ03778.1Microtubule-associated protein−5.737.02 × 10-07203929_s_atNM_016835.1Microtubule-associated protein−5.521.26 × 10-06212207_atAK023837.1KIAA1025 protein−5.372.21 × 10-06212046_x_atX60188.1Mitogen-activated protein kinase−5.333.43 × 10-06210469_atBC002915.1Discs, large (Drosophila) homol−5.283.53 × 10-06205074_atNM_003060.1Solute carrier family 22 (organ−5.135.45 × 10-06204509_atNM_017689.1Hypothetical protein FLJ20151−5.026.15 × 10-06205696_s_atNM_005264.1GDNF family receptor alpha 1−5.001.06 × 10-05219741_x_atNM_024762.1Hypothetical protein FLJ21603−4.941.00 × 10-05215616_s_atAB020683.1KIAA0876 protein−4.861.43 × 10-05208945_s_atNM_003766.1Beclin 1 (coiled-coil, myosin-l−4.861.48 × 10-05217542_atBE930512ESTs−4.801.84 × 10-05202204_s_atAF124145.1Autocrine motility factor recep−4.742.05 × 10-05204916_atNM_005855.1Receptor (calcitonin) activity−4.702.92 × 10-05218769_s_atNM_023039.1Ankyrin repeat, family A (RFXAN−4.702.58 × 10-05219981_x_atNM_017961.1Hypothetical protein FLJ20813−4.664.44 × 10-05222131_x_atBC004327.1Hypothetical protein BC014942−4.643.26 × 10-05213234_atAB040900.1KIAA1467 protein−4.603.73 × 10-05219197_s_atAI424243CEGP1 protein−4.573.45 × 10-05205425_atNM_005338.3Huntington interacting protein−4.518.86 × 10-05213504_atW63732COP9 subunit 6 (MOV34 homolog,−4.504.98 × 10-05201413_atNM_000414.1Hydroxysteroid (17-beta) dehydr−4.465.71 × 10-05203050_atNM_005657.1Tumor protein p53 binding prote−4.457.53 × 10-05212494_atAB028998.1KIAA1075 protein−4.439.46 × 10-05209173_atAF088867.1Anterior gradient 2 homolog (Xe−4.416.36 × 10-05201124_atAL048423Integrin, beta 5−4.417.76 × 10-O5205354_atNM_000156.3Guanidinoacetate N-methyltransf−4.398.11 × 10-05212444_atAA156240Homo sapiens cDNA: FLJ22182 fis−4.377.71 × 10-05205225_atNM_000125.1Estrogen receptor 1−4.378.12 × 10-05211000_s_atAB015706.1Interleukin 6 signal transducer−4.369.16 × 10-05204012_s_atAL529189KIAA0547 gene product−4.368.63 × 10-05203682_s_atNM_002225.2Isovaleryl Coenzyme A dehydroge−4.357.60 × 10-05220357_s_atNM_016276.1Serum/glucocorticoid regulated−4.355.94 × 10-05216173_atAK025360.1Homo sapiens cDNA: FLJ21707 fis−4.327.65 × 10-05210230_atBC003629.1RNA, U2 small nuclear−4.269.95 × 10-05219044_atNM_018271.1Hypothetical protein FLJ10916−4.251.75 × 10-04218761_atNM_017610.1Likely ortholog of mouse Arkadi−4.231.35 × 10-04210826_x_atAF098533.1RAD17 homolog (S. pombe)−4.221.44 × 10-04210831_s_atL27489.1Prostaglandin E receptor 3 (sub−4.221.07 × 10-04211233_x_atM12674.1Estrogen receptor 1−4.211.20 × 10-04218807_atNM_006113.2Vav 3 oncogene−4.201.46 × 10-04210129_s_atAF078842.1DKFZP434B103 protein−4.191.09 × 10-0439313_atAB002342Protein kinase, lysine deficien−4.191.23 × 10-04213245_atAL120173Homosapiens cDNA FLJ30781 fis,−4.181.43 × 10-04214053_atAW772192Homo sapiens clone 23736 mRNA s−4.181.51 × 10-04205352_atNM_005025.1Serine (or cysteine) proteinase−4.171.47 × 10-04213623_atNM_007054.1Kinesin family member 3A−4.151.88 × 10-04215304_atU79293.1Human clone 23948 mRNA sequence−4.131.40 × 10-04203009_atNM_005581.1Lutheran blood group (Auberger−4.131.80 × 10-04218692_atNM_017786.1Hypothetical protein FLJ20366−4.131.76 × 10-04218976_atNM_021800.1J domain containing protein 1−4.121.76 × 10-04201405_s_atNM_006833.1COP9 subunit 6 (MOV34 homolog,−4.111.63 × 10-04202168_atNM_003187.1TAF9 RNA polymerase II, TATA bo−4.112.01 × 10-04216109_atAK025348.1Homo sapiens cDNA: FLJ21695 fis−4.111.77 × 10-04219051_x_atNM_024042.1Hypothetical protein MGC2601−4.102.34 × 10-04210908_s_atAB055804.1Prefoldin 5−4.091.71 × 10-04221728_x_atAK025198.1Homo sapiens cDNA FLJ30298 fis,−4.072.11 × 10-04203187_atNM_001380.1Dedicator of cyto-kinesis 1−4.062.22 × 10-04212660_atAI735639KIAA0239 protein−4.042.56 × 10-04212956_atAB020689.1KIAA0882 protein−4.012.27 × 10-04217838_s_atNM_016337.1RNB6−4.012.14 × 10-04218621_atNM_016173.1HEMK homolog 7 kb−4.011.92 × 10-04201681_s_atAB0111855.1Discs, large (Drosophila) homol−4.012.49 × 10-04209884_s_atAF047033.1Solute carrier family 4, sodium−4.002.98 × 10-04201557_atNM_014232.1Vesicle-associated membrane pro−3.992.23 × 10-04219338_s_atNM_017691.1Hypothetical protein FLJ20156−3.992.94 × 10-04217828_atNM_024755.1Hypothetical protein FLJ13213−3.982.42 × 10-04209339_atU76248.1Seven in absentia homolog 2 (Dr−3.982.26 × 10-04214218_s_atAV699347Homo sapiens cDNA FLJ30298 fis,−3.972.82 × 10-04221643_s_atAF016005.1Arginine-glutamic acid dipeptid−3.962.57 × 10-04218211_s_atNM_024101.1Melanophilin−3.953.05 × 10-04221483_satAF084555.1Cyclic AMP phosphoprotein, 19 k−3.952.83 × 10-04211864_s_atAF207990.1Fer-1-like 3, myoferlin (C. ele−3.923.29 × 10-04202392_s_atNM_014338.1Phosphatidylserine decarboxylas−3.924.33 × 10-04214164_x_atBF752277Adaptor-related protein complex−3.913.52 × 10-04204862_s_atNM_002513.1Non-metastatic cells 3, protein−3.913.55 × 10-04215552_s_atAI073549Estrogen receptor 1−3.913.33 × 10-04211235_s_atAF258450.1Estrogen receptor 1−3.903.13 × 10-04210833_atAL031429Prostaglandin E receptor 3 (sub−3.893.06 × 10-04204660_atNM_005262.1Growth factor, augmenter of liv−3.892.79 × 10-04211234_x_atAF258449.1Estrogen receptor 1−3.893.10 × 10-04201508_atNM_001552.1Insulin-like growth factor bind−3.884.04 × 10-04213527_s_atAI350500Similar to hypothetical protein−3.854.33 × 10-04202048_s_atNM_014292.1Chromobox homolog 6−3.844.15 × 10-04206794_atNM_005235.1v-erb-a erythroblastic leukemia−3.843.87 × 10-04201798_s_atNM_013451.1Fer-1-like 3, myoferlin (C. ele−3.834.44 × 10-04213523_atAI671049Cyclin E13.814.14 × 10-04209050_s_atAI421559Ral guanine nucleotide dissocia3.834.07 × 10-04217294_s_atU88968.1Enolase 1, (alpha)3.844.48 × 10-04201555_atNM_002388.2MCM3 minichromosome maintenance3.844.41 × 10-04201030_x atNM_002300.1Lactate dehydrogenase B3.853.85 × 10-04202912_atNM_001124.1Adrenomedullin3.863.59 × 10-04204050_s_atNM_001833.1Clathrin, light polypeptide (Lc3.883.97 × 10-04202342_s_atNM_015271.1Tripartite motif-containing 23.884.43 × 10-04209393_s_atAF047695.1Eukaryotic translation initiati3.894.21 × 10-04219774_atNM_019044.1Hypothetical protein FLJ109963.933.86 × 10-04204162_atNM_006101.1Highly expressed in cancer, nc3.932.94 × 10-04216237_s_atAA807529MCM5 minichromosome maintenance3.962.84 × 10-04214581_x_atBE568134Tumor necrosis factor receptor3.993.07 × 10-04209408_atU63743.1Kinesin-like 6 (mitotic centrom3.992.23 × 10-04208370_s_atNM_004414.2Down syndrome critical region g4.022.94 × 10-04203744_atNM_005342.1High-mobility group box 34.022.02 × 10-04209575_atBC001903.1Interleukin 10 receptor, beta4.032.84 × 10-04200934_atNM_003472.1DEK oncogene (DNA binding)4.052.54 × 10-04202341_s_atAA149745Tripartite motif-containing 24.062.87 × 10-04200996_atNM005721.2ARP3 actin-related protein 3 ho4.062.42 × 10-04206392_s_atNM_002888.1Retinoic acid receptor responde4.062.28 × 10-04206391_atNM_002888.1Retinoic acid receptor responde4.072.52 × 10-04201797_s_atNM_006295.1Valyl-tRNA synthetase 24.072.17 × 10-04209358_atAF118094.1TAF11 RNA polymerase II, TATA b4.072.34 × 10-04209201_x_atL01639.1Chemokine (C-X-C motif) recepto4.092.80 × 10-04209016_s_atBC002700.1Keratin 74.141.69 × 10-04221957_atBF939522Pyruvate dehydrogenase kinase,4.152.22 × 10-04218350_s_atNM_015895.1Geminin, DNA replication inhibi4.161.64 × 10-04201897_s_atNM_001826.1p53-regulated DDA34.211.36 × 10-04209642_atAF043294.2BUB1 budding uninhibited by ben4.221.22 × 10-04201930_atNM_005915.2MCM6 minichromosome maintenance4.231.16 × 10-04202870_s_atNM_001255.1CDC20 cell division cycle 20 ho4.231.07 × 10-04221485_atNM_004776.1UDP-Gal:betaGlcNAc beta 1,4- ga4.261.08 × 10-04211919_s_atAF348491.1Chemokine (C-X-C motif) recepto4.271.61 × 10-04218887_atNM_015950.1Mitochondrial ribosomal protein4.278.93 × 10-05216295_s_atX81636.1H.sapiens clathrin light chain4.281.17 × 10-04218726_atNM_018410.1Hypothetical protein DKFZp762E14.281.19 × 10-04204989_s_atBF305661Integrin, beta 44.301.01 × 10-04221872_atAI669229Retinoic acid receptor responde4.311.12 × 10-04206746_atNM_001195.2Beaded filament structural prot4.329.33 × 10-05201231_s_atNM_001428.1Enolase 1, (alpha)4.425.76 × 10-05204203_atNM_001806.1CCAAT/enhancer binding protein4.426.44 × 10-05211555_s_atAF020340.1Guanylate cyclase 1, soluble, b4.475.11 × 10-05202200_s_atNM_003137.1SFRS protein kinase 14.475.17 × 10-05213101_s_atZ78330Homo sapiens mRNA; cDNA DKFZp684.497.76 × 10-05204600_atNM_004443.1EphB34.515.81 × 10-05212689_s_atAA524505Zinc finger protein4.525.10 × 10-05209773_s_atBC001886.1Ribonucleotide reductase M2 po14.553.18 × 10-05204962_s_atNM_001809.2Centromere protein A, l7kDa4.623.00 × 10-05211519_s_atAY026505.1Kinesin-like 6 (mitotic centrom4.622.41 × 10-05204825_atNM_014791.1Maternal embryonic leucine zipp4.732.45 × 10-05203287_atNM_005558.1Ladinin 14.742.06 × 10-05204913_s_atAI360875SRY (sex determining region Y)-4.772.44 × 10-05217028_atAJ2248694.822.56 × 10-05204750_s_atBF196457Desmocollin 24.841.78 × 10-05216222_s_atAI561354Myosin X4.841.93 × 10-051438_atX75208EphB35.029.02 × 10-06203693_s_atNM_001949.2E2F transcription factor 35.174.83 × 10-06205548_s_atNM_006806.1BIG family, member 35.641.96 × 10-06201976_s_atNM_012334.1Myosin X5.688.74 × 10-07213134_x_atAI765445BlG family, member 35.761.31 × 10-0640016g_atAB002301KIAA0303 protein4.261.071 × 10-04 206352_s_atAB013818peroxisome biogenesis factor 104.285.79 × 10-05205074_atAB015050solute carrier family 22 member 54.642.24 × 10-05213527_s_atAC002310similar to hypothetical protein4.623.16 × 10-05MGC13138216835_s_atAF035299docking protein 1,62 kDa4.443.32 × 10-05209617_s_atAF035302catenin (cadherin-associated protein),5.16 1.7 × 10-06delta 2 (neural plakophilin-related arm-repeat protein)208945_s_atAF139131beclin 1 (coiled-coil, myosin-like BCL25.61 5.0 × 10-07interacting protein)222275_atAI039469mitochondrial ribosomal protein S304.512.16 × 10-05203929_s_atAI056359microtubule-associated protein tau6.60 0.0 × 10-04215552_s_atAI073549Estrogen receptor 14.512.51 × 10-05212956_atAI348094KIAA0882 protein4.40 7.0 × 10-05204913_s_atAI360875SRY (sex determining region Y)-box 11−4.459.92 × 10-05213855_s_atAI500366lipase, hormone-sensitive4.171.08 × 10-04212239_atAI680192phosphoinositide-3-kinase, regulatory4.364.71 × 10-05subunit, polypeptide 1 (p85 alpha)203928_x_atAI870749microtubule-associated protein tau5.91  8 × 10-08214124_x atAL043487FGFR1 oncogene partner5.18 3.1 × 10-06212195_atAL049265MRNA; cDNA DKFZp564F0534.251.11 × 10-04210222_s_atBC000314reticulon 14.081.07 × 10-04210958_s_atBC003646KIAA0303 protein4.434.26 × 10-05204863_s_atBE856546interleukin 6 signal transducer (gp130,4.288.20 × 10-05oncostatin M receptor)213911_s_atBF718636H2A histone family, member Z−4.161.10 × 10-04212207_atBG426689thyroid hormone receptor associated6.06 1.0 × 10-07protein 2209696_atD26054fructose-1,6-bisphosphatase 14.299.21 × 10-05209443_atJ02639serine (or cysteine) proteinase inhibitor,4.216.95 × 10-05clade A (alpha-1 antiproteinase,antitrypsin), member 5202862_atNM_000137fumarylacetoacetate hydrolase4.345.59 × 10-05(fumarylacetoacetase)214440_atNM_000662N-acetyltransferase 1 (arylamine N-4.246.75 × 10-05acetyltransferase)208305_atNM_000926progesterone receptor4.158.19 × 10-05202204_s_atNM_001144autocrine motility factor receptor5.281.29 × 10-06204862_s_atNM_002S13non-metastatic cells 3, protein expressed4.308.95 × 10-05in202641_atNM_004311ADP-ribosylation factor-like 34.249.46 × 10-05200896_x_atNM_004494hepatoma-derived growth factor (high-−4.871.38 × 10-05mobility group protein 1-like)203071_atNM_004636sema domain, immunoglobulin domain4.651.63 × 10-05(Ig), short basic domain, secreted,(semaphorin) 3B205012_s_atNM_005326hydroxyacylglutathione hydrolase4.603.62 × 10-05204916_atNM_005855receptor (calcitonin) activity modifying5.475.10 × 10-07protein 1204792_s_atNM_014714KIAA0590 gene product4.141.12 × 10-04208202_s_atNM_015288PHD finger protein 154.181.08 × 10-04217770_atNM_015937phosphatidylinositol glycan, class T4.335.43 × 10-05218671_s_atNM_016311ATPase inhibitory factor 14.189.04 × 10-05219872_atNM_016613hypothetical protein DKFZp434L1424.101.03 × 10-04219197_s_atNM_020974signal peptide, CUB domain, EGF-like 25.43 6.8 × 10-07203485_atNM_021136reticulon 14.187.56 × 10-05206936_x_atNM_022335NADH dehydrogenase (ubiquinone) 1,4.286.46 × 10-05subcomplex unknown, 2, 14.5kDa220540_atNM_022358potassium channel, subfamily K,4.681.32 × 10-05member 15219438_atNM_024522hypothetical protein FLJ126504.826.68 × 10-06205696_s_at2674 U97144GDNF family receptor alpha 14.897.15 × 10-06


In addition to other know methods of cancer therapy, hormone therapies may be employed in the treatment of patients identified as having hormone sensitive cancers. Hormones, or other compounds that stimulate or inhibit these pathways, can bind to hormone receptors, blocking a cancer's ability to get the hormones it needs for growth. By altering the hormone supply, hormone therapy can inhibit growth of a tumor or shrink the tumor. Typically, these cancer treatments only work for hormone-sensitive cancers. If a cancer is hormone sensitive, a patient might benefit from hormone therapy as part of cancer treatment. Sensitive to hormones is usually determined by taking a sample of a tumor (biopsy) and conducting analysis in a laboratory.


Cancers that are most likely to be hormone-receptive include: Breast cancer, Prostate cancer, Ovarian cancer, and Endometrial cancer. Not every cancer of these types is hormone-sensitive, however. That is why the cancer must be analyzed to determine if hormone therapy is appropriate.


Hormone therapy may be used in combination with other types of cancer treatments, including surgery, radiation and chemotherapy. A hormone therapy can be used before a primary cancer treatment, such as before surgery to remove a tumor. This is called neoadjuvant therapy. Hormone therapy can sometimes shrink a tumor to a more manageable size so that it's easier to remove during surgery.


Hormone therapy is sometimes given in addition to the primary treatment—usually after—in an effort to prevent the cancer from recurring (adjuvant therapy). In some cases of advanced (metastatic) cancers, such as in advanced prostate cancer and advanced breast cancer, hormone therapy is sometimes used as a primary treatment.


Hormone therapy can be given in several forms, including: (A) Surgery—Surgery can reduce the levels of hormones in your body by removing the parts of your body that produce the hormones, including: Testicles (orchiectomy or castration), Ovaries (oophorectomy) in premenopausal women, Adrenal gland (adrenalectomy) in postmenopausal women, Pituitary gland (hypophysectomy) in women. Because certain drugs can duplicate the hormone-suppressive effects of surgery in many situations, drugs are used more often than surgery for hormone therapy. And because removal of the testicles or ovaries will limit an individual's options when it comes to having children, younger people are more likely to choose drugs over surgery. (B) Radiation—Radiation is used to suppress the production of hormones. Just as is true of surgery, it's used most commonly to stop hormone production in the testicles, ovaries, and adrenal and pituitary glands. (C) Pharmaceuticals—Various drugs can alter the production of estrogen and testosterone. These can be taken in pill form or by means of injection. The most common types of drugs for hormone-receptive cancers include: (1) Anti-hormones that block the cancer cell's ability to interact with the hormones that stimulate or support cancer growth. Though these drugs do not reduce the production of hormones, anti-hormones block the ability to use these hormones. Anti-hormones include the anti-estrogens tamoxifen (Nolvadex) and toremifene (Fareston) for breast cancer, and the anti-androgens flutamide (Eulexin) and bicalutamide (Casodex) for prostate cancer. (2) Aromatase inhibitors—Aromatase inhibitors (AIs) target enzymes that produce estrogen in postmenopausal women, thus reducing the amount of estrogen available to fuel tumors. AIs are only used in postmenopausal women because the drugs can't prevent the production of estrogen in women who haven't yet been through menopause. Approved AIs include letrozole (Femara), anastrozole (Arimidex) and exemestane (Aromasin). It has yet to be determined if AIs are helpful for men with cancer. (3) Luteinizing hormone-releasing hormone (LH-RH) agonists and antagonists—LH-RH agonists—sometimes called analogs—and LH-RH antagonists reduce the level of hormones by altering the mechanisms in the brain that tell the body to produce hormones. LH-RH agonists are essentially a chemical alternative to surgery for removal of the ovaries for women, or of the testicles for men. Depending on the cancer type, one might choose this route if they hope to have children in the future and want to avoid surgical castration. In most cases the effects of these drugs are reversible. Examples of LH-RH agonists include: Leuprolide (Lupron, Viadur, Eligard) for prostate cancer, Goserelin (Zoladex) for breast and prostate cancers, Triptorelin (Trelstar) for ovarian and prostate cancers and abarelix (Plenaxis).


One class of pharmaceuticals are the Selective Estrogen Receptor Modulators or SERMs. SERMs block the action of estrogen in the breast and certain other tissues by occupying estrogen receptors inside cells. SERMs include, but are not limited to tamoxifen (the brand name is Nolvadex, generic tamoxifen citrate); Raloxifene (brand name: Evista), and toremifene (brand name: Fareston).


EXAMPLES

The following examples are given for the purpose of illustrating various embodiments of the invention and are not meant to limit the present invention in any fashion. One skilled in the art will appreciate readily that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those objects, ends and advantages inherent herein. The present examples, along with the methods described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Changes therein and other uses which are encompassed within the spirit of the invention as defined by the scope of the claims will occur to those skilled in the art.


Example 1
Material and Methods

Patients and Samples. Studies were conducted using different cohorts of samples: 132 patients (82 were ER-positive) from UT M.D. Anderson Cancer Center (MDACC) prior to pre-operative adjuvant chemotherapy, 18 patients from MDACC with metastatic (AJCC Stage IV) ER-positive breast cancer, 277 patients from three different institutions (109 from Oxford, UK; 87 from Guy's Hospital, London UK; 81 from Uppsala, Sweden) who were uniformly treated with adjuvant tamoxifen, and 286 patients (209 were ER-positive) with node-negative disease from a single institution who did not receive any systemic chemotherapy treatment. At MDACC, pre-treatment fine needle aspiration (FNA) samples of primary breast cancer were obtained using a 23-gauge needle and the cells from 1-2 passes were collected into a vial containing 1 ml of RNAlater™ solution (Ambion, Austin Tex.) and stored at −80° C. until use, whereas archival frozen samples were evaluated from resected, metastatic, ER-positive breast cancer. All patients signed an informed consent for voluntary participation to collect samples for research. At other institutions, fresh tissue samples of surgically resected primary breast cancer were frozen in OCT compound and stored at −80° C.


Patients in this study had invasive breast carcinoma and were characterized for estrogen receptor (ER) expression using immunohistochemistry (IHC) and/or enzyme immunoassay (EIA). Immunohistochemical (IHC) assay for ER was performed on formalin-fixed paraffin-embedded (FFPE) tissue sections or Camoy's-fixed FNA smears using the following methods: FFPE slides were first deparaffinized, then slides (FFPE or FNA) were passed through decreasing alcohol concentrations, rehydrated, treated with hydrogen peroxide (5 minutes), exposed to antigen retrieval by steaming the slides in tris-EDTA buffer at 95° C. for 45 minutes, cooled to room temperature (RT) for 20 minutes, and incubated with primary mouse monoclonal antibody 6F11 (Novacastra/Vector Laboratories, Burlingame, Calif.) at a dilution of 1:50 for 30 minutes at RT (Gong et al., 2004). The Envision method was employed on a Dako Autostainer instrument for the rest of the procedure according to the manufacturer's instructions (Dako Corporation, Carpenteria, Calif.). The slides were then counterstained with hematoxylin, cleared, and mounted. Appropriate negative and positive controls were included. The 96 breast cancers from OXF were ER-positive by enzyme immunoassay as previously described, containing>10 femtomoles of ER/mg protein (Blankenstein et al., 1987).


Estrogen receptor (ER) expression was characterized using immunohistochemistry (IHC) and/or enzyme immunoassay (EIA). IHC staining of ER was interpreted at MDACC as positive (P) if ≧10% of the tumor cells demonstrated nuclear staining, low expression (L) if <10% of the tumor cell nuclei stained, and negative (N) if there was no nuclear staining. Low expression (<10%) is reported in routine patient care as negative, but some of those patients potentially benefit from hormonal therapy (Harvey et al., 1999).


RNA extraction and gene expression profiling. RNA was extracted from the MDACC FNA samples using the RNAeasy Kit™ (Qiagen, Valencia Calif.). The amount and quality of RNA was assessed with DU-640 U.V. Spectrophotometer (Beckman Coulter, Fullerton, Calif.) and it was considered adequate for further analysis if the OD260/280 ratio was ≧1.8 and the total RNA yield was ≧1.0 μg. RNA was extracted from the tissue samples using Trizol (InVitrogen, Carlsbad, Calif.) according to the manufacturer's instructions. The quality of the RNA was assessed based on the RNA profile generated by the Bioanalyzer (Agilent Technologies, Palo Alto, Calif.). Differences in the cellular composition of the FNA and tissue samples have been reported previously (Symmans et al., 2003). In brief, FNA samples on average contain 80% neoplastic cells, 15% leukocytes, and very few (<5%) non-lymphoid stromal cells (endothelial cells, fibroblasts, myofibroblasts, and adipocytes), whereas tissue samples on average contain 50% neoplastic cells, 30% non-lymphoid stromal cells, and 20% leukocytes (Symmans et al., 2003). A standard T7 amplification protocol was used to generate cRNA for hybridization to the microarray. No second round amplification was performed. Briefly, mRNA sequences in the total RNA from each sample were reverse-transcribed with SuperScript II in the presence of T7-(dT)24 primer to produce cDNA. Second-strand cDNA synthesis was performed in the presence of DNA Polymerase I, DNA ligase, and Rnase H. The double-stranded cDNA was blunt-ended using T4 DNA polymerase and purified by phenol/chloroform extraction. Transcription of double-stranded cDNA into cRNA was performed in the presence of biotin-ribonucleotides using the BioArray High Yield RNA transcript labeling kit (Enzo Laboratories). Biotin-labeled cRNA was purified using Qiagen RNAeasy columns (Qiagen Inc.), quantified and fragmented at 94° C. for 35 minutes in the presence of 1× fragmentation buffer. Fragmented cRNA from each sample was hybridized to each Affymetrix U133A gene chip, overnight at 42° C. The U133A chip contains 22,215 different probe sets that correspond to 13,739 human UniGene clusters (genes). Hybridization cocktail was prepared as described in the Affymetrix technical manual. dCHIP Vi.3 (available via the internet at dchip.org) software was used to generate probe level intensities and quality measures including median intensity, % of probe set outliers and % of single probe outliers for each chip.


Microarray Data Analysis. The raw intensity files (CEL) from each microarray were normalized using dChip V1.3 software (dchip.org). After normalization, the 75th percentile of pixel level was used as the intensity level for each feature on a microarray (see mdanderson.org/pdf/biostats_utmdabtr00503.pdf via the world wide web). Multiple features representing each probe set were aggregated using the perfect match model to form a single measure of intensity.


Definition of ER Reporter Genes. ER reporter genes were defined from an independent public dataset of Affymetrix U133A transcriptional profiles from 286 node-negative breast cancer samples (Wang et al., 2005). Expression data had been normalized to an average probe set intensity of 600 per array (Wang et al., 2005). The dataset was filtered to include 9789 probe sets with most variable expression, where P0≧5, P75−P25≧100, and P95/P5≧3 (Pq is the qth percentile of intensity for each probe set). Those were ranked by Spearman's rho (Kendall and Gibbons, 1990) with ER mRNA (ESR1 probe set 205225_at) expression, of which 2217 probe sets were significantly and positively associated with ESR1 (t-test of correlation coefficients with one-sided significance level of 99.9% and estimated false discovery rate (FDR) of 0.45%). The size of the reporter gene set was then determined by a bootstrap-based method that accounts for sampling variability in the correlation coefficient and in the resulting probe sets rankings (Pepe et al., 2003). The entire dataset was re-sampled 1000 times with replacement at the subject level (i.e., when one of the 286 subjects was selected in the bootstrap sample, the 2217 candidate probe sets from that subject were included in the dataset). Each probe set was ranked according to its correlation with ESR1 in each bootstrap dataset. The probability (P) of selection for each probe set (g) in a reporter gene set of defined length (k) was calculated as P[Rank(g)≦k]. A similar computation provided estimates of the power to detect the truly co-expressed genes from a study of a given size (Pepe et al., 2003).


Genes that are truly co-expressed with ESR1 have selection probabilities close to 1, but the selection probability diminishes quickly for lower order probe sets (FIG. 1). The probability of selecting the top 50 ER-associated probes would be 98.5% if the ER reporter gene list included 200 probes, 87.0% if 100 probes, and 41.3% if 50 probes (FIG. 1). An ER reporter list with 200 top-ranking probes would include the top 50 probes with 98.5% probability and the top 100 probes with about 93% probability (FIG. 1). The distribution of ranks is very tight for genes that are strongly correlated with ESR1 having median ranks close to 1 (FIG. 2). However, both the median rank and the variance of the distribution of ranks increase for genes that are moderately correlated with ESR1. The gene ranks for genes with Spearman's rho>0.65 are less than 200 with the exception of a few outliers (FIG. 2). Therefore as opposed to selecting the reporter genes by choosing an arbitrary cutoff on the correlation coefficient, this approach identifies the 100 genes that are most-strongly correlated with ESR1 with high power (>93%). The size of the reporter gene set was selected to be 200 probe sets, based on the bootstrap-estimated selection probabilities (FIG. 1) and the requirement to detect the top 100 truly co-expressed genes with >90% power. The original dataset was re-sampled with replacement at the subject level (i.e., when one of the 286 subjects was selected in the bootstrap sample, the 2217 candidate probe sets from that subject were included in the dataset to generate 1000 different bootstrap datasets. Each candidate probe set was ranked according to its correlation with ESR1 within each bootstrap dataset and the degree of confidence in the ranking of each probe set was quantified in terms of the selection probability, Pg(k). The probability (P) of selection for each probe set (g) in a reporter gene set of defined length (k) was calculated as P[Rank(g)]≦k.


Calculation of Expression Index (Sensitivity to Endocrine Treatment Index). To quantify the expression of the 200 reporter genes in new samples, the inventors first developed a gene-expression-based ER associated index. ER-positive and ER-negative reference signatures, or centroids, were then described as the median log-transformed expression value of each of the 200 reporter genes in the 209 ER-positive and 77 ER-negative subjects, respectively. For new samples, the similarity between the log-transformed 200-gene ER associated gene expression signature with the reference centroids was determined based on Hoeffding's D statistic (Hollander and Wolfe, 1999). D takes into account the joint rankings of the two variables and thus provides a robust measure of association that, unlike correlation-based statistics, will detect nonmonotonic associations (in statistical terms, it detects a much broader class of alternatives to independence than correlation-based statistics). The ER reporter index (RI) was defined as the difference between the similarities with the ER+ and ER− reference centroids: RI=D+−D.


The 200-gene signature of a tumor with high ER-dependent transcriptional activity resembles more closely the ER-positive centroid and therefore D+ will be greater than D and RI will be positive. The opposite will be the case for tumors with low ER-related activity and thus RI will be small or negative. Subtraction of D normalizes the reporter index relative to the basal levels of expression of the ER-related genes in ER negative tumors. Because of this and since D is a distribution-free statistic, RI is relatively insensitive to the method used to normalize the microarray data and therefore can be computed across datasets. From the RI, a genomic index of sensitivity to endocrine therapy (SET) was calculated as follows: SET=100(RI+0.2)3. The offset translated RI to mostly positive values and was then transformed to normality using an unconditional Box-Cox power transformation. Finally, the maximum likelihood estimate of the exponent was rounded to the closest integer and the index was scaled to a maximum value of 10.


Statistical Analysis of Distant relapse-free survival (DRFS). Distant relapse-free survival (DRFS) was defined as the interval from breast surgery until diagnosis of distant metastasis. Covariate effects on distant relapse risk after tamoxifen treatment were evaluated using log-rank test in multivariate Cox proportional hazards models stratified by institution. The covariates we included were genomic measurement of likely sensitivity to endocrine therapy (SET index), gene expression levels of estrogen receptor (ESR1, probe set 205225) and progesterone receptor (PGR, probe set 208305), age at diagnosis, tumor histologic grade and tumor stage (revised American Joint Committee on Cancer (AJCC) staging system). ESR1 was normally distributed, but PGR levels were log-transformed to normality. To determine the continuous relation between the SET index and 10-year DRFS, the data were fitted by Cox proportional hazards models having a smoothing spline approximation with 2 degrees of freedom of the SET index as the only covariate (Therneau and Grambsch, 2000). The baseline cumulative hazard rate was estimated from the Cox model based on the Nelson-Aalen estimator and the predicted rate of distant relapse was then obtained from the Breslow-type estimator of the survival function. Confidence intervals of the survival estimate were calculated based on the Tsiatis variance estimates of the cumulative log-hazards (Therneau and Grambsch, 2000). A similar approach was used to determine the continuous relation between ESR1 and PGR expression and DRFS.


Likely sensitivity to endocrine therapy was classified as low, intermediate, or high using cutoff points of the SET index values determined by fitting on the entire dataset (n=277) a stratified multivariate Cox model to predict DRFS in relation to age, histologic grade, stage, median-dichotomized ESR1, median-dichotomized PGR, and the trichotomous SET indicator variable using different thresholds. Thresholds that resulted in maximum or near maximum log-profile likelihood for this model were selected as most informative cut points for predicting DRFS (Tableman and Kim, 2004). The same thresholds were maintained for subsequent analyses of the untreated patients. All statistical computations were performed in R(R Development Core Team, 2005).


Example 2
Correlation Between ER mRNA Expression Levels and ER Status

Intensity values of ESR1 (ER) gene expression from microarray experiments were compared to the results from standard IHC and enzyme immunoassays in 82 FNA samples (MDACC). The Affymetrix U133A GeneChip™ has six probe sets that recognize ESR1 mRNA at different sequence locations. A comparison of the different probe sets using the 82 FNA dataset is presented in Table 3. All the ESR1 probe sets showed high correlation with ER status determined by immunohistochemistry (Kruskal-Wallis test, p<0.0001). The probe set 205225_had the highest mean, median, and range of expression and was most correlated with ER status (Spearman's correlation, R=0.85, Table 3).

TABLE 3The mean, median, and range of expression of the six probe sets thatidentify ERα gene (ESR1) are compared using the results from 82 FNAsamples. Expression of each ESR1 probe set is correlated to ER status(positive, low, or negative) and to the expression of the ESR1 205225probe set (R values, Spearman‘s rank correlation test).Probe SetSpearmanERSignal IntensityCorrelation WithESR1MeanMedianRangeER Status205225205225163391268020.851.002155521921366710.810.862171901521224290.720.842112332341786630.710.882112351891396740.690.882112342362094620.640.83


Example 3
ER Reporter Genes

The consistency of identifying top-ranking genes depends on factors that affect the sampling variability in the correlation coefficient, such as the size of the dataset and the strength of the underlying true association between the candidate genes and ESR1. The inventors evaluated the consistency in the ranking of the candidate ER reporter genes in terms of the selection probability estimated from 1000 bootstrapped datasets. FIG. 1 shows that the selection probability was high for the top-ranking probes, i.e., the top-ranking probes rank consistently at the top of the list, but it diminished quickly with increasing rank. Furthermore, the selection probability of a candidate gene of a given rank showed a strong dependence on the number of candidate probes selected. For example, the probability of consistently selecting the truly top 50 ER-associated probes was 98.5% if the top 200 candidate probes are selected, 87.0% if the top 100 probes are selected, and only 41.3% if the top 50 probes are selected (FIG. 1). Based on these considerations, the inventors defined the ER reporter list to include the 200 top-ranking probes to ensure that the 100 most-strongly associated probes with ESR1, which are expected to be biologically relevant, would be among the reporter genes with about 90% probability. The entire list included 200 probe sets (excluding those that detect ESR1) representing 163 different genes and 7 uncharacterized transcripts (Table 1).


Example 4
ER Reporter Index is Independent of ESR1 Expression

The ER reporter index (RI) was calculated for the tamoxifen-treated group and the node-negative untreated group. The RI was predominantly positive in ER-positive subjects and predominately negative in ER-negative subjects with the two ER-conditional distributions being distinct and well separated (FIGS. 3A and 3B), which supports ER RI as an indicator of ER-associated activity. To evaluate whether the levels of ER RI are correlated with ESR1 mRNA expression levels, the RI was plotted vs. ESR1 expression for both groups (FIGS. 3C and 3D). Although both ESR1 mRNA and RI were lower in ER-negative subjects, there was no apparent trend in ER-positive subjects. This suggests that, even though the estrogen reporter genes were identified as being co-expressed with ESR1, the overall expression pattern of this group of genes as captured by the ER reporter index conveys information on ER-signaling that is not captured by ESR1.


Example 5
Reproducibility of Reporter Genes and SET Index

The in vivo transcription and microarray hybridization steps were repeated using residual sample RNA from 35 FNA samples. The 35 original and replicate sample pairs demonstrated excellent reproducibility of the gene expression measurements and calculated indices (FIG. 4). The concordance correlation coefficients were (Lin, 1989; 2000): 0.979 (95% CI 0.958-0.989) for the pairs of ESR1 expression measurements, 0.953 (95% CI 0.909-0.976) for PGR expression, 0.985 (95% CI 0.972-0.992) for ER reporter index values, and 0.972 (95% CI 0.945-0.986) for the pairs of SET index measurements exhibiting excellent accuracy (minimal deviation of the best fit line from the 45° line) and good precision in all cases.


Example 6
Characterization of ER Reporter Genes

The 200 ER reporter probe sets represent 163 unique genes and 7 uncharacterized transcripts (Table 1). These contain twenty-seven probe sets that represent 23 genes on chromosome 5, and 20 probe sets that represent 18 genes on chromosome 1. Mapping the 163 genes to the KEGG pathway database indicated representation of several signaling pathways including focal adhesion, Wnt, Jak-STAT, and MAPK signaling pathways. Furthermore, mapping to gene ontology (GO) categories indicated that the biological processes “fatty acid metabolism,” “pyrimidine ribonucleotide biosynthesis,” and “apoptosis” are over-represented in this set relative to chance based on the hypergeometric test (p-values<0.03). The distributions of reporter genes for ER-positive and ER-negative breast cancers were distinct and well separated, consistent with an indicator of ER-associated activity (FIGS. 3A and 3B). Both ESR1 and reporter genes were lower in ER-negative subjects, but there was no apparent correlation in ER-positive subjects (FIGS. 3C and 3D). Therefore, although the ER reporter genes were identified by their co-expression with ESR1, the overall expression pattern of this group of genes (as captured by the index) conveys information on ER-signaling that is independent of ER gene expression level alone.


Example 7
Distant Relapse after Adjuvant Tamoxifen Therapy

Univariate Cox proportional hazards models were employed to evaluate the risk of distant relapse at 10 years after adjuvant tamoxifen treatment as continuous functions of expression levels of the estrogen receptor gene (ESR1), progesterone receptor gene (PGR), and the 200-gene index of reporter genes for sensitivity to endocrine therapy (SET index) (FIG. 5). ER gene expression (ESR1, FIG. 5A) was not a significant predictor of 10-year relapse rate (LRT p=0.16), but higher progesterone receptor gene expression (PGR, FIG. 5B) was significantly associated with lower relapse rates at 10 years (HR 0.62; 95% CI 0.44-0.88; LRT p=0.005). Higher SET index levels (FIG. 5C) were also significantly associated with lower 10-year relapse rates (HR 0.70; 95% CI 0.56-0.86; LRT p<0.001). The mean relapse-free survival at 10 years for subjects with SET index<2 was 57.1% (95% CI 41.1-80.3) whereas for those with SET index>5 was 90.0% (95% CI 82.5-97.7) (FIG. 5C).


Example 8
Distant Relapse in Untreated Patients—SET Index is Independent of Prognosis

To address the possibility that observed differences in DRFS could be due to indolent prognosis, rather than benefit from adjuvant tamoxifen, the same covariates were evaluated as potential prognostic factors of DRFS in 209 ER-positive patients who did not receive adjuvant systemic therapy. Consistent with the effects in the tamoxifen treated group, ER expression level (ESR1, FIG. 6A) was not significantly associated with the 5-year relapse rate in untreated patients (LRT p=0.75), and higher progesterone receptor (PGR, FIG. 6B) was significantly associated with lower relapse rates at 5 years (HR 0.78, 95% CI 0.67-0.90; LRT p<0.001). However, the effect of the SET index (FIG. 6C) on the 5-year relapse rate in untreated patients was small and marginally significant (HR 0.90, 95% CI 0.82-1.00; LRT p=0.043).


Example 9
Independence of Genomic Predictors in Multivariate Survival Analyses

The continuous gene-expression-based predictors (ESR1, PGR, and SET index) were evaluated in a multivariate Cox model in relation to patient's age, tumor histologic grade and tumor AJCC stage for ER-positive patients treated with adjuvant tamoxifen. SET index was a significant predictor of relapse after adjuvant tamoxifen treatment (HR 0.72; 95% CI 0.54-0.95), whereas the effect of PGR expression was not statistically significant (Table 4, Treated Patients). Conversely, when patients with ER-positive breast cancer who did not receive adjuvant treatment were evaluated with the same multivariate model, it was found that PGR expression was independently prognostic (HR 0.72; 95% CI 0.58-0.89), whereas the effect of SET index was not statistically significant (Table 4, Untreated Patients). Therefore the SET index was independently predictive of benefit from adjuvant tamoxifen therapy, but not prognostic in patients with ER-positive breast cancer who did not receive adjuvant treatment.

TABLE 4Multivariate Cox analysis of continuous gene-expression-based covariates ofDRFS in patients with ER-positive breast cancer. Treated patients (left column) receivedadjuvant tamoxifen, whereas untreated patients (right column) had node-negative diseaseand did not receive adjuvant treatment. ‡PGR expression values were log-transformed.Treated Patients (n=211)Untreated Patients (n=142)EffectHR (95% CI)P-valueHR (95% CI)P-valueAge1.09 (0.30-3.90)0.890.59 (0.31-1.11)0.10>50 vs. ≦ 50Histologic Grade1.09 (0.54-2.22)0.811.93 (0.92-4.04)0.083 vs. 1 or 2AJCC Stage1.96 (0.80-4.78)0.141.13 (0.64-1.97)0.68II or III vs. IER Expression1.00 (1.00-1.00)0.721.00 (1.00-1.00)0.13PGR Expression0.93 (0.61-1.40)0.720.72 (0.58-0.89)0.002Sensitivity to Endocrine0.72 (0.54-0.95)0.0220.99 (0.86-1.14)0.86Therapy Index


The SET index was developed to measure ER-related gene expression in breast cancer samples with a hypothesis that this would represent intrinsic endocrine sensitivity. The inventors found that SET index had a steep and linear association with improved 10-year relapse-free survival in women who received tamoxifen as their only adjuvant therapy (FIG. 2), and was the only significant factor in multivariate analysis of DRFS that included grade, stage, age, and expression levels of ESR1 and PGR (Table 4). The information from SET index is mostly predictive of benefit from endocrine treatment, rather than prognosis (FIG. 6, Table 4).


Example 10
Classes of Endocrine Sensitivity Defined By Set Index

The almost linear functional dependence of the likelihood of distant relapse on the genomic endocrine sensitivity (SET) index (FIG. 5C) makes it possible to define three classes by specifying two cut points. Optimal thresholds were chosen to maximize the predictability of the trichotomous SET index in a multivariate Cox model, and occurred at the 50th and 65th percentiles of SET distribution corresponding to index values 3.71 and 4.23, respectively. The three classes of predicted sensitivity to endocrine therapy (low, intermediate, and high sensitivity) were evaluated in a multivariate Cox model stratified by institution that included dichotomized age, histologic grade, AJCC stage, and the median-dichotomized gene expression of ESR1 and PGR. The likelihood of distant relapse after tamoxifen therapy was significantly lower in those in the high SET group, compared with the low SET group (HR=0.24, 95% CI 0.09-0.59, p=0.002). There was no significant difference between intermediate and low SET groups (HR=0.67; 95% CI 0.30-1.49; p=0.33).


Example 11
SET Index and Classes Correlate with Distant Relapse-Free Survival

Kaplan-Meier estimators of relapse-free survival were compared for the three classes of SET index in the patients with ER-positive breast cancer who received adjuvant tamoxifen (FIG. 7A) with those who did not receive adjuvant therapy (FIG. 7B). The 35% of subjects with high SET had improved and sustained survival benefit from adjuvant tamoxifen, whereas the 50% of subjects with low SET did not obtain as much benefit from adjuvant tamoxifen (FIG. 7A). Most interesting were the 15% of subjects with intermediate SET. In the untreated cohort (FIG. 7B), subjects with intermediate SET had similar prognosis to those with low SET. However, in the tamoxifen treated cohort (FIG. 7A), subjects with intermediate SET had similar prognosis to those with high SET for the first 6 to 7 years of follow up. Furthermore, within 2 years after the completion of endocrine therapy these patients with intermediate SET began to experience distant relapse at a rate that was similar to the low SET group during the first 3 to 4 years of follow up (FIGS. 7A and 7B). Finally, the Kaplan-Meier estimators of relapse-free survival based on PGR expression (FIGS. 3C and 3D) confirm the combined prognostic and predictive effects of PGR (also shown in FIGS. 5B and 6B) and demonstrate less pronounced separation of the survival curves than SET in tamoxifen treated subjects (FIGS. 7A and 7C).


The inventors observed the same effects of SET class on DRFS of patients treated with adjuvant tamoxifen when the inventors stratified this cohort by known nodal status and separately evaluated the three classes of SET index in 115 node-negative patients (FIG. 8A) and 140 node-positive patients (FIG. 8B). These three classes of SET appear to identify approximately 35% of patients who have sustained benefit from adjuvant tamoxifen alone, approximately 50% who have minimal benefit from tamoxifen, and approximately 15% of patients whose benefit from tamoxifen continues during their adjuvant treatment, but is not sustained after endocrine therapy is completed.


Patients with high endocrine sensitivity (SET index values in upper 35%) had sustained benefit from adjuvant tamoxifen, compared to untreated patients (FIG. 7). This effect was evident when comparing untreated prognosis with tamoxifen treatment in node-negative patients (FIGS. 7B and 8A). Rare relapse events during tamoxifen treatment might still occur because of individual differences in compliance, metabolism due to variant genotype of cytochrome p450 2D6, or interaction from selective serotonin reuptake inhibitors used as antidepressants or to treat hot flashes. These can limit metabolism of tamoxifen to more active metabolites, thereby decreasing treatment efficacy, and are obviously unrelated to the activity of ER in the breast cancer cells (Stearns et al., 2003; Jin et al., 2005). Patients with low SET index values (lower 50%) derived minimal benefit from adjuvant tamoxifen, irrespective of nodal status (FIGS. 11 and 12). The effect of adjuvant tamoxifen (compared to untreated prognosis) is particularly revealing for patients with intermediate SET index (FIG. 7). These patients derived benefit from tamoxifen during their adjuvant treatment, but relinquished this survival benefit after cessation of treatment. Subjects with intermediate SET index started to accrue distant relapse events within 2 years of discontinuing adjuvant tamoxifen, and at a rate that was similar to the subjects with low SET index (treatment or prognosis) in the early period of follow up. This suggests that intermediate SET index values identified patients who might benefit from prolonged and/or more effective endocrine therapy used in current crossover treatment strategies (Goss et al., 2003).


Example 12
SET Index and Chemotherapy Response in ER-Positive Breast Cancer

Groups with low, intermediate, and high SET index were compared with pathologic response outcome in the 82 patients with ER-positive breast cancer who received neoadjuvant chemotherapy with paclitaxel (12 weekly cycles) followed by fluorouracil, doxorubicin, and cyclophosphamide (4 cycles q3 weeks) (Ayers et al., 2004). The same SET classes were as for the survival analyses after adjuvant tamoxifen. There were 8 patients with ER-positive cancer who achieved pathologic complete response (pCR) in the breast and axilla, of which 7 had low SET and one had intermediate SET (Table 5). Conversely, none of the 11 patients with ER-positive breast cancer and high SET, and only one of 11 patients with intermediate SET, achieved pCR from neoadjuvant T/FAC chemotherapy (Table 5).

TABLE 5Pathologic response to neoadjuvant T/FAC chemotherapy in ER-positivepatients compared with predicted sensitivity to endocrine therapy(SET risk groups).Chemotherapy Response (ER + patients)Sensitivity toEndocrine TherapyCompete Pathologic(SET) GroupResponseResidual DiseaseLow753Intermediate110High011


Example 13
SET Index and Stage of ER-Positive Cancer

There was a progressive decline in the values for the sensitivity to endocrine therapy (SET) index with increasing AJCC stage of ER-positive breast cancers (FIG. 8A, p<0.001). The decrease is only marginally significant for the transcriptional levels of ESR1 (FIG. 8B, p=0.04) and PGR (FIG. 8C, p=0.05), whereas the transcriptional level of a housekeeper gene (GAPDH) does not vary with stage (FIG. 8D, p=0.77). This analysis was done for 351 breast cancers that were ER-positive by IHC and had known stage of disease at the time of sample (58 stage I, 123 stage IIA, 107 stage IIB, 44 stage III, and 18 stage IV). The significance of stage-related trends was evaluated by treating tumor stage as an ordinal covariate in ordinary least squares regression with orthogonal polynomial contrasts. The p-values correspond to the significance of the linear term (based on the t-test). All samples from Stage I to III breast cancer were collected prior to any treatment. The 18 samples of Stage IV ER-positive breast cancer were from relapsed disease in 17 patients and at the time of initial presentation in one, and these included 14 patients who had received previous hormonal treatment with tamoxifen and/or aromatase inhibition. There was no obvious difference in the genomic expression levels of ESR1 or SET index in the 14 patients with Stage IV breast cancer who had received prior hormonal therapy, compared to the 4 who had not (ANOVA p=0.9).


Stage-dependent differences in biomarker measurements have obvious clinical importance, particularly for biomarkers of critical targeted cellular pathways. SET index values successively declined with advancing stage, whereas changes in ESR1 and PGR were less distinct (FIG. 8). One explanation is that tumors with less intrinsic dependence on estrogen are more biologically aggressive, and hence more likely to present with larger size and nodal metastasis. Additionally, biological progression of ER-positive breast cancer probably includes progressive dissociation from estrogen dependence through recruitment of other growth and survival pathways. The SET index captures these important differences in tumor biology with greater acuity than measurements of ER and PR. If significant decrease in genomic SET index values between matched primary tumors and subsequent distant metastases were demonstrated, then SET index could be used to monitor changes in the ER genomic pathway (and endocrine sensitivity) during the course of disease.


REFERENCES

The following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.

  • U.S. Pat. No. 6,673,914
  • U.S. Pat. No. 6,521,415
  • U.S. Pat. No. 6,162,606
  • U.S. Pat. No. 6,107,034
  • U.S. Pat. No. 5,693,465
  • U.S. Pat. No. 5,384,260
  • U.S. Pat. No. 5,292,638
  • U.S. Pat. No. 5,030,417
  • U.S. Pat. No. 4,968,603
  • U.S. Pat. No. 4,806,464


OTHER REFERENCES



  • Ayers et al., J. Clin. Oncol., 22:2284-2293, 2004.

  • Blankenstein et al., Clin. Chim. Acta, 165L189-195, 1987.

  • Bonneterre et al., J. Clin. Oncol., 18:3748-57, 2000.

  • Bryant and Wolmark, N. Engl. J. Med., 349(19):1855-1857, 2003.

  • Burstein, N. Engl. J. Med., 349(19):1857-1859, 2003.

  • Buzdar, Semin. Oncol., 28:291-304, 2001.

  • Esteva et al., Clin. Cancer Res., 11:3315-9, 2005.

  • Gong et al., Cancer, 102:34-40, 2004.

  • Goss et al., N. Engl. J. Med., 349(19):1793-1802, 2003.

  • Gruvberger-Saal et al., Mol. Cancer. Ther., 3:161-168, 2004.

  • Gruvberger et al., Cancer Res., 61:5979-5984, 2001.

  • Harvey et al., J. Clin. Oncol., 17:1474-1481, 1999.

  • Hess et al., Breast Cancer Res. Treat., 78:105-118, 2003.

  • Hollander and Wolfe, In: Probability and Statistics, Wiley Series, NY: John Wiley & Sons, Inc., 1999.

  • Howell and Dowsett, Breast Cancer Res., 6:269-274, 2004.

  • Howell et al., Lancet., 365(9453):60-62, 2005.

  • Jansen et al., J. Clin. Oncol., 23:732-740, 2005.

  • Jin et al., J. Natl. Cancer Inst., 97(1):30-39, 2005.

  • Kendall and Gibbons, In: Rank Correlation Methods, NY, Oxford University Press, 1990.

  • Konecny et al., J. Natl. Cancer Inst., 95:142-153, 2003.

  • Kun et al., Hum. Mol. Genet., 12:3245-3258, 2003.

  • Lacroix et al., Breast Cancer Res. Treat., 67:263-271, 2001.

  • Loi et al., Proc. Am. Soc. Clin. Oncol., Abstract #509, 2005

  • Ma et al., Cancer Cell, 5:607-616, 2004.

  • Mouridsen et al., J. Clin. Oncol., 19:2596-2606, 2001.

  • Paik et al., N. Engl. J. Med., 351:2817-2826, 2004.

  • Paik et al., Proc. Am. Soc. Clin. Oncol., Abstract #510, 2005.

  • Pepe et al., Biometrics, 59:133-142, 2003.

  • Perou et al., Nature, 406:747-752, 2000.

  • Pusztai et al., Clinical Cancer Res., 9:2406-2415, 2003.

  • Ransohoff, Nat. Rev. Cancer, 4:309-314, 2004.

  • Ransohoff, Nat. Rev. Cancer, 5:142-149, 2005.

  • Regitnig et al., Virchows Arch., 441:328-34, 2002.

  • Rhodes et al., J. Clin. Pathol., 53:125-130, 2000.

  • Rhodes, Am. J. Surg. Pathol., 27(9):1284-1285, 2003.

  • Rudiger et al., Am. J. Surg. Pathol., 26:873-882, 2002.

  • Sorlie et al, Proc. Natl. Acad. Sci. USA, 98:10869-10874, 2001.

  • Stearns et al., J. Natl. Cancer Inst., 95(23):1758-1764, 2003.

  • Symmans et al., Cancer, 97:2960-2971, 2003.

  • Tableman and Kim, In: Survival Analysis Using S: Analysis of Time-to-Event Data, FL,: Chapman & Hall/CRC; 2004.

  • Taylor et al., Hum. Pathol., 25:263-270, 1994.

  • Therneau and Grambsch, In: Modeling Survival Data: Extending the Cox Model, NY, Springer-Verlag; 2000.

  • Thurlimann et al., N. Engl. J. Med., 353(26):2747-2757, 2005.

  • van't Veer et al., Nature, 415:530-536, 2002.

  • Wang et al., Lancet., 365:671-679, 2005.


Claims
  • 1. A method of assessing cancer patient sensitivity to treatment comprising the step of preparing a sensitivity to endocrine therapy (SET) index based on expression in a patient sample of one or more ER-related genes selected from Table 1.
  • 2. The method of claim 1, further comprising selecting a treatment based on the SET index.
  • 3. The method of claim 1, wherein the ER-related genes comprise 25 or more ER related genes of Table 1.
  • 4. The method of claim 3, wherein the ER-related genes comprise 50 or more ER related genes of Table 1.
  • 5. The method of claim 4, wherein the ER-related genes comprise 100 or more ER related genes of Table 1.
  • 6. The method of claim 4, wherein the ER-related genes comprise 200 ER related genes of Table 1.
  • 7. The method of claim 1, wherein the SET index includes covariates of tumor size, nodal status, grade, and age.
  • 8. The method of claim 1, wherein the SET index includes evaluation of overall survival (OS).
  • 9. The method of claim 8, wherein the SET index includes evaluation of distant relapse-free survival (DRFS).
  • 10. The method of claim 1, wherein the treatment is a combination of one or more cancer therapy.
  • 11. The method of claim 1, wherein the treatment is hormonal therapy.
  • 12. The method of claim 11, wherein the hormonal therapy is tamoxifen therapy, aromatase inhibitor therapy, or SERM therapy.
  • 13. The method of claim 11, wherein the treatment is chemotherapy.
  • 14. The method of claim 11, wherein the treatment is a combination of hormonal therapy and chemotherapy.
  • 15. The method of claim 1, wherein the patients are diagnosed with early or late-stage cancer.
  • 16. A method of calculating a sensitivity to endocrine treatment (SET) index comprising the steps of: (a) identifying a gene set of one or more estrogen receptor (ER)-related genes indicative of ER transcriptional activity by assessing gene expression in a reference population of tumor samples from cancer patients, defining a reference ER-related gene set; and (b) preparing a calculated index using an assessment of ER-related gene expression in one or more samples relative to the reference ER-relate gene expression.
  • 17. The method of claim 16, further comprising assessing sensitivity of a cancer to therapy using the calculated index.
  • 18. The method of claim 17, wherein the therapy is hormonal therapy or chemotherapy.
  • 19. The method of claim 18, wherein the therapy comprises both hormonal therapy and chemotherapy.
  • 20. The method of claim 19, further comprising selecting a class or individual hormonal therapy.
  • 21. The method of claim 20, wherein the hormonal therapy is tamoxifen therapy, aromatase inhibitor therapy, or SERM therapy.
  • 22. The method of claim 17, further comprising identifying a patient that will benefit from an extended duration of therapy.
  • 23. The method of claim 16, wherein all or part of the reference tumor samples are from patients diagnosed with a hormone sensitive cancer.
  • 24. The method of claim 23, wherein the hormone sensitive cancer is an estrogen sensitive cancer.
  • 25. The method of claim 24, wherein the estrogen-sensitive cancer is breast cancer.
  • 26. The method of claim 16, wherein the gene set comprises 25 to 200 ER related genes.
  • 27. The method of claim 26, wherein the gene set comprises 50 to 200 ER related genes.
  • 28. The method of claim 27, wherein the gene set comprises 200 ER related genes.
  • 29. The method of claim 16, wherein the calculated index includes a metric indicative of ER status of all or part of the reference tumor samples.
  • 30. The method of claim 16, wherein the calculated index includes covariates of tumor size, nodal status, grade, and age.
  • 31. The method of claim 16, wherein the calculated index includes evaluation of survival of the patient population sampled for all or part of the reference population of tumor samples.
  • 32. The method of claim 31, wherein calculation of the index includes evaluation of distant relapse-free survival (DRFS) of the patient population.
  • 33. The method of claim 16, wherein the patient population include ER-positive or both ER positive and ER negative samples.
  • 34. The method of claim 16, further comprising normalizing expression data of the one or more samples to the ER-related gene expression profile.
  • 35. The method of claim 34, wherein the expression data is normalized to a digital standard.
  • 36. The method of claim 35, wherein the digital standard is a gene expression profile from a reference sample.
  • 37.-42. (canceled)
Parent Case Info

This application claims priority to U.S. Provisional Patent Applications Ser. No. 60/715,403, filed on Sep. 9, 2005 and Ser. No. 60/822,879 filed on Aug. 18, 2006, each of which is incorporated herein by reference in their entirety.

Provisional Applications (2)
Number Date Country
60715403 Sep 2005 US
60822879 Aug 2006 US