The invention relates to the identification of nucleic acid and protein expression profiles and nucleic acids, products, and antibodies thereto that are outcome prognostic in prostate cancer.
Prostate cancer will account for an estimated 30% (189,000) of new cancer cases in men in the United States in 2002 (1). Many of these newly diagnosed cases are a result of the extensive use of prostate-specific antigen (PSA) screening and the subsequent diagnosis of prostate cancer at an early stage and age. However, despite the introduction of PSA screening the mortality from prostate cancer has remained relatively constant. The implications of this are that: (1) there are a large group of men diagnosed with prostate cancer for whom radical treatment is probably unnecessary and who will die with their prostate cancer rather than from it; and (2) there are a group of men for whom early detection offers the possibility of cure that may be denied by delay. Consequently, identifying these groups of men at the time of diagnosis is critical to the optimal management of prostate cancer.
While the benefits of PSA screening are widely debated, this serum marker remains one of only a small number of preoperative parameters of prognostic utility. In order to enhance the predictive value of individual parameters with outcomes, nomograms have been developed that incorporate parameters that are measured routinely in clinical practice to predict the probability of PSA relapse free survival of individual patients both prior to and following therapy (2-6). Models such as these currently form the basis of routine clinical decision-making, but such classification systems cannot explore differences in outcomes observed between cancers with similar histopathological features. Hence, there remains a critical need for increased accuracy in the subcategorization of prostate cancers to identify those with an aggressive phenotype.
There are a considerable number of publications assessing the ability of biomarkers to predict an earlier time to relapse of prostate cancer following radical prostatectomy (reviewed in ref. (17)). Despite these data, there remain no molecular markers of routine clinical utility which differentiate localized prostate cancers with an aggressive phenotype, and clinicians still rely on conventional preoperative and postoperative prognostic indicators such as pretreatment PSA levels, pathological stage and Gleason grade in routine decision-making. This most likely reflects the fact that studies that have correlated differences in gene expression with patient outcome have assessed candidate genes with limited predictive power that provide no additional prognostic information above the conventional variables. This accentuates the need to discover novel genes with strong predictive ability.
One approach is to define patterns of gene expression that correlate with disease phenotype and patient outcome. Here, we undertook a systematic search for novel biomarkers of prostate cancer prognosis by outcome-based analyses of transcript profiles.
The present invention evaluates gene expression profile and identifies prognostic genes of prostate cancers. The present invention provides a method of determining prognosis of prostate cancer and predicting prostate cancer outcome of a patient. The method comprises the steps of first establishing the threshold value of at least one prognostic gene of prostate cancer. Then, the amount of the prognostic gene from a prostate tissue of a patient inflicted of prostate cancer is determined. The amount of the prognostic gene present in that patient is compared with the established threshold value of the prognostic gene, whereby the prognostic outcome of the patient is determined.
In certain embodiments, the amount of the prognostic gene is determined by the quantitation of a transcript encoding the sequence of the prognostic gene; or a polypeptide encoded by the transcript. The quantitation of the transcript can be based on hybridization to the transcript. The quantitation of the polypeptide can be based on antibody detection. The method optionally comprises a step of amplifying nucleic acids from the tissue sample before the evaluating. In some embodiments, the evaluating is of a plurality of sequences. The method may further comprises determining prostate-specific antigen (PSA) level. The prognosis contributes to selection of a therapeutic strategy.
The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
units (Y-axis) of fluorescence signal detected by microarray analysis.
Current models of prostate cancer classification are poor at distinguishing between tumors that have similar histopathological features but vary in clinical course and outcome. In the present invention, we have applied classical survival analysis to genome-wide gene expression profiles of prostate cancers and preoperative prostate-specific antigen levels from each patient, to identify prognostic markers of disease relapse that provide additional predictive value relative to prostate-specific antigen concentration. The present invention demonstrates that multivariable survival analysis can be applied to gene expression profiles of prostate cancers with censored follow-up data and used to identify molecular markers of prostate cancer relapse with strong predictive power and relevance to the etiology of this disease.
Prostate Cancer and Treatments
Prostate cancer is found mainly in older men. Prostate cancer is the most commonly diagnosed internal malignancy and second most common cause of cancer death in men in the U.S., resulting in approximately 40,000 deaths each year. Landis et al. (1998) CA Cancer J. Clin. 48:6-29; and Greenlee, et al. (2000) CA Cancer J. Clin 50:7-13. The incidence of prostate cancer has been increasing rapidly over the past 20 years in many parts of the world. Nakata, et al. (2000) Int. J. Urol. 7:254-257; and Majeed, et al. (2000) BJU Int. 85:1058-1062. It develops as the result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer cell undergoes initiation, proliferation, and loss of contact inhibition, culminating in invasion of surrounding tissue and, ultimately, metastasis.
Prostate cancer is a disease in which malignant (cancer) cells form in the tissues of the prostate. The prostate is a gland in the male reproductive system located just below the bladder (the organ that collects and empties urine) and in front of the rectum (the lower part of the intestine). It is about the size of a walnut and surrounds part of the urethra (the tube that empties urine from the bladder). The prostate gland produces fluid that makes up part of the semen. See generally, Boyle, et al. (2002) Textbook of Prostate Cancer Isis Medical Media, ISBN: 1901865304; Kantoff (ed. 2002) Prostate Cancer: Principles and Practice Lippincott, ISBN: 0781720060; Carroll (2001) Prostate Cancer Decker, ISBN: 1550091301; Belldegrun, et al. (2000) New Perspectives in Prostate Cancer Isis Medical Media, ISBN: 1901865568; Lepor (1999) Prostatic Diseases Saunders, ISBN: 072167416X; Petrovich, et al. (eds. 1996) Carcinoma of the Prostate: Innovations in Management, Springer Verlag, ISBN: 3540587497; and standard prostate cancer medical texts.
Four types of standard treatment are used for prostate cancer: watchful waiting, surgery, radiation therapy, or hormone ablation therapy. See, e.g., the National Cancer Institute (NCI) description of prostate cancer, www.cancer.gov.
Watchful waiting is closely monitoring a patient's condition but withholding treatment until symptoms appear or change. This is usually used in older men with other medical problems and early stage disease.
Surgery is usually offered to prostate cancer patients in good health who are younger than 70 years old. Main surgery options are pelvic lymphadenectomy, radical protatectomy, perineal prostatectomy, and transurethral resection of the prostate.
Pelvic lymphadenectomy is a surgical procedure to take out lymph nodes in the pelvis to see if they contain cancer. If the lymph nodes contain cancer, the doctor will not remove the prostate and may recommend other treatment. Radical prostatectomy (RP) is surgery to remove the entire prostate. Radical prostatectomy is done only if tests show the cancer has not spread outside the prostate. The two types of radical prostatectomy are retropubic prostatectomy, which removes the prostate through an incision made in the abdominal wall, and removal of surrounding lymph nodes (lymphadenectomy) can be done at the same time; and perineal prostatectomy, which is surgery to remove the prostate through an incision made between the scrotum and the anus, and if surrounding lymph nodes are to be removed, this is usually done through a separate incision. Transurethral resection of the prostate is a surgical procedure to remove tissue from the prostate using an instrument inserted through the urethra. This operation is sometimes done to relieve symptoms caused by the tumor before other treatment is given. Transurethral resection of the prostate may also be done in men who cannot have a radical prostatectomy because of age or illness.
Impotence and leakage of urine from the bladder or stool from the rectum may occur in men treated with surgery. In some cases, doctors can use a technique known as nerve-sparing surgery. This type of surgery may save the nerves that control erection. However, men with large tumors or tumors that are very close to the nerves may not be able to have this surgery.
Radiation therapy is the use of x-rays or other types of radiation to kill cancer cells and shrink tumors. Radiation therapy may use external radiation (using a machine outside the body) or internal radiation. Internal radiation involves putting radioisotopes (materials that produce radiation) through thin plastic tubes into the area where cancer cells are found. Prostate cancer is treated with external and internal (implant) radiation. Radiation therapy may be used alone or in addition to surgery. Impotence and urinary problems may occur in men treated with radiation therapy.
Hormone therapy is the fourth of the standard treatments. Hormones are chemicals produced by glands in the body and circulated in the bloodstream. Hormone therapy is the use of hormones to stop cancer cells from growing. Male hormones (especially testosterone) can help prostate cancer grow. To stop the cancer from growing, female hormones or drugs that decrease production of male hormones may be given. Hormone therapy used in the treatment of prostate cancer may include the following: estrogens (hormones that promote female sex characteristics) can prevent the testicles from producing testosterone, however, estrogens are seldom used today in the treatment of prostate cancer because of the risk of serious side effects; luteinizing hormone-releasing hormone agonists also can prevent the testicles from producing testosterone, e.g., leuprolide, goserelin, and buserelin; antiandrogens can block the action of androgens (hormones that promote male sex characteristics), two examples are flutamide and bicalutamide; drugs that can prevent the adrenal glands from making androgens include ketoconazole and aminoglutethimide; and orchiectomy is surgery to remove the testicles, the main source of male hormones, to decrease hormone production. Hot flashes, impaired sexual function, and loss of desire for sex may occur in men treated with hormone therapy.
Deaths from prostate cancer are typically a result of metastasis of a prostate tumor. Therefore, early detection of the development of prostate cancer is critical in reducing mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become a very common method for early detection and screening, and may have contributed to the slight decrease in the mortality rate from prostate cancer in recent years. Nowroozi, et al. (1998) Cancer Control 5:522-531. However, many cases are not diagnosed until the disease has progressed to an advanced stage.
Prognosis, Outcome
Prognosis is typically recognized as a forecast of the probable course and outcome of a disease. See Dorland's Medical Dictionary. As such, it involves inputs of both statistical probability, requiring numbers of samples, and outcome data. Herein, outcome data is utilized in the form of prostate cancer recurrence after RP. A patient population of many dozens is included, providing statistical power.
The ability to determine which cases of prostate cancer will respond to treatment, and to which type of treatment, would be useful in appropriate allocation of treatment resources. As indicated above, the various standard therapies have significantly different risks and potential side effects. Accurate prognosis would also minimize application of treatment regimens which have low likelihood of success. Such also could avoid delay of the application of alternative treatments which may have higher likelihoods of success for a particular presented case. Thus, the ability to evaluate individual prostate cases for markers which subset into responsive and non-responsive groups for particular treatments are very useful.
Current models of prostate cancer classification are poor at distinguishing between tumors that have similar histopathological features but vary in clinical course and outcome. Kattan, et al. (1998) J. Nat'l Cancer Inst. 90:766-771; and Kattan, et al. (1999) J. Clin. Oncol. 17:1499-1507. Identification of novel prognostic molecular markers is a priority if radical treatment is to be offered on a more selective basis to those prostate cancer patients with clinically significant disease. A novel strategy is described to discover molecular markers for prostate cancer prognosis by assessing genome-wide gene expression in many localized prostate cancers and modeling these data based on each patient's known clinical outcome and preoperative serum prostate-specific antigen concentration. The study herein is directed to molecularly define different forms of prostate cancer which can translate directly into prognosis. And such prognosis allows for application of a treatment regimen having a greater statistical likelihood of cost effective treatments and minimization of negative side effects from the different treatment options.
Prostate cancer biopsy samples were collected and analyzed for gene expression across most genes of the human genome. Among genes detected at appropriate levels, correlations with outcome data were evaluated. Genes whose expression levels correlated with statistical significance to outcome data were identified.
This approach identified about 270 genes that demonstrated a strong association (P<0.01) with disease outcome, e.g., prostate cancer relapse, and were superior in their predictive ability relative to prostate-specific antigen levels, one of the standard markers. One of these genes, the putative calcium channel protein trp-p8, is androgen-regulated and loss of trp-p8 appears to be associated with aggressive disease. The findings provide a method of survival analysis of gene expression profiles of cancers with censored follow-up data and identify novel molecular markers of prostate cancer progression with strong predictive power that may be used to select prostate cancers with an aggressive phenotype.
Thus, the invention herein provides statistical correlations of marker expression in appropriate samples with disease outcome.
Survival Analysis
The present invention provides the application of classical multivariable survival analysis to a prostate cancer microarray data set incorporating the expression profiles of over 46,000 genes, to identify markers of disease outcome. This technique provides several significant advances over previous methods of analyses that have been used to discover markers of disease outcome from microarray data. In contrast to previously described statistical methods that rely on the classification of tumors based on known outcome (18) or known classifiers of patient outcome (eg. estrogen receptor status) (19, 20), this technique provides for censored data. This enables these analyses to proceed prior to the occurrence of all events, in this case, PSA relapse. Moreover, this survival analysis incorporates the time taken to PSA relapse and may also include covariates (eg. preoperative serum PSA levels) in order to identify genes that provide additional predictive value above conventional markers of outcome. The statistical analyses described herein have also incorporated a stringent method of estimating the pFDR that was recently described (10). This method is designed specifically for the analysis of microarray data where general dependence between hypotheses or “clumpy dependence” exists, where 50 or more genes interact in common pathways to produce some overall process (10). However, this is the first instance that it has been applied to microarray data from a survival analysis.
A recently published analysis to discover new markers of prostate cancer outcome utilized microarray analyses of prostate cancers to classify small groups of tumors where the recurrence status was known (21). That study found that no single gene was statistically associated with recurrence at P<0.05 and instead adopted a 5-gene model that most commonly included chromogranin A and inositol triphosphate receptor 3 (IP3R). The significant differences between our study and these previously published data are (1) our adoption of a Cox proportional hazards model, and (2) our observation that 277 individual genes were predictive for prostate cancer relapse, none of which overlapped with the genes in the 5-gene model identified by Singh et al. (2002). There are two prevailing explanations for the latter discrepancy. Firstly, the number of genes interrogated by oligonucleotide microarrays in our study was 4-fold greater; trp-p8 is an example of a gene which was not present in the oligonucleotide array used in the previous study. As a result, the genes identified by Singh et al. (2002), were associated with P values of less significance than those presented in Tables 1 and 2. Secondly, by utilizing a statistical method that applies to censored data, we were able to take into account the varying times to prostate cancer relapse in this model. Therefore, we were able to use our full data set in the analysis, rather than restricting the analysis to those patients with a specified length of follow-up. The larger data set and concomitant increase in statistical power may also contribute to our results differing from those of Singh et al.
The TRP channels are made of subunits with six membrane-spanning domains with both carboxy and amino termini located intracellularly that probably form into tetramers to form non-selective cationic channels, which allow for the influx of calcium ions into the cell. Trp-p8 or TRPM8 is a member of the TRPM subfamily of TRP ion channels that have potential roles in Ca2+-dependent signaling, control of cell cycle proliferation, cell division and cell migration. Ligand binding to some membrane receptors initiates a sequence of events that lead to the activation of phospholipase C, generating inositol-1,4,5-triphosphate which opens the intracellular ion channel IP3R and liberates Ca2+ from the endoplasmic reticulum. Activation of the TRP channels accompanies this chain of events, allowing the influx of calcium ions into the cells, although their activation is not necessarily directly linked to Ca2+ depletion from internal stores (22). Calnexin, which is also identified in this analysis as a marker of potential prognostic utility (P=0.004), is believed to be a key chaperone involved in the folding, assembly and oligomerization of newly synthesised IP3R receptors (24). Thus, our study implicates an important role for the phosphatidylinositol signal transduction.
Our observation that loss of trp-p8 is associated with a poor prognosis is also reminiscent of the prognostic role of another of the TRPM subfamily, TRPM1 or melastatin, in melanoma. Downregulation of melastatin mRNA in primary cutaneous melanoma is a prognostic marker for metastasis in patients with localized melanoma and is independent of conventional clinicopathological predictors of metastases (25). Recent studies showed that the rat (26) and mouse (27) orthologues of trp-p8 are functional calcium channels that respond to cold stimuli. Although cold is unlikely to be the natural stimulus for trp-p8 in the prostate, the implication that the human trp-p8 protein may be a functional Ca2+ channel suggests a role in the regulation of intracellular Ca2+ levels with possible effects on cell motility, cell proliferation and resistance to apoptotic stimuli.
In summary, our analyses have identified a group of genes that strongly correlate with prostate cancer relapse and contribute unique information to relapse prediction above preoperative PSA.
Prognosis Determination
One application of the survival analysis results is to generate a prognostic test for prostate cancer. First, we use TAQMAN® analysis to determine the absolute levels of prognostic genes in 75-150 or more prostate cancer patients. Then we correlate the absolute levels of the prognostic genes with patient outcome by a statistical analysis and determine threshold levels of prognostic genes; from which we establish a profile of the threshold level of each prognostic gene associated with a good outcome. For determining the prognosis of a prostate cancer patient, the absolute level of one or more prognostic genes of this patient is determined. Then the absolute level of one or more prognostic genes of this patient is compared with the above established threshold values. Absolute level higher (or lower depending on the prognostic gene) than the threshold values indicates good outcome.
The normalized quantitative level of absolute gene expression of a prognostic gene, from which outcome is predicted, is determined first. Quantitative polymerase chain reaction (PCR)-based methods can be applied. RT-PCR (reverse transcriptase PCR) primers are designed for selected prognostic genes, in order to perform a TaqMan® analysis.
TAQMAN® analysis is a real-time quantitative PCR, which is a powerful method used for gene expression analysis, genotyping, pathogen detection/quantitation, mutation screening and DNA quantitation. See, e.g., Bartlett (2003) PCR Protocols (2d ed.) Humana Press; and O'Connell (2002) RT-PCR Protocols, Humana Press. The technology uses, e.g., an ABI Prism instrument (TAQMAN®) to detect accumulation of PCR products continuously during the PCR process thus allowing easy and accurate quantitation in the early exponential phase of PCR. The basis for PCR quantitation in the ABI instrument is to continuously measure PCR product accumulation using a dual-labeled flourogenic oligonucleotide probe called a TAQMAN® probe. This probe is composed of a short (ca. 20-25 bases) oligodeoxynucleotide that is labeled with two different flourescent dyes. On the 5′ terminus is a reporter dye and on the 3′ terminus is a quenching dye. This oligonucleotide probe sequence is homologous to an internal target sequence present in the PCR amplicon. When the probe is intact, energy transfer occurs between the two flourophors and emission from the reporter is quenched by the quencher. During the extension phase of PCR, the probe is cleaved by 5′ nuclease activity of Taq polymerase thereby releasing the reporter from the oligonucleotide-quencher and producing an increase in reporter emission intensity. The laser light source excites each well and a CCD camera measures the fluorescence spectrum and intensity from each well to generate real-time data during PCR amplification. The ABI Prism software examines the fluorescence intensity of reporter and quencher dyes and calculates the increase in normalized reporter emission intensity over the course of the amplification. The results are then plotted versus time, represented by cycle number, to produce a continuous measure of PCR amplification. To provide precise quantification of initial target in each PCR reaction, the amplification plot is examined at a point during the early log phase of product accumulation. This is accomplished by assigning a fluorescence threshold above background and determining the time point at which each sample's amplification plot reaches the threshold (defined as the threshold cycle number or CT). Differences in threshold cycle number are used to quantify the relative amount of PCR target contained within each tube as described previously.
The TAQMAN® primers are designed within the open-reading frame of the prognostic gene of interest so that the amplicon averages 80 bp. Prostate tissue samples from 70-150 or more prostate cancer patients with known histories are collected and RNA is extracted from these samples using standard methods. TAQMAN® analysis is performed on these samples for the appropriate genes. Using the TAQMAN® analysis, the normalized absolute levels of the prognostic genes are then correlated with patient outcome. Using statistical analyses the threshold level of gene expression, which predicts outcome, is then determined. Subsequent patient samples can then be analyzed for potential of relapse and the physician can better define the patient treatment based on whether the patient is predicted to relapse. Subsetting of the data into various outcomes is achieved through statistical analyses. (Snedecor and Cochran (1994) Statistical Methods (8th ed.) Iowa State University Press; and Duda, et al. (2001) Pattern Classification (2d ed.) Wiley and Sons.)
Genes, Markers, Kits
The present study provides specific identification of multiple genes whose expression levels in biological samples will serve as markers to evaluate prostate cancer cases. These markers have been selected for statistical correlation to disease outcome data on a large number of prostate cancer patients.
The expression levels of these markers in a biological sample may be evaluated by many methods. They may be evaluated for RNA expression levels. Hybridization methods are typically used, and may take the form of a PCR or related amplification method. Alternatively, a number of qualitative or quantitative hybridization methods may be used, typically with some standard of comparison, e.g., actin message. Alternatively, measurement of protein levels may performed by many means. Typically, antibody based methods are used, e.g., ELISA, radioimmunoassay, etc., which may not require isolation of the specific marker from other proteins. Other means for evaluation of expression levels may be applied upon purification of the marker. Antibody purification may be performed, though separation of protein from others, and evaluation of specific bands or peaks on protein separation may provide the same results. Thus, e.g., mass spectroscopy of a protein sample may indicate that quantitation of a particular peak will allow detection of the corresponding marker. Multidimensional protein separations may provide for quantitation of specific purified entities.
Tables 1A-C describe markers of the invention useful for the prognosis of prostate cancer.
Table 1A shows radical prostatectomy samples that were analyzed using the Eos Hu03 GENECHIP®, which contains 59680 probesets. Each probeset's intensity measure was entered as a continuous explanatory variable in a Cox proportional hazards regression survival analysis predicting relapse. Pretreatment PSA concentration was entered as a predictor in each analysis. The interquartile range hazard ratio (IQR HR) for each probeset was then calculated. This approach was used since in conventional Cox proportional hazards analyses, the hazards ratios for a covariate are computed by raising e, the base of natural logarithms, to the power of its regression coefficient. However, because the expression data are treated here as continuous covariates, hazards ratios expressed in this manner illustrate only the change in risk of relapse associated with a change of 1 unit on the expression scale, a change too small to be meaningful. To put the hazard ratios and associated confidence limits on a more interpretable scale, presented here is the hazards ratio associated with a change in expression values equivalent to 1 interquartile range (IQR) of the sample data for each probeset. The IQR is simply the 75th percentile minus the 25th percentile, and thus contains the middle 50 percent of observations. From this analysis, 266 probesets were found to be significant predictors of relapse at P<0.01.
Table 1B lists the accession numbers for Pkey's lacking UnigeneID's for table 1A. For each probeset is listed the gene cluster number from which oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and Alignment Tools (DoubleTwist, Oakland Calif.). Genbank accession numbers for sequences comprising each cluster are listed in the “Accession” column.
Table 1C shows genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 1A. For each predicted exon, is listed the genomic sequence source used for prediction. Nucleotide locations of each predicted exon are also listed.
Pkey: Unique Eos probeset identifier number
ExAccn: Exemplar Accession number, Genbank accession number
UnigeneID: Unigene number
Unigene Title: Unigene gene title
p value: p value for relapse prediction (see Table 1A description)
Pkey: Unique Eos probeset identifier number
CAT number: Gene cluster number
Accession: Genbank accession numbers
Pkey: Unique number corresponding to an Eos probeset
Ref: Sequence source. The 7 digit numbers in this colunm are Genbank Identifier (GI) numbers. “Dunham I. et al.” refers to the publication entitled “The DNA sequence of human chromosome 22.” Dunham I. et al., Nature (1999) 402: 489-495.
Strand: Indicates DNA strand from which exons were predicted.
Nt_position: Indicates nucleotide positions of predicted exons.
Note:
the ExAccn number of NM_is abbreviated to NM in Table 1A-C.
Table 2 lists the first 50 genes, ranked by P value, identified by survival analysis to be associated with prostate cancer relapse.
aThe risk of relapse is the IQR HR calculated for each probeset as described in “Materials and Methods.”
Sequences described therein, where incomplete, may be extended either by informatics techniques, or by techniques of biochemistry and molecular biology. Many well known methods are available. See, e.g., Mount (2001) Bioinformatics: Sequence and Genome Analysis CSH Press, NY; Baxevanis and Oeullette (eds. 1998) Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (2d. ed.) Wiley-Liss; Ausubel, et al. (eds. 1999 and supplements) Current Protocols in Molecular Biology Lippincott; and Sambrook, et al. (2001) Molecular Cloning: A Laboratory Manual (3d ed., Vol. 1-3) CSH Press.
Nucleic acid sequences are particularly described. Using linkages to publicly accessible databases, e.g., GenBank accession numbers, sequences are described whose presence or absence in the samples provides prognostic capacity. Correlations are made between the detection of such sequence and the outcomes of the prostate cancers. Thus, detection of physically linked, e.g., adjacent or contiguous, sequence will be equivalent. The correlation between presence of a 5′ segment will be equivalent to such with a 3′ segment of the same physical molecule.
The tables also provide protein sequences which correspond to the identified nucleic acid sequences. The amino acid embodiments of the markers will also exhibit similar correlations with outcome. Thus, the use of the protein embodiments can also be used in the invention. Proteins or fragments can be produced, and antibodies generated. See, e.g., Coligan (1991) Current Protocols in Immunology Lippincott; Harlow and Lane (1988) Antibodies: A Laboratory Manual CSH Press; and Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press.
Kits for use in the prognostic methods are also made available. The kits will include reagents for detecting the markers, e.g., at the nucleic acid or protein level. Thus, for nucleic acid expression level prognosis kits, typically PCR primers or detectable hybridization probes will be included. For protein level prognosis kits, typically antibodies will be used to quantitate or detect the appropriate gene products. Typically instructions will be provided, which may include buffers or instructions for proper disposal of the materials.
Diagnostic, Therapeutic Applications
After prostate cancer has been identified, tests are performed to find out if cancer cells have spread within the prostate or to other parts of the body. Prostate cancer is typically classified into stages I-IV. The following tests and procedures may be used in the staging process: radionuclide bone scan, pelvic lymphadenectomy, CT scan, and seminal vesicle biopsy.
The list of targets may have other diagnostic applications besides outcome prediction. These identified markers may be valuable in such stage subsetting, distinct from outcome subsetting. Typically, after initial diagnosis, tests are performed to determine if cancer cells have spread within the prostate or to other parts of the body. Evaluation of the identified markers, singly or in combinations, may substitute for other tests to assign stage, or add to them for confirmation. Alternatively, the detection of one or more of these markers may be used as early detection screens for prostate cancer. Preferably, if the marker is soluble or released into a readily accessible body fluid, e.g., serum, semen, or urine, a diagnostic test for detection may allow for early detection of prostate cancer.
The invention is illustrated further by the following examples that are not to be construed as limiting the invention in scope to the specific procedures described in it.
Tissue Collection and Preparation of RNA
A cohort of 72 fresh-frozen prostate cancers was collected from patients with localized prostate cancer treated by radical prostatectomy RP at St. Vincent's Hospital, Sydney. The primary outcome, disease-specific relapse, was measured from the date of RP and was defined as a rise in serum PSA above 0.3 ng/ml with subsequent further rises. Following inking of the external limits of the prostate immediately after removal and prior to formalin-fixation, up to six, 5 mm core biopsies were taken and stored at −80° C. for a later RNA extraction. The proportion of invasive cancer in the biopsy sample was then estimated retrospectively by either frozen sectioning of the biopsy and hematoxylin and eosin staining, or by examination of archival formalin-fixed, paraffin-embedded tissue surrounding the biopsy site. Only those biopsies that contained ≧75% invasive cancer were used for subsequent transcript profiling. Only one biopsy per patient was analyzed.
Xenograft Model
The androgen-dependent LuCaP-35 (7) prostate cancer xenograft was grown as subcutaneous tumors in nude male mice. To study the androgen-withdrawal process, tumor-bearing mice were castrated and monitored for tumor regression and PSA levels. Tumors were harvested from mice prior to castration, and at various time points (1-100 days) post-castration and were processed for microarray analysis. For data analysis and identification of androgen-regulated genes, the samples were binned in two groups consisting of days 0-2 and days 5-100 post-castration. Genes that showed a significant (P<0.01) difference in the means of each group were identified by a standard Student's t-Test.
RNA Extraction and Microarray Protocols
Preparation of total RNA from fresh-frozen prostate and xenograft tissue was performed by extraction with Trizol reagent (Life Technologies Inc., Gaithersburg, Md.) and was reverse transcribed using a primer containing oligo(dT) and a T7 promoter sequence. The resulting cDNAs were then in vitro transcribed in the presence of biotinylated nucleotides (Bio-11-CTP and Bio-16-UTP) using the T7 MEGAscript kit (Ambion, Austin, Tex.).
The biotinylated targets were hybridized to the Eos Hu03, a customized Affymetrix GENECHIP® (Affymetrix, Santa Clara, Calif.) oligonucleotide array comprising 59,619 probesets representing 46,000 unique sequences including both known and FGENESH predicted exons that were based on the first draft of the human genome. Hybridization signals were visualised using phycoerythrin-conjugated streptavidin (Molecular Probes, Eugene, Oreg.). Normalization of the data was performed as follows. The probe-level intensity data from each array were fitted to a fixed gamma distribution with a mean of 300 and a shape parameter of 0.81. This normalization procedure removes between chip variation attributable to non-biological factors. Then for each probeset, a single measure of average intensity was calculated using Tukey's trimean of the intensity of the constituent probes (8). Finally, a correction for nonspecific hybridization was applied, in which the average intensity measure of a “null” probeset consisting of probes with scrambled sequence was subtracted from all other probesets on the chip.
Statistical Methods
Prior to survival analysis, a screen was applied to the expression data to eliminate probesets without meaningful variation. For each probeset, the ratio of the 90th percentile to the 15th percentile intensity measure was required to be at least 2, and the minimum expression level was required to be at least 150 average intensity units. Separate Cox proportional hazards analyses with pretreatment PSA concentration dichotomised at 20 ng/ml and gene expression modeled as a continuous variable were used to identify gene expression that correlated with PSA recurrence (9). The IQR hazards ratio was computed by multiplying the regression coefficient for each probeset by its own interquartile range prior to exponentiation. The positive false discovery rate (pFDR) was calculated using the method described by Storey and Tibshirani (10). Schoenfeld residuals were used to assess the proportional hazards assumption for the two probesets for trp-p8 and the assumption was found to be upheld in both cases.
Variables of clinical relevance were also modeled in univariate analyses for their ability to predict disease-free survival in the 72 prostate cancers using the Cox proportional hazards model. Trp-p8 mRNA expression assessed by ISH, was reported as proportions within histological groups and compared between groups using a Fisher's Exact test. The expression dataset of 277 selected probesets from 72 samples was reordered according to cluster analysis in both dimensions (probesets and samples). In each analysis, the distance metric was the square root of (1−r), where r is the standard pearson product-moment correlation. The clustering algorithm used was Ward's minimum variance method (11).
In order to evaluate the ability of the 11 genes used by Singh et al., to accurately predict relapse status in aggregate in our dataset, we entered these eleven probesets into a multivariate Cox regression model, and used variable selection methods to choose a subset of predictors. Three different methods were used (forward selection, backward elimination, and stepwise selection), all using P=0.15 as inclusion/exclusion criterion). In each case, the final model using 4 probesets had a significance level of P=0.0029 by the likelihood ratio test.
All statistical analyses were performed using SAS (SAS Institute Inc., Cary, N.C.).
Tissue Microarray and In Situ Hybridization
Tissue microarrays were constructed as described previously (12), and were comprised of prostate cancer samples from 95 patients that are part of a previously published cohort of patients treated for localized prostate cancer by RP alone at St. Vincent's Hospital, Sydney (13). In addition, 13 prostate cancer specimens were collected from patients treated for localized prostate cancer by RP who had received at least 3 months (range 3-10 months) of preoperative neoadjuvant hormonal treatment (5 with anti-androgens alone, 6 with a combination of a Gn-RH analogue and anti-androgens and 2 with a Gn-RH analogue alone). Trp-p8 expression in these 13 samples was assessed on conventional tissue sections.
For ISH, a 424-base pair probe for trp-p8 was derived from the 3′ end of the trp-p8 gene and transcribed to produce a DIG-labeled riboprobe using an RNA DIG-labeling kit (Roche, y™ Mannheim, Germany). ISH was performed on the VENTANA DISCOVERY™ instrument (Ventana Medical Systems, Tucson, Ariz.) using the RIBOMAP™ kit with protease P2 for 2 minutes (Ventana Medical Systems, Tucson, Ariz.) and hybridization for 8 hours at 65° C. Chromogenic detection was achieved with the BLUEMAP™ detection system as described by the manufacturer (Ventana Medical Systems, Tucson, Ariz.).
In this study, we sought to discover novel biomarkers that might predict for PSA relapse following radical prostatectomy utilizing outcome-based statistical tools to analyze gene expression profiles of 72 prostate cancers. A criteria for selection was the ability to predict recurrence better than preoperative serum PSA concentration alone, since PSA is one of only a handful of markers that provide preoperative prognostic information. The 72 prostate tissues were collected at the time of radical prostatectomy (RP) from patients undergoing treatment for localized prostate cancer at St. Vincent's Hospital Campus, Sydney, Australia. At last follow-up (median=28.25 months, range 4.9-90.3 months), 17 of the 72 (23.6%) patients had relapsed, of which 14 demonstrated a rise in postoperative PSA levels while 3 patients were diagnosed with a rising PSA and local recurrence of disease. Consistent with published data (5, 6, 13), the significant predictors of prostate cancer relapse in this cohort on univariate analysis were Gleason score (HR=1.88, P=0.027), surgical margins (HR=4.90, P=0.035) and preoperative PSA concentration (HR=4.43, P=0.006) (Table 1). The overall relapse rate of 23.6% and median time to relapse of 14 months in this group of 72 patients was similar to that observed in a cohort of 732 patients treated for localized prostate cancer by RP at the same institution between 1986 and 1999 (13).
aGleason score was modeled as a continuous variable.
RNA was extracted from a core biopsy taken at the time of RP for each of the 72 cases that comprised ≧75% cancer tissue. Biotinylated RNA from each sample was then analyzed with a customized GENECHIP® expression array, the Eos Hu03 (14). This single GENECHIP® microarray design is representative of greater than 90% of the expressed human genome based on the first public draft and comprises 59,619 probesets representative of both known and predicted genes (15). An initial screen was applied to the microarray probesets to choose genes expressed with reliable intensity and adequate cross-sample variance. This screen reduced the initial set of 59,619 probesets to a subset of 8,521 probesets for further examination.
Each probeset's intensity value was entered as a continuous explanatory variable in a Cox proportional hazards survival analysis predicting relapse. Pretreatment PSA concentration was also entered as a predictor in each analysis. From this analysis, 264 probesets were found to be significant predictors of relapse at P<0.01. To assist interpretation, we next calculated the interquartile range hazard ratio (IQR HR) for each probeset. Because the expression data are treated here as continuous covariates, hazards ratios expressed in their natural scale illustrate only the change in risk of relapse associated with a change of 1 unit on the expression scale, a change too small to be comprehended easily. To put the hazard ratios and associated confidence limits on a more interpretable scale, we present here the hazards ratio associated with a change in expression values equivalent to 1 interquartile range (IQR) of the sample data for each probeset. The IQR is simply the 75th percentile minus the 25th percentile, and thus contains the middle 50 percent of observations.
The multiple hypothesis testing problem has been recognized as an important issue to address in microarray research. The large number of tests that are performed simultaneously on thousands of probesets greatly increases the chances of making Type I errors (or false-positive findings). To assess the effect of multiple hypothesis testing, we adapted a method developed by Storey and Tibshirani (2001) for calculating the positive false discovery rate (pFDR), an estimate of the proportion of false-positives present in a set of findings (10). This technique was developed explicitly for use with microarray data, for which the usual assumption of independence among tests is untenable. The procedure can be briefly summarized as follows. First, null data were simulated by randomly permuting the relapse status of subjects and re-performing the survival analyses. In each simulation, the number of relapsers and non-relapsers (17 and 55, respectively) remained constant, but these designations were shuffled and assigned to patients at random. The permutation was performed 500 times, and for each simulation, the number of findings at P<0.01 was noted. The mean number of findings across the 500 permutations was 85.9. This figure, an estimate of the expected number of false positives under null conditions, was then divided by the number of actual findings (n=264) to obtain an estimate of the proportion of false-positive findings. After the application of a correction factor (10), the final estimate for the pFDR was 23%. Thus, we can expect that approximately 61 of the 277 findings are false positives.
Identification of the Candidate Marker Genes
The 277 probesets (Table 1A-1C) identified by survival analysis included both known genes and hypothetical genes of unknown function, as well as ESTs.
Cluster analysis performed in both dimensions on the 72 RP samples and these 277 probesets using the Ward's minimum variance procedure identified two gene expression subgroups (
Notably, three of the 277 probesets showing strongest correlation with relapse in our model were identified as the gene for the putative calcium channel protein, trp-p8 (16). For all three probesets, loss of expression of trp-p8 mRNA was associated with a significantly shorter time to PSA relapse free survival with an IQR HR of 0.26 (0.12-0.54; P<0.001), 0.32 (0.16-0.66, P=0.0022) and 0.27 (0.12-0.66, P=0.0045), respectively, when PSA was included in the analysis. Notably, loss of trp-p8 remained a significant predictor of PSA relapse when modeled alone or with Gleason score (data not shown). Subsequent analysis showed that expression of trp-p8 mRNA was primarily restricted to the prostate. Low-level expression was detected in normal liver and no detectable expression was seen in 32 distinct other normal tissues examined by oligonucleotide microarray analysis (
To gain further insight into the putative association of trp-p8 with androgen regulation, we examined the levels of trp-p8 expression in the prostate tissue of patients who were treated with androgen deprivation therapy (neoadjuvant hormonal therapy, NHT) prior to RP. In situ hybridization (ISH) for trp-8 mRNA was performed on RP specimens from 13 patients who had received at least 3 months preoperative NHT and the levels compared with tissue from 95 patients treated with RP alone (
Taken together, these data from cell lines, prostate cancer xenografts and clinical specimens, combined with the original finding that trp-p8 mRNA levels correlated strongly with prostate cancer relapse, strongly support the conclusion that trp-p8 expression is androgen-regulated and may be associated with the transition to androgen-independent disease. A monoclonal antibody to trp-p8 can be produced that will be used to assess protein expression by immunohistochemistry in an independent cohort of formalin-fixed, paraffin-embedded prostate cancer specimens with known prostate cancer outcome (13).
It should be apparent that given the guidance, illustrations and examples provided herein, various alternate embodiments, modifications or manipulations of the present invention would be suggested to a skilled artisan and these are included within the spirit and purview of this application and scope of the expanded claims.
This application is a continuation of pending U.S. patent application Ser. No. 10/603,505, filed Jun. 24, 2003, which claims the benefit of provisional application 60/391,309, filed Jun. 24, 2002, which is incorporated herein in its entirety.
Number | Date | Country | |
---|---|---|---|
60391309 | Jun 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10603505 | Jun 2003 | US |
Child | 11365745 | Feb 2006 | US |